O-Glucose Trisaccharide Is Present at High but Variable Stoichiometry at Multiple Sites on Mouse Notch1*

Notch activity is regulated by both O-fucosylation and O-glucosylation, and Notch receptors contain multiple predicted sites for both. Here we examine the occupancy of the predicted O-glucose sites on mouse Notch1 (mN1) using the consensus sequence C1XSXPC2. We show that all of the predicted sites are modified, although the efficiency of modifying O-glucose sites is site- and cell type-dependent. For instance, although most sites are modified at high stoichiometries, the site at EGF 27 is only partially glucosylated, and the occupancy of the site at EGF 4 varies with cell type. O-Glucose is also found at a novel, non-traditional consensus site at EGF 9. Based on this finding, we propose a revision of the consensus sequence for O-glucosylation to allow alanine N-terminal to cysteine 2: C1XSX(A/P)C2. We also show through biochemical and mass spectral analyses that serine is the only hydroxyamino acid that is modified with O-glucose on EGF repeats. The O-glucose at all sites is efficiently elongated to the trisaccharide Xyl-Xyl-Glc. To establish the functional importance of individual O-glucose sites in mN1, we used a cell-based signaling assay. Elimination of most individual sites shows little or no effect on mN1 activation, suggesting that the major effects of O-glucose are mediated by modification of multiple sites. Interestingly, elimination of the site in EGF 28, found in the Abruptex region of Notch, does significantly reduce activity. These results demonstrate that, like O-fucose, the O-glucose modifications of EGF repeats occur extensively on mN1, and they play important roles in Notch function.

The Notch family of single-pass transmembrane receptors is essential for early metazoan development, activating the expression of many genes involved in cell differentiation and tissue morphogenesis (1)(2)(3). Defects in Notch signaling have been implicated in a number of human diseases, including several forms of cancer, vascular defects such as cerebral autosomal dominant arteriopathy with subcortical infarcts and leukoencephalopathy (4 -6), multiple sclerosis (6), and a number of developmental syndromes (4,(7)(8)(9)(10). The canonical Notch signaling pathway is initiated by the interaction of Notch with its ligand on an apposed cell. Upon ligand binding, Notch undergoes presenilin-1-dependent proteolysis, releasing a soluble Notch intracellular domain, which enters the cell nucleus to interact directly with the transcription factors from the CSL family and regulate Notch target genes (2). There are four members of the Notch family in vertebrates (Notch1 to Notch4) interacting with several classes of Notch ligands: ligands of the DSL family (Delta-like 1, 3, and 4 and Jagged 1 and 2) (1) and newly characterized ligands without the DSL domain, such as DNER and MAGP-1 and -2 (11)(12)(13).
The Notch proteins consist of a large extracellular domain (ECD), 5 a transmembrane region, and a large intracellular domain (1,2,14). The majority of the ECD consists of tandem epidermal growth factor-like (EGF) repeats (36 EGF repeats are present in mouse Notch1 and -2, 34 in mouse Notch3, and 29 in mouse Notch4), which are conserved protein domains characterized by six cysteine residues forming three disulfide bridges (15). EGF repeats are known to participate in protein-protein interactions, such as receptor-ligand binding. Many of the EGF repeats in Notch (and other EGF repeat-containing proteins) are modified with two unusual types of protein glycosylation: O-fucose and O-glucose (16). The O-fucose modification occurs at the consensus sequence C 2 XXXX(S/T)C 3 , located between the second and third cysteines of the EGF repeat (17). O-Fucose is added by protein O-fucosyltransferase 1 (Pofut1), which is a soluble protein in the endoplasmic reticulum (18 -21). Knock-out or knockdown of Pofut1 in either mice (22) or Drosophila melanogaster (23,24) results in severe Notch-like phenotypes. Mouse Notch1 (mN1) is modified at multiple predicted sites with O-fucose glycans (17,25), and elimination of individual O-fucose sites on mN1 (EGF repeats 12, 26, and 27) alters activity in cell-based assays (25,26). Elimination of the O-fucose site on EGF 12 in vivo results in a hypomorphic allele with defects in T cell development, confirming an important role for O-fucosylation of this site (27). These results are indicative of O-fucosylation being essential for Notch function.
The O-fucose moieties on Notch can be elongated with a GlcNAc by O-fucose-specific ␤1,3-N-acetylglucosaminyltransferases of the Fringe family (28 -30). Three mammalian Fringe homologs exist, Lunatic Fringe, Manic Fringe, and Radical Fringe (31), varying in enzymatic activity levels (30) and expres-sion patterns (32). The GlcNAc-␤1,3-Fuc disaccharide can be extended to a tetrasaccharide in mammals (16). A branched O-fucose trisaccharide (GlcUA-␤1,4-(GlcNAc-␤1,3)-Fuc) has been detected in total extracts of Drosophila larvae and wing discs, but it is not clear whether this modification is on EGF repeats of Notch (33). Elongation beyond the GlcNAc-␤1,3-Fuc disaccharide has not been detected on Drosophila Notch produced in S2 cells (34). The Fringe elongation of O-fucose results in a change in the ligand-dependent activation of Notch signaling and subsequently in regulation of Notch in some developmental contexts (35)(36)(37). Elimination of Lunatic Fringe in mice results in semilethality, with developmental abnormalities as a consequence of deregulated Notch signaling during somite formation (38 -40). Both Lunatic and Manic Fringe are involved in T-cell and B-cell development (41). The studies on O-fucose and Fringe form a paradigm for the role of glycosylation in Notch signaling.
The O-glucose modifications of EGF repeats have received much less attention to date. O-Glucose was first discovered on bovine blood clotting factors VII and IX (42) as a trisaccharide (Xyl-␣1,3-Xyl-␣1,3-Glc-␤1-O-Ser) (43). The same trisaccharide was later found on human factors VII and IX, protein Z (44), thrombospondin (45), and murine fetal antigen-1/deltalike protein (FA-1-DLK) (46). Comparison of the sites of modification on these five proteins led to a proposed consensus sequence, C 1 XSXPC 2 , between the first and the second conserved cysteine residues of an EGF repeat (47). To date, only serine residues have been found to be modified with O-glucose, but it is unknown whether threonine can be modified. This consensus site can be found in EGF repeats of numerous secreted and membrane-bound proteins (supplemental Table 1) (48,49), including the Notch receptors. mN1 contains O-glucose consensus sequences in 16 of its 36 EGF repeats (Fig. 1A). Endogenous Notch1 isolated from CHO cells is known to be modified with O-glucose trisaccharide (16), but the actual sites modified or structures of the modifications on Notch have not yet been established. A recent report confirmed that the O-glucose trisaccharide on Notch has the same structure as that reported above: Xyl-␣1,3-Xyl-␣1,3-Glc (50). An enzymatic activity capable of adding O-glucose to EGF repeats, protein O-glucosyltransferase (Poglut), was detected and initially characterized in extracts of cell lines and various rat tissues (48). The gene encoding Poglut, Rumi, was subsequently identified during a mutant screen in Drosophila for modifiers of Notch activity (51). Loss of Rumi results in a temperature-sensitive Notch lossof-function phenotype in flies. Recent work shows that elimination of mouse Rumi also results in Notch-like phenotypes in mice (49). In addition, genes encoding the O-glucose-␣1,3-xylosyltransferase have recently been identified (two distinct genes encode enzymes with this activity: GXYLT1 and GXYLT2) (52). The gene encoding the xyloside-␣1,3xylosyltransferase has not yet been identified. Here we address the question of which predicted O-glucose sites on mN1 are modified and to what extent they are elongated with the xyloses. We also examine the functional importance of individual sites by introducing single amino acid muta-tions and analyzing the effects in cell-based Notch signaling assays.
Analysis of O-Glucose Sugars in Notch Fragments-Preparation of plasmids encoding the EGF repeats 1-5, 6 -10, 11-15, 16 -18, 19 -23, 24 -28, 29 -36, and 1-18 from mN1 with C-terminal Myc-His 6 tags (in pSecTag2; Invitrogen) was described previously (17). Lec1-CHO cells stably expressing each of these constructs were generated by treatment of transfected cells with 680 g/ml hygromycin B (Invitrogen) for several weeks. Clones producing significant amounts of protein were isolated by dilution cloning and evaluated by immunoblot of media using anti-Myc antibody. Metabolic radiolabeling and purification of Notch fragments were performed essentially as described elsewhere (17), but the cells were labeled with 20 Ci/ml [6-3 H]galactose. Alkali-induced ␤-eliminations and size fractionation of O-linked sugars on the Superdex column (Amersham Biosciences) were performed as described before (54). Analysis of peaks for the presence of glucitol following acid hydrolysis was performed by high pH anion-exchange chromatograph with pulsed amperometric detection analysis as described (16).

Analysis of O-Glucose Site Occupancy in Mouse Notch1
Using Collision-induced Dissociation (CID) Mass Spectrometry-Fragments of the mN1 ECD consisting of EGF repeats 1-5, 6 -10, 11-15, 16 -18, 19 -23, 24 -28, and 29 -36 and 1-18 (as described above) were produced in stably transfected Lec1-CHO cells and purified using nickel-NTA chromatography as described previously (17). Some Notch1 fragments were also produced in HEK293T, COS7, and NIH3T3 cells by transient transfection for comparison. The purified fragments were reduced using Tris(2-carboxyethyl)phosphine hydrochloride (Pierce) and carbamidomethylated with iodoacetamide prior to in-gel digestion, as described (51). For in-gel digests, ϳ200 -500 ng of protein was analyzed per run. The bands of interest were identified using zinc sulfate negative staining (Bio-Rad), excised, and digested with a protease (trypsin, chymotrypsin, V8) which, according to in silico digests (using the PeptideMass proteomics tool from the ExPASy Web site) yielded fragments of appropriate size for analysis in the ion trap in the m/z range 400 -2200. The digestions were performed for 8 -16 h at 37°C in 20 mM diammonium phosphate, pH 8.0. For in-solution digests, ϳ500 ng of protein was analyzed per run. Samples were reduced using 8 M urea, 0.4 M ammonium bicarbonate, pH 8.0, and 10 mM Tris(2-carboxyethyl)phosphine hydrochloride and carbamidomethylated with 100 mM iodoacetamide. Reduced and alkylated samples were diluted four-fold with water and digested with trypsin or V8 for 8 -16 h at 37°C. Chymotryptic digests (Princeton Separations) were incubated for 8 h at 30°C in 50 mM Tris-HCl and 2 mM CaCl 2 . All digests were desalted on a C18 ZipTip microcolumn (Millipore), dried, and resuspended in 20% acetonitrile, 0.1% formic acid. Samples were then subjected to capillary LC-MS/MS or nano-LC-MS/MS using CID, leading to losses of labile modifications, such as O-glycans, first, followed by fragmentation of the more stabile peptide backbone into b-and y-ions.
For capillary LC-MS/MS, digests were fractionated by reverse phase liquid chromatography coupled with electrospray ionization mass spectrometry using a Zorbax C8 capillary column (0.3 ϫ 150 mm) (Agilent Technologies) with a 60-min linear gradient from 0 to 95% buffer B (buffer A: 0.1% formic acid; buffer B: 95% acetonitrile in 0.1% formic acid) at 5 l/ml. The column effluent was sprayed into an XCT ion trap mass spectrometer (Agilent Technologies) equipped with a capillary electrospray source, operating in the positive ion mode, and set up to perform MS/MS on either the two or three most intense ions in each spectrum.
Samples were also analyzed by nano-LC-MS/MS using an Agilent 6340 ion trap mass spectrometer equipped with an HPLC Chip-Cube interface. Sample volumes of 1.0 -4.0 l were injected onto a Zorbax 300SB-C18 chip with a 40-nl enrichment column and a 43 mm ϫ 75 m separation column (Agilent). Samples were initially loaded onto the enrichment column at 4.0 l/min in 5.0% buffer B and then fractionated on the separation column at 450 nl/min with a 26-min non-linear gradient from 5.0 to 95% buffer B (buffer A: 0.1% formic acid; buffer B: 95% acetonitrile in 0.1% formic acid). The effluent from the HPLC-Chip was sprayed directly into the ion trap (Agilent), operating in the positive ion mode, and set up to perform MS/MS on the three most abundant ions in each scan, with 30-s active exclusion triggered after two consecutive spectra of a recurrent ion. All MS/MS experiments were carried out using the following settings: capillary voltage, 1700 -1950 V; end plate Offset, Ϫ500 V; dry gas, 5.0 liters/min at 325°C; trap drive, 100; smart target, 500,000; maximum accumulation time, 150 ms; and scan range, 300 -2200 m/z. Spectra were confirmed by searching databases with the MS/MS data using the Global Proteome Machine search engine (available on the World Wide Web). All cysteines were carbamidomethylated, and masses of all peptides were adjusted accordingly. The O-glucose-modified peptides were identified by searching the MS/MS data for constant neutral losses of masses equivalent to the O-glucose mono-, di-, and trisaccharides (162, 294, and 426 Da, respectively) relative to parent ions using the Data Analysis Tool (Agilent) as described (55). O-Fucose-modified peptides were identified by constant neutral loss searches of 146 Da, and O-HexNAc modified peptides by constant neutral loss searches of 203 Da. Subsequent searches of the MS/MS data were performed to identify peptides modified with mono-or disaccharide forms of O-glucose by generating extracted ion chromatograms using the masses of the unglycosylated peptide for the search.
Relative levels of the predicted glycoforms for each glycopeptide were analyzed using extracted ion chromatograms (EIC) for the appropriate parent ions. Glycosylation of peptides frequently suppresses ionization, so these analyses may underestimate the ratio of glycosylated peptide to naked peptide. Nonetheless, previous analysis of several O-glucosylated peptides prepared from Drosophila Notch expressed in control or Rumiknockdown S2 cells showed that the decrease in intensity of O-glucosylated peptide corresponded quite well to the increase in naked peptide (51). Thus, in our experience, O-glucosylation does not cause significant suppression of ionization.
Electron Transfer Dissociation (ETD) Mass Spectrometry-ETD transfers electrons from an anionic reagent (fluoranthene) to protonated peptides, causing fragmentation into c-and z-ions. This lower energy fragmentation method leaves the labile O-glycan modification attached to its hydroxy amino acid, thereby retaining positional sequence information of the modification within a given peptide. Samples were characterized by nano-LC-MS/MS using an Agilent 6340 ion trap mass spectrometer equipped with ETD for peptide sequence analysis by MS/MS with glycans remaining intact. Parameters were as follows: reactant temperature, 60°C; ionization energy, 80 eV; emission current, 2 A; ionization chamber, Ϫ3.7 V, with an accumulation time of 25-35 ms and trap drive of 25.
Mutagenesis of O-Glucosylation Sites in Mouse Notch1-The site-directed mutagenesis of all O-glucose sites in mN1 was performed using as a template a full-length construct of mN1 in which the C-terminal PEST domain was replaced with six tandem Myc epitopes (Notch1-Myc 6 ) in the vector pCS2ϩ (a generous gift from Dr. Raphael Kopan, Washington University School of Medicine). All glycosylation sites (serine residues) were mutated to alanine using PCR-based mutagenesis, and the mutated constructs were sequenced to confirm the presence of the mutation (see supplemental Table 2 for a list of primers used). For EGF 28, the Ala residue was reverted to Ser to demonstrate that the aberrant signaling was not due to any other inadvertently introduced mutation in Notch. All mutations were confirmed by sequencing.
Co-culture Assay-This assay was adapted from our previously described co-culture assay for NIH3T3 cells (25). 1.0 ϫ 10 5 NIH3T3 cells were seeded in a 24-well tissue culture plate and transiently transfected using Lipofectamine 2000 (Invitrogen), with 0.4 g of wild type or mutant mouse Notch1-pCS2ϩ vector, 0.15 g of TP-1 luciferase reporter construct (Ga981-6, a gift from Dr. Georg Bornkamm, Munich, Germany), and 0.15 g of gWIZ ␤-galactosidase construct (Gene Therapy Systems) for transfection efficiency normalization. 4 h post-transfection, cells were allowed to recover in fresh DMEM. To begin co-culture, L-cells (control) or L-cells expressing Delta-like 1 or Jag-ged1 (kind gift of Dr. Gerry Weinmaster, UCLA) were overlaid on the transfected NIH3T3 cells at a density of 1.0 ϫ 10 5 cells/ well. 24 h post co-culture, cell lysates were prepared as described previously (25). All co-cultures were carried out in triplicate, with assays showing significant changes in Notch1 activation carried out at least twice.
Mutation of Factor VII EGF Repeat-Mutation at the predicted O-glucose consensus site in human factor VII EGF was introduced by a conventional PCR-based site-directed mutagenesis method, converting serine 52 to threonine (S52T). PCR was performed using the pET-20b(ϩ) plasmid containing factor VII EGF as a template and the following primers: 5Ј-GACCAGTGTGCCACGAGTCCATGCCAG-3Ј and 5Ј-CTG-GCATGGACTCGTGGCACACTGGTC-3Ј. Successful mutation was confirmed by DNA sequencing. Expression and purification of wild-type and mutated factor VII EGF proteins were performed as described previously (30). As a final purification step, reverse phase HPLC was carried out. The properly folded factor VII EGF repeats were identified by their ability to be O-fucosylated by Pofut1 in vitro (48). The final concentration of factor VII EGF was determined by a BCA assay using BSA as a standard.
Protein O-Glucosyltransferase Assay-Protein O-glucosyltransferase assays were performed with slight modification as described previously (48). Briefly, 10 l of reaction mixture contained 50 mM HEPES, pH 7.0, 10 mM MnCl 2 , the indicated amounts of Factor VII EGF, 0.16 M (0.01 mCi/ml) UDP-[ 3 H]glucose, 10 M UDP-glucose, 0.5% Nonidet P-40, and partially purified protein O-glucosyltransferase from mouse brain extracts. The reaction was incubated at 37°C for 1 h and stopped by adding 900 l of 100 mM EDTA, pH 8.0. The sample was loaded onto a C18 cartridge (100 mg). After the cartridge was washed with 5 ml of H 2 O, the EGF repeat was eluted with 1 ml of 80% methanol. Incorporation of [ 3 H]glucose into the EGF repeat was determined by scintillation counting of the eluate. Reactions without substrates were used as background control.

Mouse Notch1 Is Extensively Modified with O-Glucose in
Lec1-CHO Cells-We have previously shown that endogenous Notch1 is modified with O-glucose trisaccharide in Lec8-CHO cells and that most of the O-glucose is in the trisaccharide form (16). However, it is not known which of the predicted O-glucose sites are modified nor if the structure varies from one site to another. Mouse Notch1 contains 16 predicted O-glucose sites in its extracellular domain (Fig. 1A). To determine whether O-glucose modifications are spread across the entire ECD, as suggested by the distribution of predicted O-glucose sites ( Fig.  1A), we expressed fragments of mN1 ECD containing EGF repeats 1-5, 6 -10, 11-15, 16 -18, 19 -23, 24 -28, and 29 -36 with C-terminal Myc and His 6 tags and N-terminal secretion signal sequence, as described previously (17).
The fragments were stably expressed in Lec1-CHO cells, and the cells were metabolically radiolabeled with [ 3 H]galactose. Because galactose is converted to UDP-galactose and then epimerized to UDP-glucose in cells, radiolabeling with [ 3 H]galactose permits specific labeling of the UDP-glucose pool without labeling all of the other products of glucose metabolism in cells (56). Because Lec1-CHO cells do not synthesize complex-type N-glycans (57), the radioactivity should be preferentially incorporated into O-glucose structures in Lec1-CHO cells. Radiolabeled Notch1 fragments were purified from conditioned medium using nickel-NTA-agarose and analyzed by SDS-PAGE followed by Western blot and fluorography (Fig.  1B). Under these conditions, each of the fragments expressed in Lec1-CHO cells was radiolabeled.
To confirm that the radiolabel incorporated into each Notch fragment was in the form of O-glucose and to address the question of which glycan structures are present on each of the Notch fragments, we released the O-linked sugars from the purified, radiolabeled fragments by alkali-induced ␤-elimination and analyzed the products by gel filtration chromatography (16,54). We found that both trisaccharide and monosaccharide species of O-glucose were present on all fragments, but trisaccharide was clearly the predominant species (Fig. 2). The radiolabel in each of the monosaccharide and trisaccharide peaks was confirmed to be in the form of glucitol (the expected product from ␤-elimination of O-glucose) using high pH anion exchange chromatography-pulsed amperometric detection following acid hydrolysis (data not shown) (16). These results indicate that O-glucose trisaccharide modifications are found at many sites scattered across the mN1 ECD.
Additionally, some of the fragments contained larger, radiolabeled oligosaccharides, especially abundant in the fragments EGF 16 -18 and 29 -36 (marked with an asterisk in Fig. 2.). Analysis of these oligosaccharide species by high pH anion exchange chromatography following acid hydrolysis revealed the presence of radioactive galactose (not glucitol), a building block of mucin-type oligosaccharides and glycosaminoglycan cores (data not shown). Thus, these high molecular weight species represent galactose-containing species rather than O-glu- cose. Similar results were obtained in the original analysis of endogenous Notch1 in Lec8-CHO cells (16). It is not clear whether the material excluded from the gel filtration column was attached directly to the Notch fragments or to co-purifying material. If to co-purifying material, it suggests that some Notch fragments may interact with this material more than others. Further work is being done to examine the structure and source of this material.
Site-specific Mapping of O-Glucose Modifications Using Mass Spectrometry-To examine whether specific O-glucosylation sites were modified, we turned to mass spectral methodologies similar to those we have used to map O-fucose modifi-cation sites on Notch (34). We initially analyzed the same Notch fragments used in the radiolabeling experiments shown in Figs. 1 and 2, although larger portions of mN1 (e.g. EGF [1][2][3][4][5][6][7][8][9][10][11][12][13][14][15][16][17][18] were also analyzed to confirm that utilization of glycosylation sites was not affected by the size of the fragment being analyzed. Modified peptides were identified by neutral loss of the O-glucose saccharides upon low energy CID fragmentation. An example of this analysis, performed on the EGF 1-5 fragment from mN1 expressed in Lec1-CHO cells is shown in Fig. 3. The top panel shows the base peak chromatogram (BPC), a measure of the most abundant ions eluting from the reversephase HPLC at each time point (Fig. 3A). The data were searched for ions that lose an O-glucose trisaccharide (the most abundant form of O-glucose; Fig. 2) upon fragmentation (constant neutral loss). Fig. 3B shows a constant neutral loss search of 142 Da (loss of the O-glucose trisaccharide, 426 Da, from a triply charged peptide). A single molecular ion was found in this search eluting at 53.5 min, and both the MS spectrum and the CID fragmentation (MS/MS) spectrum for this ion are shown (Fig. 3D). The MS spectrum (Fig. 3D, top) reveals an ion (m/z 1254.5) corresponding to a triply charged tryptic peptide con-taining an O-glucosylation site from EGF repeat 4 modified with the O-glucose trisaccharide. Fragmentation (Fig. 3D, bottom) results in major product ions corresponding to sequential loss of a pentose (xylose), another pentose (xylose), and a hexose (glucose) from the glycopeptide. These data indicate the presence of the O-glucose trisaccharide on this peptide. The most abundant product ion (m/z 1112.9) is the unglycosylated peptide. The mass of this ion corresponds to the triply charged form of a predicted tryptic peptide from EGF repeat 4 of mN1 that contains an O-glucose consensus sequence (in boldface type): 137 SCQQADPCASNPCANGGQCLPFESSYICR 165 (see Table 1 for predicted masses).
Some fragmentation of the peptide (after loss of the sugars) also occurs, resulting in b-and y-ions that confirm the identification of the peptide (several y-ions are indicated in Fig. 3D, bottom). We do not reliably see b-or y-ions still modified with the glycans, so assignment of the modification to a specific residue is difficult to deduce from the data. Nonetheless, the presence of the O-glucose consensus site (C 1 XSXPC 2 ) within the peptide is highly suggestive of the modified residue. Our analysis has revealed that the fragmentation pattern obtained here (major product ions resulting from the sequential losses of the two pentoses and a hexose) is diagnostic for identification of O-glucose trisaccharide-modified peptides. To search for the same peptide modified with O-glucose mono-or disaccharide, we searched the MS/MS data for the unglycosylated peptide (m/z 1112.8, Fig. 3C, EIC). Two closely eluting peaks were detected (labeled D and E in Fig. 3C). The larger peak, D, is the same species shown in Fig. 3D (peptide modified with O-glu-  Table 1 SEPTEMBER 9, 2011 • VOLUME 286 • NUMBER 36 cose trisaccharide). The smaller peak, E, corresponds to the peptide modified with O-glucose monosaccharide. The MS scan and CID fragmentation of E is shown in Fig. 3E. This result is consistent with the presence of O-glucose monosaccharide on EGF 1-5 in the radiolabeling experiments (Fig. 2, EGF 1-5).

O-Glucose Modifies Notch1 at High Stoichiometry
The peptide with O-glucose monosaccharide elutes slightly later (53.9 min) than the trisaccharide form (53.5 min), consistent with the absence of the xyloses.
By searching for neutral loss of the O-glucose saccharides with the characteristic losses of xylose, xylose, and glucose (as shown in Fig. 3 Fig. 3, and supplemental Fig. S1). The peptides were initially identified based on the characteristic fragmentation pattern of the O-glucose trisaccharide (as in Fig. 3D). The masses of the unglycosylated peptide were then matched to predicted masses of tryptic (or other proteases) peptides from mN1 that contain O-glucose consensus sites ( Table 1). The identity of each peptide was confirmed by identifying unique band y-ions in the MS/MS spectra. One peptide, from EGF repeats 15 and 16, carried both the O-glucose trisaccharide and an additional HexNAc modification. (Table 1 and supplemental Fig. S1F). Sequential loss of masses corresponded to the masses of GlcNAc, Xyl, Xyl, and Glc and totaled a loss of 629 Da. Either HexNAc or xylose can be lost first, indicating that both are terminal. Because this peptide contains the region predicted to be modified by O-GlcNAc in EGF repeat 15 (58), we believe the HexNAc is an O-GlcNAc modification, but this will require further verification. Several ions were identified corresponding to peptides containing both O-glucose and O-fucose modifications (EGF repeats 12, 20, 21, 27, and 31; Table 1 and supplemental Fig. S1). These peptides showed sequential losses of masses corresponding to fucose, xylose, xylose, and glucose, for a total loss of 572 Da. Either fucose or xylose can be lost independently, indicating that they are unlinked. A number of additional O-fucose sites were also mapped during this study using similar methods (EGF repeats 2, 3, 5, 12, 20, 21, 27, 31, and 35; Tables 1 and 2 and supplemental Fig. S1).
All 16 peptides identified in this manner contain the consensus sequence C 1 XSXPC 2 (Table 1), indicating that the consen-sus sequence is a useful prediction tool for identification of potentially O-glycosylated proteins. EIC searches for unglycosylated peptides (as in Fig. 3C) revealed that the trisaccharide form of O-glucose predominates on the samples analyzed (data not shown), consistent with the radiolabeling results shown in Fig. 2. One notable exception was EGF 27, which is modified by O-glucose as well as O-fucose. The most abundant form of this peptide contained the O-fucose monosaccharide with no O-glucose modification (Fig. 4A). Smaller amounts were observed bearing the O-glucose trisaccharide and O-fucose monosaccharide and even less with the O-glucose monosaccharide alone or naked peptide. These results indicate that this peptide is O-fucosylated at high stoichiometry but O-glucosylated at substoichiometric levels. Alternatively, the presence of the O-glucose trisaccharide could suppress ionization, although we have not observed such suppression on other peptides. In contrast, the glycopeptide from EGF 12 also contains dual O-glycosylation sites, but in this case, the predominant species contains both the O-glucose trisaccharide and the O-fucose monosaccharide (Fig. 4B). EGF 20, 21, and 31 also have dual O-glycosylation sites, and, like EGF 12, the major species is fully extended O-glucose trisaccharide with O-fucose monosaccharide (data not shown).
To determine whether the high stoichiometry of O-glucose site occupancy was cell type-specific, we analyzed the O-glucosylation of EGF 1-5 expressed in several other cell lines (COS7, HEK293T, and NIH3T3). In contrast to what we observed in Lec1-CHO cells (or HEK293T cells; not shown), a significant amount of unglycosylated peptide from EGF 4 could be detected in the samples from COS7 cells, suggesting underglucosylation of this site (Fig. 4, C and D). Interestingly, there is very little monosaccharide present in samples from either cell line, suggesting that although the xylosyltransferases extend O-glucose with high stoichiometry, there is a degree of variability in stoichiometry of O-glucosylation that is cell-or tissue type-dependent.
A Non-traditional Consensus Site (C 1 ASAAC 2 ) Is Modified with O-Glucose Trisaccharide-While performing neutral loss searches for O-glucosylated peptides, a 17th glycopeptide was discovered that did not correspond to any of the masses of peptides predicted to be O-glucosylated based on the consensus

identified with O-fucose modifications
Peptides were identified by neutral loss of a mass corresponding to fucose as shown in Fig. 3. Spectra for each glycopeptide identified here are shown in supplemental Fig.  S1, P-S). All masses were converted to the equivalent of singly charged (M ϩ H ϩ ) for the table. For each glycopeptide, the mass of the parent ion and the fully deglycosylated product and the difference between these (corresponding to the mass of the modification: O-fucose monosaccharide, 146 Da) are shown. The predicted mass of the unglycosylated peptide is also shown. All peptide masses are adjusted for carbamidomethylation of cysteines. For peptides with a mass below 2000 Da, monoisotopic masses were used. For those above 2000 Da, average masses were used. Predicted O-fucose modification sites are in boldface type and underlined (T). sequence C 1 XSXPC 2 . When the search was expanded to include masses of "non-glycopeptides" produced by chymotryptic digest of mouse Notch1, the glycopeptide was identified as including the cysteine 1 to cysteine 2 region of EGF 9 (Fig.  5A). To verify that the fragmentation pattern was in fact that of O-glucosylated EGF 9, the sample was subjected to MS2 and constant neutral loss-triggered MS3, both of which showed strong series of y-ions that correspond to the fragmentation pattern of EGF 9 (Fig. 5, B and C). Because the chymotryptic peptide contained several hydroxy amino acids, we wanted to confirm the position of O-glucosylation within the peptide by a second mass spectral method. Using ETD, we were able to map the modification to the only hydroxy amino acid (serine) between cysteines 1 and 2 of EGF 9 (Fig. 5, D and DЈ). We next examined the extent of elongation of the O-glucose at this novel site by EIC searches of the data. Like the majority of the traditional consensus sites, the glycopeptide including EGF 9 is O-glucosylated and also elongated to trisaccharide with high stoichiometry (Fig. 5E). These results suggest that the proline in the consensus sequence may not be essential for O-glucosylation.
Mouse Notch1 contains one other similar site in EGF 35 without the proline, C 1 GSLRC 2 . Although no additional ions with neutral losses of 426 Da (trisaccharide) were detected during our neutral loss searches, we searched the data for peptides from EGF 35. Only the unmodified peptide was detected (Fig. 5,  F and G). EIC searches for the masses of the theoretical trisaccharide, disaccharide, and monosaccharide glycoforms of this peptide were unsuccessful. We have also confirmed that a similar site in EGF 2 from Drosophila Notch (C 1 NSMRC 2 ) is also unmodified (data not shown). These results reveal that alanine, but not arginine, can substitute for proline in the consensus sequence for O-glucose, O-Glucosylation Only Occurs on Serine-Our analysis of O-glucose modification sites supports the proposal that only serine can be modified with O-glucose. To directly test this hypothesis, we mutated the serine in the O-glucose consensus sequence to a threonine within a bacterially expressed form of an EGF repeat from human factor VII. We have used this EGF repeat in previous studies as an acceptor substrate for protein O-glucosyltransferase (48). Interestingly, the threonine mutant did not serve as an acceptor substrate for protein O-glucosyltransferase in in vitro assays (Fig. 6A). Proper folding of the mutated EGF repeat was confirmed by demonstrating that it functions as well as the wild type EGF repeat in assays with Pofut1 (Fig. 6B), which only fucosylates properly folded EGF repeats. These results strongly support the concept that O-glucosylation only occurs on serine residues in the context of the C 1 XSX(P/A)C 2 consensus sequence. Consistent with this finding, we found that EGF repeat 21 from Drosophila Notch, which contains a threonine instead of serine in the conserved position (CVTNPC), is not O-glucosylated (data not shown).
Examination of Other Potential O-Glucose Sites-We have examined whether closely related sites can be O-glucosylated, including several in which serine residues are found in a location between cysteines 1 and 2 of the EGF repeat different from that defined by the consensus sequence. EGF 6 from mN1 contains the sequence C 1 SPSPC 2 but is unmodified (data not shown). Similarly, Drosophila Notch contains three EGF repeats with serine in a non-consensus location, none of which are O-glucosylated (EGF 6, C 1 SPSPC 2 ; EGF 23, C 1 SLSSPC 2 ; and EGF 36, C 1 SPNPC 2 ) (data not shown). Therefore, the revised consensus sequence accurately predicts the positions of the modified Ser residue and Ala/Pro residue between cysteines 1 and 2 of EGF repeats, and spatial positioning of these amino acids is essential for modification.

Individual Mutation of Most O-Glucose Sites Has Little or No Effect on Delta-like 1-and Jagged1-mediated Notch1 Signaling
Except for EGF 28-In order to determine which O-glucose modifications influence Notch signaling, we mutated individ- Both O-glycan monosaccharide and naked peptide species are not detected, indicating that both sites are extensively modified. C, mouse Notch1 EGF 1-5 was expressed and purified from Lec1-CHO cells and analyzed by LC-MS/MS as described in Fig. 3. EIC searches for the ions corresponding to a peptide from EGF 4 modified with O-glucose trisaccharide (red), monosaccharide (blue), and naked peptide (black) show that EGF 4 is highly O-glucosylated and extended to trisaccharide in Lec1-CHO cells, with naked peptide and monosaccharide species at almost undetectable levels, indicating that glucosylation and xylosylation are occurring at high efficiency. D, mouse Notch1 EGF 1-5 was expressed and purified from COS7 cells and analyzed as in C. Parallel EIC searches for ions corresponding to the same EGF 4 peptide (trisaccharide (red), monosaccharide (blue), and naked peptide (black)) reveal a significant amount of naked peptide in COS7 cells, indicating underglucosylation of EGF 4 in these cells. Monosaccharide species is almost undetectable here, suggesting that xylosylation is efficient. Gray line, peptides; blue circle, glucose; orange star, xylose; red triangle, fucose.
ual O-glucose modification sites and tested them in a cell-based Notch signaling assay (25). All 16 conserved O-glucosylation sites, as well as the newly discovered site at EGF 9, were mutated individually (Ser to Ala). The effects of the mutations on cell surface expression of mN1 were examined using a cell surface biotinylation assay, to establish whether point mutations cause misfolding of the receptor. All mutants were expressed on the cell surface at a level similar to that of the wild type protein (Fig.  7A). The site mutants, as well as wild-type Notch1 and an empty vector control, were then used in NIH3T3 cell co-culture assays as described under "Experimental Procedures." The analysis revealed that elimination of the O-glucose site in EGF 28 reduced Delta-like 1-mediated Notch1 activation to nearly background levels relative to wild type Notch1 (Fig. 7, B and C). None of the O-glucose site mutants resulted in a statistically significant reduction of Jagged1-mediated Notch activation (Fig. 7, D and E). Mutations in other individual sites did not have any apparent effect on Notch signaling, including mutation in EGF 12. A revertant mutation (Ala to Ser) using the EGF 28 site mutant as the template was generated to ensure that the effect on Notch1 activity was not due to inadvertently introduced random mutations. This revertant exhibited the same level of activation as the wild type Notch1, confirming that the effect was specifically due to the mutation of the O-glucosylation site (Fig. 7, F and G). Because all mutants, including EGF 28, were expressed on the cell surface at a level similar to that of wild type protein (Fig. 7A), this suggests that loss of O-glucosylation on EGF 28 is affecting some other step in Notch activation.

DISCUSSION
The consensus sequence for O-glucosylation originally proposed by Nishimura et al. (44) in 1989 was formulated based on a comparison of the primary sequence of three glycoproteins confirmed to be O-glucosylated from humans and cows: blood coagulation factors VII and IX and protein Z. Over 40 mammalian proteins are predicted to be O-glucosylated based on this consensus sequence, of which the Notch family of receptors contains the greatest number of predicted sites (supplemental Table 1). Although we have previously shown Notch to be O-glucosylated, little was known to date about site occupancy and glycan structure at the 16 predicted sites on mN1 or to what extent all or a subset of these sites play a role in Notch signaling. The results presented here confirm that the mouse Notch1 extracellular domain is heavily decorated with O-glucose saccharides. Using the consensus sequence C 1 XSXPC 2 , we have identified peptides spanning the cysteine 1 to cysteine 2 region of all 16 EGF repeats of mouse Notch1 predicted to be modified by Poglut/Rumi, and confirmed that all 16 bear the O-glucose-xylosexylose trisaccharide, as detected using mass spectrometry (summarized in Fig. 8A). Identifying ions that correspond to peptides containing the consensus sites C 1 XSXPC 2 , greater by a mass of 426 Da than the predicted mass of the naked peptide in an MS scan, with observed subsequent loss of two xyloses and a hexose in MS/MS, are sufficient to determine the sites of modification.
Although the C 1 XSXPC 2 consensus sequence is a useful tool in predicting sites of O-glucose modification, using our constant neutral loss search method, we discovered an additional 17th glycopeptide whose naked peptide mass did not correspond to the masses of peptides that included any of the 16 predicted sites. In silico digest indicated that the glycopeptide corresponded in mass to a non-traditional site at EGF 9, C 1 ASAAC 2 . To verify the fragmentation pattern, MS2 was repeated, and MS3 was carried out in CID mode. In addition, ETD confirmed the hydroxy amino acid within the peptide bearing the O-glucose trisaccharide was between cysteines 1 and 2 of EGF 9. Because the original consensus sequence missed at least one modified site, the remaining 19 nonconsensus EGF repeats were examined to determine if any other amino acids besides Pro and Ala are acceptable N-ter-  A, protein O-glucosyltransferase assays were performed as described (48) using crude lysates of mouse brain as enzyme source and increasing amounts of bacterially expressed factor VII EGF repeat (black) or a mutated form of the EGF repeat, S52T (red), where the Ser in the consensus O-glucose consensus sequence was replaced with a Thr, converting C 1 ASSPC 2 to C 1 ATSPC 2 . B, Pofut1 assays were performed as described (19) using 4 M wild-type (black) or mutated (red) EGF repeat. All assays were performed in duplicate. Error bars, range of duplicates.
minal to cysteine 2. Of these 18, only EGF 35 maintained a serine in the appropriate position without a proline residue before cysteine 2 (C 1 GSLRC 2 ), but it was only detected as naked peptide. Similarly, Drosophila Notch EGF 2 has an Arg before cysteine 2 (C 1 NSMRC 2 ) and has only been detected as naked peptide.
In light of these analyses, we propose refinement of the consensus sequence to allow alanine or proline, but not arginine, FIGURE 7. O-Glucose site mutation in EGF 28 of mouse Notch1 alters Notch activation. A, to evaluate cell surface expression, NIH3T3 cells were transiently transfected with wild type Notch1-Myc 6 or Notch1-Myc 6 O-glucose site mutants. 24 h post-transfection, cell surface biotinylation was carried out as described (25). Cell lysates and streptavidin-bound fractions were analyzed by immunoblot using an anti-Myc antibody to detect transfected Notch1. Immunoblotting with anti-pan-cadherin and anti-␤-actin were used as positive and negative controls, respectively (not shown). B-G, single O-glucose site mutants were evaluated in comparison with wild type mouse Notch1 in a co-culture signaling assay, where NIH3T3 cells were transiently transfected with mN1 plasmids and co-transfected with luciferase to measure Notch1 activation by Delta-like1 (B, C, and F) or Jagged1 (D, E, and G). The results are expressed as relative luciferase units (RLU), which reflects -fold activation induced by ligand-expressing L-cells over that obtained with control L-cells. The co-cultures in which Notch bearing an O-glucose site mutant resulted in a statistically significant difference in relative luciferase units when compared with wild type, as determined by one-way analysis of variance, are indicated with an asterisk. To ensure that no inadvertent mutations contributed to the decrease in Notch1 activation that resulted from the EGF 28 mutant, this mutant was reverted to wild type and used in the co-culture assays (F and G). FL, full-length Notch1; TM/ICD, transmembrane and intracellular domain of Notch1.
N-terminal to C 2 . Alanine is a non-conservative substitution for proline, and we do not yet know whether O-glucosylation of EGF 9 is context-specific or if alanine is permitted at this position in other EGF repeats. Several amino acids exist at this position in other potential sites (e.g. mouse Notch4 contains a serine within the potential site at EGF 20 (C 1 VSASC 2 ), and Drosophila Notch contains a glycine in a potential site at EGF 1 (C 1 TSVGC 2 )). Further analysis needs to be done to determine what amino acids N-terminal to cysteine 2 are permitted for O-glucosylation to occur, although our data suggest that bulky, charged residues, such as arginine, are disallowed. In addition, the fact the analyses described here were performed using fragments of mouse Notch1 overexpressed in tissue culture cells could also affect our results. The glycosylation pattern may be different on full-length protein. We have begun mass spectral site mapping on full-length mouse Notch1 immunoprecipitated from COS7 cells, and the preliminary analysis confirms many of the sites and structures mapped here (data not shown). Because our neutral loss search approach allowed us to discover a novel non-consensus site that would otherwise have been overlooked, we have confidence that our site-mapping approach will be able to identify any additional novel sites should they exist.
In further testing the consensus sequence, we decided to explore whether serine is the only hydroxy amino acid modified with O-glucose, and whether the position of the serine within the consensus sequence is critical for O-glucosylation. To accomplish this, we took two approaches. The direct approach involved introducing a Ser to Thr mutation within human factor VII, a protein extensively studied and known to be O-glucosylated (48). Although properly folded, the Thr mutant did not serve as an acceptor substrate for O-glucosylation in vitro. The second approach involved examination of non-consensus site EGF repeats that contained hydroxy amino acids in various positions between cysteines 1 and 2 and of EGF repeats that naturally have a Thr instead of Ser two amino acids after cysteine 1. EGF 21 from Drosophila Notch has a Thr instead of serine between cysteines 1 and 2 (CVTNPC), and is not O-glycosylated. EGF 6 and 23, also from Drosophila Notch, contain multiple serine residues between cysteines 1 and 2 flanking a non-hydroxy amino acid (CSPSPC and CSLSSPC) but are also not O-glycosylated. Based on all of these results, we propose the revised consensus sequence, C 1 XSX(P/A)C 2 .
Revision of the consensus sequence reveals new potential O-glucosylation sites on Notch1 and Notch3 homologs (supplemental Table 1). Expanding the search to allow any amino acid N-terminal to cysteine 2 (other than Arg or Cys) reveals 23 additional potential sites, 16 of which are on new proteins not previously predicted to be O-glucosylated. Further studies need to be done to determine whether these additional proteins are in fact O-glucosylated.
Our data suggest that the predicted sites, including the novel site at EGF 9, are modified at high stoichiometries. Very little unmodified peptide was typically observed, except at EGF 27 or on EGF 4 when produced in COS7 cells. These results indicate that some EGF repeats may be more efficiently modified than others (e.g. comparison of EGF 27 to others) and that efficiency may also be affected by cell type. We recently showed that the O-glucose site at EGF 16 from mouse Notch2 is similarly substoichiometrically modified (49). Thus, the extent of modification at individual sites may be controlled by both the sequence of an individual EGF repeat and expression levels of Rumi in a given cell. Preliminary data suggest that Rumi levels do vary in different tissues (49).
Elongation of O-glucose past the monosaccharide is also quite efficient. Both the metabolic radiolabeling experiments and EIC comparisons of peptides with monosaccharide versus trisaccharide forms confirm that all sites are modified with trisaccharide. This is consistent with our previous work on fulllength endogenous Notch1 (16). These results suggest that the xylosyltransferases present in the cells examined here are also  Tables 1 and 2 and other sources (17,25). O-Fucose sites that have not been definitively mapped are shaded light red. Elongation of O-fucose past the monosaccharide is based on earlier work (17,25). The brackets indicate that elongation occurs on some or all of the O-fucose sites in that region. Symbols are based on CFG guidelines: glucose (blue circle), xylose (orange star), fucose (red triangle), GlcNAc (blue square), galactose (yellow circle), and sialic acid (purple diamond). B, structure of the EGF 12 from mouse Notch1 with O-fucose disaccharide, as recently determined (65)  quite efficient. Of course, like the Fringes, there may be tissuespecific differences in their expression levels that alter the extent of elongation in a tissue-specific or developmentally regulated fashion.
The tryptic peptide from EGF 15 and 16 contained multiple O-glycan modifications, including O-glucose trisaccharide (EGF 16) and what appears to be the novel O-GlcNAc monosaccharide (on EGF 15). Although we cannot distinguish between GlcNAc and GalNAc based on mass alone, this modification occurs in the predicted location for O-GlcNAc on EGF repeats (58). O-GlcNAc has long been known to modify a variety of intracellular proteins, but it was recently reported on a non-intracellular protein for the first time when Okajima and co-workers (58) observed this modification on Drosophila Notch. If confirmed, this would be the first evidence of O-GlcNAc as an extracellular modification on a known mammalian protein.
The mutagenesis studies suggest that elimination of most individual O-glucose sites (other than EGF 28) has very little effect on Notch1 activity. Thus, the major effects of O-glucose may be due to elimination of O-glucose at multiple sites, as would occur in Rumi mutants (49,51). It is somewhat surprising that elimination of the O-glucose site in EGF 12 has no effect on Notch activity in cell-based assays. EGF 12 is known to be part of the ligand-binding domain (59,60), and recent modeling studies suggest that the O-glucose modification site sits in the interface between human Notch1 and Jagged1 (61). The site is also present in most but not all known Notch receptors (Fig.  1A). The lack of an effect upon elimination of O-glucose at EGF 12 is in stark contrast to the elimination of O-fucose at EGF 12 (25)(26)(27).
The largest effect on Notch1 activity was observed when O-glucose at EGF 28 was eliminated. The site in EGF 28 is not highly conserved in Notch proteins from different species (Fig.  1A) but is found in human, mouse, and rat Notch1 (16), suggesting that it may have mammal-specific effects. In addition, the effect was seen only with Delta-like 1 as ligand. The EGF 28 mutant responded like wild type to activation with Jagged1. EGF 28 is in the Abruptex region of Notch, named for a series of mutations in Drosophila Notch that result in a hyperactive Notch refractory to Fringe (17,62). Recent structural studies have suggested that the presence of calcium-binding motifs reduces the flexibility of the linker between adjacent EGF repeats (63). Most of the EGF repeats in Notch1 contain the calcium-binding motif, and as a result, the linkers are predicted to be rigid (Fig. 8A). Interestingly, the Abruptex region has several non-calcium-binding EGF repeats and may be a region of flexibility (Fig. 8A). The observation that mutation of O-fucose sites (EGF 26 and 27 (25)) and an O-glucose site (EGF 28) within the Abruptex region affects Notch1 activity in cell-based assays suggests that these glycans may be affecting the flexibility of this region. We and others have previously proposed that the O-fucose glycans may affect the overall conformation of mouse Notch1 ECD (25). Similar proposals have been made regarding Drosophila Notch (60). Support for this idea comes from the temperature-sensitive Notch phenotype in Drosophila Rumi mutants, suggesting the importance of O-glucosylation for the conformational changes of Notch during its activation (51).
The fact that the O-fucose and O-glucose glycans are fairly large with respect to the EGF repeat itself (Fig. 8B) suggests that the presence or absence of these modifications could easily affect the overall conformation of the Notch ECD, especially in these flexible regions. Structural studies will be necessary to determine whether such conformational changes do indeed occur.