Mammalian O-mannosylation of cadherins and plexins is independent of protein O-mannosyltransferases 1 and 2

Protein O-mannosylation is found in yeast and metazoans, and a family of conserved orthologous protein O-mannosyltransferases is believed to initiate this important post-translational modification. We recently discovered that the cadherin superfamily carries O-linked mannose (O-Man) glycans at highly conserved residues in specific extracellular cadherin domains, and it was suggested that the function of E-cadherin was dependent on the O-Man glycans. Deficiencies in enzymes catalyzing O-Man biosynthesis, including the two human protein O-mannosyltransferases, POMT1 and POMT2, underlie a subgroup of congenital muscular dystrophies designated α-dystroglycanopathies, because deficient O-Man glycosylation of α-dystroglycan disrupts laminin interaction with α-dystroglycan and the extracellular matrix. To explore the functions of O-Man glycans on cadherins and protocadherins, we used a combinatorial gene-editing strategy in multiple cell lines to evaluate the role of the two POMTs initiating O-Man glycosylation and the major enzyme elongating O-Man glycans, the protein O-mannose β-1,2-N-acetylglucosaminyltransferase, POMGnT1. Surprisingly, O-mannosylation of cadherins and protocadherins does not require POMT1 and/or POMT2 in contrast to α-dystroglycan, and moreover, the O-Man glycans on cadherins are not elongated. Thus, the classical and evolutionarily conserved POMT O-mannosylation pathway is essentially dedicated to α-dystroglycan and a few other proteins, whereas a novel O-mannosylation process in mammalian cells is predicted to serve the large cadherin superfamily and other proteins.

Protein O-glycosylation of the O-mannose type was originally thought to be found only in yeast and fungi, but studies over the last 30 years have identified O-Man 2 glycans and specific glycoproteins carrying O-Man glycans in human and rodents (1)(2)(3)(4)(5)(6)(7)(8)(9). The basement membrane glycoprotein ␣-dystroglycan (␣-DG) was for some time the only well characterized O-mannosylated protein known in mammals despite evidence that O-Man glycans constitute a major part of the total O-glycans in the brain (1,2,8,9). O-Mannosylation of ␣-DG is essential for assembly and function of the dystrophin-glycoprotein complex that links the cytoskeleton with the extracellular matrix, and deficiencies in all of the enzymes involved in the O-Man glycosylation underlie a subgroup of congenital muscular dystrophies (10 -12). More recently, the human O-Man glycoproteome was characterized, and it was found that the large superfamily of cadherins (cdhs) and protocadherins (pcdhs) are also decorated with ␣-linked O-Man glycans on extracellular cadherin (EC) domains. The attachment sites of these O-Man glycans appear to be highly conserved throughout evolution (13,14); moreover, O-mannosylation of E-cadherin was suggested to be crucial for E-cadherin-mediated cell adhesion (15,16).
O-Man glycosylation in metazoans is initiated in the endoplasmic reticulum (ER) by transfer of mannose from dolichol monophosphate-activated mannose to serine and threonine by the POMT1 and POMT2 protein O-mannosyltransferases. Our insight into the substrate specificities of these enzymes largely stems from studies of the yeast orthologs, which consist of a larger family of six or more protein O-mannosyltransferases (PMTs) (17,18). These are grouped into three subfamilies, PMT1, PMT2, and PMT4, and the two metazoan orthologs, POMT1 and POMT2, are grouped in subfamilies PMT4 and PMT2, respectively (17,18). This categorization is based on sequence similarities, but a recent study confirmed the func-tional similarity between the human POMT1 and yeast PMT4 enzymes (19). O-Mannosylation of proteins in yeast is widespread, and we recently characterized the yeast O-Man glycoproteome identifying almost 300 glycoproteins that enter the secretory pathway (20). In addition, we also found that yeast has an additional and unique nucleocytoplasmic O-Man glycoproteome, which is predicted to be glycosylated by a yet unknown cytosolic/nuclear O-mannosyltransferase(s) different from the ER-located PMTs (21). The nucleocytoplasmic O-mannosylation system is only found in yeast and is predicted to serve similar functions as the nucleocytoplasmic O-GlcNAcylation found in all eukaryotic cells except yeast (22). The ER-located PMTs in yeast have wide glycosylation functions of ER/Golgi, cell wall, and secreted proteins, similar to the metazoan Gal-NAc-type O-glycosylation (23), and in fact it appears that the two types of glycosylation have great overlaps in proteins and glycosites as well as biological functions (20,24). Interestingly, the orthologous metazoans POMT1 and POMT2 are predicted to have narrower substrate specificities and only serve in glycosylating a limited number of proteins, including ␣-DG and cdhs/pcdhs (13,15,19). However, the functions of the POMT1 and POMT2 isoenzymes and their relationship with the larger family of yeast PMTs are still poorly understood.
In this study, we aimed to explore the biological roles of O-mannosylation of the large families of important cdh and pcdh adhesion proteins. We used a knock-out strategy to deconstruct the genetic regulation targeting the POMT1 and POMT2 genes as well as POMGNT1 in two mammalian cell lines. We also used the SimpleCell O-glycoproteomics approach targeting both COSMC and POMGNT1 to demonstrate a predicted interplay between O-Man and O-GalNAc glycosylation in human HEK293 cells. We found in agreement with previous reports that both POMT1 and POMT2 were required for O-mannosylation of ␣-DG; however, to our surprise, deficiency in either or both POMT1 and POMT2 did not affect O-mannosylation of cdhs, pcdhs, and additional proteins. This finding was confirmed by analysis of a human skin fibroblast cell line derived from a POMT1-deficient patient. Thus, our study suggests that the function of the two mammalian POMTs is even more limited than previously predicted and serves ␣-DG and few additional proteins, including the mucin-like protein KIAA1549. Most importantly, the results suggest the existence of a previously unknown ER/Golgi-located protein O-mannosylation pathway in mammalian cells that specifically control O-Man glycosylation of the superfamily of cadherins.

Analyses of O-Man glycoproteins in mammalian cell lines with knock-out of POMGNT1, POMT1, and POMT2
We first generated CHO cells with knock-out of POMGNT1 (CHO PGNT1 ) (Table 1), termed SimpleCells with truncated O-Man glycans suitable for lectin weak affinity chromatography (LWAC) enrichment of O-Man glycopeptides with the ConA lectin (Fig. 1A). We explored the O-Man glycoproteome identifying a comparatively low number of nine O-Man glycoproteins (Table 2), and in seven the human orthologs were pre-viously identified in human MDA231 breast cancer cells (13) and two novel proteins, reticulocalbin 3 and multiple coagulation factor deficiency protein 2-like. As expected, we found O-Man glycopeptides from ␣-DG and members of the cdh/ pcdh and plexin families. We also identified the apparent CHO ortholog of the KIAA1549 protein, which we previously identified as a membrane protein with a large mucin-like ectodomain densely decorated with O-Man glycans (13). We also analyzed wild-type (WT) CHO cells, and somewhat surprisingly identified 10 glycoproteins, including ␣-DG, members of the cdh/ pcdh family, and plexins with non-elongated O-Man monosaccharide structures as produced in the CHO PGNT1 (Table 2 and  supplemental Table S2). This suggested that our analysis of lysates of CHO WT cells included biosynthetic intermediates and/or that O-Man glycans in CHO cells are not fully elongated by the POMGnT1 enzyme.
We proceeded with analysis of the O-Man glycoproteomes in CHO PGNT1 with truncated O-Man glycans containing additional single and double knock-outs of POMT1 and POMT2 (Table 1). In both single POMT1 (CHO PGNT1/POMT1 ) and POMT2 (CHO PGNT1/POMT2 ) knock-out cells as well as double POMT1/POMT2 (CHO PGNT1/POMT1/POMT2 ) knock-out cells, we consistently did not identify O-Man glycopeptides derived from ␣-DG or KIAA1549 (Table 2 and supplemental Table S2). In contrast, in all mutant cell lines we identified a number of other O-Man glycoproteins, including members of the cdh/ pcdh family and plexins as well as PDI-A3 and hepatocyte growth factor receptor ( Table 2 and supplemental Table S2). These results confirm previous studies reporting that co-expression of both POMT1 and POMT2 is required for O-mannosylation of ␣-DG (25), but they also demonstrate that O-mannosylation of a number of other proteins, including the large cdh/pcdh family, may not depend on the POMT1 and POMT2 enzymes.
In general, the number of O-Man glycoproteins identified in CHO cells by our strategy was quite low compared with our previous study of a human breast cancer cell line (13). We therefore turned to human HEK293 cells and generated isogenic cell lines with knock-out of COSMC (HEK293 SC ) and both COSMC and POMGNT1 (HEK293 SC/PGNT1 ) to establish double SimpleCells with truncated O-Man and O-GalNAc glycans (26). We stacked POMT1 (HEK293 SC/PGNT1/POMT1 ) knock-out in the double SimpleCell background where both O-Man and O-GalNAc glycans are truncated facilitating sensitive analysis of both types of glycoproteomes simultaneously ( Fig. 1 and Table 1). ␣-DG contains a central mucin-like domain with O-Man glycans in the N-terminal region and O-GalNAc glycans in the C-terminal region, and we have readily identified glycopeptides from both regions with our COSMC and POMGNT1 knock-out SimpleCell strategies enriching for O-Man and O-GalNAc (13,23). We also generated POMT1/ POMT2 double knock-out (HEK293 POMT1/POMT2 ) in HEK293 WT cells (Table 1). Using these mutant cells, we identified a total of 63 O-Man glycoproteins following ConA enrichment and mass spectrometry (Table 3). In agreement with our data from MDA231 breast cancer cells (13), we identified O-Man glycosites on ␣-DG, but the cdh/pcdh family accounted for the majority of O-Man glycoproteins identified in HEK293 SC/PGNT1 cells (supplemental Table S3). In agreement with our studies in CHO cell, we did not detect ␣-DG O-Man glycopeptides in HEK293 SC/PGNT1/POMT1 deficient in POMT1, whereas we readily identified 25 O-Man glycoproteins of which 20 were members of the cdh/pcdh superfamily (Table 3). Moreover, the same was found for the double POMT1/POMT2 knock-out HEK293 POMT1/POMT2 cell line without truncated O-glycans, where we identified 37 O-Man glycoproteins from cdh/pcdh family members as well as several other proteins ( Table 3). The cumulative findings confirm that O-mannosylation of ␣-DG and the mucin-like KIAA1549 requires the function of both POMT1 and POMT2, whereas the cadherin superfamily, plexins, and several other proteins appear to be glycosylated by yet unknown enzymes in mammalian cells. Similar to CHO cells, we identified O-Man glycosites on cdh and pcdh in HEK293 cells without knock-out of POMGNT1 (HEK293 POMT1/POMT2 ) suggesting that cell lysates included biosynthetic intermediates and/or that O-Man glycans on these glycoproteins are not fully elongated by the POMGnT1 enzyme.

Analysis of O-Man glycans on recombinantly expressed cadherin/protocadherins in HEK293 cells with POMT1/POMT2 knock-outs
We previously characterized O-Man glycosites on the extracellular domains (EC) of protocadherin ␥-C5 (EC1-4) recombinantly expressed in HEK293 cells (14). Interestingly, we found all identified O-Man glycans as single mannose monosaccharide residues. To further explore the role of POMT1 and POMT2 in glycosylation of cdh/pcdh, we expressed three representative members, mouse E-cadherin, Pcdh␣-C2, and Pcdh␥-A4, as secreted His-tagged ectodomains in HEK293 POMT1/POMT2 mutant cells. The tagged proteins were purified and digested with trypsin and subjected to bottom-up MS analysis to map glycosylation sites and structures. The mass spectrometric analyses identified 7, 8, and 7 O-Man glycosites in E-cadherin, Pcdh␣-C2, and Pcdh␥-A4, respectively. The O-Man glycosylation sites were distributed across four of the six extracellular cadherin domains of Pcdh␣-C2 and two of the six EC domains of Pcdh␥-A4, Table 1 Overview of indel mutations for the gene alleles in HEK293 and CHO The stacking order for gene editing is given by gene order.
whereas E-cadherin was occupied with O-Man on four of the five EC domains, with the EC1 domain being the only one without detectable O-mannosylation in agreement with our previous findings (supplemental Table S4) (13,14). We processed the data and searched for O-Man glycans matching all known glycan structures, e.g. M1, M2, and M3 cores, that have been observed on ␣-DG (27), but we identified no evidence of any elongation of cdh and pcdh O-Man glycans. All glycosites identified were occupied by single mannose residues with no evidence of more complex structures that are

Mammalian O-mannosylation of cadherins
normally found on ␣-DG, for example (28). Thus, analysis of the recombinantly expressed cdh/pcdhs in HEK293 mutant cells shows that O-mannosylation of this class of proteins is not dependent on POMT1 and POMT2 and furthermore that the O-Man glycans on these proteins do not appear to be elongated.

Quantitative differential O-Man glycoproteome analysis in HEK293 cells
To further confirm these findings, we used a recently developed comparative quantitative O-glycoproteomics strategy based on differential labeling of tryptic digests from isogenic cells using stable dimethyl isotopes (29,30). Total digests from HEK293 SC/PGNT1 and HEK293 SC/PGNT1/POMT1/POMT2 cells were labeled with light label (ϪN(CH 3 ) 2 ) and medium label (ϪN(CHD 2 ) 2 ), respectively. The labeled digests were then mixed in a 1:1 ratio and subsequently processed as a single sample. The O-Man glycopeptides were enriched by ConA LWAC, and the flow-through of this chromatography step was enriched for O-GalNAc glycopeptides by Vicia villosa agglutinin (VVA) LWAC ( Fig. 1). To evaluate the relative abundance of peptides originating from HEK293 SC/PGNT1 (light label) and HEK293 SC/PGNT1/POMT1/POMT2 (medium label), medium/light ratios (M/L) were calculated based on the nano-LC-MS/MS elution profiles for each labeled glycopeptide and expressed on a log 10 scale (Fig. 2). First, we observed that the total tryptic digest, collected as the flow-through fraction from the last (VVA) lectin enrichment step, demonstrated a normal distribution centered around 0 ( Fig. 2A), thus showing that digested proteins from HEK293 SC/PGNT1 and HEK293 SC/PGNT1/POMT1/POMT2 cells, respectively, were mixed equally and in a 1:1 ratio prior to lectin enrichment. Next, we processed the elution fractions from the ConA lectin enrichment and identified 38 O-Man glycoproteins in total (supplemental Table S5) of which 23 were quantified by MS1 and ETD-MS2 acquisition. For the 23 quantified O-Man glycoproteins, 16 belonged to proteins from the cdh/pcdh superfamily, all of which had O-Man glycopeptide medium/light ratios close to 1 (Figs. 2B and 3), i.e. Ͻ8-fold variation between HEK293 SC/PGNT1 and HEK293 SC/PGNT1/POMT1/POMT2 cells. Thus, the labeling strategy using stable dimethyl isotopes confirmed that O-mannosylation of cdhs and pcdhs was not substantially affected when POMT1 and POMT2 are knocked out (Fig. 3). Additional glycoproteins with Ͻ10-fold difference in O-mannosylation included plexin-B2 and PDIA3, for example (supplemental Table S5). In striking contrast, two ␣-DG O-Man glycopeptides, originating from the 339 VPTPTSPA-IAPPTETMAPPVRDPVPGKPTVTIR 371 region were identified with light label only as expected (Figs. 2 and 3). One additional O-Man glycoprotein, SUN domain-containing ossification factor (SUCO), was also found to be dependent on POMT1/POMT2 for O-mannosylation ( Fig. 2B and supplemental Table S5). The total number of O-Man glycoproteins identified in HEK293 SC/PGNT1/POMT1/POMT2 cells is therefore 36 (Table 3) due to the loss of O-Man glycosylation on ␣-DG and SUCO.
We proceeded with a direct quantitative comparison of O-GalNAc glycosylation between the HEK293 SC/PGNT1 and HEK293 SC/PGNT1/POMT1/POMT2 cells using the flow-through of the ConA LWAC. In the VVA LWAC elution, we identified 319 O-GalNAc glycoproteins (supplemental Table S6). In this analysis we readily detected glycopeptides from ␣-DG. We identified 14 glycosites in the C-terminal part of the mucin-like domain (Ala 417 -Ser 485 ), which is normally O-GalNAc-modified as reported previously (13,(31)(32)(33)(34). The relative quantification (Figs. 4 and 5 and supplemental Table S7) of these ␣-DG glycopeptides further showed Ͻ10-fold variation between HEK293 SC/PGNT1 and HEK293 SC/PGNT1/POMT1/POMT2 cells, thus demonstrating that expression of the ␣-DG protein was not substantially affected by the genetic deletion of POMTs. Importantly, however, we also identified two GalNAc-glycopeptides, DPVPGKPTVTIRTR 373 and GAIIQTPTLG-PIQPTR 389 (underlining indicates O-glycan attached), from the N-terminal region of the ␣-DG mucin domain that is normally identified as O-mannosylated glycopeptides (Figs. 4 and 5 and supplemental Table S7). The two GalNAc-modified glycopep- tides had very high log 10 (M/L) values showing that they are essentially only detectable in HEK293 cells with knock-out of POMT1/POMT2. Thus, it appears that loss of the ER-located POMT1/POMT2 enzymes leaves the N-terminal region of the mucin domain of ␣-DG available for GalNAc glycosylation by the Golgi-located polypeptide GalNAc-transferases (GalNAc-Ts) that normally only serves the C-terminal region of this mucin domain. This interpretation is in agreement with previous in vitro enzyme analysis of the substrate specificities of Gal-NAc-Ts and ␣-DG peptide substrates (24).

O-Man glycoproteome analysis of a human skin fibroblast with partial POMT1 deficiency
Finally, we also included analysis of a primary skin fibroblast derived from a compound heterozygote (S29R and R622X) POMT1-deficient patient. Using our O-Man glycoproteomics workflow with the ConA enrichment strategy, we identified 32 O-Man glycoproteins and 97 glycosites (supplemental Table  S3). Similar to our results in CHO and HEK293 cell lines, we found O-glycopeptides with a single Man residue attached to a number of proteins, including members of the cadherin superfamily, five members of the plexin family, and PDIA3 and reticulocalbin 3. In addition, we identified two novel O-Man glycoproteins, hereditary hemochromatosis protein (HFE) and platelet-derived growth factor receptor ␣ (PGFRA). We did not identify any glycopeptides from ␣-DG, which is likely because POMGNT1 knock-out is required for isolation of truncated O-Man glycopeptides in agreement with our studies of HEK293 mutant cell lines. The finding that we readily identified O-Man glycopeptides via ConA LWAC from members of the cdh and pcdh families in cells without POMGNT1 knock-out, including the skin fibroblast, indicates that these glycans are not elongated in agreement with our findings with recombinantly expressed cdhs and pcdhs. This result combined with those of CHO and HEK293 cells provide clear evidence that the POMT1-POMT2 enzyme complex is not required for O-mannosylation of the cdh/pcdh family of proteins.

Discussion
The original aim of our study was to explore the function of O-Man glycans on cadherins and protocadherins by genetic deconstruction of the protein O-mannosylation capacity in mammalian cells and enabling direct functional assays. It was previously suggested that POMT2 was required for E-cadherin cell adhesion (15), and we wanted to evaluate the role of the two POMTs and POMGnT1 for functions of cdhs and pcdhs. To our surprise, knock-out of any of these three genes alone or in combination did not affect O-mannosylation of cdhs and pcdhs, in striking contrast to ␣-DG. We unequivocally demonstrate that both POMT1 and POMT2 are required for O-mannosylation of ␣-DG, and our findings strongly indicate the existence of a novel O-mannosylation process in mammalian cells that is distinct from the classical yeast type controlled by the PMT orthologs POMT1 and POMT2. The developed double POMT1/POMT2 knock-out cell lines now offer the opportunity to distinguish and screen for the glycosyltransferase genes controlling this novel type of O-mannosylation, and this gene hunt is now in progress. Interestingly, our results also suggest that this novel O-mannosylation pathway targets substrate sites involving distinct conformations such as the EC domains in cdhs/pcdhs and not unstructured regions such as the mucin domains of ␣-DG and KIAA1549. Moreover, the novel type of O-mannosylation appears to be limited to a single ␣-Man residue, although we cannot completely exclude that elongation may occur in other cell types. How the POMGnTs avoid elongating these O-Man glycans will be an interesting topic for future studies. A recent report demonstrates that POMGnT1 controlling the core M1 elongation uses a ␤GlcNAc-binding lectin domain in its stem region for clustered glycosylation of ␣-DG (35); however, it has also recently been demonstrated that POMGnT1 exhibits rather promiscuous acceptor substrate specificity, whereas POMGnT2 controlling the competing core M3 elongation pathway has restricted peptide acceptor specificity and thus is predicted to be the gatekeeper for the site-specific glycosylation of ␣-DG (36).
The functions of the metazoan POMT1 and POMT2 isoenzymes and their relationship with the larger family of yeast PMTs are still poorly understood. The cumulative function of the yeast PMT isoenzymes are expected to cover a wide range of protein substrates as evidenced by our recent analysis of the yeast Saccharomyces cerevisiae and Schizosaccharomyces pombe O-Man glycoproteomes (20,21), but detailed substrate specificities of the individual PMTs and their contributions to the yeast O-Man glycoproteome are missing (19). Previous reports indicate that POMT1 and POMT2 are both required to

Mammalian O-mannosylation of cadherins
form a functional complex for initiation of O-mannosylation of the ␣-DG mucin-like domain (37), and this was confirmed here in both CHO and HEK293 cells where single knock-out of either POMT gene eliminated detection of ␣-DG O-Man glycopeptides (Figs. 2 and 3). It has been hypothesized that the individual POMT1 and POMT2 could have independent glycosylation functions (15,38), but we found no consistent differences in the O-Man glycoproteomes of single versus double knock-out of the two POMTs suggesting that this is not the case. Although our gene-targeting strategy for POMT1 and POMT2 included targeting of different exons in CHO and HEK293 cells that are known to be important for the catalytic function, it should be emphasized that we cannot unequivocally exclude the remote possibility that the mutant genes could encode enzymatic functions. However, we propose that our results strongly indicate that POMT1 and POMT2 have narrow functions in O-glycosylation and serve a few proteins with unstructured mucin-like regions such as the N-terminal part of the mucin domain ␣-DG and the KIAA1549 protein. How this selectivity is obtained given that a large number of mucins and mucin-like proteins that passes through the ER before being GalNAc O-glycosylated in the Golgi is unclear but suggests the presence of a special signal. Hanisch and co-workers (28) originally suggested that an upstream sequence N-terminal to the ␣-DG mucin domain directed O-mannosylation, and such a mechanism is found for other types of glycosylation, including selection of N-glycoproteins destined for the lysosome by the GlcNAc-1-phosphotransferase (39) and the hormone ␤4GalNAc-transferases (40,41).
Genetic deconstruction and simplification of glycosylation capacities in cell lines have been fruitful strategies to uncover glycoproteomes and novel types of glycosylation (13,23), and the present example adds to this by deciphering the functions of the POMTs and uncovering a previously unknown type of O-mannosylation. Protein O-mannosylation in eukaryotes now comprise at least three distinct types as follows: the evolutionarily conserved PMT/POMT ER-located type that serves ␣-DG and some unstructured mucin-like proteins; the yeast-specific nucleocytoplasmic type (21); and a pathway as predicted here to serve the folded EC domains of the large cadherin superfamily in higher eukaryotic cells.
The finding that POMTs do not control glycosylation of cadherins suggests the original observation that the POMT2 important for E-cadherin function may not be directly associated with O-glycans on cadherins. Strahl and co-workers (15) used genetic and pharmacological approaches to block POMTs, and observed that mouse embryos deficient in O-mannosylation failed to proceed from the morula to the blastocyst stage due to defects in cell-cell contact. In a recent report, POMT2 was also shown to affect E-cadherin N-glycosylation and O-mannosylation in cancer (42). Strahl and co-workers (43) also demonstrated with a monoclonal antibody developed to an O-Man glycopeptide that reactivity in the murine brain was dependent on POMT2. Currently, we have no explanation for these findings in relation to our results.
Our finding also has relevance for congenital diseases of glycosylation associated with partial deficiencies in POMT1 or POMT2 (OMIM 607423 and 607439) and causing three different forms of muscular dystrophy-dystroglycanopathy, including severe forms with brain and eye anomalies also designated Walker-Warburg syndrome or muscle-eye-brain disease, a less severe congenital form with mental retardation, and a milder limb-girdle form also designated LGMD2 (44 -47). Deficiency in the POMGNT1 gene (OMIM 606822) underlies similar but less severe dystroglycanopathies and phenotypes as those of the POMTs (48), and this may be in agreement with our finding that this enzyme only elongates O-Man glycans on ␣-DG and related proteins and not those on the cadherin family of proteins. Provided that the highly conserved O-mannosylation of the large cadherin family has important functions, this would be in agreement with our finding that this process is unrelated to the POMTs and ␣-DG glycosylation.
An overlap in proteins and sites undergoing O-glycosylation by yeast PMTs and metazoan GalNAc-Ts has been discussed for some time, and for example, recombinant expression of human O-glycoproteins in yeast has resulted in attachment of O-Man glycans on sites normally undergoing GalNAc glycosyl-ation (49). In vitro studies with peptides have further supported this (24), and here we provided evidence that elimination of POMT-driven O-mannosylation of ␣-DG in HEK293 cells results in GalNAc glycosylation at sites normally occupied by O-Man glycans, as shown previously by in vivo studies in Drosophila melanogaster (50). In normal human cells, the two glycosylation processes are topologically separated in ER and Golgi, but in cancer the GalNAc-Ts may relocate to the ER and thus potentially compete with POMTs (51,52). In this study we did not consider the POMGnT2 pathway (Fig. 1), involving addition of ␤4GlcNAc to O-Man glycans to form the M3 core (53,54). We reasoned that it would not substantially affect our global glycoproteomic analysis since it has only been reported to occur on ␣-DG (27, 54 -57).
Our study strongly suggests that the ␣-linked O-Man glycans on the cadherin superfamily is not elongated. Recombinant expression in HEK293 POMT1/POMT2 and bottom-up mass spectrometry of three representative members of the cadherin superfamily provided further insight into the O-Man biosynthesis on these glycoproteins. In contrast to the global approach using total cell lysates, the targeted analysis using purified cad-  Fig. 1) (55-57), which would result in generation of glycopeptides that evade detection and/or identification by our mass spectrometry approach. However, this possibility is unlikely considering that the purified cadherins migrated as homogeneous bands at the expected molecular weights following SDS-PAGE analysis (Fig. 6). Still, we cannot completely rule out temporal and/or spatial mechanisms capable of regulating elongation of cadherin O-Man glycans. Analogous to the Notch signaling system where O-linked fucose (O-Fuc) may be present as a single monosaccharide unit or elongated into a tetrasaccharide with profound consequences on cell signaling (58), a similar but as yet unknown biosynthetic pathway capable of modifying and elongating the O-Man monosaccharides found on cadherins and plexins, for example, may still exist. However, using HEK293 as a model system, we find no evidence to support this hypothesis and conclude that the O-Man glycosylation found on cadherin superfamily members is not elongated into complex glycans but rather is limited to single O-Man monosaccharide units. This is in agreement with the recent conclusions drawn by Strahl and co-workers (43) based on immunohistochemical and glycoproteomics studies.
The cadherin superfamily is characterized by the specific EC domain protein fold (59), and the O-Man glycans on cadherins and protocadherins are all located at evolutionarily conserved sites found on the ␤-sheet elements of EC domains (13,14). In contrast, ␣-DG O-Man glycans are located in the mucin-like domain that is predicted to be unstructured and disordered. Notably, O-Man glycosylations outside the mucin-like domain of ␣-DG have been demonstrated in D. melanogaster (50). However, these O-Man glycosites were found on ␣-DG regions poorly conserved in higher species, e.g. mouse or humans, suggesting that O-Man glycosylation outside the unstructured mucin-like domain of ␣-DG may be restricted to lower species. Thus, the POMT1/POMT2-independent O-Man initiation appears to have a preference for specific folded protein domains, analogous to the process of O-Fuc glycosylation on structured EGF domains of Notch proteins (58). This conclusion is further supported by the observation that the POMT1/ POMT2-independent O-Man glycosylation of plexins is found on ␤-strands of IPT/TIG domain folds. Furthermore, a third example of structured protein regions targeted by the POMT1/ POMT2-independent O-Man pathway appears to include the Ig-like C2-type domains. We have previously identified intracellular adhesion molecule 1 with O-Man glycans on the Ig-like C2-type domain (13), and here we found two additional members, hereditary hemochromatosis protein (HFE) and plateletderived growth factor receptor ␣ (PGFRA). It is interesting to note that there are currently 22 and 323 human proteins in UniProtKB annotated with IPT/TIG and Ig-like C2-type domains, respectively (supplemental Table S8). It is conceivable that a majority of these proteins may be targets for the novel POMT1/POMT2-independent O-mannosylation pathway.
In summary, our study highlights that our knowledge of protein glycosylation is still incomplete and that O-mannosylation of proteins is far more complicated than previously thought. We present evidence that the large family of cadherins and other proteins are not O-mannosylated by the POMTs, indicating the existence of a novel O-mannosylation pathway for these proteins. Instead, the POMTs are suggested to have exquisitely narrow glycosylation functions of ␣-DG and a few other unstructured protein domains, which may explain the well defined ␣-dystroglycanopathy phenotypic features associated with deficiencies. Our study enables the discovery of the O-mannosylation enzymes for cadherins and the characterization of biological functions related to this unique type of protein glycosylation. In preliminary studies we have identified a ho-mologous gene family conserved in metazoans that is indispensable for O-mannosylation of cadherins in HEK293 cells in contrast to POMTs as shown here, and we are in progress with characterization of this gene family.

Precise gene targeting of glycogenes in CHO and HEK293 cells
Gene targeting was performed in the CHO ZN-GS Ϫ/Ϫ (glutamate synthase) (Sigma) cells or HEK293 (ATCC) using GFP/ Crimson-tagged zinc finger nucleases and transcription activator-like effector nucleases (TALENs) or GFP-tagged clustered and regularly interspaced short palindromic repeats (CRISPR)-Cas9s (supplemental Table S1) with our recently developed screening strategy (60). CHO-GS cells were maintained as suspension cultures in EX-CELL CHO CD Fusion serum-free media, supplemented with 4 mM L-glutamine. HEK293 cells were maintained in Dulbecco's modified Eagle's medium supplemented with 10% FBS, 2 mM L-glutamine. Briefly, cells were transfected by electroporation using Amaxa kit V and program U24 for CHO cells or Q01 for HEK293 cells with Amaxa Nucleofector 2B (Lonza, Switzerland). At 72 h after transfection, GFP/Crimson-positive cells were enriched by FACS. After a 1-2-week culture, cells were single-sorted again for GFP/Crimson-negative cells in 96-well plates. Knock-out clones with frameshift mutations were identified by IDAA with gene-specific primers.
The gene-targeting strategy for the type 2 transmembrane glycosyltransferase, POMGNT1, was to disrupt the catalytic domain. For the chaperone COSMC resembling the type 2 glycosyltransferase C1GalT1, we similarly targeted the region homologous to the catalytic domain of C1GalT1. For the multitransmembrane glycosyltransferases, POMT1 and POMT2, we focused on exons previously shown to be important for enzymatic function, and we targeted different exons in CHO and HEK293 cells to disrupt the coding regions as early as possible and to avoid alternative splice variants as annotated in UniProt. Specifically for POMT1, exon 5 was targeted in HEK293 cells and exon 2 in CHO cells. For POMT2, exon 1 was targeted in HEK293 cells and exon 3 in CHO cells. Clones were selected with frameshift mutations that result in premature stop codons (Table 1 and supplemental Table S1). Previous studies have demonstrated that single amino acid substitutions in exons 1, 2, and 4 disrupt enzymatic functions of POMT1 (25,61). Similarly, for POMT2 single amino acid mutations in exons 1, 2, and 3 disrupt and/or impair function (25). All genes were targeted in the center of an exon, and clones were selected with small introduced indels limited to the particular exon. This strategy is predicted to preserve the normal intron/exon borders (supplemental Table S1). The potential truncated protein products of the POMT1 and POMT2 mutant clones selected are predicted to be catalytically inactive. All selected clones were confirmed by Sanger sequencing of 200 -300 bp of the target regions.
All zinc finger nucleases and TALENs were designed by Sigma, and gRNAs were designed using the on-line tool http:// crispr.mit.edu/, 3 selecting for lowest off-targets.

Lectin weak affinity chromatography
Packed cell pellets (0.5 ml) were trypsin-digested following previously published protocol (13). Briefly, the cell pellets were lysed by sonication in 0.1% RapiGest, 50 mM ammonium bicarbonate and cleared by centrifugation (1,000 ϫ g for 10 min). The cleared lysate was heated at 80°C for 10 min followed by reduction with 5 mM dithiothreitol (DTT) at 60°C for 30 min and alkylation with 10 mM iodoacetamide at room temperature for 30 min before overnight digestion at 37°C with 25 g of trypsin (Roche Applied Science). Trypsin was heat-inactivated at 95°C for 20 min before N-glycan removal with 8 units of PNGase F (Roche Applied Science) at 37°C overnight, followed by addition of 3 units of PNGase F and incubation for 4 h. The N-deglycosylated digests were acidified with 12 l of trifluoroacetic acid (TFA) at 37°C for 20 min and cleared by centrifugation at 10,000 ϫ g for 10 min. The cleared acidified digests were loaded onto equilibrated SepPak C18 cartridges (Waters) and washed three times per CV of 0.1% TFA (1 CV ϭ 2 ml). Columns were washed using 3 CV of 0.1% formic acid (FA) and eluted with 0.5 ml of 50% methanol (MeOH) in 0.1% FA.

Protein production
HEK293 POMT1/POMT2 cells were maintained in FreeStyle TM 293 serum-free media (Thermo Fisher Scientific). Cells were cultured to a cell density of ϳ1.2 ϫ 10 6 cells/ml with at least 90% viable cells in a shaking incubator. Expression constructs were transfected using PEI (DNA/PEI ratio of 1:2), 0.5 g of DNA per ml of cells, and 10 mM supplemented CaCl 2 . Conditioned media were collected 6 days post-transfection and purified as follows. Conditioned media (100 ml) were supplemented with 20 mM Tris-HCl, pH 8.0, 3 mM CaCl 2 , 20 mM imidazole, 500 mM NaCl. Conditioned media were incubated with 5 ml of Ni-NTA-Sepharose beads with gentle stirring for 45 min, followed by loading on a gravity flow column. The beads were washed by gravity flow with 40 CVs of wash buffer (20 mM Tris-HCl, pH 8.0, 3 mM CaCl 2 , 20 mM imidazole, and 500 mM NaCl). Proteins were eluted in 20 mM Tris-HCl, pH 8.0, 3 mM CaCl 2 , 250 mM imidazole, 500 mM NaCl. 2.5 g of purified protein was loaded on a NUPAGE TM Novex 4 -12% BisTris protein gel and stained with InstantBlue TM . 10 g of purified protein was diluted in 100 l of 50 mM ammonium bicarbonate, reduced in 5 mM DTT at 60°C for 30 min, alkylated in 10 mM iodoacetamide at room temperature for 30 min, followed by digestion with 0.25 g of trypsin (Roche Applied Science) at 37°C overnight. The peptides were purified by in-house packed Stage tips (Empore disk-C18, 3M) and subjected to bottom-up MS.
For Velos Pro acquisition, precursor MS1 scan (m/z 355-1700) was acquired in the Orbitrap at a resolution setting of 30,000, followed by Orbitrap HCD-MS/MS and, for selected samples, also ETD-MS/MS of multiply charged precursors in the MS1 spectrum; a minimum MS1 signal threshold of 10,000 -50,000 ions was used for triggering data-dependent fragmentation events; MS2 spectra were acquired at a resolution of 7,500 (HCD) or 15,000 (ETD). Supplemental activation (20%) of the charge-reduced species was used in the ETD analysis to improve fragmentation. For Fusion acquisition, precursor MS1 scan (m/z 355-1700) was acquired in the Orbitrap at a resolution setting of 120,000, followed by Orbitrap HCD-MS/MS and ETD-MS/MS of multiply charged precursors (z ϭ 2-6) in the MS1 spectrum; a minimum MS1 signal threshold of 10,000 -50,000 ions was used for triggering data-dependent fragmentation events; MS2 spectra were acquired at a resolution of 60,000 (HCD and ETD). For differential glycoproteomic analyses of dimethyl stable isotope-labeled samples, ConA-enriched O-Man glycopeptides were analyzed on the Fusion instrument using the parameters described above; VVA-enriched O-GalNAc glycopeptides were analyzed using 3-h acquisition methods on the Fusion instrument using the following parameters: method 1, as described above; method 2, precursor MS1 scan (m/z 355-1700, resolution ϭ 120,000) acquired in the Orbitrap followed by Orbitrap ETD-MS/MS (resolution ϭ 60,000) of the seven most abundant multiply charged precursors (z ϭ 2-6) in the MS1 spectrum; a minimum MS1 signal threshold of 50,000 ions was used for triggering data-dependent fragmentation events; method 3, precursor MS1 scan (m/z 355-1700, resolution ϭ 120,000) acquired in the Orbitrap followed by ETD-MS/MS (resolution ϭ 60,000) of the seven most abundant multiply charged precursors (z ϭ 3-4) in the MS1 spectrum. For method 3, the LC gradient was adjusted to 5-15% B for 155 min followed by 15-80% B for 10 min and finally 80% B for 15 min. To improve fragmentation, ETD supplemental activation (ETcid ϭ 25%) was used in all analyses described above for the Fusion instrument. The flow-through fraction from the VVA lectin column enrichment step was analyzed using a 3-h acquisition method (LC gradient IV) on the Fusion instrument with the following settings: precursor MS1 scan (m/z 355-1700, resolution ϭ 120,000) acquired in the Orbitrap followed by Orbitrap HCD-MS/MS (resolution ϭ 60,000) of the 10 most abundant multiply charged precursors (z ϭ 2-6) in the MS1 spectrum; a minimum MS1 signal threshold of 50,000 ions was used for triggering data-dependent fragmentation events. The mass spectrometry proteomics data have been deposited to the ProteomeXchange Consortium via the PRIDE (62) partner repository with the dataset identifier PXD004341 and PXD004358 for CHO and human cell, respectively.

Data analyses
Data processing was carried out using Proteome Discoverer 1.4 software (Thermo Fisher Scientific), as described previously (13), with minor modifications as outlined below. Raw data files (.raw) were processed using the Sequest HT node and searched against the canonical CHO-specific proteome (November, 2014) or the human proteome (January, 2013) downloaded from the UniProtKB database (http://www.uniprot.org/). In all cases, the precursor mass tolerance was set to 10 ppm and fragment ion mass tolerance to 0.02 Da. Carbamidomethylation on Cys was used as a fixed modification, and oxidation of Met, deamidation of Asn, and hexose modification of Ser and Thr residues were used as variable modifications. A maximum of eight variable modifications were allowed per peptide. A maximum of two missed cleavage sites were tolerated. Spectral assignments worse than the high confidence level were resubmitted to a second Sequest HT node using semi-specific trypsin proteolytic cleavage. Final results were filtered for high-confidence (p Ͻ 0.01) identifications only. Peptide confidence levels were calculated using the Target Decoy PSM Validator node of Proteome Discoverer 1.4. HCD spectra were further processed with a subtraction routine as described previously (13). Briefly, all HCD spectra were extracted to a separate .mgf file, and the exact masses of one to four hexose residues were subtracted from each precursor ion resulting in four separate .mgf files. Each .mgf file was subsequently processes as described above with the exception of omitting hexose as variable modification at Ser or Thr residues.
For dimethyl stable isotope-labeled samples, glycopeptide medium/light ratios were determined using the Event Detector Node and the Precursor Ion Node of the Proteome Discoverer workflow as described previously (30). Briefly, the Event Detector node was used for peak area quantification clustering isotopes of precursor ions that elute during the same retention time. Isotopically labeled ions were finally quantified using Precursor Ions Quantifier Node. Quantifications were based on peptides identified by ETD only. The O-Man and O-GalNAc glycosylations identified in WT cells, knock-out cell lines, and purified cadherins/protocadherins are based on single shotgun mass spectrometry experiments.