Peptide-specific Transfer of N-Acetylgalactosamine to O-Linked Glycans by the Glycosyltransferases β1,4-N-Acetylgalactosaminyl Transferase 3 (β4GalNAc-T3) and β4GalNAc-T4*

Background: LacdiNAc (GalNAcβ1,4GlcNAc) is present on O- and N-linked carbohydrate moieties of pro-opiomelanocortin. Results: β1,4-N-acetylgalactosaminyltransferases β4GalNAc-T3 and β4GalNAc-T4 mediate peptide-specific transfer of GalNAc to O-linked structures in vivo and in vitro. Conclusion: β4GalNAc-T3 and β4GalNAc-T4 can account for LacdiNAc sequences on O-linked structures on specific glycoproteins. Significance: The protein-specific addition of LacdiNAc to O-linked carbohydrates generates a family of unique structures recognized by carbohydrate-specific receptors. N- and O-linked oligosaccharides on pro-opiomelanocortin both bear the unique terminal sequence SO4-4-GalNAcβ1,4GlcNAcβ. We previously demonstrated that protein-specific transfer of GalNAc to N-linked oligosaccharides on glycoprotein substrates is dependent on the presence of both an oligosaccharide acceptor and a peptide recognition motif consisting of a cluster of basic amino acids. We characterized how two β1,4-N-acetylgalactosaminyltransferases, β4GalNAc-T3 and β4GalNAc-T4, require the presence of both the peptide recognition motif and the N-linked oligosaccharide acceptors to transfer GalNAc in β1,4-linkage to GlcNAc in vivo and in vitro. We now show that β4GalNAc-T3 and β4GalNAc-T4 are able to utilize the same peptide motif to selectively add GalNAc to β1,6-linked GlcNAc in core 2 O-linked oligosaccharide structures to form Galβ1,3(GalNAcβ1,4GlcNAcβ1,6)GalNAcαSer/Thr. The β1,4-linked GalNAc can be further modified with 4-linked sulfate by either GalNAc-4-sulfotransferase 1 (GalNAc-4-ST1) (CHST8) or GalNAc-4-ST2 (CHST9) or with α2,6-linked N-acetylneuraminic acid by α2,6-sialyltransferase 1 (ST6Gal1), thus generating a family of unique GalNAcβ1,4GlcNAcβ (LacdiNAc)-containing structures on specific glycoproteins.

rides terminating LacdiNAc modified with sulfate or NeuNAc are recognized by the mannose receptor (18 -23) and the asialoglycoprotein receptor (24,25), respectively, and regulate the circulatory half-lives of glycoprotein hormones bearing these structures in vivo. LacdiNAc termini modified with sulfate or NeuNAc on O-linked structures may be recognized by the same and/or additional receptors and also have important biological consequences in vivo.
Quantitation of GalNAc incorporated either in vitro or in vivo into Gaussia luciferase (GLuc)-containing chimeric glycoproteins was carried out as described (26). The pH of the medium was adjusted to pH 5.0 with sodium acetate for digestion with A. ureafaciens neuraminidase and DP ␤-galactosidase at 37°C. The digestions were stopped by heating.

GLuc-CTP-CA(1-19)Myc-His and GLuc-CTP-CA(1-2)Myc-
His-A limited number of glycoproteins bearing either N-linked or O-linked oligosaccharides containing the Lacdi-NAc sequence have been described in vertebrates, suggesting potential distinctive functions for this carbohydrate moiety. As illustrated in Fig. 1 for O-linked structures, GalNAc is added to the ␤1,6-linked GlcNAc moiety of core 2 type structures generating structure 7 that can be further modified with 4-linked SO 4 or ␣2,6-linked NeuNAc (structures 9 and 10). The addition of ␤1,4-linked GalNAc to N-linked oligosaccharides to form LacdiNAc sequences on glycoproteins such as LH and CA6 is mediated by protein-specific ␤1,4GalNAc transferases that recognize a peptide motif as well as the oligosaccharide acceptor (8,9,26). We previously utilized chimeric glycoproteins consisting of a secreted form of luciferase, GLuc (33,34) followed by a glycoprotein of interest and an epitope tag, Myc-His, to define the protein-specific addition of GalNAc to N-linked oligosaccharides by ␤4GalNAc-T3 and ␤4GalNAc-T4 in vivo and in vitro (26). We have now taken a similar approach to determine whether the same ␤4GalNAc-T3 and ␤4GalNAc-T4 enzymes can account for the protein-specific addition of GalNAc to core 2 O-linked oligosaccharides.

DISCUSSION
Our current studies expand the role of protein-specific synthesis of LacdiNAc structures by ␤4GalNAc-T3 and ␤4GalNAc-T4 to O-linked oligosaccharides. The subsequent addition of SO 4 , ␣2,6-linked NeuNAc, or other substituents can produce a family of unique carbohydrate structures that may have important biological roles as we have defined for N-linked oligosaccharides modified with LacdiNAc on glycoprotein hormones such as LH (8, 36 -38). We can now attribute the presence of both N-linked and O-linked oligosaccharides containing the LacdiNAc sequence on POMC to the same enzymes ␤4GalNAc-T3 and/or ␤4GalNAc-T4. Although a number of glycoproteins bearing N-linked structures containing Lacdi-NAc have been described since we originally reported this structure on the glycoprotein hormone LH (51), POMC (1) and zona pellucida 3 (39) have to date remained the only glycoproteins reported to bear core 2 O-linked structures with the Lac-diNAc sequence in vertebrates. Thus, the addition of GalNAc to core 2 O-linked structures may also be restricted to glycoproteins bearing a peptide motif such as the sequences we have described on the glycoprotein hormone ␣ subunit (6, 7) and CA6 (9,26), which are recognized by ␤4GalNAc-T3 and ␤4GalNAc-T4.
The carboxyl terminal amino acid sequence of the hCG␤ subunit contains four Ser residues that are modified with O-linked oligosaccharide structures when expressed in CHO cells (35). Adding this sequence to the carboxyl terminus of GLuc produced a substrate that contained only O-linked oligosaccharides and was efficiently secreted into the medium of cells following transfection. We have demonstrated that the carboxyl-terminal 19 amino acid sequence found on CA6 is recognized by ␤4GalNAc-T3 and ␤4GalNAc-T4 and can account for the protein-specific addition of GalNAc to N-linked oligosaccharides both in vitro and in vivo (26). Adding either the CA1-19 sequence or alternatively just the CA1-2 sequence to the carboxyl terminus of the CTP from hCG␤ yielded chimeric glycoproteins that did and did not contain a determinant recognized by ␤4GalNAc-T3 and ␤4GalNAc-T4, respectively, and could be utilized to define transfer of GalNAc in vitro as well as in vivo. We have used similar constructs to show that efficient transfer of GalNAc to N-linked glycans is dependent on the presence of a peptide recognition determinant such as CA1-19 (26). Constructs containing portions of the CA1-19 sequence were not modified as efficiently as those containing the full CA1-19 sequence. The construct GLuc-␣(PLESEE)CA1-10, which contains only the first 10 amino acids of the CA1-19 sequence, was a poor substrate for GalNAc transfer by either ␤4GalNAc-T3 or ␤4GalNAc-T4 when compared with GLuc-␣(PLESEE)CA1-19 (26). The independence of peptide recognition and GalNAc transfer to N-linked oligosaccharides suggests that the peptide requirements will be sim- Three isoforms of the core 2 ␤1,6-N-acetylglucosaminyltransferase, C2GnT1 (27), C2GnT2 (40), and C2GnT3 (41), have been identified and cloned. An O-glycomic analysis of wild type mice and mice deficient in individual C2GnTs or all three C2GnTs was recently published (42). Core 2 structures (see Table 1, m/z 1024, in Ref. 42) that contain the LacdiNAc sequence were present although not abundant in the colon of the wild type mice but not in the colon of C2GnT-deficient mice. ␤4GalNAc-T3 transcripts have been detected in human stomach and colon (11). Furthermore, LacdiNAc structures were reported to be present on surface mucous cells of the human stomach based on WFA staining (43). The distribution of ␤4GalNAc-T3 transcripts and LacdiNAc bearing core 2 structures in the colon and stomach of mice and humans suggests that additional O-linked glycans bearing LacdiNAc will be identified in the future. The LacdiNAc sequence may, however, be confined to only those core 2 structures that also have an accessible recognition motif for ␤4GalNAc-T3.
O-Glycosylation of Ser and Thr residues with ␣-linked Gal-NAc is an abundant form of glycosylation. As is illustrated by the O-glycomic analysis done by Ismail et al. (42), the structures produced are complex. As many as 20 distinct isoenzymes have been identified that mediate the site-specific addition of Gal-NAc (44,45). An additional repertoire of transferases serves to build complex oligosaccharide structures on the O-linked Gal-NAc. Recently developed approaches have identified a rapidly growing number of glycoproteins that are O-glycosylated at specific sites that in some cases serve to modulate critical biological processes (46). The selective addition of LacdiNAc sequences to O-linked structures that have an associated rec-ognition motif for either ␤4GalNAc-T3 or ␤4GalNAc-T4 and their subsequent modification with sulfate or NeuNAc provides a mechanism to produce highly unique structures at very specific locations in glycoproteins with O-linked oligosaccharides.
␤4GalNAc-Ts that are either not protein-specific or have a specificity that differs from that of ␤4GalNAc-T3 and ␤4GalNAc-T4 have been identified using in vitro assays (47)(48)(49). Furthermore, not all glycoproteins bearing N-or O-linked oligosaccharides modified with LacdiNAc have readily identifiable recognition motifs similar to those we have described on ␣ and CA6. Although it was possible to detect ␤4GalNAc-T activity in vitro in CHO cells, no evidence of LacdiNAc addition to glycoproteins expressed in CHO cells was obtained (48). However, expression of a ␤1,4GalNAcT cloned from Caenorhabditis elegans in CHO Lec8 cells resulted in LacdiNAc synthesis on multiple endogenous glycoproteins as well as the glycoprotein hormone ␣ subunit (50). Until other ␤4GalNAc-Ts can be identified or cloned, it will be difficult to assess whether they are protein-specific; however, the approach we have taken using chimeric glycoprotein acceptors makes this a more approachable problem in the future. More examples of glycoproteins bearing O-linked structure modified with LacdiNAc will be required to assess whether they are confined to specific glycoproteins and whether ␤4GalNAc-T3 and/or ␤4GalNAc-T4 are responsible.
The presence of the LacdiNAc sequence on N-linked oligosaccharides of glycoprotein hormones such as LH is of critical importance for their biology. In the case of LH, the structural features of the LacdiNAc determine the circulatory half-life of the hormone following its release into the circulation, and as a result, its potency in vivo (22,23,36). SO 4 -4-GalNAc␤1, 4GlcNAc␤ is recognized by the N-terminal Cys-rich domain of the mannose receptor in its dimeric form (18,19,21,23), whereas NeuNAc␣2,6GalNAc␤1,4GlcNAc␤ is recognized by the asialoglycoprotein receptor (24). The mannose receptor and the asialoglycoprotein receptor are highly abundant endocytic receptors that reside in endothelial cells and parenchymal cells of the liver, respectively. Glycoproteins bearing multiple O-linked structures terminating with LacdiNAc, SO 4 -4-GalNAc␤1,4GlcNAc␤, or NeuNAc␣2,6GalNAc␤1,4GlcNAc␤ may interact with the mannose receptor and the asialoglycoprotein receptor differently from glycoproteins bearing Nlinked oligosaccharides with the same termini. In addition, it is quite possible that the LacdiNAc-containing O-linked structures may be recognized by other receptors and have quite different functions in vivo such as mediating cell or matrix recognition.
Efficient transfer of GalNAc to O-and N-linked structures on specific glycoproteins by ␤4GalNAc-T3 and ␤4GalNAc-T4 is dependent on the presence of a peptide motif that is recognized by these transferases and the presence of the appropriate acceptor structure. Further modification of the LacdiNAc sequence by the addition of sulfate or NeuNAc also reflects the repertoire of GalNAc-4-STs and ␣2,6NeuNAc transferases being expressed. As a consequence, a unique family of LacdiNAccontaining structures can be added to specific glycoproteins bearing either O-linked or N-linked carbohydrates. We expect that additional glycoproteins bearing O-linked glycans with LacdiNAc structures will be identified in the future. The protein-specific synthesis of this unique family of O-linked structures makes it highly likely that like their N-linked counterparts, they well be recognized by specific receptors with functional consequences.