Oligosaccharides containing beta 1,4-linked N-acetylgalactosamine, a paradigm for protein-specific glycosylation.

The carbohydrate moieties found on glycoproteins have long been recognized as having great potential to bear biologically important information. However, actual examples of systems in which oligosaccharides play defined physiological roles have remained limited. These oligosaccharides with known biologic functions typically have distinctive structural features and are generally confined to specific glycoproteins. Synthesis of structurally unique oligosaccharides on specific glycoproteins at defined times is essential if these structures are to fulfill their biologic purpose. Since cells produce many distinct oligosaccharides as newly synthesized glycoproteins pass through the endoplasmic reticulum and the Golgi, mechanisms are required to assure that the correct structures are added to the numerous glycoproteins being synthesized. Determining how synthesis of the vast array of oligosaccharides produced by each cell is regulated is essential for understanding the biologic importance of these complex structures. Asn-linked oligosaccharides arise by processing of a common precursor structure, which is transferred en bloc from dolichol to the nascent peptide chain in the endoplasmic reticulum (1). As a result Asn-linked oligosaccharides have a common core region and differ primarily in the number and location of their peripheral branches and terminal modifications. Since all newly synthesized glycoproteins pass through the same subcellular compartments and are exposed to the same transferases, structural differences in oligosaccharides on individual glycoproteins and/or at individual glycosylation sites must in some fashion reflect the influence of the protein moiety on one or more glycosyltransferases. This suggests that key glycosyltransferases recognize features encoded within the peptide as well as the oligosaccharide of the target glycoprotein. Among the three glycosyltransferases thus far demonstrated to display peptide as well as oligosaccharide recognition, UDP-glucose:glycoprotein glucosyltransferase, UDP-N-acetylglucosamine: lysosomal enzyme N-acetylglucosamine-1-phosphotransferase, and UDP-GalNAc:glycoprotein hormone b1,4-N-acetylgalactosaminyltransferase (b1,4-GalNAcT, reviewed in Ref. 2), one of the most extensively characterized is the b1,4-GalNAcT, which produces the terminal sequence GalNAcb1,4GlcNAcb1-R on glycoproteins that contain a specific peptide recognition determinant in addition to an appropriate oligosaccharide acceptor. The product of the b1,4-GalNAcTmay be further modified by the addition of sulfate, sialic acid, or fucose, thus producing a range of unique oligosaccharide structures defined by the presence of b1,4-linked GalNAc as illustrated in Fig. 1. Each of these structures has the potential to be recognized by a specific receptor or binding protein and thus mediate a distinct biological function. As will become apparent below, the b1,4-GalNAcT is a key component of a well characterized system, which includes unique oligosaccharide structures, highly specific glycosyltransferases, and oligosaccharide-specific receptors. This is therefore an excellent model system for understanding proteinspecific glycosylation.

The carbohydrate moieties found on glycoproteins have long been recognized as having great potential to bear biologically important information. However, actual examples of systems in which oligosaccharides play defined physiological roles have remained limited. These oligosaccharides with known biologic functions typically have distinctive structural features and are generally confined to specific glycoproteins. Synthesis of structurally unique oligosaccharides on specific glycoproteins at defined times is essential if these structures are to fulfill their biologic purpose. Since cells produce many distinct oligosaccharides as newly synthesized glycoproteins pass through the endoplasmic reticulum and the Golgi, mechanisms are required to assure that the correct structures are added to the numerous glycoproteins being synthesized. Determining how synthesis of the vast array of oligosaccharides produced by each cell is regulated is essential for understanding the biologic importance of these complex structures.
Asn-linked oligosaccharides arise by processing of a common precursor structure, which is transferred en bloc from dolichol to the nascent peptide chain in the endoplasmic reticulum (1). As a result Asn-linked oligosaccharides have a common core region and differ primarily in the number and location of their peripheral branches and terminal modifications. Since all newly synthesized glycoproteins pass through the same subcellular compartments and are exposed to the same transferases, structural differences in oligosaccharides on individual glycoproteins and/or at individual glycosylation sites must in some fashion reflect the influence of the protein moiety on one or more glycosyltransferases. This suggests that key glycosyltransferases recognize features encoded within the peptide as well as the oligosaccharide of the target glycoprotein. Among the three glycosyltransferases thus far demonstrated to display peptide as well as oligosaccharide recognition, UDP-glucose:glycoprotein glucosyltransferase, UDP-N-acetylglucosamine: lysosomal enzyme N-acetylglucosamine-1-phosphotransferase, and UDP-GalNAc:glycoprotein hormone ␤1,4-N-acetylgalactosaminyltransferase (␤1,4-GalNAcT, 1 reviewed in Ref. 2), one of the most extensively characterized is the ␤1,4-GalNAcT, which produces the terminal sequence GalNAc␤1,4GlcNAc␤1-R on glycoproteins that contain a specific peptide recognition determinant in addition to an appropriate oligosaccharide acceptor. The product of the ␤1,4-Gal-NAcT may be further modified by the addition of sulfate, sialic acid, or fucose, thus producing a range of unique oligosaccharide structures defined by the presence of ␤1,4-linked GalNAc as illustrated in Fig. 1. Each of these structures has the potential to be recognized by a specific receptor or binding protein and thus mediate a distinct biological function. As will become apparent below, the ␤1,4-Gal-NAcT is a key component of a well characterized system, which includes unique oligosaccharide structures, highly specific glycosyltransferases, and oligosaccharide-specific receptors. This is therefore an excellent model system for understanding proteinspecific glycosylation.
Even though a number of glycoproteins bearing oligosaccharides with the sequence GalNAc␤1,4GlcNAc␤1-R have been described, this carbohydrate structural motif is not common in vertebrates. The vast majority of other glycoproteins produced by the tissues or cells, which synthesize glycoproteins bearing ␤1,4-linked GalNAc, do not contain ␤1,4-linked GalNAc but instead contain ␤1,4-linked Gal, indicating that the addition of ␤1,4-linked GalNAc is generally a highly protein-specific process. A ␤1,4-GalNAcT, which can account for the specific modification of Asn-linked oligosaccharides, is present in a limited number of tissues and cell lines, including those which are known to produce oligosaccharides with the ␤1,4linked GalNAc motif (11). In contrast, ␤1,4-galactosyltransferase (␤1,4-GalT), which transfers Gal in ␤1,4-linkage to virtually any terminal GlcNAc, is expressed at relatively high levels in virtually all vertebrate tissues and cell lines (12,13). Since the ␤1,4-Gal-NAcT and ␤1,4-GalT compete for the same oligosaccharide acceptors ( Fig. 1), preferential addition of GalNAc to an oligosaccharide must reflect recognition of the protein bearing the oligosaccharide. An in vitro model system was established for examining the protein specificity of the ␤1,4-GalNAcT using human chorionic gonadotropin (hCG). hCG, which is closely related to LH but is synthesized in the placenta (14), binds to the same receptor as LH. hCG contains the peptide recognition determinant utilized by the ␤1,4-GalNAcT but bears oligosaccharides that terminate with Sia-␣2,3Gal because neither the ␤1,4-GalNAcT nor the GalNAc-4-sulfotransferase (reactions 3 and 4, respectively, in Fig. 1) are expressed in human placenta (15). We established the existence of the peptide recognition determinant by comparing glycoproteins and glycopeptides bearing the identical oligosaccharide, GlcNAc 2 -Man 3 GlcNAc 2 Asn (the product of reaction 1 in Fig. 1), as acceptors for the addition of either Gal or GalNAc by transferases present in pituitary extracts. Gal is added to each of the glycoproteins tested with the same apparent K m of 1-2 mM and the same catalytic efficiency (V max /K m ). In contrast, the apparent K m for addition of GalNAc is markedly influenced by the protein moiety. For example, GalNAc is transferred to oligosaccharides on glycoproteins such as transferrin with an apparent K m of 1-2 mM, whereas transfer to the same oligosaccharides on hCG occurs with an apparent K m of 5-10 M. Thus, the catalytic efficiency for addition of GalNAc to oligosaccharides on certain glycoproteins is 100 -500-fold greater than for transfer to the same oligosaccharides on other glycoproteins, indicating the presence of a specific ␤1,4-GalNAcT recognition determinant on a particular subset of glycoproteins including hCG (15)(16)(17). We have located two recognition determinants on hCG, one on the ␣ subunit and one on the ␤ subunit (17), and have established a number of features that are critical for recognition by the ␤1,4-GalNAcT (18).
The recently solved crystal structure of hCG (19, 20) 3 including four Asn-linked oligosaccharides, illustrated in Fig. 2, has greatly enhanced our understanding of the ␤1,4-GalNAcT recognition determinant. It is immediately apparent that each of the four Asnlinked oligosaccharides on hCG is nearly as large as the peptide portion of either the ␣ or the ␤ subunit, and only the innermost 2 or 3 sugars are in direct contact with the peptide. Since the oligosaccharides are highly mobile rather than being fixed in space, their molecular dimensions are even greater relative to the peptide when considered in real time. It is also apparent that the peripheral sugars, Sia-Gal or SO 4 -GalNAc, are distant from the peptide, mobile, and highly exposed, as has been recently been established for at least one of the ␣ subunit oligosaccharides by high resolution multinuclear NMR. 3 The structures of LH, TSH, and follitropin (FSH), which have the same ␣ subunit and highly homologous ␤ subunits, are likely to be similar to that of hCG. The regions of the ␣ subunit (Fig. 2, A and B) and the ␤ subunit (Fig. 2B only), which include residues essential for recognition, have been highlighted in yellow. Residues in the ␣ subunit critical for recognition by the ␤1,4-GalNAcT include the basic amino acids in the sequence Pro 40 -Leu 41 -Arg 42 -Ser 43 -Lys 44 -Lys 45 (18). These residues are present within two turns of an ␣-helix, forming a cluster of basic residues that projects out from the ␣ subunit. The region of the ␤ subunit, which we have proposed to be critical for recognition, is found at the N terminus and consists of the sequence Pro 4 -Leu 5 -Arg 6 -Pro 7 -Arg 8 -Cys 9 -Arg 10 (17). We have shown that the Arg 6 is essential for recognition and are currently examining the contribution of other residues within this sequence.
Recognition of the peptide determinants in hCG by the ␤1,4-GalNAcT does not require the maintenance of tertiary structure since the separated native or reduced and alkylated ␣ and ␤ subunits of CG are recognized by the ␤1,4-GalNAcT (16). GalNAc is transferred to the oligosaccharides on glycopeptide fragments containing as few as 23 amino acids with the same apparent K m as the native glycoprotein hormone subunits from which they were derived (17). Each peptide recognition determinant is capable of directing transfer of GalNAc to multiple oligosaccharides since both oligosaccharides on the separated native ␣ and ␤ subunits are modified with GalNAc in vitro. The ␤1,4-GalNAcT most likely simultaneously binds to the peptide recognition determinant and the oligosaccharide being modified, thereby reducing the apparent K m . Even though the recognition determinants on both the ␣ and ␤ subunits of hCG are in close proximity within the linear protein sequence to a glycosylation site, this is not the case for the more distal glycosylation sites on either subunit. It is the relationship of the recognition determinant to the oligosaccharide acceptor in three-dimensional space that is critical for determining which oligosaccharides will or will not be modified with GalNAc in the native protein. Although the recognition determinants on the isolated ␣ and ␤ subunits are both functional, we have not yet established if both contribute to recognition by the ␤1,4-GalNAcT in the dimeric form of the hormone. The recognition determinant on the ␣ subunit (see Fig. 2) is in close proximity to the oligosaccharide at Asn 52 of the ␣ subunit and to both Asn-linked oligosaccharides on the ␤ subunit. However, the oligosaccharides on the ␤ subunit are found on the opposite surface of CG and may not be fully accessible to the ␤1,4-GalNAcT when it is bound to the recognition determinant on ␣ subunit. As a result, it is the recognition determinant near the N terminus of the ␤ subunit (Fig. 2B) that may mediate addition of GalNAc to one or both ␤ subunit oligosaccharides. The oligosaccharide at Asn 78 of the ␣ subunit is less extensively substituted with GalNAc than the one at Asn 52 (22). This oligosaccharide is more distant from either recognition determinant (Fig. 2), suggesting the GlcNAc termini of this oligosaccharide are not in sufficiently close proximity to the recognition determinant in threedimensional space to be efficiently modified.
The evidence is compelling that the peptide recognition determinants on the glycoprotein hormones reduce the apparent K m for GalNAc addition to the oligosaccharide acceptor by enhancing binding of the ␤1,4-GalNAcT. It is likely that peptide recognition determinants will be found for other glycosyltransferases that will enhance transfer of sugars to oligosaccharide acceptors on specific glycoproteins and/or at individual glycosylation sites. Since it is possible to alter the catalytic efficiency of GalNAc transfer in vitro by altering specific residues within the region recognized by the ␤1,4-GalNAcT, it will soon be possible to determine the impact of such changes in vivo.

The Unique Oligosaccharide Structural Motif and the Peptide Recognition Determinant Are Conserved among Glycoprotein Hormones from All Classes of Vertebrates
All classes of vertebrates are known to produce glycoprotein hormones closely related to those found in mammals (23). Furthermore, it is the region of the ␣ subunit that contains the recognition determinant for the ␤1,4-GalNAcT, which is the most highly conserved among vertebrates. As a result residues critical for recogni-FIG. 1. Synthesis of oligosaccharides containing ␤1,4-linked Gal-NAc or Gal. The product of reaction 1 is a common synthetic intermediate, which can be further modified by the addition of ␤1,4-linked Gal, reaction 2, or ␤1,4-linked GalNAc, reaction 3. The ␤1,4-GalT does not display any peptide specificity and will transfer Gal to any ␤-linked terminal GlcNAc with the same catalytic efficiency. In contrast the ␤1,4-GalNAcT recognizes features encoded in the peptide (i.e. the ␤1,4-GalNAcT recognition determinant) as well as the terminal ␤-linked GlcNAc. In the presence of the recognition determinant the catalytic efficiency for GalNAc addition to the same oligosaccharide is as much as 500-fold greater than for addition of either Gal or addition of GalNAc in the absence of the recognition determinant (15,17,18). ␣1,3-Fucosyltransferase will transfer fucose to GlcNAc in the presence of either ␤1,4-linked Gal (reaction 8) or GalNAc (reaction 5) (51), and ␣2,6-sialyltransferase will transfer sialic acid to either ␤1,4-linked Gal (reaction 7) or GalNAc (reaction 6) (52). In contrast, ␣2,3-sialyltransferase will add sialic acid to ␤1,4-linked Gal (reaction 9) but not GalNAc (37), while GalNAc-4-sulfotransferase will add sulfate to the 4-hydroxyl of ␤1,4linked GalNAc (reaction 4) but not to ␤1,4-linked Gal (53,54). However, all of the oligosaccharides containing ␤1,4-linked GalNAc are distinct from those containing Gal and are largely confined to glycoproteins containing the ␤1,4-GalNAcT recognition determinant, which are synthesized by cells expressing the ␤1,4-GalNAcT. f, GalNAc; Ⅺ, GlcNAc; E, Man; q, Gal; å, ␣2,6-Sia; Ç, ␣2,3-Sia; ç, Fuc; shaded rectangle, recognition determinant. tion by the ␤1,4-GalNAcT are found in ␣ subunits from all classes of vertebrates (24). This same region of the ␣ subunit is essential for activation of adenylate cyclase activity following binding to the respective G-protein-coupled hormone receptor (25). We have found that the ␤1,4-GalNAcT and the GalNAc-4-sulfotransferase, which together account for the synthesis of SO 4 -4GalNAc ␤1,4GlcNAc␤1-R termini, are expressed in the pituitaries of verte-brates ranging from fish to humans. Furthermore, the oligosaccharides on glycoprotein hormones from all classes of vertebrates terminate with GalNAc-4-SO 4 (24). This is the first instance in which a specific oligosaccharide structural motif has been shown to be maintained on a family of glycoproteins from different classes of vertebrates. Thus, the unique carbohydrate structural motif, like the sequence and structure of the glycoprotein hormone peptides, has been conserved during the evolution of vertebrate species.
LH is a major product of the gonadotroph and is one of only a few proteins produced by gonadotrophs or other cells in the pituitary that terminate with GalNAc-4-SO 4 . The expression of ␤1,4-Gal-NAcT and GalNAc-4-sulfotransferase in the gonadotroph is modulated in parallel to LH levels in response to circulating levels of estrogen (26). As estrogen levels fall, the expression of ␤1,4-Gal-NAcT and GalNAc-4-sulfotransferase increases in concert with increased synthesis of LH. The coordinate regulation of LH synthesis and ␤1,4-GalNAcT expression assures that the oligosaccharides on LH, but not other glycoproteins produced in the gonadotroph, always terminate with GalNAc-4-SO 4 . In contrast, ␤1,4-GalNAcT activity in other tissues including the submaxillary gland and kidney is not responsive to estrogen levels. Conservation of these sulfated oligosaccharide structures during evolution in conjunction with hormonal regulation of ␤1,4-GalNAcT expression in gonadotrophs but not other cells further supports the view that these sulfated oligosaccharides play a central role in the biology of the glycoprotein hormones.

The Biological Significance of GalNAc-4-SO 4 for the Glycoprotein Hormones
Consistent with the high degree of regulation seen for the synthesis of sulfated oligosaccharides on LH, these oligosaccharides have been found to mediate a crucial biological function. We have shown that the sulfated oligosaccharides on LH regulate its circulatory half-life following release into the blood (27). These oligosaccharides are recognized by a receptor in hepatic endothelial cells and Kupffer cells, which is specific for the terminal sequence SO 4 -4GalNAc␤1,4GlcNAc␤1,2Man␣1-R (28). Upon binding the GalNAc-4-SO 4 -receptor the hormone is rapidly internalized and transported to lysosomes where it is degraded. The receptor is plentiful with 500,000 binding sites detectable at the surface of each endothelial cell and has an apparent K m of 1.63 ϫ 10 Ϫ7 M for LH. The rapid and specific clearance of LH from the circulation on the basis of its terminal glycosylation was initially unexpected since rapid clearance of the hormone reduces its potency to induce ovulation following a single intravenous injection (27). This seeming contradiction is resolved upon consideration of the properties of the LH/CG receptor in the ovary and the hormonal cycle. The LH/CG receptor is a member of the seven-transmembrane domain G-protein-coupled receptor family. Upon hormone binding the receptor is activated and cAMP is produced; however, at the same time, hormone binding causes rapid inactivation and internalization of the receptor (29). As a result, continuous stimulation would result in the entire population of LH/CG receptors becoming refractory to further activation. Thus, during the 24 -48-h pre-ovulatory surge in circulating LH levels, the LH/CG receptor would not be maximally activated due to down-regulation. However, circulating LH levels rise and fall in a pulsatile manner. During the preovulatory surge it is the frequency and amplitude of these pulses that increases (30,31). This pulsatile rise and fall reflects both the episodic release of LH from granules and its rapid clearance from the blood. Other hormones such as FSH are also released episodically but have a long half-life and do not display this pulsatile rise and fall in blood levels. We have therefore proposed that this pulsatile rise and fall in LH levels is essential to obtain maximal stimulation of the LH/CG receptor. Key to the pulsatile appearance of LH in the circulation is its rapid clearance from the bloodstream mediated by its oligosaccharide component.
The crucial role mediated by these oligosaccharides is highlighted by the fact that a number of animal species, including humans and horses, synthesize a glycoprotein hormone, CG, in their placenta during the early stages of pregnancy. Equine CG and LH arise for the same gene and have identical peptide sequences (32,33). The Asn-linked oligosaccharides on equine CG terminate with Sia-␣2,3Gal␤1,4GlcNAc␤1-R (34, 35) while those on equine  (19,20). 3 The ␣ subunit and its oligosaccharides are colored blue and green, respectively. The ␤ subunit and its oligosaccharides are colored gray and magenta, respectively. The residues constituting the recognition determinant on the ␣ subunit, Pro 40 -Leu 41 -Arg 42 -Ser 43 -Lys 44 -Lys 45 , are found within two turns of an ␣ helix and are shown in yellow in both views (panels A and B) while the residues proposed to constitute the recognition determinant present on the ␤ subunit, Pro 4 -Leu 5 -Arg 6 -Pro 7 -Arg 8 -Cys 9 -Arg 10 , are highlighted in yellow only in the view shown in panel B. Residues 94 -115 (21,38) of the ␤ subunit are colored red and have been proposed to dictate receptor specificity for the different glycoprotein hormones. Differences in this region may influence ␤1,4-GalNAcT interaction with the peptide recognition determinant in FSH as compared with LH/CG and TSH. Because the oligosaccharides at Asn 52 of the ␣ subunit project from the same surface as the recognition determinant on the ␣ subunit while those at Asn 13 and Asn 30 of the ␤ subunit project from the opposite surface, the second recognition determinant on the ␤ subunit may be required to mediate efficient GalNAc addition to the oligosaccharides on both subunits of LH/CG. LH terminate with SO 4 -4GalNAc␤1,4GlcNAc␤1-R (34,36). Consistent with the presence of sialic acid-bearing oligosaccharides, CG has a long circulatory half-life. Thus equine CG and LH are different glycoforms of the same protein, which we have shown differ in their rate and site of clearance from the circulation (34). Furthermore, LH is stored in granules and released episodically into the circulation in response to a releasing factor while CG, which is not stored in granules, is released continuously from placental trophoblasts. Thus, the major difference between LH, the hormone of the ovulatory cycle, and CG, the hormone of pregnancy, is the difference in their circulatory half-lives, which results in episodic and continual stimulation of the LH/CG receptor, respectively.

Other Roles for Oligosaccharides Containing
␤1,4-Linked GalNAc The sulfated oligosaccharides on LH illustrate how unique oligosaccharide structures can play crucial physiologic roles. We have defined and characterized many of the components of the physiological system involving these oligosaccharides, including 1) their precise structures; 2) the transferases responsible for the synthesis of these structures; and 3) a receptor that specifically recognizes these sulfated oligosaccharides and mediates a specific biological function. Our results, furthermore, demonstrate that this system involves a high degree of regulation and specificity.
Not surprisingly many of these same elements are used at different times and under different circumstances for other biologic purposes. As was noted above, the number of glycoproteins known to contain oligosaccharides terminating with the sequence GalNAc␤1,4GlcNAc␤1-R has increased. The addition of SO 4 , ␣1,3linked fucose, or ␣2,6-linked sialic acid (Fig. 1, products of reactions 4, 5, and 6, respectively) has the potential to produce three additional distinct and unique oligosaccharide structures, each of which can potentially be recognized by a specific receptor similar to the hepatic GalNAc-4-SO 4 specific receptor. Different glycoforms of the same protein may also arise at different times during development 2 or in response to hormonal status (26,41). Thus, these glycoforms may fulfill a variety of biological purposes.
The demonstration of protein-specific glycosylation by the ␤1,4-GalNAcT, which acts on the glycoprotein hormones, is particularly exciting because it provides a model for understanding how these and other distinct oligosaccharide structures are synthesized in a protein-and even site-specific manner. It also exemplifies a mechanism for the addition of unique structures at precise times to specific glycoproteins. This is essential for oligosaccharides having biological roles, which involve encoding of specific information. Oligosaccharides are ideally suited for such a purpose because they are highly exposed and accessible at the surface of the proteins which bear them and because of their enormous structural diversity. It is likely that we have only gained a glimpse of the potential functions of carbohydrates thus far and that many new discoveries await those willing to embark on the study of this form of posttranslational modification.