X-ray Crystal Structure of Leukocyte Type Core 2 β1,6-N-Acetylglucosaminyltransferase

Leukocyte type core 2 β1,6-N-acetylglucosaminyltransferase (C2GnT-L) is a key enzyme in the biosynthesis of branched O-glycans. It is an inverting, metal ion-independent family 14 glycosyltransferase that catalyzes the formation of the core 2 O-glycan (Galβ1-3[GlcNAcβ1-6]GalNAc-O-Ser/Thr) from its donor and acceptor substrates, UDP-GlcNAc and the core 1 O-glycan (Galβ1-3GalNAc-O-Ser/Thr), respectively. Reported here are the x-ray crystal structures of murine C2GnT-L in the absence and presence of the acceptor substrate Galβ1-3GalNAc at 2.0 and 2.7Å resolution, respectively. C2GnT-L was found to possess the GT-A fold; however, it lacks the characteristic metal ion binding DXD motif. The Galβ1-3GalNAc complex defines the determinants of acceptor substrate binding and shows that Glu-320 corresponds to the structurally conserved catalytic base found in other inverting GT-A fold glycosyltransferases. Comparison of the C2GnT-L structure with that of other GT-A fold glycosyltransferases further suggests that Arg-378 and Lys-401 serve to electrostatically stabilize the nucleoside disphosphate leaving group, a role normally played by metal ion in GT-A structures. The use of basic amino acid side chains in this way is strikingly similar to that seen in a number of metal ion-independent GT-B fold glycosyltransferases and suggests a convergence of catalytic mechanism shared by both GT-A and GT-B fold glycosyltransferases.

Leukocyte type core 2 ␤1,6-N-acetylglucosaminyltransferase (C2GnT-L) 3 is a cis-medial Golgi resident glycosyltransferase that catalyzes the conversion of the core 1 O-glycan to that of the core 2 structure (1, 2). Core 2 O-glycans have been shown to be key ligands in selectin-mediated lymphocyte homing and leukocyte rolling; lymphocyte L-selectins bind to endothelial cell O-glycans containing 6-sulfo sialyl Lewis x on core 2 and extended core 1 structures, whereas neutrophils expressing sialyl Lewis x on core 2 O-glycans bind to E-and P-selectins on endothelial cells (3)(4)(5). C2GnT-L is also of considerable interest in the study of tumor metastasis given that its expression is highly correlated with tumor progression in a number of cancers. It is overexpressed in colorectal, lung, and prostate cancer, and recent work has shown that transfection of C2GnT-L into a prostate cancer cell line leads to increased tumor size in an experimental tumor model (6 -8).
Although glycosyltransferases have been grouped into 83 different families based on sequence similarity (9), only two major fold types, termed the GT-A and GT-B folds, have been observed (13). The GT-A fold (SCOP fold: nucleotide-diphospho-sugar transferases) can be described as a single domain ␣/␤/␣ structure in which the mixed ␤-sheet typically consists of seven ␤-strands of topology 3, 2, 1, 4, 6, 5, 7 (strand 6 is anti-parallel). All of the metal ion-dependent glycosyltransferases for which structures have been determined have been of this type, and all have utilized nucleoside diphosphate sugars as donor substrates. In these structures Mn 2ϩ or Mg 2ϩ ions are coordinated by one oxygen atom of each of the two phosphate groups of the nucleoside diphosphate sugar donor. In addition, the metal ion is also coordinated by one or two of the side chain carboxyl groups of the "DXD" motif (13). This degenerate sequence motif is critical for metal ion binding, and catalysis and has been thought to be a characteristic of the GT-A fold enzymes. The GT-B fold (SCOP fold: UDP-glycosyltransferase/glycogen phosphorylase) is characterized by two distinct ␣/␤/␣ domains, both of which contain all parallel ␤-sheets. The catalytic site is located in a cleft between the two domains, and none of the donor substrate complexes determined show evidence of bound metal ions. The recently determined structure of Campylobacter jejuni sialyltransferase CstII (14) represents the first example of a metal ion-independent glycosyltransferase not belonging to the GT-B fold. It is a single domain ␣/␤/␣ structure and, as such, was first described as being a variant of the GT-A fold (14). Analysis of the arrangement of its secondary structural elements and topology, however, shows that CstII represents a novel fold, and it has now been assigned to a fold of its own (SCOP fold: alpha-2,3/8sialyltransferase CstII) (15). A DALI search using CstII did not return other GT-A or GT-B fold glycosyltransferases, and to date, it is the only glycosyltransferase not represented by one of these two fold types.
Inverting glycosyltransferases employ an S N 2 reaction mechanism where nucleophilic attack by the acceptor hydroxyl group leads naturally to an inversion of stereochemistry at the anomeric center of the donor substrate (16). The mechanistic basis for the retention of configuration shown by retaining glycosyltransferases, however, is less clear. Although a double displacement mechanism similar to that observed in retaining glycosidases may be possible (17), a front side S N i mechanism has now been proposed for a number of retaining glycosyltransferases (18 -21). In any case, mechanisms for activating the nucleophile and stabilizing the leaving group are thought to be general features of glycosyltranferase-mediated catalysis. The use of a catalytic base (e.g. an Asp or a Glu) and a bound metal ion, respectively, provide well characterized examples of how this has been achieved. Many glycosyltransferases also show ordered sequential Bi Bi kinetic mechanisms where the donor substrate binds before the acceptor and the glycosylated acceptor leaves before the nucleotide product (22). Moreover, both structural and thermodynamic data are beginning to emerge to suggest that donor substrate binding enhances or promotes binding of the acceptor in these enzymes (22,23).
Reported here are the x-ray crystal structures of murine C2GnT-L in the absence and presence of Gal␤1-3GalNAc at 2.0 and 2.7 Å resolution, respectively. The structures show that although C2GnT-L possesses the canonical GT-A fold, it lacks the characteristic metal ion binding DXD motif. The acceptor substrate complex defines the determinants of acceptor binding, and structural alignment shows that Glu-320 corresponds to the residue shown to be the catalytic base in other inverting GT-A fold glycosyltransferases. Structural alignment further suggests that the role played by the Mn 2ϩ or Mg 2ϩ ion in the metal ion-dependent GT-A structures is served by the basic amino acid residues Arg-378 and Lys-401 in C2GnT-L. This use of basic amino acid side chains is strikingly similar to that observed in a number of GT-B fold glycosyltransferases and provides structural evidence for a convergence of cataly-tic mechanism shared by both GT-A and GT-B fold glycosyltransferases.

EXPERIMENTAL PROCEDURES
Protein Expression and Purification-A soluble form of C2GnT-L (residues 38 -428) was expressed as an N-terminal protein A fusion protein from a stably transformed CHO cell line (24) kindly provided by Dr. J. Dennis. Protein production was scaled up in a Celligen 2.2 L New Brunswick bioreactor using the basket impeller containing 30 g of FibraCel disks (New Brunswick, M11769984). The bioreactor was run in perfusion mode using CHO-S-SFM II medium (Invitrogen, 12052-098) supplemented with 3% fetal bovine serum, 2.25 g/liter glucose, 1ϫ nonessential amino acids (Invitrogen, 11140-050), 1 mg/liter aprotinin (Bioshop, 9087-70-1), and 1ϫ penicillin-streptomycin (Invitrogen, 15140-148) at a media flow rate of ϳ2 liters/day. The harvested medium was concentrated 10-fold, and the protein A⅐C2GnT-L fusion protein was purified by IgG-Sepharose and Q-Sepharose ion exchange chromatography. The Protein A affinity tag was removed by elastase digestion, and the liberated C2GnT-L fragment was further purified by Phenyl-Sepharose hydrophobic interaction chromatography, which separated the protein A tag from the monomeric and dimeric forms of C2GnT-L. The dimeric form was then purified on a Superdex 200 gel filtration column (GE Healthcare) in 10 mM Tris, pH 7.5, 50 mM KCl, 2 mM MgCl 2 , and 0.02% NaN 3 and concentrated to 10 mg/ml.
Crystallization and Data Collection-For both native and selenomethionine-labeled C2GnT-L, crystals were grown using the hanging drop vapor diffusion method by mixing 1 l of protein (10 mg/ml) with 1 l of well solution (20% polyethylene glycol 4000, 0.1 M glycine, pH 9.0, 0.6 M LiCl) and 0.22 l of 33% 1,2,3-heptanetriol. These crystals were used for streak seeding/macroseeding into clear equilibrated drops prepared as described above but at a protein concentration of 5 mg/ml. Crystals typically grew to a size of ϳ0.2 ϫ 0.2 ϫ 0.05 mm 3 1 week after seeding. Gal␤1-3GalNAc complexed crystals were grown in a similar manner by mixing 1 l of protein (10 mg/ml), 1 l of well solution (20% polyethylene glycol 4000, 0.1 M glycine, pH 9.0, 0.6 M LiCl), 0.22 l 1,2,3-heptanetriol, and 0.66 l 30 mM Gal␤1-3GalNAc (Toronto Research Chemicals, A152000). A 2.5 Å resolution multi-wavelength anomalous dispersion data set collected from selenomethionine crystals and a 2.0 Å native data set were collected on beamline F2 at the Cornell High Energy Synchrotron Source. Data from crystals of the Gal␤1-3Gal-NAc complex were collected to 2.5 Å resolution on beamline 19-BM at the Advanced Photon Source, Argonne National Laboratory. Data collection for all crystals was performed at 100 K using 20% polyethylene glycol 4000, 0.1 M glycine, 0.6 M LiCl, and 25% glycerol as a cryoprotectant. All data were processed using HKL/HKL2000 (26). Data collection statistics are summarized in Table 1.

Structure Solution and Refinement-
The selenomethionine-labeled P2 1 crystal form was solved by the MAD phasing method. Selenium positions were determined using SOLVE (27) and SHARP (28). Electron density maps were improved by histogram matching/solvent flattening using DM (29) and 4-fold noncrystallographic symmetry averaging using RAVE (30). Model building was performed using O (31), and the structure (2 dimers in the asymmetric unit) was then refined to an R-free of 28.8% (data not shown) using CNS 1.1 (32). This dimer was then used as a molecular replacement search model to solve the C2 crystal form using CNS 1.1. The structure was refined to 2.0 Å resolution using a combination of rigid body refinement, simulated annealing, positional refinement, and individual B-factor refinement as implemented in CNS 1.1. The P1 crystal form of the Gal␤1-3GalNAc complex was solved using the C2 dimer as the molecular replacement model and refined to 2.7 Å resolution as described for the C2 crystal form. Refinement statistics are summarized in Table 1.

RESULTS AND DISCUSSION
Overall Fold-The structure reported here is that of a soluble form of C2GnT-L (residues 38 -428) lacking the N-terminal cytoplasmic (residues 1-16) and transmembrane (residues 17-34) regions ( Fig. 1). A disulfide-linked dimer is found in the asymmetric unit, each monomer of which is composed of two distinct regions. The first region (residues 38 -121) is com-FIGURE 1. Stereo ribbon representation of C2GnT-L in complex with Gal␤1-3GalNAc. The catalytic domain is illustrated in cyan, the internal six-stranded ␤-sheet is highlighted in red, and the putative stem region is illustrated in blue. Cysteine residues involved in disulfide bond formation are illustrated (gray carbon atoms, yellow sulfur atoms). All figures were prepared using SPOCK (57) and rendered using RASTER3D (58).  SEPTEMBER 8, 2006 • VOLUME 281 • NUMBER 36 posed solely of ␣-helices and likely corresponds to a stem that extends the catalytic domain into the lumen of the Golgi. It is relatively disordered having an average temperature factor of 45.0 Å 2 with no interpretable electron density for residues 38 -44 and 49 -55. Both of the N-linked glycans reside in this stem region (Asn-58 and Asn-95), but interpretable electron density is seen only for the Asn-linked GlcNAc moiety in both cases. The second region, which corresponds to the catalytic domain (residues 122-428), is an ␣/␤/␣ structure consisting of a central six-stranded mixed ␤-sheet with topology 3, 2, 1, 4, 6, 5 (all strands parallel except for strand 6) flanked on one side by ␣-helices and a small three-stranded ␤-sheet and on the other side by ␣-helices and small two-and three-stranded ␤-sheets (Fig. 1). The structure of the catalytic domain belongs to the GT-A fold, and a DALI search with C2GnT-L identified other GT-A glycosyltransferases, most significantly Rhodothermus marinus mannosylglycerate synthase (Z ϭ 10.3) (20), Homo sapiens glucuronyltransferase I (Z ϭ 9.9) (33), Bos taurus ␣-1,3galactosyltransferase (Z ϭ 9.5) (34), and Oryctolagus cuniculus N-acetylglucosaminyltransferase I (GnT I; Z ϭ 9.0) (35). Within the catalytic domain, residues 122-240, which include ␤-strands 3, 2, 1, and 4 of the central ␤-sheet, contain the binding site of the nucleotide portion of the donor substrate (36). These residues also correspond to those found to be the most structurally conserved among GT-A structures. Residues 241-428, which include ␤-strands 6 and 5 of the central ␤-sheet, contain the acceptor substrate binding site. Among GT-A structures there is little structural similarity in this region other than ␤-strands 6 and 5 and ␣-helix 5 where the catalytic base resides (see below). The monomers of the observed dimer are connected by a disulfide bond between Cys-235 on each of the two monomers in the asymmetric unit (r.m.s.d. between monomers ϭ 0.37 Å). However, because the sample is predominantly monomeric when initially produced and Cys-235 is unique to murine C2GnT-L (Fig. 2), it is possible that this dimer does not represent a physiologically relevant form of the enzyme. Monomers and dimers can be separated from each other by hydrophobic interaction chromatography and gel filtration chromatogra-phy, and only the dimeric form was found to crystallize. With the exception of Cys-235 all of the other cysteine residues are conserved among C2GnT-L sequences, and in the structure eight are found to form intramolecular disulfide bonds and one is unpaired. The observed disulfide bond pattern is identical to that determined by mass spectrometric analysis (37). Two of the observed disulfide bonds (Cys-151-Cys-199 and Cys-372-Cys-381) connect structural elements within the catalytic domain, whereas the other two (Cys-59 -Cys-413 and Cys-100 -Cys-172) connect the putative stem region with the catalytic domain. The high temperature factor of the stem region and the lack of extensive protein-protein contacts between it and the catalytic domain suggest that its conformation and/or the observed disulfide bonds to it might differ in the full-length molecule, a form found to dimerize in the Golgi membrane (38). The remaining cysteine, Cys-217, is unpaired and is located in the donor substrate binding site (as discussed below in more detail). It is conserved in all mammalian C2GnT-L, C2GnT-M, and C2GnT-T sequences (Fig. 2) and is responsible for the oxidative inactivation shown by C2GnT-L (39).

Structure of C2GnT-L and an Acceptor Substrate Complex
Acceptor Substrate Binding Site-Comparison of the apo and acceptor-bound structures reveals that C2GnT-L does not undergo significant structural change upon binding of the acceptor substrate, Gal␤1-3GalNAc; the apo and Gal␤1-3GalNAc complex show an r.m.s.d. of only 0.4 Å on all C␣ atoms. The disaccharide Gal␤1-3GalNAc binds in a solventexposed cleft, and strong electron density is observed for both the nonreducing Gal and the reducing GalNAc moieties (Fig.  3). Both of the monosaccharide moieties are in the low energy 4 C 1 chair conformation, and the / values for the disaccharide ( ϭ 262.8 Ϯ 2.9°, (nϪ1) ϭ 201.2 Ϯ 7.8°, averaged over the four copies in the asymmetric unit) are very close to those of Gal␤1-3GalNAc in its lowest energy conformation (40). Based on the position of the C-1 atom of the GalNAc moiety, the polypeptide portion of the physiological substrates would point directly into solution.
Acceptor substrate binding is mediated, in part, by several hydrogen bonds. Glu-320, which is conserved among all C2GnT sequences, makes a bidentate interaction, accepting hydrogen bonds from both the O-4 and the nucleophilic O-6 of the GalNAc moiety. Glu-320 corresponds to the structurally conserved catalytic base shown through mutagenesis to be critical for activity in other GT-A fold glycosyltransferases (36,41). The importance of the interaction between Glu-320 and O-6 is evidenced by the fact that removal of O-6 leads to a significant reduction in substrate binding affinity (42,43). In addition to donating a hydrogen bond to Glu-320, the O-4 hydroxyl simultaneously accepts a hydrogen bond from Arg-254, a strong indication that this hydroxyl plays an important role in acceptor substrate binding as has been reported (42). Glu-243 also makes a bidentate interaction, accepting hydrogen bonds from both the O-4 and O-6 of the Gal moiety. In this case, the interaction with O-6 has been found to be critical, as the 6-deoxy acceptor analog, D-Fuc␤1-3GalNAc, is a very poor substrate (1). Tyr-358 also appears to play an important role in binding, as it simultaneously accepts and donates a hydrogen bond from the GalNAc NH and the Gal O-2, respectively, in this way bridging the two monosaccharide moieties of the acceptor. Lys-251 makes a hydrogen bond to the glycosidic oxygen of the acceptor saccharide, and acceptor binding is further stabilized by a stacking interaction between Trp-356 and both the Gal and GalNAc moieties of the acceptor. Stacking interactions of this type are characteristic of galactose-binding proteins (44).
Analysis of mammalian C2GnT-L, C2GnT-M, and C2GnT-T sequences revealed that although these enzymes show only ϳ50% sequence similarity, five of the six amino acids found to be involved in direct acceptor substrate interactions are conserved (Fig. 2). The sole difference occurs at Tyr-358 in C2GnT-L, a residue that is conserved in C2GnT-T (Tyr-373) but not in C2GnT-M (Gly-368). Interestingly, C2GnT-L and C2GnT-T both exhibit a strict specificity for Gal␤1-3GalNAc-R (2, 12), whereas C2GnT-M can utilize (in addition to Gal␤1-3GalNAc-R) GlcNAc␤1-3GalNAc-R (termed core 4 activity) and GlcNAc␤1-3Gal␤1-R (termed I branching activity) (11). The observation that C2GnT-M can also utilize acceptor substrates containing Glc-NAc at the nonreducing terminus can be explained by assuming that it binds its acceptors in a very similar orientation to that observed here for the C2GnT-L⅐Gal␤1-3GalNAc complex. Because Glu-243 makes an edge-on bidentate interaction with O-4 and O-6 of the galactose moiety (see Fig.  3), only a slight reorientation of the ring and/or Glu-243 would be required to accommodate the change from an axial configuration at O-4 in Gal to an equatorial one in GlcNAc. Thus, the ability to accommodate an N-acetylated sugar at this position can be explained by the amino acid difference at Tyr-358. As shown in Fig. 3, the hydroxyl group of Tyr-358 in C2GnT-L donates a hydrogen bond to the O-2 of the galactose moiety, an interaction that is likely to sterically preclude the binding of substrates with an N-acetyl group at this position. The change to Gly-368 observed in C2GnT-M would generate the volume required to accommodate GlcNAc-containing substrates such as GlcNAc␤1-3GalNAc and provides an explanation for the core 4 and I branching activity exhibited by C2GnT-M.
The C2GnT-L complex reported here provides one of the few examples of an acceptor complex determined in the absence of the donor substrate or nucleotide product. Moreover, comparison of the apo and acceptor-bound forms shows that the C2GnT-L acceptor binding site is essentially preformed. This is to be contrasted with a number of GT-A fold glycosyltransferases, many showing Bi Bi kinetic mechanisms, where donor substrate binding leads to the ordering of flexible loops that in turn help to form the acceptor binding site (22). Nevertheless, it should be emphasized that even ordered sequential kinetics do not preclude the possibility that the acceptor could bind to the enzyme, in the absence of donor, to form what would necessarily be a nonproductive complex (45). This fact is illustrated by the x-ray crystal structure of the ␤1,4galactosyltransferase I⅐␣-lactalbumin⅐glucose ternary complex, which shows that high concentrations of the acceptor substrate alone can promote both loop ordering and acceptor binding (46). If C2GnT-L, like C2GnT-M (47), possesses an ordered sequential kinetic mechanism then the C2GnT-L⅐acceptor substrate complex reported here, like that of the galactosyltransferase I⅐␣-lactalbumin⅐glucose structure, would be expected to represent a complex incapable of productively binding the donor substrate.  SEPTEMBER 8, 2006 • VOLUME 281 • NUMBER 36

JOURNAL OF BIOLOGICAL CHEMISTRY 26697
Donor Substrate Binding Site-As discussed above, the ϳ120 N-terminal residues of the catalytic domain (residues 122-240 in C2GnT-L) show the highest degree of structural similarity among the GT-A fold glycosyltransferases. This region contains the main determinants of donor substrate binding and includes ␤-strands 3, 2, 1, and 4 of the central ␤-sheet, as well as the type I ␤-turn connecting ␤-strands 4 and 4Ј. Alignment of 28 C␣ atoms from residues in the four ␤-strands gives an r.m.s.d. of 1.5 Å between mouse C2GnT-L and rabbit GnT I. The type 1 ␤-turn is of particular interest, because in the GT-A structures determined to date, the i to i ϩ 2 residues of the turn contain the metal ion-coordinating DXD motif. As shown in Fig. 4, the ␤-turn in C2GnT-L is essentially identical in conformation to that seen in GnT I; the carbonyl oxygen atom of Cys-217 accepts the characteristic hydrogen bond from the backbone NH of Asp-220, the i ϩ 3 residue of the turn. In contrast, the turn does not possess a DXD motif ( 211 EDD 213 in GnT I), the corresponding residues in C2GnT-L are 217 CGM 219 , and in addition, these residues are not conserved among the members of the GT-14 family. That C2GnT-L and the other GT-14 family members lack the DXD motif, and in particular an Asp residue in the third position, is certainly consistent with the fact that the GT-14 family members C2GnT-L, C2GnT-M, C2GnT-T (1,2,11,12), and protein O-xylosyltransferase (48) are active in the absence of a metal ion. Comparison of the structures further shows that Arg-378 and Lys-401 of C2GnT-L are positioned to interact with the donor substrate moiety corresponding to that of the UDP-GlcNAc ␤-phosphate in the GnT I⅐UDP-Glc-NAc complex (Fig. 5). These basic residues are highly conserved among all members of the GT-14 family and conceivably play the role served by the metal ion in all of the other GT-A structures determined to date. In the transition state they would electrostatically counter the negative charge formed on the nucleoside diphosphate leaving group (i.e. UDP in these N-acetylglucosaminyltransferases). Interestingly, Lys-401 exists in the cispeptide conformation, a rare occurrence for residues other than proline but when observed often indicative of an important functional role (49).
Despite these differences in structure a number of key residues, closely aligned structurally, show that the determinants of UDP-GlcNAc binding are very similar between C2GnT-L and GnT I (Fig. 5). Asp-155 of C2GnT-L is well aligned with Asp-144 of GnT I, a residue conserved among many GT-A glycosyltransferases and in the GnT I complex found to make a key hydrogen bond to the N-3 atom of the uracil moiety. Also shown is the close alignment of the backbone carbonyl group (Ile-113 in GnT I and Val-128 in C2GnT-L) found to accept a hydrogen bond from O-3Ј of the ribose moiety in the GnT I complex. Moreover, the apolar pocket formed by Leu-269 and Leu-331 that interacts with the GlcNAc N-Acetyl methyl group in the GnT I complex is clearly replaced by Val-354 and Val-380 in C2GnT-L. In addition to these determinants of UDP-Glc-NAc binding, Glu-320 and Asp-291, the presumed catalytic  bases in C2GnT-L and GnT I, respectively, are closely aligned in three-dimensional space. As mentioned above, the nucleophilic C-6 hydroxyl group of the C2GnT-L acceptor substrate is likely activated for nucleophilic attack through interaction with Glu-320 and, as now shown in the overlay with the GnT I⅐UDP-GlcNAc complex (Fig. 5), is perfectly positioned for in-line attack on C-1 of the donor substrate. The structural similarities shared by the two binding sites certainly suggest that UDP-GlcNAc will be bound to C2GnT-L in a very similar fashion to that observed in the GnT I complex.
Examination of the electron density maps of both the apo and acceptor-bound C2GnT-L structures revealed additional electron density attached to the side chain sulfur atom of Cys-217. The extra electron density, consistent with oxidation to cysteine sulfenic acid (ϪSOH) (50), points directly into the donor substrate binding pocket and is in close proximity to the position corresponding to that of the ribose moiety of the donor substrate in the GnT I complex. This modification of the Cys-217 side chain presumably precludes donor substrate binding and as such provides an explanation for the observed Cys-217dependent oxidative inactivation shown by C2GnT-L in vitro (39). Although this inactivation has been shown to be reversed by reducing agents (39), as expected for a cysteine sulfenic acid modification, the x-ray crystal structure determined from crystals soaked in UDP-GlcNAc and Tris(2-carboxyethyl)phosphine or dithiothreitol still shows the extra electron density and no evidence of donor substrate.
Insights into Glycosyltransferase Structure and Mechanism-The C2GnT-L structure reported here provides an example of a metal ion-independent GT-A glycosyltransferase where two basic amino acid residues, Arg-378 and Lys-401, are positioned to play the role of the metal ion in electrostatically stabilizing the UDP leaving group. To examine this further we compared the chemical/structural environment surrounding the C2GnT-L leaving group with that of the other metal ion-independent glycosyltransferases using structural alignments. Because these other structures are all of the GT-B or CstII fold type, we based our alignments on atoms from the saccharide and ␤-phosphate moi-eties of the nucleoside diphosphate sugar donor substrates. In this way the C2GnT-L and GnT I⅐UDP-Glc-NAc structures, aligned as described above (based on the conserved N-terminal ␤-stands and as shown in Fig.  5), were simultaneously superimposed onto GT-B glycosyltransferase complexes using the GlcNAc and ␤-phosphate moieties of the GnT I⅐UDP-GlcNAc complex and the corresponding donor substrate moieties in the GT-B complexes. As shown in Fig. 6, for the overlay with the T4 ␤-glucosyltransferase⅐UDP-Glc complex (BGT) (51), Arg-195 and Arg-269 of BGT hydrogen bond to each of two different oxygen atoms of the ␤-phosphate of its bound UDP-Glc substrate in a manner similar to that observed in the C2GnT-L complex. Moreover, Arg-195 is virtually superimposed on Lys-401 of C2GnT-L, and both residues are very close to the Mn 2ϩ ion in the GnT I complex. In other GT-B glycosyltransferases such as trehalose-6-phosphate synthase (52) and glycogen synthase (53), the side chains of basic amino acids are also found to interact with the ␤-phosphate oxygen atoms. Taken together these observations strongly suggest a convergence of catalytic mechanism shared by both GT-A and GT-B fold glycosyltransferases.
Site-directed mutagensis experiments on the metal ion-independent human ␣1,6-fucosyltransferase have provided direct evidence for the critical role played by positively charged side chains in catalysis (54). Specifically, these experiments have shown that mutation of Arg-365 or Arg-366 leads to a dramatic decrease in enzymatic activity and that Arg-366 likely interacts directly with the ␤-phosphate moiety of the donor substrate, GDP-fucose. Arg-365 and Arg-366 are conserved among the metal ion-independent ␣1,6and ␣1,2-fucosyltransferases, and an interaction with the ␤-phosphate clearly suggests a role similar to that observed in C2GnT-L and the GT-B glycosyltransferases as discussed above.
Not all metal ion-independent glycosyltransferase structures possess basic side chains positioned to interact with the ␤-phosphate moiety of the donor substrate. In some GT-B structures a positive helix dipole (55) and/or other hydrogen bond donors have been shown to interact with the negatively charged leaving group. Furthermore, in the sialyltransferase CstII, two tyrosine hydroxyl groups that donate hydrogen bonds to each of the two phosphate oxygen atoms of the CMP-NeuAc substrate have been proposed to assist in leaving group departure (14). As such, it might reasonably be expected that the "degree" to which the leaving group is stabilized would differ among glycosyltransferases, a suggestion that likely pertains to nucleophile activation as well. In fact, the recently determined structure of a GT-B-type flavonoid glucosyltransferase (56) provides an example of a novel catalytic triad-like activation of the nucleophile, a structural feature not shared by all members of this fold type. The extent to which the nucleophile is activated and the leaving group stabilized presumably reflects the reactivity of the substrates and the glycosidic linkage formed and as such may also be important in explaining how it is that both inverting and retaining glycosyltransferases are represented by both the GT-A and GT-B fold types. Unlike that which has been observed for retaining glycosidases (16), there is little evidence to support a double displacement mechanism for retaining glycosyltransferases; rather, a front side S N 2-like attack (S N i) seems to be favored in both GT-A and GT-B glycosyltransferases (18 -21). Because shifting the emphasis to leaving group stabilization would promote the more dissociative S N i mechanism required for retention of configuration, it might be relatively easy to convert from an inverting (S N 2 mechanism) to a retaining glycosyltransferase and vice versa. With the exception of a front side attack by the acceptor, there would be no requirement for a fundamentally different protein scaffold or donor substrate binding geometry, a suggestion consistent with the structures solved to date. It follows then that within fold types both retaining and inverting glycosyltransferases could have evolved from common ancestors, a suggestion also made recently by Flint et al. (20). Given the increased oxocarbenium ion character (and its increased reactivity) associated with the transition state of an S N i mechanism, it might also be expected that the need for nucleophile activation might also differ. Interestingly, in both GT-A and GT-B fold retaining glycosyltransferases an Asp or Glu residue, positioned to activate the nucleophile in both types of inverting glycosyltransferases, has not been found. In both types of retaining glycosyltransferases it has been suggested that a donor substrate phosphate oxygen atom might serve to activate the nucleophile (19 -21). Given the view that leaving group stabilization and nucleophile activation can vary significantly among glycosyltransferases, it follows that a particular "structural feature" should not necessarily be expected for any given fold type. Support for this suggestion is clearly provided for by the C2GnT-L structure reported here, which now shows that metal ion dependence and the DXD motif is not an intrinsic property of the GT-A fold glycosyltransferases.