Identification of Active Site Residues in Glucosylceramide Synthase A NUCLEOTIDE-BINDING/CATALYTIC MOTIF CONSERVED WITH PROCESSIVE b -GLYCOSYLTRANSFERASES*

Glucosylceramide synthase (GCS) transfers glucose from UDP-Glc to ceramide, catalyzing the first glycosylation step in the formation of higher order glycosphin-golipids. The amino acid sequence of GCS was reported to be dissimilar from other proteins, with no identifiable functional domains. We previously identified His-193 of rat GCS as an important residue in UDP-Glc and GCS inhibitor binding; however, little else is known about the GCS active site. Here, we identify key residues of the GCS active site by performing biochemical and site-di-rected mutagenesis studies of rat GCS expressed in bacteria. First, we found that Cys-207 was the primary residue involved in GCS N -ethylmaleimide sensitivity. Next, we showed by multiple alignment that the region of GCS flanking His-193 and Cys-207 (amino acids 89– 278) contains a D1,D2,D3,(Q/R) XX RW motif found in the putative active site of processive b -glycosyltransferases ( e.g. cellulose, chitin, and hyaluronan synthases). Site-directed mutagenesis studies demonstrated that most of the highly conserved residues were essential for GCS activity. We also note that GCS and processive b -glyco-syltransferases are topologically similar, possessing cytosolic active sites, with putative transmembrane domains immediately N-terminal

shown promising results for the reduction of glycolipid storage in Fabry disease mice and have been used to dramatically increase the cytotoxicity of anti-cancer drugs in tumor cells in culture (6 -8). The PDMP-type drugs are effective inhibitors of GCS in vitro but have the limitations of lack of oral availability, rapid elimination and/or degradation, and reported neurological side effects (9). Another GCS inhibitor, N-butyldeoxynojirimycin, is orally effective and has been shown to decrease glycosphingolipid accumulation in Tay-Sachs and Sandhoff disease mice and in a clinical trial of humans with Gaucher's disease (10 -12). However, this inhibitor is not fully specific (it inhibits ␣-glycosidase I and II as well as GCS) and is a much less effective inhibitor of GCS in vitro than the PDMP class of compounds (4,9,12,13). New analogs of both classes of GCS inhibitor have been prepared and are being evaluated as to their effectiveness and pharmaceutical properties (14 -16). The development of further GCS inhibitors with desirable pharmaceutical properties may have great potential therapeutic usefulness. However, the design of new inhibitors is hampered by a lack of knowledge concerning the GCS active site and catalytic mechanism.
The predicted amino acid sequences of cloned human, rat, and mouse GCSs were reported to have no significant homology with other proteins, including other glycosyltransferases (17)(18)(19). A recent classification of glycosyltransferases into different families based on sequence homology places mammalian GCS (Ͼ95% identical between rats, mice, and humans) in its own glycosyltransferase family (GTF21) along with putative homologs (51% identity with an open reading frame from Drosophila, ϳ40% identity with three sequences from Caenorhabditis elegans, and 28% identity with a protein from the blue green bacteria Synechocystis) from the GenBank TM Data Bank (20). 2 No recognizable domains or protein motifs have been identified within the GCS sequence, with the exception of a predicted single membrane-spanning region near its N terminus (17)(18)(19). Using antibodies directed against specific regions of GCS, we recently showed that the carboxyl tail and a hydrophilic loop near the putative membrane-spanning domain of GCS are accessible to the cytosol (21), consistent with earlier reports by us and others that GCS possesses a cytosolically oriented active site on the Golgi membrane (22)(23)(24). We also demonstrated that His-193 of rat GCS resides within or near the UDP-Glc-binding region of the protein and that rat GCS H193A and H193N mutant proteins were resistant to PDMP, suggesting that this region is involved in binding to GCS inhibitors (19). Other than these few features, nothing is known about the domain structure or active site of GCS.
Here, we conducted a series of biochemical and site-directed mutagenesis studies on rat GCS expressed in bacteria. We identified a sequence motif that is highly conserved in GCS and numerous processive ␤-glycosyltransferases (e.g. cellulose and chitin synthases), classified in glycosyltransferase family 2 (GTF2) (20). Mutagenesis of conserved residues within this motif showed that most of these amino acids are critical for GCS activity and suggest that GCS and processive ␤-glycosyltransferases share a common catalytic mechanism.

MATERIALS AND METHODS
Preparation of GCS Mutants-The coding sequence for rat GCS was inserted in the pET-3d vector as described previously (19). Mutagenesis of GCS to alter individual residues to other amino acids was performed by polymerase chain reaction of GCS in the pET-3d vector using the QuikChange site-directed mutagenesis kit (Stratagene, La Jolla, CA). The polymerase chain reaction products were transformed into Epicurian Coli XL1-Blue supercompetent cells (Stratagene) and then sequenced utilizing ABI Prism BigDye Terminator chemistry and an ABI Prism 377 sequencer (PerkinElmer Life Sciences) to verify that they contained the expected mutations Expression of Recombinant GCS-BL21-DE3 bacteria were transformed with wild type and mutant GCS inserts in the pET-3d vector. Transformed cells were incubated with shaking at 37°C in 2 ml of Luria broth (LB) medium containing carbenicillin (50 g/ml) until an A 600 of 0.6 was reached and then stored at 4°C overnight. The bacteria were then pelleted, resuspended in 40 ml of LB medium with carbenicillin (50 g/ml), and incubated with shaking at 37°C until an A 600 of ϳ0.5 was reached. GCS protein expression was then induced with 1 mM isopropyl-␤-thiogalactopyranoside. After further incubation for 1 h at 30°C, bacterial cell pellets were harvested by centrifugation. Bacterial pellets were then resuspended in GCS stabilizing buffer (50 mM HEPES, pH 7.4, 100 mM KCl, 20% glycerol, plus protease inhibitors (10 g/ml each tosylarginylmethyl ester and leupeptin, 1 g/ml each antipain and pepstatin, 25 M 4-amidophenylmethanesulfonyl fluoride; all from Sigma)), aliquoted, flash-frozen in liquid N 2 , and stored at Ϫ70°C until use.
GCS Enzyme Assays and Biochemical Characterization-Aliquots of bacterial pellets (from 10 ml of bacterial culture) in which wild type and mutant GCS were expressed were thawed, diluted to 0.5 ml in GCS stabilizing buffer, and lysed by probe sonication (three times for 8 s each) on ice. The GCS activity of bacterial lysates (10 -100 l) was measured using 10 M N-[7-(4-nitrobenzo-2-oxa-1,3-diazole)]-6-aminocaproyl-D-erythro-sphingosine (C 6 -NBD-labeled ceramide) (Molecular Probes, Eugene, OR) and 2.5 mM UDP-Glc (Sigma) in GCS assay buffer (50 mM HEPES, pH 7.4, 25 mM KCl, 5 mM MnCl 2 ) for 30 min at 37°C, followed by lipid extraction and thin layer chromatography as previously described (25). Quantities of C 6 -NBD-glucosylceramide formed were measured by image analysis of TLC plates using NIH Image. The specific activity of wild type GCS activity was estimated using standard curves with known concentrations of C 6 -NBD-glucosylceramide (Molecular Probes) on TLC plates. For comparison of mutant and wild type proteins, GCS activity was expressed as a percent of wild type activity and normalized for protein expression (relative enzyme mass) assessed by Western blotting of wild type and mutant GCS proteins.
For Western blotting, aliquots of GCS proteins in bacterial lysates were solubilized in SDS-polyacrylamide gel electrophoresis sample buffer without reducing agents or urea, run on 12% SDS-polyacrylamide gel electrophoresis gels, and transferred to polyvinylidene difluoride membranes overnight in transfer buffer with 20% methanol (21,26). Blots were probed with polyclonal anti-peptide antibodies (GCS-5, which recognizes a region near the GCS N terminus (amino acids 57-78) and GCS-1, which recognizes the GCS C terminus (amino acids 372-394)) followed by a goat anti-rabbit horseradish peroxidase secondary antibody (Roche Molecular Biochemicals) and visualized using Renaissance chemiluminescence reagent plus (PerkinElmer Life Sciences). For quantitation of relative enzyme mass, the levels of intact GCS (as detected by the GCS-1 antibody) were measured by densitometry of films using NIH image. Enzyme mass levels of mutant GCS proteins were expressed as a percent of wild type GCS protein mass measured on the same blot. Recombinant rat GCS lacking even 10 amino acids at the N terminus or 8 at the C terminus possessed less than 4% of wild type activity. 3 Thus, it was reasonable to quantify only intact GCS for enzyme activity studies. Only one mutant protein (C384A) was altered within the region recognized by one of our anti-bodies (GCS-1). However, this mutation apparently did not affect the ability of the antibody to interact with this protein, because blotting results were the same when either the GCS-1 or the GCS-5 antibody was used.
For studies of N-ethyl maleimide (NEM) inhibition of GCS activity, samples were preincubated with or without 4 mM UDP-Glc in a final volume of 98 l of GCS stabilizing buffer for 20 min at room temperature. NEM (Sigma) was prepared fresh as a stock solution (1 M) in 100% ethanol. Dilutions of NEM in distilled water were prepared, and 2-l aliquots were then added to GCS samples to acheive the desired range of final NEM concentrations. Samples were vortexed and then incubated for an additional 10 min at room temperature. Finally, 400 l of GCS assay buffer containing 4 mM dithiothreitol, 4 mM UDP-Glc, and 5 nmol of C 6 -NBD-ceramide was added, and then samples were incubated for 20 min at 37°C. GCS activity was measured as described above.
Amino Acid Sequence Alignment and Hydrophobicity Plots of GCS and Other Glycosyltransferases-Initial pairwise comparison of GCS with other proteins in the GenBank TM Data Bank were performed using the NCI BLAST program (27). Further multiple alignments between GCS and other sequences were performed using the CLUSTALW, Blockmaker, and PIMA Multiple Sequence Alignment programs on the BCM Search Launcher, followed by manual adjustments. Confirmation of the significance of multiple alignments was performed by Smith-Waterman analysis (28,29). Kyte/Doolittle and Argos algorithms for prediction of protein hydrophobicity and transmembrane domains were performed using McVector TM v6.5 (Pharmacopeia, Princeton, NJ). Additional transmembrane predictions were performed using the TMpred program on the Swiss Institute for Experimental Cancer Research server (Epalinges, Switzerland).

RESULTS
Preliminary experiments had shown that NEM and other cysteine-specific reagents inhibit GCS in human skin fibroblasts and rat liver Golgi fractions. 4 Thus, we investigated the NEM sensitivity of GCS expressed in bacteria as a possible probe for the identification of important cysteine residues within this protein. The enzymatic activity of rat GCS expressed in bacteria was inhibited ϳ80% by exposure to 0.4 mM NEM for 10 min at room temperature ( Fig. 1). Inhibition of GCS by NEM was dose- (Fig. 1) and time-dependent (data not shown). Preincubation of GCS with 4 mM UDP-Glc only partially protected against NEM inhibition, with protection of ϳ20% (relative to control activity) against 0.25 mM NEM and less protection at higher doses (data not shown).
We next prepared GCS mutant proteins in which the 11 cysteines in GCS were individually mutated to alanine (a dou- Aliquots of lysates from bacteria expressing wild type (q) or C207A (E) mutant GCS were preincubated at room temperature for 10 min with or without NEM. Samples were then diluted in assay buffer including 4 mM dithiothreitol, and GCS activity was assessed using C 6 -NBDceramide as a substrate (25). Results are means Ϯ S.E. and are expressed as a percent of control values (total activity without NEM). Specific activity of wild type GCS in bacterial lysates averaged 1.17 Ϯ 0.17 nmol/min/mg of total protein (mean Ϯ S.E.) in results from four separate experiments with representative preparations of wild type GCS.
ble mutant, C321A,C323A, was also prepared) to determine which cysteines were the targets of NEM. Each mutant was expressed at significant levels in bacteria, as assessed by Western blotting (Table I). We tested the GCS activity of each mutant with or without 0.5 mM NEM. Each Cys 3 Ala mutant possessed a significant level of GCS activity, ranging from 25 to 150% of wild type activity (Table I). All mutant proteins were inhibited Ն80% by treatment with 0.5 mM NEM, similar to wild type GCS, except for the C207A mutant, which was inhibited only 15% (Table I). These results suggest that Cys-207 is the primary residue involved in inactivation of GCS by NEM. Of possible significance is the proximity of Cys-207 to His-193, which we previously demonstrated to be within or near the UDP-Glc-binding region of GCS (19). At higher concentrations of NEM, the C207A mutant showed significant dose-dependent inhibition, although much less than the wild type protein (Fig. 1). These data suggest that at high concentrations of NEM, other cysteines within GCS besides Cys-207 are alkylated by NEM.
Alignment of GCS with Other Glycosyltransferases-We next began to look critically at the region around Cys-207 and His-193 of GCS for further clues to the substrate binding sites and catalytic mechanism of this enzyme. BLAST searches with rat GCS identified a number of proteins with very limited homology with GCS in this region (data not shown). Of greatest interest, several of these proteins were known glycosyltransferases (e.g. cellulose, chitin, and hyaluronan synthases). These proteins were all found to be classified as processive ␤-glycosyltransferases in GTF2 according to the amino acid sequencebased classification of Campbell et al. (20). Further computergenerated and visual multiple alignment of GCS (rat and Drosophila) with representative GTF2 proteins demonstrated the presence of a highly conserved set of residues between GCS and these enzymes within an ϳ200-amino acid region of each protein (referred to as the GTF2 domain, Fig. 2A and Table II). Pair-wise global alignment demonstrated only 18 -25% identity between rat GCS and the GTF2 proteins within the regions shown in Fig. 2A; however, 12 residues were entirely conserved (identical or strongly similar residues; shown in yellow) in all eight proteins, and certain other amino acids were also Ն50% conserved between GCS and GTF2 proteins (shown in gray). The significance (p Ͻ 0.05) of the alignments of the fragments shown in Fig. 2A with the rat GCS fragment was further confirmed by Smith-Waterman analysis (28,29). In addition to the proteins shown, the same motif of conserved residues was found in all known GCS homologs, other cellulose, chitin, and hyaluron synthases, rhizobial NODC proteins, and numerous uncharacterized bacterial and Arabidopsis sequences in the GenBank TM Data Bank (data not shown).
Surprisingly, the conserved residues include the D 1 ,D 2 ,D 3 , QXXRW motif ( Fig. 2A) previously identified as characteristic of GTF2 processive ␤-glycosyltransferases (e.g. cellulose synthase and chitin synthase (30 -34)) but not identified before in GCS. In addition to the D1,D2,D3,QXXRW motif, several other highly conserved residues are identified in Fig. 2A. (i) Lys-124 of rat GCS aligns with the KAG sequence of cellulose and curdlan synthases (31,34), as well as a corresponding lysine in each of the other proteins in Fig. 2A. (ii) A dibasic motif (RR or RK) is present in most of the proteins (amino acids 216 -217 of GCS) but is replaced by RIK in chitin synthases ( Fig. 2A and data not shown). (iii) A diglycine motif (GG) was found in seven of eight proteins in Fig. 2A. (iv) A conserved glutamine (Glu, replaced with Asp in hyaluronan synthases) was present immediately before D 3 in each protein. In addition, we found that a decapeptide within the GTF2 domain of GCS including the diglycine motif and D 3 can be aligned with a peptide region of porcine UDP-GalNAc pyrophosphorylase (35) identified as a uridine binding site by photoaffinity labeling (Fig. 2B).
Site-directed Mutagenesis of GTF2 Residues in GCS-To examine the role in GCS function of the conserved amino acids identified in Fig. 2, residues were individually altered by sitedirected mutagenesis. In addition, point mutants were prepared of other charged (but unconserved) residues in the regions flanking His-193. All mutant proteins were expressed at similar levels to that of the wild type enzyme, as assessed by Western blotting, with the exception of one mutant (G210I) for which only a protein fragment lacking the C terminus was detected ( Fig. 3 and data not shown). The enzyme activity of the GCS proteins mutated at amino acids within the GTF2 region is shown in Fig. 4. In this figure all substitutions were to alanine, except for glycine residues, which were substituted by isoleucine. Alanine substitutions of highly conserved residues ( Fig. 2A, highlighted in yellow) resulted in a complete loss of GCS activity for all mutants (Lys-124, Asp-144, Asp-236, Arg-272, Arg-275, Trp-276), except for Glu-235, which retained 1.5% activity and D92A (position D 1 ; see Fig. 3A), which exhibited ϳ20% activity (Fig. 4). Mutation of either glycine in the conserved diglycine motif of GCS (Gly-224 and Gly-225) or of Gly-247 of GCS, which aligns with some GTF2 proteins ( Fig.  2A), led to near or complete loss of activity (Fig. 4). Also shown in Fig. 4 are cysteine to alanine (from Table I) and histidine to alanine mutants (from Ref. 19). These data, as well as mutations of other unconserved positions within the GTF2 region of GCS, demonstrate that not all mutations within the GTF2 domain cause a significant loss of enzyme activity (Fig. 4) Additional point mutant proteins with conservative substitutions were prepared for some of the proteins that were devoid of activity when substituted with alanine. As shown in Table  III, some substitutions of the nonconserved residues Arg-177 and Arg-195 yielded proteins with significant GCS activity; however, even conservative (e.g. Asp 3 Glu, Arg 3 Lys, Trp 3 Phe) substitutions of Asp-144, Asp-236, Arg-275, or Trp-276 did not significantly restore GCS activity, suggesting that at these TABLE I GCS activity of Cys 3 Ala mutant proteins Wild type and mutant proteins in which individual cysteines were replaced with alanine were expressed in bacteria. Bacterial lysates were assayed for GCS activity using N-[7-(4-nitrobenzo-2-oxa-1,3-diazole)]-6-aminocaproyl-D-erythro-sphingosine as a substrate as described under "Materials and Methods." For NEM inhibition studies, bacterial lysates were preincubated for 15 min at room temperature with or without 0.5 mM NEM before measuring GCS activity. GCS  highly conserved positions, even slight changes in structure are enough to disrupt enzymatic function.

DISCUSSION
Biochemical studies of GCS have been difficult because the protein has never been fully purified from mammalian sources (26). When expressed in Escherichia coli, rat GCS has a similar relative specific activity (normalized by Western blots) to the native rat liver Golgi form 4 ; however, expression levels have still been too low to allow the purification of the enzyme for biophysical or structural studies. Thus, we have utilized bio-chemical and site-directed mutagenesis studies of GCS expressed in bacterial lysates to begin to identify the active site regions of GCS. Our findings that Cys-207 (Table I and Fig. 1) and  of GCS are each primary targets for amino acid-specific reagents (NEM and diethyl pyrocarbonate, respectively) that inactivate the enzyme suggest that the region of GCS including these residues may be an exposed loop that is important for substrate binding and/or enzymatic activity. Alignment of GCS with other glycosyltransferases showed that the D 1 ,D 2 , D 3 ,(Q/R)XXRW putative active site motif present in processive GTF2 ␤-glycosyltransferases was conserved in GCS

TABLE II
Characteristics of representative proteins that contain the glycosyltransferase family 2 (GTF2) domain All protein information is from the GenBank™ Data Bank entries and related publications (31,37,46,48,49). The same proteins are aligned in Fig. 2.  Figure 2 and that contain the GTF2 domain. b This open reading frame is characterized as a glucosylceramide synthase that utilizes UDP-Glc based on 50% identity with mammalian GCS, but its activity has not been tested.
in the region flanking His-193 and Cys-207. Site-directed mutagenesis of GCS at highly conserved residues showed that almost all were essential for GCS activity. These data represent the first extensive information on the GCS active site and strongly suggest that the shared motif represents a common substrate-binding and/catalytic region in GCS and processive ␤-glycosyltransferases.
Recognition of the Motif and Additional Conserved Residues-The finding of the D 1 ,D 2 , D 3 ,(Q/R)XXRW motif in GCS was unexpected because the cloned human, mouse, and rat enzymes were reported to have no significant homology to other proteins, including glycosyltransferases (17)(18)(19). This motif is fully retained in all GCS homologs identified in the Gen-Bank TM Data Bank, including mammalian forms, C. elegans (three sequences), Drosophila, Bombyx, and Danio (data not shown). It was previously proposed that the full D 1 ,D 2 ,D 3 ,QXXRW motif was present only in processive glycosyltransferases of GTF2 and that related nonprocessive glyco-syltransferases only possessed D 1 and D 2 (30). Alignment of GCS to the full motif demonstrates an exception to this generalization. An alignment of Asp-92 and Asp-144 of human GCS with conserved aspartates of ␤-glycosyltransferases was recognized in a recent classification of glycosyltransferases (36); however, the more extensive consensus region shown in Fig. 2 was not identified in that analysis. Our conclusion that the D 1 ,D 2 ,D 3 ,(Q/R)XXRW motif is a critical part of the GCS active site is based on the observation that even conservative substitutions of these residues resulted in a complete loss of GCS activity. In contrast, many other residues within the GTF2 domain could be mutated without a total loss of activity. One residue in GCS, Asp-92, which aligns with D 1 in the D 1 ,D 2 ,D 3 ,(Q/R)XXRW motif, could be mutated to alanine with retention of ϳ20% of wild type activity, suggesting that this residue is not an essential part of the GCS active site. However, two other aspartates occur immediately prior to Asp-92 and might substitute for the function of Asp-92 when it is absent.
In addition to the D 1 ,D 2 ,D 3 ,(Q/R)XXRW motif, we have identified several other residues that are conserved between GCS and some glycosyltransferases but have not previously been identified as part of this motif (Fig. 2). These include Gly-210 of GCS, which when mutated to isoleucine led to a complete loss of expression of the full-length protein, and glycines 224 and 225, which when replaced with isoleucine caused a complete loss of activity. These results suggest that these glycines are critical for proper folding or stability in GCS and probably play the same role in other glycosyltransferases in which these positions are conserved (see Fig. 2).
The Shared GTF2 Motif Is a UDP-Sugar Binding or Catalytic Site-Multiple lines of evidence suggest that the conserved region of these proteins is involved in UDP-sugar binding and/or catalysis. First, point mutation studies of Saccharomyces chitin synthase have shown that the conserved (E)D 3 ,QXXRW residues are essential for activity (37). Second, an expressed polypeptide fragment of Gossypium cellulose synthase catalytic subunit, which encompasses the GTF2 domain, binds UDP-Glc in vitro (38). Finally, a decapeptide within the conserved region of rat GCS, which includes the diglycine motif and D3, can be aligned with a peptide region of porcine UDP-GalNAc pyrophosphorylase (35) identified as a uridine binding site by photoaffinity labeling (Fig. 2B). Thus, we predict that the conserved residues in Fig. 2 define essential components of the GCS active site. The aligned proteins in Fig. 2 include   FIG. 3. A representative Western blot showing expression of GCS wild type and mutant proteins. GCS proteins were expressed in bacteria by induction with isopropyl-1-thio-␤-D-galactopyranoside for 1 h at 37°C as described under "Materials and Methods." Lysate fractions corresponding to 0.15 ml of bacterial cultures were run on 12% SDS-polyacrylamide gel electrophoresis gels and transferred to a polyvinylidene difluoride membrane, and Western blots were performed with antibodies directed against human GCS peptide sequences (21). The upper blot was probed with antibody GCS-1, which recognizes the GCS C terminus, to detect intact GCS in these fractions. In the lower blot, the same membrane was stripped with urea and reprobed with antibody GCS-5, which recognizes a sequence near the N terminus of GCS. This antibody recognizes proteolytic or truncated fragments of GCS in addition to the intact form. Note that the mutant G210I lane showed no expression of intact GCS, but a GCS fragment recognized only by the GCS-5 antibody was detected.
FIG. 4. Activity of GCS with point mutations within the conserved GTF2 domain. Wild type and GCS mutant proteins prepared by site-directed mutagenesis were expressed in bacteria. GCS activity of proteins was assessed and expressed as a percent of wild type activity, as described in the legend to Fig. 1 and under "Materials and Methods." Amino acid positions are arranged in relation to the alignment in Fig. 2A. Spacings are approximate and not indicative of exact numbers of amino acids between relevant residues. All amino acids were mutated to alanine, except for glycines, which were mutated to isoleucine. Cysteine mutant data is from Table I; histidine mutant data is from our previous publication (19). GCS amino acids that align with D 1 ,D 2 ,D 3 ,(Q/R)XXRW motif residues ( Fig. 2A) are indicated. Asterisks indicate mutant proteins that were expressed as assessed by Western blotting but that had no quantifiable GCS activity (0 -0.2% of wild type GCS). enzymes that utilize different UDP-sugar donors (Glc, GlcNAc, and GlcA) and that catalyze the formation of different linkages (e.g. ␤-1-3 versus ␤-1-4; see Table II). Thus, unconserved amino acids, rather than conserved regions, may be responsible for the different UDP-sugar specificities and linkages formed by the different enzymes.
In the past few years, the solved crystal structures of several glycosyltransferases have provided important information on the structural organization of catalytic sites and identified putative catalytic residues in these enzymes (39 -44). Those glycosyltransferases that have been crystallized are classified in several different glycosyltransferase families and have little sequence homology; however, they have been shown to possess similarities in secondary structure, and key residues within each active site are proposed to play equivalent functions. Although none of the previously crystallized glycosyltransferases (including SpsA, a GTF2 enzyme (20, 39 -44)) contains the full D 1 ,D 2 ,D 3 ,(Q/R)XXRW motif, a loose alignment between GCS and these proteins (data not shown) suggests that Asp-236 of GCS (see D3 in Fig. 2A) is equivalent to the proposed catalytic base in the crystallized enzymes (e.g. Asp-291 of Nacetylglucosaminotransferase and Asp-191 of SpsAs (40)).
Related Topology and Function of GCS and GTF2 Proteins-In addition to sharing the GTF2 domain, GCS and GTF2 proteins appear to be structurally and functionally related. First, they are all integral membrane proteins with cytosolically oriented active sites (Fig. 5). The GTF2 domain exists within a hydrophilic loop within each protein. This was demonstrated by Kyte/Doolittle plots of each representative protein from Fig. 2A and Table II (data not shown). The accessibility of His-193 and Cys-207 within GCS to diethylpyrocarbonate and NEM, respectively, also supports the premise that this region is exposed at the surface of this protein. Second, each protein has at least two hydrophobic regions that may be transmembrane domains immediately following the GTF2 domain, as judged by Kyte/Doolittle and Argos plots (data not shown; see also Refs. 18,21,37,38,[45][46][47]). Although GCS was proposed to have only a single N-terminal transmembrane domain, additional hydrophobic regions near the C terminus have been noted previously (18). These regions, which fall between the GTF2 domain and the C terminus, could be transmembrane domains or membrane-associated regions (Fig. 5), which function analogously to the post-GTF2 domain transmembrane regions of the other proteins depicted in Fig. 5. Various protein secondary structure programs (e.g. TMpred) predict transmembranes in this region of mammalian GCS, but no empirical studies have been performed to determine the topology of this region of GCS. However, if transmembrane domains are present near the C-terminal region of GCS, there are probably two of them, because the active site of GCS (presumably including the GTF2 region) is cytosolic in orientation, and we have shown that the C terminus of GCS also faces the cytosol. Finally, another common feature of GCS and GT family 2 proteins is that, soon after synthesis, their glycosylation products (e.g. glucosylceramide or polysaccharide) are transported across the lipid bilayer (either to the Golgi lumen or the cell exterior) by unidentified mechanisms. This point was noted previously by Kapitonov and Yu (36). The mechanism of transbilayer movement is not well understood for any of these products. Possibly, the transmembrane domains of these proteins, acting either singly or as oligomers (21), play a role in the transbilayer transport of the products immediately after catalysis, as has been proposed for cellulose synthase (45).
The identification of a conserved glycosyltransferase motif common to GCS and GT family 2 glycosyltransferases has important implications for understanding the structure and catalytic mechanisms of these enzymes. Parallel studies of GCS and other GTF2 proteins using site-directed mutagenesis, domain swapping, and, if possible, protein crystallization or NMR may lead to results that can be extrapolated to all proteins in glycosyltransferase families 2 and 21. Of particular importance, recognition of the structure and mechanism of the GCS active site should stimulate the development of new GCS inhibitors with desirable pharmaceutical properties for potential use in treatment of human disease. III GCS activity of site-directed mutant proteins with additional substitutions of selected residues within the GTF2 domain of GCS GCS mutant proteins generated by site-directed mutagenesis were expressed in bacteria. GCS activity of proteins was assessed and expressed as a percent of wild type activity as described in Fig. 1 (18,19,21), Dictyostelium cellulose synthase catalytic subunit (38,45), Glomus chitin synthase (37), and mouse hyaluronan synthase (46) were modeled after published data of the same or similar proteins and/or predicted using the Kyte/Doolittle and Argos algorithms. Proteins are described in Table II. For each protein shown, the active site containing the GTF2 domain (see Fig. 2A) is oriented toward the cytosol (up) and is followed by a series of predicted transmembrane or membrane-associated domains.