The UDP-glucose:glycoprotein glucosyltransferase is organized in at least two tightly bound domains from yeast to mammals.

The endoplasmic reticulum UDP-Glc:glycoprotein glucosyltransferase (GT) exclusively glucosylates nonnative glycoprotein conformers. GT sequence analysis suggests that it is composed of at least two domains: the N-terminal domain, which composes 80% of the molecule, has no significant similarity to other known proteins and was proposed to be involved in the recognition of non-native conformers and the C-terminal or catalytic domain, which displays a similar size and significant similarity to members of glycosyltransferase family 8. Here, we show that N- and C-terminal domains from Rattus norvegicus and Schizosaccharomyces pombe GTs remained tightly but not covalently bound upon a mild proteolytic treatment and could not be separated without loss of enzymatic activity. The notion of a two-domain protein was reinforced by the synthesis of an active enzyme upon transfection of S. pombe GT null mutants with two expression vectors, each of them encoding one of both domains. Transfection with the C-terminal domain-encoding vector alone yielded an inactive, rapidly degraded protein, thus indicating that the N-terminal domain is required for proper folding of the C-terminal catalytic portion. If, indeed, the N-terminal domain is, as proposed, also involved in glycoprotein conformation recognition, the tight association between N- and C-terminal domains may explain why only N-glycans in close proximity to protein structural perturbations are glucosylated by the enzyme. Although S. pombe and Drosophila melanogaster GT N-terminal domains display an extremely poor similarity (16.3%), chimeras containing either yeast N-terminal and fly C-terminal domains or the inverse construction were enzymatically and functionally active in vivo, thus indicating that the N-terminal domains of both GTs shared three-dimensional features.

Most proteins following the secretory pathway in eukaryotic cells are N-glycosylated in the endoplasmic reticulum (ER). 1 A glycan (Glc 3 -Man 9 -GlcNAc 2 ) is transferred to Asn in growing polypeptides. Glucoses are then trimmed by the action of glu-cosidase I, which removes the external Glc unit, followed by glucosidase II, which excises both Glc residues remaining in the glycan (1). Monoglucosylated N-glycans may be formed by partial deglucosylation of the transferred oligosaccharide or by reglucosylation of Glc-free glycans by the UDP-Glc:glycoprotein glucosyltransferase (GT) (2). This enzyme is a sensor of glycoprotein conformations, as it exclusively glucosylates Nglycans in not properly folded conformers. One or two ERresident ␣-mannosidases may degrade Man 9 -GlcNAc 2 to Man 8 -GlcNAc 2 and Man 7 -GlcNAc 2 , which may also be reglucosylated by GT. Folding glycoproteins oscillate then between monoglucosylated and unglucosylated forms catalyzed by the opposing activities of GT and glucosidase II. The monoglucosylated forms are recognized by two ER-resident lectins, calnexin and calreticulin. Upon reaching the proper tertiary structures, glycoproteins become substrates of glucosidase II, but not of GT. Properly folded molecules thus liberated from the lectins are then free to pursue their transit to the Golgi. Proteins that fail to properly fold are retained in the ER and are eventually transported to the cytosol, where they are degraded in the proteasomes. Interaction of monoglucosylated glycans with ER lectins not only retains misfolded glycoproteins in the ER, but also facilitates glycoprotein folding by preventing aggregation. GT is therefore the key constituent of the ER quality control of glycoprotein folding, as it is the only element in such a process that discriminates between glycoprotein conformers.
In vitro and in vivo assays have shown that, under both conditions, GT preferentially glucosylates glycoproteins not in extended conformations, but at more advanced folding, molten globule-like stages, when glycoprotein substrates already display secondary structures and some long-range interactions (3)(4)(5). Solvent-accessible hydrophobic amino acid patches have been identified as the structural elements recognized by GT in non-native conformers, as they are the only structural features exclusive of molten globule-like folding intermediates (5). Moreover, in vitro assays have shown that GT preferentially glucosylates N-glycans in the close vicinity of protein structural perturbations (6,7).
Sequence analysis (BLAST search using a BLOSUM62 matrix) of mammalian, insect, and yeast GTs has suggested that they are composed of at least two domains: the N-terminal domain, which composes 80% of the molecule, has no significant similarity to other known proteins, and has been suggested to be involved in non-native conformer recognition; and the C-terminal domain, which binds 5-azido-[␤-32 P]UDP-glucose and displays a similar size and significant similarity to members of glycosyltransferase family 8 (8 -12). Members of this family conserve the anomeric configuration of the monosaccharide transferred from a sugar nucleotide, presumably by forming a monosaccharide-enzyme intermediate. All GT Cterminal domains from different species share a significant similarity (65-70%), but no such similarity may occur between N-terminal domains. For instance, Rattus norvegicus GT (Gen-Bank TM /EBI accession number AAF67072) and Drosophila melanogaster GT (accession number Q09332) N-terminal domains share a 32.6% similarity, but they show only 15.5 and 16.3% similarities, respectively, to the same portion of Schizosaccharomyces pombe GT (accession number S63669) (8 -10). Although there is both structural and experimental evidence supporting the idea that the C-terminal domain is the catalytic portion of the enzyme, the notion that the N-terminal domain is responsible for recognition of non-native conformers has not been experimentally confirmed yet.
The work reported here provides experimental evidence for the two-domain GT structure and shows that both domains are tightly bound. If the N-terminal domain is indeed responsible for conformation recognition, this finding provides a molecular rationale for the close proximity between protein structural perturbations and N-glycans required for glucosylation. Moreover, the S. pombe GT N-terminal domain linked to the D. melanogaster GT C-terminal portion or the inverse construction formed enzymatically and functionally active enzymes in vivo. We therefore concluded that N-terminal domains from different GTs share three-dimensional features despite showing an extremely poor similarity in their primary sequences. In addition to the suggested role in the recognition of misfolded conformers, the results presented here indicate that the Nterminal domains are required for proper folding of the catalytic C-terminal portions, as the latter were enzymatically and functionally inactive and rapidly degraded in vivo when expressed in the absence of the former.
Methods-GT was assayed using UDP-[ 14 C]Glc as a sugar donor and denatured thyroglobulin as a glucosyl acceptor as described (16). Proteins were microsequenced at the Department of Biochemistry and Molecular Biology of the University of Nebraska (Omaha, NE). In constructs described below, the accuracy of all DNA junctions generated was checked by DNA sequencing. S. pombe cells were labeled with [ 14 C]Glc, and the N-glycans were analyzed as described (17), but with 2.5 mM N-methyldeoxynojirimycin. Paper chromatography was performed with Whatman No. 1 papers using solvent system A (1-propanol/ nitromethane/water (3:2:1)) and solvent system B (1-butanol/pyridine/ water (10:3:3)). Similarities of N-and C-terminal domains were obtained by BLAST analysis using a BLOSUM62 matrix.
Enzyme Purification and Related Procedures-Microsomes were prepared from R. norvegicus liver and S. pombe cells, and GTs were purified from them to homogeneity as described (17,18). Assays used for attempting to separate cleaved GT N-and C-terminal domains were as described in the indicated purification procedures (17,18).
Limited R. norvegicus GT Proteolysis-Purified enzyme (40 g) was incubated in a total volume of 250 l containing 10 mM imidazole buffer (pH 7.0), 5% sucrose, and 1 g of endoproteinase Glu-C (protease V8, Sigma) at 37°C. Aliquots of 5 l were used for GT assays after addition of 3,4-dichloroisocoumarin (Sigma) at a final concentration of 1 mM. Aliquots of 10 l were used for 10% SDS-PAGE analysis.
Electrophoresis-Nonreducing standard 10% polyacrylamide gels were employed. Purified GT (2 g) was resuspended in 10 l of solution containing 10% glycerol, 2% SDS, and 50 mM Tris-HCl (pH 7.6). N-Ethylmaleimide was added to nonreduced samples at a final concentration of 25 mM. Dithiothreitol was added to reduced samples at a final concentration of 10 mM. Both reduced and nonreduced samples were heated at 95°C for 3 min, and N-ethylmaleimide (25 mM) was added to the reduced samples to allow direct comparison of both type of samples. Reduced and nonreduced samples were run side by side on the same gels.
Antibodies against the S. pombe GT C-terminal Domain-A 519-bp fragment of S. pombe GT (bases 3658 -4176) was amplified using Pwo DNA polymerase (Roche Applied Bioscience), pBluescript SKgpt1 ϩ 33A as template, and the following oligonucleotides as primers: 5Ј-CCGGA-ATTCGTACAATTGGCCACACTGGC-3Ј (sense) and 5Ј-ATAAGAATG-CGGCCGCTAAATCGATTGTTTTGG-3Ј (antisense). The fragment released by EcoRI and NotI treatment (the primers employed generate cleavage sites for these two enzymes) was ligated to the pET22b ϩ expression vector (Novagen) previously treated with the same enzymes, and the construct was amplified in Escherichia coli NovaBlue cells (Novagen). A protein comprising amino acids 1220 -1392 of S. pombe GT fused to a polyhistidine tail was synthesized by transforming E. coli BL26(DE3) cells (Novagen) with above-described construct. The protein in inclusion bodies was washed; dissolved in 6 M urea, 25 mM Tris-HCl (pH 8.0), 0.5 M NaCl, and 5 mM imidazole buffer; and purified by affinity chromatography on Ni 2ϩ -iminodiacetic acid-Sepharose (Amersham Biosciences). A single 20-kDa protein appeared upon 10% SDS-PAGE. Antibodies raised in rabbits reacted with the recombinant protein at least at a 1:1000 serum dilution.
Construct with the S. pombe GT N-terminal and D. melanogaster GT C-terminal Domains-Two synonymous mutations (A3461G and A3464C) were introduced into the gpt1 ϩ sequence encoding the N-and C-terminal domain junction in expression vector pREP3Xgpt1 ϩ , thus creating a StuI site (cleavage with this enzyme generates a blunt end). For this purpose, the mutagenic oligonucleotide 5Ј-TTTCAAACGTAA-AGAGGCCTCTATAAATATTTTTTCTGTTGCC-3Ј and the Altered Sites II in vitro mutagenesis system (Promega) were employed. The resulting vector (pREP3Xgpt1StuI) (Fig. 1A) coded for Glu 1154 and Ala-1155 , the same as the parental one. The fragment encoding the D. melanogaster GT C-terminal domain was synthesized using oligonucleotide primers 5Ј-CATCTATCAACATTTTCTCTGTGGC-3Ј (sense; generates a blunt end) and 5Ј-TGTACATCCGGACGGGGCTCATGAGAAGGCGT-C-3Ј (antisense; generates a BspEI site), plasmid pOTgpt1 ϩ as template, and Pwo DNA polymerase. The PCR product was digested with BspEI and ligated to pREP3Xgpt1StuI previously digested with StuI and BspEI. The latter cuts S. pombe gpt1 ϩ just before the ER retrieval sequence. The expression vector generated was termed pREP3Xgpt1CTDm (Fig. 1A).
Construct with the D. melanogaster GT N-terminal and S. pombe GT C-terminal Domains-Four mutations (C56G, A57C, A58G, and G59G) were introduced into the gpt1 ϩ sequence encoding the junction between the S. pombe GT signal sequence and the N-terminal domain to create a BssHII site. These mutations resulted in the point mutation K20R at the amino acid level. Mutations were generated by conducting three PCRs. The first one used primers 5Ј-CGCGGATCCCCCGGGCTGCAG-G-3Ј (sense; hybridizes with a noncoding region ϳ100 bp upstream of the ATG codon and generates a BamHI site) and 5Ј-CTTGACATCTA-AAGGGCGCGCGGCATAGCAAATCG-3Ј (antisense and mutagenic; hybridizes with the sequence encoding the signal peptide/N-terminal domain junction). The second PCR used primers 5Ј-CGATTTGCTATG-CCGCGCGCCCTTTAGATGTCAAG-3Ј (sense and mutagenic; hybridizes with the sequence encoding the signal peptide/N-terminal domain junction) and 5Ј-CAACCACTGCGTACGGAATGCC-3Ј (antisense; hybridizes ϳ500 bp downstream from the ATG codon and generates a BsiWI site), the expression vector pREP3Xgpt1 ϩ as template, and Pwo DNA polymerase. The products of both PCRs were purified, mixed, and used as template for a third PCR, in which the sense and antisense primers were those employed in the first and second reactions, respectively. The 606-bp fragment obtained was digested with BamHI and BsiWI and ligated to expression vector pREP3Xgpt1StuI (Fig. 1A) previously treated with the same enzymes, thus generating expression vector pREP3Xgpt1StuI-BssHII (Fig. 1B). The fragment encoding the N-terminal domain of D. melanogaster GT was synthesized using primers 5Ј-TTGGCGCGCGAATCCAGTCAGAGCTATCC-3Ј (sense; creates a BssHII site at the 5Ј-end) and 5Ј-CCGTATCCTCATCAGAGGC-3Ј (antisense; generates a blunt end), plasmid pOTgpt1 ϩ as template, and Pwo DNA polymerase. The PCR product was digested with BssHII and ligated to pREP3Xgpt1StuI-BssHII previously digested with StuI and BssHII. The expression vector generated was termed pREP3X gpt1NTDm (Fig. 1B).
Truncated Protein Constructs-The expression vector encoding the S. pombe GT signal peptide followed by the N-terminal domain and the ER retrieval sequence was synthesized by first performing inverse PCR using pBluescript SKgpt1 ϩ 33A (which encodes full-length GT) as template and oligonucleotide primers 5Ј-CCGGACGAACTTTGAAAC-3Ј (sense) and 5Ј-AAAGAAGCTGAGAGACTT-3Ј (antisense). The first primer encodes the retrieval plus stop sequences, and the second one encodes the N-and C-terminal domain junction (centered in base 3444 from the ATG codon). The fragment was purified, phosphorylated, religated, and amplified in E. coli DH5␣ cells (Invitrogen). The resulting plasmid (pBluescript SKgpt1NT) was digested with SnaBI and BspEI. The 555-bp fragment generated was ligated to expression vector pREP2gpt1 ϩ previously treated with the same restriction enzymes. The resulting expression vector was pREP2gpt1NT (Fig. 1C). The expression vector encoding the GT signal peptide followed by the C-terminal domain and the ER retrieval sequence was constructed in a similar way, but with the following primers: 5Ј-AATTTCAAACGTAAA-GAAGC-3Ј (sense) and 5Ј-GGCGGCATAGCAAATCG-3Ј (antisense). The first primer encodes the N-and C-terminal domain junction, and the second one starts at the sequence coding for the junction between the GT signal sequence and the mature protein sequence. The resulting plasmid (pBluescript SKgpt1CT) was digested with BamHI and NruI. The resulting 590-bp fragment was ligated to the expression vector pREP3Xgpt1 ϩ previously treated with the same restriction enzymes. The resulting expression vector was pREP3Xgpt1CT (Fig. 1C).
Insertion of the c-Myc Epitope at the C Termini-For inserting the c-myc epitope (EQKLISEEDLN) into chimeric and truncated proteins before the ER retrieval sequence, oligonucleotide primers 5Ј-CCGGAA-CAAAAACTCATCTCAGAAGAGGATCTGAAT-3Ј (sense) and 5Ј-CCG-GATTCAGATCCTCTTCTGAGATGAGTTTTTGTT-3Ј (antisense) were annealed and ligated to the above-described constructs previously digested with BspEI. The sequence coded for the epitope and left, at both ends, sequences able to bind DNA previously digested with the restriction enzyme. All constructs had an additional Pro residue before the c-Myc epitope. The resulting expression vector designations include "c-myc" to indicate insertion of the epitope-encoding sequence. Fig. 2A shows that almost all pure S. pombe GT preparations yielded three bands when subjected to 10% SDS-PAGE. The same result was obtained also with some R. norvegicus enzyme preparations (Fig.  2B). The sizes of the fragments were 161, 130, and 35 kDa for yeast GT and 172, 135, and 38 kDa for the mammalian enzyme. The results obtained from microsequencing the N termini are depicted in Fig. 2 (A and B). They reveal that, in both cases, the smaller fragments had been produced by endoproteolytic breakage of the full-length species at the junction between the N-and C-terminal domains (amino acids 1149 and 1220 from the initial Met residue for the yeast and mammalian GTs, respectively) (Figs. 2, A and B, and 3). As expected, the largest and smallest (but not the middle) fragments of S. pombe GT reacted with polyclonal antibodies directed against a C-terminal 172-amino acid fragment of the molecule (Figs. 1A and 2C).

Linkage between the N-and C-terminal Domains Is Extremely Sensitive to Endoproteolysis-
Noncovalent Linkages Tightly Bind Both Domains-The Nand C-terminal domains of R. norvegicus and S. pombe GTs proved to be tightly bound, as the small fragment yielded by endoproteolytic degradation could not be separated from the middle one by gel filtration on a Bio-Sil SEC 250 column (Fig.  4, A and C), by chromatography on a phenyl-Superose column (Fig. 4, B and D), by ion-exchange chromatography on a Mono Q HR 5/5 column, and by filtration through a Centricon-100 filter (data not shown). Moreover, the linkage between both domains was not mediated by disulfide bonding as revealed by 10% SDS-PAGE performed under reducing and nonreducing conditions (Fig. 4, E and F). The small and middle fragments of both R. norvegicus and yeast GTs were retained on a concanavalin A-Sepharose column despite the fact that the mammalian enzyme C-terminal domain lacks N-glycosylation consensus sequences and that the only one present at such a location in the yeast enzyme was unoccupied as revealed by 10% SDS-PAGE following endo-␤-N-acetylglucosaminidase H treatment (data not shown). This result shows that the N-and C-terminal domains could not be separated also by lectin affinity chromatography.
Noncovalently Bound N-and C-terminal Domains Are Enzymatically Active and Able to Discriminate between Native and Non-native Conformers-We observed that certain S. pombe preparations were enzymatically active despite almost totally lacking the full-length protein component. A slight modifica-tion of the purification procedure was then introduced to fully eliminate it: the interaction period of the applied material with the concanavalin A-Sepharose column before elution with ␣-methylmannoside was increased from 1 to 12 h. As depicted in Fig. 5 (A and C), although the resulting preparation completely lacked the full-length enzyme, it displayed enzymatic activity. As expected, the largest fragment in the extensively degraded preparation did not react with antibodies directed against the last portion of the C terminus (Fig. 5B). As the same procedure (longer interaction of the applied material with concanavalin A-Sepharose) did not yield an R. norvegicus preparation devoid of the full-length enzyme, the purified GT was incubated with protease V8. As shown in Fig. 5D, the fulllength R. norvegicus enzyme disappeared after a 5-min incubation, yielding 135-plus 38-kDa fragments that displayed almost full enzymatic activity. Microsequencing of the latter yielded the sequence SFKWG, indicating that protease V8 cleavage had occurred at amino acid 1211 from the initial Met residue, i.e. 9 amino acids ahead of that produced by endoproteolytic cleavage (see above). The 38-kDa protein was gradually further degraded to smaller fragments, and the enzymatic activity was concomitantly lost (Fig. 5D). Both R. norvegicus and S. pombe enzyme preparations devoid of full-length compo- N-and C-terminal Domains May Be Expressed Separately, Yielding an Active Enzyme-The notion of a two-domain GT structure able to display enzymatic activity even if not covalently bound was supported by the synthesis of an active enzyme upon transfection of a GT null mutant (Sp61G4) with two plasmids separately encoding the N and C termini (Table  I). Nevertheless, microsomes prepared from cells transformed with both expression vectors displayed a lower specific enzymatic activity than those isolated from cells transformed with a plasmid coding for the full-length enzyme. Constructs coding for either one of both domains had segments encoding the GT signal peptide before the enzymatic domains and a c-Myc epitope (EQKLISEEDLN) and the GT ER retrieval sequence (PDEL) after them (Fig. 1C). Western blots with c-Myc-specific monoclonal antibodies showed that both fragments had been effectively expressed (Fig. 6A). As with wild-type GT proteolytically cleaved at the domain junction, the enzyme formed by the separately expressed domains was also able to discriminate between native and non-native conformers. No enzymatic activity was detected in microsomes isolated from cells transformed with only the C-terminal domain-encoding vector, although, as will be seen below, the fragment was effectively expressed as revealed by Western blot analysis (Table I).
D. melanogaster and S. pombe GT N-and C-terminal Domains Are Mutually Interchangeable-As shown above, in both yeast and mammalian GTs, the N-and C-terminal domains are tightly bound and show enzymatic activity even in the absence of any covalent linkage between them. This was a rather unexpected result because whereas the C-terminal domains of R. norvegicus, D. melanogaster, and S. pombe share a high similarity (62.3-74.0%), much lower values occur for the Nterminal portions (32.6% for the R. norvegicus and D. melanogaster GTs and 15.5-16.3% for the latter enzymes and S. pombe GT). In other words, to bind highly similar C-terminal domains, the tertiary structures of the N-terminal portions of different GTs should be expected to share three-dimensional features. To confirm that the N-terminal domains from different GTs share elements of tertiary structure, two types of chimeras were constructed in expression vector pREP3X. The first one encoded the D. melanogaster GT N-terminal and S. pombe GT C-terminal domains (Fig. 1B). The second construct coded for the yeast GT N-terminal and fly GT C-terminal domains (Fig. 1A). These constructs, together with those encoding either the full-length S. pombe enzyme or only its Cterminal portion, were transfected into S. pombe Sp61G4A mutant cells. All constructs coded for the S. pombe GT signal peptide at the N termini and for the above-mentioned c-Myc epitope before the S. pombe GT ER retrieval sequence (PDEL).
Sp61G4A mutants lack GT and transfer to Man 9 -GlcNAc 2 instead of the complete glycan, as they lack the dolichol-P-Glcdependent glucosyltransferase responsible for Glc 1 -Man 9 -GlcNAc 2 -P-P-dolichol formation. As they synthesize underglycosylated glycoproteins and lack the folding facilitation process mediated by glycoprotein-calnexin interaction, these double mutant cells grow poorly with a round morphology at 28°C and do not grow at 37°C. Their transformation with a GT-encoding expression vector rescued the wild-type phenotype, but no rescue was observed when two point mutations that abolished GT activity were introduced at the C-terminal domain (13).
Transformed cells were incubated with 5 mM [ 14 C]Glc in the presence of N-methyldeoxynojirimycin, a glucosidase II inhibitor. Analysis of whole cell N-oligosaccharides released by endo-␤-N-acetylglucosaminidase H showed that, in addition to peaks migrating as a Man 9 -GlcNAc standard, shoulders in the position expected for Glc 1 -Man 9 -GlcNAc were formed in cells transfected with full-length S. pombe GT or with both chimeras coding for mixed yeast-fly enzymes (Fig. 7, A, C, and D). On the contrary, the Glc 1 -Man 9 -GlcNAc shoulder was absent in cells transformed with the S. pombe GT C-terminal portion (Fig.  7B). Formation of Glc 1 -Man 9 -GlcNAc 2 in cells transformed with full-length GTs and its absence in mutants transformed with the C-terminal domain were confirmed by strong acid hydrolysis of substances migrating as Glc 1 -Man 9 -GlcNAc and Man 9 -GlcNAc standards in Fig. 7, A-D. Labeled glucose residues were present only in substances derived from cells transformed by the expression vector encoding full-length GTs (Fig.  7, E-H).
The C-terminal domain was effectively expressed as revealed by Western blot analysis; but the protein was extremely unstable, as no expressed protein was detected after a 2-h incubation of intact cells with cycloheximide, thus strongly suggesting that it had been unable to properly fold in the absence of the N-terminal domain (Fig. 6B, lanes 3 and 4). On the contrary, full-length S. pombe GT continued to be detected after a similar cycloheximide treatment (Fig. 6B, lanes 1 and 2) To confirm that the chimeric enzymes were not only enzymatically, but also functionally active in vivo, we investigated whether they rescued the non-growth phenotype of Sp61G4A mutants at 37°C. As shown in Fig. 8, A-C, in all cases, the mutant cells transformed with full-length GT constructs, but not with either the C-terminal domain-encoding expression   1 and 2) or only the C-terminal domain (expression vector pREP3Xgpt1CT-c-myc) (lanes [3][4][5][6]. Cells were incubated with cycloheximide (0.1 mg/ml) for 2 h before microsomal preparation in lanes 2, 4, and 6. Cells were grown at 28 and 37°C in lanes 1-4 and lanes 5 and 6, respectively. For further details, see ''Experimental Procedures.'' vector or with the vector itself, grew at 37°C. As in Sp61G4A double mutant cells grown at 28°C (Fig. 6B, lanes 3 and 4), Western blot analysis of microsomal proteins isolated from Sp61G4 single mutant cells transformed with the C-terminal domain expression vector showed that the C-terminal domain was expressed at 37°C, but was unstable, as it disappeared after a 2-h incubation of cells with cycloheximide (Fig. 6B, lanes 5 and 6). (Sp61G4 cells lack GT, but transfer Glc 3 -Man 9 -Glc-NAc 2 and are able to grow at 37°C.) DISCUSSION The results reported herein provide experimental evidence for the two-domain GT structure suggested by the amino acid sequence, as expression of both portions, each of them encoded by different vectors, in an S. pombe GT null mutant resulted in the synthesis of an active enzyme. Expression of the C-terminal or catalytic domain resulted in the synthesis of an inactive, highly unstable protein. Moreover, although linkage between both domains in the wild-type enzyme was extremely sensitive to proteolysis, both portions could not be separated by ionexchange, size-exclusion, lectin affinity, and hydrophobic interaction chromatographies without loss of enzymatic activity. It is worth remarking that only the N-terminal domain is Nglycosylated and that there is an almost 4-fold difference in size between both domains. Moreover, the cleaved molecules were able to discriminate between native and non-native glycoprotein conformers.
As mentioned above, the C-terminal domains displayed approximately the same size and significant sequence similarity to members of the glycosyltransferase family 8. It has also been reported that this portion of the molecule has the ability to bind 5-azido-[␤-32 P]UDP-glucose (10). These observations indicate that the C-terminal domain is the catalytic portion of the molecule where formation of the putative glycosyl-enzyme intermediate takes place. Expression of the R. norvegicus GT Cterminal domain in insect cells yielded a protein displaying ϳ5% of the specific full-length enzymatic activity (10). It was not reported, however, whether the recombinant truncated enzyme discriminated between properly and not properly folded glycoprotein substrates. As shown above and will be further discussed below, we were unable to obtain an enzyme active either in vivo or in vitro upon expressing the S. pombe Cterminal domain in a yeast GT null mutant. It has been suggested that the GT N-terminal domain is responsible for sensing the folding status of glycoprotein substrates. An additional or alternative role (or perhaps the sole one) is, as demonstrated here, to contribute to proper folding of the catalytic or Cterminal domain within the ER luminal environment.
To confirm the presence of common tertiary structural elements in the N-terminal domains of different GTs, as suggested by the tight association between highly similar C-terminal and highly divergent N-terminal domains, chimeras containing the D. melanogaster GT N-terminal and S. pombe GT C-terminal domains or the yeast GT N-terminal and fly GT C-terminal domains were expressed in S. pombe GT null mutants. Chimeric proteins were enzymatically active in vivo and were able to rescue the wild-type phenotype (growth at 37°C) of gpt1/ alg6 double mutant cells (Sp61G4A). These mutants lack GT and synthesize underglycosylated glycoproteins, as they transfer (inefficiently) Man 9 -GlcNAc 2 instead of Glc 3 -Man 9 -GlcNAc 2 (13). The severe ER stress to which they are submitted pre- In E-H, substances migrating as Glc 1 -Man 9 -GlcNAc and Man 9 -GlcNAc standards in A-D were submitted to strong acid hydrolysis and subjected to paper chromatography with solvent system B. Substances in A generated those in E, substances in B generated those in F, and so on. Standards were as follows: Man (M) and Glc (G). vents their growth at 37°C, but the wild-type phenotype could be restored upon transformation with an expression vector coding for an active GT. The underglycosylated glycoprotein(s) that necessarily required GT-mediated interaction with calnexin for proper folding at high temperature was probably somehow involved in cell wall synthesis, as the wild-type phenotype could be rescued also in a hyperosmotic medium (1 M sorbitol) (13). The expressed C-terminal domain was inactive in vivo and was also unable to rescue the wild-type phenotype. As in cells grown at 28°C, the protein was unstable at 37°C, probably reflecting its inability to properly fold in the absence of the N-terminal domain.
As protein domains in glycoprotein substrates that determine the exclusive glucosylation of non-native conformers probably do not share identical three-dimensional structures, the corresponding recognition domains in glucosyltransferases from different species might not be expected to necessarily display significant similarity in their tertiary structures. In other words, it may be that recognition of hydrophobic amino acid patches in molten globule conformers could be attained by GTs displaying a variety of different N-terminal domain conformations if those domains were indeed responsible for sensing glycoprotein substrate conformations. In fact, GTs purified from R. norvegicus and S. pombe, which share only a 15.5% similarity in their N-terminal domain primary sequences, were shown to display the same exclusive specificity for non-native conformers in in vitro assays (17,18). The results presented here, however, show that, despite the extremely poor primary sequence similarity shown by GT N-terminal domains from different species, such portions probably share substantial elements of tertiary structure, as yeast and fly GT N-terminal domains were able to facilitate folding and stabilization of the GT C-terminal portions of both species alike.
A very unique feature of GT is that it glucosylates only N-glycans in the very near proximity of protein structural perturbations. It was recently reported that only the N-glycan attached to the misfolded subunit of an artificial dimer formed by properly folded and misfolded RNase B monomers was glucosylated by GT in in vitro assays (6). It was further demonstrated that the N-glycan was glucosylated only when present in the immediate vicinity of the localized protein structural perturbation introduced into a mutant RNase B monomer by abolishing one of the four disulfide bonds present in the wildtype molecule (7). These recent results agree with previous ones reporting that N-glycans and misfolded protein subunits have to be covalently linked to be glucosylated (19). The close proximity between both structural elements required for effi-cient glucosylation (N-glycans and structural perturbations) may be best explained by the results reported herein, i.e. by the intimate association between the catalytic (C-terminal) and putative misfolded recognition (N-terminal) domains by noncovalent bonding.
It should be stressed that ascription of the location of the conformation recognition site to the N-terminal domain is merely speculative. It could be that its sole role is, as demonstrated here, to facilitate folding and stabilization of the catalytic C-terminal portion and that the conformation-sensing site resides also in this last part of the molecule. The sole evidence against this possibility is that enzymes belonging to glycosyltransferase family 8, with which GT C-terminal domains share a similar size and significant sequence similarity, probably do not have the capacity to discriminate between glycoprotein conformers, as most of them are not involved in glycoprotein glycosylation. On the other hand, the close proximity between the N-glycan and the protein structural perturbation required for glucosylation could also be explained if both the catalytic and conformation recognition sites reside in the relatively small (ϳ35 kDa) C-terminal domains.