Two proteins homologous to the N- and C-terminal domains of the bacterial glycosyltransferase Murg are required for the second step of dolichyl-linked oligosaccharide synthesis in Saccharomyces cerevisiae.

Two highly conserved eukaryotic gene products of unknown function showing homology to glycosyltransferases involved in the second steps of bacterial peptidoglycan (Murg) and capsular polysaccharide (Cps14f/Cps14g) biosynthesis have been identified in silico. The amino acid sequence of the eukaryotic protein that is homologous to the lipid acceptor- and membrane-associating N-terminal domain of Murg and the Cps14f beta4-galactosyltransferase enhancer protein is predicted to possess a cleavable signal peptide and transmembrane helices. The other eukaryotic protein is predicted to possess neither transmembrane regions nor a signal peptide but is homologous to the UDP-sugar binding C-terminal domain of Murg and the Cps14g beta4-galactosyltransferase. Both the eukaryotic proteins are encoded by essential genes in Saccharomyces cerevisiae, and down-regulation of either causes growth retardation, reduced N-glycosylation of carboxypeptidase Y, and accumulation of dolichyl-PP-GlcNAc. In vitro studies demonstrate that these proteins are required for transfer of [3H]GlcNAc from UDP-[3H]GlcNAc onto dolichyl-PP-GlcNAc. To conclude, two gene products showing homology to bacterial glycosyltransferases are required for the second step in dolichyl-PP-oligosaccharide biosynthesis.

saccharide from LLO onto proteins and is initiated by UDP-GlcNAc dolichyl phosphate:N-acetylglucosamine-1-phosphate transferase (DPAGT1). Sequence data show that this enzyme belongs to a family of sugar phosphate transferases that act on phospolipid acceptors to initiate glycolipid synthesis in bacterial peptidoglycan, liposaccharide, exosaccharide, and capsular polysaccharide pathways (4,5). A practical consequence of these facts is that the potent antibiotic tunicamycin is not clinically useful because it blocks the first step of both the eukaryotic LLO (6) and bacterial peptidoglycan pathways (7). The second step in peptidoglycan biosynthesis is carried out by UDP-GlcNAc undecaprenyl-PP-MurNAc pentapeptide:Nacetylglucosaminyltransferase (Murg, 8 -10), which has been crystallized (11) and is now a target for antibiotic design (12,13). In analogous fashion, the second step of the eukaryotic LLO pathway entails conversion of dolichyl-PP-GlcNAc to dolichyl-PP-GlcNAc 2 by a poorly characterized N-acetylglucosaminyltransferase (UDP-GlcNAc dolichyl-PP-GlcNAc:N-acetylglucosaminyltransferase, LLO-NAGT, 14 -18). A structural relationship between bacterial Murg and LLO-NAGT has not been reported, but such a finding would be of importance for our understanding of the origins and mechanisms of eukaryotic protein glycosylation and would be critical for the current projects dedicated to the design of clinically useful antibiotics that target steps in bacterial cell wall glycoconjugate biosynthesis. Finally, although a congenital disorder of glycosylation caused by a deficiency in LLO-NAGT has not yet been reported (1,19), the identification of the human gene encoding this enzyme will allow such an illness to be characterized at the molecular level and permit the design of an antenatal diagnostic test.
Here we identify two gene products that are required for the second step of LLO biosynthesis in Saccharomyces cerevisiae. These proteins are similar to two polypeptides that comprise Streptococcus pneumoniae capsular polysaccharide ␤4-galactosyltransferase, Cps14f/Cps14g, and correspond to the well characterized N-and C-terminal domains of the Escherichia coli ␤4-N-acetylglucosaminyltransferase, Murg.
Yeast Strains, Culture, and Radiolabeling-S. cerevisiae strains in * This work was supported by the Mizutani Foundation, the GIS-Institut des maladies rares/INSERM funded French Congenital Disorders of Glycosylation Research Network, and by institutional funding from INSERM. The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
which the promoters of the YBR243c (ALG7), YBR070c, and YGL047w open reading frames are individually replaced by a doxycycline-repressible (Tet-O) promoter (Open Biosystems, Huntsville, AL), and the parental strain R1158 (control cells) were grown in YPD medium at 30°C. Stationary precultures of the different Tet-O strains were diluted (A 600 nm ϭ 0.1) in YPD containing 1-10 g/ml doxycycline. Culture progression was monitored by turbidimetry during 23 h. Cells were then harvested and reseeded (A 600 nm ϭ 0.1) in the same concentrations of drug before cultivating for a further 23 h. A third passage was performed in this manner.
Preparation and Incubation of Yeast Microsomes with Nucleotide Sugars-Microsomes (23) were washed with 50 mM Tris/HCl, pH 7.3, containing 10% w/v glycerol. 400 -800 g of protein were incubated with UDP-[ 3 H]GlcNAc in the above buffer containing 20 mM MgCl 2 , and where indicated 100 M Ac-NYT-NH 2 , in a final volume of 100 l at 25°C. In experiments where microsomes were incubated with exogenous glycolipids, the DEAE-cellulose eluates were dried down and taken up in 10 l of 5% Nonidet P-40. Reactions were terminated by addition of 200 l of H 2 O, 600 l of MeOH, and 900 l of CHCl 3 . After shaking and centrifugation, the upper phase was removed and replaced with an equal volume of fresh upper phase. By repeating this procedure twice, the lower CHCl 3 phase was washed before being assayed by scintillation counting.
Galactosyltransferase Assay-Sugars released from glycolipids were resolved on a Biogel-P2 column (125 ϫ 0.7 cm) developed in 100 mM acetic acid. Fractions corresponding to those of standard GlcNAc and di-N-acetylchitobiose (GlcNAc 2 ) were collected separately and assayed for these two sugars in a reaction volume of 50 l containing 20 mM NaHCO 3 , pH 8.5, 20 mM MnCl 2 , 0.25 Ci UDP-[ 14 C]galactose, and 5 mU bovine milk galactosyltransferase. After 3 h at 37°C the mixtures were applied to columns of AG 1-X2 and AG 50-X2, and the water eluates and washes were collected and subjected to scintillation counting.
Western Blot-Cell extracts (23) were resolved on 10% SDS-acrylamide gels and blotted onto nitrocellulose membranes. The anti-carboxypeptidase Y (CPY) monoclonal antibody was detected using goat anti-mouse conjugated with horseradish peroxidase.

Identification of a Highly Conserved Eukaryote Gene Product of Unknown Function Similar to the C-terminal Domain of E.
coli Murg-DPAGT1, the enzyme responsible for the first step in the biosynthesis of LLO, has weak homology to UDP-Mur-NAc pentapeptide:undecaprenyl-P MurNAc(pentapeptide)-1phosphate transferase (Mray), which catalyzes the formation of undecaprenyl-PP-MurNAc pentapeptide (4,5), the first lipid linked intermediate in the biosynthesis of bacterial peptidoglycan (Fig. 1). The glycosyltransferases responsible for the second steps of these related pathways may also display similarities because as well as employing a common activated sugar donor the two lipid-linked acceptor molecules share common structural features (Fig. 1). In bacteria, the enzyme Murg transfers GlcNAc from UDP-GlcNAc onto C-4 of the MurNAc residue of undecaprenyl-PP-MurNAc pentapeptide (9) and has been characterized at the atomic level (13). The N-and C-terminal domains of the Murg enzymes belong to Protein Families (Pfam, www.sanger.ac.uk/Software/Pfam/, Ref. 24) PF03033 and PF04101, respectively. While examining PF04101, we noted that as well as containing 176 proteins corresponding to various bacterial Murg sequences it contained a group of proteins that possessed only the Murg C-terminal domain. Within this group we noted the presence of a small protein with unknown function that is highly conserved in yeast, plants, insects, worms, and mammals. To learn more about this protein, the human, toad, and yeast sequences were submitted to a fold recognition server (25) to detect structural features in common with other proteins present in the data bases. Irrespective of the organism of origin, the highest significant consensus values were obtained with E. coli Murg (Protein Data Bank entry 1F0K) as template. Murg comprises N-and C-terminal domains containing a Rossman fold ␣/␤ open sheet structure (11). An alignment of the primary and secondary structural characteristics of Murg with those predicted for YGL047w-like sequences is illustrated in Fig. 2 (lower panel). The YGL047wlike sequences begin with a conserved region that maps onto FIG. 1. Identification of novel genes required for growth and N-glycosylation in yeast. The first two steps of eukaryotic glycoprotein biosynthesis are carried out by DPAGT1 and LLO-NAGT. The analogous steps of bacterial peptidoglycan biosynthesis involve UDP-MurNAc pentapeptide:undecaprenyl-P MurNAc(pentapeptide)-1 phosphate transferase (Mray) and UDP-GlcNAc:undecaprenyl-PP-MurNAc pentapeptide N-acetylglucosaminyltransferase (Murg). In Sphingomonas paucimobilis, the first two steps of sphingan biosynthesis are undertaken by the Spsb and Gelk transferases. Whereas the first step in S. pneumoniae capsular polysaccharide biosynthesis is effected by the cps14e gene product, the second step is carried out by cps14f and cps14g gene products, which together promote efficient galactosyltransferase activity.
Murg G-loop 3, which is involved in binding the ␣ phosphate of UDP-GlcNAc (11). Importantly, the YGL047w-like sequences also contain a conserved motif that maps onto a region of Murg (C␤4-C␣4-C␤5-C␣5) that contains a consensus sequence for UDP-glucuronosyltransferases (11). YGL047w-like sequences are predicted to possess neither an N-terminal signal peptide nor transmembrane spanning regions. As all the proteins known to interact with dolichyl-PP-sugars possess several transmembrane regions, the above observations suggested that the YGL047w gene product may not act efficiently on a highly lipophilic acceptor alone. Accordingly, we looked in PF03033 for mammalian proteins containing only the N-terminal domain of Murg, however no such sequences were apparent.
Identification of Second Highly Conserved Eukaryotic Gene Product of Unknown Function That Is Similar to the N-terminal Domain of E. coli Murg-Important information bearing on polypeptides corresponding to only the C-terminal domain of Murg derives from studies on bacterial liposaccharide, capsularsaccharide, and exopolysaccharide synthesis, which have striking similarities to LLO and peptidoglycan biosynthesis (Fig. 1). In fact, the cps14g gene product that is involved in the second step of bacterial capsular polysaccharide biosynthesis (26) is a 167-residue ␤4-galactosyltransferase displaying similarities to the C-terminal domain of Murg (Fig. 2, lower panel), but its activity is greatly enhanced when assayed in the pres-ence of the 149-residue cps14f gene product (26), which shows homology to the N-terminal domain of Murg (Fig. 2, upper  panel). Accordingly, we searched the data banks for yeast, plant, and mammalian sequences showing homology to the cps14f gene product. A yeast gene, YBR070c, along with several plant and mammalian homologues were identified (Fig. 2, upper panel). This protein was submitted to the fold recognition server as described above, and irrespective of the organism of origin, the highest significant consensus values were again obtained with E. coli Murg (Protein Data Bank entry 1F0K) as template. The YBR070c-like sequences are predicted to possess a hydrophobic N-terminal signal peptide and 1-2 other transmembrane helices that are not present in Murg (Fig. 2, upper  panel).
YBR070c and YGL047w Are Required for N-Glycosylation in S. cerevisiae-As YBR070c and YGL047w are essential for growth in S. cerevisiae, their functions were examined in YGL047wDR and YBR070cDR strains in which each gene is under the control of a doxycyclin-repressible promoter (20). These cells, along with ALG7DR cells (in S. cerevisiae the DPATG1 gene is called ALG7, Ref. 27, Fig. 1) and control cells (see "Experimental Procedures"), were grown in the presence of different concentrations of doxycyclin. The growth curves shown in Fig. 3 demonstrate that all except the control cells display reduced growth rates when grown with drug. This   FIG. 2. Sequence alignments and secondary structure predictions for the YGL047w and YBR070c gene products. Upper panel, the yeast YBR070c-encoded polypeptide (Sc, Swiss-Prot accession no. P38242) and homologous sequences from Homo sapiens (Hs, Gen-Bank TM accession no. BC011706) and Xenopus laevis (Xl, GenBank TM accession no. BP699206) were aligned using the Clust-alW program (32). This alignment was mapped manually onto the first 161 amino acids of E. coli Murg (SWALL accession no. P17443) and the S. pneumoniae Cps14f protein (Swiss-Prot accession no. P72514). Lower panel, similarly, polypeptides homologous to the YGL047w gene product (Hs, GenBank TM accession no. BC005336; Xl, GenBank TM accession no. AW644301; Sc, Swiss-Prot accession no. P53178) were aligned and manually mapped onto the Cterminal region (from position 181) of Murg and the S. pneumoniae Cps14g protein (Swiss-Prot accession no. P72515). For the ClustalW alignments, conserved residues are shown in red letters. Where these conserved residues are also conserved in Murg or the Cps14 polypeptides the residue is also indicated in red. A fold recognition server (25) was used to predict secondary structure features for the polypeptides, and in all cases E. coli Murg (Protein Data Bank code 1F0K) was calculated to be the best template for structure predictions. Murg N-, and C-terminal ␤ sheet regions (N␤1-7 and C␤1-6, Ref. 13) and predicted ␤ sheet regions are boxed in blue, whereas pink boxes have the same significance with respect to ␣ helices. Potential signal peptides are indicated by blue dotted lines. Amino acids predicted to occur in transmembrane helices are underlined by pink dotted lines, and green bars (1-3) indicate the positions of the Murg "G-loops," which are described in the text. The solid blue bar indicates the region corresponding to a peptide motif that is conserved in the UDPglucuronosyltransferase family. effect was only observed during a second growth passage and was maximal after a third passage in the presence of 1-10 g/ml doxycyclin. Many yeast strains deficient in enzymes of the LLO pathway generate CPY forms bearing reduced numbers of N-glycans (23), and Fig. 4 shows that hypoglycosylated CPY forms appear when ALG7DR, YGL047wDR, and YBR070cDR (but not control) cells are cultivated with 10 g/ml doxycyclin (identical results were obtained with 1 g/ml doxycyclin). A comparison of data presented in Fig. 3, A and B, show that the onset of both CPY hypoglycosylation and growth retardation occur simultaneously. In the experiments that follow all the yeast strains used were harvested at mid-log phase after 3 growth passages in 1 g/ml doxycyclin.
Metabolic radiolabeling of YGL047wDR cells with 2[ 3 H]mannose for 30 min revealed a 1.5-fold increase in radiolabel incorporation into LLO but also into endo H-released N-glycans, and it was noted that the distribution of LLO species was the same in both YGL047wDR and control cells (data not shown). These data indicate that the observed block in glycoprotein biosynthesis (Fig. 3) occurs at an early step, before the addition of mannose to the LLO. A similar experiment in which cells were pulse radiolabeled with [6-3 H]GlcNAc for 8 min revealed that the ratio of LLO bearing GlcNAc and GlcNAc 2 was higher in YGL047wDR cells (0.65) than that (0.17) observed in the control cells. Although this result indicates a potential block in the addition of GlcNAc onto dolichyl-PP-GlcNAc, the low incorporation of radioactivity into these LLO rendered further analysis by this approach difficult.

The YGL047w and YBR070c Gene Products Are Required for the Synthesis of Dolichyl-PP-GlcNAc 2 -When incubated with UDP-[ 3 H]GlcNAc, microsomes derived from the YGL047wDR and YBR070cDR strains incorporate 30 -50% less [ 3 H]GlcNAc
into glycolipids than their wild type-derived counterparts (Fig.  4A, left panel). After fractionating the glycolipids by ion-exchange chromatography, it was noted that [ 3 H]GlcNAc incorporation into neutral or positively charged (Unbound) species was the same in the three microsome populations (Fig. 4A, middle panel). By contrast, elution of the columns with 100 mM NH 4 Ac, a procedure known to displace dolichyl-PP-oligosaccharides from this ion exchanger (22), indicated that YGL047wDR and YBR070cDR microsomes incorporated only 25-30% the amount of radioactivity into these glycolipids when compared with control membranes (Fig. 4A, right panel). In a similar experiment (data not shown) microsomes derived from ALG7DR cells were found to incorporate only 10% of control levels of radioactivity associated with these negatively charged glycolipids. TLC of the glycolipid fractions that did not bind to DEAE-cellulose revealed a single component that exhibited behavior that was similar irrespective of the microsome population from which it was derived (Fig. 4B, Unbound). Resolution of the negatively charged glycolipids revealed the presence of two major components (Fig. 4B, 100 mM NH 4 Ac) in control cell-derived microsomes, but much reduced quantities of the slower migrating species were present in microsomes from the YGL047wDR and YBR070cDR strains. Furthermore, incorporation of radiolabel into both these components was sharply reduced in ALG7DR membranes. Separation of the free sugars released from these glycolipids by mild acid hydrolysis (Fig. 4B, 100 mM NH 4 Ac after HCl) revealed GlcNAc and GlcNAc 2 to be present. Indeed, for all incubations the relative amounts of these two sugars were found to reflect the distribution of the two major glycolipid species. It was also noted (data not shown) that these two glycolipid species remained intact after saponification (0.1N NaOH for 15 min at 37°C). Together, these data demonstrate that the glycolipids with R f values of 0.25 and 0.14 are the DPAGT1 (alg7p)-dependent species dolichyl-PP-GlcNAc and dolichyl-PP-GlcNAc 2 , respectively, and that YGL047wDR and YBR070cDR membranes are defective in generating the latter species.
Dolichyl-PP-GlcNAc Accumulates in YGL047wDR and YBR070cDR Strains-If YGL047w and YBR070c encode proteins required for dolichyl-PP-GlcNAc 2 biosynthesis, YGL047wDR-and YBR070cDR-derived microsomes would be expected to accumulate dolichyl-PP-GlcNAc, but data shown in Fig. 4B indicate that under our microsome radiolabeling conditions this may not be the case. It is known that the formation of dolichyl-PP-GlcNAc by DPAGT1 is subject to product inhi- FIG. 3. Examination of growth rate and carboxypeptidase Y glycosylation in yeast strains harboring the ALG7, YBR070c, and YGL047w genes under the control of a doxycycline-repressible (Tet-O) promoter. A, the indicated yeast strains were cultivated in YPD growth medium (open circles) or with YPD containing doxycyclin (solid circles, 1 g/ml; solid triangles, 10 g/ml) during three growth cycles. The first passage (P1) was started by seeding stationary phase cells into the appropriately supplemented YPD to a value of 0.1 A 600 and grown for 24 h as described under "Experimental Procedures." The second and third passages (P2 and P3) were initiated by reseeding stationary phase cells from the previous passages (P1 or P2) as described above. B, cells taken from the logarithmic phases of the three passages (P3) grown in the presence of 10 g/ml doxycyclin were harvested, and cellular extracts were subjected to SDS-PAGE and Western blotting as described under "Experimental Procedures." The blots were probed with a monoclonal anti-CPY antibody. The migration positions of carboxypeptidase Y glycoforms possessing 0, 1, 2, 3, or 4 N-glycans are indicated to the left of the blots. bition (28), therefore one explanation of this observation is that high levels of dolichyl-PP-GlcNAc already present in the YGL047wDR-and YBR070cDR-derived microsomes reduce DPAGT1-mediated [ 3 H]GlcNAc addition onto dolichyl-P. LLOs were therefore extracted from the different cell populations and purified by ion-exchange chromatography as described above. As bovine milk galactosyltransferase transferred galactose (Gal) from UDP-galactose onto GlcNAc and GlcNAc 2 with different efficiencies (19), sugars released from the negatively charged glycolipids by mild acid hydrolysis were separated on Biogel P2 prior to quantitation using bovine milk galactosyltransferase and UDP-[ 14 C]galactose. As shown in Fig. 5, A and  B, the YGL047wDR and YBR070cDR strains reveal striking accumulations of the disaccharide Gal␤4GlcNAc indicating accumulation of a glycolipid bearing the hallmarks of dolichyl-PP-GlcNAc. If dolichyl-PP-GlcNAc does indeed accumulate in YGL047wDR and YBR070cDR strains, glycolipids from these cells would be expected to rescue both dolichyl-PP-GlcNAc 2 synthesis and peptide glycosylation in ALG7DR membranes, which are unable to generate dolichyl-PP-GlcNAc or glycosylate peptides efficiently. LLOs were extracted from the dif-ferent cell populations and purified by ion-exchange chromatography as described above prior to being incubated with ALG7DR membranes, UDP-[ 3 H]GlcNAc, and the tripeptide, Ac-NYT-NH 2 , that contains the N-glycosylation consensus sequence. Fig. 5C shows striking dose-dependent increases in radiolabel incorporation into crude glycolipids when incubations are performed with glycolipids derived from YGL047wDR and YBR070cDR but not control cells. Furthermore, as shown in Fig. 5D, this increase in radioactivity could be accounted for by the increase of [ 3 H]GlcNAc incorporation into dolichyl-PP-GlcNAc 2 . Finally, as demonstrated in Fig. 5E, glycolipids purified from YGL047wDR and YBR070cDR but not control cells were able to promote efficient glycosylation of the acceptor tripeptide. Together these results demonstrate the presence of striking dolichyl-PP-GlcNAc accumulations in YGL047wDR and YBR070cDR cells. DISCUSSION Data bank searches revealed the presence of two genes of unknown function that are highly conserved in eukaryotes and essential for growth in S. cerevisiae. Our results demonstrate that down-regulation of these genes causes growth retardation and CPY hypoglycosylation in yeast. This phenotype is often observed in yeast possessing deficiencies in early acting enzymes of the LLO biosynthetic pathway. Glycoprotein hypoglycosylation can be caused by either deficiencies in LLO biosynthesis or transfer of the oligosaccharide from LLO onto protein (1). We were unable to pinpoint the defective step in glycoprotein biosynthesis in [2-3 H]mannose-radiolabeled YGL047wDR cells. The observed increase in [2-3 H]mannose incorporation into both LLO and N-glycans in this strain was surprising in view of the CPY hypoglycosylation results. In fact, we have often noted a similar occurrence in skin biopsy fibroblasts from patients with type I congenital disorder of glycosylation. Despite well described blockages in LLO biosynthesis, there is often an increased incorporation of [2-3 H]mannose into N-glycans. This phenomenon might be explained by changes in pool sizes of the various activated mannose intermediates that are required for N-glycosylation. By contrast, pulse metabolic radiolabeling of the YGL047wDR strain with [ 3 H]GlcNAc suggested that these cells were unable to efficiently add the second GlcNAc residue onto growing LLO.
We went on to demonstrate that both YGL047wDR and YBR070cDR cells accumulate dolichyl-PP-GlcNAc and that membranes derived from these cells have strikingly reduced capacities to transfer [ 3 H]GlcNAc from UDP-[ 3 H]GlcNAc onto this glycolipid. Taken together, our results show that the YGL047w and YBR070c genes are involved in protein N-glycosylation and more particularly are involved in the transfer of the second GlcNAc residue onto growing LLO. Little is known about the enzyme that undertakes this reaction. LLO-NAGT has an unusual cation dependence in that it has been shown to be more sensitive to activation by Ca 2ϩ or Mg 2ϩ than Mn 2ϩ (18), whereas most glycosyltransferase A family members are often optimally activated by Mn 2ϩ . The enzyme is membraneassociated (15) and has similar solubilization properties to yeast Alg1p (GDP-mannose dolichyl-PP-GlcNAc 2 :␤4-mannosyltransferase), which is predicted to possess four transmembrane regions but is more readily solubilized than yeast DPAGT1, which is predicted to possess 10 transmembrane helices (18). In intact microsomes, LLO-NAGT is known to be particularly sensitive to trypsin (17), N-ethymaleimide and non-permeant N-ethymaleimide derivatives (14), suggesting that important domains of the protein are exposed to the cytosolic face of the endoplasmic reticulum and contain one or more cysteine residues.
The gene products that we demonstrate to be involved in the second step of LLO biosynthesis in S. cerevisiae correspond to the two protein domains that have been shown to be required for UDP-sugar:undecaprenyl pyrophosphoryl monosaccharide glycosyltransferases involved in bacterial cell wall glycoconjugate biosynthesis pathways (9,26,29). These two protein domains are usually formed by a single polypeptide chain, but in a few instances the domains correspond to separate polypeptide chains encoded by different genes (26). In one experiment where the N-and C-terminal domains of the former type of enzyme (Fig. 1, Gelk) have been dissociated and expressed separately there was no evidence for transferase activity when the domains were assayed either independently or as an equimolar mixture (29). Nevertheless, it is possible that the two domains must be expressed together to generate the active complex, but to the best of our knowledge this experiment has not been performed on this type of enzyme. ADP-glucose pyrophosphorylase is a single polypeptide whose two domains are inactive when expressed separately (30). However, when the domains are co-expressed as separate polypeptides the two domains associate tightly, and enzyme activity is achieved (30). Although the cps14f and cps14g gene products are required for efficient ␤4-galactosyltransferase activity in S. pneumoniae it is not known whether or not these two proteins form a complex (26). Inspection of genome wide protein-protein interaction experiments in S. cerevisiae (Yeast Research Center, University of Washington) do not reveal an unambiguous complex between the YGL047w and YBR070c gene products, but in at least one experiment when the former protein is TAP-tagged at its C terminus and used as bait to detect interacting proteins, the YBR070c gene product appears second in a list of "hits" (31). Although we do not know whether or not these two proteins do in fact associate, primary and secondary structure alignments show that together they possess the main structural features that are thought to be important for the functioning of the E. coli Murg glycosyltransferase (11). The C-terminal domain of Murg contains a peptide sequence that is conserved in the UDP-glucuronosyltransferase family, and a glycine rich loop (G-loop 3) thought to be involved in binding the ␣ phosphate of UDP-GlcNAc. Both of these regions map onto highly conserved domains of the YGL047w peptide sequences. The Murg Nterminal domain contains two glycine-rich loops (G-loops 1 and 2, Ref. 11) that are thought to bind the lipid acceptor (undecaprenyl-PP-MurNAc), and Fig. 2 indicates that these loops map onto conserved G-containing sequences in the YBR070r-like sequences. In view of the N-ethymaleimide sensitivity of LLO-NAGT (14), it is of interest to note that the YBR070c gene product-like sequences contain a conserved cysteine residue in a region that corresponds to a region close to G-loop 2 of Murg. In addition to these structural features the YBR070c gene product is predicted to possess a signal peptide and transmembrane helices that are not predicted to occur in the bacterial Murg and cps14f sequences. The second transmembrane region of the YBR070c-associated sequences corresponds to the third ␣-helix of the Murg N-terminal domain (N␣3) that contains both hydrophobic and positively charged amino acids thought to be involved in docking the bacterial protein to the cell membrane (11).
In summary, we have identified two gene products involved in the second step of LLO biosynthesis in S. cerevisiae. Although it is not understood how the YGL047w and YBR070c encoded proteins function, bioinformatics information predicts that the proteins possess features required of a UDP-GlcNAc dependent N-acetylglucosaminyltransferase capable of acting on dolichyl-PP-GlcNAc. This information provides insight into the evolutionary origin and mechanism of a key step in the eukaryotic N-glycosylation pathway, and whatever the precise function of these two proteins in this reaction, mutations in the human orthologs of these genes may underlie as yet undescribed subtypes of type I congenital disorder of glycosylation.