Sequence Properties of the 1,2-Diacylglycerol 3-Glucosyltransferase from Acholeplasma laidlawii Membranes

Synthesis of the nonbilayer-prone a -monoglucosyldia-cylglycerol (MGlcDAG) is crucial for bilayer packing properties and the lipid surface charge density in the membrane of Acholeplasma laidlawii . The gene for the responsible, membrane-bound glucosyltransferase (alMGS) (EC 2.4.1.157) was sequenced and functionally cloned in Escherichia coli , yielding MGlcDAG in the re-combinants. Similar amino acid sequences were encoded in the genomes of several Gram-positive bacteria (especially pathogens), thermophiles, archaea, and a few eukaryotes. All of these contained the typical EX 7 E catalytic motif of the CAZy family 4 of a -glycosyltrans-ferases. The synthesis of MGlcDAG by a close sequence analog from Streptococcus pneumoniae (spMGS) was verified by polymerase chain reaction cloning, corrobo-rating a connection between sequence and functional similarity for these proteins. However, alMGS and spMGS varied in dependence on anionic phospholipid activators phosphatidylglycerol and cardiolipin, sug-gesting certain regulatory differences. Fold predictions strongly indicated a similarity for alMGS (and spMGS) with the two-domain

Lipids are the local environment for most integral and peripheral membrane proteins, which often depend on the lipids for optimal function. The large diversity of lipids and the differences in composition and properties between membranes have made it difficult to find out common features of bilayer organization and how lipids and proteins are cooperating in local processes. Lipid-synthesizing pathways have been mapped for the most common types of lipids, and several of the corresponding enzymes catalyzing these reactions have been characterized. However, when it comes to the connection between regulation of bilayer properties and enzyme structure, very little is known (1). So far, only a few lipid-synthesizing enzymes have been crystallized. Which structural properties are involved in the catalytic mechanism of these lipid enzymes, and how are the membrane properties sensed (1)?
In the well characterized plasma membrane of Acholeplasma laidlawii, the lipid composition is regulated in a manner to maintain (i) lipid phase equilibria, close to a potential bilayer to nonbilayer transition, (ii) a nearly constant radius of spontaneous curvature, and (iii) a certain anionic surface charge density of the lipid bilayer. The synthesis of the major nonbilayer-prone lipid in this membrane, monoglucosyldiacylglycerol (MGlcDAG) 1 (Scheme 1, step I), plays an important role to fulfill the two first points above but also the third, since it is strongly regulated by negatively charged lipids (e.g. the major in vivo lipid phosphatidylglycerol (PG)) (2). MGlcDAG is consecutively processed into diglucosyl diacylglycerol (DGlcDAG) (Scheme 1, step II). Consequently, the formation of this glucolipid, a transfer of Glc from the donor UDP-Glc to the acceptor lipid diacylglycerol (DAG) catalyzed by a glucosyltransferase (EC 2.4.1.157) (3), plays a central part in understanding the total regulation of lipid syntheses in A. laidlawii membranes. Furthermore, glycolipids including nonbilayer-prone ones are major constituents in many cell surface membranes, certain bacterial groups, and most photosynthetic organelles. Fairly little is known about the synthesis and regulation of these. Usually, they are made in a separate pathway (as in A. laidlawii), branching from the conserved one to anionic phospholipids.
In this work, we have cloned the gene for the ␣-monoglucosyldiacylglycerol synthase from A. laidlawii membranes and a sequence analog from the pathogen Streptococcus pneumoniae and propose these genes, on the basis of sequence similarities, to belong to a new large group of lipid glycosyltransferases that are widely spread in nature. We also present a functional comparison between the two cloned glucosyltransferases and discuss structural properties based on two-and three-dimensional fold predictions from the primary structure. A striking similarity to two new, related structures for an Escherichia coli glycosyltransferase and epimerase, respectively, is indicated.

EXPERIMENTAL PROCEDURES
Strains and Genomic DNA-Strain A-EF22 of A. laidlawii was cultivated as described by Karlsson et al. (3), and genomic DNA was prepared using the kit GenomicPrep TM (Amersham Pharmacia Biotech). The growth of S. pneumoniae strain 19F CCUG 3030 was performed in Todd-Hewitt medium. For DNA extraction, an overnight culture was harvested, and the pellet was resuspended in H 2 O and heated at 95°C for 10 min. The supernatant from the following centrifugation contained the DNA.
PCR Amplification-The N-terminal sequence of the purified MGl-cDAG synthase from A. laidlawii (alMGS) (3) was analyzed through Edman degradation, revealing a 20-residue sequence (4). An internal amino acid sequence of 10 residues was determined as above after proteolytic cleavage of the protein and separation of peptides by reverse phase high pressure liquid chromatography.
Degenerated oligonucleotides with the primary sequence 5Ј-ATT GGT ATI TT(T/C) TCI GAA GC-3Ј and 5Ј-TTT ATC TGG ICC (A/G)TC (G/T)CC-3Ј (DNA Technology, Denmark) were synthesized and used in a PCR amplification with genomic A. laidlawii DNA as the template. Amplification conditions using AmpliTaq ® DNA polymerase were 30 cycles at 96°C for 1 min, 48°C for 1 min, and 72°C for 1 min followed by a final extension at 72°C for 10 min. Purified PCR product was ligated into a pCR-Script TM Amp SK(ϩ) cloning vector (Stratagene) and cloned in TOP10FЈ cells (Invitrogen). Screening for positive colonies was performed by blue-white color selection, combined with PCR or colony DNA hybridization. Sequence analysis of positive clones was performed using either vector or gene-specific primers and ABI PRISM ® BigDye TM Sequencing Kit (PE Applied Biosystems).
Southern Blot and Hybridization-Restriction endonuclease HindIII was used for a complete digestion of A. laidlawii genomic DNA. The DNA fragments were separated by agarose gel electrophoresis and transferred to a Hybond-N membrane in a Southern blot procedure (5). Probes amplified by PCR with gene-specific primers from the cloned fragment of the Almgs gene were [ 35 S]dATP-labeled by nick translation and used in DNA hybridization for 18 h at 50°C. DNA fragments with the molecular mass corresponding to the hybridization bands were purified and used in a ligation reaction with the pCR-Script TM Amp SK(ϩ) vector linearized by HindIII. Hybond-N membrane, [␣-35 S]dATP and the nick translation kit N5500 were all purchased from Amersham Pharmacia Biotech. The hybridization was visualized by electronic autoradiography (Packard Instant Imager TM ).
Cloning and Expression-Oligonucleotides were designed for the start and stop codon region for the MGS gene from S. pneumoniae (SPmgs), with the forward primer structure 5Ј-AAA GTG AGG TAA TCT ATG CGA ATT G-3Ј and reversed primer sequence 5Ј-GCT GTT CCT CTT TCT ATT CTT CAT-3Ј. The corresponding oligonucleotides for the MGS gene from A. laidlawii (ALmgs) were designed with the sequence 5Ј-AAA GTG AGG TAA TCT ATG AGA ATT GGT ATT TTT TCG G-3Ј and 5Ј-CTA CTT TTT ATT CAA TTT TTT GTT ATT TTT ATC-3Ј. Genomic DNA was used for PCR amplification, and the products were ligated into the pCR-Script vector and cloned as described above.
The alMGS was also constructed with an N-terminal His 6 tag, 2 using the E. coli strain BL21 (Novagen) for cloning of the pET15b recombinant. Both TOP10FЈ and BL21 were grown on agar plates supplemented with 100 g/ml carbenicillin. Protein expression of all recombinant strains was performed in 1ϫ LB medium supplemented with 50 g of carbenicillin/ml. The strains were grown at 37°C, and 1 mM isopropyl-1-thio-␤-D-galactopyranoside was added at A 600 ϭ 0.6. Cells were harvested by centrifugation after 5 h of incubation.
Solubilization of Cells and Lipids-E. coli cells were solubilized in assay buffer (110 mM HEPES, pH 8.0, 22 mM MgCl 2 , and 22 mM CHAPS), by extensive vortexing three times during incubation on ice for 3 h. The protein concentrations in solubilized cell extracts were between 4 -5 mg/ml and determined by a micro-BCA kit (Pierce). Mixed micellar dispersions were prepared as described by Karlsson et al. (3), with an exception for the assay buffer (cf. above).
Enzymatic Assays-In the standard assay for MGlcDAG synthesis, 25 l of protein solution (see above) was added to 20 l of lipid micellar solution and incubated on ice for 30 min. The reaction was started by the addition of 5 l of UDP-[ 14 C]glucose to give a final concentration of 1 mM (30 GBq/mol). Standard lipid concentration was 10 mM (1 mM DAG substrate in addition to the activator 1,2-dioleoyl-sn-glycero-3phosphoglycerol). After 30 min of incubation at 28°C, the reaction was stopped with 375 l of methanol/chloroform, 2:1 (v/v), and the lipids were extracted and separated by TLC (2). Radiolabeled enzymatic product from MGlcDAG synthesis in vivo was utilized in a DGlcDAG synthesis assay (7), using a nearly homogenous fraction of the DGlcDAG synthase from A. laidlawii. The lipid products on the TLC plates were visualized and quantified by electronic autoradiography (Packard Instant Imager TM ). All assays were done in duplicate.
The glucolipid products were also identified with a spray reagent. Lipid extracts were first separated by TLC as above, sprayed with a sulfuric acid/methanol mixture 1:1 (v/v), and exposed to 170°C. Lipids containing sugar moieties were colored purple after ϳ2 min (8).
Growth of Recombinant Strains-Overnight cultures were grown in 1ϫ LB medium (50 g of carbenicillin/ml) and used for inoculation (2%) to the same medium containing 14.8 kBq of [ 14 C]acetate/ml. The strains were grown at 37°C, and 2 mM isopropyl-1-thio-␤-D-galactopyranoside was added at A 600 ϭ 0.6. Cells were harvested by centrifugation after 4 h of incubation. Lipids were extracted from the cell pellet twice by chloroform/methanol (2:1 and 0:1, v/v) and separated by TLC (0.2-mm Silica Gel 60) developed in chloroform/methanol/acetic acid (65:25:10, v/v/v). Plates were visualized by autoradiography (cf. above). The radioactivity is incorporated into the lipid acyl chains and was assumed to be approximately equal except for cardiolipin (4 chains), which was predicted to be labeled twice. A similar in vivo labeling was also performed with addition of [ 14 C]UDP-Glc instead of acetate.
Sequence Analysis and Structure Prediction-The amino acid sequence of the alMGS was used in searches for homologous/analogous sequences with PSI-BLAST (9) at the NCBI, in the data base of finished and unfinished genomes, and the Conserved Domain Data base at NCBI. Preliminary sequence data were obtained from the Institute for Genomic Research through their Web Site. Genes listed in Table II and Fig. 1 are selections from searches updated March 6, 2001. Prediction of the primary and secondary structures of the MGS sequences were performed with tools available at the ExPASy Molecular Biology server (Swiss Institute of Bioinformatics) and with the Wisconsin Package version 9.1 (Genetics Computer Group, Madison, WI). For the threedimensional structure predictions, the three-dimensional fold recognition service at the EMBL Web site (10) and Protein Data Bank (PDB)-ISL (11) at SCOP were utilized.
Nucleotide Sequence Accession Number-Nucleotide sequence data have been deposited at GenBank TM with accession number AF349769.

RESULTS
Gene Cloning and Sequence-From the purified MGlcDAG synthase of A. laidlawii (3), the N-terminal (MRIGIFSEAY-LPLISGVV) and an internal (FIIIGDGPDK) amino acid sequence was determined. A PCR amplification from chromosomal DNA with the corresponding degenerated DNA oligonucleotide primers yielded a 762-base-long nucleotide fragment. This was used as a base for probes in a Southern blot procedure and identified a 4.2-kilobase-long nucleotide sequence described in Table I. An open reading frame of 1197 bases, a putative transcription start at base Ϫ11, and a potential ribosome-binding site were identified on the basis of a matching of the translated sequence with the two amino acid sequences above. The G ϩ C content of the DNA was typical for acholeplasmas (30.8%). This open reading frame was coding for a protein with 398 amino acids, lacking a signal peptide (ac-cording to SignalP) and given the gene name ALmgs. The amino acid sequence is not related to the ones in the UGT Nomenclature Committee (12). No other potential lipid-synthesizing enzyme genes were present on the contig. However, the two tRNA-amino acid synthases indicate a conserved chromosomal environment. The A. laidlawii tRNA Leu has been described earlier in Ref. 13.
The amino acid sequence for ALmgs was used as a query in a PSI-BLAST homology search. A selection of the best hits are presented in Table II; among the 20 top ones putative glycosyltransferases from a large variety of organisms were found. All of these belong to family 4 (retaining GTs) in the glycosyltransferase systematics (CAZy) (14) and share the typical residues for ␣-GTs in this class and belong to family D in the classification by Breton et al. (15). However, the best scores were revealed in a data base for finished and unfinished microbial genomes. Here, the best hit was a sequence coding for a protein in Treponema denticola with 36% amino acid identity to the MGlcDAG synthase (the six best hits all had identities above 30%). The translated sequences from Enterococcus faecalis, Streptococcus pyogenes, S. pneumoniae, and T. denticola (second row) were aligned with the alMGS sequence (top) (Fig.  1). The conserved residues in all five sequences were mainly focused to three domains: the first 40 residues, residues 90 -130, and above all amino acids 280 -310, which contains the characteristic motif EX 7 E of family 4 GTs (cf. above).
A search in the protein domain data base (at NCBI) showed that alMGS and the potential homologs belonged to pfam00534, which is the glycosyltransferase group 1. These proteins transfer NDP-linked sugars, like glucose, galactose, mannose, and X-glucose, to a variety of acceptor substrates such as glycogen and lipopolysaccharides. Part of the S. pneumoniae gene ( Fig. 1) has been annotated before but with unknown function (16); the full sequence was retrieved from a contig (gnlԽTIGRԽS.pneumoniae_3836) in the finished and unfinished microbial genome data base and was given the name SPmgs. In order to establish the potential functions of the proteins in Fig. 1, SPmgs, ranked fifth in the figure, was PCR-cloned from S. pneumoniae chromosomal DNA.
Gene Functions-The genes ALmgs and SPmgs were placed under control of the lac promoter but out of frame with the ␤-galactosidase gene in the pCR-Script vector in an E. coli TOP10FЈ. A putative ribosome-binding site found at Ϫ10 upstream from the start codon in SPmgs was used for both constructs. The ALmgs gene was also ligated into a pET15b vector and overexpressed in E. coli BL21 as a recombinant protein variant with a His 6 tag fused to the N terminus. 2 Harvested cells from induced recombinant and control strains were solubilized by CHAPS detergent, and standard in vitro assays for MGlcDAG synthesis were performed, using substrates and activator lipid according to experimental procedures (cf. Ref. 3). The results (Table III) revealed that the encoded proteins were able to catalyze the assumed or predicted glucosylation reaction. The same radiolabeled product was synthesized independent of which of the two substrates, [ 14 C]UDP-Glc or [ 14 C]DAG, was labeled. In order to verify that the lipid product synthesized in vivo (Fig. 2, lane 5) was MGlcDAG, it was extracted from the TLC plate and used in an in vitro assay for DGlcDAG synthesis with purified DGlcDAG synthase from A. laidlawii membranes (7). This enzyme can only use ␣-MGlcDAG as the lipid substrate and not ␤-MGlcDAG or Gal variants. The extracted, radiolabeled lipid (Fig. 2, lane 1), could indeed be used as substrate in the DGl-cDAG synthesis (Fig. 2). The intensity of the radiolabeled product was increased when 14 C-labeled UDP-glucose was used (lane 3). The MGlcDAG TLC spot was also identified as a glycolipid by charing with sulfuric acid/methanol (1:1, v/v) (see "Experimental Procedures"). A typical purple color characteristic for glycolipids was observed (data not shown).
Hence, the two genes ALmgs and SPmgs, with the translated amino acid sequences indicated in Fig. 1, encode analogous enzymes, which both perform the synthesis of the membrane lipid MGlcDAG.
Lipid Composition in Recombinant Cells-The lipid composition in the recombinant E. coli strains was analyzed by incorporation of [ 14 C]acetate into the lipids during growth. Four major lipids were recognized on the TLC plates: the glucolipid MGlcDAG, phosphatidylethanolamine (PE), and the negatively charged lipids phosphatidylglycerol (PG) and CL (Fig. 2). The control TOP10FЈ strain, containing a pCR-Script vector, had a lipid composition normal for E. coli wild type, with about 72% PE and 28% negatively charged lipids (Table III). The recombinant strain with expressed alMGS contained a significant fraction of MGlcDAG, about 10%. The fraction of anionic lipids was kept constant, while the nonbilayer lipid PE had decreased to about 63%. However, the homologous glucosyltransferase from S. pneumoniae, although active in vitro, did not affect the lipid composition in vivo, and only traces of MGlcDAG were found (Table III, Fig. 2). The BL21 strain, overexpressing the His-tagged alMGS, showed a slightly lower synthesis of the glucolipid compared with TOP10FЈ, indicating that the N-terminal His extension did not seriously effect enzyme activity. The overexpressed GTs decreased the growth rates compared with controls, and the latter reached the stationary phase faster, which may influence the fractions of CL and PG.
The results in Fig. 3 show that all four lipids were potent activators for the two GTs, but to different extents. An increased fraction of PG and CL gave sigmoidal-like activation curves for alMGS, while spMGS responded only to PG. Cardiolipin was able to activate the spMGS but to a very low extent. The lipid-like detergent, PGD, showed only slightly activating effects on spMGS. This was also true for the alMGS at lower concentrations, but the activity increased significantly at concentrations above 25%. Without supplemented DAG substrate, traces of MGlcDAG product could still be detected. This was most probably due to a minor fraction of endogenous DAG present in the added E. coli cell suspension. The total content of E. coli lipids present in an assay was estimated to be less than 50 nmol (cf. the supplemented amounts of 500 nmol). All observed effects on the cloned alMGS (Fig. 3) were in agreement with earlier studies of the native enzyme (2, 17). Structure Predictions-alMGS is firmly anchored in the membrane, and detergents are needed for solubilization (3).
Potential hydrophobic transmembrane (TM) segments in the sequences were investigated with a number of prediction methods at the ExPASy Molecular Biology server. One TM (residues 3-22) was proposed according to HMMTOP (20), TMpred (residues 3-24) (21), and TopPred2 (residues 4 -24) (22), but not by the SOSUI (23). The orientation of the putative TM was uncertain. This segment also had a substantial amphipathic character according to a hydrophobic moment analysis (24). According to the majority of methods used, spMGS was a soluble protein lacking TM segments, except by HMMTOP (20), predicting one at residues 3-21.
The alMGS sequence showed a low homology to proteins in the structure data base (PDB). However, a three-dimensional fold prediction method based on homologous sequence searches (10) listed MurG from E. coli (sequence identity of 14%; 26% similar amino acids), which encodes for a glycosyltransferase catalyzing the last step in the peptidoglycan precursor pathway (25,26), and a soluble UDP-N-acetylglucosamine 2-epimerase from E. coli (27) (12% identical and 24% similar amino acids). These two have very similar structures (27), but MurG was FIG. 1. Sequence conservation in the alMGS and homologs from pathogenic bacteria. Amino acid sequence alignment (ClustalW) is shown of some of the closest homologs found by a BLAST search in microbial genomes data base (finished/unfinished) at NCBI. Shaded amino acids indicate residues identical for all homologs. Amino acid sequences obtained by Edman degradation of the N terminus and a fragment of the purified protein are underlined. The characteristic, potential active site EX 7 E motif for this group of glycosyltransferases is boxed. Ala, A. laidlawii; Tde, T. denticola (36% identity); Efa, E. faecalis (33% identity); Spy, S. pyogenes (34% identity); Spn, S. pneumoniae (32% identity). proposed to be attached to the inner membrane (no predicted TM). An alignment of MurG and alMGS by ClustalW was used to find sequence and potential structural similarities (some gaps included) between the two membrane-associated proteins. The aligned sequences were marked with the predicted secondary structure for alMGS (from Jpred (28)) and the established structure of MurG recently determined (29). The results (Fig.  4) indicated that the alMGS exhibits a similar but not identical secondary structure and topology to MurG and the epimerase. Besides this, initial circular dichroism studies of a purified His-tagged alMGS suggest both ␣and ␤-structures, 2 in agreement with the ␣/␤ open sheet structure determined for the MurG domains. Furthermore, two regions in this alignment had higher identities (Fig. 4); residues 62-105 and 290 -318 (the EX 7 E motif) in alMGS showed similarity to residues 71-114 (25% identity) and 254 -278 (37% identity) in MurG. A closer study of residues 74 -85 in the alMGS revealed a mixture of hydrophobic and basic amino acids in a predicted amphipathic ␣-helix (24). The corresponding sequence in MurG (residues 84 -95) is an ␣-helix proposed to be part of the membrane-binding domain (29). The amphipathic characters are also evident from helical wheel presentations (data not shown). Residues 102-106 in MurG consist of a glycine-rich loop (G loop) localized between a ␤-strand and an ␣-helix (29). This motif is related to the one included in the classical Rossmann fold (30,31). Interestingly, a similar motif is present in the alMGS (putative G loop sequence SXGXXG) (Fig. 4).
Other membrane-binding segments may be present as well. Searching the PDB Intermediate Sequence Library at the SCOP data base (11) with the alMGS sequence as a probe revealed domains in the two related botulinum (PDB structure 3BTA) and tetanus (PDB structure 1A8D) neurotoxins, close to the binding site for the negatively charged neuronal ganglioside lipid (32). Sequence segment 212-260 in alMGS, containing two conserved regions in the potential lipid GTs (Fig. 1), could be modeled on the PDB 1A8D structure template by SwissModel (ExPASy server (33)). Positions 225-239 had the largest hydrophobic moment (24) for the entire alMGS. Likewise, this segment could also be modeled on the membranebinding, 62-106 segment in MurG (cf. Fig. 4). Another motif spanning from about residue Ile 111 to Tyr 127 , with a large hydrophobic moment in alMGS, was highly conserved in all the potential lipid GTs (Fig. 1). This amino acid stretch was, despite a low similarity, possible to align with a motif (Pro 121 -Lys 136 ) conserved among MurG proteins (29). Generally, amphipathic segments (24) were less frequent and of smaller magnitude in MurG than in alMGS.
A theoretical pI was calculated to be around 9 (or higher) for the sequences in Fig. 1, with the exception of spMGS and E. faecalis with a pI of about 5-6. For alMGS, a high number of basic residues were found in the N-terminal half, while the second half was dominated by acidic residues. This polarization of charges along the sequence gave a high pI (ϳ10) for the N-terminal halves and a lower pI (ϳ7) for the C-terminal part, a difference that was analogous but lower for spMGS. Interestingly, a similar pattern of charge distribution is valid for MurG with a pI for the full sequence calculated to be 10.2, while the C-terminal had an acidic pI of 6.2.
Hence, the alMGS lipid glucosyltransferase seems to have several structural features in common with certain membranebinding proteins of known structure, especially the E. coli GT MurG.

DISCUSSION
Lipid Glycosyltransferase Genes-The genes for the well studied MGlcDAG synthase from A. laidlawii strain A-EF22 was cloned, and the encoded catalytic function was confirmed. In the standard assay procedure, rac-1,2-diacylglycerol was utilized as the acceptor and UDP-␣-glucose as the donor substrate for synthesis of the lipid product MGlcDAG. The stereochemistry of the sugar moiety was characterized indirectly by a coupled enzymatic synthesis of the subsequent glucolipid DG-lcDAG, which specifically demands ␣-MGlcDAG as substrate (7). ␤-MGlcDAG is not present in A. laidlawii (34). Related Gram-positive bacteria, like the ones in Fig. 1, all contain the lipid ␣-MGlcDAG in their membranes (35).
In addition, ␣-MGlcDAG is the structural base of the lipoteichoic acid in S. pneumoniae, anchoring this cell wall polymer into the cytoplasmic membrane (36). The visualization of MGl-cDAG from the S. pneumoniae gene (clone SPmgs; Fig. 2) strongly indicates that the corresponding genes in Fig. 1, being more similar to the alMGS sequence than spMGS, all encode the MGlcDAG synthesis function. Likewise, treponemas and other spirochetes all contain a monoglycosyl-DAG, where the hexose is glucose, galactose, or mannose (37). The T. denticola sequence in Fig. 1 most likely encodes the GT needed for this synthesis. In an analogous manner to S. pneumoniae, this lipid may also be the anchor to the complex outer membrane sheath lipid OML521 in T. denticola (38). Additional homologs from other pathogens, not shown here, were also found in, for example, Streptococcus mutans, Clostridium acetobutylicum, and Streptococcus equi.
In the Gram-positive sequences of Fig. 1, the MGlcDAG synthase genes are adjacent to another gene (potentially in an operon), tentatively identified by us as glycosyltransferases of CAZy family 4 (cf. above). No other GT was identified next to the A. laidlawii MGS gene (see Table I). Likewise, in the related mollicutes Mycoplasma pneumoniae, three potential lipid GT genes lay separated on the chromosome. 3 However, the alMGS sequence was not related to the M. pneumoniae ones, in agreement with the different glycolipid structures. 3 This potential group of lipid GTs seems to be widely spread in nature according to the list of selected orthologs in Fig. 1 and Table II. Prokaryotes, including Gram-positive and Gram-negative eubacteria, and archaea are represented, but the list also contains sequences from eukaryotes. Interestingly, they were found also in the hyperthermophiles Thermotoga maritima and Pyrococcus horikoshii, members of eubacteria and archaea, respectively. This all indicates that a common ancestor to these orthologs was developed very early in the evolution, before the separation of the bacterial from the archaeal lineages. All analogs ( Fig. 1 and Table II) analyzed contained the EX 7 E motif typical for the retaining ␣-GTs of CAZy family 4 (14). A number of the analogs in Table II are involved in the synthesis of various lipids or lipid-based molecules. The Borrelia burgdorferi enzyme synthesizes monogalactosyl-DAG, 4 in accordance with the reported presence of this lipid in B. hermsii (40), and the S. pneumoniae homolog encodes the ␣-MGlcDAG synthase (this work). Lactococcus, Deinococcus, Thermotoga, and Pyrococcus species (Table II) all contain various glycolipids, including ␣-MGlcDAG in the two former ones (35,41). However, the gene from Pseudomonas aeruginosa is not the one synthesizing the excreted rhamnolipid (42). The Synechocystis gene, ranked as number 11 (Table II), was recently suggested as a lipid ␣-glycosyltransferase catalyzing the synthesis of sulfoquinovosyl-diacylglycerol (43).
The Bacillus subtilis gene (Table II) is tuaC involved in lipoteichoic acid synthesis (44). Number 17 is Rv0557 from Mycobacterium tuberculosis, and it was recently identified as a mannosyltransferase (PimB) acting on a phosphatidylinositol lipid (45). In CAZy family 4 (14), there are more than a dozen open reading frames from Arabidopsis thaliana. One of these genes (CAB69850 ; Table II) showed homology to the MGlcDAG synthase from A. laidlawii and an even higher similarity (35% identity) to the proposed sulfoquinovosyl-diacylglycerol synthase from Synechocystis (cf. above). Like the ␤-MGalDAG synthase (a GT) from cucumber (46), the two analogs from Arabidopsis and Synechocystis (slr0384) seemed to contain a leader and signal peptide of about 103 and 31 amino acids, respectively (ChloroP/SignalP prediction). They are probable transit peptides required for import through the chloroplast envelope and export to a proper Synechocystis compartment.  (29), was proposed to be similar to alMGS by a three-dimensional fold prediction method (10). The two sequences were aligned with ClustalW, and the potential secondary structure of alMGS was predicted by Jpred (at the ExPASy server); this is shown along the established MurG structure (x-ray). A very good prediction of the determined secondary structures of MurG and the epimerase was also achieved by Jpred (data not shown). Light gray, ␣-helix; dark gray, ␤-strand. Two interesting regions with high sequence homologies are boxed. The first stretch is proposed to be a membrane-associating domain, and the second contains the motif for UDP-sugar binding. The outside sequences are represented by a plus sign for positively charged residues (Lys, Arg), minus sign for negative ones (Asp, Glu), and dots for others. In the first box, residues in proposed G-loops are marked with boldface italic type (Gly, Ser). Positions 111-127 are strongly conserved in the Hence, the alMGS enzyme is member of a potentially large and conserved group of lipid glycosyltransferases in nature. Most important, this group is not closely related sequencewise to the corresponding ␤-MGalDAG synthases in plant chloroplasts (47).
Regulation of Activity-The enzymatic regulation of the MGlcDAG synthesis in A. laidlawii has been extensively characterized (2,6). Certain lipids activated the alMGS due to their charge properties, with PG as the most potent activator. Here, the two enzymatic activities expressed in E. coli were studied in a mixed micellar system in vitro and with respect to the effects of negatively charged lipids (Fig. 3). The sigmoidal curves shown for PG and CL in the activation of alMGS reached their maximum at a similar fraction of negative charges (two in CL) (Fig. 3B). Similarly, PG also stimulated the activity of spMGS, while a very poor response was given by CL. This difference in regulatory properties is very interesting, since PG and CL are major lipids in S. pneumoniae (19), but only PG has been found in this strain of A. laidlawii (18). The binding of alMGS to lipid bilayers was recently shown to be modulated by electrostatic interactions, 2 with a preference for binding to PG-and CL-enriched membranes. The regulation of spMGS activity may be governed in a different way; the low response to CL and the stimulatory effects by PG indicate a reduced ability to interact with CL. spMGS has substantially different charge properties as illustrated by the lower pI of its N-and C-terminal halves (domains) (see above). This may serve to inhibit synthesis of too much nonbilayer-prone lipid, since both CL and MGlcDAG have such properties. Alternatively, CL is a true inhibitor of the spMGS enzyme. In the A. laidlawii used, lacking CL, this is evidently not the case. In the two ALmgs recombinant clones in vivo, the new nonbilayerprone glucolipid was synthesized to ϳ10 mol %. The fraction of PE was lowered to the same extent, while the fraction of the negatively charged lipids was kept. This down-regulation of the major nonbilayer E. coli lipid may be an enzymatic regulation of the lipid synthesis to keep certain biophysical properties, like the spontaneous curvature (48), intact in the bilayer. However, the nonbilayer-prone ␤-MGalDAG from cucumber did not cause an analogous reduction of only PE in E. coli (46).
The lack of in vivo MGlcDAG synthesis in the SPmgs clone may depend on (i) the presence of CL in E. coli (cf. above); (ii) a lower density of basic amino acids in spMGS, leading to a weaker binding to an intracellular anionic membrane; or (iii) an inhibitor acting on the enzyme in vivo but not in vitro. Early studies of this enzyme in S. pneumoniae (49) localized the glucosyltransferase activity to a soluble fraction, indicating that the protein was not tightly bound to the membrane. The major nonbilayer-prone lipid PE in E. coli did not act as an inhibitor to this enzymatic activity according to results from in vitro experiments (data not shown). PE has not been found in S. pneumoniae.
Structure Proposal-Up to now only a handful NDP-glycosyltransferases have been structurally determined. The majority of these are using the inverting mechanism and the glycosidic bond formed in the products are in the ␤-configuration, but one exception is the newly determined structure of LgtC in Neisseria meningitidis (50), which uses a retaining mechanism. The sequence similarity between these structures is low, and they are classified into different glycosyltransferase (CAZy) families (14). However, their three-dimensional structures fall into only two superfamilies (51). The three-dimensional fold prediction (10) for alMGS (see "Results") is proposed to be similar to one of these, containing the membrane-bound E. coli MurG (29) and soluble UDP-N-acetylglucosamine 2-epimerase (27) but also phage T4 ␤-GT structures (51). Similar predictions were valid for all of the most closely related sequences in Fig. 1 and for Lactococcus lactis and B. burgdorferi in Table II. Likewise, this was also the case for the cucumber ␤-MGalDAG GT and its Arabidopsis homolog (data not shown). The latter two and MurG belong to CAZy family 28 (14), strongly indicating structural similarities between the latter and family 4, including alMGS. The LgtC in N. meningitidis, with a retaining mechanism, belongs to family 8. Although they have an analogous catalytic mechanism, the LgtC and alMGS do not show any strong structural homology.
A prediction of alMGS secondary structures and an alignment along the MurG sequence (Fig. 4) revealed several surprising similarities and several regions potentially involved in membrane binding. One of these (positions 212-260; Figs. 1 and 4) could be modeled on an analogous region in two membranebinding toxins (cf. above), but most typical was the amphipathic character of all of these regions. Such features are described for many proteins binding to lipid bilayer surfaces, and they are analyzed in more detail for a number of established amphipathic helices from the latter (recently reviewed by Johnson and Cornell (52)). Most similar in this collection of amphipathic helices was a membrane binding segment of DnaA (positions 366 -388), initiating chromosome replication in E. coli (53); it aligned with positions 218 -240 in the region with the largest hydrophobic moment of the entire alMGS sequence (cf. Figs. 1 and 4). The importance of this DnaA segment for phospholipid interaction is visualized by recent mutant studies (54).
Searching the sequences for several lipid-binding proteins, we found several intriguing similarities. The negatively charged, signal recognition particle receptor FtsY of E. coli, integrating into anionic phospholipids (55), has a positively charged amphipathic ␣-helix 4 in the structure (56), very similar in sequence to the position 75-91 amphipathic segment in alMGS (Fig. 4). Likewise, but with a slightly lower similarity, was the resemblance of this alMGS segment with the C-terminal amphipathic anchor segments of the LgtC galactosyltransferase (57) and the E. coli phosphatidylserine synthase (58). Phosphatidylserine synthase is the rate-keeping step for the synthesis of the major nonbilayer-prone lipid PE in E. coli (39) and associates to a negatively charged lipid surface (6). The features discussed above and the similarity of this amphiphilic, positively charged segment in alMGS with the aligned membrane-binding segment in MurG (first box in Fig. 4) strongly support a similar anchoring function for these two. In the soluble epimerase (above) the corresponding helix sequence segment has fewer positive and more negative charges and is FIG. 5. Schematic diagram showing a putative binding mechanism for the alMGS to a lipid membrane. The schematic is based on liposome binding studies of alMGS (6) and its structural homology to MurG from E. coli. The latter has two domains with a catalytic cleft in between and a membrane-binding site in the N-terminal domain. We propose a similar structure for alMGS with an uneven charge distribution on the protein surface. Basic and hydrophobic residues in one or more amphipathic helices in the N-terminal domain (labeled with stripes) induce binding to the anionic lipid membrane. The catalytic pocket includes the EX 7 E motif proposed to bind UDP-Glc. now the contact region in the dimer (27), with no membrane attachment.
A model (schematic diagram) for the interaction and anchoring of alMGS with a lipid bilayer surface by a combination of charge-charge and hydrophobic interaction is shown in Fig. 5. It is based on (i) the indicated similarities (see "Results") between the structurally determined MurG glycosyltransferase and alMGS (and modeled on the former); (ii) the cooperative dependence of alMGS activity on the activator lipid PG (3); (iii) the corresponding dependence of alMGS binding to PG-enriched (and CL-enriched) bilayers 2 ; (iv) the ability to release most alMGS from membranes only by detergents and chaotropic agents 5 ; and (v) the presence of several potential amphipathic helix segments in the alMGS sequence, typical for many lipid surface-associated proteins (cf. Johnson and Cornell (52)). Here, a close approach of the active site region to the bilayer surface, containing the hydrophobic substrate DAG, may be governed or modulated by the type and amount of negatively charged activator lipids.
In summary, the enzyme synthesizing the major nonbilayerprone membrane lipid MGlcDAG in A. laidlawii is related to a large group of lipid glycosyltransferases in nature. It has homologs in related pathogenic bacteria and a structure potentially similar to E. coli MurG, and it is probably attached to the membrane by charge-charge and hydrophobic interactions.