The Assembly System for the Lipopolysaccharide R2 Core-type ofEscherichia coli Is a Hybrid of Those Found inEscherichia coli K-12 and Salmonella enterica

In Escherichia coli F632, the 14-kilobase pair chromosomal region located between waaC(formerly rfaC) and waaA (kdtA) contains genes encoding enzymes required for the synthesis of the type R2 core oligosaccharide portion of lipopolysaccharide. Ten of the 13 open reading frames encode predicted products sharing greater than 90% total similarity with homologs in E. coli K-12. However, the products of waaK (rfaK) andwaaL (rfaL) each resemble homologs inSalmonella enterica serovar Typhimurium but share little similarity with E. coli K-12. The F632 WaaK and WaaL proteins therefore define differences between the type R2 and K-12 outer core oligosaccharides of E. colilipopolysaccharides. Based on the chemical structure of the core oligosaccharide of an E. coli F632waaK::aacC1 mutant and in vitro glycosyltransferase analyses, waaK encodes UDP-N-acetylglucosamine:(glucose) lipopolysaccharide α1,2-N-acetylglucosaminyltransferase. The WaaK enzyme adds a terminal GlcNAc side branch substituent that is crucial for the recognition of core oligosaccharide acceptor by the O-polysaccharide ligase, WaaL. Results of complementation analyses of E. coli K-12 and F632 waaL mutants suggest that structural differences between the WaaL proteins play a role in recognition of, and interaction with, terminal lipopolysaccharide core moieties.

Lipopolysaccharides (LPS) 1 are major and characteristic components of the outer membrane of Gram-negative bacteria. The hydrophobic lipid component (lipid A) anchors the LPS molecule in the outer membrane. Lipid A is linked to a core oligosaccharide (core OS) of 10 -15 sugars; the core OS is often phosphorylated. The resulting basic structure is known as rough or R-LPS. In the Enterobacteriaceae, R-LPS is capped by an O antigen side chain polysaccharide (O-PS) to form LPS molecules termed smooth (or S-LPS). In contrast, some organisms, like Hemophilus influenzae or Neisseria gonorrhoeae, lack O-PS but modify their R-LPS by addition of a few glycosyl residues to produce lipo-oligosaccharide. In the Enterobacteriaceae, the core OS is divided into two structural regions, an inner core containing Kdo and heptose and an outer core region consisting primarily of hexose and acetamido sugars. Whereas the inner core is highly conserved among members of the Enterobacteriaceae, the outer core region exhibits variation in its components and structure. Indeed, although there is only one wild-type core structure currently described in Salmonella spp. 2 (Ra core), there are five different core OS structures in Escherichia coli (designated K-12, R1, R2, R3, and R4) which are differentiated based on their outer core OS structures. The structures of the outer core OSs of Salmonella enterica, E. coli K-12, and E. coli R2 are shown in Fig. 1A.
Lipid A-core and O-PS are formed by independent assembly pathways (1)(2)(3). The core OS biosynthesis region of the chromosome (formerly known as the rfa region) contains genes that define unique core OS structures. Many of the genes at this locus code for glycosyltransferases which sequentially elongate the core OS on a lipid A acceptor. The chromosomal core OS biosynthesis region of E. coli K-12 has been entirely sequenced, and the majority of the equivalent region has been completed in S. enterica (Fig. 1B). Most of the known core OS biosynthesis genes in S. enterica have predicted products that are highly similar (greater than 70% total similarity) to E. coli K-12 counterparts. Striking exceptions are WaaK and WaaL, where the similarity is less than 35% (4). The WaaL protein is the only gene product known to be involved in the ligation of pre-assembled O-PS to lipid A-core. This occurs at the periplasmic face of the plasma membrane, prior to translocation of completed S-LPS to the outer membrane (reviewed in 1). WaaL mutants of both E. coli K-12 and S. enterica are unable to "cap" the lipid A-core molecule with an O-PS. The WaaL enzyme of E. coli K-12 has relaxed specificity for the polymer it attaches to lipid A-core since it can effectively ligate a number of "native" E. coli polymers, as well as an ever increasing range of O-PS structures resulting from expression of cloned O-PS-biosynthesis genes in E. coli K-12 (1). From the limited available data, it appears that the S. enterica WaaL protein shows similar relaxed specificity for polymer structure. Ligase enzymes from different bacteria are therefore expected to share a common * This work was supported in part by funding awarded (to C. W.) by the Canadian Bacterial Diseases Network and by the Natural Sciences and Engineering Research Council. The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
The nucleotide sequence(s) reported in this paper has been submitted to the GenBank TM /EBI Data Bank with accession number(s) AF026386 and AF019375.
§ Recipient of a Natural Sciences and Engineering Research Council postdoctoral fellowship.
Ligation is a crucial step in the assembly of S-LPS. Since the O-PSs of pathogenic bacteria are usually required for resistance to complement-mediated killing (5), the ligation step is important for survival in the host and could potentially be exploited for novel therapeutic approaches. However, the mechanism of ligation is unknown. Differences in WaaL sequences of E. coli K-12 and S. enterica most likely reflect the varying structures in the outer core OSs (see Fig. 1A) which serve as acceptors for O-PS, but the structural requirements for a functional core OS acceptor have not been addressed in a systematic manner. Attempts to relate structure and function in WaaL homologs from E. coli K-12 and S. enterica are hampered by differences in both backbone glycan sequence as well as side chain substituents in their respective core OSs. The E. coli R2 core OS has a backbone identical to E. coli K-12 but contains a terminal ␣1,2-GlcpNAc side branch, as is found in S. enterica (Fig. 1A). Analysis of WaaL activity in this strain therefore allows distinction between structural requirements for ligation imposed by features of the core OS backbone and terminal side branch substitutions. The waaK gene of S. enterica has been implicated in the addition of the ␣1,2-linked GlcNAc residue (6). Available evidence suggests that this terminal core OS side branch is important for O-PS ligation activity (7), but the data are limited by the lack of precisely defined mutations and individually cloned genes for functional complementation experiments.
To resolve these ambiguities, the waaK and waaL genes were characterized in E. coli F632, a prototype strain with an E. coli R2 core OS. Structural and biochemical analyses of defined insertions in the E. coli R2 chromosomal genes, together with complementation experiments using single open reading frames and E. coli R2, K-12 and S. enterica core OS acceptors were used to precisely define the effects of the terminal GlcNAc side branch of the E. coli R2 core OS on ligation activity.

EXPERIMENTAL PROCEDURES
Bacterial Strains and Plasmids-The bacterial strains and plasmids used in this study are listed in Table I. The R2 prototype strain used in this study, F632, is an O-PS-deficient derivative of E. coli O100, and although it does not produce an O-PS, it does contain a complete core OS (this study and Refs. 8 and 9).
DNA Methods-Restriction endonuclease digestion and ligation was performed essentially as described by Sambrook et al. (11). Restriction enzymes were purchased from either Life Technologies, Inc. (Burlington, Ontario), New England Biolabs (Mississauga, Ontario), or Boehringer Mannheim (Laval, Quebec). Plasmids were introduced into E. coli strains by using CaCl 2 -competent cells (11) or by electroporation using conditions described elsewhere (12) and a Gene Pulser from Bio-Rad (Mississauga, Ontario). Chromosomal DNA isolation was performed using the Qiagen genomic DNA isolation kit, and plasmid DNA was prepared using QIAprep plasmid spin columns (Qiagen Inc., Santa FIG. 1. Structure of the outer core OS of E. coli K-12, E. coli R2, and S. enterica and organization of their core OS biosynthetic clusters. A, structure of the outer core OSs from the LPSs of E. coli K-12, E. coli R2, and S. enterica. Genetic determinants involved in their biosynthesis are also indicated. All sugars are in the pyranose configuration and the linkages are ␣, unless otherwise indicated. B, maps of the sequenced regions of the waa cluster from the chromosomes of S. enterica, E. coli K-12, and E. coli R2. Numbers indicate percent similarity and identity at both the amino acid and nucleotide levels for respective homologs. The waa* nomenclature is described elsewhere (http://www.angis-.su.oz.au/BacPolGenes/BPGD.html, Ref. 43). Genes involved in the synthesis of the outer core OS are highlighted in white, and the waaL gene, which is involved in ligation of O-PS to lipid A-core, is highlighted in gray. The nucleotide sequence from waaC to waaA of the E. coli F632 chromosome has been deposited in the GenBank TM data base under accession number AF019375, and the nucleotide sequence from waaY to waaA of the S. enterica chromosome has been deposited in the GenBank TM data base under accession number AF026386.
Clarita, CA). Where necessary, DNA fragments were isolated from agarose gels using the Geneclean kit from Bio/Can Scientific (Mississauga, Ontario).
PCR and Sequencing Techniques-Oligonucleotides were synthesized using a Perkin-Elmer 394 DNA synthesizer, and sequencing was performed using an ABI 377 DNA sequencing apparatus (Perkin-Elmer) at the Guelph Molecular Supercentre (University of Guelph). PCR was performed using a GeneAmp PCR System 2400 from Perkin-Elmer. The "expand high-fidelity enzyme mix" (Boehringer Mannheim) was used as the polymerase enzyme in PCR reactions where products were greater than 5 kb. For product sizes of less than 5 kb, PwoI DNA polymerase (Boehringer Mannheim) was used. PCR amplification of the 14-kb fragment flanked by the waaC and waaA genes was performed as follows: one initial cycle at 94°C for 1 min; 20 cycles at 94°C for 15 s and 68°C for 12 min; 16 auto cycles at 94°C for 15 s and 68°C for 12 min, with an auto-extension at 68°C for 15 s per cycle; a final cycle at 72°C for 10 min. The oligonucleotide primers were based upon similar regions of sequence between E. coli K-12 and S. enterica in the waaC and waaA genes and are as follows: (i) forward primer, 5Ј-ACGTTGC-CCGCACTCACTGA-3Ј and (ii) complementary reverse primer, 5Ј-TTCGGTGGCAGGTAAGGTTC-3Ј. PCR products were purified using the QIAquick PCR purification kit from Qiagen. To ensure error-free sequencing, the sequence of each of the DNA strands was determined from the product of separate PCR runs. In the rare instances where a mismatch in sequence between strands occurred, a small region surrounding the mismatch was reamplified and resequenced.
In Vitro Mutagenesis and Gene Replacement-The E. coli F632 waaK gene was mutated in vitro by insertion of a gentamicin-resistance cassette (the aacC1 gene from Tn1696). The cassette was isolated on a 835-bp SacI fragment from plasmid pUCGM, blunt-ended with T4 DNA polymerase, and inserted into the unique EcoRV site in the waaK coding region of plasmid pWQ900 (Fig. 2). The waaK::aacC1 gene was then recovered on a 1.8-kb SacI fragment which was inserted into the SmaI site of the suicide delivery vector pCVD442 (13). Plasmid pCVD442 carrying the waaK::aacC1 gene was maintained in the mobilizing strain SM10pir and transferred to E. coli F632 by conjugation. E. coli F632 waaK::aacC1 was obtained by sucrose selection in the absence of NaCl at 37°C. Resulting colonies were tested for gentamicin resistance and ampicillin sensitivity. The presence of the waaK::aacC1 mutation was confirmed by Southern hybridization and PCR, followed by sequencing the junction sites of waaK-aacC1 on the amplified fragment. The E. coli F632 waaL gene was mutated in vitro by replacement of an internal 1.2-kb HpaI-MfeI fragment from the waaL coding region  of pWQ900 ( Fig. 2) with the gentamicin resistance cassette present on a SmaI fragment from plasmid pUCGM. This essentially removes the complete waaL coding region. The waaL::aacC1 gene was recovered on a 2.5-kb BstEII-BstXI fragment which was blunt-ended with Klenow enzyme and T4 DNA polymerase and inserted into the unique EcoRV site of the suicide delivery vector pMAK705 (14). E. coli F632 was transformed with pMAK705 carrying the waaL::aacC1 gene, and chromosomal gene replacement was carried out by a procedure described elsewhere (15). The presence of the waaL::aacC1 mutation in E. coli CWG302 was confirmed by sequencing the junctions of waaL-aacC1 in an amplified PCR fragment.
Computer Analysis-Sequence data were edited and analyzed using AssemblyLIGN and MacVector software (International Biotechniques Inc., New Haven, CT). Hydrophilicity plots of predicted amino acid sequences were performed using the MacVector software package and the method of Kyte-Doolittle, with a hydrophilicity window of 7 and an amphiphilicity window of 11. Homology searches of nucleotide and amino acid sequences in the National Center for Biotechnology Information data bases were done with the BLAST (basic local alignment search tool) server analysis program (16). Pairwise nucleotide sequence alignments and percentage identity scores were obtained using the NALIGN program of the PC/GENE software package (IntelliGenetics Inc, Mountain View, CA) with an open gap cost of 25 and a unit gap cost of 5. Pairwise protein alignments and percentage identity and similarity scores were obtained using the PALIGN program of PC/GENE with an open gap cost of 5 and a unit gap cost of 5. Multiple sequence alignments were performed using CLUSTALX (version 1.62b). Protein secondary structure was predicted using the GARNIER and GGBSM programs present in the PC/GENE software package and by hydrophobic cluster analysis (HCA) using the HCA plot program (Doriane Informatique, Le Chesnay, France).
Lipopolysaccharide Analysis by SDS-PAGE-Small scale LPS preparations were made from SDS-proteinase K whole cell lysates by the method of Hitchcock and Brown (17). Large scale preparations used the hot phenol/water extraction as described elsewhere (18). LPS was separated on 10 -20% gradient SDS-Tricine polyacrylamide gels that were obtained from Novex (San Diego, CA). Polyacrylamide gel electrophoresis (PAGE) conditions were those recommended by the manufacturer. Silver staining (19) and Western immunoblotting procedures have been described (15), as has production of polyclonal rabbit anti-D-galactan I serum (20). Throughout this study, LPS from an equivalent number of cells was loaded in each gel lane.
Generation of Core Oligosaccharides-Water-insoluble LPSs were obtained by hot water/phenol extraction of E. coli F632 and CWG300 cells (18) and treated with 1% acetic acid at 100°C to cleave the acid-labile ketosidic linkage between the core and lipid A. The waterinsoluble lipid A was isolated from the hydrolysate as a pellet by centrifugation (5000 ϫ g, 5°C). The supernatant containing core OS was purified through passage on a column of Bio-Gel P-2 (1 m ϫ 1 cm) with water as eluent. The lipid A-free core OS eluted after the void volume and was detected by the phenol/sulfuric acid assay (21).
Sugar Composition and Methylation Linkage Analyses-Sugar composition analysis was performed by the alditol acetate method (22). Hydrolysis of glycosidic bonds was achieved by using 4 M trifluoroacetic acid at 100°C for 4 h. The samples were then reduced in H 2 O with NaBD 4 and acetylated with acetic anhydride using residual sodium acetate as the catalyst. Characterization of the alditol acetate derivatives was performed by gas-liquid chromatography-mass spectrometry using a Hewlett-Packard chromatograph equipped with a 30-m DB-17 capillary column (210°C (30 min) to 240°C at 2°C/min). Mass spectrometry in the electron impact mode was recorded using a Varian Saturn II mass spectrometer. Enantiomeric configurations of the individual sugars were determined by the formation of the respective 2-(S)and 2-(R)-butyl glycosides (23). Methylation linkage analysis was carried out by the Ciucanu and Kerek (NaOH/Me 2 SO-methyl iodide) procedure (24). The permethylated alditol acetate derivatives were fully characterized by gas-liquid chromatography-mass spectrometry in the electron impact mode using a column of DB-17 operated isothermally at 190°C for 60 min.
Fast Atom Bombardment-Mass Spectrometry-A fraction (25%) of the methylated sample was used for positive ion fast atom bombardment-mass spectrometry. This was performed by using a Jeol JMS-AX505H mass spectrometer with glycerol/thioglycerol as the matrix and a tip voltage of 3 kV.
Nuclear Magnetic Resonance (NMR) Spectroscopy-1 H and 13 C NMR spectra of the core OSs were recorded on a Bruker AMX 500 spectrometer at 300 K using standard Bruker software. Prior to performing the NMR experiments, the samples were lyophilized three times with D 2 O (99.9%). The internal references for 1 H and 13 C NMR were the HOD peak (␦ H 4.786) and acetone (␦ C 31.4), respectively.
Glycosyltransferase Assays-Incorporation of radiolabel from UDP-[ 14 C]GlcNAc into LPS was used as a measurement of glycosyltransferase activity. Membranes were prepared as described previously (25) from 500 ml of log phase E. coli CWG300 (waaK::aacC1) cells, to provide the acceptor LPS. Membrane-free soluble enzyme extracts of each strain were prepared by collecting ultracentrifugation supernatants from cell-free lysates. Each reaction contained an equal amount of acceptor (20 g of membrane protein) and an aliquot of soluble enzyme extract (15 g protein) in a final volume of 0.1 ml. The buffer comprised 50 mM Tris-HCl, 10 mM MgCl 2 , and 1 mM dithiothreitol. The reaction was started by addition of 0.025 Ci of UDP-[ 14 C]GlcNAc (specific activity 10.2 mCi/mmol; ICN), and the glycosyltransferase assays were performed at 37°C. To measure incorporation into LPS, aliquots of the reaction mixture were separated by descending paper chromatography. High molecular weight radiolabeled LPS was retained at the origin after descending paper chromatography using ethanol (95%) and 1 M ammonium acetate (7:3) (26). Unincorporated substrate migrates in this system and the origins from the chromatogram were excised and counted in a scintillation counter.

Genetic Determinants for Outer Core OS Biosynthesis in E.
coli K-12, R2, and S. enterica-As a starting point for this study, the sequence of the core OS biosynthesis region of the chromosome in E. coli F632 (R2 core prototype) was determined. In E. coli K-12 and S. enterica, the waaC and waaA genes encode the heptosyltransferase I and the bifunctional Kdo transferase, respectively, for inner core biosynthesis (3). Similarities in the waaC and waaA genes (and gene products) between E. coli K-12 and S. enterica and preliminary Southern hybridization experiments (data not shown) suggested that these genes would likely be conserved in other E. coli core types. This predicted conservation was used to design PCR primers to amplify the region containing the outer core OS biosynthesis genes from E. coli F632. The complete nucleotide sequence of the resulting 14-kb PCR amplification fragment was determined, revealing a general organization typical of those seen in E. coli K-12 and S. enterica (Fig. 1B). The organization and function of the core OS biosynthesis regions in S. enterica and E. coli K-12 have been reviewed previously (2). However, the sequence information for S. enterica was incomplete and, in some regions contained errors, effectively limiting comparisons with the region from E. coli K-12. These problems were resolved by sequencing PCR amplification products spanning gaps in S. enterica sequences and resequencing regions where some ambiguities remained. The structures of the completed regions from the three bacteria are shown in Fig. 1B as are the sequence relationships between the genes and their predicted gene products.
Although the functions of some outer core OS biosynthesis enzymes have been established in biochemical analyses, others are inferred from structures of mutant core OSs resulting from defects in various genes (for reviews see Refs. 2 and 3). The relationships among predicted polypeptides representing the core OS backbone glycosyltransferases in S. enterica, E. coli K-12, and E. coli R2 are consistent with the structures of their respective core OSs.
All three core types have a Glcp-␣(133)-Hepp linkage defining the junction between the inner and outer core OS. The waaG gene encodes a UDP-glucose:(heptosyl) lipopolysaccharide ␣1,3-glucosyltransferase (GlcI transferase) in S. enterica (27), and LPS chemical structure in a waaG mutant is consistent with a similar activity in E. coli K-12 (28). The S. enterica waaG mutant is complemented by the cloned genes from both S. enterica (29) and E. coli K-12 (30). The WaaG predicted proteins of S. enterica and E. coli K-12 and F632 share 85.8 -96.0% identity (Fig. 1B). The three known WaaG proteins all contain a motif characteristic of one family of ␣-glycosyltrans-ferases (31). The motif comprises two invariant glutamic acid residues in the signature sequence E(X 7 )E, located in a region of similar secondary structure as defined by hydrophobic cluster analysis (HCA). This sequence in all three WaaG proteins is E 281 AAGIVLLE 289 . These data, together with similar genetic organization are consistent with the assignment of the waaG gene in E. coli F632.
All three core types have an ␣1,6-linked Galp side branch on the GlcI residue. Structural determination of mutant LPSs, enzyme assays, and genetic complementation experiments all identify waaB as the structural gene for the UDP-galactose: (glucosyl)LPS ␣1,6-galactosyltransferase in S. enterica (29,32). The transferase involved, WaaB, is highly conserved. The E. coli F632 WaaB protein shares 63.2% identity with the S. enterica homolog but is much more closely related to the E. coli K-12 protein (92.5% identity). The E. coli K-12 and S. enterica WaaB proteins are functionally equivalent (33). Analysis of the WaaB sequence identifies the E(X 7 )E signature sequence (specifically E 266 GFPMTLLE 274 for all three WaaB proteins) and similar HCA profiles to WaaG and several other prokaryotic ␣-glycosyltransferases (31).
Distal to the GlcI residue, the three core OS backbones diverge. Structural analysis and genetic data indicate that in S. enterica the products of the waaI and waaJ (formerly rfaI and J) are involved in the addition of the GalI and GlcII residues, respectively (2). The corresponding enzymes in E. coli K-12, WaaO, and WaaR have different substrate and acceptor specificities (Fig. 1A), and the UDP-glucose:(glucosyl) lipopolysaccharide ␣1,3-glucosyltransferase (GlcII transferase, now WaaO) has been studied in detail (30). In previous reviews, the E. coli K-12 GlcII and GlcIII transferases were also named WaaI and WaaJ, respectively, following nomenclature established for S. enterica. Since the E. coli K-12 enzymes differ in both substrate and acceptor specificities, we have given these enzymes unique designations. The E. coli K-12 and E. coli R2 backbones are identical, and the GlcII and GlcIII transferase homologs (WaaO and WaaR, respectively) share greater than 88% identity in these bacteria. The amount of similarity between WaaI and WaaO or WaaJ and WaaR is lower, as might be expected given their different specificities, but the values are still significant (Ͼ55% identity). Previously available sequence data for the S. enterica waaI and J genes (34) identified smaller open reading frames than the waaO and waaR genes (33). On resequencing in the current study, the predicted WaaI and WaaJ proteins are found to be of comparable size to their WaaO and WaaR counterparts. The WaaO, -R, -I, and -J transferases all lack the ␣-glycosyltransferase motif typical of WaaB and WaaG. However, BLASTP searches of the data bases identify consensus features in these proteins which are found in a variety of other prokaryotic ␣-glycosyltransferases and thus define a new family of ␣-glycosyltransferases ( Fig. 3 and Table  III) which is discussed below.
A New Family of ␣-Glycosyltransferases: the WaaIJ Family-WaaI, and to a lesser extent WaaJ, of S. enterica have been established as the HexII and HexIII glycosyltransferases, respectively, involved in assembly of the outer core OS portion of the LPS molecule. The HexII and HexIII glycosyltransferases of E. coli K-12, WaaO, and WaaR have been identified, although direct biochemical evidence conclusively identifying their functions is currently limited. These proteins share four highly conserved regions of primary sequence (labeled I, II, III, and IV) which are highlighted in Fig. 3. BLASTP searches of the data bases using the WaaI, -J, -O, or -R proteins identify a number of other known or putative ␣-glycosyltransferases in a variety of other prokaryotes and one protein from a eukaryote (Table III). Secondary structure predictions using a number of programs within the PC/GENE software package as well as HCA analysis (35) predict the following: (i) sequence I is present in an ␣-helix; (ii) sequence II lies within a ␤-strand; (iii) sequence III is present in undetermined secondary structure; and (iv) sequence IV is part of random coil structure.
Characterization of the E. coli R2 waaK Gene-The E. coli R2 and S. enterica core OSs both have a side branch ␣1,2-linked GlcpNAc substitution on the terminal Glcp residue (Fig. 1A). A Hepp residue occupies the same position in E. coli K-12 which would explain the absence of a waaK homolog in its core OS biosynthesis gene cluster. S. enterica mutations mapping to waaK lack GlcNAc in their outer core OS (36), and the waaK gene has been identified in this organism (7). The predicted products of the WaaK homologs from E. coli R2 and S. enterica share 75.3% identity (83.2% total similarity) (Fig. 1B). The predicted molecular mass of the E. coli R2 WaaK protein is 42.8 kDa based on sequence analysis, and the cloned R2 waaK coding region (pWQ901) directs synthesis of a protein with a calculated molecular mass of 43 kDa in Coomassie-stained SDS-polyacrylamide gels (data not shown). The WaaK proteins from E. coli F632 and S. enterica have the E(X 7 )E ␣-glycosyltransferase motif also found in WaaG and WaaB. The motif identified in E. coli R2 WaaK is E 288 AFCMVAVE 296 , identical to that of S. enterica WaaK. Sequence data and structural similarities in the core OSs are consistent with WaaK homologs encoding ␣1,2-linked GlcpNAc transferases. Unambiguous assignment was achieved by structural and biochemical analyses of a precisely defined waaK mutant in the R2 core OS prototype strain, E. coli F632 (see below).
The E. coli R2 waaK Gene Product Encodes a UDP-N-Acetylglucosamine:(Glucosyl) LPS ␣1,2-N-Acetylglucosaminyltransferase for Outer Core OS Assembly-Insertional inactivation of the R2 waaK coding region in E. coli F632 gave strain CWG300 (waaK::aacC1). In SDS/Tricine-PAGE, the LPS lipid A-core band of CWG300 migrates slightly faster than that of the wild-type parent (Fig. 4), reflecting a core truncation. Introduction of plasmid pWQ901 (which carries the R2 waaK gene) into CWG300 yields LPS with the same migration as the wild-type R2 strain (F632), confirming that the defect results only from the expected single mutation.
Structural analysis was used to confirm the nature of the LPS defect in E. coli CWG300. The R2 core OS of E. coli F632 obtained after mild-acid hydrolysis contained D-glucose, D-galactose, and N-acetyl-D-glucosamine in an approximate molar ratio of 3:1:1. Also evident in the hydrolysate were L-glycero-Dmanno-heptose and the 1,6-anhydro-LD-heptose derivative that forms under the hydrolytic conditions used (Fig. 5). These components are those expected for the published structure of the E. coli R2 core OS (Fig. 1A), and the complete outer core OS structure was confirmed by chemical analyses and 1 H and 13 C one-dimensional NMR (data not shown). The acetylamido group of the GlcNAc residue gave the predicted resonances in 1 H and 13 C NMR spectra at 2.02 and ϳ23 ppm, respectively. The CϭO bond of the N-acetyl group was characterized by the signal at 174 ppm in the 13 4 ] ϩ (data not shown). In the R2 core, the substitution of the ␣-GlcIII residue at O-2 by GlcpNAc is incomplete. This is particularly evident in the results of linkage data obtained from gas-liquid chromatography-mass spectrometry of permethylated alditol acetate derivatives (Table II). The methylation experiments showed both terminal Gal and GlcNAc moieties as predicted from the complete structure, as well as terminal Glc residues. From consideration of the molar ratios of the terminal sugars and the amount of the 32)-Glc-(13 moiety, it appears that the GlcpNAc side branch is present in 80% of the core OS molecules. Compositional data for the core OS from E. coli CWG300 indicated the presence of D-glucose and D-galactose in an approximate molar ratio of 3:1 (Fig. 5). No GlcNAc was detected in the glycose composition analysis, and the characteristic resonances of the methyl group of GlcNAc were absent in the NMR spectra. Linkage analysis by methylation identified terminal Gal and Glc residues and an interior 32)-Glc-(13 moiety (molar ratio 1:1:1; Table II) in CWG300 core OS, reflecting a truncated R2 core OS devoid of the terminal GlcpNAc substituent.
The results obtained from structural analyses of the core OS in strain CWG300 (waaK::aacC1) therefore illustrates the requirement for WaaK in the addition of the terminal GlcpNAc substituent on GlcIII. To demonstrate directly the appropriate glycosyltransferase activity in the WaaK protein, soluble enzyme fractions from E. coli strains F632, CWG300, and CWG300 (pWQ901) were tested for their ability to transfer [ 14 C]GlcNAc from UDP-[ 14 C]GlcNAc into acceptor LPS provided by membranes from strain CWG300. It is difficult to assess the absolute values for incorporation in these assays since it is not possible to know how much functional acceptor LPS is available to the glycosyltransferase in the membrane fraction. In fact, the observation of similar activities in the wild-type extract and that containing overexpressed waaK suggests that acceptor may well be a limiting factor. However, activities from F632 (2.95 pmol/g soluble extract protein/h) and CWG300(pWQ901) (2.96 pmol/g/h) were over 5-fold higher than the activity observed from CWG300 (0.56 pmol/g/ h). Control experiments indicated that the low level of background activity in the control extract (CWG300) was attributable to the membrane fraction itself and was not dependent on added soluble extract. This may reflect use of UDP-GlcNAc to assemble other cell wall components.
Taken together, the structural and biochemical data show that the waaK gene from E. coli F632 encodes the UDP-Nacetylglucosamine:(glucosyl) LPS ␣1,2-N-acetylglucosaminyltransferase involved in addition of the terminal side branch in the outer core OS.
The Role of WaaK in Ligation of O-PS-An S. enterica mutant with the waaK953 allele (SL733) does not produce any detectable S-LPS in silver-stained SDS-PAGE (7). The same result is shown in Fig. 6A, lane 2. Introduction of a plasmid containing the complete waaK coding region from S. enterica was able to complement the waaK953 phenotype by restoring synthesis of O-PS in this strain (7). As might be expected given the similarities in core OS structures and WaaK homolog sequences, the R2 waaK gene carried on pWQ901 could functionally replace the waaK gene of S. enterica SL733, leading to restoration of S-LPS formation (Fig. 6A, lane 3).
These results suggest that the WaaK added ␣1,2-linked Gl-cpNAc residue may be required in a functional LPS acceptor for ligated O-PS. However, previous structural analysis of the LPS from the prototype S. enterica waaK953 mutant indicates that the mutant is still able to ligate a trace amount of O-PS to lipid A-core (36). This result can be interpreted in one of two ways: (i) WaaK activity is essential for ligation, but the waaK953 mutation is leaky and retains some enzymatic activity; or (ii) WaaK activity is important for ligation, but its absence only alters the efficiency of the process. These alternatives could be addressed using the waaK::aacC1 null mutation in E. coli CWG300. Due to the fact that E. coli F632 produces an R-LPS, a test O-PS was introduced into this strain to study the contribution of core structure to "capping" with O-PS. Plasmid pWQ3 contains all genes necessary for the production of the O-PS (D-galactan I) of Klebsiella pneumoniae O1 (20). As shown in Fig. 6, B and C, lane 1, the R2 core of E. coli F632 serves as an efficient acceptor for D-galactan I. However, as with S. enterica, the GlcNAc-deficient core OS of CWG300 (waaK::aacC1) is incapable of acting as acceptor for O-PS ( Fig. 6B and C, lane 2). No S-LPS (reflecting ligated O-PS) was detected in SDS-PAGE of LPS samples, either by silver staining or by using the more sensitive Western immunoblotting approach. Introduction of plasmid pWQ901 into CWG300 restores the wild-type ligationproficient phenotype ( Fig. 6B and C, lane 3). From these data, the requirement for the ␣-1,2-linked GlcpNAc in ligation seems to be essential.
Characterization of the E. coli F632 WaaL Gene Product-The waaL homolog in E. coli F632 was initially identified by its occupation of a similar position within the core OS biosynthesis cluster as those in E. coli K-12 and S. enterica. The predicted R2 WaaL protein is, however, much more similar to the S. enterica WaaL protein (81.1% total similarity) than the E. coli K-12 WaaL protein (33.6% total similarity). Hydrophilicity plots of the three WaaL homologs show significant similarity in their predicted structures, and those for S. enterica and E. coli R2 are virtually identical (Fig. 7). Computer analysis predicts that all three WaaL proteins contain at least eight membrane spanning domains. The distribution and the sizes of the transmembrane segments and surface-exposed loops are similar. The predicted E. coli K-12 WaaL protein is slightly larger in size (419 amino acids, 46,874 Da) than WaaL homologs of either E. coli R2 (405 amino acids, 46,048 Da) or S. enterica (404 amino acids, 46,031 Da).
Inactivation of the waaL gene in both E. coli K-12 and S. enterica results in full-length core OS that is not "capped" by O-PS (7,37). Similarly, inactivation of the E. coli F632 waaL gene in strain CWG302 results in the inability of the organism to ligate D-galactan I O-PS to lipid A-core (Fig. 8A, lane 2). Introduction of plasmid pWQ902 into CWG302 restores the ability to ligate D-galactan I to lipid A-core (Fig. 8A, lane 3). Whereas plasmid pWQ902 readily complements the waaL phenotype in S. enterica (Fig. 8B, lane 3), it appears not to be able to fully complement the defect in the E. coli K-12 waaL mutant strain CS2334 (Fig. 8C, lane 3). The amount of Dgalactan I that is ligated to lipid A-core in CS2334(pWQ902) is significantly less than that seen in the parent K-12 strain, AB1133, and is only clearly evident when Western immunoblotting is used to detect the S-LPS product (Fig. 8C, compare  lanes 1 and 3). DISCUSSION Variations in outer core OS structures currently determine five different core types in the LPSs of E. coli and one in S. enterica. The data presented here establish that the E. coli R2 core OS biosynthesis gene cluster is a hybrid of those of E. coli K-12 and S. enterica. The predicted glycosyltransferases (WaaG, -O, and -R) for assembly of the outer core OS backbone are highly similar in E. coli K-12 and R2. In contrast, the products of the waaK and waaL genes of R2, which are involved in the completion of the core OS and ligation of O-PS, are highly conserved with homologs in S. enterica.
Relatively little is known about the mechanism of action of glycosyltransferases, and models lean heavily on the more extensive literature for glycosylhydrolases. As more sequences are available it is apparent that there are several families of ␣and ␤-glycosyltransferases. WaaI and -J provide the prototype for a new family of ␣-glycosyltransferases. Members of the family identified in prokaryotes contain four conserved regions of primary sequence (Fig. 3) located in regions of common secondary structure. Interestingly, one eukaryotic member of the family (protein T10M13.14 of Arabidopsis thaliana) lacks the sequence III motif. It is striking that where substrates for these proteins are known, the WaaIJ family proteins use UDP- hexose (Galp or Glcp), and many are involved with the core region of an LPS or lipo-oligosaccharide molecule (Table III). One, WbbM, is involved in O-PS synthesis, and only one (GspA) has been identified from Gram-positive bacteria. It is not possible to assign catalytic and/or binding residues without more extensive biochemical analysis, but the identification of conserved residues in this family provides the foundation on which such strategies will be based.
There are some potential open reading frames in E. coli K-12 whose functions remain obscure (reviewed in Ref. 2). Based primarily on SDS-PAGE data, the waaS gene has been proposed to play a role in an alternate form of LPS which is separate from those molecules which will become an acceptor for O-PS (2). The "waaS" regions in E. coli R2 and K-12 are relatively poorly conserved, and only remnants remain in S. enterica. Analysis of the structure of the predicted R2 WaaS protein indicates that it is an additional member of the WaaIJ family of ␣-glycosyltransferases (Fig. 3). These motifs are absent in the K-12 WaaS protein. Examination of the available structural data for the R2 and K-12 core OSs indicates they have non-stoichiometric substitutions of the KdoII residue with either Galp or Rhap residues, respectively (38). Structural features of the R2 WaaS protein are consistent with a transferase that uses UDP-Galp as a donor. The absence of a similar modification in S. enterica core OS is consistent with the absence of a complete open reading frame equivalent to waaS. Ultimately, the waaS genes may require unique designations, but this should await experimental determination of their precise roles in core OS assembly. Two additional open reading frames, waaZ and waaY, are conserved in all three core types. Their role in core assembly is presently unknown although the waaZ gene may also play a role in the production of an alternate form of LPS (2). The role of WaaQ is also unknown although it may function as a HeppIII transferase for inner core biosynthesis, based on limited resemblance to the sequences of other heptosyltransferases (3). Mutations in waaP influence inner core phosphorylation, but its precise role is not known (3). The waaQ and waaP genes (and gene products) are highly  conserved in all three core types. The identification of conserved sequences for these largely uncharacterized genes in E. coli R2 does not shed light on the function of their gene products, and since they are not considered to influence outer core OS carbohydrate structure, their roles are not addressed here. However, structural and genetic information is now available to systematically address these additional questions in core OS assembly.
In the core OS of S. enterica, addition of the terminal ␣1,2linked GlcpNAc side branch requires WaaK (7,36,39). Membranes of S. enterica are known to incorporate GlcpNAc from UDP-GlcpNAc (40), but the data directly linking WaaK to the glycosyltransferase activity has been circumstantial. Here, we show that the E. coli R2 WaaK homolog is the UDP-N-acetylglucosamine:(glucosyl) LPS ␣1,2-N-acetylglucosamine transferase for outer core OS assembly. Structural analysis of the S. enterica core OS (36) indicate this substitution is stoichiometric. However, in E. coli R2 (this study and Ref. 8), this terminal GlcNAc substitution is non-stoichiometric. Possible reasons for this minor difference remain unclear. The literature for the corresponding HexIII substitution in E. coli K-12 indicates the presence of terminal ␣-1,6-heptose, referred to as HepIV (38). As predicted from the structures, the K-12 core OS biosynthesis gene cluster does not contain a homolog of waaK. In E. coli K-12, the gene which we have renamed waaU (originally this gene was also referred to as rfaK) occupies the identical location to waaK. It has previously been suggested that WaaU might still be a GlcpNAc transferase but one involved in transfer of GlcpNAc to an undefined location in the inner core OS (2,37). Interestingly, WaaU contains two ␣-glycosyltransferase motifs, E 228 QIKVIYQE 235 and E 261 IETLPFDE 269 , resembling the the two closely occurring E(X 7 )E motifs found in the Cterminal third of WaaC and WaaF proteins of E. coli K-12, R2, and S. enterica. This is in contrast to the hexosyltransferases such as WaaG, -B, and -K, which contain only a single copy of the motif. Furthermore, BLASTP searches identify regions of local similarity shared by WaaU and a variety of known and predicted heptosyltransferases, including WaaC and WaaF proteins. Thus, whereas the identity of the glycosyltransferase for the terminal ␣1,6-linked Hepp (HepIV) side branch in the E. coli K-12 core is equivocal, waaU is the most likely candidate.
Currently, little is known of the mechanism by which O-PSs are ligated to the lipid A-core molecule. Insights into the ligation reaction could lead to novel therapeutic agents that prevent the attachment of O-PS and lead to a higher degree of complement-mediated killing by the host. The ligase enzyme is envisioned as a glycosyltransferase with a complex (lipidlinked oligosaccharide) substrate requirement. Motifs found in currently known glycosyltransferases are absent in the ligase protein, as might be expected since the ligase substrate is not a nucleotide diphospho sugar. There is little conservation in the primary sequences of the ligases of E. coli R2/S. enterica and E. coli K-12, although their secondary structures appear to be a conserved feature. Ligases from E. coli K-12, R2 (see above,) and S. enterica (data not shown) all interact with and efficiently ligate the reporter O-PS, D-galactan I, to their respective core OS molecules. Furthermore, the ability of the K-12 ligase protein to interact with a variety of different polysaccharide structures is interesting and suggests a relaxed specificity for the ligated structure. To the extent that biosynthetic data are available, all of the polysaccharides currently known to be ligated to lipid A-core by WaaL are assembled on an undecaprenyl pyrophosphoryl lipid intermediate (reviewed in Refs. 1  and 41). The precise details of the trans-cytoplasmic assembly pathways can vary considerably, and the E. coli K-12 ligase efficiently ligates O-PS products from the three currently known pathways (1). The undecaprenyl pyrophosphoryl carrier may provide the conserved feature in the ligated substrate for ligase function.
We are interested in the structural requirements for the acceptor in the O-PS ligation reaction. Core OS structure has a profound effect on ligase specificity. Prior to this work, the only known ligases were those from S. enterica and E. coli K-12. Comparative analysis of these ligases is complicated by core OS structures that differ in both backbone sequence and terminal side branch substituents. Also, a full collection of precise mutations in core OS assembly as well as individually cloned and expressed genes have not been available to address directly the structural requirements. These limitations prompted the analysis reported here.
Previous cross-complementation data indicate that the waaL gene from E. coli K-12 cannot complement a ligase-defective mutant S. enterica, suggesting structural specificity in terms of the core OS acceptor (37). Studies involving a prototype S. enterica waaK mutant (SL733) indicate that absence of the terminal ␣1,2-linked GlcNAc abrogates attachment of most O-PS; only trace amounts remained (36). As reported here, and in previous work by others (7), there is no evidence of residual S-LPS in the currently available isolate of S. enterica SL733. The nature of the waaK953 allele is unknown, leading to questions concerning whether WaaK activity is essential for ligation or only important for ligation efficiency. Complementation of the ligation defect by waaK genes from the same organism (7) and from E. coli R2 (this work) indicates that the defects are confined to waaK in S. enterica SL733. In a defined waaK null mutant of E. coli F632 (strain CWG300), the ligation of a reporter O-PS to the R2 core OS is eliminated, and a terminal side group substitution is therefore essential for ligation activity in E. coli R2 and probably in S. enterica. Consistent with this conclusion, there is one report of the structure of an O-PS (serotype O104) attached to an R2 core OS, and the WaaKdirected GlcNAc side branch was present in stoichiometric amounts in the linkage region of the resulting S-LPS (42). The ability of the E. coli R2 waaK gene product to efficiently complement the S. enterica WaaK-mediated ligation defect rules out any role for core OS backbone structure in determination of acceptor specificity in the group of ligase enzymes examined here, since these core OSs differ at the HexII position (i.e. Gal in S. enterica and Glc in E. coli R2).
Interestingly, in previous work from others, some ligation activity was restored when a plasmid containing waaL and waaU from E. coli K-12 was introduced into an S. enterica waaK mutant (37). Inactivation of the waaL coding region of this plasmid eliminated its inability to complement the waaK phenotype of S. enterica. This suggests that the "complementation" of the waaK defect in fact represented replacement of enough S. enterica core OS terminus with that of E. coli K-12, to allow the K-12 ligase to functionally replace the S. enterica chromosomally encoded WaaL. Unfortunately, the structure of the resulting core was not determined. In light of the published lack of complementation of the S. enterica waaL mutant with the cloned E. coli K-12 waaL gene (37), the requirement for a precise side group appeared to be absolute. However, using a plasmid carrying only the R2 waaL structural gene, we were able to restore low levels of ligation activity in the E. coli K-12 waaL mutant (CS2334). The differences in the two studies could reflect differences in sensitivity of detection methods or levels of waaL expression; previous studies did not use a defined inducible promoter in complementation plasmids. Based on the data presented here, a terminal side group is critical for this group of ligases, although the precise nature of the residue clearly affects efficiency of ligation. The different efficiencies of the R2 WaaL protein in ligation to the K-12 and R2/S. enterica core OSs presumably reflect steric hindrance resulting from the replacement of the GlcpNAc side branch (present in the native R2 core OS) with a Hepp residue (in K-12 core OS).