Chromosomal and Plasmid-encoded Enzymes Are Required for Assembly of the R3-type Core Oligosaccharide in the Lipopolysaccharide of Escherichia coli O157:H7*

The type R3 core oligosaccharide predominates in the lipopolysaccharides from enterohemorrhagic Escherichia coli isolates including O157:H7. The R3 core biosynthesis ( waa ) genetic locus contains two genes, waaD and waaJ , that are predicted to encode glycosyltransferases involved in completion of the outer core. Through determination of the structures of the lipopolysaccharide core in precise mutants and biochemical analyses of enzyme activities, WaaJ was shown to be a UDP-glucose:(galac-tosyl) lipopolysaccharide (cid:1) -1,2-glucosyltransferase, and WaaD was shown to be a UDP-glucose:(glucosyl)lipo-polysaccharide (cid:1) -1,2-glucosyltransferase. The residue added by WaaJ was identified as the ligation site for O polysaccharide, and this was confirmed by determination of the structure of the linkage region in serotype O157 lipopolysaccharide. The initial O157 repeat unit begins with an N -acetylgalactosamine residue in a (cid:2) -anomeric configuration, whereas the Life ther-mocycler for each primer pair. Chromosomal DNA template prepared using the Bio-Rad Instagene and Pwo I polymerase used for PCR. PCR amplification products by electrophoresis on 0.7% agarose gels and using ethidium bromide. Purification of PCR and plasmid DNA fragments carried using DNA purification (MOBIO Lab- Qiagen plasmid spin kit or GenElute plasmid minipreps (Sigma) were used to purify recombinant plasmids. Large endogenous virulence plasmid DNA was prepared using the large plasmid QIAprep spin miniprep kit (Qiagen). Ligation reactions were carried out using methods described elsewhere (27), and restriction endonuclease enzymes were purchased from either New England Biolabs or Invitrogen and used according to the manufacturer’s instructions. A Gene Pulser from Bio-Rad was used for electroporation of plasmids (28), and the plasmids were maintained in E. coli DH5 (cid:1) . For Southern blots, restriction digests of (cid:2) 5 (cid:2) g of plasmid were on an agarose gel. erlined) and 5 (cid:7) -TGATGCTCGAGTTTATAC-3 (cid:7) (XhoI). A 450-bp fragment of waaJ was removed by digesting the plasmid with EcoRV and PstI, and the ends were blunted using T4 polymerase (New England Biolabs). The SmaI-digested aac C1 gentamicin resistance cassette was ligated into the waaJ gene. The resulting plasmid was digested with EcoRI and PstI to isolate a fragment containing the gentamicin resist- ance cassette flanked by waaJ sequences; this fragment was then cloned into the temperature-sensitive suicide vector, pMAK705 (31), giving plasmid pWQ316. The waaD gene from F653 was PCR-amplified and cloned into pBAD24 using the primers 5 (cid:7) -GGTTAGAATTGAGAT- GGTTGATAAA-3 (cid:7) and 5 (cid:7) -TTTGTTATCCATGGAAACGTA-3 (cid:7) , exploiting the EcoRI and NcoI sites (underlined) built into the primers. The resulting plasmid (pWQ157) was digested with NdeI, which cuts in the middle of the waaD gene, and the ends were filled in using the large Klenow fragment (New England Biolabs). The plasmid was then ligated to the SmaI-digested aac C1 cassette. A fragment containing the aac C1 gene flanked by waaD sequences was removed as an EcoRI and XbaI fragment and ligated into pMAK705, as pWQ317. pMAK705 constructs were separately transformed into F653, and allelic exchange was performed as described Each mutation was confirmed by PCR across the region, followed by sequencing of the mutation junction. F653 plasmid open pWQ157 In vitro glycosyltransferase assays for WaaD and WabB were per- formed using lysates of and DH5 an A of and expression the or was in- duced using pended m M containing m M MgCl m M by g

. The core OS has limited variation with only five known core structures in E. coli (designated K-12, R1, R2, R3, and R4) but there are conserved structural themes in these core OSs (3).
The E. coli core OS is conceptually divided into two structural regions, the inner core region and the outer core region. The inner core consists of the sugars 3-deoxy-D-manno-oct-2ulosonic acid (Kdo) and L-glycero-D-manno-heptose (Hep). Phosphorylethanolamine residues can be attached nonstoichiometrically to KdoII, and the HepI and HepII residues are decorated with phosphoryl (P) and/or PPEtN residues (Fig. 1A). Phosphorylated inner core OS plays a crucial role in outer membrane stability (4 -6), and these essential functions may place structural constraints on the inner core OS, accounting for conservation of its base structure. Some additional typespecific modifications occur at KdoII or HepIII (3,7,8) and these include an ␣-1,7-linked GlcNAc (N-acetylglucosamine) residue on HepIII in the E. coli R3 inner core (Fig. 1A). The outer core consists of a tri-hexose (HexI-HexII-HexIII) backbone where the sequence of glucose or galactose residues at the HexII and HexIII positions varies according to core type. The HexI residue represents the first sugar and is always Glc (Fig.  1A, GlcI). The trisaccharide backbone is modified with side branch substitutions of hexose and acetamidohexose residues that vary according to core type. Variations in outer core structure give rise to altered sites for the attachment of O-PS in some core types, as well as altered antigenic epitopes and bacteriophage receptors (3,9). The five E. coli core types can be differentiated by polyclonal and monoclonal antibodies (10,11) and by PCR tests that are based on unique sequences in the waa gene clusters encoding enzymes involved in core OS biosynthesis (12). From such analyses it is evident that the R1 core type predominates in E. coli isolates most frequently associated with extraintestinal infections, whereas the R3 core type is detected in most enterohemorrhagic E. coli (EHEC) isolates, including O157:H7.
The core OS is thought to be assembled by a coordinated complex of membrane-associated glycosyltransferases that se-quentially extend the core OS on a lipid A acceptor molecule, by transfer of glycosyl residues from nucleotide sugar precursors (1). These reactions occur at the cytoplasmic face of the plasma membrane. The majority of the enzymes required for core OS assembly are encoded by genes within the chromosomal waa locus. The E. coli and Salmonella waa loci show conserved organizational features, and all consist of three operons (3). The long central operon contains genes whose products are required for modification of the inner core OS (e.g. addition of phosphoryl residues and HepIII), as well as genes encoding the glycosyltransferases necessary for outer core OS biosynthesis. In the case of the R1 and R4 core OS types, this operon also contains the waaL gene whose product is required for ligation of O-PS to the lipid A core acceptor.
The R3 core type is of biomedical interest because it is found in verotoxigenic isolates belonging to the common enterohemorrhagic E. coli (EHEC) serogroups O157, O111, and O26 (12,13) and in Shigella flexneri serotype 2a (14), a strain most commonly associated with bacillary dysentery or shigellosis. The R3 core type is also found in E. coli J5, a mutant with rough LPS (i.e. devoid of O-PS) originally isolated from E. coli O111:B4 (15). The J5 LPS has been widely used in immunochemical studies, in analyses of endotoxicity, and in attempts to develop antibodies against core OS-lipid A that would be broadly cross-reactive and protective in applications such as treatment of sepsis (15)(16)(17)(18)(19). Antibodies cross-reacting with R3 core OS epitopes were detected in 6 of 10 patients infected with different EHEC strains (20), and it has been proposed that IgG and IgA responses to the R3 core OS may be protective against E. coli and Shigella spp. (21).
The structure of the R3 core OS has been determined ( Fig.  1A) (22), and the sequence of the waa loci have been reported for an R3 prototype (3), for two isolates of E. coli O157 (23,24), and for S. flexneri 2a (14,25). The sequences of the waa genes from S. flexneri 2a and E. coli O157:H7 share Ͼ99% identity (14). The functions of some waa gene products have been deduced by high levels of sequence identity shared with known enzymes, but others could not have been assigned based on sequence data alone. Also, unlike the other E. coli core types, the R3 waa locus itself does not contain enough open reading frames to account for the observed core OS structure (3). The objective of this work was to resolve ambiguities in the assignment of enzymatic activities to genes involved in the biosynthesis of the R3 core OS structure and establish the molecular architecture of the linkage region between core OS and O-PS in the LPS of E. coli serotype O157:H7.

EXPERIMENTAL PROCEDURES
Bacterial Strains, Plasmids, and Media-The bacterial strains and plasmids used in this study are listed in Table I. The R3 prototype strain isolate in this study, F653, is an O antigen-deficient derivative of E. coli O111:K58 (26) and contains a complete core OS. Also used in this study was a verotoxigenic-negative strain (EC960264) of E. coli O157:H7 obtained from Health Canada. Bacterial strains were all grown in Luria-Bertani (LB) broth at 37°C. Antibiotics were used at the following concentrations: ampicillin (100 g/ml), chloramphenicol (30 g/ml), gentamicin (30 g/ml), kanamycin (30 g/ml), tetracycline (10 g/ml), and streptomycin (100 g/ml). For growth and induction of strains containing pBAD24 derivatives, L-arabinose was used at a final concentration of 0.02%.
General DNA Methods-Oligonucleotide primer synthesis and DNA sequencing were carried out at the Guelph Molecular Supercenter (University of Guelph) or Sigma Genosys. PCR amplifications were carried out in a PerkinElmer Life Sciences GeneAmp PCR System 2400 thermocycler or a PTC-200 thermal cycler (MJ Research), using conditions optimal for each primer pair. Chromosomal DNA template was prepared using the Bio-Rad Instagene matrix, and PwoI polymerase from Roche Diagnostics was used for PCR. PCR amplification products were separated by electrophoresis on 0.7% agarose gels and stained using ethidium bromide. Purification of PCR and plasmid DNA fragments was carried out using Ultra-Clean DNA purification kit (MOBIO Laboratories). Qiagen plasmid spin kit or GenElute plasmid minipreps (Sigma) were used to purify recombinant plasmids. Large endogenous virulence plasmid DNA was prepared using the large plasmid QIAprep spin miniprep kit (Qiagen). Ligation reactions were carried out using methods described elsewhere (27), and restriction endonuclease enzymes were purchased from either New England Biolabs or Invitrogen and used according to the manufacturer's instructions. A Gene Pulser from Bio-Rad was used for electroporation of plasmids (28), and the plasmids were maintained in E. coli DH5␣. For Southern blots, restriction digests of ϳ5 g of plasmid DNA were separated on an agarose gel. The gel was treated with denaturing buffer, and the DNA was then transferred to a positively charged nylon membrane (Roche Diagnostics), by downward alkaline transfer for 3 h (29). The DNA was then cross-linked to the membrane using a Stratalinker (Stratagene). After low stringency prehybridization at 37°C for 3 h, the membrane was probed with a digoxigenin-dUTP-labeled DNA probe at 37°C overnight. The probe was made by PCR amplification of a 600-bp internal fragment of wabB gene and digoxigenin-labeled according to the manufacturer's protocol (Roche Diagnostics). The blot was washed under low stringency conditions (2ϫ SSC, 0.1% SDS) three times at 25°C, and the hybridized probe was detected colorimetrically by using an anti-digoxigenin antibody-alkaline phosphatase conjugate. The sequences of the waaQ operon from E. coli F653 (GenBank TM accession number AF019745) and O157:H7 (accession number AE005590) and the wabB gene from O157:H7 (deposited as "rfbU" accession number AAC70095) are available in GenBank™.
Chromosomal Insertion Mutations-The waaJ and waaD genes of the R3 core OS biosynthesis region in E. coli F653 were independently mutated by insertion of a nonpolar gentamicin resistance cassette, aacC1, from plasmid pUCGm (30), giving strains CWG351 (waaD:: aacC1) and CWG350 (waaJ::aacC1). To construct the waaJ mutant, a PCR-amplified DNA fragment from F653 containing waaI-Y-J-D was cloned in pBluescript II SK(ϩ) (Stratagene) using the enzymes XhoI and SacII and sequenced to ensure error-free amplification. The primers used were 5Ј-AAGGAGATGACCGCGGAGCT-3Ј (SacII, site underlined) and 5Ј-TGATGCTCGAGTTTATAC-3Ј (XhoI). A 450-bp fragment of waaJ was removed by digesting the plasmid with EcoRV and PstI, and the ends were blunted using T4 polymerase (New England Biolabs). The SmaI-digested aacC1 gentamicin resistance cassette was ligated into the waaJ gene. The resulting plasmid was digested with EcoRI and PstI to isolate a fragment containing the gentamicin resistance cassette flanked by waaJ sequences; this fragment was then cloned into the temperature-sensitive suicide vector, pMAK705 (31), giving plasmid pWQ316. The waaD gene from F653 was PCR-amplified and cloned into pBAD24 using the primers 5Ј-GGTTAGAATTGAGAT-GGTTGATAAA-3Ј and 5Ј-TTTGTTATCCATGGAAACGTA-3Ј, exploiting the EcoRI and NcoI sites (underlined) built into the primers. The resulting plasmid (pWQ157) was digested with NdeI, which cuts in the middle of the waaD gene, and the ends were filled in using the large Klenow fragment (New England Biolabs). The plasmid was then ligated to the SmaI-digested aacC1 cassette. A fragment containing the aacC1 gene flanked by waaD sequences was removed as an EcoRI and XbaI fragment and ligated into pMAK705, as pWQ317. pMAK705 constructs were separately transformed into F653, and allelic exchange was performed as described elsewhere (4). Each mutation was confirmed by PCR across the region, followed by sequencing of the mutation junction. To verify phenotypes and the nonpolar nature of each mutation, each derivative of F653 was complemented with a recombinant pBAD24 plasmid carrying the corresponding PCR-amplified individual open reading frame. Plasmid pWQ157 carrying waaD is described above. The waaJ gene was cloned into pBAD24 using the primers 5Ј-GATTAGAA-TTCAGGGTAATGAAATTG-3Ј and 5Ј-TTTATCAACCATGTCGACTAT-TAACC-3Ј, exploiting the EcoRI and SalI sites (underlined) to give plasmid pWQ156.
The wzy gene was also mutated in strain E. coli EC960264, to give CWG653 (wzy::aacC1). The wzy gene was amplified and cloned into the pBAD24 expression vector by using the primers 5Ј-GAGAAAGGAGA-ATTCAAAATGAAGTCAG-3Ј and 5Ј-GTTTTTTCTAACTCTAGAGCAT-TATTATAAG-3Ј, and the resulting plasmid was used subsequently to complement the wzy mutation and verify that the phenotype represented a single genetic defect. A SmaI fragment carrying the aa-cC1gene was then inserted into the HpaI site in the middle of the wzy gene in pWQ60. An XbaI and SmaI fragment carrying the mutated wzy gene was then cloned into the sucrose-sensitive suicide plasmid, pRE-112 (32), giving pWQ318. The resulting plasmid was then transformed into E. coli SM10pir and transferred by conjugation to a streptomycin-resistant mutant of E. coli O157:H7 (EC960264). A double crossover event was selected for by plating on 10% sucrose, streptomycin, and gentamicin. The same method was used to mutate the wabB mutation in CWG653 giving CWG654. Briefly, the wabB gene was amplified from E. coli O157:H7 using the primers 5Ј-GGGCCAAGGGTACCAGATAA-TTAATGA-3Ј (KpnI site underlined) and 5Ј-GCTGTCTAGAGGATCC-GCCGTTA-3Ј (XbaI) and cloned into pBAD24, resulting in pWQ159. A SmaI fragment carrying the aacC1 gentamicin resistance cassette was cloned into the PvuII site the middle of the wabB gene. The mutated gene was digested with EcoRI, and the end was blunted using the Klenow fragment. After additional digestion with XbaI, this DNA fragment containing wabB::aacC1 was cloned into SmaI/XbaI-digested pRE112, and the resulting plasmid, pWQ319, was transformed into SM10pir. The plasmid was transferred to CWG653 by conjugation, and a single crossover event was selected by plating on chloramphenicol and streptomycin. The isolate was grown at 37°C to an A 600 of ϳ0.6, and a double crossover event was then selected for 10% sucrose and gentamicin. Both the wzy and the wzy/wabB mutants were verified by PCR and sequencing across the mutant junctions.
Lipopolysaccharide Analysis by PAGE-Small scale LPS preparations were made from SDS-proteinase K whole-cell lysates by the method of Hitchcock and Brown (33). LPS was separated on either 10 -20% gradient SDS-Tricine polyacrylamide gels or 4 -12% NuPAGE gels (obtained from NOVEX, San Diego, CA). SDS-PAGE and silverstaining conditions are reported elsewhere (34). In order to visualize the presence of D-galactan I O-PS in bacteria harboring pWQ3 (35), wholecell lysates were separated on a 12% SDS-PAGE and analyzed by Western immunoblotting using procedures described elsewhere (36) and polyclonal rabbit anti-D-galactan I serum (37).
Isolation of LPS and Purification of Oligosaccharides-Large scale LPS preparations were made using the phenol, chloroform, and petroleum ether method as outlined elsewhere (38). Lipid A was removed by treating LPS (100 mg) in 5 ml of 2% AcOH at 100°C for 3 h. Lipid A was removed as a precipitate by centrifugation, and soluble products were separated on a Sephadex G-50 column (2.5 ϫ 95 cm) eluted in pyridinium/acetate buffer, pH 4.5 (4 ml of pyridine and 10 ml of AcOH in 1 liter of water). The eluate was monitored using a refractive index detector. For O-deacylation, 100 mg of LPS was dissolved in 3 ml of anhydrous hydrazine and incubated for 1 h at 40°C. The mixture was then poured into cold acetone, and the precipitate was collected by centrifugation, washed with acetone, and lyophilized. The LPS backbone oligosaccharides were isolated by O,N-deacylation of 120 mg of LPS in 4 ml of 4 M KOH at 120°C for 16 h (39). The mixture was cooled and neutralized with 2 M HCl, and the precipitate was removed by centrifugation. The supernatant was desalted by gel chromatography on Sephadex G-50. Individual compounds from the O,N-deacylated LPS were isolated by high performance anion-exchange chromatography (HPAEC) on a column (250 ϫ 9 mm) of Carbopac PA1 that was eluted with a linear gradient of 10 -80% 1 M sodium acetate in 0.1 M NaOH at a flow rate of 3 ml/min for 60 min. After desalting, oligosaccharides were isolated as single compounds in yields of 2-10 mg. LPS was dephosphorylated by dissolving 10 mg in 1 ml of 48% aqueous hydrofluoric acid (HF), and the treatment was performed for 48 h at 4°C (40). The HF was removed under a stream of nitrogen, and the LPS was resuspended in 10 ml of distilled water and lyophilized. After another round of dissolving in water and lyophilization, the final products were dissolved in 1 ml of water and used to provide acceptor LPS in the in vitro glycosyltransferase assays.
Compositional and Methylation Analysis-For compositional analysis, LPS was hydrolyzed in 4 M CF 3 CO 2 H (110°C, 3 h), and monosaccharides were converted into the alditol acetate derivatives. The products were analyzed by gas-liquid chromatography (GC) on an Agilent 6850 chromatograph equipped with DB-17 (30 m ϫ 0.25 mm) fused silica column using a temperature gradient of 180°C (2 min) 3 240°C at 2°C/min. Methylation analysis was performed using Ciucanu-Kerek procedure (41). Methylated products were hydrolyzed, and the monosaccharides were converted to 1d-alditol acetates by conventional methods and analyzed by GC-MS. GC-MS was performed on Varian Saturn 2000 system equipped with an ion-trap mass spectral detector using the same column.
NMR Spectroscopy-NMR spectra were recorded at 25°C in D 2 O on a Varian UNITY INOVA 600 instrument using acetone as reference ( 1 H, 2.225 ppm and 13 C, 31.45 ppm). Varian standard programs COSY, NOESY (mixing time of 300 ms), TOCSY (spinlock time 120 ms), HSQC, and gHMBC (evolution delay of 100 ms) were used with digital resolution in F2 dimension Ͻ2 Hz/point. Spectra were assigned using the computer program Pronto (42).
Mass Spectroscopy-A crystal model 310 CE instrument (ATI Unicam, Boston) was coupled to a Q-Star quadrupole/time-of-flight mass spectrometer, or an API 3000 mass spectrometer (Applied Biosystems/ MDS Sciex, Concord, Canada), via a micro-ion spray interface. The separations were obtained on an ϳ90-cm-long bare fused silica capillary, using 15 mM ammonium acetate in deionized water, pH 9.0, containing 5% methanol. A voltage of 15 kV was typically applied at the injection. The outlet of the capillary was tapered to ϳ15 m internal diameter using a laser puller (Sutter Instruments, Novato, CA). For MS/MS analysis, fragment ions were formed by collision activation of selected precursor ions with nitrogen in the RF-only quadrupole collision cell and recorded using a time-of-fight mass analyzer. For pseudo MS/MS/MS analysis, the precursor ions were generated with an orifice voltage of ϩ180 V and mass spectra were acquired with nitrogen in the RF-only quadrupole collision cell. ESI MS was carried out as described previously (43).
In Vitro Glycosyltransferase Assays-The waaJ gene was amplified by PCR using the primers 5Ј-GATTAGAATTCAGGGTAATGAAATTG-3Ј (E-coRI) and 5Ј-TTTATCAACCATGTCGACTATTAACC-3Ј (SalI). The fragment was digested with EcoRI and SalI and cloned into the pET28aϩ expression vector (Novagen), giving pWQ155. The resulting construct expressed WaaJ with an N-terminal His 6 tag. The plasmid was transformed into E. coli BL21 (DE3) for overexpression. The strain was grown to an A 600 of 0.6, and isopropyl-␤-D-thiogalactoside was added to a final concentration of 1 mM. After 2 h of incubation, the cells were collected by centrifugation, and the cell paste was frozen overnight. The pellet was thawed and resuspended with lysis buffer consisting of 10 mM imidazole, 50 mM NaH 2 PO 4 , 300 mM NaCl, and 1 mg/ml lysozyme. The cells were then lysed by ultrasonication, and cell-free lysate was collected after ultracentrifugation (100,000 ϫ g for 1.5 h at 4°C). Cell-free lysate was then incubated with nickel-nitrilotriacetic acid-agarose (Qiagen) and loaded in a gravity column. The column was washed in 50 mM imidazole, and the His 6 -WaaJ protein was eluted with a 250 mM imidazole solution.
To measure glycosyltransferase activity, each reaction contained 300 g of CWG350 LPS acceptor, 10 g of His 6 -WaaJ, 0.699 nmol of UDP- In vitro glycosyltransferase assays for WaaD and WabB were performed using cell-free lysates of E. coli 21548 (pWQ157; WaaD) and DH5␣ (pWQ159; WabB). The lysates from the same strains containing pBAD24 vector provided the controls. The cells were grown to an A 600 of 0.6, and expression of the relevant protein (WaaD or WabB) was induced using 0.02% arabinose. The cells were grown for 2 h, collected by centrifugation, and frozen overnight. The pellet was thawed and resuspended with lysis buffer consisting of 50 mM Tris-HCl, pH 8.0, containing 10 mM MgCl 2 and 1 mM dithiothreitol. The cells were lysed using ultrasonication (model 500, Fisher). Cell-free lysates were prepared by ultracentrifugation (100,000 ϫ g for 1.5 h at 4°C). For the WaaD assay, LPS from CWG351 acted as the acceptor, and LPS from both ). The LPS collected on the filters was washed three times with stop solution to eliminate residual unincorporated radioactive substrate, and then the filters were dried. Incorporation was measured by scintillation as described above, this time using a Beckman Coulter LS6500 multipurpose scintillation counter.

Organization and Bioinformatic Analysis of the Central
Operon of the R3 Core Type waa Locus-E. coli F653 provides the prototype for the R3 core OS structure (26), and the waa locus has been sequenced (3). Essentially identical sequences have resulted from the genome sequences of E. coli O157:H7 (23,24) and S. flexneri serotype 2a (14,25). The central operon of the waa locus (Fig. 1B) is located between the waaA and waaL genes. With a length of 9079 nucleotides, this is the smallest central operon of any of the E. coli waa loci. The waaA genes from K12 and R3 share 98.7% identity at the nucleotide level, and the predicted WaaA proteins are identical. The waaA gene product is a bifunctional Kdo transferase responsible for the addition of KdoI and KdoII (44). WaaL proteins are involved in O-PS ligation (1). Collectively, they share limited primary sequence identity, but all are predicted to be integral membrane proteins with more than eight transmembrane domains, and all have predicted hydrophilic domains of equivalent size and distribution, leading to common hydropathy profiles (3). The waaL gene in the R3 waa loci shares ϳ52% identity with the characterized WaaL "ligases" from E. coli F632 (the R2 prototype strain) (45) and from Salmonella enterica sv. Typhimurium (46). The waaQGP genes from the R3 system were identified based on Ͼ99% identity shared by their gene products with enzymes whose activity has been determined in the R1 system (4). The predicted WaaY protein shows less similarity when compared with its R1 counterpart (51.7% identity and 65.8% total similarity).
The outer core OS backbone in all E. coli and Salmonella LPSs shares a trisaccharide backbone (HexI-HexII-HexII) where the nonreducing HexI is always a glucose residue formed by the WaaG glycosyltransferases (Fig. 1A, GlcI) (47). The glycosyltransferases required for addition of HexII and HexIII in E. coli and Salmonella share a number of conserved motifs and form what has been termed the WaaIJ family (3). These enzymes belong to larger grouping of Family 8 retaining glycosyltransferases based on BLAST analysis and shared predicted three-dimensional structures identified by methods including hydrophobic cluster analysis (48) (afmb.cnrs-mrs.fr/ CAZY/index.html). The outer core backbones of E. coli R3 and S. enterica serovar Typhimurium share the identical ␣-GlcII-(132)-␣-GalI-(133)-GlcI trisaccharide (Fig. 1A), and it was expected that the locus would contain genes homologous to those encoding the known HexII (waaI) and HexIII (waaJ) transferases from Salmonella. In the central waa operons of Salmonella and E. coli K12, R1, R2, and R4, genes encoding the HexII and HexIII transferases are contiguous (reviewed in Ref. 3). However, in the R3 waa locus the candidate waaI and waaJ genes are separated by the predicted waaY gene (Fig. 1B). The putative R3 waaI gene product shares ϳ50 -60% identity with other HexII enzymes. The closest homolog to the R3 WaaI protein is that from S. enterica sv. Typhimurium (59.4% identity; 71.9% total similarity), as would be predicted since both core types include an ␣-GalI-(133)-GlcI linkage. The activity of the S. enterica sv. Typhimurium homolog has been established (47,49), but no biochemical confirmation of activity is available for the corresponding R3 enzyme. The closest WaaJ homologs with defined activity are those from S. enterica serovars Typhi and Typhimurium (ϳ45% identity and 65% similarity). Confirmation of the sequence predictions concerning the identity of WaaJ was obtained by biochemical experiments (see below).
Only one additional gene (designated waaD) was present in the waa locus of E. coli F653. Sequence features predict WaaD to be a retaining glycosyltransferase belonging to family 4 (afmb.cnrs-mrs.fr/CAZY/index.html), and BLAST searches identify WaaK as its closest homolog. The WaaK enzyme adds an ␣-1,2-linked GlcNAc residue to HexIII (a Glc residue) in the S. enterica sv. Typhimurium and E. coli R2 core types (45). The WaaK and WaaD proteins share 57% identity (70% total similarity). These similarities could reflect either a common sugar nucleotide (i.e. WaaD could indeed be a GlcNAc transferase) or, alternatively, both enzymes may share a common acceptor (the 2-position of GlcII) to which they transfer different sugars. In this case, WaaD could be the GlcIII transferase for R3 core OS assembly. This was resolved by biochemical methods that identified WaaD as the GlcIII transferase (see below).
Aside from the tentative nature for some gene assignments from data base searches, the R3 waa locus does not encode enough open reading frames to accommodate the number of glycosyltransferases needed for biosynthesis of the known core OS structure. This differs from the waa loci from E. coli K12, R1, R2, and R4 and S. enterica sv. Typhimurium and sv. Arizonae IIIA, where structural genes for all of the expected glycosyltransferases for outer core OS assembly have been identified in the waa locus. To establish which transferase was missing, the activities of WaaD and WaaJ were defined by a combination of structural determination of LPS from defined mutants and biochemical assays.
Structural Characterization of the Linkage Region between O-PS and the R3 Core OS-The biosynthesis of many E. coli O antigens involves an initiation reaction mediated by WecA, a UDP-GlcNAc:Und-P GlcNAc-1-P transferase (50). As a result, the GlcNAc residue in the F653 outer core OS structure could arise from WecA activity and reflect a residue that marked the attachment site for O antigen. The nature of the O-PS biosynthesis defect in F653 is unknown, and R3 is the only E. coli core OS where the O-PS ligation site has not been established. Therefore, to complete the structural analysis of the R3 core OS type, the ligation site was determined.
Since the E. coli R3 prototype F653 is a rough strain and does not contain O-PS, the ligation site was determined by using the R3 core in E. coli O157:H7. In order to simplify structural analysis, a chromosomal mutant in the wzy (O-PS polymerase) gene was constructed by allelic exchange. The Wzy protein is involved in the assembly of the undecaprenol-linked O-PS repeat units, and strains with wzy mutations contain a full lipid A core with only one O-PS repeat unit (reviewed in Ref. 1). PAGE analysis of the LPS from the wzy mutant strain, CWG653, showed the expected truncated LPS molecule containing lipid A core plus a single O-PS repeat unit and loss of all higher molecular weight O-PS (Fig. 2).
The LPS from CWG653 was extracted by the phenol, chloroform, and petroleum ether method (38). N,O-Deacylation of the LPS from CWG653 gave two major compounds, oligosaccharides 1 and 2, that were well separated by HPAEC (Fig. 3, lower  panel) as well as a mixture of the oligosaccharides 3 and 4. NMR and MS data (data not shown) for oligosaccharide 2 showed it to be identical to the complete R3 core OS, previously described as the major component of deacylated F653 LPS (22) (Fig. 4A). From NMR data, oligosaccharide 1 contained the core OS plus additional monosaccharides (residues W, X, Y, and Z) representing one repeating unit of the O-PS (Fig. 4A). A set of two-dimensional spectra (COSY, TOCSY, NOESY, HSQC, and gHMBC) were recorded and completely interpreted (Table II and Fig. 5). The identity of the monosaccharides was deduced from NMR chemical shifts and vicinal coupling constants. Connections between monosaccharides were determined on the basis of NOE and HMBC transglycosidic correlations. The NOE correlations Z1Y3, Y1X4, X1W3, W1K4, and W1K6 (Fig.  6) reflected the structure of the O157 repeat unit fragment with residue Z at the nonreducing end. The initial GalN (galactosamine; residue W) of the O-repeat unit is attached to O-4 of the outer core Glc (residue K). Through these experiments, the additional structure that distinguished oligosaccharide 1 from oligosaccharide 2 was identified as ␣ -Rha4N-(1-3) Fig. 4A). This corresponds to the repeating unit of O157 LPS with altered anomeric configuration of the GalN. The structure was confirmed by ESI-MS, which gave an observed mass of 3100.4 Da, consistent with a predicted average mass of 3101.4 Da (data not shown). Location of phosphate residues was based on downfield shifting of 1 H signals at the sites of phosphorylation, as well as additional 1 H-31 P couplings. The phosphorylation was verified by 1 H-31 P HMQC correlations: A1 to 31 P at 2.05 ppm, B4 to 4.30 ppm, E4 to 3.20 ppm, and F4 to 3.45 ppm (data not shown).
The mixture of the oligosaccharides 3 and 4 identified in HPAEC analysis of deacylated CWG653 LPS (Fig. 3, lower  panel) was not analyzed in detail by NMR. ESI-MS showed two peaks, with the masses of 3020.5 and 3181.3 Da, respectively. Oligosaccharide 3 corresponds to a derivative of oligosaccharide 1 lacking one phosphate group. Oligosaccharide 4 is also derived from oligosaccharide 1 but contains three phosphate groups (on residues A, B, and E; no phosphate on residue F) and contains an additional GlcN residue on O-7 on the HepIII (residue H, Fig. 4). The oligosaccharide 3 and 4 structures were both predicted from the previous analysis of F653 LPS where, in the absence of an O repeat, similarly substituted oligosaccharide derivatives of R3 core OS were identified (22).
To confirm the structure of O-PS-core OS linkage region, and verify that the structural data did not reflect unanticipated effects of the wzy mutation, the products from the water-phase LPS from wild type E. coli O157 were analyzed. The chemistry of the O157 LPS is such that smooth LPS partitions almost exclusively in the phenol phase (unlike most other smooth LPSs), with only short chain LPS recovered from the water phase (51). Deacylation of the short chain LPS and HPAEC of the products (Fig. 3, upper panel) yielded oligosaccharides that gave NMR spectra identical to the fully characterized oligosaccharides 1 and 2 from E. coli CWG653 (O157 wzy) (data not shown). The short chain LPS was also subjected to AcOH hydrolysis, and the products were separated by gel filtration chromatography (Sephadex G-50) to yield fractions containing the nonsubstituted core OS and core OS linked to one or two O157 repeat units (data not shown). The major component

FIG. 4. Chemical structures of the major LPS oligosaccharides representing core OS carrying a single repeat unit of O157 antigen.
A shows the structures of oligosaccharides 1-4 isolated from N,O-deacylated E. coli CWG653 (serotype O157 wzy mutant) LPS by HPAEC. B shows the structure of oligosaccharide 5, the major component isolated from AcOH-hydrolysis of short chain water-soluble E. coli O157 wild type LPS.

TABLE II
Partial NMR data for the major compound (oligosaccharide 1) obtained from deacylation of E. coli CWG653 LPS Chemical shifts for the remainder of the molecule were essentially identical to those published previously (22) for the R3 core OS (identical to oligosaccharide 2) lacking the O157 repeat unit (i.e. residues Z-Y-X-W). oligosaccharide 5 was analyzed by NMR and mass spectroscopy, and the NMR data was partially interpreted. Complete assignment of all signals and NOE contacts was possible for all monosaccharides except heptoses and Kdo, which gave several series of signals due to different Kdo forms. The results were in complete agreement with the proposed structure (Fig. 4B). The fragmentation of cationic oligosaccharides typically proceeds by cleavage at the glycosidic bonds, which provides sequence and branching information (52). The product ion spectrum (MS/MS spectrum) obtained from doubly charged ion at m/z 1255.12 is illustrated in Fig. 7A. In order to confirm the structure, the sample was also analyzed by CE-MS/MS/MS techniques using API 3000 mass spectrometer. The CE-MS/ MS/MS experiments were acquired using a high orifice voltage (180 V), and the extracted mass spectrum is shown in Fig. 7B. The series of fragments correspond to the sequences of oligosaccharide 5. Observation of sequential loss of the residues Z, Y, X, and W (Fig. 7B) confirms again that Rha4NAc residue is present at the nonreducing end of the biological repeat unit of the O157 antigen. Unfortunately, no fragments were observed which could confirm the linkage between residues W and K. Taken together, the structural data identify the O-PS linkage site as O-4 of GlcII on the outer core OS (HexIII). To correlate this structural data with genes in the waa locus, strains CWG350 (waaJ) and CWG351 (waaD) were tested for their ability to ligate a test O-PS, the D-galactan I O-PS from Klebsiella pneumoniae serotype O1 (35). The genes necessary for biosynthesis of D-galactan I are cloned in plasmid pWQ3 (35). LPS profiles were examined by PAGE and Western im- munoblotting from transformants of CWG350, CWG351, and F653 harboring pWQ3. LPS containing D-galactan I O-PS was detected in F653 (pWQ3) but not in either CWG350 (pWQ3) or CWG351 (pWQ3) (Fig. 8). The absence of ligation in CWG350 is consistent with predictions that WaaJ is the transferase responsible for the addition of GlcII. In other systems, residues linked to the linkage residue also influence ligation proficiency (45,46), so the absence of ligation in more than one mutant was not unprecedented. However, this observation did not facilitate an unequivocal assignment of WaaD function.
The WaaJ Protein from the E. coli R3 core OS Assembly System Is a UDP-glucose:(Galactose) LPS ␣-1,2-Glucosyltransferase-The absence of one core OS glycosyltransferases created ambiguity in the functional assignments. In one scenario, WaaJ would be responsible for adding a single residue (GlcII) with waaD encoding either of the GlcIII or GlcNAc transferases (see Fig. 1A). Alternatively, the R3 WaaJ enzyme could potentially add two ␣-1,2-Glc residues (GlcII and GlcIII) with WaaD representing the final activity, the GlcNAc transferase. There is precedent for multitransfer of glycose residues in core OS biosynthesis, and the Kdo transferase (WaaA) provides a fully characterized example (44).
In order to unequivocally identify the role of the waaJ gene in R3 core OS biosynthesis, a nonpolar chromosomal mutation was constructed in the putative waaJ gene by allelic exchange. The lipid A core molecules of the putative waaJ mutant (CWG350) migrated faster than the corresponding lipid A core from the parent, F653. The difference in migration between the lipid A core OSs was consistent with the loss of two glycose residues (Fig. 9A). To identify the missing sugar(s), linkage analysis of CWG350 LPS was performed by the methylation procedure. The derivatives were identified by GC-MS (data not shown). The samples from CWG350 and F653 differed in derivatives of the outer core sugars. The CWG350 LPS retained Gal, but the derivative was only substituted at positions 1 and 3, rather than the 1,2,3-substituted residue seen in F653. The F653 LPS contained three Glc derivatives substituted at positions 1 and 3 (from GlcI), 1 and 2 (from GlcII), and 1 only (from GlcIII). In CWG350, only 1,3-substituted Glc remained, reflecting the presence of only GlcI. These findings confirmed that WaaJ deficiency in CWG350 resulted in loss of the GlcII residue as well as the linked GlcIII residue but ruled out any effect on the GalI transferase activity (i.e. WaaI). However, the data provide no insight into the enzyme responsible for addition of GlcIII, and this was resolved by examining the enzymatic function of WaaJ.
The waaJ gene was cloned in pBAD24 to utilize arabinoseinducible expression and the optimal ribosome-binding site provided by the vector (53). Based on sequence analysis, the putative waaJ gene in E. coli F653 could be initiated with an ATT or TTG codon. Both are downstream (separated by 8 and 14 nucleotides, respectively) of a (5Ј-GAAGGG-3Ј) putative ribosome-binding site. Identical sequences were identified in the GenBank TM accession for E. coli O157 (AE005590), S. flexneri serotype 2a (AE016991), and in several field isolates of E. coli whose core OS type was determined to be R3. 2 An ATG initiating codon is used by all other genes encoding WaaIJ family proteins, but this was ruled out for WaaJ because the nearest ATG codon giving a reasonably sized open reading frame translated a protein lacking 154 N-terminal amino acid residues that contain a characteristic motif found in other members of the WaaIJ family. Aspartate residues in this motif have been shown to be important for catalytic activity in other HexII and HexIII transferases (54). Equally important, the N-terminal domain of Family 8 glycosyltransferases has been shown to contain residues important for interaction with UDP-hexose donors, residues critical to the coordination of metal cofactors (Mn 2ϩ and Mg 2ϩ ) and residues binding LPS acceptor motifs (55). As expected, the truncated polypeptide expressed from the 2 D. Heinrichs and C. Whitfield, unpublished results. downstream ATG was unable to restore core biosynthesis in the waaJ mutant, CWG350 (data not shown). In contrast, use of the ATT as an initiating codon in plasmid pWQ156 yielded functional enzyme that complemented the waaJ defect in CWG350 to restored synthesis of a lipid A core OS molecule that comigrated with F653 LPS in PAGE (Fig. 9A).
To determine whether WaaJ was monofunctional or bifunctional in E. coli R3 core types, the activity of the enzyme was examined in vitro. The waaJ gene was cloned in pET28a(ϩ) to generate a WaaJ derivative with an N-terminal His 6 tag. The His 6 -WaaJ protein expressed from pWQ155 was functional in terms of its ability to complement the waaJ defect in PAGE analysis of CWG350 LPS (data not shown). The overexpressed His 6 -WaaJ protein was detected in both the soluble and membrane fractions from the host strain (E. coli BL21 [DE3]) (Fig.  10A). The sequence of WaaJ does not predict any transmembrane segments, and the predicted pI of 6.91 does not suggest an overall basic character that might facilitate interaction with membrane phospholipids. However, membrane association could be mediated by protein-protein interactions with other enzymes of the host core OS-biosynthesis complex. The soluble His 6 -WaaJ protein was partially purified (Ͼ90% purity) by nickel-chelation affinity chromatography (Fig. 10A) and was used for in vitro assays. The His 6 -WaaJ-containing lysate incorporated [ 14 C]Glc from UDP-[ 14 C]Glc into LPS, whereas no activity was detected in lysates lacking enzyme or in reactions with no added acceptor (data not shown). The enzyme was active and incorporated [ 14 C]Glc into acceptor in a time-dependent manner (Fig. 10B). The products of an equivalent in vitro reaction using nonradioactive UDP-Glc substrate were examined by PAGE (Fig. 10C). The LPS resulting from WaaJ activity exhibited a mobility slower than that of the CWG350 LPS acceptor but faster than the F653 LPS. The in vitro results are therefore consistent with WaaJ being a monofunctional LPS ␣-1,2-glucosyltransferase and adding a single residue (Gl-cII) to the acceptor LPS. With the functional assignment of waaJ, still unaccounted for were the genes responsible for the addition of two outer core OS residues, GlcIII (linked to HexIII) and GlcNAc (linked to HexII).
The WaaD Protein from the E. coli R3 Core OS Assembly System Is a UDP-glucose:(Glucose) LPS ␣-1,2-Glucosyltransferase-To establish the role of WaaD, a nonpolar chromosomal waaD insertion mutant was made in F653, by insertion of a gentamicin resistance cassette. In PAGE analysis, the LPS lipid A core of the waaD mutant (CWG351) migrates midway between that of the parent F653 and the waaJ mutant CWG350 (Fig. 9B), indicating loss of a single glycose FIG. 9. PAGE profile of the LPS from E. coli F653 waaJ and waaD mutants. A shows CWG350 (waaJ) and complementation of the waaJ defect with plasmid pWQ156, carrying waaJ. B shows the corresponding results for CWG351 (waaD) and its complementation with pWQ157 (carrying waaD). In complementation experiments, expression of the cloned genes was induced by growth in 0.002% arabinose. residue. The waaD gene cloned into the pBAD24 vector (pWQ157) complemented the waaD mutation to restore a full-length core, indicating that the LPS phenotype was due only to the waaD defect. The core OS structure of CWG351 was examined. After mild acid hydrolysis, the outer core OS contained D-Glc, GlcNAc, and D-Gal in an approximate molar ratio of 2:1:1, respectively. Although the data suggested that WaaD is not involved in the addition of the outer core GlcNAc residue, the interpretation is complicated by the presence of small amounts of GlcNAc attached to HepIII in the inner core OS (Fig. 1A; and seen in oligosaccharide 4 above). In further methylation experiments, the CWG351 outer core structure yielded 1,3-substituted Glc (derived from GlcI), 1,2,3-substituted Gal (from GalI), and 1-substituted Glc that would arise from GlcII in the absence of the GlcIII residue. The results obtained from structural analysis of the core OS in strain CWG351 therefore illustrate that the waaD gene encodes the transferase responsible for addition of the HexIII substitution, the terminal Glc residue.
To identify unequivocally the specificity of the WaaD glycosyltransferase, cytosolic fractions of E. coli 21548 (wecA) harboring pWQ157 or pBAD24 (vector control) were tested for in vitro glycosyltransferases activity using purified LPS from CWG351 (waaD) as the acceptor. As shown in Fig. 11, the extract with plasmid-expressed WaaD exhibited glucosyltransferase activity, whereas no activity was detected in the control lysates. When the same extracts were tested for the ability to transfer radioactivity to acceptor LPS from UDP-[ 14 C]GlcNAc, no activity was detected (data not shown). Taken together, the structural and biochemical data show that the waaD gene from E. coli R3 prototype F653 encodes the ␣-1,2-linked glucosyltransferase responsible for the addition of the terminal Glc side branch (HexIII substitution), and WaaD plays no role in the addition of the ␣-1,3-linked GlcNAc attached to HexII in the outer core OS.
Identification of a Plasmid-encoded Glycosyltransferase That Modifies the Inner Core OS in Type R3 LPS-With the identification and characterization of the waaJ and waaD genes, functional assignment was completed for genes within the waa cluster. However, the glycosyltransferases for the ␣-1,3-linked GlcNAc residue attached to HexII and the ␣-1,7-linked GlcNAc residue attached to HepIII (Fig. 1A) remained unknown. An additional putative glycosyltransferase gene has been identified on the 92-kb virulence plasmid, pO157, found in E. coli O157 isolates and is identified on the CAZY website (afmb.cnrs-mrs.fr/CAZY/index.html). This potential glycosyltransferase was designated "RfbU" due to similarities shared with a putative ␣-mannosyltransferase from the biosynthesis of sero-group D1 O-PS in S. enterica (56), but its exact activity remains unknown. The enzyme is a member of Family 4 retaining glycosyltransferases, whose members include the core OS biosynthesis enzymes WaaD and WaaG. The putative rfbU gene product from E. coli O157:H7 strain EDL933 (57) (GenBank TM accession number AAC70095.1) was used to identify other homologs in the data bases by BLAST. Examples include a putative plasmid-encoded glycosyltransferase from the E. coli O157:H7 isolate derived from the Sakai outbreak (58) (Gen-Bank TM accession number BAA31838.1), and a predicted glycosyltransferase homolog (accession number AF134403) from an enteroaggregative serotype O42 E. coli isolate (52% identity; 66% similarity) (59). Also identified was a gene from the S. flexneri 2a pWR100 virulence plasmid (60% identity; 74% similarity) (60) (GenBank TM accession number NP085405). The organization of the pO157 locus is reasonably well conserved (Fig. 12A). RfbU provided a candidate for one of the "missing" transferases for R3 core OS biosynthesis and with the establishment of its function (see below), the gene has been renamed wabB, following the convention for genes involved in bacterial polysaccharide synthesis (61). In order to be certain that the wabB gene was present in the R3 prototype strain E. coli F653, Southern blot analysis was performed. F653 contains endogenous plasmids, and a hybridization signal was detected with plasmid DNA purified from both E. coli O157 and F653 (Fig.  12B).
To investigate the possible contribution of WabB to assembly of the lipid A core, the wabB gene was mutated in E. coli CWG653 (serotype O157 wzy) by inserting a nonpolar gentamicin resistance cassette. However, there was no obvious difference in the PAGE migration of LPS from CWG653 and that from the wabB wzy double mutant CWG654 (Fig. 13A). These results could reflect no role for WabB in LPS biosynthesis or the fact that the GlcNAc substituent is present in less than 30% of all F653 LPS molecules (22), and the sensitivity of silverstained PAGE gels could limit the ability to detect differences in whole-cell lysates of CWG653 and CWG654 (Fig. 13A). To determine the structures involved, the O, N-deacylated LPS  (76) or E. coli O157 (shf-wabB-ecf3-msbB) (75). B shows a Southern hybridization in which plasmid DNA (ϳ5 g) from F653 and the positive control (O157: H7) were digested with EcoRI, and the resulting Southern blot was probed with an internal fragment of wabB from pO157. from CWG654 was examined. HPAEC of this material yielded three major oligosaccharides that were analyzed by NMR and ESI-MS. None of these oligosaccharides contained ␣-1,7-linked GlcN at HepIII (residue H) (data not shown).
To provide further insight into WabB function, the gene was cloned into a pBAD24 expression vector (giving pWQ159) and introduced independently into E. coli F653, an O157 isolate, and CWG653 (Fig. 13, B-D). Overexpression of wabB caused an altered SDS-PAGE migration of the LPS from all three strains, consistent with the addition of one sugar residue to LPS molecules ranging from uncapped rough species to O antigensubstituted LPS. Structural analysis was performed on purified LPS from F653 (pWQ159) (Fig. 13B). Complete deacylation resulted in two main compounds, oligosaccharides 6 and 7 (Fig.  14), that were isolated by preparative HPAEC and completely characterized by NMR and ESI-MS. The structures were identical to oligosaccharides reported in the determination of the R3 core OS structure (22) (data not shown). The relative amount of these components changed compared with those found in parent strain (F653), with a significant increase in oligosaccharide 6 which contained ␣-1,7-linked GlcN at HepIII (residue H) (Fig. 14). LPS from F653 (pWQ159) was O-deacylated with hydrazine and analyzed by ESI-MS, which resulted in identification of two major components, oligosaccharides 8 (molecular mass of 3187.8; predicted 3185.8) and 9 (3063.2; predicted 3062.5). The observed substitution of HepIII (residue H) with ␣-GlcNAc is consistent with WabB being the GlcNAc transferase responsible for inner core modification.
Verification of the glycosyltransferase activity of WabB was obtained by in vitro assays. Soluble fractions from cell lysates of E. coli DH5␣ (pWQ159) and DH5␣ (pBAD24 control) were tested for their ability to transfer [ 14 C]GlcNAc from UDP-[ 14 C]GlcNAc onto acceptor LPS. In an initial experiment with CWG654 acceptor LPS, no activity was detected (Fig. 15). Structural data from this and previous studies (22) indicate that the substitution of HepIII (residue H) with GlcNAc and phosphorylation of HepII (residue F) are mutually exclusive events, suggesting that an appropriate acceptor for WabB may not be available in the LPS from CWG654. The experiment was therefore repeated using CWG654 LPS that was first dephosphorylated by HF treatment. This yielded an acceptor that was now modified by lysates containing WabB. Radioactivity was incorporated from UDP-[ 14 C]GlcNAc into HF-treated CWG654 LPS in a time-dependent manner (Fig. 15). Collectively, these data confirm plasmid-encoded WabB is an UDP-N-acetylglucosamine:(heptose) LPS ␣-1,7-N-acetylglucosamine transferase that adds an ␣-1,7-linked GlcNAc residue to the HepIII residue in the inner core OS of R3-type LPS. DISCUSSION In this study, we have characterized the waa genetic locus from E. coli F653 encoding enzymes for R3-type core OS assembly based on analysis of LPS structure in specific waa mutants and biochemical activities of specific enzymes. The waaA, waaQ, waaG, waaP, waaY, and waaL genes were all assigned by high levels of conservation shared with homologs whose activities have been established. The R3 core OS structure is consistent with these activities. Although some R3 glycosyltransferases could be clearly assigned by a bioinformatic approach, the role of others was not obvious, particularly with respect to addition of side branch residues attached to the outer core trisaccharide backbone. The function of WaaJ and WaaD in the synthesis of the terminus of the R3 core OS and O-PS ligation site has now been established by determining the structures of precise core OS-assembly mutants and by biochemical assays. The waaD gene encodes an ␣-1,2-glucosyltransferase responsible for addition of the HexIII substitution, a terminal glucose side branch. The sequence relationships shared by WaaD and its closest homolog, the WaaK GlcNAc transferase found in E. coli with R2 core OS (45) and some Salmonellae (62), presumably reflect the shared acceptor residues for the enzymes, rather than the nature of the residues transferred. These results emphasize the need for caution in the assignment of specific glycosyltransferase functions based on sequence data alone. Identification of the WaaJ and WaaD glycosyltransferase activities completes the functional assignment of genes within the R3 waa locus. The R3 waa locus seems to be genetically related to that of the Salmonellae, given the close similarity in the waaJ, waaI, waaL, and even the waaD/ waaK genes. The origin of the GlcNAc (residue M) attached to outer core Gal (residue I) remains an open question. One possibility considered was that the GlcNAc residue is not a core sugar but instead is the first residue of the O-PS, added by WecA (50). The determination of the structure of the ligation site ruled out this possibility. Also, the LPS from a wecA mutant made in E. coli F653 showed the same migration in PAGE as the parent (data not shown). It is unlikely that any of the identified core OS biosynthesis enzymes possess an unanticipated additional activity, and the unidentified GlcNAc transferase must be significantly different to the members of the known families for it to have gone unrecognized during genome sequence annotation.
The ligation of O-PS to the lipid A core plays an important part in bacterial virulence, establishing a layer of O-PS crucial for resistance to complement-mediated killing (63). The details of the ligation mechanism have not been established, but it is thought to involve a complex designed to recognize a specific lipid A core acceptor molecule. The WaaL integral inner mem- FIG. 13. Influence of wabB expression on LPS structure. A shows the PAGE LPS profiles of the wabB mutant (CWG654) and its parent CWG653 (wzy); the mutation had no effect on the profile. The wabB gene was overexpressed from plasmid (pWQ159) in strains F653 (B), CWG653 (wzy) (C), and O157:H7 (D). In each case, the LPS molecules from strains expressing wabB migrated with a higher molecular mass, consistent with the addition of a single glycose residue to the lipid A core OS. Expression of wabB in pWQ159 was induced using 0.02% arabinose.
brane protein is the only currently known component involved in the ligation process (64). The primary sequences of the waaL genes have little homology, but their secondary structure is similar. The data reported here establish that the ligation site for O-PS in the R3 core OS is the HexIII (Glc) residue. The same ligation site is used in E. coli R2 and S. enterica serovars Typhimurium and Arizonae IIIA (45,65). A waaD mutant contains the ligation site residue but is unable to serve as a functional acceptor for the model O-PS, D-galactan I. The dependence of ligation on a residue attached to HexIII represents another conserved feature in the R2 and Typhimurium core OS systems (45,46). Significantly, the R3 WaaL protein sequence has the most homology with the WaaL proteins from R2 (51% identity) and S. enterica serovars Typhimurium (53% identity) and Arizonae IIIA (52% identity). In contrast, the R3 ligase shares little primary sequence identity with that from the R1 system where the O-PS is ligated onto a ␤-linked residue that is attached to HexII (66). In agreement with the sequence similarities, the R3, R2, and Salmonella ligases all generate ␤-linked products. The repeat unit structure of the bulk The WabB GlcNAc transferase represents the first example of a plasmid-encoded gene product involved in assembly of core OS in E. coli. The inner core OS of F653 contains nonstoichiometric substituents in the form of the 4-linked phosphate residue on the HepII residue transferred by the WaaY kinase (found on 70% of LPS molecules) (4) and the 1,7-linked GlcNAc residue on HepIII added by WabB (30%) (22). Most interesting, the structural data suggested that these residues are mutually exclusive although the underlying reason was unknown. One speculation was that the GlcNAc transferase responsible for adding the 1,7-linked GlcNAc on HepIII could act as both a phosphatase and a sugar transferase (22). Here we demonstrate with biochemical analyses that this is not the case. Instead, the explanation for the nonstoichiometry lies in the acceptor specificity of WabB. In in vitro experiments, WabB cannot modify acceptor LPS unless it is first chemically dephosphorylated. It is currently unknown whether the converse situation is also the case (i.e. WaaY cannot phosphorylate an acceptor modified with GlcNAc at HepIII), but the structural analyses of the R3 core OS revealed no molecules carrying both substituents (data reported above, and see Ref. 22). In a related situation, it has been shown in the E. coli R1 core type that the WaaY enzyme (required for phosphorylation of the HepII residue) is active only when the HepI residue is phosphorylated by WaaP and HepIII is added to HepII by WaaQ (4). These and other observations with inner core modifications in E. coli LPS (8) collectively suggest a fine balance in the various molecular species of LPS, but the precise impact of the various inner core modifications in the biology of E. coli is not fully understood. The LPS PAGE profiles for strains overexpressing WabB show that inner core modifications are shared by both rough and O-substituted LPS species, rather than reflecting a means of discriminating between the two forms of LPS.
A cross-reactive monoclonal antibody, WN1 222-5, can effectively bind to the core OS of a variety of clinical isolates of E. coli, S. enterica, and Shigella (19). This antibody also demonstrated cross-protective results in vivo against the endotoxic activities of LPS in challenges with whole bacteria and LPS (19). The precise structure of the WN1 222-5 epitope has been determined, and the epitope is present in all core OS structures of E. coli, S. enterica, and Shigella. WN1 222-5 binds to the distal part of the inner core OS, and both the HepIII residue and the 4-linked phosphate on HepII are important determinants of the epitope (69). Core OSs prepared from E. coli F653 FIG. 14. Chemical structure of the major oligosaccharides isolated from the LPS of E. coli F653 (pWQ159). Two oligosaccharides (6 and 7) were isolated from O,N-deacylated LPS. A significant increase was evident in the relative amount of oligosaccharide 6, compared with wild type F653 (22). Oligosaccharides 8 and 9 were obtained after O-deacylation by treatment with hydrazine. The designations of the variable residues R 2 -R 4 were selected for consistency with the structures of oligosaccharides 1-4 in Fig. 4. (R3 prototype) and F470 (R1 prototype) that contained the 1,7-linked GlcNAc substitution on HepIII lack the 4-linked phosphate residue on HepII and were not recognized by WN1 222-5 in enzyme-linked immunosorbent assay (69). The identification of WabB as the enzyme responsible for adding the 1,7-linked GlcNAc substitution on HepIII may add further structural and genetic insight into the use of this monoclonal antibody as a therapeutic agent for vaccine development.
The R3 core type is prevalent in E. coli and Shigella isolates involved in enteric infections. Although WabB is clearly not essential for LPS biosynthesis, the distribution of the wabB gene in E. coli O157 and S. flexneri 2a suggests the modification of HepIII with a GlcNAc residue is a common feature in R3 core OSs. However, the importance in pathogenesis of the GlcNAc modifications and the plasmid location of wabB are unknown. The wabB gene is part of a reasonably well conserved locus. Upstream of wabB is shf, a gene sharing low homology with icaB from Staphylococcus epidermidis. The ica operon is involved in the formation of polysaccharide intercellular adhesion polymer, but IcaB is a secreted protein with no known glycosyltransferase activity (70). In S. flexneri, rfbU (i.e. wabB) is followed by virK, a virulence gene from S. flexneri (71). The same location in E. coli O157 is occupied by ecf3, a gene encoding an integral membrane protein related to virulence proteins of unknown function (72). In most cases, downstream of the rfbU gene is a homolog of msbB, a gene whose product participates with a chromosomal msbB gene in full acyl-oxyacylation of lipid A and is required for full virulence in S. flexneri (1,73). The pAA2 plasmid from enteroaggregative E. coli O42 isolate differs in that it lacks the msbB homolog (59). The role of the plasmid-encoded MsbB protein in lipid A modification has been established (74,75). One advantage of having the wabB gene on a plasmid is the opportunity for differential expression between it and genes in the chromosomally located waa operon, affording independent regulation and maintenance of a fine balance of inner core substitutions. The genes in the shf-wabB-virK-msbB locus are cotranscribed in E. coli O157:H7 and are subject to thermoregulation (74,75). It remains to be established whether a wabB mutant would show diminished virulence.