Lipopolysaccharide O-antigens — bacterial glycans made to measure

Lipopolysaccharides are critical components of bacterial outer membranes. The more conserved lipid A part of the lipopolysaccharide molecule is a major element in the perme-ability barrier imposed by the outer membrane and offers a pathogen-associated molecular pattern recognized by innate immune systems. In contrast, the long-chain O-antigen polysaccharide (O-PS) shows remarkable structural diversity and fulfills a range of functions, depending on bacterial lifestyles. O-PS production is vital for the success of clinically important Gram-negative pathogens. The biological properties and functions of O-PSs are mostly independent of specific structures, but the size distribution of O-PS chains is particularly important in many contexts. Despite the vast O-PS chemical diversity, most are produced in bacterial cells by two assembly strategies, and the different mechanisms employed in these pathways to regulate chain-length distribution are emerging. Here, we review our current understanding of the mechanisms involved in regulating O-PS chain-length distribution and dis-cuss their impact on microbial cell biology.

Lipopolysaccharides are critical components of bacterial outer membranes. The more conserved lipid A part of the lipopolysaccharide molecule is a major element in the permeability barrier imposed by the outer membrane and offers a pathogen-associated molecular pattern recognized by innate immune systems. In contrast, the long-chain O-antigen polysaccharide (O-PS) shows remarkable structural diversity and fulfills a range of functions, depending on bacterial lifestyles. O-PS production is vital for the success of clinically important Gram-negative pathogens. The biological properties and functions of O-PSs are mostly independent of specific structures, but the size distribution of O-PS chains is particularly important in many contexts. Despite the vast O-PS chemical diversity, most are produced in bacterial cells by two assembly strategies, and the different mechanisms employed in these pathways to regulate chain-length distribution are emerging. Here, we review our current understanding of the mechanisms involved in regulating O-PS chain-length distribution and discuss their impact on microbial cell biology.
Lipopolysaccharides (LPSs) are a family of structurally related glycolipids found in the outer membranes of most Gram-negative bacteria (1). A typical Escherichia coli cell possesses ;2 3 10 6 LPS molecules, covering about three quarters of the cell surface, and it has been estimated that 70,000 molecules/min are exported to the outer membrane to sustain the growth rate of this organism (reviewed in Ref. 2). These are important macromolecules with a substantial amount of cellular resources dedicated to their production.
LPS molecules contain a mix of well-conserved and highly variable structural elements. All LPS molecules contain a lipid component (lipid A), whose archetypal structure is composed of a bisphosphorylated disaccharide backbone carrying 4-7 acyl chains (2). Lipid A is a principle structural component in the outer leaflet of the outer membrane and is essential for viability of almost all LPS producers. The few species (Neisseria meningitidis, Moraxella catarrhalis, and Acinetobacter baumannii) that can withstand disruption of the conserved lipid A-biosynthesis pathway (Raetz pathway) require major reconfiguration of outer membrane biogenesis and homeostasis, and this is accompanied by a fitness cost (reviewed in Ref. 3). The unique structure of lipid A contributes to the barrier properties of the outer membrane and resistance to antimicrobial peptides and antibiotics like polymyxin and colistin. It also provides a patho-gen-associated molecular pattern, recognized by complexes of Toll-like receptor 4 and myeloid differentiation factor 2 (TLR4: MD2), to activate a proinflammatory response (3). Regulated modifications to the base lipid A structure allow some pathogens to enhance resistance or evade immune detection, and much of our knowledge of lipid A structure results from efforts to understand the structural determinants of these key properties. However, an LPS composed solely of lipid A is so far confined to the intracellular pathogen, Francisella novicida (4). In all other bacteria, lipid A is subject to further glycosylation, creating remarkable structural diversity.
LPS molecules typically contain a short "core oligosaccharide" structure (core OS), which is attached to lipid A via an a-linked 3-deoxy-D-manno-oct-2-ulosonic acid (Kdo) residue. Whereas the core OS architecture is relatively conserved within a species, the structures can vary considerably between species in terms of the number and type of sugars, the extent of branching, and the presence of nonglycose substituents, such as phosphate (5). The core OS may also contribute to outer membrane stability, but only the Kdo addition is essential for viability in vitro in the E. coli paradigm. This reflects the order of assembly where two Kdo residues are added to the nascent structure prior to completion of lipid A acylation; underacylated LPS is a poor substrate for the LPS export machinery, which cannot sustain growth without suppressor mutations (reviewed in Ref. 2).
In some bacteria (often mucosal pathogens), the core oligosaccharide can be modified by a phase-variable (on-off) extension of one or a few sugars in a form known as lipooliogsaccharide. However, the more prevalent (and classical) LPS format shows a tripartite structure where the core OS is further glycosylated by a long-chain repeat-unit polysaccharide. This is generally called the O-antigen (O-PS), with the term originating from the Kauffman-White serological classification system developed for Salmonella in the 1930s. LPS molecules carrying O-PS are called "smooth LPS" (S-LPS), whereas those lacking O-PS are termed "rough" (R)-LPS (Fig. 1A). Again, this follows Salmonella precedent, describing the colony morphologies of these bacteria grown on solid media. O-PSs are hypervariable repeat-unit polysaccharides with known structures compiled in the Carbohydrate Structure Database (RRID:SCR_018684) (6). They can differ in glycose (i.e. sugar or sugar derivative) and nonglycose components, linkages, and topology of the repeat unit (Fig. 1C), generating distinct O-antigen epitopes. For example, there are 46 main O-serogroups in Salmonella, and some of these structures are also found among the .180 recognized O-serotypes in E. coli, reflecting the propensity for horizontal gene transfer (7,8). The application of silver-stained SDS-PAGE offered the first insight into a critical LPS structural feature: heterogeneity in O-PS chain lengths within a population of bacteria (9). O-PS chain-length distribution is characteristic for a particular isolate, and some examples are shown in Fig. 1B. An important feature of the SDS-PAGE profiles from WT isolates is the presence of "modal" clusters (marked by brackets in Fig. 1B) of stained molecules, where the O-PS chain-length distribution is skewed toward a particular size range. As will be described below, this is a regulated process and is important in LPS properties and functions, and the extent of the heterogeneity can offer insight into the O-PS assembly mechanism. The Salmonella enterica serovar Typhimurium pattern reveals a wide range of chain lengths, with the amount (and staining intensity) generally decreasing as chain lengths increase, with the exception of a higher-molecular-weight modal cluster of bands. In this species, the high-molecular-weight modal cluster is eliminated by deletion of wzz genes encoding chain length regulators (see the mutant profile). The examples shown from E. coli and Klebsiella pneumoniae lack similar chain length regulators and use a different mechanism to establish modality in chain length. In K. pneumoniae O2a, the pattern shows a relatively wide distribution of O-PS size. Capping of O2a chains with O1 antigen increases maximum chain length and adds an additional higher-molecular-weight cluster containing the O1 antigen. In contrast, E. coli O9a and K. pneumoniae O12 show much tighter modality with no appreciable amounts of shorter O-PS lengths. How are the distinctive chain-length patterns achieved, and what impact does this have on the biology of the bacteria? These questions are the subject of this review. We will describe our emerging understanding of the different molecular mechanisms involved in the biosynthesis of O-PS, highlighting relationships to systems involved in the production of other bacterial cell-surface polymers. Particular emphasis is placed on the regulation of O-PS polymerization machinery, to generate products with defined chain-length distributions, and the importance of this regulation to the success of bacterial pathogens.

Biosynthetic bookends-shared processes in the initiation and termination of O-PS assembly
All O-PSs are assembled as undecaprenyl diphosphate (Und-PP)-linked assembly intermediates (10) in the cytoplasm and end with the ligation of Und-PP-linked O-PS glycans with defined chain-length distributions to lipid A core in the periplasm. Between these two conserved processes, the specific features of O-PS polymerization and export across the cytoplasmic membrane define three fundamentally different O-PS assembly strategies. Synthesis is initiated at the cytoplasmic face of the inner membrane by a phosphoglycosyltransferase (PGT) enzyme that transfers a hexose phosphate or acetamido sugar phosphate from the corresponding nucleotide diphosphoglycose donor to undecaprenyl phosphate (Und-P). PGT family members initiate all bacterial polysaccharides involving Und-PP-linked intermediates, including peptidoglycan, teichoic acids, Nand O-linked glycans in many glycosylated proteins, and some classes of capsular and exopolysaccharides. There have been important recent advances in understanding the molecular mechanisms of these enzymes (reviewed in Ref. 11). PGT enzymes have been assigned to two superfamilies, which differ in membrane topology and mechanism (11). MraY from peptidoglycan biosynthesis is a prototypical "polytopic" PGT, with 10-11 transmembrane helices (TMHs) in its functional catalytic core and a proposed active site positioned within a cleft at the membrane surface created by the cytosolic loops (12). Homologs of MraY include WecA, which initiates O-PS biosynthesis in many bacteria E. coli, Shigella, and Salmonella by the transfer of 2-acetamido-2-deoxyglucose-1-phosphate (GlcNAc-1-P) to Und-P (13). Some O-serotypes then perform an epimerization to convert Und-PP-GlcNAc to Und-PP-GalNAc (14) before further elaboration. PglC from the N-glycosylation pathway in Campylobacter provides the prototype for monotopic PGTs. The PglC catalytic core reveals a new protein fold. The catalytic domain is stabilized at the membrane interface by an unusual strategy, involving a reentrant helix (that penetrates one membrane leaflet), and additional helices that are oriented planar to the membrane surface (15). Some OP-S PGTs, including the founding galactose-1-P transferase (WbaP) from Salmonella, share the same catalytic core as PglC. Although they may possess additional TMHs, these are not essential for catalysis, allowing their inclusion in the "monotopic" family (16,17).
O-PS synthesis ends with the production of an Und-PPlinked glycan chain, now located in the periplasm, which provides a donor substrate for glycosylation of lipid A core. The reaction is performed by the O-antigen ligase, encoded by a gene typically designated waaL (10). WaaL is an integral membrane protein, but no solved structures are available (18). Although the ligation reaction has been demonstrated biochemically with partially or fully defined substrates (19,20), its activity in most species is only inferred from the LPS phenotype of waaL mutants. The ligation reaction is analogous to protein glycosylation, and sequence similarities have been recognized between WaaL and some bacterial oligosaccharyltransferases (21). After ligation, the completed LPS molecule is translocated to the outer membrane and inserted into the outer leaflet by a conserved LPS transport (Lpt) machine spanning the cell envelope. The components and functions of the Lpt complex have been reviewed elsewhere (22,23), and structures of subcomplexes have been described (24)(25)(26)(27).
Various aspects of O-PS assembly have been reviewed previously (e.g. see Refs. 10 and 28-31), and here we focus on recent advances in understanding chain length regulation and its impact on cellular functions.

Regulating a polymerase-the Wzx/Wzy-dependent pathway
The most prevalent O-PS biosynthesis process is the Wzx/ Wzy-dependent pathway; as an example, it is used for 165 of 176 currently identified E. coli O-serotypes (8). Studies involving E. coli, Shigella, Pseudomonas, and Salmonella provide the framework of biochemical knowledge gathered on this O-PS pathway, and many of the central steps have parallels in the biosynthesis of capsular and extracellular polysaccharides in Gram-positive and Gram-negative bacteria (32).
After PGT-mediated initiation of O-PS biosynthesis, a series of reactions catalyzed by classical sugar nucleotide-dependent glycosyltransferases (GTs) completes the carbohydrate structure of the Und-PP-linked repeat unit at the cytoplasmic face of the inner membrane ( Fig. 2A). Nonglycose components are sometimes added at this stage. The pathway is defined by the export of Und-PP-linked repeat units to the periplasm, where they are polymerized in a blockwise process by transfer of the growing O-PS from its lipid carrier to the nonreducing terminus of the incoming Und-PP repeat unit. These steps are performed by the Wzx exporter and Wzy polymerase, respectively. Modality of O-PS chain length distribution is established in the polymerization stage, prior to ligation (33,34), by proteins belonging to the Wzz family.
Wzx is a subfamily of the MOP (multidrug/oligosaccharidyllipid/polysaccharide) flippase family of transporters. These are diverse integral membrane proteins that exert substrate preference for the structure of the O-PS repeat unit. The principles of substrate recognition are complicated and poorly understood, but some structural determinants appear to be particularly important for high efficiency flipping (reviewed in Refs. 35 and 36). Export of nonnative O-PS can occur under conditions of Wzx overexpression. This suggests that the flippase can operate independently of a presumed assembly complex possessing fixed stoichiometry of components, and the physiological basis requires more investigation. Although the structure of Wzx has not been solved, structural and mechanistic insight is provided by a related flippase (MurJ) that exports Und-PP-linked repeat units of peptidoglycan. An overexpressed Wzx exporter from exopolysaccharide assembly can substitute for MurJ, reinforcing the possibility of a conserved mechanism in this class of exporters (37). The initial structures of two MurJ homologs in inwardfacing conformations revealed a two-lobed structure with a predominantly cationic lumen that may be accessed via a lateral gate (38,39). Mutagenesis, trapping of intermediates, and structures of different conformations (38)(39)(40)(41)(42) support an alternating access model, where substrate binds at the lateral entry gate and is captured in the lumen in the inward-facing conformation. The substrate is then proposed to be flipped by a "rocker switch" mechanism in transition to the outward-facing conformation, and its release is encouraged by lumen shrinkage.
Wzy polymerases transfer the growing O-PS onto the nonreducing end of the newly delivered repeat unit (43). These proteins are integral membrane proteins with a conserved membrane topology (30). Wzy proteins share a periplasmic Wzy_C motif with O-antigen ligases and some oligosaccharyltransferases for O-linked protein glycosylation, which also use Und-PP oligosaccharide substrates (21). In Salmonella, Wzy proteins have complex specificities involving the glycose composition of the donor and acceptor oligosaccharides (44), but some homologs seem more promiscuous (45). The Wzy mechanism is still speculative, and its structure and function are expected to be complex, based on precedent from other enzymes that use lipid-linked sugar or oligosaccharide donors, including bacterial N-linked oligosaccharyltransferase (PglB) (46) and GT-Cfold GT enzymes such as ALG6 GT from eukaryotic N-linked glycan assembly (47) (see below). In vitro demonstrations of Wzy activity lead to the conclusion that Wzy operates in a "distributive" mechanism (48,49). In contrast to processive enzymes that retain the growing glycan chain throughout chain extension, distributive enzymes release the glycan between additions, creating a broad profile of product sizes (50). This type of activity cannot explain the modal clusters seen in bacteria such as E. coli and Salmonella (Fig. 1B) and highlights the necessity for the final component of the assembly machinery, chain-length regulatory Wzz proteins belonging to the polysaccharide copolymerase family 1 (PCP-1, formerly MPA-1) (51,52).
Initially, wzz genes were called rol (regulator of O-chain length) or cld (chain-length determinant) based on the loss of modal clusters of O-PS in the corresponding mutants (53, 54) (Fig. 1B). A specific Wzz protein imparts a characteristic modality in O-chain lengths, but some organisms, including Salmonella (55), Shigella (56), and P. aeruginosa (57,58), express two wzz genes, imparting different modalities on the same O-PS. For example, some Salmonella isolates produce three ranges of O-PS lengths: "short" (,16 repeat units), "long" (16-35), and very long (.100) (59). The Wzz proteins compete for the same pool of Und-PP-linked intermediates, and high levels of Wzy are required to support the very long modality (60), but the precise cellular concentration of Wzy and its stoichiometry with other essential components of the assembly pathway remain to be determined. The longer O-PS modalities require Wzz ST and Wzz FepE , respectively, whereas short chains are Wzz-independent and generate an SDS-PAGE pattern where the probability of further O-PS extension decreases as the chains get longer (Fig. 1B). Synthesis of the shorter chains requires Wzy and two proteins of currently unknown function, PbgE 2 and PbgE 3 (61). Both are integral membrane proteins, but they share no compelling structural similarity with Wzz proteins, and their mechanism(s) of action requires investigation.
Wzz proteins form oligomers located in the inner membrane, and each protomer possesses two TMHs separated by a large periplasmic domain (52). Crystal structures of periplasmic domains from several Wzz homologs from E. coli and Salmonella reveal a conserved fold with an a/b membrane-proximal base and an extended a-helical hairpin extending more than 100 Å into the periplasm (62). The protomers interact at their bases to create oligomers varying in size from trimers to nonamers in crystals. In contrast, EM images of negatively stained full-length proteins appear more uniform and reveal octamers (63). Reconstitution of three different Wzz proteins in proteoliposomes also generated a homogeneous population interpreted as hexamers (64), but the data could potentially fit an octamer (63). Subsequently, a cryo-EM structure of full-length detergent-solubilized Wzz was solved (65) (Fig. 2B). The packing of the protomers was consistent with the crystallographic structures, but they formed a dodecameric structure. Modeling and coarse-grained simulation in a lipid environment offered further structural insights: dodecamers are unstable in the membrane; the importance of the a/b domain in oligomerization was reinforced; the periplasmic a-helical domain has flexibility; and the TMHs are unlikely to contact one another within the lipid bilayer. Extensive mutagenesis has been performed on several Wzz homologs, and chimeric proteins have been constructed based on sequence and structural data. In general, surface-exposed residues in the periplasmic domains appear most likely to influence O-PS chain length, and regions distinguishing different modalities conferred by two Wzz proteins from P. aeruginosa O13 map to the periplasmic domain of those proteins (e.g. Refs. 58 and 66). However, some residues in the base and oligomer lumen are also important. It has also been reported that mutations that shift oligomer stability also affect O-PS length, reinforcing the functional importance of an oligomeric structure (e.g. Refs. 66 and 67). However, it remains to be established whether the differences in the numbers of protomers in oligomers reported in different studies are simply a result of the varied preparation approaches or if they reflect a mechanism where a dynamic oligomeric complex is integral to its function.
How does this all translate into the function as an O-PS chain-length regulator? Despite the amount of available data, a compelling structure-function interpretation is elusive (68), but the role of Wzz does appear to be structural rather than catalytic. Over the years, several possible models have been proposed. In one, O-antigen length is modulated by the transient nature of the interaction between Wzy and Wzz, eventually allowing transfer of the growing glycan to WaaL for ligation to lipid A core (54). In another proposed model, Wzz would facilitate interactions between Wzy and WaaL (69), but we have since learned that modality is established in the absence of WaaL (33,34). The initial Wzz structures led to a new hypothesis that Wzz and Wzy interact and different oligomer sizes influence modality (62). In support of interaction, Shigella Wzz seems to play a role in Wzy stability (70), and high levels of Wzy are needed to sustain synthesis of longer chains (60), but attempts to directly demonstrate Wzy:Wzz interactions have met with mixed results; interactions were detected in one study (71) but not in another (60). A more recent (hybrid) model proposes that polymerization continues until the binding capacity or lumen volume of a Wzz oligomer is met or Wzy:Wzz interactions cease (65). This invokes measurement of chain length against some structural element of the Wzz oligomer, and O-PS:Wzz interaction has been reported (72). However, further elaboration of models requiring Wzy:Wzz interaction must accommodate the low levels of Wzy expression relative to Wzz and are limited by not knowing the site of interaction with the polymerase (i.e. is Wzy on the exterior surface of the Wzz oligomer or within the lumen?). Genetic studies have led to the hypothesis that Wzx, Wzy, and Wzz all interact to form a membrane-associated polymerization complex (73), and it is conceivable that this complex may also include the PGT and GT enzymes to coordinate synthesis, export, and polymerization of Und-PP-linked repeat units, which would afford other regulatory opportunities. For example, the flow of building blocks into the system may influence chain-length distributions, and a role for Wzz in modulating the activities of the PGT and Wzx proteins could also underpin chain-length regulation. Notably, an altered chain length phenotype has been reported for derivatives of the Salmonella PGT (WbaP) with a deletion in a periplasmic loop (74). Genetic studies have also led to an interesting hypothesis where WaaL controls the level of Und-PP-linked intermediates available in the periplasm, potentially by regulating the PGT enzyme (75). In summary, there are many possible mechanistic explanations for regulating O-PS chain length distribution in Wzx/Wzy-dependent systems. Ultimately, a solution to the question is likely to require definitive structural insight into the identities and interactions of components in (multi)enzyme complexes to drive biochemical experiments.

Molecular rulers and transport coupling-ABC transporter-dependent processes
Another widespread O-PS assembly strategy is characterized by polymerization of the Und-PP-linked glycan at the cytoplasmic face of the membrane, prior to export to the periplasm by a member of the ABC transporter superfamily (Fig. 3A). In these systems, the PGT enzyme acts once per O-PS (providing the reducing terminal acetamido sugar), rather than once per repeat unit like Wzx/Wzy systems. Any acetamido sugars found in the repeat unit are added by additional GTs. O-PS chain-length distribution is still established at the polymerization stage, but two fundamentally different approaches have been described. The polymerization and export processes resemble those used in the production of teichoic acids and the glycans for S-layer glycoproteins (76,77). ABC transporters are also used for the export of some capsular polysaccharides, but these are built on lipid carriers other than undecaprenol (32).
For O-PS, two variations of the pathway have been described, and these differ in the presence or absence of a sophisticated chain-termination mechanism (Fig. 3A). The simpler system is exemplified by K. pneumoniae O2a, where O2a antigen (Fig.  1C) synthesis requires three GTs. Two adapter GTs add sequential Galp and Galf residues to Und-PP-GlcNAc (generated by WecA), directing it into the pathway, and the WbbM polymerase is solely responsible for extending the repeat-unit region of the glycan on this acceptor structure (78). WbbM is a dual GT domain protein that forms homotrimers, and the catalytic domains in each monomer are linked via a flexible tether to C-terminal membrane-associating amphipathic helices (78) (Fig. 3B). The six catalytic sites in the WbbM trimer are positioned on a single surface oriented toward the membrane where the Und-PP-glycan is anchored. WbbM interacts with other GTs in the pathway in a heterocomplex (79), and the other enzymes presumably occupy space between the membrane and the WbbM catalytic sites. WbbM behaves like a distributive polymerase in vitro, and the chain-length distribution of native O2 O-PS (Fig. 1B) may be influenced by concentrating the WbbM reaction products in a localized space, as well as the inherent flexibility of the linker. Polymerization is terminated by export via the ABC transporter. In the absence of export, the O2a chains become aberrantly long, whereas overexpression of the transport results in chains with a shorter average size (80), but the mechanistic principles underpinning this type of regulation have not yet been established. In serotype O1, WbbM interacts with a second bifunctional polymerase (WbbY), which extends a polymer composed solely of Galp residues (Fig. 1B) on a short segment of O2a glycan (81). In O2c, two monofunctional GTs add a different disaccharide to an O2a acceptor, but WbbM does not interact with the additional GTs in two-hybrid experiments (81). The relative positioning of the various active sites in productive heterocomplex is an important question for further research.
E. coli O9a and K. pneumoniae O12 show much finer control over O-PS chain length (Fig. 1B) and are leading examples of a more elaborate chain-length regulatory process. The O9a anti-gen (Fig. 1C) is part of a group of related O-PS structures found in E. coli, K. pneumoniae, and other bacteria, resulting from horizontal transfer of the genetic locus. The polymerization strategy is similar to that described above for K. pneumoniae O2a, with two adapter GTs participating with WecA to generate a dedicated acceptor, which is then extended by a dual GT-domain polymerase (WbdA) (82,83). In E. coli O9a, the N-terminal GT domain of WbdA adds two a1,2-linked Manp residues, and the C-terminal module adds two a1,3-linked residues to create the tetrasaccharide repeat unit (83,84), but some wbdA mutations alter the precise activity of the N-terminal domain, and those proteins add either one or three a1,2-linked residues, changing the serotype (85,86). The modular WbdA enzymes also operate in a distributive mechanism in vitro (83), and the relatively limited range of product chain lengths (Fig.  1B) is dictated by a dedicated chain-termination mechanism prior to export (87). WbdD is the bifunctional chain terminator that operates by adding a nonreducing terminal methylphosphate cap to prevent further polymerization (87)(88)(89). What determines when the terminator intervenes in the process to create the observed modality? Mathematical modeling of O-PS chain-length distributions observed under conditions where the ratios of polymerase and terminator were experimentally manipulated implicated a structural element in the regulation (90), and this was explained by the subsequent discovery of a molecular ruler built into the machinery (91). The WbdD terminator exists as a trimer, where the catalytic domains are separated from a membrane-associated amphipathic helix by an extended coiled-coil structure (91). The polymerase (WbdA) and the terminator (WbdD) form a membrane-bound complex where WbdA interacts with the noncatalytic membrane-associated C-terminal region of WbdD (92), creating a physical separation of the polymerization and termination catalytic sites (Fig. 3C). With this organization, polymerization can progress until the O-PS length is minimally sufficient for interaction with the terminator. Experimentally varying the length of the coiled-coil region by insertion or deletion of the defining heptad motifs caused corresponding changes in the size distribution of O9a O-PS, validating this component as the molecular ruler (91).
What about other O-PS structures? The K. pneumoniae O12 O-PS repeat unit (Fig. 1C) is capped by a single b-Kdo residue (93), and its chain-length distribution is regulated by a conceptually similar strategy. In this case, a single protein (WbbB) possesses both polymerization and termination activity. Two GT modules near the C terminus perform polymerization, and they are separated from a chain-terminating b-Kdo GT by a coiledcoil molecular ruler (94, 95) (Fig. 3C). In principle, this process does not have to be confined to O-PSs, and strong candidates are found among systems from diverse bacterial genera that build glycans that decorate paracrystalline protein layers (Slayers) on the surfaces of some Gram-positive bacteria (96). A recent bioinformatics survey has identified candidates for similar biosynthetic strategies and glycan terminator chemistries in a wide range of bacteria from different environments, producing glycans whose structures and functions have yet to be studied (77). These must now be validated experimentally to discover whether additional mechanistic complexity exists.
Core ABC transporter structures possess two transmembrane domains (TMDs) forming the membrane channel and two nucleotide-binding domains (NBDs) that turnover ATP to drive transport (reviewed in Refs. 97 and 98). Most O-PS transporters are formed from a complex of two TMD proteins (called Wzm) and two NBDs (Wzt) (reviewed in Ref. 76), and this is the format found in K. pneumoniae O2a (99). An exception has been reported in export of Lewis antigen-mimicking Helicobacter pylori O-PS (100). The exporter involved (Wzk) is composed of two identical polypepeptides, which each contain one TMD and one NBD, resembling the PglK exporter from the Campylobacter N-glycosylation pathway (101). A solved structure is available for an ABC transporter from Aquifex aeolicus that closely resembles the E. coli O9a O-PS transporter (102,103). The transporter possesses a contiguous transmembrane channel to accommodate its glycan substrate, but it is notable that the O-PS substrates for these transporters are often linear glycans with side chains added post-transport (see below). The absence of elaborate side chains may reflect constraints imposed by the channel dimensions. An amphipathic gate helix at the NBD:TMD interface is thought to provide access to the channel for the substrate. The working model proposes iterative transport steps comparable with a translocase (like cellulose synthase, which catalyzes polymerization and export), rather than a conventional alternating access mechanism common in ABC transporters. The undecaprenyl lipid carrier is predicted to stay in the membrane phase, whereas the glycan traverses the membrane within the transporter lumen (102).
The K. pneumoniae O2a ABC transporter transports Und-PP-linked products with no specificity for the glycan structure (80). In contrast, ABC transporters from systems that use a chain-termination mechanism possess an additional carbohydrate-binding module (CBM) that recognizes the terminal part of the glycan as a prerequisite for export (104)(105)(106). In doing so, it ensures that the chain-length determination imparted during assembly is faithfully carried through to the final product on the cell surface. Exchange of the CBM confers new specificity on the transporter (107). One mechanistic consequence of a separate substrate recognition event is that synthesis and transport can be temporally uncoupled in E. coli O9a; substrate engagement apparently only requires the terminating moiety. In contrast, the direct participation of the K. pneumoniae O2a ABC transporter in chain-length determination requires obligatory coupling of synthesis and transport (80). The A. aeolicus CBM was removed to facilitate crystallization of the ABC transporter, but it shares the same strand exchange-stabilized dimer organization and immunoglobulin-like structure seen in other homologues from E. coli O9a and K. pneumoniae O12 (104,105,108). The CBM is thought to increase the local concentration of substrate near the entrance into the transporter, but the presence of the CBM also significantly increases the ability of the transporter to turn over ATP in vitro, so its functional role is more complex than simple substrate recognition (102). After the ligase reaction transfers nascent O-PS to lipid A core, the spent undecaprenyl diphosphate is dephosphorylated and the monophosphate form is returned for another reaction cycle. Whereas the Wzm-Wzt structure represents a major advance, important questions still remain, including the mechanisms initiating transport and the reorientation of the undecaprenyl phosphate after ligation.
An "outlier" assembly process in salmonella O:54 The third O-PS assembly system is currently confined to a single example, S. enterica serovar Borreze O:54 (Fig. 4). The dedicated enzymes for this particular O-PS are encoded on a mobilizable plasmid, so the O:54 antigen can be coexpressed with a Wzx/Wzy pathway O-PS determined by the harboring isolate's chromosomal locus (109). At the heart of the system is a single protein responsible for both chain extension and translocation of the O-PS to the periplasm, analogous to processes performed by bacterial synthases. Bacterial cellulose synthase offers the most detailed structural and biochemical understanding of these enzymes (110). Like cellulose synthase, the O:54 synthase possesses a single-catalytic site GT module but has fewer TMHs, suggesting a topology closer to bacterial chitin and hyaluronan synthases, whose structures are not yet solved (111). The polymerizing O:54 synthase activity requires a disaccharide acceptor synthesized by the WecA PGT as well as a dedicated adapter GT encoded on the plasmid (112). The involvement of a PGT and WaaL to ligate the O-PS product to lipid A core dictates the involvement of Und-PP-linked intermediates, distinguishing it from other known synthases, which operate without lipid intermediates. It is unclear how this critical difference affects the translocation/flipping process. Cellulose synthase is a processive enzyme, and residues correlated with processivity are conserved in the O:54 enzyme, but how this class of enzymes determines glycan chain-length distribution remains unknown.

Bacteriophages and the generation of O-PS diversity
Bacteriophages are considered to be one of the main selective pressures driving bacterial surface glycan diversity (113). Bacteriophage infection of Gram-negative bacteria involves complex interactions with cell-surface receptors, frequently outer membrane proteins or membrane-proximal parts of the LPS molecule (114). O-PS layers can mask such receptors, and some phages require tail fibers with enzymatic (e.g. glycosidase or deacetylase) activities to infect the host, but their role may go beyond simply clearing access to the surface; in many cases, interaction with O-PS is a necessary step in the infection process (115). Understanding these processes and the potential steps to resistance is central to efforts to deploy bacteriophage therapies. Resistance to bacteriophages can follow several routes, the simplest being replacement of all or part of the O-PS locus by horizontal gene transfer (113) or mutation to alter the specificity of key enzymes (like the E. coli O9 example described above). Another strategy that is available for laboratory-grown bacteria is loss of the O-PS receptor, and O-PS-specific phages played a vital role in the selection and characterization of mutants during the development of our understanding of LPS genetics and biochemistry. However, this clearly comes at a fitness cost outside the laboratory, given the important roles of O-PS in microbial biology that will be discussed below. An elegant study in S. Typhimurium illustrates one strategy to balance these competing selective pressures. The horizontally acquired opvAB genes are phase-variable (transcriptionally on or off) (116). They are regulated by Dam-mediated methylation of DNA adenine residues and by the OxyR transcriptional regulator, which responds to reactive oxygen species. In the "on" phase, the longer O-PS chains generated by Wzz ST are diminished at the expense of chains in the range of 3-8 repeat units. The proposed mechanism involves inhibition of Wzz ST (and presumably Wzz FepE ) function by the small 34-residue OpvA peptide and installation of OpvB (which shares sequence similarity with Wzz ST ) as a replacement. The "on" phase thereby creates a subpopulation of bacteria resistant to O-PS-targeted phages; those cells are unable withstand complement or infect macrophages, but virulent "off" phase cells can reemerge as soon as phage selection is removed. Some bacteriophages exploit a conceptually similar "inhibit and replace" strategy to alter Wzy polymerase specificity and change O-PS structure, effectively precluding further infection by phages using the same O-PS receptor. This process falls under the category of "serotype conversion," where structural modifications in the O-PS change its immunological epitopes and response to typing antisera. The impact of lysogenic bacteriophages on O-PS serology was recognized more than 60 years ago in Salmonella (117), and the change in backbone linkage was subsequently identified (118). In both Salmonella and P. aeruginosa, the native polymerase is inhibited by a small polypeptide containing a single TMH. The P. aeruginosa protein shares sequence similarity with the N-terminal TMH of Wzz, suggesting that it exploits structural elements that might normally be used in Wzy:Wzz functional interactions and consistent with the observation that inhibition is more effective in the absence of both Wzz proteins (119). A precise mechanism will require solved structures of Wzy:inhibitor and Wzy:Wzz complexes. Inhibition of the native Wzy is accompanied by an alternate inhibitor-resistant Wzy protein encoded by the same bacteriophage to switch the anomeric configuration of the glycan linkage.
Other prophage-encoded structural diversification strategies include the addition of O-acetyl, phosphoethanolamine, or glycose residues to an O-PS structure. For example, many S. flexneri O-serotypes are based on the same O-PS backbone (serotype Y), and serological diversity is created by modifications at different sites (120) (Fig. 1C). Whereas phage-encoded O-acetylation appears to be a cytosolic activity, glycosylation involves a periplasmic modification with interesting parallels to compartmentally separated stages in protein N-glycosylation in eukaryotes (Fig. 5). Periplasmic glycosylation is performed by a system that acts as an accessory module to existing assembly pathways. These modules were initially found in Salmonella O-PS but now extend to other species and to glycosylation of different cell-surface glycans (reviewed in Ref. 121). In Salmonella and Shigella, the modification involves O-PS glucosylation (encoded by gtr genes), whereas K. pneumoniae perform galactosylation (gml genes). In some bacteria, the chromosomal glycosylation locus is not associated with a prophage, offering potential insight into the original source of these systems. Periplasmic glycosylation precludes the use of nucleotide sugars as donors, and the three-component systems that direct the process are characterized by Und-P-glycose direct donors. The first two components catalyze synthesis and export of the donor and are conserved and functionally interchangeable between serotypes that add the same glycose modification, whereas the final transfer reaction is structure-specific. Und-P-glycose donors are synthesized by enzymes related to eukaryotic dolichol phosphate mannose synthase 1 (DPM1) and possess a conventional N-terminal glycosyltransferase GT-A fold (family GT2) anchored in the membrane by two C-terminal transmembrane helices. In the prototype from Synechocystis, the protein forms a tetramer with cytosolic catalytic sites located 15 Å from the membrane, and the reaction is thought to involve partial extraction of Und-P substrate from the membrane (122). The product is then flipped to the periplasmic face of the membrane by a protein sharing topology with flippases from related glycosylation systems and with EmrE, a MATE (multi-antimicrobial extrusion) family transporter (121). Glycosylation of the O-PS backbone is performed by GT-C-fold enzymes resembling eukaryotic protein mannosyltransferases, like ALG6, which operate in the ER lumen. ALG6 possesses one structurally conserved and one variable integral membrane domain, and catalytic residues are placed in extramembrane loops (47).
Periplasmic glycosylation in Salmonella occurs on an Und-PP-linked O-PS substrate, prior to ligation to lipid A core (123). The observation that the reducing terminal repeat unit is not glycosylated, and that LPS in wzy mutants is not modified, led to a conclusion that glycosylation required a minimum glycan chain length (123)(124)(125)(126). The structure of ALG6 suggests a reaction in close proximity to the membrane, implying the enzyme either lacks access to, or has low affinity for, shorter O-PS chains. Resolution of this question will require further biochemical investigation. The subsequent discovery of side-chain Gal addition in serotypes of K. pneumoniae (e.g. seroconversion of serotype O2a to O2afg, Fig. 1C) provides experimentally validated examples of three-component glycosylation of O-PSs from ABC transporter-dependent pathways (127)(128)(129). NMR data reveal that modification of these O-PSs can be stoichiometric when the modification genes are overexpressed, but they do not offer a reliable picture of the status of the terminal repeat units. The observed structure suggests a model where repeat units are modified as they emerge from the ABC transporter, and the associations of the GT-C enzyme with either polymerase or ABC transporter offer the next challenge in understanding this process.  (137), oxidative stress (138), and bile (139). They can also promote resistance to antimicrobial peptides (140)(141)(142), polymyxin (139), and colistin (143), which is predominantly driven by lipid A modifications (3). Some O-PS structures can also contribute to the recognition of LPS by the innate immune system and offer additional elements to canonical immune responses under certain circumstances. Examples include recognition of O-PS by soluble CD14 (144), activating cytokine production via a non-TLR4 pathway in NK cells (145), and recognition of O-PS by C-type lectin 2 (Dectin-2) (146).

Impact of chain-length distribution on O-PS function
The most detailed description of the importance of O-PS chain length is of its interaction with the complement system. Classic studies from Leive and co-workers (147) with Salmonella and E. coli established that resistance to complementmediated killing in the absence of antibodies correlates strongly with O-PS size and, to some extent, its distribution (i.e. the percentage of "capped" lipid A cores). In these bacteria, O-PS efficiently activates the complement system, but a population of long-chain molecules preferentially bind complement component C3 (147,148). The net effect is that steric hindrance prevents the downstream complement cascade product, C5b-9, from interacting with the outer membrane surface and forming a stable, lasting, hydrophobic interaction. The amount of bound membrane attack complex, C5b-9, is a correlate of serum-mediated bacterial killing (149) as it is a prerequisite for insertion of the membrane attack complex into the outer membrane (150). Some O-PSs influence the rate of complement activation via different efficiencies of C3 deposition or C3 convertase function (151)(152)(153)(154), but it is uncertain whether these properties are direct effects of chemistry or indirect results of O-PS conformation. These examples all involve O-PS produced by Wzx/Wzy-dependent pathways, but the same principles likely apply to the products of ABC transporter-dependent mechanisms. Isolates of K. pneumoniae O1 are resistant to serum killing, but mutants defective in production of the O1 antigen become sensitive despite still producing O-PS composed solely of the O2a antigen (155). The killing effect was abrogated by heat-treating the serum, suggesting a role for complement. The survival of O1 isolates could reflect differences in either chain length or conformation. Notably, the addition of a sidechain galactose to the O2a antigen also confers some protection in K. pneumoniae O2afg isolates, which predominate in the carbapenemase-resistant ST258 clone (127). In this case, the PAGE profiles of the corresponding LPS molecules do not suggest a large change in apparent size with or without the sidechain galactose.
Antibodies recognizing cell-surface epitopes promote bacterial killing by the alternative complement pathway without increasing the amount of bound C5b-9 (156), but O-PS has a complex and species-dependent impact on the efficacy of antibodies targeting surface epitopes. Early investigations suggested that O-PS presents a barrier to antibodies recognizing underlying proteins (e.g. Refs. 157 and 158), but a recent study illustrates that the role of O-PS in this context can involve more than a simple steric barrier (159). In this case, interactions between IgG and an outer membrane protein antigen were influenced by the physical space occupied by the antigen and its dynamic interaction with O-PS. The end result is that effective antibody binding required the cognate O-PS and outer membrane protein epitope. In Bordetella parapertussis, O-PS protects the bacterium against opsonization with antibodies induced against surface antigens by the acellular B. pertussis vaccine. Although these bacteria are taken up by polymorphonuclear leukocytes, they are targeted to lipid rafts and trafficked to nonbactericidal LAMP-negative phagosomes, where they survive (160). Perhaps surprisingly, high-titer IgG antibodies directed against O-PS can enhance complement resistance in Salmonella (161), Pseudomonas aeruginosa (162), and uropathogenic Escherichia coli (UPEC) (163). The underlying mechanism appears to involve a barrier of bound antibodies on long chain O-PS that may further encourage complement deposition away from the outer membrane and physically blocks access of other protective antibodies.
E. coli, Salmonella, and P. aeruginosa possess multimodal O-PS chain-length distributions due to the possession of two or more chain-length regulatory proteins, and this offers a mechanistic avenue to modulate chain-length distributions and contribute differentially to the biology of the organism. In Salmonella, the O-PS pattern is influenced by the level of expression of the O-PS biosynthesis locus, which is regulated by the transcriptional antiterminator RfaH (164) and by s factors RpoN and RpoS via RfaH (165) as well as its balance with Wzz proteins, which affect wzy expression and Wzy activity (166,167). "Long" O-PS is generally considered to protect Salmonella against serum killing and is dictated by Wzz ST . wzz ST expression is regulated by the RcsAB two-component system associated with stress response and swarming behavior (168) and depends on Dam methylation (169). "Very long" O-PS, directed by Wzz FepE , can confer a fitness advantage on serovar Typhimurium in the inflamed intestine by offering resistance to the elevated bile levels in the lumen during colitis (139). Interestingly, S. Typhi does not cause acute colitis, and the absence of very-long O-PS (due to wzz FepE being a pseudogene in this species) enhances immune evasion enabled by the capsular polysaccharide known as Vi-antigen produced by serovar Typhi (170).
The role of O-PS in macrophage interactions is complex and may involve further fine tuning of O-PS size and structure. In bacteria like E. coli, Salmonella, and Shigella, interpretations are complicated by the production of so-called group 4 capsules. These redirect O-PS glycan from Wzx/Wzy-dependent pathways to the cell surface by a separate translocation machinery without ligation to lipid A core (171). They can contribute to the same biological properties and functions as LPS-linked O-PS but in ways that are sometimes different and often poorly understood. Here we confine the discussion to the LPS-linked forms. Production of "very long" O-PS is also important for uptake of Salmonella by macrophages (172) as well as survival and replication within the Salmonella-containing vacuole (167). Growth conditions that mimic those within the macrophage lead to enhanced production of "very long" O-PS (173). In serovar Typhimurium, expression of pbgE 2 and pbgE 3 (associated with "short" O-PS), wzz ST , and wzz FepE is regulated by the PmrAB two-component system (61,140,168). Subtle differences in regulation may occur between serovars, assisting niche adaptation. For example, PmrA apparently represses wzz ST in serovar Enteritidis and induces wzz ST in Typhimurium (169). The involvement of PmrAB integrates modulation of O-PS chain length into a coordinated response to polycations and iron that also includes regulated modifications of lipid A. The interdependence extends to an effect of Wzz ST on the balance of lipid A modifications (61). The broader connection is reinforced by VisP, a periplasmic protein that affects outer membrane structure and integrity and works together with Wzz ST and Wzz FepE to affect wzy expression (167). Exactly how these processes are functionally integrated is an important question for further investigation.
Differences between the O-PS in Shigella sonnei and S. flexneri contribute to the reduced uptake and vacuolar escape observed for S. sonnei, in turn leading to less inflammatory cell death in infected macrophages (174). S. flexneri uses the needle structures of type-3 protein secretion systems (T3SS) to inject effector proteins as an essential step in the invasion of epithelial cells and evasion of innate immunity, but the efficiency of T3SS function is influenced by O-PS structure (135). Constitutive periplasmic O-PS glycosylation can affect the conformation of the closest linkage in the O-PS backbone in molecular modeling studies (175), leading to up to 50% reduction in the physical length of the O-PS chain in S. flexneri (135). Although this does not appear to affect resistance to serum killing, it does have a marked influence on the exposure and function of T3SS. Whereas glucosylation is constitutive in Shigella, O-PS glucosylation in laboratory-grown S. Typhimurium only occurs at low levels because it is subject to genetic (form) variation in this species (176), and the ability to switch glucosylation on/off promotes long-term colonization of the intestine (177). After macrophage infection, S. flexneri uses O-PS to block apoptosis by inhibiting caspase activity (142). O-PS-dependent anti-apoptotic function is not confined to Shigella. It has been proposed that Porphyromonas gingivalis O-PS interacts with bacterial (gingipain) proteases to promote their involvement in anti-apoptotic pathways (178). The influence of O-PS may also extend beyond the intact bacterial cell, as O-PS in some E. coli isolates may enhance the efficiency and uptake of outer membrane vesicles by host cells by enabling raft-dependent endocytosis (179).
Most research concerning the biological impact O-PS chain length is directed at bacterial pathogenesis. However, there is also an interesting role played in cell-surface architecture in some bacteria that produce paracrystalline protein arrays (Slayers) to form a molecular sieve that covers the cell and protects it against various environmental stresses. Caulobacter crescentus O-PS plays a pivotal role in nucleating and binding S-layer to the surface. During assembly, the S-layer monomers bind to a specific structural element in the O-PS repeat units in a Ca 21 -dependent process and then migrate to oligomerize at the tips of O-PS to form the intact layer (180,181). Optimal assembly in this system is presumably aided by the tight modal chain-length distribution of O-PS in this species (182), and the process appears to involve an ABC transporter-dependent pathway (183).

Conclusions
In the last few years, we have seen major advances in our understanding of O-PS biosynthesis, due to the combined application of biochemistry and structural biology. Access to large amounts of genome sequence data are offering an opportunity to understand the distribution of prototypical systems and components as well as reveal new systems for investigation that do not conform to currently recognized parameters. This information is already being used to dissect broader functions of O-PS to focus on the roles of specific structures, chain lengths, and regulatory systems. An important application of the mechanistic insight into O-PS assembly lies in the expanding field of glycoengineering, which was founded on the discovery of strategies to transfer heterologous O-PS to protein carriers by exploiting the Campylobacter N-glycosylation system (reviewed in Ref. 184). The approach has been expanded to include different bacterial glycans and human glycan mimics in search of novel vaccines, therapeutics, and diagnostic tools. Bacterial glycan biosynthesis machinery is important because of the vast diversity of enzyme activities and availability of homologs that may have more advantageous properties. Central to these exciting applications is a fundamental understanding of how glycan assembly systems work, how components from different assembly pathways can be productively combined for new functionalities, and whether the critical specificity of a particular enzyme is affected by its use outside the natural context. There are still important questions to resolve around the structure and mechanism of key membrane proteins, the architecture of multienzyme complexes, and the cellular factors that determine the cellular distribution of these complexes and the allocation of shared Und-P carrier. However, the framework is established, and the necessary tools are now available to answer these questions.