Bacterial Protein N-Glycosylation: New Perspectives and Applications*

Protein glycosylation is widespread throughout all three domains of life. Bacterial protein N-glycosylation and its application to engineering recombinant glycoproteins continue to be actively studied. Here, we focus on advances made in the last 2 years, including the characterization of novel bacterial N-glycosylation pathways, examination of pathway enzymes and evolution, biological roles of protein modification in the native host, and exploitation of the N-glycosylation pathways to create novel vaccines and diagnostics.

UDP-GlcNAc (10,11) and transferred to Und-P by PglC (12). After stepwise extension mediated by four glycosyltransferases (GTases), PglA, PglJ, PglH, and PglI (13,14), the full-length Und-PP-linked heptasaccharide (lipid-linked oligosaccharide (LLO)) is then flipped by the ATP-dependent flippase PglK from the cytosolic side of the bacterial inner membrane toward the periplasmic space (14,15). The oligosaccharyltransferase (OTase) PglB is the key enzyme of the pathway and is homologous to the STT3 subunit from the eukaryotic OTase complex (16). PglB catalyzes glycan transfer from the lipid carrier to specific asparagines within flexible loops of folded acceptor proteins (17) that carry the bacterial glycosylation sequon (D/ E)XNZ(S/T) (where X and Z can be any amino acid except proline) (18). PglB also releases the heptasaccharide as free oligosaccharides into the periplasmic space, and this hydrolase activity is influenced by the osmotic environment of the cell (19).
In eukaryotes, the OTase complex transfers a tetradecasaccharide unit (Glc 3 Man 9 GlcNAc 2 ) to proteins, and these sugars are further modified by additional glycosyl hydrolases and GTases to yield a heterologous mixture of N-glycoproteins (1,20,21). Although the C. jejuni N-glycans are homogeneous in structure, the OTase is capable of recognizing various substrates (22)(23)(24) and is limited only by the requirement for an acetyl group at C-2 of the reducing end sugar (24,25). The promiscuity of the C. jejuni PglB OTase coupled with the ability to function in Escherichia coli (16) in combination with enzymes from lipopolysaccharide O-antigen biosynthetic pathways and various GTases of bacterial and eukaryotic origin provides opportunities to produce defined sugar structures on select acceptor proteins (see below for details).

Bacterial N-Linked Protein Glycosylation: An Update
There have been several publications and reviews describing the classical N-glycosylation pathway in the ⑀-subdivision of Proteobacteria over the last decade (for literature selection, see Refs. 6, 26, and 27). It has been demonstrated that nearly all bacteria within this class, which includes Campylobacter, Helicobacter, and Wolinella species, possess at least one ortholog of the bacterial OTase (for review, see Ref. 6). In addition, a subset of Deltaproteobacteria, including Desulfovibrio species, also possess the N-glycan pathway (28,29).

Periplasmic N-Glycosylation Pathways in Epsilonproteobacteria: Variations of a Common Theme
In contrast to C. jejuni, in which a single pglB gene is located within a gene cluster encoding all enzymes required for biosynthesis of the N-linked glycan, in other Campylobacter and Helicobacter species, the pgl genes are more dispersed across the genome (6, 30 -32) and may contain two copies of pglB (6,31). The composition of the N-glycans is also more variable than originally thought ( Fig. 1) (6,33,34). Helicobacter pullorum produces a linear pentasaccharide with unknown sugar residues with masses of 216 and 217 Da (31). Wolinella succinogenes produces a hexasaccharide containing three 216-Da mon-osaccharides and an unusual 232-Da residue at the nonreducing end and is the only species so far that has a sugar branching off the second sugar from the reducing end (30).
At least 16 different structures were identified in the Campylobacter genus ( Fig. 1) (30,32). In all species examined, the first two to three reducing end sugars (which include diNAcBac) were conserved, and all of the variation occurred at the nonreducing end. On that basis, Campylobacter species were divided into two groups (32). Group I (Cj group) includes thermotolerant strains that (except for the Campylobacter lari-like species, which lacks the glucose branch) produce the C. jejuni heptasaccharide. The non-thermotolerant Campylobacter species (Group II) express different N-glycan chain lengths from hexasaccharides to octasaccharides and, in some cases, produce from two to four structural variants. The presence of monosaccharides with unusual masses of 217 Da (as observed in H. pullorum (31), Hilicobacter winghamensis (31), Campylobacter rectus, Campylobacter showae, Campylobacter curvus, Campylobacter mucosalis (32), and Campylobacter concisus (30) and predicted to be 2-acetamido-2-deoxy-D-galacturonic acid (32)), 234 Da (identified as glucolactilic acid in C. concisus but also present in other species (32)), and 245 Da (present in Campylobacter hominis (32) and predicted to be an O-acetylated Hex-NAc (32)) suggests that some Campylobacter species have acquired these unique sugars independently (32).
Interestingly, the overlap in the N-glycoproteome was shown to be marginal, with the exception of the SecG protein, which is part of the general protein secretory machinery and has been demonstrated to be a target for N-glycosylation in every Campylobacter species examined (32). Although a variety of phenotypes are associated with loss of protein glycosylation (see "Biological Significance" below), it appears that this machinery has specific targets requiring N-glycosylation, and modification of other proteins may occur simply because they possess a sequon for N-glycosylation.
In addition to the variety in N-glycan structures, non-carbohydrate modifications can be found. Scott et al. (35) identified a phosphoethanolamine (pEtN) residue attached to the terminal GalNAc of the N-glycan of C. jejuni on multiple proteins. Although pEtN was not an immunodominant epitope, deletion of the gene in C. jejuni JHH1 resulted in reduced motility and increased sensitivity toward polymyxin B (35). Interestingly, the glycans of Campylobacter gracilis were found to be nearly 100% pEtN-modified on the free oligosaccharides and ϳ50% modified on the glycoprotein, with non-modified forms being present on the same acceptor peptide (32). pEtN modifications are frequently present on lipid A and lipopolysaccharide core structures and also have been recently identified to modify the C. jejuni flagellar hook protein FlgG (36). However, the question of why this modification is added to the free oligosaccharides and N-glycans of only two species when all Campylobacter species possess a pEtN transferase homolog remains to be answered. Interestingly, Neisseria species that have a general pathway for O-glycosylation further modify their O-glycosylated pili and at least two other glycoproteins with pEtN, phosphocholine, and phosphoglycerol, but these modifications are added to unoccupied Ser/Thr residues on the protein and not directly to the sugars (37,38).

Biological Significance
Disruption of the C. jejuni N-glycosylation pathway results in pleiotropic phenotypes, including loss of protein recognition by both human (39) and rabbit (7,40) immune sera (Fig. 2). The observation that multiple proteins lose antibody reactivity when the pgl genes are disrupted led to the initial discovery of this general glycosylation pathway (7). Subsequently, it was demonstrated by van Sorge et al. (41) that the C. jejuni N-glycan can also be recognized by the innate immune system through the C-type lectin known as MGL (macrophage galactose lectin receptor with specificity for N-acetylgalactosamine) on dendritic cells. Mutation of the pgl genes also influences bacterial adherence and invasion of cultured intestinal cells in vitro as well as mouse and chicken colonization in vivo (40,42,43). Of the Ͼ60 proteins that have been annotated and demonstrated to be modified with N-glycans in C. jejuni (8), the majority are part of protein complexes. Despite the fact that several studies have demonstrated the importance of N-glycosylation in C. jejuni pathogenesis, the general role of the N-glycans is still unknown. Site-directed mutagenesis within the N-glycosyla-tion sequon has been used to study proteins such as Cj1496c, ZnuA, and VirB10 (44 -46). Removal of the N-linked glycans from Cj1496c, a protein involved in epithelial cell invasion and chicken colonization, and ZnuA, which is important for survival in low-zinc environments, showed no significant influence on the function of these proteins (42,45). Similarly, the N-glycan on the surface-associated lipoprotein JlpA, an adhesin required for chicken colonization, was not required for protein antigenicity (47,48), and N-glycosylation of two mechanosensitive channel proteins had no major effect on their function and stability (49). However, N-glycosylation at one site of VirB10, a periplasmic component of the type IV secretion system, was shown to be important for DNA uptake and stability of the complex (44). This was the first study to suggest that N-glycans are potentially involved in the proper assembly and function of protein complexes in C. jejuni.
CmeABC is the major efflux pump in C. jejuni responsible for conferring resistance to a broad range of antibiotics and bile salts and is also required for colonization of chickens in vivo (50). The three components of the CmeABC multidrug efflux pump in C. jejuni are also N-glycosylated. Mutagenesis of the Asn 123 and Asn 273 sites of the periplasmic component of this efflux pump, CmeA, demonstrated similar patterns of antimicrobial sensitivity and reduced chicken colonization levels as a cmeA mutant. 4 Because CmeA protein levels were not altered in the N-glycosylation mutant, studies are under way to determine whether N-glycosylation influences pump assembly.

Evolutionary Perspective
Because all Campylobacter species but one possess a pgl gene cluster with counterparts showing high levels of similarity and synteny (32), this suggests that the N-glycan pathway was likely present in the common ancestor of the Campylobacter genus prior to the division into species. Thus, it is not surprising that the first two to three reducing end sugars are conserved in all species capable of protein glycosylation (Fig. 1). This is in contrast to the diverse N-glycan structures at the nonreducing end, which are subjected to pressures to evolve due to immune recognition. This is emphasized by the observation that C. jejuni N-glycan-specific antibodies recognize only the full-length heptasaccharide and do not react against pathway mutants devoid of the nonreducing end sugars (19,51). We have also shown that C. jejuni N-glycan antibodies do not react against N-glycans produced by non-thermotolerant Campylobacter species, and conversely, antibodies raised against the N-glycan of Campylobacter fetus fetus react only with N-glycans of species with the same terminal glycans (Fig. 1), although the three reducing end sugars are identical between these species (32).
Of the 29 Campylobacter species capable of N-glycosylating their proteins, all form the N-linkage to asparagine through the unusual diNAcBac sugar (Figs. 1 and 3) (32). In contrast, all eukaryotes form N-linkages through GlcNAc. To synthesize diNAcBac, bacteria start with UDP-GlcNAc and make the conversion using three enzymes, PglD, PglE, and PglF (as outlined above). All Neisseria species examined to date O-glycosylate their proteins. Synthesis of the oligosaccharide occurs on Und-P, followed by flipping across the membrane and addition to Ser/Thr residues of multiple target proteins by the promiscuous PglL OTase (52)(53)(54). Although this genus belongs to the Betaproteobacteria, they also predominantly use diNAcBac at the reducing end (referred to as DATDH (diacetamidotrideoxyhexose) in the literature because the absolute conformation of the hexose has not been determined), although other variants such as GATDH (glyceramidoacetamidotrideoxyhexose (55) and DADDGlc (diacetamidodideoxyglucopyranose (56)) have been detected. Thus, Neisseria species also possess homologs of the PglFED enzymes. Phylogenetic analyses of these three enzymes, which convert UDP-GlcNAc to UDP-diNAcBac from all strains known to produce N-glycans and Neisseria species containing the pgl locus, showed the formation of two distinctive clades (Fig. 3) (data not shown). These results suggest that the Campylobacter and Neisseria protein glycosylation path- ways have convergently evolved to use diNAcBac. Why this unusual sugar is produced by two different bacterial protein glycosylation systems remains a mystery. It is also surprising to see that the enzymes from W. succinogenes (which uses diNAcBac) (30) cluster with those from Helicobacter species possessing pgl gene loci, including pglFED, but use HexNAc as the reducing end sugar (31). Desulfovibrio species were excluded from these analyses because they do not possess obvious PglFED orthologs. This is supported by x-ray crystallography data demonstrating that the reducing end sugar of Desulfovibrio gigas is GlcNAc (29).

Gammaproteobacteria Do It, Too! The Cytoplasmic N-Glycosylation Pathway
In 2003, St Geme and co-workers (57) demonstrated that the extracellular high-molecular weight non-pilus adhesin HMW1 of the Gram-negative bacterium Haemophilus influenzae was modified with mono-or dihexoses on asparagine residues. HMW1 mediates attachment to human epithelial cells, an essential step in the establishment of H. influenzae disease (58 -60). HMW1C is a unique GTase that is required for glycosylation of HMW1 in the cytoplasm, and glycosylation was shown to be essential for HMW1 stability and translocation to the bacterial surface (57). Thirty-one modification sites were identified in HMW1, and all but one of the asparagines is within the conventional consensus sequence for eukaryotic N-glycosylation, NX(S/T) (61). In a subsequent study, HMW1C was demonstrated to have substrate specificity for UDP-␣-D-glucose and UDP-␣-D-galactose, but with the prerequisite that glucose must be the first hexose linked to asparagine in the glycopeptides containing dihexose modifications prior to further modification by the same enzyme (62). Because HMW1C orthologs can be found in several other Gram-negative pathogens, including enterotoxigenic E. coli, Yersinia pseudotuberculosis, Yersinia enterocolitica, Yersinia pestis, Haemophilus ducreyi, Actinobacillus pleuropneumoniae, Mannheimia spp., Xanthomonas spp., and Burkholderia spp., among others (62), these proteins were suggested to constitute a new family of bacterial enzymes capable of creating N-linked carbohydrate-protein and carbohydrate-carbohydrate bonds in the cytoplasm that do not require the formation of the monosaccharide units on a lipid-linked intermediate (61).
Recently, the structure of the HMW1C ortholog of A. pleuropneumoniae that is capable of glycosylating the H. influenzae HMW1 adhesin in vitro and in vivo (63) was solved (64). Structure-guided modeling in combination with site-directed mutagenesis and glycosylation assays demonstrated that the H. influenzae HMW1C protein adopts the same structure as A. pleuropneumoniae HMW1C, with critical residues for UDPhexose binding and residues for acceptor protein binding being conserved (64). Schwarz et al. (65) confirmed that the HMW1C ortholog from A. pleuropneumoniae is an inverting GTase that transfers a glucose or galactose moiety (with specificity for UDP-glucose) to asparagine within the eukaryotic NX(S/T) consensus sequence, but the enzyme did not further elongate the N-linked monosaccharide. An additional, glucose-specific, polymerizing glucosyltransferase elongates the N-linked glucose, adding two glucose residues (and up to six residues in excess of donor) to generate Glc␣(1-6)Glc␣(1-6)Glc (65). Despite lacking the need for Asp/Glu at the Ϫ2 position of the sequon, HMW1C proteins could not glucosylate short peptides that work for C. jejuni PglB (22), indicating that binding of the substrate to the enzyme might require an extended contact surface (65). A proline in the ϩ1 position of the acceptor peptide impaired glycosylation (like C. jejuni PglB). HMW1C does not require metal ions (unlike C. jejuni PglB) but operates on folded proteins (like C. jejuni PglB). Because HMW1C does not share any sequence identity with PglB and because of the fact that the two enzymes belong to two structurally unrelated GTase families, these bacterial N-glycosylation systems seem to have evolved independently in two genera within two different bacterial classes and in different compartments of the cell.

Bacterial Glycoengineering: A New Discipline Is on the Rise
Conjugate vaccines, which are composed of polysaccharide antigens covalently linked to carrier proteins, have been demonstrated to be the most effective and safest vaccines against bacterial pathogens. In contrast to isolated bacterial polysaccharides, conjugate vaccines induce a long-lasting T-lymphocyte-dependent immunological memory (66,67) and thus have played an enormous role in preventing infectious diseases (67,68). However, the majority of these vaccines are based on chemical conjugation of a protein to either purified polysaccharide from a natural source or carbohydrates produced by multiple chemical synthesis steps, an expensive and time-consuming process yielding variable products (69). Bacterial glycoengineering may provide a solution to facilitate the production of such compounds in a cost-effective and homogeneous manner.
The functional transfer of the C. jejuni glycosylation machinery into E. coli started the era of bacterial glycoengineering (16). The C. jejuni-derived glycoprotein CmeA, a periplasmic component of the multidrug efflux pump, was the preferred N-glycan acceptor protein to be modified with various bacterial oligosaccharides in a PglB-dependent manner (23,24). Now, researchers can take advantage of this system for the creation of novel glycoconjugates (70).

Vaccines, Therapeutic Proteins, and Novel Glycans
A novel vaccine against Brucella abortus, a major cause of brucellosis in humans and livestock (71,72), was described recently (73). The purified glycoconjugate, produced by PglBdependent coupling of an N-formylperosamine homopolymer to C. jejuni CmeA, induced a pathogen sugar-specific IgG response in mice but was not protective in challenge experiments. However, when the recombinant glycoprotein was coated onto magnetic beads, it was efficient in differentiating between naïve and infected bovine sera, thus acting a useful diagnostic tool (73). These examples, together with previous results (16,23,24), demonstrate that it is now feasible to produce a variety of bacterial glycans attached to protein acceptors with relative ease. However, the synthesis of eukaryote-like oligosaccharides to generate humanized glycoproteins is still a challenge within the field of bacterial glycoengineering. Recently, a combination of in vivo and chemoenzymatic approaches produced homogeneous eukaryotic N-glycoproteins by in vitro trimming and remodeling of a bacterially generated N-linked glycan by transglycosylation (76). In this proof-of-concept study, the authors were able to generate a natural high mannose-type glycan (Man 9 GlcNAc 2 ) and a biantennary complex-type N-glycan attached to CmeA as well as two eukaryotic proteins, the CH2 domain of human IgG-Fc and the single-chain antibody F8 (76). A big step forward was the complete in vivo synthesis of a eukaryotic trimannosylchitobiose glycan by expressing four eukaryotic GTases and the subsequent transfer (via C. jejuni PglB) to acceptor proteins in E. coli (77). Despite the fact that only a very small fraction of each acceptor protein was glycosylated under the tested conditions, this study sets the stage for further engineering of eukaryotic glycoproteins using bacterial systems (77).
N-Linked protein glycosylation can also improve the biophysical and pharmacokinetic properties of existing therapeutic proteins such as antibodies when expressed and N-glycosylated in eukaryotic cell lines (78,79). Using the bacterial system, Lizak et al. (80) demonstrated that introduction of two N-glycan acceptor sites and subsequent N-glycosylation of a singlechain antibody (scFv 3D5) with the C. jejuni N-glycan in E. coli significantly increased the proteolytic stability and solubility without influencing the affinity and serum stability of the single-chain antibody. However, N-glycosylation did not affect the overall clearance rate in the murine model system, which was probably due to the properties of the attached C. jejuni N-glycan (80).

Understanding and Improving N-Glycosylation
The x-ray crystal structure of C. lari PglB, which is 56% identical to C. jejuni PglB and complements C. jejuni PglB function in E. coli (80,81), was solved in complex with the DQNATF acceptor peptide. This structure greatly advanced our understanding of the detailed mechanism for N-linked protein glycosylation at a molecular level (82). The structure not only identified the acceptor peptide-binding and catalytic sites in the transmembrane domain but provided insights into the mechanism of sugar transfer from LLOs to acceptor proteins. The structural requirements of the acceptor peptide (flexible loop formation), as well as important amino acid residues within C. lari PglB that explain the peptide sequon requirements and the catalytic mechanism (including divalent metal ion complexing), were identified.
Topology predictions further demonstrated the commonality of three motifs in bacterial, archaeal, and eukaryotic OTases that were the target for mutagenesis and functional analysis within C. jejuni PglB (83, 84). C. jejuni PglB activity was either low, medium, or completely abolished in point mutants with substitutions of amino acid residues that are involved in substrate binding and catalysis, confirming and supporting the observations made by Lizak et al. (82) for C. lari PglB. Genetic and functional analysis suggested that a 568 MXXI 571 motif has an impact on C. jejuni PglB function (84); in addition, single point mutations introduced into the conserved 475 DXXK 478 motif, earlier identified within the OTase of Pyrococcus furiosus (85), implied that Asp 475 is involved but does not play an essential role in PglB activity (83). However, double valine or alanine substitutions of the Asp 475 and Lys 478 sites resulted in the complete loss of C. jejuni CmeA glycosylation activity in E. coli, an effect more drastic than a single Asp 475 substitution. 5 The fact that D475V and K478V single and double point mutants displayed a significant reduction in PglB-mediated free oligosaccharide release and N-glycosylation levels when analyzed in the native C. jejuni host suggests that these residues are indeed important for PglB activity. 5 The PglB structure in combination with in vitro evolution systems such as cell-surface and phage display (86,87) will lead to the creation of OTases that efficiently glycosylate eukaryotic acceptor sites. The identification and characterization of novel OTases (88), especially those with more relaxed substrate specificities (28,81), will assist in engineering novel antigens. The use of codon-optimized GTase-and OTase-encoding genes, better glycoconjugate production strains, and improved culture conditions and glycoconjugate purification methods (74, 80, 89 -91) will further aid to optimize novel and already existing glycoconjugate expression systems.

Conclusions
Over the last decade, an incredible amount of progress has been made in our understanding of bacterial N-glycosylation systems. However, several key questions still remain. For instance, what are the main targets for N-glycosylation? Although tremendous progress has been made in the field of bacterial glycoengineering, there is still room for improvement and it needs to be determined whether it will be possible to produce humanized therapeutic proteins in the bacterial system in large enough quantities to make it feasible to replace existing technologies. Analogous to eukaryotes, which require flipping of a LLO from the cytoplasm to the lumen of the endoplasmic reticulum, protein N-glycosylation has been detected only in Gram-negative bacteria, which contain two membranes. Why do only some Gram-negative bacteria modify their proteins with sugars, whereas others use the same proteins without modification? Why was diNAcBac selected twice during bacterial evolution to act as the sugar that links oligosaccharides to proteins? Are there additional protein targets for the cytoplasmic HMW1C-dependent glycosylation pathway, or do these organisms use this pathway solely for the purpose of glycosylating the adhesion needed for human cell attachment? What are the exact roles of the sugars on the adhesin? The answers to the many remaining questions will provide more insight into the process of asparagine-linked protein glycosylation in bacteria.