Structure of the N-linked glycan present on multiple glycoproteins in the Gram-negative bacterium, Campylobacter jejuni.

Mass spectrometry investigations of partially purified Campylobacter jejuni protein PEB3 showed it to be partially modified with an Asn-linked glycan with a mass of 1406 Da and composed of one hexose, five N-acetylhexosamines and a species of mass 228 Da, consistent with a trideoxydiacetamidohexose. By means of soybean lectin affinity chromatography, a mixture of glycoproteins was obtained from a glycine extract, and two-dimensional gel proteomics analysis led to the identification of at least 22 glycoproteins, predominantly annotated as periplasmic proteins. Glycopeptides were prepared from the glycoprotein mixture by Pronase digestion and gel filtration. The structure of the glycan was determined by using nano-NMR techniques to be GalNAc-alpha1,4-GalNAc-alpha1,4-[Glcbeta1,3-]GalNAc-alpha1,4-GalNAc-alpha1,4-GalNAc-alpha1,3-Bac-beta1,N-Asn-Xaa, where Bac is bacillosamine, 2,4-diacetamido-2,4,6-trideoxyglucopyranose. Protein glycosylation was abolished when the pglB gene was mutated, providing further evidence that the enzyme encoded by this gene is responsible for formation of the glycopeptide N-linkage. Comparison of the pgl locus with that of Neisseria meningitidis suggested that most of the homologous genes are probably involved in the biosynthesis of bacillosamine.

Glycosylation of proteins was once considered to be specifically a eukaryotic phenomenon, but it is now clear that it is widespread in both the Archaea and Eubacteria domains (1,2). Glycosidic linkages of both the N-and O-types have been identified in a diverse group of prokaryotic organisms with a preponderance of N-linked sugars apparent in the Archaea, whereas linkage units of the O-type predominate in glycoproteins identified thus far in the Eubacteria (1,2). In addition, bacterial N-and O-linkages are formed with a wider range of sugars than those observed in eukaryotic glycoproteins.
Recently a gene locus was identified in the enteric pathogen Campylobacter jejuni, which appears to be involved in the glycosylation of multiple proteins and which provided the first evidence of a pathway for widespread protein glycosylation in a Gram-negative bacterium (3). Mutagenesis of genes within this locus, termed pgl (for protein glycosylation), resulted in loss of immunogenicity in multiple proteins. The glycan moieties of these proteins were also shown to be recognized by antisera from experimentally infected human volunteers (3). Removal of the glycan moieties by pgl mutation resulted in decreased adherence and invasion in vitro and loss of mouse colonization in vivo (4), suggesting that protein glycosylation influences the virulence properties of this organism. A similar genetic locus has recently been described for pilin O-glycosylation in Neisseria meningitidis (5,6).
This glycosylation pathway appears to be distinct from that involved in the glycosylation of flagellin observed for several bacterial species (7)(8)(9)(10). Thibault et al. (11) recently demonstrated that the glycosyl moiety on the flagellin of C. jejuni consists of variants of an unusual nine-carbon sugar, pseudaminic acid, O-linked to Ser or Thr. This sugar may be considered a highly modified sialic acid (12).
We have been identifying and characterizing post-translational modifications of proteins in C. jejuni strain NCTC 11168, the strain for which the whole genome sequence has been described by Parkhill et al. (13). Among the proteins giving rise to multiple spots on two-dimensional gels was PEB3, or Cj0289c, a major antigenic protein of C. jejuni first described by Pei et al. (14). When purified and analyzed by one-dimensional SDS-PAGE, it revealed two bands with a mass difference of ϳ1500 Da, both of which had N-terminal sequences corresponding to authentic PEB3. Concurrent with our observations on PEB3, Linton et al. (15) identified two putative glycoproteins from C. jejuni by use of the GalNAc-specific lectin, soybean agglutinin, one of which was PEB3, and the other a putative periplasmic protein Cj1670c, which they named CgpA. The authors also observed a number of other putative glycoproteins, based upon their ability to bind to the lectin, but these were not identified. Furthermore, protein binding to the lectin was also affected by mutagenesis of genes in the pgl locus.
We have undertaken a comprehensive analysis of the C. jejuni NCTC 11168 glycoprotein complement, and in this report we present data on the identification of individual glycoproteins and structural characterization of the N-linked glycan moiety, which is common to all these glycoproteins, by mass spectrometry and NMR spectroscopy. In addition, mutation of a gene, pglB, whose homology to the STT3 subunit of the N-linked oligosaccharyltransferase of Saccharomyces cerevisiae suggested a role in glycoprotein biosynthesis (3,16), was shown to specifically affect the glycosylation of the identified glycoproteins.

EXPERIMENTAL PROCEDURES
Bacterial Strains and Plasmids-C. jejuni NCTC 11168 was routinely grown on Mueller Hinton agar under microaerophilic conditions (10% CO 2 , 5% O 2 , 85% N 2 ) at 37°C. Escherichia coli DH10B (Invitrogen) was used as the host strain for cloning experiments, and clones were grown on S-Gal TM /LB agar (Sigma) or MH agar at 37°C. When appropriate, antibiotics were added to the following final concentra-tions: kanamycin (30 g/ml) and ampicillin (150 g/ml). Plasmid pPCR-Script Amp (Stratagene) was used as the cloning vector.
Preparation of Glycoprotein Extracts-Cells from two plates of overnight growth were re-suspended in 10 ml of Mueller Hinton broth and used to inoculate 1 liter of MH culture medium. Cultures were grown under microaerophilic conditions at 37°C for 24 h with shaking at 150 rpm. Bacterial cells from 12 liters of culture media were harvested by centrifugation at 10,000 ϫ g for 15 min and immediately frozen at Ϫ75°C. Frozen cell pellets were thawed on ice in 0.2 M glycine HCl buffer, pH 2.2 (17), and extracted for 15 min with gentle stirring. Extracts were clarified by centrifugation at 10,000 ϫ g for 15 min, dialyzed against pure water (Milli-Q system, Millipore Corp.), and freeze-dried.
Purification and Analysis of PEB3-The PEB3 protein was purified to homogeneity by cation exchange chromatography of the glycine extract as previously described (14). A Mono S HR 5/5 column (Amersham Biosciences) was used on an ÅKTA Explorer LC system (Amersham Biosciences). The column eluate was monitored for UV absorbance at 280 nm, and fractions were examined by SDS-PAGE analysis (18) in Mini Protean II slab gels (Bio-Rad Laboratories). N-terminal sequencing of individual proteins was performed on a model 491 Procise protein sequencing system (Applied Biosystems Inc.), following transfer from SDS gels to ProBlot TM PVDF 1 membrane (Applied Biosystems Inc.) as described by LeGendre et al. (19).
The protein molecular weight profiles of selected fractions were determined by electrospray ionization mass spectrometry using an Applied Biosystems/Sciex Q-Star hybrid quadrupole time-of-flight mass spectrometer. The fractions were first dialyzed extensively to remove salts and adjusted to 30% methanol, 0.2% formic acid. The solution was infused at a flow rate of 1 l/min, and spectra were acquired over the range m/z 600 -2000.
Analysis of Tryptic Peptides-Selected fractions were digested overnight at 37°C with modified trypsin (Promega) in 50 mM ammonium bicarbonate and analyzed by capillary LC-tandem mass spectrometry using a capillary high-performance liquid chromatography system (CapLC, Waters) coupled with a Q-TOF2 hybrid quadrupole time-offlight mass spectrometer (Micromass). Approximately 250 ng of each digest was injected onto a 0.3-ϫ 150-mm PepMap C 18 capillary LC column (Dionex/LC-Packings) and resolved by gradient elution (5-90% acetonitrile, 0.2% formic acid in 45 min). The mass spectrometer was set to operate in automatic MS/MS acquisition mode, and spectra were acquired on doubly, triply, and quadruply charged ions.
Larger scale separation of the tryptic digest was carried out on a 4.6-ϫ 250-mm Jupiter C 18 LC column (Phenomenex Inc.). The fraction containing the glycopeptide was then infused at a flow rate of 1 l/min into the microelectrospray interface of the Q-TOF2 mass spectrometer. Fragmentation of the glycopeptide prior to MS/MS analysis was achieved by front-end collision-induced dissociation (the orifice voltage was increased to 100 V from the normal 40 V). The MS/MS collision offset for the singly charged fragment ions produced in this manner was 20 -25 V (laboratory frame of reference). For ␤-elimination experiments by the method of Rademaker et al. (20), approximately half of the glycopeptide-containing fraction was evaporated to dryness and dissolved in 25% aqueous ammonium hydroxide. The solution was left at room temperature overnight, evaporated to dryness for a second time, and re-dissolved in water. The solution was then examined by infusion MS as described above.
Purification and Analysis of Total Glycoproteins-The glycoproteins from the glycine extracts were isolated by affinity chromatography on SBA lectin-agarose (Sigma-Aldrich Ltd.). The freeze-dried glycine extract was re-dissolved in PBS (100 mM NaCl, 50 mM sodium phosphate, pH 7.5) and passed through an SBA-agarose column previously equilibrated in PBS. The column was washed with 10 column volumes of PBS, and bound glycoprotein was eluted with 0.1 M GalNAc in PBS. Glycoprotein-containing fractions were pooled, dialyzed against Milli-Q water, and freeze-dried.
The glycoproteins were separated by SDS-PAGE on 12.5% homogeneous polyacrylamide gels (18). Two-dimensional PAGE was performed using pre-cast IEF strips containing immobilized linear pH gradients of either pH 3-10, pH 4 -7 (Bio-Rad Laboratories), or pH 6 -11 (Amersham Biosciences). Proteins were solubilized in sample buffer according to the manufacturer's instructions and resolved by isoelectric focusing on the precast IEF strips followed by SDS-PAGE on homogenous 12.5% slab gels, 20 ϫ 20 cm, for the second dimension. Gels were stained with Bio-Safe colloidal G-250 Coomassie Blue stain (Bio-Rad Laboratories) or silver-stained (20). For subsequent lectin probing, the gels were electroblotted onto PVDF membrane at 50 V for 1 h in 10 mM 3-(cyclohexylamino)-1-propanesulfonic acid buffer, pH 11, containing 10% methanol. The membrane was washed in Milli-Q water and blocked in Tris-buffered saline (100 mM NaCl, 50 mM Tris, pH 7.5) with 0.05% Tween 20 and 2% blocking buffer (Roche Molecular Biochemicals) for 2 h at room temperature. Following blockage, blots were further incubated with SBA-alkaline phosphatase conjugate (EY Laboratories Inc.) at a concentration of 10 g/ml in the above blocking solution for 1 h at room temperature. Blots were washed three times in Tris-saline with 0.05% Tween 20 and developed using nitro blue tetrazolium chloride/ 5-bromo-4-chloro-3-indolyl phosphate in 0.1 M NaCl, 0.1 M Tris, pH 9.5, with 50 mM MgCl 2 .
MS Analyses of Two-dimensional Gel Spots-The protein spots were excised and destained with a 1:1 ratio of 30 mM potassium ferricyanide and 100 mM sodium thiosulfate (21). The gel spots were washed extensively with deionized water, shrunk with acetonitrile, and re-swollen with 50 mM ammonium bicarbonate containing Promega modified trypsin (10 ng/l), and sufficient 50 mM ammonium bicarbonate was added to cover the gel pieces (typically 30 l). The tubes were sealed and incubated overnight at 37°C. The digest solutions were removed, and the gel pieces were extracted with 50 l of 5% acetic acid and then with 50 l of 5% acetic acid in 50% aqueous methanol. The extracts were pooled with the digest solutions and concentrated to ϳ10 l.
The peptide extracts from the intense protein spots were analyzed by MALDI-TOF MS using a M@LDI-LR mass spectrometer (Micromass). Approximately 0.5 l of the MALDI matrix solution (10 mg/ml ␣-cyano-4-hydroxy-cinnamic acid in 50% acetonitrile, 0.2% trifluoroacetic acid) was deposited on the target plate and allowed to dry. The peptide extracts were desalted using C 18 ZipTips TM (Millipore) and were deposited directly on the matrix spots. Acquisition of the MALDI-TOF MS spectra was carried out automatically. The spectra were calibrated externally using peptide standards and internally with trypsin autolysis peptides. Data base searching was carried out in batch mode using Mascot Daemon TM (Matrix Science) and against the C. jejuni NCTCSP11168 genome sequence data base.
The extracts from the fainter protein spots were analyzed by NanoLC-MS/MS using the Q-TOF2 mass spectrometer. The entire samples were injected onto a 0.3-ϫ 5-mm C 18 micro pre-column cartridge (Dionex/LC-Packings). The peptides were retained while the solvent components were washed to waste. The trap was then brought online with a 75-m ϫ 150-mm C 18 Nano-Series column (Dionex/LC-Packings), and the peptides were separated with a gradient supplied by the CapLC pump (15-75% acetonitrile, 0.2% formic acid in 30 min, ϳ300 nl/min flow rate). The mass spectrometer was set to acquire MS/MS spectra in automated mode as described above. Data base searching was carried out as described for the MALDI-TOF MS analyses.
Glycopeptide Preparation-Freeze-dried total glycoprotein (5 mg) was dissolved in 250 l of 100 mM Tris, pH 8.0, containing 2 mM CaCl 2 , and digested with Pronase as previously described (22). The digest was microcentrifuged at 10,000 ϫ g for 15 min, and the supernatant was applied to a column (1 ϫ 120 cm) of BioGel P4, 200 mesh (Bio-Rad Laboratories). The column was run in water, and the column eluate was monitored by refractive index. Fractions were screened by ESI-MS and precursor ion scanning mass spectrometry (precursors of the HexNAc oxonium ion at m/z 204) on an API 3000 triple quadrupole mass spectrometer (Applied Biosystems/Sciex). Fractions giving the HexNAc ion signature were pooled and freeze-dried. The glycopeptides were further purified on a 1-ϫ 120-cm column of BioGel P2 fine grade, the fractions being monitored and screened as described above.
to within 10 Hz of the set value. Samples in 40 l of D 2 O solution were recorded at 25°C and at 35°C to produce sharper peaks. The pH was unknown due to the small volume. Deuterated EDTA (CDN Isotopes Inc.) was added to chelate metal ions and provide sharper peaks for bacillosamine and amino acids. Although the glycopeptide isolate from a P4 column contained some amino acid and sugar impurities, spectra were of sufficient quality to allow complete resonance assignments of the glycopeptide in the presence of 15 mM of deuterated EDTA. Much of the NMR structural work proceeded with this sample, because of the risk of loosing the bulk of the isolated glycopeptide by further purification. The derived structure was confirmed by additional NMR experiments on glycopeptide that had been purified using a P2 column, lyophilized, and dissolved in 40 l of D 2 O with 1 mM deuterated EDTA. The experiments were performed with suppression of the HOD signal at 4.78 ppm (25°C) and 4.67 ppm (35°C). Acquisition and processing of two-dimensional experiments (COSY, TOCSY, NOESY, HMQC, HMBC) were performed as described previously (23). The 1 H reference was set by external acetone at 2.23 ppm. The 13 C reference was set with the methyl resonance of external acetone at 31.07 ppm. The 1 H and 13 C chemical shifts in Table I (see below) were measured from the proton spectra and from C-H cross-peaks in the HMQC and HMBC spectra. One-dimensional TOCSY experiments with various spin-lock times from 30 to 151 ms and one-dimensional NOESY with mixing times from 400 to 800 ms were performed as described previously (23,24). Selective experiments were described as one-dimensional EXP (selected spins, selective excitation bandwidth, mixing time) where EXP is TOCSY or NOESY.
The use of magic angle spinning (MAS) for liquid state samples in the presence of both radio frequency and magnetic-field homogeneities has been shown to influence significantly the performance of mixing sequences in TOCSY experiments and can degrade performance (25,26). Use of adiabatic (WURST) mixing sequences can eliminate such effects (25,27). The standard two-dimensional TOCSY and one-dimensional TOCSY sequences were modified so that the MLEV-17 or DIPSI-2 mixing sequence was replaced with the adiabatic WURST-2 pulses. The adiabatic (WURST-2) mixing had a single adiabatic inversion pulse length of T p ϭ 1/MAS spin rate, a modulation depth of 8, and an adiabicity of 2. Typically, for the WURST-2 pulse, the sweep bandwidth was 24 kHz, T p ϭ 0.333 ms (at a MAS spin rate of 3000 Ϯ 10 Hz), B 1 (max) ϭ 8.51 kHz, and B 1 (root mean square) ϭ 4.77 kHz.
GC-MS Analysis-The enantiomeric configurations of the Glc and GalNAc components of the P2 product were assigned by characterization of the but-2-yl glycosides in gas liquid chromatography-mass spectrometry (28). The derivatives were analyzed using a Hewlett-Packard chromatograph equipped with a 30-m DB-17 capillary column (180 -260°C at 3.5°C/min), and spectra in the electron impact mode were obtained with a Varian Saturn II mass spectrometer.
Construction and Characterization of pglB Mutant-For construction of the pglB mutant, genes Cj1121c to Cj1126c were PCR-amplified from C. jejuni NCTC 11168 using the primers: Cj1121cF (5Ј-ACTCACTAT-TGCCATTAAGATAAGC-3Ј) and Cj1126cR (5Ј-AAAACCCTTATT-TAGTTTTGTTTGC-3Ј). The PCR product was polished with Pfu polymerase and then ligated into pPCR-Script Amp (Stratagene) according to the manufacturer's instructions. The ligation mixture was electroporated into electrocompetent E. coli DH10B and selected for on LB S-gal agar (Sigma-Aldrich) with ampicillin. A blunt-ended kanamycin resistance cassette from pILL600 (29) was inserted into the filled-in XbaI restriction site of pglB, generating pEAp26. The orientation of the cassette was confirmed to be non-polar by sequencing with the ckanB primer (5Ј-CCTGGGTTTCAAGCATTAG-3Ј). DNA was sequenced using terminator chemistry and AmpliTaq cycle sequencing kits (Applied Biosystems) and analyzed on an Applied Biosystems 373 DNA sequencer. The mutated plasmid DNA was used for electroporation into C. jejuni NCTC 11168 (30), and the kanamycin-resistant transformants were characterized by PCR to confirm that the incoming plasmid DNA had integrated by a double cross-over event.
Proteins were extracted from C. jejuni whole cells using 0.2 M glycine at pH 2.2 (17) and dialyzed against water. Samples were analyzed by two-dimensional PAGE using 11-cm pH 3-10 ReadyStrips (Bio-Rad Laboratories) as above and pre-cast 12-ϫ 8-cm 8 -16% gradient Criterion slab gels (Bio-Rad Laboratories). Gels were stained with colloidal Coomassie Blue, photographed, and then partially destained by washing in water. Proteins were transferred for 1 h at 207 mA onto PVDF membranes using a Trans-Blot SD Semi-Dry Transfer Cell (Bio-Rad). After blocking overnight, membranes were probed with a 1:500 dilution of HS:2 serotyping serum followed by a 1:5000 dilution of goat anti-rabbit antiserum (Sigma-Aldrich) and developed with nitro blue tetrazolium chloride/5-bromo-4chloro-3-indolyl phosphate (Roche Molecular Biochemicals).

RESULTS
Purification and Characterization of PEB3-PEB3 protein (Cj0289c) was identified in two-dimensional gels of a glycine extract by peptide mass fingerprinting, as a component of a group of spots focusing within a range of pH 9 -10 (results not shown). PEB3 was purified from the extract by cation exchange chromatography, and re-fractionated on the same column, using a shallower NaCl gradient, resulting in the PEB3 appearing in three fractions (Fig. 1). SDS-PAGE analysis showed two bands, whose N-terminal sequences were determined following their transfer to a PVDF membrane. Ten cycles of sequencing identified the lower mass species as PEB3, whereas the higher mass, more abundant component, was also PEB3 with a minor sequence corresponding to PEB4 (Cj0596).
The mass spectrum and the reconstructed molecular mass profile for fraction #31 are presented in Fig. 2 (a and b). Three peaks were observed in the reconstructed mass profile. The peaks at 25,454 and 28,376 Da correspond well with the expected molecular masses of PEB3 (25,453 Da, Cj0289c) and PEB4 (28,377 Da, Cj0596) respectively, without signal peptides. To identify the protein of mass 26,861 Da, CapLC-MS/MS analysis was carried out on the tryptic digest of this fraction. All but one of the peptides identified could be assigned to PEB3 or PEB4, in accord with the N-terminal sequence data. MS/MS analysis of the unidentified ion (Fig. 3a) clearly identifies it as a glycopeptide. A fragmentation series composed of sequential losses of HexNAc (203 Da) and a single Hex (162 Da) can be observed in this spectrum. The tryptic peptide was identified as 68 DFNVSK 73 from PEB3. The residue mass of the oligosaccharide portion of this glycopeptide is 1406 Da, which corresponds well with the difference in the molecular weights of PEB3 and the unknown protein peak observed in Fig. 2b. Therefore, it appeared that ϳ50% of the PEB3 protein in this fraction was modified with a single oligosaccharide composed of 5 Hex-NAcs, 1 Hex, and an unusual sugar with a residue mass of 228 Da. Moreover, the MS/MS spectrum indicated that the oligosaccharide was linked to the peptide via the 228-Da sugar moiety.
Characterization of the Glycopeptide Linkage-The PEB3 tryptic peptide to which the oligosaccharide is attached contains sites for both N-and O-linkage, i.e. Asn and Ser. Therefore, it was necessary to carry out further experiments to determine the nature of this linkage. Previously, we have used ␤-elimination to remove O-linked carbohydrates from the flagellin of C. jejuni 81-176 (11). However, this procedure failed to remove the oligosaccharide in this instance. This was our first indication that the oligosaccharide is N-linked. This was confirmed by MS/MS analysis of the singly protonated fragment ion at m/z 937.0 produced by front-end collisioninduced dissociation of the intact glycopeptide (Fig. 3b). This ion is composed of the tryptic peptide plus the unusual 228-Da sugar only. An ion was observed at m/z 605.1, which could only be assigned to the b 3 fragment ion plus the 228-Da sugar moiety. No fragment ions were observed, suggesting that the carbohydrate is linked to the serine residue. All of this evidence strongly suggests that the oligosaccharide is linked to the peptide at Asn 70 . Interestingly, this peptide contains the eukaryotic N-linkage consensus sequon, Asn-Xaa-Ser.
Isolation and Identification of Glycoproteins-Putative glycoproteins were purified by SBA affinity chromatography from the glycine extracts of 40 g wet weight of cells. The yield of putative glycoproteins was 5 mg as estimated by UV absorbance at 280 nm. The GalNAc eluant was subjected to one-and two-dimensional PAGE (Fig. 4), and to ensure that the proteins purified in this manner possessed lectin binding properties, rather than nonspecific binding characteristics, Western blotting with an SBA/alkaline phosphatase conjugate was also carried out. Approximately 13 protein species were visualized following one-dimensional SDS-PAGE, but this number increased substantially when the product was analyzed by twodimensional PAGE. The proteins in individual bands from onedimensional SDS-PAGE and spots from two-dimensional PAGE were identified by mass fingerprinting and data base searching (Table I). Among the identified proteins were PEB3 (Cj0829c) and CgpA (Cj1670c) previously identified by Linton et al. (15). The vertical pattern of spots with identical pIs displayed by Cj1670c, and other proteins, likely indicates varying degrees of glycosylation, because examination of their predicted amino acid sequences, derived from the whole genome sequence of C. jejuni NCTC 11168 (13), revealed the presence of multiple potential N-linked glycosylation sites containing the sequon Asn-Xaa-Ser/Thr (Table I). In fact, MS/MS analysis of the Cj1670c-containing in-gel digest extracts indicated that three of its six N-linkage sites were occupied to varying extents (three Cj1670c glycopeptides were detected by CapLC-MS/MS: 7 TDQNITLVAPPEFQKEEVK 25 , 77 VLDVSVTIPEKNSSK 91 , and 92 QESNSTANVEIPLQVAK 108 ). A single glycopeptide was also observed for Cj0114 ( 71 LSQVEENNQNIENNFTSEIQK 91 ) and for Cj0200c ( 1 DSLKLEGTIAQIYDNNK 17 ). Furthermore, the mass and composition of the glycan component of all these glycopeptides appears to be identical to that observed for PEB3.
However, certain proteins identified from the two-dimensional PAGE, notably Cj0147c, Cj0169, Cj0332c, Cj0334, Cj0638c, Cj1181c, and Cj1534c do not contain any of these specific sequons in their amino acid sequences. These proteins are either non-covalently associated with SBA-binding proteins or bind nonspecifically to the column. This conclusion is supported by the failure of these protein spots to react with the SBA/alkaline phosphatase conjugate following two-dimensional PAGE and electroblotting (Table I).
Preparation of Glycopeptides-The mixed glycoprotein sample was subjected to two rounds of Pronase digestion, and the products were separated by gel filtration on BioGel P4 (Fig. 5a). The carbohydrate-containing fractions were located by mass spectrometry and, after NMR studies (see below), the sample was re-purified on BioGel P2 (Fig. 5b). The final purified sample was composed mainly of the oligosaccharide linked to a single Asn (Fig. 5c). There was no evidence of any variation in the glycan component. The yield of glycopeptides was estimated at around 200 g.
NMR Spectroscopy-With the use of selective methods, it was possible to work with the impure P4 sample. The complete FIG. 4.Two-dimensional gels of the SBA affinity chromatography product. The proteins were separated on twodimensional PAGE in two pH ranges, pH 4 -7 (a) and pH 6 -11 (b), and then silverstained. The identities of the spots, shown by their Cj numbering, were determined by mass spectrometry of their tryptic digests. A full list of the identified proteins is given in Table I. resonance assignment of the sugar moiety was done with this sample for fear of loss of glycopeptide upon further purification. After a second purification using a P2 column, 25% of the glycopeptide was lost as judged by measurement of the signalto-noise ratio for the P4-and P2-purified samples. An HMQC experiment was rerun on the P2-purified sample to confirm the assignments obtained using the P4-purified sample. From mass spectrometry results, the glycopeptide was composed of five HexNAc residues, 1 Hex residue, and an unknown sugar with a mass of 228 Da. The absolute configuration of the HexNAc and Hex residues was determined to be D by chemical analysis. Analysis of the 1 H and 13 C NMR data indicated the presence of eight anomeric protons labeled in alphabetical order (Fig. 6). A series of one-dimensional TOCSY analyses for the anomeric resonances was done for proton assignments (Fig.  7). Different mixing times were used to assign the spins within each residue. HMQC and HMBC were then used to assign the 13 C resonances. The NMR assignments are given in Table II. One-dimensional NOESY experiments (Fig. 7) were used to obtain the sequence as shown (Fig. 6).
Residue g was assigned to ␤-D-Glcp. For the one-dimensional TOCSY of g1 with a mixing time of 144 ms (Fig. 7a), all spins up to the H6 and H6Ј resonances were detected, indicative of the large J H,H couplings typical of ␤-glucopyranose. Resonance g1 also overlapped with f2, allowing detection of the f1 to f4 resonances. Residue g was terminal due to similar 13 C and 1 H chemical shifts for C2 to C6 with those of ␤-D-glucopyranose (31).
Five residues (a, c, d, e, and f) were identified as ␣-D-Galp-NAc. A value of J 1,2 of 3.6 Ϯ 0.2 Hz, the strong H1-H2 NOE (Fig. 7), and J H,H coupling pattern, which included a small coupling to H4 (Fig. 7), showed that these units had the ␣-Dgalactopyranosyl configuration. Although the e1 and f1 anomeric resonances could be selectively excited with a narrow bandwidth of 10 Hz, the spectra for simultaneous excitation of e1 and f1 are shown in Fig. 7b. Resonances f2 to f4 were also detected for the one-dimensional TOCSY of g1f2 (Fig. 7a). A chemical shift for C2 near 51 ppm was indicative of an acetamido group. The (C2, H4) HMBC correlations were used to assign the C2 resonances of the five GalNAc residues. The one-dimensional TOCSY-NOESY(H1, H4) was used to detect the NOE between H4 and H5 and thus assign the H5 resonances (23). The one-dimensional TOCSY experiments on the H5 resonances were then used to detect the H6s resonances (not shown). Integration of the P2-purified sample (Fig. 6b) indicated seven NAc groups, five of those corresponding to the five GalNAc residues. Comparison of the 13 C chemical shifts of the GalNAc units with those of ␣-D-GalpNAc indicated that residues a, c, e, and f were linked at O-4 due to downfield shifts for C4 (32). Residue f had a branch point at O-3 as established by a downfield shift for C3. Residue d was terminal due to similar 13 C chemical shifts for C2 to C6 with those of ␣-D-GalpNAc (32).
Residue b was assigned to ␤-D-bacillosamine (2,4-diacetamido-2,4,6-trideoxy-␤-D-glucopyranose). For residues b and c, the anomeric resonances partially overlapped. For the onedimensional TOCSY of the b1 and c1 resonances with a mixing time of 144 ms (Fig. 7e), broad resonances could be identified up to a CH 3 resonance (b6) at 1.14 ppm. For the one-dimensional TOCSY for b6 at 1.14 ppm, with a mixing time of 66 ms (Fig.  7d), peaks up to b4 were observed. A series of one-dimensional TOCSY with different mixing time from 30 ms to 144 ms were done to assign the peaks. For residue b, the broad peaks do not provide the coupling constants. However, for the one-dimensional TOCSY of b1 with a mixing time of 144 ms, resonances up to H6 could be observed, similar to the one-dimensional TOCSY for H1 of ␤-Glcp (residue g). Hence, such efficient transfer was indicative of large coupling constants typical of a ␤-glucopyranosyl configuration. The b1-b3 and b1-b5 NOE values observed in Fig. 7k were also typical of the ␤ anomeric configuration. Chemical shifts for C2 and C4 at 55 and 58 ppm were indicative of acetamido groups, in accord with the presence of seven NAc groups in the structure (Fig. 6a). The chemical shift for C1 at 79 ppm and H1 at 5.1 ppm indicated an N-linked anomeric similar to that found for ␤-GlcNAc-Asn (33). Comparison of the chemical shifts of residue b with those of 2,4-diamino-2,4,6-trideoxy-␤-D-glucopyranose found in other structures indicated that Bac was linked at O-3, the only possible glycosidation site (34 -36).
The absolute configuration of residue b was obtained by NOEs as previously described for another structure containing bacillosamine (34). A strong NOE observed between the CH 3 resonance at 1.14 ppm (b6) and the NAc at 1.95 ppm was due to the close proximity of the CH 3 group at C6 to the NAC-CH 3 group at C4. Because the NAc resonance at 1.95 ppm was isolated from the other NAC resonances, it could be selectively excited. A strong NOE was observed between this NAc resonance and the ␣-D-GalpNAc H1 resonance of residue a, due to the a (1-3)b linkage as shown below. This NOE can only occur if residue b has the D configuration, where the NAc group at C4 of residue b is in close proximity (3 Å) to the anomeric proton of residue a. This NOE is not possible if residue b has the L configuration, because the H1a/4b-NAc interproton distances are greater than 5 Å.
Characterization of a pglB Mutant-A pglB mutant was constructed by cassette mutagenesis (Fig. 8). Glycine extracts of the mutant cells demonstrated dramatic changes in protein immunoreactivity by two-dimensional PAGE with Western blotting with HS:2 serotyping sera (Fig. 8, b and e). Several of the proteins that showed a change in the mobility and/or immunoreactivity on two-dimensional gels were identified by mass fingerprint analysis (Table I). The protein identifications are in agreement with those identified by SBA lectin affinity chromatography, providing further evidence that the proteins reactive with the GalNAc lectin are glycosylated by the Pgl pathway. The full set of SBA-reactive proteins was not observed in this experiment, because it was performed on whole glycine extracts and at different pH range, hence Figs. 4 and 8 are not directly comparable. When PEB3 was purified from the pglB mutant by ion-exchange chromatography as described above, analysis by mass spectrometry showed that the protein completely lacked the glycan (Fig. 2, c and d).
To show that the pglB mutation only affected the glycosylation of the glycoproteins, analyses were performed of the lipoo- ligosaccharide and capsular polysaccharide of the mutant. Deoxycholate-PAGE of proteinase K digests and mass spectrometry showed that the mass of the mutant LOS core was identical to that of the wild-type (results not shown). In addition, identical capsular repeats were visible in the extracts of the cells of the wild-type and the isogenic mutant on deoxycholate-PAGE. To further demonstrate that the capsule was unaltered, we examined the polysaccharide by HR-MAS NMR, which has been developed to examine capsular polysaccharide resonances on intact Campylobacter cells without the need for extensive growth and purification. 2 The spectrum of the mutant was unchanged compared with that of the wild-type (results not shown).

DISCUSSION
By means of mass spectrometry and HR-MAS nano-NMR techniques (11,37), the structure of the Asn-linked glycan was established as a heptasaccharide comprising a linear chain of one D-Bac and five D-GalNAc residues with a D-Glc branch (Fig.  6a). The non-reducing terminal GalNAc residue would be responsible for the reactivity of C. jejuni glycoproteins with the lectin, SBA (15). Bacillosamine (Bac), a diaminotrideoxyhexose, has been found in many prokaryotes (36), most recently in Pseudomonas reactans (35). An N-linkage between Bac and Asn residues has not been reported before, although it is possible that an O-linkage of Bac to Ser is present in the pilin of N. meningitidis (5,6). The mass data reported for the O-linked sugar are consistent with Bac (38), but its stereochemistry has not been reported. No evidence was found for any variation in the C. jejuni glycopeptide structure within the glycan moiety, although, as was to be expected, there were glycopeptides with various additional amino acids in the Pronase product (Fig. 5c). The PEB3 analysis showed the substituted Asn was followed by Val-Ser, suggesting the sequon for glycosylation was the same as in eukaryotes, Asn-Xaa-Ser/Thr.
In all, 38 proteins were identified in one-dimensional and two-dimensional PAGE of the SBA affinity chromatography product, by MS of the products of in-gel tryptic digestions (Table I). Of these, 22 were found to react with SBA or immune sera in membrane blots. The remainder generally gave weak spots upon silver staining but still failed to react when blots of gels from higher protein loads were re-tested. Consistent with this behavior, all but one of the sequences of these proteins completely lacked any Asn-Xaa-Ser/Thr sequons. Although they might therefore be considered nonspecifically absorbed proteins, the elution agent, 0.1 M GalNAc, is a mild one. Hence, it is possible that some of them form complexes with one or other of the glycoproteins. The glycoproteins are predominantly annotated as periplasmic proteins, which suggests the glycosylation machinery is specific for this compartment. The majority of the proteins gave rise to several spots on the two-dimensional PAGE indicating multiple glycoforms, which in the case of the purified PEB3 includes the non-glycosylated form. In addition, several other periplasmic proteins observed in the proteome map of C. jejuni have sequons but did not appear in the SBA extract. Hence, the glycosylation is generally incomplete. The PEB3 data showed Asn-Xaa-Ser to be the probable sequon. However, MS/MS analysis of the gel-isolated glycoproteins digests indicates that the alternative N-linkage sequon, Asn-Xaa-Thr, is being utilized in at least three other glycoproteins (Cj0114, Cj0200c, and Cj1670c). In addition, two other glycoproteins identified in the lectin-affinity isolate, Cj0415 and Cj1032, have Asn-Xaa-Thr sequons but no Asn-Xaa-Ser ones. The Thr sequon occurs relatively often in the C. jejuni glycoprotein sequences, at 43% of the putative Asn sequons. That the glycan structure is an immunodominant feature was evident from blots of two-dimensional gels of the SBA-purified glycoproteins with immune sera, which were virtually identical to blots with SBA, and from the loss of protein spots in the blots of the pglB mutant (Fig. 8).
It has previously been pointed out (3,16) that the C. jejuni genome contains a gene, pglB, which shows homology to the gene encoding the STT3 subunit of the S. cerevisiae oligosaccharyltransferase (P39007; 14% identity and 26% similarity). This enzyme is responsible for the block transfer of glycans to Asn residues to form glycoproteins. Furthermore, there is an unannotated gene in a member of the Archaea, Methanobacterium thermoautotrophicum (3), that is more homologous to either of the above than they are to each other (AE000942; 15% identity and 27% similarity to STT3; 17% identity and 26% similarity to pglB). Hence, this pathway appears to be present in all three domains. The establishment here of the presence of N-linked glycans in C. jejuni suggests that PglB may also function as an oligosaccharyltransferase, and this was further confirmed by the complete loss of glycan from PEB3 in the pglB mutant. Although the genetic organization of the locus suggests many of the pgl genes are co-transcribed (3), our results suggest that the phenotype of the mutant does not reflect polar effects on downstream genes. However, PglB differs from STT3 in its putative specificity being for transfer to Asn of a heptasaccharide with a Bac-reducing end group, instead of Glc 3 Man 9 GlcNAc 2 , although both oligosaccharides may be transferred from lipid carriers. Although PglB has not been shown to form a complex with other proteins in the manner of STT3, the two proteins have similar membrane topologies and homologous C-terminal domains (16). From the point of view of N-glycosylation, therefore, the periplasmic space would appear a, gene schematic of the general protein glycosylation locus of C. jejuni NCTC 11168. Genes that have homologues to genes in the pgl locus of Neisseria sp. are shown by gray arrows. The mutation in pglB shown below the locus was constructed using pEAp26. b-e, two-dimensional gel analysis of C. jejuni wild-type and isogenic pglB mutant. b and d, colloidal Coomassie Blue stain of two-dimensional gels before immunoblotting of wild-type and mutant, respectively. c and e, immunodetection of proteins by HS:2 serotyping sera of wild-type and mutant, respectively. The arrows indicate the proteins that showed differences in either gel migration and/or immunoreactivity, which were excised for identification by mass fingerprinting. The identities of the spots are shown by their Cj numbering. The masses of the molecular mass protein markers in kDa are shown at the left.
to be functionally similar to the endoplasmic reticulum.
In our hands, the structures of the LOS and capsule were unaffected by pglB mutation, as previously found (3,15). There are effects of pgl gene mutation on the immunological properties of flagellin (3), but these do not appear to be attributable to loss of the glycan from it. While the two structural flagellin proteins have seven or eight potential glycosylation sites, respectively, only O-linked pseudaminic acid variants have been found on these proteins so far (11). The changes appear therefore to be due to some indirect effect, and it is noteworthy that the list of glycoproteins (Table I) includes one annotated as being involved in flagellin biosynthesis, Cj1565c, pflA the paralysis factor (39).
The pgl locus of C. jejuni is very similar to one in N. meningitidis that is responsible for the O-glycosylation of pilin (5,6), as shown in Fig. 8. Although in general the annotation of carbohydrate-related genes cannot be very precise, nevertheless there are some interesting connections between the annotations of genes in the loci and the glycan structures. The pilin of N. meningitidis is O-substituted at Ser 63 with Gal-␤1,3-Gal-␣1,3-trideoxy-diacetamidohexose (38). The conformation of the latter sugar has not been reported, but it could well be bacillosamine. In agreement with the amino acid linkages in the two structures, there is no homolog of Cj pglB in N. meningitidis, but three genes that are plausible candidates for Bac synthesis genes are present in both loci, Cj pglF (Nm pglD) a dehydratase, Cj pglE (Nm pglC) an aminotransferase, and Cj pglD (Nm pglB) an acetyltransferase (annotations from Parkhill et al. (13)). Cj pglC is a homolog of the second section of the apparently fused gene Nm pglB, which has been suggested to be a glycosyltransferase responsible for putting the first sugar, i.e. bacillosamine, onto a lipid carrier (6). This infers that both N-and O-linked structures are pre-assembled on lipids before attachment to protein, rather than the latter being assembled processively in the manner of eukaryotic O-linked glycoproteins (40).
This analysis suggests that much of the homology between the two loci is due to Bac-related genes. It is therefore surprising that N. gonorrhoeae also contains a set of these genes (41), when the reported glycan substitution on its pilin is Gal-␣1,3-GlcNAc (42). The glycosyltransferases that assemble the rest of the structures are less easy to assign. The Gal-␣1,3 linkage in the N. meningitidis structure has been shown to be formed by the Nm pglA product, which lies well outside the locus (5). This is homologous to the Cj pglA, which is therefore a candidate for the transferase forming the GalNAc-␣1,3-Bac linkage. Nm pglA is also highly homologous to an N. gonorrhoeae gene pgtA, which could therefore be responsible for forming the Gal-␣1,3-GlcNAc linkage.
The other putative glycosyltransferase genes in C. jejuni are Cj1127-1129 or pglH, pglI, and pglJ. These have relatively general annotations that do not permit assignment to GalNac-␣1,4or Glc␤1,3-transferase functions. The lack of homology among them and there being only these three candidate genes in the locus for the remaining five glycoside linkages argue against postulating four independent GalNAc-␣1,4 transferases for assembly of the remainder of the structure. One such transferase could assemble the GalNAc chain with the oligosaccharyltransferase being sensitive to the size of the assembled glycan. Finally, there is a nearby gene, Cj1130c, an ABC transporter wlaB, which may be involved in transfer of the lipid-attached glycan across the inner membrane.
The bacterial glycoproteins that have been identified previously are predominantly single types of protein in each bacterium, such as flagellin or S-layers (1, 2), whose biological function is therefore known. In C. jejuni, we have demonstrated that multiple glycoproteins contain the same glycan, suggesting that several biological functions could potentially be affected. In addition, the glycoproteins, such as PEB3, can exist in both modified and unmodified forms. The ability to vary protein glycosylation has been described for M. tuberculosis Apa glycoproteins, which show differences in delayed-type hypersensitivity reactions and T-lymphocyte stimulation related to the extent of protein glycosylation (43,44). Other biological roles that carbohydrate modifications on bacterial glycoproteins have been shown to affect include adhesion (45), protection against proteolytic cleavage (46), solubility (47), antigenic variation (48), and protective immunity (49). In C. jejuni, alteration of the N-linked glycosylation pathway by mutation in the pgl locus has already been shown to influence adherence, invasion, colonization, and immunogenicity (3,4). We are currently investigating the biological role(s) of the glycan modifications in C. jejuni pathogenesis.