Structure and Molecular Characterization of Streptococcus pneumoniae Capsular Polysaccharide 10F by Carbohydrate Engineering in Streptococcus oralis*

Although closely related at the molecular level, the capsular polysaccharide (CPS) of serotype 10F Streptococcus pneumoniae and coaggregation receptor polysaccharide (RPS) of Streptococcus oralis C104 have distinct ecological roles. CPS prevents phagocytosis of pathogenic S. pneumoniae, whereas RPS of commensal S. oralis functions as a receptor for lectin-like adhesins on other members of the dental plaque biofilm community. Results from high resolution NMR identified the recognition region of S. oralis RPS (i.e. Galfβ1–6GalNAcβ1–3Galα) in the hexasaccharide repeat of S. pneumoniae CPS10F. The failure of this polysaccharide to support fimbriae-mediated adhesion of Actinomyces naeslundii was explained by the position of Galf, which occurred as a branch in CPS10F rather than within the linear polysaccharide chain, as in RPS. Carbohydrate engineering of S. oralis RPS with wzy from S. pneumoniae attributed formation of the Galf branch in CPS10F to the linkage of adjacent repeating units through sub terminal GalNAc in Galfβ1–6GalNAcβ1–3Galα rather than through terminal Galf, as in RPS. A gene (wcrD) from serotype 10A S. pneumoniae was then used to engineer a linear surface polysaccharide in S. oralis that was identical to RPS except for the presence of a β1–3 linkage between Galf and GalNAcβ1–3Galα. This polysaccharide also failed to support adhesion of A. naeslundii, thereby establishing the essential role of β1–6-linked Galf in recognition of adjacent GalNAcβ1–3Galα in wild-type RPS. These findings, which illustrate a molecular approach for relating bacterial polysaccharide structure to function, provide insight into the possible evolution of S. oralis RPS from S. pneumoniae CPS.

The Mitis group of viridans streptococci includes the important pathogen Streptococcus pneumoniae and 12 commensal species that inhabit the upper respiratory tract of man (1,2). Streptococcus mitis and Streptococcus oralis, the two commensal species most closely related to S. pneumoniae, play an important role in colonization of tooth surfaces (3). Survival of these bacteria as pathogen or commensal depends on surface polysaccharides that have different ecological roles. Thus, cap-sular polysaccharides (CPS) 2 of S. pneumoniae protect invading bacteria from phagocytic killing by the host, whereas the so called receptor polysaccharides (RPS) of S. oralis and related oral species function as receptors for lectin-like surface adhesins of other members of the dental plaque biofilm community, such as type 2 fimbriae-bearing Actinomyces naeslundii (4). Lectin-like recognition of RPS depends on the presence of an immunorecessive host-like motif; either GalNAc␤1-3Gal (Gn) or Gal␤1-3GalNAc (G), in the repeating units of different RPS structural types (5,6). The presence of adjacent ␤1-6linked Galf may also be important for exposing these motifs along linear polysaccharide chains (7,8). Whereas the host-like features of RPS are critical for interbacterial adhesion, they contribute little to antigenicity, which instead depends on other features of these polysaccharides (6). RPS serotypes 1, 2, and 3, which occur in association with either Gn or G recognition motifs, are Glc and L-Rha-containing polysaccharides whereas RPS serotypes 4 and 5 lack Glc and L-Rha but instead contain ribitol phosphate in addition to Galf, Galp, and GalNAc. The latter polysaccharides contain Gn recognition motifs and thus, are designated RPS4Gn and RPS5Gn.
The evolution of S. mitis from an ancestral S. pneumoniaelike pathogen was recently proposed from comparative taxonomic and genomic studies of these closely related species (2). Like the S. mitis genome, the nearly complete S. oralis genome 3 is about 10% smaller than that of S. pneumoniae, thereby raising the possibility that S. oralis also evolved by genome reduction from ancestral S. pneumoniae, albeit at an earlier time than S. mitis. The possible evolution of pathogen to commensal is consistent with molecular similarities seen between the surface polysaccharides of different modern day species, such as S. pneumoniae CPS serotype 21 with S. oralis RPS1Gn and RPS2G (9) and S. pneumoniae CPS serogroup 10 with S. oralis RPS4Gn and RPS5Gn (10). The high shared synteny and homology seen across the chromosomal loci of CPS10F and RPS4Gn (i.e. cps10F and rps4Gn) is especially striking and includes the three genes for the recognition region in RPS. Two of these genes, wefD and wefM in S. oralis (10) and homologous wciF and wcrC in S. pneumoniae (11) are associated with synthesis of GalNAc␤1-3Gal in the repeating units of CPS10F and RPS4Gn (Fig. 1). The third gene, wefE of S. oralis and homologous wcrH of S. pneumoniae, encode putative Galf transferases that are 95% identical. However, based on the available structure of CPS10F ( Fig. 1), WcrH was predicted (11) to link Galf to the Gal moiety of GalNAc␤1-3Gal, forming a branch in this polysaccharide ( Fig. 1), whereas the action of WefE involved transfer of Galf to the terminal GalNAc␤ moiety, forming the linear recognition region in RPS4Gn of S. oralis. Whereas these findings suggest that WefE and WcrH differ in acceptor specificity, they also may indicate an error in the CPS10F structure ( Fig. 1), which, as noted elsewhere (12), is not well established.
The present study was initiated to clarify the structural and corresponding molecular relationship that exists between CPS10F of S. pneumoniae and RPS4Gn of S. oralis and thereby, gain insight into the evolutionary history of these functionally distinct polysaccharides. The structure of CPS10F from S. pneumoniae 34355, the strain used to identify the cps10F locus (13), was determined by high resolution NMR and selected genes from this locus were characterized in S. oralis for their abilities to alter the structure and reactivity of RPS4Gn. The results associate the proposed evolution of RPS4Gn (and RPS5Gn) from a CPS10F-like ancestor with the different polymerases (Wzy) of S. oralis and S. pneumoniae. They also identify a previously unrecognized gene (wcrF) in the cps10F locus that appears to be critical for distinguishing different closely related members of CPS serogroup 10.

EXPERIMENTAL PROCEDURES
Bacterial Strains and Culture Conditions-Wild-type and mutant streptococci (Table 1) were cultured as previously described (10).
Antibodies and Immunochemical Methods-Previously described methods (6,14) were followed to prepare rabbit antiserum against S. pneumoniae 34355 by repeated intravenous injections of heat-inactivated whole bacteria and also to affinity purify anti-CPS10F IgG by 4 M MgCl 2 elution of bound antibody from partially oxidized CPS10F coupled to Affi-Gel Hz. The cross-reactive fraction of antibody referred to as anti-RPS2Gn/4Gn IgG was prepared from rabbit antiserum (R102) against RPS2Gn-producing S. gordonii 38 (6) by 4 M MgCl 2 elution from partially oxidized RPS4Gn coupled to Affi-Gel Hz. This antibody was absorbed with RPS Ϫ S. gordonii XC2 (14) to ensure RPS specific immunoreactivity. Similarly, anti-RPS4Gn IgG was absorbed with RPS Ϫ S. oralis YC3 (10). Absorptions were performed by incubating 25 g of IgG with ϳ2 ϫ 10 9 freshly washed bacteria in a total volume of 0.5 ml of phosphate-buffered saline containing 2 mg/ml bovine serum albumin for 2 h at 4°C prior to centrifugation of absorption mixtures to remove added bacteria and membrane filtration (0.22 M pore size) of antibody-containing supernatants. Dot immunoblotting (10) was performed to compare binding of each antibody at 50 ng/ml to decreasing numbers of streptococci spotted on nitrocellulose membranes.
Isolation of Polysaccharides-S. pneumoniae 34355 was cultured to late stationary phase in 18 liters Todd Hewett Broth that was passed through a Millipore (Billerica, MA) PBTK Ultrafiltration Membrane (30-kDa pore size) to remove macromolecules prior to autoclaving. Following inoculation, cultures were incubated 3 days at 37°C; liquid phenol was then added to a final concentration of 1% to kill virulent bacteria (15), which were removed by centrifugation of the culture medium followed by membrane filtration (0.22-M pore size) of the supernatant. Cell-free culture supernatant containing high molecular weight CPS10F was concentrated above a PBHK Ultrafiltration Membrane (100-kDa pore size) along with several volumes of water and buffer (pH 7.4, 10 mM TrisCl containing 15 mM NaCl, 2.5 mM MgCl 2 , 0.5 mM CaCl 2 , and 0.1% azide), which were added to wash residual media components through the membrane. The retained high molecular weight fraction was harvested from the concentrator and incubated with 15 mg of DNase I and 75 mg RNase (both from Sigma) in a total volume of 650 ml for several hours prior to the addition of 500 mg of protease (Sigma) for overnight digestion at 37°C. The digest was chilled on ice prior to precipitation of protein in the presence of trichloroacetic acid, which was added to a final concentration of 5% (w/v). Precipitate was removed by centrifugation and membrane filtration (0.22-M pore size) prior to neutralization of the cold filtrate by dropwise addition of concentrated Tris. Soluble material was further digested with 20,000 units of mutanolysin (Sigma) in 400 ml of 20 mM sodium/potassium phosphate buffer (pH 6.7) containing 0.5 mM MgCl 2 , 0.5 mM CaCl 2 , 0.5% azide to cleave possible peptidoglycan links between CPS10F and C-polysaccharide (16). Following digestion, added protein was again removed by precipitation in the presence of 5% trichloroacetic acid, as described above. The soluble fraction was dialyzed against water followed by 10 mM TrisCl buffer (pH 8.0) containing 100 mM NaCl and applied to a DEAE Sephacel (GE Healthcare) anion exchange column equilibrated with this buffer. The column was rinsed with starting buffer prior to elution with a linear gradient of NaCl (100 -200 mM) in 10 mM Tris buffer. Column fractions were monitored by the phenol sulfuric acid reaction (17) and by immunodiffusion performed with rabbit anti-serogroup 10 serum (Statens Serum Institute). CPS10F emerged from the column as a symmetrical peak in fractions containing from 120 to 150 mM NaCl. Material from the region between 128 and 140 mM NaCl was used for structural studies. The central region of the 1 H-13 C spectrum of this material was fully assigned to CPS10F (Fig. 2); contaminating C-polysaccharide was less than 5% based on comparisons of one-dimensional 1 H (not shown) and two-dimensional 1 H-13 C NMR spectra with the data of Karlsson et al. (18). Comparable spectra of CPS10F samples that were not digested with mutanolysin prior to chromatography contained additional peaks that could be assigned to the form of C-polysaccharide that has two phosphocholine residues per repeating subunit (results not shown). RPS-like cell wall polysaccharides produced by plasmid-bearing mutant constructs of S. oralis C104 were isolated from mutanolysin digests of protease-treated cell walls and purified by DEAE Sephacel column chromatography as described previously (6,10).
Chemical Methods for Carbohydrate Composition and Linkage Analysis-Glycosyl composition analysis was done by gas chromatography-mass spectrometry (GC-MS) of the monosaccharide TMS derivatives at the Complex Carbohydrate Research Center (University of Georgia). After methanolysis of the polysaccharide in 4 N trichloroacetic acid at 100°C (4 h), the resulting monosaccharide methylglycosides were derivatized and analyzed on an Alltech AT1 fused silica capillary column. Linkage analysis was done by GC-MS of the partially methylated alditol acetates. Samples were permethylated with CH 3 I in anhydrous NaOH in DMSO. The partially methylated polysaccharide was hydrolyzed in 2 M trichloroacetic acid at 121°C, reduced with NaBD 4 , acetylated in pyridine/acetic anhydride, and analyzed on a DB-1 capillary column.
Aqueous HF was used to cleave the phosphate from selected samples prior to the analyses. Polysaccharide was treated in 48% aqueous HF at 4°C for 2 days, then evaporated in a stream of nitrogen followed by neutralization with NH 4 OH. Salts were removed by Biogel P2 gel permeation chromatography, which resolved two peaks of carbohydrate. Both peaks were analyzed along with the intact polysaccharide.
Structural Characterization of Polysaccharides by NMR Spectroscopy-NMR spectra of purified polysaccharides were recorded as in previous studies (10,19) with a Bruker DRX500 with a cryoprobe and a DRX700 using standard acquisition software. Generally, a 1-5 mg sample of polysaccharide was exchanged twice by lyophilization from 3 ml of 99.96% D 2 O and dissolved in 0.6 ml of 99.99% D 2 O for a 5-mm sample tube or a lesser volume for a Shigemi tube. Chemical shifts were recorded at a probe temperature of 25°C or 45°C relative to internal acetone ( 1 H, 2.225 ppm; 13 C, 31.07 ppm). All the data were processed using NMRPipe, NMRDraw, NMRView, and Sparky software. Double quantum filtered homonuclear coherence spectroscopy (COSY) and total correlation spectroscopy (TOCSY) along with gradient triple quantum filtered spectra (TQF-COSY) were carried out to assign the scalar coupled protons of each monosaccharide residue. 13 C chemical shifts were assigned by heteronuclear single quantum coherence spectroscopy (HSQC) and combination HSQC-TOCSY. Inter-residual linkages were determined by nuclear Overhauser spectroscopy (NOESY) with mixing times of 100 ms and 300 ms and by longrange C-H heteronuclear multiple bond correlation spectroscopy (HMBC). All chemical shifts reported were measured from natural abundance 13 C-1 H HSQC spectra to avoid chemical shift and lineshape distortion by 1 H strong coupling that is common in carbohydrates. A number of these spectra were acquired at relatively high resolution (5 Hz) in the indirect ( 13 C) dimension by means of folding and acquisition of 2048 FID, which was performed to increase the information content of HSQC-TOCSY spectra. This was possible because the natural line width for these polysaccharides is relatively narrow. Molecular Methods-Previously described methods (10) were used to prepare S. oralis YC7 and YC8 by replacement of wefL or wzy in S. oralis C104 with a nonpolar erm cassette and pJY-derived plasmids expressing genes of interest from S. pneumoniae ( Table 1). The PCR primers (supplemental Table S1) used to prepare these plasmids were designed to amplify not only the S. pneumoniae gene of interest but also its upstream Shine-Dalgarno sequence.
Bacterial Adhesion-The bacteria overlay technique, performed with fluorescein isothiocyanate-labeled bacteria (20), was used to compare adhesion of A. naeslundii 12104 to different streptococci. Briefly, nitrocellulose membranes were spotted with decreasing numbers of streptococci as described for dot immunoblotting and dried overnight. Prior to use, membranes were blocked 2 h by incubation in 20 mM phosphatebuffered (pH 7.4) saline containing 0.1 mM CaCl 2 and 5% bovine serum albumin. Blocked membranes were overlaid with 40 ml of fluorescein-labeled A. naeslundii 12104 (5 ϫ 10 8 /ml) in the same buffer, incubated at room temperature for 2 h with occasional gentle mixing, washed with 20 mM Tris-buffered (pH 7.5) saline containing 0.05% Tween 20 to remove unattached bacteria, and scanned with a Typhoon scanner to detect adherent bacteria.

RESULTS
Structure of S. pneumoniae CPS10F-Based on the previously reported structure of CPS10F ( Fig. 1), we expected that the presence of phosphate in this polysaccharide would adversely affect the results of carbohydrate composition and linkage analysis. Consequently, we analyzed both intact CPS10F and the HF-treated polysaccharide. Aqueous HF treatment effectively cleaves phosphate; however, it also cleaves furanoside and to some extent other glycosidic linkages, which may account for the recovery of HF-treated CPS10F in two fractions following Biogel P2 column chromatography ( Table  2). The only components identified from intact CPS10F or either HF-treated fraction were ribitol, galactose, and galactosamine (Table 2). Linkage (methylation) analysis of intact CPS10F (Table 2) revealed two acetylated forms of ribitol (i.e. 2-Ac-1,3,4,5 tetra methyl-and 1,2,4,5-tetra-Ac-3-methl ribitol); however, only the 2-acetylated derivative was identified from the HF-treated polysaccharide, thereby suggesting that ribitol was linked through either the 2 or 4 position. Terminal Galf was only identified from intact CPS, presumably because unsubstituted furanoside ( Fig. 1) was cleaved by HF. A noteworthy finding involved the identification of 4,6-substituted GalNAc from the intact polysaccharide rather than 3,6-substituted Galp as was expected from the currently available structure of CPS10F (Fig.  1). The failure to detect 4,6-substituted GalNAc from HF-treated CPS10F suggested cleavage of a furanoside (or phosphate) from GalNAc. HF treatment of the polysaccharide also reduced the amount of 3-substituted Galp while increasing the amount of terminal galactopyranoside (t-Galp), thereby indicating substitution of Galp at the 3-position by a furanoside (or phosphate). The findings from chemical analysis were augmented by results from NMR spectroscopy.
The HSQC spectrum of the low field anomeric region (not shown) revealed five distinct signals indicating five sugar residues in the CPS10F repeating subunit. These were labeled A, B, C, D, and E for the purpose of assignment in Fig. 2 and Table 3. The letters, although arbitrary, were chosen to correspond to the notation used in previous studies of RPS4Gn from S. oralis C104 (21) and RPS5Gn from S. oralis SK144 (10). Spectra recorded at 45°C showed better resolution than those at room temperature because of greater polysaccharide mobility at the higher temperature.
For residue A, signals for H1 through H3 were readily assigned by COSY (supplemental Fig. S1) and TOCSY (supplemental Fig. S2) and the corresponding 13 C were assigned by HSQC. The low field positions of the 13 C shifts and the small values of J HH identified this residue as a furanoside. A strong HMBC cross peak (supplemental Fig. S3) was noted between H1 and C4 characteristic of furanosides and the assignments confirmed by two-dimensional HSQC-TOCSY. Both HMBC and HSQC-TOCSY cross peaks (not shown) were observed between C4 and a pair of protons at 3.86 and 3.78 ppm that were identified as methylene protons by an edited HSQC spectrum. These signals were identified as H6Ј and H6 of residue A (i.e. A-H6Ј and A-H6, respectively) by TQF-COSY (not shown), which also identified A-H5 (4.355 ppm). The latter peak exhibited TOCSY signals to H6 and H6Ј as well as to H4 and H3. Although the A-2 and A-4 resonances overlapped in the spectrum shown in Fig. 2, these resonances were resolved in spectra recorded at 25°C. Residue A of CPS10F was identified ␤-Galf by comparison of chemical shifts to those of methylgalactofuranosides (22).
For residue B, H2 was identified by COSY spectra and TOCSY indicated that H2 and H3 were strongly coupled. HSQC-TOCSY at short mixing times (20 ms) identified B-C2 and B-C3 at 71.68 and 81.59 ppm, respectively. A narrow TOCSY cross peak between H1 and H4 (supplemental Fig. S2) identified the equatorial H4 at 4.094 ppm. An HMBC cross peak (supplemental Fig. S3) between the signal assigned by HSQC as B-C4 (69.56 ppm) placed B-H5 at 3.689 ppm, an assignment confirmed by NOE between H1 and H5. B-C5 (75.99) assigned by HSQC exhibited HSQC-TOCSY to methylene peaks at 3.77 and 3.80, which were assigned as H6 and H6Ј. The H-H coupling constants and chemical shifts identified this residue as ␤-Galp.
For residue C, the downfield shift of the anomeric signal at 5.057, 109.14 ppm suggested a furanoside. The C-H2 signal was identified by COSY at 4.077 ppm and C-C2 by HSQC at 82.00 ppm. The strong HMBC cross peak between H1 and C4 (84.04 ppm) was also characteristic of a furanoside. C-H2 and C-H3 were strongly coupled; however, HMBC cross peaks (supplemental Fig. S3, upper panel) from H1 to C-C2 (82.00 ppm) and C-C3 at 77.72 ppm identified these resonances.

Molecular Basis of S. pneumoniae CPS10F Structure
HSQC-TOCSY from C-H4 (not shown) confirmed the assignments of C-C2 and C-C3. HSQC-TOCSY from C-C3 and C-C4 identified C-H5 as well as C-H6 and C-H6Ј that were recognized as methylene protons in edited HSQC. Like residue A, residue C was assigned as ␤-Galf with the anomeric configuration based on chemical shifts (22). For residue D, with signal D-1 at 4.679, 103.99 ppm, H2 was identified by COSY. HSQC located D-C2 at 54.20 ppm, a chemical shift characteristic of an amino sugar, which suggested that residue D was the GalNAc identified by chemical analysis ( Table 2). H3 of this residue was located by homonuclear TOCSY as well as by a HSQC-TOCSY cross peak with D-C2 (data not shown). A sharp cross peak at 4.18 ppm in the same row of this spectrum, as well as in supplemental Fig. S2, was assigned as the equatorial proton, D-H4. An HMBC cross peak observed between the signal assigned by HSQC to D-C4 (77.95 ppm) and a signal at 3.887 ppm suggested assignment of the latter peak to D-C5, which was supported by NOE (Fig. S4) from D-H4. HSQC-TOCSY from D-C5 identified methylene peaks in edited HSQC for D-H6 and D-H6Ј. Any possible confusion resulting from the overlapping chemical shifts of D-H5 and D-H6 (3.888 ppm) was resolved in HSQC-TOCSY spectra (data not shown) run without 13 C decoupling during acquisition. In such spectra, direct peaks were split by 1 J CH allowing the correct 1 H chemical shift of the relay peaks to be accurately determined. This residue was identified as ␤-GalNAc on the basis of the anomeric chemical shifts and the large coupling of H1 and H2.
For residue E, with the E-1 signal at 5.080, 99.18 ppm, cross peaks from H1 to H2 in COSY (supplemental Fig. S1) and between H1 and H3 in TOCSY (supplemental Fig. S2) identified the proton assignments and HSQC was used for assignment of E-C2 and E-C3. HMBC cross peaks (supplemental Fig. S3, lower panel) observed from E-C2 to both E-H3 and E-H4 located these proton resonances. This residue was identified as ␣-Galp by the small J H1-H2 , the large NOE between H1 and H2 (supplemental Fig. S4) and the narrow peak assigned to H4, which resulted from the small scalar coupling of this equatorial proton to H3 and H5. HMBC cross peaks were observed between E-H1 and 80.84 ppm (E-C3) and to peaks at 71.57 and 79.14 ppm, one of which was expected to be E-C5 (19). An HMBC cross peak observed between E-C4 (70.07 ppm) and 4.246 ppm was assigned as E-H5 to assist in the identification of E-C5. A signal at 62.21 ppm attributed to a methylene group by edited HSQC showed an HSQC-TOCSY cross peak with E-H5, identifying the former signal as E-C6 corresponding to E-H6 at 3.740 ppm.
The interpretation of the NMR data presented above, combined with the carbohydrate composition and linkage analyses ( Table 2) identified five sugar residues in the repeating subunit of CPS10F. However, a number of resonances in the HSQC spectrum of this polysaccharide (Fig. 2) remained unassigned due to the presence of ribitol phosphate (Table 1), which lacks an anomeric signal. Ribitol (residue F) was identified in multiplicity-edited HSQC spectra of its methylene groups in the 1 and 5 positions. The spectrum shown in Fig. 2 contained two such negative signals ( 13 C shifts of 63.95 and 65.79 ppm) that were not assigned to sugars. In 31 P HSQC spectra (not shown), cross peaks were observed at 4.211 and 4.090 ppm, which corresponded to the negative peak at a 13 C shift of 65.79 ppm in Fig.  2. Accurate 1 H chemical shifts for the signal assigned to C5 of ribitol were determined from the 13 C HSQC spectrum. HSQC-TOCSY peaks between the F-C5 signal and 4.086 ppm identified F-H4 and HSQC provided a 13 C shift of 79.14 for F-C4. An HSQC-TOCSY cross peak was observed between this latter signal and 3.90 ppm, a resonance which was assigned as F-H3; F-C3 was located at 71.56 ppm by HSQC. Although this 13 C chemical shift was identical to that of E-C5, the HSQC-TOCSY cross peak with E-H6 was easily distinguished from that of F-H2 at 3.814 ppm as well as that of F-H4. HSQC spectra located F-C2 at 72.39, a chemical shift close to that of D-C3. Nevertheless, HSQC-TOCSY cross peaks for F-H1 (3.677 ppm) and F-H1Ј (3.826 ppm) in this row were readily distinguished from those arising from residue D. Our assignment of the C-H pairs summarized in Table 3 accounts for all the signals observed in the HSQC spectrum shown in Fig. 2.
Given the complete NMR assignment of S. pneumoniae CPS10F ( Table 3), determination of the linkages between the residues by HMBC and NOE data were straight forward. In our new and revised structure of CPS10F (Fig. 3A), the scalar coupling between 13 C of one residue with 1 H of the adjacent residue, indicated in red, provided unambiguous proof of linkage positions because 3-bond scalar coupling follows the chemical bonds. Also indicated in blue in Fig. 3, are 1 H-1 H proximities derived from NOE data (Fig. S4); all of these support the proposed structure. The position of the phosphodiester linkage joining the 5-position of ribitol (F) to C5 of the ␤-Galf (A) was revealed by 31 P-1 H HSQC spectra (not shown), which showed strong correlation of the 31 P signal with A-H5 and with F-H5 and H5Ј.
Molecular Basis of Polysaccharide Structure and Reactivity-The three linkages that distinguish the structures of CPS10F from RPS4Gn (Fig. 3, A and B) have dramatic effects on the reactions of these polysaccharides both as antigens and as receptors for adhesion of type 2 fimbriated A. naeslundii (Fig.  4). To define these differences at the molecular level, we characterized selected genes from the cps10F or cps10A loci for their ability to alter the structure and reactivity of RPS4Gn produced by S. oralis C104.
The phosphodiester linkages between ribitol-5-phosphate and Galf in CPS10F and RPS4Gn were previously suggested to depend on wcrB in S. pneumoniae 34355 (11) and wefL in S. oralis C104 (10). To test these hypotheses, we replaced wefL in S. oralis C104 with a nonpolar erm cassette, to obtain S. oralis YC7. Surprisingly, the loss of wefL reduced but did not abolish anti-RPS immunoreactivity. The end point of strain YC7 was ϳ10-fold lower than wild-type strain C104 in dot immunoblotting performed with anti-2Gn/4Gn reactive IgG (Fig. 4), which binds the Galf␤1-6GalNAc region of RPS4Gn (6), and 100-fold lower with anti-RPS4Gn reactive IgG, which binds the Gal␣1-1ribitol region of RPS4Gn (10). Considered together, these findings suggested reduced cell surface production of an antigenically altered polysaccharide. Although we did not isolate the YC7 polysaccharide for structural characterization, we suspected that it was a variant of RPS4Gn devoid of ribitol-5-phosphate. Formation of such a polysaccharide in the absence of S. oralis C104 and S. oralis YC1(pJY-12) (Fig. 4), the presence of Gal␣1-4ribitol in the polysaccharide of the later strain abolished the reaction of anti-4Gn RPS specific IgG without increasing anti-CPS10F immunoreactivity, which was negative for both strains. Results from parallel bacteria overlay experiments performed with these strains showed that both supported adhesion of A. naeslundii.
The structure of CPS10F (Fig. 3A) suggested that the Galf branch in this polysaccharide was formed by the polymerase dependent-linkage of adjacent repeating units through subterminal GalNAc. To test this proposal, we replaced wzy in S. oralis C104 with a nonpolar erm cassette and transformed the resulting RPS Ϫ strain (S. oralis YC8) with pJY-13 expressing wzy from the cps10F locus. The yield of anionic polysaccharide from S. oralis YC8(pJY-13) was ϳ10-fold less than expected (i.e. 3 mg from an 18-liter culture), which increased the proportion of contaminating cell wall material, as indicated by the presence of a few unassigned signals in the HSQC spectrum of this sample (supplemental Fig. S7). However, these signals did not prevent complete assignment of the 1 H and 13 C signals associated with the specific polysaccharide produced by S. oralis YC8(pJY-13). The signals assigned to residues A, E, and F of this polysaccharide were similar to those of S. oralis C104 RPS4Gn while those assigned as B1 and B2, C5 and C6, and D4 and D6 were similar to the corresponding signals of S. pneumoniae CPS10F ( Table 3). The inter-residue connectivities of the S. oralis YC8(pJY-13) polysaccharide established the structure shown in Fig. 3E. Results from dot immunoblotting (Fig. 4) of this construct showed reduced binding of both anti-RPS antibodies but increased binding of anti-CPS10F reactive IgG, thereby identifying the S. pneumoniae polymerase as an important molecular determinant of CPS10F immunoreactivity. Bacteria overlay experiments (Fig. 4) showed weak but significant adhesion of A. naeslundii to S. oralis YC8(pJY-13), which was confirmed by standard coaggregation assays performed with these strains. In control experiments, A. naeslundii WVU45M, which lacks type 2 fimbriae (23), failed to coaggregate with either S. oralis YC8(pJY-13) or any other strain listed in Fig. 4.
We also tested wcrH from the cps10F cluster (Fig. 5) for complementation of the wefE deletion in RPS Ϫ S. oralis YC6. NMR spectra recorded for the polysaccharide isolated from S. oralis YC6(pJY-14) were indistinguishable from those of wild-type RPS4Gn (results not shown). Likewise, the reactions of this construct and S. oralis C104 were identical in dot immunoblotting and bacteria overlay experiments (Fig. 4). Thus, wcrH and wefE appeared to represent the same gene in different species. RPS Ϫ S. oralis YC6 was then transformed with pJY-15 expressing wcrD, the gene associated (11) with the ␤1-3-linked Galf branch in CPS10A (Fig. 5). The chemical shifts recorded for the polysaccharide isolated from S. oralis YC6(pJY-15) (supplemental Fig. S8) were similar to those of S. oralis C104 RPS4Gn for residues A, B, E, and F but different for D2, D3, D5, and D6 (Table 3). Determination of the linkage positions by 1 H-13 C HMBC and by 1 H-31 P HSQC indicated that the S. oralis YC6(pJY-15) polysaccharide was identical to RPS4Gn except for the ␤1-3 linkage between residues C and D (Fig. 3F). In dot immunoblotting (Fig. 4), S. oralis YC6(pJY-15) was labeled with anti-RPS4Gn reactive IgG but not with anti-RPS2Gn/4Gn reactive IgG. Importantly, this construct failed to support adhesion of A. naeslundii (Fig. 4). In comparable studies, we transformed S. oralis YC6 with a plasmid expressing wcrG for the ␤1-6linked Galp branch in CPS10A (Fig. 5B); however, the resulting construct did not produce an immunoreactive cell surface product (results not shown).

DISCUSSION
The structure of CPS10F (Fig. 3A), like those of previously characterized CPS10A (24) and RPS4Gn (21) (Fig. 5B), has now been established by chemical and high resolution NMR methods. In addition, the molecular difference between these polysaccharides was defined by carbohydrate engineering of S. oralis RPS4Gn (Fig. 3) with genes from the cps10F and cps10A loci (Fig. 5A). The results of these studies provide new insight into the possible evolution of RPS from CPS and resolve inconsistencies that arose in earlier molecular studies (11) from errors in the available structure of CPS10F (Fig. 1). Based on the present findings, it is clear that the gene wcrB, which occurs in both the cps10A and cps10F loci, is associated with the same structural feature in the corresponding polysaccharides, namely, the linkage of ribitol-5-phosphate to the 5-OH of Galf, rather than 6-OH as in RPS4Gn (Fig. 5). It is also clear from the corrected structure of CPS10F (Fig. 5B), that the genes previously designated wcrC in both the cps10A and cps10F loci (11,13), are in fact, distinct. This was firmly established by the ability of each gene to change the ␣1-1 linkage from Gal to ribitol in RPS4Gn. Thus, expression of wcrC from the cps10A locus in wefM deficient S. oralis C104 (i.e. strain YC1) changed this linkage to ␣1-2 in our previous study (10), and expression of the corresponding gene from the cps10F locus changed the same linkage to ␣1-4 in the present study (Fig. 3D). In view of these findings, the gene previously designated wcrC in the cps10F locus has now been given the designation wcrF in the Bacterial Polysaccharide Gene Database (25), as indicated in Fig. 5. Finally, the present findings associate wcrH of S. pneumoniae and wefE of S. oralis with the ␤1-6 transfer of Galf to GalNAc␤ and attribute formation of the Galf branch in CPS10F to the subsequent Wzy-dependent linkage of adjacent repeating units, which is comparable in CPS10F and CPS10A. The only structure-determining genes that are not considered in Fig. 5 are wciG and closely related wefK, which encode putative O-acetyltransferases. In previous studies, we associated wefK of S. oralis 10557 with partial O-acetylation of RPS3G (26,27). However, the involvement of these genes in biosynthesis of either CPS10F or RPS4Gn remains to be established as NMR spectra of these polysaccharides did not reveal any evidence of O-acetylation.
Adhesion of A. naeslundii to RPS4Gn-bearing S. oralis but not to CPS10F-bearing S. pneumoniae (Fig. 4) provided experimental evidence for the critical role of ␤1-6-linked Galf in RPS function. However, the same wzy-dependent linkage that abolished adhesion of A. naeslundii to S. pneumoniae only reduced adhesion to the surface polysaccharide of S. oralis YC8(pJY-13) (Fig. 4). Adhesion of A. naeslundii to S. oralis YC8(pJY-13) can be explained by the action of the S. pneumoniae polymerase, which is not expected to affect the recognition domain at the non-reducing end of each polysaccharide chain. Whereas functional receptors are also expected at the ends of CPS10F chains, the effective cell surface density of these may be reduced on S. pneumoniae by the presence of long CPS10F chains and increased on S. oralis YC8(pJY-13) by the presence of relatively short chains, a possibility consistent with the low yield of polysaccharide from this construct. In any case, A. naeslundii did not attach to S. oralis YC6(pJY-15) (Fig. 3F), which has a surface polysaccharide identical to wild-type RPS4Gn except for the wcrD-dependent ␤1-3 linkage between Galf and GalNAc␤1-3Gal␣. Thus, ␤1-6-linked Galf in RPS4Gn allowed recognition of adjacent GalNAc␤1-3Gal whereas ␤1-3-linked Galf in the polysaccharide of S. oralis YC6(pJY-15) blocked recognition. Whether exposure of the host-like feature in wildtype RPS depends on a simple steric effect of the ␤1-6 linkage or alternatively, on the flexibility of this linkage (8) and associated conformational effects remains to be determined.
The difference between CPS10F and RPS4Gn as receptors for interbacterial adhesion points to wzy replacement as an important step in the proposed evolution of CPS to RPS. Additional steps are suggested from the presence of ancestral cps10A-like sequences in both the cps10F and rps4Gn loci (Fig.  5A). Thus, the intergenic regions between wciF/wefD and wzx in these loci harbor similar 80 base pair sequences that resemble the 5Ј-end of wcrG in the cps10A locus. The loss of wcrG from a CPS10A-like serotype would eliminate the ␤1-6-linked Galp branch, thereby allowing acquisition of wcrH/wefE for a ␤1-6-linked Galf branch (Fig. 5B). Evidence for the loss of wcrD for a ␤1-3-linked Galf branch is also clear from the wcrD pseudogene in the cps10F locus and from sequences that closely resemble the 5Ј-and 3Ј-ends of wcrD in the intergenic region between wefM and wefD in the rps4Gn locus. The apparent loss of wcrG or wcrD and acquisition of wcrH/wefE may be explained by selective pressure from the host immune response for the emergence of new CPS serotypes (28,29). In contrast, conversion of CPS to RPS, via wzy replacement in the case of RPS4Gn, may depend on the advantage gained from RPS-mediated interactions with other commensal species, leading to the establishment of mutualism in biofilm communities (30). The time frame for the possible evolution of CPS to RPS, although largely unknown, could extend back to a common ancestor of man and the great apes, based on the host range of modern day S. pneumoniae (2). The greater homology seen between the cps10F and rps4Gn loci (Fig. 5) than between cps21 and either rps1Gn or rps2G loci (9) is consistent with the different distributions of the two RPS groups on modern day commensal species. Thus, ribitol phosphate-containing types of RPS, such as RPS4Gn, have only been identified from strains of S. oralis, a close relative of S. pneumoniae whereas Glc and L-Rha-containing types of RPS (i.e. serotypes 1, 2, or 3) occur on strains of S. oralis and also on more distantly related species, including S. gordonii and S. sanguinis (6). Further insights into the evolution of these polysaccharides seem likely based on comparative genomic studies of different RPS-bearing commensal species.
The ␣1-4 linkage between Gal and ribitol of CPS10F could only be established by high resolution NMR of this polysaccharide as the identification of 2-Ac-1,3,4,5 tetra-Me-ribitol by methylation analysis of HF-treated CPS10F ( Table 2) could indicate either an ␣1-2 or ␣1-4 linkage due to the symmetry of ribitol. Based on the membership of WcrC and WefM in glycosyltransferase family 4 of the Carbohydrate Active Enzymes database (CAZy), we previously suggested (10) that the divergent N-terminal regions of these proteins formed different acceptor binding sites for ribitol-5-phosphate and the conserved C-terminal regions similar donor binding sites for UDP-Gal. Accordingly, sequence identities of ϳ50% seen between these proteins and presently designated WcrF (Fig. 5A) depend primarily on similarities in the C-terminal regions of these proteins. The genes for these allelic transferases are important molecular determinants of RPS serotype specificity. Thus, the structure and corresponding antigenic difference between RPS4Gn and RPS5Gn was previously shown to depend on the presence of wefM in the rps4Gn locus verses wcrC in the rps5Gn locus (10). In addition, the homologous reaction of anti-RPS4Gn specific IgG with S. oralis C104 was abolished in the present study by changing the wefM-dependent, ␣1-1 linkage in RPS4Gn to a wcrF-dependent, ␣1-4 linkage in the polysaccharide produced by S. oralis YC1(pJY-12) (Fig. 4). A puzzling feature of CPS serogroup 10 involved the apparent genetic identity of cps10A to cps10B and cps10C to cps10F (13). It is now clear that these closely related loci are distinguished by the presence of wcrC or wcrF. Thus, wcrC occurs in the cps10A and cps10C loci while wcrF occurs in the cps10B and cps10F loci. Based on the occurrence of these genes, we hypothesize that CPS10B is similar to CPS10A (Fig. 5B) except for the presence of an ␣1-4 linkage between Gal and ribitol-5-phosphate and likewise, that CPS10C is similar to CPS10F (Fig. 5B) except for the presence of an ␣1-2 linkage between Gal and ribitol-5phosphate. Comparative structural and molecular studies of these polysaccharides are currently underway to test these predictions.