The Structure of the Colony Migration Factor from PathogenicProteus mirabilis

Swarming by Proteus mirabilis is characterized by cycles of rapid and coordinated population migration across surfaces following differentiation of vegetative cells into elongated hyperflagellated swarm cells. It has been shown that surface colony expansion by the swarm cell population is facilitated by a colony migration factor (Cmf), a capsular polysaccharide (CPS) that also contributes to the uropathogenicity of P. mirabilis(Gygi, D., Rahman, M. M., Lai, H.-C., Carlson, R., Guard-Petter, J., and Hughes, C. (1995) Mol. Microbiol. 17, 1167–1175). In this report, the Cmf-CPS was extracted with hot water, precipitated with ethanol, and further purified by gel permeation chromatography. Its structure was established by glycosyl composition and linkage analyses, and by one- and two-dimensional NMR spectroscopy. The Cmf-CPS is composed of the following tetrasaccharide repeating unit. → 3 ) α D Man p A ( 1 → 4 ) α D Gal p NAc ( 1 → 3 ) α D Glc p NAc ( 1 → 3     ↑     1     β D Gal p A         Proteus mirabilis is a pathogenic Gram-negative bacterium that frequently causes kidney infections, typically established by ascending colonization of the urinary tract (1-5). It exhibits a striking form of multicellular behavior, called swarming migration, in which motile vegetative rods growing on solid media differentiate into extremely elongated hyperflagellated swarm cells that undergo rapid and coordinated population migration away from the initial colony (4, 6, 7). A transposon mutant of P. mirabilis, WT19, defective in mass migration of normally differentiated swarm cells, has been reported (8). Genetic analyses identified a lesion in a putative polysaccharide assembly locus, and electron microscopy and gel electrophoresis confirmed the specific loss of a capsular polysaccharide ( CPS ).1 This CPS , named Cmf (colony migration factor)-CPS, was suggested to facilitate population migration by enhancing medium surface fluidity and possibly influencing cell-cell interactions. ThecmfA − was also attenuated in experimental uropathogenicity, showing greatly reduced colonization of the urinary tract (9). Little is known about the structures of Proteuspolysaccharides, and serology indicates substantial heterogeneity, with P. mirabilis and the closely related Proteus vulgaris divided into 49 O-serogroups, and many smooth strains remain unclassified (10). Analysis of the Cmf-CPS from wild-type P. mirabilis WT19 (8) indicated that it is an acidic type II molecule rich in galacturonic acid andN-acetylgalactosamine and that it can have a phospholipid anchor. This composition showed that it must be structurally different from previously reported, functionally anonomous CPSs from P. mirabilis ATCC49565 (11) and P. vulgaris CP2-96 (12). The former was reported to consist of a branched trisaccharide repeating unit of N-acetylglucosamine,N-acetylfucosamine, and glucuronic acid and the latter of a tetrasaccharide repeating unit of two glucosyl, oneN-acetylgalactosaminosyl, and one glucuronosyl residue. To establish the nature of a functionally characterized Proteuspolysaccharide and gain a view of the possible common structural features among the polysaccharides of this genus, we report the structure of the Cmf-CPS from P. mirabilis WT19.

Proteus mirabilis is a pathogenic Gram-negative bacterium that frequently causes kidney infections, typically established by ascending colonization of the urinary tract (1)(2)(3)(4)(5). It exhibits a striking form of multicellular behavior, called swarming migration, in which motile vegetative rods growing on solid media differentiate into extremely elongated hyperflagellated swarm cells that undergo rapid and coordinated population migration away from the initial colony (4,6,7).
A transposon mutant of P. mirabilis, WT19, defective in mass migration of normally differentiated swarm cells, has been reported (8). Genetic analyses identified a lesion in a putative polysaccharide assembly locus, and electron microscopy and gel electrophoresis confirmed the specific loss of a capsular polysaccharide (CPS). 1 This CPS, named Cmf (colony migration factor)-CPS, was suggested to facilitate population migration by enhancing medium surface fluidity and possibly influencing cell-cell interactions. The cmfA Ϫ was also attenuated in experimental uropathogenicity, showing greatly reduced colonization of the urinary tract (9).
Little is known about the structures of Proteus polysaccharides, and serology indicates substantial heterogeneity, with P. mirabilis and the closely related Proteus vulgaris divided into 49 O-serogroups, and many smooth strains remain unclassified (10). Analysis of the Cmf-CPS from wild-type P. mirabilis WT19 (8) indicated that it is an acidic type II molecule rich in galacturonic acid and N-acetylgalactosamine and that it can have a phospholipid anchor. This composition showed that it must be structurally different from previously reported, functionally anonomous CPSs from P. mirabilis ATCC49565 (11) and P. vulgaris CP2-96 (12). The former was reported to consist of a branched trisaccharide repeating unit of N-acetylglucosamine, N-acetylfucosamine, and glucuronic acid and the latter of a tetrasaccharide repeating unit of two glucosyl, one N-acetylgalactosaminosyl, and one glucuronosyl residue. To establish the nature of a functionally characterized Proteus polysaccharide and gain a view of the possible common structural features among the polysaccharides of this genus, we report the structure of the Cmf-CPS from P. mirabilis WT19.

EXPERIMENTAL PROCEDURES
Bacterial Strains and Growth Conditions-P. mirabilis WT19 strain U6450 (proticine type P3/S1) (13) was isolated from a chronic urinary tract infection involving renal stone formation (13). The bacterial cells were grown overnight at 37°C on the surface of Brilliant Green agar (BBL, Becton Dickinson, Cockeysville, MD), and colony morphology was examined the next day. After observing that this culture produced terraced, swarming colonies that extended across 50% of the agar surface, cells at the edge of the swarm colony were transferred to Brilliant Green agar again to confirm the absence of contaminants. Biochemical confirmation of this strain as Proteus was made using a commercial package of diagnostic reagents (Enterotube II, Becton-Dickinson) and was confirmed as P. mirabilis by the National Veterinary Services Laboratories (Ames, IA), although it atypically failed to ferment two sugars, maltose and xylose.
Isolation and Purification of Cmf-CPS-Bacteria were grown in BHI broth, harvested by centrifugation, and washed once in physiologically buffered saline as described by Lee and Cherniak (14). Bacterial cells (100 g, wet weight) were suspended in 300 ml of water and stirred vigorously in boiling water for 30 min. The suspension was cooled in an ice bath with stirring for 90 min. The cell residue was removed by centrifugation (10,000 ϫ g, 30 min, 4°C), the supernatant adjusted to 1% acetic acid, and crude Cmf-CPS precipitated by the addition of ethanol (2.5 volumes, 24 h, Ϫ20°C). The Cmf-CPS precipitate was collected by centrifugation (10,000 ϫ g, 30 min, 4°C), washed with ethanol, washed again with acetone, dried, dissolved in water, and lyophilized. This crude Cmf-CPS was suspended in a solution containing 6 ml of EDTA-phosphate (0.05 M Na 2 HPO 4 /0.005 M EDTA, pH 7.0), 3 mg of DNase (in 3 ml of 0.04 M MgCl 2 ), and 20 mg of RNase (in 3 ml of 0.04 M MgCl 2 ). This solution was incubated for 16 h at 37°C followed by the addition of proteinase K (4 g) and incubated again for 16 h at 37°C. The resulting solution was dialyzed against distilled water for 48 h and centrifuged at 5,000 ϫ g for 20 min, and the supernatant was lyophilized. The final yield was 800 mg of crude Cmf-CPS. Crude Cmf-CPS was further purified by gel filtration column (90 ϫ 1.6 cm) chromatography using Sephadex G-150 equilibrated with a buffer solution consisting of 0.2 M NaCl, 1 mM EDTA, 50 mM Tris base, and 0.25% deoxycholic acid (DOC), pH 9.25. The content of each fraction was identified by polyacrylamide gel electrophoresis in the presence of DOC (DOC-PAGE) using 18% acrylamide (15). Gels were fixed in the presence or absence of Alcian blue (16) and silver-stained (17).
A fraction of the crude Cmf-CPS was also further purified by mild acid hydrolysis in aqueous 1% acetic acid at 100°C for 2 h. After hydrolysis, the solution was cooled and centrifuged (10,000 ϫ g). The supernatant was extracted with diethyl ether (3 ϫ 10 ml), and the aqueous layer was fractionated on a Sephadex G-75 column (90 ϫ 1.6 cm). The fractions were assayed for hexose with phenol-sulfuric acid. The resulting Cmf-CPS and oligosaccharide (OS) fractions were lyophilized.
Nuclear Magnetic Resonance Spectroscopy-Samples were prepared for NMR analysis by a 2-fold lyophilization from D 2 O and dissolved in 0.5 ml of D 2 O. Spectra were recorded at 60°C. Chemical shifts are reported in ppm, using sodium 3-trimethylsilylpropanoate-d 4 (␦ H 0.00), and acetone (␦ C 31.00) as internal references. All NMR spectra were recorded on Bruker AMX-500 or DRX-600 MHz spectrometers. Twodimensional DQF-COSY (18), TOCSY (19,20), and NOESY (21) data sets were collected in phase-sensitive modes using the States-TPPI (22) method. In these experiments, low power presaturation was applied to the residual water (HOD) signal. Typically, data sets of 2048 (t 2 ) ϫ 512 (t 1 ) complex points were collected with 16 scans per FID and a sweep width in both dimensions of 6 ppm. The TOCSY experiments contained MLEV17 (23) mixing sequences ranging from 60 to 320 ms, and the NOESY mixing delay was 200 ms.
A gradient HSQC (24) data set was collected using the echo/anti-echo method for pure absorption data. A data set of 2048 (t 2 ) ϫ 512 (t 1 ) complex points was acquired, with 32 and 64 scans per FID. The sweep width was 7 ppm for proton (F 2 ) and 60 ppm for carbon (F 1 ). The GARP (25) sequence was used for 13 C decoupling during aquisition. Data were processed typically with a lorentzian-to-gaussian weighting function applied to t 2 and a shifted squared sine bell function and zero-filling applied to t 1 . Data shown were processed with Felix software (Molecular Simulations, Inc.).
Glycosyl Composition Analyses-Glycosyl composition of Cmf-CPS (0.5 mg each) was performed by hydrolysis in 0.5 ml of 2 M trifluoroacetic acid in a closed vial at 120°C for 3 h. The glycoses in the hydrolysate were reduced with NaBH 4 , acetylated, and analyzed by combined gas-liquid chromatography mass spectrometry (GLC-MS) (26). For the determination of uronic acid, the Cmf-CPS sample (0.5 mg) was dried in vacuum and methanolyzed in 1 ml of methanolic 2 N HCl at 80°C for 16 h. The resulting methyl glycosides were either trimethylsilylated and analyzed by GLC-MS (26), or they were reduced with NaBH 4 (10 mg) in water (100 l), acetylated, and analyzed by GLC-MS. The absolute configurations of the glycoses present were determined by GLC-MS analysis of the trimethylsilylated (S)-(ϩ)-2-butyl and (S)-(Ϫ)-2-butyl glycosides (27,28).
Glycosyl Linkage Analyses-Glycosyl linkage analysis was carried out using a modified NaOH method (29,30). The sample (1 mg) was dissolved in dimethyl sulfoxide (100 l), powdered NaOH (100 mg) was added, and the reaction mixture was stirred rapidly at room temperature for 30 min. Methylation was performed by the sequential additions of methyl iodide (10, 10, and 20 l) at 10-min intervals. After an additional 20 min of stirring, 1 ml of 1 M sodium thiosulfate was added, and the methylated glycans were recovered in the organic phase by extraction with chloroform (0.5 ml ϫ 3). The permethylated product was further purified by reverse-phase chromatography using a Sep-Pak C18 cartridge (31). The methylated glycan was hydrolyzed with 2 M trifluoroacetic acid (120°C, 3 h), reduced with NaB 2 H 4 , acetylated, and analyzed by GLC-MS (26). For the linkage determination of the uronic acid, the permethylated Cmf-CPS sample (0.5 mg) was dried in vacuum and methanolyzed in 1 ml of methanolic 2 N HCl at 80°C for 16 h.
Released, partially methylated methyl glycosides were N-acetylated with the addition of 200 l of methanol, 20 lЈof pyridine, and 20 l of acetic anhydride at room temperature for 5 h and dried in air. The carboxyl methyl ester of uronic acid was reduced with NaB 2 H 4 (10 mg) in water (100 l) for 2 h at room temperature, neutralized with acetic acid, and dried in air with the addition of methanol. The carboxyl reduced products were hydrolyzed with 2 M trifluoroacetic acid (120°C, 3 h), reduced with NaB 2 H 4 , acetylated, and analyzed by GLC-MS (26).
GLC-MS analyses were performed using capillary columns (length, 30 m; inner diameter, 0.32 mm) with helium as the carrier. A DB-5 column (J & W Scientific) was used for aminoglycosyl derivatives, and an SP2330 column (Supelco, Bellefonte, PA) was used for the neutral glycosyl derivatives.

RESULTS
Isolation and Purification of Cmf-CPS-Analysis of the crude Cmf-CPS by DOC-PAGE ( Fig. 1) showed that the crude Cmf-CPS contained some contaminating LPS. The crude Cmf-CPS was separated from the contaminating LPS by a Sephadex G-150 column using buffer containing DOC. The DOC-PAGE analysis of the fractions resulted in a fraction that silverstained only after fixing the gel in the presence of Alcian blue (Fig. 1A), a property characteristic of acidic polysaccharides (32). The low molecular weight fraction silver-stained after being fixed in the presence or absence of Alcian blue, a feature that is characteristic of LPS (32). Thus, the crude Cmf-CPS fraction contained a high molecular weight Cmf-CPS DOC and a low molecular weight LPS (a lipo-oligosaccharide, LOS DOC ).
During the purification of Cmf-CPS DOC described in the previous paragraph, the sample was subjected to alkaline conditions (pH 9.25) for an extended period of time. Thus, any O-acetyl substituents, if present, would have been removed. To purify Cmf-CPS without removal of O-acetyl groups, a portion of crude Cmf-CPS (50 mg), was hydrolyzed with mild acid and purified by Sephadex G-75 column chromatography. Two fractions were obtained, the high molecular weight Cmf-CPS G75 and low molecular weight oligosaccharides (OS G75 ) derived from the LOS. The Cmf-CPS G75 eluted just after the void volume, and the OS G75 eluted at twice the void volume (not  Table I. Both Cmf-CPS DOC and Cmf-CPS G75 have very similar glycosyl compositions, i.e., mannuronic acid, galacturonic acid, N-acetyl glucosamine, and Nacetyl galactosamine in a molar ratio of 1:1:1:1. The LOS DOC contains glycosyl residues characteristic of LPS core oligosaccharides, i.e., glucose, galactose, mannose, D,D-heptose, and L,D-heptose (33). Fatty acid analysis showed that neither of the Cmf-CPS fractions contained detectable fatty acyl residues, whereas the LOS DOC contained myristic, palmitic, and ␤-hydroxy myristic acids, a result consistent with the presence of lipid A. Determination of the absolute configurations of the glycoses present in the Cmf-CPS fractions revealed that all glycoses had the D-configuration.
Glycosyl Linkage Analysis-Glycosyl linkage analysis was performed by methylation followed by hydrolysis, reduction, and preparation of alditol acetates. Linkage analysis of the uronic acid was performed by methanolysis followed by reduction of the permethylated sample prior to hydrolysis. The glycosyl linkages of the Cmf-CPS DOC are shown in Table II. Prior to carboxyl group reduction, 3-linked N-acetylglucosamine (GlcNAc), and 3,4-linked N-acetylgalactosamine (GalNAc) were present in a 1:1 ratio. After the carboxyl group reduction, two additional glycosyl residues were observed. The mass spectra and retention times of their partially methylated alditol acetates were consistent with 3,6-linked mannosyl and 6-linked galactosyl residues with two deuteride atoms at C-6, showing that these two residues were derived from 3-linked mannuronic acid and terminally linked galacturonic acid, respectively.
NMR Spectroscopic Analysis-The 1 H NMR spectrum of the Cmf-CPS DOC (Fig. 2) confirmed that galactosamine and glucosamine were N-acetylated, as indicated by a singlet at 2.05 ppm. The 1 H NMR spectrum of Cmf-CPS G75 fraction (data not shown) was identical to that of Cmf-CPS DOC . The absence of a signal at about ␦ 2.10 in either Cmf-CPS fraction indicates that the Cmf-CPS does not contain O-acetyl substituents. The anomeric region shows three downfield ␣-anomeric proton signals (Table III)  With the aid of two-dimensional COSY (spectrum not shown), TOCSY (Fig. 3A), and broad-band decoupled HSQC (spectrum not shown) analyses, most of the 1 H and 13 C NMR signals could be assigned (Tables III and IV). The four glycosyl residues were designated A-D according to their decreasing anomeric chemical shifts.
Residue A has an anomeric signal at ␦ 5.37 and a J H-1,H-2 coupling constant of 3 Hz, indicating that it is an ␣-linked residue. The H-1 to H-5 proton signals (Table III) for residue A were assigned from the COSY and TOCSY (Fig. 3A) spectra. A large J H-3,H-4 coupling constant (Ͼ5 Hz) was observed for A, supporting the conclusion it has a gluco configuration. The carbon signals (Table IV) from C-1 to C-5 for residue A were determined from the HSQC spectrum. The C-2 chemical shift of residue A is ␦ 54.0, typical of a nitrogen bearing carbon. The downfield position of the C-3 carbon signal (␦ 82.2) indicates that residue A is substituted at this position. Thus, A is the 3-linked N-acetylglucosaminosyl residue. The carbon chemical shifts from C-1 to C-5 for residue A (Table IV) are also similar to those previously reported for a 3-linked ␣-N-acetylglucosaminosyl residue (34).  a These values are for the partially methylated alditol acetates from carboxyl-reduced (CR) Cmf-CPS DOC . The mass spectra of these partially methylated alditol acetates show that they both have two deuterium atoms at C-6 and indicate that they were derived from the 3-linked mannuronic acid and from terminally linked galacturonic acid present in the Cmf-CPS.
a Mass spectrometric analysis shows that the alditol acetates of these residues have two deuterium atoms at C6, indicating that they were derived from their corresponding uronosyl residues, i.e. mannose and glucose from mannuronic and glucuronic acids, respectively.
The anomeric signal for residue B is ␦ 5.17 (J H-1,H-2 not resolved), showing that it is ␣-linked. The proton chemical shifts (Table III) from H-1 to H-5 protons were assigned from COSY (spectrum not shown) and TOCSY (Fig. 3A) spectra. A relatively small J H-3,H-4 coupling constant (Ͻ5 Hz) for residue B indicates that it has a galacto configuration. The carbon chemical shifts (Table IV) from C-1 to C-5 for residue B were assigned from HSQC spectrum. The C-2 chemical shift of res-idue B is ␦ 50.5, typical of a nitrogen bearing carbon. The downfield shift of C-3 (␦ 76.6) and C-4 (␦ 79.1) indicates that residue B is substituted at C-3 and C-4. Glycosyl linkage analysis (Table II) showed that N-acetylgalactosamine is the only 3,4-linked aminoglycosyl residue found in the Cmf-CPS. Therefore, residue B is the 3,4-linked-␣-N-acetylgalactosaminosyl residue.
Residue C has an anomeric proton chemical shift at ␦ 5.00, (J H-1,H-2 not resolved) indicating that it is ␣-linked. The proton chemical shifts (Table III) from H-1 to H-5 for residue C were assigned from the COSY and TOCSY (Fig. 3A) spectra. A small J H-2,H-3 coupling constant (Ͻ5 Hz), indicates that the residue C has a manno configuration. The carbon chemical shifts (Table  IV) from C-1 to C-5 were assigned from HSQC spectrum. The downfield chemical shift of C-3 (␦ 76.5), indicates that residue C is substituted at this position. Glycosyl linkage analysis (Table II) showed only one 3-linked mannuronosyl residue in the Cmf-CPS. Thus, residue C is the 3-linked mannuronosyl residue.
The anomeric proton chemical shift for residue D is ␦ 4.79 (J H-1,H-2 7 Hz) indicating that it is ␤-linked. The proton chemical shifts (Table III) from H-1 to H-5 for residue D were assigned from the COSY and TOCSY (Fig. 3A) spectra. The J H-3,H-4 coupling constant for residue D is similar to that for residue B (i.e., Ͻ 5 Hz), indicating that it has a galacto configuration. The carbon chemical shifts (Table IV) from C-1 to C-5 carbon for residue D were determined from the HSQC spectrum. There is no downfield chemical shift for any carbon of residue D, indicating that it is not substituted. The only terminally linked hexosyl residue observed in the glycosyl linkage analysis of the Cmf-CPS (Table II) was galacturonic acid. Thus, residue D is the terminally linked ␤-galacturonosyl residue.
The sequence of glycosyl residues was determined from a NOESY experiment (Fig. 3B and Table V). In addition to intraresidue NOE contacts to H-2 and H-3, residue A has an NOE contact from H-1 to H-3 of residue C. Because residue C is 3-linked ␣-D-mannuronic acid, the sequence shown in Structure 1 was established.    The signals labeled in bold type on the NOESY spectrum indicate the strong inter-residue NOE contacts from which the glycosyl sequence was deduced. The mixing time for the TOCSY spectrum shown was 120 ms. Complete assignment required several TOCSY experiments requiring several mixing times ranging from 60 to 320 ms. The spectra for these other experiments are not shown.   Residue B has a strong NOE contact from H-1 to H-3 of residue A, indicating that residue B is linked to the 3-position of residue A. Thus, the P. mirabilis WT19 Cmf-CPS consist of a tetrasaccharide repeating unit,n -, as shown in Structure 4.

DISCUSSION
Extracellular polysaccharides are central to bacterial survival, particularly against the immune defenses of the mammalian host. In uropathogenic P. mirabilis, the Cmf capsular polysaccharide has also been shown to facilitate the rapid multicellular migration of elongated hyperflagellated swarm cells, which correlates with the ability to establish experimental ascending colonization of the urinary tract and may be coupled to the formation of biofilms (7). Proteus migration requires close cell-cell contact, with swarm cells aligning along their long axes in multicellular rafts. The Cmf-CPS may provide a matrix for surface migration of the swarm cell rafts (35), stabilizing cell-cell contact and facilitating intercellular communication (8,35). In addition, the acidic CPS is thought to act as lubricant, creating a fluid environment through which Proteus can swarm by extracting water from the agar medium beneath the colony (4,8). This latter hypothesis is supported by the observation that increased agar concentration or reduced polysaccharide biosynthesis, both of which result in a lowered agar/capsular polysaccharide osmotic activity ratio, reduce migration velocity but do not inhibit differentiation (4,8). Surface active agents are produced by other bacteria that undergo population migration cell rafts, e.g. the unrelated Myxococcus produces an extracellular slime during fruiting body development (36).
Increasing the understanding of the apparently multiple roles of Proteus polysaccharides in colony expansion and virulence requires a knowledge of their structures. Including the Cmf-CPS structure of this report, three Proteus CPS structures have been described in the literature and are shown in Fig. 4. Although these three structures are quite different from one another, they have two general similarities. First, all three structures are acidic in that they all contain at least one uronosyl residue; the CPS from P. mirabilis ATCC49565 has a branching terminally linked ␣-D-glucuronosyl residue, the P. vulgaris CPS contains a 4-linked ␣-D-glucuronosyl residue, and the P. mirabilis WT19 CPS contains 3-linked ␣-D-mannuronosyl and branching terminally linked ␤-D-galacturonsyl residues. Second, all three structures have amino sugars; P. mirabilis ATCC59565 CPS contains both N-acetylglucosamine and N-acetylfucosamine, P. vulgaris CP2-96 CPS contains N-acetylgalactosamine, and P. mirabilis WT19 CPS contains both N-acetylglucosamine and N-acetylgalactosamine. Understanding the molecular basis by which these acidic CPSs facilitate Proteus swarming will require further investigation.  Fig. 3: s ϭ strong, m ϭ medium, and w ϭ weak.