Structural and Functional Analysis of Campylobacter jejuni PseG

Flagella of the bacteria Helicobacter pylori and Campylobacter jejuni are important virulence determinants, whose proper assembly and function are dependent upon glycosylation at multiple positions by sialic acid-like sugars, such as 5,7-diacetamido-3,5,7,9-tetradeoxy-l-glycero-l-manno-nonulosonic acid (pseudaminic acid (Pse)). The fourth enzymatic step in the pseudaminic acid pathway, the hydrolysis of UDP-2,4-diacetamido-2,4,6-trideoxy-β-l-altropyranose to generate 2,4-diacetamido-2,4,6-trideoxy-l-altropyranose, is performed by the nucleotide sugar hydrolase PseG. To better understand the molecular basis of the PseG catalytic reaction, we have determined the crystal structures of C. jejuni PseG in apo-form and as a complex with its UDP product at 1.8 and 1.85 Å resolution, respectively. In addition, molecular modeling was utilized to provide insight into the structure of the PseG-substrate complex. This modeling identifies a His17-coordinated water molecule as the putative nucleophile and suggests the UDP-sugar substrate adopts a twist-boat conformation upon binding to PseG, enhancing the exposure of the anomeric bond cleaved and favoring inversion at C-1. Furthermore, based on these structures a series of amino acid substitution derivatives were constructed, altering residues within the active site, and each was kinetically characterized to examine its contribution to PseG catalysis. In conjunction with structural comparisons, the almost complete inactivation of the PseG H17F and H17L derivatives suggests that His17 functions as an active site base, thereby activating the nucleophilic water molecule for attack of the anomeric C–O bond of the UDP-sugar. As the PseG structure reveals similarity to those of glycosyltransferase family-28 members, in particular that of Escherichia coli MurG, these findings may also be of relevance for the mechanistic understanding of this important enzyme family.

nucleotide removal by a metal-independent C-O bond cleavage mechanism resulting in inversion of stereochemistry at C-1 of the product 2,4-diacetamido-2,4,6-trideoxy-L-altropyranose or 6-deoxy-AltdiNAc, similar to the catalytic properties of some GT-B glycosyltransferases.
Together, glycosyltransferases and glycoside hydrolases compose the majority of enzymes in both eukaryotes and prokaryotes that manipulate glycosidic bonds. Glycosyltransferases of the Leloir classification use sugar-nucleotide derivatives as glycosyl donors resulting in transfer to acceptors such as a monosaccharide, oligosaccharide, or polysaccharide. It is therefore plausible that a "glycosyltransferase fold" in PseG has evolved to efficiently utilize water as an acceptor, instead of another carbohydrate, consequently behaving as a hydrolase (11). Based on structure, most glycosyltransferases fall into two groups, GT-A and GT-B, that exhibit different folds, respectively (17). For both families, depending on the particular enzyme, the outcome may result in either inversion or retention of stereochemistry for the donor anomeric carbon (see Fig.  2). In addition, GT-B family enzymes are metal-independent, lacking an important DXD motif present in most GT-A members. Based on the novelty of PseG and its role in H. pylori pathogenicity, we sought a greater structural and mechanistic understanding of this important enzyme.

Construction of
supplemental Table S1. The newly constructed plasmids were sequenced as described previously (13).
For crystallization, PseGHis 6 was expressed in 1 liter of TB medium containing 30 g ml Ϫ1 kanamycin. The culture was first grown for 2 h at 37°C, followed by induction with 100 M isopropyl ␤-D-thiogalactopyranoside for an overnight period at room temperature. The same induction protocol was followed to express the selenomethionine (SeMet)-containing protein in the Escherichia coli metA Ϫ auxotroph DL41(DE3) in LeMaster medium (19) containing 25 mg liter Ϫ1 L-SeMet. Cells were harvested by centrifugation (4000 ϫ g, 4°C, 30 min) and resuspended in lysis buffer containing 50 mM Tris-Cl, pH 7.5, 400 mM NaCl, 1% (v/v) Triton X-100, 5% (v/v) glycerol, 1 mM DTT, 20 mM imidazole, and a mixture of protease inhibitors (10 M leupeptin, 0.5 mM benzamidine, and 0.1 mM Pefabloc TM ). Cells were disrupted by sonication for a total of 2 min with alternating cycles of 15 s on and 15 s off. The lysate was clarified by ultracentrifugation (100,000 ϫ g, 4°C, 45 min), and the protein supernatant was incubated with 1 ml of pre-equilibrated nickel-nitrilotriacetic acid-agarose (Qiagen, Mississauga, Ontario, Canada) for 1 h with gentle shaking. Nickel-nitrilotriacetic acid beads were washed first with buffer (50 mM Tris-Cl, pH 7.5, 1 M NaCl, 5% (v/v) glycerol, 1 mM DTT, 20 mM imidazole, pH 7.5) followed by a similar buffer containing 0.4 M NaCl and 40 mM imidazole to remove nonspecifically bound proteins. Elution was performed in the same buffer containing 250 mM imidazole, pH 7.5. Purity of the eluted protein was assessed by SDS-PAGE. A total of 9 mg of pure native protein and 5 mg of SeMet-labeled protein was obtained using this procedure. The protein buffer was changed to 20 mM Tris-Cl, pH 7.5, 0.2 M NaCl, 5% (v/v) glycerol, 5 mM DTT, 1 mM MgCl 2 prior to concentration by ultrafiltration to a final concentration of 9 mg ml Ϫ1 as determined by the method of Bradford (20).
Enzymatic Synthesis and Purification of PseG Substrate-Large scale enzymatic synthesis of UDP-6-deoxy-AltdiNAc was accomplished using a 45-ml reaction containing 1 mM UDP-GlcNAc, 1 mM pyridoxal-5Ј-phosphate, 10 mM L-glutamate, 1.5 mM acetyl-CoA, and ϳ10 mg each of H. pylori PseBHis 6 , His 6 PseC, and His 6 PseH, similar to methods described previously (13). After passage through an Amicon Ultra-15 filter (10,000 molecular weight cutoff), the sugar preparation was lyophilized and desalted/purified using a Superdex Peptide 10/300 GL column (Amersham Biosciences) in 25 mM ammonium bicarbonate, pH 7.9. For kinetic analysis, the sample was subjected to a final polishing step using a Mono Q 4.6/100 PE column (Amersham Biosciences) with a 25-500 mM ammonium bicarbonate gradient, pH 7.9, over 20 column volumes. Purified fractions were pooled and lyophilized. Finally, quantification was determined using the molar extinction coefficient of UDP (⑀ 260 ϭ 10,000).
CD Spectral Analysis of PseG and Substitution Derivatives-CD spectra were recorded on a Jasco J-600 spectropolarimeter (Utrecht, Netherlands) with a 0.05-cm quartz cuvette at ambient temperature (20 -22°C). The instrument was calibrated with ammonium (ϩ)-10-camphorsulfonate. Wavelengths from 190 to 260 nm were measured with a 0.2-nm step resolution, a 2.0-nm bandwidth, and 20-nm/min scan speed. All CD experiments were performed with protein samples in a buffer consisting of 25 mM sodium phosphate, pH 7.3, 25 mM NaCl. The final concentrations of the proteins were 0.098 -0.104 mg/ml, calculated from the absorption at 280 nm. Four scans were collected and averaged, and the data were smoothed using the Jasco software. Data are expressed per peptide bond as mean residue ellipticity []ϫ10 Ϫ3 (degrees cm 2 dmol Ϫ1 ). PseG catalyzes the removal of UDP from UDP-2,4-diacetamido-2,4,6-trideoxy-␤-L-Alt or UDP-6-deoxy-Altdi-NAc. B, UDP-GlcNAc hydrolyzing 2-epimerase NeuC catalyzes the removal of UDP and the formation of ManNAc from UDP-GlcNAc. C, GlcNAc transferase MurG catalyzes the formation of undecaprenyl-phosphorylmuramyl-pentapeptide-GlcNAc via formation of a glycosidic linkage between UDP-GlcNAc and undecaprenylphosphoryl-muramyl-pentapeptide. R represents the phosphoryl-undecaprenyl moiety, with the pentapeptide having the specific sequence L-Ala-D-␥Glu-L-Lys-D-Ala-D-Ala. Both A and C activities result in an initial inversion of stereochemistry at C-1 for the donor substrate. In contrast, the activity for B results in an initial retention of C-1 stereochemistry. Enzymatically altered anomeric bonds are indicated in red. JULY 31, 2009 • VOLUME 284 • NUMBER 31

JOURNAL OF BIOLOGICAL CHEMISTRY 20991
Kinetic Measurements-For routine analysis of enzyme activity, PseGHis 6 or respective substitution derivatives were incubated at 37°C in 25 mM sodium phosphate buffer, pH 7.3, 25 mM NaCl, with UDP-6-deoxy-AltdiNAc, and reactions were analyzed by capillary electrophoresis as described earlier (14). For kinetic analysis, substrate conversion was measured using a continuous coupled assay for UDP formation by monitoring NADH oxidation (21). Specifically, enzyme reactions (150 l each) were performed in a 96-well microplate at 37°C with 25 mM sodium phosphate, pH 7.3, 25 mM NaCl, 5 mM MgCl 2 , 2 mM phosphoenolpyruvate, 0.2 mM NADH, 12 units each of lactate dehydrogenase and pyruvate kinase, and variable UDP-6-deoxy-AltdiNAc concentrations (0.03-2 mM). Reactions were initiated by the addition of PseGHis 6 derivatives (1.5-90 nM final concentration) as required. Substrate conversion was extrapolated from A 340 versus [NADH] standard curves using a microplate reader (Bio-Tek Instruments Inc., Winooski, VT), where initial [NADH] was determined using the molar extinction coefficient for NADH (⑀ 340 ϭ 6220) and a quartz cuvette. Kinetic constants were calculated using initial velocities and Eadie-Hofstee plots with the program GraphPad Prism 3.
Characterization of PseGHis 6 -Dynamic light scattering was performed using a DynaPro plate reader (Wyatt Technologies, Santa Barbara, CA) using purified PseG at a concentration of 14 mg ml Ϫ1 in 20 mM Tris-Cl, pH 7.5, 0.2 M NaCl, 5% (v/v) glycerol, 5 mM DTT, 1 mM MgCl 2 . The mass of native and SeMet-substituted PseGHis 6 was determined by electrospray ionization mass spectrometry using an Agilent 1100 Series LC-MSD mass spectrometer (Agilent Technologies, Mississauga, Ontario, Canada) and analyzed using Agilent Chemstation version A.09.01 software.
Crystallization-Initial crystallization screening was performed in 96-well sitting drop plates using crystallization screens from Hampton Research (Aliso Viejo, CA) with a Hydra II crystallization robot (Thermo Fisher Scientific, Hudson, NH). Well diffracting crystals were obtained from two reservoir conditions (a and b), following optimization from initial screens, using the hanging drop vapor diffusion method at 20°C as follows: condition a, 0.1 M BisTris, pH 6.5, 0.2 M ammonium sulfate, 23% (v/v) PEG 3350; condition b, 0.1 M BisTris, pH 6.5, 50 mM CaCl 2 , 25% (v/v) PEG monomethyl ether 550. Crystals of SeMet-substituted protein were obtained under the same conditions but by increasing the precipitant concentration by 2-3% (v/v). PseGHis 6 crystals obtained by co-crystallization in the presence of 1 mM UDP-6-deoxy-AltdiNAc, 5 mM UDP-glucosamine, or 5 mM UDP-N-acetylglucosamine were obtained under the same conditions. Crystals obtained with UDP-N-acetylglucosamine showed electron density only for UDP and were used to determine the co-crystal structure. For cryoprotection prior to x-ray diffraction data collection, crystals were transferred to a solution containing reservoir solution supplemented with 15% glycerol (v/v) for crystals from condition a and with reservoir solution alone for crystals from condition b, picked up with a nylon loop and flash-cooled directly in the N 2 cold stream at 100 K (Oxford Cryosystems, Oxford, UK). Apo-and UDP-bound PseGHis 6 crystals belong to space group P4 1 with the apo-form having unit cell dimensions of a ϭ b ϭ 93.6, c ϭ 42.7 and Z ϭ 4, and a Matthews coefficient of 2.9 Da Å Ϫ1 (22), whereas UDP-bound PseGHis 6 crystals have a unit cell of a ϭ b ϭ 94.4, c ϭ 43.6 and Z ϭ 4.
Structure Determination and Refinement-X-ray diffraction data were collected at beamlines X12B and X29, National Synchrotron Light Source, Brookhaven National Laboratory, using a Quantum-4 CCD detector (X12B) or Quantum-315 detector (X29) (Area Detector Systems Corp., Poway, CA). For phase determination, data collected at the selenium anomalous peak were used in the selenium single-wavelength anomalous dispersion method (19). All three selenium sites present in the asymmetric unit were identified using SHELXD (23) with cross-correlation values of 42.3 and 25.1 for all and weak reflections, respectively. Initial phases were calculated using SHELXE (23), and the resulting maps were used directly for automated model building with ARP/wARP (24). The resulting model was 70% complete with remaining residues fit manually using the program COOT (25). The final model of PseGHis 6 was refined using Refmac5 (26) to a final R work and R free of 0.193 and 0.237, respectively, for all reflections to 1.8 Å. No -cutoff was used in the refinement. The apo-PseGHis 6 structure was used as the starting model for molecular replacement, phase calculation, and refinement with data sets obtained for Pse-GHis 6 co-crystallized with ligands. In all cases, the resulting electron density maps clearly indicated the presence of bound UDP, which was modeled using the program COOT (25). The Pse-GHis 6 -UDP complex was refined to 1.85 Å resolution with a final R work and R free of 0.177 and 0.211, respectively. Final refinement statistics for both models are given in Table 1. Coordinates for apo-and UDP-bound PseGHis 6 have been deposited in the RCSB Protein Data Bank (27) with accession codes 3HBM and 3HBN, respectively.
Molecular Modeling of Free and Bound Substrate-To facilitate molecular modeling of PseG-bound substrate, several energetically favorable free-substrate conformations (O Glymethylated) were calculated. All "free" structures were first optimized using the AM1 Hamiltonian as implemented in Hyperchem 6.0, starting from idealized ring structures, including several boat structures. Three different ring structures 4 C 1 , 1 C 4 , and 5 S 1 were found. Each structure then had the hydroxyl rotamers optimized by using the conformational search routine available in Hyperchem. These three structures were then optimized using the Amsterdam Density Functional (ADF-2006) DFT QM program. Optimized structures were checked by frequency calculations (analytical second derivatives) for true convergence. Structures that exhibited negative frequencies were re-optimized using the coefficients of the negative modes to adjust the Cartesian coordinates and the optimization, with the frequency calculations repeated until all modes were positive. All structures were optimized as internal coordinates using the triple plus basis set with no frozen core. Keywords used for the DFT calculations were LDA, SCF, VWN, METAGGA, and HFEXCHANGE (see ADF manual). Full solvation was considered parameterized to water. For further free conformer ( 4 C 1 , 1 C 4 , and 5 S 1 ) details, see supplemental Fig. S3.
Substrate docking was performed on the UDP-bound Pse-GHis 6 crystal structure, after removal of UDP, C-terminal fusion tag extension Leu-Glu-His 6 , and water molecules. The protonation state of PseG was assigned with Hϩϩ (28). The UDP-6-deoxy-AltdiNAc substrate was built in Sybyl 6.9 (Tripos, Inc., St. Louis, MO), starting from the PseG-bound conformation of UDP to which the 6-deoxy-AltdiNAc sugar moiety was appropriately connected. Three substrate conformations were separately built, corresponding to the 1 C 4 chair, 4 C 1 chair, and 5 S 1 twist-boat sugar conformations calculated for the free O Glymethylated sugar as described earlier. Flexible docking of the sugar part of the substrate was carried out by Monte Carlo minimization (MCM) conformational sampling (29) applied to protein-ligand binding (30). The crystallographic binding mode of the UDP product was adopted for the UDP part of the substrate. Three independent docking runs of 10,000 MCM cycles were carried out, initiated from the expected 1 C 4 , 4 C 1 , and 5 S 1 conformations for the 6-deoxy-AltdiNAc sugar moiety. Conformational sampling included all acyclic and cyclic rotatable bonds starting from the P-␣ atom and covered the pyrophosphate and 6-deoxy-AltdiNAc moieties. In each MCM cycle, at least two dihedral angles were simultaneously and randomly perturbed. The O-5-C-1 endocyclic sugar bond was used for pyranose ring opening and reforming during sampling. The minimized energy of the complex formed the basis of accepting or rejecting the resulting conformation according to the Metropolis probability criterion at 300 K. Energy minimization was carried out with the AMBER force field (31, 32) and a distance-dependent (4R) dielectric constant. Ligand partial charges were calculated with Molcharge (OpenEye, Inc., Santa Fe, NM) based on the AM1-BCC method (33) and the 1 C 4sugar conformation of the substrate. The protein region allowed to move during the energy minimization step included 36 residues around the putative sugar-binding pocket (Ser 10  Separate hybrid quantum mechanics (QM)/molecular mechanics (MM) molecular dynamics (MD) simulations were carried out for PseG complexes with the substrate in 1 C 4 or 5 S 1 conformations. These simulations were initiated from PseGbound substrate geometries generated by MCM flexible docking runs as described above but without internal sampling of the sugar ring to prevent the 1 C 4 to 5 S 1 conversion. Before starting QM/MM MD, 3 ns of classical MD simulations were performed to equilibrate the protein and allow the substrate to accommodate in the binding cavity, using the AMBER force field (31,32) in AMBER 9 (34). AM1-BCC partial charges (33) were assigned to the substrate. Each complex was solvated in a truncated octahedron TIP3P water box (35), and electroneutrality was achieved by adding Na ϩ counterions. Applying harmonic restraints with force constants of 10 kcal mol Ϫ1 Å Ϫ2 to all solute atoms, the system was energy-minimized first, followed by heating from 100 to 300 K over 25 ps in the canonical ensemble (constant number of particles, volume, and temperature, NVT) and by equilibrating to adjust the solvent density under 1 atm pressure over 25 ps in the isothermal-isobaric ensemble (constant number of particles, pressure, and temperature, NPT) simulation. The harmonic restraints were then gradually reduced to 0 with four rounds of 25-ps simulations. After an additional 25-ps simulation, a 3-ns production NPT run was obtained with snapshots collected every 1 ps, using a 2-fs time step and 9-Å nonbonded cutoff. The Particle Mesh Ewald method (36) was used to treat long range electrostatic interactions, and bond lengths involving bonds to hydrogen atoms were constrained by SHAKE (37). The QM/MM MD simulation started from the last snapshot of the classical MD simulation and was carried out in AMBER 10 (38). The entire substrate molecule was included in the QM region described with the PM3 Hamiltonian (39), whereas the solvated protein was treated at the molecular mechanics level using the AMBER force field. The SHAKE option was lifted, and the MM correction for peptide linkage was applied for the QM region. The QM/MM simulations were run for 120 ps with a time step of 1 fs and snapshots collected every 50 fs.
Standard analyses of MD trajectories were carried out with PTRAJ in AMBER 10. Water O and H occupancy plots and solute (PseG-substrate complex) average structures were generated from 120-and 90-ps QM/MM MD trajectories for the 5 S 1 and 1 C 4 conformations, respectively. Each average structure was first fitted with a water molecule that was added into the water density identified at the putative catalytic position, and then energy-minimized in vacuum with the hybrid PM3-AMBER potential for 5000 steps, including the catalytic water along with the substrate in the QM region. PseG-substrate binding affinities were estimated with SIETRAJ (40) by calculating the average solvated interaction energy (SIE) (41). Relative protein strain energies were estimated by calculating the average solvated conformational energy of substrate-bound PseG. The SIE and solvated conformational energy were averaged over 160 snapshots at 0.5-ps intervals from the last 80 ps of QM/MM MD trajectories (see supplemental material for further details on SIE and solvated conformational energy calculations).

RESULTS AND DISCUSSION
Sequence Relationships-Surprisingly, there is very low sequence similarity among PseG hydrolases from different bacterial species. Based on a PSI-BLAST analysis (42), C. jejuni PseG (274 residues) shares only ϳ24 -28% sequence identity with other putative bacterial PseG proteins. In fact, from this analysis, only weak sequence similarity is observed between C. jejuni and H. pylori PseG proteins, yet their identical enzymatic activity has been confirmed in vitro (13). The other possible (ϳ24 -28% identity) PseG proteins identified include FlmD from Aeromonas punctata (43) and Aeromonas hydrophilia (44) that play a role in flagellar assembly and, additionally, for the former, in O-antigen biosynthesis. These pseG or flmD sequences are found in gene clusters highly similar to pseudaminic acid biosynthetic genes (13), and in fact, flagellin from A. punctata was shown to be glycosylated with pseudaminic acid (45). As well, the rkpO genes from Sinorhizobium meliloti Rm41 (46) and Rhizobium sp. NGR234 (47), associated with the synthesis of K-antigen capsular polysaccharides, are required for pseudaminic acid biosynthesis in these organisms, suggesting that these proteins also are PseG family members exhibiting low sequence similarity with their C. jejuni counterpart. Similar sequences have also been identified in the O-antigen biosynthetic loci of Pseudomonas aeruginosa serotype O7 and O9, ORF11 and ORF9, respectively (48). As these O-antigen biosyn-Structure and Function of C. jejuni PseG JULY 31, 2009 • VOLUME 284 • NUMBER 31 thetic loci also possess pseudaminic acid-like biosynthetic genes, and the fact that this sugar is known to decorate the lipopolysaccharide and pili of P. aeruginosa (49,50), we propose that these too are PseG members. Importantly, when performing a multiple sequence alignment using MUSCLE (51) with the PseG sequences described, a minimum consensus sequence (DX 5 GXGHX 2 R) for this family of proteins was identified (supplemental Fig. S4).
PseG Structure-PseG was purified as a C-terminally His 6tagged fusion protein and crystallized in both apo-form as well as bound to UDP. Electrospray ionization mass spectrometry analysis of purified PseGHis 6 gave a mass of 32,376 Da (calculated mass 32,383 Da), whereas the selenium-substituted Pse-GHis 6 sample gave a mass of 32,516 Da (calculated mass 32524 Da), consistent with the expected incorporation of three SeMet residues. Although the dynamic light scattering experiment showed the protein was polydisperse in solution, this did not adversely affect our efforts to obtain well diffracting crystals of this protein.
The structures of PseGHis 6 in apo-and UDP-bound forms have been determined by selenium single-wavelength anomalous dispersion and refined with good geometry to R work ϭ 0.193, R free ϭ 0.237 at 1.8 Å resolution, and R work ϭ 0.177, R free ϭ 0.211 at 1.85 Å resolution, respectively ( Table 1). The apo-PseGHis 6 structure contains three bound sulfate molecules, whereas the complex with UDP, in addition to the ligand, includes two molecules of glycerol (Fig. 3). The structure consists of the entire sequence, with the exception of Phe 129 , which was not modeled in the apo-PseGHis 6 structure because of poor density. Both models contain a C-terminal extension consisting of vector sequence from the C-terminal His 6 fusion tag, 275 Leu-Glu-His 6 282 , which is ordered in the crystal structures. This C-terminal extension forms both direct and water-mediated H-bonds with the PseG molecule itself, as well as with a surfaceexposed cluster of negatively charged residues (Asp 30 , Asp 44 , Glu 52 , Glu 67 , and Glu 68 ) from a symmetry-related PseG molecule. Based on the crystal structures and analysis by size exclusion chromatography (result not shown), we surmise that the enzyme is a monomer.  The overall structure of PseGHis 6 is composed of distinct Nand C-terminal domains, each having a classical ␣/␤-fold (Fig.  3A). The N-terminal domain (residues 1-142) harbors a central seven-stranded parallel ␤-sheet (13-12-11-14-15-16-17) flanked on either side by three ␣-helices, whereas the C-terminal domain (residues 153-282) has a six-stranded parallel ␤-sheet (13-12-11-14-15-16) surrounded by five ␣-helices. A short helix (residues 143-152) links the N-and C-terminal domains. Most of the inter-domain interactions are water-mediated, with one of the sulfates (301) acting as a bridge through H-bonding to the N Phe-15 from the N-terminal domain and the N Gly-166 and OG1 Thr-167 of the C-terminal domain. The inter-domain interactions are contributed by residues situated in the loops between ␤1/␣1 and at the C-terminal ends of ␤4-␤7 of the N-terminal domain and the ␤1/␣1 and ␤2/␣2 loops as well as helices ␣4 and ␣5 of the C-terminal domain. The domain interface consists of both polar and non-polar residues; however, only one direct hydrogen bond is observed between ND2 Asn-238 and O Val-114 . The two domains are structurally related and can be superposed with an r.m.s.d. of 1.6 Å for 126 C-␣ pairs. The very low sequence identity and similarity (9 and 26 residues, respectively) from the structure-based alignment of the two domains are suggestive of an ancient gene duplication followed by divergence in sequence.
UDP-binding Site-Attempts to obtain a PseGHis 6 co-crystal structure with the sugar product alone were unsuccessful. However, co-crystallization experiments with UDP-GlcNAc, UDP-GlcN, or the natural substrate UDP-6-deoxy-AltdiNAc yielded a PseGHis 6 complex with bound UDP. Activity measurements in solution of PseGHis 6 with UDP-GlcNAc did not yield any significant activity, suggesting that the bound UDP observed in our structures is the result of either low UDP contamination of UDP-sugar preparations or spontaneous hydrolysis of the UDP-sugars either in the reservoir or upon storage (52).
Superposition of the apo-and UDP-bound forms of Pse-GHis 6 shows small but readily detectable domain movements, resulting in partial closure of the two domains upon UDP binding. This domain movement is clearly revealed upon superposition of the two structures, with an overall r.m.s.d. for all C-␣ atoms of the apo-and UDP-bound forms of 0.90 Å, with the r.m.s.d. decreasing to 0.39 Å when only the 142 residues of their respective N-terminal domains are used in the superposition (Fig. 3B). The UDP-or product-binding site, seen as a tunnel, is situated at the domain interface and stretches across the entire interface (supplemental Fig. S5).
In the PseGHis 6 -UDP complex (Fig. 3C), UDP binds within a cleft between the two domains, interacting with ␤2 and the ␤4-␣4 loop of the C-terminal domain and the ␤1-␣1 loop of the N-terminal domain. The conserved, consensus sequence motif identified based on alignment of PseG-related sequences, DX 5 GXGHX 2 R, lines one side of the PseG active site cleft and makes contacts with the ␤-phosphate of UDP. UDP is held mainly through interactions with residues of the C-terminal domain, with fewer contributions from the N-terminal domain. The 2Ј-and 3Ј-hydroxyl groups of the ribose are positioned by the side chains of Arg 143 and Glu 239 , along with water-mediated interactions involving the 3Ј-hydroxyl. The phosphate groups exhibit H-bonds with the main-chain nitrogen atoms of Phe 15 , Gly 16 , Gly 166 , Ser 235 , and Leu 236 along with OH Ser-234 (Fig. 3D). The uracil base is positioned within a hydrophobic pocket formed by the side chains of Phe 15 , Cys 163 , Ala 189 , Ile 219 , Leu 236 , and the main-chain atoms of Gly 165 , Gly 166 , and Thr 190 . Watermediated H-bonds are found between N-3, O-2, and O-4 of uracil with the His 216 main chain and through additional waters to other residues, including Arg 143 , Glu 145 , Ile 219 , and Glu 217 .
Molecular Modeling of the PseG-Substrate Complex-Adjacent to the ␤-phosphate group of UDP, a cavity is observed at the N-and C-terminal domain interface suggestive of a putative binding site for the substrate sugar moiety, 6-deoxy-AltdiNAc. This cavity contains a bound molecule of glycerol in the Pse-GHis 6 -UDP complex ( Fig. 3D and supplemental Fig. S5B). The glycerol is involved in an intricate network of hydrogen bonding interactions, including direct interactions with the ␣and ␤-phosphates of UDP, direct contacts with Gly 16 , His 17 , Tyr 78 , Ser 234 , Ser 235 , and the Arg 20 -Asp 101 salt bridge, along with indirect hydrogen-bond interactions via several buried water molecules. The candidate sugar-binding pocket is also lined by other residues, including Ile 13 , Gly 14 , Thr 167 , Ile 169 , Ser 232 , Tyr 252 , Val 253 , Asn 255 , and Gln 256 . The general location of the PseGHis 6 -bound glycerol is similar to that of the GlcNAc moiety of UDP-GlcNAc bound to MurG (18).
As efforts to co-crystallize the PseGHis 6 -substrate complex using either wild-type or mutant enzymes were unsuccessful, we employed protein-ligand docking and molecular dynamics simulations to arrive at a plausible model of PseG-substrate binding. Flexible docking of the sugar part of the UDP-6-deoxy-AltdiNAc substrate was first carried out by MCM conformational sampling (29,30) of acyclic and cyclic rotatable bonds in the sugar and pyrophosphate parts of the substrate, while maintaining the crystallographically observed binding mode of the uridine part. The three feasible pyranose ring conformations of the 6-deoxy-AltdiNAc sugar moiety, the 1 C 4 chair, 4 C 1 chair, and 5 S 1 twist-boat (supplemental Fig. S3), were incorporated into UDP-6-deoxy-AltdiNAc substrate models and independently subjected to MCM docking to PseG. All three MCM docking simulations converged toward the same lowest energy structure of the complex, corresponding to the twist-boat sugar ring conformation. Compelling evidence for the propensity of the 6-deoxy-AltdiNAc sugar, in the free state, to acquire a twistboat conformation is the finding that the ␤-L-altropyranose sugar ring is flexible and found to adopt a twist-boat conformation (53). In addition, as stated above, ab initio structure calculations of O Gly -methylated 6-deoxy-AltdiNAc (supplemental Fig. S3) show the 5 S 1 twist-boat as a low energy conformation along with the 1 C 4 and 4 C 1 chairs.
The predicted PseG-substrate binding mode was further investigated by hybrid QM/MM unconstrained molecular dynamics simulations in explicit solvent. Separate simulations were carried out for the substrate in the 1 C 4 and 5 S 1 sugar ring conformations observed experimentally in the free state (53). Analysis of the QM/MM MD trajectories indicate that these 1 C 4 and 5 S 1 sugar ring conformations are stable in the solvated PseG-binding site, i.e. they did not interconvert or transition to other conformations during the course of the simulations (supplemental Fig. S6A). The most common feature of the PseG- JULY 31, 2009 • VOLUME 284 • NUMBER 31 substrate binding modes obtained from these simulations is that the substrate sugar C-1 atom approaches the His 17 side chain (supplemental Fig. S6B) with which it interacts via a structured water molecule in both the 1 C 4 and 5 S 1 conformations (Fig. 4). This water molecule that appears structured during both QM/MM MD simulations is also present in the Pse-GHis 6 -UDP crystal structure (Wat110), and it is anchored via H-bond interactions by the His 17NE atom and Ile 13 main-chain oxygen atom. Hence, these models predict that a water molecule at this location is the nucleophile that attacks the substrate at the sugar C-1 atom, and that His 17 serves a key catalytic role in activating this water molecule.

Structure and Function of C. jejuni PseG
A closer examination of the QM/ MM MD models appears to favor the twist-boat conformation of the sugar as representing the Michaelis complex. The putative catalytic water molecule is more favorably positioned for an S n 2 attack with inversion of configuration at the sugar C-1 atom in the case of the 5 S 1 conformation than in the 1 C 4 conformation (Fig. 4, C and D). In the 5 S 1 conformation, the water molecule is closer to the C-1 atom (3.5 versus 4.1 Å for 1 C 4 ) and more inline with the scissile anomeric C-1-O Gly bond (174 o versus 135 o for 1 C 4 ). Such geometrical differences in the ground state would incur a lower energy barrier to the transition state of the hydrolytic reaction in the case of the twist-boat conformation. The QM/MM calculations also indicate a more polarized anomeric C-1-O Gly bond in the case of bound 5 S 1 conformation by 12% (0.11 e), i.e. the C-1 atom is more positively charged and the O Gly atom is more negatively charged relative to the 1 C 4 conformation (supplemental Fig. S6, C and D). Thus, the twist-boat conformation may resemble electronically the transition state of the reaction. The QM/MM MD simulations also indicate that accommodation of the 5 S 1 conformation is accompanied by smaller perturbations and increased rigidity of the PseG-binding site in comparison with the 1 C 4 conformation (supplemental Fig. S6, E and F). The 5 S 1 substrate conformation is also favored by a more negative solvated interaction energy (40,41) with PseG (by 1.7 kcal/mol) and a lower solvated conformational energy of the complexed PseG protein (by 11 kcal/mol) (supplemental Table S3). In conclusion, although we cannot rule out the possibility of a PseG-substrate-binding mode involving a 1 C 4 chair conformation, all of our computational data support the likelihood of it involving a twist-boat conformation of the 6-deoxy-AltdiNAc sugar moiety.
Direct intermolecular interactions between PseG and the modeled 5 S 1 twist-boat conformation of the 6-deoxy-AltdiNAc sugar are detailed in Fig. 4A. The 3-hydroxyl and 4-N-acetyl substituents of the sugar are predicted to form hydrogen bonds Stereo views of the substrate-binding site show PseG residues (C atoms in cyan) in contact with the sugar moiety of the substrate (C atoms in yellow, anomeric C-1 atom in black) from models corresponding to the 5 S 1 twist-boat conformation (A) and 1 C 4 chair conformation of the sugar ring (B). For Ile 13 , only the main-chain carbonyl group is rendered. Hydrogen bonds are shown as dashed lines, and the putative water nucleophile is shown as a red sphere. Hydrogen atoms are omitted for clarity. Geometric details in the ground state show the principal putative catalytic residues and the position of the putative catalytic water molecule relative to the anomeric scissile bond C-1-O Gly of the substrate, from models corresponding to the 5 S 1 twist-boat conformation (C) and the 1 C 4 chair conformation of the sugar ring (D). Rendering scheme for C and D is as in A and B.
with the Tyr 78 , Asn 255 , and Gln 256 residues of PseG. Substantial packing is achieved between the 4-N-acetyl and C-6-methyl substituents buried in a deep pocket that is partially solvated and partially filled by the glycerol molecule co-crystallized with the PseGHis 6 -UDP complex (supplemental Fig. S5). This Pse-GHis 6 -bound glycerol molecule mimics the interactions established by the 4-N-acetyl and C-6-methyl substituents of the modeled substrate. Direct contacts in this pocket also implicate PseG residues Gly 16 , His 17 , Ser 234 , and the Arg 20 -Asp 101 salt bridge. The methyl group of the more solvent-exposed 2-Nacetyl moiety is predicted to form non-polar contacts in a surface cradle formed by Ile 169 , Tyr 252 , and Val 253 . The 2-N-acetyl substituent also establishes intramolecular hydrogen bonds with the 1-phosphate and 3-hydroxyl substituents. The latter group is also found in proximity (3.4 Å) of the putative catalytic water molecule. In the case of the 1 C 4 conformation, predicted substrate interactions with the enzyme appear less intimate and less complementary (Fig. 4B), as reflected quantitatively by a less favorable SIE value relative to the 5 S 1 conformation (supplemental Table S3). The sugar moiety does not fill the pocket occupied by the trapped glycerol molecule in the PseGHis 6 -UDP complex, which remains filled by several water molecules that connect to a water channel sandwiched between substrate and enzyme (supplemental Fig. S7). In contrast with the bound 5 S 1 conformation, the 4-N-acetyl of the bound 1 C 4 conformation is not engaged in direct H-bond contacts with the protein and, together with the C-6-methyl, is partially solvent-exposed. The 2-N-acetyl substituent is buried instead, with the carbonyl oxygen forming H-bonds with the Thr 167 hydroxyl and the backbone amides of Gly 165 and Gly 166 , and with the methyl group sandwiched between Gly 165 and Tyr 252 . The 3-hydroxyl substituent forms a hydrogen bond with the main-chain carbonyl of Ile 13 .
Kinetic Characterization of PseG Mutants-For kinetic analysis, the PseG hydrolase reaction was monitored using a continuous coupled assay for UDP release (21) in a 96-well microplate format. The kinetic constants obtained for PseGHis 6 were similar to those reported by Liu and Tanner (11) yielding an apparent k cat of 25 Ϯ 0.73 s Ϫ1 and apparent K m of 0.25 Ϯ 0.014 mM ( Table 2). Three active site residues were targeted for mutagenesis, based on the PseG-substrate model, to examine their contribution to catalysis. All substitution derivatives tested exhibited Michaelis-Menten kinetics (supplemental Fig.  S8), with varying levels of activity, and did not appear to induce gross protein misfolding or degradation as assessed by CD spectroscopy and SDS-PAGE analyses, respectively (supplemental Figs. S1 and S2 and supplemental Table S2). Substitution of His 17 to Phe or Leu had the most dramatic effect on turnover, resulting in near nonmeasurable activity, confirming the catalytic importance of this residue (Table 2). In contrast, an H17N substitution retained ϳ10% turnover, suggestive of possible functional complementation by the Asn side chain (see Fig. 5). Another important residue highlighted by the model was Asn 255 , and evidently, an N255A substitution harbored only ϳ1% turnover relative to wild type. The last substitution, Y78F, was the least affected kinetically, which may indicate a possible role for the phenyl ring that is conserved here (see below). Finally, PseGHis 6 is specific for UDP-6-deoxy-AltdiNAc, in that no activity was observed with UDP-4-amino-4,6-dideoxy-␤-L-AltNAc or UDP-2,4-diacetamido-2,4,6-trideoxy-␣-D-glucopyranose (or UDP-2,4-diacetamido-bacillosamine). This may be explained by the molecular modeling of the PseG-substrate complex, where the 4-N-acetyl and 6-methyl substituents of UDP-6-deoxy-AltdiNAc in the twist-boat conformation are buried within a deep cavity resulting in substantial packing and intermolecular interactions ( Fig. 4 and supplemental Figs. S5 and S6). As such, the positively charged 4N amino group of the former and the C-6 epimer stereochemistry of the latter may preclude a proper induced fit within the active site. We note that the residues altered in this study (His 17 , Tyr 78 , and Asn 255 ) are invariant residues within the PseG hydrolase family (supplemental Fig. S4).
Implications for the Mechanism of Catalysis-Mechanistic studies of PseG, employing labeling using H 2 18 O, are most consistent with a concerted mechanism in which a water molecule attacks C-1 of the UDP-6-deoxy-AltdiNAc sugar, followed by cleavage of the C-O anomeric bond (11). This conclusion is consistent with both (a) incorporation of the 18 O label in the sugar product and not the UDP leaving group and (b) the inversion of configuration of the C-1 hydroxyl. This mechanism is distinct from that used by the UDP-N-acetylglucosamine 2-epimerase NeuC (SiaA) from Neisseria meningitidis, which proceeds via formation of the putative intermediate, 2-acetamidoglucal ( Fig. 2B) (54). Here an anti-elimination of UDP from the substrate is performed, with the carboxylate moieties of Glu 122 and Asp 131 important in catalysis, possibly through activation of a water molecule necessary to perform nucleophilic attack at C-1 of the 2-acetamidoglucal intermediate.
From a mechanistic point of view, the reaction catalyzed by PseG is similar to that catalyzed by E. coli GDP-mannose mannosylhydrolase, although, unlike PseG, GDP-mannose mannosylhydrolase is a member of the Nudix family and utilizes a divalent cation in catalysis (16,55,56). Both PseG and GDPmannose mannosylhydrolase, however, participate in C-O bond cleavage with inversion of configuration of C-1 of the sugar. His 124 of GDP-mannose mannosylhydrolase has been identified as a base responsible for activating a water molecule, with coordination of the Mg 2ϩ ion to the water contributing to activation (57).
Our hybrid QM/MM molecular dynamics simulations of PseG complexes with the substrate in the 5 S 1 twist-boat and 1 C 4 chair conformations point to a water molecule (Wat110 in the PseGHis 6 -UDP crystal structure) that is hydrogen-bonded to His 17 and is also suitably positioned for nucleophilic attack to the anomeric C-1 atom of UDP-6-deoxy-AltdiNAc, particu- larly in the case of the twist-boat conformation. This water molecule would contact the C-1 atom in-line with the C-1-O Gly anomeric bond and is consistent with an S n 2 reaction leading to inversion of configuration at C-1. The putative catalytic water molecule is also hydrogen-bonded to the main-chain carbonyl of Ile 13 and further stabilized by the edge of the Tyr 78 aromatic ring (Fig. 4, A and B). This may explain the higher turnover observed for the Y78F substitution derivative, with both residues containing an aromatic ring ( Table 2). The catalytic mechanism inferred from our crystallographic, modeling, and kinetic data is shown in Fig. 5. The His 17 residue would polarize a water molecule at the described location adja-cent to the sugar C-1 atom of the substrate. This polarization would involve an equilibrium between the ionization states OH Ϫ -His 17ϩ and water-His 17 . The hydroxide anion, tethered to His 17ϩ and Ile 13 , would then attack the sugar C-1 atom of the substrate. In the case of the H17N substitution, the carbonyl oxygen atom of Asn 17 would provide a weaker water polarization, as reflected by a 10-fold drop in the measured k cat value ( Table 2). A substrate-binding mode with the sugar ring in a twist-boat conformation appears to favor the nucleophilic attack in-line with the C-1-O Gly anomeric bond for inversion of configuration at C-1. In contrast, the 1 C 4 chair conformation displays a less optimal ground state geometry for nucleophilic attack, given the constraints imposed by the binding site geometry and also possibly relating to a reduced access to the C-1 atom. The predicted twist-boat conformation of the sugar ring would be facilitated mainly through interactions with Asn 255 , which is tethered to the 3-OH substituent of the sugar, in turn poised for hydrogen bonding to the catalytic water molecule, consistent with the mutagenesis data ( Table 2). It is therefore possible that the enzyme takes advantage of the relatively stable twist-boat sugar ring conformation of 6-deoxy-AltdiNAc to facilitate the attack. Recent crystallographic and theoretical studies have shown substrate binding in twist-boat conformations in several systems, notably retaining and inverting ␤-glycosidases (see Refs. 58, 59 and references therein). This nucleophilic attack promotes the S n 2-like displacement of UDP to generate the C-1-inverted ␣ anomer of the 6-deoxy-AltdiNAc sugar. It is expected that the ␣ to ␤ conversion of 6-deoxy-AltdiNAc would take place nonenzymatically in solution (11).
PseG-MurG Comparative Structural Analysis and Insight into the Mechanism of MurG Glycosyltransferases-Within SCOP (60), PseG would be classified within the UDP-glucosyltransferase/glycogen phosphorylase fold (61), characterized by two dissimilar ␣/␤ domains that share similar central, parallel ␤-sheets. This fold currently includes nine families of carbohydrate-active enzymes, including several glycosyltransferases (18, 62-67), sugar phosphorylases (68, 69), and a UDP-sugar FIGURE 5. Proposed catalytic mechanism for PseG based on modeling of the UDP-sugar complex and activity analysis of substitution derivatives. Here, PseG employs a single displacement mechanism involving C-O bond cleavage via direct attack of the anomeric carbon by a hydroxide nucleophile. Specifically, the residue His 17 performs a base-catalyzed attack of water by abstracting a proton. The activated water is anchored with its other proton hydrogen-bonded to the main-chain carbonyl of Ile 13 , and it is further stabilized by the partially positively charged edge of the aromatic ring of Tyr 78 (not shown). Several PseG residues participate in direct intermolecular interactions with the sugar moiety and stabilize the Michaelis complex, including Asn 255 that appears as a determinant in selecting the twist-boat conformation already present in the free substrate. This twist-boat conformation of the sugar appears primed for catalysis, as it exposes the C-1 atom and favors attack by the His 17 -activated water, resulting in inversion of stereochemistry at C-1 and cleavage of the C-1-O Gly bond. The liberated sugar product is expected to attain an equilibrium distribution of ␣ and ␤ anomers that interconvert nonenzymatically in solution. The inset shows a proposed scheme for His 17 acting as a general base, abstracting a proton from water, or via the H17N substitution H-bonding with the proposed catalytic water molecule. epimerase (70). Although there is a lack of significant sequence similarity between the two, C. jejuni PseG and MurG from E. coli (PDB codes 1F0K and 1NLM (18,71)) bear striking structural resemblance. MurG is a membrane-associated UDP-Glc-NAc:lipid I glycosyltransferase involved in peptidoglycan biosynthesis and is currently the most structurally similar protein to PseG. Relative to C. jejuni PseG, E. coli MurG contains some additional insertions around the central ␤-sheets, as well as a C-terminal ␣-helical extension (supplemental Fig. S4). The best structural alignment between PseG-UDP and molecule A of MurG bound to UDP-GlcNAc (PDB code 1NLM (18)), carried out using Swiss PDB Viewer (72), gives an r.m.s.d. of ϳ1.6 Å for 146 common C-␣ atoms, the vast majority of which reside in the N-terminal domain (supplemental Fig. S9A). According to this structural alignment, the putative catalytic base His 17 of PseG is structurally conserved in MurG (His 19 ), whereas there is a clear translation of the bound UDP moiety between PseG and MurG. An alternative fit of the two enzymes carried out on the bound UDP moiety and including select C-␣s from the C-terminal ␤-sheet results in a displacement of the N-terminal domains together with a misalignment of the conserved putative catalytic histidine (supplemental Fig. S9).
One characteristic observed in all GT-B transferases is substrate-induced domain movement. The shift observed in Pse-GHis 6 upon binding of UDP relative to the apo-structure is smaller (ϳ1.4 Å) in comparison with MurG (ϳ3.4 Å). It is possible that a sulfate ion situated near the domain interface in the apo-PseGHis 6 structure, which coincides with the ␤-phosphate position of UDP in the PseGHis 6 -UDP complex, serves to hold the two domains together via interactions with loop regions of both the domains. However, when comparing the ligandbound structures of PseG and MurG, it is evident that the two domains are arranged into a more open conformation in the case of the substrate-bound MurG structure relative to the UDP-bound (or the substrate-bound) PseG structure (supplemental Fig. S9). This may be a consequence of the differing size of acceptor nucleophiles utilized by these two enzymes.
Despite different inter-domain openings leading to a translation of the bound UDP moiety relative to the aligned N-terminal domains, a combined structure-based alignment between PseG and MurG followed by alignment of related sequences showed conservation between several invariant residues across known PseG homologs and residues near the substrate-binding site of E. coli MurG (supplemental Fig.  S4). These residues in PseG(MurG) include Gly 16 (Gly 18 ), His 17 (His 19 ), and Gln 256 (Gln 289 ) in the putative sugar-binding site, Glu 239 (Glu 269 ) involved in anchoring the ribose moiety, and Gly 165 (Gly 191 ) involved in anchoring the ␤-phosphate. The E269D mutation in MurG has a significant impact on substrate binding with little impact on turnover, indicating that binding of the ribose moiety is critical for MurG function (18,73). In addition, although there is structural conservation of side chains between the PseG and E. coli MurG residues Asn 255 (Gln 288 ) (supplemental Fig. S4), this conservation is not observed for the MurG family in general (71). This may be expected as our study suggests Asn 255 is involved in stabilizing the catalytic twist-boat conformation of the PseG substrate, a substrate conformation that is not adopted in the MurG reac-tion (18). Moreover, the Tyr 78 residue altered in this study is not found to be conserved in E. coli MurG. However, Asn 128 of MurG appears to occupy a similar space in structural comparisons where, like Tyr 78 of PseG, it interacts with the 4-substituent of the donor sugar. Coincidentally, an N128A MurG substitution was found to greatly affect turnover (18). This agrees with our mechanistic model, whereby these residues may additionally facilitate the coordination of the acceptor nucleophile. As such, their differences may not only be due to different donors but perhaps more importantly to the unique acceptors, respectively.
The kinetic mechanism of MurG has been investigated, and it is found to follow a compulsory Bi-Bi mechanism, with binding of the donor UDP-sugar prior to the lipid-linked N-acetylmuramic acid (lipid I) (74), although the identity of the catalytic base that activates the C-4 hydroxyl moiety of the acceptor substrate remains unknown (Fig. 2C). The significant impact of the H19A substitution on MurG activity, which lowers the k cat by more than 3 orders of magnitude (18), prompted recent speculations that it may represent the catalytic base of MurG (75). However, His 19 is located remotely from the donor sugar, 9.3 Å across the inter-domain cleft to the anomeric bond. Our mutagenesis and structural data, supporting a catalytic role for His 17 as a general base in the PseG inverting glycoside hydrolase, strengthen the putative catalytic role of the structurally conserved His 19 in the MurG inverting glycosyltransferase.