Crystal Structure of Streptococcus pneumoniae N-Acetylglucosamine-1-phosphate Uridyltransferase Bound to Acetyl-coenzyme A Reveals a Novel Active Site Architecture*

The bifunctional bacterial enzymeN-acetyl-glucosamine-1-phosphate uridyltransferase (GlmU) catalyzes the two-step formation of UDP-GlcNAc, a fundamental precursor in bacterial cell wall biosynthesis. With the emergence of new resistance mechanisms against β-lactam and glycopeptide antibiotics, the biosynthetic pathway of UDP-GlcNAc represents an attractive target for drug design of new antibacterial agents. The crystal structures of Streptococcus pneumoniae GlmU in unbound form, in complex with acetyl-coenzyme A (AcCoA) and in complex with both AcCoA and the end product UDP-GlcNAc, have been determined and refined to 2.3, 2.5, and 1.75 Å, respectively. TheS. pneumoniae GlmU molecule is organized in two separate domains connected via a long α-helical linker and associates as a trimer, with the 50-Å-long left-handed β-helix (LβH) C-terminal domains packed against each other in a parallel fashion and the C-terminal region extended far away from the LβH core and exchanged with the β-helix from a neighboring subunit in the trimer. AcCoA binding induces the formation of a long and narrow tunnel, enclosed between two adjacent LβH domains and the interchanged C-terminal region of the third subunit, giving rise to an original active site architecture at the junction of three subunits.

one of the main cytoplasmic precursors of the bacterial cell wall, being situated at the branch point of two important biosynthetic pathways, namely peptidoglycan and lipid A biosynthesis (2). In eukaryotes, a bifunctional enzyme equivalent to GlmU is missing, and acetyltransfer and uridyltransfer are accomplished by two distinct enzymes, both very distantly related in sequence to GlmU, the latter thus advancing to an attractive target for the development of new antibiotics.
The crystal structures of a truncated form of Escherichia coli GlmU (GlmU-Tr) and of a GlmU-Tr⅐UDP-GlcNAc complex have been recently reported (3). These structures confirmed that the enzyme is organized in the following two separate domains as proposed previously (4,5): (i) an N-terminal uridyltransferase (PPase) domain, comprising Asn-3 to Arg-227, resembling the dinucleotide binding Rossmann fold, first reported in the lactate dehydrogenase family (6), and containing the signature motif G-X-G-T-(R/S)-(X) 4 -P-K, found in the majority of pyrophosphorylases, and (ii) a C-terminal acetyltransferase domain, containing the hexapeptide repeat (L/I/V)-(G/A/E/D)-X 2 -(S/T/A/V)-X, a signature of the unusual left-handed ␤-helix (L␤H) structural motif, typically found in other bacterial acetyl-and acyltransferases (7) (Fig. 1B). Furthermore, the GlmU-Tr⅐UDP-GlcNAc complex structure identified the precise location of the uridyltransfer reaction, the pyrophosphorylase activity of GlmU-Tr being retained. However, acetyltransferase activity was lost because of spontaneous truncation during purification, confirming that the bifunctional enzyme possesses indeed two distinct active sites located in separate domains, with the acetyltransferase activity residing in the C-terminal portion of the enzyme (4). Although the crystal structure of the E. coli enzyme, coupled to mutagenesis studies, has revealed some residues crucial for pyrophosphorylase activity (3), the catalytic machineries responsible for both pyrophosphorylase and acetyltransferase activity remain to be elucidated.
Here we present the crystal structures of full-length GlmU from the pathogenic organism Streptococcus pneumoniae in its unbound form and in complex with AcCoA and both AcCoA and the product UDP-GlcNAc. These structures define the precise location of the acetyltransferase active site, reveal substantial conformational changes occurring both upon AcCoA and UPD-GlcNAc binding, and highlight the structural elements responsible for substrate recognition and catalysis in the two distinct active sites of this bifunctional enzyme.

EXPERIMENTAL PROCEDURES
Expression, Purification, and Crystallization-The coding region of SpGlmU was amplified from S. pneumoniae strain R 800 DNA by polymerase chain reaction and inserted into the bacterial expression plasmid PQE30 (Qiagen). Recombinant SpGlmU was overexpressed in M15 cells and purified to homogeneity by nickel-nitrilotriacetic acidagarose and gel filtration chromatography. Enzyme activity has been * This work was funded in part by a Groupement d'Intérèt Public-Hoechst Marion Roussel grant and the Centre National de la Recherche Scientifique (UMR 6098, Marseille, France). The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
The tested and found similar to that of full-length E. coli GlmU (3). Crystals were grown at 20°C by the hanging-drop vapor diffusion method by mixing equal volumes of protein solution (13 mg/ml) with reservoir solution composed of 26% (v/v) PEG 400, 50 mM NaCl, and 300 mM CaCl 2 at pH 8.0 by TRIS-HCl. Small rhombohedral crystals with a typical size of 0.1 ϫ 0.1 ϫ 0.1 mm appeared within 1 week. Crystals belong to space group R3 and contain two molecules per asymmetric unit. As molecular replacement with GlmU-Tr (Protein Data Bank entry 1FXJ) failed, selenomethionine-substituted enzyme was produced using the same bacterial strain grown in minimum medium and supplemented, before induction, with selenomethionine and amino acids known to inhibit methionine biosynthesis (8). The yield of selenomethionine substitution was about 50% as judged by matrix-assisted laser desorption ionization/time of flight mass spectroscopy analysis. Crystals of bigger dimensions and higher diffraction quality were obtained for the selenomethionine-substituted enzyme under the same crystallization conditions as adopted for the native protein. Crystals for the AcCoA complex were obtained by incubating the enzyme with 20 mM AcCoA prior to crystallization and lowering the PEG 400 concentration to 18% (v/v). AcCoA⅐UDP-GlcNAc complex crystals were obtained by cocrystallization with 20 mM AcCoA followed by harvesting into a stabilizing solution made of 30% (v/v) PEG 400, 50 mM NaCl, 300 mM CaCl 2 at pH 8.0 by TRIS-HCl and supplemented with 10 mM UDP-GlcNAc.
Data Collection, Structure Solution, and Refinement-All data sets were collected at 100 K on flash-frozen crystals. Cryosolutions were of the same composition as the crystallization/harvesting solutions with the addition of an increasing amount of PEG 400 and supplemented with 5% (v/v) glycerol. A 3-wavelength multiple anomalous dispersion data set for selenomethionine-substituted SpGlmU was collected on beamline BM14 (European Synchrotron Radiation Facility, Grenoble, France), a data set for native SpGlmU and data for the AcCoA complex were collected on beamlines ID14-EH2, and data for the AcCoA⅐UDP-GlcNAc complex were collected on beamline ID14-EH3 (European Synchrotron Radiation Facility, Grenoble, France). Data were indexed and integrated with DENZO (9), and all further computing was carried out with the CCP4 program suite (10) unless otherwise stated. Data collection statistics are summarized in Table I and Table II. The SpGlmU structure was solved using the program SOLVE (11). The initial multiple anomalous dispersion phases had a mean figure of merit of 0.340-to 2.8-Å resolution and were improved by density modification with the program DM (12) and extended to the resolution of the native data set (2.3 Å). Because of the low yield of selenomethionine incorporation only a few of these residues could be located in the experimental electron density maps, which were of mediocre quality. Non-crystallography symmetry averaging and phase combination techniques were of great help in overcoming these problems, and a preliminary model could be constructed for most of the L␤H domain and the core of the N-terminal PPase domain using the program TURBO-FRODO (13). However, most of the loop regions, some of the ␣-helixes in the N-terminal domain, and the last 25 C-terminal residues turned out to be extremely disordered, if visible at all. A striking improvement of the map quality was observed for the AcCoA complex. A continuous model could be built comprising residues Ser-2-Gln-459. A crystal lattice rearrangement occurred upon soaking of AcCoA complex crystals in the solution containing UDP-GlcNAc, and the structure was solved by molecular replacement with the program AMoRe (14). Refinement was carried out with the programs REFMAC (15) and CNS (16), using the maximum likelihood method and incorporating bulk solvent corrections, anisotropic F obs versus F calc scaling, and non-crystallography symmetry restraints. 10% of the reflections were set aside during refinement for cross-validation purposes. Automated correction of the model and solvent building were performed with the program ARP/wARP (17). The stereochemistry of the final models was verified with the program PROCHECK (18). Refinement statistics are summarized in Table II. Coordinates have been deposited in the Protein Data Bank under accession reference numbers 1HM0 for apo-SpGlmU and 1HM8 and 1HM9 for the AcCoA and the AcCoA⅐UDP-GlcNAc complex, respectively. Fig. 1B was generated with Alscript (19), and Figs. 2-4 were generated with SPOCK (20) and Raster3D (21).

RESULTS AND DISCUSSION
The crystal structure of full-length SpGlmU was determined by multiple anomalous dispersion techniques. The apo-SpGlmU, SpGlmU-AcCoA, and SpGlmU-AcCoA⅐UDP-GlcNAc structures were refined to 2.3, 2.5, and 1.75 Å, respectively, and have good stereochemistry. The apo-SpGlmU structure consists of residues Ser-2 to Val-142 and Val-149 to Glu-447. The surface loop Arg-143-Glu-148, located in the pyrophosphorylase domain, and the last 12 residues of the acetyltransferase domain, Tyr-448 -Gln-459, could not be built because of lack of electron density. The two complex structures, SpGlmU⅐AcCoA⅐ and SpGlmU⅐AcCoA⅐UDP-GlcNAc, consist of residue Ser-2 to Gln-459, and clear unbiased electron density could be observed for both AcCoA and UDP-GlcNAc prior to the incorporation in the refinement (Fig. 2a).
The SpGlmU apo-structure, except for the two missing regions Arg-143-Glu-148 and Tyr-448 -Gln-459, is highly similar to the SpGlmU⅐AcCoA complex structure, with a root mean square deviation of 0.450 Å for 440 C␣ positions (Fig. 2c). The SpGlmU-AcCoA complex structure, in turn, is almost identical to the SpGlmU⅐AcCoA⅐UDP-GlcNAc complex structure in the acetyltransferase domain (root mean square deviation of 0.17 Å for 208 C␣ positions). However, the two complex structures differ greatly in the pyrophosphorylase domain, as discussed further below.
The SpGlmU overall fold for residues Ser-2 to His-330 is similar to the E. coli-truncated enzyme (3). However, the relative arrangement of the pyrophosphorylase and the acetyltransferase domain differs between the crystal structures of SpGlmU and E. coli GlmU-Tr (Fig. 3a). Indeed, the two GlmU structures present a 20°deviation in the direction of the ␣-helical linker, indicating that this is, in fact, a flexible hinge. A direct consequence of this deviation are major differences between GlmU-Tr and SpGlmU occurring in the regions of the pyrophosphorylase domain neighboring the N-cap of the ␣-helical linker. These conformational changes, together with a high overall mobility of the pyrophosphorylase domain, as opposed to the acetyltransferase domain, suggest that the presented structures may represent only snapshots of a highly dynamic system. The Pyrophosphorylase Domain-The SpGlmU PPase domain can be divided into two lobes separated by the active site pocket. The first hundred residues, containing the consensus sequence motif G-X-G-T-(RS)-(X) 4 -P-K, form the nucleotide binding lobe, whereas the second lobe, responsible for recognition of the sugar moiety, encompasses the remaining residues of the N-terminal domain (Fig. 3B).
Striking differences exist between the PPase domains of apo-SpGlmU and the SpGlmU⅐AcCoA⅐UDP-GlcNAc complex (root mean square deviation of 2.2 Å for 226 C␣ atoms), indicating that the enzyme undergoes a substantial conformational change upon substrate/product binding. In the absence of UDP-GlcNAc (apo-SpGlmU and SpGlmU⅐AcCoA), SpGlmU adopts an open conformation, whereas in the UDP-GlcNAc complex two regions within the sugar binding lobe move toward each other giving rise to a closed conformation (Fig. 3B). Upon product binding the entire region encompassing residues Thr-132-Lys-166 moves as a rigid body, making a 20°tilt resulting in a 7-Å movement of the ␤5b-␤6 surface loop. The melting of the last turn of the ␣-helix ␣5, facing the ␤5b-␤6 loop, transforms the following ␣5-␣6 surface loop (Asn-191-Tyr-197) into an extended thumb-shaped hairpin. These movements bring the two above surface loops close to each other, such that in the UDP-GlcNAc complex the Ala-192 N hydrogen bonds Asp-157 OD1 (Fig. 3b), whereas in the unbound form these two residues are 14 Å apart. This suggests that the two surface loops function like a pair of tongs, closing up upon substrate binding and anchoring the sugar deep into the active site pocket thereby shielding it from solvent.
The "breathing" of the PPase domain of SpGlmU could not be observed for the E. coli GlmU-Tr enzyme, where the crystal structures reveal a closed conformation for both the apo-and UDP-GlcNAc complexed forms (3). However, analysis of the crystal packing in the E. coli GlmU-Tr structures reveals that the pyrophosphorylase domain is constrained into its closed conformation in both the apo-form and the GlmU-Tr⅐UDP-GlcNAc complex by the packing environment, whereas no such constraints exist in apo-or complexed SpGlmU crystals.
The interactions of the enzyme with the nucleotide and the sugar are largely conserved within the complex crystal structures from S. pneumoniae and E. coli GlmU, yet significant differences reside in the surroundings of the pyrophosphate moiety. Whereas in the GlmU-Tr⅐UDP-GlcNAc complex both phosphates are solvent-exposed, in the SpGlmU⅐AcCoA⅐UDP-GlcNAc complex the ␣-phosphate is stabilized through weak hydrogen bonds to the side chains of sequence-conserved Arg-15 and Lys-22, located within the signature motif. Moreover, both phosphate groups interact through a calcium ion with Asp-102 and Asn-227, situated in the ␤4-␣4 hairpin and in the N-cap of the long ␣-helical linker, respectively (Fig. 3c). This calcium ion exhibits the octahedral coordination geometry characteristic of Mg 2ϩ ions and thus mimics the catalytically important Mg 2ϩ ion. Arg-15 presents static disorder, with the conformation of minor occupancy contacting the ␤-phosphate (Fig. 3B). This disorder, together with the weak hydrogen bond to Lys-22, indicates that instead of stabilizing the product Arg-15 and Lys-22 must have a role either in substrate recognition or transition-state stabilization during the single displacement reaction (22), consistent with mutagenesis data of the E. coli GlmU enzyme (3).
Surprisingly in the crystal structure of the E. coli GlmU-Tr⅐UDP-GlcNAc complex the functional residues Arg-18, Lys-25, Asp-105, and Asn-227 are located far away from the pyrophosphate group. These residues are carried by three structural elements in intimate contact with each other, with the ␣-helical arm, and with the acetyltransferase domain of a neighboring subunit. As mentioned above, the E. coli GlmU-Tr structures differ from the SpGlmU structures in the relative arrangement of the acetyltransferase and pyrophosphorylase domains, probably because of enzyme truncation. Consequently in E. coli GlmU-Tr, the ␣-helical arm pushes the signature motif Gly-14 -Lys-25 away from the substrate binding pocket of the pyrophosphorylase domain, which might explain the 2-fold reduction in the k cat value of E. coli GlmU-Tr, as compared with the wild-type enzyme (3). The Acetyltransferase Domain-The C-terminal acetyltransferase L␤H domain resembles an equilateral prism, with the three sides formed by three parallel ␤-sheets composed of short ␤-strands (Fig. 2B). The 50-Å long ␤-helix of full-length SpGlmU consists of 10 regular coils, whereas the E. coli GlmU-Tr structure is truncated after the fourth coil. The buried surface area to a 1.6-Å probe radius of a SpGlmU subunit upon trimer formation is 4690 Å 2 , a value in the highest range when compared with other homologous trimeric L␤H structures. The regularity of the prism is interrupted only at the seventh coil by a single insertion loop, encompassing the sequence-conserved region Asn-385 to Lys-393, which projects from one of the vertices of the prism and flanks an adjacent subunit (Fig. 2B). The dominant and striking feature of the SpGlmU trimeric assembly is the domain exchange of the Cterminal region. Although this is a novel feature within the family of bacterial acetyltransferases, such a domain exchange has been reported for a number of proteins and is referred to as three-dimensional domain swapping (23). After 10 1/3 complete turns the peptide chain is exchanged with an adjacent subunit, thus forming the unique antiparallel ␤-strand of an additional coil within the L␤H domain. Therefore, together with the ␣-helical linker sitting on top of the L␤H domain, the C-terminal domain exchange contributes to the stabilization of the SpGlmU trimeric assembly. At residue Glu-447 the anti-parallel ␤-strand reaches the vertex of the prism of a neighboring subunit. At this point the polypeptide chain inserts between FIG. 1. Catalytic reactions and sequence alignment of SpGlmU. A, schematic representation of the two-step reaction catalyzed by GlmU: acetyltransferase (I) and pyrophosphorylase (II). B, the SpGlmU sequence is aligned with a consensus sequence calculated on the basis of 12 known sequences of bacterial GlmU. Invariant residues are highlighted in white with a black background. h, s, p, c, and . denote hydrophobic, small, polar, charged, and any residues, respectively. Residues buried at the trimer interface (black circles above sequence), involved in AcCoA (light gray triangles pointing upwards), UDP-GlcNAc/Ca 2ϩ (gray triangles pointing downwards/gray circles), and putative GlcN-1-P (gray triangles pointing downwards) binding are shown; those forming the catalytic triad and involved in the PPase activity are shown as gray circles above sequence and black circles below sequence, respectively. two neighboring subunits and coils backwards in the direction of the N terminus, forming two successive 3 10 -helices and ending in intimate contact with the insertion loop Asn-385-Lys-393 of an adjacent subunit. An 8-Å long and very narrow tunnel is formed in this way, enclosing bound AcCoA located at the interface of two subunits, and closed from the outside by the exchanged C-terminal arm of the third subunit and the insertion loop Asn-385-Lys-393 (Fig. 4A), revealing that the trimeric assembly is required for the acetyltransferase activity. To our knowledge, such an active site architecture located at the junction of three subunits is novel and exemplifies how an oligomeric assembly, coupled to a domain exchange, can create a specific binding site. The C-terminal region past residue Glu-447 could not be observed in the apo-structure, suggesting that this region is highly flexible and becomes only structured upon AcCoA binding. No other major structural rearrangements occur upon AcCoA binding in the acetyltransferase domain, except maybe for the insertion loop Asn-385-Lys-393, which is highly disordered in the apo-structure, as indicated by a main chain average B-factor of 74 Å 2 as compared with an average main chain B-factor of 35 Å 2 for the rest of the acetyltransferase domain.
Major stabilization of the AcCoA cofactor is ensured by stacking of the adenine group between the side chains of Ile-437 and Arg-441, hydrogen bonds from the 3Ј-phosphate group to Lys-445 NZ and Tyr-448 OH, and electrostatic interactions with Arg-439. Additional stabilization arises from hydrophobic interactions, and hydrogen bonds from the ␤-mercaptoethylamine moiety to main-chain atoms of Asn-385, Ser-404, and Ala-422. The pyrophosphate is exposed to the solvent and does not interact with the protein.
Structural comparison of the SpGlmU⅐AcCoA complex with other related bacterial acetyltransferases reveals a common location of the AcCoA binding site. Indeed, AcCoA adopts a conformation very similar to the fishhook-like conformation observed for CoA in the tetrahydrodipicolinate N-succinyltransferase (24), bent at the pyrophosphate group and with an extended pantetheine arm running parallel to the L␤H domain (Fig. 4A). Although the C-terminal domain exchange, from which the AcCoA binding site emerges, is novel and dissimilar to other related bacterial acetyltransferases, structuring of the C-terminal portion upon CoA binding has been previously reported for tetrahydrodipicolinate N-succinyltransferase (24).
Implications for Catalysis-Acetyltransferases utilizing Ac-CoA as substrate donor transfer the acetyl group, loosely bound through the weak thioester linkage, either to a cysteine residue, forming a covalent acetyl-enzyme intermediate, or directly to the substrate (25,26). In the light of the first of these two mechanisms, the role of the four cysteine residues in the E. coli enzyme was investigated by site-directed mutagenesis studies (27). However, none of the cysteine residues are conserved between known GlmU sequences, and acetyltransferase activity was dramatically decreased only by the Ala mutant of Cys-307, which is disulfide-bridged and points toward the interior of the L␤H domain in the E. coli GlmU-Tr structure. SpGlmU contains only one single cysteine residue, Cys-369, located 10 Å apart from the active site, excluding thus the hypothesis of a covalent acetyl-cysteine enzyme intermediate.
Inspection of the SpGlmU active site points rather toward a direct acetyl group transfer, based on a catalytic triad formed by the conserved residues His-362, Glu-348, and Ser-404 (Fig.  4B). His-362 is the only residue located in close proximity of the thioester, which may function as a general base catalyst, activating the C-2 amine of glucosamine-1-P for nucleophilic attack. Hydrogen bonding of His-362 ND1 to Glu-348 OE1 ensures the proper tautomeric form of the imidazole, lacking one proton on NE2. Ser-404, located behind the thioester, is well positioned to stabilize, together with the main-chain nitrogen atom of Ala-379, the negative charge building up on the thioester carbonyl at the transition state. The sequence-conserved Asn-385 residue, within hydrogen bond distance to the sulfur, could have a role in proton transfer at the end of the catalytic cycle. The importance of His-362 is highlighted by a superimposition of SpGlmU with the crystal structure of tetrahydrodipicolinate N-succinyltransferase, which positions SpGlmU His-362 similarly to tetrahydrodipicolinate N-succinyltransferase Asp-141, a residue proposed to function as the general base (24). A histidine, His-79, has been suggested to function as the general base, as well, in the related hexapeptide xenobiotic acetyltransferase from Pseudomonas aeruginosa (28).
In absence of a complex with GlcN-1-P, we have modeled GlcN-1-P into the small pocket containing the catalytic triad and surrounded by bulky side chains protruding from two neighboring subunits and the insertion loop Asn-385-Lys-393. The orientation of GlcN-1-P is constrained by a cluster of sequence-conserved electropositive residues (Arg-332, Lys-350, and Lys-391), candidates for binding the C-1 phosphate group, a hypothesis supported by earlier kinetic studies showing that GlcN is a very poor substrate compared with GlcN-1-P for the acetyltransfer reaction (4). In our model the acceptor amino group on C-2 is within hydrogen binding distance from the proposed catalytic base (NE2 of His-362) and ideally poised to make a nucleophilic attack on the thioester (Fig. 4B).
The three crystal structures of SpGlmU in unbound and complexed form described in this paper highlight novel structural features necessary to achieve the acetyltransferase reaction and define a structural template to design new antibiotics. A detailed dissection of the two distinct GlmU catalytic mechanisms must await further crystallographic investigations of substrate and inhibitor complexes.