Open and Closed Structures of the UDP-glucose Pyrophosphorylase from Leishmania major

Uridine diphosphate-glucose pyrophosphorylase (UGPase) represents a ubiquitous enzyme, which catalyzes the formation of UDP-glucose, a key metabolite of the carbohydrate pathways of all organisms. In the protozoan parasite Leishmania major, which causes a broad spectrum of diseases and is transmitted to humans by sand fly vectors, UGPase represents a virulence factor because of its requirement for the synthesis of cell surface glycoconjugates. Here we present the crystal structures of the L. major UGPase in its uncomplexed apo form (open conformation) and in complex with UDP-glucose (closed conformation). The UGPase consists of three distinct domains. The N-terminal domain exhibits species-specific differences in length, which might permit distinct regulation mechanisms. The central catalytic domain resembles a Rossmann-fold and contains key residues that are conserved in many nucleotidyltransferases. The C-terminal domain forms a left-handed parallel beta-helix (LbetaH), which represents a rarely observed structural element. The presented structures together with mutagenesis analyses provide a basis for a detailed analysis of the catalytic mechanism and for the design of species-specific UGPase inhibitors.

Uridine-diphosphate-glucose pyrophosphorylase (UGPase 1 ) represents a ubiquitous enzyme, which catalyzes the formation of UDP-glucose, a key metabolite of the carbohydrate pathways of all organisms. In the protozoan parasite Leishmania major, which causes a broad spectrum of diseases and is transmitted to humans by sand fly vectors, UGPase represents a virulence factor because of its requirement for the synthesis of cell surface glycoconjugates. Here we present the crystal structures of the Leishmania major UGPase in its uncomplexed apo form (open conformation) and in complex with UDPglucose (closed conformation). The UGPase consists of three distinct domains. The Nterminal domain exhibits species-specific differences in length, which might permit distinct regulation mechanisms. The central catalytic domain resembles a Rossmann fold and contains key residues that are conserved in many nucleotidyltransferases. The Cterminal domain forms a left handed parallel β helix (LβH), which represents a rarely observed structural element. The presented structures together with mutagenesis analyses provide a basis for a detailed analysis of the catalytic mechanism and for the design of species specific UGPase inhibitors.
Uridine-diphosphate-glucose pyrophosphorylase (UGPase; EC 2.7.7.9) is present in all three kingdoms of life and catalyzes the reaction of UTP + glucose-1-phosphate → UDPglucose + PP i in the presence of Mg 2+ in vivo. UDP-glucose, the activated form of glucose, plays an essential role in the metabolism of carbohydrates in all organisms. UDP-glucose is the main glucosyl donor for the formation of glycogen, starch, and cellulose, as well as for the synthesis of glucose-containing glycolipids, glycoproteins, and proteoglycans (1,2). In addition, other important nucleotide sugars such as UDP-xylose, UDP-glucuronic acid and UDPgalactose are derived from UDP-glucose. In bacteria some of these activated sugars are used to build the bacterial polysaccharide capsule that often represents the sole determinant of virulence of these organisms. In Streptococcus pneumoniae, for example, it was known that mutants containing an inactivated UGPase gene (galU) are completely avirulent, as they are unable to form the polysaccharide capsule (3). Similarly, UGPase is involved in the biosynthesis of the LPS core in Escherichia coli, resulting in a reduced adhesion behavior of E. coli galU mutants (4).
The protozoan parasite Leishmania is the causative agent of a widespread group of diseases collectively known as Leishmaniasis. The disease affects more than 12 million people world-wide and until now there is no specific drug available to cure the disease (5). Leishmania express various glycoconjugates on their cell surface that is dynamically modified during the parasite life cycle allowing the survival and proliferation in the sand fly vector as well as in the mammalian host (6,7). The biosynthesis of glycoconjugates essentially depends on the availability of activated nucleotide sugars. UGPase represents a key position in the activation of glucose and galactose, which are major components of Leishmania glycoconjugates. Formation of UDPglucose is a prerequisite for the synthesis of UDP-galactose, which is achieved either by nucleotide transfer from UDP-glucose onto galactose-1-phosphate or by epimerization of UDP-glucose to UDP-galactose. As galactosecontaining glycoconjugates are important virulence factors in Leishmania major (L. major), its UGPase has been intensively characterized. Like other pyrophosphorylases the L. major UGPase requires a divalent metal ion, Mg 2+ , and acts reversibly in vitro. The enzymatic reaction follows an ordered bi-bi reaction mechanism with sequential binding of UTP preceding glucose-1-phosphate. Interestingly, UDP and UMP are not recognized by the enzyme and do not facilitate the binding site for glucose-1-phosphate (8). Thus, the γ-phosphate group of UTP is essential for both binding of the nucleotide and inducing of the presumed change in conformation, which then allows the glucose-1-phosphate to the active site.
Several mutagenesis studies and a modeling approach in eukaryotic UGPases have been performed, which identified specific residues important for catalytic activity (9 -12). However, these investigations alone were not enough to resolve the catalytic mechanism. The knowledge of the three dimensional structure is, therefore, of utmost importance. In this study we succeeded in solving the three dimensional structure of the L. major UGPase apo protein and of the complex of UGPase with the product UDP-glucose. Comparison of the two structures revealed an induction from an open to a closed conformation when the product is bound accompanied by major domain motions. In combination with mutagenesis data, our results provide first insight into the mechanism of UDPglucose formation and establish the structural basis for the rational design of inhibitors that specifically affect the L. major enzyme.

EXPERIMENTAL PROCEDURES
Protein expression and purification -The gene encoding UGPase from L. major was subcloned into the His-tag expression vector pET-22b (Novagen). The recombinant plasmid pET-UGP-His was transformed into E. coli expression strain BL21(DE3), overexpressed and purified by Ni 2+ chelating chromatography as described previously (8). As a last additional purification step the buffer of the purified protein was changed by using a HiTrap desalting column (Amersham) to buffer A (1 mM DTT, 2 mM MgCl 2 , 10 mM Tris-HCl pH 7.5, 100 mM NaCl). Finally, the protein was concentrated to 20 mg/ml. Selenomethionine (SeMet)-labeled protein was produced in the methionineauxothrophic E. coli strain B834(DE3) and grown in minimal medium without methionine supplemented with SeMet (0.3 mM). After induction with 1 mM IPTG the cells were grown at 15°C for 18 h. The purification procedure was identical to that of the native protein.
Incorporation of selenomethionine was confirmed by determination of the molecular mass using ESI-MS.
In vitro activity measurement -The in vitro activity of mutant and wild type UGPase was measured in the forward reaction by a coupled enzymatic assay as described previously (8). Briefly, the assay was performed at 25°C in 50 mM Tris/HCl pH 7.8, 10 mM MgCl 2 , 1 mM 2-mercaptoethanol, 1 mM UTP (Roche), 1 mM glucose-1-phosphate (Sigma), 1 mM NAD + (Roche) and 0.05 units UDP-glucose dehydrogenase (Calbiochem). The reaction was started by the addition of wild type or mutant UGPase. The reduction of the cofactor NAD + to NADH+H + was monitored at 340 nm and the initial linear rates were used to calculate the enzymatic activity.
Crystallization and Data Collection -Crystals were grown at 18°C using the sitting drop vapor diffusion method. As the crystallization of SeMet labeled protein yielded better diffracting crystals, the labeled protein was used for all trials. Crystals of the apo protein were grown from 21% PEG 3350, 100 mM Bis-Tris pH 7.0, 200 mM Li 2 SO 4 . Crystals of UGPase in complex with UDP-glucose were obtained from 27% PEG Monomethyl ether 2000, 100 mM Bis-Tris pH 6.5. For cocrystallization UDP-glucose (5 mM) was used in excess. All crystals belong to space group C222 1 and contain one molecule per asymmetric unit. Diffraction data of the apo protein as well as the MAD datasets of the complex were collected at beamline BW6 at the German Electron Synchrotron (Hamburg, Germany). Data sets were integrated, scaled and merged by using DENZO and SCALEPACK (13). The data collection statistics are summarized in Table 1.
Phase Determination and Structure Refinement -The structure of the UGPase + UDP-glucose complex was solved by the multiwavelength anomalous dispersion (MAD) phasing method with programs from the CCP4 collection (14). The positions of 16 SeMet atoms present in the asymmetric unit of the complexed form were determined using the program SHELXD (15). Data were phased and refined to a resolution of 2.3 with MLPHARE. Initial phases were improved by solvent flattening with the program DM from the CCP4 package (Tab. 1).
The resulting electron density map was used in an automated structure refinement approach. ARP/wARP including the warpNtrace protocol (16) were used to obtain a first glycine trace of the protein structure. Starting from this initial model the complete catalytic domain and most of the N-terminal domain of the model could be built. The model was refined in CNS (17) and rebuilt in O (18). The loop region between the catalytic domain and the C-terminal domain was not well defined in the electron density of the ligand bound structure, while this part was traceable in the apo structure.
The apo structure was solved by molecular replacement using the likelihoodbased molecular replacement program PHASER-1.2 (19). The coordinates of 482 residues could be refined using the excellent difference electron density map, except for the first six and the last six residues. The loop containing residues 267 to 274 and residues 467 to 469 showed weak density. It was not possible to detect any electron density in the two structures, which could be assigned to Mg 2+ . However, in the apo structure two peaks of electron density are located at the active site and at the protein surface, which apparently are sulfate ions (SO 4 2-) from the screening buffer. The refinement was done with CNS and the model rebuilding was performed in O.
The well defined C-terminal domain of the apo structure was used to solve the missing C-terminal domain of the ligand structure by molecular replacement. The final model of the ligand structure contained all residues beside the first and the last six residues of the full-length protein. The ligand UDP-glucose was built into the model with the automated ligand building module from ARP/wARP version 6.1. Averaged Sigma-A-weighted and composite-omit (2F o -F c ) maps were used for model building of the ligand in O. Water molecules were added and refined by CNS. The statistical data of the refinement are presented in Table 2 (21), RASTER3D (22), and ALSCRIPT (23).

RESULTS
The 2.3 Å resolution structure of the 54.5 kDa full-length L. major UGPase complexed with UDP-glucose (closed conformation) was solved by MAD using selenomethionine substituted protein. Except for a small part of the N-and C-terminus the whole model (S7-P488) could be built with two surface loops (K267-D274, N466-S471) associated with weak electron density. The model has overall good stereochemistry with 99% of the residues in the preferred ф/ψ conformation and final R-values of R crys = 0.21 and R free = 0.25.
Also, the substrate-free apo protein (open conformation) crystallized in the space group C222 1 but with a different unit cell and its structure was solved by molecular replacement. The model of the apo protein was refined to 2.4 Å with final R values of R crys = 0.21 and R free = 0.26. In the apo model 87% of the residues displayed a main chain conformation in the most favored regions and 12% in the additional allowed regions as defined by the Ramachandran plot. All residues with unusual main chain conformation were either very well defined in the electron density map or occupied flexible regions of the model. A summary of the data collection is presented in Table 1 and the refinement statistics are summarized in Table 2.
Overall structure -UGPase contains an N-terminal domain, a central catalytic domain and a C-terminal domain ( Fig. 1a/b). The catalytic domain harbors the nucleotide sugar binding site and resembles the Rossmann fold seen in many nucleotide binding proteins. It consists of a central highly bent, twisted and mixed sheet composed of seven β-strands arranged in the order 4-3-1-7-11-8-14 that are parallel aligned except β11. This central sheet is topped at one end by a small two stranded antiparallel β-sheet (β2a-β2b) and flanked at the opposite side by a long two stranded anti-parallel β-sheet (β9-β10), which points to a protruding extension. The central sheet is additionally surrounded by eleven α-helices and seven 3 10 helices. The side of the central sheet that faces the N-terminal domain is partially covered by the α-helix α8. In contrast, the top side of the central sheet is solvent accessible and faces the nucleotide sugar moiety. Furthermore, a sequence insertion in the Leishmania enzyme forms the long loop (residues R261-L280) protruding from the catalytic domain, which has the appearance of a handle (Fig. 1b).
The N-terminal domain is discontinuously formed by residues S7-T44 encompassing the two N-terminal α-helices α1 and α2, and two further building blocks (residues N163-P188 and P331-A355) that protrude from the catalytic domain (Fig. 1b). These segments contain a four stranded anti-parallel β-sheet (β5a, β6, β12 and β13), flanked on one side by α1 and α2 and on the other side by the short β-strand β5b and a 3 10 helix.
The C-terminal domain is connected to the catalytic domain via a long loop that encompasses two anti-parallel β-strands (β15, β16) and a 3 10 helix. The domain comprises residues D391-P488 that are arranged in a left handed parallel β helix (LβH) composed of three complete rungs. The first rung contains a large loop and is atypically composed of a shortened β-strand, two subsequent α-helices, and another β-strand. The second and the third rung are formed by three canonical short β-strands. The LβH is assembled by tandem-repeats of imperfect hexa-peptides, which harbor strongly conserved hydrophobic residues at the so-called i-positions (Fig. 1b) pointing towards the interior of the LβH. Generally, small residues occupy the corner positions at the sharp turns.
Active site -The refined structure of L. major UGPase in complex with UDP-glucose, the product of the enzymatic reaction, defines the location of the active site of the enzyme (Fig.  2a). Figure 2b shows the well defined electron density corresponding to UDP-glucose. The UDP-glucose is bound in a deep groove located at the center of the catalytic domain that consists of highly conserved residues. The nucleotide moiety is fixed by a mixture of hydrogen bonds and hydrophobic interactions established by residues of the conserved NB loop (nucleotidebinding loop, (Fig. 1b)). The NB loop (residues K80-K95), which constitutes a roof above the nucleotide on the active site, contains the glycine rich consensus sequence motif K-L-N-G-G-L-G-T-X-M-G-(X) 4 -K. The basic residues H191 and K380 interact with the negatively charged phosphate groups of UDP-glucose (Fig. 3a). The glucose moiety of the product is bound to a depression that is adjacent to the nucleotide. The hydroxyl groups of the sugar are fixed by numerous hydrogen bonds (Fig. 3a). These observations are in good agreement with data obtained from saturation transfer difference (STD)-NMR, indicating that all protons of the sugar moiety are in intimate contact with the protein (8). At the end of the cleft F305 contacts the sugar moiety in the closed conformation, thereby limiting the space available for sugar binding. The SB loop (substrate binding loop, residues T250-G257) covers the sugar within the binding site and contains the two highly conserved glycines G256 and G257 that rearrange during the catalytic reaction (see below).
The binding mode of UDP-glucose and the active site architecture of UGPase in general provide information about a putative binding mechanism of the UGPase substrate UTP. When the nucleotide moiety of UTP is superimposed with the uridyl group of UDP-glucose, the β-and γ-phosphate groups of UTP can be positioned into a small cavity above the nucleotide binding site that is lined by K380 and K95 (Fig. 3b) leading to a conformation of UTP similar to that of dTTP in the structurally related glucose-1phosphate thymidylyltransferase (RmlA/RffH) (24,25).
Based on our structural data we mutated several substrate binding residues in the active site of UGPase to further study the importance of these residues for catalysis. K380 forms a hydrogen bond to the α-phosphate group of UDP-glucose. Most probably, both residues K380 and K95 bind to the phosphate groups of the substrate UTP. Therefore, no catalytic activity for the K380A mutant was observed, in contrast to a residual activity of 0.5% for the K95A mutant (Tab. 3). Residue H191 binds and stabilizes the β-phosphate group of UDP-glucose and probably also the phosphate group of the substrate glucose-1-phosphate. When H191 is substituted by the apolar leucine, a residue of similar size, the enzyme is inactive since the hydrogen bond between histidine and the βphosphate is abolished. In the H191N mutant asparagine may still be able to form a hydrogen bond to the substrate, which is in agreement with the observed reduced activity of 0.3% of the H191N mutant and the natural occurrence of asparagine at position 223 in the human UDP-Nacetylglucosamine PPase (AGX) (26).
Comparison of open and closed form -Upon product binding and presumably also upon  Fig. 2a and 4).
The superposition of both structures in the open and the closed conformation reveal a significant relocation of the NB loop towards the ligand (Fig. 2a). On the opposite edge of the active site the SB loop with the conserved glycines G256 and G257 moves over the sugar moiety of the product and the side chain of F305 rearranges to tighten up the active site (Fig. 2a,  3a and 4). The handle-like extension formed by β-strands, β9 and β10 and the connecting loop adopts very different conformations in the apo and in the UDP-glucose complexed UGPase structures. At current, the role of the handle movemen it is not clear. However, residues G256, G257 at the beginning and E284 at the end of the handle, as well as the adjacent residues V246 to A260 and L281 to S303 perform a 12° turn towards the sugar moiety in the UDPglucose UGPase complex compared to the apo structure ( Fig. 2a and 4). This fact justifies the assumption that the large movement of the handle is at least partially induced by ligand binding.

DISCUSSION
UGPases of eukaryotic origin are about five hundred residues long and show 40% similarity in protein sequence (Figure 1b). Prokaryotic UGPases in general are shorter and their sequences align with a very limited degree of conservation. Nevertheless, both human as well as yeast UGPase are able to complement UGPase (GalU) deficient E. coli mutants (27,28). In contrast to the monomeric L. major and plant UGPases (8,11) the active animal and fungal UGPases are oligomeric. The enzyme, Ugp1p from Saccharomyces cerevisiae forms octamers in crystals (29). This enzyme also harbors a C-terminal domain with a left handed parallel β helix which mediates the association in the homooctamer. Also in the related enzymes ADP-glucose pyrophosphorylase from potato tuber (30) and in the prokaryotic UDP-Nacetylglucosamine Ppase GlmU (31), similar LβHs mediate the oligomerization. Especially, a trimeric assembly as seen in GlmU and in other enzymes with a β helix domain (31), suggesting a possible function of these β helix domains in protein assembly. In contrast, an N-terminal domain with a similar fold as seen in UGPase is only found in the AGX (with a Dali Z score of 3.4) and Ugp1p enzymes. In Ugp1p of S. cerevisiae specific phosphorylation of Serine 11 by the kinase PAS (34) was shown indicating a regulatory role of this N-terminal domain. Deletion of the PAS gene in yeast resulted in an altered sugar flux when grown on minimal media.
A close inspection of the active site residues reveals a high conservation between the human and the Leishmania sequences, thus making it extremely difficult to design specific active site inhibitors against L. major. Since our structure demonstrates that in the Leishmania enzyme glycine G220, which is replaced by isoleucine in humans, provides additional space in the vicinity of the active site, we suggest to further explore this structural freedom by the use of substrate analogues that are alkylated at O3 of the nucleotide ribose. Due to limited space in the human enzyme O3-alkylated derivatives should not bind to the mammalian enzyme.
Interestingly, the mammalian enzyme does not contain the long insertion forming the handlelike structure (Fig. 1b). Since amino acids at the base of this handle are involved in catalysis and their conformation differs in the open and closed structure of L. major UGPase, this site might be an attractive target for inhibitor design. Upon closure of the structure a hydrophobic patch consisting of I47, A51, I52, and L281 becomes more accessible for potential effectors (Fig. 2a). The mutation L281D weakens the structural integrity of this patch and results in a reduction of the activity to 16%. Further studies have to be performed to show if this hydrophobic region offers a potential contact site for allosteric regulation.
Catalytic Mechanism -It was demonstrated that the activation of glucose-1phosphate to UDP-glucose and the release of pyrophosphate by UGPase proceed via an ordered bi-bi mechanism (8, 35, 36,). In accordance with these findings, the obtained structural data suggest the following reaction mechanism: a) First, UTP binds to the active site in a comparable position as observed for the nucleotide moiety of the product UDP-glucose, with its β-and γ-phosphate groups pointing towards the positively charged residue K95, which is part of the conserved NB loop. Consistently, the mutation K95A resulted in an almost complete loss of activity of L. major UGPase (Tab. 3). Flores-Diaz and coworkers reported a second NB loop mutation for the Chinese hamster UGPase (1), corresponding to G84D in the L. major sequence, which resulted in a nearly inactive enzyme. Our crystal structure illustrates that any amino acid residue at position 84 that is larger than glycine causes steric hindrance of UTP binding. Moreover, all amino acids different from glycine interfere with the conformational freedom required at this position. The residues G84 and G83 have conformations restricted to glycines. The same is seen for the glycines G256 and G257 within the SB loop (Fig. 3a) which display large conformational variations (G256 in the open conformation ф/ψ = -75°/169° and ф/ψ = 154°/-147° in the closed conformation). b) Upon coordination of UTP, glucose-1phosphate binds to UGPase with its sugar moiety probably similarly positioned as the glucose in the UDP-glucose UGPase complex. In the apo UGPase structure the conserved asparagines N219 and N308 coordinate a sulfate ion that was present in the crystallization buffer (Fig. 3b). The position of this sulfate ion, which is localized between the corresponding positions of the two phosphate groups of UDP-glucose, may resemble the nucleophilic phosphate group of glucose-1phosphate.
c) Presumably substrate binding induces structural changes towards the closed conformation of UGPase. The substrates should be fixed in their positions by the NB-and the SB loop, whereby the active site is tightened up by an aliform approach of loops that surround the active site. In the transition state the two reacting phosphate groups have to approach each other up to a distance below their van der Waals contact. The nucleophilic phosphate group of glucose-1phosphate is coordinated by N219 and N308 via hydrogen bonds as described above. The attacked α-phosphate of UTP is coordinated by K380 and H191 that are capable to stabilize the evolving negative charge on the oxygen atom of the α-phosphate group in the pentavalent transition state. Consequently, the mutations of K380Q (9) and K380A (this work) resulted in an inactive enzyme (Tab. 3). Replacement of H191 in the H191N mutant also dramatically reduced the catalytic activity of UGPase which demonstrates the importance of H191 that stabilizes the transition state by interacting with glucose-1-phosphate via hydrogen bonds as well as with UTP via hydrophobic interactions. d) Once the reaction has taken place, the phosphate group of glucose-1-phosphate becomes the β-phosphate of UDP-glucose. The coordination of this phosphate group changes from N219 and N308 to H191 and K380 as seen in the UDP-glucose bound structure, which results in a loss of two hydrogen bonds. In parallel, the covalent link to the β-phosphate group of UTP is lost leading to less tightly bound products which then easily dissociate from the enzyme.
In conclusion, the structures of L. major apo UGPase and of the UDP-glucose UGPase complex provide a basis for an understanding of the catalytic mechanism of UGPases and offer a template for the structure-based design of specific inhibitors. Nevertheless, the regulation of this important enzyme and the function of the N-and C-terminal domain require further studies.      Values for the highest resolution shell are given in parentheses. a R crys = Σ│F (obs) -F (calc)│ ⁄ ΣF (obs). b R free was determined from 5% of the data (10% of the apo dataset) that were omitted from the refinement. c Ramachandran plot distribution refers to the most favored/additional/generously/disallowed regions as defined by Procheck 37