Structural Basis for Reversible Phosphorolysis and Hydrolysis Reactions of 2O-- Glucosylglycerol Phosphorylase *

Kouki K. Touhara (東原幸起), Takanori Nihira (仁平高則), Motomitsu Kitaoka (北岡本光), Hiroyuki Nakai (中井博之), and Shinya Fushinobu (伏信進矢) From the Department of Biotechnology, University of Tokyo, 1-1-1 Yayoi, Bunkyo-ku, Tokyo 113-8657, the Faculty of Agriculture, Niigata University, Niigata 950-2181, and the National Food Research Institute, National Agriculture and Food Research Organization, Tsukuba, Ibaraki 305-8642, Japan

In this study, we employed x-ray crystallography to clarify the molecular basis of GG recognition and the hydrolysis of ␤Glc1P. Here, we determined two crystal structures of GGP, one is in complex with glucose and the other is in complex with isofagomine and glycerol. Moreover, automated docking and mutational analyses were carried out to investigate substrate recognition and the reaction mechanism of GGP.

EXPERIMENTAL PROCEDURES
Protein Expression and Purification-The expression vectors for C terminally (His) 6 -tagged wild-type GGP and its E475Q mutant (11) were introduced into Escherichia coli BL21 Codon-Plus (DE3)-RIL (Stratagene, La Jolla, CA). The transformants were cultured in Luria-Bertani medium containing 50 mg/liter of kanamycin and 100 mg/liter of chloramphenicol at 37°C for 4 h. Isopropyl 1-thio-␤-D-galactopyranoside was added to a final concentration of 0.1 mM to induce protein expression. Following an additional incubation at 20°C for 24 h, the cells were harvested by centrifugation and suspended in 20 mM HEPES-NaOH (pH 7.5) and 500 mM NaCl. Cell extracts were obtained by sonication followed by centrifugation to remove cell debris. The protein was purified to homogeneity using sequential column chromatography involving nickel-nitrilotriacetic acid superflow (Qiagen, Hilden, Germany), MonoQ 10/100 GL, and Superdex 200 pg 16/60 column chromatography (GE Healthcare). The protein concentration was spectrophotometrically determined at 280 nm using a theoretical extinction coefficient of ⑀ ϭ 124,110 M Ϫ1 cm Ϫ1 based on the amino acid sequence. The E475Q mutant was overexpressed and purified in the same way as wild-type GGP.
Crystallography-Crystals of GGP were obtained at 20°C using the hanging drop vapor diffusion method by mixing 1 l of protein solution containing 4.3-5.7 mg/ml of protein with an equal volume of reservoir solution containing 28% polyethylene glycol 400, 0.1 M HEPES-NaOH (pH 7.5), and 0.2 M CaCl 2 . A protein solution containing 10 mM D-glucose was used to obtain wild-type GGP crystals in complex with glucose. The crystals of the E475Q mutant were obtained using a protein solution containing 0.1 mM isofagomine (Toronto Research Chemicals, North York, Canada) and 10% glycerol. The hexagonal crystals grew in ϳ5 days. The diffraction data were collected at 100 K ( ϭ 1.000 Å) using a charge-coupled device camera on BL17A at the Photon Factory of the High Energy Accelerator Research Organization (KEK, Tsukuba, Japan) and processed using HKL2000 (23). Molecular replacement was performed using Molrep (24). The search model was prepared using a homology modeling server with CsKP (PDB code 3WIR) as a template. Automated model building and refinement was performed using ARP/wARP (25). Manual model rebuilding and refinement were performed using Coot (26) and Refmac5 (27). The data collection and refinement statistics are provided in Table  1. The molecular interfaces were analyzed using the PDBe PISA server. Molecular graphic images were prepared using PyMol (DeLano Scientific, Palo Alto, CA).
Docking Simulations-The ligand molecule (␤Glc1P) was generated using PCModel (Serena Software, Bloomington, IN). The crystal structure of GGP in complex with glucose was used for this docking study by removing all water and ligand molecules. The docking studies were carried out using the AUTODOCK 4.2 program (28). Using AutoDockTools, polar hydrogen atoms were added to amino acid residues, and Gasteiger charges were assigned to all atoms of the enzyme. The Van der Waals and electrostatic energy grid maps were prepared using the AUTOGRID program, with 80 ϫ 80 ϫ 80 points spaced at 0.375 Å distances. The grid box was centered on the C1 atom of glucose bound to subsite Ϫ1. The ligand structure was docked with flexible torsion angles, whereas the protein structure was fixed. Eight torsion angles of ␤Glc1P were rotatable but the sugar ring was fixed in the 4 C 1 conformation. An initial random population size of 150 individuals, a maximum number of 2,500,000 energy evaluations, and a maximum number of generations of 27,000 were used for the Lamarckian genetic algorithm search. At the end of a docking procedure of 256 docking runs, the resulting positions were clustered according to a root mean square criterion of 2.0 Å.
Construction of Mutants and Kinetic Analysis-Site-directed mutagenesis was carried out with a PrimeSTAR mutagenesis kit (Takara Bio). The following primers and their complementary strands were used: 5Ј-caggcctttcagggcgcagcgttctgg-3Ј (Y327F), 5Ј-tatgcatttatcagcggcaagaccggt-3Ј (W381F), 5Ј-caaagaagacgagttttacggctatcca-3Ј (Y572F), and 5Ј-tgcatcgcgcaggcggatgtcattcag-3Ј (K587A). The expression plasmids for the mutants were verified by DNA sequencing. The mutant proteins were expressed, purified, and characterized as described previously (11). Extinction coefficients of ⑀ ϭ 122,620 and 118,610 M Ϫ1 cm Ϫ1 were used to determine the protein concentrations of the Tyr to Phe and Trp to Phe mutants, respectively. The reverse phosphorolytic (synthetic) activity was measured by subtracting the increase in D-glucose (hydrolytic activity) from the increase in P i (hydrolytic plus synthetic activities) at 30°C in the reaction mixture containing 0 -100 mM glycerol, 10 mM ␤Glc1P, and 40 mM HEPES-NaOH (pH 7.5) (11). The hydrolytic activity was measured based on the increase in D-glucose at 30°C in the reaction mixture containing 0.1-10 mM ␤Glc1P and 40 mM HEPES-NaOH (pH 7.5) (11). At least 9 different substrate concentrations for measurement of each enzyme and activity were collected. Protein concentrations of 58 nM (wild-type), 1.

RESULTS AND DISCUSSION
Overall Structure-The crystal structures of wild-type GGP in complex with glucose (GGP-glucose) and the E475Q catalytic acid residue mutant of GGP in complex with isofagomine and glycerol (GGP-isofagomine-glycerol) were determined at 1.9-and 2.3-Å resolutions, respectively ( Table 1). The GGPglucose crystal (C2 space group) contains two monomers (chains A and B) in the asymmetric unit, whereas the GGPisofagomine-glycerol crystal (P2 1 space group) contains eight monomers (chains A-H). The main chain traces of these 10 structures are almost identical, as the root mean square deviation without cutoff distance for C␣ atoms between all pairs of the molecules are within 1.01 Å. Fig. 1 shows a comparison of the overall structures of GGP, LbMP, and CsKP. GGP shows sequence identities of 28.3 (234/828) and 31.9% (257/805) to LbMP and CsKP, respectively. The GH65 enzymes consist of four distinct regions: an N-terminal ␤ sandwich domain (residues 1-250 in GGP, blue), a linker region (251-286, cyan), a catalytic (␣/␣) 6 barrel domain (290 -687, yellow), and a C-terminal domain (287-289 and 688 -761, red). GGP forms a dimer in solution (11). The GGP dimer shown in Fig. 1A was predicted to be the most probable form in solution among the crystallographic contacts. The dimer interface area is 1457 Å with 29 hydrogen bonds and 14 salt bridges. The GGP dimer is basically similar to the head to head dimer of LbMP ( Fig. 1B) (21), but their relative orientations are different. The LbMP dimer is asymmetric due to the large displacement (Ͼ10 Å) of the N-terminal domain (21), whereas the GGP dimer is nearly symmetric. The crystal packing of GGP lacks the disulfide bond-linked dimer interfaces between the N-domains that were observed in the CsKP crystals ( Fig. 1C) (22).
Three unique extended loops are present in the catalytic domain of GGP (magenta in Fig. 1A). Loop 1 (388 -410, red), loop 2 (567-578, blue), and loop 3 (659 -667, green) are shown in Fig. 2A. Loop 1 corresponds to "loop 3" in reports for Lactobacillus acidophilus maltose phosphorylase and CsKP (22,29), and it is a key determinant of the substrate specificity of these enzymes. These loops completely cover the active site of GGP. Thus, one or more loops must open to accept the substrates. Specifically, loop 1 contains three phenylalanine residues (Phe-395, Phe-396, and Phe-409), which form hydrophobic interactions with glycerol at the active site (discussed below). The three loops are not conserved in the GH65 family members (Fig. 2D). In contrast to GGP, the active site pocket of LbMP is rather open because it contains short loops at these regions (Fig. 2B). The pocket of CsKP is partially closed because a long loop 1 with two antiparallel ␤-strands covers the active site ( Ligand Binding Pathway of GGP-To investigate a possible mechanism for substrate entry into the completely closed active site of GGP, we examined the average B-factors of the three loops ( Table 1). The average B-factors of loop 1, which contains the three key phenylalanine residues for glycerol binding, are lower than that of the overall structure. The B-factors for loop 2 are not significantly different from the overall average value. In contrast, loop 3 showed significantly higher average B-factor values that are 1.2ϳ1.4 times larger than the overall average value. None of the three loops is affected by crystal packing (data not shown). Loops 1 and 2 are located near the dimer interface but loop 3 is not (Fig. 1A). These results indicate that loop 3 could flexibly move and facilitate ligand binding to GGP. A phylogenetic tree for the GH65 enzymes indicates that GGP, nigerose phosphorylase, kojibiose phosphorylases, trehalose-6-phosphoate phosphorylase, and 1,3-␣-glucosylrhamnose phosphorylase diverge from the maltose phosphorylases and trehalose phosphorylases (11). Among the former five enzymes, GGP and nigerose phosphorylase contain three long loops, the kojibiose phosphorylases and trehalose-6-phosphoate phosphorylase contain long loops of 1 and 2, and 1,3-␣glucosylrhamnose phosphorylase contains the shortest loops in all three regions (Fig. 2D). The emergence of the three long loops could be a key event of the molecular evolution of GGP that yielded the specificity for GG phosphorolysis.
GGP-Glucose Structure-Both monomers in the asymmetric unit of the GGP-glucose structure contain a ␤-glucose molecule in the active site (subsite Ϫ1) with a clear electron density (Fig. 3A, chain B). The glucose forms direct hydrogen bonds with Trp-333, Asp-334, Lys-587, and Gln-588. These four residues stabilize all five hydroxyls of glucose and are perfectly conserved among all characterized bacterial GH65 family enzymes (Fig. 2D). Tyr-327 forms a water-mediated hydrogen bond with glucose. Tyr-327, Trp-381, Glu-475 (catalytic residue), Tyr-572, and Lys-587 (discussed below) form a hydrogen-

Structure of Glucosylglycerol Phosphorylase
bonded water network on the ␤-face (␣-anomer side) of glucose. GGP-Isofagomine-Glycerol Structure-The complex structure with isofagomine and glycerol was determined using a mutant at the catalytic residue (E475Q) (11). Six of the eight chains in the asymmetric unit (A and D-H) contain an isofagomine molecule at the active site (subsite Ϫ1). Forty-four glycerol molecules were found in total, and four glycerol molecules are located at the active site (chains D and F-H). Fig. 3B shows a glycerol molecule bound at the acceptor site (subsite ϩ1) of chain D. This glycerol molecule adopts an extended conformation, and the three hydroxyls of isofagomine form hydrogen bonds with Trp-333, Asp-334, Gln-475 (mutated from Glu), and Gln-588 (Fig. 4B). The O2-hydroxyl of glycerol is located at a distance of 3.0 Å from the N1 atom of isofagomine. This is a reasonably close distance to represent the acceptor site of glycerol to produce GG with ␤Glc1P via the reverse phosphorolysis reaction. The three hydroxyls are oriented in the same direction and fixated by direct hydrogen bonds with Tyr-327, Trp-381, and Gln-475. Two water molecules also form hydrogen bonds with glycerol, forming bridges with Tyr-572 and Lys-587. Furthermore, the hydrophobic side of glycerol is stabilized by interactions with Phe-395, Phe-396, and Phe-409. Although Tyr-327 and Lys-587 are conserved among the bacterial GH65 family enzymes, the other five residues that comprise the acceptor site, Trp-381, Phe-395, Phe-396, Phe-409, and Tyr-572, are not conserved (Fig. 2D). Specifically, Phe-395, Phe-396, and Phe-409 are located in loop 1, which is often lost among GH65 enzymes. CsKP, Clostridium phytofermentans nigerose phosphorylase, and Lactobacillus lactis trehalose-6-phosphate phosphorylase have a long loop in this region, but contain only one or two phenylalanine residues.
The side chain of the Gln-475 residue adopts a conformation different from that of Glu-475 in the GGP-glucose structure. This difference may be an artifact due to the mutation. However, flexibility of the catalytic residue may be required to fulfill its roles as the proton donor (phosphorolysis) and the proton acceptor (reverse phosphorolysis and ␤Glc1P hydrolysis).
In summary, the acceptor (glycerol-binding) site of GGP contains a hydrophobic side with phenylalanine residues as well as a hydrophilic side with several hydrogen bond-forming residues, and this site is well designed to recognize a glycerol molecule. Importantly, glycerol is a relatively flexible molecule compared with pyranose sugars (e.g. glucose), which are acceptor molecules of other GH65 enzymes. Therefore, GGP features these extensive interactions to fix the acceptor. Most of the residues involved in these interactions (Trp-381, Phe-395, Phe-396, Phe-409, and Tyr-572) are located within or near loops 1 and 2 (Fig. 2D, orange stars).
Docking Analysis-We performed an automated docking analysis to obtain probable binding models of ␤Glc1P. The first-ranked cluster was reasonably large (55 of 256 docking runs) and shows a significantly lower binding energy (Ϫ7.52 kcal/mol) compared with other clusters (Ͼ Ϫ6.88 kcal/mol). These features yielded the best result for estimating the binding mode of ␤Glc1P (Fig. 3C). The glucose moiety of ␤Glc1P forms hydrogen bonds with Arg-320, Trp-333, Asp-334, Lys-587, and Gln-588. The phosphate moiety forms hydrogen bonds with the side chains of Arg-320, Tyr-327, Lys-587, and Ser-622, and the main chain NH group of Ser-623. In summary, the docked ␤Glc1P is coordinated by many hydrogen bonds, and the interactions are almost the same as those of the GGP-glucose crystal structure. Fig. 3D shows the superimposition of the GGP-glucose structure (green), docked ␤Glc1P (cyan thin lines), LbMP in complex with P i (magenta), and CsKP in complex with glucose and P i (yellow). The residues involved in the recognition of ␤Glc1P are highly conserved, but Arg-320 and Tyr-572 are not. The residue that corresponds to Arg-320 is generally Lys in GH65 (Lys-345 and Lys-330 in LbMP and CsKP), but a few members contain Arg at this site (Fig. 2D). The region near Tyr-572 shows very low conservation among the GH65 enzymes. In the LbMP structure, Trp-582 is located at a corresponding position, whereas a corresponding residue is lacking in CsKP (Fig. 3D). The glucose moiety of docked ␤Glc1P shows a slight displacement from the glucose molecules in the GGP and CsKP structures. The phosphate moiety of the docked ␤Glc1P is also displaced from the P i molecules in the LbMP and CsKP structures, probably because the phosphate moiety in ␤Glc1P is covalently bonded to the glucose moiety, which is more strongly fixated by a larger number of hydrogen bonds. Alternatively, this phe-nomenon can be ascribed to the difference in the Arg-320 residue.
Mutational Analysis-The two crystal structures and the docking analysis suggested that Tyr-327, Trp-381, Tyr-572, and Lys-587 are the key residues contributing to the unique characteristics of GGP because they are involved in glycerol recognition. Therefore, we constructed mutants at these residues (Y327F, W381F, Y572F, and K587A). The kinetic parameters of the reverse phosphorolysis (GG synthesis) and ␤Glc1P hydrolysis were measured using these mutants ( Table 2). The reverse phosphorolysis activities of the Y327F, W381F, and K587A mutants were not detectable or significantly reduced. This finding agrees with the result of structural and docking analyses, as these residues directly recognize glycerol (Tyr-327 and Trp-381) or ␤Glc1P (Lys-587). Therefore, the substrate specificity for glycerol is mainly dictated by the two residues anchoring the O1 and O3 hydroxyls of glycerol, Tyr-327 and Trp-381 (Fig.  3B). The reverse phosphorolysis activity of Y572F, which indirectly recognizes the O2 atom of glycerol, was also significantly impaired, with a 7.3-fold increase of K m and 8.2-fold decrease of k cat . W381F retained its ␤Glc1P hydrolysis activity, whereas  other mutants exhibited no detectable or significantly decreased activities. The k cat of Y572F was 15 times lower than that of the wild-type. Reaction Mechanisms-Our previous work (11) indicated that the ␤Glc1P hydrolysis of GGP can be explained as a glycosyl transfer reaction to a water molecule. In other words, GGP can utilize whether a water molecule or a glycerol as an accepter for a glycosyl transfer reaction from ␤Glc1P. Therefore, the position of the key water molecule for hydrolysis assumedly corresponds to the O2 atom of glycerol (Fig. 3B). This area is the ␣-anomer side of glucose, which features a hydrogen-bonded water network in the GGP-glucose structure (Fig. 3A).
␤Glc1P hydrolysis must be primarily facilitated via proton extraction by Glu-475, which is located close to the O2 atom of glycerol. According to the proposed reaction mechanism, E475A and E475Q mutants did not show detectable activities for either hydrolysis or phosphorolysis (11). In this study, we focused on the additional water-mediated hydrogen bond,  which holds the O2 atom of glycerol (Fig. 3B). This water molecule is further fixated by Tyr-572 and Lys-587. Interestingly, Tyr-572 is not conserved among the GH65 enzymes. The mutation of residues involved in water-mediated hydrogen bonds (Y572F and K587A) significantly reduced both of the ␤Glc1P hydrolytic and reverse phosphorolytic activities. Additionally, a mutation at Tyr-327, which is involved in formation of the hydrogen-bonded water network at the ␣-anomer side of glucose at subsite Ϫ1 as well as the phosphate recognition of ␤Glc1P (Fig. 3C), significantly reduced ␤Glc1P hydrolytic activity. This result indicates that these structural features are important for the bifunctionality of GGP. GGP notably differs from the other GH65 enzymes in its flexible nature of the acceptor molecule, glycerol. Because glycerol does not contain a ring structure, a water-mediated hydrogen bond is required to fix the position of the O2 hydroxyl of glycerol. The presence of the two hydrogen bonds at the acceptor oxygen site may fix a water molecule to this site and then exert the hydrolysis activity by correctly orienting a lone pair of water molecules to the anomeric C1 atom of ␤Glc1P. In summary, we propose possible reaction mechanisms for the phosphorolysis of GG, its reverse reaction, and the hydrolysis of ␤Glc1P (Fig. 4). In the phosphorolysis of GG, an oxygen atom of P i attacks the anomeric C1 atom, and Glu-475 works as a general acid. This reaction mechanism is typical for inverting GH-type phosphorylases (30). For the reverse phosphorolysis reaction, ␤Glc1P is assumed to bind at the position shown in the docking analysis. The O1 and O3 hydroxyls of glycerol are coordinated with Trp-381 and Tyr-327. However, a water molecule must form a hydrogen bond coordinated by Tyr-572 and Lys-587 to fix the O2 atom of glycerol to the position where the reverse phosphorolysis reaction would occur. The two hydrogen bonds fix and orient the O2 atom of glycerol or a water molecule to the anomeric C1 atom of ␤Glc1P to facilitate reverse phosphorolysis and hydrolysis reactions.
Recently, a faint hydrolytic activity on ␣-mannose 1-phosphate was identified for a GH130 ␤-1,4-mannooligosaccharide phosphorylase (31). GH130 enzymes are inverting phosphorylases that act on ␤-mannosides. Interestingly, a unique protonrelay reaction mechanism that utilizes a water molecule was proposed for a GH130 4-O-␤-D-mannosyl-D-glucose phosphorylase because the distance between the catalytic residue and the glycosidic bond oxygen of the substrate is too large for direct  ND ND ND ND ND ND a Measured with 0 -100 mM glycerol and 10 mM ␤Glc1P. b Measured with 0.1-10 mM ␤Glc1P. c Not detected (less than 1/1000 activity of wild-type). d Not reliable because the rates of the P i and D-glucose releases were too close at all of the concentrations of glycerol.
protonation (32). Thus, the presence of water-mediated hydrogen bonds at the catalytic center of GH130 enzymes may also be a key for the faint hydrolytic activity on the glycosyl phosphate substrate.
In this study, a structural basis for substrate recognition and reaction mechanisms of GGP was clarified. Although GGP can efficiently and specifically produce GG via its reverse phosphorolysis activity, a considerably high activity of the side reaction, ␤Glc1P hydrolysis can be an obstacle for industrial applications. The structure determinations of sugar phosphorylases provided solid foundations for endowing altered substrate specificities and increased thermostability to these enzymes (33)(34)(35)(36). Our structural study will also facilitate future protein engineering of GGP to modulate and improve its function.