Structure-based Functional Annotation

Despite the generation of a large amount of sequence information over the last decade, more than 40% of well characterized enzymatic functions still lack associated protein sequences. Assigning protein sequences to documented biochemical functions is an interesting challenge. We illustrate here that structural genomics may be a reasonable approach in addressing these questions. We present the crystal structure of the Saccharomyces cerevisiae YMR099cp, a protein of unknown function. YMR099cp adopts the same fold as galactose mutarotase and shares the same catalytic machinery necessary for the interconversion of the α and β anomers of galactose. The structure revealed the presence in the active site of a sulfate ion attached by an arginine clamp made by the side chain from two strictly conserved arginine residues. This sulfate is ideally positioned to mimic the phosphate group of hexose 6-phosphate. We have subsequently successfully demonstrated that YMR099cp is a hexose-6-phosphate mutarotase with broad substrate specificity. We solved high resolution structures of some substrate enzyme complexes, further confirming our functional hypothesis. The metabolic role of a hexose-6-phosphate mutarotase is discussed. This work illustrates that structural information has been crucial to assign YMR099cp to the orphan EC activity: hexose-phosphate mutarotase.

Despite the overwhelming impact of systematic genome sequencing on the understanding of the biology and evolution of organisms, enormous gaps are present in our current knowledge of the function of proteins. On the one hand, it is well known that many of the newly generated protein sequences cannot be assigned with a precise function. On the other hand, a significant portion of biochemical functions experimentally determined are still not linked to a protein sequence. The awareness of the latter point, which is not well recognized, presents an important challenge. For instance, systematic analysis of the well structured and much used Enzyme Commission (EC) data base revealed that more than 39% of the experimentally identified enzyme activities (corresponding to 1529 EC numbers) are not associated with protein sequences in major public databases (1,2). There is clearly an urgent need to fill this gap in our biochemical knowledge, and an organized combined bioinformatics and experimental effort will be needed to bridge it (3).
Structural genomic endeavors, through systematic approaches at all experimental steps of the structure determination process, to fill the gap between protein sequence and structure space. Apart from being the main source of novel fold discovery, structural genomics may also generate valuable information on proteins whose function was not identified before (4,5). It is certain that a considerable fraction of these targets is endowed with enzymatic activities, and the identification of the biochemical function of these proteins offers a high value structural genomics spin-off.
In our yeast structural genomics project we have focused on non-membrane proteins of unknown fold (6). Our initial target list consisted of a mix of proteins with known and unknown functions. In this article, we demonstrate how biochemical investigations based on initial structure determination may clearly define biochemical function. YMR099cp is a 34-kDa protein that is predicted from Psi-Blast sequence analysis to adopt an aldose-1-epimerase fold, which does not share significant homology with proteins whose biochemical function was characterized. However, no biochemical function could be deciphered from these observations. We report here on the crystal structure of YMR099cp and show that it adopts the galactose mutarotase fold (11). Analysis of the putative active site suggested that its catalytic machinery may perform a mutarotation reaction but that YMR099cp probably has different substrate specificity from the well studied galactose mutarotase with which it shares the same fold. Our present biochemical and biophysical investigation of the catalytic activity indicate that YMR099cp corresponds to the glucose-6-phosphate 1-epimerase (or mutarotase) previously identified in S. cerevisiae, using biochemical experiments, but whose gene has not been identified since (12)(13)(14)(15). Our results highlight how structural genomics may suggest biochemical experiments that allow for determination of the precise function of a target protein.

MATERIALS AND METHODS
Cloning, Expression, and Purification-The YMR099c open reading frame was amplified by PCR using the genomic DNA of * This work was supported by grants from the Ministè re de la Recherche et de la Technologie (Programme Gé nopoles). The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact. the S. cerevisiae strain S288C as a template. An additional sequence coding for a His 6 tag was introduced at the 3Ј-end of the gene during amplification. The PCR product was then cloned into a derivative of pET9 vector (Stratagene). Expression was done at 37°C using the transformed Escherichia coli Gold (DE3) strain and 2xYT medium (BIO101 Inc.) supplemented with kanamycin at 50 g/ml. Cells were harvested by centrifugation, resuspended in 30 ml of 20 mM Tris-HCl, pH 7.5, 200 mM NaCl, 5 mM ␤-mercaptoethanol, and stored overnight at Ϫ20°C. Cell lysis was completed by sonication. The His-tagged protein was purified on a Ni-nitrilotriacetic acid column (Qiagen Inc.) followed by gel filtration on a SuperdexTM200 column (Amersham Biosciences). Glucose-6-phosphate Epimerase Activity Measurements-Because of the specificity of glucose-6-phosphate dehydrogenase for ␤-Glc6P, both spontaneous and YMR099cp-catalyzed formation of ␤-Glc6P could be followed by measuring the increase in absorbance at 340 nm caused by generation of NAD(P)H, coupled to the conversion of ␤-Glc6P into D-glucono-1,5-lactone 6-phosphate by glucose-6-phosphate dehydrogenase. All kinetic experiments were performed at 25°C in 50 mM imidazole, pH 7.6, 50 mM KCl, 8 mM MgSO 4 using an Applied Photophysics SW18-MV stopped-flow spectrophotometer. Reactions were initiated by mixing equal volumes of 60 M equilibrated Glc6P and 2 mM NADP ϩ with 160 units/ml glucose-6-phosphate dehydrogenase from Leuconostoc mesenteroides (Sigma), 2 mM NADP ϩ , and various concentrations of YMR099cp. Stopped-flow data were fitted to double exponentials by a nonlinear least-squares curve fitting using software provided by Applied Photophysics. Experiments were performed in triplicate.
Fluorescence Measurements for Binding of Ligands-All phosphorylated sugars used for these measurements were purchased from Sigma. Excitation was performed at 295 nm and emission scanned from 300 -500 nm with a Cary Eclipse fluorospectrophotometer (Varian). Measurements were performed at 20°C with 1 M YMR099cp in 20 mM Tris-HCl, pH 7.5. Successive aliquots of ligands (from 0 to 20 mM) were added to the protein. Binding of ligands to YMR099cp was quantified as a difference of tryptophan fluorescence at 350 nm as a function of ligand concentration. Experiments were performed in triplicate.
NMR Spectroscopy-All NMR spectra were recorded in 99% D 2 O with a Bruker DRX400 instrument (400 MHz), with ligands and protein concentrations of 20 mM and 25 M, respectively. Standard Bruker software was used to acquire and process the NMR data. 1 D-1 H NMR was acquired with 16,000 data points and 16 scans. Two-dimensional NOESY was recorded with 256 experiments of 2048 data points and 8 scans per t 1 experiment and a mixing time of 200 ms. Sine-squared-bell apodization was applied to spectra in both dimensions and processed in States-TPPI mode, followed by symmetrization.
Crystallization and Resolution of the Structure-Crystallization trials were performed at 18°C. Crystals for the apo form of the protein were grown from a mixture in a 1:1 ratio of 20 mg/ml protein solution in 20 mM Tris-HCl, pH 7.5, 200 mM NaCl, and 10 mM ␤-mercaptoethanol with a crystallization solution of 25-30% polyethylene glycol 3000, 0.2 M lithium sulfate, 0.1 M Hepes pH 7.5. Crystals for the D-galactose 6-phosphate (Gal6P, 2 100 mM) and D-glucose 6-phosphate (Glc6P, 200 mM) bound forms were obtained under the following conditions: 32% polyethylene glycol 4000, 0.2 M lithium chloride, 0.1 M Hepes pH 7.5. Crystals could only be obtained by microseeding from crystals of the apo form. Crystals were cryoprotected by transferring either to paratone (Glc6P and Gal6P forms) or to a crystallization solution with progressively higher glycerol concentrations up to 30% v/v (unbound form) and then flashcooled in liquid nitrogen.
Diffraction data were collected from a flash-cooled crystal at 100 K on beamline ID23-1 at the European Synchrotron Radi- YMR099cp Is a D-Hexose-6-phosphate Mutarotase ation Facility (ESRF, Grenoble, France). The apoprotein crystallized in space group P2 1 2 1 2 with cell dimensions: a ϭ 44.9 Å, b ϭ 74.2 Å, c ϭ 106.5 Å, corresponding to one molecule per asymmetric unit and a solvent content of 55%. Image processing, data reduction, and scaling were carried out using the XDS package (16). The structure of the Haemophilus influenzae Hi1317 protein (PDB code 1JOV; 26% sequence identity) was used as search model for molecular replacement trials using the MolRep program in the 20-4-Å resolution range (17). This initial model was automatically rebuilt using the Arp/wARP program using data to 1.7-Å resolution (18). The resulting model was improved by careful analysis of the 2F o Ϫ F c and F o Ϫ F c electron density maps using the molecular graphics program TURBO-FRODO. The refinement of this model was further carried out with the program REFMAC in the 20-1.7-Å resolution range (19). The final model for the apostructure includes all the residues from Pro 2 to Lys 288 , 330 water molecules, 1 sodium, and 2 sulfate ions, one Hepes molecule, and one glycerol molecule from the cryoprotection solution. The sugarbound structures (Glc6P and Gal6P) crystallizing in a different space group (C222 1 ) were solved by molecular replacement and refined to 1.6 Å. Models were constructed for the Glc6P and Gal6P YMR099cp complexes for residues Pro 2 -Glu 289 .
Both complexes contain one hexose 6-phosphate sugar and in the case of Gal6P a barium ion. The Glc6P and Gal6P complex models were completed with 253 and 378 water molecules, respectively. The statistics on data collection and refinement are provided in Table 1.
The atomic coordinates and structure factors for the YMR099c native protein and the Glc6P and Gal6P complexes have been deposited into the Brookhaven Protein Data Bank under the accession numbers 2CIQ, 2CIR, and 2CIS, respectively.

RESULTS
Structure of the YMR099cp-The crystal structure of the S. cerevisiae gene product YMR099cp (297 amino acids) has been solved using the molecular replacement method using the structure of a protein of unknown function (H. influenzae Hi1317; PDB code: 1JOV) and refined to 1.7-Å resolution (Table 1). All 2F o Ϫ F c electron density maps are of excellent quality, except for the N-terminal methionine and 9 C-terminal amino acids that have not been built because they were not defined in electron density. One copy of YMR099cp is present in the asymmetric unit, confirming gel filtration elution profile that clearly indicates a monomeric state of the protein in solution (data not shown). YMR099cp is made of a single globular domain of approximate dimensions 37 ϫ 52 ϫ 53 Å 3 . Its structure can be described as a ␤-sandwich made of 22 ␤-strands and two short ␣-helices (Fig. 1A). Strands are organized in four antiparallel ␤-sheets arranged in two parallel layers. The first one is made of three sheets: S1 composed of ␤-strands 1-4, S2 (strand order: ␤5, ␤10, and ␤21), and S4 (strand order: ␤11, ␤20, ␤19, ␤16, ␤13, and ␤14). The second layer is formed by the sole 9-stranded ␤-sheet S3 (strand order: ␤6-␤9, ␤22, ␤18, ␤17, ␤12, and ␤15). The two helices ␣1 and ␣2 are facing toward the same side of the monomer and are part of the linkers connecting strands ␤7 to ␤8 and ␤19 to ␤20. This fold has already been described for galactose mutarotases (RMSD of 2.2-2.6 Å over 240 C␣ positions; 15-17% sequence identity (11,20,21)), domain 5 of ␤-galactosidase (RMSD of 3.1 Å over 190 C␣ positions, 7% sequence identity (22)) and the N-terminal domain of maltose phosphorylase (RMSD of 3.4 Å over 164 C␣ positions, 13% sequence identity (23)). Structural similarity is also shared with proteins of unknown function: Caenorhabditis elegans C01B4.6 gene product (RMSD of 2.41 Å over 235 C␣ positions, 13% sequence identity; PDB code: 1LUR) and S. cerevisiae YNR071cp (RMSD of 2.67 Å over 243 C␣ positions, 12% sequence identity; PDB code: 1YGA).
Comparison with Galactose Mutarotases-The crystal structures of YMR099cp and of the H. influenzae Hi1317 protein (which provided the model used for molecular replacement) confirm the Psi-Blast prediction that they belong to the aldose-1-epimerase family. The best structurally and biochemically characterized member of this family is galactose mutarotase, an enzyme that catalyzes the conversion of ␣/␤-D-galactose (24). This is the first step of the Leloir pathway, which in most orga-nisms converts ␤-D-galactose to the more metabolically useful glucose 1-phosphate.
Surface projection of sequence conservation among YMR099cp-related proteins has highlighted a highly conserved pocket whose floor is made by sheet S4 and walls by helix ␣2 as well as by the long loop connecting ␤4 to ␤6 (Fig. 1B). This pocket, which contains a glycerol-bound molecule bound in the so-called apo structure, corresponds to the active site in galactose mutarotase. Sequence alignment of YMR099cp orthologs and galactose mutarotases reveals five strictly conserved pocket residues (Figs. 1C and 2): two histidine residues (His 82 and His 159 , YMR099cp numbering), two acidic amino acids (Asp 203 and Glu 264 ) and Tyr 161 . Crystal structures of Lactococcus lactis galactose mutarotase bound to sugars have shown that these residues are involved in sugar binding (11,25). Site-directed mutagenesis led to the conclusion that in L. lactis galactose mutarotase, His 170 and Glu 304 play the role of catalytic acid and base, respectively (26). Hence, YMR099cp seems to possess the catalytic machinery necessary to interconvert ␣ and ␤ sugar anomers, suggesting that YMR099cp is a yeast galactose mutarotase. However, Gal10p was annotated recently with this activity in S. cerevisiae (27). Gal10p is a bifunctional enzyme with the N-terminal domain harboring UDP-galactose 4-epimerase and the C-terminal domain galactose mutarotase activities (21). This raises the issue whether multiple genes code for galactose mutarotase activity in yeast or YMR099cp is an active mutarotase with different substrate specificity. An indication supporting the latter hypothesis comes from the refinement of the native structure, which has revealed a sulfate ion bound in the active site pocket (Fig. 1C). This ion, present in the crystallization buffer, is complexed by two arginine side chains (Arg 57 and Arg 86 ) that are strictly conserved among YMR099cp orthologs but not among bona fide galactose mutarotases (Fig. 2, family 1, black arrows). Interestingly, a sulfate ion occupies exactly the same position in  YMR099cp Is a D-Hexose-6-phosphate Mutarotase OCTOBER 6, 2006 • VOLUME 281 • NUMBER 40 JOURNAL OF BIOLOGICAL CHEMISTRY 30179 the active site of H. influenzae Hi1317 (unknown function, PDB code: 1JOV), and is liganded by homologous arginines. Finally, structure superimposition of the glucose-bound galactose mutarotase E304Q mutant of L. lactis onto YMR099cp shows that the glucose O 6 atom and an oxygen atom from sulfate almost overlap. The fact that sulfate ions often substitute for phosphate groups in protein structures (28,29) suggests that YMR099cp binds a hexose-6-phosphate and that it hence may have hexose-6-phosphate mutarotase (or 1-epimerase) activity. A glucose-6-phosphate-1-epimerase, which catalyzes the mutarotation of D-glucopyran 6-phosphate (12-15) was previously purified from yeast, but the sequence of the protein was not identified. Interestingly, the previously published data for the enzyme purified from yeast cells (estimated molecular mass of 35-kDa and isoelectric point of 5.8; Ref. 12) fit very well with the values calculated from the YMR099cp sequence (34 kDa and pI of 6.07).
Functional Studies-These structural observations led us to test YMR099cp for hexose 6-phosphate epimerase activity. For this purpose, we have used an enzyme-coupled assay. It consists in following the reduction of NAD(P) ( ϭ 340 nm) resulting from glucose-6-phosphate dehydrogenase activity, which specifically converts ␤-Glc6P (but not the ␣ form) into D-glucono-1,5-lactone 6-phosphate (Fig. 3A). The formation of ␤-Glc6P from a racemic mixture has been investigated by stopped-flow at various YMR099cp concentrations (Fig. 3B), which revealed two distinct phases. The first one corresponds to the initial burst of ␤-Glc6P because of its presence in the racemic mixture and hence is independent on the YMR099cp concentration. In the second phase, faster generation of NAD(P)H reflecting higher velocity constant (k ϩ1 ) for the interconversion between glucose 6-phosphate anomers is obtained with increasing concentration of YMR099cp, demonstrating the glucose-6-phosphate epimerase activity of the protein (Fig. 3B, inset). In our experiments, YMR099cp accelerates the reaction two to three times more than previously observed in the same conditions by Wurster and Hess (Fig. 3B, inset; Ref. 12). This probably reflects the general improvement in the quality (i.e. purity and stability) of a protein sample extracted and purified within a few hours from an overexpressing E. coli strain, compared with the one of a protein sample obtained from a wild-type yeast lysate using a long and tedious purification protocol (12).
Unfortunately, this enzymatic assay could not be used to test other hexose 6-phosphate substrates because glucose-6-phosphate dehydrogenase exhibits a high degree of specificity for ␤-Glc6P. A more common way to measure epimerase activity is polarimetry, but this method requires the pure ␣ or ␤ anomeric forms of the substrates, which are usually not commercially available. We therefore used NMR to identify activity on other hexose 6-phosphate substrates (30). To identify putative substrates to be tested by NMR, we have first tested the binding of a few potential ligands using tryptophan fluorescence. As the comparison of the YMR099cp and galactose mutarotase active sites revealed that the aromatic side chain (phenylalanine or tyrosine) that forms a hydrophobic sugar binding platform in galactose mutarotases is replaced by a strictly conserved tryptophan residue in all YMR099cp homologs, we then assumed that this side chain could be used as a fluorescent probe. This allowed us to test the binding affinity of several sugar candidates as a ligand for YMR099cp. In addition to glucose 6-phosphate, shown to be a substrate of the protein, we tested the binding of glucose, three additional hexose 6-phosphate sugars (Gal6P, Man6P, and Fru6P) and ribose 5-phosphate. These measurements revealed that YMR099cp displays a slightly higher affinity for Fru6P and Man6P than for Glc6P (K d of 114 M, 160 M, and 200 M, respectively) and has lower affinity for Rib5P and Gal6P (ϳ1 mM, Table 2). Interestingly, the K d value of 200 M measured for Glc6P by fluorescence is comparable to the K m value (144 M) determined for yeast glucose-6-phosphate epimerase (15). No binding could be detected for glucose demonstrating the importance in substrate specificity of the phosphate group.
We have tested both Man6P and Gal6P as mutarotase substrates of YMR099cp using NMR techniques. Man6P and Gal6P differ from Glc6P by the orientation of the hydroxyl group at position C 2 and C 4 , respectively (Fig. 4). First, sugar binding has been monitored using a 1 H-1 D reference spectrum and a one-dimensional saturation difference (STD) technique (31). The 1 H-1 D reference spectra recorded for these different ligands (present at 20 mM) in the absence or presence of 25 M YMR099cp show a specific line broadening for the peaks corresponding to both the ␣ and ␤ anomeric protons upon enzyme addition. This indicates that YMR099cp binds to both anomeric forms from Glc6P, Gal6P, and Man6P (Fig. 5). This was confirmed by the STD spectra recorded for these three sugars (data not shown). Second, enzyme catalyzed anomeric interconversion was investigated using two-     (32). Exchange cross-peaks between ␣ and ␤ Glc6P anomers could only be observed in the presence of YMR099cp enzyme (Fig.  5A). These cross-peaks are due to a fast YMR099cp-catalyzed interconversion between ␣ and ␤ Glc6P, thus confirming the glucose-6-phosphate mutarotase activity measured by enzyme-coupled assay. In addition, this validates that NOESY experiments are useful to screen for putative YMR099cp substrates. NOESY exchange cross-peaks between anomers were also observed on Gal6P and Man6P, when mixed with YMR099cp (Fig. 5, B and C). This clearly shows that YMR099cp also has mutarotase activity on Gal6P and Man6P sugars. Altogether, these functional data clearly demonstrate that YMR099cp catalyzes the interconversion between the ␣ and ␤ anomers from at least three hexose 6-phosphate sugars (Glc6P, Gal6P, and Man6P). In addition, the velocity constant measured for the YMR099cp-catalyzed anomerization of ␣ to ␤ Glc6P (k ϩ1 ϭ 62.8 min Ϫ1 with 1.72 M YMR099cp) and the affinity constant determined for Glc6P (K d ϭ 200 M) are comparable to those of the glucose-6-phosphate epimerase initially described by Wurster and Hess (12)(13)(14) in identical experimental conditions (k ϩ1 ϭ 32 min Ϫ1 with 1.72 M enzyme and K d ϭ 55 M and 144 M for ␤ and ␣ anomers, respectively).

YMR099cp Is a D-Hexose-6-phosphate Mutarotase
Complexes with Hexose 6-Phosphate Sugars-To get a better understanding of the substrate preference of YMR099cp, we have solved the structures of its complexes with Glc6P and Gal6P to 1.6-Å resolution (Fig. 6). Virtually no conformational changes are observed between the free and complexed YMR099c (RMSD value of 0.14 -0.22 Å between the various structures).
For Glc6P, the 2F o Ϫ F c electron density map demonstrated that only the ␤ anomer of the pyran form is bound and that the sugar ring adopts a chair conformation. The sugar phosphate moiety makes one hydrogen bond with the Gln 81 side chain N⑀2 atom and is bound by the positively charged side chains from Arg 57 and Arg 86 as observed for the sulfate ion in the apostructure (Fig.  6A). These two residues form an arginine clamp only conserved in YMR099cp orthologs. This arginine clamp could be the hallmark distinguishing mutarotases acting on hexose 6-phosphate from those acting on hexoses (no binding to glucose could be detected by fluorescence measurements). The sugar ring is sandwiched between Gln 81 and Trp 238 side chains. The O 2 , O 4 hydroxyl groups, and O 5 ring sugar from Glc6P are hydrogen-bonded to Asp 203 O␦2, Lys 244 N, and His 82 N⑀2, respectively. In addition, water-mediated hydrogen bonds connect the phosphate group to Tyr 161 O and Glc6P O 1 to Asp 203 O␦1. Finally, minor contacts are observed between Glc6P and Phe 67 , Gln 183 , and Met 248 .
In the unbound structure of YMR099cp, the strictly conserved His 159 and Glu 264 perfectly superpose with the catalytic residues identified in L. lactis galactose mutarotase (His 170 and Glu 304 , see above). Structural and enzymatic studies on L. lactis enzyme mutants bound to galactose have unambiguously shown that His 170 acts as the active site acid by protonating the O 5 sugar ring oxygen (distance between His 170 N⑀2 and O 5 atoms is 3 Å) while Glu 304 is ideally located (its O⑀1 and O⑀2 oxygens are 2.7 Å away from galactose O 1 atom) to act as general base through deprotonation of the anomeric O 1 hydroxyl group (26). This led us to assume that the anomerization of hexose 6-phosphate by YMR099cp proceeds via the same catalytic mechanism as galactose mutarotase and that His 159 and Glu 264 are the catalytic acid and base, respectively. In our YMR099cp⅐Glc6P complex, these two residues adopt the same conformations as in the galactose mutarotase-galactose complex. However, they make no hydrogen bonds with either the O 1 or O 5 Glc6P oxygen atoms, because it is less deeply buried into the active site pocket (Fig. 5A). Glc6P therefore probably forms a non-productive complex in our YMR099cp crystal. A similar observation was made for crystal structure of the wildtype and E304Q mutant forms of L. lactis galactose mutarotase, which both bind only the ␤-anomer of glucose in a non-productive manner (25,26). On the contrary, the natural substrate galactose is well positioned in the active site and a mixture of

YMR099cp Is a D-Hexose-6-phosphate Mutarotase
both anomers is observed (k cat /K m ϭ 185 and 12.65 s Ϫ1 ⅐mM Ϫ1 for galactose and glucose, respectively (26)). Because there exists no biochemical assay to measure the mutarotation of Man6P or Gal6P (and their pure anomers are commercially unavailable) we were not able to determine their kinetic parameters and we therefore cannot be sure of the preferred substrate of YMR099cp (the affinity for Man6P (K d , 0.16 mM) is higher than for Glc6P (K d , 0.2 mM).
Surprisingly, the 1.6-Å resolution structure of the YMR099cp Gal6P complex unambiguously revealed that it is the ␤ form of tagatose 6-phosphate (Tag6P), the Gal6P isomer (Fig. 6B), which is present in the active site. As observed for Glc6P, the Tag6P phosphate moiety interacts with the Gln 81 , Arg 57 , and Arg 86 side chains from the arginine clamp and the furan ring packs on Trp 238 . The Tag6P O 3 , O 4 , and O 5 atoms are H-bonded to Asp 203 O␦2, His 82 N⑀2, and Gln 81 N⑀2, respectively. The Tag6P O 1 atom interacts with His 159 N⑀2 and Glu 264 O⑀1. Finally, a barium ion (present as counter ion of Gal6P used in the co-crystallization drop) is coordinated to Tag6P O 4 and O 5 atoms. The presence of Tag6P in the enzyme active site was not expected because the starting material corresponds to the pyran form of galactose 6-phosphate as confirmed by 1 H NMR analysis (data not shown). Hence, small amounts of Tag6P (undetectable by NMR) should be present in commercial Gal6P as observed for fructose 6-phosphate in Glu6P salts. Clearly, crystallization of YMR099cp in the presence of Gal6P is strongly selective for the Tag6P-containing complex. Fluorescence measurements have shown that YMR099cp has higher affinity for Fru6P (K d of 114 M), the Glc6P isomer, than for its pyran form. Fru6P and Tag6P differ only in the orientation of the hydroxyl group at position C 4 (Fig. 4). By analogy, we can speculate that Tag6P, the Gal6P isomer, binds tighter to YMR099cp than Gal6P does (K d of 0.985 mM), explaining why it is exclusively present in the crystal.

DISCUSSION
Many enzymes involved in sugar metabolism are specific for the ␣ or ␤ anomer of their substrate and therefore sugar anomerization equilibrium plays a role in the general metabolic fluxes of the cell. For example, galactokinase specifically phosphorylates ␣-galactose at position 1 to produce ␣-D-galactose-1-phosphate (33), while glucose dehydrogenase only uses ␤-D-glucose as substrate (34). Although interconversion between the ␣ and ␤ anomers of these sugars occurs spontaneously in solution, the in vivo rate may be too low to support fast energy generation by metabolic pathways (the rates for spontaneous mutarotation of glucose and glucose-6-phosphate are 0.015 min Ϫ1 and 0.09 min Ϫ1 , respectively (35)). Hence, aldose-1-epimerases (or mutarotases; EC 5.1.3.3) increasing the rate of the anomerization are required. The most studied enzyme of this family, galactose mutarotase catalyzes the first of the four steps of the Leloir pathway which converts ␤-D-galactose (one of the lactose degradation products with ␣-D-glucose) into the metabolically more useful glucose 1-phosphate (24,36). This first step consists of the interconversion of ␤-D-galactose to ␣-D-galactose, which is then phosphorylated by galactokinase to yield ␣-D-galactose 1-phosphate. The latter is transformed into UDP-glucose by galactose-1-phosphate uridylyltransferase to produce glucose 1-phosphate and UDP-galactose. Finally, UDP-galactose 4-epimerase regenerates UDP-glucose from UDP-galactose. In human, mutations in the genes encoding for either of the four enzymes involved in the Leloir pathway result in galactosemia, a rare but potentially lethal disease leading to cataract formation and liver dysfunction (37,38).
In recent years, the galactose mutarotase catalytic mechanism has been dissected by the resolution of the crystal structures of the L. lactis enzyme bound to different substrates combined with site directed mutagenesis studies (11,25,26). A two-layered ␤-sandwich fold made of 20 -30 antiparallel ␤-strands was revealed for L. lactis, human and yeast enzymes (11,20,21). In addition, the structure of the galactosebound enzyme has highlighted four strictly conserved residues (2 histidines and two acidic amino acids) located in the galactose binding pocket that could potentially act as general acid/ base in the mutarotation reaction. Substitution of the acidic residues (Asp 243 and Glu 304 according to L. lactis numbering) by Ala and of two histidines (His 96 and His 170 ) by Asn has lead to the conclusion that Glu 304 and His 170 act as catalytic base and acid, respectively (26). Glu 304 starts the reaction by abstracting the proton from the C-1 hydroxyl group of the sugar while His 170 protonates the C-5 ring oxygen. A growing number of mutarotases acting on a wide variety of furan or pyran sugars ((deoxy)ribose, fucose, rhamnose) has been characterized recently (30,39,40). Here, we bring structure-based experimental evidence that the YMR099cp exhibits hexose-6phosphate mutarotase activity on Glc6P, Gal6P, and Man6P. Glc6p mutarotase activity was previously described for an unidentified enzyme purified from wild-type yeast exhibiting a similar molecular weight and pI as YMR099cp (12)(13)(14). It was shown that specific anomers of both Man6P and Glc6P are subject to metabolic conversions. For instance, yeast phosphomannose isomerase, which catalyzes the interconversion of Man6P and Fru6P, is specific for the Man6P ␣-anomer but both mannose anomers are phosphorylated by hexokinase (41). Hence, the mutarotase activity that we have observed on Man6P could provide sufficient amounts of ␤-Man6P for phosphomannose isomerase. The need for an enzyme interconverting ␣and ␤-Glc6P anomers is further justified by the specificity of different enzymes involved in sugar metabolism pathways (Fig. 7). First, glucose-6-phosphate dehydrogenase specifically YMR099cp Is a D-Hexose-6-phosphate Mutarotase OCTOBER 6, 2006 • VOLUME 281 • NUMBER 40 catalyzes the conversion of the Glc6P ␤-anomer into D-glucono-1,5-lactone-6-phosphate. This is the first irreversible and rate-limiting step of the pentose phosphate pathway that generates NAD(P)H from NAD(P) ϩ upon conversion of ␤-Glc6P into Fru6P (42). Second, the Glc6P ␣-anomer is the specific substrate for both phosphoglucomutase and phosphoglucose isomerase. Phosphoglucomutase is involved in glycolysis as well as in the metabolism of glycogen, trehalose, and galactose and also catalyzes the interconversion of ␣-D-Glc1P and ␣-D-Glc6P (43). Phosphoglucose isomerase converts ␣-D-Glc6P to Fru6P during glycolysis and the reverse reaction during gluconeogenesis (43). Enzyme with specificity for Gal6P anomers has not been described yet and whether Gal6P is a biological substrate of YMR099cp remains an open question. Surprisingly, the complex between YMR099cp and Gal6P revealed the presence of the furanose isomer of Gal6P (Tag6P) in the crystals. The exclusive presence of Tag6P in the YMR099cp active site can be explained by the higher affinity of the protein for the furanose form of the substrate, at least in the crystal. It should be noted that YMR099cp exhibits higher affinity for the furanose Fru6P (K d , 114 M) than for the pyran form of Gal6P (K d , 985 M). The Tag6P bound to YMR099cp active site is very likely to mimic Fru6P. Analysis of our complexes (Fig. 6) clearly shows that the main interaction with the substrates occurs through anchoring of the 6 phosphate group and that the active site pocket can easily accommodate various isomers at positions 2 and 4 on the sugar ring. This explains why we observe such broad substrate specificity. Whether YMR099cp acts in a precise metabolic pathway, like galactose mutarotase does, remains an open question.
In conclusion, it has been recently reported that 40% of the enzymatic activities described by EC numbers are not associated with any protein sequence in major public databases. In this article we demonstrate that a systematic structural approach can help to fill this gap in favorable cases. We show here that the S. cerevisiae ymr099c gene product codes for an enzyme that corresponds to a hitherto orphan EC number (EC 5.1.3.15). The definition of the physiological role of the enzyme will need further exploration. Hopefully, this will explain why YMR099c belongs to the 1% of the genes regulated by the transcription activator GCN4 that, to a vast majority, encode for proteins involved in amino acid and nucleotide metabolism (7). Similarly, it remains to be determined if the co-purification of YMR099cp with TRZ1, a tRNA 3Ј processing endonuclease responsible for a 3Ј trailer from precursor tRNA (8 -10), as any biological relevance or if this is an artifact.