Three New Nudix Hydrolases from Escherichia coli*

Three members of the Nudix (nucleoside diphosphate X) hydrolase superfamily have been cloned from Escherichia coli MG1655 and expressed. The proteins have been purified and identified as enzymes active on nucleoside diphosphate derivatives with the following specificities. Orf141 (yfaO) is a nucleoside triphosphatase preferring pyrimidine deoxynucleoside triphosphates. Orf153 (ymfB) is a nonspecific nucleoside tri- and diphosphatase and atypically releases inorganic orthophosphate from triphosphates instead of pyrophosphate. Orf191 (yffH) is a highly active GDP-mannose pyrophosphatase. All three enzymes require a divalent cation for activity and are optimally active at alkaline pH, characteristic of the Nudix hydrolase superfamily. The question of whether or not Orf1.9 (wcaH) is a bona fide member of the Nudix hydrolase superfamily is discussed.

Three members of the Nudix (nucleoside diphosphate X) hydrolase superfamily have been cloned from Escherichia coli MG1655 and expressed. The proteins have been purified and identified as enzymes active on nucleoside diphosphate derivatives with the following specificities. Orf141 (yfaO) is a nucleoside triphosphatase preferring pyrimidine deoxynucleoside triphosphates. Orf153 (ymfB) is a nonspecific nucleoside triand diphosphatase and atypically releases inorganic orthophosphate from triphosphates instead of pyrophosphate. Orf191 (yffH) is a highly active GDP-mannose pyrophosphatase. All three enzymes require a divalent cation for activity and are optimally active at alkaline pH, characteristic of the Nudix hydrolase superfamily. The question of whether or not Orf1.9 (wcaH) is a bona fide member of the Nudix hydrolase superfamily is discussed.
Due to the rapid advances in DNA sequencing technology over the past several years, there has been an explosive addition of partial nucleotide sequences and whole genomes from a multitude of organisms deposited in the data banks. These nucleotide sequences are analyzed by specialized "gene finder" software, so the identification of genes is no longer the purview of the geneticist who relies on some discernible function, or phenotype, for discovery. Consequently, large numbers of genes are being discovered and decoded into open reading frames representing proteins of unknown functions. The goal of functional genomics, or proteomics, is to discover roles played by these orphan proteins, whether they be structural, regulatory, or enzymatic. A major tool in annotating or relating a protein to a function is the recognition of structural motifs or amino acid signature sequences. In this vein, we have discovered a superfamily of enzymes related by the similarity of their substrates and defined by a highly conserved sequence of amino acids, the "Nudix box" 3 as follows, where X represents any amino acid, and U is usually a bulky hydrophobic amino acid, such as Leu, Val, or Ile. This amino acid motif is of highly predictive value, because proteins (enzymes) containing the Nudix box invariably catalyze the hydrolysis of nucleoside diphosphates linked to some other moiety, X, hence the acronym, Nudix (1), although in a few cases other organic compounds containing pyrophosphate linkages are also hydrolyzed. At this writing, over 2000 members of the superfamily from 200 species ranging from viruses to humans have been identified through BLAST (2) searches, and we have been systematically cloning, expressing, and characterizing their activities. The range of substrates includes, among others, nucleoside triphosphates, nucleotide sugars, ADP-ribose, various coenzymes, and diadenosine oligophosphates of the form adenosine(5Ј)p n (5Ј)adenosine, where n ϭ 3-6. All free living organisms contain at least one Nudix gene, and many have several genes coding for different Nudix hydrolases, although the number of representatives in each species seems to be dependent on factors other than genome size. Bacillus cereus has 26 genes, Saccharomyces cerevisiae has 7, Drosophila melanogaster has 20, and Homo sapiens has 22, with respective genome sizes of 5.4, 13, 165, and 3200 Mb.
Escherichia coli, with a genome of 4.6 Mb and 13 Nudix hydrolase genes, may be considered the progenitor of the superfamily, since it was the first organism revealing the amino acid signature sequence diagnostic of the family, and it may be considered paradigmatic of the rest of the extended family, because several of the enzymes first discovered in E. coli have since been found in other prokaryotes, archaea, and eukaryotes. Eleven of the 13 E. coli Nudix proteins contain the consensus Nudix box, one (Orf191) has a conservative replacement of lysine for arginine (R15 N K), 3 and one (Orf1.9) has a slight rearrangement of amino acids (Fig. 1). We have previously identified enzymatic activities for eight of these proteins. In this paper, we describe the cloning and expression of three more of the Nudix genes and the identification of their respective enzymatic activities. Thus, 11 of the 13 Nudix candidates have now been accounted for. * This work was supported by National Institutes of Heath Grant GM18649. This is publication 1529 from the McCollum-Pratt Institute. The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact. 1

Materials
Oligonucleotide primers were from Integrated DNA Technologies (Coralville, IA); expression plasmids, and E. coli hosts were from Novagen (Madison, WI) or laboratory stocks; E. coli chromosomal DNA for cloning was prepared in this laboratory from strain MG1655 was kindly provided by Dr. Frederick R. Blattner (University of Wisconsin). Restriction enzymes, kits for the polymerase chain reaction, calf intestinal alkaline phosphatase, and yeast inorganic pyrophosphatase were from Stratagene (La Jolla, CA). Substrates and common biochemicals were from Sigma, and the Bradford reagent (3) for protein determinations was from Bio-Rad.

Cloning of the Nudix Genes of E. coli into pET Expression
Vectors-The genes were amplified from genomic DNA using the DNA polymerase chain reaction incorporating NdeI and BamHI sites at the start and end of the gene. The amplified DNA was cut with the appropriate restriction enzyme and ligated into pET11b (orf141 and orf153) or pET24a (orf191 and orf192) under control of the T7 lac promoter. The resultant plasmids were transformed into HMS174(DE3), the transformants were screened for the correct size insert, and the cloned fragments were sequenced at the Johns Hopkins School of Medicine facility. Permanent stocks of the cells were stored at Ϫ80°C in LB broth containing 10% dimethyl sulfoxide.
Growth and Expression of the Transformed Cells-One colony of the expression strain was inoculated into 40 ml of LB medium containing 100 g/ml ampicillin (pET11b) or 30 g/ml kanamycin (pET24a), incubated on a rotary shaker at 37°C overnight, transferred to 2 liters of the same medium, and grown to an A 600 of 0.8. At this point, the cells were derepressed by the addition of isopropyl-␤-D-thiogalactopyranoside to a concentration of 1 mM, incubated for an additional 3 h, after which the cells were harvested by centrifugation, washed in buffered isotonic saline solution, and frozen at Ϫ80°C.
Purification of the Enzymes-All of the following steps were conducted at 0 -4°C. The expressed proteins of Orf141 and Orf153 were extracted from the thawed cells into 2.5 volumes of buffer A (50 mM Tris-Cl, pH 7.5, 1 mM EDTA, 0.1 mM dithiothreitol. Orf191 was not extractable under these conditions, so the cells were sonicated in 2.5 volumes of buffer A and centrifuged, and the precipitate was discarded. The extracts were adjusted to protein concentrations of 10 mg/ml with buffer A, and the nucleic acids were precipitated by the dropwise addition of 10% streptomycin sulfate to a final concentration of 1.5%. The precipitates were removed by centrifugation, and the supernatants were adjusted to 70% saturation with solid ammonium sulfate. After 1 h on ice, the precipitates were collected, dissolved in a minimal volume of buffer A, and chromatographed on a gel filtration column of Sephadex, G-100 (2.5 ϫ 50 cm), equilibrated and run with 200 mM NaCl in buffer A. Fractions were analyzed by polyacrylamide gel electrophoresis, and those estimated to be Ͼ90% pure were combined and concentrated by pressure dialysis or by precipitation with ammonium sulfate.
Enzyme Assays-The standard reaction mixture contained the following in 50 l: 50 mM Tris-Cl, pH 8.5, 5 mM Mg 2ϩ or Mn 2ϩ , 2 mM substrate, and 0.5 units of yeast inorganic pyrophosphatase for nucleoside triphosphates when PP i was one of the products, and 0.2-2 milliunits of enzyme. Assay of all other substrates depends on the conversion of a phosphatase-resistant substrate to a phosphatase sensitive product, and 3 units of alkaline phosphatase was substituted for the inorganic pyrophosphatase. The mixture was incubated at 37°C for 15 min and stopped by the addition of Norit (charcoal activated by an acid wash) for nucleoside triphosphates or EDTA for other substrates, and the liberated inorganic orthophosphate was assayed by the colorimetric procedure of Fiske and SubbaRow (4) as modified by Ames and Dubin (5). One unit of enzyme hydrolyzes 1 mol of substrate/min. The enzymatic rates obtained with this assay are reliable to approximately Ϯ10%.

RESULTS AND DISCUSSION
Gene Cloning, Expression, and Enzyme Purification-Cloning of all three genes went without incident, and the plasmid inserts were sequenced and found to be identical to those reported in the data banks. As with some of the other Nudix hydrolases (6 -10), two of the proteins, Orf141 and Orf153, were extractable into buffer merely by freezing and thawing the cells. This left the bulk of the cellular proteins behind, thereby considerably simplifying the purification procedure. At present, there is no satisfactory explanation as to why only a subset of the expressed Nudix hydrolases are extractable under these conditions. It does not appear to reflect their localization within the cell, since foreign proteins expressed in E. coli also follow this unpredictable behavior.
Identification of Enzyme Activities and Specificities-A unifying property of the Nudix hydrolases is their activity on nucle-  AUGUST 11, 2006 • VOLUME 281 • NUMBER 32 oside diphosphate derivatives, although these substrates may vary widely for different members of the superfamily. It has been possible to classify some of the proteins into subfamilies based on landmark amino acids outside of the Nudix box that are diagnostic of enzymes having similar specificities (6,(11)(12)(13) and thereby predicting the substrate(s) for a newly discovered member of the superfamily. However, the three proteins described here do not have any distinguishing characteristics of predictive value. Accordingly, the enzymes were tested with a battery of nucleoside diphosphate derivatives.

Three New Nudix Hydrolases
Orf141- Table 1 shows that Orf141 prefers the pyrimidine deoxynucleoside triphosphates dUTP, dTTP, and dCTP and is less active on the corresponding diphosphates. Since the Nudix hydrolases Orf17 (7), MutT (14), and Orf135 (10) prefer dATP, dGTP, and dCTP, respectively, the activity of Orf141 for dTTP completes the set of enzymes in E. coli active on the four canonical deoxynucleoside triphosphates. Like the other three enzymes, the products of the hydrolysis catalyzed by Orf141 are a nucleoside monophosphate and inorganic pyrophosphate according to the following reaction.
When inorganic pyrophosphatase is omitted from the standard assay, dTMP is produced, as measured by high pressure liquid chromatography, but no P i is formed (data not shown). It has been demonstrated previously by conducting the enzymatic hydrolysis in H 2 18 O, that the mechanism of hydrolysis catalyzed by the Nudix hydrolases, MutT dGTPase (15) and Orf17 dATPase (7), is by nucleophilic attack on the ␤-phosphorus, leading to the formation of the corresponding nucleoside monophosphate and PP i. In all likelihood, Orf141 dTTPase follows the same mechanism as well as for the hydrolysis of nucleoside diphosphates (Table 1) to nucleoside monophosphates and P i .
Orf153-This enzyme differs from all four of the other Nudix nucleoside triphosphatases in E. coli in the following respects. 1) It hydrolyzes nucleoside triphosphates in a stepwise manner, forming P i instead of PP i . Thus, inorganic pyrophosphatase can be omitted from the assay. 2) It has no general preference for deoxyribose or ribose. 3) The nucleoside diphosphates are generally as good substrates as the triphosphates. None of the other Nudix hydrolase substrates tested, including NADH, GDPmannose, and ADP-ribose, are acted upon by the enzyme. It is interesting that with the exception of GDP, Orf141 has much higher or equal activities on their common substrates compared with Orf153, raising the question as to what role Orf153 might play in the cell.
Orf191-In our original BLAST searches of the E. coli genome, we missed Orf191, most likely due to the replacement of the highly conserved arginine (R15 N ) by a lysine (Fig. 1). Subsequently, using less stringent search criteria, we identified Orf191 as a potential Nudix hydrolase with a conservative R14 N K substitution, and the proline 15 amino acids downstream of the Nudix box suggested that it might be a member of the ADP-ribose pyrophosphatase subfamily (6). Accordingly, we tested the expressed protein and found that it did indeed catalyze the hydrolysis of ADP-ribose, but a subsequent survey of related compounds revealed that GDP-mannose was a better substrate. Table 2 compares the relative activities of Orf191 on several potential substrates. It appears that the enzyme prefers purine over pyrimidine nucleotide sugars (compare GDP-glucose with UDP-and CDP-glucose) and, within the purine series, prefers a 6-keto over a 6-amino group (compare GDPmannose, IDP-ribose, and GDP-glucose with ADP-ribose and ADP-glucose). X-ray crystallographic analysis of the enzymesubstrate complex should provide the structural basis of the specificity of the enzyme.  The checkmarks refer to those enzymes reported in this paper. The highly conserved amino acids of the signature sequence are in boldface type. Only the region surrounding the Nudix box is shown. The question mark refers to an uncertainty in the classification of Orf1.9 (see "Results and Discussion"). B, the two Nudix hydrolases that have not been positively identified. Orf192 has the CoA pyrophosphatase motif (13) (boldface type and underlined) but has not been expressed satisfactorily, and Orf180 remains a mystery.
The enzyme catalyzes the hydrolysis of the pyrophosphate bond, always releasing a nucleoside monophosphate and a phosphorylated moiety, depending on the substrate. For GDP-mannose, the reaction may be written as follows.

GDP-mannose ϩ H 2 O 3
GMP ϩ mannose-1-phosphate REACTION 2 In the absence of alkaline phosphatase, no P i is liberated during the course of the reaction, whereas in its presence, 2 mol of P i are liberated per mol of GDP-mannose hydrolyzed.
Other Properties of the Enzymes-As with all the other Nudix hydrolases characterized so far, these three enzymes absolutely require a divalent cation cofactor for activity, and Mg 2ϩ at 5 mM is optimum. Also characteristic of the members of the superfamily, the three enzymes are optimally active in the alkaline range around pH 8.5. Both Orf141 and Orf153 elute from a calibrated gel filtration column as monomers, whereas Orf191 with a molecular mass of 20 kDa elutes as a 60-kDa homotrimer. One other Nudix hydrolase, the MutX dGTPase from Streptococcus pneumoniae, also behaves as a trimer in solution (16), whereas the ADP-ribose pyrophosphatase from E. coli (17) and the diadenosine tetraphosphate pyrophosphatase from Deinococcus radiodurans (18) are dimers. Thus, quaternary structure is not a common feature of members of the Nudix hydrolase superfamily.
The kinetic constants for the three enzymes are shown in Table 3 along with the constants for Orf1.9, another GDP-mannose hydrolase present in E. coli. Orf1.9 was one of the original open reading frames detected in our first BLAST search for proteins containing the Nudix signature sequence (19), and at that time, we noticed that there were some differences in the arrangement of amino acids in its Nudix box. Two of the highly conserved amino acids, E16 N and E19 N were replaced by a leucine and alanine, respectively (Fig. 1). Nevertheless, we found that the expressed and purified protein was active on the nucleoside diphosphate, GDP-mannose (9), suggesting that the enzyme was a bona fide member of the Nudix hydrolases. In addition to the amino acid discrepancies, we were also sur-prised by the products of the reaction, which were GDP and free mannose according to the following reaction. REACTION 3 In all other Nudix hydrolase reactions, the nucleophilic attack is on phosphorus, resulting in the cleavage of a pyrophosphate linkage, whereas the mechanism of Orf1.9 involves attack on carbon as has been shown by NMR of a reaction run in H 2

18
O (20). Our recent discovery of Orf191, an authentic GDP-mannose pyrophosphatase member of the Nudix hydrolases, raises the question as to whether Orf1.9 is indeed a member of the family. Comparison of the kinetic constants in Table 3 shows that Orf191 has a k cat 150 times higher and a catalytic efficiency (k cat /K m ) 60 times higher than Orf1.9. An attractive hypothesis is that Orf1.9 is an example of convergent evolution, not gene duplication as is the case for all other members of the superfamily, and that hydrolysis of GDP-mannose is not the physiological role of the enzyme. One possibility is that Orf1.9 is in reality a GDP-mannosyl transferase that transfers mannose to water in the absence of its physiological acceptor. Precedence for this has been seen with phage T-4 ␤-glucosyltransferase (21) and the UDP-glucosyltransferase of Clostridium difficile toxin (22).
With the characterization of Orf141, Orf153, and Orf191 in this paper, 11 of the 13 proteins in E. coli identified in a BLAST search as members of the Nudix hydrolase superfamily have been shown to have enzymatic activities (Fig. 1A). In Fig. 1B, two as yet unidentified open reading frames are shown. Most likely, Orf192 is a CoA pyrophosphatase, because it has an amino acid motif upstream of the Nudix box (shown in boldface type, underlined), designating the CoA pyrophosphatase subfamily of the Nudix hydrolases (13). However, we have cloned and expressed Orf192 into several vectors and hosts, and we have not yet been successful in producing enough soluble protein to identify the activity unequivocally. On the other hand, we have cloned, expressed, and purified Orf180, one of the first open reading frames we uncovered in a BLAST search, and although it is a readily soluble protein and contains a canonical Nudix box, it still remains an enigma.