Cloning, expression, and characterization of polyphosphate glucokinase from Mycobacterium tuberculosis.

Polyphosphate glucokinase from Mycobacterium tuberculosis catalyzes the phosphorylation of glucose using polyphosphate or ATP as the phosphoryl donor. The M. tuberculosis H37Rv gene encoding this enzyme has been cloned, sequenced, and expressed in Escherichia coli. The gene contains an open reading frame for 265 amino acids with a calculated mass of 27,400 daltons. The recombinant polyphosphate glucokinase was purified 189-fold to homogeneity and shown to contain dual enzymatic activities, similar to the native enzyme from H37Ra strain. The high G+C content in the codon usage (64.5%) of the gene and the absence of an E. coli-like promoter consensus sequence are consistent with other mycobacterial genes. Two phosphate binding domains conserved in the eukaryotic hexokinase family were identified in the polyphosphate glucokinase sequence, however, "adenosine" and "glucose" binding motifs were not apparent. In addition, a putative polyphosphate binding region is also proposed for the polyphosphate glucokinase enzyme.

Polyphosphate glucokinase from Mycobacterium tuberculosis catalyzes the phosphorylation of glucose using polyphosphate or ATP as the phosphoryl donor. The M. tuberculosis H 37 Rv gene encoding this enzyme has been cloned, sequenced, and expressed in Escherichia coli. The gene contains an open reading frame for 265 amino acids with a calculated mass of 27,400 daltons. The recombinant polyphosphate glucokinase was purified 189-fold to homogeneity and shown to contain dual enzymatic activities, similar to the native enzyme from H 37 Ra strain. The high G؉C content in the codon usage (64.5%) of the gene and the absence of an E. coli-like promoter consensus sequence are consistent with other mycobacterial genes. Two phosphate binding domains conserved in the eukaryotic hexokinase family were identified in the polyphosphate glucokinase sequence, however, "adenosine" and "glucose" binding motifs were not apparent. In addition, a putative polyphosphate binding region is also proposed for the polyphosphate glucokinase enzyme.
Glucose phosphorylation is catalyzed by a family of structurally related hexose phosphotransferases. These enzymes can be divided into two general groups: the Class I enzymes, which include hexokinase I, II, and III, are larger (ϳ100 kDa) and have a high affinity for glucose (K m ϭ 20 -130 M) (1, 2); the Class II enzyme, which is called hexokinase IV or glucokinase, is smaller (ϳ50 kDa) and displays a low affinity for glucose (K m ϭ 5-8 mM) (reviewed in Ref. 3). Comparison of the deduced amino acid sequences of mammalian liver glucokinases with that of hexokinase I, II, and III (4) revealed that the hexokinases are essentially dimers of glucokinase, thus providing evidence for the hypothesis that the hexokinases have evolved from a primordial glucokinase-like gene by gene duplication (1,4,5). However, from a phylogenetic hexokinase family tree constructed by comparing 60 sequences of sugar kinases, Bork et al. (6) observed that glucokinases appear in three clusters in separate branches, where (i) the mammalian glucokinases formed one cluster, (ii) the yeast glucokinase appeared to be clustered with yeast hexokinases rather than with mammalian glucokinases, and (iii) bacterial glucokinases from Zymomonas mobilis and Streptomyces coelicolor were grouped with the Zymomonas fructokinase. These authors concluded that a divergent evolutionary relationship between these glucokinases was unlikely. Rather, they argued that evolutionary convergence to glucose specificity must have occurred independently in mammals, yeast, and bacteria.
Our interest in the evolutionary origin of the glucokinases led us to investigate the properties of polyphosphate glucokinases (Poly(P)-glucokinase) 1 from different sources. Inorganic polyphosphates (poly(P)s) are linear polymers of orthophosphate linked by phosphoanhydride bonds. These polymers have been found in almost all species, and their proposed biological functions have been reviewed in detail (7)(8)(9)(10). One of the poly(P)-utilizing enzymes, Poly(P)-glucokinase (EC 2.7.1.63), on the other hand, has been found only in certain bacteria (7,11). This Poly(P)-glucokinase, which was first observed in the Mycobacterium phlei by Szymona (12), is an unusual glycolytic enzyme that utilizes poly(P) or ATP as the phosphoryl donor to phosphorylate glucose. We previously reported the purification of this enzyme from Propionibacterium shermanii (13) and Mycobacterium tuberculosis H 37 Ra (14) and demonstrated that the poly(P)-and ATP-dependent glucokinase activities were catalyzed by a single enzyme. Poly(P)-glucokinase from M. tuberculosis also utilized GTP, UTP, and CTP as the phosphoryl donors (14). In addition, kinetic studies with the P. shermanii enzyme suggested that ATP and poly(P) have different binding sites (13). It has been hypothesized (13) that glucose phosphorylation in the microorganism may have originally been mediated by poly(P), and when ATP became available in the environment, a transition was made to utilize the latter phosphoryl donor by the glucokinases. Thus, this bifunctional Poly(P)-glucokinase could represent an "intermediate" in the evolution of glucokinases, especially in prokaryotic cells.
In attempts to identify the structural and functional domains of the poly(P)-glucokinase, we have cloned, sequenced, and expressed the Poly(P)-glucokinase gene (ppgk) from M. tuberculosis H 37 Rv. The expressed and purified recombinant Poly(P)-glucokinase (re-Poly(P)-glucokinase) from Escherichia coli, showed that the cloned ppgk gene encodes a single polypeptide chain containing both the poly(P)-and ATP-dependent glucokinase activities. Through sequence alignment with other glucokinases, phosphate binding motifs were found to be conserved in the Poly(P)-glucokinase and other prokaryotic glucokinases. In addition, a putative poly(P) binding site for the Poly(P)-glucokinase is proposed. Amino Acid Sequences of Poly(P)-glucokinase Peptides and Design of Degenerate Primers-Approximately 100 g of Poly(P)-glucokinase was digested with endoproteinase Arg-C, and the peptides were fractionated by Tricine-SDS-PAGE (16) followed by electroblotting onto a polyvinylidene difluoride membrane (17). The membrane was then stained with Coomassie Brilliant Blue R-250 and destained. A peptide with a migration corresponding to 27 kDa (see Table I, peptide number 1) was excised and subjected to N-terminal sequencing on an Applied Biosystem Model 470 sequencer at the core facility of Case Western Reserve University. Because the native enzyme migrated as a 33-kDa protein (14), this peptide sequence was expected to be close to the N terminus of the enzyme. Therefore, we designed degenerate oligonucleotides, 5Ј-GGAATTCCTT(T/C)GGIGTIGA(T/C)GTNGGNGG-3Ј, as the sense primers based on the amino acid sequence FGVDVGG. Internal peptides were obtained by digesting the enzyme with V 8 -protease or trypsin and separating the digest on a Synchropak C 8 column (RP-8, 25 ϫ 0.46 cm) using reverse-phase high performance liquid chromatography. Peptides were eluted with a linear gradient using 0.1% aqueous trifluoroacetic acid (solvent A) and 0.1% trifluoroacetic acid in CH 3 CN (solvent B). Peptide elution was monitored at 220 nm, collected manually, and subjected to N-terminal sequencing. The antisense primers sequences, based on peptide number 4 sequence EEHYGAG, are 3Ј-CT(T/C)CT(T/ C)GA(A/G)AT(G/A)CCICGNCCICCTTAAGG-5Ј. For both primers, letter N represents the position where ATGC were at the same place. Nucleotide, inosine (I), was sometimes used to substitute ATGC, reducing the degeneracy to 64-fold for both primers.
Genomic Cloning of Poly(P)-glucokinase Gene-Degenerate sense primers (28-mers) and the degenerate antisense primers (29-mers) corresponding to the sequences of two peptides (see Table I, numbers 1 and 4) of Poly(P)-glucokinase from M. tuberculosis H 37 Ra were used as primers in PCR. To isolate the DNA, approximately 10 g of M. tuberculosis H 37 Ra cells were freeze-thawed twice and resuspended in 35 ml of TE buffer (10 mM Tris-HCl, 1 mM EDTA, pH 7.4), and 2.5 ml of lysozyme (10 mg/ml) was added, mixed, and incubated for 1 h at 37°C. To this, 3.5 ml of 10% SDS and 300 l of proteinase K (10 mg/ml) were then added, and the solution was incubated for 30 min at 65°C. Purification of genomic DNA was then carried out by published methods (15).
Around 100 ng of purified genomic DNA was used as a template in a PCR reaction. Sixteen different buffer conditions, with variable Mg 2ϩ concentrations and pH values, were set up based on the PCR Optimizer kit (Invitrogen), and 30 cycles of 95°C (2 min), 55°C (3 min), and 72°C (3 min) were performed. A major 365-bp product was observed on a 1% agarose gel under the following conditions: Mg 2ϩ (2.0 mM), Tris-HCl (60 mM, pH 9.0), and (NH 4 ) 2 SO 4 (15 mM). This fragment was gel-purified and subcloned into the vector, pBluescript SK Ϫ (Stratagene, La Jolla, CA). The resulting plasmid was designated as pBS-PCR-1. Sequence analysis of pBS-PCR-1 indicated that the deduced sequence encoded the amino acid sequences of the two isolated peptides (numbers 1 and 4) as shown in Table I. A high specific activity probe was generated using 25 ng of this 365-bp fragment as a template and the Random Primed DNA labeling kit and 5 Ci of [␣-32 P]CTP. The resulting 32 P-labeled probe was used to screen the gt-11 genomic library of M. tuberculosis H 37 Rv. Around 6 ϫ 10 4 plaques were screened on four duplicate filters. Hybridizations were performed at 65°C overnight in 6 ϫ SSC buffer (20 ϫ SSC ϭ 3 M of NaCl and 0.3 M sodium citrate), 1% (w/v) bovine serum albumin, and 0.02% (w/v) SDS. The filters were then washed three times with 2 ϫ SSC containing 0.1% SDS at room temperature for 5 min each and with 1 ϫ SSC containing 0.1% SDS at 65°C twice for 2 h each. Autoradiographs were prepared by exposure to x-ray film (Kodak X-OMAT AR) at room temperature for 4 h or at Ϫ70°C overnight.
Eighteen positive clones were detected in the primary screening. After tertiary screening, only seven plaques remained positive. These phages were enriched separately on 150-mm LB medium plates, and the phage-DNAs were purified, digested with EcoRI, and separated on a 0.8% agarose gel. From these seven -phage DNAs, eight fragments below 10 kb were observed that represented the inserts of M. tuberculosis H 37 Rv DNA. The fragments were then gel-purified and subcloned into pBluescript. The resulting plasmids were designated as pBS-PPGK 1 to 8.
Reselection of pBS-PPGK Clones-Two primers based on the 365-bp insert in pBS-PCR-1 were synthesized to determine which of the 8 pBS-PPGK plasmids was best suited for further analysis. Primer I contains 5Ј-end sequences (5Ј-CAGCGGGATCAAGGGCG-3Ј) and primer II contains 3Ј-end sequences (5Ј-TGCGGCGTCTGCGTCGTT-3Ј). Thermal cycles of PCR reactions were one cycle of 94°C (2 min), 30 cycles of 94°C (1 min), 60°C (2 min), and 72°C (1 min), and one cycle of 72°C (7 min). This reaction was designed to produce a 280-bp fragment. Two out of the eight clones yielded the expected result, and one, pBS-PPGK-7, was selected for further analysis.
Southern Blotting Analysis-pBS-PPGK-7 plasmid was digested with different restriction enzymes and separated on a 1% agarose gel in a Tris-boric acid-EDTA system. The DNA fragments were transferred onto a nylon membrane by the capillary method (18). Generation of the radioactive probe, hybridization conditions, and autoradiographs were the same as described above.
Sequence Analysis-Nucleotide sequence analysis was performed by using the Sanger dideoxy chain termination method (Sequenase Version 2.0, U. S. Biochemical Corp). DNA template and sequencing primer were incubated at 90°C for 2 min and transferred to 37°C for 15 min. The resulting sequences were used to design additional primers in a primer-walking strategy. For pBS-PPGK-7 gene, 1100-bp was sequenced from both strands (see Fig. 1) including 795-bp representing the ppgk open reading frame as shown in Fig. 2.
Assay of Poly(P)-glucokinase Activity-The enzymatic activity was measured by coupling the formation of glucose-6-phosphate to glucose-6-phosphate dehydrogenase and monitoring the formation of NADH spectrophotometrically at 340 nm (14). One unit of glucokinase activity is defined as that amount of enzyme that catalyzes the formation of 1 mol of glucose-6-phosphate/min at 30°C.
Expression of Recombinant Poly(P)-glucokinase in E. coli BL21 and BL21 pLysS Strains-The construction of the expression vector, pET-23a-PPGK, is shown in Fig. 3. This plasmid was transformed into BL21 and BL21 pLysS E. coli strains separately, and control experiments were performed by transforming with pET-23a without the ppgk gene. The BL21 cells were grown in LB medium containing carbenicillin (50 g/ml) at 37°C, and the BL21 pLysS cells were grown in LB medium containing carbenicillin (50 g/ml) and chloramphenicol (34 g/ml) at 37°C. When E. coli reached an A 600 nm of approximately 0.8 -1, IPTG was added to a final concentration of 0.4 mM, and 2 ml of culture were removed after 1, 2, 4, and 6 h after induction. The cells were harvested, centrifuged, and resuspended in 150 l of buffer containing 50 mM glucose, 1 mM MgCl 2 , 0.5 mM EDTA, 0.5 mM ␤-mercaptoethanol, and 10 mM Tris-HCl (pH 7.4). The cells were sonicated followed by centrifugation at 12,000 ϫ g for 15 min. The supernatant was removed, and the pellet was resuspended in the buffer described above. Expression of the re-Poly(P)-glucokinase was examined by monitoring the poly(P)-dependent glucokinase activity.
Purification of Recombinant Poly(P)-glucokinase from E. coli-All purification steps were carried out at 4°C unless otherwise stated, and all buffers contained 50 mM glucose, 1 mM MgCl 2 , 0.5 mM EDTA and 0.5 mM ␤-mercaptoethanol. BL21 pLysS cells that carried the pET-23a-PPGK plasmid were grown in LB medium containing 50 g/ml carbenicillin and 34 g/ml chloramphenicol at 37°C. When the optical density (A 600 nm ) reached 0.6 -0.8, IPTG (final concentration, 0.4 mM) was added, and cells were permitted to grow for another 3 h. The cells (ϳ18 g) were pelleted by centrifugation, resuspended in 25 ml of 10 mM Tris-HCl (pH 7.4), and sonicated 3 times for 5 min each. Streptomycin sulfate (final concentration, 6%, w/v) was added, mixed for 30 min, and centrifuged at 20,000 ϫ g in a SS34 rotor. Supernatant was collected, mixed with ammonium sulfate (final concentration, 30%, w/v) for 30 min, and centrifuged as above. The supernatant was brought to 70% (w/v) saturation with ammonium sulfate for 30 min and centrifuged as above. The pellet was collected, resuspended in 18 ml of 50 mM potassium phosphate (KPi) buffer, and dialyzed against 4 liters of 50 mM KPi (pH 6.8) buffer overnight. The dialysate was pooled and applied to a 60-ml phospho-cellulose (P-11) column (Whatman), and bound proteins were eluted with a linear gradient of 50 to 500 mM KPi buffer (pH 6.8).  Fig. 1). A possible initiator methionine is labeled ϩ1. The deduced amino acid sequence is shown below the nucleotide sequence. A putative Shine-Dalgarno sequence for translation initiation is underlined. Peptide sequences (shown in Table I

VLNDADAAGLAEEHYGAGK
Trypsin Poly(P)-glucokinase activity was detected between 0.3 and 0.35 M KPi gradient. The re-Poly(P)-glucokinase containing fractions were dialyzed overnight against 4 liters of 50 mM KPi buffer and applied to a hydroxyapatite column. The bound protein was then eluted with a linear gradient of 50 mM to 500 mM KPi buffer. The re-Poly(P)-glucokinase eluted between 0.3 and 0.35 M KPi gradient. The re-Poly(P)-glucokinase containing fractions were then dialyzed against 10 mM Tris-Maleate (pH 6.8) buffer overnight and applied to an Affi-Gel Blue column (Bio-Rad). The bound protein was then eluted with the same buffer containing 5 mM (in terms of orthophosphate concentration) polyphosphate (Type 35) and 2 mM MgCl 2 . The re-Poly(P)-glucokinase containing fractions were pooled, concentrated using Amicon ultrafiltration apparatus equipped with a YM-10 membrane, and then stored at Ϫ20°C.

RESULTS
Cloning of the Poly(P)-glucokinase Gene-Our cloning strategy for the ppgk gene involved the generation of the probe employing PCR with the genomic DNA template, followed by screening a M. tuberculosis genomic library and identifying a full-length clone. Initially two degenerate primers were designed based on amino acid sequences of two peptides (Table I, numbers 1 and 4). These two primers together with genomic DNA were used in a PCR to generate a 365-bp product, which was used as a probe to screen the genomic library of M. tuberculosis. Out of the 60,000 plaques, seven clones were identified by plaque hybridization. Reselection of pBS-PPGK plasmids 1-8 by an additional round of PCR, as described under "Experimental Procedures," yielded two plasmids with the expected 280-bp product, and they were designated as pBS-PPGK-7 and pBS-PPGK-8. In order to position the open reading frame of the ppgk gene in the pBS-PPGK-7 insert, this plasmid was digested with different restriction enzymes, and the digested fragments were separated on a 1% agarose gel followed by Southern blot analysis with the 365-bp PCR product as a probe. As shown in Fig. 1, the 2.3-kb EcoRI insert was found to contain the open reading frame of the ppgk gene. An expanded scheme of the sequencing strategy is also shown below the restriction map in Fig. 1. The M. tuberculosis H 37 Rv ppgk gene sequence and the deduced amino acid sequence are shown in Fig. 2. The peptide sequences that were obtained from M. tuberculosis H 37 Ra (Table I) are also aligned below that deduced from the H 37 Rv ppgk gene sequence. The open reading frame of the gene consisted of 795 bases that predicted a polypeptide of 265 amino acids with a calculated molecular mass of 27,400 Da. In common with other M. tuberculosis genes (19 -21), the ppgk gene showed a high GϩC content (64.5%), and analysis of codon usage bias showed a strong preference of G and C in the third base position. A putative Shine-Dalgarno sequence (GAGGAG) was also identified in the nucleotide sequence upstream from the proposed initiation codon (labeled as ϩ1). However, a typical E. coli-like promoter consensus sequence was not apparent at the Ϫ10 or Ϫ35 nucleotide positions.

Expression of Recombinant Poly(P)-glucokinase in E. coli-
The construction of the expression plasmid, pET-23a-PPGK, is diagrammed in Fig. 3. Two E. coli strains, BL21 and BL21 pLysS, carrying this plasmid, were tested for the expression of re-Poly(P)-glucokinase. As shown in Table II, when BL21 pLysS or BL21 were used as hosts and transformed with pET-23a vector by itself, Poly(P)-glucokinase activity was not detected in cell lysates as determined by the enzymatic activity assay, using poly(P) as a substrate, as described under "Experimental Procedures." When the BL21 pLysS was transformed with pET-23a-PPGK plasmid, Poly(P)-glucokinase activity was detected only in those cells that were induced with IPTG. The re-Poly(P)-glucokinase was expressed as a soluble cytosolic protein, and no Poly(P)-glucokinase activity was found in the cell pellets. The ATP-dependent glucokinase activity was also determined in these cells. As shown in Table II, the specific activity of ATP-dependent glucokinase was about two times higher if the cells were induced by IPTG. The increased specific activity is likely due to the expression of the re-Poly(P)-glucokinase encoded in the pET-23a-PPGK plasmid. Similar results were obtained when a BL21 strain was used, although some amount of re-Poly(P)-glucokinase was expressed without IPTG induction. This leakage is common in BL21 cells as described by the manufacturer (Novagen). Hence, the BL21 pLysS strain was used for the expression of this recombinant enzyme.
Purification of Recombinant Poly(P)-glucokinase from E. coli-The purification procedures are described under "Experimental Procedures," and the results are shown in Table III. As calculated from the purification table, the re-Poly(P)-glucokinase was not expressed at high levels (0.4% of total cellular proteins). The re-Poly(P)-glucokinase was purified 189-fold to near homogeneity, as judged by SDS-PAGE (Fig. 4), with a 18% recovery. The ATP-dependent glucokinase activity was co-purified with poly(P)-dependent glucokinase activity during the course of purification with a constant ratio of poly(P)-dependent activity to ATP-dependent activity (around 3.5 except in the crude extract stage). When the purified enzyme was chromatographed into a gel filtration column (TSK-G3000SW) or a C 8 reverse-phase column, only a single peak was observed (data not shown). In addition, the single peak from the gel filtration column was found to contain both poly(P)-and ATP-dependent , and one cycle of 72°C (7 min). Vent R DNA polymerase (from New England Biolabs) was used. This amplified product was subcloned into EcoRI site of pBluescript plasmid to confirm the sequence after PCR. This ppgk gene was then subcloned into NdeI site of the expression vector pET-23a (Novagen). Correct orientation of insertion was determined by SalI and PvuII digestion. The resulting plasmid was designated pET-23a-PPGK. glucokinase activities. Hence, the minor band seen in lane 7 of Fig. 4 might be an artifact from the SDS preparation as was observed in other cases (22). Similar to Poly(P)-glucokinase from M. tuberculosis H 37 Ra, the purified re-Poly(P)-glucokinase possessed both the poly(P)-and ATP-dependent glucokinase activities. The specific activities of this purified enzyme were found to be 203 units/mg and 61 units/mg for the poly(P)and ATP-dependent glucokinase activities, respectively (Table  III). These values are similar to those obtained for the Poly(P)glucokinase from the H 37 Ra strain (14). Thus, these results support our previous finding that both activities are catalyzed by a single protein based on a number of different criteria (14). Although the calculated molecular mass is 27,400 Da, the re-Poly(P)-glucokinase migrated as a 33-kDa protein on SDS-PAGE (Fig. 4, lane 7), which is identical to the migration of the native enzyme from the H 37 Ra strain (Fig. 4, lane 1). The difference in the observed and calculated molecular mass may be due to the presence of a cluster of charged groups (amino acids 188 -200 and 222-229, Fig. 2), which could cause anomalous migration of the enzyme by SDS-PAGE. Such an effect of clusters of charged groups on the mobility of proteins on SDS-PAGE has been observed with the RNA-binding protein (23,24). DISCUSSION Previously, we reported the purification of Poly(P)-glucokinase from M. tuberculosis H 37 Ra to homogeneity and charac-   terized some of its biochemical properties (14,25). In this study, we report the cloning and sequencing of the ppgk gene. The cloned ppgk gene contains a full length of open reading frame, and several peptide sequences of the purified Poly(P)glucokinase from the H 37 Ra strain were identified in the deduced amino acid sequence. The expressed re-Poly(P)-glucokinase purified from E. coli contained both the poly(P)-and ATP-dependent glucokinase activities. On the basis of these results, we conclude that this clone represents the ppgk gene of M. tuberculosis H 37 Rv, and its translated polypeptide is Poly(P)-glucokinase. The cloning and expression of the mycobacterial Poly(P)-glucokinase in E. coli has enabled us to obtain purified enzyme in a relatively short time, considering the slow growing nature of M. tuberculosis. The identification of two conserved phosphate binding domains through sequence alignment (as see below) with other glucokinases has paved the way for future studies to test the putative functional residues through site-directed mutagenesis.
We demonstrated previously that the Poly(P)-and ATP-glucokinase activities from both M. tuberculosis and P. shermanii were catalyzed by a single enzyme (13,14). Although the protein chemical evidence was compelling, earlier biochemical data on the Poly(P)-glucokinase from different sources raised the possibility that at least in some cases, the poly(P)-and ATP-dependent glucokinase activities were catalyzed by distinct enzymes (26 -30). However, the observation that the M. tuberculosis ppgk gene expressed in E. coli displayed both the poly(P)-and ATP-dependent activities offers conclusive evidence at the DNA level that the two activities are catalyzed by a single protein.
Characterization of the Poly(P)-glucokinase Gene-Although a potential Shine-Dalgarno region was identified 5 bp upstream of the proposed translation initiation site, an E. coli-like consensus promoter region was not obvious. Preliminary in vitro transcription experiments using E. coli RNA polymerase and the ppgk gene template (nucleotide positions Ϫ78 to 97) did not yield any transcribed mRNA. Other studies on the promoter regions of a number of M. tuberculosis genes including 85A antigen (31), cpn 60 (32), the Mycobacterium bovis Bacillus Calmetle-Guèrin hsp 60 (33), Bacillus Calmetle-Guèrin mph 70 (34), and the Mycobacterium smegmatis ask (35) genes showed that E. coli-like promoter regions were also absent in these genes. Hence, in order to identify the promoter region of the M. tuberculosis ppgk gene, we will have to subclone the potential promoter region from the ppgk gene into a Mycobacterium-E. coli shuttle vector followed by transformation into the nonpathogenic mycobacterial strain.
The ppgk gene sequence predicts a polypeptide of 265 amino acids with a potential translational initiator methionine.  (6,36) found that the three-dimensional structures of actin, hexokinase, and Hsp70 protein families contained common motifs interacting with the ATP molecule, which are the "Phosphate-1" and "Phosphate-2" motifs contacting the ␤and ␥-phosphates of ATP and the "Connect-1" and "Connect-2" motifs at the interface between the subdomains. Residues in these motifs involved in the interaction of ATP are highly conserved in many glucokinases; they are Asp and Gly (in Phosphate-1), Asp (in Connect-1), Gly and Thr (in Phosphate-2), and Gly (in Connect-2). As shown in Table IV, analysis of the deduced amino acid sequences of the ppgk gene shows that this enzyme contains regions that are homologous to Phosphate-1 and Phosphate-2 regions of yeast glucokinase. Further sequence alignment analyses on other prokaryotic glucokinases sequences (Table IV) also indicate the presence of phosphate binding motifs. The homologies within the Phosphate-1 motif and Phosphate-2 motif were analyzed by the Multiple Alignment Construction and Analysis Workbench program (BLOSUM-62) and were found to be statistically significant with p values of 4.5 ϫ 10 Ϫ16 and 3.8 ϫ 10 Ϫ15 , respectively. This result implies that these phosphate binding sites are conserved from eukaryotic hexokinases to prokaryotic glucokinases.
The identity of sequences involved in polyphosphate binding is less clear. Previously, we demonstrated that tryptophans in the peptide, KNDWTYPKWAKQ, of Poly(P)-glucokinase from H 37 Ra strain were found to be selectively oxidized by N-bromosuccinimide with concomitant loss of enzymatic activity (25). Tetrapolyphosphate or long chain polyphosphate substrate afforded protection against this oxidation and the loss of activity. Residues 177-216 of the Poly(P)-glucokinase sequence from H 37 Rv encodes a closely related sequence, RKDWSYARWSEE. Hence this region might be a binding site for polyphosphate, in addition to Phosphate 1 and 2, which specifically enables the Poly(P)-glucokinase to utilize polyphosphates. Several charged The designations are based on known three-dimensional structures and alignments of sequences of three functionally diverse family of proteins, namely, actin, hexokinase, and Hsp70, as proposed by Bork et al. (6,36). Sequence alignment among glucokinases was performed by using the Multiple Alignment Construction and Analysis Workbench program (41,42) (available from the National Center for Biotechnology Institute). 123900 (40) a Identification is the maximal subalignment among all sequences with score assignments (BLOSUM-62). The p value of this alignment is 4.6 ϫ 10 Ϫ16 , which means the degree of the homology is statistically significant (default value is 10 Ϫ4 ).
b Identification is the maximal subalingment among all sequences with score assignments (BLOSUM-62). The p value of this alignment is 3.8 ϫ 10 Ϫ15 , which means the degree of the homology is statistically significant (default value is 10 Ϫ4 ).
c Glucokinases sequences were obtained from GenBank. d Numbers indicate the position of the closest amino acid.
groups around this region, Lys 188 , Glu 189 , Lys 190 , Asp 192 , Lys 197 , and Lys 200 , may fulfill the requirement for poly(P) binding. However, conclusive evidence for a poly(P) binding site will have to await site-directed mutagenesis of residues in this region.