|
Advertisement | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
J. Biol. Chem., Vol. 281, Issue 31, 22131-22141, August 4, 2006
The Structure of an Ancient Conserved Domain Establishes a Structural Basis for Stable Histidine Phosphorylation and Identifies a New Family of Adenosine-specific Kinases*![]() ![]() ![]() 1 3
From the
Received for publication, March 31, 2006 , and in revised form, May 24, 2006.
Phosphorylation of both small molecules and proteins plays a central role in many biological processes. In proteins, phosphorylation most commonly targets the oxygen atoms of Ser, Thr, and Tyr. In contrast, stably phosphorylated His residues are rarely found, due to the lability of the N-P bond, and histidine phosphorylation features most often in transient processes. Here we present the crystal structure of a protein of previously unknown function, which proves to contain a stably phosphorylated histidine residue. The protein is the product of open reading frame PAE2307, from the hyperthermophilic archaeon Pyrobaculum aerophilum, and is representative of a highly conserved protein family found in archaea and bacteria. The crystal structure of PAE2307, solved at 1.45-Å resolution (R = 0.208, Rfree = 0.227), forms a remarkably tightly associated hexamer. The phosphorylated histidine at the proposed active site, pHis85, occupies a cavity that is at the interface between two subunits and contains a number of fully conserved residues. Stable phosphorylation is attributed to favorable hydrogen bonding of the phosphoryl group and a salt bridge with pHis85 that provides electronic stabilization. In silico modeling suggested that the protein may function as an adenosine kinase, a conclusion that is supported by in vitro assays of adenosine binding, using fluorescence spectroscopy, and crystallographic visualization of an adenosine complex of PAE2307 at 2.25-Å resolution.
Phosphorylation of appropriate functional groups provides one of the key mechanistic devices in biology. Thus, phosphoryl transfer reactions involving molecules such as ATP and GTP, and the phosphorylation state of small molecule substrates, are critical to many enzyme reactions. Likewise, the phosphorylation and dephosphorylation of amino acid side chains in proteins forms the basis of many signaling and regulatory processes. The most common targets are the -OH groups of amino acid side chains such as serine, threonine, and tyrosine and small molecules such as sugars. In part this reflects the ubiquitous occurrence of such groups, but it also depends on the relative stability of the O-P bond to hydrolysis, which gives long-lived phosphorylation. In contrast, the phosphorylation of nitrogen atoms, with the formation of N-P bonds, is not often observed (1). In proteins, the phosphorylation of histidine residues produces a phosphoramidate bond that has a large standard free energy of hydrolysis. This makes phosphohistidines the most unstable of any known phosphoamino acid (2) and favors the utilization of histidine in rapid processes involving transient phosphorylation. Examples include the use of phosphohistidines as enzyme intermediates, as in the mechanisms of succinyl-CoA synthetase (3) and nucleoside diphosphate kinase (4), or for rapid signaling processes such as the two-component systems of bacteria (5, 6). In the latter, a sensor is connected to a regulator through histidine phosphorylation and a subsequent phosphotransfer to an aspartate residue.
Another well characterized system, the bacterial phosphoenolpyruvate:sugar phosphotransferase system (PTS),4 consists of proteins that carry out four successive phosphoryl transfers until the final phosphorylation of the sugar, concomitant with transport into the cell across the bacterial cell membrane (7). In the PTS, Enzyme I (EI) first autophosphorylates, using phosphoenolpyruvate as a substrate, and then transfers the phosphoryl group to the histidine-containing protein. Subsequently the phosphoryl group is transferred to the sugar-specific Enzymes II (EIIA and EIIB), which can consist of multisubunit enzymes or separate proteins. Each of four proteins, EI, histidine-containing protein, EIIA, and EIIB, is transiently phosphorylated at a histidine residue, except certain EIIBs, which instead can be phosphorylated at a cysteine residue. During the phosphotransfer in the PTS, the phosphoryl group, commencing with the N
The characteristic instability of phosphorylated histidines means that only a few examples have been defined structurally. The structure of the phosphohistidine form of histidine-containing protein has been determined by NMR spectroscopy (9), but none of the phosphohistidine forms of the PTS proteins have been elucidated by x-ray crystallography. The structures of phosphohistidine intermediates have been determined crystallographically for three enzymes, succinyl-CoA synthetase (3), nucleoside diphosphate kinase (4), and a cofactor-dependent phosphoglycerate mutase (10). From these, it is clear that phosphohistidine residues can be stabilized in certain environments. Indeed, in histone H4, a phosphohistidine (at residue 75) has been shown to have a half-life of 12 days at room temperature and pH 7.6 (11). Here we describe the crystal structure of a functionally uncharacterized protein, PAE2307, which has been discovered to contain a stably phosphorylated histidine residue. This protein was selected as a target in a pilot structural genomics enterprise, focused on open reading frames (ORFs) from the hyperthermophilic archaeon Pyrobaculum aerophilum (12), which were annotated as unknowns and for which no functional or structural predictions could be made based on amino acid sequence. PAE2307 is a representative of a conserved family of proteins found in both archaeal and bacterial species, which has been described as an "ancient conserved domain" (13) and presumably has an evolutionarily distant origin, prior to the divergence of the bacteria and the archaea. The high level of conservation in this family suggests some important but previously uncharacterized biological function (14, 15). The discovery of a phosphorylated histidine strongly suggests that PAE2307 is involved in a phosphotransfer reaction common to both bacteria and archaea and enables us to analyze the structural features that contribute to stable histidine phosphorylation. Additionally, bioinformatic evidence coupled with weak structural similarity to a thermostable DNA polymerase suggests that nucleotides or nucleosides are the most likely substrate for the phosphorylation reaction thought to be catalyzed by PAE2307. In silico modeling identified a binding site for adenosine that places the C5'-hydroxyl group of ribose adjacent to the phosphorylated histidine, suggesting that the protein may function as a nucleoside kinase. This model is supported by in vitro measurements using fluorescence spectroscopy and the visualization by x-ray crystallography of the adenosine-bound form of the protein.
ORF SelectionORFs that were annotated as "unknown" in the preliminary annotation of the P. aerophilum genome (12) were selected as an initial target set. These ORFs were then screened for putative trans-membrane helices using DAS (16) and TMHMM (17). PSI-BLAST (18) searches using the ORFs that lacked predicted transmembrane helices were manually inspected, and those with any significant similarity to characterized genes were removed. Finally, the remaining ORFs were passed through a fold prediction algorithm (19), and those that gave a Z-score of >5 were removed. The remaining ORFs were considered `true' unknowns and were selected for further study. Protein Expression, Purification, and CrystallizationPAE2307 was overexpressed in Escherichia coli, using the expression vector pET28, and was purified by immobilized metal ion affinity chromatography and size-exclusion chromatography as described (20). The expression vector adds an N-terminal polyhistidine tag, which was used for purification, but was not cleaved, and therefore remained on the protein during subsequent studies. Native PAE2307 crystals were initially obtained in two crystal forms, only one of which was suitable for x-ray analysis (20). This crystal form was grown from an unbuffered solution of 50 mM KH2PO4 containing 20% polyethylene glycol 8000, and proved to be tetragonal, space group I4122, with unit cell dimensions a = b = 120.0, c = 156.5 Å and three copies of the protein monomer in the asymmetric unit. The crystals diffracted to 1.45-Å resolution, and they were used for the native structure determination but were not reproducible. For subsequent ligand soaking experiments, a third crystal form (Type III) was obtained by mixing protein solution (6.8 mg/ml in 20 mM HEPES, 150 mM NaCl, pH 8.0) in a 1:1 ratio with mother liquor (21% (w/v) methoxypolyethylene glycol 5000, 0.2 M N-(1,1-dimethyl-2-hydroxyethyl)-3-amino-2-hydroxypropanesulfonic acid/KOH, pH 8.3) and incubating in a sitting-drop experiment at 18 °C. Thick, plate-like crystals appeared after 3 days and reached their maximum size after 14 days. These crystals proved to be orthorhombic, space group P212121, a = 76.9, b = 109.8, c = 112.5 Å, with six molecules in the asymmetric unit. Data Collection, Structure Determination, and Refinement The structure of native apoPAE2307 was solved by single-wavelength anomalous diffraction phasing from a platinum derivative, as described previously (20). Manual building using O (21) into a map calculated from experimental phases to 2.1 Å yielded a partial model, which was refined against a 1.45-Å resolution native data set, collected at 110 K at the Stanford Synchrotron Radiation Laboratory (beamline 9-2, Area Detector Systems Corp. Quantum 4 detector, wavelength 0.9792 Å). The data statistics are in Table 1. A single round of simulated annealing using CNS (22) reduced the Rfree from 53.3% to 43.4%, and subsequent automatic model building using ARP/wARP (23) resulted in an almost complete amino acid chain for all protomers. Two further rounds of model building using O, interspersed with simulated annealing, energy minimization, and individual atomic B-factor refinement using CNS, resulted in a final model comprising residues 5-167 of monomers A and B, and residues 6-167 of monomer C, with Rcryst = 20.8% and Rfree = 22.1%. No density was found for the N-terminal His tag on any of the three molecules in the asymmetric unit. Full refinement statistics are shown in Table 1. The Ramachandran plot produced by PROCHECK (24) showed that 90.8% of all residues fall in the most favored regions, and only two residues per monomer, Ala26 and Phe28, are in disallowed regions; both are adjacent to the active site, and both have unambiguous electron density.
The complex with adenosine was obtained by soaking a Type III native crystal in a series of cryoprotectant solutions, which consisted of mother liquor containing 1%, 5%, 10%, 15%, and 20% glycerol, and which were all saturated with adenosine. Adenosine was poorly soluble in the mother liquor, but its concentration was estimated to be at least 250 µM. The crystal was then flash-cooled in liquid nitrogen for data collection. Data were collected at 110 K at the European Synchrotron Radiation Facility, Grenoble (beamline ID-29, wavelength 0.9793 Å). The data were processed with MOSFLM and SCALA from the CCP4 program suite (25), giving a full data set to 2.25 Å (Table 1). The structure of the adenosine complex was solved by molecular replacement with MOLREP (26), using the 1.45-Å native structure as search model. Refinement was with REF-MAC5 (27), with manual rebuilding achieved by using COOT (28). Clear electron density for a bound adenosine molecule was found in the putative active sites of each of the six molecules in the asymmetric unit; these were modeled in and refined with full occupancy. The final structure comprises residues 5-167 of molecules A, B, D, and E, and 6-167 of molecules C and F; six adenosine molecules; two phosphate ions; and 407 water molecules. Final values of Rcryst and Rfree were 18.2% and 24.1%, respectively (see Table 1 for full details).
Crystallographic Verification of His85 Phosphorylation in ApoPAE2307The first electron density maps of the I4122 crystal form strongly suggested that His85 was phosphorylated in each of the three protomers. The initial refinement, in which this residue was modeled as an unmodified histidine, returned B-factors for His85 and the associated Asp139 that were consistent and similar: 10-14 Å2 and 13-15 Å2, respectively, for protomer A, 12-15 Å2 and 11-13 Å2 for protomer B, and 11-13 Å2 and 14-15 Å2 for protomer C. Difference maps after this refinement showed unequivocal electron density for phosphoryl groups attached to each histidine, at levels of 4 In Silico Docking ExperimentsThe structures of potential ligands at the highest resolution available were obtained from the HIC-Up data base (29) and were energy-minimized and converted to MOL2 format using the PRODRG server (30). In silico ligand binding was performed using the program GOLD (31).
Fluorescence SpectroscopySpectroscopic experiments were performed on an Hitachi F-4500 fluorescence spectrophotometer, with excitation and emission slit widths of 5 nm, and a scan speed of 240 nm/min. Excitation and emission wavelengths of 280 nm and 347 nm, respectively, were used for all samples, using a quartz cuvette with 0.5-cm path length. Analysis by mass spectrometry gave a mass of 20,674 Da, compared with 20,675 Da expected for the full-length His-tagged protein minus its N-terminal Met residue. This showed that the protein used in these assays was homogeneous and not phosphorylated. Protein (5 µM) and ligands (0.5-500 µM) were each dissolved in a buffer solution containing 20 mM HEPES and 150 mM NaCl at pH 8.0. The data were corrected for inner filter effects using a correction factor as described by Eftink (32). The true protein fluorescence intensity was determined by subtracting the buffer fluorescence intensity from the sample fluorescence intensity, and multiplying by the correction factor (C) for that particular sample, where C = 10((
Protein Sequence AnalysisA PSI-BLAST query of the NCBI non-redundant protein sequence data base (release date: December 15, 2005), with the PAE2307 sequence, shows convergence after three iterations (using an E-value cutoff of 0.05) to produce a set of protein sequences derived from 42 distinct species. The proteins form a discrete family of well conserved sequences, all of which are at least 35% identical to each other in amino acid sequence. The members of this protein family are drawn equally from archaeal species (18 examples) and bacterial species (24 examples). They cover a broad phylogenetic range, including representatives from the crenarchaeota, euryarchaeota, , , and proteobacteria, thermotogae, actinobacteria, Deinococci, and cyanobacteria. However, the family contains no eukaryotic members. A multiple sequence alignment of a phylogenetically diverse selection of these protein sequences is shown in Fig. 1. The multiple sequence alignment reveals 24 residues that are absolutely conserved, including the phosphorylated histidine and several other residues that surround the putative active site. This protein family is classified in the Pfam and Interpro databases as protein of unknown function DUF355 (Pfam accession number PF04008; Interpro accession number IPR007153) and is annotated with the comment that "The high level of conservation in this family suggests some as yet unknown important biological function." The genomic context of the PAE2307 gene is uninformative, because it does not sit in an obvious operon and is flanked in the genome by poorly characterized ORFs, which are annotated as a putative resistance protein and putative transcriptional regulator. There is also no clear synteny, as the context of orthologs in other species is variable.
Overall Fold of PAE2307The PAE2307 monomer (Fig. 2a) is folded into a single domain with an extended C-terminal arm, culminating in an
Quaternary StructureThe protein monomers are arranged in a tightly associated hexamer (Fig. 2c), best described as a dimer of trimers. Dynamic light scattering analysis showed that PAE2307 has a particle size of 123 ± 10 kDa in solution, consistent with a hexameric species. The trimer is formed by the three monomers of the crystal asymmetric unit, with the hexamer being completed by 2-fold crystallographic symmetry. At the center of the trimer, the outer strands ( 8; residues 131-140) of the small -sheet from each of the three monomers face each other around the 3-fold axis. Taking molecule A as the reference, residues 131-135 run antiparallel to 136-140 of molecule B and residues 136-140 run antiparallel to 131-135 of molecule C. Many hydrophobic side chains pack between these three small -sheets to give stability to the trimer.
Residues 139-144 of molecule A continue on to make extensive inter-subunit contacts, passing through a cleft made by the
Finally, the C-terminal helix has a striking amino acid composition with no fewer than 11 charged residues out of 16, and extends as an arm to make extensive interactions with the adjacent monomer in the trimer (A with C, C with B, and B with A). Three arginine residues on the inner face of the helix, Arg152, Arg155, and Arg160, play a central role, making three salt bridges, and additional hydrogen bonds, with residues from
A phosphate ion is found within the narrow tunnel that runs along the 3-fold axis of the trimer, bound by the Arg111 side chain of each protomer in the trimer. The P-O2 bond lies along the 3-fold axis such that O2 interacts with the NH2 groups of three Arg111 side chains. In addition, N
The trimer can be described as a domed assembly with a rather flat surface as its base. The flat face is formed by the three sets of helices
Fold SimilaritySearching the Protein Data Bank with the PAE2307 monomer using DALI (33) or SSM (34) gives two obvious matches: the conserved hypothetical proteins TT1634 from Thermus thermophilus (PDB code: 1VGG) and TA1353 from Thermoplasma acidophilum (PDB code: 1RLH), which are both orthologs of PAE2307. The structure of the T. thermophilus protein is extremely similar to PAE2307. It is also a hexamer, and the two structures superimpose with a rootmean-square (r.m.s.) difference in C
Besides these obvious cases, the structural similarity to known protein structures detected by fold comparison is limited to a partial match of the central
Active SiteThe phosphorylated histidine residue, pHis85, is located in a cavity at the interface between two subunits. This is the presumed active site, giving 6 active sites per hexamer. The cavity is formed by residues 26-30 and 53-58 of one subunit (the
Each of the oxygen atoms of the phosphoryl group on pHis85 is hydrogen-bonded, with good geometry. The O2P atom is at the N terminus of a 310 helix with a hydrogen bond to the peptide NH of Phe28, O1P hydrogen bonds to the peptide NH of Ala55, and O3P bonds to O of Ser56 and N 2 of Asn118 of the adjacent subunit. The stabilizing environment of the phosphoryl group is illustrated in Fig. 4. Many of the groups in and around the putative active site are fully or mostly conserved in all homologous sequences (Fig. 1). The phosphorylated His85, together with Phe28, Phe87, Asn118, Asp139, Gly140, and Tyr165 are all invariant, residue 95 is always aromatic, residue 97 is always either Ile or Val, and Asn20 and His27 are only replaced by similar hydrogen-bonding residues. Between them, these residues account for many of the conserved residues in the sequence alignment in Fig. 1, with the other residues presumably conserved for conformational reasons (Gly and Pro) or because they make multiple stabilizing hydrogen bonds or salt bridges. Functional Hypothesis and in Silico Ligand BindingThe presence of a stably phosphorylated histidine strongly suggested that PAE2307 has a biochemical role that involves phosphate transfer. The putative substrate for this phosphorylation is not clear, but two lines of evidence imply a role in DNA or nucleotide metabolism. First, there is the weak structural similarity to a B-type DNA polymerase described above. Secondly, when analyzed using a phylogenetic profiling technique based on the comparison of orthologs from 81 microbial genomes,5 PAE2307 is found to cluster with DNA-binding proteins from Methanopyrus kandleri, Sulfolobus solfataricus, and Thermoplasma acidophilum, a gyrase from S. solfataricus, and adenylate and predicted nucleotide kinases from M. kandleri.
To test the feasibility of a role for PAE2307 in nucleotide metabolism, in silico docking studies were carried out with a number of nucleosides and nucleotides, using the program GOLD (31). The phosphate of group of pHis85 was taken as the center of the active site, and a radius of 20 Å was used to define the possible binding surface, which encompassed the whole inter-subunit cleft. Phosphorylated nucleotides bound in a number of configurations, with the binding being primarily dictated by the interaction of their phosphate groups with the side chains of Arg62 and Lys58, and the base ring making few contacts with the protein. However, nucleosides reproducibly bound in the hydrophobic pocket adjacent to the active site, with GOLD fitness scores (
In Vitro Ligand BindingTo test the hypothesis that PAE2307 may function as a nucleoside kinase, in vitro ligand binding experiments were carried out to establish whether nucleosides were able to bind to the protein. The presence of Trp95 in the predicted nucleoside binding site enabled intrinsic protein fluorescence to be used to monitor ligand binding. Addition of various nucleosides to the protein in solution resulted in the reduction of observed fluorescence and allowed the measurement of binding curves, as shown in Fig. 6. Adenosine showed relatively tight binding in comparison to the other nucleosides tested (Fig. 6a), with a calculated KD of 15 µM compared with 515, 223, and 302 µM for guanosine, thymidine, and deoxyuridine, respectively. Therefore, PAE2307 appears to be specific in binding adenosine over other nucleosides. To test whether the phosphorylation state of the nucleoside was important in binding, the three phosphorylated forms of adenosine, AMP, ADP, and ATP, were tested for their ability to bind (Fig. 6b). AMP bound only slightly less tightly than adenosine alone, with a calculated KD of 26 µM. However, the di- and triphosphates bound much more weakly, with KD values of 740 and 741 µM, respectively. Interestingly, adenine base alone bound to the protein more tightly than either ADP or ATP, with an estimated KD of 198 µM. This preference for binding non- or mono-phosphorylated forms of adenosine over the higher phosphorylated forms supports the hypothesis that PAE2307 acts as a kinase, with adenosine and/or AMP as preferred substrate.
Structure of Adenosine ComplexTo confirm that adenosine binds as predicted by in silico modeling, the crystal structure of an adenosine complex of PAE2307 was determined and refined at a resolution of 2.25 Å (Table 1). Electron density maps showed unambiguous positive difference peaks at the predicted adenosine binding sites, as shown in Fig. 5b. Adenosine modeled into this difference density refined well, with atomic B-factors that were very similar to surrounding residues (25-30 Å2), and no residual difference electron density, thus validating the predictions made from in silico modeling. A comparison of the predicted and observed conformations of adenosine is shown in Fig. 5c. The binding mode is largely as predicted, but there are several small changes in the active site, the most notable of which is that the side chain of Trp95 has rotated about C -C to enable better stacking with the ring of the adenine base. The conformation of the observed ligand is slightly different to that predicted by modeling, with a hydrogen bond made from the N1 rather than the N6 nitrogen of the adenine base to the phenolic oxygen of Tyr165, and a rotation around the N9-C1' bond, which enables hydrogen bonds to be made from the O2'- and O3'-hydroxyl groups on the ribose ring to the side chain of Asn20 and the backbone nitrogen of Ala117, respectively. One additional change is that, because this form of the protein is not phosphorylated, a hydrogen bond is able to form between the N 2 nitrogen of His85 and the O5' of the adenosine ribose. Overall, the in silico modeling had accurately placed the adenosine ligand, with a root-mean-square difference in atomic positions of only 1.85 Å between the predicted and observed conformations.
The structure of PAE2307 demonstrates the powerful role that structural biology can play in providing functional hypotheses for proteins of previously unknown function. The unexpected discovery of a phosphorylated histidine in PAE2307, in a solvent-exposed cavity that contains a number of fully or mostly conserved residues, leaves little doubt that this is the active site and that the function of this protein and its homologs is in phosphoryl transfer. Given the rarity of stably phosphorylated histidine residues, it also allows us to examine the factors that stabilize such modifications and how these may be related to function.
There are currently three other examples of crystal structures of phosphohistidine-containing proteins, nucleoside diphosphate kinase (4), succinyl-CoA synthetase (3), and cofactor-dependent phosphoglycerate mutase (10). A common factor in each of these structures is that the non-phosphorylated imidazole nitrogen of the active histidine side chain is hydrogen-bonded to an oxygen atom of the protein. In nucleoside diphosphate kinase, N
We conclude that histidine phosphorylation is stabilized by appropriate ion pair hydrogen bonds involving the non-phosphorylated imidazole nitrogen. This is consistent with the full occupancy for the phosphoryl groups in nucleoside diphosphate kinase and succinyl-CoA synthetase, compared with the partial occupancy (0.28) in cofactor-dependent phosphoglycerate mutase where there is a hydrogen bond but no ionic interaction. In agreement with this, the salt bridge in PAE2307 between pHis85 N The in vivo substrate for the phosphorylation reaction predicted to be catalyzed by PAE2307 is not known, but the phylogenetic linkage of orthologous proteins to DNA-binding proteins and nucleotide kinases, and a weak structural similarity to a domain (of unknown function) from a thermostable DNA polymerase, gave the first suggestion that a nucleotide or nucleoside may be the substrate. The favorable adenosine-binding mode predicted by in silico ligand binding analyses led to our functional hypothesis that the in vivo biochemical function of this highly conserved protein family is as an adenosine-specific nucleoside kinase. This hypothesis is strongly supported by our experimental binding studies. The in vitro binding studies demonstrated that adenosine, but not other nucleosides, bound tightly to the protein (KD = 15 µM) and that di- or triphosphorylated adenosine nucleotides bind much less well than adenosine or AMP. The crystallographic analysis of adenosine binding confirmed its specific and stereochemically favorable binding mode, with the base snugly accommodated between invariant or highly conserved residues and the O5' oxygen of the adenosine ribose in a position to receive a phosphate transferred from the phosphorylated His85. A priori, it is possible to hypothesize that AMP may be the phosphate donor rather than the substrate of the proposed kinase activity. If this were the case, AMP would lose a phosphate to become adenosine in the course of the reaction. However, as adenosine binds more tightly than AMP (KD of 15 µM compared with 26 µM for AMP), this scenario would lead to a dead-end adenosine-protein complex. If, on the other hand, AMP or adenosine is the acceptor substrate and phosphorylation leads to formation of ADP or ATP, the di- or tri-phosphorylated product would dissociate from the protein, as both have a much lower binding affinity (30- to 50-fold weaker). We therefore propose that AMP or adenosine is the acceptor in a phosphoryl transfer from another (unknown) donor.
Adenosine kinases have been characterized from a number of species, with structures known for the human (36) and Toxoplasma gondii (37) enzymes. These belong to a wider family of carbohydrate kinases called the ribokinase family after the archetypal member (38). They share a common fold comprising a large PAE2307 has a different fold and appears to be representative of a novel family of nucleoside kinases. Although it is possible that the active site histidine, His85, could act as a base to remove the C5' hydroxyl proton, the observed stable phosphorylation of His85 and its favorable orientation for phosphoryl transfer to the hydroxyl group strongly suggest that its mechanism involves a phosphohistidine intermediate. Unlike the enzymes of the ribokinase family, however, PAE2307 does not undergo any conformational change outside of the active site as a result of adenosine binding; the native and adenosine-bound structures are essentially identical. Although the phosphate donor in the reaction catalyzed by PAE2307 is not yet identified, its likely binding site can be proposed from an examination of the structure. The cavity in the protein surface in which the active site histidine is found is `L'-shaped, with His85 sitting at the corner of the `L.' The adenosine molecule binds into the short arm of the `L,' and it is possible that the phosphate donor will bind into the cavity along the long arm of the `L,' in such a way as to be able to transfer phosphate to adenosine via His85. Several strongly or absolutely conserved residues line this region, including Glu54, Ser56, and Gly57 on one side and Ile29 and Arg155 on the other, with Glu54 and Arg62 forming the cavity floor. However, attempts to model the binding of potential phosphate donor molecules to this region in silico were not successful.
PAE2307 exhibits a strong preference for adenine-containing ligands (KD of 15 µM for adenosine compared with 515, 223, and 302 µM for guanosine, thymidine, and deoxyuridine, respectively). The reason for this specificity is apparent upon inspection of the adenosine binding site, as the C2 carbon of the adenine ring packs closely against the side chain of Ile97. The C2 carbon of the base ring is modified in guanine by the addition of an amino group, and in thymine and cytosine by a carbonyl oxygen, and either of these modifications would cause a clear steric clash, disfavoring productive binding. Ile97 is itself part of a strongly conserved sequence motif (96P(I/L)N(I/V)L100) in helix In summary, the structure of PAE2307 identifies it as a likely nucleoside kinase, with a strong preference for adenosine over other nucleosides. The nature of the donor molecule for the phosphorylation reaction is unknown, but the reaction appears to proceed via a phosphohistidine intermediate, in contrast to known nucleoside kinases. The biological purpose of this phosphorylation reaction is unclear, but the strong conservation of sequence within the protein family and its wide phylogenetic distribution among the archaea and bacteria implies that it is likely to have an important role.
The atomic coordinates and structure factors (codes 1WVQ and 2GL0) have been deposited in the Protein Data Bank, Research Collaboratory for Structural Bioinformatics, Rutgers University, New Brunswick, NJ (http://www.rcsb.org/).
* This work was supported in part by the Marsden Fund of New Zealand and the Centre for Molecular Biodiscovery. The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
1 Canada Research Chair in Structural Biochemistry. On Administrative Leave from the University of Saskatchewan for work at the University of Auckland.
2 Supported by a Royal Society (United Kindom) Traveling Fellowship. 3 To whom correspondence should be addressed. Tel.: 64-9-373-7599; Fax: 64-9-373-7619; E-mail: ted.baker{at}auckland.ac.nz.
4 The abbreviations used are: PTS, phosphoenolpyruvate:sugar phosphotransferase system; PAE2307, protein product of open reading frame 2307 from the P. aerophilum genome; ORF, open reading frame; pHis, phosphorylated histidine; EI and EII, Enzymes I and II; r.m.s., root mean square.
5 Y.-C. Huang, P. Riddle, C. Triggs, V. L. Arcus, and J. S. Lott, manuscript in preparation. Database searchable at www.cs.auckland.ac.nz/~yhua033.
X-ray diffraction data from the native crystal were collected by Peter Haebel and from the adenosine-soaked crystal by Fasseli Coulibaly and Richard Bunker. Mass spectrometry data were abtained by Kayal Rajasekaran and Martin Middleditch.
|
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Advertisement | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||