First Structure of a Eukaryotic Phosphohistidine Phosphatase*

Phosphatases are a diverse group of enzymes that regulate numerous cellular processes. Much of what is known relates to the tyrosine, threonine, and serine phosphatases, whereas the histidine phosphatases have not been studied as much. The structure of phosphohistidine phosphatase (PHPT1), the first identified eukaryotic-protein histidine phosphatase, has been determined to a resolution of 1.9Å using multiple-wavelength anomalous dispersion methods. This enzyme can dephosphorylate a variety of proteins (e.g. ATP-citrate lyase and the β-subunit of G proteins). A putative active site has been identified by its electrostatic character, ion binding, and conserved protein residues. Histidine 53 is proposed to play a major role in histidine dephosphorylation based on these observations and previous mutational studies. Models of peptide binding are discussed to suggest possible mechanisms for substrate recognition.

Reversible phosphorylation of residues is crucial in a variety of signaling pathways. Most of our understanding regarding these signaling events in eukaryotes comes from tyrosine, serine/threonine kinases, and phosphatases (1). Less well characterized is histidine phosphorylation-dependent signaling in eukaryotes. A little more than thirty years ago, Histone H4, the first vertebrate protein with a phosphorylated histidine residue, was identified (2). Since then there has been a measured increase in knowledge of mammalian histidine kinases (3). Unfortunately, very little information regarding eukaryotic histidine phosphatases has been available during this same period. This nescience is interesting because histidine phosphorylation is quite prevalent in the cell and likely accounts for ϳ6% of all phosphorylation in eukaryotes (4). Thus far, only one protein (the ␤-subunit of heterotrimeric G proteins) in vertebrates has been identified as undergoing reversible histidine phosphorylation where both the kinase (NDPK B) and phosphatase (PHPT1) 2 are known (for a recent review, see Ref. 5). However, more information regarding histidine phosphatases is slowly beginning to emerge. To date, the only other structure of a histidine phosphatase is Escherichia coli SixA (6). Under certain anaerobic respiratory conditions, SixA is involved in down-regulation of the E. coli ArcB-to-ArcA phosphorelay system. SixA shows structural homology to the well studied family of arginine-histidine-glycine (RHG) phosphatases (6) but no sequence homology to PHPT1.
Mammalian phosphohistidine phosphatase (PHPT1) was first identified and characterized as a 14-kDa protein in 2002 (7,8). The enzyme can dephosphorylate the phosphohistidinecontaining peptide succinyl-Ala-His(P)-Pro-Phe-p-nitroanilide, E. coli cheA, rabbit ATP-citrase lyase, and the rat ␤-subunit of G proteins (7)(8)(9)(10). PHPT1 has been suggested to be highly involved in neuronal function. Unlike most phosphatases it does not require divalent cations for activity. Individual point mutations of conserved histidine and arginine residues determined that Arg 45 , His 53 , and His 102 may play a role in the reaction mechanism as a result of eliminated or reduced phosphatase activity when mutated to alanines (11). PHPT1 is expressed in a variety of vertebrates but not in fungi or bacteria. The PHPT1 DNA sequence shows similarity to testes-specific proteins in Drosophila, ocnus, janus-a, and janus-b, according to the Conserved Domain Data Base (12). These proteins may be phosphatases, but little information is available on their function. In this work, we report the first crystal structure of a eukaryotic phosphohistidine phosphatase: PHPT1. The structure enables us to define a substrate-binding pocket and to model possible phosphatase-substrate docking modes. We thereby provide the structural basis for further biochemical, biophysical, and genetic studies in the rapidly developing field of eukaryotic phosphohistidine signaling.

EXPERIMENTAL PROCEDURES
Cloning, Protein Expression, and Purification-Residues 5-125 of the human PHPT1 gene (gi: 19353099) were cloned by ligation-independent cloning into a pET-28 based expression vector incorporating a TEV-cleavable N-terminal His-tag fusion (pNIC-Bsa4). After transformation and liquid culture growth using standard methods, recombinant expression of PHPT1 was induced at 291 K by addition of 0.5 mM isopropyl ␤-D-thiogalactopyranoside to either Terrific broth for expression of native protein or minimal medium for selenomethionine incorporation according to the methionine pathway inhibition method (13). After harvest PHTP1 was purified using IMAC on a 1 ml HiTrap chelating HP column followed by gel filtration on a Superdex 75 column (columns from GE Healthcare, Uppsala, Sweden). Crystallization, Data Collection, and Structure Solution-Initial crystal screening with selenomethionine-derivatized protein using commercial screens gave hits in several salt conditions. After optimization, large rhombohedral crystals could be obtained by mixing protein solution (15 mg/ml) with an equal volume of reservoir (2.0 M ammonium sulfate, 0.1 M Bistris, pH 5.5). Crystals were swept into a reservoir with 15% butanediol and then dunked in liquid nitrogen. The data were indexed in space group R3 with unit cell parameters a ϭ b ϭ 228 Å, c ϭ 29.4 Å. Multiple-wavelength anomalous dispersion data (Table 1) were collected around the selenium K-edge at Beamline 14.1 (BESSY II, Berlin, Germany). XDS/XSCALE (14) was used to process the data and put the three data sets on an approximately similar scale. XPREP (Bruker AXS) were used to prepare FA-values for substructure solution in SHELXD (15) that found 10 of the possible 12 sites. Identified sites were used for phase calculation in SHARP (16) followed by density modification in PIRATE 3 The resulting map could be partially autotraced using ARP/wARP (17). Unfortunately, only one of the four protomers in the asymmetric unit was well ordered and numerous attempts with different refinement protocols failed to give acceptable residuals. Since twinning is common in space group R3, twinned refinement was attempted with different operators but did not improve residuals or the quality of difference maps.

* The Structural Genomics
A search for alternative crystal forms was initialized with native protein (23 mg/ml). Both protein with and without intact His-tag was used in this screening process. A new crystal form was obtained in 2.0 M sodium formate, 0.1 M Bistris propane, pH 7.0. After optimization, the best diffracting crystals were grown in 1.9 M sodium formate, 0.1 M Bistris propane, pH 6.5, at room temperature in 2 days by the hanging drop vapor diffusion method. Diffraction data were collected to 1.9 Å ( ϭ 1.033 Å) on a PHPT1 crystal at Beamline ID29 (European Synchrotron Radiation Facility, Grenoble, France). The data were indexed in space group R3 with unit cell parameters a ϭ b ϭ 112.5 Å, c ϭ 29.4 Å. In the new crystal form, there is only one PHPT1 molecule in the asymmetric unit. Diffraction data were processed using XDS/XSCALE (14). The large cell model was used as the starting model to obtain a molecular replacement solution using MOLREP (18). The resulting model was improved through several rounds of model building and refinement in Coot (19) and REFMAC5 (18). Progress of the refinement was monitored by the free R factor for 5% of the data (20). The final model consisted of 100 amino acid residues, 17 formate ions, and 59 waters. The N-terminal His-tag and linker (17 residues), residues 31-38, and the last three C-terminal amino acids are not visible in the structure. The final residuals for this model were r ϭ 17.1% and R free ϭ 22.5%.
Structural Similarity Searches-DALI (21) was used to determine the closest structural neighbors of PHPT1. The top five structures based on statistical significance (Z-score cutoff of 3.3) were aligned and compared with PHPT1.
Modeling of Ligands-The peptides succinyl-Ala-His(P)-Pro-Phe-p-nitroanilide, Met-Gly-His(P)-Ala-Gly-Ala-Ile, and Tyr-Ser-His(P)-Asp-Asn-Ile-Ile-Cys-Gly were modeled into the putative active site using the Molegro virtual docker (22). The sequences with phosphorylated histidines MGHAGAI and YSHDNIICG were chosen based on the conserved sequence among succinyl-CoA synthetases and its structure (23) and a loop between two ␤-strands in the ␤-subunit of the G protein transducin structure (24), respectively. In the case of the ␤-subunit of the G protein transducin peptide, the histidine was modeled as phosphorylated on the N ⑀2 position. Typical docking runs consisted of docking a single ligand with the PHPT1 protein with no solvent molecules. The search algorithm typically used default values for 5-10 runs. In some cases the population size was increased from 50 to 100. Multiple poses were returned for each run and those with phosphohistidines near His 53 were considered potential ligands.
Protein Structure Accession Numbers-The atomic coordinates and structure factors have been deposited in the Research 3 K. Cowtan, unpublished results.

RESULTS AND DISCUSSION
Structural Model-PHPT1 is a 125-amino acid human cytostolic protein that folds into a compact elbow-shaped molecule of a mixed ␣/␤ fold with novel topology (Fig. 1, A and B). The molecule is defined by six ␤-strands flanked by two ␣-helices. A single helix and three strands lie approximately perpendicular to each other near the base of the molecule. There are two central anti-parallel ␤-strands (␤2 and ␤4) that extend the full length of the protein. On either side of these strands are two sets of parallel ␤-strands (␤1, ␤3 and ␤5, ␤6). The core of the protein is defined by this ␤-sheet and the N-terminal histidine tag extends from it. Electron density for the tag was partly not interpretable so it was not modeled, but it is likely required for crystallization because of the numerous crystal contacts that is formed with this extended strand. On the surface there is a positively charged pocket that consists of conserved residues (Fig. 2, A and B). This area is a potential site for phosphatase activity because of its positive surface charge. The cavity volume is ϳ30 Å 3 and found at the base of the molecule (Fig. 1C). Two individual amino acid substitutions to alanine resulted in the loss of phosphatase activity (11). These mutations, His 53 3 Ala and His 102 3 Ala, map to this region of the enzyme. Additionally, many identical and conserved residues of orthologous proteins define this basic patch and surrounding atoms (Fig. 1D). Formate and sulfate ions are found in this pocket in both the small and large cell structures, respectively (Fig. 2, A and B). Both ions bind in similar locations near His 53 in the positive patch but form a binding network distinct from one another. This arrangement suggests there may be multiple ways to bind substrates within the active site based on their size and charge. In total, 17 formate ions solvate PHPT1. Some of these ions are bound to surfaces distinct from the active site.
A DALI (21) (Fig. 2C). Almost all of these residues are identical in orthologous proteins among the animal kingdom. Therefore, these residues likely contribute important structural features to maintain the phosphatase activity of the enzyme. The active site residue His 53 is located at The colors indicate positive (blue) and negative (red) electrostatic potential at the solvent accessible surface. The electrostatic calculations were done with APBS (32). D, solid surface representation of identical residues (blue), conserved (yellow), and semiconserved (aquamarine) substitutions among orthologous proteins as described previously (11). All panels were prepared with PyMOL (33).
the beginning of helix ␣1; near this residue, two formate ions and two water molecules form an intricate hydrogen bonding network. A total of five formate molecules are found in a pocket near His 53 . The pocket is defined by two loops; Glu 51 and Tyr 52 on a loop joining helix ␣1 and ␤-strand ␤4 define one side of the pocket while Tyr 93 and Met 95 on a loop between ␤5 and ␣2 define the other side. Furthermore, Arg 78 is found in the middle of the ␤4 strand and may be involved in coordinating the phosphate group. The Arg 78 3 Ala mutant resulted in a 30% decrease in activity (11), suggesting it does play a role in phosphatase substrate recognition or catalysis. There are three conserved glycines immediately upstream of Arg 78 ; these glycines offer Arg 78 a great deal of conformational flexibility, presumably needed in substrate recognition and catalysis. Furthermore, in a scenario where substrate interacts with Arg 78 , the nearby glycine-rich region may flex so that a tightening of the pocket around the peptide substrate occurs. Close to Arg 78 , Lys 21 is located in the middle of the ␤2 strand and is involved in hydrogen bonding to a formate ion. In the pocket, only Arg 78 and Lys 21 are found on ␤-strands. These residues may act as anchors and assist in coordination of the phosphohistidine to the active site. It is not known whether the N ␦1 or N ⑀2 of the imidazole ring of His 53 is involved in dephosphorylation and it is unfortunately not clear from the structure in what orientation the imidazole ring should be positioned. From a conformational standpoint, as the side chain of His 53 projects radially from its turning backbone with the imidazole pointing downward the bottom of the active site pocket, a substrate N ␦1 interaction is most likely. Interestingly, His 53 is calculated to have a pK a of 3.13 using PROPKA (25). Therefore, it is likely that His 53 is not protonated at the N ⑀2 position in the crystallization solution and in the physiological context.
Substrate Binding-ATP-citrate lyase (ACL) is a substrate for PHPT1 and it has been suggested that PHPT1 dephosphorylates phospho-His 760 (9). A BLAST (26) search revealed that a conserved domain (SucD, succinyl-CoA ␣-subunit) is found from residues 649 -777 of human ACL. The histidine maps to this domain and sequence analysis show that this residue and adjacent residues (GHAGA) are highly conserved among higher eukaryotes (data not shown). The structure of this subunit contains a phosphorylated histidine at residue His 259 (23). This amino acid corresponds to His 760 in CLUSTALW (27) sequence alignments of the succinyl-CoA ␣-subunit with ACL. This residue is found on an extended loop of 22 residues between two ␣-helices near the surface of the protein. Considering the length and makeup of the loop, it is entirely possible that it is flexible and could bind in the pocket of PHPT1 (Fig. 2D).
The ␤-subunit of G proteins is phosphorylated on histidine residue 266 (28 -31) and is dephosphorylated by PHPT1 (10). The structure of the ␤-subunit (24) shows this residue in a surface loop of eight amino acids connecting two ␤-strands. This small loop is modeled in the pocket of PHPT1 (Fig. 2E). In these two cases it appears that some rearrangement of these loops would be required for the substrate to bind in the pocket. It is unclear if there are additional sites of binding separate from the active site. The binding pocket of PHPT1 can accommodate a variety of ligands and studies are in place to assess possible inhibitors and binding partners that should allow for the mechanism of action to be discerned.
The structure of PHPT1 is the first glimpse at a eukaryotic histidine phosphatase. The protein is of a novel topology, and the structure reveals a positively charged active site defined by  (32). C, PHPT1 active site residues and solvent molecules. The pocket is defined by the labeled residues. Also visible are five formate ions and two waters (red spheres). Schematic coloring is as for Fig. 1A. D, peptide MGHAGAI of succinyl-CoA synthetase bound in one possible orientation in the putative active site of PHPT1. The ligand (magenta) is hydrogen-bonded to Lys 21 and His 53 . E, peptide YSHDNIICG of the ␤-subunit of the G protein transducin bound in a possible orientation in the proposed active site of PHPT1. The ligand (magenta) is hydrogen-bonded to His 53 , Arg 78 , and Tyr 93 . Residues labeled in purple and black are for the ligand and PHPT1, respectively. In both cases, the histidine in the ligand is phosphorylated at the N ⑀2 position. The nitrogen (N ␦1 or N ⑀2 ) of His 53 that interacted with substrate varied within each set of docking solutions.
conserved residues that provide a suitable environment for binding phosphohistidine-containing substrates. His 53 is proposed to be involved with the phosphatase activity of this enzyme. Furthermore, Arg 78 and Lys 21 may act as anchors to provide a stable scaffold for substrate and phosphohistidine interactions as well as provide a possibility for charge stabilization of transition states in the catalytic reaction. Future challenges include investigation of co-crystal structures with target peptides to understand in detail the structural basis for substrate recognition and catalysis.