Evolutionary and Structural Analyses of Mammalian Haloacid Dehalogenase-type Phosphatases AUM and Chronophin Provide Insight into the Basis of Their Different Substrate Specificities*

Background: Substrate specificity determinants of mammalian haloacid dehalogenase (HAD) phosphatases are poorly understood. Results: AUM (aspartate-based, ubiquitous, Mg2+-dependent phosphatase) is a novel tyrosine phosphatase and paralog of the serine/threonine- and pyridoxal 5′-phosphate phosphatase chronophin. Conclusion: Conserved cap residues in AUM or chronophin determine phosphatase substrate specificity. Significance: These findings provide a starting point for structure-based development of HAD phosphatase inhibitors. Mammalian haloacid dehalogenase (HAD)-type phosphatases are an emerging family of phosphatases with important functions in physiology and disease, yet little is known about the basis of their substrate specificity. Here, we characterize a previously unexplored HAD family member (gene annotation, phosphoglycolate phosphatase), which we termed AUM, for aspartate-based, ubiquitous, Mg2+-dependent phosphatase. AUM is a tyrosine-specific paralog of the serine/threonine-specific protein and pyridoxal 5′-phosphate-directed HAD phosphatase chronophin. Comparative evolutionary and biochemical analyses reveal that a single, differently conserved residue in the cap domain of either AUM or chronophin is crucial for phosphatase specificity. We have solved the x-ray crystal structure of the AUM cap fused to the catalytic core of chronophin to 2.65 Å resolution and present a detailed view of the catalytic clefts of AUM and chronophin that explains their substrate preferences. Our findings identify a small number of cap domain residues that encode the different substrate specificities of AUM and chronophin.

HAD phosphatases are aspartate-and Mg 2ϩ -dependent enzymes encompassing a Rossmannoid fold and the active site signature sequence hhhDXDX(T/V)(L/I)h (where h is hydrophobic residue; X is any amino acid; human consensus motif). The first aspartate in this HAD motif I acts as the essential nucleophile, and additional catalytic residues cluster in a total of four HAD motifs that are positioned within the Rossmannoid core. The catalytic cleft of HAD phosphatases has to be temporarily shielded during the phosphoaspartyltransferase reaction, because the nucleophilic attack requires active site solvent exclusion, whereas the subsequent hydrolysis of the phosphoaspartate intermediate is solvent-dependent. Control of active site accessibility is achieved by so-called flap structures, which are small, mobile elements bordering the catalytic core, and by highly diversified cap domains that can provide more extensive shielding (2,3).
Cap domains can additionally play a decisive role in the selectivity for low or high molecular weight substrates; HAD phosphatases with small (C0) caps have an open catalytic cavity and tend to process macromolecular substrates, whereas the larger cap domains of the C1/C2 subfamily members typically occlude the active site to facilitate the dephosphorylation of low molecular weight substrates (12). However, some phosphoproteins with terminal phosphorylation sites can also be dephosphorylated by C2-capped HAD phosphatases such as chronophin (8). Importantly, C1/C2 caps can supply structural elements involved in substrate interactions, and thus contribute to phosphatase specificity (13)(14)(15)(16)(17).
Although all HAD phosphatases belong to the same fold, their catalytic features have evolved independently multiple times, illustrating the functional diversity of the Rossmannoid core (2,3). These "large scale" mechanisms for structural and functional divergence have been studied and are well understood (18). Yet, given one structural type, how different can the substrates be? And, given that they are different, how is substrate specificity determined? To address these questions, we have analyzed AUM (aspartate-based, ubiquitous, Mg 2ϩ -dependent phosphatase; gene annotation,: phosphoglycolate phosphatase, PGP). This gene is the closest paralog of chronophin (also known as pyridoxal 5Ј-phosphate phosphatase) (8,19,20). Interestingly, AUM acts as a tyrosine phosphatase, whereas chronophin dephosphorylates serine/threonine-phosphorylated proteins (8,(21)(22)(23)(24) and PLP (19,20,25). We have integrated evolutionary, biochemical, and structural analyses of AUM and chronophin to unravel the determinants behind the substrate specificity of these two closely related proteins.

EXPERIMENTAL PROCEDURES
Phylogenetic Analysis of AUM and Chronophin-Metazoan chronophin and AUM phosphatases were identified using PFAM 25.0 (26), clan CL0137, as described (12), and human sequences were used as queries for BLAST searches of the indicated genomes. Ensembl accession numbers are listed in the legend of Fig. 1B. The sequences were aligned with Muscle (27) and curated with GBlocks (28). The phylogenetic tree was calculated using PhyML with default parameters (29,30). Conserved sites were identified as 90% consensus in chronophin and AUM, respectively (www.hiv.lanl.gov). Site-specific evolutionary rates were identified using MrBayes (31,32) with a ␥ model for the rate distribution and 10 categories considering the AUM and chronophin subfamilies. Differently conserved sites were identified with SDPfox (33). To test for sites under different selective pressure, HMMDiverge was used (34). All sites with a probability of Ն0.8, of which the evolutionary rate is higher in one family, were selected. Branch-specific rate analyses were performed with PAML (35). The following models were tested: 1) one rate for the whole tree; 2) one additional rate for the branches of chronophin and AUM, respectively, following the duplication; and 3) one additional rate for the chronophin and the AUM sub-branch, respectively. To test for significance, the doubled difference of the log likelihood was compared with the 2 value taking into account the different degrees of freedom. Neither model 2 nor model 3 fitted the data significantly (p Ͻ 0.05) better than the simple model 1.
Bacterial Expression Constructs-The full-length clone encoding for murine AUM (gene annotation, phosphoglycolate phosphatase) was obtained from the German Resource Centre for Genome Research. For bacterial expression, full-length AUM cDNA was cloned into the bacterial expression vector pETM11 (EMBL), producing AUM with an N-terminal His 6 tag that is followed by a tobacco etch virus (TEV) protease cleavage site. Murine chronophin was reverse-transcribed from adult mouse brain tissue, and the PCR product was subcloned into pETM11 to create N-terminally His 6 -tagged chronophin. Sitedirected mutagenesis was performed using the QuikChange mutagenesis kit (Stratagene). The AUM(1-113)-chronophin(101-207)-AUM(234 -321) (ACA) hybrid in pETM11 was obtained by gene synthesis after optimizing the sequence for maximal protein production without changing the murine AUM or chronophin amino acid sequences (Geneart). Nested PCR was used to create the chronophin(1-100)-AUM(114 -233)-chronophin(208 -292) (CAC) hybrid in pETM11. All constructs were verified by sequencing.
Recombinant Protein Expression and Purification-pETM11 constructs encoding for His 6 -tagged AUM WT and AUM variants were transformed into Escherichia coli BL21 (DE3) and expressed for 20 h at 28°C after induction with 0.5 mM isopropyl ␤-D-1-thiogalactopyranoside supplemented with 20 g/ml chloramphenicol, 50 g/ml kanamycin, and 1 ng/ml tetracycline. To increase solubility, AUM was co-expressed with the chaperones groES-groEL-tig from the pG-Tf2 plasmid (Takara) according to the manufacturer's instructions. Cells were harvested at 8000 ϫ g for 10 min and resuspended in TNM (50 mM triethanolamine (TEA), 200 mM NaCl, 5 mM MgCl 2 ; pH 7.5) supplemented with 10 mM imidazole and protease inhibitors (EDTA-free protease inhibitor tablets; Roche Applied Science). All purification steps were carried out at 4°C. Cells were lysed in the presence of 150 units/ml DNase I (Applichem) using a cell disruptor (Constant Systems), and cell debris was removed by centrifugation (10,000 ϫ g, 30 min, 4°C). For His 6 -AUM purification, cleared supernatants were loaded on a nickel-nitrilotriacetic acid-agarose column (HisTrap HP, GE Healthcare) operated on an ÄKTA liquid chromatography system (GE Healthcare) in TNM, and His 6 -tagged proteins were eluted using a linear 10 -400 mM imidazole gradient in TNM. Peak fractions were tested for phosphatase activity using pNPP as a substrate (see below). Active fractions were pooled, and the His 6 tag was cleaved with TEV protease for 4 days at 4°C. Subsequently, cleaved AUM was separated from uncleaved AUM and from the His-tagged TEV protease on a HisTrap HP column. Active fractions containing untagged AUM were pooled, concentrated (10-kDa MWCO; Amicon Ultra-15, Millipore), further purified on a HiLoad 16/60 Superdex 200 pg size exclusion chromatography column (GE Healthcare), and eluted in TNM.
Chronophin and the CAC hybrid were expressed and purified as described above for AUM with the following modifications: His 6 -tagged enzymes were expressed for 18 h at 20°C after induction with isopropyl ␤-D-1-thiogalactopyranoside. After centrifugation, cells were resuspended in 100 mM TEA, 500 mM NaCl, 20 mM imidazole, 5 mM MgCl 2 ; pH 7.4, and lysed using a cell disruptor in the presence of protease inhibitors and DNase I. Cleared supernatants were loaded on a HisTrap HP column in binding buffer (50 mM TEA, 500 mM NaCl, 20 mM imidazole, 5 mM MgCl 2 ; pH 7.4), and His 6 -tagged proteins were eluted using a linear gradient of up to 50% elution buffer (50 mM TEA; 250 mM NaCl; 500 mM imidazole; 5 mM MgCl 2 ; pH 7.4). The fractions containing His 6 -chronophin were pooled; the His 6 tag was cleaved with TEV protease, and untagged chronophin was isolated using a HisTrap HP column and further purified on a HiLoad 16/60 Superdex 200 pg size exclusion chromatography column in 50 mM TEA; 250 mM NaCl; 5 mM MgCl 2 ; pH 7.4.
Analytical Size Exclusion Chromatography-To determine the oligomeric state of AUM, analytical size exclusion chromatography was performed using a HiLoad 16/60 Superdex 200 pg column, calibrated with globular proteins of known molecular weight (gel filtration LMW calibration kit, GE Healthcare). Blue dextran was used to determine the void volume of the column. Protein elution volumes were determined by monitoring the absorption at 280 nm. The partition coefficient (K av ) was calculated with the formula where V e is the elution volume; V o is the void volume, and V t is the total column volume. The apparent molecular weight was then derived from the inverse logarithm of the partition coefficient.
Analytical Ultracentrifugation-Sedimentation velocity analytical ultracentrifugation was carried out using a Beckman Optima XL I analytical ultracentrifuge (Beckman Coulter) with an eight hole An-50 Ti rotor at 40,000 rpm and 20°C. Four hundred l of highly purified recombinant murine AUM, murine chronophin, or reference buffer solution were loaded in standard double-sector charcoal-filled Epon centerpieces equipped with sapphire windows. Protein concentrations corresponded to an A 280 of 0.25-0.8. Data were collected in continuous mode at a step-size of 0.003 cm, using absorption optical detection at a wavelength of 280 nm. Data were analyzed using the software SEDFIT (National Institutes of Health) to determine continuous distributions for solutions to the Lamm equation c(s), as described previously (36). Analysis was performed with regularization at confidence levels of 0.68 and floating frictional ratio (f/f 0 ϳ1.22 Ϯ 0.04 for AUM and f/f 0 ϳ1.32 Ϯ 0.02 for chronophin, suggesting a globular conformation for both enzymes), time-independent noise, base line, and meniscus position to root mean square deviation (r.m.s.d.) values of Ͻ0.0064 for AUM and Ͻ0.012 for chronophin. Consistent results were obtained in three independent experiments.
Protein Crystallization and Data Collection-The chronophin-(1-100)-AUM(114 -233)-chronophin(208 -292) (CAC) hybrid protein was concentrated to 10 mg/ml (as determined by absorption at 280 nm using a calculated molar extinction coefficient of 21,430 M Ϫ1 cm Ϫ1 ) in crystallization buffer (10 mM TEA; 0.1 M NaCl; 1 mM MgCl 2 ; pH 7.4) using 10-kDa molecular mass cut off centrifugal filter devices (Amicon Ultra-15, Millipore). Crystals were grown at 20°C in 15% (w/v) PEG 3350 and 0.2 M Mg(NO 3 ) 2 using the sitting-drop vapor diffusion method, by mixing 0.6 l of protein solution with 0.4 l of reservoir solution. CAC crystals appeared as thin plates with dimensions of ϳ0.25 ϫ 0.5 ϫ 0.05 mm after 2-3 days, and the majority of crystals displayed very high mosaicity. Crystals were cryoprotected for flash-cooling in liquid nitrogen by soaking in mother liquor containing 30% (v/v) glycerol. Diffraction data were collected on an R-axis HTC image plate detector mounted on a Micromax HF-007 rotating anode x-ray generator. Data were processed using iMosflm (37) and scaled with Scala from the CCP4 program suite (38). The structure was solved by molecular replacement with the program Phaser (39) with human pyridoxal-5Ј-phosphatase (PDB entry 2OYC) as a search model. The structure was refined at 2.65 Å resolution with Phenix (40), incorporating torsion angle noncrystallographic symmetry (ncs) restraints. Data collection and refinement statistics are summarized in Table 3. The figures of the chronophin and CAC structures were generated with PyMOL (The PyMOL Molecular Graphics System, Version 1.5.0.4, Schrödinger, LLC.). Topology diagrams were generated with the pro-origami system (41) using PDB files 2P69 and 4BKM. Dimer interface calculations were performed with the PISA on-line tool (42).
In Vitro Phosphatase Activity Assays-Phosphatase activity assays were conducted in 96-well microtiter plates. For assays using pNPP as a substrate, 0.8 g of the purified proteins were preincubated for 30 min at 37°C in TNM, in the presence or absence of phosphatase inhibitors or their respective solvent controls. To study the effect of NaF or BeF 3 Ϫ on phosphatase activity, AUM or chronophin was preincubated with 1 mM NaF or with 1 mM NaF ϩ 0.1 mM BeCl 2 (BeF 3 Ϫ ). The reaction was started by the addition of pNPP (final concentration ranging from 0.5 to 9 mM in a total assay volume of 100 l and 3.5 mM for single point assays). The kinetics of pNP generation were followed spectrophotometrically by measuring the absorbance at 405 nm every 30 s on a microplate reader (Envision 2104; PerkinElmer Life Sciences). pNP generation was quantitated using pNP standard curves. To derive K m and k cat values, the data were fit by nonlinear regression to the Michaelis-Menten equation using GraphPad Prism, version 4.01. For nucleotide or PLP dephosphorylation assays, 0.16 g of protein were preincubated for 10 min at 22°C in TNM. The reactions were started by the addition of nucleotides (0 -3 mM) or PLP (final concentration ranging from 0 to 1 mM in a total volume of 50 l, and 0.5 mM for single point assays) and stopped after 5.5 min by the addition of 100 l of malachite green solution (Biomol Green; Enzo Life Sciences). Released phosphate was determined by measuring A 620 and extrapolating the values to a phosphate standard curve, and K m and k cat values were calculated using GraphPad Prism, version 4.01. All pNPP and PLP phosphatase assays were performed with three independently purified protein batches. For phosphopeptide array assays, phosphotyrosine and mixed phosphotyrosine/-serine/-threonine phosphatase substrate sets (JPT Peptide Technologies) were probed with purified AUM according to the manufacturer's protocol. Briefly, peptides were incubated with 83 ng of AUM in 25 l of phosphatase assay buffer (40 mM TEA, pH 7.5; 30 mM NaCl; 0.1% (v/v) Triton X-100), yielding a final substrate concentration of 10 M. The plate was incubated at 37°C under agitation, and the reaction was quenched after 45 min with 25 l of malachite green solution. Released phosphate was determined by measuring A 620 .
Phosphatase Overlay Assays-To test the phosphatase activity of AUM against cellular phosphoproteins, HeLa cells were treated with 100 M freshly prepared pervanadate solution for 15 min at 37°C to block the activity of cellular tyrosine phosphatases. Cells were rinsed in phosphate-buffered saline (PBS), scraped in ice-cold lysis buffer (150 mM NaCl, 1% (v/v) Triton X-100, 1 mM ␤-glycerophosphate, 2.5 mM sodium pyrophosphate, 1 mM sodium orthovanadate, 10 g/ml aprotinin, 10 g/ml leupeptin, 1 mM pepstatin, 1 mM phenylmethylsulfonyl fluoride (PMSF)) supplemented with phosphatase inhibitor cocktails I and II (Sigma), and lysed by repeatedly passing through a 20-gauge needle. Insoluble material was removed by centrifugation at 21,000 ϫ g for 10 min at 4°C, and the cleared supernatants were mixed with 2ϫ Laemmli's sample buffer. Proteins were separated by SDS-PAGE using a single-well comb and transferred onto nitrocellulose membranes (Hybond C; Amersham Biosciences). The nitrocellulose membrane was then cut into vertical strips and incubated for 1 h at 37°C under rotation with the indicated concentrations of AUM WT or AUM D34N in phosphatase assay buffer supplemented with 0.5% (w/v) nonfat milk powder and 2 mM MgCl 2 . Tyrosine phosphorylation patterns were visualized by probing with ␣-phosphotyrosine antibodies (clone 4G10, Millipore).
RNA Interference and Analysis of Cellular Tyrosine Phosphorylation-For RNA interference-mediated knockdown of AUM in the murine spermatogonial cell line GC1-spg, lentiviral particles containing AUM-directed or nontargeting control shR-NAs (MISSION shRNA panel SHCLND-NM_025954 or SHC002, respectively; Sigma) were generated in human embryonic kidney 293F cells by co-transfection with the packaging vector psPAX2 and the envelope vector pCR-VSV-G. Cells stably expressing the shRNA constructs were selected in medium containing 1 g/ml puromycin for 2-3 days. GC1-spg cells expressing AUM shRNA (TRCN0000081477) or control shRNA (SHC002) were seeded on 3-cm diameter dishes, precoated with 0.1 mg/ml poly-L-lysine (Sigma), and starved in DMEM without FCS for 20 h. Cells were stimulated with human epidermal growth factor (EGF; 100 ng/ml, Sigma) for the indicated time points and lysed on ice in 250 l of 2ϫ Laemmli's sample buffer. Proteins were separated by SDS-PAGE and blotted onto nitrocellulose, and membranes were probed with 4G10 ␣-phosphotyrosine antibodies. Blots were stripped and reprobed with antibodies against AUM to assess AUM depletion and with ␣-tubulin antibodies (clone DM1A, Sigma) to control for comparable protein loading.
Generation of AUM Antibodies-AUM-specific rabbit polyclonal antibodies were generated by Charles River using purified full-length murine untagged AUM D34N as an immunogen in New Zealand White rabbits, and antibodies were purified by affinity chromatography.
Preparation of Tissue and Cell Lysates-An adult C57BL/6 male mouse was sacrificed by cervical dislocation and dissected, and the indicated tissues/organs were immediately snap-frozen and pulverized in liquid nitrogen. One hundred mg of the respective tissue powder were solubilized in 1 ml of tissue lysis buffer (50 mM Tris-HCl, pH 7.2; 150 mM NaCl; 0.1% (w/v) SDS; 1% (v/v) Triton X-100; 1 mM EDTA; 0.5 mM sodium orthovanadate; 1 mM sodium fluoride; 1% (w/v) sodium deoxycholate; 10 g/ml aprotinin; 10 g/ml leupeptin; 1 mM pepstatin; 1 mM PMSF) supplemented with phosphatase inhibitor cocktails I and II. Insoluble material was removed by centrifugation at 21,000 ϫ g for 10 min at 4°C.

RESULTS
AUM Is a Chronophin Paralog-The overall amino acid sequence identity between HAD phosphatases is typically Ͻ15%. By database mining, we identified AUM as the closest chronophin homolog in metazoa with an overall amino acid sequence identity of 45% between murine AUM and chronophin and 47% between the two human enzymes. Fig. 1A shows that AUM contains the four HAD-type phosphatase signature motifs that are strictly conserved across evolution. The corresponding amino acid residues are largely identical in AUM and chronophin, yet markedly different from the more distantly related C2-capped phosphomannomutases PMM1 and -2. Based on the overall sequence identity with chronophin and the insertion of the capping domain between HAD motifs II and III, AUM likely belongs to the structural subfamily of NagD-like, C2-capped HAD phosphatases (2,12). To determine the phylogenetic relationships of AUM and chronophin, we analyzed their transcripts from different vertebrate proteins together with urochordate sequences by calculating a phylogenetic tree (43). As shown in Fig. 1B, AUM and chronophin evolved via duplication of an ancestral gene at the origin of the vertebrates. Retained duplicated genes usually undergo either neofunctionalization (i.e. one of the genes evolves a new function) or subfunctionalization (i.e. each gene retains a subset of the original function) (44). A typical indication of neofunctionalization is an increased evolutionary rate in one of the two duplicated genes. To test for such a rate difference in the case of AUM and chronophin, models allowing for different rates at the base of each subgroup as well as in the whole subgroups were compared with models allowing only a single rate over the entire phylogenetic tree. None of the complex models improved the fit to the data significantly (p Ͻ 0.05). Thus, the evolutionary rate of both genes cannot be distinguished, disagreeing with a typical model of neofunctionalization.
We raised polyclonal antibodies against murine AUM to compare the distribution of AUM and chronophin. Fig. 2A shows that AUM-directed antibodies recognize recombinant AUM and do not cross-react with recombinant chronophin. AUM is expressed in all investigated mouse tissues, with high levels present in testis, whereas chronophin appears to be relatively enriched in brain (Fig. 2B). Fig. 2C demonstrates that AUM is also expressed in commonly used cell lines, including the mouse spermatogonial cell line GC1-spg, which we used for further studies (see Fig. 3F).
AUM Is a HAD-type Protein-tyrosine Phosphatase-We verified that AUM has intrinsic phosphatase activity by first analyzing the enzyme kinetics of recombinant AUM against pNPP, a generic phosphatase substrate. Fig. 3A shows that AUM dephosphorylates pNPP, whereas the activity of chronophin measured in parallel was very low, as reported earlier (8,19,20). The replacement of the putative nucleophilic Asp in AUM with Asn (AUM D34N ) abolishes pNPP dephosphorylation, demon-strating that AUM is indeed an aspartate-dependent phosphatase. The catalytic efficiency of AUM is about 1000-fold higher than the pNPP phosphatase activity of the HAD-type protein-  (72). Accession numbers are given on the right. tyrosine phosphatase Eya3, as determined for the recombinant murine enzyme (45), yet it is about 1000-fold lower in comparison with classical tyrosine phosphatases such as PTP1B, TC-PTP, or SHP1 (46).
HAD phosphatases form a stable complex with BeF 3 Ϫ that structurally mimics their tetragonal phosphoaspartate inter-mediate (16,47,48). Fig. 3C shows that the hydrolytic activity of both AUM and chronophin is completely blocked by BeF 3 Ϫ but is insensitive to NaF. AUM was insensitive to inhibitors of type 1 and 2A Ser/Thr protein phosphatases, such as okadaic acid and calyculin A (both tested up to a concentration of 1 M) but was concentration-dependently inactivated by orthovanadate, a phosphate mimic that can act as a product inhibitor (IC 50 ϭ 41.4 M). Consistent with the critical role of Mg 2ϩ for HAD phosphatases, the Mg 2ϩ -chelator EDTA (5 mM) completely abolished AUM activity, and AUM was also inhibited by Ca 2ϩ , which can displace Mg 2ϩ from the active site (49)  these results support the classification of AUM as a Mg 2ϩ -dependent HAD phosphatase. Some HAD phosphatases display in vitro activity against phosphopeptides (7). We therefore explored the activity and potential substrate preferences of AUM in array assays containing a total of 720 different human phosphopeptides. AUM dephosphorylated a small fraction (ϳ3.5%) of the 488 tested Tyr(P) peptides (Fig. 3D), whereas no activity was detected against any of the 174 Ser(P) or the 58 Thr(P) peptides. We next examined the potential protein phosphatase activity of AUM in phosphatase overlay assays. Fig. 3E shows that AUM directly hydrolyzes tyrosyl-phosphorylated proteins isolated from HeLa cell extracts, whereas the activity of AUM D34N is markedly compromised. Like Eya3, AUM thus displays in vitro activity against denatured tyrosyl-phosphorylated proteins at concentrations that are equivalent to those reported for SHP1 and PTP1B in comparable in vitro phosphatase overlay assays (50).

Characterization of the Novel Mammalian Tyrosine Phosphatase AUM
When we depleted endogenous AUM by RNA interference in the mouse spermatogonial cell line GC1-spg and stimulated the cells with epidermal growth factor (EGF), the loss of AUM resulted in a transient increase in EGF-induced protein tyrosine phosphorylation. After 3 min of EGF treatment, we detected hyperphosphorylated bands of ϳ150 and ϳ250 kDa in AUMdepleted cells compared with control shRNA cells, and an ϳ75-kDa band was hyperphosphorylated after 5 min of EGF treatment in AUM-depleted cells (Fig. 3F). These data demonstrate that in contrast to the Ser/Thr phosphatase chronophin, which is inactive against standard Tyr(P)-containing peptides and tyrosyl-phosphorylated proteins (8), AUM acts as a tyrosine phosphatase with narrow protein substrate specificity in cells.
Determinants of Substrate Specificity-Which features determine the difference in AUM and chronophin substrate preferences? To address this question, we swapped the cap domains between the two enzymes. Interestingly, the exchange of the AUM and chronophin cap modules resulted in a clear change of substrate specificity (Fig. 4). Although the phosphatase hybrid consisting of the chronophin cap and the AUM core domain (AUM(1-113)-chronophin(101-207)-AUM(234 -321), referred to as "ACA") is inactive against pNPP, the presence of the chronophin cap is sufficient for substantial PLP dephosphorylation. In contrast, the presence of the AUM cap in the chronophin core/ AUM cap phosphatase hybrid (chronophin(1-100)-AUM(114-233)-chronophin(2(08 -292), termed "CAC") is incompatible with phosphatase activity against PLP. Consistent with a role of residues in the AUM core domain in efficient pNPP dephosphorylation (Table 2), the pNPP phosphatase activity of the CAC hybrid is substantially reduced compared with AUM WT but comparable with the low basal pNPP activity of chronophin WT . The fact that pNPP dephosphorylation by CAC was not enhanced compared with chronophin supports the concept that caps provide features geared toward the specific recognition of physiological substrates.
We performed an evolutionary analysis of residues that are strictly retained in the respective orthologs to predict sites that could be of importance for substrate specificity (Table 3). Leu-204 in AUM and His-182 in the corresponding position in chronophin were identified as the most significant candidate residues with a Z-score of 6.88. When we introduced a His residue in the AUM position 204 (AUM L204H ), the pNPP activity of AUM L204H was reduced to ϳ64.3% of the AUM WT activity (Fig.  4A), yet AUM L204H dephosphorylated PLP almost as well as the ACA mutant (Fig. 4B). This result confirms that residues in the catalytic core of AUM are important for efficient pNPP dephosphorylation (see Table 2). Furthermore, it clearly demonstrates that the introduction of a His residue at position 204 in the AUM cap module can transfer chronophin-like specificity onto AUM, while simultaneously attenuating pNPP dephosphorylation. We conclude that this position is critical for the substrate specificity of AUM and chronophin.
Structural Basis of Substrate Specificity-Further insights into the structural basis of the different substrate specificities of AUM and chronophin were obtained from x-ray crystallographic studies. Although we were unable to crystallize fulllength AUM, we succeeded in growing and analyzing crystals of the CAC hybrid, containing the AUM cap domain inserted into the chronophin catalytic core (PDB entry 4BKM; Table 4). The atomic model of the CAC hybrid was refined at 2.65 Å resolution to an R-factor of 19.8% and an R free of 25.6%. The protein crystallized in the space group P2 1 with four molecules (two homodimers) per asymmetric unit. The overall structure of the murine chronophin core/AUM cap hybrid is highly similar to human chronophin (Fig. 5A). When only the amino acids corresponding to the catalytic core of human chronophin (PDB 2P69) and the catalytic core of murine chronophin in the CAC hybrid are structurally aligned (residues 1-100 and 207-292 in chronophin or 1-100 and 200 -306 in CAC, respectively), the FIGURE 3. AUM substrate specificity in vitro and in cells. A, in vitro pNPP phosphatase assays were performed in 96-well microtiter plates in a total assay volume of 100 l, using recombinantly expressed and purified AUM, AUM D34N , or chronophin (0.8 g of protein/well) and 3.5 mM pNPP as a substrate. The kinetics of pNP generation were followed spectrophotometrically by measuring the absorbance at 405 nm. B, in vitro PLP phosphatase assays with purified AUM and chronophin were performed in 96-well microtiter plates in a total assay volume of 50 l, using recombinantly expressed, purified AUM or chronophin (0.16 g of protein/well), and 0 -1 mmol/liter PLP. The reaction was stopped with malachite green, and released phosphate was determined by measuring A 620 . The enzyme velocity toward increasing PLP concentrations is shown. C, effect of BeF 3 Ϫ on AUM or chronophin activity toward pNPP or PLP. The recombinant purified enzymes (0.8 g of AUM or 0.16 g of chronophin/well) were preincubated for 30 min at 37°C (AUM) or for 10 min at 22°C (chronophin) in the absence (control) or presence of 1 mM NaF (NaF) or 1 mM NaF ϩ 0.1 mM BeCl 2 (BeF 3 Ϫ ) before phosphatase activity against pNPP (3.5 mM) or PLP (0.5 mM) was measured as described above. Left panel, velocity of AUM-dependent pNPP dephosphorylation Ϯ inhibitors; right panel, velocity of chronophin-dependent PLP dephosphorylation Ϯ inhibitors. A-C, results are mean values Ϯ S.E. of n ϭ 3 independent experiments. D, activity of AUM in phosphopeptide array assays. A set of 720 different phosphopeptides phosphorylated on tyrosine, serine, or threonine residues (final substrate concentration, 10 M) was incubated with 100 nM purified AUM in a 384-well microtiter plate (final assay volume, 25 l) for 45 min at 37°C. The reaction was quenched with malachite green, and released phosphate was determined by measuring A 620 . Absorbance values were normalized to the peptide substrate yielding the highest reading (100% hydrolysis), and all peptides with absorbance values of Ն66% over the background are listed. In these sequences, acidic residues are highlighted in red, basic residues in blue, and proline residues in gray. Swiss-Prot accession numbers of the corresponding proteins are given on the right. E, determination of AUM and AUM D34N activity toward tyrosine-phosphorylated proteins in overlay assays. HeLa cells were left unstimulated (unstim.) or treated with pervanadate. After SDS-PAGE, cell lysates were blotted onto nitrocellulose, and the membrane was cut and incubated in the absence (Ϫ) or presence of the indicated concentrations of AUM WT or AUM D34N . Protein tyrosine phosphorylation was analyzed with 4G10 ␣-phosphotyrosine antibodies. A short (left panel) and longer exposure (right panel) is shown (n ϭ 3). F, comparison of cellular phosphotyrosine levels in control (ctrl) or AUM-depleted GC1-spg cells. GC1-spg cells expressing control (ctrl) shRNA or AUM shRNA were stimulated with 100 ng/ml EGF for the indicated time points. Cells were lysed, and proteins were separated by SDS-PAGE and transferred onto nitrocellulose membranes for immunoblotting. Cellular tyrosine phosphorylation levels were analyzed with 4G10 ␣-phosphotyrosine (pTyr) antibodies; AUM depletion was assessed with ␣-AUM antibodies, and tubulin served as a loading control (n ϭ 3).

TABLE 1 Catalytic constants of AUM toward nucleotides
Nucleotide dephosphorylation was measured in 96-well microtiter plates in a total assay volume of 50 l, using recombinantly expressed, purified AUM (0.16 g of protein/well) and 0 -3 mM nucleotides. The reactions were stopped with malachite green, and released phosphate was determined by measuring A 620 . Results are means Ϯ S.E.; n ϭ 3. The inclusion of the cap domain in the superposition increases the r.m.s.d. to 1.1 Å, still indicating a significant structural homology between the two proteins. However, the superimposition of only the catalytic cores of the chronophin and CAC structures reveals that the orientation of the capping domain in the CAC hybrid differs from chronophin by a 6.5°r otation as determined by LSQKAB of the CCP4 program suite, resulting in a slightly more open conformation of the cap relative to the core domain.
Both caps consist of a central parallel ␤-sheet in 8 -7-9 -10Ϫ13 (CAC) or 7-8-9 -12 (chronophin) orientation, with ␣-helices (CAC: F, E, G, H, I; chronophin: E, F, G, H) connecting the strands of the sheet (Fig. 5B). Preceding strand 10 (CAC) or 9 (chronophin), a ␤-hairpin followed by a helix (I in CAC and H in chronophin) is inserted into the capping domain (␤-hairpin residues, 201-211 in AUM, 190 -200 in CAC, and 181-191 in chronophin). This ␤-hairpin covers the entrance of the active site and harbors the substrate specificity determining residue (Leu-204 in AUM/Leu-191 in CAC, His-182 in chronophin) and is therefore referred to as "substrate specificity loop." Compared with the orientation of the substrate specificity loop in the chronophin cap, the corresponding ␤-hairpin is notably skewed in AUM (Fig. 5A, open arrow). The AUM cap also fea-tures an additional extensive loop not present in chronophin, indicated by the closed arrow in Fig. 5A (residues 140 -164 in AUM/127-151 in CAC; referred to as "transverse loop," see below), which is located between strands 8 and 9 and comprises a short helical stretch (helix G in CAC, see Fig. 5B). Thus, consistent with the structural conservation of the cap domains in HAD subfamilies, the overall structures of the AUM and chronophin capping domains are closely related in terms of structural elements and folding with an r.m.s.d. of 1.28 Å, although remarkable differences exist (see below and Fig. 6A).
The CAC hybrid crystallizes as a dimer (Fig. 5C), and we confirmed that AUM also predominantly exists as a dimer in solution, by employing size exclusion chromatography and analytical ultracentrifugation sedimentation velocity measurements (Fig. 5, D and E). The AUM phosphatase activity in the size exclusion chromatography fractions containing AUM dimers or AUM tetramers was comparable. CAC dimerization is mediated via the AUM capping domain, mainly by helices H and I located between strands 9 and 13 of the ␤-sheet and following the ␤-hairpin substrate specificity loop. Interestingly, the transverse loop in the AUM cap of molecule A in the dimer is positioned in close proximity to the substrate specificity loop in the cap of molecule B. Arg-197 in the ␤-hairpin and Leu-144 in the CAC transverse loop (corresponding to Arg-201 and Leu-157 in AUM) may be important residues for this interaction between two CAC protomers. This characteristic loop in  the AUM cap also partially occludes the active site entrance of the adjacent protomer. When the four CAC protomers (two homodimers) of one asymmetric unit are superimposed, no marked differences are seen in the orientation of this loop, indicating that this is the predominant conformation rather than an artifact of crystal packing. Thus, both the substrate specificity loop and the large and presumably flexible transverse loop are likely to determine AUM phosphatase substrate accessibility and selectivity.
A comparison of the active sites of chronophin and the CAC hybrid provides a detailed view of residues important for substrate specificity (Fig. 6A). Although all core domain residues that are directly involved in the dephosphorylation reaction are identical between AUM and chronophin (with the exception of Thr-67 in AUM, which contributes to the orientation of the substrate for nucleophilic attack by forming a hydrogen bond

TABLE 2 Catalytic constants of AUM, AUM mutants, and chronophin toward pNPP
Exchange of the cap domains and of amino acid residues in AUM with the corresponding chronophin residues in regions of AUM/chronophin sequence divergence. pNPP dephosphorylation assays were performed as described in the legend to Fig. 3. The substitution of three divergent amino acids adjacent to HAD motif I of AUM for the corresponding chronophin residues (AUM R41N; T44R; A45I ) slightly elevates the activity toward pNPP. In contrast, the swap of residues in the seven-amino acid stretch of AUM that immediately succeeds the conserved Ser/Thr residue in HAD motif II (AUM T67S; S71R; K72R; T73A ) abolishes AUM phosphatase activity toward pNPP, which is partially restored in the presence of the HAD motif II residue Thr-67 (AUM S71R; K72R; T73A ; ϳ58.7% of AUM WT activity). Ser/Thr of motif II forms a hydrogen bond with the substrate's phosphoryl group and thus contributes to the orientation of the substrate for nucleophilic attack. The catalytic constants of AUM T67S; K72R; T73A , AUM T73A , the CAC hybrid, and chronophin could not be determined (ND) because the pNPP activity of these proteins was too low. None of the investigated AUM variants dephosphorylated the chronophin substrate PLP. The protein concentrations of those AUM mutants that could not be purified to apparent homogeneity (AUM T67S; S71R; K72R; T73A , AUM T67S; K72R; T73A , AUM T67S; S71R; T73A , see Fig. 4C) were determined densitometrically. To this end, proteins were subjected to SDS-PAGE and stained with Coomassie Blue, and the bands corresponding to the respective AUM mutant were analyzed using the ImageJ software (National Institutes of Health). Various concentrations of BSA were run on the same SDS-polyacrylamide gels and were analyzed as described above to construct protein concentration standard curves. Results are mean values Ϯ S.E. of three independent experiments performed in triplicate with three independently purified protein batches.   a Numbers in parentheses refer to the respective highest resolution data shell in the data set. b R sym ϭ ⌺ hkl ⌺ i ͉I i Ϫ ͗I͉͘/⌺ hkl ⌺ i I i , where I i is the ith measurement, and ͗I͘ is the weighted mean of all measurements of I. c R p.i.m. ϭ ⌺ hkl (1/(n Ϫ 1)) 1/2 ⌺ i ͉I i Ϫ ͗I͘ /⌺ hkl ⌺ i I i , where n is the multiplicity of the observed reflection. d ͗I/I͘ indicates the average of the intensity divided by its S.D. value. e R cryst ϭ ⌺͉F o Ϫ F c ͉/⌺͉F o ͉, where F o and F c are the observed and calculated structure factor amplitudes. R free is same as R cryst for 5% of the data randomly omitted from the refinement. f Estimated coordinate error is based on R free . g Ramachandran statistics indicate the fraction of residues in the favored, allowed, and disallowed regions of the Ramachandran diagram, as defined by MolProbity (73). h Number of serious clashes per 1000 atoms is shown in Ref. 73. FEBRUARY 7, 2014 • VOLUME 289 • NUMBER 6 with its transferring phosphoryl group and thus is required for AUM activity rather than specificity; see Table 2), remarkable differences exist in the AUM and chronophin cap domains. Importantly, the imidazole ring of His-182 in the chronophin cap directly coordinates and orients the PLP pyridine ring, which explains its critical function for substrate recognition (see Fig. 4). In contrast, the aliphatic Leu-204 of AUM (Leu-191 in CAC) is unable to coordinate PLP. Two other notable amino acid exchanges in the cap are the replacement of the acidic Asp-177 in chronophin for the charge-neutral Asn-199 in AUM (Asn-186 in CAC) and the substitution of Tyr-150 in chronophin for Phe-173 in AUM (Phe-159 in CAC).

Characterization of the Novel Mammalian Tyrosine Phosphatase AUM
Finally, we have conducted a topological comparison of the catalytic clefts in chronophin and the CAC hybrid. Fig. 6B illustrates that the cap portion of the active site pocket in CAC that is dedicated to substrate recognition is deeper than in the chronophin pocket. This result is consistent with the observed AUM phosphatase activity toward the Tyr(P)-mimetic pNPP substrate and toward Tyr-phosphorylated peptides and proteins, and it indicates that in contrast to the shallower binding groove of chronophin, the conformation of the substrate binding pocket of AUM allows for the accommodation of bulkier Tyr residues.
Evolution of Substrate Specificity-To identify further sites responsible for the different substrate specificities of AUM and chronophin, we adopted a comparative evolutionary approach. If, following gene duplication, the functions of two paralogs change, two types of sites can typically be distinguished. Provided that a corresponding position in the two proteins is essential for their respective functions, this site is highly conserved in both proteins but harbors different residues (class II sites (51)). The substrate switch resulting from the His-182/Leu-204 swap has clearly revealed the importance of such differentially conserved residues for AUM and chronophin functions. We have identified 15 additional class II sites that may be involved in functional differentiation (Table 3).
When mapping these sites on the AUM cap structure (Fig.  7A), some differentially conserved residues stand out as candidates involved in the determination of substrate specificity:  Asn-199 in AUM (Asp-177 in chronophin) is located in the substrate recognition part inside the active cleft (see also Fig.  6A), whereas Arg-203 in AUM (Trp-181 in chronophin) maps to the substrate specificity loop but is oriented toward the outer protein surface and may thus function in substrate recognition at the entrance to the active cleft. Other differentially conserved residues, including Ala-128, Tyr-178, Cys-217, and Ala-226, in AUM are exposed on the cap surface and may play a role in regulatory protein-protein interactions. Sites involved in functional differentiation can also be highly conserved in one paralog but under poor selection constraint in the other (class I sites; see Table 5). Of interest, Glu-207 in AUM is positioned on top of the specificity loop and may be involved in initial substrate contact of AUM but not of chronophin (Fig. 7B), and Arg-41 in AUM is located at the entrance to the active cleft ( Fig. 7B and Table 2).
To visualize catalytic core residues that are differently conserved, we have mapped them on the chronophin structure. Fig.  7C shows that these residues cluster around the active center of chronophin. Chronophin-Arg-63 is located at the active site entrance and corresponds to AUM-Lys-72, a core domain residue important for AUM activity (see Table 2 for biochemical indications on the relevance of this residue). The differently conserved chronophin residues Ser-195, Phe-156, and Gly-117 and residues Arg-74 and Ala-198 that are only conserved in chronophin are located on the chronophin surface and are potential candidates for protein-protein interaction sites. The relevance of these conserved residues for substrate specificity can be systematically explored once a physiological AUM substrate has been identified.

DISCUSSION
We have identified and characterized the previously unexplored chronophin paralog AUM as a mammalian HAD-type tyrosine phosphatase and have characterized it by biochemical, structural, and computational techniques. Although physiological AUM substrates are currently unknown, the elevated levels of tyrosine-phosphorylated proteins observed upon stimulation of AUM-depleted cells with EGF implicate AUM as a tyrosine phosphatase involved in growth factor-induced signaling pathways. The gene encoding for human AUM was previously annotated as phosphoglycolate phosphatase (PGP) based on     (58), which is clearly distinct from the properties of the substrate binding groove in AUM as shown in this study. Still, should AUM indeed have PGP activity in addition to its tyrosine phosphatase activity, this would add yet another dimension to the functional potential of this HAD phosphatase subfamily. Contrasting AUM, its paralog chronophin is a PLP-and Ser/ Thr-directed phosphatase. Classical Ser/Thr-or Tyr-directed phosphatases constitute separate, structurally unrelated enzyme families (59,60). In contrast, all HAD phosphatases share a structurally conserved Rossmannoid catalytic core. Previous work has identified HAD family members that possess protein Ser/Thr or Tyr phosphatase activity (2,3). Although classical Ser/Thr-directed phosphatases recruit an array of regulatory subunits to precisely target their substrates (61,62), and classical Tyr phosphatases have acquired their specificity by domain fusion events (63,64), HAD phosphatases have specialized their functions by the fusion of a "generic" catalytic core with structurally diversified cap domains (2,63,65). Crystallographic studies in prokaryotes (14,66), mammalian nucleotidases (16,67), human phosphomannomutases (68), and human phosphatases (69) have led to the realization that HAD caps can provide specificity domains, yet little experimental evidence has been available so far to support this presumed role. We have therefore created functional HAD phosphatase cap hybrids between two HAD paralogs and show that these domain swaps can indeed toggle between chronophin-and AUM-like phosphatase specificities.
Employing a combination of biochemical, evolutionary, and structural data, we have identified the mechanisms behind this striking functional flexibility. We show that the functional specialization of HAD phosphatases cannot only be attributed to the presence of different cap structures but also to seemingly small changes within cap domains. Despite their highly conserved fold, the caps of chronophin and AUM contain residues involved in substrate recognition. Using an evolutionary-based prediction of residues determining specificity, we identified a candidate site responsible for the change in function. Indeed, a single, differently conserved residue in AUM and chronophin is sufficient to switch between the characteristics of a Tyr-or Ser/Thr-directed phosphatase, respectively. This cap residue (Leu-204 in AUM and His-182 in chronophin) is positioned as part of a characteristic ␤-hairpin structure referred to as sub-strate specificity loop (13,14). Although the presence of this loop is common to both AUM and chronophin, AUM additionally contains a transverse loop that may serve to orient the substrate specificity loop of the adjacent monomer in the homodimer.
In addition to their preference for Tyr-or Ser/Thr-phosphorylated residues, phosphatases also have to evolve specificity toward their particular target proteins. To discover such sites, we combined structural and evolutionary data. We identified four AUM-specific residues around the substrate entry site to the catalytic core (Arg-41, Lys-72, Arg-203, and Glu-207) that may be involved in initial AUM/substrate contact, and two differentially conserved residues inside the AUM substrate binding groove (Asn-199 and Leu-204) that may be involved in defining the electrostatic environment of the active site and in substrate recognition. These sites are therefore excellent candidates for further functional characterization of AUM. Thus, structural information allowed us to map the functional divergence of AUM and chronophin to defined substitutions in the substrate binding pocket.
Yet the following question remains. How did these differences evolve? AUM and chronophin arose from a duplication of an ancestral gene at the base of the vertebrates. Unexpectedly, we did not find a significant difference between the evolutionary rates of the two paralogs, indicating that they both differ in function from the ancestral gene. The orthologs of AUM and chronophin in the urochordates Ciona savigny and Ciona intestinalis harbor neither a histidine nor a leucine residue in the specificity determining site but instead a methionine. It will be interesting to test the function of this gene or even to reconstruct the ancestral precursor gene.
When looking at the large and ancient family of HAD proteins, the evolution of their functional divergence can be traced back to the addition of different cap domains to a Rossmannoid core (2). We thus propose a two-step process in the evolution of HAD phosphatase specificity. In the first evolutionarily old step, different cap domains were added to the Rossmannoid core. Subsequently, phosphatase functions within structural HAD subclasses were further diversified by amino acid substitutions within the caps. It seems that this evolutionary mechanism has equipped HAD phosphatases with a high degree of functional flexibility. Indeed, orthologs of AUM and chronophin were expanded independently in nematodes and arthropods (12). Differences in the residues positioned in the specificity determining sites hint at supplementary functions in these model organisms. Thus, the AUM/chronophin subfamily might provide an excellent model system to study the evolution of functional divergence of HAD phosphatases.
Despite their important roles in human diseases, which make some HAD phosphatases conceptually attractive drug targets (5,12,25,70,71), knowledge of mammalian HAD phosphatase specificity determinants is currently very limited. We demonstrate that HAD substrate specificity can be encoded by a very small number of predictable amino acid residues. This concept may be instrumental in the search for specific HAD phosphatase inhibitors that target the unique substrate recognition pockets of individual enzymes. Finally, it will be important to elucidate the in vivo functions of the Ser/Thr-and PLP-directed