Differential catalytic promiscuity of the alkaline phosphatase superfamily bimetallo core reveals mechanistic features underlying enzyme evolution

Members of enzyme superfamilies specialize in different reactions but often exhibit catalytic promiscuity for one another's reactions, consistent with catalytic promiscuity as an important driver in the evolution of new enzymes. Wanting to understand how catalytic promiscuity and other factors may influence evolution across a superfamily, we turned to the well-studied alkaline phosphatase (AP) superfamily, comparing three of its members, two evolutionarily distinct phosphatases and a phosphodiesterase. We mutated distinguishing active-site residues to generate enzymes that had a common Zn2+ bimetallo core but little sequence similarity and different auxiliary domains. We then tested the catalytic capabilities of these pruned enzymes with a series of substrates. A substantial rate enhancement of ∼1011-fold for both phosphate mono- and diester hydrolysis by each enzyme indicated that the Zn2+ bimetallo core is an effective mono/di-esterase generalist and that the bimetallo cores were not evolutionarily tuned to prefer their cognate reactions. In contrast, our pruned enzymes were ineffective sulfatases, and this limited promiscuity may have provided a driving force for founding the distinct one-metal-ion branch that contains all known AP superfamily sulfatases. Finally, our pruned enzymes exhibited 107–108-fold phosphotriesterase rate enhancements, despite absence of such enzymes within the AP superfamily. We speculate that the superfamily active-site architecture involved in nucleophile positioning prevents accommodation of the additional triester substituent. Overall, we suggest that catalytic promiscuity, and the ease or difficulty of remodeling and building onto existing protein scaffolds, have greatly influenced the course of enzyme evolution. Uncovering principles and properties of enzyme function, promiscuity, and repurposing provides lessons for engineering new enzymes.

Members of enzyme superfamilies specialize in different reactions but often exhibit catalytic promiscuity for one another's reactions, consistent with catalytic promiscuity as an important driver in the evolution of new enzymes. Wanting to understand how catalytic promiscuity and other factors may influence evolution across a superfamily, we turned to the well-studied alkaline phosphatase (AP) superfamily, comparing three of its members, two evolutionarily distinct phosphatases and a phosphodiesterase. We mutated distinguishing active-site residues to generate enzymes that had a common Zn 2؉ bimetallo core but little sequence similarity and different auxiliary domains. We then tested the catalytic capabilities of these pruned enzymes with a series of substrates. A substantial rate enhancement of ϳ10 11 -fold for both phosphate mono-and diester hydrolysis by each enzyme indicated that the Zn 2؉ bimetallo core is an effective mono/di-esterase generalist and that the bimetallo cores were not evolutionarily tuned to prefer their cognate reactions. In contrast, our pruned enzymes were ineffective sulfatases, and this limited promiscuity may have provided a driving force for founding the distinct one-metal-ion branch that contains all known AP superfamily sulfatases. Finally, our pruned enzymes exhibited 10 7 -10 8 -fold phosphotriesterase rate enhancements, despite absence of such enzymes within the AP superfamily. We speculate that the superfamily active-site architecture involved in nucleophile positioning prevents accommodation of the additional triester substituent. Overall, we suggest that catalytic promiscuity, and the ease or difficulty of remodeling and building onto existing protein scaffolds, have greatly influenced the course of enzyme evolution. Uncovering principles and properties of enzyme function, promiscuity, and repurposing provides lessons for engineering new enzymes.
Enzymes catalyze biological reactions with enormous rate enhancements up to 10 29 -fold, and they do so with high specificity, thereby ensuring rapid and controlled flux of substrates through metabolic pathways and permitting exquisite biological regulation (1,2). But how did these remarkably efficient enzymes arise? What are the mechanistic and structural properties that determine the catalytic proficiencies of a family of proteins and thus the catalytic landscape that evolution has traversed?
Catalytic promiscuity, the ability of an enzyme to carry out reactions other than its evolutionary optimized reaction, is expected to strongly bias the emergence of new enzymes, as it can provide an evolutionary head start following gene duplication (3,4). Correspondingly, the catalytic promiscuity of modern day enzymes and their conserved elements may be evolutionary vestiges and may provide clues to catalytic capabilities and mechanistic features that have influenced evolutionary history and may provide insights for the engineering of new enzymes (4 -13).
We can identify enzymes that are members of superfamilies, i.e. enzymes with similar overall folds and common underlying catalytic features, and these enzymes very likely emerged from a common evolutionary precursor (3,8,10,11,14,15). Indeed, extant superfamily members that catalyze different reactions often exhibit cross-promiscuity, consistent with an evolutionary role for promiscuity (3,11,16). In some cases, for enzymes that have evolved subsequent to the last universal common ancestor (LUCA) 2 of current cellular life, reconstructions can provide information about potential evolutionary pathways This work was funded by National Institutes of Health Grant GM049243 and GM64798 (to D. H.). The authors declare that they have no conflicts of interest with the contents of this article. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health. We dedicate this paper to Patsy Babbitt on the occasion of her retirement in honor of her profound and creative insights into enzyme evolution and function. This article contains supplemental Figs. S1-S6, Tables S1-S4, and Refs. 1-8. The atomic coordinates and structure factors (codes 5TPQ and 5TOO) have been deposited in the Protein Data Bank (http://wwpdb.org/). 1 To whom correspondence should be addressed. E-mail: herschla@stanford.edu. (17)(18)(19)(20)(21), whereas in other cases reconstructions are stymied as superfamily members were already present in LUCA or emerged very early after the common ancestor (22). 3 In addition to our ability to create models for the sequences of evolutionary precursors (17)(18)(19)(20)(21), we would like to know, from a mechanistic standpoint, why certain enzymes have evolved from one another and, conversely, why others did not. As noted above, catalytic promiscuity is one factor that has likely played an important role, and we gain new insights into evolutionary pathways from the determinations of catalytic promiscuities herein. Nevertheless, catalytic promiscuity does not ensure the development and fixation of a new activity, even if it can provide a selective advantage; in addition to the inevitable role of chance in evolution, effective mutational paths need to avoid major sustained fitness dips that would prevent or greatly hinder optimization.
Thus, information about the physical and mechanistic features that determine catalytic efficiency and catalytic promiscuity will help us understand how enzymes have emerged throughout evolution and the evolutionary potential and evolvability of enzymes. Ultimately, these insights may be harnessed to engineer new, beneficial enzymes (Refs. 4 -13, 23, 24 and references therein and Refs. 23,[25][26][27]. We have taken advantage of the well-established functional, structural, and evolutionary connections within the alkaline phosphatase (AP) superfamily to relate the catalytic promiscuity of the superfamily's Zn 2ϩ bimetallo core to the elaboration of enzyme classes across this superfamily (6, 28 -37). To ensure that our results would be representative rather than idiosyncratic, we compared three alkaline phosphatase (AP) superfamily members, mutating the distinguishing active-site residues to create mutant enzymes with a common Zn 2ϩ bimetallo core but pervasive sequence differences and distinct auxiliary domains. Determination of the catalysis of these three "pruned" enzymes across a series of reaction classes and consideration of catalytic and structural features allowed us to relate intrinsic catalytic prowess of the Zn 2ϩ bimetallo core to evolutionary decisions made across this superfamily.

Bimetallo branch of the AP superfamily
The AP superfamily subdivides into two main subgroups, one with enzymes containing a single metal ion and predominantly catalyzing sulfatase reactions and the other with enzymes containing the canonical alkaline phosphatase bimetallo center (38). We focused on the latter group, which possesses two metal ions coordinated by conserved residues in the active site and a universally conserved serine or threonine nucleophile (Fig. 1). This group includes phosphatases, phosphate diesterases, phosphomutases, among other enzymes (Scheme 1) (30 -34, 39). The enzymes studied here typify a cross-section of the structural diversity seen across the AP superfamily bimetallo branch while maintaining Zn 2ϩ ions in their bimetallo cores, which facilitates the comparisons herein (Fig. 2). EcAP and Chryseobacterium meningosepticum alkaline phosphatase PafA (PafA) are both phosphate monoesterases (Table 1), but they are evolutionarily distant as suggested by a low sequence identity (8.9%; supplemental Fig. 1) (40) and distinct inserted elements (Fig. 2, A and B) (34,40). Xanthomonas citri nucleotide pyrophosphatase/phosphodiesterase (NPP) has a subset of active sites and scaffold features in common with PafA but is evolutionary distant from PafA (10% sequence identity; supplemental Fig. 1) (40) and is a phosphate diesterase rather than a monoesterase (Table 1), related to a large set of AP superfamily diesterases (Fig. 2, A and B) (34, 40 -42).
The active-site residues common to these enzymes reside in analogous positions within conserved secondary structure elements ( Fig. 2A, white and gray), and these secondary structure 3 P. C. Babbitt, personal communication. Generic active site of AP bimetallo enzyme shows the alkoxide nucleophile (Ser or Thr), Zn 2ϩ atoms, and Zn 2ϩ ligands with a depiction of the reaction's transition state, and general reaction mechanism for AP superfamily enzymes via a covalent intermediate (E-P), shown for reaction of a phosphate monoester giving an inorganic phosphate (P i ) product.

Mechanistic features underlying evolution in the AP superfamily
Mechanistic features underlying evolution in the AP superfamily elements include highly similar tertiary structures (Fig. 2B, gray) (30, 31, 33, 34, 38 -40, 43). All three enzymes have a serine or threonine nucleophile, two Zn 2ϩ ions, and identical Zn 2ϩ ligands (Fig. 2, A and C, black). Mechanistic similarities are also readily apparent. The conserved seryl/threonyl residue is a ligand of Zn 2 and serves as a nucleophile that attacks the phosphorus atom of the phosphate monoester or diester substrate; the leaving group oxygen atom, which develops a negative charge in the transition state, is stabilized by Zn 1 ; and one of the oxygen atoms of the transferred phosphoryl group situates between the Zn 2ϩ ions, presumably making close electrostatic interactions that stabilize the trigonal bipyramidal transition state ( Fig. 2D) (44).
These structural and mechanistic similarities contrast with local and global differences (Fig. 2). Fig. 2C shows the residues that sit opposite to the bimetallo center (colored residues). The two monoesterases (EcAP and PafA) donate multiple hydrogen bonds to each of the two oxygen atoms on the transferred phosphoryl group that face away from the Zn 2ϩ bimetallo site, interactions that are unique to each enzyme and suggestive of evolution along distinct pathways within the AP superfamily (40,41). The diesterase (NPP) makes interactions with the nonesterified outward-facing phosphoryl oxygen atoms that are the same as for the PafA monoesterase, apparently reflecting a shared evolutionary origin (Fig. 2C, NPP, Asn-111 (magenta); PafA, Asn-100 (green)) (37,40,41), whereas the other outward facing phosphoryl oxygen makes different interactions, in its esterified form with an NPP hydrophobic pocket and in its nonesterified form with a PafA hydrogen bond network (Fig. 2, B and C, NPP, magenta; PafA, green) (34,37,45).
Mechanistic studies have shown, as expected, that mutations of groups making specialized interactions with a phosphoryl oxygen atom (in the monoesterases) or with the ester group attached to the transferred phosphoryl group (in the diesterase) preferentially lower activity of the cognate reaction and thereby decreased specificity (6,36,40,45). In addition to the local active-site differences and very low sequence conservation within their structurally conserved cores (sequence identities: 6.0% for EcAP/NPP, 8.9% for EcAP/PafA, and 10% for NPP/ PafA for the structurally conserved cores; supplemental Fig. 1), these enzymes contain different sets of added helices and sheets (Fig. 2, A and B, color-coded regions) (40).

Phosphate monoesterase and diesterase activities of the Zn 2؉ bimetallo scaffold of AP superfamily members
We first asked how effective the AP superfamily Zn 2ϩ bimetallo center is in catalyzing phosphate monoester and diester hydrolysis when embedded in each of the three superfamily scaffolds after truncation of the active-site side chains particular to each active site (Fig. 2C, colored residues interacting with the substrate non-bridging oxygen atoms facing away from the Zn 2ϩ bimetallo center). It has been suggested that the scaffold tunes the Zn 2ϩ bimetallo site to favor monoesterase over diesterase activity (34), and the ample sequence differences throughout these scaffolds could cause these or other catalytic differences. Furthermore, whereas it is reasonable that the scaffolds make the same functional interactions in phosphate monoester and diester transition states, the charges of reactants and transition state structures for reactions vary, so there is no reason a priori to expect the same strength for each catalytic interaction and the same catalytic contribution; in other words, we cannot predict the relative rate constants for reactions with different substrates, even ones making the same atom-toatom interactions.
We created pruned versions of each enzyme by subtractive mutagenesis, denoted throughout with asterisks (i.e. EcAP*, NPP*, and PafA*), by mutating the side chains directly contacting and within 4 Å of the substrate non-bridging oxygen atoms and, for NPP, side chains that interact with the methyl substituent of me-pNPP used herein (Fig. 2C, color-coded side chains (Scheme 2) (45). Subtractive mutagenesis was used in all cases except for mutations of the nucleophile, T79S and T90S for PafA and NPP, respectively, so that all three pruned enzymes contained the same nucleophile. The pruned constructs had mutations as follows: EcAP*: D101A, R166S, D153A, E322A, and K328A; NPP*: T90S (nucleophile), F91A, N111A, L123A,  Assay conditions used are as follows: 0.1 M sodium MOPS, pH 8.0, 0.5 M NaCl, 100 M ZnCl 2 , and 500 M MgCl 2 at 25°C, unless noted otherwise. Errors (in parentheses) in activities are standard deviations from a duplicate with the same or independent enzyme preparations. ND, not determined.

Mechanistic features underlying evolution in the AP superfamily
and Y205A; and PafA*: T79S (nucleophile), N100A, K162A, and R164A (Table 2). Prior studies showed that the only non-alanine mutant, R166S, gives similar kinetics as the corresponding alanine mutation (36), and for each enzyme, we created at least one additional mutated version to control for unintended consequences from the specific mutations made. For Glu-322, mutation to tyrosine was employed as prior studies showed equivalent effects for it and alanine and as the homologous residue in NPP is a tyrosine (Tyr-205) (6). The ratio between the rate constants for monoester and diester hydrolysis of each enzyme set fell within 3-fold of one another (Table 3), providing evidence against unintended or idiosyncratic effects.
We expected, most simply, that the bimetallo cores would not rearrange in the mutated enzymes, as numerous EcAP variants with mutations in these active-site residues exhibit no measurable change in the bimetallo center or in more remote parts of the structure (6,36,(47)(48)(49)(50). Nevertheless, as we mutated four or five side chains simultaneously in each active site, we tested for structural rearrangements.
For each enzyme, its wild-type (WT) and pruned form gave CD spectra that were identical, within error (supplemental Fig.  2). To obtain a more detailed view of possible structural rearrangements, crystals of PafA* and EcAP* were obtained, and their structures were determined to 2.03 and 2.45 Å resolution, respectively (Tables 4 and 5 and supplemental Fig. 3). The overall structures of WT EcAP and EcAP* were superimposed with a backbone r.m.s.d. of 0.42 Å over 801 atoms (Fig. 3A), and WT PafA and PafA* were superimposed with a backbone r.m.s.d. of 0.18 Å over 449 atoms (Fig. 3B). The conserved active-site Zn 2ϩ ions and ligands were identical for the two sets of wild-type and pruned enzymes (Fig. 3C), with the only exception being the serine nucleophile of PafA* (T79S), which was not coordinated to Zn 2 and appeared to be interacting with another ion in the active site (supplemental Fig. 3). This difference presumably arose because the PafA* crystals were grown at pH 4.3, a pH at which the enzyme is not active and the serine residue is likely neutral ( Fig. 3C; data not shown) (51). Fig. 4A compares the phosphate monoesterase activity for the WT and pruned versions of each enzyme (gray and black bars, respectively). As expected, the pruned monoesterases (EcAP* and PafA*) are more diminished in monoester activity than the pruned diesterase (NPP*). Intriguingly, the monoesterase activities of the three pruned enzymes were within 5-fold. Fig. 4B shows the corresponding comparisons for phosphate diesterase activity. As expected, converse effects on the two reactions are observed, i.e. the diesterase pruned construct (NPP*) gave the largest effect. Remarkably, as for the monoesterase reaction, all three enzymes had very similar diesterase activity, again within a 5-fold range.
The phosphate monoesterase and diesterase results are summarized together in Fig. 4C as the ratios of these activities. There is an enormous difference in the substrate preferences of the WT enzymes of ϳ12 orders of magnitude, and the intrinsic discrimination is greater still (i.e. the chemical step is not ratelimiting in all cases) (Fig. 4C, gray bars) (40,51). In contrast, for the pruned constructs the monoesterase/diesterase ratios are uniform, all within 2-fold ( Fig. 4C, black bars (near axis)). Furthermore, the pruned enzymes provide substantial rate enhancements of ϳ10 10 -10 11 -fold rate for the two reactions, indicating that catalysis from the bimetallo scaffold rivals or exceeds that of many fully evolved enzymes ( Fig. 5) (2)(3)(4).
The similarity of the catalytic ratios for pruned enzymes derived from two different monoesterases and a diesterase strongly suggests that the bimetallo site is functionally conserved across the AP superfamily, despite the very low sequence identity of the surrounding scaffold and despite the different additional structural elements (Fig. 2, A and B) (33,39,40). Thus, specialization for the individual cognate reactions arises from the residues that have been pruned from each construct, and evolution has not independently tuned the common Zn 2ϩ bimetallo center to favor phosphate monoester or diester hydrolysis. Most pertinent for understanding catalysis and evolution in the AP superfamily, the Zn 2ϩ bimetallo center appears to be a potent and general enzyme, as elaborated below and under the "Discussion."

Enzymatic activity of the Zn 2؉ bimetallo scaffold as a potential evolutionary driver across the AP superfamily
Interestingly, the monoesterase/diesterase ratios for the pruned enzymes were not just similar, but were also near unity ( Fig. 4C; Tables 2 and 3). To test whether this similarity was Scheme 2

Mechanistic features underlying evolution in the AP superfamily
deep-seated or coincidental due to the particular phosphate ester leaving group used, we carried out analogous measurements with phosphate mono-and diesters across a range of intrinsic reactivities, using phosphate mono-and diesters with a 3-nitrophenoxide or 3,4-dinitrophenoxide instead of p-nitrophenoxide leaving group and thereby spanning a range of leaving group pK a values of three units and reactivity of ϳ10 3 -fold (52)(53)(54). 4 Monoesterase/diesterase activity ratios near unity were observed for these substrates as well (supplemental Table  1). We do not know if this identity represents an equivalence of individual catalytic contributions for the different reactions or differential individual effects that coincidentally sum to the same value.
The similarity of the monoesterase and diesterase activities of the AP superfamily bimetallo scaffold suggests a model in which this promiscuous scaffold provided a strong physical and functional foundation that Nature built upon to make distinct, more efficient, and specialized monoesterases and diesterases. Physically, two of the three phosphoryl oxygen atoms face away from the bimetallo center, so that these atoms are accessible to the local active site and serve as functional handles that provide evolutionary opportunities to sculpt new interactions or to coopt existing residues for new purposes along the pathway from 4 P. Babbitt and J. Lassila, unpublished results. Assay conditions used are as follows: 0.1 M sodium MOPS, pH 8.0, 0.5 M NaCl, 100 M ZnCl 2 at 25°C. Errors (in parentheses) in activities are standard deviations from a duplicate with the same or independent enzyme preparations. Table 3 Rate constants for alternative pruned constructs of EcAP, NPP and PafA a a Assay conditions used are as follows: 100 mM sodium MOPS, pH 8.0, 500 mM NaCl, 100 M ZnCl 2 at 25°C. Errors (in parentheses) in activities are standard deviations from a duplicate with the same or independent enzyme preparations. b Rate constants from Ref. 47. c Data were calculated as the ratio of (k cat /K m ) pNPP, pruned /(k cat /K m ) pNPP, alt. pruned . d Data were calculated as the ratio of (k cat /K m ) me-pNPP, pruned /(k cat /K m ) me-pNPP, alt. pruned .  Mechanistic features underlying evolution in the AP superfamily a generalist to specialist, from a monoesterase to diesterase, or vice versa. Functionally, the substantial ϳ10 10 -10 11 -fold rate enhancements from the bimetallo scaffold would provide a considerable head start toward the evolution of fully optimized monoesterases and diesterases or allow significant catalysis while progressing from one activity to the other (Fig. 5) (2-4).
The observation of different interactions in the EcAP and PafA monoesterase active sites (Fig. 2C, colored residues) indicates an evolutionary divergence, perhaps to provide distinct catalytic properties best suited to the growth conditions of each parent organism (40,42,55).

Weak sulfatase activity of the Zn 2؉ bimetallo scaffold may have restricted evolutionary diversification within this AP superfamily branch
Highly promiscuous phosphate monoesterase and diesterase activities of the ancient Zn 2ϩ bimetallo core may have promoted the evolution of AP-superfamily phosphate monoesterases and diesterases. Conversely, activities not found in extant AP superfamily bimetallo enzymes may be absent because they are not efficiently catalyzed by this motif. The AP superfamily sulfatases provide a particularly incisive test of this prediction, as the sulfatase and phosphatase substrates and transition states are highly analogous geometrically (56,57), but sulfatases are not found in the bimetallo branch, falling instead solely in the one metal-ion active-site subgroup (38, 43, 58 -60).
The sulfatase activities of our bimetallo scaffolds were extremely low, so low that only upper limits could be deter-mined (Fig. 5, pNPS; Table 2). Our most sensitive measurements revealed that the bimetallo scaffold is at least 10 5 -fold less effective as a sulfatase than as a phosphate monoesterase or diesterase. Furthermore, the very low intrinsic reactivity of sulfate esters necessitates highly efficient catalysis (16,61). Thus, our results support a mechanistic mandate for the absence of bimetallo cores among sulfatases. Intriguingly, the AP superfamily appears to have overcome the limitation of the bimetallo scaffold catalysis by utilizing only one metal ion for sulfatases, instead of two. Although the ancient origin of the AP superfamily members prevents us from tracing possible ancestral sulfatases and possible pathways for this conversion, 3 our results reveal a fundamental underlying mechanistic property that Nature faced as it navigated these evolutionary processes.

Phosphotriesterase activity of the Zn 2؉ bimetallo scaffold?
As all known cognate phosphate triesterases exist outside of the AP superfamily (62), we wondered whether the AP superfamily bimetallo scaffold is also less efficient as a triesterase. Indeed, the triesterase rate enhancements were ϳ10 3 -10 4 -fold lower than those for reactions of phosphate diesters and monoesters (Fig. 5, paraoxon; Table 2 andsupplemental Fig. 4). The lower catalysis could arise from differences in substrate charge or transition state charge distribution, due to the tighter triester transition state structures, although such differences did not manifest for the mono-and di-esters despite their charge and transition state differences (44).
The bimetallo scaffold is at least 2-3 orders of magnitude more effective at catalyzing the phosphate triester reaction than the sulfatase reaction, and the greater intrinsic reactivity of phosphate triesters than sulfate esters (63) would render it more probable for Nature to be able to take advantage of this catalytic promiscuity. Nevertheless, there is an apparent absence of an AP superfamily triesterase, and this may arise from the lower catalytic effectiveness of the bimetallo center for this reaction relative to phosphate mono-and diester reactions. However, it may also, or instead, arise from the rarity of phosphate triesters in biology and/or from factors intrinsic to the AP superfamily structural scaffold that render it difficult to sculpt binding sites for both substituents of the transferred phosphoryl group, as elaborated under "Discussion."

Discussion
The probability of de novo evolution of proteins that adopt specific structures is extremely low, and the probability that such proteins exhibit substantial and specific enzymatic activities is lower still. But once a structural motif evolves, gene duplication allows it to be put to multiple uses, and promiscuous catalytic activity greatly increases the probability that evolution will be able to find, utilize, and optimize a potentially beneficial activity (3)(4)(5)64). These factors have presumably led to the preponderance of enzymes with shared folds and shared catalytic features (3,10,11).
We have posed two fundamental questions at the interface of catalytic function and evolution, using the Zn 2ϩ bimetallo scaffold of the AP superfamily. 1) Does the superfamily bimetallo core intrinsically favor certain reactions over others and, if so, are these preferences mirrored in evolutionary history? 2) Did

Mechanistic features underlying evolution in the AP superfamily
evolution, subsequent to divergence in the AP superfamily, "tune" individual bimetallo cores and scaffolds to favor cognate over non-cognate reactions?
We addressed the second question by determining the activity of three AP superfamily Zn 2ϩ bimetallo scaffolds, absent of the specialized interactions specific to each subset of superfamily members. The AP superfamily members studied herein, Escherichia coli alkaline phosphatase (EcAP), an evolutionarily distinct alkaline phosphatase from C. meningosepticum (PafA), and a phosphate diesterase from X. citri (NPP), have very low overall sequence identity and distinct architectural differences within their common bimetallo scaffolds (Fig. 2, A and B, gray and white; supplemental Fig. 1). Nevertheless, these enzymes, with a small set of divergent active-site side chains pruned away (Fig. 2C, color-coded), have highly similar phosphate monoesterase and phosphate diesterase activities, each within a 5-fold range (Figs. 4, A and B, and 5). This observation is consistent with the common Zn 2ϩ -Zn 2ϩ distance observed for AP superfamily members that catalyze hydrolysis of phosphate monoand diesters (65), and these similarities suggest that electro-static or geometric effects rooted in the Zn 2ϩ bimetallo site did not specialize the active site Zn 2ϩ bimetallo center for cognate over non-cognate reactions. Rather, active-site features other than the bimetallo core are responsible for monoester versus diester specialization. As illustrated in Figs. 2 and 6, the bimetallo core provides a scaffold for these specializing auxiliary domains and side chains (6,40,45).
With this information in hand, we turn to the broad question of the interplay of catalytic promiscuity and the selection and optimization of enzymes through evolution. We found that the rate enhancements of our pruned enzymes for both phosphate monoester and diester hydrolysis are substantial, 10 10 -10 11fold (Fig. 5), rivaling or exceeding the catalytic power of many natural enzymes (2). Thus, an AP superfamily scaffold with the Zn 2ϩ bimetallo center alone is a highly effective catalyst that could have provided an effective starting point for the evolution of the extant enzymes that specialize in each of these reactions, as depicted by the central generalist "bimetallo core" in Fig. 6 (see also simplified version of Fig. 6 in supplemental Fig. 6). Although the ancient origin of the alkaline phosphatase superfamily prevents tracing its evolutionary history, the generalist ability of the bimetallo core seems likely to have contributed to its use in both phosphate mono-and di-esterases, whether these activities arose from a common generalist or from one another.
In contrast to this phosphate mono-/di-esterase generalist activity, the pruned enzymes are many orders of magnitude less effective in catalyzing reactions of sulfate esters (Fig. 5). Intriguingly, sulfatase reactions are catalyzed exclusively by enzymes on a distinct one-metal ion branch of the AP superfamily (38, 43, 58 -60). These observations suggest a mechanistic imperative for the absence of AP superfamily bimetallo sulfatases and a functional driving force for the formation of a distinct sulfatase superfamily branch with just one divalent metal ion (Fig. 6, orange) (59). Although the mechanistic origin of the inability of the bimetallo centers to catalyze sulfuryl transfer remains to be determined, linear free energy comparisons of EcAP reactions reveal a strong dependence of catalytic power on substrate charge (56). Thus, to achieve sulfatase activity, the alkaline  Table 1 and Refs. 40,47,51,86). The values for the pruned enzymes are shown in black (Table 2). B, values of k cat /K m for hydrolysis of minimal diester substrate me-pNPP. For WT, the value in light gray is for the full cognate diester substrate, dT5Ј-pNPP ( Table 1). The values for the pruned enzymes with the minimal diester are shown in black. C, specificity of the enzymes as ratio of reaction rates of monoester and diester hydrolysis, for WT (gray) and pruned enzymes (black). For the WT enzymes, the gray bars are obtained from the corresponding bars in A and B for mono-and di-esters, respectively. The values for the pruned enzymes are near unity and therefore difficult to see. Ratios for the reactions are given in supplemental Table 3.  Table 4.

Mechanistic features underlying evolution in the AP superfamily
phosphatase superfamily may have had to undergo a more involved remodeling of its core, in addition to taking on auxiliary domains (38). We speculate that this extensive remodeling might have been evolutionary allowed if metal ligand mutations led to loss of the second metal ion and use of one or more of the freed ligands to aid sulfatase catalysis. Alternatively, the sulfatases may have arisen from a highly evolved AP superfamily phosphatase, using the catalytic interactions beyond the bimet-allo core to provide modest and selectable sulfate ester catalysis, while the second metal site was repurposed. It is also possible that the phosphatases arose from the addition of a second metal ion site to a sulfatase precursor.
We also determined the promiscuous phosphotriesterase activity of the AP superfamily bimetallo scaffold, a biological reaction that is catalyzed by members of the amidohydrolase superfamily (66 -68) but is not known to be catalyzed within Figure 6. Building catalysis and specificity from the AP superfamily bimetallo core. The bimetallo core (gray, hypothetical core extracted from NPP (2GSN) (34)) is a phosphate monoesterase and diesterase but a poor sulfatase and a less efficient triesterase. This low catalytic promiscuity may have led to the loss of one of the metal ion sites in proceeding to the AP superfamily sulfatases (orange). PafA and NPP, despite carrying out phosphate monoesterase and diesterase reactions, respectively, share common "Asn auxiliary domains" (magenta) (40). AP superfamily phosphate monoesterases, such as EcAP and PafA, contain a common Phosphatase helix (light blue), whereas this helix is absent in diesterases, and the resultant cavity contains residues involved in diester substrate binding (diester pocket, red) (40). The AP superfamily lacks phosphate triesterases, and the interaction of the nucleophile backbone amide with one of the monoester or diester phosphoryl oxygen atoms (Fig. 2D) may render the AP superfamily scaffold difficult to evolve or reengineer as a triesterase. The rate for each enzyme construct is shown graphically next to the structures with phosphomonoesterase in blue, phosphodiesterase in red, sulfate esterase in orange, and phosphotriesterase in brown. The values for WT EcAP, PafA, NPP, and EcAP* are from Tables 1 and 2. The values for the Asn core are from Sunden et al. (40). The PAS kinetic constants are from Babtie et al. (63), and those for PTE from the amidohydrolase superfamily are from Mohamed et al. (46). The schematic presents possible evolutionary relationships, but the ancient origin of the AP superfamily prevents evolutionary ordering.
the AP superfamily. Catalysis by the pruned AP superfamily enzymes is substantial and considerably greater than that for a sulfate ester, although still ϳ10 3 -fold less than that for phosphate monoesters and diesters (Fig. 5).
So why did Nature chose structural and catalytic motifs other than those of the AP superfamily to catalyze phosphate triester reactions (62,66,69)? The simplest models are probabilistic, especially given the small number of phosphotriesterases found in Nature (70). When the selective advantage for a phosphotriesterase arose, random duplications may have occurred for members of the amidohydrolase superfamily, or duplications in that family may have had higher promiscuity and thus been favored early in the evolution of these new enzymes. However, there is another possibility, rooted in the structural architecture of the AP superfamily. Below we describe how the architecture has been used and remodeled through the evolution of phosphate mono-and diesterases, and we then return to how architectural features may have limited the ability to evolve an AP superfamily triesterase.
Remarkably, the monoesterase PafA and diesterase NPP share a common auxiliary domain that is distinct from any in EcAP, despite PafA and EcAP both functioning as monoesterases (Fig. 6, Asn auxiliary domain, magenta) (40,41). This element is inserted in a common position in the bimetallo core and is structurally homologous (Fig. 2A). Its role appears to be to position the asparagine residue (Fig. 2C, N111 and N100 in NPP and PafA, respectively) that interacts with one of the phosphoryl oxygen atoms that faces away from the bimetallo core (Fig. 2D, O 1 ). This element provides an equivalent mono-and di-esterase enhancement, consistent with the interaction with an unesterified oxygen atom for mono-and diester substrates (40). Architecturally, there is enough room, and there are sufficient interactions to build this auxiliary domain onto the bimetallo core to enhance catalysis.
How, then, are monoester and diester substrates distinguished? As highlighted in Fig. 6, PafA, EcAP, and all known alkaline phosphatase superfamily monoesterases contain a common active-site ␣-helix (Fig. 6, EcAP and PafA, "phosphatase helix," light blue) (40). This helix appears to position activesite residues that interact with the other outward-facing phosphoryl oxygen atom (Fig. 2D, O 2 ). However, the phosphatase helix sits in the area where the second phosphodiester substituent is situated, and it is absent in all known diesterases (40). Instead, residues that line the region exposed by the absence of the phosphatase helix are used to create a binding pocket for this substituent (Fig. 6, NPP, diester pocket, red), and different NPP subfamilies utilize different residues in this area and in nearby auxiliary domains to achieve distinct substituent specificities (34,71,72).
The different sets of auxiliary domains in EcAP, PafA, and NPP highlight the versatility of the bimetallo core as a scaffold onto which additional domains and elements can be built (Fig.  6). What about the potential to build an alkaline phosphatase superfamily active site that is optimal for a phosphate triester substrate? Stereochemical studies indicate that the second diester substituent has a strong preference for the O 2 position (Fig. 2D) in diesterase reactions catalyzed by EcAP, i.e. in the same position it occupies in NPP reactions, despite the pres-ence of the phosphatase helix and absence of a substituent binding pocket (37). This observation implies inhibition of positioning of the substituent in the opposite position. Indeed, inspection of active-site interactions of EcAP and PafA reveals that the backbone amide of the serine or threonine nucleophile donates a hydrogen bond to this oxygen atom (O 1 , Figs. 2D and 6).
Given that this backbone interaction is located at the site of nucleophilic attack, we speculate that resculpting an AP superfamily member to replace this interaction and accommodate an additional triesterase substituent would be difficult, and this difficulty may account for the absence of known AP superfamily triesterases (Fig. 6, brown X). These observations underscore the role of idiosyncratic structural and chemical properties of each protein scaffold in determining evolvability for new enzymatic functions (73)(74)(75)(76)(77). Analogously, these features will determine which scaffolds can be efficiently engineered to effectively carry out new reactions.
Overall, our results suggest a profound connection between catalytic mechanism and evolution and support a prominent role for catalytic promiscuity in deciding which enzymes and scaffolds were adopted for particular catalytic roles and how they were adapted to take on new catalytic roles throughout evolution (Fig. 6). Furthermore, our results, combined with prior mechanistic and structural studies, provide perspective on how AP superfamily structural elements and domains may have been used to enhance and specialize catalysis (Fig. 6). Once a scaffold like the AP superfamily bimetallo core is in place, its further evolutionary elaboration will depend on what reactions are "needed," which reactions can be catalyzed by that scaffold, and how readily that scaffold can accommodate new structural and functional elements. The insights herein and elsewhere into the structural and mechanistic properties of AP and other superfamilies will help guide efforts to select, redesign existing, and design de novo enzymes for practical applications in chemistry, medicine, and engineering.

Protein expression and purification
Mutants of AP from E. coli (EcAP), NPP from Xanthomonas axonopodis pv. citri, and PafA from C. meningosepticum were expressed in E. coli SM547 (DE3) cells. Cells were grown to an optical density of 0.6 -0.8 in rich medium and glucose (10 g of tryptone, 5 g of yeast extract, 5 g of NaCl, and 2 g of glucose per liter) with 50 g/ml carbenicillin at 37°C. Protein expression was induced by adding 0.3 mM isopropyl thiogalactopyranoside. Cultures were grown at 30°C for 16 -20 h and were harvested by centrifugation.
EcAP was purified from a fusion pMAL-p2X construct containing an N-terminal maltose-binding protein (MBP) with a Factor Xa cleavage site between it and the natural N-and a C-terminal StrepII tag end as described previously (6,47). Briefly, pelleted cells with EcAP were lysed with osmotic shock and purified over amylose resin. The resin was washed extensively, and the protein was eluted with 10 mM maltose. The enzyme was buffer-exchanged into storage buffer (10 mM sodium MOPS, pH 8.0, 50 mM NaCl, and 100 M ZnCl 2 ). Con-

Mechanistic features underlying evolution in the AP superfamily
structs with and without the MBP tag used for purification for NPP* and EcAP* were shown to have the same activity, as did PafA* with and without the Strep tag.
NPP was purified from a similar pMAL-p2X fusion construct as EcAP, containing an N-terminal MBP and C-terminal Stre-pII tags with a Factor Xa cleavage site between it and the natural N-terminal end, as described previously (45). Briefly, cells containing NPP were lysed with osmotic shock and run over a Q-Sepharose column. After extensive washing, the protein was eluted, and protein-containing fractions were pooled. StrepTactin resin (5 ml, IBA Life Sciences, Göttingen, Germany) was washed extensively with 0.5 M sodium hydroxide, and was packed onto a gravity column. Following neutralization of the resin with 1 M Tris-HCl, pH 8.0, and washing with several column volumes of column buffer (100 mM Tris-HCl, pH 8.0, and 150 mM NaCl), the pooled enzyme fractions were loaded over the resin. Following washing the resin with column buffer (100 mM Tris-HCl, pH 8.0, 150 mM NaCl), the enzyme was eluted with 5 mM desthiobiotin in the same buffer and buffer exchanged into storage buffer (100 mM Tris-HCl, pH 8.0, 150 mM NaCl, and 100 M ZnCl 2 ).
PafA was purified from a fusion pET-22b construct containing a C-terminal StrepII tag. The cell pellets were resuspended in column buffer (100 mM Tris-HCl, pH 8.0, and 150 mM NaCl) and either frozen for later purification or lysed by passing the suspension through an Emulsiflex (Avestin, Ottawa, Ontario, Canada) three times. The lysate was centrifuged to remove cell debris (20,000 ϫ g for 20 min), and the supernatant was filtered through a 0.45-m filter. StrepTactin resin (5 ml, IBA Life Sciences) was washed with sodium hydroxide as described above, and following neutralizing the resin, the filtrate was loaded over the resin. The resin was washed with 6 column volumes of column buffer, and the protein was eluted with 5 mM desthiobiotin in column buffer. Fractions containing purified PafA were pooled and buffer-exchanged into storage buffer (10 mM sodium MOPS, pH 8.0, 50 mM NaCl, and 100 M ZnCl 2 ). Protein purity was assessed with SDS-PAGE and was estimated to be Ͼ95% for all enzymes by staining with Coomassie Blue.
Aryl methyl diesters were prepared using the method originally reported by Ba-Saif et al. (78), using commercially available dimethyl chlorophosphate (me-3,4-diNPP and me-3-NPP, respectively). After demethylation, the acid form of the methyl diester was isolated by resuspending the lithium salt in 5 M HCl and extracting into diethyl ether. This method resulted in ϳ1% contamination with aryl monoester. The material was further purified over a RediSep Rf Gold C18 (Teledyne Isco, Lincoln, NE) column equilibrated with 100 mM triethylammonium chloride, pH 5.4. Gradient chromatography was run with this buffer and a 0 -40% acetonitrile gradient at 2% per min. Fractions containing the aryl methyl diester were pooled, dried by rotary evaporation, and the acid form of the aryl methyl diester isolated as described above. Using the isolated acid form, concentrations for kinetic assays were determined gravimetrically and by quantification by absorbance of the leaving group after total hydrolysis, and values agreed within 20% in all cases.

Kinetic assays
Activity measurements were performed at 25°C in a UV/visible Lambda 25 spectrophotometer (PerkinElmer Life Sciences) in 0.1 M sodium MOPS, pH 8.0, 0.5 M NaCl, 100 M ZnCl 2 , unless otherwise noted. For substrates with a p-nitrophenolate (Scheme 2) or 3,4-dinitrophenolate leaving group, the formation of free p-nitrophenolate or 3,4-dinitrophenolate from hydrolysis of the substrates was monitored continuously at 400 nm. For substrates with an 3-nitrophenolate leaving group, the formation of free m-nitrophenolate was monitored continuously at 394 nm. Rate constants were determined from initial rates, and the activity of the free enzyme, k cat /K m , was determined. The kinetic parameters were shown to be first-order in both enzyme and substrate concentration, and the concentrations of each varied over at least a 5-fold range. Linear fits had R 2 values of Ͼ0.98 in all cases. Reported errors were estimated from at least two independent kinetic measurements. Comparisons with independent enzyme preparations for each mutant gave values within error.
The following buffers were used in pH dependences that established that kinetics were followed in a pH-independent region for each of the pruned enzymes: sodium MES, pH 5.5 and 6.0; sodium MOPS, pH 7.0; sodium CHES, pH 9.0; sodium CAPS, pH 10, each at 100 mM and in the presence of 500 mM NaCl and 100 M ZnCl 2 (supplemental Fig. 4). The pH dependence for pNPP and me-pNPP is similar in each enzyme, suggesting that the dianionic form of pNPP is reacting and not the monoionic species. The pH dependence for pNPP and me-pNPP for EcAP* is also similar to the paraoxon pH dependence, suggesting that paraoxon is hydrolyzed in the same active site.
As the mutant activities are orders of magnitude lower than those for the wild-type enzymes with their cognate substrates, we carried out inhibition experiments to ensure that the measured activities represent the pruned enzymes and not a trace contaminant and that phosphate monoester and diester hydrolysis reactions occurred at the same active site for each enzyme. Inhibition experiments with inorganic phosphate (P i ) were carried out with subsaturating amounts of substrate. The P i concentration range was 0.1-100 mM for EcAP, 0.5-100 mM for NPP, and 0.13-100 mM for PafA (supplemental Fig. 5). The observation of K i values that are the same for the monoester and diester substrates and distinct from the K i values for the wildtype enzymes suggests that the same enzyme is catalyzing both reactions in each case and that this enzyme is not a contaminating wild-type enzyme. To further test the origin of the observed catalysis, we purified pruned versions of each enzyme with the nucleophilic Ser or Thr residue mutated to Gly: EcAP, S102G/ D153A/R166S/E322Y/K328A; NPP, T90G/F91A/N111A/L123A/ Y205A; and PafA, T79G/N100A/K162A/R164A. In each case, the observed reaction rate was decreased by 1-2 orders of magnitude relative to the pruned enzymes, and no reaction was observed above the non-enzymatic background hydrolysis, strongly sug-gesting that the observed activities of the pruned constructs arose from the pruned enzymes studied herein.

Metal occupancy
To test for metal ion concentration-dependent activation of expressed enzyme, PafA, NPP, and EcAP were incubated with the following metal concentrations: 10 M ZnCl 2 , 100 M ZnCl 2 , 500 M ZnCl 2 , 1.0 mM MgCl 2 , and 100 M ZnCl 2 in 10 mM sodium MOPS, pH 8.0, and 50 mM NaCl at 25°C. No activation was observed for either PafA or NPP. The EcAP-pruned variant containing the E322Y mutation displayed activation behavior, with a half-time for activation of ϳ4 days, whereas the E322A-containing mutant maintained constant activity. These results mirrored the activation previously observed in the otherwise wild-type context (6). The E322Y and E322A AP* constructs had activities within 2-fold after activation. The occupancy of metal ions in the active site of all of the pruned enzymes and WT enzymes were determined with atomic emission spectroscopy (supplemental Table 2), as described previously (6,47), following overnight incubation in storage buffer with 100 M ZnCl 2 , at room temperature.

Circular dichroism (CD)
CD spectra were collected for the WT and pruned enzymes, each with their MBP tags removed, in 10 mM sodium MOPS, pH 8.0, 50 mM NaCl, 100 M ZnCl 2 . The following enzyme concentrations were used: EcAP WT and pruned, 1.4 M; NPP WT and pruned, 1.0 M; and PafA WT and pruned, 1.4 M.

Crystallization and structure determination of PafA* and EcAP*
PafA* was buffer-exchanged into 10 mM Tris-HCl, pH 8.0, 50 mM NaCl, 100 M ZnCl 2 and concentrated to 2.5 mg/ml. Equal volumes of PafA and precipitant solution (22% PEG3350, 0.1 M sodium acetate, pH 4.4, 0.2 M ammonium sulfate) were mixed and placed over a reservoir of 1 ml of precipitant solution to crystallize by the hanging drop method. Crystals were harvested and frozen in liquid nitrogen without cryoprotectant, and crystallographic data were collected at the Stanford Linear Accelerator at beamline 11-1, using the Pilatus 6M detector (Dectris) with thin slice oscillations (0.2°) for a total 240°rotation. Diffraction images were processed with iMosflm and scaled with the AIMLESS and POINTLESS programs from the CCP4 suite of software (79). The dataset was complete (99.9%) and redundant (8.5-fold) to 2.03 Å resolution, with good merging statistics, which are summarized in Table 4. Dataset resolution was kept at the limit of the detector (2.03 Å), because strong merging statistics, high completeness, and high redundancy persisted into the highest resolution bin ( Table 4).
The structure was solved by molecular replacement using PHENIX (80) with the WT PafA structure (PDB code 5TJ3 (40)) as a search model. Prior to molecular replacement, we modified the search model by removing all solvent and metal atoms, side chains surrounding the two active-site metal sites, and the side chains of the mutated residues (T79S, N100A, K162A, and R164A). The molecular replacement solution showed strong electron density peaks for the missing active-site side chains and the metal ions, a lack of density for the residues mutated to alanines, and a clear change in the position of the T79S loop. Additionally, a strong positive density peak was found coordinated between the two active-site Zn 2ϩ ions and Ser-79, which we interpreted as Cl Ϫ . After several rounds of refinement with phenix.refine (80) and manual adjustments with Coot (81), the final model had excellent geometry and stereochemistry, with R work ϭ 17.0% and R free ϭ 20.9% (Table 4).
EcAP* was buffer-exchanged into 10 mM sodium MOPS, pH 7.0, 50 mM NaCl, 100 M ZnCl 2 and concentrated to 5 mg/ml. Equal volumes of EcAP* and precipitant solution (23% PEG3350, 0.2 M NH 3 F, 0.2 M sodium HEPES, pH 8.0) were mixed and placed over a reservoir of 1 ml of precipitant solution to crystallize by the hanging drop method. Crystals were harvested and frozen in liquid nitrogen with paratone-N oil as cryoprotectant. Crystallographic data were collected at the Stanford Linear Accelerator at beamline 12-2 using the Pilatus 6M PAD detector with thin slice oscillations and shutterless mode for a 360°rotation, resulting in highly redundant 2.45 Å resolution dataset containing anomalous signal from zinc, despite the energy offset compared with the zinc peak energy (12658 eV versus 9668 eV). Diffraction images were processed with the AUTOXDS script (Ana Gonzalez, SSRL) using the current XDS version, AIMLESS, and POINTLESS programs from the CCP4 package (79). Anomalous signal was kept during the processing. Dataset resolution cutoff was chosen based on the high completeness, Ͼ35-fold multiplicity, and Ͼ30% CC1/2 as suggested by the scaling program AIMLESS.
The structure was solved by molecular replacement with a single monomer (subunit A) from the 3TG0 PDB entry in the PHASER program (82) that produced a single solution in space group P6322 (no. 182), with two polypeptide chains in the unit cell. The molecular replacement solution showed negative peaks in the F o Ϫ F c map for the mutated side chains (D153A, E322A, and K328A) or a lack of side-chain density in the 2F o Ϫ F c map for D101A and R166S. The following features were observed and incorporated into the structural model: (a) a wellpopulated metal site ligated by His-162 and Glu-164; (b) a lessoccupied metal site ligated by His-125 and Asp-304; (c) the StrepII tag at the C terminus providing a histidine ligand (His-452) to the metal-binding site (a). Anomalous difference maps using the refined model suggest that all metal-binding sites have Zn 2ϩ ions incorporated. The model was updated and checked in COOT (81), while being refined initially with maximum likelihood program REFMAC (83) and finally with BUSTER (version 2.10.2 (84)). The final model had excellent geometry and stereochemistry, in the top 1% distribution for this resolution range in MolProbity (85) and low residual factors values: R ϭ 16.46% and R free ϭ 21.90% (Table 5), justifying the inclusion of the weaker data in improving the resulting model.
Author contributions-F. S. and I. A. expressed and purified enzymes, performed all characterization of the enzymes, crystallized the enzymes, and mounted them for X-ray studies; T. D. and A. L. solved the crystal structures for the PafA* and EcAP*; J. S. synthesized the substrates with alternative leaving groups; D. H. and F. S. designed the study, analyzed the data, and wrote the manuscript with contributions from all authors.