Mechanistic basis for the evolution of chalcone synthase catalytic cysteine reactivity in land plants

Flavonoids are important polyphenolic natural products, ubiquitous in land plants, that play diverse functions in plants' survival in their ecological niches, including UV protection, pigmentation for attracting pollinators, symbiotic nitrogen fixation, and defense against herbivores. Chalcone synthase (CHS) catalyzes the first committed step in plant flavonoid biosynthesis and is highly conserved in all land plants. In several previously reported crystal structures of CHSs from flowering plants, the catalytic cysteine is oxidized to sulfinic acid, indicating enhanced nucleophilicity in this residue associated with its increased susceptibility to oxidation. In this study, we report a set of new crystal structures of CHSs representing all five major lineages of land plants (bryophytes, lycophytes, monilophytes, gymnosperms, and angiosperms), spanning 500 million years of evolution. We reveal that the structures of CHS from a lycophyte and a moss species preserve the catalytic cysteine in a reduced state, in contrast to the cysteine sulfinic acid seen in all euphyllophyte CHS structures. In vivo complementation, in vitro biochemical and mutagenesis analyses, and molecular dynamics simulations identified a set of residues that differ between basal-plant and euphyllophyte CHSs and modulate catalytic cysteine reactivity. We propose that the CHS active-site environment has evolved in euphyllophytes to further enhance the nucleophilicity of the catalytic cysteine since the divergence of euphyllophytes from other vascular plant lineages 400 million years ago. These changes in CHS could have contributed to the diversification of flavonoid biosynthesis in euphyllophytes, which in turn contributed to their dominance in terrestrial ecosystems.

In their transition from aquatic domains to terrestrial environments, early land plants faced several major challenges, including exposure to damaging UV-B radiation once screened by aquatic environments, lack of structural support once provided by buoyancy in water, drought, and novel pathogens and herbivores. To cope with many of these stresses, land plants have evolved a series of specialized metabolic pathways, among which phenylpropanoid metabolism was probably one of the most critical soon after the transition from water to land (1).
Flavonoids are a diverse class of plant phenolic compounds found in all extant land plants, with important roles in many aspects of plant life, including UV protection, pigmentation for attracting pollinators and seed dispersers, defense, and signaling between plants and microbes (2). Some flavonoids are also of great interest for their anti-cancer and antioxidant activities as well as other potential health benefits to humans (3). After the core flavonoid biosynthetic pathway was established in early land plants, new branches of the pathway continued to evolve over the history of plant evolution, producing structurally and functionally diverse flavonoids to cope with changing habitats, co-evolving pathogens and herbivores, and other aspects of plants' ecological niches. Basal bryophytes biosynthesize the three main classes of flavonoids, namely flavanones, flavones, and flavonols, which likely emerged as UV sunscreens (4). The lycophyte Selaginella biosynthesizes a rich diversity of biflavonoids, many of which were shown to be cytotoxic and may function as phytoalexins (5). The ability to synthesize the astringent, polyphenolic tannins, which defend against bacterial and fungal pathogens, seems to have evolved in euphyllophytes (4). Finally, seed plants, including gymnosperms and angiosperms, developed elaborate anthocyanin biosynthetic pathways to produce the vivid colors used to attract pollinators or ward off herbivores.
Chalcone synthase (CHS), 2 a highly conserved plant type III polyketide synthase (PKS), is the first committed enzyme in the plant flavonoid biosynthetic pathway. CHS synthesizes naringenin chalcone from a molecule of p-coumaroyl-CoA and three molecules of malonyl-CoA (Fig. 1A) (6). The proposed catalytic mechanism of CHS involves loading of the starter molecule p-coumaroyl-CoA onto the catalytic cysteine, which also serves as the attachment site of the growing polyketide chain during the iterative elongation steps (7). This initial reaction step requires the cysteine to be present as a thiolate anion before loading of the starter molecule (Fig. 1B). Using thiol-specific inactivation and the pH dependence of the malonyl-CoA decarboxylation reaction, the pK a of the catalytic cysteine (Cys-164) of Medicago sativa CHS (MsCHS) was measured to be 5.5, a value significantly lower than 8.7 for free cysteine (8).
Interestingly, we observed that the catalytic cysteine residues in the previously reported MsCHS crystal structures appear to be oxidized to sulfinic acid (PDB codes 1BI5 and 1BQ6) (11). Furthermore, the same phenomenon was observed in the crystal structures for several other plant type III PKSs evolutionarily derived from CHS, including Gerbera hybrida 2-pyrone synthase (PDB code 1QLV) (Fig. S1) (9). The other noncatalytic cysteines in these proteins do not appear to be oxidized. These findings suggest that the oxidation of the catalytic cysteine observed in several type III PKS crystal structures may not simply be an artifact of X-ray crystallography but rather reflects the intrinsic redox potential and reactivity of the catalytic cysteine evolved in this family of enzymes. Indeed, the propensity for a particular cysteine residue to undergo oxidation has been previously indicated to correlate with a low pK a (10).
Here, we present a set of new crystal structures of orthologous CHSs representing five major lineages of land plants, namely bryophytes, lycophytes, monilophytes, gymnosperms, and angiosperms, spanning 500 million years of land plant evolution. Through comparative structural analysis, in vivo complementation, in vitro biochemistry, mutagenesis studies, and molecular dynamics simulations, we reveal that CHSs of basal land plants, i.e. bryophytes and lycophytes, contain a catalytic cysteine less reactive than that of the CHSs from higher plants, i.e. euphyllophytes. We probe into the structure-function relationship of a set of residues that modulate the reactivity of the catalytic cysteine, which leads us to propose that euphyllophytes may have evolved a more catalytically efficient CHS to enhance flavonoid biosynthesis relative to their basal-plant relatives.

Basal-plant CHSs contain reduced catalytic cysteine in their crystal structures
To examine the structural basis for the evolution of CHS across major land plant lineages, we cloned, expressed, and solved the crystal structures of the five CHS orthologs from the bryophyte Physcomitrella patens (PpCHS), the lycophyte Selaginella moellendorffii (SmCHS), the monilophyte Equisetum arvense (EaCHS), the gymnosperm Pinus sylvestris (PsCHS), and the angiosperm Arabidopsis thaliana (AtCHS) ( Fig. 2 and Table 1). Like previously reported crystal structures of type III polyketide synthases, all five CHS orthologs form symmetric homodimers and share the same ␣␤␣␤␣ thiolase fold, suggesting a common evolutionary origin (11). The catalytic triad of cysteine, histidine, and asparagine is found in a highly similar conformation to other PKS and related fatty acid biosynthetic ␤-ketoacyl-(acyl-carrier-protein) synthase III (KAS III) enzymes, suggesting that they share a similar general catalytic mechanism (Fig. 2B).
Based on the previously proposed reaction mechanism for MsCHS, the catalytic cysteine is Cys-169 in AtCHS and Cys-159 in SmCHS. This residue initiates the reaction mechanism by performing nucleophilic attack on p-coumaroyl-CoA (Fig.  1B). The other two members of the catalytic triad consist of His-309 and Asn-342 in AtCHS, and His-302 and Asn-335 in SmCHS. The catalytic histidine contributes to the lowered pK a of the catalytic cysteine by forming a stable imidazolium-thiolate ion pair (8). The histidine and asparagine also form the oxyanion hole that stabilizes the tetrahedral transition states formed during the initial nucleophilic attack by cysteine on p-coumaroyl-CoA and after malonyl-CoA decarboxylation (Fig. 1B).
Notably, SmCHS and PpCHS are the first CHSs for which a reduced catalytic cysteine has been observed in the crystal structure (Fig. 2B). The catalytic cysteine in SmCHS can still become oxidized to sulfenic acid when the crystal is soaked in hydrogen peroxide, indicating that it is still susceptible to oxidation at a lower rate ( Fig. S2 and Table S1). Like most other euphyllophyte type III PKS crystal structures solved to date, AtCHS, PsCHS, and EaCHS contain doubly oxidized catalytic cysteine sulfinic acid (Fig. 2B). This interesting observation suggests a functional divide between basal-plant and euphyllophyte CHSs. Despite shared orthology, the redox potential of the catalytic cysteine in PpCHS and SmCHS may differ from that of the euphyllophyte CHSs, resulting in different levels of sensitivity to oxidation under similar crystallization conditions. This could be due to the evolution of some novel molecular features in euphyllophyte CHSs not present in the lower-plant CHSs.

Basal-plant CHSs only partially complement the Arabidopsis CHS-null mutant
CHS orthologs have been identified in all land-plant species sequenced to date, suggesting a highly conserved biochemical function. To test whether the five CHSs from the five major plant lineages are functionally equivalent, we generated transgenic A. thaliana lines expressing each of the five different CHSs driven by the Arabidopsis CHS promoter in the CHS-null mutant transparent testa 4-2 (tt4-2) background (12) (Fig. S3).
Twenty independent T1 plants were selected for each construct. The phenotypes of the transgenic plants described below were represented by the majority of independent transgenic events for each unique construct. As the name indicates, the tt4-2 mutant is devoid of flavonoid biosynthesis and therefore lacks the accumulation of the brown condensed tannin   Top, the homodimeric form of CHS is shown with a color gradient from blue at the N terminus to red at the C terminus of each monomer. Bottom, backbone and side chains of the catalytic triad and the differentially conserved cysteine/ serine are shown. The 2F o Ϫ F c electron density map contoured at 1.5 is shown around the catalytic cysteine. CHSs from euphyllophytes show the catalytic cysteine oxidized to sulfinic acid, whereas CHSs from basal land plants have a reduced catalytic cysteine. The red or yellow dot next to the enzyme name indicates the presence of serine or cysteine, respectively, in position 347 (AtCHS numbering).

Evolution of chalcone synthase cysteine reactivity
pigments in seed coats, revealing the pale yellow color of the underlying cotyledons (12). Although AtCHS, PsCHS, and EaCHS fully complement the tt phenotype of tt4-2, PpCHS and SmCHS only partially rescue the seed tt phenotype of tt4-2 ( Fig.  S3), suggesting that PpCHS and SmCHS are likely less active than their higher-plant counterparts in vivo. This result also correlates with the crystallographic observation where the catalytic cysteine of basal-plant and euphyllophyte CHSs exhibits differential susceptibility to oxidation.

pK a of the catalytic cysteine is higher in basal-plant CHSs than in euphyllophyte CHSs
To perform nucleophilic attack on the p-coumaroyl-CoA substrate, the catalytic cysteine must be present in the thiolate anion form. As shown previously in MsCHS, the pK a of the catalytic cysteine is lowered to 5.5, well below physiological pH, to stabilize this deprotonated state (8). Two factors could contribute to the depressed pK a of Cys-164. First, His-303, one of the catalytic triad of CHS in the vicinity of Cys-164, provides an ionic interaction with Cys-164 that can further stabilize the cysteine thiolate anion. Second, Cys-164 is positioned at the N terminus of the MsCHS ␣-9 helix (11), which provides a stabilizing effect on the cysteine thiolate anion through the partial positive charge of the helix dipole (13). The acidic pK a of the catalytic cysteine in CHS ensures the presence of a cysteine thiolate anion in the enzyme active site at physiological pH to serve as the nucleophile for starter molecule loading. Table 1 Crystallographic data collection and refinement statistics for the five wildtype CHSs The highest-resolution shell values are given in parentheses.  Table 2 Crystallographic data collection and refinement statistics for the three mutant CHSs

PpCHS
The highest-resolution shell values are given in parentheses.

Evolution of chalcone synthase cysteine reactivity
To measure the pK a of the catalytic cysteine in the five landplant CHS orthologs, we performed pH-dependent inactivation of CHS using iodoacetamide, a thiol-specific compound that reacts with sulfhydryl groups that are sufficiently nucleophilic, followed by a CHS activity assay at the usual reaction pH. At pH values above the pK a , the catalytic cysteine is deprotonated and able to react with iodoacetamide, thus inactivating CHS. At pH values below the pK a , the catalytic cysteine is protonated and protected from iodoacetamide modification, thus retaining CHS activity in the subsequent enzyme assay. The amount of CHS activity remaining after iodoacetamide treatment was expressed as a ratio compared with the CHS activity of a control treatment at the same pH but without iodoacetamide. The pK a value was calculated using nonlinear regression to fit a log(inhibitor) versus response equation, which gave the pH at which 50% of maximal inhibition was obtained.
The pK a value for AtCHS was measured to be 5.428, which is close to the 5.5 measured for MsCHS (Fig. 3A). The pK a value for SmCHS was measured to be 6.468, ϳ1 pH unit higher than that of the two angiosperm CHS orthologs. This elevated pK a value measured for SmCHS is consistent with the observation of a catalytic cysteine that is less reactive and less prone to oxidation. Also consistent with the crystallographic and plant complementation results, pK a values around 5.5 were measured for euphyllophyte orthologs PsCHS and EaCHS and around 6.5 for the basal-plant orthologs PpCHS (Fig. S4).

Residues near the active-site cavity affect the pK a and reactivity of the catalytic cysteine
We next examined the sequence and structural differences between basal-plant and euphyllophyte CHSs that could play a role in modulating catalytic cysteine reactivity. This led us to first identifying a residue near the active site that is conserved as Cys-347 (AtCHS numbering) in AtCHS and other euphyllophyte sequences, and as Ser-340 (SmCHS numbering) in SmCHS and other lycophyte and bryophyte sequences ( Fig. 2A).
To investigate the role of this residue in modulating catalytic cysteine reactivity, we generated the reciprocal mutations in SmCHS and AtCHS, respectively, and first characterized these mutant proteins using X-ray crystallography ( Fig. 3B and Table  2). Under identical crystallization conditions as WT SmCHS, the SmCHS S340C mutant exhibits a partially oxidized catalytic cysteine in its crystal structure, suggesting that the residue does play some role in determining cysteine reactivity. The AtCHS C347S mutant, however, still retains an oxidized catalytic cysteine in its crystal structure.
We then measured the pK a value of the catalytic cysteine in both SmCHS S340C and AtCHS C347S mutants (Fig. 3C)  . pK a measurement of the catalytic cysteine and characterization of key residues that affect pK a . A, pK a measurement of AtCHS and SmCHS WT enzymes. CHS enzyme was pre-incubated at various pH values with or without the 25 M iodoacetamide inhibitor for 30 s, and an aliquot was taken to run in a CHS activity assay. The ratio of naringenin product produced in the iodoacetamide treatment divided by the control treatment was calculated for each pH point. A nonlinear regression was performed to fit a log(inhibitor) versus response curve to determine the pH at which 50% of maximal inhibition was achieved, which was determined to be the pK a value of the catalytic cysteine residue. The pK a of AtCHS is close to the 5.5 determined for other euphyllophyte CHSs, whereas the pK a of SmCHS is over 1 pH unit higher. B, overall structures and active-site configurations of AtCHS C347S and SmCHS S340C single mutants. The 2F o Ϫ F c electron density map contoured at 1.5 is shown around the catalytic cysteine. SmCHS S340C shows oxidation of Cys-159, unlike the SmCHS WT. AtCHS C347S has an oxidized Cys-169, like AtCHS WT. C, pK a measurements of AtCHS C347S and SmCHS S340C mutants.

Evolution of chalcone synthase cysteine reactivity
also consistent with the observation that the AtCHS C347S crystal structure retained an oxidized catalytic cysteine. Taken together, the crystallographic and pK a measurement results suggest that the reciprocal mutation at this position is not sufficient to act as a simple switch between the active-site environments of euphyllophyte and basal-plant CHSs to modulate catalytic cysteine reactivity. Additional sequence and structural features likely contribute to an active-site environment that lowers the pK a of the catalytic cysteine in AtCHS.
To identify these features, we examined a multiple sequence alignment of CHS orthologs from diverse plant species and identified residues that show conserved variations between euphyllophytes and basal-plant lineages ( Fig. 4A and Fig. S5). Two residues, Phe-170 and Gly-173 in euphyllophyte CHSs, were found to be substituted as serine and alanine, respectively, in basal-plant lineages. Because of their positions in the ␣-helix immediately C-terminal to the catalytic Cys-169, we postulated that these two residues could play a role in determining the structure of the helix, which would have an effect on the electronic environment of the active site, due to the helix dipole's contribution to lowering the catalytic cysteine pK a value (11). Four additional residues near the active-site opening of CHS were also identified as differentially conserved between euphyllophytes and basal plants. We postulated that these positions might affect the dynamics of the active-site tunnel and solvent access to the active site. The six aforementioned residues were mutated in the SmCHS S340C background to their corresponding residues in AtCHS to generate the SmCHS I54M/ S160F/A163G/G203S/A207Q/V258T/S340C septuple mutant, termed SmCHS M7. Likewise, the reciprocal mutations were also made in the AtCHS C347S background to generate AtCHS M7.
Compared with SmCHS S340C, the six additional mutations in SmCHS M7 lower the pK a by nearly 0.7 pH units from 6.429 to 5.738 (Fig. 4B). Similarly, the six mutations of AtCHS M7 raise the pK a by almost 1 pH unit from 5.181 to 6.167 compared with AtCHS C347S. Consistent with the pK a observation, the dimeric crystal structure of AtCHS M7 has one monomer with a catalytic cysteine singly oxidized to sulfenic acid and one monomer with a reduced cysteine (Fig. 4C). Sulfenic acid is more reduced than the doubly oxidized sulfinic acid seen in other euphyllophyte crystal structures, indicating that these six mutations decreased the reactivity of the catalytic cysteine. These mutations represent a part of a possible evolutionary path from ancestral basal-plant CHSs toward the stronger pK a -lowering properties of euphyllophyte CHSs. Any further attempts at engineering CHS to fully swap the pK a -lowering properties between AtCHS and SmCHS would likely require different methods of searching for conserved sequence differences, beyond visual observation of structural differences. An analysis of the CHS multiple sequence alignment using ancestral sequence reconstruction with FastML (14) identified eight additional positions that are differently conserved between euphyllophytes and basal plants and could affect CHS function based on their position in the CHS crystal structure (Fig. S6).

Molecular dynamics simulations reveal differences in activesite interactions between basal-plant and euphyllophyte CHSs
Our crystal structures revealed a correlation between the pK a value of the catalytic cysteine and a set of residues near the active site. To further investigate the mechanisms underlying these conserved differences between euphyllophyte and basalplant CHSs, we employed molecular dynamics (MD) simulations to examine the interactions between these residues. We

Evolution of chalcone synthase cysteine reactivity
first surveyed the potential role of the C347S substitution (AtCHS numbering) in affecting the active-site environment in WT AtCHS and SmCHS (Fig. 5A). In WT AtCHS, where the largest cluster represents 70.3% of all structures sampled in this simulation, the thiol group of Cys-347 points away from the active site and cannot form any stable interaction with the catalytic His-309 (distance 6.5 Å). In contrast, the corresponding Ser-340 in SmCHS is 2.8 Å away from the histidine in the largest cluster, representing 98.7% of all structures sampled in the SmCHS simulation.
Next, we determined the inter-residue distances between the ionic pair Cys-169 -His-309 as in AtCHS or Cys-159 -His-302 as in SmCHS and between residue Cys-347 (AtCHS)/Ser-340 (SmCHS) and the catalytic histidine (Fig. 5B). For the WT SmCHS simulation, we observed a sharp peak at around 2.8 Å between Ser-340 and His-302, reflecting a stable hydrogen bond between the two residues. On the contrary, no such shortdistance peak was observed for WT AtCHS. These results suggest that the catalytic histidine is stabilized upon forming a hydrogen bond with Ser-340 in SmCHS, but such an interaction is relatively loose in AtCHS. Similar differences between the other euphyllophyte and basal-plant CHSs are also seen for PsCHS, EaCHS, and PpCHS (Fig. S7).
To further investigate the motion of the catalytic histidine in various mutant enzyme active-site environments, we also performed MD simulations of AtCHS C347S, SmCHS S340C, AtCHS M7, and SmCHS M7 (Fig. 5A). The largest cluster sizes were 86.0, 66.6, 96.7, and 71.0%, respectively. Ser-347 and His-309 in AtCHS mutants adopt a similar conformation to the corresponding residues in WT SmCHS. In contrast, no stable hydrogen bond between Cys-340 and His-302 is formed in the largest cluster of the SmCHS mutant simulations. Introducing point mutations dramatically changes the distributions of those key inter-residue distances. In the AtCHS C347S mutant, the Ser-347-His-309 distance dramatically shortens to a peak around 2.8 Å, and introducing the six additional mutations in AtCHS M7 further increases the height of the peak. This suggests that mutating these seven positions in AtCHS to the corresponding residues in SmCHS can allow the active-site residues to approximate the interactions of WT SmCHS. The opposite effect is seen in SmCHS S340C and SmCHS M7 mutants, which recapitulate the weak interaction between Cys-347 and His-309 seen in WT AtCHS.
Based on these results, we hypothesize that the strong Ser-340 -His-302 interaction facilitated by the SmCHS active-site environment may weaken the stabilizing effect of His-302 on the catalytic cysteine thiolate compared with that in AtCHS, thus contributing to the higher pK a value. Meanwhile, the inter-residue distance of the catalytic cysteine-histidine ionic pair is rather stable in all CHS simulations, ranging from 3 to 5 Å and centered around 4.1 Å. This suggests that the C347S substitution (AtCHS numbering) does not directly break this ionic interaction but may subtly influence the charge distribution on the histidine imidazole ring to perturb the catalytic cysteine pK a value (Fig. 6). In addition, the presence of a cysteine appears to decrease solvent con-

Evolution of chalcone synthase cysteine reactivity
tent in the active site compared with serine, which would increase the pK a -lowering effect of the ionic interaction between histidine and the catalytic cysteine ( Fig. S8 and supporting Note). Taken together, our results suggest that euphyllophyte CHSs have evolved to enhance the reactivity of the catalytic cysteine through the modification of specific interactions between active-site residues to allow for stronger stabilization of the thiolate.

Discussion
As early plants initially migrated from water to land and further radiated to occupy diverse terrestrial environmental niches, they continuously encountered new challenges from biotic and abiotic stresses. The greatly expanded diversity and increased abundance of flavonoids in certain plant lineages could have increased the demand for metabolic flux into flavonoid biosynthesis. One adaptive strategy to meet this demand, among many others, is to increase the enzymatic efficiency of chalcone synthase, the first committed enzyme of flavonoid biosynthesis that gates flux from general phenylpropanoid metabolism. One property of CHS that affects its enzymatic efficiency is the reactivity of the first step of nucleophilic attack on p-coumaroyl-CoA. To investigate this, we performed structural, biochemical, mutagenesis, and molecular dynamics experiments on CHS orthologs from five major plant lineages. Our results suggest that euphyllophyte CHSs have indeed evolved new structural features to increase the reactivity of their catalytic cysteine compared with basal-plant CHSs.
To identify sequence and structural features between euphyllophyte and basal-plant CHSs that lead to this difference in enzymatic properties, we generated mutants in the background of AtCHS and SmCHS at various positions with conserved sequence differences segregating euphyllophyte and basal-plant CHSs. AtCHS M7 and SmCHS M7 had pK a values raised by about 0.7 pH units and lowered by about 1 pH unit from the WT enzymes, respectively. Furthermore, AtCHS M7 also exhibits a less oxidized catalytic cysteine in its crystal structure than in WT AtCHS. These results indicate that we were able to identify residue changes that partially traced the evolutionary path from SmCHS to AtCHS that increased the reactivity of the catalytic cysteine. In the type III PKS family, the introduction of a large number of mutations to yield subtle changes in enzyme activity is not unprecedented. Stilbene synthase produces resveratrol, a tetraketide product whose biosynthetic mechanism differs from that of naringenin chalcone in only the final cyclization step. In a previous study, a total of 18 point mutations were required to convert CHS activity to stilbene synthase activity, through small changes in the hydrogen-bonding network in the active site (15).
To examine in detail the intramolecular interactions that lead to enhanced cysteine reactivity, we performed molecular dynamics simulations on CHS. In comparing different CHS orthologs and point mutants, we observed that the presence of a cysteine in position 347 (AtCHS numbering) leads to a weak interaction between that cysteine and histidine, as indicated by the broad distribution of inter-residue distances centered at a distance greater than 5 Å, too long for a stable hydrogen bond. In contrast, when a serine is present, the sharp peak of serinehistidine inter-residue distance around 2.75 Å suggests the presence of a strong hydrogen bond. This hydrogen bonding likely shifts the electron density of the histidine away from the catalytic cysteine, weakening the imidazoline-thiolate ion pair. This weakened ionic interaction would lead to less pK a depression compared with CHS orthologs and mutants containing a cysteine in the nearby position, where the histidine is able to maintain a stronger ion pair with the catalytic cysteine and lower the pK a to a greater degree. This is reminiscent of the role of aspartate 158 in papain, a cysteine protease that also uses a cysteine-histidine-asparagine catalytic triad for nucleophilic attack on its substrates (16). Although Asp-158 is not essential for papain activity, its side chain affects the pH-activity profile by forming a hydrogen bond with the backbone amide of the catalytic histidine. This interaction stabilizes the catalytic ionic pair and maintains an optimal orientation of active-site residues. A D158E mutant papain had a pH-activity profile shifted Evolution of chalcone synthase cysteine reactivity by 0.3 pH units, about the same magnitude of the effect we observed on pK a for CHS cysteine/serine mutants. We propose a model of the role of position 347 in enhancing CHS reactivity (Fig. 6). In the basal example of SmCHS, the serine interacts more strongly with the histidine of the catalytic triad, weakening the ionic interaction that stabilizes the thiolate form of the catalytic cysteine. In euphyllophyte CHSs, this position mutated to a cysteine, which interacts more poorly with the histidine, strengthening the ionic interaction and stabilizing the activated thiolate of the catalytic cysteine.
Although the mechanism of how the other six mutations in the M7 mutants affect the catalytic cysteine is not entirely clear, we noticed that, possibly due to the smaller side chains of the S213G and Q217A mutations, AtCHS M7 has a surface helix in a slightly different conformation than WT AtCHS, leading to a slightly wider active-site opening (Fig. S9). There is also a newly solvent-accessible cavity as determined by a computational cavity-finding software (Fig. S8). These structural differences could lead to subtle changes in the amino acid backbone dynamics near the active site and thus alter the active-site volume or electronic environment, which could alter the pK a value of the catalytic cysteine (8).
Although cysteine sulfenic and sulfinic acid have been thought of as crystallographic artifacts, an increasing number of studies have shown that this type of cysteine oxidation can play an important functional role. In particular, cysteine sulfinic acid has been shown to play a regulatory role in reversible inhibition of the activity of enzymes such as protein-tyrosine phosphatase 1B and glyceraldehyde-3-phosphate dehydrogenase, suggesting that cysteine redox potential can be an evolved trait (17,18).
Our results demonstrate that euphyllophytes could have evolved a CHS enzyme that is intrinsically more active, with increased cysteine reactivity as a component, as one adaptation to produce the larger suite of flavonoids needed to counter the various environmental stresses they face. Although it may seem counterintuitive for euphyllophytes, which encounter more oxidative environments than do basal plants, to rely on a CHS enzyme that is more susceptible to oxidation, this susceptibility may be an unavoidable trade-off resulting from the chemical nature of a more nucleophilic cysteine: a catalytic cysteine more reactive toward substrate is also more reactive toward oxidants like hydrogen peroxide. To compensate for this increased susceptibility to oxidation, euphyllophytes may have evolved other systems to better maintain the redox environment inside the cell, one of those systems being the antioxidant flavonoids themselves.

Cloning and site-directed mutagenesis of CHSs
Total RNA was obtained from A. thaliana, P. sylvestris, E. arvense, S. moellendorffii, and P. patens. Reverse transcription was performed to obtain cDNA. The open reading frames (ORFs) of five CHS orthologs were amplified via PCR from cDNA, digested with NcoI and XhoI, and ligated into NcoI-and XhoI-digested pHis8-3 or pHis8-4B Escherichia coli expression vectors. Site-directed mutagenesis was performed according to the QuikChange II site-directed mutagenesis protocol (Agilent Technologies).

Transgenic Arabidopsis
The AtCHS promoter (defined as 1328 bp of sequence upstream of the CHS transcription start site) was amplified via PCR from Arabidopsis genomic DNA, digested with HindIII and XhoI, and ligated into HindIII-and XhoI-digested pCC 1136, a promoterless Gateway cloning binary vector containing a BAR resistance gene marker, to generate pJKW 0152. The five CHS ORFs described above were then PCR-amplified from cDNA and cloned into pCC 1155, an ampicillin-resistant version of the pDONR221 Gateway cloning vector, with BP clonase in the Gateway cloning method (ThermoFisher Scientific). The resulting vectors were recombined with pJKW 0152 using LR clonase in the Gateway cloning method to generate the final binary constructs. Agrobacterium tumefaciens-mediated transformation of Arabidopsis was performed using the floral dipping method (44).

Recombinant protein expression and purification
CHS genes were cloned into pHis8-3 or pHis8-4B, bacterial expression vectors containing an N-terminal His 8 tag followed by a thrombin or tobacco etch virus cleavage site, respectively, for recombinant protein production in E. coli. Proteins were expressed in the BL21(DE3) E. coli strain cultivated in terrific broth and induced with 0.1 mM isopropyl ␤-D-1-thiogalactopyranoside overnight at 18°C. E. coli cells were harvested by centrifugation, resuspended in 150 ml of lysis buffer (50 mM Tris, pH 8.0, 500 mM NaCl, 30 mM imidazole, 5 mM DTT), and lysed with five passes through an M-110L microfluidizer (Microfluidics). The resulting crude protein lysate was clarified by centrifugation (19,000 ϫ g, 1 h) prior to Qiagen nickel-nitrilotriacetic acid (Ni-NTA) gravity flow chromatographic purification. After loading the clarified lysate, the Ni-NTA resin was washed with 20 column volumes of lysis buffer and eluted with 1 column volume of elution buffer (50 mM Tris, pH 8.0, 500 mM NaCl, 300 mM imidazole, 5 mM DTT). 1 mg of His-tagged thrombin or tobacco etch virus protease was added to the eluted protein, followed by dialysis at 4°C for 16 h in dialysis buffer (50 mM Tris, pH 8.0, 500 mM NaCl, 5 mM DTT). After dialysis, the protein solution was passed through Ni-NTA resin to remove uncleaved protein and His-tagged tobacco etch virus. The recombinant proteins were further purified by gel filtration on an ÄKTA Pure fast protein liquid chromatography (FPLC) system (GE Healthcare). The principal peaks were collected, verified by SDS-PAGE, and dialyzed into a storage buffer (12.5 mM Tris, pH 8.0, 50 mM NaCl, 5 mM DTT). Finally, proteins were concentrated to Ͼ10 mg/ml using Amicon Ultra-15 centrifugal filters (Millipore).

Protein crystallization
All protein crystals were grown by hanging drop vapor diffusion at 4°C, except for EaCHS at 20°C. For AtCHS WT and C347S crystals, 1 l of 10 mg/ml protein was mixed with 1 l of reservoir solution containing 0.

X-ray diffraction and structure determination
X-ray diffraction data were collected at beamlines 8.2.1 and 8.2.2 of the Advanced Light Source at Lawrence Berkeley National Laboratory on ADSC Quantum 315 CCD detectors for AtCHS WT, AtCHS C347S, and SmCHS S340C crystals. X-ray diffraction data were collected at beamlines 24-ID-C and 24-ID-E of the Advanced Photon Source at Argonne National Laboratory on an ADSC Quantum 315 CCD detector, Eiger 16M detector, or Pilatus 6M detector for SmCHS WT, EaCHS, PsCHS, and AtCHS M7 crystals. Diffraction intensities were indexed and integrated with iMosflm (19) and scaled with Scala under CCP4 (20,21). The phases were determined with molecular replacement using Phaser under Phenix (22). Further structural refinement utilized Phenix programs. Coot was used for manual map inspection and model rebuilding (23). Crystallographic calculations were performed using Phenix.

Comparative sequence and structure analyses
CHS protein sequences were derived from NCBI and the 1000 Plants (1KP) Project (24,25). In all cases, AtCHS was used as the search query. Amino acid alignment of CHS orthologs was created using MUSCLE with default settings (26). UCSF Chimera and ESPript were used to display the multiple-sequence alignments shown in Fig. 2 and Figs. S5 and S6 (27,28). Phylogenetic analysis was performed using MEGA7 (29). All structural figures were created with the PyMOL Molecular Graphics System, version 1.3 (Schrödinger, LLC) (30). Activesite cavity measurements for the AtCHS and AtCHS M7 structures were determined using KVFinder (31).

Enzyme assays and pK a measurement
A 4CL-CHS-coupled assay was used for kinetic analysis. A 4CL reaction master mix was made by incubating 917 nM A. thaliana 4CL1 (NCBI accession number NP_175579.1) in 100 mM Tris-HCl, pH 8.0, 5 mM MgCl 2 , 5 mM ATP, 100 M p-coumaric acid, 100 M CoA, and 10 or 50 M malonyl-CoA for 30 min at room temperature to generate p-coumaroyl-CoA at a final concentration of 70 M. This 4CL was divided into individual aliquots of 196 l in Eppendorf tubes. CHS enzyme was incubated for 30 or 60 s in 16-l volumes using a triple buffer system (50 mM AMPSO, 50 mM sodium phosphate, 50 mM sodium pyrophosphate, various pH values) (32,33) at room temperature in the presence of 25 M iodoacetamide for the inactivation sample or water for the control sample. Aliquots (4 l) were withdrawn from the incubation mixture and added to the standard coupled CHS assay system. The CHS reaction was run for 10 min at room temperature and stopped by addition of 200 l of methanol.
The assay samples were centrifuged and analyzed directly by LC-MS. LC was conducted on a Dionex UltiMate 3000 UHPLC system (ThermoFisher Scientific), using water with 0.1% formic acid as solvent A and acetonitrile with 0.1% formic acid as solvent B. Reverse-phase separation of analytes was performed on a Kinetex C18 column, 150 ϫ 3 mm, 2.6-m particle size (Phenomenex). The column oven was held at 30°C. Samples were eluted with a gradient of 5-60% B for 9 min, 95% B for 3 min, and 5% B for 3 min, with a flow rate of 0.7 ml/min. MS analysis was performed on a TSQ Quantum Access Max mass spectrometer (ThermoFisher Scientific) operated in negative ionization mode with a SIM scan centered at 271.78 m/z to detect naringenin chalcone.
The pH profiles (pH on the x axis, ratio of naringenin chalcone produced with iodoacetamide-treatment to control on the y axis) were determined by fitting raw data to the log(inhibitor) versus response equation using nonlinear regression in Prism, version 6.0f (GraphPad software).

Molecular dynamics
All MD simulations were performed using the GROMACS 5.1.4 package (34) and CHARMM force field (35). The catalytic residues were modeled as protonated histidine (His-309 in AtCHS number) and deprotonated cysteine (Cys-169 in AtCHS numbering). All CHSs were constructed as dimers and were pre-aligned to the WT AtCHS crystal structure using the Multiseq plugin of VMD (36). All CHS dimers were solvated with 0.1 M NaCl in a dodecahedron box. Before the production runs, all systems were submitted to a minimization, followed by a 500-ps NVT and a 500-ps NPT run with heavy atoms constrained. This was followed by another 5-ns NPT simulation with protein backbone constrained. In all simulations, an integration time step of 2 fs was used, with bonds involving hydrogens constrained using LINCS (37,38). The van der Waals interaction was smoothly switched off starting from 10 Å, with a cutoff distance of 12 Å. The neighboring list was updated every 10 steps with Verlet cutoff scheme. The electrostatic interaction was evaluated using Particle-Mesh-Ewald (PME) summation (39) with a grid spacing of 1.5 Å to account for the long-range interaction, whereas its short-range interaction in real space had a cutoff distance of 12 Å. The velocity-rescaling thermostat (40) and Parrinello-Rahman barostat (41, 42) were employed to maintain the temperature at 300 K and the pressure at 1 bar.
For each CHS, three copies of 200-ns production runs were performed. The aggregated simulation time of all CHS WT and mutant systems is 5.4 s. The two monomers of a given CHS

Evolution of chalcone synthase cysteine reactivity
were treated equivalently in the analysis, i.e. the three copies of trajectories of each monomer were combined after they were aligned to chain A of the associated crystal structure, resulting in a total of 1.2-s trajectory for analysis of a given CHS system. Clustering analysis was carried out with GROMACS gmx cluster with RMSD cutoff of 0.1 nm. The inter-residue distance was measured using the tcl scripting abilities provided by VMD (43). The minimum distance between the two nitrogen atoms of the catalytic histidine and the associated hydroxyl, thiol, or thiolate group of its serine or cysteine partner was taken as the inter-residue distance. Water occupancy calculation was performed using the volmap plugin of VMD (43).