Pterocarpan synthase (PTS) structures suggest a common quinone methide–stabilizing function in dirigent proteins and proteins with dirigent-like domains

The biochemical activities of dirigent proteins (DPs) give rise to distinct complex classes of plant phenolics. DPs apparently began to emerge during the aquatic-to-land transition, with phylogenetic analyses revealing the presence of numerous DP subfamilies in the plant kingdom. The vast majority (>95%) of DPs in these large multigene families still await discovery of their biochemical functions. Here, we elucidated the 3D structures of two pterocarpan-forming proteins with dirigent-like domains. Both proteins stereospecifically convert distinct diastereomeric chiral isoflavonoid precursors to the chiral pterocarpans, (–)- and (+)-medicarpin, respectively. Their 3D structures enabled comparisons with stereoselective lignan– and aromatic terpenoid–forming DP orthologs. Each protein provides entry into diverse plant natural products classes, and our experiments suggest a common biochemical mechanism in binding and stabilizing distinct plant phenol–derived mono- and bis-quinone methide intermediates during different C–C and C–O bond–forming processes. These observations provide key insights into both their appearance and functional diversification of DPs during land plant evolution/adaptation. The proposed biochemical mechanisms based on our findings provide important clues to how additional physiological roles for DPs and proteins harboring dirigent-like domains can now be rationally and systematically identified.

The biochemical activities of dirigent proteins (DPs) give rise to distinct complex classes of plant phenolics. DPs apparently began to emerge during the aquatic-to-land transition, with phylogenetic analyses revealing the presence of numerous DP subfamilies in the plant kingdom. The vast majority (>95%) of DPs in these large multigene families still await discovery of their biochemical functions. Here, we elucidated the 3D structures of two pterocarpan-forming proteins with dirigent-like domains. Both proteins stereospecifically convert distinct diastereomeric chiral isoflavonoid precursors to the chiral pterocarpans, (-)-and (1)-medicarpin, respectively. Their 3D structures enabled comparisons with stereoselective lignan-and aromatic terpenoid-forming DP orthologs. Each protein provides entry into diverse plant natural products classes, and our experiments suggest a common biochemical mechanism in binding and stabilizing distinct plant phenol-derived mono-and bis-quinone methide intermediates during different C-C and C-O bond-forming processes. These observations provide key insights into both their appearance and functional diversification of DPs during land plant evolution/adaptation. The proposed biochemical mechanisms based on our findings provide important clues to how additional physiological roles for DPs and proteins harboring dirigent-like domains can now be rationally and systematically identified.

Lignan-forming DPs
The first DPs reported were the (1)-and (-)-pinoresinolforming DPs affording entry into the lignan metabolic pathways (i.e. provided that one-electron (1e -) oxidation capacity was also present) (1, 7-10, 12-15) ( Fig. 2A). In this way, the (1)-and (-)-pinoresinol-forming DPs (DIR-a subfamily members, Fig. 1) engender distinct stereoselective intermolecular couplings, in the presence of a 1eoxidase or oxidant, of the prochiral coniferyl alcohol quinone methide (QM) free radicals so formed (i.e. to give the two distinct enantiomeric forms of pinoresinol, depending upon the Dir-a subfamily DP type). Conversely, in the absence of the DPs, only nonregiospecific and nonstereoselective phenoxy radical coupling occurs to afford a mixture of racemic products.

Pterocarpan-forming DPs
In pterocarpan (phytoalexin) biosynthesis studies, such as to (1)-pisatin in pea (Fig. 2C), it was deduced that DPs in the DIR-b/d subfamily were involved (18). 4 Based on this deduc-tion, Dr. Tomoyoshi Akashi, following completion of his term as a visiting scientist in the research group of the late Hans Van Etten, examined formation of the structurally related (-)-medicarpin in licorice (Glycyrrhiza echinata) on returning to Japan.

Lignin-forming DPs
In addition to the DPs in the above diverse metabolic pathways, cell wall structural reinforcement via lignin deposition has been implicated to involve DIR-e subfamily members (e.g. Arabidopsis AtDIR10; Fig. 1) (20,21) in the angiosperms at least. The latter DPs are reportedly part of supramolecular complexes in enabling another metabolic product, lignin, to be formed in Casparian band tissues. However, the actual physiological substrates that these DPs utilize have neither been identified nor demonstrated in vitro.
The genes encoding DPs for entry points in pterocarpan, lignan, lignin biopolymer, and aromatic terpenoid biosynthesis are all of similar size. Of these DPs, the DIR-e ligninforming DPs have much longer b1-b2 loops in their 3D structures, when compared with other DP's (e.g. DRR206 (12), AtDIR6 (22), and GhDIR4 (17). However, the biochemical significance of these much longer b1-b2 loops is currently unknown.
With the availability of structures of stereoselective medicarpin-forming DPs (nonglycosylated), stereoselective lignanforming DPs (both apparently requiring post-translational glycosylation for stability), and a homology-modeled aromatic diterpenoid DP (GhDIR4), it was instructive to probe and compare the mechanistic biochemical features of these distinct DP types.
Described herein are the 3D structures of two stereoselective pterocarpan-forming DPs from pea and licorice, which preferentially produce either (1)-or (-)-medicarpin, depending on the substrate (Fig. 2C). These findings are discussed in the context of this DP type, which has dirigent-like (amino acid sequence similarity) domains as compared with the stereoselective lignan and aromatic terpenoid-forming DPs. Of particular interest was whether there was a common DP biochemical mechanism and, if so, what were the underlying mechanistic principles involved.
We describe that pterocarpan synthases, containing dirigent-like domains, initially engender mono-QM formation from their chiral substrates, this being followed by intramolecular cyclization (C-O bond formation) to afford entry into the  2) is maintained, with some families split where clear divisions were apparent (e.g. DIR-a 1 and DIR-a 2 ). Proteins whose functional characterization has been described in the literature are indicated (e.g. DRR206 (12,14), a (1)-pinoresinol-forming DP from pea (P. sativum), in the Dir-a 1 subfamily; AtDIR6 (8,9,13), a (-)-pinoresinol-forming DP from A. thaliana, in the Dir-a 2 subfamily; GePTS1 (19) and PsPTS1, medicarpin-forming DPs from licorice (G. echinata) and pea, respectively, in the Dir-b/d subfamily; GhDIR4 (16,17), an aromatic diterpenoid ((1)-gossypol-forming) DP from cotton (G. hirsutum); and AtDIR10 (20), a Casparian band lignin-forming DP from A. thaliana, in the Dir-e subfamily). The narrow distributions of sequences from gymnosperms, lycophytes, and bryophytes are easily discernable and contrast with the broad distribution of angiosperm dicots and even broader distribution of extant angiosperm monocots (mainly crop grasses). Ends of each branch of the tree are colored for different land plant families as indicated (e.g. light blue ends indicate lycophytes).

Heterologous expression and gel-permeation chromatography
GePTS1 and PsPTS1 coding sequences were individually codon-optimized for Escherichia coli, with each synthetic gene cloned into the pET101/D-TOPO ® E. coli expression vector harboring a C-terminal 63 polyhistidine region. The vector constructs were then each used to transform E. coli BL21 (DE3) cells. After induction with isopropyl 1-thio-b-Dgalactopyranoside, the resulting recombinant His-tagged proteins were individually purified to apparent homogeneity ( Gel-permeation chromatography (GPC) was next carried out on a TSKgel G3000SW XL column, precalibrated with molecular weight standards, to determine the oligomeric state of both PTSs. GePTS1 and PsPTS1, in solution, exist mainly as trimers (;68.0 kDa), with (because of association/ aggregation) a small amount of higher-molecular weight entities also being evident (roughly corresponding to 410-500 kDa).
The pea medicarpin-forming DP (PsPTS1) catalyzed the same conversions (Fig. S2, C and J). Control assays (no DP present) gave smaller amounts of racemic medicarpin products (Fig. S2, A and H) because of nonenzymatic conversion of cis-and trans-DMI.
For PsPTS1, the assays to obtain kinetic parameters were exactly as described above. Again, the cis-DMI (3R,4R) isomer was utilized under these conditions, whereas the (3S,4S) cis-DMI was not. In our hands, PsPTS1 displayed much lower (;7-fold) catalytic turnover (k cat /K m ) relative to GePTS1, this in part being due to the ;6-fold increase in K m .
Under these conditions, the corresponding trans-DMI (3S,4R) isomer was also utilized, whereas the (3R,4S) trans-DMI was not converted. However, the catalytic turnover (k cat / K m ) was reduced ;6-fold, relative to GePTS1, whereas the K m values were very similar for both PsPTS1 and GePTS1.
Additionally, the (3S,4S) and (3R,4S) enantiomers were slowly converted into (1)-and (-)-medicarpin when .1 mg of DP was used in the assays and when longer incubation times were used (30 min or more; data not shown).

Medicarpin-forming DP structure determinations
Medicarpin-forming DP (GePTS1 and PsPTS1) crystals were obtained as described under "Experimental procedures" following initial screening at the Hauptman Woodward Institute (Buffalo, NY), where 1,536 conditions were tested (25).
The GePTS1 structure was solved by molecular replacement at 2.6 Å resolution (Fig. S3A). Six independent DP monomer molecules, labeled A-F (Fig. 4), were located in the crystallographic asymmetric unit, arranged as a dimer of trimers. The buried surface area between the two trimers is small, and the biologically active form in solution is presumed to be a trimer, as confirmed by GPC analysis.
In any event, the two trimers from the X-ray analysis are related by a noncrystallographic symmetry 2-fold axis roughly parallel to the body diagonal of the P3 1 21 asymmetric unit. The residues in each of the six GePTS1 monomers are as follows: A, 23-65, 72-189; B, 26-29, 38-194; C, 26-189; D, 23-190; E, 24-197; F, 24-191 (Fig. 3). The mature sequence of GePTS1 begins at residue Ala 23 and ends at Tyr 188 . Additional electron density was observed extending away from the C terminus to various extents in all six monomers. Monomers A and C have one additional residue that could be modeled, monomer D has two residues, and monomer F has three. Nine additional residues were modeled in monomer E, and monomer B has an additional 19 residues. The latter were identified as part of the linker for the C-terminal His tag from the pET101/D-TOPO ® expression vector (Fig. S4).
The GePTS1 monomer is an eight-stranded antiparallel b-barrel comprised of two curved anti-parallel sheets formed by strands b19, b2, b3, b4, b5, and b69 (designated sheet I) and b6, b7, b8, and b1 (designated sheet II) (Fig. 5A) that contact each other only slightly at the b19-b8 and b5-b69 interfaces. The N and C termini are located in adjacent b-strands at one end of the barrel. The six monomers superimpose onto each other with root mean square deviations (RMSDs) in Ca   positions of between 0.53 and 1.05 Å. Inspection of the superimposed structures shows that the b-barrels align almost perfectly, with the main deviations occurring in the N and C termini and in loops at the opposite end of the molecule (Fig. 5B). If the three monomers comprising a trimer are superimposed, the N and C termini are quite divergent in structure, but when the molecules are related by the noncrystallographic symmetry 2fold (A-D, B-E, and C-F), there is a much closer structural similarity. Further inspection of the two trimers as a whole show that the N-and C-terminal extensions wrap around neighboring monomers. The interface between the two trimers in the asymmetric unit is, however, not compact, and the protein is not likely to adopt the hexameric (dimer of trimers) state in dilute solution (i.e. as demonstrated with the GPC analyses above). The pea medicarpin-forming DP, PsPTS1, structure was solved by single-wavelength anomalous diffraction methods using the signal from intrinsic sulfur atoms (sulfur-SAD (single anomalous diffraction)) in the 11 methionine residues (excluding the N-terminal methionine). Its structure was refined against 1.5 Å resolution native data to a final R free of 0.1907 (Fig. S3B). It consists of a single monomer in the asymmetric unit, residues Phe 35 -Tyr 187 , plus two residues at the C terminus (Lys 188 and Gly 189 ) from the linker for the C-terminal His tag. The extended N terminus observed in some of the GePTS1 monomers is not resolved in the PsPTS1 structure. A trimeric complex is formed by the crystallographic 3-fold axis parallel to the body diagonal of the cubic unit cell, this being in agreement with the GPC. PsPTS1 also has an eight-stranded b-barrel structure (Fig. S5), comprised of two curved b-sheets with the same topology as GePTS1 (sheet I: b19, b2, b3, b4, b5, and b69; sheet II: b6, b7, b8, and b1).
The PsPTS1 and GePTS1 monomer structures are thus very similar, with RMSDs between 0.47 and 0.75 Å for the superposition of the PsPTS1 monomer on the six independent GePTS1 monomers. The entrance to the putative active site of the GePTS1 and PsPTS1 monomers is located at the end of the barrel opposite the N and C termini ( Fig. 5B and Fig. S5). The opening of the active site is surrounded by five loops that show some degree of structural differences in the six monomers. These loops are between strands b1 and b19 (loop I), strands b19 and b2 (loop II), strands b3 and b4 (loop IV), strands b5 and b6 (loop VI), and strands b7 and b8 (loop VIII). The other loops, III (V), V, and VII, project out from the side of the monomer opposite that involved in trimer formation.
The putative active-site cavity is a tunnel that extends into the barrel to a depth of ;18 Å from the outermost external loops. For example, in GePTS1, the cavity volumes range between 350 and 500 Å 3 (calculated with ICM-Pro (26)) and are lined by predominantly aromatic and hydrophobic residues and two aspartate residues (Asp 50 and Asp 83 ) (Fig. 6B). (The GePTS1 and PsPTS1 residues lining the interior pocket and forming the putative active site are identical, with all important/conserved residues numbered as in GePTS1 (Fig. 3) in the following discussion.) The roughly cylindrical active-site cavity is long and narrow (around 7 Å diameter) and nearly parallel to the 3-fold symmetry axis of the trimer, where Tyr 181 sits at the base of the tunnel with its hydroxyl group projecting along the tunnel axis. Additionally, lining the tunnel are polar residues Asn 137 and Arg 145 , conserved in both PTSs and in many DIRb/ d sequences (Fig. S6). Compared with DRR206 and AtDIR6 (PDB ID 4REV and 5LAL, respectively), the GePTS1 and PsPTS1 active sites are narrower and deeper and aligned more parallel with the trimer symmetry axis, whereas DRR206 and AtDIR6 are wider and shallower and point outward more. Like DRR206 and AtDIR6, GePTS1 and PsPTS1 structures contain an V loop (Figs. 5 and 6C and Fig. S5) that folds back to contact the exterior of the barrel with conserved loop residues Thr 86 and Ser 93 (Fig. 3) forming a cluster with the highly conserved residue His 49 on the barrel itself. A second exterior loop on the same side of the barrel occurs at the end of b1 prior to b19 ( Fig. 5B and Fig.  S5). A similar loop and the following short b-strand are present in the structure of AtDIR6, but not in DRR206, where the corresponding sequence is disordered. Finally, a conserved b-bulge is found in both GePTS1 and PsPTS1 structures near the end of b7.

Docking studies
Docking of DMI substrates (3S/R,4R-DMI) and the presumed 3S/R-DMI-QM intermediate in the GePTS1 active site (Fig. 6, C-E) showed that they could potentially bind in a lengthwise fashion in the active-site tunnel with either end pointing in toward the conserved Tyr 181 residue at the base of the tunnel. These docking simulations were used to evaluate whether the proposed mechanism discussed below is plausible, given the constraints placed on it by the dimensions of the active site and the size of the substrate and intermediate, and to identify low-energy orientations of the bound substrates and intermediates that are consistent with this mechanism. Conserved polar residues Asp 50 , Asp 83 , Tyr 103 , Asn 137 , and Arg 145 (Fig. 6B) are located along the sides of the tunnel, and the substrate and intermediate presumably must be able to bind in an orientation that places their key reactive components proximally to the necessary residues. These criteria were used to guide selection of the docked structures shown in Fig. 6 (C-E) for GePTS1. The preferred orientation thus has the QM oxygen at the bottom of the tunnel near Tyr 181 and C-4 (bearing the labile OH in DMI) oriented toward Asp 50 . We found that for both DMI and DMI-QM, substrates with the S configuration at C-3 performed better in docking simulations. However, the substrate with the R configuration at C-3 appears to be the preferred substrate in plants. Furthermore, better docking was found when the pyran ring conformation was such that the phenol substituent on C-3 was equatorial, making the entire structure more flat and less bent, consistent with the straight and narrow nature of the active-site tunnel. We note that the simulations did not allow for any adjustment of side-chain conformations within the protein active site upon binding, which presumably could alter the specificity and ligand-binding interaction energy.

Overall topology
The DRR206 (Fig. 7A) (12), AtDIR6 (Fig. 7B) (22), GePTS1 (Fig. 5A), and PsPTS1 (Fig. 7C) monomers, in their respective trimers, all have the same eight-stranded b-barrel topology. Of these, the (1)-pinoresinol-forming DP (DRR206) from pea, obtained at 1.95 Å resolution (12), was the first 3D DP structure (PDB entry 4REV) solved. Its structure contained two independent monomers in the asymmetric unit, and the trimeric structure was generated by the crystallographic 3-fold axis of the H3 space group. In the same way, the PsPTS1 trimer is also crystallographic, with the three monomers related by the 3-fold body diagonal of the cubic unit cell. The Arabidopsis (-)-pinoresinol-forming DP (AtDIR6) structure was also solved as two monomers, and the trimer was generated crystallographically (22).
Superposition of the PsPTS1 and GePTS1 monomers against DRR206 and AtDIR6 using secondary structure-matching algorithms (27) implemented in COOT (28) gave RMSDs ranging from 1.4 to 1.6 Å for the b-barrel core, slightly higher compared with those between the two PTS1 enzymes themselves (Table S1). The eight b-strands match very well between the PTS1 dirigent-like proteins and other two DPs, with the main differences occurring in the N and C termini and in the loops between the b-strands, in particular loops I, II, IV, and V. Interestingly, the V loop adopts the same conformation in all four enzymes, which hints at a functionality for this structural element as suggested for DRR206 (12). When the GePTS1 and symmetry-generated PsPTS1 trimers are superimposed upon the symmetry-generated DRR206 and AtDIR6 trimers, the core RMSDs are similar for the three b-barrel core, indicative of a highly conserved oligomeric structure. The RMSDs are significantly greater, however, when the Ca atoms of all residues are matched using ICM-Pro (26), primarily due to the conformational variability in the loop regions between these enzymes (Table S1).
Currently, the GePTS1 structure is the only one that shows the trimer without requiring it to be generated by crystallographic symmetry, being found as a dimer of trimers in the asymmetric unit. Furthermore, even though the gossypolforming DP GhDIR4 (a member of the DIR-b/d family) has low sequence identity to GePTS1/PsPTS1 (;35%) and AtDIR6/ DRR206 (;25%), these DP structure determinations allowed for homology modeling with reasonable quality for the core barrel structure (Fig. 7D) (29,30).

V loops and other alignments
The (1)-and (-)-pinoresinol-forming DP (DRR206 and AtDIR6) structures contain an V loop that folds back upon the exterior of the barrel, with conserved residues (Thr 84 and Ser 91 ) that form a small cluster with His 39 located on the barrel (Fig. 8A). This loop is also present in the medicarpin-forming (His 48/49 , Thr 85/86 , and Ser 92/93 in PsPTS1/GePTS1, Fig. 8B and Fig. 6 (C-E)) and gossypol-forming (His 46 , Ser 88 , and Arg 81 , in place of Thr, in GhDIR4; Fig. 8C) DPs, and it appears to be a general feature of dirigent proteins. The conserved residues and structure of this loop and its position on the exterior of the barrel suggest that it may be either a locus of interaction with other proteins or that it may mediate flexibility in the upper portion of the barrel that comprises the active site.  (B), and PsPTS1 (6OOD) (C). D, homology model of GhDIR4 created with Phyre2 in one-to-one threading mode using PsPTS1 structure as a template (30,31). The b-strands are colored blue to red from the N to the C terminus: royal blue, b-1; slightly lighter blue, b19; light blue-green, b2; green, b3; yellow-green, b4; yellow, b5; lighter orange, b6 and b69; darker orange, b7; red, b8.
The second exterior loop between b1 and b19 in the medicarpin-forming DPs (Fig. 7C) is also similar to that found in the (-)-pinoresinol-forming DP AtDIR6 (Fig. 7B), but not in DRR206 (Fig. 7A), where an alternate transition directly to a larger, more disordered b1-b2 loop is found. It is conceivable that the disordered loop in DRR206 is capable of forming the additional b-strand and intervening loop. The proximity of the b1-b1' loop to the V loop is noteworthy. A conserved b-bulge found in both GePTS1 and PsPTS1 structures is also observed in AtDIR6, but not in DRR206, near the end of b7. The significance, if any, of these observations is currently unknown.
The medicarpin-forming DP (GePTS1 and PsPTS1) structures, being distant from both DRR206 and AtDIR6 in sequence space, also provided additional homology modeling leverage, particularly in the large DIR-b/d family. Their structures helped to clarify ambiguity in how sequence alignments of distantly related DPs might be constructed, particularly in b8, which has comparatively little sequence conservation throughout the DP superfamily (Fig. 3). Insofar as Tyr 181 is on b8 in GePTS1 (Figs. 3 and 6B) (and Tyr 180 in PsPTS1; Fig. 8B), it may be that additional functionally important residues in other subfamily classes of DPs are also located along this strand.
Our alignments also suggest that the b-strands forming the core b-barrel structures in all three DP types are largely conserved, this in turn indicating that homology modeling may be used to model and understand the active sites and, in particular, the surrounding loops. These are depicted in dark blue in the alignment (Fig. 3) and vary significantly in both length and sequence. We currently hypothesize that these loops hold important roles, possibly helping confer substrate specificity. Loops on the opposite end of the barrel (Fig. 7) are more conserved, particularly b2-b3 (V) and b6-b7 loops, and may represent potential interaction sites for, for example, a DP-specific (per)oxidase in a putative protein supramolecular complex.

Putative active-site pocket
From domain-swapping experiments giving different coupling stereoselectivities, we provisionally identified key regions for substrate binding and coupling in the putative active site (9). This, with the X-ray data, led to the deduction that each (1)-pinoresinol-forming DP (DRR206) monomer in the trimer has a prominent deep pocket at one end of the barrel, surrounded by flexible loops. We proposed that this pocket, oriented toward the outside of the trimer and lined with hydrophobic residues, is provisionally the substratebinding site for (1)-pinoresinol formation (Fig. 8A). The volume of the pocket is large enough that two monolignolderived substrates could bind in a single pocket. Similar conclusions were drawn from structure determination of the (-)-pinoresinol-forming AtDIR6 (22).
The putative (1)-pinoresinol-forming DP (DRR206) activesite cavity is shallower and broader than that in the pterocarpan synthases, GePTS1 and PsPTS1, harboring dirigent-like domains. This presumably is indicative of differences in size and geometry of the putatively bound substrates and QM intermediates (e.g. monoversus bis-QMs). Our homology model of GhDIR4 suggests that the binding site is larger and more accessible than those of PTS1, DRR206, or AtDIR6, partly because six fewer residues comprise loop VI and adjacent portions of strands b5 and b6 to accommodate two bulkier hemigossypol substrates.
Some residues forming the putative binding/active site in the interior of the barrel are conserved between DRR206 (Fig. 8A), AtDIR6, GePTS1 (not shown), PsPTS1 (Fig. 8B), and GhDIR4 (Fig. 8C). A notable exception is Tyr 181 /Tyr 180 (in GePTS1/ PsPTS1), this being a conserved residue in the majority of DIRb/d subfamily sequences, although not that of GhDIR4. Indeed, the sequences similar to GePTS1 and PsPTS1 are most likely homologous pterocarpan synthases from other legumes (Fig.  S6). Conversely, a corresponding tyrosine is neither conserved in the pinoresinol-forming DPs, DRR206 or AtDIR6, in the DIR-a subfamily nor found in the gossypol-forming DP in the DIR-b/d subfamily. This may make sense, insofar as the gossypol-forming DP mechanism might be more like that of pinoresinol-forming DPs (in which Tyr 181 is absent) given the similarity in their putative prochiral QM radical substrates.
GePTS1 and PsPTS1 sequences lack a conserved aspartate as found in pinoresinol-forming DPs (Asp 134 in DRR206, Asp 137 in AtDIR6). The conserved Asp 134 /Asp 137 residue was proposed to reprotonate one of the bis-QM carbonyl oxygens to facilitate nucleophilic addition by the C-9 OH at C-79 to form one of the cyclic ether rings of pinoresinol ( Fig. 2A). This aspartate may not be needed, or Tyr 181 /Tyr 180 or a water molecule might fulfill this role. In place of this Asp, Asn 137 is conserved in PTS1 and many DIR-b/d sequences (Fig. S6). The GhDIR4 sequence lacks either asparagine or aspartate at the equivalent position but has aspartate at the subsequent position. This is located in loop VI, which is considerably truncated in GhDIR4 and thus may have a function similar to that proposed in AtDIR6 (22). Finally, Arg 145 is highly conserved in DPs, although not GhDIR4; this residue is nearby the aforementioned Asp/Asn in the active site. However, in GhDIR4, Arg 130 is a few residues away in the sequence and nearby, in loop VI, in the homology model, and could fulfill the same role as Arg 145 .
Biochemical mechanism considerations in the medicarpin-, pinoresinol-, and gossypol-forming DPs The major difference between these three DP types is their distinct substrate versatilities, reflecting differences in substrate recognition and binding in their active-site pockets, as well as product outcome. All three DP types use substrates that initially had a free phenolic group functionality in their aromatic ring(s) and, in the case of hemigossypol, in both rings. The coniferyl alcohol and hemigossypol-derived substrate radicals also have very different aromatic group substitutions, and these need to be understood better from a substrate-binding requirement. With the need for an oxidase to generate the presumed free radical species, how the DP and the oxidase(s) interact for stereoselective coupling also needs to be determined.
Following one-electron oxidation of the phenolic OH groups in coniferyl alcohol and hemigossypol, prior to coupling, the stereoselectivity of the coupling reactions requires that the prochiral substrates be bound and oriented such that coupling only occurs at the specific regio-centers and not at other poten-tial coupling sites. In the absence of the DPs, these substrates only produce free radical-derived racemic products through coupling, some of which are nonregiospecific.
This leads to the question as to what is being bound in the DP active site prior to coupling. One-electron oxidation of the C-4 phenolic group in coniferyl alcohol and at the equivalent position in hemigossypol would generate intermediates (QM radicals) with similar extended delocalization. These intermediates then stereoselectively couple to afford the corresponding chiral bis-QM intermediates. Their DP active sites can thus be envisaged as able to possibly bind both the various electron-delocalized intermediate (free radical) monomers and the corresponding chiral bis-QM intermediates. The monomer binding and orientation in the active sites, however, control the stereoselectivity outcomes. Subsequent intramolecular cyclization (C-O bond formation) and re-aromatization presumably occur in these DP active sites as well (12,22).
In contrast, the medicarpin-forming DPs, with their dirigent-like domains, apparently process chiral substrates, with the R stereochemistry of the OH functionality at C-4 being favored over the S-configuration for C-O bond formation. These data thus suggest that the chirality of the 4-OH group is of considerable importance for preferentially undergoing dehydration to generate the presumed QM intermediate (or a functional equivalent) prior to C-O bond formation. However, the presence of the aromatic 7-OH group also appears to be essential, presumably enabling generation of the putative QM intermediate prior to ring closure to afford the pterocarpan skeleta.

Proposed pterocarpan synthase mechanism
Uchida et al. (19) proposed that conversion of DMI to medicarpin catalyzed by PTS would likely have two or more reaction steps and proceed via a QM intermediate, possibly involving different conformational states of the enzyme. Determination of the structure of PTS, together with evaluation of active-site residue mutants, and substrate and intermediate docking simulations now allow the proposed mechanism to begin to be evaluated in greater detail and in context of the positions of conserved residues in the active site.
At a minimum, the mechanism would likely require an acidic residue that protonates the 4-OH of DMI to facilitate its departure as H 2 O, thereby producing what is formally a benzylic carbocation. Another residue or a bound water could then reversibly accept the phenolic 7-OH proton of the benzo-dihydropyran ring, affording the para-QM intermediate (Fig. 2C). The mechanism would also likely require stabilization of this intermediate and promotion of attack by the phenolic 29-OH on the QM carbon (C-4) through (or simultaneous with) removal of the hydroxyl proton to form the new partially reduced furan ring of medicarpin.
The conserved polar residues in the active site of GePTS1-Asp 50 , Asp 83 , Tyr 103 , Asn 137 , Arg 145 , and Tyr 181 -are likely to facilitate this mechanism. Mutagenesis of four of these residues showed significant effects on activity (Asn 137 and Arg 145 were not targeted for mutagenesis).
To investigate whether Asp 50 or Tyr 103 had any effect on conversion of cis-DMI and trans-DMI substrates, both residues were individually replaced with alanine and phenylalanine, respectively (i.e. Asp 50 ! Ala and Tyr 103 ! Phe). As shown in Table 1, these two mutations resulted in massive reductions in catalytic turnover (i.e. down to 1 and 3% for the cis-DMI (3R,4R) substrate and to 1.6 and 7.8% with the trans-DMI (3S,4R) isomer, relative to WT GePTS1). Moreover, for the cis-DMI (3R,4R) substrate, the K m values were much higher for both mutants (i.e. K m values of 1,175 and 555 mM versus 145 mM for WT GePTS1), with the V max for each mutant also greatly attenuated (220 and 306 versus 2,674 picokatals/mg of protein). In addition, when the trans-DMI (3S,4R) isomer was used, the K m value for D50A was only slightly attenuated (i.e. K m of 520 mM versus 680 mM for WT GePTS1), whereas for Y103F, it greatly increased to 3,320 mM. On the other hand, V max values were reduced down to 9 and 265 picokatals/mg of protein, respectively, versus 712 picokatals/mg of protein for WT GePTS1.
In the proposed mechanism for pterocarpan (medicarpin) formation by PTS, the QM forms after the chiral substrate (DMI) binds. Thus, one might expect polar active-site residues that are not conserved in pinoresinol-forming DPs to fulfill this additional function. The residues fitting this description are, in GePTS1 (Fig. 6, A-E), Asp 83 (conserved in all PTS sequences and present but not highly conserved in some other DP sequences) and Tyr 181 (conserved in all PTS sequences and many other Dir-b/d sequences although not GhDIR4 and absent in other DP sequences). To investigate this possibility, GePTS1 mutants, D83A and Y181F, were also obtained, and the resulting proteins were purified.
Kinetic data (Table 1) established that the D83A and Y181F mutations also had significant deleterious effects on catalytic turnover. With the cis-DMI (3R,4R) substrate, catalytic turnover was reduced to 1.2 and 1.5%, relative to WT GePTS1, whereas for the trans-DMI (3S,4R) isomer, the reductions were down to 8.3 and 3.0% of WT GePTS1 activity. For the cis-DMI isomer, K m values increased to 1,300 and 825 mM versus 145 mM for WT GePTS1. V max values for each were also greatly attenuated (283 and 233 versus 2,674 picokatals/mg of protein for WT GePTS1. With the corresponding trans-DMI, however, the K m value for D83A was only slightly attenuated (i.e. K m of 665 versus 680 mM for WT GePTS1), whereas for Y181F it was greatly increased to 5,665 mM. V max determinations were also found to be attenuated (58 and 180 picokatals/mg of protein versus 712 picokatals/mg of protein for WT GePTS1). In other words, both of these mutations also overall had massive deleterious effects on catalytic turnover.
These effects on PTS1 activity from mutagenesis of Asp 50 , Asp 83 , Tyr 103 , and Tyr 181 , combined with the inferences from docked substrate and intermediate orientations, can thus be used to propose the following roles for polar active-site residues in the proposed mechanism.
In GePTS1, Tyr 181 or a nearby bound water in the active site may have a role in accepting the 7-OH proton during formation of the QM intermediate, particularly if Tyr 181 exists as the phenolate, which could be stabilized by the nearby side-chain of Arg 145 . Alternatively, Tyr 181 and Arg 145 may facilitate QM formation with a bound water as proton acceptor, rather than the phenolate directly. Tyr 181 or a nearby protonated water would also presumably reprotonate the QM oxygen (7-O) upon cyclization at C-4 to form the new furan-like ring in medicarpin, regenerating the original (phenol) OH functionality. Docking predictions having DMI and DMI-QM structures, where the incipient QM is buried most deeply in the active site, suggest this role for Tyr 181 . Notably, Tyr 181 is not conserved in pinoresinol-forming DPs, such as AtDIR6 and DRR206.
Asp 50 is likely the donor that initially protonates the 4-OH, which then leaves as water to ultimately generate the QM. This same residue could then provide a negative charge to stabilize the partial positive charge on the QM carbon and could subsequently serve as a proton acceptor for the 29-OH proton during attack on the QM carbon (C-4) through which cyclization to form the new partially reduced furan ring occurs. The proposed mechanism does, however, require that Asp 50 be in an un-ionized form for the initial step, to protonate the 4-OH, and provisionally suggests that catalysis would be inhibited by low pH.
Docking experiments identified several bound DMI and DMI-QM orientations with the 7-OH directed inward toward the bottom of the active-site tunnel and near Arg 145 and Tyr 181 . Among these orientations were some in which the 4-OH of DMI (see Fig. 6, C-E) and C-4 of DMI QM are proximal to Asp 50 . Asp 50 in GePTS1 appears to form a hydrogen bond with Tyr 103 and is also near Asp 83 ; both residues are conserved in PTS1 sequences (Tyr 103 is conserved widely across DPs) and may help modulate the proton donor and/or acceptor activity of Asp 50 .
Our working hypothesis for the role of Arg 145 , which is highly conserved in most dirigent proteins and dirigent-like domains, including the pinoresinol-forming dirigent proteins, is that the positively charged side-chain guanidino group stabilizes the QM intermediate by balancing the partial negative charge on the QM carbonyl oxygen. Whereas QMs are frequently drawn as a half-quinone (e.g. a 2,5-cycohexadienone with an exocyclic double bond to a benzylic carbon para to the carbonyl), it is useful to consider the zwitterionic resonance form: a phenolate with a benzylic carbocation at the para position. Stabilization of a reactive species with highly electron-rich and electron-poor moieties likely requires suitably located charged groups. In the proposed mechanism for PTS1, the likely role of the conserved arginine (Arg 145 in GePTS1) is to stabilize the partial negative charge on the QM carbonyl oxygen, whereas the conserved aspartate (Asp 50 in GePTS1) likely stabilizes the QM benzylic carbon and facilitates nucleophilic attack by a hydroxyl group to form a furan-like ring.
Asp 50 , Tyr 103 , and Arg 145 are conserved in both AtDIR6 and DRR206 (Fig. 3), and the positions of the side-chains are nearly identical in the superposition of all three structures. This suggests a common role for these residues, despite the apparent dissimilarity in their substrates, including the likelihood that AtDIR6 and DRR206 bind two coniferyl alcohol QM radicals, whereas GePTS1/PsPTS1 bind a single DMI substrate and is unlikely to involve a QM radical in the mechanism. A p-QM has an electrophilic carbon at the benzylic position, para to a partially negatively charged carbonyl oxygen. Protonation of this oxygen decreases the energy barrier of the second cyclization step by making the benzylic carbon (C-4) considerably more electrophilic (31)(32)(33).
In GePTS1, the phenolic 29-OH group, which attacks the electrophilic C-4 of the DMI-QM intermediate, is equivalent to either one of the nucleophilic oxygens (9-or 99-OH) in the bis-QM intermediate en route to pinoresinol formation, which attack the electrophilic C-79 and C-7 atoms, respectively, in this intermediate. In both substrates, the nucleophilic OH and electrophilic carbon are separated by three carbons, such that intramolecular cyclization forms a five-membered cyclic ether, a reduced furan (or partially reduced in the case of DMI). Both Asp 50 in GePTS1 and its homologue in pinoresinol-forming DPs (Asp 49 in AtDIR6) therefore presumably could have similar roles in stabilizing the partial positive charge on the QM carbon as well as in accepting the proton from the nucleophilic hydroxyl group.
In the pinoresinol-forming dirigent proteins DRR206 and AtDIR6, where the proposed mechanism has a bound bis-QM intermediate resulting from 8-89 coupling of two coniferyl alcohol QM radicals, the homologous residue to Asp 50 (Asp 49 in AtDIR6) was proposed to have a somewhat different role. There it was envisaged as protonating the carbonyl oxygen at one end of the bis-QM, making the proximal methide carbon more electrophilic (formally resembling a benzylic carbocation) and facilitating cyclization in that half of the bis-QM, via attack by the nucleophilic 9-OH originating from the other coniferyl alcohol radical substrate, and thereby forming one of the cyclic ether rings in pinoresinol (22).
However, our interpretation is that, as suggested by the enzyme kinetics of GePTS1 (19) and data herein, there may be conformational change in the protein upon binding and/or rearrangement of the position of the substrate. We note that the barrel itself in GePTS1 and the conformations of residues inside it, particularly Phe 48 and Asp 50 (Fig. 3), are potentially influenced by His 49 (on the outside of the barrel and in contact with conserved residues in the V loop). These influences may exert subtle effects on the active site either through interactions with a partner protein or in response to other stimuli.
Thus, perhaps significantly, Phe 48 , His 49 , and Asp 50 in GePTS1 are conserved in nearly all dirigent proteins (Fig. 3) (whose equivalents in PsPTS1 are Phe 47 , His 48 , and Asp 49 ; Fig.  8B). In addition, the role of conserved residue Arg 145 (conserved in many DP sequences, including DRR206 and AtDIR6) is currently unproven, as discussed above, but a role in stabilization of the partial negative charge on the QM oxygen (at C-7) remains a reasonable assumption. Resolution of these and other ambiguities will likely require crystallization of PTS with bound substrates, products, intermediates, or their analogues.

Concluding remarks
The key mechanistic aspects of the three DP types herein are (a) binding of monomeric species (achiral or chiral), (b) QM formation and binding (or a radical or ionic counterpart) (e.g. either via intermolecular coupling and bis-QM generation or mono-QM generation), and (c) re-aromatization (through either intramolecular cyclization (C-O bond formation) or intramolecular rearrangement).
It appears that in all three DP types (medicarpin-, pinoresinol-, and gossypol-forming DPs), the active site must be able to accommodate and stabilize QM intermediates. Assuming these are generated, both the lignan-and pterocarpan-forming DPs can then undergo intramolecular cyclization (C-O bond formation) to afford the corresponding products. However, whether this occurs at the DP active sites or following release of the mono-or bis-QM intermediates remains to be established. This differs, however, from the aromatic terpenoid (1)-gossypol-forming DP, which undergoes re-aromatization, with the latter occurring also either before or after release from the DP active site. The DP active sites thus can accommodate either at least two monomers for coupling or alternatively larger molecules for further processing (here intramolecular cyclization to afford pterocarpans).
These insights, we propose, will be of critical importance in both predicting and establishing the precise biochemical roles of the vast DP multigene families awaiting discovery in the future and in establishing the full diversity of the metabolic pathways involved, leading to different plant phenol metabolic classes. Clearly, any distinct land plant phenol metabolic class entry point (e.g. to lignans, lignins, aromatic diterpenoids, and pterocarpans thus far) requiring formation of QM intermediates (or a radical or ionic counterpart) can now be considered as having genes encoding either a DP or DP-like function.
In years gone by, terpenes were considered by some researchers to be produced nonenzymatically, but this notion evaporated when terpene synthases were discovered. As DP functions in land plant metabolism and evolution are identified, the importance of how such organisms actually control QM biochemistries will be perhaps key to better understanding how successful land plant adaptation originated and evolved.

Materials
All solvents and reagents were purchased from either Sigma-Aldrich or Fischer Scientific. Racemic vestitone was purchased from Santa Cruz Biotechnology, Inc., and synthetic (1)-medicarpin was kindly provided by Dr. K. H. Lee (University of North Carolina, Chapel Hill, NC, USA).
NMR spectra were recorded on a Varian VNMRS spectrometer operating at 599.64 and 150.79 MHz for 1 H and 13 C, respectively, and equipped with a 5-mm HCN cryoprobe (Varian) with a cold carbon preamp. The sample temperature was maintained at 20°C for all experiments. J values are given in Hz. One-dimensional 1 H and 13 C and two-dimensional gHSQCAD, gHMBCAD, and gCOSY spectra (Figs. S8-S18) were acquired for both cis-and trans-DMI using typical acquisition and processing parameters. For the cis-DMI sample, a HOMO2DJ experiment was also acquired to aid in resolving the peak positions and J-coupling on a multiplet region centered at 6.39 ppm (see spectra in Figs. S8 and S10). Chemical shifts were referenced internally to the solvent methanol-d 4 (3.31 ppm for the residual methyl proton and 49.15 ppm for C). For full NMR acquisition details, see the supporting information.
Synthesis of cis-DMI ((3R,4R) and (3S,4S)) and trans-DMI ((3S,4R) and (3R,4S)) The four stereoisomers of cis-and trans-DMI were chemically prepared by sodium borohydride (NaBH 4 ) reduction of racemic vestitone as described (19,34). To a solution of racemic (3RS)-vestitone (40 mg) in ethanol (2 ml) was added NaBH 4 (80 mg) at room temperature. The contents were stirred for 2-3 h until the vestitone was totally reduced to the corresponding DMI. After completion of the reaction, excess ethanol was removed in vacuo and the reaction mixture was quenched with water (3 ml), and the whole was extracted with ethyl acetate (2 3 30 ml). The ethyl acetate solubles were combined and passed through an anhydrous Na 2 SO 4 plug and evaporated to dryness in vacuo. The residue so obtained was subjected to silica gel preparative TLC as described in Uchida et al. (19) using toluene/ethyl acetate/methanol/benzene (6:4:1:3) to individually afford cis-DMI ((3R,4R) and (3S,4S)) (11.7 mg) and trans-DMI ((3S,4R) and (3R,4S)) (21.5 mg), respectively. Cloning and heterologous expression of G. echinata pterocarpan synthase 1 (GePTS1) and mutants GePTS1 coding sequence (GenBank TM accession no. LC121822), as well as four individual mutants (D50A, D83A, Y103F, and Y181F) were codon-optimized for E. coli and synthesized via GeneOptimizer ® (Invitrogen) without the N-terminal signal peptide (23 amino acids). The GePTS1 gene was cloned into the pET101/D-TOPO ® E. coli expression vector, whereas the four mutants were cloned into the pET100/D-TOPO ® . GePTS1 and each of the four mutant constructs were transformed into One Shot ® BL21 Star TM (DE3) competent E. coli (Invitrogen) according to the manufacturer's protocol. Initial Luria-Bertani medium cultures (10 ml) containing 100 mg/ ml carbenicillin were incubated overnight (;15 h) at 37°C with shaking at 250 rpm. A 500-ml aliquot of each culture was then used to inoculate Luria-Bertani medium (50 ml) containing 100 mg/ml carbenicillin. After incubating at 37°C with shaking at 250 rpm to obtain an A 600 of ;0.6, the cultures were induced with isopropyl 1-thio-b-D-galactopyranoside at a final concentration of 1 mM. After continued shaking at 28°C for 24 h, cells were harvested by centrifugation at 3,000 3 g for 20 min at 4°C , with the pellets frozen and stored at -80°C.
Cloning and heterologous expression of P. sativum medicarpin-forming DP (PsPTS1) The GePTS1 sequence was used to search the P. sativum "Cam_eor" UniGene set (24), resulting in a gene (PsCam039127) being selected and named PsPTS1. PsPTS1 had ;92%/85% similarity/identity to the GePTS1 peptide sequence. The PsPTS1 coding sequence was codon-optimized for E. coli and synthesized as above without its N-terminal signal peptide (21 amino acids). Cloning and expression of PsPTS1 were performed using the GePTS1 protocols above.
Purification of GePTS1, PsPTS1, and the four GePTS1 mutant His tag fusion proteins Pelleted cultures were individually lysed using BugBuster ® Protein Extraction Reagent (EMD Millipore) with Benzonase ® Nuclease and rLysozyme TM added. Purification of each protein was individually performed using a POROS TM 20 MC metal chelate affinity (Thermo Scientific) column. Each cell-free extract was applied to the POROS TM 20 MC column equilibrated in binding buffer (20 mM Tris-HCl, pH 7.9, 500 mM NaCl, and 20 mM imidazole) at 4°C and then washed with 10 bed volumes of binding buffer to remove unbound proteins. Each recombinant protein was next eluted using elution buffer (20 mM Tris-HCl, pH 7.9, 500 mM NaCl) containing imidazole initially at a concentration of 150 mM and then 300 mM.
Individual fractions of each recombinant proteins were subjected to SDS-PAGE using a Mini-PROTEAN ® TGX TM precast gel, 4-20% gradient (Bio-Rad), with visualization done by silver staining. Fractions containing each of the recombinant proteins (Fig. S1) were individually pooled, and the buffer was exchanged to 25 mM Tris-HCl (pH 7.9) using a PD10 column (GE Healthcare), following which the resulting protein solutions were individually concentrated using an Amicon ® Ultra-4 10K centrifugal filter (Millipore). Protein quantification was carried out using the Bradford assay (Bio-Rad) microassay procedure. Typically, 5-7 mg of each pure protein were obtained from a 30-ml E. coli culture.

CD spectrophotometry
CD spectra of WT GePTS1 and mutants were recorded on an AVIV model 410 CD spectrophotometer. Samples were dissolved in 20 mM Tris-HCl buffer, pH 7.9. Protein concentration ranged from 170 to 650 mg/ml and was measured as above. Spectra were recorded at 25°C in 1-mm quartz cuvettes over a wavelength range from 270 to 190-200 nm, depending on concentration. Data were collected with 0.5-nm wavelength steps, 1.0-nm bandwidth, and 1.0-s averaging time. Four scans were averaged, and a buffer blank was collected prior to each sample and subtracted from the average of the four scans. Spectra were not smoothed but were normalized to the same concentration for comparability.
Crystallization and X-ray data collection Initial crystallization conditions for GePTS1 and PsPTS1 were obtained using the microbatch-under-oil method employing 1,536-well microassay plate high-throughput screening (25)  GePTS1 and PsPTS1 crystals were subsequently flash-cooled in crystallization buffer, supplemented with either 25% (v/v) glycerol or ethylene glycol in H 2 O, stored in cryo-vials, and shipped to the Stanford Synchrotron Radiation Light source (SSRL) for data collection. The GePTS1 crystals belong to the trigonal space group P3 1 21 with unit cell dimensions a = b = 162.572, c = 99.763, diffracted to ;2.6 Å resolution. A complete data set comprising 1,000 images with a rotation angle of 0.2°w as collected from a single crystal on SSRL beamline BL9-2 using X-rays at 12,658 eV (0.97946 Å) and a PILATUS 6M PAD detector running in the shutterless mode. Data were processed with XDS (35) and scaled with AIMLESS (36) from the CCP4 suite of programs (37). The Matthews coefficient (38), assuming six molecules in the asymmetric unit, was 3.1 Å 3 /Da (60% solvent content). Final data collection statistics are given in Table 2 (39)(40)(41)(42). The PsPTS1 crystals belong to the cubic space group P2 1 3 with unit cell dimensions a = b = c = 78.893, diffracting to ;1.5 Å resolution. A complete data set comprising 900 images with a rotation angle of 0.2°was collected from a single crystal on SSRL beamline BL9-2 using X-rays at 12,658 eV (0.97946 Å) and a PILATUS 6M PAD detector running in the shutterless mode. Data were processed with XDS (35) and scaled with AIMLESS (36). The Matthews coefficient (38), assuming one molecule in the asymmetric unit, was 2.53 Å 3 /Da (51% solvent content). Final data collection statistics are given in Table 2. An additional data set from a second cryo-cooled crystal was collected on BL9-2 using X-rays at 7,500 eV (1.65307 Å) via the inverse beam method and wedges of 30°to maximize the anomalous signal from the intrinsic sulfur atoms. A complete data set comprising 1,800 0.2°images was collected and also processed with XDS (35) and scaled with AIMLESS (36). Statistics are given in Table 2.
Data processing, structure determination, and refinement The GePTS1 structure was solved by molecular replacement using a starting model derived from the dirigent protein AtDIR6 from A. thaliana (PDB code 5LAL) (22). Two models were used comprising (a) monomer AtDIR6 and (b) trimeric AtDIR6. GePTS1 and AtDIR6 sequences were aligned, and both AtDIR6 models were converted into pseudo-GePTS1 models using the program CHAINSAW (43) from the CCP4 suite (37), whereby identical residues in the two sequences were retained, and those that differed were truncated at the Cb atom. A good molecular replacement solution was obtained using the trimeric pseudo-GePTS1 model (searching for two copies) using the program PHASER in the PHENIX suite (44), with a translation function Z-score (TFZ) of 47.2 and a log-likelihood gain (LLG) after refinement of 2,432. The same solution was obtained using the monomeric pseudo-GePTS1 model (TFZ = 34.5, LLG = 2442), searching for six copies. This latter solution was submitted to a round of automated model building with PHENIX.AUTOBUILD using data to 2.65 Å resolution, giving a crystallographic R work and R free of 0.313 and 0.367, respectively, with 824 residues built in 43 fragments covering the six molecules in the asymmetric unit. Refinement of the GePTS1 structure using all data to 2.6 Å resolution was completed with PHENIX.REFINE (44), alternating with manual building of the model using the molecular graphics program COOT (28). Water molecules were added at structurally and chemically relevant positions, and the atomic displacement parameters for all atoms in the structure were refined isotropically. Final refinement statistics are given in Table 3 (45).
The PsPTS1 structure was solved by sulfur-SAD methods implemented in PHENIX. Following solvent flattening and density modification, the overall figure of merit was 0.325. Autobuilding in PHENIX generated a model comprising 95 of 169 expected residues. Initial refinement with PHENIX. REFINE gave R work and R free values of 0.33 and 0.35, respectively. The model was rebuilt into the density-modified electron density, and subsequent refinement was switched to the 1.5 Å resolution native data. Water molecules were added at structurally and chemically relevant positions, with atomic displacement parameters for all atoms in the structure refined isotropically. Final refinement statistics are given in Table 3. Final coordinates and structure factors have been deposited in the Protein Data Bank with accession codes 6OOC (GePTS1) and 6OOD (PsPTS1).

Substrate docking
Ligand-protein docking simulations were set up, run, and analyzed in the Windows version of AutoDockTools (ADT version 1.5.6), a graphical user interface to the AutoDock 4 suite of programs for predicting binding of small molecules (substrates, inhibitors) to a macromolecular receptor's 3D structure (46). The four substrate (DMI) diastereomers and two QM intermediate enantiomers, with two alternate conformations of the flavone pyran ring where the 39 phenol substituent was either pseudo-equatorial or pseudo-axial for each (12 structures total), were built and energy-minimized in Chem3D and saved in .mol2 format. Docking simulations were performed with AutoGrid and AutoDock. AutoGrid was used to precompute a R meas is the redundancy-independent merging R factor (39). b R pim is the precision-indicating merging R factor (40). c Percentage of correlation between intensities from random half-sets of data (41). d Correlation of DI anom from two random half-sets (42). the grid maps of interaction energies for various atom types in the ligand with the enzyme. These grid maps were then used in the AutoDock docking simulations to determine the total ligand-protein interaction energy. To prepare the structures of the ligand and the protein for docking, missing hydrogen atoms and Gasteiger partial atomic charges were added to their 3D structures loaded from their respective .mol2 and .pdb files.
The water molecules present with the enzyme structure were removed. AutoDockTools identified five active torsions in the ligand. The grid box was centered upon the enzyme with a grid spacing of 0.375 Å, with sufficient size to cover the ligand-and the receptor-binding sites. No motion was permitted in the protein backbone or side chains. After the structures were prepared, AutoGrid was run to obtain the grid maps for the AutoDock calculations. The AutoDock calculations were executed using the Lamarckian genetic algorithm with 100 dockings per ligand and 2,500,000 energy evaluations per docking. Finally, 100 enzymebound ligand conformations were obtained and analyzed.

Data availability
Coordinates and structure factors for GePTS1 and PsPTS1 have been deposited in the Protein Data Bank with accession codes 6OOC and 6OOD, respectively. All other data are contained within the article and the supporting information.
Acknowledgments-This paper is dedicated to Professor W. David Nes (Texas Tech) on the occasion of his 65th birthday and to Emeritus Professor Robert Verpoorte for his scientific contributions dedicated to the plant sciences. A portion of the research was performed using EMSL (grid.436923.9), a United States Department of Energy Office of Science User Facility sponsored by the Office of Biological and Environmental Research. Use of the Stanford Synchrotron Radiation Lightsource, SLAC National Accelerator Laboratory, was supported by the Department of Energy, Office of Science, Basic Energy Sciences (BES), under Contract DE-AC02-76SF00515. The SSRL Structural Molecular Biology Program is supported by the Department of Energy Office of Biological and Environmental Research (BER) and by NIGMS, National Institutes of Health, Grant P41GM103393.