Deletion of a Single Amino Acid Residue from Different 4-Coumarate:CoA Ligases from Soybean Results in the Generation of New Substrate Specificities*

Plant 4-coumarate:coenzyme A ligases, acyl-CoA ligases, peptide synthetases, and firefly luciferases are grouped in one family of AMP-binding proteins. These enzymes do not only use a common reaction mechanism for the activation of carboxylate substrates but are also very likely marked by a similar functional architecture. In soybean, four 4-coumarate:CoA ligases have been described that display different substrate utilization profiles. One of these (Gm4CL1) represented an isoform that was able to convert highly ring-substituted cinnamic acids. Using computer-based predictions of the conformation of Gm4CL1, a peptide motif was identified and experimentally verified to exert a critical influence on the selectivity toward differently ring-substituted cinnamate substrates. Furthermore, one unique amino acid residue present in the other isoenzymes of soybean was shown to be responsible for the incapability to accommodate highly substituted substrates. The deletion of this residue conferred the ability to activate sinapate and, in one case, also 3,4-dimethoxy cinnamate and was accompanied by a significantly better affinity for ferulate. The engineering of the substrate specificity of the critical enzymes that activate the common precursors of a variety of phenylpropanoid-derived secondary metabolites may offer a convenient tool for the generation of transgenic plants with desirably modified metabolite profiles.

The well regulated secondary metabolism of plants can be conceived as an adaptive network of specialized enzymes, responding to diverse environmental and internal stimuli. This network ensures an efficient partitioning of resources for the benefit of the plant. The phenylpropanoid pathway of plants represents a particular example for this concept. This pathway uses phenylalanine-derived intermediates for the biosynthesis of compounds used as UV protectants, defense chemicals, signaling compounds, and flower pigments, as well as for the building units of one of the most important structural polymers, lignin. One essential early step in the biosynthesis of phenylpropanoids is the activation of differently ring-substituted cinnamic acids to the corresponding coenzyme A thioesters. This reaction is catalyzed by 4-coumarate:CoA ligase (4CL, 1 EC 6.2.1.12), an enzyme that employs a two-step reaction mechanism related to other adenylate-forming enzymes (1) such as fatty acyl-CoA synthetases, peptide synthetases, and luciferases. In the first step, ATP forms an adenylate intermediate with the carboxyl moiety of the respective substrate from which, in the second step, the activated group is transferred to an acceptor, mostly the sulfhydryl group of an enzyme-bound cofactor, followed by the concomitant release of AMP. Crystal structures are available for some members of the protein family, allowing the structural modeling of plant 4CLs (2,3).
Plant 4CLs have been characterized from a wide range of species, showing different isoform distribution patterns. Some plants exhibit only a single form, whereas others contain multiple isoforms. From those containing multiple isoforms, these can exhibit similar or highly divergent substrate specificities in discriminating differently ring-substituted cinnamic acids. For example, in parsley (Petroselinum crispum) (4), potato (Solanum tuberosum) (5), and loblolly pine (Pinus taeda) (6), the multiple 4CL isogenes encode identical or very similar proteins. Conversely, soybean (Glycine max) (7,8), tobacco (Nicotiana tabaccum) (9), aspen (Populus tremuloides) (10), hybrid poplar (Populus trichocarpa x P. deltoides) (11), and Arabidopsis thaliana (12) contain structurally divergent 4CL isoforms. Due to the pronounced differences at the structural, enzymatic, and regulatory levels, the four soybean isozymes deserve special attention (7,8). The two genes encoding isoforms 3 and 4 are very closely related and could not be distinguished with respect to the enzymatic properties of the encoded proteins. These isoforms were strongly up-regulated at the transcriptional level by elicitation or infection (7,8). In contrast, the structurally diverse isoform 1 was demonstrated to be down-regulated after elicitation, whereas isoform 2 showed nearly no response (8). In addition to this rather unusual behavior, isoform 1 displayed an interesting biochemical feature: it represented the first 4CL for which a cDNA was isolated to date, where the encoded enzyme was able to convert sinapate to its respective CoA ester (8), a step potentially of critical importance for the synthesis of syringyl-type lignin.
Protein sequence alignments of 4CLs revealed the existence of conserved peptide motifs (5,7,11,12). Box I, described as a putative nucleotide-binding motif, has been used as a signature for the superfamily of adenylate-forming enzymes (13,14), whereas the absolute conservation of the box II motif seemed to be restricted to 4CLs (5). The box I and II motifs have been the subject of recent investigations using mutant 4CL isoforms from Arabidopsis (2,3,15). Interestingly, the modification of highly conserved residues included in each of these boxes, which have been postulated to be essential for the enzymatic reaction, did not result in the total loss of activity but showed rather subtle changes of the kinetic parameters of the conversion of caffeate to the respective CoA ester (2). Putative substrate-binding domains within the amino acid sequence have been identified by domain-swapping approaches (15) and have been shown to include specificity-conferring residues that were predicted to be in direct contact with the ring substituent of the cinnamic acid substrate, ferulate (3).
In our earlier report, we described the first cloned 4CL isoenzyme capable of converting sinapate to its thiol ester (8). We now aimed to pinpoint the major difference(s) in the structures of the soybean isozymes that are responsible for the different substrate specificities. In a first attempt, domain swapping between one isoform using a narrow substrate range and one isoform with a broad substrate range was used to confine the regions important for the binding of the acid substrate. Second, structural comparisons have been conducted using 4CL protein alignments in conjunction with crystal structures derived from enzymes utilizing similar enzymatic mechanisms. This allowed the deduction of regions or single amino acid residues with a suspected importance in the enzymatic reaction toward the cinnamic acid substrate. Site-directed mutagenesis was then used to generate modified proteins, which were assayed for altered substrate specificities.
Intriguingly, the deletion of one single amino acid residue from Gm4CL2 as well as from Gm4CL3, which is absent in wild-type Gm4CL1, resulted in the generation of new specificities and allowed the mutant enzymes to convert sinapate. The possibility of converting 4CL isoforms into sinapoyl-synthesizing enzymes may lead to valuable transgenic plants, where both lignin production and composition may be customized by using inducible promoters in conjunction with recombinant 4CL proteins expressing the substrate specificity of choice.

EXPERIMENTAL PROCEDURES
Structural Analysis-Amino acid sequences were aligned and modeled using SWISS-Model (www.expasy.ch). The crystal structure of luciferase (Protein Data Bank codes 1LCI, 1BA3) (16) was used as a template for the prediction of the putative conformations of Gm4CL1, Gm4CL3, as well as Gm4CL3dVal 367 .
4CL Hybrids-The plasmids pQE-30/Gm4CL1 and pQE-31/ Gm4CL3, described in (8), were used for the generation of hybrids between the soybean 4CL isoforms 1 and 3. Restriction sites available at identical positions in both cDNAs or generated by PCR at the respective positions were used to assemble different portions of the cDNAs according to Fig. 1A and to yield the chimeric proteins depicted in Fig. 1B.
Site-specific Mutagenesis-The modification of single nucleotide residues was performed using proof-reading PCR according to Refs. 17 and 18. Briefly, for each mutation, a pair of oligonucleotides was synthesized including the desired alterations surrounded by at least eight nucleotides of the original sequence. The size of the primers was adjusted to yield a melting temperature of 75-77°C by using the following formula: T m ϭ 81.5 ϩ 0.41 ϫ GC (%) Ϫ 675/number of bases Ϫ sequence deviation (%). For the amplification, 10 ng of column-purified plasmid DNA (Genomed, Bad Oeynhausen, Germany) was used in a total volume of 15 l, including 1 M of each primer, 200 mM dNTPs, and one unit of Pfu polymerase (Promega/Serva). In total, 18 cycles were conducted, consisting each of 35 s at 94°C, 45 s at 55°C, and 12 min at 72°C, preceded by a melting step at 94°C for 2 min and followed by a final extension step at 72°C for 10 min. Subsequently, the parental template DNA was digested with DpnI (Promega/Serva), and the amplified plasmids were purified and transformed into Escherichia coli DH5␣ (19). The mutations were verified by sequencing.
Expression in E. coli and Enzyme Purification-Heterologous expression, enzymatic assays, and calculations of isoenzyme specificities were performed as described (8). Isolation of recombinant proteins was achieved by affinity dye chromatography combined with immobilized metal chelate affinity chromatography. Briefly, crude bacterial extracts were applied to reactive blue CL-6B (Sigma) in 50 mM Tris/HCl, pH 8, 14 mM ␤-mercaptoethanol, and 30% (v/v) glycerol (buffer A), the column was washed with buffer A including 0.6 M NaCl, and proteins were eluted by addition of 2 M NaCl. The eluate was applied to nickelnitrilotriacetic acid agarose (Qiagen), equilibrated previously in buffer A including 2 M NaCl. After washing (buffer A, 1 M NaCl, 10 mM imidazole), the bound proteins were eluted by raising the imidazole concentration to 200 mM.
Antisera and Bacterial Strains-An antiserum raised against parsley 4CL (20) was used in conjunction with anti-rabbit IgG from goat coupled to alkaline phosphatase (Sigma) to verify the expression from different constructs according to Ref. 8. E. coli DH5␣ was used for the propagation of recombinant plasmids, and the strain SG13009 (Qiagen) was used for the expression of proteins.

RESULTS
Domain Swapping-The plant 4CLs belong to a family of AMP-binding enzymes utilizing a common two-step activation mechanism for carboxylate substrates as diverse as fatty, acetic, amino, or cinnamic acids or chlorobenzoate and luciferin (1,13,16,21). PheA, the phenylalanine-activating subunit of gramicidin S synthetase 1 from Bacillus brevis, which employs the same reaction mechanism, has been crystallized in a ternary complex with the substrates AMP and phenylalanine (22). Luciferase from the firefly Photinus pyralis, another member of this enzyme family from which a structure is available, showed a structure similar to PheA, which allowed the presumptive assignment of the amino acid residues implicated in the binding of the substrates when compared with PheA (16). Since firefly luciferase and plant 4CLs share 31-33% identical amino acid residues, it was then possible to use the structure of the firefly luciferase as a template to model the hypothetical threedimensional conformation of Gm4CL1. All these proteins fold into a larger N-terminal domain and a smaller C-terminal domain with the substrate-binding pocket being located near the transition of the N-terminal domain to the C-terminal domain. This putative substrate-binding region located to the central part of Gm4CL1.
We tested this prediction by using a domain-swapping approach exploiting the two soybean 4CL isoforms, Gm4CL1 and Gm4CL3, which showed the most diverse substrate specificities (8). Following the assembly of different hybrids from Gm4CL1 and Gm4CL3 cDNAs (Fig. 1A), the derived cDNAs were cloned into a bacterial expression vector (pQE30 or pQE31). As summarized in Fig. 1B, nine chimeric proteins have been generated, from which five were shown to be inactive. The hybrid proteins H1 and H5, which each contained the highly conserved C-terminal domain of the other form, showed no differences in substrate specificities (Table I), excluding an impact of this protein region on the utilization of the cinnamic acid substrate. The replacement of the central region of Gm4CL1 with the respective portion of Gm4CL3 (H3, H4) resulted in a significant alteration of the substrate specificity. The hybrid proteins H3 and H4 accepted only the same restricted substrate range of cinnamic acid substrates as isoform 3 (Table I). The failure to generate active hybrid proteins in the reverse combination of isoforms 1 and 3 (H7, H8) may have been due to the N-terminal extension present in Gm4CL3 as opposed to Gm4CL1, which may have resulted in a structural perturbation that was not tolerated by an isoform 1-type enzyme. The hybrid forms H2 and H6, which were merged in the middle of the central region, as well as H9, which contained only the central part of Gm4CL1 in the backbone of Gm4CL3, all yielded inactive enzymes despite a positive expression of all of these recombinant proteins (Fig. 1C).
Alanine Scanning-The involvement of the central domain of 4CL in the recognition of cinnamic acid substrates was surveyed in more detail. The sequential superposition of the central protein regions of PheA, which have been shown to build part of the active site cavity, in the first step to luciferase and then to a computer-generated model of Gm4CL1, led to the selection of a protein region of 4CL presumably involved in the binding of the cinnamic acid substrate (Gm4CL1 331-348 , Fig.  2A). In addition, in the vicinity of the selected amino acid sequence, a tripeptide (VPP), which was strictly conserved between 4CLs ( Fig. 2A, positions 284 -286 in Gm4CL1), was observed. The residues Ala 236 , Ile 330 , Cys 331 , and Ala 322 , Ala 301 , Thr 278 of PheA have been found to line both sides of the specificity pocket for the phenylalanine side chain, completed by Trp 239 at the bottom of the cavity (22). From these, four residues were located in or close to the highly conserved peptides shown in the restricted alignment of PheA with the soybean and Arabidopsis 4CLs ( Fig. 2A). Accordingly, these residues represented promising candidates for a functional role in 4CLs as well.
Both regions of Gm4CL1 highlighted in Fig. 2A, therefore, were chosen for an alanine-scanning mutagenesis approach to attribute functional importance to specific amino acid side chains by replacing consecutive residues with alanine. The substrate specificities of the recombinant proteins were determined and compared with the wild-type Gm4CL1 enzyme (Fig.  2B). One replacement (T331A) did not have any influence on enzyme activity, whereas eight of the mutations displayed major effects resulting in nearly complete loss of conversion of cinnamic acid substrates (G333A, Y336A, G337A, M338A, T339A, E340A, G342A, L344A, M348A) despite the positive expression of all of the recombinant proteins (Fig. 2C). Strikingly, the modification of residues 284 -286, as well as of residue 334, resulted in larger activities with the highly substituted substrate sinapate when compared with the respective reference values obtained with 4-coumaric acid. With the exception of the mutants P285A and Q334A, this alteration of the enzyme activity ratio was also observed with the artificial substrate 3,4-dimethoxy cinnamate (3,. Deletion of a Single Residue-The alignment of soybean 4CL1 with other plant isoforms for which the substrate range was specified revealed one crucial difference ( Fig. 2A and data not shown): the absence of a single valine residue (lacking between Pro 343 and Leu 344 of Gm4CL1) in the region, which was shown to build the phenylalanine-binding pocket in PheA (22). According to the designation of functional side chain residues, as identified by the alanine-scanning mutagenesis, this part of the putative binding pocket showed only a moderate influence on the selectivity for differently substituted cinnamic acids (Fig. 2B). Furthermore, Gm4CL1 represents the only isoform that is able to activate sinapate and 3,4-DMC. We therefore tested the impact of the valine deletion by engineering it into the two other isoforms (Gm4CL2 and 3) of soybean, which have been shown to display intermediate and narrow substrate utilization ranges, respectively (8). The mutant enzymes were indeed able to convert sinapate, although the affinities were shown to be rather low (K m values of 1208 and 323 M, respectively, Table II) when compared with Gm4CL1 (K m value of 4.7 M). Gm4CL2dVal 345 did not show any further significant alteration of the catalytic efficiency, in contrast to Gm4CL3dVal 367 . The latter mutant form did not only gain the capacity to convert sinapate but also the capacity to utilize the artificial substrate 3,4-DMC, showing a slightly better relative conversion rate in comparison with sinapate. Moreover, the efficiency of activating ferulic acid was greatly enhanced when compared with the wild-type 4CL3 enzyme (Table II). In the case of Gm4CL3, the deletion of one single residue resulted in a considerable alteration of the functionality of the isoenzyme, albeit accompanied by a decrease of the efficiency for 4-coumarate, the highly preferred substrate of the wild-type isoform, as well as for caffeate. DISCUSSION The soybean 4-coumarate:CoA ligase 1 (1) (Gm4CL1) cDNA (8) represents a valuable tool for investigations on the active site of a plant 4CL isoform, which is capable of converting the highly ring-substituted cinnamic acids, sinapate and 3,4-DMC, to the respective CoA thioesters. The aim of our present study was to elucidate the critical difference, inherited by Gm4CL1, that allows this enzyme to accommodate a great range of dif- The type of substrate specificity of each chimeric protein is given to the right, as deduced from the data shown in Table I. Ϫ, no enzyme activity detectable. The fusion points for the various chimeric proteins are as follows: Hybrid H1: Gm4CL1 M1-G400 ::Gm4CL3 Y425-P570 ; H2: Gm-4CL1 M1-A256 ::Gm4CL3 A280-P570 ; H3: Gm4CL1 M1-A131 ::Gm4CL3 Y156-P570 ; H4: Gm4CL1 M1-A131 ::Gm4CL3 Y156-T444 ::Gm4CL1 G421-N546 ; H5: Gm-4CL3 M1-T444 ::Gm4CL1 G421-N546 ; H6: Gm4CL3 M1-E294 ::Gm4CL1 L272-N546 ; H7: Gm4CL3 M1-K159 ::Gm4CL1 I136-N546 ; H8: Gm4CL3 M1-K159 ::Gm-4CL1 I136-G400 ::Gm4CL3 Y425-P570 ; H9: Gm4CL3 M1-E294 ::Gm4CL1 L272-G400 :: Gm4CL3 Y425-P570 . In the above list of chimeric proteins, the subscript single-letter codes represent the amino acids, and the subscript numbers represent the position numbers. C, detection of the recombinant chimeric Gm4CL proteins generated in E. coli by Western blotting. Crude protein extracts (10 g of protein each) were separated on 10% SDS-polyacrylamide gels and transferred onto nitrocellulose filters. For immunodetection, antiserum raised against parsley 4CL (20) combined with goat antirabbit IgG conjugated to alkaline phosphatase was used. The relative molecular masses of protein standards are shown on the left.
ferently substituted substrates. The analyses were provoked and facilitated by the availability of two other soybean 4CL cDNAs, coding for Gm4CL2 and Gm4CL3, which are distinguished from Gm4CL1 by an intermediate and narrow substrate utilization range, respectively (8). By successively pin-pointing one region of the primary sequence of the enzyme contributing to the active site, a single amino acid residue was identified that, in Gm4CL2 and Gm4CL3, prohibited the conversion of the highly substituted substrates.
One further important prerequisite for the design of our  Gm) and A. thaliana (At) 4CL isoforms 1, 2, and 3, respectively, in comparison with PheA, the phenylalanine-activating subunit of gramicidin S synthetase 1 from B. brevis. The denoted numbering is according to each single polypeptide, as found in the databases. B, enzyme activity profiles of different recombinant Gm4CL1 proteins generated by site-directed mutagenesis in the protein regions depicted in panel A in comparison with the wild-type (wt) enzyme activity of Gm4CL1 (4CL1/wt). The recombinant proteins were expressed in E. coli, and the specific activities were measured in crude extracts as described previously by using 2 mM cinnamate or 0.5 mM of all other cinnamic acid substrates, respectively (8). The conversion rates for the cinnamic acids, as catalyzed by the recombinant proteins, are each depicted in the order cinnamate, 4-coumarate, caffeate, ferulate, sinapate, and 3,4-DMC (left to right), respectively, and are denoted by different shading of the columns. C, detection of the single-residue mutant proteins and wild-type Gm4CL1 generated in E. coli by Western blotting, performed as described in the legend for Fig. 1. The relative molecular masses of protein standards are shown on the left. The recombinant 4CLs were expressed in E. coli, and specific enzyme activities were determined as described previously (8). The cinnamic acid substrates were used at the following concentrations in the assays: cinnamate, 2 mM; 4-coumarate, caffeate, sinapate, and 3,4-DMC, 0.5 mM; ferulate, 0.1 mM. experiments was structural information describing proteins executing similar enzymatic reactions. 4CLs belong to the superfamily of AMP-binding enzymes, which also includes PheA, the adenylation domain of gramicidin S synthetase, and the firefly luciferase (13,16,21,22). The structures of these proteins have been determined, in the case of PheA even as a ternary complex in the presence of the substrates AMP and phenylalanine (16,22). Based on the amino acid sequence alignment of Gm4CL1 and the unliganded luciferase, a structural model was created, which then was used to superpose PheA. In the predicted conformation of Gm4CL1, the putative substrate-binding cavity located to the central amino acid sequence including the amino acid residues 284 to 348 of ligase 1. This prognosis was verified by domain-swapping experiments (Fig. 1). The hybrid H4 revealed a conversion of the broad substrate range to the restricted one, as typified by Gm4CL3, after the substitution of a sequence of just 288 amino acid residues in the center of the Gm4CL1 protein with the respective region of Gm4CL3. The difficulties in generating active hybrid proteins experienced with the majority of the constructs prevented more in-depth studies of the localization of the regions that may be important for the determination of the substrate range. This may have been due to the higher divergence in similarity and size of the soybean isoforms 1 and 3 (58% identity (8)) as compared with the Arabidopsis forms 1 and 2 (86% identity (12)), for which a larger number of active hybrid enzymes have been generated recently (15). The analyses of the substrate recognition profiles of the chimeric proteins from Arabidopsis pointed to the same region in the central part of the enzymes, as was found for soybean. Moreover, the dissection of this inner part of the Arabidopsis 4CLs resulted in the designation of two substrate-binding domains (sbd I and sbd II), which independently were able to confer a change in the substrate range (15). Included in these domains are the regions that were analyzed by alanine-scanning mutagenesis in the present study (Fig. 2A). The sequence of Gm4CL1 284 -286 corresponds to a highly conserved tripeptide in the alignment of the amino acid sequences of a large number of the adenylateforming enzymes (data not shown) and was therefore chosen for closer inspection. According to the structural data, this tripeptide (Leu 279 -Pro 280 -Pro 281 in PheA) is in close proximity to the phenolic side chain of the aromatic acid substrate (22). Interestingly, the exchange of each of the respective positions in Gm4CL1 with alanine resulted in an increase of the activity with more highly ring-substituted cinnamic acids with the remarkable exception of the loss of 3,4-DMC activation by the mutation P285A (Fig. 2B). This artificial substrate contains an O-methyl group in para-position, which may bear a steric hindrance in preventing a productive enzyme-substrate complex in the modified substrate groove. Although the participation of the tripeptide in substrate binding cannot be firmly deduced without the elucidation of the real structure of a 4CL in the presence of the acid substrate, our data clearly support the computational model of Gm4CL1, which places this peptide at one end of the substrate-binding pocket. The different ring substitutions of the cinnamic acids implicate different spatial requirements for the substrates, which may or may not fit into the groove built by this stretch of amino acid residues, depending on their respective side chains. This conclusion was corroborated by the recent description of a substrate specificity-conferring amino acid in 4CL2 from Arabidopsis (3). The position Met 293 of At4CL2, which was speculated to correspond to Thr 278 of PheA, was shown to influence the spatial arrangement of the substrate-binding pocket in such a way that only the exchange with a smaller residue allowed the utilization of ferulate (3). Regardless of minor displacements of the PheA amino acid sequence in relation to different plant 4CL protein alignments used in the aforementioned publication (3), published recently elsewhere (15), or presented in our study ( Fig.  2A), the accumulated results indicate that in silico biology can be applied to approach the questions posed in these studies. According to the alignment of Fig. 2A, the second selected region of Gm4CL1 (positions 331 to 348) corresponds to a small amino acid motif of PheA, which includes residues contributing  (22). The importance of this central part of the polypeptide chain of Gm4CL1 is reflected by the loss of activity observed for 9 out of 13 single mutant enzymes (Fig. 2B). Interestingly, the exchange at Q334A did not result in a total loss of activity despite the importance of the respective functionally analogous residue Asn 321 of PheA. Here, the side chain was proposed to interact both with the exocyclic nitrogen of adenine as well as with N1 of the purine ring through a well ordered water molecule (22). Moreover, the substrate specificity of the mutant Q334A displayed a reversal of the pattern when compared with that of the wild-type Gm4CL1 enzyme: caffeate and sinapate were preferably converted, in contrast to coumarate, ferulate, and 3,4-DMC. This result could indicate a major influence on cinnamic acid substrate discrimination rather than on AMP binding. The most conspicuous difference of the central part of the polypeptide chain between Gm4CL1 and all other known plant 4CL proteins was the absence of a conserved valine residue between the positions Pro 343 and Leu 344 of Gm4CL1 ( Fig. 2A and data not shown). We substantiated the important contribution of this amino acid to the substrate specificity by engineering its absence into two other 4CL isoforms of soybean that were known to accept only a subset of the ring-substituted cinnamic acids as substrates (8). The single amino acid deletions had a profound impact on the range of accommodated substrates, generating new substrate specificities for Gm4CL2 and Gm4CL3 (Table II). The mutation of ligase 3, which efficiently converted only 4-coumarate and caffeate, to a form that not only accepted two additional substrates (sinapate and 3,4-DMC) but also showed a drastic improvement of the affinity for another (ferulate), illustrated the relevance of this amino acid for substrate discrimination. Computer-based modeling suggested that the deletion of the single valine residue caused a displacement of the respective peptide loop of Gm4CL3 and concomitantly resulted in a steric rearrangement of the orientation of the neighboring leucine side chain. Interestingly, the exchange at L344A of Gm4CL1 resulted in the total loss of enzyme activity (Fig. 2B). Furthermore, recent evidence indicated the participation of additional neighboring regions of the polypeptide chain, which allowed the mutant Arabidopsis enzymes to convert ferulate (3,15). Taken together, these findings suggest that spatial restrictions of the binding groove rather than specific interactions between the substrate and amino acid residues of the polypeptide chain may be decisive in determining the substrate specificity of distinct 4CL isoenzymes.
In summary, the comparison of enzymatically highly divergent isoforms of 4CL from soybean enabled the prediction of functionally important amino acid residues. Active mutant en-zymes with single amino acid exchanges revealed some of the structural principles responsible for the ability to utilize differently substituted cinnamic acids to a different extent. The principles detected for 4CL isozymes from soybean as well as from other sources (2,3) help to understand the catalytic action of this important enzyme of the general phenylpropanoid pathway of plants. They could be useful also for the design of novel 4CL specificities. For example, the manipulation of the lignin composition of trees used for paper production is an economically important trait since a higher content of syringyl monolignols improves the commercial lignin degradation. Although recent discoveries concerning alternative pathways for the synthesis of monolignols provide interesting targets for genetically modified crop plants (23)(24)(25), the utilization of 4CL to preferentially activate a range of cinnamic acid substrates leading to highly substituted lignin building units may also constitute a valuable tool.