Bifunctional cis-Abienol Synthase from Abies balsamea Discovered by Transcriptome Sequencing and Its Implications for Diterpenoid Fragrance Production*

Background: Balsam fir produces cis-abienol, a natural product of value to the fragrance industry. Results: We describe the genomics-based discovery of balsam fir cis-abienol synthase. Conclusion: cis-Abienol synthase is a bifunctional diterpene synthase that produces a bicyclic diterpenol in the class II active site. Significance: cis-Abienol synthase is a new enzyme for metabolic engineering of plants or microorganisms to produce high value fragrance compounds. The labdanoid diterpene alcohol cis-abienol is a major component of the aromatic oleoresin of balsam fir (Abies balsamea) and serves as a valuable bioproduct material for the fragrance industry. Using high-throughput 454 transcriptome sequencing and metabolite profiling of balsam fir bark tissue, we identified candidate diterpene synthase sequences for full-length cDNA cloning and functional characterization. We discovered a bifunctional class I/II cis-abienol synthase (AbCAS), along with the paralogous levopimaradiene/abietadiene synthase and isopimaradiene synthase, all of which are members of the gymnosperm-specific TPS-d subfamily. The AbCAS-catalyzed formation of cis-abienol proceeds via cyclization and hydroxylation at carbon C-8 of a postulated carbocation intermediate in the class II active site, followed by cleavage of the diphosphate group and termination of the reaction sequence without further cyclization in the class I active site. This reaction mechanism is distinct from that of synthases of the isopimaradiene- or levopimaradiene/abietadiene synthase type, which employ deprotonation reactions in the class II active site and secondary cyclizations in the class I active site, leading to tricyclic diterpenes. Comparative homology modeling suggested the active site residues Asp-348, Leu-617, Phe-696, and Gly-723 as potentially important for the specificity of AbCAS. As a class I/II bifunctional enzyme, AbCAS is a promising target for metabolic engineering of cis-abienol production.

The labdanoid diterpene alcohol cis-abienol is a major component of the aromatic oleoresin of balsam fir (Abies balsamea) and serves as a valuable bioproduct material for the fragrance industry. Using high-throughput 454 transcriptome sequencing and metabolite profiling of balsam fir bark tissue, we identified candidate diterpene synthase sequences for full-length cDNA cloning and functional characterization. We discovered a bifunctional class I/II cis-abienol synthase (AbCAS), along with the paralogous levopimaradiene/abietadiene synthase and isopimaradiene synthase, all of which are members of the gymnosperm-specific TPS-d subfamily. The AbCAS-catalyzed formation of cis-abienol proceeds via cyclization and hydroxylation at carbon C-8 of a postulated carbocation intermediate in the class II active site, followed by cleavage of the diphosphate group and termination of the reaction sequence without further cyclization in the class I active site. This reaction mechanism is distinct from that of synthases of the isopimaradiene-or levopimaradiene/abietadiene synthase type, which employ deprotonation reactions in the class II active site and secondary cyclizations in the class I active site, leading to tricyclic diterpenes. Comparative homology modeling suggested the active site residues Asp-348, Leu-617, Phe-696, and Gly-723 as potentially important for the specificity of AbCAS. As a class I/II bifunc-tional enzyme, AbCAS is a promising target for metabolic engineering of cis-abienol production.
Conifers produce a diverse array of diterpenoids as major oleoresin components that play a key role in the chemical defense against herbivores and pathogens, such as bark beetles and their associated fungi (1)(2)(3). Oleoresin diterpenoids are also used as large volume, renewable raw material for the production of a suite of industrial resins and coatings and other bioproducts (4,5). Although tricyclic diterpene resin acids are ubiquitously abundant in the pine family (Pinaceae), the oleoresin of balsam fir (Abies balsamea (L.) Mill.) contains cis-abienol 4 (see Fig. 1), a bicyclic, tertiary diterpene alcohol, as the major diterpenoid (6). Among other applications, cis-abienol 4 and other oxygen-containing diterpenoids of plant origin (e.g. sclareol and manool) are important in the fragrance industry to produce Ambrox, which serves as a sustainable replacement for the use of ambergris in high end perfume formulations (7). Whereas Ambrox is produced from plant terpenoids, ambergris is an animal product secreted from the intestines of sperm whales, which are listed as an endangered species.
In addition to balsam fir, a gymnosperm tree, only a few angiosperm plant sources, such as tobacco (Nicotiana tabacum; family Solanaceae) trichomes (8,9) or the tuberous roots of Bolivian sunroot (Polymnia sonchifolia; family Asteraceae) (10) are known to produce cis-abienol 4 in amounts that are relevant for industrial scale extraction. The importance of cisabienol 4 as a plant-derived precursor for the fragrance and bioproducts industry has spawned an interest in the discovery of the relevant biosynthetic genes and enzymes and their future application in metabolically engineered microbial or plant production systems.
In contrast to the well characterized biosynthesis of tricyclic diterpenes by conifer diTPSs of the Iso-and LAS-type, a conifer gene or enzyme for the biosynthesis of cis-abienol 4 has not yet been reported. A plausible reaction sequence of cis-abienol 4 formation, catalyzed by a bifunctional conifer class I/II diTPS could proceed via water capture of a carbocation intermediate at carbon C-8 and subsequent ionization of the allylic diphosphate group without further cycloisomerization (Fig. 1). Recently, a monofunctional angiosperm diTPS catalyzing the formation of a bicyclic oxygen-containing diterpenoid, copal-8-ol diphosphate synthase from Cistus creticus (Cistaceae), has been reported (23). This enzyme represents a class II diTPS, which catalyzes the protonation-initiated cyclization of GGPP 1 to form the hydroxylated CPP compound. In addition, Mafu et al. (24) described a bifunctional class I/II diTPS, SmCPSKSL1, from the lycophyte Sellaginella moellendorffii for the formation of labda-7,13-dien-15-ol, where the primary hydroxyl group is introduced during the reaction of the class I active site.
We describe here the development of a 454 transcriptome resource for balsam fir, which, in conjunction with terpenoid metabolite profiling, was used for the identification and functional characterization of a class I/II gymnosperm cis-abienol synthase (CAS). The same sequence resource also revealed two additional bifunctional class I/II diTPSs, representing the balsam fir Iso and LAS enzymes. Comparative homology modeling FIGURE 1. Schematic of the proposed biosynthesis of diterpene olefins and alcohols by class I/II bifunctional diTPSs in conifers. The known activity of LASand Iso-type diTPSs in the formation of tricyclic diterpenes involves the stepwise cyclization of GGPP 1 via (9S,10S)-CPP 5 (i.e. CPP of normal or (ϩ)-stereochemistry). Protonation-initiated cyclization of GGPP 1 to CPP 5 is catalyzed by the class II active site of LAS-and Iso-type diTPSs. At the class I active site of Iso enzymes, ionization-dependent secondary cyclization of CPP 5 and deprotonation of the resulting sandaracpoimaren-8-yl cation 6 lead to the formation of isopimaradiene 7. Alternatively, in the class I active site of LAS enzymes, additional proton transfer and methyl migration afford the tertiary abietenyl cation 8, and further deprotonation and hydroxylation (i.e. water capture) result in the formation of the instable 13-hydroxy-8(14)-abietene 9/10 product of the LAS activity. Dehydration of this tertiary diterpene alcohol yields several different diterpene olefins, including abietadiene 11, palustradiene 12, levopimaradiene 13, and neoabietadiene 14. A different reaction sequence is proposed for the formation of the bicyclic diterpene alcohol cis-abienol 4 by the class I/II bifunctional AbCAS enzyme. The class II activity of AbCAS converts GGPP 1 to labda-13-en-8-ol diphosphate 3 via water capture at the C-8 carbon of the labda-13-en-8-yl ϩ cation 2. Subsequent ionization of the allylic diphosphate at the class I active site, without additional cyclization, yields cis-abienol 4. and site-directed mutagenesis revealed unique features of cisabienol 4 biosynthesis and suggest functional divergence in the evolution of CAS relative to the paralogous Iso-and LAS-type diTPSs associated with substitution of key residues in the active sites. A novel feature of CAS is the formation of a tertiary diterpenol at the class II active site of a bifunctional class I/II diTPS.

EXPERIMENTAL PROCEDURES
Plant Material-Two-year-old saplings of Abies balsamea var. phanerolepsis (L.) Mill. were purchased from Arbutus Grove Nursery Ltd. (North Saanich, Canada) and maintained in a greenhouse as described previously (25). Needles, xylem/ wood, and phloem/bark were harvested from the upper interwhorls and used fresh or flash frozen in liquid N 2 for metabolite and RNA extraction, respectively.
Diterpene Standards-Authentic diterpene resin acid standards were purchased from Orchid Celmark (New Westminster, Canada). The corresponding diterpene olefins were synthesized from the acids at Best West Laboratories Inc. (Salt Lake City, UT) as described previously (20).
Metabolite Analysis-Terpenoids were extracted from samples of 2 g of pulverized tissue with 1.5 ml of diethyl ether for 16 h at room temperature. Water was removed by the addition of anhydrous Na 2 SO 4 , and extracts were passed through 0.22-m GHP membrane filters (PALL Corp., De Miniac, Canada). GC-MS analysis was performed on an Agilent 6890N GC (Agilent Technologies Inc., Mississauga, Canada), 7683B series autosampler and a 5975 Inert XL MS Detector at 70 eV and 1 ml min Ϫ1 helium as carrier gas using a SGE Solgel-Wax column (polyethylene glycol, 30 m, 250 m inner diameter, 0.25-m film) with the following GC temperature program: 40°C for 5 min, 3°C min Ϫ1 to 80°C, 8°C min Ϫ1 to 250°C, 10°C min Ϫ1 to 270°C, hold 5 min, pulsed splitless injector held at 250°C. Diterpene resin acids were extracted from samples of ϳ150 mg of tissue with 1.5 ml of tert-butylmethylether following the method described by Lewinsohn et al. (26) and derivatized with 2 M trimethylsilyl diazomethane (Sigma) prior to GC-MS analysis using an Alltech AT-1000 column (polyethylene glycol acid-modified, 30 m, 250-m inner diameter, 0.25-m film) and GC specifications as follows: 150°C initial temperature, 1.5°C min Ϫ1 to 220°C, 20°C min Ϫ1 to 240°C. Compound identification was achieved by comparison of mass spectra with those of authentic standards and reference mass spectral databases of the National Institute of Standards and Technology MS library searches (Wiley W9N08). Quantifications were based on three independent biological replicates.
RNA Isolation and cDNA Synthesis-Total RNA was isolated from samples of 150 mg of bark tissue according to the method of Kolosova et al. (27), and mRNA was purified on Dynabeads (Invitrogen). RNA integrity and amounts were determined on a Bioanalyzer 2100 using an RNA Pico Chip (Agilent). Synthesis of cDNA as a template for PCR amplification of candidate genes was carried out with random hexamer oligonucleotides using the SMARTer cDNA rapid amplification of cDNA ends (RACE) amplification kit (Clontech, Mountain View, CA).
cDNA Library Construction and 454 Transcriptome Sequencing-Construction of a non-normalized bark tissue cDNA library and subsequent 454 transcriptome sequencing was conducted at the McGill University and Génome Québec Innovations Centre (Montreal, Canada). The cDNA library was constructed from 200 ng of fragmented mRNA using the cDNA Rapid Library Preparation kit, GS FLX Titanium series (Roche Applied Science) following the manufacturer's protocols. Yield of cDNA and the size range of cDNA fragments were assessed using a Bioanalyzer 2100 Pico Chip (Agilent), and 200 ng of the cDNA library were subjected to a half-plate reaction of 454 pyrosequencing using the Roche GS FLX Titanium technology.
De Novo Sequence Assembly-The 454 transcriptome sequencing generated 797,060 raw reads. After trimming of 5Ј adapter sequences and exclusion of reads smaller than 45 bp, de novo assembly of the bark transcriptome was performed using the GS De Novo Assembler version 2.5p1 (Roche Applied Science). BLASTx search (28) of the resulting isotigs against a comprehensive in-house curated sequence database of plant diTPS sequences was used to identify a subset of candidate isotigs. Reads from these isotigs were reassembled in PHRAP (29) for closer investigation.
Isolation of Full-length cDNAs-A full-length (FL) sequence of AbdiTPS4 was retrieved from the assembled transcriptome sequences. Completion of the 3Ј-sequences of AbdiTPS1 and -2 was achieved by 3Ј-RACE using the SMARTer cDNA RACE cDNA amplification kit (Clontech) and Phusion DNA polymerase (New England Biolabs, Pickering, Canada) with the following PCR program: initial denaturation at 98°C for 30 s, five cycles of denaturation at 98°C for 20 s and extension at 72°C for 90 s, five cycles of denaturation at 98°C for 20 s, annealing at 70°C for 20 s and extension at 72°C for 90 s, 27 cycles of denaturation at 98°C for 20 s, annealing at 65°C for 20 s and extension at 72°C for 90 s, and a final extension at 72°C for 3 min. The obtained amplicons were gel-purified and ligated into pJET using the CloneJET kit (Clontech). Pseudomature versions of the FL cDNAs (starting at (K/N)REFPP) were amplified (initial denaturation at 98°C for 30 s followed by 30 cycles of denaturation at 98°C for 10 s, annealing at 65-68°C for 30 s, and extension at 72°C for 80 s and final extension at 72°C for 10 min), cloned into pJET, and subcloned into the NheI/SalI and NotI restriction sites of pET28b(ϩ) (EMD Biosciences, San Diego, CA). All cDNA constructs were sequence-verified at the Nucleic Acid Protein Service Unit (University of British Columbia, Vancouver, Canada) prior to expression in Escherichia coli. Oligonucleotides used in PCR procedures are listed in supplemental Table 1.
Functional Characterization of Diterpene Synthases-Recombinant proteins were expressed in E. coli BL21DE3-C41, Ni 2ϩ affinity-purified as described elsewhere (16), and desalted into 20 mM HEPES (pH 7.2), 150 mM NaCl, 10% glycerol, 5 mM DTT using PD MiniTrap G-25 columns (GE Healthcare). Enzyme assays were carried out as described before (16) in 50 mM HEPES (pH 7.2), 10 M MgCl 2 , 5% glycerol, 5 mM DTT, using 100 g of purified protein (100 g each for coupled assays) and 15 M (E,E,E)-GGPP 1 (Sigma) with incubation for 1 h at 30°C. After extraction of reaction products with 500 l of pentane, GC-MS analysis was conducted on an Agilent 7890A GC, 7683B series autosampler and a 7000A Triple Quad MS Detector at 70 eV and 1.2 ml min Ϫ1 helium flow using an HP5ms column (5% phenyl methyl siloxane, 30 m, 250 m inner diameter, 0.25-m film) and the following GC temperature program: 40°C for 2 min, 20°C min Ϫ1 to 300°C, hold 2 min, pulsed splitless injection (injector held at 250°C). For the detection of diphosphate intermediates, reaction products were dephosphorylated prior to extraction by incubation with 10 units of calf intestinal alkaline phosphatase (Invitrogen) for 16 h at 37°C and analyzed on a Solgel-Wax column as described above (see "Metabolite Analysis") with modified GC parameters: 40°C for 2 min, 25°C min Ϫ1 to 250°C, hold 5 min, pulsed splitless injector held at 250°C. Analysis of reaction products via LC-MS was performed on an Agilent 1100 series LC/MSD Trap XCT Plus MS with atmospheric pressure chemical ionization (APCI) in positive mode on an Agilent Zorbax RX-Sil silica column (4.6 mm inner diameter ϫ 150 mm ϫ 5 m) as reported previously (18).
Nuclear Magnetic Resonance (NMR) Analysis-cis-Abienol 4 was prepared from a pool of 10 individual enzyme assays as described above using 20 M GGPP and an incubation time of 2 h to maximize product formation. To enhance product purity, pentane was purified on alumina prior to use, and assays were performed in buffer pre-extracted with alumina-purified pentane. Nuclear magnetic resonance (NMR) spectra were recorded on a Bruker Avance 600 spectrometer equipped with a QNP or TCI cryoprobe (600 MHz) using deuterochloroform (CDCl 3 ) as the solvent, which was neutralized by distillation and filtration through potassium carbonate prior to preparation of the sample. Signal positions (␦) were calculated in parts per million (ppm) as compared with tetramethylsilane (␦ 0) and were measured relative to the signal of the solvent (CDCl 3 : ␦ 7.26, 1 H NMR; ␦ 77.0, 13 C NMR).
Generation of Site-directed Protein Variants of AbdiTPS4-Alanine substitutions of Asp-402, Asp-404, or Asp-621 of AbdiTPS4 were generated following the QuikChange site-directed mutagenesis protocol (Stratagene, Mississauga, Canada) with the cDNA in pET28b(ϩ) as template and the following PCR program: initial denaturation at 98°C for 90 s followed by 29 cycles of denaturation at 98°C for 50 s, annealing at 60°C for 30 s and extension at 72°C for 4 min, and final extension at 72°C for 10 min. Oligonucleotides used for mutagenesis are listed in supplemental Table 1.
Phylogenetic Analysis-Multiple protein sequence alignments were performed using the CLC bio Main Workbench 5.7.1 (CLC bio, Århus, Denmark). Phylogenetic analyses were conducted on the basis of the maximum likelihood algorithm using PhyML 3.0 (30) with four rate substitution categories, LG substitution model, BIONJ starting tree and 100 bootstrap repetitions, and displayed as phylogram using treeview32 1.6.6.
Computational Structure Analysis-Homology models of AbLAS, AbIso, and AbCAS were built using the CPHmodels 3.0 server (31) based on the tertiary structure of TbTXS (32) (PDB code 3p5pA, chain A) and certified as high quality exceeding 91% residues assigned to most favored regions in Ramachandran plot statistics using PROCHECK (33). Lack of structural errors in the models was validated using the ProSA-web server (34). Pairwise comparison of these modeled structures with the DaliLite server (35) demonstrated a high structural similarity of AbLAS, AbIso, AbCAS, and TbTXS with root mean square deviations of Յ1 Å for the ␤␥ domain and the ␣ domain, respec-tively, thus allowing for a structural comparison of the active sites. Using Molegro Virtual Docker 2010.4.0.0 (36), a semiautomated docking approach was applied to place labda-13-en-8-ol diphosphate 3 in the class I and class II active site of AbCAS (Fig. 6). For this purpose, active site cavities were predicted using a probe radius of 1.0 Å and a grid resolution of 0.6 Å with further manual optimization. Substrate docking was then restricted to the predicted cavities, inclusive of the required Mg 2ϩ complex. Energy-minimized Protein Data Bank coordinates of the ligands were generated using the PRODRG server (37).

Terpenoid Metabolite Profiling Confirms cis-Abienol as
Major Diterpene in Bark Tissue of Balsam Fir-To identify a suitable tissue source for transcriptome mining of genes of cisabienol 4 biosynthesis, we established diterpenoid metabolite profiles for wood and xylem, bark and phloem, and needles of balsam fir sapling trees (supplemental Table 2). As supporting information, we also measured mono-and sesquiterpenoids. Although only trace amounts of these terpenoids were detected in wood tissue (data not shown), bark/phloem and needle samples had similar total amounts of terpenoids, consisting mostly of diterpenoids, with lesser amounts of monoterpenoids and relatively minor amounts of sesquiterpenoids (supplemental Table 2). Both the bark/phloem and the needle samples contained the diterpene resin acids, abietic acid, isopimaric acid, dehydroabietic acid, and palustric acid, as major diterpenoids. These are all commonly abundant in conifers. cis-Abienol 4 was found almost exclusively in the bark/phloem tissue, where it was the most abundant terpenoid metabolite, accounting for more than 25% of the total diterpenoid content of balsam fir stem tissue (supplemental Table 2).
Discovery of diTPS Genes via Transcriptome Sequencing and de Novo Sequence Assembly-Based on the high abundance of cis-abienol 4 in bark/phloem of balsam fir sapling stems, we used this tissue source for the preparation of a non-normalized cDNA library and subsequent transcriptome sequencing. A half-plate reaction of Roche 454 sequencing generated a total of 797,060 sequence reads with an average GC content of 46% and an average read length of 359 bp (supplemental Table 3). After adapter trimming, the remaining high quality reads were subjected to a de novo assembly in GS De Novo Assembler 2.5p1 with a size exclusion of 45 bp. The assembly contained 85% of all input reads, yielding 14,699 isogroups from 17,122 isotigs with an average size of 1,114 nucleotides. For the discovery of TPS genes, these isotigs were searched against an in-house curated database of 146 known plant TPSs, resulting in a subset of isotigs that represented candidate genes for nine different putative mono-and sesqui-TPSs and five putative diTPSs (supplemental Table 4). Of the diTPSs, four candidate isotigs (AbdiTPS1, AbdiTPS2, AbdiTPS3, and AbdiTPS4) resembled bifunctional class I/II gymnosperm diTPSs of the TPS-d group (14,17,19) containing the characteristic DXDD, DDXXD, and NSE/DTE motifs (38 -40). To optimize and validate the sequence assemblies, reads corresponding to these four isotigs were reassembled in PHRAP (29), from which three unique partial diTPS cDNA sequences (AbdiTPS1, AbdiTPS2, and AbdiTPS3) and one FL cDNA sequence (AbdiTPS4) were obtained. The encoded proteins of AbdiTPS1, AbdiTPS2, and AbdiTPS3 showed highest similarity with grand fir (Abies grandis) abietadiene synthase (AgAS) (11,12), with more than 90% protein sequence identity. In contrast, the deduced protein sequence AbdiTPS4 showed highest similarity with Norway spruce (Picea abies) isopimaradiene synthase (PaIso) (14), although at a substantially lower level of sequence identity of only 75%, highlighting AbdiTPS4 as a unique candidate diTPS sequence (supplemental Fig. 1).
Functional Characterization of Balsam Fir diTPSs and Discovery of Novel cis-Abienol Synthase-The FL cDNA clone for AbdiTPS4 was amplified based on the in silico assembled FL sequence. FL or nearly FL cDNAs for AbdiTPS1 and AbdiTPS2, respectively, were recovered by 3Ј-RACE, with AbdiTPS2 lacking ϳ40 -50 bp of the plastidial transit peptide. Despite using a number of different primer combinations and PCR conditions, we could not amplify a cDNA clone corresponding to the AbdiTPS3 sequence assembly. For functional characterization of AbdiTPS1, AbdiTPS2, and AbdiTPS4, we cloned pseudomature versions of these diTPSs starting at the conserved (N/K)RX 6 W motif and lacking the putative plastidial transit peptide (16). Recombinant proteins were expressed in E. coli and Ni 2ϩ affinity-purified, resulting in soluble proteins of the expected molecular mass of 90 -91 kDa (supplemental Fig. 2). Enzyme assays with GGPP 1 as substrate followed by GC-MS or LC-MS analysis of reaction products against proper controls and, where available, authentic standards identified functions for AbdiTPS1, AbdiTPS2, and AbdiTPS4 as described below.
The product profile of AbdiTPS1 was identified by GC-MS to consist of four peaks corresponding to abietadiene 11, palustradiene 12, levopimaradiene 13, and neoabietadiene 14 in a ratio of ϳ4:3:2:1 (Fig. 2). This profile closely matches that previously identified for grand fir AgAS (12), a diTPS that is 99% identical with AbdiTPS1 on the amino acid level. According to these patterns, AbdiTPS1 can be classified as an LAS-type bifunctional class I/II diTPS, which appears to be orthologous with AgAS. Following the recent discovery of epimers of a tertiary alcohol, 13-hydroxy-8(14)-abietene 9/10, as the initial enzyme products of Norway spruce PaLAS (18), we also investigated the product of AbdiTPS1 by LC-MS, which confirmed the formation of 13-hydroxy-8(14)-abietene 9/10 (Fig. 3). The detected mass fragment of m/z 273 in the LC-MS analysis (Fig.  3) corresponds to dehydration product of 13-hydroxy-8(14)abietene 9/10 as detailed by Keeling et al. (18).
AbdiTPS4 showed a unique single-peak product profile (Figs. When the product of AbdiTPS4 was analyzed by GC-MS, under a number of different conditions, we observed poor res-olution of the shape of the cis-abienol 4 peak (Fig. 2), relative to the peak shape of diterpene olefins, and the occurrence of trace amounts of two additional compounds. This profile is probably due to degradation of cis-abienol 4 during GC-MS as reported FIGURE 2. Activity of balsam fir diTPSs AbdiTPS1 (AbLAS), AbdiTPS2 (AbIso), and AbdiTPS4 (AbCAS). A, total ion chromatograms (TIC) of reaction products from in vitro assays with purified recombinant enzymes using GGPP 1 as a substrate. GC-MS analysis was performed on an Agilent HP5ms column with electronic ionization at 70 eV. Enzymatic activity assays were confirmed with three independent experiments. Peak IS, internal standard 1.6 M 1-eicosene; peak a, palustradiene 12; peak b, levopimaradiene 13; peak c, abietadiene 11; peak d, neoabietadiene 14; peaks e and f, epimers of 13-hydroxy-8(14)-abietene 9/10; peak g, isopimaradiene 7; peak h, cis-abienol 4; the authentic standard of abietadiene 11 contained an unknown contamination; the authentic standard of levopimaradiene 13 contains other diterpene isomers, which could not be removed. B, mass spectrum of cis-abienol 4, produced by recombinant AbdiTPS4 with GGPP 1 as substrate and comparison with the reference mass spectrum of cis-abienol 4 as described by Vlad et al. (41) and obtained from the National Institute of Standards and Technology MS library searches (Wiley W9N08). Authentic cis-abienol 4 standard is not commercially available.
previously (42)(43)(44). However, LC-MS analysis confirmed cisabienol 4 as a single product of AbdiTPS4, with m/z 273 in the LC-MS analysis corresponding to the predicted dehydration product of the diterpene alcohol (Fig. 3). Additional structural and stereochemical analysis by proton and carbon NMR and comparison with previously reported analyses (45, 46) confirmed the identity of the AbdiTPS4 product as cis-abienol 4.
Based on functional characterization, AbdiTPS1, AbdiTPS2, and AbdiTPS4 will be referred to as AbLAS, AbIso, and AbCAS, respectively. AbCAS represents a new type of conifer diTPS, producing a bicyclic, tertiary diterpene alcohol, as opposed to tricyclic products of the LAS-and Iso-type diTPSs. With the present study, LAS-or Iso-type diTPSs have now been identified in three different genera of the pine family, namely in true firs (Abies), spruce (Picea), and pine (Pinus), which allows for analysis of gene orthology within this family (Fig. 4). It appears that gene duplications and neofunctionalization leading to paralogous pairs of LAS and Iso genes occurred independently in Abies and Picea, after the separation of these genera. Within the spruce genus, the dichotomy of LAS and Iso genes happened apparently prior to the speciation of Norway spruce and Sitka spruce.
The bifunctional class I/II AbCAS described here has only been cloned from balsam fir, matching the major diterpenoid produced by this species (supplemental Table 2). Within the group of conifer class I/II diTPSs, the AbCAS gene is separate from and appears basal to a clade of LAS and Iso genes from firs, spruces, and pine (Fig. 4).

Site-directed Mutagenesis of DXDD (Class II Active Site) and DDXXD Motifs (Class I Active Site) of AbCAS Demonstrate That Formation of the Tertiary Alcohol and Cyclization
Occur during the Class II Reaction-A plausible mechanism for the formation of cis-abienol 4 requires hydroxylation via water quenching of a labda-13-en-8-yl carbocation 2 at C-8 (Fig. 1). In principle, hydroxylation can occur during reactions at the class II active site or at the class I active site. To delineate which of the two active sites of bifunctional AbCAS catalyzed the formation of the tertiary alcohol, we generated a set of alanine substitutions of the DXDD (class II active site) and DDXXD (class I active site) motifs to obtain monofunctional AbCAS variants that contain either a non-functional class II (AbCAS:D402A/ D404A) or a non-functional class I (AbCAS:D621A) active site.
AbCAS:D621A converted GGPP 1 (peak i in Fig. 5) into trace amounts of CPP 5 (peak j) and a product with a significantly longer retention time, indicative of a hydroxylated diphosphate (peak k). This compound was identified, upon cleavage of the diphosphate group, as labda-13-en-8,15-diol based on characteristic mass fragments (m/z 290 [M ϩ ], m/z 275, m/z 257, m/z  192, and m/z 177) as compared with the mass spectrum of the authentic compound (23). epi-Manoyl oxide and manoyl oxide (peaks l and m in Fig. 5) were apparently not products of AbCAS but rather the result from ether formation of the dephos-phorylated labda-13-en-8-ol diphosphate 3 under GC-MS conditions because they were not seen in the wild-type enzyme product profile and did not occur under different GC conditions. Results from enzyme assays with AbCAS:D621A suggested that the class II active site of AbCAS catalyzes protonation-initiated formation of bicyclic labda-13-en-8-ol diphosphate 3 via water quenching of the parental carbocation.
Alanine substitution of Asp-402 and Asp-404 (AbCAS: D402A/D404A) in the class II active site resulted in complete loss of enzymatic activity with GGPP 1 as substrate. Activity of AbCAS:D402A/D404A could not be restored by combining this mutant with a protein variant of PaLAS (PaLAS:D611A), which contains an inactive class I active site and accumulates (9S,10S)-CPP 5 (peak j in Fig. 5). This result suggested that, unlike the LAS and Iso-type enzymes, the class I active site of AbCAS is not active with (9S,10S)-CPP 5 as an intermediate substrate.
Formation of cis-abienol 4 (peak h in Fig. 5) from GGPP 1 was restored in coupled assays of the two monofunctional AbCAS class I and class II protein variants, AbCAS:D402A/D404A and AbCAS:D621A, confirming that the non-mutated active sites remained functional in the two mutated proteins.
Together, the mutational analysis of class I and class II active sites of AbCAS suggested a reaction sequence of the bifunctional class I/II enzyme in which both the bicyclization and water capture occur in the class II active site, resulting in an intermediary labda-13-en-8-ol diphosphate 3, which undergoes cleavage of the diphosphate group and final deprotonation at the class I active site (Fig. 1).

DISCUSSION
Variations on a common theme of diTPS-catalyzed cycloisomerization of GGPP 1 contribute substantially to the chemical diversity of diterpene metabolites found in the oleoresin of conifers and in nature in general. Introduction of hydroxy functions further increases the diversity of naturally occurring diTPS products. Hydroxylation of conifer diterpenes can result from activity of P450s acting on products of diTPSs (20,22) or can result from capture of water by carbocation intermediates during the diTPS reaction (18). For the fragrance industry, bicyclic hydroxylated diterpenes, such as cis-abienol 4 and sclareol, are of particular value as plant-derived precursors for the sustainable production of Ambrox, which replaces the controversial use of animal-derived ambergris in perfume formulations.
Surprisingly, relatively little is known about diTPSs catalyzing cyclohydration reactions. Reported examples are the biosynthesis of copal-8-ol diphosphate by a monofunctional class II diTPS from Cistus creticus (23), the formation of ent-16␣hydroxy-kaurene as a product of the bifunctional class I/II entcopalyl diphosphate/ent-kaurene synthase from the non-vascular plants Physcomitrella patens and Jungermannia subulata (47,48), and the biosynthesis of labda-7,13-dien-15-ol catalyzed by an ent-copalyl diphosphate/ent-kaurene synthase-like diTPS from the lycophyte Sellaginella moellendorffii (24). In addition, a recent study on the product specificity of Norway spruce PaLAS (18) demonstrated a tricyclic, tertiary diterpene alcohol as the initial, but highly unstable, product of this diTPS. In the case of PaLAS, the allylic diterpenol product dehydrates to a set of diterpene olefins. Against this background of prior knowledge, the identification of bifunctional class I/II AbCAS adds a unique new catalyst to the known portfolio of diTPSs, with possible applications in the bioproducts industry. The genomics-based discovery of AbCAS, alongside AbLAS and AbIso, also substantially contributes to our understanding of the molecular, biochemical, and evolutionary underpinnings of conifer diterpenoid diversity.
The present study highlights the powers of combined metabolite profiling, tissue-specific deep transcriptome sequencing, and functional (i.e. biochemical) genomics for the successful discovery and characterization of new enzymes of natural product biosynthesis. The general framework for such an approach has also recently been applied for the discovery of a suite of new TPS genes in, for example, tomato (49) and Euphorbia fischeriana (50). Of fundamental importance for the success of a genomics-or transcriptomics-based approach to natural product enzyme discovery is the traditional or new knowledge of the occurrence of specialized metabolites associated with particular plant species or particular tissues. Balsam fir was ideally suited for the discovery of CAS because this species accumulates large amounts of cis-abienol 4 in the oleoresin of bark/ phloem tissue (supplemental Table 2). Xylem/wood or needles of balsam fir do not contain cis-abienol 4 in relevant amounts and therefore were not included in the transcriptome sequencing for CAS discovery.
Beyond the selection of species and tissues via metabolite profiling prior to 454 sequencing, another key element in the successful discovery strategy of AbCAS was the application of a directed BLASTx search of the de novo transcriptome assembly. Specifically, we directed the BLASTx search of the de novo transcriptome assembly against a comprehensive sequence database of characterized TPSs. This strategy allowed for the curator-optimized annotation of TPSs genes, which substantially streamlined the efforts of functional gene characterization toward four candidate AbdiTPSs. Of these four candidates, the corresponding enzymes of AbdiTPS1 (AbLAS), AbdiTPS2 (AbIso), and AbdiTPS4 (AbCAS) were successfully characterized.
Based on the relatively high protein sequence identity of more than 90% with AgAS, a similar catalytic function was predicted for AbdiTPS1 and AbdiTPS2, whereas the lower sequence identity of AbdiTPS4 (75%) was indicative of a distinct function. This general prediction was confirmed by functional characterization, which remains a necessary part of TPS gene annotation. The functional characterization of AbdiTPS1 (AbLAS) with 13-hydroxy-8(14)-abietene 9/10 as the initial reaction product and multiple olefin compounds occurring upon dehydration of the alcohol was important to substantiate a recent discovery of the formation of a tricyclic tertiary C-13 alcohol as the primary product of Norway spruce PaLAS (18). Likewise, the discovery of AbdiTPS2 (AbIso) was important because it was the first Iso-type diTPS to be characterized outside of the spruce genus (Picea), with phylogenetic patterns suggesting that functional divergence of LAS-and Iso-type diTPS occurred independently in the spruce and fir (Abies) lineages (Fig. 4). In agreement with predictions based on sequence divergence, AbdiTPS4 (AbCAS) represents a novel diTPS func-tion. A bifunctional cis-abienol synthase has not been previously reported for any plant species. Together, the set of three types of bifunctional diTPSs, AbIso, AbLAS, and AbCAS, account for the majority of diterpene structures found in the specialized diterpene metabolism of balsam fir bark tissue.
In the absence of pure cis-or trans-abienol standards, stereochemistry of the product of AbdiTPS4 was assigned based on comparison of retention times, under several different GC and LC conditions, with those of the cis-abienol metabolite in the resin mixture of balsam fir. Previous studies demonstrated that abienol from balsam fir exhibits the cis-configuration (51). Our proton and carbon NMR analysis further corroborated the stereochemistry of the reaction product of AbdiTPS4 as cis-abienol.
Despite extensive efforts of TPS gene discovery in species of spruce (14,17) and in grand fir (52), there is no known gene in any plant species that is closely related (Ͼ70% protein identity) to AbCAS. It is therefore possible that a CAS-type bifunctional class I/II diTPS is unique to balsam fir, which is the first conifer species for which three types of functionally distinct diTPSs are now known. A phylogenetic position of AbCAS between Gingko biloba GbLS and other known conifer diTPSs of specialized metabolism in spruce, firs, and pine ( Fig. 4) may suggest an evolutionary origin of AbCAS by gene duplication prior to speciation within the pine family with its possible loss in some or all of the other species of this comparison.
Bifunctional class I/II diTPSs of conifer specialized metabolism are members of the TPS-d group (14,19). Within this group, new genes and enzyme functions appear to have evolved from a common ancestor by repeated gene duplication. Neo-or sub-functionalization of members of this gene family involved mutation of specific active site residues (16,(53)(54)(55). The mutational analysis of AbCAS and comparative structural analysis of AbCAS with AbIso and AbLAS identified distinct catalytic mechanisms and revealed unique residues in the AbCAS class I and class II active sites that may determine product specificity. The initial bicyclization of GGPP 1, resulting in a proposed labda-13-en-8-yl ϩ intermediate 2 of (9S,10S) stereochemistry, is probably a common feature of the class II reactions of CAS-, LAS-, and Iso-type diTPSs (Fig. 1). Previous work showed that naturally occurring cis-abienol 4 is optically pure and represents the (9S,10S) configuration (51). In a unique sequence of the class II reaction, AbCAS may then promote water capture at the C-8 carbon of the carbocation to form labda-13-en-8-ol diphosphate 3, whereas the LAS-and Iso-type diTPSs form (9S,10S)-CPP 5 through deprotonation at the C-8 position (Fig.  1). Structural modeling and molecular docking of labda-13-en-8-ol diphosphate 3 in the class I and class II active sites of AbCAS (Fig. 6) revealed only one unique amino acid in the class II active site, namely Asp-348, in proximity of the docked intermediate that may control this particular hydroxylation reaction. Located at the posterior of the active site opposite of the DIDD motif and Trp-358, which have previously been reported to contribute to the AgAS-catalyzed class II reaction (39), the negatively charged side chain of Asp-348 may stabilize the positive charge at C-8 for water quenching to occur in the formation of labda-13-en-8-ol diphosphate 3 (Fig. 6B). This local negative charge is not present in any LAS-or Iso-type enzymes, which contain a conserved histidine in this position (supplemental Fig. 1).
Mutational analysis and complementation assays suggest that the class I active site of AbCAS is selective for labda-13-en-8-ol diphosphate 3 as a substrate but does not convert (9S,10S)-CPP 5, which is the class I active site substrate of LAS-, and Iso-type diTPSs (Fig. 5). Unlike LAS-and Iso-enzymes, the class I active site of AbCAS catalyzes the ionization of the diphosphate group without cyclization of a C-ring (Fig. 1). Several unique residues were found in a radius of 7 Å around the FIGURE 6. Unique active site residues of AbCAS relative to AbLAS and AbIso and implication for the formation of cis-abienol. Structural models were built based on the tertiary structure of T. brevifolia taxadiene synthase (32) (Protein Data Bank code 3p5p, chain A). A, superimposed homology models of AbCAS, AbLAS, and AbIso resembling the common ␣-helical folding pattern, comprised of the N-terminal ␤ domain (cyan) and ␥ domain (orange) and C-terminal ␣ domain (green). The position of labda-13-en-8-ol diphosphate 3 (yellow) was estimated in the active sites by molecular docking using Molegro Virtual Docker 2010.4.0.0 (36), with the diphosphate group bound to the Mg 2ϩ cluster (pink), which in return is coordinated by the DDXXD and NDXXTXXXE motifs (red). B, superimposition of putative catalytic residues in the class II active site of AbLAS (blue), AbIso (green), and AbCAS (light brown). Asp-348 is located at the posterior of the class II active site, opposite the DIDD motif, with its side chain protruding toward C-8 of labda-13-en-8-ol diphosphate 3. C, Leu-617 and Phe-696 are located in the class I active site cavity upstream of the DDLYD motif, creating an expansion of the hydrophobic pocket relative to AbLAS and AbIso. D, Gly-723 and Val-724 contribute to a change in the hinge region between helix G1 and G2, which has previously been shown to be critical for the plasticity of conifer diTPSs (16,51,54). hydroxy group of labda-13-en-8-ol diphosphate 3 docked within the class I active site of AbCAS that may control the AbCAS reaction (Fig. 6, C and D). Among these, Leu-617, Phe-696, and Gly-723 are of particular interest because they contribute to the contour of the active site cavity. Leu-617 and Phe-696 are located in the class I active site cavity upstream of the DDXXD motif on helix D, creating an expansion of the hydrophobic pocket relative to AbLAS and AbIso. Gly-723 and Val-724 account for a change in the hinge region between helix G1 and G2. These residues may contribute to the release of a bicyclic product rather than facilitating a secondary cyclization. Interestingly, residues corresponding to Leu-617, Phe-696, and Gly-723 have previously been shown to be critical for the catalytic plasticity of conifer diTPSs (16,(53)(54)(55)(56) and thus appear to represent key positions for the functional evolution of these enzymes.
Plant TPSs are useful enzymes for the metabolic engineering of bioproducts and biofuels in yeast and E. coli (4,57). Metabolic engineering of microbial hosts or plants to produce oxygenated diterpenoids, such as cis-abienol 4 or sclareol, may provide a sustainable production platform for these compounds for the fragrance industry and other applications. The patent literature (58,59) suggests that genes for cis-abienol 4 and sclareol biosynthesis have been cloned from angiosperms. However, unlike the bifunctional conifer class I/II diTPSs, the known angiosperm diTPSs are exclusively monofunctional enzymes (19). Thus, using the angiosperm enzymes for pathway engineering would require the dual expression optimization of separate class I and class II enzymes. In contrast, the use of a bifunctional class I/II diTPS, such as AbCAS, would require only the expression of a single gene. Optimization of protein expression is also likely to be less complicated for the bifunctional class I/II diTPS (i.e. AbCAS), because the two active sites required for the conversion of GGPP 1 to cis-abienol 4 will be in equal amounts as part of the same protein and will be in close physical proximity of each other. In essence, for applications of metabolic engineering, the bifunctional class I/II AbCAS offers a scaffolded arrangement of the two essential active sites. This scaffold evolved in nature for the high volume production of cis-abienol 4 in the oleoresin of balsam fir. Because AbCAS is only very distantly related to angiosperm diTPSs (less than 30% protein sequence identity), this enzyme would also be a preferred choice for metabolic engineering of cis-abienol 4 production in industrial crops, such as tobacco, because co-suppression effects on endogenous diTPSs are unlikely.