Biochemical Conservation and Evolution of Germacrene A Oxidase in Asteraceae*

Sesquiterpene lactones are characteristic natural products in Asteraceae, which constitutes ∼8% of all plant species. Despite their physiological and pharmaceutical importance, the biochemistry and evolution of sesquiterpene lactones remain unexplored. Here we show that germacrene A oxidase (GAO), evolutionarily conserved in all major subfamilies of Asteraceae, catalyzes three consecutive oxidations of germacrene A to yield germacrene A acid. Furthermore, it is also capable of oxidizing non-natural substrate amorphadiene. Co-expression of lettuce GAO with germacrene synthase in engineered yeast synthesized aberrant products, costic acids and ilicic acid, in an acidic condition. However, cultivation in a neutral condition allowed the de novo synthesis of a single novel compound that was identified as germacrene A acid by gas and liquid chromatography and NMR analyses. To trace the evolutionary lineage of GAO in Asteraceae, homologous genes were further isolated from the representative species of three major subfamilies of Asteraceae (sunflower, chicory, and costus from Asteroideae, Cichorioideae, and Carduoideae, respectively) and also from the phylogenetically basal species, Barnadesia spinosa, from Barnadesioideae. The recombinant GAOs from these genes clearly showed germacrene A oxidase activities, suggesting that GAO activity is widely conserved in Asteraceae including the basal lineage. All GAOs could catalyze the three-step oxidation of non-natural substrate amorphadiene to artemisinic acid, whereas amorphadiene oxidase diverged from GAO displayed negligible activity for germacrene A oxidation. The observed amorphadiene oxidase activity in GAOs suggests that the catalytic plasticity is embedded in ancestral GAO enzymes that may contribute to the chemical and catalytic diversity in nature.

Terpenoids, derived from isopentenyl diphosphate, are structurally the most diverse class of natural products with many known biological functions and commercial applications (1). One subclass of terpenoids is sesquiterpene lactone (STL) 3 , characterized by its ␣-methylene ␥-lactone moiety on the 15-carbon core backbone (2). Although STLs are found in several plant families including Cupressaceae (3), liverwort (4,5), and even fungus (6,7), their occurrence in nature is by far the most frequent among Asteraceae (or Compositae) plants, the second largest plant family, after Orchidaceae, composed of 23,000 plant species (8,9). Despite the structural complexities of STLs, the basic backbones of all STLs are constrained to several core skeletons, such as germacranolide, eudesmanolide, guaianolide, and helenanolide (2). This may suggest that STL biosynthesis is governed by a limited number of biochemical rules.
Although structural and bioactivity studies of STLs have been extensively performed, the biosynthetic route of STLs remains poorly understood at the molecular level. The proposed biosynthetic route of the simplest STL, costunolide, is depicted in Fig. 1 (19,20). Farnesyl diphosphate is cyclized to C15 hydrocarbon germacrene A by germacrene A synthase (GAS), and subsequently its C12 methyl group undergoes a three-step oxidation reaction to yield the germacrene A acid (germacra-1(10),4,11(13)-trien-12-oic acid). An additional hydroxylation at the C6 position of germacrene A acid facilitates nonenzymatic lactonization of C6 hydroxyl and C12 carboxylic group, yielding costunolide. The costunolide in turn serves as a framework of guaianolides, eudesmanolides, germacranolides, and other STLs by unknown mechanisms. Last, elaborate chemical decorations of STL scaffolds are carried out by P450s and several other modifying enzymes to produce more complex and biologically active STL end products.
A number of GAS genes have been isolated and characterized in Asteraceae (21)(22)(23)(24); however, the biochemical mechanism of germacrene A C12 oxidation and other downstream modifications have not been fully understood. Based on cell-free assays in chicory, it has been proposed that three distinct enzymes, germacrene A C12 hydroxylase (P450), germacrene alcohol dehydrogenase, and germacrene aldehyde dehydrogenase, are involved in the biosynthesis of germacrene A acid in chicory (19). The P450-mediated biosynthesis of costunolide from germacrene A acid has also been demonstrated in chicory extract (20). Nonetheless, the corresponding genes responsible for these reactions have not been identified to date. Recently, the biochemical scenario of such three independent enzymes for germacrene A acid synthesis has been challenged by the discovery of multifunctional amorphadiene oxidase (AMO or CYP71AV1), which can catalyze three-step oxidations of amorphadiene to yield artemisinic acid in Asteraceae plant, Artemisia annua ( Fig. 1) (25,26).
From the perspective of biochemical evolution, the artemisinic acid biosynthesis is a specific evolutionary event that only occurred in a single modern species, A. annua. It can, therefore, be theorized that the biochemistry of artemisinic acid diverged from the more general germacrene A acid biosynthetic pathway. Comparative analysis of artemisinic acid and germacrene A acid biosynthesis would be an attractive model to understand the adaptive evolution of enzymes in terpenoid metabolism.
Germacrene A acid is certainly a necessary chemical for indepth investigation of STL metabolism in Asteraceae. However, this compound is difficult to obtain because germacrene A acid is a low abundant, transient intermediate in the STL biosynthetic pathway (27), and the chemical synthesis of terpenoid is difficult to achieve. One report showed that a minute amount (2 mg) of germacrene A acid could be purified from 300 g of costus (Saussurea lappa) (27). Thus, microbial de novo production of germacrene A acid using cDNA clones can be a convenient alternative to acquire germacrene A acid.
In this report germacrene A oxidase (GAO) isolated from lettuce (Lactuca sativa) was expressed in an engineered yeast to synthesize germacrene A acid de novo. Furthermore, GAO cDNAs were isolated from several other Asteraceae plants representing major subfamilies, including the phylogenetically basal species Barnadesia spinosa. Resulting biochemical data provided evidence that GAO activity is highly conserved in Asteraceae. In addition, the cross-reactivity of GAOs toward amorphadiene highlighted the evolutionary significance of the catalytic plasticity encoded in GAOs.

EXPERIMENTAL PROCEDURES
Plasmid Construct for Gene Expression-Sequence information at the start and stop codons of LsGAO was obtained from the Compositae Genome Project Data base at the University of California Davis. LsGAO was amplified from the cDNA pool prepared from lettuce leaf by a forward primer, 5Ј-CGAGGT-CTAGAATGGAGCTTTCAATAACCACC-3Ј, and a reverse primer, 5Ј-GCCCTCTAGAGCAAAACTCGGTACGAGTA-ACAAC-3Ј. The amplified product was digested by XbaI and ligated into the SpeI site of pESC-Ura plasmid. A. annua cytochrome P450 reductase (CPR) in a pESC-Ura plasmid (25) was digested by BamHI and NheI, and the digested fragment was ligated to the corresponding sites in LsGAO::pESC-Ura, resulting in a dual expression plasmid, LsGAO/CPR::pESC-Ura. This plasmid and the previously generated GAS::pESC-Leu plasmid (24) were co-transformed to the EPY300 strain (25,28). For chemical purification purposes, plasmid stability was enhanced by coding three genes in a single plasmid. This was achieved by amplifying the expression cassette of GAS from the GAS::pESC-Leu plasmid by a forward primer, 5Ј-GTCAATCA-CTACGTGAGTACGGATTAGAAGCCGCCGA-3Ј, and a reverse primer, 5Ј-GTCAATGCCGGCCTTCGAGCGTCCC-AAAACCT-3Ј. The amplified product was digested by DraIII and NaeI, and the digested fragment was ligated into the corresponding sites of an empty pESC-Leu2d. This DNA manipulation freed two multiple cloning sites for further cloning. Two expression cassettes for LsGAO and CPR were digested from the LsGAO/CPR::pESC-Ura by PacI and ScaI, and the digested fragment was ligated to the corresponding sites of the newly generated GAS::pESC-Leu2d, resulting in the triple expression plasmid named GAS/LsGAO/CPR::pESC-Leu2d. Bioinformatic analyses identified start and stop codons of chicory and sunflower, and their open reading frame sequences were used to design appropriate primers. The Barnadesia clone ordered from the Arizona Genomics Institute at the University of Arizona encoded a full-length cDNA (clone CCHS24399). For costus SlGAO clone isolation, a 1.4-kb fragment of SlGAO was first obtained from costus cDNA using primers designed at the highly conserved domains of other GAOs. The primer pair used was a forward primer, 5Ј-ACCGTGGCTCAAAGCTCTCA-GTC-3Ј, and a reverse primer, 5Ј-GACTCCCCATAATCGG-TCACATGC-3Ј. Both 5Ј-and 3Ј-rapid amplification of cDNA ends were conducted to determine start and stop codons of SlGAO followed by the recovery of a full-length cDNA. All the isolated GAOs were first cloned to pESC-Leu vector to make translational fusions to the FLAG epitope. For subcloning purposes, HaGAO was amplified using a forward primer, 5Ј-GCACTAGTATGGAAGTCTCCCTCACCACTTC-3Ј, and a reverse primer, 5Ј-CGATACTAGTGCAAAACTTGGTACA-AGCATCAA-3Ј. SlGAO was amplified using a forward primer, 5Ј-TAATCTAGAATGGAACTCTCCTTCACCACTTCCAT-TGC-3Ј, and a reverse primer, 5Ј-TATTCTAGACGAAAAC-TAGGTACCAGTACCAAATGAGTC-3Ј. CiGAO was amplified using a forward primer, 5Ј-ACGTCTAGAATGGAGCTC-TCACTCACTACTTCCA-3Ј, and a reverse primer, 5Ј-ACGT-CTAGAGCAAAACTTGGTACGAGTATCAATTCGGT-3Ј. BsGAO was amplified using a forward primer, 5Ј-ATATCTA-GAACCATGGAACTCACTCTCACCACTTCCC-3Ј, and a reverse primer, 5Ј-ATACTAGTCGAGCAGAGTTGTTAGC-AGTCTTGTAAGCTG-3Ј. The amplified fragments were digested by XbaI or SpeI and cloned into the SpeI site of pESC-Leu vector. The entire open reading frames in fusion with the FLAG epitope were digested by NotI and PacI, and the LsGAO in the triple expression vector was removed and replaced with the NotI-and PacI-digested GAO open reading frames.
Yeast Culture and Metabolite Sample Preparation-For standard yeast culture, the transgenic yeast strain of interest was inoculated in 3 ml of Synthetic Complete (SC) medium omitting the appropriate amino acids with 2% Glc. The inocula were cultured overnight at 30°C and 200 rpm. The start culture was diluted 100-fold in SC medium omitting the appropriate amino acids with 1.8% Gal and 0.2% Glc. 100 ml of medium was cultured for metabolite profiling, whereas 1.0 -2.5 liters of medium was cultured for chemical purification and identification. Rearranged sesquiterpenoids (costic acids) were extracted and analyzed by GC-MS according to the published methods (28). For buffered neutral culture, culture medium was adjusted to have 100 mM HEPES/NaOH (pH 7.5). After cultivating yeast for 48 -72 h at 30°C, the culture medium was adjusted to pH 6.0 with 2 M HCl, and the medium was extracted with ethyl acetate twice. The ethyl acetate fractions were evaporated in N 2 gas to concentrate samples to 1 ml, and 1 l was analyzed by GC-MS. For LC-MS analyses, the solvent was replaced with methanol.
Microsome Preparation, Immunoblot Analysis, and in Vitro Enzyme Assay-Microsomes were prepared according to the published protocol (29) except that a micro-beadbeater (Biospec Products, Bartlesville, OK) was used for 90 s with glass beads (500-m diameter). For immunoblot analysis, microsomal proteins were separated on a 10% SDS-PAGE and transferred onto a polyvinylidene fluoride membrane. The membrane was blocked by 5% nonfat milk in TBST buffer (25 mM Tris-HCl, pH 7.5, 150 mM NaCl, and 0.05% Tween 20) for at least 1 h, incubated with anti-FLAG M2 antibodies (Sigma) in 1:5000 dilution, washed 3 times with TBST, and incubated with goat anti-mouse secondary antibody (GE Healthcare) in a 1:5000 dilution. After washing the membrane three times with TBST, the bound secondary antibodies were detected with ECL Plus detection reagents (GE Healthcare). For in vitro enzyme assay, the protease-deficient Saccharomyces cerevisiae YPL 154C:Pep4 KO strain was transformed with LsGAO/ CPR::pESC-Leu2d and CPR::pESC-Leu2d. After 15 h of cultivation in Glc, the yeast culture was shifted to fresh medium with 2% Gal, and the yeasts were further cultivated for 24 h. The in vitro enzyme reactions were carried out in 3 ml of 50 mM HEPES/NaOH (pH 7.5) buffer containing 3 mg of microsomal protein, 200 M germacrene A, 500 M NADPH, and an NADPH regeneration system (10 mM Glc 6-phosphate and 3 units of Glc-6-phosphate dehydrogenase). The reaction occurred at 23°C for 4 h with gentle agitation. The reaction product was acidified with 2 M HCl to pH 6.0 and extracted with ethyl acetate. The extract solvent was then replaced with methanol for LC-MS analysis.
GC-MS Analysis of Sesquiterpenoids-Products obtained by the expression of the GAS/LsGAO/CPR::pESC-Leu2d plasmid were analyzed on an Agilent 6890N gas chromatography system coupled to an Agilent 5975B mass spectrometer. 1-l samples were injected at a port temperature of 150°C and analyzed on a DB5-MS column (30 m ϫ 250-m inner diameter ϫ 0.25-m film thickness). Helium was used as carrier gas with a constant flow rate of 1 ml min Ϫ1 . The temperature program was set to 40°C for 1 min followed by a linear gradient of 10°C min Ϫ1 to 250°C. For separating ␣and ␤-costic acids, 2-l samples were analyzed on an Agilent chiral Cyclodex-B column (30 m ϫ 250-m inner diameter ϫ 0.25-m film thickness). The temperature program used was 1 min hold at 40°C, 10°C min Ϫ1 increments from 40°C to 100°C, and 3°C min Ϫ1 increments from 100 to 245°C. Retention indices for methyl esters of ␣-, ␤-, and ␥-costic acid and ilicic acid were 1807, 1805, 1788, and 1966, respectively, in the DB-5 column and 1914, 1920, 1889, and 2208, respectively, in the Cyclodex-B column. Retention indices values for native forms of ␣-, ␤-, and ␥costic acid and ilicic acid were 1873, 1870, 1852, and 2103, respectively, in DB5 column. Cope-rearranged product, elemenic acid, showed a retention index of 1762, and electron impact MS relative ion intensity as follows: M ϩ 234 (1) NMR Analyses-For costic acids and ilicic acid, NMR spectra were recorded in 3-mm standard NMR tubes on a Varian Unity Inova 500 MHz spectrometer equipped with a 3-mm indirect detection pulsed field gradient probe. The 1 H and 13 C NMR chemical shifts were referenced to solvent signals at ␦ H/C 7.14/ 127.68 (C 6 D 6 ) relative to tetramethylsilane. One-and twodimensional homonuclear NMR spectra were measured with standard Varian pulse sequences, and the experiments performed included gCOSY, TOCSY, ROESY, gHSQCAD, and gHMBCAD. Adiabatic broadband and band-selective GHSQCAD and GHMBCAD spectra were recorded using CHEMPACK 4.0 pulse sequences (implemented in Varian Vnmrj 2.1B spectrometer software). For germacrene A acid, 1 H and 13 C NMR spectra were acquired in 5-mm standard NMR tubes at 400.13 and 100.6 MHz on a Bruker AVANCE 400 Spectrometer equipped with 5-mm inverse probe with triple axis gradients. Chemical shifts (␦) were referenced to internal tetramethylsilane for both 13 C and 1 H. Spectra were recorded with standard Bruker pulse sequences under Xwinnmr. Experiments performed included one-dimensional proton, one-dimensional 13 C with proton decoupling, 13 C attached proton test with proton decoupling, COSY with double quantum filter, TOSCY with 60-ms mixing time, heteronuclear multiple quantum coherence, and heteronuclear multiple bond coherence.
Isolation of Ilicic Acid (4)-The EtOAc extract was concentrated to 100 l in a vacuum concentrator and separated by TLC on silica gel 60 F 254 (Merck) with hexane/EtOAC (1:4 v/v) as the solvent. UV 254 nm identified an absorbent band with an R F value of 0.08 that was eluted from silica with EtOAC and subsequently identified to contain 4 (peak 4) in Fig. 2. Further purification was achieved by using an isocratic HPLC fractionation (P580 Dionex Liquid Chromatography system, Grom-Sil 120 ODS-5 ST column, 5 m, 4.6 ϫ 250 mm (Grom, Herrenberg, Germany). The mobile phase used was 40% acetonitrile acidified to pH 2.5 with trifluoroacetic acid, and compounds FIGURE 2. Analyses of the metabolites de novo synthesized from transgenic yeast. A, GC-MS chromatographs are shown for the sesquiterpenoids from yeast transformed with the indicated genes. Lines a and b are negative controls, and line c displays the metabolites unique to the yeast transformed with three genes (GAS, GAO, and CPR). Compounds 2 and 3 were not separated by DB-5 MS column but were clearly separated by the chiral column (Cyclodex-B column) as shown in the inset. B, the mass fragmentation patterns of compound 1 to 4 are given. C, proposed acid-induced rearrangements of germacrene A acid to ␣-, ␤-, and ␥-costic acid and additional modification of costic acids to ilicic acid in yeast culture are shown. The speculated structure of peak 2 as ␣-costic acid was also given with a question mark. D, cope rearrangement of germacrene A acid to elemenic acid by heat is shown.
Isolation of Costic Acids (1 and 3)-The same TLC separation was conducted as described above. In addition to the ilicic acid band, another band on the TLC plate showing strong absorbance at UV254 with an R F value of 0.62 was identified to contain 1 and 3 (peaks 1 and 3, respectively) in Fig. 2 by GC-MS analysis. This band was excised, eluted with EtOAC, and further separated by HPLC as described above. Compound 1 (␥-costic acid) was eluted at 9.6 min with a shoulder peak. A separate collection of the shoulder peak and subsequent NMR analysis afforded ␤-costic acid (3).
Isolation of Germacrene A Acid-For purification of germacrene A acid, the culture medium was extracted with EtOAc, concentrated to 100 l, and fractionated by a reverse-phase column, Sep-Pak Plus C18 cartridge 55-105-m column (Waters).
Step gradients of elution solvents (water and acetonitrile) starting from 100% water with 10% increments of acetonitrile were used to isolate germacrene A acid. A total of 56 fractions (0.5 ml each) were collected, and the presence of germacrene A acid was visualized as a blue spot on TLC plate using the method described by de Kraker et al. (27). . Total ion scans were used in both negative and positive mode, and specific ion masses were selected for further mass analysis. For germacrene A acid detection in B. spinosa, 100 mg of ground fresh tissue (a mixture of flower and leaf that were used for the cDNA library preparation in the Compositae genomics project) was provided from the Center for Genomics and Bioinformatics, Indiana University. The sample was extracted with 5 ml of ethyl acetate, filtered, dried, and dissolved in methanol. The same Rapid Resolution LC program as above was employed using the Zorbax SB-C18 column (2.1 ϫ 30 mm, 3.5 m).
Phylogenetic Analysis-GAO and AMO amino acid sequences were aligned by ClustalW algorithm. Phylogenetic analysis was performed using the Phylogenetic Analysis Using Parsimony (PAUP) 4.0 software (Sinauer Associates Inc., Sunderland, MA). The first 21 amino acids corresponding to the membrane domain were excluded. Characters were reweighted according to the rescaled consistency index. Parsimony analysis was performed using the tree-bisection-re-connection (TBR) algorithm. 1000 replicates of the bootstrap analysis were performed to evaluate the statistical significance of each node.

Isolation of GAO and Pathway Reconstitution in Engineered
Yeast-Lettuce (L. sativa) accumulates STLs inside its laticiferous cells in stems and leaves (30), and hence, we first aimed at functionally identifying a gene encoding the enzyme for germacrene A oxidation from lettuce. A bioinformatics screening of the lettuce express sequence tag data base using A. annua AMO (or CYP71AV1) identified a gene highly homologous to AMO (25,26). PCR amplification using primers designed at the start and stop codons of the identified ESTs allowed us to isolate a full-length gene from lettuce leaf cDNA. The isolated cDNA encodes a polypeptide of 488 amino acids with a predicted molecular mass of 54.9 kDa. The deduced amino acid sequence from this gene showed 86.7% identity to that from A. annua AMO. This P450 gene was designated as germacrene A oxidase (GAO) based on its catalytic property (see below).
To assess the enzymatic activity of GAO, yeast strain EPY300 engineered to produce a markedly increased level of farnesyl diphosphate (an immediate precursor of germacrene A) served as a platform strain. The production of the hydrocarbon germacrene A in the EPY300 strain expressing previously characterized lettuce GAS has been demonstrated (21,24). For in vivo catalytic coupling of GAS and GAO, open reading frames of GAO with a FLAG epitope tag and A. annua CPR with a cMyc epitope tag under Gal10 and Gal1 promoters, respectively, were placed in the pESC-URA dual expression vector. This plasmid and previously generated GAS::pESC-Leu were co-transformed in EPY300, resulting in simultaneous expression of GAS, GAO, and CPR in farnesyl diphosphate abundant cells. Upon induction of transgenes, the presence of GAO and CPR recombinant proteins was confirmed by immunoblot analysis with FLAG and cMyc antibodies (data not shown).
Terpenoids were extracted, derivatized as described previously (25), and subjected to the GC-MS analysis. Unexpectedly, four novel compounds that were not present in the yeast expressing either GAS or GAO/CPR were identified in the yeast expressing GAS, GAO, and CPR ( Fig. 2A). The four compounds (1, 2, 3, and 4) showed the same parental mass of m/z 248 (methyl ester form) but distinct fragmentation patterns in electron impact MS (Fig. 2B). The electron impact MS analysis of their native (non-methylated) forms revealed an identical parent mass of m/z 234 in each case. The observation of m/z 248 (methyl ester) and 234 (native form) suggested that the hydrocarbon germacrene A (M r 204) is sequentially oxidized three times by GAO as is the case for the biosynthesis of artemisinic acid by AMO (Fig. 1) Structure Analyses of Products from Yeast Culture under Acidic Conditions-We pursued the chemical purification of the new compounds for structural analysis. Before purification, plasmid stabilities were enhanced by coding all three genes (GAS, GAO, and CPR) in a single plasmid in pESC-Leu2d vector. The EPY300 strain transgenic for this new construct was cultured in 2.5 liters, and the newly synthesized compounds were purified through TLC and HPLC. Compound 1, 3, and 4 were successfully purified and subjected to extensive one-and two-dimensional NMR studies (see "Experimental Procedures" for details). The structures of compounds 1 and 4 were unambiguously determined to be ␥-costic acid and ilicic acid, respectively (supplemental Table 1, Fig. 2C). Due to a limited amount of 3, the 1 H NMR and indirectly detected 13 C signals of peak 3 could only be partially assigned. Nevertheless, structure elucidation indicated this compound to be ␤-costic acid. Compounds 1, 3, and 4 represent cyclization products of germacrene A acid (Fig. 2C). Although purification of 2 was not achieved, one can assume that this compound could be ␣-costic acid, the third form of cyclized product from germacrene A acid, in accordance with the observed molecular mass of 234 atomic mass units. Compound 4 (ilicic acid) is believed to be derived from hydration of the double-bonds in costic acids.
Altered Metabolite Profile from the pH Adjusted Yeast Cultivation-We speculated that the cyclization of germacrene A acid occurs in the course of de novo germacrene A acid synthesis in an acidic culture condition. When pH values of the yeast culture were monitored, the yeast medium (initially pH 6) became very acidic after a 48-h cultivation (pH ϳ 3). To overcome the medium acidity, the pH was stabilized by supplementing the medium with 100 mM HEPES buffer (pH 7.5). In the buffered medium, the pH of the medium decreased to 6 after a 48-h cultivation. In the GC-MS analysis of the metabolites, the four peaks identified in the acidic condition almost disappeared, and instead a very broad, new peak with m/z 234 (without methylation) appeared (Fig. 3A, top). The broad peak was also observed previously when the hydrocarbon germacrene A was subjected to GC-MS analysis (27). This broadening was attributed to the on-column heat-induced Cope rearrangement of germacrene A into faster migrating ␤-elemene.
Before chemical purification, we evaluated if this new compound (broad peak) was indeed the precursor of the costic acids detected previously. To assess this possibility, the compound from neutral cultivation was treated with trichloroacetic acid. When the acidified samples were reanalyzed by GC-MS, the three early eluting peaks identified as costic acids were clearly detected with a complete disappearance of the broad peak (Fig.  3A, middle). In addition, when the compound from the neutral cultivation was injected at a 330°C GC inlet temperature (previously 150°C), the minute peak preceding the broad peak increased by 19-fold (Fig. 3A, arrow). GC-MS analysis suggested that the fast-migrating compound was ␤-elemenic acid (heat-induced product), in agreement with the published data (27) (Fig. 2D, see "Experimental Procedures" for the full MS spectrum). In LC-MS analysis, the samples prepared from nonbuffered medium showed the presence of costic acids (m/z 233) and ilicic acid (m/z 251) with a negligible amount of putative germacrene A acid (m/z 233) (Fig. 3B, top). However, in the samples prepared from the buffered medium, the amount of costic acids and ilicic acid decreased to 15 and 34%, respectively, relative to their levels in non-buffered culture. Concomitantly, a new peak of m/z 233 at a slightly earlier retention time than costic acids increased by 44-fold (Fig. 3B, bottom). These GC-and LC-MS results showed that simply cultivating the transgenic yeast in buffered medium dramatically alters the sesquiterpenoid profile from aberrant rearranged products (costic acids and ilicic acid) to a new peak with m/z 233 in negative ions.
Structure Elucidation of Germacrene A Acid-4 mg of the new compound was purified from a 1-liter neutralized medium by solid-phase extraction column (C18 reverse phase Ϫ , 233.1547). Based on the complete signal assignment of the compound by standard one-and two-dimensional NMR experiments (see "Experimental Procedures," supplemental Table 1), the compound was determined to be germacrene A acid. The extensive peak broadening observed in the 1 H NMR spectra was attributable to the presence of different conformers as were found in the NMR analysis of germacrene A (31). In this case, the broad signals are due to intermediate exchange on the NMR time scale by the various conformers. The in vitro activity of the recombinant GAO was further confirmed using the microsomes isolated from the yeast expressing GAO and CPR. In a carefully pH-controlled experiment (pH was Ͼ6.0 in all experimental conditions except for the 12-min HPLC run), (Ϫ)LC-MS analysis showed that 233 ions for germacrene A acid were clearly detectable, but the other 233 ions from costic acids could not be detected (Fig. 3C). The in vivo, in vitro, and chemical analysis data demonstrated that GAO encodes an enzyme for the conversion of germacrene A to germacrene A acid.
GAO Activity Is Highly Conserved in Asteraceae-The in vivo system was used to trace the advent of GAO in various Asteraceae plant species. To study the evolutionary lineage of GAO, a combination of express sequence tag-mining, PCR on conserved regions, and rapid amplification of cDNA end methods was used to isolate full-length clones of AMO/GAO homologs from selected Asteraceae plants. Sunflower (H. annuus), chicory (Cichorium intybus), and costus (S. lappa) were chosen as the representatives of Asteroideae, Cichorioideae, and Carduoideae, respectively; these are the core subfamilies that constitute 95% of all Asteraceae. B. spinosa was selected as the representative of the phylogenetic base lineage, subfamily Barnadesioideae (Fig. 4A) (9).
The enzymatic activities of the isolated GAO clones were examined in the yeast in vivo system by co-expressing GAS and CPR. In the (Ϫ)LC-MS analyses, the 233 ions from the germacrene A acid were detected in all samples tested, and quantitative analyses showed that comparable amounts of germacrene A acid were synthesized in yeast expressing either a lettuce, chicory, costus, or Barnadesia clone (Fig. 5A). Although the germacrene A acid from the yeast-expressing sunflower clone was an order of magnitude lower (10 -15-fold) than that of the others, semiquantitative immunoblot analysis of isolated microsomes revealed that the sunflower recombinant protein was at least 10-fold lower than that from the other clones (Fig.  5B). Hence, the sunflower enzyme also resulted in a comparable level of catalytic activity.
Although the involvement of germacrene A acid is apparent in sesquiterpenoid metabolisms of sunflower, costus, chicory, and lettuce on the basis of chemical structures and/or in vitro enzyme assays (19,20,24,27), the biochemical relevance of germacrene A acid in B. spinosa has not been addressed. Thus, the presence of germacrene A acid was examined by (Ϫ)LC-SIM (selective ion mode) from the ethyl acetate extract of B. spinosa (leaf and flower). In the (Ϫ)233 ion monitoring, germacrene A acid together with costic acids was detected at the concentration of 0.013% (weight/fresh weight) (Fig. 5C). This result showed that germacrene A acid is present in B. spinosa most likely as an intermediate. Collectively, these results demonstrated that GAOs and their corresponding enzymatic activities (three consecutive oxidations of germacrene A C12) are highly conserved at the phylogenetic basal clade of Asteraceae (i.e. Barnadesioideae) and retained in three major subfamilies of Asteraceae.
Phylogenetic Analysis of GAO and AMO-The deduced amino acids from these clones shared significant sequence identities ranging from 78.4 to 97.3% (Fig. 4C). Interestingly, AMO shared a higher degree of homology to the GAOs from lettuce, chicory, sunflower, and costus (84.2-86.8%) than BsGAO did with these GAOs (79.6 -82.6%). To obtain a better insight into the AMO and GAO evolution, a phylogenetic tree was reconstructed from AMO and five GAOs using two cytochrome P450s for sesquiterpene oxidations as outgroups (32,33). The phylogenetic analysis showed that AMO forms a distinctive node from the major GAO clade within the lineage originated from the Barnadesia GAO (Fig. 4B). Although both sunflower and Artemisia belong to the same Asteroideae subfamily, sunflower GAO constitutes part of the major GAO clade (Fig. 4B, bracket) that can be distinguished from AMO by a strong statistical support. This data implied that AMO in A. annua recently underwent a specific biochemical microevolution that was not mirrored by the overall speciation patterns of the subfamily Asteroideae.
Cross-reactivities of AMO and GAO-In the sequence alignment, after excluding the membrane-bound domain, only 21 amino acids in AMO were different from the conserved residues in GAO, and an additional 7 amino acids in AMO were unique residues that were not present among the variable residues of other GAOs (Fig. 4C). Such high sequence homology between AMO and GAOs intrigued the substrate specificities of these two types of enzymes and the molecular evolution of AMO in Asteraceae. To address these questions, cross-reactivities of GAOs toward amorphadiene and AMO toward germacrene A were investigated using an in vivo system. Co-expression plasmids for the swapped gene pairs (GAS/AMO or ADS/GAOs) were constructed in the CPR::pESC-Leu2d plasmid and were transformed to the EPY300 strain. When co-expressed, the GAS and AMO gene pair displayed negligible activity (0.04%) for germacrene A acid synthesis, relative to the activity detected from the native enzyme pair, GAS and GAO (Table 1). However, when co-expressed with ADS, all five GAOs from various Asteraceae plants displayed ϳ10 2 -10 3 -fold higher relative activities (5-40%) for artemisinic acid synthesis than that from the GAS and AMO pair. These results suggest that AMO appeared to have lost its ancestral capacity to oxidize germacrene A during its evolutionary path, but the GAOs (prede-cessors of AMO) from various Asteraceae species including B. spinosa possess the catalytic potential (or plasticity) to oxidize amorphadiene.

DISCUSSION
In this work GAO catalytic activity was unambiguously determined to be germacrene A oxidase using the pathway reconstitution in yeast system in vivo and a standard in vitro assay. Lack of substrates and chemical standards often interfere with the progress of natural product biochemistry. Moreover, the instability of transient intermediates complicates biochemical studies. Both germacrene A and germacrene A acid are unstable, transient intermediates in the STL metabolic pathway in plants, and thus, the acquisition of these compounds by purification from plant sources is difficult. In the absence of an authentic standard (germacrene A acid), the enzymatic produc-  (9) is shown. Asterisks indicate the four subfamilies where GAOs were isolated, and the parentheses indicate specific species names. B, the phylogenetic tree of AMO/GAO in Asteraceae is shown. HPO and EAH used as outgroups are Hyoscyamus muticus premnaspirodiene oxygenase and tobacco 5-epi-aristolochene dihydroxylase, respectively. Bootstrap values were shown in percentage values from 1000 replicates. The bracket indicates the GAO clade that is clearly distinguished from BsGAO and AMO. C, alignment of deduced amino acids from GAOs and AMO is shown. Amino acid sequences were obtained from cDNAs deposited at the NCBI. AMO, amorphadiene oxidase from A. annua (DQ268763 or DQ315671); LsGAO, germacrene A oxidase from L. sativa (GU198171) or from C. intybus (Ci; GU256644), S. lappa (Sl; GU256645), H. annuus (Ha; GU256646), and B. spinosa (Bs; GU256647). Stars indicate the residues conserved in GAO but different in AMO. Circled marks indicate the residues unique in AMO but not conserved in GAO. The alignment is shaded to a 50% consensus. Dark and light shading indicate identical and similar residues, respectively. tion of germacrene A acid by in vitro assays and subsequent purification to the level adequate for standard NMR analyses would be technically challenging. The results presented here demonstrated that pathway reconstruction in yeast is a convenient alternative for the characterization of gene function, mass production of natural products for chemical analysis, and substrate preparation for further biochemical studies.
The ␤and ␥-costic acids identified in this study are known natural products that can be derived from germacrene A acid in vitro (27). It was, therefore, not surprising to detect these prod-ucts in acidic yeast culture conditions, but the identification of ilicic acid was not expected from the yeast system. Ilicic acid and its hydroxyl derivatives have been reported as natural products in Asteraceae plants (34 -36), and their anti-tumor and anti-inflammatory activities have been suggested (34,37). The formation of ilicic acid from costic acids could be mediated by a nonspecific yeast enzyme(s) due to the stereospecific hydroxyl group at C4 position, but the nature of this chemical conversion was not further pursued in this study.
The origin and phylogeny of Asteraceae has been a subject of intensive systematic studies for the last 30 years. Based on fossil records and molecular systematic data, it is believed that Asteraceae originated about 30 million years ago in South America and radiated rapidly in all worlds except Antarctica (9,38,39). In 1987 it was discovered that a 22-kb DNA inversion in the chloroplastic genome shared by all Asteraceae plants is lacking in the Barnadesioideae group, and thereafter this inversion data has served as key molecular evidence to support an ancient split between Barnadesioideae and the rest of the Asteraceae (39).
The yeast in vivo reconstitution results in this study clearly demonstrated that GAO and its activity are highly conserved in five plant species representing the major subfamilies and the phylogenetic base of Asteraceae. In particular, GAO gene and its three sequential oxidation activity in the ancient surviving species B. spinosa indicate the occurrence of GAO at the beginning of Asteraceae evolution and subsequent retention of GAO and its enzyme activity in the major subfamilies for more than 30 million years. Considering that the genes in plant secondary metabolism, such as cytochrome P450s of the CYP71 family, are often rapidly diversified for neofunctionalization, the conservation of GAO in Asteraceae suggests that GAO offers increased fitness to plants in this family. Because various STLs have been isolated from sunflower, lettuce, chicory, and costus (30,40,41), it is tempting to speculate that STLs synthesized through GAO activity are the chemical entities responsible for the enhanced fitness of Asteraceae. However, phytochemical studies of Barnadesioideae have only been limited to flavonoids to date (42,43), and further investigation is required to correlate GAO to STLs in Barnadesioideae.    a Natural pairs of sesquiterpene synthase and P450 (e.g., lettuce GAS and lettuce GAO or Artemisia ADS and Artemisia AMO) were set as 100% activities. The production levels of germacrene A acid or artemisinic acid were normalized by relative P450 abundance estimated by immunoblot analysis. Values (mean Ϯ S.D.) were obtained from at least three independent transformants. b Germacrene A acid measurement. c Artemisinic acid measurement.