Bacterial Biosynthetic Gene Clusters Encoding the Anti-cancer Haterumalide Class of Molecules

Background: Oocydin A is an anticancer haterumalide with strong antimicrobial activity against agriculturally important plant pathogenic fungi and oomycetes. Results: The oocydin A gene cluster has been identified and characterized in four plant-associated enterobacteria. Conclusion: The ooc gene cluster is organized in three transcriptional units encoding enzymes that belong to a growing class of trans-acyltransferase polyketide synthases. Significance: Oocydin A has potential agricultural, pharmacological, and chemotherapeutic applications. Haterumalides are halogenated macrolides with strong antitumor properties, making them attractive targets for chemical synthesis. Unfortunately, current synthetic routes to these molecules are inefficient. The potent haterumalide, oocydin A, was previously identified from two plant-associated bacteria through its high bioactivity against plant pathogenic fungi and oomycetes. In this study, we describe oocydin A (ooc) biosynthetic gene clusters identified by genome sequencing, comparative genomics, and chemical analysis in four plant-associated enterobacteria of the Serratia and Dickeya genera. Disruption of the ooc gene cluster abolished oocydin A production and bioactivity against fungi and oomycetes. The ooc gene clusters span between 77 and 80 kb and encode five multimodular polyketide synthase (PKS) proteins, a hydroxymethylglutaryl-CoA synthase cassette and three flavin-dependent tailoring enzymes. The presence of two free-standing acyltransferase proteins classifies the oocydin A gene cluster within the growing family of trans-AT PKSs. The amino acid sequences and organization of the PKS domains are consistent with the chemical predictions and functional peculiarities associated with trans-acyltransferase PKS. Based on extensive in silico analysis of the gene cluster, we propose a biosynthetic model for the production of oocydin A and, by extension, for other members of the haterumalide family of halogenated macrolides exhibiting anti-cancer, anti-fungal, and other interesting biological properties.

Plants have to cope with potentially devastating biotic and abiotic environmental stresses during their life cycle. Among these stresses, fungal and oomycete plant pathogens are responsible for many of the most agriculturally important diseases worldwide (1). Effective chemical control of these infections is extremely difficult, and biopesticides (pest management agents based on living microorganisms or natural products) are considered to be one of the most promising methods for rational crop management (2). Many bacteria can synthesize bioactive secondary metabolites, some of which can combat key fungal and oomycete plant pathogens, and thus these metabolites may have utility in biocontrol systems (3,4).
Oocydin A is a chlorinated macrolide that was isolated in 1999 from the plant epiphytic bacterial strain Serratia marcescens MSU97 because of its strong bioactivity against plant pathogenic oomycetes (5). The same macrolide, named haterumalide NA, was also isolated from the sponge Ircinia sp. (6) and subsequently from the rhizosphere bacterium, Serratia plymuthica A153, where it showed the ability to inhibit the hyphal growth of the plant pathogenic fungus, Sclerotinia sclerotiorum (7). Furthermore, in 2005, it was also isolated from Serratia liquefaciens number 1821 on the basis of its anti-hyperlipidemic activity and given the code FR177391 (8). Spectroscopic data for haterumalide NA, oocydin A, and FR177391 are identical, although there are differences in their reported optical rotations, so it remains formally possible that the bioactive metabolite from Serratia species is the enantiomer of that derived from the sponge.
Although there are no reports to date on the biosynthetic route to oocydin A, its structure suggests a polyketide origin. During the biosynthesis of polyketides, the polyketide chain is assembled and elongated while covalently attached to acyl carrier protein (ACP) 3 domains. The elongation is performed by the C-C bond-forming ketosynthase (KS) domains, and acyltransferase (AT) domains are responsible for introducing (mostly) malonyl-or methylmalonyl-building units to the ACP during each cycle of elongation. Type I polyketide synthases (PKSs) are usually multidomain proteins in which the domains form an assembly line, consisting of multiple modules, each of which is responsible for one round of chain elongation. The minimal KS-AT-ACP module can be supplemented with one or more of a ketoreductase (KR) domain (converting the ␤-keto group to a ␤-hydroxy group), a dehydratase (DH) domain (eliminating water to generate a CϭC double bond), and an enoylreductase (ER) domain, which reduces double bonds to saturated intermediates (14). According to the textbook model, the number of modules in a polyketide gene cluster is correlated with the number of extension cycles executed by the PKS and therefore with the structure of the eventual secondary metabolite. After the synthesis of the polyketide backbone, the chain is released from the PKS, usually by hydrolysis or cyclization catalyzed by a thioesterase (TE) domain, and in many cases, the polyketide is then modified by a range of tailoring enzymes. These modifications include glycosylations, hydroxylations, acyl transfers, epoxidations, and halogenations (14 -16). In bacteria, polyketide biosynthesis genes are normally organized in gene clusters with the PKS genes being part of a single operon, which reflects both the coordinate regulation required for the activation of the biosynthetic pathway and the evolutionary origin of the cluster, in most cases, by horizontal gene transfer between microbial genomes (17).
In this study, we employed genome sequencing, comparative genomics, mutagenesis, and chemical analysis to identify the PKS gene cluster responsible for the biosynthesis of oocydin A. The oocydin A gene cluster is organized in three different transcriptional units, and it is present in four different plant-associated enterobacteria, three of them belonging to the genus Serratia. Based on the structure of the gene cluster and analysis of its conserved motifs and catalytic residues, we propose a model for the biosynthesis of oocydin A.
In Vitro Nucleic Acid Techniques-A S. marcescens MSU97 cosmid library was constructed from high molecular weight genomic DNA (35-45 kb) using the pWEB-TNC Cosmid cloning kit following the manufacturer's instructions (Epicenter Biotechnologies). Plasmid DNA was isolated using the Anachem Keyprep plasmid kit. For DNA digestion, the manufacturer's instructions were followed (New England Biolabs and Fermentas). Separated DNA fragments were recovered from agarose using the Anachem gel recovery kit. Ligation reactions, total DNA extraction, and Southern blots were performed by standard protocols (28). DNA digoxigenin-dUTP probes were obtained via PCR following the instructions of the manufacturer (Roche Applied Science). Competent cells were prepared using calcium chloride, and transformations were performed by standard protocols (28). Phusion high fidelity DNA polymerase (New England Biolabs) was used in the amplification of PCR fragments for cloning. Sequences of these PCR fragments were verified to discard amplicons containing mutations. Routine DNA sequencing was carried out at the University of Cambridge DNA Sequencing Facility on an Applied Biosystems 3730xl DNA analyzer.
Genome Sequencing and Bioinformatics Analyses-Genomic DNA sequencing was performed at the DNA Sequencing Facility, Department of Biochemistry (University of Cambridge), using 454 DNA pyrosequencing technology on a Pico Titer Plate for a Roche Applied Science Genome Sequencer FLX system. The shotgun assemblies were carried out using 454 GS de novo assembler software (Newbler version 2.6). For the S. marcescens MSU97 genome sequence, the assembly used 521,156 reads or 204 MB of raw data to give 38ϫ coverage of the genome and resulted in 81 contigs, 68 of which were larger than 500 bp. The average contig size was 77,365 bp, and the largest contig was 418,649 bp. The Serratia plymuthica A153 assembly used 308,585 reads or 129 MB of raw data to give a 22ϫ coverage of the estimated genome size and resulted in a total of 36 contigs, 24 of them larger than 500 bp. The average contig size was 230,980, and the largest contig was 1,516,666 bp.
Automated annotation of the bacterial sequences was done using the BASys web server (29). Anti-SMASH was used to analyze the potential secondary metabolite biosynthesis gene clusters present in the genomes (30). Genome comparison analyses were performed employing wgVISTA on-line tool (31). Open reading frames (ORFs) in the oocydin A gene cluster were automatically predicted using Glimmer 3.0 (32). Blast analyses were made for the functional gene assignment. Protein domain organization was identified using the NCBI conserved domains database (33) and the Pfam database (34). Multiple sequence align-ments were carried out with ClustalW2 (European Bioinformatics Institute). Artemis software (Wellcome Trust Sanger Institute) was used to visualize genomic sequences.
Transposon Mutagenesis-Random transposon mutagenesis of S. plymuthica A153 using Tn-KRCPN1 was performed as described below. In a biparental conjugal mating, 500 l of overnight cultures of E. coli ␤2163 (pKRCPN1) and S. plymuthica A153 were mixed, collected by centrifugation, resuspended in 30 l of fresh LB, and spotted on an LB agar plate supplemented with 300 M 2,6-diaminopimelic acid. After overnight incubation at 30°C, cells were scraped off the plate and resuspended in 1 ml of LB. Serial dilutions were plated on LB agar medium containing 75 g ml Ϫ1 kanamycin. 2,6-Diaminopimelic acid was not added to the LB agar medium, to allow counterselection of the E. coli donor. In total, 5000 kanamycin-resistant insertion mutants were screened for their inability to inhibit Pythium ultimum growth. The insertion site of transposon Tn-KRCPN1 in mutants of interest was determined using random primed PCR following the method described previously (35) and using primers described in supplemental Table S5. From the 22 isolated mutants exhibiting no, or reduced, bioactivity, all the insertions were located in the oocydin A gene cluster. Within the isolated mutants, some of the insertions sites were identical, suggesting that the strains were most likely clonal isolates.
Marker Exchange Mutagenesis-Specific site-directed mutants defective in oocK and oocM were constructed by homologous recombination using derivative plasmids of the suicide vector pKNG101 (20). These plasmids, which are listed in Table 1, were confirmed by DNA sequencing, and they carried mutant alleles for the replacement of wild type genes in the chromosome. In all cases, plasmids were transferred to S. plymuthica A153 by triparental conjugation using E. coli CC118pir and E. coli HH26 (pNJ500) as helper. Mutants defective in oocK and oocM were generated using plasmids pMAMV54 and pMAMV62, respectively ( Table 1). All relevant mutations were confirmed by PCR, sequencing, and Southern blot analysis. Primers used in this marker exchange mutagenesis are listed in Table S5. Generalized Transduction-The newly isolated generalized transducing phage, MAM1, 4 was used for transduction of chromosomal mutations using a method similar to that described previously (36). Transductants were selected on plates containing kanamycin, and retention of phage sensitivity was confirmed in the transductants.
RNA Extraction, cDNA Synthesis, and Reverse Transcription-PCR (RT-PCR) Analyses-RNA was extracted from late exponential (12 h) cultures grown in enriched potato dextrose medium using an RNeasy mini kit (Qiagen) according to the manufacturer's instructions. RNA concentration was determined spectrophotometrically, and RNA integrity was assessed by agarose gel electrophoresis. Genomic DNA contamination was eliminated by treating total RNA with Turbo DNA-free (Ambion). The synthesis of cDNA was performed using random hexamers (GE Healthcare) and SuperScript II reverse transcriptase (Invitrogen) in a 20-l reaction with 2.5 g of total RNA and incubation at 42°C for 2 h. A negative control reaction was also performed, omitting the reverse transcriptase enzyme. Then the equivalent of 50 ng of total RNA was subjected to PCR amplification using primers to amplify across the junctions (supplemental Table S5). Positive and negative control PCRs were performed using genomic DNA and no-RT cDNA samples, respectively, as templates. PCR conditions consisted of 30 cycles of denaturation for 1 min at 94°C, annealing for 1 min at 62°C, and extension for 40 s at 72°C.
Antibacterial, Antifungal, and Anti-oomycete Activity in Vitro-Production of antibiotic compounds was tested against E. coli ESS and Bacillus subtilis JH642 using lawn assays as described previously (37). Antagonistic activities of bacterial strains against the fast growing plant pathogenic oomycete P. ultimum were assayed by spotting 5 l of overnight cultures of the selected strains on a PDA plate. Following incubation for 16 h at 25°C, the plates were inoculated with 5-mm diameter mycelial plugs taken from a culture of P. ultimum grown on PDA. Plates were incubated at 25°C for 3-5 days. For the fungicide assays, indicator top agars of Verticillium dahliae, Thanatephorus cucumeris, Alternaria solani, and Fusarium oxysporum f. sp. lycopersici were prepared by vortexing a 5-mm fungal plug in 10 ml of sterile distilled water. Then 15 ml of PDA was added and mixed, and 5 ml of top lawns were poured into PDA plates. Five microliters of overnight cultures of the selected strains were spotted on the surface of the fungal agar lawn and incubated for 7-10 days at 25°C. To determine fungicide/antioomycete levels in bacterial supernatants, culture samples were taken, and cells were pelleted by centrifugation (14,000 ϫ g, 10 min), and the supernatant was filtered (0.2 m). Three hundred microliters of the filter-sterilized supernatant were added to wells cut into the PDA plate and incubated at 25°C for 3-5 days. All the experiments were repeated at least five times.
Caenorhabditis elegans Virulence Assays-The C. elegans virulence assays were adapted from those carried out for Kurz et al. (38). Briefly, NGM plates were inoculated with 50 l of an overnight culture of the oocydin A-producing wild type strains (S. marcescens MSU97, S. plymuthica A153, S. odorifera 4Rx13, and Dickeya dandatii Ech703) or the oocydin A-deficient S. plymuthica A153 mutants. Plates inoculated with E. coli OP50 were also included as a negative control. The plates were incubated for 16 h at 25°C. For each assay, worms previously fed on E. coli OP50 were transferred to the plates inoculated with the strains to test. Fifty L4 stage hermaphrodite DH26 worms (genotype fer-15(b26)II, sterile at 25°C) were used for each strain tested (10 worms per plate). Plates were incubated at 25°C, and the number of live worms was scored every 24 h. Worms were considered dead when they failed to respond to touch. Survival curves were analyzed using GraphPad Prism software. p values Ͻ0.05 were considered statistically significant.
Motility and Biofilm Formation Assays-Motility and biofilm assays were performed at 25°C in enriched potato dextrose medium, conditions where we observed enhanced oocydin A production. Swimming assays were performed on 0.3% agar plates. Biofilm formation assays were carried out in 96-well microtiter plates with shaking at 75 rpm. Crystal violet staining and quantification method were performed as described previously (39).
LC-MS Studies-For the analysis of the bacterial supernatants, 25 ml of enriched potato dextrose broth was inoculated with the strains to analyze and incubated at 25°C. After 48 h of incubation, cells were pelleted by centrifugation and filtered. Before the extraction, the pH of the culture supernatant was adjusted to 3.8 with citric acid. Six milliliters of the pH-adjusted supernatant were extracted twice with dichloromethane (6 ml). The organic layers were combined, dried over sodium sulfate, and evaporated under reduced pressure. The residue was resuspended in 1 ml of H 2 O/acetonitrile (1:1, v/v), and 10 l was analyzed by LC-MS on a Finnigan MAT LCQ instrument using a Phenomenex Kinetex 2.6 m XB-C18 100A column of size 100 ϫ 2.1 mm eluted at 0.3 ml min Ϫ1 with a linear gradient over 15 min from 95:5 to 5:95 of water to acetonitrile, each containing 0.1% formic acid.

RESULTS AND DISCUSSION
Identification of the Oocydin A Gene Cluster-To identify the oocydin A gene cluster, the genome sequence of S. marcescens MSU97 was obtained using 454 pyrosequencing technology, automatically annotated, and analyzed. Based on the presence of NRPS and PKS encoding genes, at least five candidate biosynthetic gene clusters were identified in the 5.3-Mb MSU97 draft genome. Among them are gene clusters involved in the biosynthesis secondary metabolites such as siderophores and the red pigmented linear tripyrrole, prodigiosin.
A number of studies have reported that genes encoding enzymes responsible for the halogenation of secondary metabolites are closely associated with particular NRPS and PKS gene clusters (40 -44). Therefore, we hypothesized that the gene cluster involved in the biosynthesis of oocydin A was likely to encode a halogenase enzyme. During the in silico screening of the MSU97 genome, we identified a non-heme chloroperoxidase and a peroxidase that were not associated with any of the candidate biosynthetic gene clusters. However, two flavin-dependent monooxygenases were identified in a 77.5-kb gene cluster involved in the biosynthesis of an unknown polyketide. Flavin-dependent halogenases are the main class of enzyme involved in the halogenation of aliphatic compounds (15,45), so it seemed possible that this was the oocydin A gene cluster, although its complexity made it difficult to predict the structure of the polyketide synthesized.
In a first direct attempt to isolate the oocydin A gene cluster, we constructed a cosmid library of S. marcescens MSU97. Specific primers for five of the MSU97 biosynthetic gene clusters were designed, and screening of 2000 clones yielded 90 cosmid clones that hybridized positively to the probes. However, none of these cosmids could induce the production of oocydin A in E. coli, a host generally appropriate for the expression of cloned enterobacterial genes (46 -48). Random mutagenesis was then employed as a second strategy to isolate S. marcescens MSU97 mutants with reduced antimicrobial activity toward the oomycete P. ultimum. However, the MSU97 strain proved to be recalcitrant to various genetic tools that we have previously used successfully for the genetic analysis of secondary metabolite production in Serratia and Erwinia (36,47,49). This genetic intractability, coupled with high intrinsic multidrug resistance, made a classical mutagenesis approach unfeasible.
Besides S. marcescens MSU97, we had access to another oocydin A-producing strain, S. plymuthica A153 (7). We postulated that both strains should carry the same gene cluster for oocydin A biosynthesis. S. plymuthica A153 is a genetically tractable strain and therefore susceptible to efficient transposon mutagenesis. With the aim of isolating oocydin A-defective mutants, a transposon mutant library was screened in dual culture plate bioassays for mutants defective in antimicrobial activity toward the oomycete P. ultimum. We identified six independent transposon insertion mutants showing complete loss of the bioactive properties against the oomycete (Fig. 1, E and F, and supplemental Fig. S1A). We also isolated a mutant (MMnO15) that showed reduced anti-Pythium activity (Fig.  1E). All the insertion mutations were transduced back into a S. plymuthica A153 wild type genetic background using a new phage (MAM1) that we have recently isolated 4 to ensure that the phenotypes were indeed associated with single insertions and to confirm association between mutation and phenotype. The resulting transductants displayed the same phenotype as the original random mutants confirming genetically that loss of bioactivity was caused by the transposon insertion. Random primed PCR and genome sequence analysis of S. plymuthica A153 confirmed that all the transposon insertions mapped to a gene cluster homologous to the 77.5-kb MSU97 gene cluster (supplemental Table S2) that we had proposed as the main candidate for the biosynthesis of oocydin A ( Fig. 2A). This corroborated the hypothesis that this particular polyketide gene cluster is also carried by S. plymuthica A153 ( Table 2). We then successfully inactivated, in A153, the two candidate halogenase-encoding genes present in the gene cluster (oocK and oocM) using allelic replacement through the insertion of a kanamycin cassette. The resulting mutants were incapable of inhibiting the growth of P. ultimum ( Fig. 1F and supplemental Fig. S1A) as expected.
To confirm the identity of the oocydin A gene cluster and to corroborate its role in the biosynthesis of the macrolide, we analyzed by LC-MS the culture supernatants of S. marcescens MSU97 and S. plymuthica A153 wild type strains and identified a metabolite with the same molecular weight as oocydin A (Fig.  3, A and B). This metabolite showed the characteristic isotope pattern of a monochlorinated compound, with peaks at m/z 493 and 495 [M ϩ Na] ϩ in a 3:1 ratio as a result of the two isotopes 35 Cl and 37 Cl (supplemental Fig. S2A). The MS/MS fragmentation of [M ϩ Na] ϩ ion m/z 493 yielded two major ions, at m/z 433 [M ϩ Na-AcOH] ϩ and 449 [M ϩ Na-CO 2 ] ϩ (supplemental Fig. S2, B and C). The same analysis showed that all the nonbioactive mutants displayed a complete loss of oocydin A production ( Fig. 3F and supplemental Fig. S3, B-D), confirming the role of the gene cluster in the biosynthesis of the halogenated macrolide. There were no dechloro analogs of oocydin A detected in the extracts of oocK or oocM mutants that formally could be due to potential polar effects of the kanamycin cassette insertion in the respective genes. The analysis of the MMnO15 extracts showed that oocydin A was produced at significantly low levels (3-5%) as compared with the wild type strain A153 (Fig. 3E). This result suggests that other trans-acting ATs and ERs encoded in the genome of A153 could be catalyzing the required reactions, although less efficiently than OocU, -V, and -W. Taking all these data into account, we have named the genes of this cluster ooc, for oocydin A biosynthetic genes.
Oocydin A Gene Cluster Is Present in Several Plant-associated Enterobacteria-In addition to S. marcescens MSU97 and S. plymuthica A153, genome comparison analysis revealed that the ooc gene cluster is also present in the rhizobacterium, S. odorifera 4Rx13 (24) (GenBank TM accession no. ADBX01000002.1; genomic coordinates 102562-180808), and in the phytopathogenic bacterium D. dadantii Ech703 (GenBank TM accession NC_012880; genomic coordinates 1624970 -1705054). Chemical analysis of 4Rx13 and Ech703 by LC-MS showed that they also produced oocydin A (Fig. 3, C and D, and supplemental Fig.  S2, D and E), and, as expected, they show very clear anti-Pythium properties (Fig. 1, C and D). Interestingly, the genomic context of the ooc gene clusters in S. marcescens MSU97, S. plymuthica A153, and D. dadantii Ech703 is completely different. Therefore, the upstream and downstream ends of the different loci were assigned based on the homologies between the three biosynthetic clusters. The ooc gene clusters of S. plymuthica A153 and S. odorifera 4Rx13 show the highest homology (94.8% at DNA level; supplemental Table S1), and they also share the same genomic context. Therefore, these data suggest that the biosynthetic cluster could have been transferred horizontally between different genera and species of bacteria. In fact, although the G ϩ C content of the ooc gene clusters of MSU97 (57.8%), A153 (56.6%), 4Rx13 (56.0%), and Ech703 (52.3%) are similar to the genomic G ϩ C content in the chromosome, sequences reminiscent of mobile genetic elements, such as a bacteriophage P4 integrase, border the ooc gene cluster in D. dadantii Ech703.
The oocydin A gene clusters of A153, 4Rx13, and Ech703 are 78.2, 78.3, and 80.1 kb, respectively, and they are 81.7, 81.8, and 70.9% identical, respectively, at the DNA level to the 77.5-kb cluster of MSU97 (Fig. 4 supplemental Table S1). Unsurprisingly from the high nucleotide homology, the module organization within the PKS proteins encoded by the four ooc gene clusters revealed that all share the same domains and domain organization. Furthermore, on average, the proteins encoded by the S. marcescens MSU97 ooc gene cluster are 87.1, 87.3, and 78.6% identical to those of A153, 4Rx13, and Ech703, respectively. However, there are some interesting differences between them. First, the initial gene of the cluster in MSU97, A153, and 4Rx13 (encoding a putative alpha/beta hydrolase) is not present in Ech703 (supplemental Table S4). Second, OocN and OocS of the MSU97, A153, and Ech703 ooc gene clusters are divided into three and two genes, respectively, in S. odorifera 4Rx13 (supplemental Table S3), although it is formally possible that the difference might be due to an error in genome sequencing. In accordance with this notion, domains that are encoded by two different proteins in 4Rx13 were found contiguous within the same PKS protein in other bacteria (supplemental Table  S3). Finally, the region containing the putative small genes, oocH and oocI, of MSU97 is poorly conserved between the strains showing a 49.4, 46.5, and 41.7% identity at the DNA level, compared with A153, 4Rx13, and Ech703, respectively. Because this is a region of divergence of two transcriptional units ( Fig. 2A), it is reasonable to hypothesize that this region has an important regulatory role(s) in the expression of this complex gene cluster. Interestingly, the G ϩ C content of this intergenic region in MSU97, A153, 4Rx13, and Ech703, which is 42.6, 41.7, 42.1, and 37.1%, respectively, is significantly lower than the mean G ϩ C content of the gene clusters. OocB, a putative transporter present in the gene cluster of MSU97, A153, and Ech703 (supplemental Table S3), was not present in the GenBank TM draft annotation of the 4Rx13 cluster, but the reannotation of the cluster showed that it is, in fact, present.     NOVEMBER 9, 2012 • VOLUME 287 • NUMBER 46

Identification of Oocydin A Biosynthetic Gene Clusters
Bioinformatic Analysis of the ooc Biosynthetic Gene Clusters-As described above, the biosynthetic gene cluster of oocydin A spans between 77 and 80 kb and consists of 23 ORFs. The likely function for each gene product was deduced by sequence comparison with proteins of known function, and 16 of the proteins can be assigned possible roles in oocydin A biosynthesis ( Table  2 and supplemental Tables S2-S4).
The oocydin A biosynthetic cluster encodes five multifunctional PKS enzymes (OocJ, OocL, OocN, OocR, and OocS), which include a total of 16 ␤-KS, 7 KR, 20 ACP, 5 DH, 2 enoyl-CoA hydratases (EH), 2 methyltransferase (MT) domains, a TE, and an NRPS condensation (C) domain (Fig. 2B, Table 2, and  supplemental Tables S2-S4). One of the first observations derived from the analysis was the absence of integrated AT domains in all the biosynthetic modules of the five PKS proteins encoded by the ooc gene cluster. This places the oocydin A PKS in the growing class of PKSs that use trans-acting AT domains. In these enzymes, each module receives its acyl building block from one or more discrete ATs encoded by separate genes (50). The ooc gene cluster encodes two such proteins, OocV and OocW, containing two and one AT domains, respectively, with all three of the domains containing the catalytic Ser-His dyad and the highly conserved N-terminal GQGSQ loop (supplemental Fig. S4) (51,52). In accordance with the trans-AT nature of the ooc gene cluster, analysis of its interdomain sequences identified regions with homology to truncated AT domains following most of the KS domains. Recently, it has been suggested that these regions are the binding sites for the separate AT domains in trans-AT PKSs (53). Two or more free-standing ATs are infrequently found in trans-AT PKSs, and to our knowledge, only nine examples have been described as follows: bacillaene (54); elansolid (55); etnangien (56); kirromycin (57); mupirocin (58); pederin (59); rhizopodin (60), rhizoxin (61), and sorangicin (62). Multiple sequence alignments of OocV and OocW against known trans-ATs revealed that OocV-AT1 and OocW share all the residues characteristic of a malonyl-CoA-specific AT, whereas OocV-AT2 possesses more active site residue diversity (supplemental Fig. S4) (51,52,62). It has been suggested that this residue diversity may be associated with a broader range of substrate specificity or specialized functions (52,(62)(63)(64). BLAST analyses and multiple sequence alignments also revealed that OocV-AT1 and OocV-AT2 are most similar to BryP-AT1 and BryP-AT2, respectively (Table 2  and supplemental Tables S2-S4; supplemental Fig. S4), which have been grouped in two different classes of trans-ATs (64). Lopanik et al. (64) showed that both BryP-AT1 and BryP-AT2 prefer to transfer malonyl-CoA, although BryP-AT1 is more active than BryP-AT2. Multiple alignment analyses (supplemental Fig. S4) also grouped OocV-AT2 with the stand-alone acyltransferase PedC. A recent study showed that PedC does not exhibit AT activity but possesses acyl hydrolase activity and cleaves acyl groups bound to ACP, suggesting a role in PKS biosynthetic proofreading (65).
The number of domains in each of the five Ooc PKS proteins varies between 5 (OocR) and 18 (OocN). Interestingly, the number of KS (16 as domains within the modular proteins and one as a separate ORF) and ACP (22 domains, two of them present in separate ORFs) domains exceeds the number needed for the biosynthesis of oocydin A, which would be 8 and 9 domains, respectively. Presumably, some of the KS domains are either inactive and skipped or are simply used for transferring the acyl intermediate to the next ACP domain without extending it. In accordance with this observation, the catalytic CHH triad present in active KS domains is absent in KS1 (SHH), KS5 (CQH), KS10a (CEH), and OocF (SHH) (supplemental Figs. S5-S8, the numbering of domains refers to the module number as shown in Fig. 2B) (66). Furthermore the characteristic All seven potential KR domains identified showed the characteristic Rossmann fold for cofactor binding. The SYN conserved triad located in the active site (68) was modified in KR4 (SYA) and KR11 (S(Y/S)(A/I)) (supplemental Fig. S5-S8). Module 3 contains a region of 55 amino acids, which shows homology with the Rossmann fold of NAD(P)-binding proteins, but it lacks the SYN conserved triad, which indicates that it does not function as a ketoreductase. Of the five predicted DH domains found in the PKS, three of them (DH7a, DH8, and DH10) contain the HXXXGXXXXP motif characteristic of DH domains (supplemental Fig. S5-S8) (69). Module 1 contains two EH domains showing altered conserved motifs essential for the oxyanion hole that stabilizes the enolate anions (supplemental Fig. S9) (70). However, the "oxyanion hole" consensus motifs are present in OocC and OocD (supplemental Fig. S9). A TE and a C domain containing the GXSXG (71) and the HHXXXDG (72) conserved motifs, respectively, were identified in the last multimodular PKS of the cluster.
Finally, at least three putative tailoring enzymes are encoded by genes in the cluster, and they probably catalyze oxidative/ reductive transformations (OocK, OocM, and OocU). The cluster also encodes a protein (OocB) 40% identical (60% similar) to the putative efflux protein RRSL_03865 present in Ralstonia solanacearum UW551 and in other genera such as Enterobacter, Photorhabdus, and Burkholderia. This protein might be involved in the efflux of the bioactive molecule to the environment. Perhaps surprisingly, no genes encoding obvious candidate regulatory proteins were found in the ooc gene cluster.
Model for the Biosynthesis of Oocydin A-It has been observed that trans-AT systems possess numerous functional peculiarities as compared with cis-AT PKSs, in which the AT domains are located within the multidomain PKS. These pecu-liarities are often translated into unusual chemistry and include the loss of co-linearity found in canonical PKSs, lack of expected domains, unusual domain orders and domain sets, and the splitting of modules between two PKS proteins. Additionally, PKS modules can be "skipped" or used more than once during the biosynthetic pathway (16,50). These unusual characteristics make it difficult to establish an obvious predictive correlation between the polyketide structure and the architecture of the trans-AT PKS modules. However, the substrate specificity of KS domains of trans-AT PKSs has been correlated with their sequence, and they have been found to be most similar to other KS domains that process similar substrates (50,55). Thus, based on an extensive in silico analysis, we propose a mechanism for the biosynthesis of oocydin A as shown in Fig.  2B. In this model, functional KS domains have been compared with the nearest characterized homologs to determine what acyl groups could be accepted by the respective domains. As has been observed in other trans-AT PKSs, the oocydin A PKS deviates from the textbook PKS model architecture in several features.
The structure of oocydin A shows carboxyl groups at both ends of the linear chain of the polyketide, so its biosynthesis could, in principle, be started at either end. However, the organization of the domains makes it much more likely that the starter unit is C-17 to C-19. The predicted DH-MT-FkbH domains in the Load module are uncommon but have some similarity to the DH-KR-FkbH Load module of BryA, involved in the biosynthesis of bryostatin, which functions to load a lactyl starter unit onto an ACP domain (73). However, the structure of oocydin A does not require any such starter unit; the expected starter unit is a malonyl unit instead of the glyceric acid unit normally loaded by the FkbH domain (74). Therefore, we propose that the first three domains (DH, MT, and FkbH) might be nonfunctional relics in the process of evolutionary degeneration. In accordance with this, the DH and MT domains present in this unusual DH-MT-FkbH module lack the characteristic HXXXGXXXXP and LEXGXGXG conserved motifs, respectively. Furthermore KS1 has a serine in place of the active site cysteine and so is probably incapable of chain extension. Thus, the proposed biosynthesis would begin with the transfer of an activated malonyl starter unit from malonyl-CoA to the ACP1 domain of OocJ by one of the proposed trans- ATs. It is possible that KS1 catalyzes the decarboxylation of this malonyl group to an acetyl group, but it is equally likely that this carboxyl is retained, and it is the acetic acid unit introduced by the 3-hydroxy-3-methylglutaryl-CoA synthase (HCS) cassette (see below) that is decarboxylated. KS2 would then catalyze the first round of chain extension to give a ␤-keto-thioester attached to ACP2.
The methyl group at C-17 is attached to a carbon that would initially have been a carbonyl group. Such methyl groups are usually added by HCS cassettes that consist of stand-alone HCS, ketosynthase, and ACP proteins (63,73,75). Importantly, these three proteins are contiguous in the cluster (OocE-G). In this mechanism for generating a methyl group, two dehydratase (DH)-like EH domains are also required. Two genes for standalone EH proteins (OocC and OocD) that complete the operon encoding the HCS cassette are present in the ooc gene cluster. The similarity between the EH domains of OocD and OocC is only 19.1% (supplemental Fig. S9), and it has been hypothesized that such EH domains have different functional roles catalyzing, respectively, the dehydration and decarboxylation required to produce the methyl group (70). Two side-by-side EH domains are also present in module 1 of OocJ. The integration of EHs within a PKS module is unusual, although it has been described in the gene clusters of the related PKSs that make pederin (59) and onnamide A (76) as examples of the biosynthetic versatility of PKSs. The fact that the motifs essential for the oxyanion hole are poorly conserved in EH1a and EH1b and that a mutant defective in the EH OocC (MMnO14) does not produce oocydin A suggest that OocC and OocD are responsible for the required dehydration and decarboxylation. Thus, the HCS cassette (or alternatively the DH-like domains in module 1) would function at module 2 by adding a methyl group at C-17 (or adding -CH 2 CO 2 H if decarboxylation of the starter unit does occur). The domain structures of the other four PKSs (OocL, OocN, OocR, and OocS) do not show clearly the order in which they act. However, the structure of oocydin A suggests that they act in the same order as the genetic organization of their cognate genes, which is normally the case. Thus, the subsequent biosynthetic steps can be largely rationalized by the co-linearity rule with several deviations because some domains have mutant catalytic domains and several modules lack domains that would be predicted from the oocydin A structure. Module 3 appears to contain a fully active KS domain but only has a fragment of a KR domain, which lacks the active site groups and cannot be functional. As oocydin A has no keto groups that have not been reduced, this suggests that this module may be skipped. Module skipping has been described in other trans-AT PKS such as difficidin (54), myxovirescin A (77), leinamycin (78), and chivosazol (79). Following the assembly line, KS5 at the N-terminal end of OocN has a mutated CHH motif, so if probably does not elongate the chain. However, it may be involved in transferring the chain to the next module without elongating it.
Further along OocN, KS7 is predicted, by homology with other PKS domains, to accept the (S)-␤-hydroxy-acyl group that is produced by module 6. Domain 7 has two DH-like domains. In other PKSs, such as the sorangicin one (62), the extra DH domain is proposed to catalyze a cyclization in which a hydroxyl group attacks the ␤-position of an ␣,␤-unsaturated thioester in a Michael addition mechanism (interestingly, many domains in the ooc PKS show high similarity to equivalent domains in the sor PKS). In accordance with this model, KS8 has homology to KSs that accepts acyl groups where a cyclization has occurred. In the case of oocydin A, this cyclization needs to be from a hydroxyl at C-14, as shown in Fig. 2B. It appears therefore that hydroxylation at C-14 occurs, whereas the acyl chain is incomplete and still attached to the PKS (most likely when it was attached to module 4 before the reduction of the keto group at C-15 occurs so that deprotonation of C-14 is facilitated). The enzyme responsible for hydroxylation of C-14 is probably one of the two flavin-dependent monooxygenases OocK and OocM, the genes for which are located between PKSencoding genes, immediately before and after oocL. Because PKS genes are very often contiguous in a cluster, this location of oocK and oocM may suggest that the encoded enzymes act on substrates that are still bound to one of the PKS proteins.
Given that the fragment of a KR domain in module 3 lacks the catalytic triad, the number of active KR domains identified in the ooc biosynthetic cluster is insufficient for the biosynthesis of the macrolide. Therefore, one of the modules is probably used twice, catalyzing two rounds of elongation. This unusual feature has mainly been reported in trans-AT systems, such as rhizopodin (60), lankacidin (53), and oxazolomycin (50,80). We propose that module 8 would be used twice, but another possibility is that KS9 is in fact active in chain elongation but then passes the ␤-keto acyl group back to module 8 for processing by its KR and DH domains (which would be an example of nonlinearity often found in trans-AT PKSs).
At this point in assembly, an ER domain would be needed but OocN (like all of the other four PKS proteins) lacks this type of domain. However, OocU has high similarity to proteins found in PKS clusters that have been shown to be trans-acting flavindependent ERs such as PksE from B. subtilis (81) or BatK from Pseudomonas fluorescens (66). Therefore, we postulate that this ER domain might reduce the ␣,␤,␥,␦-conjugated diene attached to module 8, as shown in Fig. 2B, leaving the remaining double bond in the ␤,␥-position.
Further along the PKS production line, module 11, which spans OocR and OocS, contains an MT domain that we suggest adds the methyl group present at C-4. The structure of oocydin A also requires a DH to act at this stage to generate the C-4/C-5 double bond. There is no DH in module 11, but there is one in module 10, so perhaps the ␤-hydroxyacyl group from module 11 gets passed back for dehydration by that domain. Alternatively a trans-acting DH may effect the dehydration.
The TE and the NRPS C domains of OocS show high similarity to the domains present in the PKS proteins BryX and BryD, respectively, involved in the biosynthesis of bryostatin (73). Interestingly, the last PKS proteins involved in the chain extension of oocydin A and bryostatin both terminate with these C domains. It is not known whether it is the TE or the C domain that catalyzes the macrolactonization in bryostatin biosynthesis. By analogy, we propose that either the TE or the NRPS C domain in OocS is the likely candidate for the macrolactonization that releases the acyl group from the PKS. After the release of the polyketide from the PKS, the final acetylation (shown in Fig. 2B) can be catalyzed by a separate acyltransferase. However, there is a possibility that one of the KS14, KS15, TE, or C domains of OocS catalyze this acetylation. KSs, TEs, and C domains all catalyze acyl transfers of some complexion, so in principle, O-acylation would be only a small change of function.
Finally, the chlorine atom at C-8 should be introduced by a chlorinase. Chlorinating enzymes have been classified into highly specific halogenases requiring dioxygen and either a reduced flavin or ␣-ketoglutarate as co-substrates and less specific haloperoxidases that use hydrogen peroxide, often P450 enzymes (15,45). Because there are no predicted P450 enzymes in the ooc gene cluster, the main candidates for the halogenase are the flavin-dependent monooxygenases OocK and OocM. In agreement with this, the mechanism of action of flavin-containing halogenases is just a slight modification of that of monooxygenases (82), and crystallization of some of these halogenases has revealed a flavin monooxygenase domain in most of them (83). The chlorination could be a post-PKS biosynthesis step (Fig. 2B) or it could happen on a PKS-bound intermediate, perhaps in module 7 at the ␤-keto-thioester stage, when C-8 is easy to deprotonate.
Oocydin A Cluster Is Formed by Three Transcriptional Units-The genetic organization of the oocydin A cluster suggests the presence of at least three transcriptional units. To further investigate this hypothesis, transcript analysis by RT-PCR was performed on cultures of S. marcescens MSU97. To select the conditions where the oocydin A genes were being expressed, MSU97 culture samples were taken at different points along the growth curve. Filter-sterilized supernatants from these samples were added to holes punched in P. ultimum bioassay plates. Oomycete growth inhibition was observed in the supernatants after 12 h of growth in enriched potato dextrose medium (5), and these growth conditions were therefore used for the RT-PCR studies. For this analysis, primers were designed to cover the region between the 3Ј end of the upstream gene and the 5Ј end of the contiguous downstream gene (Fig. 5A). Two smaller transcriptional units, one consisting of oocA and oocB and a second consisting of oocG, oocF, oocE, oocD, and oocC, were identified. In addition, RT-PCR products were detected across all the intergenic regions containing oocJ-W, indicating the existence of a large polycistronic transcript (Fig. 5B). Unexpectedly, prominent PCR products were also obtained for the region covering the genes oocG-I (Fig. 5B). This might indicate the presence of divergent overlapping messenger RNAs in this region, consistent with the proposed regulatory role of this locus in the ooc cluster. Our analysis confirmed the presence of the three predicted transcriptional units in the oocydin A cluster, although the absence of additional internal promoters cannot be assumed.
Biological Properties of Oocydin A-Oocydin A has been shown to be very active against plant pathogenic oomycetes belonging to Pythium and Phytophthora genera (5). However, the activity of oocydin A against plant pathogenic fungi such as S. sclerotiorum was controversial because two different studies observed contradictory results (5, 7). These apparently contradictory findings perhaps could be explained by the concentration of bioactive molecules present in each bioassay because oocydin A had fungistatic or fungicidal effects, depending on the concentration (7). With the aim of determining the biological roles of oocydin A, we characterized phenotypically the strain S. plymuthica A153 and its non-oocydin A-producing mutants. The other oocydin A producers, S. marcescens MSU97, S. odorifera 4Rx13, and D. dadantii Ech703 were also analyzed. Perhaps surprisingly, given the number of enzymes that it encodes, inactivation of the large ooc gene cluster had no detectable effect on the growth rate. Neither was there any detectable impact on motility or biofilm formation of S. plymuthica A153 (data not shown). Besides the very high bioactivity that oocydin A shows against the oomycete, P. ultimum, we also tested its bioactive properties against fungal plant pathogens. In contrast to the report by Strobel et al. (5), we observed that the halogenated molecule was also active against the fungus, V. dahliae, because all the producing strains show bioactivity against the fungus, whereas the A153 mutants defective in the ooc gene cluster had lost the anti-Verticillium activity ( Fig. 1H and supplemental Fig. S1B). Interestingly, as we had observed previously in the P. ultimum inhibition assays, MMnO15 also showed reduced bioactivity against V. dahliae (Fig. 1H). In our study we have also verified that oocydin A inhibits the growth of other fungal plant pathogens, including A. solani, F. oxysporum, and to a lesser degree, T. cucumeris (supplemental Fig. S1, C-E). However, oocydin A was not active against the ascomycetes Saccharomyces cerevisiae and Schizosaccharomyces pombe (data not shown).
S. plymuthica A153 and S. marcescens MSU97 also show antibacterial activity against Gram-positive and Gram-negative bacteria (supplemental Fig. S10). However, the non-oocydin A-producing mutants of A153 showed the same antibacterial properties (supplemental Fig. S11), whereas these activities were not detected in the other oocydin A producers Ech703 and 4Rx13 (supplemental Fig. S10), showing that, in accordance with previous observations (8), oocydin A is not responsible for the antibacterial properties.
A similar result was observed with C. elegans. The nematode C. elegans has been used previously as a model system for the in vivo identification of bacterial virulence factors (84). A153 and MSU97 showed a dramatic virulence in C. elegans assays (supplemental Fig. S12). However, the non-oocydin A-producing mutants killed C. elegans at the same rate as the wild type strain (supplemental Fig. S13), whereas 4Rx13 and Ech703 show reduced, or no, virulence against the nematode (supplemental Fig. S12), implying that oocydin A is unlikely to be responsible for the observed virulence seen in MSU97 and A153 (supplemental Fig. S12).
Concluding Remarks-Oocydin A is an antifungal and antioomycete halogenated macrolide that has been shown also to possess antitumor properties. Therefore, this molecule, or analogs, may have considerable utility and promise for crop disease biocontrol and/or in medical chemotherapy (5,6,9,11). In this work, we describe for the first time the identification of the ooc biosynthetic gene cluster in four different strains of plant-associated enterobacteria, including bacteria from Serratia and Dickeya genera. The ooc biosynthetic gene cluster consists of 23 genes (or 22 genes in D. dadantii Ech703) organized in three transcriptional units. The analysis of the domains and module organization in the PKS proteins has revealed that these oocydin A biosynthetic enzymes belong to the subclass of trans-AT PKSs. In accordance with this, uncommon domain orders and splits of modules between PKS proteins were found in the ooc cluster products, similar to other trans-AT PKSs involved in biosynthesis of bryostatin (73), rhizopodin (60), rhizoxin (61), and sorangicin (62). A HCS cassette, required to modify the oocydin A PKS-bound intermediate, was also identified in the ooc gene cluster.
The broad spectrum of biological activities of oocydin A has made it an attractive compound for chemical studies. Although its chemical synthesis has been accomplished, the overall yield of only 1.3% for the best current synthesis remains inefficient (9). Thus, our definition of the genes responsible for the biosynthesis of oocydin A and our proposed biosynthetic model now provide an excellent opportunity to investigate the enzymatic mechanisms involved and the nature of the biosynthetic machinery. By a combination of molecular genetics and (bio) chemistry, we can now test the model for biosynthesis and confirm the assembly pathway. Furthermore, given the distribution of the ooc cluster in different enterobacterial genera and in nonconserved genetic contexts, we are now in a position to investigate how the production of this curious bioactive molecule is regulated. Future research will provide information that enables knowledge-based strategies for enhancing productivity and for generating novel analog haterumalides by synthetic biology methods. Such novel analogs might have important agricultural, pharmacological, and chemotherapeutic applications.