Structure and Catalytic Mechanism of the Thioesterase CalE7 in Enediyne Biosynthesis*

The biosynthesis of the enediyne moiety of the antitumor natural product calicheamicin involves an iterative polyketide synthase (CalE8) and other ancillary enzymes. In the proposed mechanism for the early stage of 10-membered enediyne biosynthesis, CalE8 produces a carbonyl-conjugated polyene with the assistance of a putative thioesterase (CalE7). We have determined the x-ray crystal structure of CalE7 and found that the subunit adopts a hotdog fold with an elongated and kinked substrate-binding channel embedded between two subunits. The 1.75-Å crystal structure revealed that CalE7 does not contain a critical catalytic residue (Glu or Asp) conserved in other hotdog fold thioesterases. Based on biochemical and site-directed mutagenesis studies, we proposed a catalytic mechanism in which the conserved Arg37 plays a crucial role in the hydrolysis of the thioester bond, and that Tyr29 and a hydrogen-bonded water network assist the decarboxylation of the β-ketocarboxylic acid intermediate. Moreover, computational docking suggested that the substrate-binding channel binds a polyene substrate that contains a single cis double bond at the C4/C5 position, raising the possibility that the C4=C5 double bond in the enediyne moiety could be generated by the iterative polyketide synthase. Together, the results revealed a hotdog fold thioesterase distinct from the common type I and type II thioesterases associated with polyketide biosynthesis and provided interesting insight into the enediyne biosynthetic mechanism.

core, enediyne natural products are categorized into two groups with either a 9-or 10-membered enediyne moiety (1,2). The antitumor activity of enediyne natural products derives from their capacity to induce chromosomal DNA cleavage through an oxidative radical mechanism (3). The biosynthetic mechanism for the enediyne moiety has been, however, elusive despite clues gleaned from early isotope-feeding experiments (4,5). Pioneering genetic studies of the biosynthesis of calicheamicin and C-1027 from two research groups yielded major insights into the biosynthetic pathways, suggesting that an iterative polyketide synthase (PKS) 5 plays a central role in the assembly of both the 9-and 10-membered enediyne moieties (6,7). The gene clusters also contain open reading frames encoding hypothetical proteins for the downstream processing of the PKS product. The involvement of similar genes in enediyne biosynthesis was later confirmed for neocarzinostatin, maduropeptin, dynemicin, and several putative enediyne natural products in soil and marine microorganisms (8 -11).
Polyketide and non-ribosomal peptide synthesis generally involves the so-called type I and type II thioesterases for the release of final product or removal of aberrant products. Type I thioesterases (TE I) are cis-acting domains fused to the C ter-minus of the most downstream module of PKS or non-ribosomal peptide synthase for the release and cyclization of the final product (15,16). By contrast, type II thioesterases (TE II) are discrete proteins responsible for the trans hydrolytic release of aberrant products (17)(18)(19). TE II proteins are structurally and evolutionarily related to a family of well known ␣/␤ hydrolase that contain 240 -260 residues (20). A common serine esterase motif GXSXG and another downstream motif GXH are conserved in TE II proteins (21,22). The stand-alone 146-amino acid-containing CalE7 does not belong to the TE II family, because it is neither an ␣/␤ fold hydrolase nor a protein containing the two conserved motifs for TE II. Instead, CalE7 shares moderate sequence homology with a family of hotdog fold proteins characterized by a long central ␣-helix packed against a five-stranded anti-parallel ␤-sheet. Such hotdog fold proteins include many characterized and hypothetical thioesterases that use acyl CoA as substrates (23). The three-dimensional structure and substrate specificity of several hotdog fold thioesterases have been determined, including YbgC from Helicobacter pylori (24), Paal from E. coli (25), HB8 from Thermos thermophilis (26), FcoT from Mycobacterium tuberculosis (27), YciA from Haemophilus influenzae (28), human THEM2 (25) and 4-hydroxylbenzoyl-CoA thioesterases (4-HBT) from Pseudomonas sp. Strain CBS and Arthrobacter sp. strain SU (29 -31). Despite their diverse specificity toward acyl substrates (23,25), all known hotdog fold thioesterases catalyze the hydrolysis of thioester bond using a Glu/Asp residue as nucleophile or general-base catalyst with the exception of FcoT (27). Here we present structural and biochemical data showing that CalE7 does not contain an acidic residue in its active site and is thus likely to utilize a different catalytic mechanism. The results also suggest that CalE7 facilitates a subsequent decarboxylation step to yield the carbonyl-conjugated polyene (1). Hence, the results introduce a hotdog fold thioesterase with a novel product-releasing mechanism in comparison with the traditional type I and II thioesterases associated with the biosynthesis of polyketide natural products. Furthermore, the crystal structure revealed a kinked substrate-binding channel that is predicted to bind a cis-double bondcontaining polyene substrate, raising the possibility that CalE8 is able to generate a cis-double bond.
Cloning, Protein Expression, and Purification-Cloning, expression, and purification of CalE8 and CalE7 have been described previously (13). Site-directed mutagenesis of CalE7 was performed using QuikChange II kit (Stratagene) following the manufacturer's instructions and with the mutations confirmed by DNA sequencing. The mutant proteins were expressed and purified following the same procedure as the wild-type protein. For the expression and purification of SgcE10, the SgcE10 gene was subcloned into the vector pCDF-2 according to the instruction manual for pCDF-2 Ek/LIC kit (Novagen) to give the expression plasmid pCDF-SgcE10. The expression and purification of the (His) 6 -tagged SgcE10 was carried out following the protocol for CalE7 (13). Briefly, the E. coli strain BL21(DE3) carrying the expression plasmid was grown in LB medium supplemented with streptomycin (50 g/ml) and incubated at 37°C. Protein expression was induced with 0.2 mM isopropyl 1-thio-␤-D-galactopyranoside at an A 600 of 0.6. After shaking for 20 h at 16°C, the cells were harvested and lysed in a buffer that contains 50 mM Tris (pH 8.0), 200 mM NaCl, 1 mM dithiothreitol, and 5% glycerol. The protein was purified by Ni 2ϩ -NTA affinity and size-exclusion chromatography. Fractions containing the recombinant protein were pooled, concentrated, and stored at Ϫ80°C until use.
The CalE7 gene was subcloned into pCDF-2 according to the instruction manual for pCDF-2 Ek/LIC kit (Novagen). The resulting plasmid pCDF-CalE7 was co-transformed with pET-CalE8 into the BL21(DE3) competent cells for protein co-expression. The LB medium inoculated with the transformed cells was supplemented with kanamycin (50 g/ml) and streptomycin (50 g/ml) and incubated at 37°C overnight. Protein expression was induced with 0.4 mM isopropyl 1-thio-␤-D-galactopyranoside when the A 600 reached 0.6. After shaking for 20 h at 16°C, the cells were harvested and lysed for protein purification following the procedure described previously (13). Because both CalE8 and CalE7 contain an His 6 tag, the purification by Ni 2ϩ -NTA affinity chromatography yielded a solution that contains both proteins. The colorless CalE8 and colored CalE7 were subsequently separated and purified by size-exclusion chromatography using a Superdex 200 column (Amersham Biosciences).
Enzymatic Assay and Product Analysis by Absorption Spectroscopy and HPLC-Steady-state kinetic measurement was carried out with 10.5 M CalE7 or CalE7 mutant, 1.12 M CalE8, 175 M NADPH (saturating concentration), and malonyl-CoA at various concentrations in 200 l of buffer (100 mM Tris, pH 8.2, 300 mM NaCl, 1 mM dithiothreitol) at 30°C. The reaction mixture without malonyl-CoA was equilibrated at 30°C for 15 min in the temperature-controlled sample chamber of Shimadzu UV-visible 1700 spectrometer. Enzymatic reaction was initiated by the addition of malonyl-CoA. A full wavelength spectrum scan was performed at every 2-min interval throughout the course of the experiment. The absorbance at 420 nm was corrected for baseline before being used for calculation of initial velocities. The initial velocity data were fitted non-linearly to the Michaelis-Menten equation to obtain V max . For large scale reactions, the pre-reaction mixture consisting of 3.6 M CalE8, 7.0 M CalE7, 0.15 mM acetyl-CoA, and 1.05 mM NADPH was incubated at 30°C for 15 min. 1.05 mM malonyl-CoA was added to start the reaction. Following 3 h of incubation, the yellow reaction mixture was acidified with trifluoroacetic acid and extracted by using ethyl acetate. The resulting organic extract was evaporated by using a SpeedVac. The dried sample was re-dissolved in methanol before application onto an analytical eclipse XDB C18 column (4.6 ϫ 250 mm) using an Agilent 1200 HPLC. A 60-min linear gradient was used starting with 100% Buffer A (HPLC grade water with 0.045% trifluoroacetic acid) to 100% Buffer B (acetonitrile with 0.045% trifluoroacetic acid). The diode-array detector was set at 410 nm with a reference wavelength of 600 nm. The characterization of the products by high resolution liquid chromatography-mass spectrometry has been described previously (13).
Product Analysis for CalE8/CalE7 and CalE8/SgcE10 Co-expression-The plasmid pCDF-CalE7 or pCDF-SgcE10 was co-transformed with pET-CalE8 into the BL21(DE3)-competent cells. The colored CalE7 and SgcE10 were expressed and purified following the procedure described above. The protein solution was treated with trifluoroacetic acid and extracted with ethyl acetate. The organic extract was evaporated by using a SpeedVac, and the dried sample was re-dissolved in methanol before application onto an analytical eclipse XDB C18 column (4.6 ϫ 250 mm) using an Agilent 1200 HPLC. Product quantification was carried out by using the same HPLC solvent system and conditions described above.
Crystallization and X-ray Diffraction Data Collection-Prior to crystallization, CalE7 purified from co-expression with CalE8 was buffer exchanged and concentrated to ϳ15 mg/ml in 50 mM potassium phosphate (pH 6.5), 150 mM NaCl using Amicon centrifugal concentrators (Millipore). An automated initial crystallization screen of 672 conditions was performed using the CyBio-Crystal Creator (Jena Biosciences). A volume of 200 nl of purified CalE7 was added to an equal volume of crystallization solution using the sitting drop, vapor-diffusion method. Crystals were obtained at 291 K with Index Screen condition 38 (0.1 M HEPES, pH 7.0, 30% Jeffamine-M600, Hampton Research). Crystals up to 0.2 mm ϫ 0.2 mm ϫ 0.1 mm in size could be grown in 0.1 M Hepes, pH 7.0, 28% Jeffamine-M600. Before data collection, crystals were transferred to a cryoprotectant of 0.1 M HEPES, pH 7.0, 28% Jeffamine-M600 and 25% glycerol and cooled to 100 K in a gaseous N 2 stream using an Oxford cryosystem. Diffraction intensities were collected at the NSRRC (Hsinchu, Taiwan) to a resolution of 1.75 Å using one single crystal and were integrated, merged, and scaled using the HKL2000 suite (32).
Structure Solution, Refinement, and Model Analysis-The structure of CalE7 was solved using the molecular replacement software MrBUMP (33) with the crystal structure of a hypothetical protein (AQ1494) from Aquifex aeolicus (PDB code 2EGI, RIKEN Structural Genomics/Proteomics Initiative) having an overall homology of only 28% of 126 aligned residues with CalE7, as a search probe (Fig. 3A). Automated model building was carried out with Arp/warp (34). The structure was refined using molecular dynamics and simulated annealing as implemented in the program CNS (35), with positional and individual temperature factor refinement without NCS restrains. The computer graphics software O was used for model rebuilding between refinement cycles (36). Analysis of the atomic model was carried out with the CCP4 suite of programs (37). The refined coordinates and structure factor amplitudes have been deposited in the PDB (PDB code: 2W3X).
Computational Docking-The substrates used for docking were built from scratch using Build tools available in the Maestro program and were processed further with the program Lig-Prep (38). The docking of the substrates into the substratebinding pocket of CalE7 was performed using GLIDE in the standard precision mode (38). Both the protein and substrates were modeled using the OPLS-AA force field (39).

RESULTS
In Vitro Enzymatic Activity with CalE8 and CalE7-The enzymatic assays were carried out by incubating CalE8 and CalE7 with acetyl-CoA, malonyl-CoA, and NADPH as described under "Experimental Procedures." At alkaline pH (pH 8.8), the carbonyl-conjugated polyene 1 (C 15 H 18 O, [Mϩ1] ϩ , m/z 215.1432), and its geometrical isomers were the major products as shown by absorption spectroscopy and HPLC analysis (Fig. 2). The 15 carbon-containing 1 has been proposed to derive from a PKS-tethered ␤-ketocarboxylic acid intermediate after hydrolysis and decarboxylation (13) (Fig. 1). We noticed that CalE8 and CalE7 also generated three minor products (2, 3, and 4) with shorter retention time. These minor products exhibited absorption maxima of 310 nm (2), 355 nm (3), and 375 nm (4), respectively, relative to the 395 nm maximum for 1. High resolution mass spectrometry suggested that 2, 3, and 4 are a series of aberrant or immature products with predicted molecular formula of C 8  . Although the characterization of the molecular structures of 2, 3, and 4 by NMR is difficult given the low yield, the deduced molecular formula and tandem mass spectrometry spectra suggested that they are most likely the lactones formed in solution from the linear tetra-, penta-, and hexaketide PKS products (supplemental Fig. S1). The yield of 1 decreased with decreasing pH to a point (pH 6.6) that 2, 3, and 4 become the major products. Importantly, only a negligible amount of 1, 2, 3, and 4 could be observed in the absence of CalE7 under all pH conditions, indicating that CalE7 is responsible for the release of 1 as well as the tetra-, penta-, and hexaketide linear intermediates. The observation that CalE7 did not discriminate the ACP domain-tethered acyl substrates with shorter chain length indicates CalE7 is not the determining factor of chain length.
Overall Structure of CalE7-We previously observed the expression of product-bound CalE7 when this protein was coexpressed with CalE8 (13). The product-bound CalE7 was used for crystallization in an attempt to obtain the three-dimensional structure of an enzyme-product complex. However, despite the intense yellow appearance of the original protein solution due to the binding of the product, crystallization in the presence of Jeffamine M-600, a polyether amine, appears to select the subpopulation of unliganded CalE7. As a result, only the structure of the apo-enzyme was obtained, which was refined to a final R work of 0.196 and R free of 0.234 with good stereochemistry ( Table 1). The asymmetric unit contains six molecules that can be overlaid with an average r.m.s.d. of 0.56 Å. Mobile amino acids located at the N-and C-terminal ends of each monomer could not be traced. The CalE7 monomer (Fig.  3B) exhibits a typical ␣/␤ hotdog fold with the five-stranded antiparallel ␤-sheet wrapping around the long ␣-helix ␣1 and two shorter ␣-helices ␣2 and ␣3 at the N terminus of the central helix (23). The main conformational differences between the six independent monomers are observed in the ␤5-␣3 loop where a displacement of ϳ2.8 Å is observed for one monomer (supplemental Fig. S2). A search for similar protein structures (40) revealed that CalE7 is closely related to members of YbgClike thioesterases from E. coli and H. pylori (24) with r.m.s.d. of 1.6 Å for 129 and 127 equivalent C␣ atoms after superposition and to the 4-hydroxybenzoyl-CoA thioesterase of Pseudomonas sp. strain CBS-3 (30) (with an r.m.s.d. of 1.9 Å for 135 equivalent C␣ atoms). Based on size-exclusion chromatography, CalE7 is a tetramer in solution. Six subunits were seen in the  asymmetric unit: four form a 222 (or D 2 ) tetramer and two form a 2-fold symmetric dimer. A second complete 222 tetramer can be generated from the latter dimer via the crystallographic dyad. As seen for other hotdog fold thioesterases, dimerization proceeds through the formation of a continuous 10-stranded antiparallel ␤-sheet with the central helices on the inner side of the ␤-sheet (Fig. 3C). This buried dimer interface (4540 Å 2 ) is stabilized via main-chain interactions between strands ␤2 and by residues 25-34 of the central helix ␣1 with their 2-fold counterparts, as well as by hydrophobic interactions between ␣2 of one subunit and the ␤1-␣1 loop of the other subunit. Tetramer formation leads to the burying of the four central helices while the ␤-sheets are exposed to the solvent, as observed in other hotdog fold thioesterases with the ⑀␥ oligomeric arrangement (24,29,30). A hydrophilic channel ϳ8 Å wide lies at the center of the tetramer along a 2-fold symmetry axis. Pyrophosphate molecules, possibly carried over from bacterial expression, are observed at both ends of this solvent-accessible channel.
Substrate-binding Pocket-An L-shaped putative substratebinding pocket is visible at each dimer interface, over the ␤-sheet edge (Fig. 4A). The location of the substrate-binding pocket agrees well with other characterized hotdog fold thioesterases. The inner channel extends inside the other subunit, surrounded by residues from helices ␣1, ␣2, and sealed by the short C-terminal helix ␣3 (Fig. 4, A and B). The putative substrate-binding channel is lined with hydrophobic residues that project from the two neighboring subunits, as well as five polar residues that include Arg 37 and Thr 60 from one subunit and Asn 19 , Asn 23 , and Tyr 29 from the adjacent subunit (Fig. 4, A and  B). An additional ϳ7-Å deep side pocket capped by the phenolic ring of Tyr 29 faces the inner segment of the substrate-binding channel with several water molecules making hydrogen bonds to residues Tyr 29 (2.9 Å), Asn 19 (2.84 Å), Arg 37 (2.85 Å), and Thr 60 (2.7 Å) (Fig. 4A). As inferred from a superposition with substrate-bound structures of 4-HBT from Pseudomonas (30), the segment forming the entrance of the channel at the dimer interface is likely to house the phosphopantetheinyl moiety of the substrate. Interestingly, only one (subunit E) out of the six substrate-binding pockets adopts an open conformation, with the substrate-binding channels of the other five subunits (A-D and F) shielded from bulk solvent by loop ␤5-␣3. Structural comparison reveals that a movement of 2.8 Å of the ␤5-␣3 loop results in the opening of the substrate-binding pocket to the bulk solvent. This observation indicates that substrate binding is likely to be accompanied by movement of the flexible ␤5-␣3 loop that acts as a gate (supplemental Fig. S2).
No bound substrate was observed in the substrate-binding pocket. However, at the edge of the dimer interface, close to the entrance of the substrate-binding pocket, a Jeffamine molecule from the crystallization solution is clearly visible in the electron density map (Fig. 3D). The Jeffamine molecule sits between one dimer and the symmetry-related tetramer, stabilized mostly by hydrophobic contacts. Comparison with 4-HBT of Pseudomonas sp. strain CBS-3 (PDB code: 1LO9) reveals that the Jeffamine fragment overlaps with the nucleotide and phosphopantetheinyl moieties of hydroxybenzoyl-CoA (Fig. 3C). The location of the Jeffamine fragment may mimic interactions formed by the exposed region of the phosphopantetheinyl arm tethered to the ACP domain with the CalE7 protein.
Computational Docking-Computational docking was performed to explore the structure of the enzyme-substrate complex. The initial effort to dock the phosphopantetheinyl-linked full-length substrate containing six conjugated CϭC bonds into the open pocket of subunit E was not successful. The binding channel appears a few Angstroms short for accommodating a full-length substrate with six conjugated CϭC bonds. We reasoned that the binding of the rigid polyene substrate probably requires conformational change that involves movement of the short C-terminal ␣3 helix to extend the channel, which was not taken into consideration during in silico docking. To circumvent the difficulty, shorter substrates that contain two, three, and five CϭC bonds were used for docking instead. Docking of the substrates in the binding pocket was observed only when a cis-double bond at the C4/C5 position was incorporated into the substrates. Despite the different length of the in silico substrates, the three sets of docking all placed the thioester bond at a similar position within the substrate channel. As shown by the highest ranked docking structure with the five CϭC bonds-containing substrate (Fig. 5A), the phosphopantetheinyl moiety is bound by the entrance segment of the channel through interactions with several main-chain groups and side chains of Lys 57 , Phe 59 , and Glu 70 . The docking results also suggest that the hydrophobic polyene moiety is bound by the inner segment of the channel, with the orientation of the thioester bond differing slightly among the docking conformations. The oxygen atoms of the carbonyl and thioester groups of the substrate were positioned within hydrogen-bond distance with Asn 19 and Arg 37 , respectively. Comparison of the model with other substrate-bound thioesterase structures such as 4-HBT (30) revealed a similar position for the thioester bond (Fig. 5A), lending further support to the reliability of the docking result. For the closed channel in the subunits A-D and F, docking the putative ␤-ketocarboxylic intermediate (4,6,8,10,12,14-ene-3-one-hexadechexaenoic acid) generated by thioester hydrolysis confirmed that the channel can only accommodate a full-length substrate with a cis CϭC bond at the C4/C5 position (Fig. 5B). The substrate fits tightly in the solvent-inaccessible part of the channel, with the carbonyl oxygen now moved away from the initial hydrogen-bond partner Asn 19 and the carboxylic end of the intermediate shifted into the side pocket that contains several water molecules. Judging by the distance, a few water molecules hydrogen-bonded by Asn 23 , Tyr 29 , and Thr 60 in the crystal structure could remain in the side pocket and form hydrogen bonds with the carboxylic group of the ligand.
Probing the Catalytic Mechanism by Site-directed Mutagenesis-Given the absence of an acidic residue Glu or Asp in the binding pocket, CalE7 must catalyze the hydrolysis of the thioester bond using a different mechanism from other hotdog fold thioesterases. The binding pocket mainly consists of hydrophobic residues with the exception of five conserved polar residues, namely Asn 19 , Asn 23 , Tyr 29 , Arg 37 , and Thr 60 (Fig. 5A). Five additional polar residues (Glu 17 , Cys 36 , Glu 38 , Glu 70 , and Asp 75 ) in the vicinity of the substrate-binding channel are also well conserved among CalE7 homologues (supple-mental Fig. S3). Site-directed mutagenesis and enzymatic assay were carried out to examine the roles of the ten residues in catalysis.
Incubation of CalE8 and CalE7 mutant with substrates at pH 8.2 produced 1 and its geometrical isomers as major products, whereas the mutants generated various amount of product relative to the wild-type CalE7 (Fig. 6). For the five residues located in the pocket, the mutation N23A had a negligible effect on product yield. The greatest effect was observed for the mutation R37Q or R37K, which completely abolished the enzymatic activity with negligible product formation. The mutations T60A and Y29F have significant impact by reducing the yield to 17 and 28%, respectively. On the other hand, the mutation N19A seems to facilitate the formation of 1 with a relative yield of almost 300% relative to CalE7. For the five conserved residues located outside the binding pocket, the mutation E17Q caused a notable decrease of product formation (80%), whereas mutations C36A, E38Q, E70Q, and D75N only caused moderate decreases in product yield. The effect of the E17Q mutation was surprising given that the side chain of the distal Glu 17 is distant from the substrate-binding pocket (supplemental Fig.  S3). The E17Q mutant was found to be a dimer in solution by size-exclusion chromatography, whereas all the other mutants remain as tetramer (data not shown). From the structure, it can be seen that Glu 17 is located at the tetrameric interface of CalE7 and surrounded by positively charged residues (supplemental Fig. S4). Hence, the effect of the E17Q mutation is most likely due to the structural perturbation at the tetramer interface.
The wild-type and five mutants were further characterized by steady-state kinetic measurement. The effect of the mutation on enzymatic rate for the five residues in the substrate pocket was examined by following product formation using absorption spectroscopy at 420 nm. The effect of the mutation on V max for the mutants correlates well with the effect on product yield (Fig. 6 (inset) and supplemental Table S1). Not surprisingly, the greatest effect was observed for mutation R37Q or R37K, which displays negligible activity even at high enzyme concentration. The T60A mutant exhibited a ϳ2-fold reduction in V max relative to the wild-type enzyme, whereas the N19A mutant showed a 1.2-fold greater V max than that of the wild-type CalE7. The mutations Y29F and N23A caused similar decreases (ϳ1.4-fold) in catalytic rate. Together, the data suggested that Arg 37 is the only essential residues for catalysis.
Substitution of CalE7 by SgcE10-SgcE and SgcE10 are the homologues of CalE8 and CalE7 in the biosynthesis of the 9-membered enediyne-containing C-1027. Zhang et al. observed the production of 1,3,5,7,9,11,13,-pentadecaheptaene (C 15 H 18 ) from the co-expression of SgcE and SgcE10 in E. coli (12), in contrast to the carbonyl-conjugated polyene (1, C 15 H 18 O) produced by CalE8 and CalE7. To investigate whether the thioesterase (CalE7 or SgcE10) exerts any control on the formation of the product, we cloned and expressed SgcE10, and examined the product generated from CalE8 and SgcE10 by in vitro enzymatic assay and in vivo co-expression. For in vitro assay, the incubation of CalE8 and SgcE10 with substrates yielded 1 instead of the 1,3,5,7,9,11,13-pentadecaheptaene. However, the yield appears to be lower than that of CalE7 with a relative yield of 25% (Fig. 6). For in vivo co-expression, the bright yellow pentadecaheptaene that was generated by SgcE/SgcE10 and found in the cell debris was not observed for CalE8/SgcE10 co-expression as evidenced by the pale cell debris. Instead, co-expression of CalE8 and SgcE10 produced the carbonyl-conjugated polyene 1 with a lower yield compared with the CalE8/E7 system. These observations support the notion that product formation is mainly determined by the FIGURE 3. Sequence and structure of CalE7. A, sequence alignment of CalE7 with SgcE10 and the hypothetical protein (AQ1494) from A. aeolicus (PDB code 2EGI). Strictly conserved residues are highlighted in red, and partially conserved residues are yellow. The residues mutated in the present study are indicated by blue asterisks. B, representation of the CalE7 monomer. The protein secondary structures elements are labeled and colored from blue at the N terminus to red at the C terminus. C, schematic representation of the CalE7 tetramer showing the Jeffamine fragment (sticks) binding site between two subunits. Subunits are colored gray and cyan for one dimer and dark blue and dark gray for the other. The ␤5-␣3 loops at the active sites is colored red. The inset shows a close-up view of the Jeffamine binding site superposed with 4-hydroxybenzoyl-CoA-bound 4-BHT (PDB code: 1LO9). D, simulated annealing difference Fourier map with coefficients ͉F obs ͉ Ϫ ͉F calc ͉ and phases calculated from the protein model with atoms from the Jeffamine fragment omitted and contoured at a level of 3. PKS, i.e. CalE8 and SgcE, and that CalE7 is more efficient than SgcE10 in releasing product 1 from CalE8.

DISCUSSION
Thioesterases associated with fatty acid synthase, PKS, and non-ribosomal peptide synthase generally belong to the so-called type I and II thioesterases. On the other hand, hotdog fold thioesterases are mainly known to utilize acyl-CoA as substrates, with only a few known examples that act on peptidyl carrier protein or ACP-tethered acyl substrates (23,41,42), including the EntH protein (or YbdB) involved in the biosynthesis of the non-ribosomal peptide-derived enterobactin (41,43,44). Hence, CalE7 appears to represent an uncommon example where a hotdog fold thioesterase was recruited for polyketide synthesis.
Structural Binding by CalE7-The overall structure of CalE7 resembles other hotdog fold thioesterases with the ⑀␥ oligomeric arrangement. However, with a sharply kinked substratebinding channel and an unusually large side pocket, the substrate-binding pocket of CalE7 differs significantly from other thioesterases despite the common location of the pocket at the subunit interface. According to the docking results, the channel is a few Angstroms short for the full-length substrate, presumable due to the shortening of the channel by a conformational change involving helix ␣3. We propose that the binding of the ACP-tethered substrate requires conformational changes in the ␤5-␣3 loop as well as the ␣3 helix. The movement of the "gating" ␤5-␣3 loop would open the main entrance, whereas the movement of the short ␣3 helix that contains Leu 133 and Phe 137 would extend the channel. Docking of substrates of various lengths suggests that the protein binds and orients the substrate for hydrolysis largely through interactions between the phosphopantetheinyl moiety and the entrance segment of the binding pocket. This is consistent with the finding that CalE8/CalE7 was able to generate 1 as well as the derailed polyketide products that eventually form lactones 2, 3, and 4 (supplemental Fig. S1).
The substrate of CalE7 is covalently attached to the phosphopantetheinyl group of the ACP domain of CalE8. Efficient cleavage of the product requires the recognition of CalE7  . Enzymatic activity of the wild-type and mutant CalE7. Comparison shows the relative product yield for CalE7, and its mutants with the product yield for SgcE10 are included. Reaction conditions are described under "Experimental Procedures," and the relative yield was calculated based on the total peak area of the product (1) and its isomers obtained from HPLC chromatogram. Inset: comparison of V max for CalE7 and its mutants. The reactions were followed by monitoring the formation of the products at 420 nm using a UV-visible spectrophotometer. The catalytic activity for the R37Q and R37K mutants was too low to be measured. by CalE8. No significant complex formation between CalE8 and CalE7 was detected using size-exclusion chromatography, indicating that the two proteins only interact with each other with moderate affinity. Presumably, a portion of the phosphopantetheinyl group would be bound inside CalE7 while the remainder of the phosphopantetheinyl group would protrude from the tunnel at the dimer interface of CalE7. The ACP domain of CalE8 is thus expected to be located in the proximity of the entrance of the active site. A few positively charged residues are found at the entrance of the substrate-binding site that could neutralize the negatively charged surface of ACP surrounding the attached phosphopantetheinyl group. Indeed, surface potential calculation for CalE7 revealed such a region, which could be complementary to ACP around Lys 62 and Arg 113 . Considering that most residues in the substrate-binding channels are conserved between CalE7 and SgcE10, the lower product yield for SgcE10 and overall catalytic efficiency could result from the loss of specific interactions required for such protein-protein interaction.
Catalytic Mechanism of CalE7-A common feature of the hotdog fold thioesterases is the presence of an essential acidic residue Glu/Asp on either side of the binding pocket (23). Although initial sequence comparison of CalE7 with other thioesterases suggested the conserved Glu 17 or Glu 75 could be the catalytic residue, the crystal structure revealed that the two residues are too distant from the substrate-binding channel to play a direct role in catalysis. The absence of an acidic residue as nucleophile or general base was recently reported for the thioesterase FcoT from Mycobacterium tuberculosis, an acyl-CoA thioesterase that prefers long-chain fatty acyl-CoA substrates (27). It was proposed that FcoT activates the thioester bond and stabilizes the oxyanion intermediate via the side chains of Tyr 33 , Asn 74 , and Tyr 66 , and that hydrolysis occurs through a direct attack by a hydroxide ion. The superposition of the structures of FcoT and CalE7 showed that these three residues are not conserved in CalE7, indicating that CalE7 utilizes a catalytic mechanism different from FcoT.
Based on the structural and biochemical data, we propose a catalytic mechanism for CalE7 as illustrated in Fig. 7. The substrate is first bound by CalE7 with the oxygen atoms of the thioester and carbonyl groups anchored by Arg 37 and Asn 19 , respectively. This binding configuration places the two car- FIGURE 7. Proposed catalytic mechanism for CalE7. The substrate-binding channel is represented by the gray shade. In the hydrolysis step (1 and 2), the essential residue Arg 37 functions as an oxyanion hole. The step following hydrolysis represents the breakage of the thioester bond as well as a thermally driven conformational change inside the substrate-binding channel, mainly involving the ␤5-␣3 loop and the ␣3 helix. The decarboxylation occurs with an intramolecular proton transfer within the six-centered transition state, and with the tautomerization to form the ketone.
bonyl oxygen atoms within hydrogen-bond distance of Arg 37 and Asn 19 . The nucleophilic attack of the thioester carbonyl by a water or hydroxide anion is facilitated with Arg 37 acting as an oxyanion hole to stabilize the transition state. This mechanism is consistent with the observation of several water molecules within the active site pocket (Fig. 4A). Although we cannot totally rule out that Arg 37 functions as a general base for water activation, this alternative mechanism is rather unlikely, because it would require a reduction of the pK a of Arg 37 of ϳ4 -5 pK a units. Subsequently, the collapse of the transition state would break the thioester bond to generate the 16-carbon ␤-ketocarboxylic intermediate. The intermediate undergoes a re-positioning in the substrate-binding channel, likely driven by a conformational change that could involve the closing of the ␣3-␤5 gating loop and the movement of the C-terminal helix ␣3. As a result, the carboxylic end of the intermediate swings into the short side pocket that contains several solvent molecules. Note that the hydrogen bond between the carbonyl oxygen and Asn 19 must be broken for the intermediate to move into the side pocket. Following the conformational change, the channel is shortened by the movement of helix ␣3 and concealed from the bulk solvent by the ␣3-␤5 loop. The carboxylic group now becomes hydrogen-bonded to the water molecules in the side pocket (Figs. 5B and 7). A facile decarboxylation of the ␤-ketocarboxylic acid could occur with an intramolecular proton transfer between the carbonyl and carboxylic groups within a six-centered transition state. Alternatively, the decarboxylation step may take place with Tyr 29 playing the role of proton acceptor, resembling the mechanism proposed for isocitrate dehydrogenase (45). In the last step, the ketone 1 is generated after the immediate tautomerization of the enol intermediate with a water molecule in the side pocket acting as proton donor.
The proposed catalytic mechanism is consistent with several major experimental observations. Firstly, Arg 37 is the only essential catalytic residue as established by mutagenesis study. This is understandable considering the critical role of Arg 37 as oxyanion hole in hydrolysis. Secondly, the intriguing observation that the mutant N19A exhibited a higher product yield and greater V max can be rationalized by considering the re-positioning of the intermediate prior to decarboxylation. The hydrogen bond between Asn 19 and the keto oxygen of the substrate presents a kinetic barrier for the re-positioning process. Elimination of the side chain of Asn 19 would disrupt this hydrogen bond and lower the kinetic barrier for the reorientation, resulting in increases in overall rate and product yield. Thirdly, the T60A mutation did not abolish the enzymatic activity, ruling out the possibility that the hydroxyl side chain functions as the nucleophilic group attacking the thioester. Instead, Thr 60 is most likely to play a secondary or structural role in catalysis, by positioning the side chain of Arg 37 through hydrogen bonding and anchoring one of the water molecules in decarboxylation using the main-chain carbonyl group. Fourthly, the residue Tyr 29 is located at the end of the side pocket and is 13 Å away from the side chain of Arg 37 , indicating that the likelihood for it to participate in the hydrolysis is small. Thus, the reduction in product yield and rate caused by the mutation Y29F indicated that the phenolic oxygen promotes another step along the cat-alytic coordinate. It is well known that ␤-ketocarboxylic acids are chemically labile and prone to decarboxylation in solution due to small kinetic barrier (46,47). The decarboxylation may require little assistance from the enzyme for kinetic barrier reduction. However, the presence of the unusual side pocket and the effect of the Y29F mutation led us to propose an assisting role of Tyr 29 in decarboxylation. A network of several water molecules was seen to be anchored by Tyr 29 , Asn 23 , and Thr 60 in the side pocket. Judged by the distance, Tyr 29 could promote the protonation of the enol intermediate by lowering the pK a of the water through hydrogen-bonding (Fig. 7).
The exact function of the essential residue Arg 37 remains to be fully established in the future. Current structural and biochemical data do not distinguish whether Arg 37 acts as transition state stabilizer or general base even though the latter is less likely. We propose that hydrolysis may take place through a direct attack of hydroxide anion, similar to the mechanism proposed for FcoT and the hydrolytic antibody D2.3 (27,48). Because the rate of such nucleophilic reaction depends on the concentration of the hydroxide anion, this is to some extent in line with the observation that high pH favored product formation. Although it is rare for Arg to act as general base in enzymes due to its high pK a , we could not totally rule out the possibility that the water molecule could be activated by the neutral form of Arg 37 if the pK a of Arg 37 is drastically lowered by several pH units in CalE7.
Implications for Enediyne Biosynthesis-The structural and mechanistic study of CalE7 provided some interesting insights into the enediyne biosynthetic pathway. First, there was speculation that the formation of the bicyclic enediyne from the putative biosynthetic intermediate 1 might be catalyzed by CalE7, given the observation that CalE7 shares some structural homology with the aromatase/cyclase in aromatic polyketide biosynthesis (49). Due to its size, an undecorated bicyclic enediyne cannot be sterically accommodated into the pocket of CalE7. Therefore, the observation of a narrow and elongated substrate-binding channel, rather than a large and spacious pocket such as the one in tetracenomycin aromatase/cyclase, argues against the possibility that CalE7 also functions as a cyclase. In addition, the sharply kinked substrate-binding channel led us to propose that the product of CalE8 contains a cis double bond. The presence of the cis double bond at the C4/C5 position was not revealed previously by the NMR spectrum of 1 due to signal overcrowding in the olefinic proton region (13), or probably due to the geometrical isomers generated by thermal or lightinduced cis-trans isomerization. Intriguingly, the position of the inferred cis double bond in the putative biosynthetic intermediates 1 coincides with the position of the cis C4ϭC5 bond in the 10-membered enediyne of calicheamicin according to the proposed catalytic mechanism (Fig. 1A) (13). This implies that the cis double bond in the 10-membered enediyne could be generated by CalE8, but not by a downstream tailoring enzyme. Although the presence of the cis CϭC in the biosynthetic intermediate needs to be further confirmed by more direct evidence, the structural results presented here raised the tantalizing possibility that the DH domain of the iterative PKS may be able to generate a cis double bond. Finally, it is important to stress that the function of CalE7 was proposed to release the linear prod-uct 1 for downstream processing, largely based on the assumption that the 1 is the mature product of CalE8 and thus the biosynthetic intermediate. However, the current structural and biochemical data cannot totally rule out the possibility that 1 is only an aberrant product of CalE8. In this case, the function of CalE7 is to remove the derailed products from the PKS, resembling the editing roles of other type II TEs. These uncertainties will be borne out with the elucidation of the downstream cyclization and oxidation steps.