Crystal Structure of Vinorine Synthase, the First Representative of the BAHD Superfamily*

Vinorine synthase is an acetyltransferase that occupies a central role in the biosynthesis of the antiarrhythmic monoterpenoid indole alkaloid ajmaline in the plant Rauvolfia. Vinorine synthase belongs to the benzylalcohol acetyl-, anthocyanin-O-hydroxy-cinnamoyl-, anthranilate-N-hydroxy-cinnamoyl/benzoyl-, deacetylvindoline acetyltransferase (BAHD) enzyme superfamily, members of which are involved in the biosynthesis of several important drugs, such as morphine, Taxol, or vindoline, a precursor of the anti-cancer drugs vincaleucoblastine and vincristine. The x-ray structure of vinorine synthase is described at 2.6-Å resolution. Despite low sequence identity, the two-domain structure of vinorine synthase shows surprising similarity with structures of several CoA-dependent acyltransferases such as dihydrolipoyl transacetylase, polyketide-associated protein A5, and carnitine acetyltransferase. All conserved residues typical for the BAHD family are found in domain 1. His160 of the HXXXD motif functions as a general base during catalysis. It is located in the center of the reaction channel at the interface of both domains and is accessible from both sides. The channel runs through the entire molecule, allowing the substrate and co-substrate to bind independently. Asp164 points away from the catalytic site and seems to be of structural rather than catalytic importance. Surprisingly, the DFGWG motif, which is indispensable for the catalyzed reaction and unique to the BAHD family, is located far away from the active site and seems to play only a structural role. Vinorine synthase represents the first solved protein structure of the BAHD superfamily.

The acyl-CoA-dependent BAHD 1 superfamily is a fast growing enzyme family that has only recently been defined (1). The name BAHD is coined from the first four enzymes of the family isolated from plant species. The members of this family play an important role in the biosynthesis of a variety of secondary metabolites. The family might become significantly larger in the near future because ϳ70 BAHD-related genes have been identified recently in the Arabidopsis genome (2), and in most cases, their biochemical function still needs to be explored. Several BAHD members occurring in medicinal plants and fungi play very specific metabolic roles in biosynthetic pathways. The most prominent members are, for instance, those participating in the biosynthesis of the Catharanthus alkaloid vindoline (3), a precursor of the anti-cancer drugs vincaleucoblastine and vincristine, the Papaver alkaloid morphine (4), the diterpenoid alkaloid Taxol (5-7), anthocyanins (8 -10) as well as some phytoalexins (11), and enzymes involved in floral scent (12).
A well-characterized enzyme of this family is vinorine synthase (VS; EC 2.3.1.160), which is of central importance in the endogenous formation of monoterpenoid indole alkaloids of the ajmalan type in the plant genus Rauvolfia. The synthase is located in the middle of the complex biosynthetic pathway that starts with tryptamine and the monoterpene secologanin and leads, finally, to the six-membered ring system of ajmaline that bears nine chiral carbon atoms (Fig. 1). Ajmaline is an antiarrhythmic drug from the Indian plant Rauvolfia serpentina, which has been known as a medicinal plant for about 3000 years. VS catalyzes the acetyl-CoA-dependent reversible biosynthesis of the ajmalan-type alkaloid vinorine from the alkaloid 16-epi-vellosimine. The latter belongs to the class of sarpagan alkaloids containing a five-ring system, and this is the final ring closure reaction during the biosynthesis of ajmaline (Fig.  1). VS connects the two different types of alkaloids biosynthetically and occupies a central role in the metabolism of alkaloids in the genus Rauvolfia.
VS has been identified previously in de-differentiated cell suspension cultures of R. serpentina and preliminarily characterized (13). Only recently has it been functionally expressed in Escherichia coli and purified to homogeneity (14,15). The synthase is a monomeric enzyme with a molecular mass of 46.8 kDa. Knowledge of the primary structure of the enzyme allowed sequence alignment studies placing VS into the BAHD family as a new member (15). This classification was based on the consensus sequences HXXXD and DFGWG. The typically low overall sequence identity (25-34%) to other BAHD members might indicate a divergent evolution of the family from one ancestral gene (1). Some functional significance of both motifs has been demonstrated by site-directed mutagenesis performed on a malonyl-CoA transferring plant enzyme (9), and more detailed mutation studies have been carried out on vinorine synthase (15). The results showed, however, that a better understanding of the catalytic process and the function of the conserved residues would be best addressed by three-dimensional structural analysis. Because there was no structural information available from members of this enzyme family, we have crystallized vinorine synthase from R. serpentina (16,17) and solved the x-ray crystal structure at 2.6-Å resolution. Structural analysis combined with previously reported biochemical and mutagenesis studies allows us to propose a model for VS catalysis and provides insight into the function of conserved motifs within the BAHD superfamily.

EXPERIMENTAL PROCEDURES
Overexpression, Purification, and Crystallization of VS-VS was subcloned into the pQE-2 vector and overexpressed in E. coli. The soluble protein was purified by nickel-nitrilotriacetic acid affinity chromatography, anion exchange, and gel filtration chromatography as described previously. The N-terminal His tag was removed for crystallization (15,16). Crystals of VS were obtained at 32°C by the hanging drop vapor diffusion method. The reservoir solution contained 0.1 M Tris-HCl, pH 8.7, 2 M ammonium sulfate, and 2% polyethylene glycol 400. The enzyme (2-3 mg/ml) was in a buffer containing 20 mM Tris-HCl, pH 7.5, 10 mM ␤-mercaptoethanol, 1 mM EDTA, and 0.5 mM acetyl-CoA. SeMet VS was obtained by inhibition of the methionine biosynthetic pathway (18) with the same expression vector and E. coli strain used for expression of the native VS. Purification and crystallization of SeMet VS were carried out using a protocol similar to that used for wild-type VS (17).
Data Collection and Processing-Both SeMet VS and native VS crystals were cryoprotected by addition of 20 -25% glycerol to the precipitant buffer before being flash-cooled in a stream of cold nitrogen at 100 K. Native data and multi-wavelength anomalous diffraction data from SeMet VS crystals measured at three different wavelengths around the selenium absorption edge were collected using synchrotron radiation on the BW7A beamline of the European Molecular Biology Laboratory at the DORIS storage ring of the Deutsches Elektronen-Synchrotron (Hamburg, Germany). The SeMet crystals diffracted to 3.24 Å, whereas the native crystals diffracted to 2.60 Å. The diffraction data were processed using the HKL program package (19). The crystals belong to the space group P2 1 2 1 2 1 with two molecules in the crystallographic asymmetric unit. The data collection and processing statistics are summarized in Table I.
Structure Determination, Model Building, and Refinement-The structure was solved using the three wavelength multi-wavelength anomalous diffraction protocol of the European Molecular Biology Laboratory Hamburg automated crystal structure determination platform (20). Within the platform, positions of the anomalous scattering atoms were determined with the program SHELXD (21), and 18 Se sites were further refined using MLPHARE (22) to generate initial phases. Phase improvement by density modification was performed using DM (23). The platform provided the correct selenium sites and an interpretable map with a partial ␣-helical model containing 167 of 842 residues. The partial model was produced by the program ESSENS (24) within the platform.
Once the map was judged to be interpretable, 50% of the polyalanine model was built using a semiautomatic procedure with the programs MAID (25), RESOLVE (26), and XTALVIEW/XFIT (27). Later, phases were extended to 2.6 Å using data from the native crystal by density modification and 2-fold non-crystallographic symmetry averaging. Manual model building was continued using XTALVIEW/XFIT, and a relatively complete polyalanine model was built into the improved density. The selenium sites were used as markers to place the correct side chains in the electron density, and building of most of the side chains of a single molecule was possible. The second molecule was then generated using the non-crystallographic symmetry operator. At this stage, refinement of the structure was initiated using simulated annealing, followed by positional and restrained B-factor refinement as implemented in CNS (28). The geometry and completeness of the model was iteratively improved by refinement with CNS using simulated annealing and torsion angle dynamics protocol and manual building. In the final stage, refinement was carried out using non-crystallographic symmetry restraints, bulk solvent correction, and anisotropic scaling, with each domain of each monomer defined as a TLS group in the modeling of anisotropy using the program REFMAC5 (29,30). The refinement was monitored throughout using the free R-factor calculated with 3.2% of the unique reflections. Of 421 residues, 8 residues in chain A and 9 residues in chain B are not visible in the electron density (see "Results") and are probably disordered. The acetyl-CoA could not be located in the electron density, hence these crystals are referred to here as native crystals. The refinement statistics for the native structure are shown in Table I.
The overall geometric quality of the model was assessed using PRO-CHECK (31). 87.7% of the amino acid residues of VS were found in the most favorable regions of the Ramachandran plot, and no residues were in the disallowed regions. All figures were produced using MOLSCRIPT (32), PyMOL (33), and RASTER3D (34).
Modeling of CoA-The structural similarity of VS to the CoA-dependent acyltransferase family was used to model CoA into the active site of VS. Because the structure of dihydrolipoyl transacetylase (Protein Data Bank code 1EAD) has been solved in complex with CoA (35), this protein was used as the basis for the modeling of the CoA molecule in VS. The dihydrolipoyl transacetylase trimer binds CoA in the solvent channel that is located at the interface of two subunits. The two subunits of dihydrolipoyl transacetylase are analogous to a single monomer of the VS structure that contains two domains. Both structures were superimposed in order to provide a starting model for the VS⅐CoA complex. To improve the accuracy of positioning CoA, all residues within a 4-Å distance of CoA in the molecules were selected using the program CONTACT (22), and these residues were superimposed with an r.m.s.d. of 1.8 Å onto corresponding residues in VS using the program LSQKAB (22). The matrix was applied to the CoA molecule in Protein Data Bank code 1EAD in order to transfer CoA into the solvent channel of the VS structure. A model of a single molecule of VS with the fitted CoA was then subjected to model refinement in CNS (28), exclud-ing the x-ray terms. The resulting model was then used for further analysis.

RESULTS
Structure Determination of VS-The structure of VS was solved by the multi-wavelength anomalous diffraction method using selenomethionine substituted VS. The crystals formed in space group P2 1 2 1 2 1 with two molecules in the asymmetric unit. The model was refined to a final crystallographic R-value of 21.1% (R free ϭ 27.2%), using data from 20.0-to 2.6-Å resolution. The presented atomic model of VS shows all residues except N-terminal residues 1-3 from both molecules in the asymmetric unit and a surface loop (residues 235-239 for A molecule and 235-240 for B molecule). The crystallographic information is summarized in Table I. The contacts among the non-crystallographic symmetry-related dimers in the crystals are generally weak and hydrophilic in nature. The structural observations are consistent with biochemical data that VS is active as monomer as determined previously by size exclusion chromatography (14).
Overall Structure of VS-The structure of VS contains 14 ␤-strands (␤1-␤14) and 13 helices (␣1-␣13) and consists of two approximately equal-sized domains. The domains are connected with a large crossover loop (residues 201-213) that spans nearly 36 Å. Domain 1 contains a mixed 6-stranded ␤-sheet (␤1-␤2, ␤5-␤7, ␤12), which is covered on both sides by 7 helices (␣1-␣7) (Fig. 2). Strand 12 (residues 370 -372) protrudes out from domain 2 and forms part of an anti-parallel sheet in domain 1. Domain 1 also contains a pair of ␤-strands (␤3 and ␤4) on the surface of the protein at one end of the central ␤-sheet. Domain 2 contains 6 helices and a mixed 6-stranded ␤-sheet (␤8-␤11, ␤13-␤14). A loop from domain 2 between ␤-strands 9 and 10 extends into domain 1 and contacts ␣6. Domain 1 and domain 2 share a very similar polypeptide backbone fold; however, their topology is different. Their backbones can be aligned to within 3.1-Å r.m.s.d. over 85 amino acids. The secondary elements that correspond in the two domains include the 6-stranded ␤-sheet and two ␣-helices (␣2 in domain 1 and ␣9 in domain 2). The sequence identity among these aligned positions is rather low, with only seven pairs of identical residues (8.2%). Architecture of Solvent Channel and Location of Active Site-A solvent channel runs through the VS molecule (Fig. 3) and is formed between the two domains by two loops, which protrude from domain 2 to contact domain 1 (Fig. 2). The first loop is located between the two parallel strands ␤11 and ␤13 of domain 2 and includes strand ␤12 of domain 1. A second loop is situated between ␤9 and ␤10. The DFGWG and GN motifs in the first and second loop, respectively, are absolutely conserved throughout the BAHD superfamily (Fig. 4). The active site HXXXD sequence motif in the VS structure is located at the interface between the two domains, and the catalytic residue His 160 of this motif is accessible from both sides of the channel (Fig. 3).
VS Structure Represents a Member of the CoA-dependent Acyltransferase Family-Structurally related proteins can be retrieved from secondary structure matching (www.ebi.ac.uk/ msd-srv/ssm/cgi-bin/ssmserver) servers using the whole VS molecule or separate domains as search models. The closest structure to VS is the polyketide synthase-associated protein 5 ( . All of these aligned proteins are CoA-dependent acyltransferases and contain the conserved HXXXD motif in the active site. In all these acyltransferases except VibH, His of this motif plays a critical role in the CoA-dependent acyltransfer reaction mechanism (35)(36)(37)(38)(39). VibH also contains the HXXXD motif, and the conserved His is favorably positioned in the active site, but mutation of this His to Ala or Glu has little effect on catalysis, indicating that the HXXXD motif in VibH is not used for an equivalent role in acyltransfer catalysis (40). On the basis of the structural alignment and sequence motifs present in the protein, it is evident that VS is a new member of the CoA-dependent acyltransferase family.

The Active Site of VS and Proposed Reaction Mechanism-
The HXXXD motif is highly conserved in the BAHD gene family and a number of other acyltransferases. Our previous biochemical and mutagenesis studies have shown that His 160 in VS is indispensable for acetyltransferase activity (15). The VS structure presented here explains the functional importance of this residue. His 160 is located in a loop between helix 5 and strand 7, situated directly in the center of the solvent channel. This structural arrangement allows the ligand (acetyl-CoA) and the substrate 16-epi-vellosimine to approach the active site independently from the front face (CoA binding) and the back face (substrate binding) of the enzyme (Fig. 3). In fact, based on kinetic data obtained previously with an enriched VS prepara-  (13) in which a ternary complex between enzyme, substrate, and co-substrate is involved. It was concluded that substrate and co-substrate are independently bound at the active site of VS, a conclusion that is in agreement with the described structure. Interestingly, the side chain of His 160 of the catalytic site adopts a rather unusual conformation ( 1 ϭ Ϫ140°, 2 ϭ Ϫ31°) to form an intra-residue hydrogen bond (2.9 Å) between the imidazole nitrogen N␦1 and the carbonyl oxygen of the same amino acid. Besides, the N␦1 of His 160 is also hydrogen-bonded with carbonyl oxygen of Ala 163 (2.9 Å) and with the side chain of Asn 293 (3.0 Å). Structures of several related CoA-dependent acyltransferases have been solved in complex with co-factor and substrate, such as Azotobacter vinelandii dihydrolipoyl transacetylase (Protein Data Bank codes 1EAD and 1EAB) with CoA and substrate lipoamide and mouse carnitine acetyltransferase (Protein Data Bank codes 1NDB and 1NDI) with substrate carnitine and CoA (35,36). By superimposing the dihydrolipoyl transacetylase monomer on domain 1 of VS, we could map the CoA and lipoamide binding sites onto VS. In this model, His 160 is located at the same position as the catalytic residue His 610 in dihydrolipoyl transacetylase. Based on our VS-CoA model, it can be seen that CoA enters the solvent channel from the front face of the molecule (Fig. 3) between ␤-strands 11 and 13. Lipoamide binding can be mapped to the opposite side of the CoA binding site (data not shown). Based on biochemical results (13,15) and our structural analysis, we propose the following acetyl-transfer mechanism for VS. The His residue in the active site, acting as a general base, extracts the proton from the 17-hydroxyl group of 17-deacetylvinorine or from the thiol group of CoA, depending on the direction of the reversible reaction. The activated hydroxyl or thiol group can then directly attack the carbonyl carbon in acetyl-CoA or vinorine, and the reaction proceeds without the formation of an acetyl-enzyme intermediate (Fig. 5).
Role of Conserved Residues in BAHD Superfamily Enzymes-Multiple protein sequence alignment revealed that BAHD family proteins share both significant sequence identity of 25-34% (Fig. 4) and contain 19 amino acid residues that are absolutely conserved (Fig. 4). Interestingly, all these conserved residues belong exclusively to domain 1 (Fig. 6). Future structural analysis of other members of the enzyme family may show that strict domain-located conservation is a common feature of the superfamily.
In the HXXXD motif, His 160 is hydrogen-bonded with two main chain carbonyl oxygens in addition to the side chain of Asn 293 . Asp 164 points away from His 160 and the active site. Although mutation of Asp 164 to Ala resulted in complete loss of activity (15), the side chain orientation of Asp 164 is such that it is not involved in hydrogen bonding with His 160 . Therefore, it is unlikely that these two residues function as a dyad in catalysis as proposed for human carnitine acetyltransferase (37). Asp 164 is rather involved in the formation of a salt bridge with the conserved Arg 279 , which is most likely to be important for maintaining the geometry of the active site. Thus, Asp 164 does not appear to have a direct role in catalysis, and it is most likely of structural importance, as has been discussed for several other acyltransferases (38,39). The importance of His and Asp in the HXXXD consensus sequence for other members of the BAHD family has also been demonstrated by chemical modification and mutagenesis experiments (1,9,15). The BAHD family enzymes might therefore have a similar conformation of the catalytic His and use a reaction mechanism similar to that proposed for VS.
Another highly conserved region within the BAHD acyltransferases is the DFGWG motif near the C terminus. This motif is unique for BAHD enzymes and has been suggested to be important for the catalysis or binding of CoA (1,9,15). The structure analysis of VS reveals, however, that the DFGWG motif is remote from the active site, and therefore it is unlikely to have a direct role in either substrate binding or catalysis. Modeling of CoA into the VS binding pocket also showed that this particular turn has contact with neither the pantetheine nor the adenosine moiety of CoA (Fig. 6). Therefore, this conserved sequence seems to play an important structural role by maintaining the conformational integrity of the enzyme structure rather than being involved in catalytic function. The importance of Asp in the DFGWG motif has been identified by two previous mutagenesis experiments. Its mutation to Ala caused complete loss in anthocyanin 5-O-glucoside-6ٞ-O-malonyltransferase (9) and a 65% decrease of the catalytic activity in VS (15). The DFGWG motif is located at a turn between ␤11 and ␤12 (Fig. 6). Asp 362 is a part of the turn that is hydrogenbonded with the amide group nitrogen of main chain of Trp 365 and Gly 366 . Therefore, the orientation of Asp 362 seems to play a vital role in maintaining the turn. As also observed in carnitine acetyltransferase and dihydrolipoyl transacetylase, ␤11 and ␤13 in domain 2 are splayed apart from each other at deoxytaxol N-benzoyltransferase from T. canadensis, hydroxyanthranilate hydroxycinnamoyltransferase from Avena sativa, and Taxadienol acetyltransferase from T. cuspidata). The Swiss-Prot accession numbers of the representative members of the BAHD family are Q9FVF1, Q94FT4, Q9ZTK5, Q6TXD2, O64988, Q9M6E2, Q8GSM7, Q8LL69, Q7XXP3, and Q9M6F0, respectively. The sequence identities of these enzymes are in the range of 25-34%. Horizontal helical segments above the sequences indicate ␣-helices (labeled ␣1-␣13); horizontal arrows indicate ␤-strands (labeled ␤1-␤14). The sequence numbering is shown according to VS. the front face (Fig. 6), and this may create the opening for the binding of CoA. The DFGWG motif may also have importance for maintaining the integrity of the CoA binding pocket. Modeling of CoA into the binding channel shows that several residues may have contact with CoA. However, except for His 160 , there are no other residues that are strictly conserved in this region.
In order to gain more detailed insight into the nature of the binding pocket and the reaction mechanism of this enzyme, the crystal structure of ligand-and substrate-bound VS will be required, and this work is now under way.
Future Prospects-The biosynthesis of ajmaline, illustrated in Fig. 1, is one of the most elaborated pathways in the field of natural product biosynthesis. It is also one of the best known examples in modern proteomics research for which experimental evidence is available not only for all enzymes directly involved in the pathway but also for those catalyzing side routes (41). Together, this yields a comprehensive knowledge of alkaloid metabolism in Rauvolfia at the enzymatic level. Moreover, about half of the proteins involved in ajmaline biosynthesis have now been functionally overexpressed in E. coli. In addition to VS, two other enzymes (strictosidine synthase and strictosidine glucosidase) have been successfully crystallized, and preliminary x-ray analyses were carried out recently (42,43). The biosynthesis of ajmaline therefore offers a unique opportunity to investigate the details of alkaloid formation at a structural level in the near future. Such an investigation could deliver a much better understanding of the extraordinarily high substrate specificity, which is typical for most of the participating enzymes. It would also allow the search for a specific indole binding site at a structural level and would provide not only information on evolutionary origins but also information on the relationship of single domains or entire Rauvolfia enzymes and on an entire pathway of natural product biosynthesis. FIG. 6. Conservation of residues belonging to domain 1 in the BAHD family. Ribbon diagram of VS (the coloring scheme is as described in Fig. 2) with modeled CoA (in black) is shown in an orientation similar to that depicted in Fig.  3A. There are 19 residues that are strictly conserved and belong exclusively to domain 1. These residues are indicated with black balls at the C␣ positions and labeled with the respective residue name.