Structure of a Eukaryotic Nonribosomal Peptide Synthetase Adenylation Domain That Activates a Large Hydroxamate Amino Acid in Siderophore Biosynthesis*

Nonribosomal peptide synthetases (NRPSs) are large, multidomain proteins that are involved in the biosynthesis of an array of secondary metabolites. We report the structure of the third adenylation domain from the siderophore-synthesizing NRPS, SidN, from the endophytic fungus Neotyphodium lolii. This is the first structure of a eukaryotic NRPS domain, and it reveals a large binding pocket required to accommodate the unusual amino acid substrate, Nδ-cis-anhydromevalonyl-Nδ-hydroxy-l-ornithine (cis-AMHO). The specific activation of cis-AMHO was confirmed biochemically, and an AMHO moiety was unambiguously identified as a component of the fungal siderophore using mass spectroscopy. The protein structure shows that the substrate binding pocket is defined by 17 amino acid residues, in contrast to both prokaryotic adenylation domains and to previous predictions based on modeling. Existing substrate prediction methods for NRPS adenylation domains fail for domains from eukaryotes due to the divergence of their signature sequences from those of prokaryotes. Thus, this new structure will provide a basis for improving prediction methods for eukaryotic NRPS enzymes that play important and diverse roles in the biology of fungi.

The nonribosomal peptide synthetases (NRPSs) 4 are large, multimodular enzymes, which are ubiquitous in both bacteria and fungi. These enzymes are involved in the synthesis of a wide array of secondary metabolites that have diverse biological roles, including iron sequestration and antimicrobial, insecticidal, and antiviral activity (1,2). These compounds have had a very significant impact on human health since the discovery and development of penicillin by A. Fleming, H. Florey, and E. B. Chain in the 1920s and 1930s, and they are currently used as antibiotics, antivirals, immunosuppressants, and antitumor agents in humans. NRPSs synthesize peptides by a multiple carrier thiotemplate mechanism similar to that employed by polyketide synthases and fatty-acid synthases (1,3). In general, NRPSs are modular, with each module catalyzing the incorporation of one amino acid substrate into the growing peptide. NRPS modules are, in turn, made up of independently folding functional domains that catalyze the individual reactions of peptide synthesis. The adenylation domain selects and activates the amino acid substrate before attaching it to a 4Ј-phosphopantetheine (4Јppant) prosthetic group on the adjacent peptide carrier protein (PCP) domain.
NRPS adenylation domains are members of the acetyl-CoA synthetase-like superfamily (as defined by SCOP (4)). The corresponding Pfam family (PF00501) (5) contains over 17,000 sequences from ϳ1550 different species, including bacteria, archaea, and eukarya. Despite low sequence similarity, this superfamily shows a high degree of structural and functional conservation (6). Members of the superfamily activate carboxylic acids by an adenylation reaction, utilizing ATP and forming an enzyme-bound adenylate intermediate. In most cases, these enzymes subsequently catalyze the formation of a thioester bond to either a CoA molecule or a 4Јppant prosthetic group. Members of the superfamily play roles in the biosynthesis and degradation of an array of important primary and secondary metabolites. Different members of the superfamily activate diverse substrates ranging from very small carboxylic acids such as acetate (7) to very large substrates such as 2-amino-9,10-epoxy-8-oxodecanoic acid (8). Thus, the architecture of the binding pocket used to achieve substrate specificity by individual family members is of significant interest.
Following the publication of the first structure of an NRPS adenylation domain, PheA (the phenylalanine-activating adenylation domain from GrsA, the gramicidin S NRPS from Bacillus brevis (9)), a number of methods for predicting the specificity of NRPS adenylation domains have been developed based on the similarity of the binding pocket residues to those of previously characterized domains with known specificity (10 -12). These methods work well for predicting the substrates for adenylation domains from prokaryotic NRPSs but are less successful for both eukaryotic NRPSs (13)(14)(15) and prokaryotic domains that activate unusual substrates. The three known structures of NRPS adenylation domains (9,16,17) are all of prokaryotic origin (as are all current NRPS domain structures). Furthermore, the previously determined structures of members of the acetyl-CoA synthetase-like superfamily bind and activate small and medium sized substrates, and it is expected that the binding pockets for larger substrates are substantially different (13,18).
Among the largest substrates of the acetyl-CoA synthetaselike superfamily are the N ␦ -acyl-N ␦ -hydroxyornithines (AHOs) found in the NRPS-synthesized fungal hydroxamate siderophores. These siderophores chelate iron through the bidentate hydroxamate group of the AHOs (19). One class of fungal hydroxamate siderophore are the ferrichromes, which are made up of three AHO residues and, in most cases, three other proteogenic amino acids (19). The particular acyl group of the AHOs differ between various ferrichromes that have been identified and characterized (19).
A novel NRPS gene, sidN, from the Epichloë-Neotyphodium complex (phylum Ascomycota, family Clavicipitaceae) of grass endophytes has recently been cloned and characterized. 5 The NRPS encoded by sidN is a siderophore synthetase with a threemodule organization that resembles previously characterized NRPSs (13,20) that synthesize ferrichromes. The mutualistic relationship between the fungal endophytes of the Epichloë-Neotyphodium complex and the agronomic grasses that they colonize plays a vital role in the pastoral agriculture of many parts of the world by improving the tolerance of grass plants to abiotic and biotic stresses (21,22). SidN is essential for the maintenance of the mutualistic character of this relationship, and the details of the cloning and characterization of sidN will be reported elsewhere. 6 Recently, attempts have been made to predict the specificities of the adenylation domains of siderophore synthetases using methods such as homology modeling and inspection of the predicted substrate binding pocket residues (13,18). These studies have predicted that the third adenylation domain of the three-module siderophore synthetases bind and activate the AHO residues and that these domains have a larger binding pocket than has previously been characterized (13,18). However, experimental confirmation of these predictions has been hampered by technical difficulties in producing soluble protein of eukaryotic adenylation domains in heterologous expression systems (13).
We have expressed, purified, and determined the three-dimensional structure of the third adenylation domain of SidN (SidNA3) from Neotyphodium lolii Lp19 at 2.0 Å resolution. We have also biochemically characterized this domain, confirming that SidNA3 adenylates the large nonproteogenic amino acid, N ␦ -cis-anhydromevalonyl-N ␦ -hydroxy-L-ornithine (cis-AMHO). In addition, this amino acid has been confirmed as a component of the fungal siderophore using mass spectroscopy. This is the first reported structure of a eukaryotic NRPS domain and thus provides details of the architecture of the specificity-determining pocket for eukaryotic adenylation domains. The structure and biochemistry of SidNA3 also extend our knowledge of the large acetyl-CoA synthetase-like superfamily for which there are only a handful of structures despite their diversity and ubiquity in nature.

EXPERIMENTAL PROCEDURES
Protein Expression and Purification-The SidNA3 domain was amplified from N. lolii Lp19 genomic DNA and cloned using the Gateway system (Invitrogen) into the pDest17 expression vector (Invitrogen). A two-step nested PCR protocol was used to introduce the sequence for a recombinant tobacco etch virus protease cleavage site into the PCR product to enable removal of the polyhistidine fusion tag (23). The plasmid was transformed into Escherichia coli Rosetta TM (DE3) cells by electroporation. Expression was performed at 18°C in ZYP-5052 autoinduction medium (24). The expressed protein (in 25 mM sodium phosphate, pH 6.5, 500 mM NaCl, 10 mM imidazole, 5 mM ␤-mercaptoethanol) was purified by immobilized metal affinity chromatography. The polyhistidine fusion tag was removed by digestion with recombinant tobacco etch virus protease. The cleaved SidNA3 protein was purified by subtraction immobilized metal affinity chromatography and then further purified by size exclusion chromatography using a HiLoad TM 16/60 Superdex TM 75 pg column (GE Healthcare) with 25 mM MES, pH 6.5, 500 mM NaCl, 5 mM ␤-mercaptoethanol as the running buffer. Seleno-methionine-substituted protein was prepared by the same protocol except that PASM-5052 medium (24) was used as the expression medium.
Mass Spectrometry of the SidN Siderophore-A liquid culture of Epichloë festucae strain Fl1 was grown for ϳ2 weeks at 22°C in modified Mantle media (25), with yeast extract replaced with 0.6 M thymine. The supernatant was separated by centrifugation and freeze-dried, and 50 g of the dried material was extracted with methanol (500 ml). The extract was evaporated to dryness under reduced pressure, reconstituted in MilliQ water (200 ml), and eluted through a solid phase extraction cartridge (1 g, Isolute ENV ϩ , International Sorbent Technology). The cartridge was washed with water (10 ml), and a siderophore fraction was eluted with methanol (10 ml). The fraction was evaporated to dryness and stored at 0°C and reconstituted in 1:1 acetonitrile/water with 0.1% formic acid for infusion into a hybrid ion trap Fourier transform ion cyclotron resonance mass spectrometer (LTQ FT, Thermo Fisher Scientific) using static nanospray electrospray ionization in ϩve mode. The spray voltage was 5.0 kV, and the capillary temperature was 225°C. Collision-induced dissociation of isolated parent ions was carried out with a collision energy of 60 units.
N ␦ -cis-Anhydromevalonyl-N ␦ -hydroxy-L-ornithine Purification-cis-AMHO was obtained by alkaline hydrolysis of fusigen (EMC Microcollections) (26). A solution of fusigen was adjusted to pH 12.0 using 1 N NaOH and incubated at room temperature for 1 h followed by overnight incubation at 4°C. The solution was neutralized with 1 N HCl. The resulting cis-AMHO was purified by high performance liquid chromatogra-phy using a Luna 3-m C18(2) 100-Å 150 ϫ 3-mm column (Phenomenex) and a water/acetonitrile gradient (no ion-pairing agent) of 0 -22% over 20 min using a 0.3 ml min Ϫ1 flow rate. Separation was performed on a System Gold instrument (Beckman Coulter), and detection was via absorbance at 220 nm. The cis-AMHO had a retention time of ϳ13 min. The cis-AMHO was dried using a Vacufuge Concentrator 5301 (Eppendorf) to remove the acetonitrile and redissolved to the required concentration in water prior to use.
Inorganic Pyrophosphate Release Assay-The release of inorganic pyrophosphate was assayed using an EnzChek pyrophosphate assay kit (Invitrogen) (27). The assays were carried out at 30°C in 100-l volumes in a 96-well microtiter plate. Each reaction contained 50 mM Tris-HCl, pH 7.5, 10 mM MgCl 2 , 1 mM ATP, 0.25 mM 2-amino-6-mercapto-7-methylpurine ribonucleoside, 1 unit of purine nucleoside phosphorylase, and 0.03 units of inorganic pyrophosphatase, 1 M SidNA3, 0.2 mM amino acid substrate. Following a 20-min incubation to remove contaminating pyrophosphate and phosphate, the reaction was initiated by addition of the amino acid substrate. The 360 nm absorbance was monitored using an EnVision 2104 multilabel plate reader (PerkinElmer Life Sciences), reading every 2 1 ⁄ 2 min. The initial rates for steady-state kinetics were measured using the same method except that 0.4 mM 2-amino-6-mercapto-7-methylpurine ribonucleoside and 2 mM ATP were used together with varying amounts of the amino acid substrate, cis-AMHO. For the steady-state kinetics measurements, two replicates were performed for each concentration, and absorbance readings were made every 10 s.
Crystallization-Crystallization was performed using the sitting drop vapor diffusion method at 18°C. Initial crystallization trials were conducted with drops consisting of 100 nl each of protein (8 mg ml Ϫ1 in 25 mM MES, pH 6.5, 200 mM NaCl, 5 mM ␤-mercaptoethanol, 1 mM MgCl 2 , 2 mM ATP, and 5 mM CoA) and precipitant solutions. The initial crystals grew in 25.5% polyethylene glycol 4000, 0.17 M ammonium sulfate, and 15% glycerol. Extensive optimization of the crystallization conditions and micro-seeding resulted in diffraction-quality crystals with the best crystals growing in 0.1 M MES, pH 6.5, 13-14% polyethylene glycol 4000, 0.01-0.02 M ammonium sulfate, and 15% glycerol using protein at 5 mg ml Ϫ1 in 25 mM MES, pH 6.5, 200 mM NaCl, 5 mM ␤-mercaptoethanol, 1 mM MgCl 2 , and 2-5 mM CoA. The crystallization behavior of the seleno-methionine-substituted protein was identical to that of native protein.
Structure Determination-Data collection was performed on flash-cooled crystals at the Stanford Synchrotron Radiation Laboratory, Stanford University, Palo Alto, CA. One high resolution dataset and one multiple wavelength anomalous dispersion dataset from seleno-methionine-substituted protein crystals were collected and used for structure determination. The data were processed with MOSFLM (28) and SCALA (29). AutoSharp (30) was used to determine phase information from multiple wavelength anomalous dispersion. Nineteen selenomethionine sites were found with a figure of merit for phasing of 0.37. There were two molecules per asymmetric unit, but density was only present for one of the two C-terminal domains.
Automatic model building was carried out using RESOLVE (31) through the PHENIX AutoBuild wizard (32) followed by manual building in Coot (33) and refinement using phenix.refine (34). The high resolution dataset and experimental phases from the multiple wavelength anomalous dispersion dataset were used. Restrained individual coordinates, restrained isotropic B-factors, and TLS parameters were refined with noncrystallographic symmetry restraints applied to the N-terminal domains. For the TLS refinement, three TLS groups consisting of each of the two N-terminal domains and the ordered C-terminal domain were defined.
The final model included a total of 903 residues out of the expected 1124 residues per asymmetric unit (of the missing 221 residues, 136 were from the missing C-terminal domain). Chain A (N-terminal domain only) consisted of residues 13-40, 46 -179, and 183-421, and chain B (N-and C-terminal domains) consisted of residues 12-40, 46 -180, 183-426, 430 -456, 465-482, and 487-535. The single C-terminal domain that was modeled appears to be mobile with weak density and a high average B-factor of 77.9 Å 2 compared with 30.9 Å 2 for the N-terminal domain in the same molecule (chain B). In the final model, 96.2% of the residues lie in favored regions, and 3.6% of the residues lie within allowed regions of the Ramachandran plot. There is just one Ramachandran outlier in each chain, which is Gly-322. This residue lies at the entrance to the amino acid binding pocket. Data collection and refinement statistics are given in Table 1.
Ligand Docking-The energy-minimized AHO models were constructed using ChemBio3D Ultra (CambridgeSoft). Using GOLD (35), the ligand was docked into an area of the SidNA3 structure with a radius of 7-8 Å centered on the binding pocket and using the "Detect Cavity" option. The "Standard default settings" genetic algorithm parameters and the default parameters of the ChemScore (36) scoring function were used. Ten docking runs were performed for each ligand model, and the top three solutions were kept.

RESULTS
SidNA3 Expression and Purification-Successful expression and purification of the third adenylation domain of SidN (SidNA3) were achieved by making a range of constructs that were tested for soluble protein expression. Optimization of the N-and C-terminal borders of the domain was of particular importance as differences of just 14 amino acids at either end of the construct lead to either soluble or insoluble protein expression (37). Initially, longer constructs of the SidNA3 domain exhibited a strong tendency to form large soluble aggregates, as determined by size exclusion chromatography. Subsequent optimization of the construct borders and of the purification and storage buffer conditions yielded protein that was suitable for biochemical characterization and crystallization. The final 61-kDa SidNA3 protein domain was comprised of residues 2270 -2826 of SidN.
Product of SidN-Evidence that SidN incorporates AMHO residues was provided by mass spectrometric examination of the siderophore produced by SidN from E. festucae Fl1. For these studies, E. festucae Fl1 was used as a model organism for producing the siderophore after mass spectroscopy showed that the siderophores produced by sidN from N. lolii Lp19 and E. festucae Fl1 are identical. 7 Furthermore, a comparison of the amino acid sequence of the SidNA3 domain from E. festucae Fl1 shows 99% identity to that from N. lolii Lp19. A siderophorerich fraction was obtained from an extract of the supernatant of an E. festucae Fl1 culture grown under iron-depleted conditions using solid phase extraction (38). The fraction was examined by high resolution Fourier transform mass spectrometry on a hybrid ion-trap ion cyclotron mass spectrometer. On infusion into the electrospray ionization source, an ion of m/z 1083.53405 ([MH] ϩ , C 46 H 75 N 12 O 18 , calculated mass 1083.53168) was observed. This ion was not detected in samples from cultures of E. festucae Fl1 in which the sidN gene had been knocked out by gene replacement. 5 Collision-induced dissociation of this species in the ion-trap spectrometer gave major product ions from neutral losses of 112.05040 (anhydromevalonyl moiety, C 6 H 8 O 2 , calculated mass 112.05188), 130.07472 (hydroxyornithyl moiety, C 5 H 10 N 2 O 2 , calculated mass 130.07368), and 242.12498 (AMHO moiety, C 11 H 18 N 2 O 4 , calculated mass 242.12611), and water. These results show that the acyl group of the three AHO residues in the SidN-produced siderophore is anhydromevalonyl. The anhydromevalonyl group can exist as either the cis or trans isomer, but the mass spectrometry results do not provide any information on which isomer is present in the SidN siderophore. Further work to fully characterize the siderophore, using high resolution NMR, is currently underway.
SidNA3 Activates N ␦ -cis-Anhydromevalonyl-N ␦ -hydroxy-L-ornithine-Previous studies have predicted that the third adenylation domain of the three-module siderophore synthetases adenylate the AHO residues (13,18). To confirm this prediction, a continuous spectroscopic assay (27) that measures the release of inorganic pyrophosphate by the adenylation reaction was employed to investigate the substrate specificity of the SidNA3 domain. cis-AMHO was obtained by hydrolysis of the siderophore fusigen (26), a cyclic trimer of cis-AMHO monomers connected by head-to-tail ester bonds. The activity assays showed that SidNA3 specifically adenylates cis-AMHO but did not adenylate L-ornithine or any of the 20 proteogenic amino acids (Fig. 1). We were unable to source trans-AMHO, and thus we could not investigate whether SidNA3 shows any preference for the cis or trans isomer of the anhydromevalonyl group. By monitoring the steady-state kinetics of pyrophosphate release via this assay, we determined the observed K m and k cat values is the jth measurement of the intensity of the unique reflection h, and I(h) is the mean over all symmetry-related measurements. c R-factor ϭ ͚͉F obs Ϫ F calc ͉/F obs , where F obs and F calc are the observed and calculated structure-factor amplitudes, respectively. d R free is the R-factor of the 5% of data (selected randomly) not used in refinement. e Data are according to the definitions of Lovell et al. (49).
for adenylation of cis-AMHO to be 40.0 Ϯ 5.0 M and 0.92 Ϯ 0.04 min Ϫ1 , respectively. The two auxiliary enzymes used in this enzyme-coupled assay were present in excess and have reported rates that suggest that the adenylation reaction was rate-limiting. Thus, the observed K m and k cat values seen here are likely to reflect the adenylation step catalyzed by SidNA3. This K m value for cis-AMHO and SidNA3 is of the same order of magnitude as the K m values previously determined for other NRPS adenylation domains and their cognate amino acid substrates (39,40). The low k cat value is the result of the slow catalytic turnover due to slow release of the enzyme-bound adenylate. It is likely that this rate would be greatly enhanced in the presence of the flanking domains of the NRPS, as the 4Јppant prosthetic group on the adjacent PCP domain would immediately react with the adenylated amino acid. Structure of SidNA3-SidNA3 was crystallized by the sitting drop method in the presence of CoA. The structure of SidNA3 was determined by multiple wavelength anomalous dispersion using seleno-methionine-substituted protein. SidNA3 has two domains ( Fig. 2), an N-terminal domain (residues 12-421) and a smaller C-terminal domain (residues 422-535) (numbering for SidNA3 used here begins with the first native residue in the expression construct; Fig. 3). The N-terminal domain is made up of a distorted six-stranded antiparallel ␤-barrel together with an ␣␤␣␤␣ structure formed by two predominantly parallel ␤-sheets (seven-and eight-stranded) flanked by ␣-helices. The core of the C-terminal domain is made up of a three-stranded antiparallel ␤-sheet surrounded by three ␣-helices. A long loop in the C-terminal domain, located between the final ␤-strand and the final ␣-helix, loops back toward the N-terminal domain (Fig. 2). The secondary and tertiary structures for each domain are very similar to that of PheA (9), with r.m.s.d. values of 1.6 Å for the N-terminal domain (369 residues aligned) and 1.4 Å for the C-terminal domain (79 residues aligned). Nevertheless, SidNA3 does show some significant differences to the acetyl-CoA synthetase-like superfamily. SidNA3 lacks three ␣-helices that are found in PheA and, in turn, has an additional two ␣-helices (Fig. 3). One of the three absent ␣-helices is at the N terminus, and some of the residues that would be expected to form this ␣-helix were not included in the SidNA3 construct that was used (Fig. 3). Hence, it is not clear whether this ␣-helix is present in the native SidNA3 domain. However, the remaining two absent ␣-helices and one of the two additional ␣-helices are within the N-terminal domain and are associated with insertions and deletions in the sequence (Fig. 3). The final additional ␣-helix in SidNA3 is at the C terminus. Apart from PheA, this FIGURE 1. SidNA3 activity assays. Indicative plots of the accumulation of PP i , as monitored by absorbance at 360 nm, for SidNA3 activity using the 20 proteogenic amino acids, cis-AMHO, and ornithine (Orn) as substrates. The assay was conducted three times using different concentrations of enzyme and substrate. In each case, cis-AMHO was the only amino acid that was adenylated. Subsequently, duplicate assays were conducted to calculate observed k cat and K m values (see text for details). AU, absorbance units.
C-terminal ␣-helix is present in all of the structures of members of the acetyl-CoA synthetase-like superfamily that contain this region of the protein.
SidNA3 Is in an "Open" Conformation-Most members of the acetyl-CoA synthetase-like superfamily are thought to adopt two distinct conformations to catalyze the two reactions that are performed by these enzymes, the adenylation and thiolation reactions (7,41). Previous structures show the two conformations differ by an ϳ140°rotation of the C-terminal domain relative to the N-terminal domain. SidNA3 was crystallized in a third conformation where the C-terminal domain is rotated ϳ45°relative to the orientation seen in the adenylation conformation (Fig. 4A). We describe this as an open conformation as there is a wide separation between the C-and N-terminal domains of SidNA3. There are very few interactions between the two domains or between the C-terminal domain and neighboring molecules in the crystal. The accessibility of the active site, located between the two domains at the surface of the N-terminal domain, is much greater in the open conformation of SidNA3 compared with the adenylation conformation seen in some other structures, such as that of PheA (Fig. 4B).
As would be expected from the lack of molecular interactions be-   (9). The positions of the ␤-strands and ␣-helices for both structures are shown. The numbering scheme for the ␤-strands is that used in Conti et al. (9). The residues not present in the models are indicated by lowercase letters, and the first five N-terminal residues of SidNA3, which are cloning artifacts, are shown in gray lettering and given negative numbers. The residues involved in binding the amino acid substrate are shaded (dark gray for the standard PheA residues and light gray for the extra SidNA3 residues). tween the domains, the C-terminal domain of SidNA3 is not held rigidly in the open conformation and is probably quite mobile, reflected in a high average B-factor (77.9 Å 2 ) compared with the N-terminal domain in the same molecule (30.9 Å 2 ) and has weaker electron density. Electron density is not present for many side chains of residues in this C-terminal domain, and there is no density for several other residues (residues 427-429, 457-464, 483-486, and 536 -557). Further support for C-terminal domain mobility is the complete absence of electron density for the C-terminal domain for the second of the two molecules in the asymmetric unit. Significantly, this missing C-terminal domain cannot occupy the same orientation as the ordered C-terminal domain of the other monomer as this would cause clashes with neighboring molecules in the crystal. Similarly, it cannot occupy the adenylation or thiolation conformations previously reported due to the geometry of crystal packing. The missing C-terminal domain is not the result of degradation of the protein as demonstrated by SDS-PAGE and mass spectroscopy (data not shown). The positional disorder of the C-terminal domain makes sense in light of the lack of interactions between the two domains in the open conformation. Similar open conformations with a wide separation between the two domains have been observed in four other members of the superfamily (Protein Data Bank codes 1lci (42), 1ult (43), 2vsq (17), and 3g7s). The orientation of the two domains is different in each of these structures. It is likely that the individual crystal packing arrangements stabilize the exact orientation seen in each of these structures and that the C-terminal domains are all significantly mobile in solution.
Architecture of the SidNA3 Active Site and Substrate-binding Pocket-The ATP and amino acid binding pockets of the active site were identified by comparison with PheA (9). The ATP-binding channel and the amino acid binding pocket are adjacent to each other on the face of the N-terminal domain between it and the C-terminal domain. The ATP binding pocket of SidNA3 shows the archetypal conserved motif 178 TSGSTGTPK 186 , which is similar to the P-loop seen in many ATP-binding proteins (44). This loop is thought to be flexible and has been observed in many conformations in the structures of acetyl-CoA synthetase-like superfamily members. In SidNA3, there is also evidence for flexibility insofar as three amino acids are missing due to poor electron density (Gly-180, Ser-181, and Thr-82).
The amino acid binding pocket of SidNA3 is much larger than that of PheA (Fig. 5), consistent with it binding the large amino acid substrate AMHO. In contrast to PheA, the SidNA3 pocket is defined by the main chain and side chains of 17 residues (Phe-198, Trp-202, Ile-206, Phe-222, Asp-231, Val-232, Gly-235, Glu-236, Leu-239, Gly-272, Tyr-293, Gly-295, Val-296, Gly-297, Val-320, Ile-328, and Gly-329). These overlap with the equivalent nine residues that line the PheA substrate binding pocket along with eight additional residues (Phe-198, Trp-202, Ile-206, Phe-222, Glu-236, Leu-239, Tyr-293, and Val-296) that lie deeper in the interior of the N-terminal domain where they bind the hydrophobic anhydromevalonyl moiety. The larger size of the binding pocket in SidNA3 when compared with PheA is the result of the substitution of a tryptophan residue that defines the bottom of the binding pocket (Trp-239 in PheA) with a glycine residue (Gly-235 in SidNA3) (Fig. 5B). There are five glycines in total lining the SidNA3 pocket, giving a large hydrophobic cavity containing only a few potential hydrogen bonding groups.
Previous homology modeling had predicted that the AHO-adenylating domains of three-module siderophore synthetases would have larger pockets than PheA (13,18); however, the observed architecture of the SidNA3 pocket differs from that predicted. Schwecke et al. (13) predicted that the binding pockets of AHO-adenylating domains would be lined by three extra residues, beyond the nine seen in the PheA. One of the three predicted extra residues is seen to line the binding pocket of SidNA3 (Phe-222). Similarly, Bushley et al. (18) predicted that seven residues, in addition to the nine seen in PheA, had the potential to contribute to the pocket of AHO-adenylating domains. For SidNA3, two of these predicted residues (Glu-236 and Leu-239) were observed to be lining the pocket. Thus, the experimental determination of the SidNA3 structure provides a new template to advance our understanding of the determinants of substrate binding in NRPSs that adenylate large amino acids and, in particular, those that synthesize hydroxamate siderophores.
Open Conformation of SidNA3 Has Implications for Substrate Binding-In PheA, the ␣-amino and ␣-carboxylate groups of the phenylalanine ligand are held in place by interactions with Asp-235 and Lys-517 (which is also involved in binding ATP) (Fig. 5C) (9). The equivalent residue to Asp-235 in SidNA3, Asp-231, is in an almost identical position (Fig. 5C). However, Lys-517, and its equivalent in SidNA3, Lys-526, are located on a loop that extends from the C-terminal domain into the active site. Due to the different C-terminal domain orientation in the SidNA3 structure, Lys-526 is shifted ϳ11 Å from the equivalent position of Lys-517 in the PheA structure. Another consequence of the difference in the position of the C-terminal domain loop is that in the SidNA3 structure the side chain of the conserved aro-matic residue Phe-230 is in a different conformation to that of the equivalent residue in PheA, Phe-234. In PheA (9), the C-terminal domain loop restricts the rotation of the side chain of Phe-234, ensuring that it points toward the amino acid binding pocket where it forms a binding surface for the main chain atoms of the phenylalanine ligand, restricting their movement (Fig. 5C). In contrast, in the SidNA3 structure this surface does not exist as the orientation of the C-terminal domain loop allows the Phe-230 side chain to point away from the amino acid binding pocket, opening up the mouth of the pocket (Fig.  5C). To correctly bind the amino acid ligand, a conformational change from the open conformation to the adenylation conformation would be required to move the C-terminal domain loop into a position such that Phe-230 and Lys-526 can take on their roles in ligand binding.
Ligand Docking of N ␦ -Acyl-N ␦ -hydroxyornithines-To gain insight into the binding specificity of SidNA3, we conducted in silico ligand docking with the SidNA3 structure, using a library of AHOs containing the eight acyl groups that have been observed in siderophores (Fig. 6). Each stereoisomer was modeled separately, resulting in a library of 32 compounds. The ligand docking was carried out with GOLD (35), and binding poses were assessed using the ChemScore (36) scoring function. Consistent with SidNA3 being an AMHO-adenylating domain, the top-ranked ligand from the docking run was cis-AMHO, with N ␦ -cis-anhydromevalonyl-N ␦ -hydroxy-D-orni-thine ranked second ( Table 2). Visualization of the binding poses of the top three solutions for the cis-AMHO ligand showed that the ligand is placed in an extended conformation in the binding pocket with all three solutions being well clustered with an r.m.s.d. value of 0.34 Å (Fig. 5D). All of the 10 topranked ligands, except N ␦ -cis-5-acetoxy-3-methylpent-2enoyl-N ␦ -hydroxy-D-ornithine, showed an identical hydrogen bonding pattern. The terminal hydroxy group of the ligand hydrogen bonds to the side chain hydroxyl of Tyr-293 and the N ␦ -hydroxy of the hydroxamate group hydrogen bonds to the main chain carbonyl oxygen of Ile-328 (Fig. 5D). In contrast, the likely hydrogen bond to the carbonyl oxygen of the hydroxamate group remains unsatisfied in all of the docking solutions, which would be energetically unfavorable. The hydroxamate group is able to adopt either the synperiplanar or antiperiplanar conformations (45). This group was allowed to flip during the GOLD docking with the ligand adopting the antiperiplanar conformation in the top-ranked GOLD solutions. In any case, there are no obvious hydrogen bonding partners in the binding pocket for either ligand conformation. However, the pocket is very wide at this point, and it is possible that a water molecule could be involved in satisfying the hydrogen bonding potential of this carbonyl oxygen and bridging between the ligand and main chain atoms. Despite extensive crystallization experiments, we have been unable to determine a structure of SidNA3 with cis-AMHO in the binding pocket.
Comparison of the Binding Pockets of Other Siderophore Synthetases-To further investigate the binding of AHOs to the adenylation domains of siderophore synthetases, we compared the pocket-lining residues of a range of domains that adenylate these substrates. AHO-adenylating domains from other fungal siderophore synthetases were identified by searching the relevant literature. A multiple sequence alignment containing SidNA3 and these domains was produced using T-Coffee (46), and 15 residues predicted to line the pocket for these adenylation domains were extracted from the multiple sequence alignment (Table 3). These putative pocket-lining residues were the equivalents of those lining the SidNA3 pocket apart from Asp-    (Table 3).
Common Binding Pocket Architectures of the Acetyl-CoA Synthetase-like Superfamily-The structures of 18 other members of the acetyl-CoA synthetase-like superfamily are available. To gain insight into the common architecture of the binding pockets of the superfamily, we compared the pockets of these structures with each other and SidNA3 ( Table 4). Three of the structures, the long chain fatty acyl-CoA synthetase from Thermus thermophilus (43) and the luciferase enzymes from the Japanese (47) and American (42) fireflies, have larger binding pockets than SidNA3 (Table 4). Strikingly, these larger pockets are lined by the structural equivalents of most of the residues that line the SidNA3 pocket plus some additional residues. The additional size of the long chain fatty acyl-CoA synthetase pocket is unsurprising as it binds and activates myristate and palmitate, substrates that are larger than those of SidNA3. The larger size of the luciferase binding pocket in comparison with that of SidNA3 is somewhat surprising as the substrate, luciferin, is similar in size to cis-AMHO. However, most of the extra binding pocket residues seen, in addition to those in SidNA3, do not appear to be in contact with the luciferin ligand; rather, they play a catalytic role (47). The remaining 14 structures have similar sized or smaller binding pockets than SidNA3 (Table 4), and the pockets of these enzymes are mainly lined by a subset of residues equivalent to those in SidNA3. The binding pocket of SidNA3 defines the upper limits of NRPS adenylation domain binding pockets, as cis-AMHO is one of the largest amino acids to be incorporated into secondary metabolites by NRPSs.

DISCUSSION
We have determined the structure of SidNA3, the first structure of a eukaryotic NRPS domain. Activity assays clearly show that SidNA3 adenylates cis-AMHO but not any of the 20 proteogenic amino acids or L-ornithine. This is the first experimental evidence in support of the predictions that the third module of three-module siderophore synthetases adenylate AHO residues (13,18). Consistent with it binding cis-AMHO, one of the largest substrates of the acetyl-CoA synthetase-like superfamily, the amino acid binding pocket of SidNA3 is much larger than that of PheA (9). The SidNA3 binding pocket is lined by 17 rather than 9 residues (as seen for PheA), and it defines the upper limits of NRPS adenylation domain binding pockets. Previous homology modeling predicted that the AHOactivating domains would have larger pockets than PheA (13, 18) but do not correctly identify all of the residues involved in substrate binding. This highlights the value of an experimentally determined structure in understanding substrate specificity for the large family of eukaryotic NRPS enzymes.
The N-and C-terminal domains in the SidNA3 structure are arranged in an open conformation previously unseen in NRPS adenylation domains. This conformation is distinct from the two catalytically active conformations thought to be adopted by members of the acetyl-CoA synthetase-like superfamily (7).
However, similar open conformations have previously been observed in four other members of the superfamily (Protein Data Bank codes 1lci (42), 1ult (43), 2vsq (17), and 3g7s). The exact orientation of the C-terminal domain relative to the N-terminal domain is different in each of these structures. Nevertheless, the structures are similar in that there is a wide separation between the two domains and a highly accessible active site. Given the lack of interactions between the two domains in the open conformation, it is likely that the exact orientation of the C-terminal domain seen in these structures is stabilized by crystal packing and that the C-terminal domain is mobile in solution. This is supported by the high mobility of the C-terminal domains observed in the SidNA3 structure. Furthermore, in the structure of the open conformation of the long chain fatty acyl-CoA synthetase (43), the C-terminal domain of each mon- TABLE 3 Amino acid residues lining the binding pockets of N ␦ -acyl-N ␦ -hydroxy-ornithine-activating adenylation domains a Data were predicted but experimentally unconfirmed. TABLE 4 The pocket-lining residues in the acetyl-CoA synthetase-like superfamily
a Only the main chain atoms of these residues line the pocket. b The binding pocket exposed to solvent in the FAAL28 structure, which was solved in unliganded form, is too small to accommodate its long chain fatty acid substrate. There is an internal cavity in the protein closed off by Met-233. Met-233 is the equivalent of the Trp-234 residue in long chain fatty acid, which closes off the entrance to the binding pocket in the unliganded form and opens up upon ATP binding. Thus, it is likely that Met-233 in FAAL28 is performing a similar function and that the true binding pocket is much larger. c In long chain fatty acid, sequence alignments place Asn-232 as the equivalent of the PheA residue 236, but structural comparisons show that it is actually Trp-234 that is the structurally equivalent residue due to an alternate main chain route.
omer of the enzyme, which forms a domain-swapped dimer, is in a different open orientation (43). The recent structure of the surfactin synthetase termination module (SrfAC) from Bacillus subtilis (17) has provided further insights into the relevance of the open conformations for NRPS adenylation domains. This structure shows that the condensation domain and the N-terminal part of the adenylation domain associate closely to form a catalytic platform. The PCP domain and the C-terminal part of the adenylation domain rearrange on this platform to move the tethered peptide product between the reaction centers (17). It seems probable that the conformational changes of the adenylation domain would be coordinated with rearrangements in the positioning and conformation of the PCP domain. This is further reinforced by the fact that the 4Јppant arm in SrfAC is not long enough to reach each of the catalytic centers, implying considerable mobility among the domains of NRPSs. Thus, in addition to the adenylation and thiolation conformations, the adenylation domain may also adopt at least two other stable conformations that allow the PCP domain and 4Јppant to be positioned in the reaction centers of the preceding and successive domains in the peptide synthesis reaction pathway. In the SrfAC structure, the PCP domain is positioned such that the 4Јppant arm would be able to reach into the active site of the preceding condensation domain, suggesting that the module is in a conformation suitable for catalyzing the condensation reaction (17). The adenylation domain is in an open conformation with the C-terminal domain stabilized by interactions with the neighboring condensation domain (17). SidNA3 was crystallized in isolation from its neighboring domains and, hence, cannot form similar interactions, which would explain the high mobility of the C-terminal domain.
The residues lining the substrate binding pocket of PheA (9) have previously been used to develop methods to predict the specificities of newly identified adenylation domains (10 -12). The specificity of a novel adenylation domain is predicted based on its similarity to the signature sequences of adenylation domains with known substrates. These methods work well for prokaryotic NRPSs but are generally unsuccessful for eukaryotic NRPSs (13). Of significant interest is whether the SidNA3 binding pocket is able to discriminate between acyl-hydroxyornithines with the various acyl groups seen in siderophores (Fig.  6). In silico, ligand docking of AHOs into the SidNA3 structure was undertaken and showed cis-AMHO as the top-ranked substrate, consistent with the fact that AMHO is the in vivo substrate. Visualization of the in silico-bound cis-AMHO shows that the terminal alcohol of the ligand is hydrogen-bonded to the side chain hydroxyl of Tyr-293. Tyr-293 is one of the few hydrogen bonding partners in the SidNA3 binding pocket and is located at the very base of the pocket, which is otherwise a largely hydrophobic site. Furthermore, comparisons of the pocket-lining residues of other AHO-adenylating domains showed that the SidNA3 domain clustered with those known to activate cis-AMHO (Table 3). The discrimination in the AHO that are used to synthesize the siderophore in vivo may also be due to limitations in the pool of AHOs available in the cell. Further experimental investigation is necessary to resolve the questions of the specificity of the SidNA3 domain for AHO residues. Nevertheless, our results demonstrate the specificity of SidNA3 for AHOs over ornithine or any of the other 20 common amino acids.
The structure of SidNA3 shows that although the substrate binding pockets of eukaryotic NRPS adenylation domains are constructed along fundamentally similar principles to their prokaryotic counterparts, their elaboration to accommodate more complex substrates is not readily predictable from sequence comparison alone. The low success rate of specificityprediction methods for eukaryotic domains is due to the divergence in their signature sequences from those of prokaryotes. The structure of SidNA3 therefore provides a springboard for the development of more accurate specificity prediction methods for the abundant eukaryotic NRPS adenylation domains.
It is possible that eukaryotes utilize a more diverse set of signature sequences than the prokaryotes. Furthermore, it is perhaps rather simplistic to attempt prediction of NRPS substrate specificities using just the nine residues that line the PheA pocket. It is obvious from the comparisons of the substrate binding pockets of superfamily members that the size of the pocket varies considerably. By limiting the comparisons to the nine PheA residues, some of the important specificity-determining residues may be missed, especially for larger substrates. Expanding the number of residues considered to match those found in the SidNA3 pocket will serve to improve the coverage of the possible specificity-determining residues. This would be at the expense of increasing the noise for smaller pockets, however, and the pocket size itself is probably a key specificity determinant for many of these enzymes. Thus, it is likely that the greater use of structurally based tools such as homology modeling based on appropriate templates, employed to inform comparisons of the unknown domains with those of known specificity, will prove more fruitful than simple sequence comparisons. Development of an accurate substrate prediction method for domains of unknown specificity is now the subject of current work.