Chain Initiation in the Leinamycin-producing Hybrid Nonribosomal Peptide/Polyketide Synthetase from Streptomyces atroolivaceus S-140

Nonribosomal peptide natural products are biosynthesized from amino acid precursors by nonribosomal peptide synthetases (NRPSs), which are organized into modules. For a typical NRPS initiation module, an adenylation (A) domain activates an amino acid and installs it onto a peptidyl carrier protein (PCP) domain as a thioester; an elongation module, which has a condensation (C) domain located between every consecutive pair of A and PCP domains, catalyzes the formation of the peptide bond between the upstream aminoacyl/peptidyl-S-PCP and the free amino group of the downstream aminoacyl-S-PCP. d-Amino acid constituents in peptide natural products usually arise from the l-enantiomers through the action of integral epimerization (E) domains of an NRPS. The biosynthetic gene cluster for leinamycin, a hybrid nonribosomal peptide/polyketide containing a d-alanine moiety, does not encode a typical NRPS initiation module with the expected A-PCP-E domains; instead, it has only an A protein (LnmQ) and a PCP (LnmP), both of which are encoded by separate genes. Here we show the results of biochemical experiments as follows: (i) we demonstrate that LnmQ directly activates d-alanine as d-alaninyl-AMP and installs it onto LnmP to generate a d-alaninyl-S-PCP intermediate; (ii) we confirm that aminoacylation of LnmP by LnmQ in trans is the result of specific communication between the separate A and PCP proteins; and (iii) we reveal that leinamycin production can be improved by supplementation of exogenous d-alanine in the fermentation broth of Streptomyces atroolivaceous S-140. These findings unveil an unprecedented NRPS initiation module structure that is characterized by a discrete d-alanine-specific A protein and a PCP.

Nonribosomal peptide natural products include many clinically important drugs such as vancomycin (antibacterial), bleo-mycin (anticancer), and cyclosporine (immunosuppressant). They are biosynthesized from amino acid precursors by nonribosomal peptide synthases (NRPSs) 2 that are multifunctional enzymes organized into modules (1). Three groups of NRPS systems are known to date (2). (i) Type A (linear) NRPSs are analogous to type I polyketide synthases (PKSs) (3,4) in which each module harbors a set of distinct, noniteratively acting activities responsible for the catalysis of one cycle of peptide chain elongation. In this system, the amino acid sequence of the resultant linear peptide chain is co-linear with the number and order of the modules. (ii) Type B (iterative) NRPSs use their modules or domains more than once in the assembly of a single product. This strategy is employed to build up peptide chains containing repeated smaller sequences. (iii) Type C (nonlinear) NRPSs harbor at least one unusual arrangement of the modules or core domains with two primary differences from typical type A NRPSs. One is the unusual internal cyclization or branch point synthase that leads to a deviation from the normal function of a linear NRPS, resulting in a nonlinear peptide product. The other is the specialized condensation or ligase domain that incorporates small soluble molecules, such as amines, into the assembled peptides.
Three enzymatic activities, which normally are specified by a module that contains condensation-adenylation-peptidyl carrier protein (C-A-PCP) domains, are necessary for one complete elongation cycle of peptide synthesis catalyzed by an NRPS. The A domain selects the correct amino acid from the pool of available substrates and activates it as an aminoacyl adenylate, and the activated amino acid is then transferred onto the -SH group of the 4Ј-phosphopantheinyl (4ЈPP) arm attached to the PCP domain. Next, the C domain, usually present between every consecutive pair of A and PCP domains, catalyzes the peptide bond formation between the upstream aminoacyl/peptidyl-S-PCP and the free amino group of the downstream aminoacyl-S-PCP, thus facilitating the translocation of the growing peptide chain onto the next module. Correspondingly, a typical initiation module requires two catalytic activities arranged as A-PCP didomain for the formation of a starter aminoacyl-S-PCP; a native elongation module can be biochemically converted into an initiation module by deleting the C domain (5).
One of the hallmarks of nonribosomal peptide natural products is the presence of D-amino acids. For example, three of the four amino acids of glycopeptidolipids (6), three of the seven amino acids of mycosubtilin (7), and four of the seven amino acids of vancomycin (8) have the D-configuration. In general, the A domains of NRPSs select the readily available L-amino acids for activation, and epimerization (E) domains generate a D-configuration, whereas the amino acids is tethered to the PCP (1,2,3,5). D-Alanine is the only exception to this rule in a few cases. A number of nonribosomal peptide natural products, such as HC toxin (9), cyclosporin (10), daptomycin (11), microcystin (12), dendroamide A (13), and ramoplanin (14), contain D-Ala. Sequence data suggest that the internal C-A-PCP-E module, which activates L-Ala and then epimerizes L-Ala into D-Ala, has been found in the NRPS genes for daptomycin (11) and microcystin (12) biosynthesis. The NRPS module for D-Ala incorporation in cyclosporine or HC toxin biosynthesis, however, does not contain an integrated E domain and cannot activate L-Ala; instead, D-Ala is among the pool of available amino acids and is directly activated. In these two cases, D-Ala is provided by a dedicated Ala racemase that is encoded by a separate gene (15,16).
Leinamycin (LNM) (Fig. 1A), a hybrid nonribosomal peptide/polyketide natural product containing a D-Ala, exhibits potent antitumor activity, especially against tumors that are resistant to clinically important anticancer drugs (17). We have reported on the identification, localization, sequencing analysis, and genetic and biochemical characterization of the biosynthetic gene cluster for LNM biosynthesis in Streptomyces atroolivaceus S-140 (18 -21). Sequencing analysis of the lnm gene cluster revealed discrete NRPS PCP (LnmP) and A (LnmQ) proteins, instead of a conventional A-PCP-E start module or an A-PCP module for directly loading of D-Ala. The essential role of LnmQ in LNM biosynthesis was proven by gene inactivation and complementation experiments (20). Here we show in vitro that LnmQ directly activates D-Ala and loads it onto the -SH group of 4ЈPP moiety in LnmP as thioester to generate a D-alaninyl-S-PCP intermediate. Transfer of D-Ala between LnmQ and LnmP is mediated through specific interactions between the A and PCP proteins and does not involve dissociation of the aminoacyl-AMP followed by adventitious thiolation by the free-standing PCP in solution. Furthermore, we show that LNM production can be improved by supplementation of exogenous D-Ala in the fermentation broth of S. atroolivaceus. These findings unveil new insights into the biochemistry of NRPS initiation and provide new opportunities for the generation of LNM analogs through engineering the NRPS initiation module.

EXPERIMENTAL PROCEDURES
Bacterial Strains, Plasmids, Biochemicals, and Chemicals-Escherichia coli DH5␣ was used as host for general subcloning, and E. coli BL21(DE3) was used as the host for protein expression (Novagen, Madison, WI). An LNM producer, S. atrooliva-ceus S-140, and authentic LNM were gifts from Kyowa Hakko Kogyo Co. Ltd. (Machida, Tokyo, Japan). Cloning vectors pGEM-9Zf(ϩ) (Promega, Madison, WI) and protein expression vectors pET28a and pET37b (Novagen) were originally from commercial sources. D-Ala was from Sigma, and all other common biochemicals and chemicals were from standard commercial sources.
DNA Isolation and Manipulation-Plasmid preparation and DNA fragment extraction were carried out by using commercial kits (Qiagen, Santa Clarita, CA). Restriction enzyme digestions and ligations were done by standard methods (22). PCR primer synthesis and DNA sequencing were carried out at the University of Wisconsin-Madison Biotechnology Center. PCRs were performed on a GeneAmp 2400 thermocycler (PE-ABI, Foster City, CA).
Expression of lnmQ and lnmP Genes in E. coli and Purification of the Resultant Proteins-The lnmQ gene was amplified by PCR from cosmid pBS3007 (18) with the following primer pairs: 5Ј-CG GAA TTC CAT ATG AGC GGC GCC AAG CTG C (EcoRI-NdeI) and 5Ј-CG CGC AAG CTT GGA CGC CGG GGC GAG GTT (HindIII). (Restriction enzyme sites introduced into the N-and C-terminal portions of the genes are indicated in the primer sequence in boldface and underlined). PCR products were purified by agarose gel electrophoresis and extraction from gel slices, digested with EcoRI/ HindIII, and ligated into the same sites of pGEM-9Zf(ϩ) to yield plasmid pBS3048. This plasmid was identified by restriction enzyme digestion and confirmed by sequencing, and then its NdeI/HindIII fragment was cloned into the same sites of pET37b to make the expression plasmid pBS3049. The resulting plasmid, as well as pBS3021, an lnmP expression plasmid constructed previously (19), was transformed separately into E. coli BL21(DE3) for gene expression. For overproduction of the LnmQ and LnmP proteins, cells harboring pBS3049 (for lnmQ) or pBS3021 (for lnmP) (19) were grown in LB medium supplemented with 50 g/ml kanamycin. Cultures (2ϫ 500 ml) were grown to an A 600 of 0.4 -0.6 at 37°C and then cooled to 18°C for 30 min, and gene expression was induced by the addition of 0.2 mM IPTG. Cultures were grown for an additional 18 h. The purification of the His 6 -or His 8 -tagged fusion protein with nickel-nitrilotriacetic acid affinity resin was performed according to the manufacturer's manual (Qiagen, Valencia, CA), and the resultant proteins were dialyzed against 25 mM Tris-HCl (pH 7.5), 25 mM NaCl, 2 mM DTT, and 10% glycerol and stored at Ϫ80°C.
Amino Acid Specificity Assay of the LnmQ Protein-Amino acid-dependent ATP-sodium pyrophosphate (PP i ) assays were performed according to literature procedures (18,(23)(24)(25). A typical 100-l assay reaction contained 125 nM LnmQ, 3 mM ATP, 0.1 mM PP i with 0.5 Ci of 32 PP i (40.02 Ci/mmol; PerkinElmer Life Sciences), 5 mM MgCl 2 , 0.1 mM EDTA, and 0.5 mM various amino acids in 75 mM Tris-HCl (pH 7.5) buffer. After 30 min of incubation at 30°C, the assays were stopped by the addition of 0.5 ml of 1% (w/v) activated charcoal in 4.5% (w/v) tetrasodium pyrophosphate and 3.5% (v/v) perchloric acid. The precipitates were collected on glass fiber filters (2.4 cm, G-4, Fisher), washed successively with 10 ml of 40 mM sodium pyrophosphate plus 1.4% perchloric acid, 10 ml of water, and 5 ml of ethanol, and briefly dried in air. The filters were mixed with 7 ml of scintillation fluid (ScintiSafe Gel, Fisher) and counted on a Beckman LS-6800 scintillation counter to determine the radioactivity. To determine the kinetic parameters of substrates, reactions were carried out at 30°C in a total volume of 100 l that contained 75 mM Tris-HCl (pH 7.5), 10 mM MgCl 2 , 5 mM DTT, 3 mM ATP, 50 nM LnmQ, 1 mM PP i with 0.8 Ci of 32 PP i (12.1Ci/mmol; PerkinElmer Life Sciences), and increasing concentrations of D-Ala (0.1, 0.2, 0.4, 0.8, 1.6, 3.2, 5.0, 7.5, 10, and 15 mM) for 5 min. For Gly, the concentrations were 2, 5, 10, 20, 30, 40, 60, 80, 100, 120, 150, 200, 300, and 400 mM, and the reaction time was 10 min. The initial rate conditions were maintained by computing the substrate amino acid at less than 10% conversion, on the assumption that these conditions were within the linear range of enzyme concentration and enzyme turnover. The data were fitted with the Michaelis-Menten equation to extract the kinetic constants.
In Vitro Aminoacylation Assays of PCP (LnmP)-The in vitro phosphopantetheinylation and aminoacylation of LnmP were carried out according to literature methods (17,(23)(24)(25). A typical reaction (100 l) contained 75 mM Tris (pH 7.5), 10 mM MgCl 2 , 5 mM DTT, 0.3 mM CoA, 10 M LnmP, and 100 nM Svp, a known promiscuous phosphopantetheinyl transferase (23), and was incubated for 30 min at 30°C to allow complete phosphopantetheinylation of the apoLnmP. Then 0.2 M LnmQ and 1.5 Ci of D-[ 14 C]Ala or [ 14 C]Gly (55 mCi/mmol, ICN Biomedicals Inc., Irvine, CA) were added into the reaction, followed by 3 mM ATP to initiate the aminoacylation reaction, which was allowed to proceed at 30°C for another 2-10 min. The reactions were quenched with 0.9 ml of acetone at Ϫ80°C for 2 h; the precipitated proteins were collected by centrifugation at 14,000 rpm, 4°C for 20 min, and the protein pellets were redissolved and loaded on a 4 -20% SDS-polyacrylamide gel. After being stained with Coomassie Blue R-250, destained, and dried, the gels were visualized by a Phosphor Imager (low energy screen, GE Healthcare).

Identification of the Reaction Products of LnmQ and LnmP by HPLC and ESI-MS Analysis-After
complete phosphopantetheinylation of the apoLnmP (20 M) as described as above, 0.5 M LnmQ and 60 M D-Ala or 200 M Gly was added into the reaction, followed by 3 mM ATP to initiate the aminoacylation reaction, which was allowed to proceed at 30°C for another 2-5 min. The reactions were analyzed by injecting a portion into an HPLC equipped with a Jupiter C18 reverse phase column (5 m, 300 Å, 250 ϫ 4.6 mm) and chromatographing with a linear gradient of 40 -75% acetonitrile in water (containing 0.1% trifluoroacetic acid) in 25 min at a flow rate of 1 ml/min with monitoring by a UV detector at 215 nm. Peaks were collected, lyophilized, redissolved in H 2 O, and subjected to ESI-MS analysis (Agilent 1000 HPLC-MSD SL instrument, Palo Alto, CA).
Determination of the Kinetic Parameters of LnmP Aminoacylation by LnmQ-To determine the kinetic parameters for holo-LnmP aminoacylation by LnmQ, reactions were carried out at 30°C in total volume of 100 l that contained 75 mM Tris-HCl (pH 7.5), 10 mM MgCl 2 , 5 mM DTT, 3 mM ATP, 5 nM LnmQ, 2 mM D-Ala (or 40 mM Gly), and various concentrations of holo-LnmP (2.5, 5, 7.5, 10, 15, 20, 25, and 30 M) for 3 min. The ratio of aminoacyl-S-LnmP to holo-LnmP conversion was analyzed by HPLC as described above. The initial rate conditions were maintained by computing the substrate amino acid conversion at less than 10% on the assumption that these conditions were within the linear range of enzyme concentration and enzyme turnover. The data were fitted with the Michaelis-Menten equation to extract the kinetic constants.
LNM Production, Isolation, and Analysis-LNM production and isolation from wild type S. atroolivaceus and HPLC analysis were carried out as reported previously (18 -21), the only difference was feeding D-Ala to the fermentation broth. Seed culture (5 ml) was inoculated into the fermentation medium (50 ml), which was allowed to grow at 28°C, 300 rpm for 10 -15 h, and then different concentrations of D-Ala (5, 10, 25 mM) were added separately. After 3-5 h, moisturized Diaion HP-20 resin  JULY 13, 2007 • VOLUME 282 • NUMBER 28 JOURNAL OF BIOLOGICAL CHEMISTRY 20275 was added to 5% (w/v), and the fermentation was continued for up to 48 h (18 -21).

Characterization of LnmQ as an NRPS A Domain with a Sub-
strate Specificity for D-Ala-According to the LNM structure (Fig. 1A), the widely accepted bacterial NRPS/PKS paradigm (4,26), sequencing analysis, and the result of in vivo gene replacement (20), we have proposed that LnmQ serves as the NRPS A domain of the LNM biosynthetic initiation (or loading) module, to activate D-Ala as an aminoacyl-AMP and transfer it onto LnmP, the PCP partner of LnmQ (Fig. 1A). To validate this hypothesis, we amplified the lnmQ gene by PCR and cloned it into the T7 overexpression vector pET37b. This construction allows the production of LnmQ as a C-terminal His 8 -tagged protein. LnmQ was overproduced in E. coli BL21(DE3) and purified by nickel affinity chromatography to homogeneity, as visualized by SDS-PAGE and Coomassie Blue staining (Fig. 1B). The yield of purified fusion protein was ϳ20 mg/liter. Substratedependent ATP-32 PP i exchange assays showed that, among D-Ala, ␤-Ala, and the 20 proteinogenic L-amino acids tested, D-Ala is the preferred substrate of LnmQ ( Fig. 2A). LnmQ also demonstrated activity, albeit much weaker (11% relatively to D-Ala under the specific assay condition), toward Gly. In addition, kinetic parameters for D-Ala and Gly as substrates of LnmQ (Fig. 2, B and C, and Table 1) were comparable with other A domains of NRPSs (27)(28)(29)(30). The K m value for D-Ala is 1.62 mM, which is 22 times lower than that of Gly; and the catalytic efficiency of D-Ala is 264 mM Ϫ1 ⅐min Ϫ1 , which is about 50 times higher than that of Gly. These results support D-Ala as the natural substrate of LnmQ.
Characterization of LnmP as an NRPS PCP-The lnmP gene, encoding a discrete PCP, is located upstream of lnmQ and is transcriptionally coupled with the lnmQ open reading frame (Fig. 1A). LnmP contains a modestly conserved signature motif of GX(HD)S, in which the Ser residue (underlined) can be posttranslationally modified by the covalent attachment of the 4ЈPP group (20,25). To test LnmP as the PCP partner of LnmQ, we expressed this small gene using the pET28a vector in E. coli and purified the resultant PCP as an N-His 6 and C-His 6 -tagged fusion protein to homogeneity (Fig. 1C), with a yield of ϳ50 mg/liter (19). To our surprise, two distinguishable protein bands were visualized on SDS-PAGE. HPLC and ESI-MS analysis (Fig. 3C, panel I, and Table 2) confirmed that one band was the apo-form, and the other band was the holo-PCP that has a 4ЈPP group added by E. coli ACPS phosphopantetheinyl transferase (24,25). By HPLC analysis, the holo-form represented ϳ40% of the total PCP after 18 h of induced expression at 18°C (Fig. 3C, panel I). When LnmP was incubated with CoA and the Svp phosphopantetheinyl transferase (25), the apoPCP was completely converted into the holo-form (Fig. 3C, panel II).
Aminoacylation of LnmP by LnmQ-We previously proposed that LnmQ and LnmP constitute the initiation module that is responsible for chain initiation of LNM biosynthesis by incorporating a D-Ala moiety (Fig. 1A) (19,20). To verify whether holo-LnmP can be aminoacylated by LnmQ, we incubated the holo-LnmP and LnmQ with D-[ 14 C]Ala or [ 14 C]Gly and ATP to directly test amino acid loading (Fig. 3A). Reaction mixtures were subjected to SDS-PAGE and phosphorimaging to detect specific loading of the 14 C-amino-acid onto the 4ЈPP group of PCP (Fig. 3B). Reaction mixtures with nonradioactive substrates were subjected to HPLC (Fig. 3C), and the resolved apoPCP, holo-PCP, and aminoacyl-S-PCP species were collected and subjected to ESI-MS analysis to confirm the predicted modifications ( Table 2). The collective results showed that both D-Ala and Gly can be effectively activated by LnmQ as FIGURE 2. Amino acid substrate specificity of LnmQ as an adenylation enzyme. A, amino acid-dependent ATP-32 PP i exchange assay with 100% relative activity corresponds to 8 ϫ 10 5 cpm (standard three-letter amino acid designations are used). D-Ala (B) or Gly (C) were used as a substrate by plotting the initial velocities of ATP-32 PP i exchange as a function of amino D-Ala or Gly concentration. aminoacyl-AMP and transferred onto holo-LnmP to form the D-Ala or Gly aminoacyl-S-PCP intermediate, and that formation of aminoacyl-S-LnmP is dependent on ATP (Fig. 3B, lanes  2 and 5) and holo-LnmP (Fig. 3B, lanes 3, 4, 6, and 7). The kinetic parameters of holo-LnmP aminoacylation by LnmQ were subsequently determined ( Table 3 (30); these constants are also comparable with those observed in typical NRPS systems where the aminoacyl-AMP ligase must aminoacylate the cognate holo-PCP in trans (31,32).

Confirming the Protein-Protein Interaction between LnmQ and LnmP-
The results above confirmed that LnmQ, a monofunctional A domain, activates D-Ala and transfers it to the LnmP PCP. Next we asked whether there is a specific proteinprotein interaction between this discrete A:PCP pair, and if so, does the aminoacylation of LnmP depend upon this specific communication. To address these questions, we first chose to study the natural A-PCP didomain of LnmI and used L-Cys as substrate for an orthogonal experiment. The LnmI A-PCP was originally used to show specific loading of L-Cys to the PCP (18). In the present work the LnmI-A and LnmI-PCP domains were expressed as individual mono-domain proteins then used to establish that LnmI-A can still specifically activate L-Cys and load it onto the 4ЈPP group of LnmI-PCP (data not shown). In contrast, neither the LnmQ (for D-Ala):LnmI-PCP pair nor the LnmI-A (for L-Cys):LnmP pair could achieve effective aminoacylation. These results suggest that LnmQ and LnmP interact with each other specifically, to serve as the initiation module for LNM biosynthesis.
For a pair of NRPS A and PCP proteins, there are two possible mechanisms of aminoacylation in trans. One mechanism results from specific communication between the A and PCP domains; alternatively, aminoacylation could happen if dissociation of the aminoacyl-AMP is followed by adventitious thiolation by the free-standing PCP in solution (Fig. 4A) (29). To address this issue, we designed the following experiment. LnmQ, with its substrates D-Ala and ATP together in assay buffer, were put into a dialysis tube with a molecular cutoff of 8 kDa. This dialysis tube was then submerged into the same assay buffer containing holo-LnmP (Fig. 4B). If the aminoacylation   between LnmQ and LnmP depends upon specific communication, the outside holo-PCP should not be aminoacylated, because the dialysis tube prevents direct protein-protein interaction; in contrast, if the aminoacylation results from the dissociation of the aminoacyl-AMP followed by adventitious thiolation by the free-standing PCP in solution, the outside holo-PCP could be aminoacylated, because the D-alaninyl-AMP catalyzed by LnmQ can dissociate and permeate through the dialysis tube. By HPLC analysis of the holo-PCP mixture (Fig. 4C), no aminoacylated product, D-alaninyl-S-LnmP, was formed, even when reaction was allowed to last for up to 1 h; once the dialysis tube was broken, nearly all holo-PCP was converted into aminoacylated D-alaninyl-S-LnmPs in 2 min. These results suggest that there is specific communication between the discrete A and PCP, which contributes to the aminoacylation in trans. Improving LNM Production by in Vivo Feeding of D-Ala-One of the well characterized examples of D-Ala incorporation in nonribosomal peptide natural products is HC toxin, a secondary metabolite produced by a filamentous fungi (9). In this case, an NRPS A domain directly activates D-Ala, which is produced by a dedicated Ala racemase. Disruption of the Ala racemase gene abolished the production of D-Ala-containing isoforms of HC toxin. Supplementation of exogenous D-Ala to the growth medium restored the normal HC toxin production profile of the Ala racemase null mutant. Higher D-Ala concentrations resulted in higher levels of the D-Ala-containing isoforms of HC toxin (9). This knowledge inspired us to ask whether LNM production in S. atroolivaceus S-140 can be improved by feeding additional D-Ala in the fermentation medium. Indeed, when the fermentation medium was supplemented with different concentrations of D-Ala, LNM production was significantly improved. Supplementation of 25 mM D-Ala resulted in 2-3-fold increase in LNM production, compared with S. atroolivaceus S-140 fermentation under the identical condition without D-Ala supplementation, as analyzed by HPLC.

DISCUSSION
LNM, an antitumor antibiotic, represents a new class of microbial hybrid nonribosomal peptide/ polyketide scaffold featuring a macrolactam with a spiro-linked 1,3-dioxo-1,2-dithiolane 5-membered ring structure that has not been found in any other natural products. Gene cloning and sequencing analysis, plus genetic and biochemical characterization work of the LNM biosynthetic gene cluster, have revealed not only a hybrid NRPS/PKS with a novel AT-less type I PKS and several other unprecedented features (18 -21) but also a novel type of NRPS initiation (or loading) module with the following features. (i) It is composed of discrete A (LnmQ) and PCP (LnmP) proteins (Fig. 1). (ii) LnmQ directly selects and actives D-Ala to form D-alaninyl-AMP, which is the first characterized A enzyme with D-Ala as a substrate in prokaryotic NRPSs (Fig. 2). (iii) The D-Ala transfer in trans from LnmQ onto LnmP to generate the D-alaninyl-S-PCP intermediate depends upon spe- cific communication between the discrete A and PCP proteins (Figs. 3 and 4).
NRPSs assemble their peptide products via a thiotemplate process (1,2,4,5), in which A domains activate amino acid substrates as the aminoacyl-AMP before their transfer onto the -SH group of 4ЈPP on adjacent PCP domains. Subsequently, C domains catalyze the formation of peptide bonds between the upstream peptidyl-S-PCP and the free amino group of the downstream aminoacyl-S-PCP. An archetypical NRPS contains a didomain (A-PCP) module for chain initiation with L-amino acid and repeated modules of C-A-PCP for chain elongation (Fig. 5A, panel I), as seen in the biosynthetic systems for vanco-  N-formylation); panel V, special module for aromatic acids instead of amino acids (with the A domains are acyl-CoA ligase-like domains that are also known as acyl-CoA ligase-like domain and PCP domains also known as aryl-carrier proteins); and panel VI, the special module as exemplified by LnmQ and LnmP here with D-Ala as a start unit. Ox, oxidization; C or CЈ, condensation; Cy, cyclization. B, classification of NRPS initiation according to the organization of the start (or loading) module as follows: panel I, the starter module fused together with the extension modules as the N-terminal part of a large NRPS protein (1); panel II, the starter module existed as a separated protein from the extension modules (46); panel III, only the A domain of the starter module exists as a discrete protein with the PCP domain of start module fused with the extension modules as the N-terminal part of the NRPS protein (38); panel IV, the starter module composed of discrete A and ICL-PCP didomain proteins (42); and the starter module composed of discrete, monofunctional A and PCP proteins for panel V an aromatic acid (43), and panel VI for a D-Ala as exemplified by LnmQ and LnmP reported here. Sal, salicylic acid; Dhb, 2,3-dihydroxybenzoic acid; Hqa, 3-hydroxyquinaldic acid. mycin family antibiotics (8,33,34). A variation of an initiation module, organized as A-PCP-E, where the E domain epimerizes the L-amino acid into D-configuration once they are installed as aminoacyl-S-PCP intermediates, represents an initiation module for D-amino acid incorporation (Fig. 5A, panel II), as exemplified by tyrocidine (28). For biosynthesis of more complex moieties, such as the N-acylated amino acid starter units of surfactin (35) or daptomycin (11), the initiation module starts with an additional N-terminal condensation (CЈ) domain (Fig.  5A, panel III). Similarly for the N-formylated amino acid starters, the initiation module starts with an additional formylation (F) domain (Fig. 5A, panel IV) as has been recently characterized for gramicidin (36). Additionally, when the starter unit is something other than an amino acid, such as the aromatic acids in mycobactin (37), yersiniabactin (38), myxochelin (39), pyochelin (40), vibriobactin (41), bacillibactin (42), enterobactin (31), and thiocoraline (43) or amine group in bleomycin (44) or tallysomycin (45), the starter unit is activated by an acyl-CoA ligase-like domain with the consumption of ATP, a process mechanistically similar to NRPS A catalysis (therefore these acyl-CoA ligase-like domains sometimes have also been named A domains) (Fig. 5A, panel V).
NRPS initiation modules can also be classified by the organizational patterns of domains. Besides a typical initiation module fused together with the domain(s) of an elongation module (Fig. 5B, panel I), as seen in the bacitracin (1) and vancomycin (8) biosynthetic systems, the initiation module can also exist as a discrete protein, separated from the elongation module (Fig. 5B, panel II), as exemplified by tyrocidine (28) and gramicidin S (46). These two types of NRPS initiation modules are commonly observed in type A (linear) NRPS system (2), in which the sequence of the resulting linear peptide chain is co-linear with the number and order of the modules. A third type initiation module is known, in which only the A domain exists as a discrete protein, the PCP domain of which is fused with elongation modules. This rare case has been found in yersiniabactin (38) or pyochelin (40) biosynthesis (Fig. 5B, panel III). Another type of initiation module, as seen in enterobactin (31), myxochelin (39), vibriobactin (41), and bacillibactin biosynthetic systems (42), has a discrete A domain and a bifunctional protein with a PCP domain fused with an isochorismate lyase (ICL) domain (Fig.  5B, panel IV). Finally, an initiation module consisting of a discrete acyl-CoA ligase-like domain and PCP (Fig. 5B, panel V) has also been proposed for the marine natural producy of thiodepsipeptide thiocoraline; the 3-hydroxyquinaldic acid starter is first activated as an acyl-AMP and subsequently loaded to the PCP to initiate its biosynthesis (43). A variation of this organization has also been reported for the quinoxaline antibiotics whose initiation module consisted of a discrete A enzyme and an acyl carrier protein (47,48). These three types use aromatic acids as starter units; the discrete A proteins are also called acyl-AMP ligases and the PCP domains also as aryl-carrier proteins (49). They are only found in type B (iterative) or type C (nonlinear) NRPSs systems (2), in which the modules or domains can be used more than once in the assembly of a single product, resulting in repeated sequences in peptide chains or unusual internal cyclization or branch point syntheses.
A sixth type of NRPS initiation module is described here from the LNM biosynthetic system. It uses an NRPS initiation module comprised of two discrete monofunctional proteins: LnmQ, acting as A domain for activation of D-Ala to form D-alaninyl-AMP, and LnmP, as the cognate PCP partner of LnmQ (Fig. 5, A, panel VI, and B, panel VI). This type of A:PCP pair had been postulated as a strategy to activate and sequester amino acids as enzyme-bound aminoacyl thioesters for specific modification (30,50,51). In the biosynthesis of pyrrole moieties in pyoluteorin, undecylprodigiosin and coumermycin A1, an A domain activates L-proline as L-prolyl-AMP and transfers the L-prolyl moiety to the -SH group of 4ЈPP of the respective holo-PCP partner, and then the L-prolyl-S-PCP acts as substrate for further desaturation and oxidation to form pyrrole-2-carboxy-S-PCP, which serves as the activated monomer for incorporation into the final products (30). Another example is barbamide, a hybrid peptide-polyketide from a marine source (51), in which the A domain activates L-leucine as L-leucinyl-AMP and transfers it to the holo-PCP partner. The L-leucinyl-S-PCP then acts as the substrate for chlorination to form trichloroleucinyl-S-PCP, which is incorporated into the final product. In either case, the A:PCP pair seems to have evolved from a typical NRPS assembly line to serve a distinct purpose, i.e. not to catalyze peptide bond formation but to create a pool of covalently tethered amino acids dedicated to subsequent modification (50).
In LNM biosynthesis, the unique A:PCP pair could have two functions as follows: to act as an initiation module for oligopeptide biosynthesis or to divert a portion of the cellular D-Ala pool into secondary metabolism. This notion is supported by the following three observations. (i) The K m value of D-Ala activation by the A enzyme (LnmQ) is about 1.6 mM; once D-Ala is activated as D-alaninyl-AMP form, the K m value of holo-PCP (LnmP) aminoacylation by A domain is about 6 M. The freestanding, monofunctional D-alaninyl-S-PCP thus may balance kinetic sequestration and thermodynamic activation. (ii) In the other two systems directly using D-Ala as an NRPS substrate for the synthesis of the fungal products cyclosporin (9) and HC toxin (10), D-Ala is provided by pathway-specific Ala racemases (15,16). In the LNM biosynthetic gene cluster, however, no pathway-specific Ala racemase gene is found, indicating that the LNM pathway may compete with the primary bacterial cell wall metabolism for a source of D-Ala. (iii) LNM production can be improved by supplementation of exogenous D-Ala in the fermentation broth.
In most NRPS systems, the A domain activates an amino acid substrate as its aminoacyl-AMP and then transfers the aminoacyl moiety to the cognate PCP in cis. When the A and PCP domains are artificially separated, in trans aminoacylation between the A domain and its cognate PCP also can work effectively, yet it has been reported that aminoacylation in trans failed with its cognate PCP but succeeded with heterologous PCP protein fragments (29). In addition, a free standing PCP of the bleomycin biosynthetic pathway has no apparent cognate A domain partner but can be successfully aminoacylated in trans (24). These observations raise the question whether paired A:PCP proteins communicate with each other and whether the communication provides partner identification, as well as the covalently enforced proximity (in cis) or noncovalently proximity (in trans). Either mechanism must ensure a high local concentration of the 4ЈPP -SH to capture the aminoacyl-AMP lodged in the adjacent A domain, or the aminoacylation of PCP coming from the dissociation of aminoacyl-AMP followed by adventitious thiolation by the free-standing PCP in solution (in trans) or the covalently linked PCP (in cis).
With LNM biosynthesis, the free-standing LnmQ and LnmP A:PCP pair provided an opportunity to answer this question. The facts that LnmQ specifically loads D-Ala onto LnmP but not the LnmI-PCP, and that LnmI-A specifically loads L-Cys onto LnmI-PCP but not LnmP suggest that the in trans aminoacylation is specific in this case. The results of the dialysis tube experiments support this conclusion.
Manipulation of chain initiation in PKS assembly lines has proven very useful for generating novel natural products (52,53); in contrast, attempts at engineering the chain initiation step of NRPS biosynthesis largely remain at the in vitro biochemical level (54). The A domain is the critical fidelitycontrolling unit for amino acid monomer selection, activation, and transfer to the paired PCP; at the same time, the C domain shows low selectivity toward the upstream donor residue and higher selectivity toward the downstream receptor residue (55). Thus, the chain initiation module should be the most promising candidate for engineering production of novel metabolites because it is the only module without a C domain in the NRPS assembly line. Results reported herein underscore the flexibility and versatility of NRPS initiation module both structurally and mechanistically for natural product biosynthesis and provide another opportunity for engineered biosynthesis of novel LNM analogs by manipulating the initiation module.