Engineering of Peptide Synthetases

Peptide synthetases are large enzymatic complexes that catalyze the synthesis of biologically active peptides in microorganisms and fungi and typically have an unusual structure and sequence. Peptide synthetases have recently been engineered to modify the substrate specificity to produce peptides of a new sequence. In this study we show that surfactin synthetase can also be modified by moving the carboxyl-terminal intrinsic thioesterase region to the end of the internal amino acid binding domains, thus generating strains that produce new truncated peptides of the predicted sequence. Omission of the thioesterase domain results in nonproducing strains, thus showing the essential role of this region and the possibility of obtaining peptides of different lengths by genetic engineering. Secretion of the peptides depends on the presence of a functionalsfp gene.

Peptide synthetases are large enzymatic complexes that catalyze the synthesis of biologically active peptides in microorganisms and fungi and typically have an unusual structure and sequence. Peptide synthetases have recently been engineered to modify the substrate specificity to produce peptides of a new sequence. In this study we show that surfactin synthetase can also be modified by moving the carboxyl-terminal intrinsic thioesterase region to the end of the internal amino acid binding domains, thus generating strains that produce new truncated peptides of the predicted sequence. Omission of the thioesterase domain results in nonproducing strains, thus showing the essential role of this region and the possibility of obtaining peptides of different lengths by genetic engineering. Secretion of the peptides depends on the presence of a functional sfp gene.
Two mechanisms for the biosynthesis of peptides are known to exist in bacteria and fungi, ribosomal synthesis and production by peptide synthetases. These are large enzymatic complexes responsible for the synthesis of hundreds of types of peptides, some of which have immunoactive, antibiotic, antifungal, or surfactant properties. Whereas polypeptides produced by ribosomal synthesis typically contain only the amino acids directly specified by the triplets of the genetic code, peptides built on synthetases often contain unusual amino or hydroxy acids as building blocks that are not present in proteins. The amino acids can be modified by peptide synthetases either through methylation, hydroxylation, or enantiomerization. The peptides are typically short (up to about 20 residues) and can be linear, circular, or branched (1,2). Peptide synthesis proceeds by the "multiple carrier thiotemplate mechanism" (3,4), and many of the details of these systems remain to be investigated experimentally. According to this mechanism each domain recognizes a specific amino (or hydroxy) acid that is covalently bound to the cofactor via a thioester bond after activation to the corresponding acyladenylate derivative. The growth of the polypeptide chain thus occurs through a series of thioester bond cleavages and the simultaneous formation of amide or ester bonds in the peptide. At the end of synthesis of each peptide, the chain is thought to be released from the enzyme by a thioesterase (TE) 1 activity encoded in the synthetase gene (1).
Peptide synthetases are interesting not only from a scientific and evolutionary point of view but also for their biotechnological potential. Many enzymatically synthesized peptides are in fact biologically active, and some of them are industrially produced, among which are cyclosporins, surfactin, and fungicides. A growing number of laboratories are involved in applied research projects focused on the isolation and characterization of new peptide synthetases and on the genetic manipulation of peptide synthetase genes to optimize peptide production and to genetically modify the sequence of the peptides produced. In fact, the structural organization of peptide synthetases makes these enzymes particularly amenable to protein engineering. Sequence analysis of the genes encoding peptide synthetases (2, 5-9) reveals a clear organization of the enzymatic complex in domains associated with diverse functions such as amino acid binding and activation, methylation, or racemization. The order and substrate specificity of repeated homologous regions that are about 1,000 residues long determine the sequence of the peptide product and the type of modification of each single amino acid unit of the peptide. Amino acid binding domains have been shown to functionally exist within the repeated regions, and in these regions, conserved sequence motifs are present homologous to nucleotide binding regions and AT-Pases; phosphopantheteine cofactor binding sequence motifs have been demonstrated experimentally (3,10,11).
Recently it was shown that amino acid binding domains of peptide synthetases can be exchanged among different synthetases to generate hybrid enzymes capable of producing new peptides. In this way, Stachelhaus et al. (12) replaced the last leucine binding domain of Bacillus subtilis surfactin synthetase with domains derived from the Bacillus brevis grs operon and from domains of the fungine L-␣-aminoadipyl-cysteinyl-Dvaline synthetase (acvA) gene. These authors demonstrated that the peptides produced by the engineered synthetase carried at the terminal position the amino acid recognized by the replacing domain.
In the course of our investigation of the elements necessary and sufficient for peptide synthesis and in an attempt to further expand the possibilities of engineering peptide synthetases, we have focused our attention on the thioesterase that is responsible for the release of the newly synthesized peptide from the enzyme. Interestingly, in the surfactin synthetase (srfA) and gramicidin synthetase (grs) operons, two regions contain sequences homologous to thioesterases. One encodes a 25-kDa protein (srfAORF4 (5) or the grsT product (13)) homologous to fatty acid synthase thioesterase type II. The other one, present in all peptide synthetases characterized so far, lies downstream from the sequence of the last amino acid binding domain and is homologous to fatty acid synthase thioesterase type I (6,7,14). We have shown that insertional inactivation or deletion of the thioesterase type I region results in a stable but unproductive surfactin synthetase (15), whereas deletion of srfAORF4 has no effect on peptide production (5).
In this study we show that recombinant enzymes obtained by repositioning the integral thioesterase type I domain within surfactin synthetase efficiently synthesize peptides of reduced length as long as the region is properly fused to an amino acid activating domain of the enzyme complex. These data open the way to the synthesis of peptides of desired length and amino acid composition.

EXPERIMENTAL PROCEDURES
Plasmid Constructions-The integrative plasmids used to fuse the thioesterase domain at the end of amino acid activating domains were all derivatives of pJM103 or pJM102 (the difference between the two vectors being the orientation of the multicloning site) and were constructed as follows: part of the srfA fourth and fifth domains corresponding to the 12400 -13827 and the 16538 -16949 regions of srfA (5) were amplified using JH642ϩ chromosomal DNA as the template and the couples of primers: 4VAL-FOR 5Ј-GCATATGTGAATTC-GGATCCAGCGCTTCCTGGG-3Ј, 4VAL-REV 5Ј-TTCCTTAT-GAGCTCCTCTTGAATTTTCGCCGTCA-3Ј; and 5ASP-FOR 5Ј-TCA-CGGAATTCAAGAAGCGGTT-3Ј, 5ASP-REV 5Ј-CTTTTCTC-GAGCTCCGCCTGAATGTTGGCAATCA-3Ј) encoding EcoRI and SacI sites at the two extremities (underlined). Cloning of the two fragments in pJM103 (16) between these two sites generated plasmids pVAL and pASP. For the construction of plasmids pVAL-TE and pASP-TE, the EcoRI-SacI inserts from pVAL and pASP were introduced between the EcoRI and SacI sites of a pJM103 derived construct pTE. In this construct, the 795-base pair-long sequence encoding the last 260 residues of srfAORF3 was cloned as a polymerase chain reaction fragment between the SacI and BamHI sites using the couple of oligonucleotides with sequence: TE-FOR 5Ј-AACAAAGAGCTCGGGATTGATCTTCCA-3Ј and TE-REV 5Ј-GTGTGGATCCATTTATGAAACCGTTACGGTTTG-3Ј as primers for amplification of this region from chromosomal DNA.
All plasmids containing fragments derived from polymerase chain reactions were sequenced to confirm the absence of mutations by the dideoxy terminators method (17). The integrative plasmids were introduced into B. subtilis surfactin producing strain JH642ϩ as described (5).
HPLC and TLC Methods-The supernatant from 1.5-ml cultures was acidified to pH 2.0 with 6 N HCl to precipitate surfactin and surfactinderived peptides. The pellet was then collected by centrifugation and resuspended in 100 l of methanol. For HPLC analysis, 20 l of this solution was injected on a reverse phase C8 column (HP LiChrospher 60-RP, Select B, 5 m, 125 ϫ 4 mm). A linear solvent gradient was applied ranging from 70 to 95% (v/v) acetonitrile in water (both solvents containing 0.05% (v/v) trifluoroacetic acid) for 10 min followed by 5 min with 95% acetonitrile at a 1 ml/min flow rate. Elution was monitored spectrophotometrically at 220 nm.
TLC chromatography was performed as follows: 5-l samples were spotted on a silica plate (Merck Silica gel 60, 0.25-mm thick) and separated using a mixture of acetone/butanol/H 2 O 5:3:1 as the mobile phase. Surfactin and its lipopeptide fragments were detected by spraying the plate with water. For preparative purposes the equivalent of about 150 l of culture, run as separate samples, was collected and running conditions were optimized to allow maximal separation of the diverse species. The water-repellent spots were scraped from the plate and dried, and the lipopeptides were eluted from silica with methanol. The profile and the approximate quantity of the eluted peptides were determined by HPLC analysis. In particular for the mutant strains the analysis was performed as follows. Strains integrated with pVal-TE: two spots representing all hydrorepellent material were scraped from the plate and the amino acid composition was analyzed (the major contaminant, a peptide rich in Gly secreted by the parental strain (non-hydrorepellent) of known amino acid composition and corresponding to the material shown in panel A of the HPLC pattern was removed in this way). Strains integrated with pAsp-TE: three hydrorepellent spots representing all of the lipopeptides were scraped, eluted with methanol, and analyzed by HPLC; their amino acid composition was determined. Since the slowest migrating spot was a mixture of the contaminant described above and the major lipopeptide species as detected by HPLC (peak 1), this species was HPLC purified to homogeneity, and its amino acid composition analyzed. The remaining material was constituted by the Gly-rich contaminant of known composition and by a minor peak representing about one fifth of the peak purified by HPLC; the amino acid composition of the peptide contained in this minor peak resulted in being the same as that of peak 1.
The faster TLC migrating material represents the minor peaks eluting after peak 1, and their amino acid composition was analyzed directly and resulted again in having the same composition as peak 1, as reported in Table I.
Amino Acid Analysis-Peptide hydrolysis was performed in vapor phase (19) in standard hydrolysis tubes previously pyrolyzed at 500°C using a Waters Pico-Tag work station for 90 min at 120°C. Amino acid analysis was carried out using the phenylthiocarbamyl derivatization method (20). The amino acid samples were extensively dried under a vacuum to remove HCl, neutralized with a 2:1 methanol/diisopropylethylamine mixture, dried under a vacuum, and then derivatized with a freshly prepared mixture of methanol/diisopropylethylamine/water/ phenylisothiocyanate 7:1:1:1. The reaction was allowed to proceed for 25 min at room temperature, then the mixture was dried under vacuum and reconstituted with 190 mM sodium acetate buffer just before injection into a Beckman Ultrasphere HPLC column. All of the amino acid compositions were normalized to glutamic acid.

RESULTS
srfA Gene Manipulation-As we have previously shown, 3Јdeletions of various length in the surfactin synthetase gene (srfA) result in the synthesis of truncated enzymes that are able to recognize and bind amino acids but are unable to produce the corresponding peptides (15). Since experimental evidence indicated that the integrity of the thioesterase type I-like domain (here designated the TE region) is necessary for production of surfactin (5), we have examined the productivity of truncated forms of the enzyme after fusion to the TE region at the carboxyl-terminal boundary of amino acid-activating domains. The extent and degree of homology of the TE region to several thioesterases is shown in Fig. 1.
To obtain strains in which the TE region is fused to the truncated surfactin synthetase, integrative plasmids containing the appropriate constructs were integrated in the B. subtilis chromosome by recombination. The TE region (comprising the carboxyl-terminal 260 amino acids of srfAORF3) was amplified from the chromosome of the surfactin-producing strain JH642ϩ (5) as a SacI-BamHI 795-base pair-long fragment. This fragment was fused to regions encoding carboxyl-terminal portions of the fourth or fifth domains of surfactin synthetase that were amplified as EcoRI-SacI fragments and cloned in the integrative plasmid pJM103 (16), thus generating plasmids pVAL-TE and pASP-TE. Two plasmids (pVAL and pASP) containing the same inserts but lacking the TE domain were also constructed and inserted in the chromosome (see Fig. 2 and "Experimental Procedures" for details).
Transformation of the surfactin-producing JH642ϩ strain with the integrative plasmids pVAL-TE and pASP-TE results in the insertion of the plasmid into two possible regions of srfA, depending on the point where Campbell recombination ensues. Recombination in the fourth or fifth amino acid-activating domains generates hybrid surfactin synthetases in which one or the other of these domains is fused to the TE region, whereas recombination at the level of the srfAORF3 TE region results in strains in which the srfA operon preserves the wild-type configuration up to the end of the ORF3. Interruption of the srfA operon after this point by JM103 integration has no effect on surfactin production (5). Transformation of JH642ϩ cells with plasmids pVAL and pASP are predicted to result in plasmid integration and srfA transcription interruption at the level of the "junction" region (5) downstream of the fourth amino acid binding domain (VAL) and a few residues downstream of the cofactor attachment site of the fifth (ASP) domain with pVAL and pASP, respectively. To discriminate between the possible alternative regions of integration in the case of plasmids pVAL-TE and pASP-TE and to confirm the insertion point in the case of plasmids pVAL and pASP, the protein and DNA pattern of the recombinant strains were analyzed. Protein Analysis-Transformation with pVAL-TE and pASP-TE resulted in the selection on chloramphenicol of two classes of transformants as detected on spo plates, halo producers or no-halo transformants. Production of surfactin can be visualized on spo agar plates as a clear halo around the colony. In the case of transformation with pVAL and pASP, only one class of colonies was obtained that did not show a halo around the colony. Thus, the protein pattern of several colonies from each class was further examined.
We have previously shown (15) that the srfAORF1, sr-fAORF2, and srfAORF3 subunits of wild-type surfactin synthetase are readily visible on SDS-polyacrylamide gels by Coomassie Blue staining. Representative patterns of strains in which the srfA gene is interrupted by integration are shown in Fig. 3 lanes 3-6 where they are compared with the srfA wild-type pattern (lane 7) and with the pattern of a mutant strain in which the srfA gene is not transcribed (15). The interruption of srfAORF2 in all halo negative strains could be identified by the disappearance of the corresponding wild-type ORF2 band. These strains were named dom4::pVAL, dom4::pVAL-TE, dom5::pASP, and dom5::pASP-TE, respectively, which denotes the region of integration and the name of the integrative plasmid (Fig. 2). Halo-positive colonies in which integration occurred at the level of the TE region have a protein pattern identical to the wild-type strain on these gels and are not shown in this figure. The appearance of new bands (Fig. 3,  asterisks) can be seen in strains in which integration occurred in the fifth domain ( lanes 5 and 6), which indicates the presence of new hybrid proteins migrating at the expected position on the polyacrylamide gel. In the class of colonies in which integration happened at the carboxyl end of the fourth domain, the band that corresponds to srfAORF2 disappears, but no new protein species could be detected by this method. Analysis of DNA (by polymerase chain reaction or Southern blotting techniques) of representative colonies from each class confirmed plasmid integration in the expected regions (data not shown).
Peptide Production by the Mutant Strains-Peptide production by the mutant synthetases was investigated by growing the mutant strains in conditions that allow surfactin production and by looking for the presence of lipopeptides in the culture medium. The culture supernatants were acidified, and the precipitate was extracted with organic solvents and analyzed by TLC and reverse phase HPLC.
Analysis of the products obtained by TLC shows the appearance of slower migrating lipopeptides in the dom4::pVAL-TE and dom5::pASP-TE integrants in which the thioesterase is fused at the end of the domains, whereas cells in which the TE domain is missing do not secrete appreciable amounts of lipopeptides (Fig. 4).
Interestingly, the R F of the new peptides does not show any alteration after treatment with NaOH (data not shown) by which the lactone bond of surfactin is hydrolyzed, and a much slower migrating species is detected. Fig. 5 shows the HPLC profiles of peptides purified from strains in which the srfA gene has been inactivated by promoter deletion (15)  These results indicate that the recombinant strains, in which the TE region is fused at the end of amino acid-activating domains, produce new lipopeptide species that can be resolved into three major peaks by reverse phase HPLC. In these conditions wildtype surfactins are detected as a group of three main peaks representing fatty acid length polymorphism. The material secreted by strain dom5::pASP-TE (panel C) appears to be less hydrophobic than surfactin and the peptides secreted by strain dom4::pVAL-TE (panel B). The lipopeptides purified from TE::pVAL-TE and TE::pASP-TE had an elution profile similar to surfactin (data not shown).
Amino Acid Analysis of the Peptides-The amino acid composition of the peptides produced by the strains containing the TE region fused to amino acid-activating domains four and five and was analyzed as described under "Experimental Procedures." The acid-precipitable material was further purified by TLC and HPLC, and amino acid analysis of the different fractions showed that the peptide composition is the same for all fractions. Table I  case with surfactins (21). We estimated from HPLC and amino acid analysis that the amount of lipotetrapeptide produced is about 48.3 mg/liter, whereas the pentapeptide is about 18.9 mg/liter, representing about 23 and 9% of surfactin production by weight, respectively, in laboratory conditions.
Production of the Peptides is sfp Dependent-The presence of an intact sfp gene is known to be necessary for surfactin production, although the step involved remains unclear. To investigate whether the synthesis of the shorter peptide is dependent on the presence of an intact sfp gene, an sfp 0 strain (5) was transformed with plasmid pVAL-TE. Three independent colonies in which the protein pattern indicated that the TE region was fused to the valine domain were analyzed for peptide production by HPLC and TLC. None of the strains produced measurable amounts of lipopeptides, indicating the dependence of peptide production on sfp. DISCUSSION Peptide synthetases are codified by genes organized in a modular structure in which repeated domains are associated to specific functions. This organization in structurally and functionally separated regions suggests the possibility of altering the order and type of building units genetically to create new enzymes of novel specificity and to produce peptides that may contain many of the unusual amino acids that are present in these secondary metabolites and not in proteins.
We have previously shown that for efficient surfactin production it is necessary to maintain in the enzyme complex the function(s) associated with the carboxyl-terminal region (TE) fused to the last amino acid activation domain. Deletion of this region impairs peptide production even in the presence of normal amounts of the truncated enzyme. We show here that the fusion of the TE domain to the carboxyl terminus of different amino acid activation domains results in the efficient production and release of the expected peptides from the enzyme. These results identify a boundary between the amino acid activation domain and the TE region that can be exploited to create fusion proteins, which maintain the activity of both enzymatic domains.
In the absence of direct structural data on the amino acid activation domains and of the entire enzyme complex, one could presume that the fusion point is in a region separating struc- turally independent domains. Although our experiments do not directly show that the TE region is enzymatically active, in this region sequences conserved in thioesterases can be identified. In particular a GXSXG motif is present, which is common to many thioesterases, esterases, and lipases known to be serine hydrolases, where the central serine residue is part of the catalytic triad (22). Two classes of thioesterases have been grouped by primary sequence homology; one with higher homology to the thioesterase integral domain of fatty acid synthase complexes (thioesterase I), and a second class more similar to fatty acid thioesterase II enzymes, which are polypeptides about 260 residues long catalyzing chain termination and release of medium chain-fatty acids from fatty acid synthases. The srfAORF3 terminal region is more homologous to thioesterases type I, especially in the N-terminal half. Sr-fAORF4, together with the protein encoded by grsT, shares homology with thioesterases belonging to class II. We have shown previously that inactivation of this gene in B. subtilis has no influence on the levels of surfactin production. All of the integrations in srfA described in this work also disrupt transcription of the srfAORF4 gene. It would thus appear that the thioesterase codified by the srfAORF4 gene is not needed either for the release of the peptide or for whatever role one could presume for this protein. However, the presence in the cell of other proteins with a similar role that might replace them cannot be excluded. One could predict, for example, that the recently cloned pps synthetase of B. subtilis (23) might be associated with a protein homologue of srfA4 and grsT.
Many similarities exist between the structural organization of the peptide synthetase or polyketide synthase enzymatic complexes, and the engineering of polyketide synthase has resulted in the possibility of producing new polyketides by domain replacement or rearrangement (24). In particular the carboxyl-terminal portion of deoxyerythronolide B synthase, encoding the thioesterase-cyclase function that is responsible for the detachment of deoxyerythronolide B synthase, was repositioned in an internal region of the complex to generate a shorter triketide lactone (25).
The experiments reported here also show that the production of the linear tetra-and pentapeptides is dependent, like surfactin, on the presence of a wild-type copy of the product of the sfp gene. The function of this gene, necessary for surfactin production, is still unclear. sfp 0 strains accumulate normal levels of the structural components of the surfactin synthetase complex, and the enzyme is active in vitro as measured by the pyrophosphate exchange assay (15). Since the product of sfp (and of its grs homologue, gsp) shares sequence similarity and can functionally replace the product of the entD gene involved in iron transport, it was suggested that wild-type sfp might be needed for peptide secretion or iron acquisition (26,27). The sfp dependence on peptide production shown in the present study only indicates that the step inhibited in sfp mutant strains is not dependent on either peptide sequence or peptide conformation.
Nothing is known about the mechanism of secretion of surfactin or of lipopeptides in general. However, our experiments indirectly show that the mechanism might not be very sequence specific, and that the linear and shorter peptides can be transported equally well even with surfactin variants that carry internal deletions. 2 Engineering of peptide synthetases is of great industrial interest for at least two reasons; it implies the possibility of non-ribosomal synthesis of commercially valuable natural compounds in hosts that are better suited for industrial production; it opens the way to the biological synthesis of new molecules, analogues, and substituted versions of biologically active peptides as an alternative to the chemical synthesis that is often difficult and costly especially in the case of non-linear and modified peptides. We have determined that insertion of the thioesterase region at the end of amino acid binding domains can be exploited for efficient production of peptides of predefined length. Since surfactin yield reaches 1-2 g/liter by optimizing fermentation parameters, peptide yields from the engineered synthetases can be interesting for pharmaceutical applications, notwithstanding the reduction compared with the wild-type enzyme.
Although successful examples of peptide synthetase manip-2 Francesca de Ferra, Ornella Tortora, and Claudio Tosi, manuscript in preparation.  ulation are at hand (this study and Ref. 12), many aspects of the thiotemplate mechanism of peptide production are still unclear and need to be investigated to better exploit this system for the efficient synthesis of new peptides by engineered microorganisms.