![]()
|
|
||||||||
J. Biol. Chem., Vol. 280, Issue 25, 23605-23614, June 24, 2005
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||



¶

||
From the
Recombinant Gene Products Group, International Centre for Genetic Engineering and Biotechnology, Aruna Asaf Ali Marg, New Delhi 110067, India
Received for publication, March 21, 2005 , and in revised form, April 15, 2005.
| ABSTRACT |
|---|
|
|
|---|
-D-galactopyranoside, some library members were found to express dicodon-incorporated proteins. Because of this, the host cells, in our case Escherichia coli, were unable to grow any further. The bactereostatic/lytic nature of the dicodon proteins was monitored by growth curves as well as by zone clearance studies. Transmission electron microscopy of the affected cells illustrated the extent of cell damage. The proteins themselves were overexpressed as fusion partners and subsequently purified to homogeneity. One such purified protein was found to strongly bind heparin, an indication that the interaction of the de novo proteins may be with the nucleic acids of the host cell, much like many of the naturally occurring antibacterial peptides, e.g. Buforin. Therefore, our approach may help in generating a multitude of finely tuned antibacterial proteins that can potentially be regarded as lead compounds once the method is extended to pathogenic hosts, such as Mycobacteria, for example. | INTRODUCTION |
|---|
|
|
|---|
A little more than a decade ago, Hecht and co-workers (10) put forth a most novel hypothesis, that of binary patterning of proteins. According to this simple concept, it is the binary patterning of polar (
) and non-polar () amino acids in the designed protein and not the exact identity of those amino acids that eventually dictates the secondary structure of the designed protein. The binary-patterning concept may be viewed as a part-rational approach, whereby one can predetermine the library folds but not the exact folds of its individual members. Indeed, de novo protein libraries based on binary patterning have furnished many functional as well as structural proteins (11, 12), although problems of a suitable assay system, as well as the limitation of a fixed length of library members, have yet to be addressed satisfactorily. The second seminal study by Mekalanos and co-workers (13) involves the generation of nonrational peptide libraries based on degenerate oligonucleotide design. Their work elegantly demonstrated that an inducible non-rational DNA library, once translated in vivo, could lead to the selection of a functional peptide. This so-called ABBIS approach was successfully used for isolation of a number of peptides that, upon overexpression in the host cell, led to severe growth attenuation of the host. The ABBIS method too has its limitations that are primarily because of its use of fixed-length degenerate oligonucleotides as a starting point for making a peptide library. Nonetheless, it remains a powerful example of a non-rational approach toward the de novo synthesis of useful proteins, in their case antimicrobial peptides (AMP).1
Of late, there is an urgent need for generating newer more potent AMPs, given their newfound importance in the context of widespread antibiotic resistance among human pathogens (1416). AMPs have been isolated from a variety of species, including around 40 odd AMPs from humans (15, 17). Thus far, around 700 AMPs have been isolated and characterized (18) and many among them are in advanced clinical trials as drug candidates against bacterial and fungal infections (15, 19). A general characteristic of AMPs is their net positive charge that is believed to be crucial for gaining entry into negatively charged bacterial outer membranes, although lately, some anionic AMPs have also been isolated (20, 21). Consequently, many models for bacterial entry have been postulated (17, 22). However, recent DNA-microarray studies on bacteria affected by AMPs have suggested that cell wall lysis may not be the sole mode of AMP action, given that the expression of a host of cytoplasmic genes is severely affected upon AMP cell entry (23, 24). Notwithstanding the uncertainty surrounding their action, it is clear that AMPs represent a class of molecules that may in the near future provide an effective alternative to the currently used antibiotics (14). Therefore, we believe that an approach that leads to the synthesis of a diverse pool of AMPs and one that addresses the earlier problems associated with AMP synthesis would be of considerable help.
In this report, we describe one such approach that is based on the application of a directed-evolution technique that we recently developed called codon shuffling (25). We also describe how, through codon shuffling, libraries are generated that: (a) can be both non-rational as well as part-rational; (b) have members large enough to be classified as proteins and not just peptides; (c) are not restricted by limits of length of their corresponding genes; and (d) can be preferentially skewed in amino acid attributes like charge and hydrophobicity, thereby narrowing down the search for a successful AMP.
| EXPERIMENTAL PROCEDURES |
|---|
|
|
|---|
Construction of ds Hairpin-encapsulated DC Fragment (HPDN) LibraryFor the initial part of library construction, the protocol followed was as described earlier (25) with one variation. Immediately after the mixing of 14 DCs, the solution was made to 7.5% polyethylene glycol and the DC mixture was incubated at 4 °C for 24 h. To this mixture was added 100 pmols of 5'-phosphorylated and PAGE-purified ds hairpin (5'-tttaaacacgtggcggccgctctagaggcccgcgcgggcctctagagcggccgccacgtgtttaaa-3') that had been self-annealed earlier. The ligation temperature was increased to 16 °C, and the incubation prolonged for another 12 h. The ligation mixture DNA was precipitated and extracted once with phenol/chloroform. The resuspended DNA was digested with XbaI for 4 h at 37°C, after which 1 µl of the digested DNA was used as a template for PCR using the 5'-phosphorylated oligonucleotide 5'-agcggccgccacgtgtttaaa-3' that served both as a forward and reverse primer. The PCR products were eluted using DEAE membrane and fractionated based on their lengths (50400 bp). The purified fragments were used directly as inserts for creation of de novo libraries.
Construction of Expression Plasmid pTEMDNp28 A 420-bp long DNA fragment from the TEM-1 gene of plasmid pSC1 (25) was PCR-amplified using pfu polymerase and the oligonucleotides 5'-aagcatgcaaggagatggcgcccaacagtccc-3' and 5'-cctacgtatgcacccaactgatcttcagcatc-3' as forward and reverse primers, respectively. The PCR product was cloned in pBluescriptSRF vector, and the fragment was excised using NcoI and SnaBI restriction enzymes and cloned in pDNp28 vector previously cut with the same two enzymes. The resulting plasmid was designated pTEMDNp28. Plasmid pTEMDNp28 carries the SnaBI site into which DC fragment libraries can be inserted. The site is immediately downstream of the TEM-1 secretion signal sequence and just upstream of the His6 tag.
Western Blot AnalysisWhole-cell lysates of EQAMP13 cultures were run on 15% SDS-PAGE and transferred onto supported nitrocellulose membrane (Invitrogen). The membrane was blocked with 1% polyvinylpyrrolidone and then probed with anti-His monoclonal antibody (1:7500 dilution). The blot was developed by using 5-bromo-4-chloro-3-indolyl phosphate/nitro blue tetrazolium (Promega).
Heparin Binding StudiesProcedures relating to the construction of GST-SKAMP1 gene and subsequent purification of the corresponding protein are described under supplemental data. 20 µg of pure GST-SKAMP1 protein was loaded onto a column containing heparin-Sepharose. The column was washed extensively with buffer C. Elution buffer (buffer C with an NaCl gradient from 250 mM to 1 M) was then poured over the column, and fractions were collected.
Western BlotThe eluted fractions, as well as the heparin resin, were probed with anti-GST antibodies (1:25000 dilution) for 1 h. Nitrocellulose membrane was then washed four times with 1x phosphate-buffered saline plus Tween 20 (each wash of 5-min duration). Secondary probing was done with anti-rabbit IgG heavy and light chain AP-conjugated antibodies (1:5000 dilution) for 1 h. After washing the membrane with 1x phosphate-buffered saline plus Tween 20, the blot was developed with 5-bromo-4-chloro-3-indolyl phosphate/nitro blue tetrazolium substrate.
| RESULTS AND DISCUSSION |
|---|
|
|
|---|
250300 bp in length is astronomically high and much beyond what may be required to exhaust all of the possible protein folds. Therefore, at first glance, codon shuffling represents a useful method for generating de novo protein libraries.
|





...) or non-polar (...) residues, rather than an arrangement of such residues that promotes a secondary structure feature (
or



). Indeed, by predetermining the degeneracy of parent oligonucleotides, such that the resulting peptides may possess polar/non-polar arrangements, Hecht and coworkers (10, 29) were able to generate a library of peptides wherein most members displayed a well-folded behavior. Consequently, the library size in such situations may not be necessarily large to begin with, much like in codon shuffling. One possible explanation in the codon-shuffling context may be that the 14 dicodons themselves possess a range of inherent secondary structure-forming attributes because of the manner in which they are paired, such as polar/non-polar, non-polar/polar, non-polar/turn, non-polar/non-polar, and other such combinations (Table I). For example, Glu-Leu or Asp-Ile repeats yield a binary pattern (
) that represents
-sheet structure, whereas the string Met-His-Asp-Ile-Met-His-Glu-Leu-Ser-Thr yields a binary pattern (


) that would preferentially form an
-helix. Additionally, the appearance of the dicodons Trp-Pro, Pro-Gly, or Gly-Ala may induce turns or breaks, whereas sputtering of Cys-Ala within the DC protein sequence may create intramolecular or intermolecular disulfide bridges.
To further investigate this hypothesis, we undertook a search of all of the proteins in the Protein Data Bank to find the location of dicodon pairs within protein structures. The search string was established as a dicodon-dicodon pair (DC-DC) amounting to a search of a total of 196 DC-DC motifs (14DC x 14DC). The DC preferences for particular secondary structure elements are listed in Table I. Predictably, the proline-containing DCs (Trp-Pro and Pro-Gly) were preferentially found in turn elements, whereas DCs that were formed of amino acid pairs of high
-helix forming propensity, such as Phe-Glu, for example, were found to be embedded in
-helices. Noteworthy, only 11% DC-DC pairs were found in unstructured and random coil elements. This may indicate that a de novo DC protein library would contain proteins that possess secondary structure elements and are therefore well folded as opposed to proteins from a degenerate oligonucleotide library. To test our hypothesis, we cloned codon-shuffled genes in specially designed vectors (see "Experimental Procedures") using the equimolar DC-protocol described previously (25). Two representative clones were sequenced and shown to be wholly DC incorporates (Supplemental Fig. S1A). When overexpressed in E. coli, the clones yielded DC proteins that were preferentially found in the insoluble fractions. Nonetheless, because of reasonably good expression, the proteins could be routinely purified by metal-affinity chromatography in 8 M urea buffer. Postdialysis, one of the proteins could be solubilized in as little as 300 mM urea (Supplemental Fig. S1B).
These results indicated to us that de novo DC protein libraries were viable at generating reasonably large quantities of proteins of expected sizes and that subsequent application of a selection system could fish out desirable proteins. However, one caveat needed to be addressed before codon shuffling could be used efficiently for constructing de novo protein libraries. Because codon-shuffled DNA fragments were ligated directly to expression vectors for their eventual translation, the numbers of such fragments available for cloning were not all exhaustibly represented in the DC library since there was only one ds DNA fragment (i.e. insert fragment) of each type available. In other words, the impossibility of experimentally cloning each and every ds DNA fragment meant that the available complexity of the DC library was being compromised.
Use of ds DNA Hairpins for Amplification of Codon-shuffled FragmentsTo address the above-mentioned problem, we envisaged the use of ds DNA hairpin that would firstly cap the two exposed ends of the DC-assembled DNA fragment and then be used as a template for fragment amplification using oligonucleotides that are complementary to the hairpin sequence (30). Moreover, the ds hairpin, in addition to its primary role of amplifying DC fragments, could also be used as a scaffold for encapsulating DC proteins. Protein scaffolds have previously been employed to tighten randomly generated de novo peptides in hopes that the peptides may fold better as their N and C termini are no longer free (31, 32). One such scaffold, the thioredoxin (trx) protein, has been employed on numerous occasions (13, 33, 34). However, whether the peptides are as effective once they are removed from their scaffolds or, more worringly, whether the de novo peptide activity is partly due to the scaffold itself are some of the issues that remain largely unaddressed. However, one could be prudent and reduce the deleterious effects of the scaffold, simply by keeping its size to a minimum. Using a large protein scaffold may stimulate the de novo protein to take a shape that without the scaffold would be altogether different. Therefore, in designing the ds hairpin, we focused on what was our initial goal: to isolate proteins that act as an antibacterial.
A significant number of isolated AMPs are
-helical in nature, and many that are not become so once they interact with bacterial membranes (15, 35). Therefore, it would be helpful to construct AMPs over an
-helical scaffold. As a starting point, we envisaged that the ds hairpin should code for a string of 67 amino acid residues. After a preliminary study of exploring binary patterns for the scaffold, we settled on the pattern shown in Fig. 1. The binary pattern,

, along with its "antisense" pattern, 
, are both predicted to form an
-helix based on the Hecht hypothesis (10). We then searched the Protein Data Bank to document the presence of these two patterns in the various fold elements. Interestingly, both patterns show a generous preference for
-helices, thereby confirming the practicability of using Hecht binary patterning for de novo peptide design (Fig. 1). As a next step, we embedded some unique restriction enzyme (RE) sites (none of which appears in any given DC fragment) within the ds hairpin sequence, such that they would not disturb the binary pattern of the translated hairpin (Fig. 1B). The final sequence of the designed hairpin (shown in Fig. 1) coded for the following two sequences, "SGRHVFK" and "FKHVAAA," depending upon which hairpin strand was translated. Importantly, both these sequences conform to an
-helix-forming Hecht pattern.
|
Finally, we wanted to explore a case where the DC protein is encapsulated by the traditional, often employed trx scaffold. Therefore, we ligated the hairpin-encapsulated DC fragment library with an expression vector (pTRXDNp21) that carried the modified trx gene wherein a unique SnaBI site had been positioned within the trx sequence that corresponded to the region between the two cystines (see supplemental data for details). The DNA of many of the obtained clones was sequenced and found to contain hairpin-encapsulated DC fragment sequences expectedly positioned within the SnaBI site (Supplemental Fig. S2A). One representative member of the trx-DC library was further analyzed to check for protein expression. Upon induction with IPTG, the protein trxHPDN3 was found to be generously overexpressed. However, most of the protein was found to be in the insoluble fractions, albeit the protein could be purified to homogeneity under denaturing conditions (8 M urea) using Ni-NTA chromatography (Supplemental Fig. S2B). Upon dialysis, we found the protein to be soluble in a minimum of 150 mM urea. Circular dichroism studies of THPDN3 along with native trx protein showed that the chimera displayed a predominantly
-helical content (helix, 24.1%; sheet, 10.0%; turn, 34.4%; random, 31.4%), much like the parent trx protein (helix, 21.5%; sheet, 14.3%; turn, 31.3%; random, 32.8%; Supplemental Fig. S2C). Without the use more incisive techniques such as x-ray crystallography or NMR, it is not possible to speculate whether the DC protein has folded outside the trx scaffold or whether the obtained CD data are for a single protein entity. Further work in this direction is currently underway in our laboratory.
In summary, we have investigated the usefulness of DC proteins as candidates for de novo protein libraries. We have also found that it is worthwhile to encapsulate the DC proteins using scaffolds. Of the two scaffolds that we used for this purpose, we found the trx scaffold to yield chimeras that were insoluble and thus of little use in vivo.
Design and Synthesis of de Novo Anti-bacterial Proteins Having put in place a method for the design, synthesis, and selection of de novo proteins with two different scaffolds, we now focused our efforts on the synthesis of de novo proteins that would act as an antibacterial. Our strategy was to select for AMPs in vivo by identifying those transformed bacterial colonies that were unable to grow in the presence of IPTG in contrast to robust growth in the absence of the inducer. Additionally, we decided to tether the DC proteins with an N-terminal signal sequence for the following reason. Because the AMP site of action could be the cytoplasm, the periplasm, or indeed the extracellular environment of the host, in our case E. coli, we felt that it was important not to exclude, by a method of selection, DC proteins that were acting at any of the above-mentioned three sites. For example, by not choosing to tether the DC library with an N-terminal signal, all of the DC proteins would be restricted to the cytoplasm of the cell. On the other hand, choosing the ompA signal sequence as the primary tether for the AMPs, all DC proteins would be directed toward the extracellular environment of the host cell. As a result, AMPs that would have had a cytoplasmic or a periplasmic site of action would not be selected.
|
|
-lactamase as the primary AMP tether. This was because we and others (25, 36) have shown that, under induction conditions, the TEM signal sequence directs transport of the
-lactamase protein to all three regions of the bacterial cell (the cytoplasm, the periplasm, and the extracellular environment) in the approximate ratio of 70:25:5. Therefore, by choosing the TEM-1 signal, we would not be restricting the site of action of de novo AMPs to a particular cellular environment. The expression vector that would accept the incoming hairpin-encapsulated DC fragments was redesigned to include the segment of TEM-1 gene that codes for the 23-amino acid-long N-terminal signal (see "Experimental Procedures"). This vector (Supplemental Fig. S3, pTEMDNp28) was digested with SnaBI and introduced to an equimolar hairpin-encapsulated DC mixture. The ligation mixture was used to transform E. coli BL21(DE3)-competent cells, and the cells were plated onto media containing kanamycin (the vector marker). We obtained a library size of
103 colony-forming units, the size varied by a factor of a fold when the experiments were repeated, presumably because of differences in ligation and transformation efficiencies. The colony-forming units were then replica-plated onto plates that, in addition to kanamycin, also contained 0.5 mM IPTG. Although the majority of colonies were able to display robust growth, we were able to isolate four plate I colonies that showed no growth on plate II (kanamycin + IPTG). The sequencing of DNA isolated from the colonies showed that all of them were hairpin-DC sequences. Predicted translation products as well as physical properties of the three unique sequences are shown in Fig. 3. There is a wide variance seen in the protein length and pI among the three proteins named EQAMP1, EQAMP2, and EQAMP3. As a control, the DNA from a colony that grew well on both plates, I and II, was also sequenced (cEQ1). Consensus secondary structure predictions on all four proteins displayed a dramatic difference between the three proteins and the control protein (Fig. 3). Whereas EQAMP13 were predicted to be rich in
-helical content, the control protein, cEQ1, was predicted to be rich in random coils. This incongruity among the three EQAMP proteins and the control made us revisit the initial HPDN DC protein library (Fig. 2). Indeed, all HPDN18 proteins were predicted to have a random coil content >55%. Therefore, it is tempting to suggest that the EQAMP proteins, because of their in vivo AMP behavior, display pronounced helical fold characteristics that stand in contrast to the predicted fold characteristics of non-selected DC proteins.
Confirmation of AMP Behavior of EQAMP ProteinsTo confirm beyond reasonable doubt that the in vivo AMP activity witnessed was due to DC proteins and nothing else, we carried out the following experimental checks. (a) Plasmid DNA from the strains exhibiting growth inhibition upon addition of IPTG were isolated, purified, and used to transform a fresh lot of E. coli competent cells. The addition of IPTG to growing cultures of the freshly transformed cells also displayed growth inhibition, thereby pointing to the fact that the growth inhibition was a property of the plasmid DNA used for transformation. (b) The DC fragment within each plasmid was amplified using universal primers that bound to only the vector regions flanking the DC fragments and nowhere else. The amplified products were digested with restriction enzymes that cut within the vector sequences and ligated to freshly cut pET28a vector. The resulting plasmids (that should be identical to the starting plasmids) were used to investigate growth inhibition and were all found to exhibit the same. This ruled out growth inhibition properties by any means (such as contamination and other factors) other than the DC fragments themselves. (c) The supernatant of IPTG-induced growth cultures of EQAMPs was isolated and spread on agarose plates containing E. coli DH5
lawns, and the plates were incubated for a period of 1418 h at 37 °C. The absence of any plaque formation ruled out phagemediated inhibition as the reason for growth inhibition of EQAMP cultures. (d) As a final check, we wanted to confirm that the growth inhibition was due to DC proteins and not because of their DC-mRNA. For this purpose, the unique NcoI site in each EQAMP plasmid (that also contained the ATG start codon) was cut and filled in using Klenow polymerase and the plasmids were self-ligated. This resulted in a frameshift starting from the NcoI ATG codon, which no longer yielded the DC proteins as the translated products. However, the mRNA in each case remained identical to the wild-type EQAMP mRNA. When the resulting plasmids EQAMPnco13 were used, no growth inhibition was seen (Supplemental Fig. S4), thereby confirming that it was the DC protein and not the DC-mRNA that caused inhibition.
|
|
|
-helical elements (Fig. 3). Indeed, if one were to assume this prediction as correct, the helical-wheel projection drawing of EQAMP2, for example, nicely illustrates the demarcation of its residues, non-polar in the interior and polar at the exterior of the assumed all-helical bundle (Supplemental Fig. S7). However, presently, these are at best conjectures. Isolation of purified EQAMP proteins and their subsequent structural characterization would lead us to an accurate representation. As a next best alternative, we isolated representative AMPs (EQAMP2 and SKAMP1) as trx fusion proteins and obtained their CD spectra (Fig. 6). The AMP fusion proteins displayed a well folded structure as indicated by the CD data for EQAMP2 (helix, 31%; sheet, 0%; turn, 38%; random, 31%) and for SKAMP1 (helix, 29%; sheet, 4%; turn, 34%; random, 33%; Fig. 6B). Direct Visualization of the in Vivo Effect of EQAMPsAlthough growth inhibition experiments had earlier indicated to us that the EQAMP13 proteins were most probably bacteriostatic, rather than bacteriolytic in nature, we decided to investigate by TEM the state of the host cell post-IPTG induction with a non-induced cell culture as a reference control. TEM results (Supplemental Fig. S8, AD) illustrated to us that induced samples displayed a morphology that was markedly different from the non-induced samples. In addition to the disintegration of cell wall, widespread cytoplasmic contraction was also clearly visible. The latter is generally caused when a membrane protein of a cell is affected or sequestered or when the nucleic acids of the cell are targeted (17). We decided to investigate this phenomenon further by studying the possible association of EQAMPs with heparin (heparan sulfate). It is well known that proteins that bind heparin also bind nucleic acids (38, 39). In addition, many AMPs have been shown to bind heparin directly (40). Although obtaining pure EQAMPs had proved earlier to be unsuccessful, we were able to obtain good quantities of soluble EQAMPs as fusion partners of the GST protein. As a representative example of such fusion proteins, we decided to investigate the SKAMP1-GST fusion protein (see below for the synthesis and isolation of SKAMP1 protein). Purified SKAMP1-GST (molecular mass of 36.7 kDa) was able to bind to heparin at varying pH conditions, and the fusion protein could partly be eluted with buffer containing 1 M NaCl (Fig. 7). Some of the protein was seen bound to heparin resin, even after elution with the above-mentioned buffer, indicating very strong binding. It has previously been shown that GST on its own does not bind to heparin (41), a finding also confirmed by us using purified GST protein as a control (Fig. 7). Therefore, these results indicate association of AMPs with a negatively charged moiety (in this case heparin) that could be either a nucleic acid or indeed a negatively charged membrane protein. Further studies in this direction are ongoing.
|
-helical elements. Growth inhibition and zone clearance studies with SKAMP1 yielded results similar to those with EQAMPs, indicating that expression of SKAMP1 was also severely deleterious to the host cell (Fig. 4 and Supplemental Fig. S6). Thus, the isolation of an AMP from a skewed DC library is a step in the direction of tailoring protein libraries to meet specific needs. We are currently studying the prospect of creating "severely skewed" libraries wherein the negatively charged dicodons have been altogether removed from the DC mixture. | CONCLUSIONS |
|---|
|
|
|---|
| FOOTNOTES |
|---|
The on-line version of this article (available at http://www.jbc.org) contains Supplemental data and Figs. S1S8. ![]()
Both authors contributed equally to this work. ![]()
¶ Present address: School of Medicine, Stanford University, Palo Alto, CA 94305. ![]()
|| To whom correspondence should be addressed. Tel.: 91-11-26195007; Fax: 91-11-26162316; E-mail: anand{at}icgeb.res.in.
1 The abbreviations used are: AMP, antimicrobial peptide; trx, thioredoxin; DC, dicodon; HPDN, hairpin-encapsulated DC fragment; IPTG, isopropyl
-D-thiogalactoside; GST, glutathione S-transferase; Ni-NTA, nickel-nitrilotriacetic acid; TEM, transmission electron microscopy; ds, double-stranded. ![]()
| ACKNOWLEDGMENTS |
|---|
| REFERENCES |
|---|
|
|
|---|
This article has been cited by other articles:
![]() |
A. Rao, G. Ram, A. K. Saini, R. Vohra, K. Kumar, Y. Singh, and A. Ranganathan Synthesis and Selection of De Novo Proteins That Bind and Impede Cellular Functions of an Essential Mycobacterial Protein Appl. Envir. Microbiol., February 15, 2007; 73(4): 1320 - 1331. [Abstract] [Full Text] [PDF] |
||||
| ||||||||