Functionalized protein-like structures from conformationally defined synthetic combinatorial libraries.

An approach is described for the de novo design of protein-like structures in which synthetic combinatorial libraries (SCLs) were incorporated into an amphipathic alpha-helical scaffold (an 18-mer sequence made up of leucine and lysine residues) to generate conformationally defined SCLs. In particular, the SCLs in which the "combinatorialized" positions were on the hydrophilic face showed an alpha-helical conformation in mild buffer. These SCLs were used to generate context-independent but position-dependent scales of alpha-helical propensity for the L-amino acids. These scales were then used to design highly alpha-helical peptides that self-associated in mild buffer. The same approach was also found to permit the identification of conformation-dependent decarboxylation catalysts.

Synthetic combinatorial libraries (SCLs) 1 are broadly recognized as having the capability of greatly accelerating the discovery of new lead compounds. SCL approaches have primarily been focused on the generation of small molecule diversities (i.e. short peptides, peptidomimetics, or organic compounds) (1). The generation of molecular diversities based on defined structural motifs can be expected to broaden the use of SCLs for those applications requiring the presence of a well defined secondary and/or tertiary structure. For example, the de novo design of artificial receptors and catalysts in most cases requires the generation of protein-like molecules having the general structural and functional properties found in natural enzymes. Thus, a productive strategy can be seen in the selectionbased design of well defined secondary and tertiary structures, which maintain sufficient flexibility to allow the accessibility of the functionalities required for catalysis to occur.
In an initial effort directed toward the design of new proteinlike structures using SCL approaches, conformationally defined combinatorial libraries have been designed by randomizing five positions in two different 26-amino acid-long naturally occurring peptides. The first of these libraries, built in our laboratory, was based on the amphipathic peptide melittin (2), whereas a Cys 2 His 2 consensus "zinc finger" motif was used as a scaffold by Bianchi et al. (3). These first examples have proven to be useful case studies to confirm that synthetic combinatorial chemistry can be readily extended to structurally defined polypeptides or protein segments.
Based on our earlier studies on structure-activity relationships using basic polypeptides (4 -6), a series of conformationally defined SCLs were then constructed around an 18-mer peptide composed solely of leucine and lysine residues (7). Leucine/lysine-based peptides represent simple model systems capable of forming protein-like conformations through selfaggregation. The resulting structures, however, lack the specific electrostatic, hydrogen-bonded, and van der Waals interactions that are necessary to stabilize unique protein conformations (8). Therefore, we used combinatorial diversity with the aim of overcoming a number of these limitations while also allowing the design of new, functionalized, positively charged, self-assembling proteins having unique, well defined structures for specific functions. In an initial step, the structural nature of these libraries was used to study the helical propensity of amino acids, which permitted the identification of peptides that self-aggregated into ␣-helical conformations in mild buffer.
The advantage of self-aggregated amphipathic peptides is their inherent tendency to produce hydrophobic cores, much like the intrinsic character of protein interiors, while providing an outwardly directed polar environment. Such characteristics have been applied in different studies for the design of new macromolecules that are able to specifically bind to ligands and/or to bring into close proximity functional groups that result in known biological functions (9 -12). The conformationally defined SCL approach was used in the present study to design new protein-like structures that have binding stepbased catalytic activity. The catalytic properties of individual self-aggregated peptides derived from these conformationally defined SCLs were explored with the decarboxylation of oxaloacetate serving as a model reaction system.

EXPERIMENTAL PROCEDURES
SCLs and Individual Peptide Synthesis-The SCLs and individual peptides were prepared by simultaneous multiple peptide synthesis using t-Boc chemistry as described elsewhere (13). The mixture positions were obtained using a mixture of 19 L-amino acids (cysteine was omitted) based on a predefined chemical ratio (14) at each coupling step. Final cleavage and deprotection steps were carried out using a "lowhigh" hydrogen fluoride procedure (15,16). Individual peptides were purified by preparative reversed phase-high performance liquid chromatography (HPLC) using a DeltaPrep 3000 reversed phase-HPLC combined with a Foxy fraction collector (Millipore, Waters Division, San Francisco, CA). Analytical reversed phase-HPLC and laser desorption time-of-flight mass spectroscopy (Kompact Maldi-Tof mass spectrometer, Kratos, Ramsey, NJ) were used to determine the purity and identity of the individual peptides.
Circular Dichroism Measurements-All measurements were carried out on a Jasco J-720 CD spectropolarimeter (Eaton, MD) in conjunction with a Neslab RTE 110 water bath and temperature controller (Dublin, CA). CD spectra were the average of a series of 3-7 scans made at 0.2-nm intervals recorded at 25°C. CD spectra of the same buffer without peptide were used as baseline in all the experiments. Ellipticity is reported as mean residue ellipticity [⍜] (deg cm 2 dmol Ϫ1 ); the limits of error of measurements at 222 nm were Ϯ200 deg cm 2 dmol Ϫ1 . Peptide concentrations were determined by UV spectrophotometry at 276 nm in buffer using ⑀ ϭ 1420 M Ϫ1 cm Ϫ1 for tyrosine (17) and ⑀ ϭ 5570 M Ϫ1 cm Ϫ1 for tryptophan (18).
The GdnHCl denaturation studies were carried out by preparing mixtures of a stock solution of peptide mixture in buffer (5 mM MOPS-NaOH, 200 mM NaCl, pH 7.0), buffer alone, or a solution of 7.2 M GdnHCl in buffer. The ratios of buffer and 7.2 M GdnHCl solutions were varied to give the appropriate final GdnHCl concentrations. The samples were allowed to equilibrate for 30 min at room temperature prior to CD measurement.
Activity Assay for Decarboxylation of Oxaloacetate-The kinetic parameters for decarboxylation of oxaloacetate with the different peptides were determined by following the loss of absorbance at 280 nm arising from the enol of oxaloacetate as described by Johnsson et al. (11) using a Hewlett Packard 8452A diode array UV spectrophotometer (Palo Alto, CA). Each peptide mixture was initially screened at 0.2 mM with 11.4 mM oxaloacetate in phosphate-buffered saline buffer (35 mM phosphate, 0.15 M NaCl, pH 7) at 25°C. The reaction was monitored for 45 min. The specific activity was defined as the change in oxaloacetate concentration as a function of reaction time and peptide or peptide mixture concentration. The average standard error between measurements was Ϯ0.44 ϫ 10 Ϫ3 s Ϫ1 .

RESULTS AND DISCUSSION
Design of Conformationally Defined SCLs-The 18-mer peptide scaffold used in these studies (termed YLK) ( Table I) was found in earlier studies to be a random coil in mild buffer but to adopt a monomeric amphipathic ␣-helical conformation (6) both in trifluoroethanol, a solvent known to promote ␣-helical conformation (19), and in lipid environment. Three different SCLs were initially constructed (Table I and Fig. 1): one in which the "combinatorialized" positions were on the hydrophobic face of an amphipathic helix (i.e. replacing leucine 4, 7, 11, and 14), one in which the combinatorialized positions were on the hydrophilic face (i.e. replacing lysine 6, 9, 13, and 16), and one in which the four combinatorialized positions were placed on a complete helical turn (i.e. replacing leucine 7 and 8 and lysine 9 and 10). Each library was characterized by a single defined position (termed the O position) with one of 19 L-amino acids (cysteine was omitted to prevent the formation of dimers or polymers) and three mixture positions (termed the X positions), which are made up of a close to equimolar mixture of the same 19 L-amino acids.
In order to analyze the structural behavior of these three different libraries, the apparent helical contents were measured for each peptide mixture using CD spectroscopy either in MOPS buffer at neutral pH or in the presence of 80% trifluoroethanol. In buffer, the CD spectra showed clear indications of an ␣-helical conformation only when the randomized region was contained on the hydrophilic face of the helix (Fig. 2). The highest apparent ellipticity at 222 nm (negative minimum characteristic for ␣-helices) was found when glutamic acid was in the defined position (i.e. replacing lysine 6), and the lowest apparent ellipticity occurred with proline in this position. In addition, the helical content was found to be concentration-dependent, indicating the formation of aggregates under these conditions. A tetrameric aggregate was found to best fit the concentration dependence curve obtained for the peptide mixtures tested in mild buffer (data not shown). It should be noted that high ionic strength and high peptide concentrations were necessary for the original YLK peptide to form tetrameric aggregates. All of the peptide mixtures from the three libraries were found to adopt an ␣-helical conformation in trifluoroethanol with relatively small variations in apparent helical content.
To properly mimic naturally occurring proteins, the de novo designed proteins should have a well defined conformation under physiological conditions. The library having its diversity on the hydrophilic face of YLK appeared to be more suitable for studies directed toward the generation of such proteins. Furthermore, the variation seen in apparent helical content between the peptide mixtures within this library is anticipated to allow the modulation of the helical content of final individual  basic polypeptides. In order to obtain information on all of the randomized positions in a single screening process, this library was extended to generate four related libraries in which the O position was at position 6, 9, 13, or 16 (these four libraries represent a positional scanning SCL) (Table I and Ref. 20).
Understanding the Helical Propensity of Amino Acids-Despite the numerous studies directed toward understanding ␣-helical propensity at the amino acid level (21)(22)(23)(24)(25)(26)(27)(28)(29), the use of ␣-helical propensity scales to predict the secondary structures of peptides and proteins remains problematic. Conformationally defined SCLs represent a new approach for the evaluation of the helical propensity of amino acids in a random local environment (i.e. considering average intra-or intermolecular side chain interactions) in basic polypeptides. Such an approach can be expected to provide information about the average contribution of an amino acid toward the helical stability of basic polypeptides and protein segments in a local contextindependent but position-dependent manner.
Both hydrogen bonding between the residues at positions i and iϩ4 or iϪ4 and the side chain interactions between residues at positions i and iϩ3 or iϪ3 are known to be the main contributors to helical stability. As summarized in Table II, the extent of local random environments depends on the position of the defined amino acid within the sequence. For instance, a close to completely random local environment was achieved when the defined residue was located at position 9 or 13. In both cases, a random environment was provided by one of the two hydrogen bonds (on the carbonyl side or amide hydrogen side, respectively) and by the iϪ3 or iϩ3 position, respectively. In contrast, when the defined residue was located at position 6 or 16, a random environment was provided only by the position iϩ3 or iϪ3, respectively. A complete random environment having X positions at iϮ4 and iϮ3 could not be achieved while maintaining the amphipathicity (i.e. without substituting leucine residues) and/or maintaining the mixture positions in the middle of the hydrophilic face (i.e. at positions that are exposed to the solvent and distant from the hydrophobic/hydrophilic interface).
Each peptide mixture from the four libraries was analyzed by CD spectroscopy in the absence or the presence of GdnHCl. The stabilities of the peptide mixtures to GdnHCl denaturation were measured by monitoring the mean residue ellipticity at 222 nm as a function of denaturant concentration. In a manner similar to proteins or single sequence defined polypeptides, the mixtures unfolded cooperatively with midpoints that depended markedly on the nature and position of the defined amino acid (Fig. 3). The apparent helical content of the peptide mixtures defined with a given amino acid was found to vary according to the location of this amino acid within the sequence (i.e. which library the corresponding peptide mixture belongs to). The degree of variability was found to be amino acid-dependent. For instance, alanine at position 13 (in the middle of the helix) led to a higher helical content than when it was located close to the N terminus (i.e. at position 6) (Fig. 4A). In contrast, a glycine at position 6 led to a higher helical content when located near the N terminus than when located in the middle of the helix or close to the C terminus (Fig. 4B). Other amino acids, such as asparagine, could be located at any position without significant variations in the resulting helical content (Fig. 4C). The apparent free energy of helix formation (⌬G f app ) was calculated for each peptide mixture based on a tetramer formation, assuming that the helix-coil transition is a totally cooperative two-state transition (5,30,31). The resulting rank order of free energies for helix stabilization of each amino acid by position in the sequence, (i.e. in a given local environment) is summarized in Table III. Similar values were obtained when the free energy for helix formation was estimated from the data resulting from GdnHCl denaturation and extrapolation of the free energy at each individual concentration of GdnHCl back to zero concentration (data not shown). In addition to the position dependence found for the ⌬G f app values for given amino acids, the relative rank orders of the amino acids varied from one scale to the other. As expected, the highest similarity was found for the scales generated from libraries having defined positions at either 9 or 13, both of which represent a close to random local environment for the amino acids studied. As in other scales presented in the literature, proline was found to have the least helical propensity, a property that was independent of its location. Proline ⌬G f app was 3-4 Kcal/mol lower than the amino acid with the highest helical propensity in each scale.
The discrepancy seen between the scales generated from the libraries representing "fixed" local environments (defined position at 6 or 16) and those representing "random" local environments (defined position at 9 or 13) further confirms that the  nature of amino acid-amino acid interaction plays an important role in determining the helical propensity. We therefore believe that a local random environment can translate the average helical propensity of an amino acid, which, in turn, can be expected to assist in understanding the secondary structure of a given peptide sequence. In addition, these results show that conformationally defined SCLs providing random local envi-ronments can be used to yield an overall helical propensity value for each amino acid that is independent of the nature of the interacting amino acid side chains and of the contribution of the donor/acceptor pair of hydrogen bonding amino acids. When comparing the relative free energies of helix formation obtained in a close to random environment with other experimental and statistical data, the best linear correlation was with the relative free energies required to orient the amino acids in helical dihedral angles described by Muñ oz and Serrano (29) (Fig. 5). This can be explained by the fact that in contrast to other approaches, this method does not involve the contribution of the hydrogen bonds and side chain-side chain interactions. It should be noted that the relative effect toward the stability of the ␣-helix by the six amino acids Ala, Ile, Met, Gln, Val, and Thr correlate well for all the different proposed scales (21)(22)(23)(24)(25)(26)(27)(28)(29). Thus, these amino acids can be considered as helix stabilizers or destabilizers regardless of their local environment. The analysis of the relative effect of the other 14 naturally occurring amino acids shows significant discrepancies between the different scales, indicating their greater susceptibility to their local environment. In order to test whether the average positional helical propensity found in the present studies can be translated to indi-  vidual peptides, three different series of peptides were prepared. In two of the series, the peptides represented the combination at positions 6,9,13, and 16 of either the four amino acids with the highest or, separately, the lowest helical propensity from each library or combinations of amino acids having high propensity with others having low helical propensity. The first series of peptides was based on the original YLK peptide, whereas the second series was synthesized incorporating these amino acids into an intrinsically less amphipathic 18-mer peptide composed solely of alanine and serine residues (termed YAS) (Table IV). In agreement with our scales, the peptide YLK[W 6 L 9 A 13 E 16 ] showed the highest helical content in aqueous buffer when compared with the other peptides of its series (Table IV) (Table IV). A completely unrelated peptide, PGLa, was also selected for validation of the described scales.
PGLa is a 21-residue naturally occurring peptide isolated from frog skin that is known to be inducible into an amphipathic  a The CD spectra were recorded in 5 mM MOPS buffer, 200 mM NaCl, pH 7, 25°C at a peptide concentration of 70 M.

FIG. 5.
Correlation between ⌬⌬G f app and ⌬⌬G int . The ⌬⌬G f app represents the ⌬G f app of each amino acid when defined at position 13 relative to the ⌬G f app of glycine and was divided by four to account for the contribution of the monomeric units. The ⌬⌬G int represents the intrinsic ⌬G of each amino acid relative to the intrinsic ⌬G of glycine reported by Muñ oz and Serrano (29). The correlation coefficient of these values (after removing the values for Arg, Leu, Ser, and Trp) was 0.91, and the slope was 0.95. ␣-helical conformation (32). Two PGLa analogs were prepared by replacing the residues at positions 6, 9, 13, and 16 either with the four amino acids having the highest helical propensity in the respective positional scales (i.e. Trp, Leu, Ala, and Glu, respectively) or with four alanine residues, because alanine shows one of the highest helical propensity in most of the reported scales (21)(22)(23)(24)(25)(26)(27)(28)(29). Although the helical content was low for the three peptides in MOPS buffer, analog PGLa-[W 6 L 9 A 13 E 16 ] had the highest helical content, followed by PGLa-[A 6 A 9 A 13 A 16 ] and PGLa (Table IV). Greater differences in ellipticities were observed in 10% trifluoroethanol, which allows the detection of weak helical tendencies in unstructured peptides ([⍜] 222 ϭ Ϫ7793, Ϫ5377, and Ϫ4331 deg cm 2 dmol Ϫ1 for PGLa-[W 6 L 9 A 13 E 16 ], PGLa[A 6 A 9 A 13 A 16 ], and PGLa, respectively).
The similarities in the helical content rank order found between these three separate series illustrate the potential of average helical propensity in the design of new, ␣-helical basic polypeptides. In particular, using conformationally defined SCLs, we were able to modify in a single optimization step the sequence of the original YLK scaffold to generate individual sequences having well defined secondary structures in mild buffer. Furthermore, the use of the nonsequence-related PGLa series shows that one can increase the helical content of a given sequence by incorporating high propensity amino acids as determined using random local environment scales at specific positions. Overall, these results support the contribution positional dependence makes to the ␣-helical stability of amino acids.
Identification of Peptides Having Catalytic Activity-Enzyme engineering has been used by several groups to develop new catalysts that operate outside their native context, or catalyze reactions not represented in nature. A number of peptides have been designed that have catalytic activities, although these activities are low when compared with natural enzymes (9 -12). In particular, Johnsson et al. recently reported the catalytic activity of a 14-mer peptide, called oxaldie-1, for the oxaloacetate decarboxylation reaction (11). Such activity was postulated to be closely related to the ability of this peptide to adopt a partial ␣-helical conformation in aqueous solution, which, due to the resulting helix dipole, lowers the pK a of the reactive N-terminal amino group and to electrostatically attract the substrate through lysine residues. Due to the intrinsic character of the conformationally defined SCLs described in these studies, we used this catalytic reaction as a model system to validate the use of these SCLs for the identification of new catalytic compounds.
Each peptide mixture was assayed for its ability to enhance the decarboxylation of oxaloacetate. The original YLK sequence and oxaldie-1 served as references in the assay used in this study and showed similar specific activities. All of the peptide mixtures, except when proline was at the defined position, were found to have equal or higher activity when compared with the original YLK and oxaldie-1. The good correlations observed between the specific activity and the ability of the mixtures to adopt an ␣-helical conformation in buffer (Fig. 6) support the reported role of this conformation in the catalytic activity of such peptides. This is confirmed by the fact that none of the peptide mixtures from the library built on the hydrophobic face of the helix showed significant activity. It should also be noted that none of these later peptide mixtures adopted a defined conformation in buffer ( Fig. 2A).
The identification of active individual sequences from the screening of an SCL in a positional scanning format involves selecting the most active peptide mixtures from each individual library and using combinations of the corresponding defined amino acids to prepare all of the possible individual peptides (20). Thus, a set of individual peptides was prepared that represented all possible combinations of the amino acid(s) selected at each position from the screening assay: Glu, Lys, or Trp at position 6; Ala, Asn, or Trp at position 9; Glu, Gln, Val, or Trp at position 13; and Asp, Met, Gln, or Arg at position 16. The combinations of these amino acids resulted in the generation of a set of 144 individual 18-mer peptides (3 ϫ 3 ϫ 4 ϫ 4 ϭ 144). Each individual peptide was screened for catalytic activity in a manner similar to that carried out for the libraries. The specific activities ranged from 10.2 ϫ 10 Ϫ3 to 3.9 ϫ 10 Ϫ3 s Ϫ1 (representative sequences are shown in Table V). The most active individual peptides were found to catalyze the decarboxylation of oxaloacetate via a UV-detectable enamine intermediate (11,33) with Michaelis-Menten saturation kinetics corresponding to a K obs value 3-4-fold higher than that found for oxaldie-1 under the assay conditions used (Fig. 7). This activity is in the same order of magnitude as that first reported for catalytic antibodies (34) and represents a 10 3 -10 4 -fold enhancement over the same reaction catalyzed by simple amines (11). The activity found for these peptides was related to their ability to fold into an ␣-helical conformation in buffer ([⍜] 222 ϭ Ϫ28300 and Ϫ4460 deg cm 2 dmol Ϫ1 for YKLLKELLAKLK-WLLRKL-NH 2 and oxaldie-1, respectively; peptide concentration was 200 M in 5 mM MOPS buffer, 200 mM NaCl, pH 7, 25°C). A negative control peptide containing three proline residues was included in this set of peptides in order to confirm the low activity observed for the peptide mixtures defined with proline. As anticipated, this peptide had lower activity than any of the 144 peptides tested (specific activity of 3.75 10 Ϫ3 s Ϫ1 ), combined with a low helical content in buffer ([⍜] 222 ϭ Ϫ3985 deg cm 2 dmol Ϫ1 ).
Combinatorial Approaches to the de Novo Design of Proteins-The present results demonstrate that highly simplified, functional proteins can be efficiently designed using a combination of strategies involving the rational design of a protein framework and the "natural" selection process of combinatorial library approaches (i.e. selection driven by the assay system of interest). Thus, based on a protein framework that provides the minimum required physicochemical properties resulting in a desired biological function, the ability to engineer randomized positions along such a framework using combinatorial chemistry allows the optimization of both the structural and functional properties of these proteins. This first use of conformationally defined SCLs for the design of new catalytic compounds serves as a further example of the strength of the SCL approach for the preparation of small functionalized pro- teins. Furthermore, these approaches enhance the structural understanding of polypeptides and protein segments. FIG. 7. Kinetic analysis for the decarboxylation activity of individual peptides derived from the SCLs. All the parameters were determined in phosphate-buffered saline buffer at 25°C. The peptide concentration was selected at 0.2 mM, which was found to be in the linear section of the maximum rate versus peptide concentration plots. The initial rates are plotted for YKLLKELLAKLKWLLRKL-NH 2 (å), YKLLKLLLPKLKPLLPKL-NH 2 (ç), oxaldie-1 (Ç), and spontaneous hydrolysis (f). The derived kinetic parameters are shown in the top panel.