Local Interactions in Protein Folding: Lessons from the α-Helix

The a-helix is an elegantly simple structure (1), but helix thermodynamics has proven to be complex and often refractory. The appealing idea that helix formation is an early guiding event in protein folding (2–6) remains controversial (7–11). We suggest that the known facts are sufficient to resolve this controversy because they imply that side chain conformational entropy, and therefore helix propensity, must play an important organizing role during the earliest stages of protein folding. Briefly, the argument presented in this minireview is as follows. There are only two substantially populated conformers available to a non-glycyl/non-prolyl peptide (12, 13), a and b, corresponding to regions of the f,c map near (260°,240°) and (2120°,1135°), respectively. The repetition of a-values for successive residues results in a helix, a conformation that compacts the chain, thereby expelling water while engendering intrasegment hydrogen bonds (1) and exquisite van der Waals contacts (14). Although the peptide backbone has a strong tendency to leave the vapor phase and enter water (15), the helix per se can be an even better “solvent” (16). Further, water liberated by backbone polar groups upon helix formation is released, an entropically favored event. The preceding factors that promote helix formation are opposed by the entropic cost of freezing the main chain into a single conformation. On balance, helix formation must be energetically favorable for main chain atoms because alaninebased peptides are helical (17). Nonetheless, most peptides and protein segments are not helical, so unfavorable factors usually outweigh favorable ones. The several helix-promoting factors mentioned above all involve the peptide backbone and are therefore common to every residue (except glycine and proline). Consequently, helical conformation is the preferred state of the backbone (e.g. polyalanine), while helix-disfavoring factors must arise in the side chain. Side chains also pay an entropic price for helix formation because the presence of the bulky helix backbone is sterically incompatible with some side chain conformers (18, 19). Unlike the backbone, where helix-affecting factors are the same from residue to residue, side chain entropy differs from one residue type to the next. For example, helix formation largely restricts a central valine to only one of its three possible side chain configurations because one of the g-carbons “bumps” into a backbone atom in either of the other two. These side chain steric factors predispose segments of the chain toward either a or b regions of the f,c map, and their influence would be exerted early in folding because the interactions are local, exerted between atoms that are close in sequence. Consecutive residues that preferentially populate the same region, either a or b, become candidates for further stabilization into helices or strands, respectively. These entropically driven segments of nascent secondary structure affect subsequent folding by favoring certain pathways and suppressing others. Thus, helix and strand formation will be guiding events in protein folding.

The ␣-helix is an elegantly simple structure (1), but helix thermodynamics has proven to be complex and often refractory. The appealing idea that helix formation is an early guiding event in protein folding (2)(3)(4)(5)(6) remains controversial (7)(8)(9)(10)(11). We suggest that the known facts are sufficient to resolve this controversy because they imply that side chain conformational entropy, and therefore helix propensity, must play an important organizing role during the earliest stages of protein folding.
Briefly, the argument presented in this minireview is as follows. There are only two substantially populated conformers available to a non-glycyl/non-prolyl peptide (12,13), ␣ and ␤, corresponding to regions of the , map near (Ϫ60°,Ϫ40°) and (Ϫ120°,ϩ135°), respectively. The repetition of ␣-values for successive residues results in a helix, a conformation that compacts the chain, thereby expelling water while engendering intrasegment hydrogen bonds (1) and exquisite van der Waals contacts (14). Although the peptide backbone has a strong tendency to leave the vapor phase and enter water (15), the helix per se can be an even better "solvent" (16). Further, water liberated by backbone polar groups upon helix formation is released, an entropically favored event.
The preceding factors that promote helix formation are opposed by the entropic cost of freezing the main chain into a single conformation. On balance, helix formation must be energetically favorable for main chain atoms because alaninebased peptides are helical (17). Nonetheless, most peptides and protein segments are not helical, so unfavorable factors usually outweigh favorable ones. The several helix-promoting factors mentioned above all involve the peptide backbone and are therefore common to every residue (except glycine and proline). Consequently, helical conformation is the preferred state of the backbone (e.g. polyalanine), while helix-disfavoring factors must arise in the side chain.
Side chains also pay an entropic price for helix formation because the presence of the bulky helix backbone is sterically incompatible with some side chain conformers (18,19). Unlike the backbone, where helix-affecting factors are the same from residue to residue, side chain entropy differs from one residue type to the next. For example, helix formation largely restricts a central valine to only one of its three possible side chain configurations because one of the ␥-carbons "bumps" into a backbone atom in either of the other two. These side chain steric factors predispose segments of the chain toward either ␣ or ␤ regions of the , map, and their influence would be exerted early in folding because the interactions are local, exerted between atoms that are close in sequence. Consecutive residues that preferentially populate the same region, either ␣ or ␤, become candidates for further stabilization into helices or strands, respectively. These entropically driven segments of nascent secondary structure affect subsequent folding by favoring certain pathways and suppressing others. Thus, helix and strand formation will be guiding events in protein folding.

The ␣-Helix
The ␣-helix (1) is a well designed structure. Its hydrogen bonds are intrasegment and therefore self-contained, with near ideal geometry. Backbone atoms are close packed, and all interactions are local, confined between consecutive turns of the helix. Finally, the pattern is completely extensible; any number of consecutive residues can adopt a helical conformation with these favorable design characteristics.
Helices are observed frequently in proteins (20), suggesting that the ␣-helix is also a stable structure. The autonomous stability of the helix has been a topic of keen interest for many years, motivated in part by conjectures about protein folding. One engaging idea has been that helices seed the folding pathway and influence later folding events. Unfortunately, early experimental evidence indicated that the cooperative unit for stable helix formation is ϳ100 residues in length (21). This threshold exceeds the length of the average protein helix (ϳ12 residues (22)) by almost an order of magnitude. Consequently, the conclusion that protein helices were far too short to function as independent folding units seemed inescapable.
This prevailing view of the 1970s came to be reversed in the 1980s, after Bierzynski et al. (23), expanding upon earlier work by Brown and Klee (24) demonstrated that residues 1-13 of ribonuclease, liberated upon cyanogen bromide cleavage, contained a helix that was stable in water at near physiological temperature. Other isolated protein fragments were found to be structured as well (25,26). Such results prompted a reevaluation; helices might function as independent folding units after all.
What factors are responsible for helix stability (27)? Two "textbook" features of helices are usually invoked to account for stability, hydrogen bonding (28) (but see Ref. 29) and tight main chain packing (14). A third implicit feature is the fact that when the chain folds into a helix, bound water is released and returned to the bulk phase. These helix-promoting factors are opposed by the entropic cost of constraining the main chain to a single conformation. For backbone atoms, there appears to be a net tendency toward helix formation because alanine-based peptides (17,30) are observed to be helical.
What factors are responsible for helix specificity (27,31)? That is, why are some peptides and protein segments helical, while others are not? The favorable main chain contributions to helix stability are constant from one residue to the next because all have identical backbones (except glycine and proline). Yet, most peptides are not helical nor are about 75% of the residues in proteins (20). Therefore, these helix-stabilizing main chain factors that cause polyalanine to be helical cannot * These minireviews will be reprinted in the 1997 Minireview Compendium, which will be available in December, 1997. This is the second of five articles in the "Protein Folding and Assembly Minireview Series." Financial support for this work was received from the National Institutes of Health. be the ones that differentiate helix from non-helix.
Helix capping has been hypothesized as a general mechanism that discriminates between helices and other conformational alternatives, including coil (22,32). In proteins, the helix of average length (ϳ12 residues) has eight intrasegment hydrogen bonds between successive amide hydrogen donors and carbonyl oxygen acceptors situated four residues previously in sequence (i.e. N-H(i) ⅐⅐⅐ OϭC (i Ϫ 4)). Unavoidably, the initial four amide hydrogens and final four carbonyl oxygens of the helix lack intrasegment main chain hydrogen bonds because, upon termination, no next turn of helix exists to provide such partners. Thus, Pauling-Corey-Branson hydrogen bonds account for only 50% of the total in the helix of average length, with the first four N-H groups and last four CϭO groups accounting for the remaining 50%.
In both proteins and peptides, the chain leaving the helix tends to occlude some of these unsatisfied donors and acceptors, hindering access by solvent water. Provision of hydrogen bond partners for these otherwise unsatisfied amide hydrogens and carbonyl oxygens is termed helix capping. In addition to polar backbone groups, apolar side chains situated near helix termini can be solvent exposed, and therefore helix destabilizing, unless the chain folds so as to foster a hydrophobic contact. Recently, our definition of capping has been extended to include these hydrophobic capping interactions as well. 1 The helix capping hypothesis has been confirmed experimentally in both peptides (34 -37) and proteins (38,39). Recurrent capping motifs such as the capping box (40 -43) and the Schellman motif (44,45), which contribute to the stability of protein helices (39), have been shown to persist in peptides (46,47), where they can inhibit expected fraying (48) at helix ends.
In 1990, individual residue contributions to helix stability were assessed in four separate host/guest systems (49 -52). In such experiments, guest residues are substituted systematically at central positions (to avoid end effects) within a host peptide of known helical content and the resultant effect measured. Stabilizing substitutions increase helix content; destabilizing substitutions decrease it. An experimental scale of helix propensities is then derived by quantifying these effects in terms of the free energy differences (⌬⌬G) between the stability of the host (⌬G helix3coil host ) and each guest (⌬G helix3coil guest ). Although the host systems differed among these four groups, the resulting rank order of helix propensities was remarkably similar.
What is the basis for these measured differences in helix propensity? Why does a small, nondescript residue like alanine have a higher helix propensity than valine? We hypothesized that the differences are due in large part to the loss of side chain conformational entropy upon helix formation (19). Physically, this effect reflects the difference between the side chain's conformational freedom in the relatively flexible coil state and in the more restricted helical state, with its bulky helix backbone.
Conformational entropy is not the only proposed explanation for differences in helix specificity. Another is the drive to segregate polar and apolar residues on opposite helical faces, giving rise to an amphipathic helix (61)(62)(63). While protein helices are often amphipathic, the host/guest peptide systems (49 -52) are not. Therefore, amphipathic segregation is an unlikely explanation for the differences in helix propensity in these peptide systems.
The electrostatic field resulting from the helix dipole has been proposed as yet another factor that contributes to both helix stability and specificity (64,65). Åqvist et al. (66) demonstrated that this effect is short-ranged, and its influence is confined largely to individual backbone dipoles localized within the first and last turns of the helix, resulting in a formal positive charge at the helix N terminus and formal negative charge at the C terminus. Localized backbone charges can be stabilized by compensating side chain charges, a fact which helps explain the early observation that the helix N terminus tends to be enriched in acidic residues, while the C terminus is enriched in basic residues (67,68). These electrostatic interactions may involve side chain-to-backbone hydrogen bonding, thereby coupling the helix dipole to helix capping, though the two effects can be disentangled (39).
One popular idea holds that the burial of side chain apolar surface is an important source of helix stability in both proteins (55,56) and alanine-based peptides (29). Of necessity, burial would be limited to the ␤-carbon in the latter case. As emphasized by these authors (29,56), most of the stabilization energy must be contributed by side chain to backbone interactions. Interactions among side chains are precluded in an alaninebased peptide because the ␤-carbons cannot reach one another. Although interactions between proximate side chains are possible in peptides of heterogeneous composition, such interactions are too infrequent and haphazard to qualify as a general explanation of stability. Often, contacting side chains pay an entropic price that outweighs any enthalpic gain, and their net effect is to destabilize the molecule (69,70).
The hydrophobic contribution made by burial of side chain apolar surface is proportional to the difference in area between the helix and coil states. Although the surface area of a residue side chain in a polyalanyl helix is easily computed, the corresponding area in the coil state is elusive (71). Often, an extended tripeptide is used to represent the coil state, but this model is open to question (72). As an alternative, Creamer et al. (72) developed two limiting cases that bracket the expected behavior of the coil between reliable extremes. One extreme was represented by simulated hard sphere peptides and the other by fragments excised from folded proteins. Using these limits, it was shown that the area buried by apolar side chains upon helix formation is considerably less than that estimated from a tripeptide. Upon transfer from the coil to a midhelical position, an alanine side chain loses an area between 10 and 0 Å 2 , and a valine side chain exposes an area between 0 and 17 Å 2 .
These results underscore our intuitive expectation. In the coil, a central residue in a peptide is partially shielded by surrounding neighbors, while in the helix, that same residue would be extruded into a hyperexposed configuration (like a peacock's tail). Thus, loss of side chain apolar surface from side chain-backbone interactions is an implausible source of helix stability.
Summarizing the conclusions from these studies, the ␣-helix is an energetically stabilized conformation for main chain atoms. Helix specificity arises in the side chain, where loss of conformational entropy upon helix formation is a major determinant of the helix-forming tendencies of residues in both peptides and proteins. When present, helix capping contributes additional specificity.

The Lessons
The thermodynamic hypothesis of Anfinsen asserts that the folded state of a population of proteins corresponds to a global minimum of free energy (73). Given that the fold is unique, the protein will have lost all conformational entropy or nearly so. Therefore, it would appear that an effective solution to the folding problem is to minimize the internal energy.
However, this attractive strategy is called into question by studies of the ␣-helix described in the previous section. Recapping, side chain conformational entropy is a major determinant of helix propensity in both peptides and proteins. In peptides, the reason some sequences are helical and others are not is explained, in large part, by differences in the entropic price that their side chains must pay to leave the coil state and adopt a helical conformation. A parallel situation exists in proteins where there are only two substantially populated regions in , space (74) that a non-prolyl/non-glycyl residue can adopt, ␣ and ␤, corresponding to regions of the , map near (Ϫ60°,Ϫ40°) and (Ϫ120°,ϩ135°), respectively. Analogous to coil, ␤ (i.e. an extended conformation) imposes little steric restriction on the conformation of nearby side chains. Thus, helix propensities in proteins involve entropy differences between extended and helical conformations.
A clear conclusion from these studies is that side chain conformational entropy influences whether a protein segment will be helical or extended. This observation is of central importance to understanding the folding problem. The protein interior is comprised almost exclusively of residues from either ␣-helices or ␤-strands, 2 a consequence of the fact that these two regular secondary structures are unique in their capacity to provide buried backbone carboxamides with intramolecular hydrogen bond partners (76). In other words, conformational entropy is implicated in determining the structure of the protein core, and its effect is exerted in discriminating between helix or strand.
Physically, loss of side chain conformational entropy measures the effect of side chain atoms bumping into the rest of the polypeptide chain. Side chain steric factors affect helicity markedly, and their influence is local, realized primarily through interaction with other atoms close in sequence (77). For illustration, Fig. 1 shows the results of two identical simulations, one of a 15-residue polyalanine and the other of a similar 15-mer where the central alanine has been replaced by a valine. The presence of even the single valine reduces the fractional helix population by up to 20% at an adjacent site. Longer range steric interactions will have little influence on helicity because, in a helix, side chain ␤and ␥-carbons cannot reach beyond the backbone of an adjacent helical turn.
Taken together, these studies depict a simple picture of the folding process. Proteins, being macromolecules, are large enough to enclose a solvent-shielded interior within which hydrophobic groups can be sequestered. Shielded hydrophobic side chains are covalently attached to the polar backbone, which is also shielded in most cases (78). Were this backbone unable to form H-bonds within the molecular interior, then hydrogen bonding would push the conformational equilibrium far toward the unfolded state, where backbone groups could H-bond to solvent water. Consistent with this idea, almost all backbone groups within the interior of proteins of known structure are found to be H-bonded (76). There are only two structures that provide ubiquitous hydrogen bonding for interior residues, ␣-helix and ␤-sheet (1,79). While other interior structures are found occasionally (80), they do not lend themselves readily to routine hydrogen bonding. Parenthetically, with only two conformational possibilities, the Levinthal paradox (81,82) is reconciled for the protein core.
Of the two core conformations, helix is the thermodynamically preferred state for the main chain, but some side chains lose sufficient conformational entropy in a helix that they push the residue toward the only other allowed region, viz. extended (i.e. ␤-strand). Thus, conformational entropy plays a crucial role in selecting between helix and sheet.
Given that side chain conformational entropy is a local effect, it must arise as an early folding event, e.g. within the unobservable burst phase of kinetic experiments. Such events will predispose residues to populate either ␣ or ␤ regions of , space preferentially. Consecutive residues that populate the same region become candidates for further stabilization as helix or strand. These entropically driven segments of nascent secondary structure guide subsequent folding by favoring certain pathways and suppressing others. It is important to emphasize that this organizing effect is due to the entropically driven enrichment of ␣ microstates over ␤ microstates, or the converse, and it is exerted before stable secondary structure can be detected experimentally. The move set was as follows: a set of three consecutive residues was selected at random. In all three, backbone dihedral angles were assigned to either the ␣-region ( ϭ Ϫ64 Ϯ 7°, ϭ Ϫ43 Ϯ 7°) or the ␤-region ( ϭ Ϫ120 Ϯ 40°, ϭ ϩ120 Ϯ 30°). For each residue, exact values of and were chosen at random from the given ranges. Either the ␣or ␤-region was chosen randomly with equal probability. In valine-containing triplets, the side chain torsion () angle was rotated at random in the range Ϫ180°to ϩ180°. A hard sphere potential was used to describe interactions between atoms. In this potential, atoms have no attractive component, only excluded volume. United atoms were employed, i.e. CH, CH 2 , and CH 3 groups were treated as single atoms with inflated radii. The atomic radii used were scaled to 90% of their van der Waals values (33), bond lengths and angles were fixed at standard values, and peptide units were held rigid and planar ( ϭ 180°).