How Membranes Shape Protein Structure*

Constitutive -helical membrane proteins (MPs) are assembled in membranes by means of a translocation/insertion process that involves the translocon complex (1). After release into the membrane’s bilayer fabric, a MP resides stably in a thermodynamic free energy minimum (evidence reviewed in Refs. 2 and 3). This means that the prediction of MP structure from the amino acid sequence is fundamentally a problem of physical chemistry, albeit a complex one. Physical influences that shape MP structure include interactions of the polypeptide chains with water, each other, the bilayer hydrocarbon core, the bilayer interfaces, and cofactors (Fig. 1). Two recent reviews (3, 4) provide extensive discussions of the evolution, structure, and thermodynamic stability of MPs. Here we provide a distilled (and updated) overview that addresses four broad questions. What is the nature of the bilayer matrix that encloses MPs? How can the thermodynamic principles of MP stability be discovered? How does the bilayer matrix induce structure? How can the structure of MPs be predicted? We focus primarily on -helical proteins, but the thermodynamic principles we present also apply to -barrel MPs, which Lukas Tamm discusses elsewhere in this series. Two influences will emerge as paramount in shaping MP structure. First, as implied in Fig. 1, the bilayer fabric of the membrane has two chemically distinct regions: hydrocarbon core (HC) and interfaces (IFs). Interfacial structure and chemistry must be important, because the specificity of protein signaling and targeting by membrane-binding domains could not otherwise exist (5). Second, the high energetic cost of dehydrating the peptide bond, as when transferring it to a non-polar phase, causes it to dominate in the formation of structure (6). The only permissible transmembrane structural motifs of MPs are -helices and -barrels, because internal H-bonding ameliorates this cost.

time-averaged probability distribution functions of water and lipid component groups (carbonyls, phosphates, etc.), representing projections of three-dimensional motions onto the bilayer normal (9,10). The liquid crystallographic structure of an L ␣ -phase dioleoylphosphatidylcholine (DOPC) bilayer is shown in Fig. 2A (11).
Three features of this structure are important. First, the widths of the probability densities reveal the great thermal disorder of fluid membranes. Second, the combined thermal thicknesses of the IFs (defined by the distribution of the waters of hydration) is about equal to the 30-Å thickness of the HC. The thermal thickness of a single IF (ϳ15 Å) can easily accommodate an ␣-helix parallel to the membrane plane (Fig. 2B). The common cartoons of bilayers that assign a diminutive thickness to the bilayer IFs are thus misleading. Third, the thermally disordered IFs are highly heterogeneous chemically. As the regions of first contact, the IFs are especially important in the folding and insertion of non-constitutive MPs, such as toxins (12), and to the activity of surface-binding enzymes, such as phospholipase A 2 (13). But they are also important in shaping MP structure (Fig. 1).
A molecule moving from water to the bilayer HC must experience a dramatic variation in environmental polarity over a short distance because of interfacial chemical heterogeneity, as illustrated by the yellow curve of Fig. 2B (14). An amphipathic helix such as melittin (15), represented schematically in Fig. 2B, locates (16) at the midpoint of the steep descent of the polarity gradient. Because the polarity changes over a distance corresponding roughly to helix diameter, peptide-bilayer interaction energies must be very sensitive to polarized helices, such as amphipathic ones.

Coming to Thermodynamic Terms with Insoluble
Membrane Proteins Experimental exploration of the stability of intact MPs is problematic because of their general insolubility. One approach to stability is to "divide and conquer" by studying the membrane interactions of fragments of MPs, i.e. peptides. Because MPs are equilibrium structures, folding and stability can be examined by constructing thermodynamic pathways (3) such as those shown in Fig. 3. Although these pathways do not mirror the actual biological assembly process of MPs, they are nevertheless useful for guiding biological experiments, because they provide a thermodynamic context within which biological processes must proceed.
The four-step model (3) of Fig. 3 is a logical combination of an early three-step model of Jacobs and White (17) and the two-stage model of Popot and Engelman (18,19) in which TM helices are first "established" across the membrane and then assemble into functional structures (helix association). The model summarizes the types of experiments on MP folding now being pursued in several laboratories.
In Fig. 3, the free energy reference state is taken as the unfolded protein in an IF. However, this state cannot actually be achieved with MPs because of the solubility problems, nor can it be achieved with small non-constitutive membrane-active peptides, such as melittin, because binding usually induces secondary structure (partitioning-folding coupling). Thus, as is often the case in solution thermodynamics, the reference state must be a virtual one. It can be defined for phosphocholine IFs by means of an experimental interfacial free energy (hydrophobicity) scale (20) derived from the partitioning into POPC bilayers of tri-and pentapeptides (17,20) that have no secondary structure in the aqueous or interfacial phases. This scale, which includes the peptide bonds as well as the side chains, allows calculation of the virtual free energy of transfer of an unfolded chain into an IF. For peptides that cannot form regular secondary structure, such as the antimicrobial peptide indolicidin (21), the scale predicts observed free energies of transfer with remarkable accuracy (22). This validates it for the computation of virtual free energies for partitioning into phosphocholine IFs. Similar scales are needed for other lipids and lipid mixtures.

How Membranes Induce Structure: The Importance of the Peptide Bond
The high cost of interfacial partitioning of the peptide bond (20), 1.2 kcal mol Ϫ1 , explains the origin of partitioning-folding coupling and also why the interface is a potent catalysis of secondary structure formation. Wimley et al. (23) showed for interfacial ␤-sheet formation that H-bond formation reduces the cost of peptide partitioning by about 0.5 kcal mol Ϫ1 per peptide bond. The folding of melittin into an amphipathic ␣-helix on POPC membranes involves a per residue reduction of about 0.4 kcal mol Ϫ1 (24). The folding of the antimicrobial peptide magainin on charged bilayers seems to entail a smaller per residue value, about 0.1 kcal mol Ϫ1 (25). The cumulative effect of these relatively small per residue free energy reductions can be very large when tens or hundreds of residues are involved, as in the assembly of the ␤-barrel transmembrane domain (26) of ␣-hemolysin that buries ϳ100 residues in the membrane.
Determination of the energetics of TM ␣-helix insertion, which is critically important for predicting structure, is difficult because non-polar helices tend to aggregate in both the aqueous and interfacial phases (27). Several efforts have been made, with mixed success (27)(28)(29)(30)(31). Although precise values for the free energy of helix insertion remain to be established, the broad energetic issues are clear (32). Computational studies (33,34) suggest that the transfer free energy ⌬G CONH of a non-H-bonded peptide bond from water to alkane is ϩ6.4 kcal mol Ϫ1 , compared with only ϩ2.1 kcal mol Ϫ1 for the transfer free energy ⌬G Hbond of an H-bonded peptide bond. The per residue free energy cost of disrupting H-bonds in a membrane is therefore about 4 kcal mol Ϫ1 . A 20-amino acid TM helix would cost 80 kcal mol Ϫ1 to unfold within a membrane, which explains why unfolded polypeptide chains cannot exist in a transmembrane configuration. Fig. 4 illustrates the importance of ⌬G Hbond in setting the threshold for transmembrane stability as well as the so-called decision level in hydropathy plots (35). Using the single membrane-spanning helix of glycophorin A (36) as an example, panel A shows that the free energy of transfer of the side chains dramatically favors helix insertion, whereas the transfer cost of the helical backbone dramatically disfavors insertion. Panel B shows that an uncertainty of 0.5 kcal mol Ϫ1 in the per residue cost of backbone insertion has a major effect on the interpretation of hydropathy plots and on the establishment of the minimum value of side chain hydrophobicity required for transmembrane helix stability. What is the most likely estimate of ⌬G Hbond ? The practical number, in the context of Fig. 4A, is the cost of ⌬G glycyl helix transferring a single glycyl unit of a polyglycine ␣-helix into the bilayer HC. Electrostatic calculations (34) (3). This is borne out by work in progress 2 using the recently developed MPtopo data base of MPs of known topology (38), accessible via the World Wide Web (blanco.biomol.uci.edu/mptopo).
The hydrophobic effect is generally considered to be the major driving force for compacting soluble proteins (39). However, it cannot be the force driving compaction (association) of TM ␣-helices. Because the hydrophobic effect arises solely from dehydration of a non-polar surface (40), it is expended after helices are established across the membrane. Helix association is most likely driven primarily by van der Waals forces, more specifically the London dispersion force (reviewed in Refs. 3 and 4), but why would van der Waals forces be stronger between helices than between helices and lipids?
Extensive work (41-45) on dimer formation of glycophorin A in detergents reveals the answer: knob-into-hole packing that allows  (69), which also include solvent properties peculiar to bilayers that arise from motional anisotropy and chemical heterogeneity.  (3,14,70). A, structure of a DOPC bilayer (5.4 waters/lipid) determined by joint refinement of x-ray and neutron diffraction data (11); B, polarity profile (yellow curve) of the DOPC bilayer computed from the absolute values of atomic partial charges (14). The end-on view in panel B of an ␣-helix with a diameter of ϳ10 Å (typical for MP helices (46)) shows the approximate location of the helical axes of the amphipathic helix peptides Ac-18A-NH 2 (71) and melittin (16), as determined by a novel, absolute scale x-ray diffraction method (reviewed in Ref. 72). The "structure" of the bilayer shown in panel A is comprised of a collection of transbilayer Gaussian probability distribution functions representing the lipid components that account for the entire contents of the bilayer unit cell. The areas under the curves correspond to the number of constituent groups per lipid represented by the distributions (1 phosphate, 2 carbonyls, 2 methyls, etc.). The widths of the Gaussians measure the thermal motions of the lipid components and are simply related to crystallographic B-factors (9,16,71). The thermal motion of the bilayer is extreme: lipid component B-factors are typically ϳ150 Å 2 , compared with ϳ30 Å 2 for atoms in protein crystals. In addition to this thermal motion, two other features of the bilayer are important for shaping membrane protein structure. First, the IFs have a combined thickness equal to that of the hydrocarbon core (ϳ30 Å). A 15-Å-thick IF can comfortably accommodate an MP helix lying parallel to the membrane plane. Second, the IFs are chemically heterogeneous. Panel A shows that they are composed of water, choline, phosphate, glycerol, carbonyls, and even some methylenes that spill into the IFs because of thermal motion. Panel B reveals steep gradients of polarity in the IFs that change over a distance approximately equal to the diameter of an ␣-helix.

Minireview: How Membranes Shape Protein
Structure 32396 more efficient packing between helices than between helices and lipids. Tight, knob-into-hole packing has been found to be a general characteristic of helical bundle MPs as well (46,47). For glycophorin A dimerization, knob-into-hole packing is facilitated by the GXXXG motif, in which the glycines permit close approach of the helices. The substitution of larger residues for glycine prevents the close approach and hence dimerization (41,44,45). The so-called TOX-CAT method (48) has made it possible to sample the amino acid motifs preferred in helix-helix association in membranes by using randomized sequence libraries (49). The GXXXG motif is among a significant number of motifs that permit close packing. A statistical survey of MP sequences disclosed that these motifs are very common in membrane proteins (50).
Dimerization studies of glycophorin in detergent micelles (44) do not permit the absolute free energy of association to be determined because of the large free energy changes associated with micelle stability. However, estimates (3) suggest 1-5 kcal mol Ϫ1 as the free energy cost of separating a helix from a helix bundle within the bilayer environment. The cost of breaking H-bonds within the bilayer HC (above) implies that H-bonding between ␣-helices could provide a strong stabilizing force for helix association. This is borne out by recent studies of synthetic TM peptides designed to hydrogen bond to one another (51,52). Interhelical H-bonds, however, are not common in MPs (reviewed in Ref. 3). Indeed, lacking the specificity of knobs-into-hole packing, they could be hazardous because of their tendency to cause promiscuous aggregation (4). However, they are probably important in the association of transmembrane signaling proteins (53).

Predicting the Structure of Helical Membrane Proteins
As for soluble proteins, the ultimate solution to the problem of predicting three-dimensional structure of MPs from sequence will come from a deep quantitative understanding of the energetics of protein folding. The experimental approaches described above lead in that direction. At a simple level, the prediction of MP topology is fairly easy and reliable because of the high hydrophobicity of TM helices. Such sequence segments are generally apparent in hydropathy analysis (Fig. 4B), which is now a standard prediction tool (reviewed in Ref. 35). However, the reliability of the resulting topologies depends strongly upon the hydrophobicity scale used, and there are many (mostly side chain only scales). An analysis 2 using the MPtopo data base (38) reveals that side chain only scales significantly overpredict TM segments because of the neglect of ⌬G bb for reasons illustrated by Fig. 4B. The experiment-based whole-residue hydrophobicity scale of White and Wimley (3), which takes ⌬G bb into account, greatly reduces overprediction. 2 Membrane Protein Explorer (MPEx) is a Web-based hydropathy analysis tool using this scale (blanco.biomol.uci.edu/mpex). The incorpo-  (3) for describing the partitioning, folding, insertion, and association of ␣-helical polypeptides. The aqueous insolubility of membrane proteins, folded or unfolded, precludes direct determinations of interaction free energies. The only possibility for understanding the energetics of MP stability is through studies of small, water-soluble peptides (20,23,24,27). This approach, summarized in the figure, uses the unfolded peptide in the IF as the thermodynamic reference state. The free energy of unfolded partitioning in phosphocholine IFs can now be estimated using the whole-residue interfacial hydrophobicity scale of Wimley and White (20). Unfolded peptides are driven toward the folded state in the IF because hydrogen bond formation dramatically lowers the cost of peptide bond partitioning, which is the dominant determinant of whole-residue partitioning. The free energy reduction accompanying secondary structure formation is typically ϳ0.4 kcal mol Ϫ1 per residue (23,24) but may be as low as 0.1 kcal mol Ϫ1 (74). Although small, such changes in aggregate can be large. For example, the folding of 12 residues of 26-residue melittin into an ␣-helical conformation causes the folded state to be favored over the unfolded state by ϳ5 kcal mol Ϫ1 . To put this number in perspective, the ratio of folded to unfolded peptide is ϳ4700. The cost of partitioning the peptide bond also dominates transmembrane helix insertion (Fig. 4). The association of TM helices is probably driven by van der Waals interactions, giving rise to knob-into-hole packing (43)(44)(45)75). The GXXXG motif is especially important in helix-helix interactions in membranes (49,50).

FIG. 4. The energetics of transmembrane helix insertion and the consequences for hydropathy plot analysis. Panel A is based upon
Wimley and White (27). A, estimated relative free energy contributions of the side chains (⌬G sc ) and backbone (⌬G bb ) to the helix insertion energetics of glycophorin A (36); B, hydropathy plots of the L subunit of the photosynthetic reaction center of Rhodobacter sphaeroides showing the importance of knowing the correct value of ⌬G bb (the green horizontal lines identify the known transmembrane helices (76)). In panel A, the net side chain contribution (relative to glycine) was computed using the n-octanol hydrophobicity scale of Wimley et al. (37). The per residue cost (⌬⌬G bb ) of partitioning a polyglycine ␣-helix can be estimated from the theoretical work of Honig and colleagues (34,77) to be ϩ1.25 kcal mol Ϫ1 . The cost of partitioning an unfolded glycyl unit into n-octanol is ϩ1.15 kcal mol Ϫ1 , suggesting that the wholeresidue n-octanol scale (37) provides a reasonable estimate of the free energy of inserting ␣-helical amino acid residues into bilayers. An exact value of ⌬⌬G bb is essential for placing hydropathy plots on an absolute thermodynamic scale, which is necessary for distinguishing TM from non-TM peaks in hydropathy plots. This is shown in panel B. The blue, black, and red curves are plots made using ⌬⌬G bb ϭ 0.75, 1.25, and 1.75 kcal mol Ϫ1 per residue, respectively. If ⌬⌬G bb is too small, TM helices will be overpredicted; if too large, they will be underpredicted.

Minireview: How Membranes Shape Protein Structure 32397
ration into prediction algorithms of additional knowledge of MP structure and stability, such as the so-called positive-inside rule (54,55) or secondary structure propensity (56), can improve the reliability of topology prediction algorithms. Statistical algorithms that rely in part on alignment of MP sequences with significant homology to a sequence of interest can also improve accuracy (57)(58)(59)(60)(61).
Perspectives Considerable progress has been made during the past 15 years in understanding the physical principles underlying MP structure and stability. Of great importance is the growing number of MPs whose structures have been determined to high resolution (an up-to-date list is maintained at blanco.biomol.uci.edu/Mem-brane_Proteins_xtal.html). About 40 structures have now been published, and all are either helical bundles or ␤-barrels. An important question is whether new motifs will emerge. Whatever they may be, they would have to include H-bonded peptide bonds in the transmembrane segments. One possibility is the ␤-helix motif (62). A significant feature of many big MPs, such as sarcoplasmic reticulum calcium ATPase (63), is large extracellular domains. This means that the prediction of MP structure will depend as well upon success in predicting the structure of soluble proteins. Another feature not included in any prediction algorithm is the arrangement of subunits, which are common in large MPs.
More information about the assembly of MPs by the translocon apparatus may result in new insights into structure determination. New insights are also likely to result from our growing understanding of the role of lipids in MP folding (reviewed in Refs. 64 and 65). Finally, a more detailed understanding of specific molecular interactions, particularly in mixed-lipid bilayers, will clarify how membrane interfaces shape protein structure. Of particular importance are the interactions of aromatic residues (66) and charged residues (67), and how hydrophobic and electrostatic interactions combine to stabilize proteins at interfaces (22).