An Alanine-Zipper Structure Determined by Long Range Intermolecular Interactions*

A major challenge in protein folding is to identify and quantify specific structural determinants that allow native proteins to acquire their unique folded structures. Here we report the engineering of a 52-residue protein (Ala-14) that contains exclusively alanine residues at the hydrophobic a andd positions of a natural heptad-repeat sequence. Ala-14 is unfolded under normal solution conditions yet forms a parallel three-stranded α-helical coiled coil in crystals. Ala-14 trimers in the solid state associate with each other through the pairing of polar side chains and formation of an extended network of water-mediated hydrogen bonds. In contrast to the classical view that local intramolecular tertiary interactions dictate the three-dimensional structure of small single-domain proteins, Ala-14 shows that long range intermolecular interactions can be essential in determining the metastable alanine-zipper structure. A similar interplay between short range local and longer range global forces may underlie the conformational properties of the growing class of natively unstructured proteins in biological processes.

A major challenge in protein folding is to identify and quantify specific structural determinants that allow native proteins to acquire their unique folded structures. Here we report the engineering of a 52-residue protein (Ala-14) that contains exclusively alanine residues at the hydrophobic a and d positions of a natural heptad-repeat sequence. Ala-14 is unfolded under normal solution conditions yet forms a parallel three-stranded ␣-helical coiled coil in crystals. Ala-14 trimers in the solid state associate with each other through the pairing of polar side chains and formation of an extended network of water-mediated hydrogen bonds. In contrast to the classical view that local intramolecular tertiary interactions dictate the three-dimensional structure of small single-domain proteins, Ala-14 shows that long range intermolecular interactions can be essential in determining the metastable alanine-zipper structure. A similar interplay between short range local and longer range global forces may underlie the conformational properties of the growing class of natively unstructured proteins in biological processes.
The classical view of protein folding posits that local secondary and tertiary interactions promote the association of cooperative structural elements in a process of sequential stabilization to form a limited set of transient intermediates and finally the native state (1)(2)(3)(4)(5)(6)(7)(8)(9)(10)(11)(12). The current theory emphasizes the role of the free energy landscape of a polypeptide chain in driving rapid and efficient folding into a unique well defined structure (13)(14)(15)(16)(17). In any description of protein folding, a major challenge is to identify specific structural determinants that guide the folding process. Although considerable progress has been achieved toward understanding how compact single-domain proteins fold (1)(2)(3)(4)(5)(6)(7)(8)(9)(10)(11)(12), the factors that specify and stabilize larger and more extended protein complexes remain largely undefined despite their critical importance in the complex environment inside a living cell. We report here a new alaninezipper structure that undergoes a native folding transition only during crystallization. A network of long range solvent-mediated interactions is observed in the 1.3-Å x-ray crystal structure of this alanine-zipper trimer that appears to directly control its folding into a helical trimer structure.

EXPERIMENTAL PROCEDURES
Protein Production-Plasmid pAla14 encoding Ala-14 was derived from pLpp56 (Ref. 18). The sequence Cys-Gly-Gly was added to the N terminus of Ala-14 to produce Ala-14 N . Similarly, Gly-Gly-Cys was added to the C terminus of Ala-14 to produce Ala-14 C . The sequences Cys-Gly-Gly and Gly-Gly-Cys were added at the N and C termini, respectively, of Ala-14 to generate Ala-14 NC . The proteins were expressed in Escherichia coli strain BL21(DE3)/pLysS (Novagen). Cells were grown at 20°C in LB media to an optical density of 0.8 at 600 nm and induced with isopropylthio-␤-D-galactoside for 10 h. Cells were lysed by glacial acetic acid on ice. The bacterial lysate was centrifuged (35,000 ϫ g for 30 min) to separate the soluble fraction from inclusion bodies. The soluble fraction containing denaturing Ala-14 was dialyzed into 5% acetic acid overnight at 4°C. Proteins from the soluble fraction were purified to homogeneity by reverse-phase high performance liquid chromatography (Waters, Inc.) on a Vydac C-18 preparative column (Hesperia, CA) using a water-acetonitrile gradient in the presence of 0.1% trifluoroacetic acid and lyophilized. The molecular weights of each peptide were confirmed by using matrix-assisted laser desorption ionization time-of-flight mass spectrometry (PerSeptive Biosystems, Framingham, MA).
To produce the disulfide-linked Ala-14SOS protein (see Fig. 2C), the reduced Ala-14 N , Ala-14 C , and Ala-14 NC peptides were mixed and incubated in redox buffer (1 mM oxidized glutathione, 1 mM reduced glutathione, 50 mM NaCl, 0.1 M Tris-HCl, pH 8.7, 1 mM EDTA) for 3 days at 4°C at a total protein concentration of 40 mg/ml. This treatment was followed by high pressure liquid chromatography purification as described above. The designed Ala-14SOS protein with disulfides between Ala-14 N and the N terminus of Ala-14 NC and between Ala-14 C and the C terminus of Ala-14 NC was analyzed by proteolysis with trypsin and mass spectrometry. In the oxidized sample, the disulfide-bonded homodimers of the CGGSSNAK and ADNAAGGC fragments were observed. In the reducing sample, the masses of these homodimers disappear and new masses appear, corresponding to the individual fragments.
Crystallization, Structure Determination, and Refinement-A 25mg/ml stock of Ala-14 was prepared in water. The best diffracting crystals grew at 4°C from 1 l of the stock added to 1 l of the reservoir buffer (0.1 M sodium acetate, pH 4.0, 1.3 M sodium citrate) and allowed to equilibrate against the reservoir buffer. The crystals belong to the space group P1 (a ϭ 21.84 Å, b ϭ 26.94 Å, c ϭ 45.80 Å; ␣ ϭ 88.05°, ␤ ϭ 84.95°, ␥ ϭ 87.56°) and contain three monomers in the asymmetric unit. Prior to data collection, crystals were transferred into cryosolution containing the reservoir buffer and 15% glycerol, harvested, and frozen in liquid nitrogen. Diffraction data were collected at the X25 beamline at the Brookhaven National Laboratory using a Brandeis B4 CCD detector. Reflection intensities were integrated and scaled with the programs Denzo and Scalepack (19). Initial phases were determined by molecular replacement using the Ala-7 trimer (20) as a search model in the program AMoRe (21). Electron density map interpretation and model building were done with the program O (22). The structure of Ala-14 was refined at 1.5-Å resolution using the program CNS 1.0 (Ref. 23). The isotropic CNS model was further refined at 1.3-Å resolution using restrained anisotropic displacement parameters for all atoms with the program SHELX97 (Ref. 24). The final model was verified using simulated annealing omit maps. All main-chain torsional angles * This work was supported by Grant AI 42382 from the National Institutes of Health and by Grant 50151N from the American Heart Association. The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
The fall within the helical region of a Ramachandran plot. The conformations of the majority of the residues are well defined with the exception of the two most N-terminal residues of chains A and C and the side chains of Lys-5B, Lys-19C, and Gln-46B.
Structure Analysis-Superhelical parameters for the Ala-14 and Lpp-56 trimer structures were obtained by fitting the C ␣ backbones to a supercoil parameterization suggested by Crick (25). Root mean square deviations were calculated using the program CNS 1.0 (23). Surface areas were calculated with a 1.4-Å radius probe with CNS 1.0 (23). Buried surface areas were calculated from the difference of the accessible side-chain surface areas of the trimer structure and of the individual helical monomers. Atomic solvation energy was calculated using the method of Eisenberg and McLachlan (26). The net solvation energies were derived by subtracting the energies of the component ␣ helices from the energy of the trimer structure. The three most N-terminal residues and the most C-terminal residue of Ala-14 were omitted from the calculation to minimize end effects.
Biophysical Experiments-Circular dichroism spectra were acquired on an Aviv 62DS spectropolarimeter as described previously (27). The measurements of [] 222 were carried out on 3-mg/ml solutions of protein at 0°C in 50 mM sodium phosphate, pH 7.0, and 150 mM NaCl. A [] 222 value of Ϫ36,000 degrees cm 2 dmol Ϫ1 was taken to correspond to 100% helix (28). Thermal melts were performed on 3 mg/ml-protein solutions in the same buffer by measuring ellipticity at 222 nm in 1°C steps with a 2-min equilibration time and a 30-s integration time. All melts were reversible. Superimposable folding and unfolding curves were observed, and Ͼ95% of the signal was regained upon cooling. Analytical ultracentrifugation measurements were performed on a Beckman XL-A analytical ultracentrifuge as described previously (29). Protein solutions were loaded at initial concentrations of 0.5, 1.5, and 5 mg/ml and analyzed at rotor speeds of 35,000 and 38,000 rpm for Ala-14 and 22,000 and 25,000 rpm for Ala-14SOS at 4°C.

RESULTS AND DISCUSSION
Alanine Zipper-The outer membrane lipoprotein is the most abundant protein of E. coli containing a trimeric ␣-helical coiled-coil domain (Lpp-56) that is embedded between the outer membrane and the periplasmic peptidoglycan (18). Classical coiled-coil proteins share a characteristic seven-amino acid repeat containing bulky hydrophobic side chains at the first (a) and fourth (d) positions (30 -32). The crystal structure of Lpp-56 reveals that this coiled-coil trimer has an unusual feature: three alanines occupying successive a and d positions form a local structure we refer to as an alanine zipper (18). In previous studies, we showed that four additional alanine layers can be stably incorporated into the triple-helical structure despite the fact that each alanine layer is destabilizing significantly relative to large hydrophobic groups (20). Because alanine has relatively low hydrophobicity, the effect generally considered to be an important determinant of protein stability (33)(34)(35), we surmised that van der Waals packing interactions within the -CH 3 groups are insufficient to specify and sustain an extended alanine-zipper conformation under physiological conditions. To explore the limits of three-dimensional structure formation in the absence of strong hydrophobic stabilization, we engineered an alanine-zipper protein called Ala-14 by the complete substitution of each bulky hydrophobic amino acid at the a and d positions of the Lpp-56 heptad-repeat sequence with alanine (Fig. 1A).
Crystal Structure of the Ala-14 Trimer-To evaluate the high resolution features of the interfacial alanine side-chain packing, we determined the x-ray crystal structure of the Ala-14 peptide at 1.3-Å resolution (Fig. 1B). The Ala-14 structure was refined to a conventional R-factor of 15.0% with a free R-factor of 20.9% and root mean square deviations from ideal bond lengths and bond angles of 0.010 Å and 2.0°, respectively (Table  I). Ala-14 forms a parallel triple-stranded ␣-helical structure with novel and unique features (Fig. 1, C and D). The superhelix creates a cylinder that is ϳ18-Å wide and ϳ74-Å long. An The Ala-14 trimer shows a classical acute knobs-into-holes packing characteristic of parallel triple-stranded coiled coils (25,36). At both the a and d layers, the C ␣ -C ␤ bond of each alanine knob makes an acute angle (ϳ60°) with the C ␣ -C ␣ vector at the base of the recipient hole among four residues on a neighboring helix. This similar geometric packing at positions a and d is a distinguishing structural feature of trimer conformation (36). In 13 of 14 a and d layers, the dihedral angles and of the alanine backbones are approximately Ϫ40 and Ϫ70, corresponding to the ␣-helix region of Ramachandran space (37). The alanine residues in the C-terminal most d layer assume dihedral angles near 160 and Ϫ70 and adopt an extended conformation. These end effects cause the superhelix to be locally underwound, and five coordinated water molecules are observed to be anchored to the Ala-48B backbone carbonyl in the core structure.
An interplay between alanine side-chain packing and backbone conformation can be discerned in the trimer structure as dramatic changes in the twist of individual helices and the helix-helix spacing. The superhelical parameters (radius R 0 , pitch 0 , and a-position C ␣ phase angle ) for the Ala-14 and Lpp-56 trimers are as follows: R 0 ϭ 5.1 Å, 0 ϭ 99 residues/ turn, and ϭ 13.1°for Ala-14; R 0 ϭ 6.1 Å, 0 ϭ 115 residues/ turn, and ϭ 16.8°for Lpp-56. The alanine backbone deviation from classical coiled coils can be described as a localized unwinding of the superhelix (decrease in 0 ), decrease in the phase angle , and decrease in the supercoil radius R 0 . As a result, the backbones of the three helices in the Ala-14 trimer are more highly curved and wrap more tightly around the superhelical axis, producing a very close spacing between adjacent helices (8.8 Å). This type of helix contact may indeed correspond to an optimal geometry for a novel class of tight helix-helix association in proteins (38).
Solution Properties of Ala-14 -Surprisingly, Ala-14 has only ϳ20% helix content at 3 mg/ml protein concentration in neutral pH phosphate-buffered saline at 0°C as determined by CD ( Fig. 2A). Under these conditions, Ala-14 lacks a "folded" base line at low temperatures and displays a broad noncooperative thermal unfolding transition (Fig. 2B). Sedimentation equilibrium measurements indicate that Ala-14 exists as a monomer, even at a concentration of 5 mg/ml. Strikingly, there is no apparent increase in the ␣-helical content and/or the thermal stability of Ala-14 in the presence of 0.1 M sodium acetate, pH 4.0, 1.3 M sodium citrate (crystallization conditions) relative to phosphate-buffered saline, pH 7.0. Taken together, these data indicate that the Ala-14 peptide lacks intermolecular association and forms a mix of intramolecular helix and coil under native folding conditions in solution.
To assess the role of the monomer-trimer equilibrium of Ala-14 on its folding in aqueous solution, we generated a covalently linked Ala-14 trimer whose stability is concentrationindependent. In this single-chain version of the protein denoted Ala-14SOS, the three Ala-14 strands are connected by short disulfide-bonded peptide linkers (Fig. 2C). The helical CD signal observed in Ala-14SOS is comparable, and the thermal stability is identical to that found in the isolated Ala-14 peptide (Fig. 2). In contrast to what is commonly found in other coiledcoil models, the stability of Ala-14 is insensitive to its molecularity. Thus, the Ala-14 peptide is largely unfolded in solution, notwithstanding that the high resolution crystal structure of the Ala-14 trimer shows specific tertiary packing among helices. This conclusion is in accord with the predicted loss of ϳ15.6 kcal/mol in net hydrophobic stabilization energy of the triplehelical structure associated with alanine substitutions (26) and the notion that the -CH 3 side chain is too short to form stable nonpolar clusters (39).
Intertrimer Packing Interactions-What interactions might then explain the precise native-like assembly of Ala-14 helical trimers in the solid state? One clue comes from the observation that crystals of the Ala-14 peptide have exceptionally low values of solvent content (26.5%) and crystal volume/unit of protein molecular mass (1.70 Å 3 /dalton). The crystal structure displays a distinctive packing arrangement in which layers of parallel triple helices are situated closely along the helical axis with respect to each other (Fig. 3A). In the direction of the helical axis, there is little space between the C-terminal end of one triple-helical molecule and the N-terminal end of the next (Fig. 3B). This compact arrangement makes it possible to create sufficient intertrimer contacts to stabilize the crystal lattice. In effect, the lattice provides helix-terminating interactions that anchor the ends of successive triple helices in the crystal. Residues Ser-1B, Ser-2B, Asn-3A, Asn-3B, Asn-3C, Ala-4A, Lys-5A, Asp-7B, Gln-14B, Lys-19B, Asn-28A, Ser-32A, Asp-33A, Lys-38B, Asp-39B, Arg-47A, Asp-49A, Asp-49B, Asn-50A, Asn-50C, Ala-51A, and Ala-51C are positioned such that favorable hydrogen-bonded interactions involving the polar side chain and main chain atoms and structured water molecules can occur between the N-terminal and C-terminal ends of neighboring triple helices (Fig. 4, Lys-5A/Asn-50A*). A major problem in helix formation is the lack of hydrogen-bonding

i for the intensity (I) of i observations of reflection h.
c R-factor ϭ ⌺͉F obs Ϫ F calc ͉ / ⌺͉F obs ͉, where F obs and F calc are the observed and calculated structural factors, respectively. No -cutoff was applied. d R free ϭ R-factor calculated using 10% of the reflection data chosen randomly and omitted from the start of refinement. e r.m.s., root mean square. partners for exposed terminal amide protons and carboxyl oxygens (40). The helix capping and solvation of the ends presumably account in part for the presence of a high resolution molecular structure in the Ala-14 trimer under crystallization conditions. A second unique feature of Ala-14 in the crystalline state is the presence of continuous strips of intertrimer interactions along the helical axis. These appear to be critical in determining the compact lateral assembly of Ala-14 trimers (Fig. 5A). The network of extended interstrand connections among peptide atoms and ordered water molecules includes the following: (i) 2 salt bridges and 16 charge-stabilized hydrogen bonds between the adjacent triple helices/trimer (Fig. 5, B-E, Asp-12C/ Arg 31C* and Arg-43C/Ser-11B*) and (ii) water-mediated hydrogen-bonding interactions between polar or charged side chains and backbone carbonyl or amide groups also occur frequently (Fig. 5, A and C-E). Of the 186 water molecules added in the crystal structure, 154 (82%) are involved in direct interactions (within 3.25 Å) with the peptide. The average B-factor value of the structured water molecules, 28.2 Å 2 , is not much higher than the 21.3 Å 2 B-factor of protein side-chain atoms. The hydration of the Ala-14 trimer seems to be dominated by its involvement in water-mediated interchain hydrogen bonds. The combination of specific intertrimer electrostatic interactions with capping effects at the helix termini suffices to lock a labile Ala-14 solution structure into a defined native state in the crystal.
Polar Interactions-Proteins acquire their spatial structures under the influence of a manifold of noncovalent interactions among backbone and side-chain atoms including the hydrophobic effect, van der Waals interactions, and hydrogen bonds and other electrostatic interactions. Whereas the relative importance, selectivity, and context dependence of these forces remain imperfectly understood, the burial of hydrophobic surfaces is thought to play a major role in governing protein architecture and folding, at least in relatively small protein folds (3,(33)(34)(35)(41)(42)(43)(44). Ala-14 provides a striking example of how a native structure can still form once this localized driving force has been severely attenuated. In contrast to proposals that solvent-exposed electrostatic interactions contribute only marginally to protein stability, we find that an array of weak polar interactions distributed along the chain can impose suf-ficient conformational constraints to stabilize the alanine-zipper trimer.
Buried alanine side chains provide a small van der Waals component to the thermodynamic stability of the alanine-zipper trimer. The residual enthalpic interactions among the core Ala side chains are not sufficient to maintain the alaninezipper fold in solution. Our finding that the covalently tethered Ala-14 trimer analog fails to form a triple-helical structure in solution indicates that alanine-zipper formation is not solely opposed by the loss of translational conformational entropy. It would appear that favorable enthalpy contributions from residual van der Waals packing, backbone hydrogen-bonding, and

FIG. 2. Solution properties of the Ala-14 (triangles) and Ala-14SOS (circles) proteins.
A, circular dichroism spectra at 0°C in phosphate-buffered saline, pH 7.0, and 3 mg/ml protein concentration. B, thermal melts monitored by circular dichroism at 222 nm. C, a schematic model of the designed protein Ala-14SOS. Three Ala-14 helices are linked using short disulfide-bonded peptide sequences. N-to-N linker and C-to-C linker are formed by disulfide bond formation between the N-terminal residues Cys-Gly-Gly and the C-terminal residues Gly-Gly-Cys, respectively. In each case, two glycine residues were added between the helix and the cysteine residue to provide flexibility for disulfide bond formation. D, representative sedimentation equilibrium data of a 0.5 mg/ml solution of Ala-14SOS collected at 4°C and 22,000 rpm in phosphate-buffered saline, pH 7.0. The data fit best to a Ala-14 trimer model (curve 3). Curves for a Ala-14 dimer (curve 2) and a Ala-14 tetramer (curve 4) are indicated for comparison. the network of solvent interactions guide and stabilize the formation of the alanine-zipper structure during crystallization. The precision of this process is reflected by the high resolution of the crystal structure (1.3 Å) with low temperature factors (16.6 and 28.9 Å 2 for protein and solvent atoms, respectively).
Biological Implications-This process of triple-helical assembly in the crystalline state is to our knowledge the first example of stabilization of a native state directed by long range intermolecular rather than short range intramolecular interactions. It is tempting to speculate that extended solvent-mediated interactions such as what we describe here mediate the assembly of the coiled-coil rod of myosin to form filaments in muscle. There is growing evidence that a class of proteins referred to as intrinsically unstructured proteins are unfolded and devoid of native tertiary structure under normal solution conditions yet can self-assemble into highly organized aggregates that play a causative role in the pathogenesis of human diseases including Alzheimer's and the transmissible spongiform encephalopathies (45,46). Understanding the fundamental nature of the mechanism of such large-scale assembly processes is a critical step in the development of strategies to prevent and treat these incurable diseases. By analogy with Ala-14, long range intermolecular interactions might entail formation of solvent-mediated interactions that promote assembly of the proteins into oligomeric fibrils.