Low Energy Pathways and Non-native Interactions

Four versions of a β-sheet protein (CD2.d1) have been made, each with a single artificial disulfide bond inserted into hairpin structures. Folding kinetics of reduced and oxidized forms shows bridge position strongly influences its effect on the folding reaction. Bridging residues 58 and 62 does not affect the rapidly formed intermediate (I) or rate-limiting transition (t) state, whereas bridging 33 and 38, or 31 and 41, lowers the t-state energy, with the latter having the stronger influence. Bridging residues 79 and 90 stabilizes both I- and t-states. To assess additivity in the energetic effects of these bridges, four double-bridge variants have also been made. All show precise additivity of overall stability, with two showing additivity when ground states and the rate-limiting t-state are assessed, i.e. no measurable change in the folding mechanism occurs. However, combining 31-41 and 79-90 bridges produces a molecule that folds through a different pathway, with a much more stable intermediate than expected and a much higher t-state barrier. This is explained by the artificial introduction of stabilizing, non-native contacts in the I-state. More surprisingly, for another double-bridge version (58-62 and 79-90) both I- and t-states are less stable than expected, showing that conformational constraints introduced by the two bridges prevent formation of non-native contacts that would otherwise stabilize the I- and t-states, thereby lowering the energy of the folding landscape in the wild-type (unbridged) molecule. We conclude that the lowest energy path for folding has I- and t-state structures that are stabilized by non-native interactions.

The first domain of the cell surface molecule CD2 is a member of the immunoglobulin family of ␤-sandwich proteins. It has proved to be a productive experimental model for the study of folding in proteins that comprise only ␤-sheets, because of the reversibility and high yield of the folding reaction and the large signal change that accompanies the folding process. In addition, the absence of disulfide bonds in the folded structure means that the native state is maintained only by non-covalent interactions and the folding reaction is described by a single exponential process showing that there are no subpopulations that fold by a different route and that there is only one rate-limiting transition state.
Despite the uncomplicated nature of the folding and unfolding kinetics, analysis of the denaturant dependence of the folding rate constant shows that the protein folds through a rapidly formed intermediate state, such that the overall folding process can be accurately described by a three-state mechanism as shown in Equation 1, U L | ; where U is the unfolded state, I is a partially folded intermediate state, and F is the fully folded state. The initial rapid collapse, described by the equilibrium constant K (I/U) , occurs on the sub-millisecond time scale and leads to the formation of an I-state in which some aspects of overall topology of the molecule are established. Evidence for this is drawn from hydrogen-exchange experiments in which the most extensive I-state protection is detected in a nucleus of four ␤-strands (see Fig. 1, B, C, E, and F). Interestingly, the hydrogen-bonded strand pairings in this nucleus are distant in sequence (B with E and C with F) and form the most topologically complicated substructure within the folded molecule (1,2). The simpler and more sequence-local substructures, namely the ␤-hairpins (a ␤-strand, followed by a turn of any description and a second ␤-strand that is hydrogen-bonded to the first), are also protected to some extent, particularly the long hairpin created by the hydrogen bonding of the F and G strands. The rate-limiting transition state for folding of CD2 has also been examined by mutational analysis (3). These studies also emphasize the importance of the B,C,E,F nucleus (see Figs. 1 and 3b) in the transition state, because the only residues with transition state -values greater than 0.2 are in this region. It is also interesting to note that there are no statistically relevant -values greater than 0.5 in the whole protein, implying a rather poorly ordered transition state with respect to the intimacy of side-chain interactions.
In two other proteins of the immunoglobulin superfamily, namely TI 127 and TNfn3 (belonging to the Ig I-set and the fn3 set of folds, respectively), it is also the formation of the B,C,E,F nucleus that is the key to folding (4,5). These findings reinforce the argument that topology and the transition state in the folding pathway are strongly related.
The kinetic influence of disulfide bridges has been used as a tool in dissecting the folding pathway of numerous proteins (6 -9). The topological properties of the intermediate and transition states of CD2 have also been probed by examining the effects of preformed disulfide bridges on folding dynamics (10). It was found that, in the majority of cases, well designed cross-links (11,12) stabilized the folded state by an amount that was predictable from a calculation of the entropic penalty in the unfolded state (13). Surprisingly, however, cross-links that connect ␤-strands that are distant in sequence were found predominantly to stabilize the rapidly formed intermediate state, suggesting that these strand-strand interactions occur in the initial stages of folding. Crosslinks that stabilize local hairpins have their major influence on the second, rate-determining step leading to significant enhancements in the folding rate. The result implies that the slow, rate-limiting step must involve the consolidation of localized, hairpin structures, a process that occurs after the complex elements of overall topology are established. Thus long-range contacts, and therefore molecular order, can be established in early, non-rate-limiting processes.
To extend the above study of disulfide engineering, in this report we have looked at the additivity of effects, i.e. the extent to which the energetic properties of a molecule in which two disulfide bonds have been engineered can be predicted by simply summing the effects of the two single bridges. We undertook this work to address three questions. First, could we combine bridges to produce hyper-stable versions of the protein? This would clearly have implications for engineering extreme robustness, at least into ␤-sheet-containing proteins. Second, could we combine bridges that individually produced fast-folding proteins to create an ultra-fast folder? Third, and more speculatively, if we found nonadditivity in combining the bridges, what information might this reveal about the nature of the folding process?

EXPERIMENTAL PROCEDURES
Mutagenesis-All mutations were made on the rat gene cloned into the pGEX-2T glutathione S-transferase fusion vector (14). The proteins were expressed and purified as described previously (2).
Cross-linking Disulfides-To form the disulfide bridges, 10 M protein was oxidized in an oxygen-purged buffer containing 20 M zinc chloride, 50 mM triethanolamine hydrochloride at pH 7.5. To break the bridges, 10 M protein in a helium-purged buffer containing 1 mM EDTA and 50 mM triethanolamine hydrochloride at pH 7.5 was reduced by the addition of 10 mM dithiothreitol. In addition, all other buffers were filtered and purged with the relevant gases. The proteins were then checked for oxidation and reduction using mass spectrometry and Elman's reagent (15).
Kinetic Analysis-The data have been fitted to Equation 2, where k FϪI and k IϪF are rate constants describing the forward and reverse reactions, respectively, between the folded and intermediate states (I). K I/U is the equilibrium constant ([I]/[U]) for the rapid interconversion of the intermediate and unfolded states (16). In the fitting routine, the following relationships were used as shown in Equations 3-5, where the subscript w describes the rate and equilibrium constants in water, and the m parameters describe the shifts in the stabilities of each state (designated by the subscript) as a function of the guanidine hydrochloride (GuHCl) denaturant activity (D), where D ϭ [C 0.5 [GuHCl]/ (C 0.5 ϩ[GuHCl])] Ϫ 2.6[Na 2 SO 4 ], where C 0.5 is a denaturation constant with a value of 7.5 M. This treatment has been explained in detail elsewhere (2,16,17), with the coefficient of 2.6 derived from the linear relationship between sodium sulfate (Na 2 SO 4 ) concentration and denaturant activity between 0 and 0.4 M (i.e. solutions of 0, 0.1, 0.2, 0.3, and 0.4 M give molar denaturant activities of 0, Ϫ0.27, Ϫ0.51, Ϫ0.78, and Ϫ1.03 M, respectively). Na 2 SO 4 is a kosmotropic agent and works in a manner analogous but opposite to GuHCl, increasing the free energy of solvation of hydrocarbon and consequently driving folding reaction in favor of more compact and desolvated states. The molar ability of Na 2 SO 4 to decrease the extent of hydrocarbon burial has been scaled to the molar ability of GuHCl to increase it, allowing us to calculate the denaturant activity of Na 2 SO 4 (16). This results in negative values of denaturant activity.
Folding measurements were initiated by mixing a 10 M solution of unfolded Domain 1 of Cluster Determinant 2 (CD2.d1) 2 containing 50 mM triethanolamine hydrochloride and 3.27 M GuHCl against 10 volumes of a given concentration of GuHCl at 298 K in an SX.18MV stopped-flow apparatus (Applied Photophysics Ltd.). An excitation wavelength of 295 nm was selected by a single monochromator (bandpass 5 nm) from a mercury-xenon light source. The fluorescence intensity above 340 nm was recorded using an emission cut-off filter. For the unfolding reactions, a 10-mM solution of folded CD2.d1 in 50 mM triethanolamine hydrochloride, pH 7.5, was mixed with 10 volumes of an appropriate concentration of GuHCl at 298 K and the reaction recorded as above. The resulting data points are the result of at least three averages. The data are independent of protein concentration, with temperature control and mixing efficiency remarkably reliable in this apparatus. All reaction solutions were maintained at the appropriate temperature using a thermostated circulating water bath and were monitored continuously with a sensitive thermocouple. From this, the fluctuation in temperature was determined to be no more than Ϯ 0.1°C. Rate constants determined in the same conditions but in different experimental data sets vary by only 6 -7%.
All data were fitted using the Grafit analysis software (Erathracus software). When fitting kinetic data to Equation 2, proportional weighting was used so that the fitted values took account of rate constants equally across the whole range.
Structure-The coordinates for wild-type CD2.d1 were obtained from the Protein Data Bank (1HNG).  As has been described previously, an analysis of the folding kinetics of the reduced and oxidized forms shows that the position of the bridge strongly influences its effect on the folding reaction (10). The effects of the bridges, with respect to free energy changes, are summarized in TABLE TWO, which splits the reaction into three stages. The first stage (U-to-I) is the rapid transition from the unfolded state to the intermediate. The second is the passage from this intermediate state to the rate-limiting transition state (I-to-t). The third is the descent from this high energy state to the folded ground state (t-to-F). For each stage in the reaction, the difference between the free energy change for the unbridged (reduced) molecule and the bridged (oxidized) molecule is given. The ⌬⌬G values reported in These ⌬⌬G (bridge) values should be a true reflection of the topological effect of folding with the bridge constraint compared to the bridge constraint, since the change in volume and hydration potential between two cysteine residues (2 ϫ SH) and a cysteine bridge (S-S) is relatively small. Additionally, by using the reduced state as the reference, rather than the wild-type, small local perturbations caused by replacing the wild-type residue with a cysteine are discounted, and we reveal only the effect of oxidizing the cysteines. For the double-bridge versions the values in brackets represent the expected ⌬⌬G (bridge) when the appropriate single-bridge values are summed together.

Dissection of Folding Pathways-Shown in
Effects of Single Bridges-The data presented in the upper half of TABLE TWO show the wide spectrum of effects in the single-bridge versions of CD2. For instance, a bridge between residues 58 and 62 has no effect on the rapidly formed intermediate and only slightly increases the energy barrier that defines the rate-limiting transition state for folding. The major effect of the bridge is to destabilize the folded state, presumably by introducing some degree of strain. The insertion of a bridge between residues 33 and 38 leads to a moderate reduction in the size of the transition state barrier (⌬⌬G (bridge) ϭ Ϫ0.7 kcal/mol) and produces a protein with a 3-fold faster folding rate. The bridge between residues 31 and 41 is in the same hairpin and lowers the energy of the transition state to a greater degree (⌬⌬G (bridge) ϭ Ϫ1.8 kcal/mol). In this case, we see a 20-fold enhancement in the folding rate. The most striking effect is produced by a bridge between residues 79 and 90, which markedly stabilizes both the intermediate state (⌬⌬G (bridge) ϭ Ϫ2.2 kcal/mol) and reduces the height of the transition state barrier by 1.5 kcal/mol.  (Fig. 1)

are shown for data collected in oxidizing (ox) and reducing (red) conditions
The bridge position is described by the mutation the single-letter code and by the strands that are joined. Quoted errors are cumulative S.E. calculated from the global least-square fits to the chevron plots (based on a 95% confidence limit) in Fig. 2.  Combination of Bridges; Additivity in Overall Free Energy of Folding-What is more relevant to the study conducted here is the additivity of effects. All the four double-bridge variants show precise additivity with respect to overall stability (⌬⌬G (bridge) U-to-F). That is, the sum of the effects of the single bridges (see values in parentheses) is equal to the effect of the double bridge. This result demonstrates that, at least in principle, engineering high stability by insertion of multiple disulfide bonds into ␤-sheet regions of proteins is feasible.

k I-F (w) k F-I (w) k I/U (w) M u m I m t ⌬G
Additivity of Energies along the Folding Pathway-The data in TABLE TWO show that the property of additivity of free energy is maintained through every step in the folding mechanism for two of the four double-bridge proteins (31-41 ϩ 58 -62 and 33-38 ϩ 58 -62). Within error, the measured effect of introducing two bridges into a single molecule is the same as summing the measured effects in the two individual singlebridge versions (see values in brackets in TABLE TWO), showing that the bridges induce no measurable change in the folding mechanism.
However, this property is not shown by the remaining two combinations. Combining the 31-41 and 79 -90 bridges produces a molecule that folds through a different pathway, with an intermediate that is much more stable than expected from the additivity principle and a transition state barrier that is much higher in energy (see schematic profile in Fig. 3a). More explicitly, the ⌬⌬G (bridge) for the U-to-I transition in the double-bridge version is greater than Ϫ3.7 kcal/mol (in the sense of being more stable). The fact that we see no evidence for a measurable population of intermediate in the reduced form means that we do not know the upper limit for this effect. However, from summing the values for the individual bridges we would expect a ⌬⌬G (bridge) of Ϫ1.5 kcal/mol, i.e. the measured effect of the double bridge is Ͼ2.2 kcal/mol greater than the predicted effect. This can be defined as a synergistic combination of bridges when considering the energy of formation of the folding intermediate, i.e. the sum is greater than expected from its parts.
When examining the height of the transition state barrier, it is interesting to note that the reduced and oxidized versions of 31-41 ϩ 79 -90 mutant fold at almost the same rate, whereas we would expect a 260fold enhancement of the rate with its bridges intact. This means that the barrier is 3.2 kcal/mol higher than expected because of the introduction of stabilizing, non-native contacts in the I-state that have to be broken before the molecule can achieve the productive transition state.
The hope in combining the CCЈ and FG bridges was that we would not alter the folding pathway and so combine the reductions in the height of the folding barrier occasioned by the single bridges. This would have created a variant of CD2 with a folding rate of Ͼ10,000 s Ϫ1 if there were a 260-fold enhancement of the rate constant for the reduced system. However, the distortion in the pathway defeated this aim.
Perhaps more surprisingly, the remaining double-bridge version (58 -62 ϩ 79 -90) has an intermediate state and a transition state that are less stable than expected. The measured ⌬⌬G (bridge) for the U-to-I transition in the double-bridge version (see TABLE TWO) is close to zero. However, from summing the values for the individual bridges we would expect a ⌬⌬G (bridge) of Ϫ2.2 kcal/mol, i.e. the measured effect of the double bridge is Ͼ2.2 kcal/mol smaller than the predicted effect (see Fig. 3a). This can be defined as an anti-synergistic combination of bridges when considering the energy of formation of the folding intermediate, i.e. the sum is less than expected from its parts. However, as the term anti-synergistic contains an intrinsic contradiction, this effect will be called a destructive combination.
When examining the height of the transition state barrier, it is interesting to note that this destructive effect persists into the next stage of folding, i.e. the transition state barrier. This is 0.6 kcal/mol higher in the double mutant than would be expected from summation of the singles (Fig. 3b).
This means that the constraints on conformation introduced by the two bridges in the 58 -62 ϩ 79 -90 mutant are acting in the opposite way to those described for the 31-41 ϩ 79 -90 mutant. That is, they must prevent rather than cause the formation of non-native contacts that would otherwise stabilize the I-state and the t-state in the folding pathway of the single-bridge versions.

DISCUSSION
Implications of Synergistic Combinations-The principle behind additivity of free energy in a pathway is that when two bridges combine synergistically, i.e. their effect on the energy change for a given step is greater than predicted from the sum of the individuals, then the pathway has changed by creating a new conformation of lower free energy than is achieved in either of the single-bridge proteins. This is exemplified by the 31-41 ϩ 79 -90 version of CD2, which has an intermediate state that is Ͼ2.2 kcal/mol more stable than predicted. This means that there are stabilizing, non-native interactions between the CCЈ and FG hairpins in the double-bridge version that are absent in the singles (Fig.  3b). However, the fact that additivity is maintained across the whole pathway, from unfolded to fully folded state, means that this advantage must be lost at a later stage. In fact, the transition state barrier is 3.2 kcal/mol higher than predicted in the doubly bridged molecule. The fact that there is a negative slope in the rate plot for the double mutant at low denaturant concentrations (see Fig. 2) shows that the intermediate is more compact than the transition state. In turn, this demonstrates that the stabilizing non-native contacts formed in the intermediate state have to be broken before the transition state can be attained.
This result can be rationalized in structural terms by the following argument. As shown in the schematic in Fig. 3, the CCЈ and FG hairpins are not in contact in the folded state, and analysis of the single-bridge 31-41 version of CD2 shows that the CCЈ hairpin is not normally formed in the I-state. If the presence of this disulfide bridge forces hairpin formation then there is the possibility of a strong non-native interaction between the two hydrophobic faces of the hairpins. This interaction would then have to be broken to achieve the native topology.
Implications of Destructive Combinations-According with the above paradigm, two bridges can be considered to combine in a destructive manner when their effect is less than predicted from the sum of the individual effects, i.e. the pathway is changed in the double mutant by preventing stabilization of non-native interactions that normally occur in the single-bridge proteins. The 58 -62 ϩ 79 -90 version of CD2 exemplifies this phenomenon, suggesting that such non-native contacts between the DE and FG hairpins occur in the normal folding pathway (see Figs. 1 and 3a). The predicted free energy stabilization (⌬⌬G (bridge) ) in the double mutant for the formation of the I-state is Ϫ2.2 kcal/mol, whereas the observed is only Ϫ0.2 kcal/mol.
In the case of the 79 -90 single-bridge version, we see a marked stabilization of the I-state (Ͼ3.7 kcal/mol). This result shows that the FG hairpin must be at least partially formed in this conformation when the wild-type molecule folds. However, the single 58 -62 bridge has, if anything, a destabilizing effect on the I-state showing that in the wild-type folding pathway the DE hairpin is not formed in the I-state. The destructive nature of the DE-FG combination in the double-bridge molecule can therefore be explained if the FG hairpin in normal folding pathway is stabilized by non-native interactions with the DE segment of the protein that has not, at this stage, formed a hairpin.
It is interesting to note that the expected reduction in the transition state barrier from the summation of the single-bridge effects is 1.0 kcal/mol, whereas the observed stabilization is only 0.4 kcal/mol. This result implies that the non-native contacts that normally stabilize the intermediate state are maintained to a lesser degree in the transition state.
Concluding Remarks-Although one of the objectives of these experiments was to make a fast-folding version of CD2 by combining an artificial bridge across the FG hairpin with one across the CCЈ hairpin, we would argue that our failure to do so was instructive. Rather than making a fast folder, we succeeded only in forming a misfolded overstable intermediate, i.e. we distorted the folding pathway by stabilizing a substructure (CCЈ) that is normally unfolded in the I-state, thereby creating non-native interactions between the two hairpins. The lesson here is to engineer more intelligently and, when combining bridges to engineer high folding rates, choose to stabilize hairpins that are in contact in the native state. Hence, non-native interactions that increase the height of the major energy barrier would probably be avoided by pursuing this strategy.
To add to this, we found evidence for the beneficial role of non-native interactions in a folding pathway. When the DE bridge was combined with the FG, we were surprised to find that the double-bridge molecule folded through a less stable intermediate and through a higher energy transition state than would have been predicted from the effects of the two single bridges. This shows that the double bridge has abolished non-native contacts that normally stabilize intermediate and transition states in the folding pathway. The result implies that, perhaps counterintuitively, contacts between regions of the polypeptide chain that are not in contact in the folded native state can assist, rather that inhibit, the kinetics of the folding process.