Exploring the Cytochrome c Folding Mechanism

Understanding the role of partially folded intermediate states in the folding mechanism of a protein is a crucial yet very difficult problem. We exploited a kinetic approach to demonstrate that a transient intermediate of a thermostable member of the widely studied cytochrome c family (cytochrome c552 from Thermus thermophilus) is indeed on-pathway. This is the first clear indication of an obligatory intermediate in the folding mechanism of a cytochrome c. The fluorescence properties of this intermediate demonstrate that the relative position of the heme and of the only tryptophan residue cannot correspond to their native orientation. Based on an analysis of the three-dimensional structure of cytochrome c552, we propose an interpretation of the data which explains the residual fluorescence of the intermediate and is consistent with the established role played by some conserved interhelical interactions in the folding of other members of this family. A limited set of topologically conserved contacts may guide the folding of evolutionary distant cytochromes c through the same partially structured state, which, however, can play different kinetic roles, acting either as an intermediate or a transition state.

The mechanism by which an unfolded polypeptide chain finds its unique native state is one of the most intriguing problems in biology. Extensive kinetic studies led to the hypothesis that protein folding proceeds along a defined reaction pathway whereby the polypeptide is driven through one or more partially structured intermediates. Following this view, much experimental work has focused on the identification and characterization of folding intermediates and transition states employing rapid reaction techniques (1). However, the discov-ery of kinetic traps (2,3) and parallel folding pathways (4) required the significance of partially folded species to be critically re-examined.
Understanding the significance of folding intermediates is often difficult (5,6). Under refolding conditions, they are usually populated within the dead time (2-5 ms) of conventional stopped-flow instruments. The kinetic data for the majority of proteins that seem to fold by means of a three-state mechanism are compatible with both a productive on-pathway intermediate and a misfolded off-pathway species that must unfold again before reaching the native state. Only recently, with the development of ultrarapid mixing devices, the kinetic role of some intermediates has been directly addressed (7)(8)(9)(10).
Thermostable proteins are potentially useful in studying the structural and energetic properties of folding intermediates and their kinetic role, because it is reasonable to envisage that these may be particularly stable species. Because of experimental complexities, however, only limited information on the kinetic folding mechanism of hyperthermophilic proteins is available (11,12). Here, we report on the folding kinetics of cytochrome (cyt) 1 c 552 from the extreme thermophilic bacterium Thermus thermophilus (13,14) in an attempt to shed light on the folding mechanism of the cyt c family. Kinetic experiments led to the identification of a partially structured species and provided direct evidence that cyt c 552 folds through a compact on-pathway intermediate. Analysis of the three-dimensional structure (14) of native cyt c 552 made it possible to derive a hypothesis for the structure of this on-pathway intermediate, consistent with its fluorescence properties and with the conservation of several interhelical contacts, previously shown to play a role in the folding of other members of the cyt c family (15)(16)(17)(18).

EXPERIMENTAL PROCEDURES
Native cytochrome c 552 from T. thermophilus was purified as described previously (19). All experiments were carried out on the oxidized protein, as checked spectrophotometrically. The buffers used were: 50 mM sodium phosphate, pH 7.0, and 50 mM sodium citrate, pH 2.1. All reagents were of analytical grade.
Stopped-flow Measurements-All fluorescence-detected kinetic folding experiments were carried out on an Applied Photophysics DX-17MV stopped-flow instrument (Leatherhead, UK); the excitation wavelength was 290 nm, and the fluorescence emission was measured by using a 320-nm cut-off glass filter. In all experiments (performed at 10°C), refolding and unfolding were initiated by an 11-fold dilution of the denatured or the native protein in the appropriate buffer.
Double-mixing Measurement-Interrupted refolding experiments * This work was partially supported by the Ministero dell'Istruzione, Università e Ricerca of Italy (PRIN 2001 on "Structural Dynamics of Hemeproteins" to M. B. and Centro di eccellenza "Biologia e medicina molecolare"). The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact. This paper is dedicated to the memory of Professor Eraldo Antonini, eminent biochemist prematurely deceased on March 19, 1983 were carried out on an Applied Photophysics SX-18MV stopped-flow instrument with double-mixing facility (Leatherhead, UK). Refolding and unfolding were initiated by a symmetric mixing of the denatured or the native protein with the appropriate buffer. Unfolded cyt c 552 was obtained by incubation in 5 M GdnHCl at pH 2.1.
Data Analysis-Equilibrium experiments. Assuming a standard twostate model, the GdnHCl-induced denaturation transitions were fitted to the equation where ⌬G w and ⌬G d are the free energy of folding in water and at a concentration D of denaturant, respectively, m UN is the slope of the transition (proportional to the increase in solvent-accessible surface area in going from the native to the denatured state), and D is the denaturant concentration (20). In some cases, an equation that takes into account the pre-and post-transition base lines has been used to fit the observed unfolding transition (21). Kinetic Experiments-Analysis was performed by nonlinear least squares fitting of single-or double-exponential phases by using the fitting procedures provided in the Applied Photophysics software. The chevron plot describing the U^I transition was fitted by numerical analysis based on the two-state model where U represents the unfolded state and I represents the intermediate state. The observed folding kinetics is described by the equation k obs ϭ k UI ϩ k IU where the following applies.
These equations permit one to determine the refolding and unfolding rates at any denaturant concentration D and thereby k IU 0 and k UI 0 , i.e. the unfolding and refolding rate constants in the absence of denaturant. From the chevron plot describing the I^N transitions, where N represents the native state, only the refolding limb at GdnHCl concentrations Յ 2.5 M (reporting on the I 3 N transition) and the unfolding limb at GdnHCl concentrations Ն 4 M (reporting on the N 3 I transition) were fitted according to Equations 2 and 3, respectively. The microscopic rate constants and the associated m-values for this plot are Structural Analysis-All x-ray structures described in this work are available from the Protein Data Bank (22) and were analyzed by using the programs Insight II (23) and Database of Secondary Structure in Proteins (24).

RESULTS
Hydrophobic Interactions and Stability of cyt c 552 -Cyt c 552 from T. thermophilus is a class I cyt c with an unusually long (131 residues) amino acid sequence, only distantly related to sequences of others members of this class. Inspection of its three-dimensional structure (14) shows that the protein is endowed with some unique features that are absent in the canonical cyt c fold ( folding transitions both at equilibrium and in kinetic experiments. The GdnHCl denaturation curves of the oxidized cyt c 552 at pH 7.0 and 2.1 are shown in Fig. 2A; the parameters obtained by fitting the data to Equation 1 (see "Experimental Procedures") are reported in the legend.
Stopped-flow unfolding experiments carried out under the same conditions (Fig. 2B) show that unfolding is cooperative and extends over a time scale spanning approximately three orders of magnitude, whereas the unfolding time courses are always single exponential. The unfolding rate constants extrapolated to 0 M GdnHCl are: 6.7 Ϯ 2.0 ϫ 10 Ϫ10 s Ϫ1 at pH 7.0 and 3.7 Ϯ 1.0 ϫ 10 Ϫ6 s Ϫ1 at pH 2.1. These values are much smaller than those generally reported for mesophilic globular proteins (10 Ϫ1 Ϭ 10 Ϫ4 s Ϫ1 ) but are still on the time scale expected for a thermophilic protein (11,12). At pH 2.1, the rate constant dependence on [GdnHCl] tends to become non-linear only at the highest concentrations. This behavior is typical of members of the cyt c family and is attributed to the Met-iron deligation becoming rate-limiting (15,17,25).
The fact that cyt c 552 still retains considerably slow unfolding kinetics at pH 2.1, where ionizable groups should all be protonated and salt bridges disrupted, implies that there is an important component to stability caused by forces that are nonelectrostatic in nature. On the other hand, the observation that the unfolding rate constants at pH 2.1 and 7.0 ( Fig. 2B) differ by more than three orders of magnitude poses an interesting question: can this large effect be accounted for by differences in the free energy of the native state or of the unfolding transition state? Examination of the three-dimensional structure of cyt c 552 (14) shows a peculiar distribution of ion pairs. Although their overall number is very close to the average found for mesophilic proteins (0.004/residue; Ref. 26), in cyt c 552 salt bridges are clustered in the C-terminal ␣-helices (14). This led to the hypothesis that these so-called "thermo helices" represent a dominant factor for thermostability (14). From the equilibrium unfolding curves at pH 2.1 and 7.0 (see Fig. 2A), we obtain a ⌬⌬G UN ϭ 5.5 Ϯ 0.2 kcal/mol Ϫ1 , which is similar to the destabilization of the unfolding transition state at pH 2.1 compared with pH 7.0 (⌬⌬G ‡ ϭ 4.80 Ϯ 0.01 kcal/mol Ϫ1 ; see Fig. 2B and Table I). This indicates that the slower unfolding kinetics at pH 7.0 is essentially caused by a greater stabilization of the native state rather than by an increase in the barrier for unfolding. This observation is in agreement with the proposal that ion pairs play a kinetic role by clamping the protein in key surface positions, thereby reducing the normal vibrational modes accessible upon unfolding (11).
Folding Kinetics-The refolding kinetics of cyt c 552 , followed by Trp fluorescence quenching, is complex. It was suggested previously (27) that unfolded cyt c 552 at pH 7.0 may exist as a mixture of different 6-coordinate states, most probably because of iron miscoordination by His residues (His 32 and His 86 ), which may interfere with the refolding reaction; this is the most likely explanation for the multiphasic refolding time course that we observed at pH 7.0 (not shown). Miscoordination of the heme iron by His residues was abolished in horse cyt c either by lowering the pH (thereby protonating the His residues) or by adding imidazole as an extrinsic Fe 3ϩ ligand (15,28). In the case of cyt c 552 , heterogeneity of the refolding reaction was still observed not only at pH 7.0 (even after the addition of 0.2 M imidazole to the refolding buffer, data not shown) but also at very low pH values (i.e. at pH 2.1). A pH titration of the GdnHCl-unfolded cyt c 552 , followed by the appearance of the characteristic high spin absorbance band at 620 nm, yielded a curve consistent with a single transition, with a pK ϭ 4.5 and a Hill coefficient of n ϭ 1, indicating the binding/ dissociation of a single group (data not shown). Therefore, we resorted to exploring the complete [GdnHCl] dependence of the refolding rates at pH 2.1, where His residues are completely protonated. Under these conditions, the protein is still native, as judged from the acid titration (27) and our unfolding data ( Fig. 2A), although it is sufficiently destabilized to allow kinetic refolding experiments to be carried out. Surprisingly, even at pH 2.1, the refolding time course in the milliseconds to seconds time range is clearly biphasic and is described only by a double exponential, which cannot be caused by miscoordination. A third and much slower refolding phase (k ϳ 0.01 s Ϫ1 ) with a very small amplitude was also observed. However, ad hoc interrupted unfolding experiments, with variable delay times (1-10 min; not shown) demonstrate that this very slow process is consistent with a minor species refolding along a pathway rate-limited by isomerization of the prolyl-peptide bonds.
The dependence of the refolding and unfolding rate constants on [GdnHCl] at pH 2.1 is shown in Fig. 3. Because we can exclude miscoordination events, we assumed that an intermediate species was populated; therefore, we carried out experiments to clarify whether this intermediate is on-pathway (U 7   7 U 7 N). The faster phase (k ϳ 100 s Ϫ1 ) of the refolding limb of the chevron plot is assumed to represent the formation of I from U. To assess the dependence of the rate of unfolding of I on [GdnHCl], we used a doublemixing protocol. Refolding after denaturant dilution (first mixing) was allowed to proceed for 2 s at pH 2.1 and 2.5 M GdnHCl to preferentially populate I. Subsequently (second mixing), the intermediate was unfolded by mixing with GdnHCl at various final concentrations, thus generating the rate profile for the I 7 U reaction (Fig. 3).
The rate constants derived from the faster phase at low and high GdnHCl concentrations define, respectively, the microscopic rate constants k UI and k IU together with their denaturant dependences (m UI and m IU ); on the other hand, the rate constants for the slower phase define, respectively, k IN and k NI , together with their GdnHCl dependences m IN and m NI ( Table  I). The situation emerging from inspection of Fig. 3 (5,28). The validity of the on-pathway model is further supported by the good agreement between the equilibrium parameters calculated using the kinetic data (⌬G UN

DISCUSSION
The parameters derived from the folding/unfolding data have been used to calculate the free energy diagram shown in Fig. 4, which depicts the relative compactness of each species expressed through its ␤ value (see the legend to Fig. 4). The strong denaturant dependence of the U 3 I transition (Fig. 3), which yields a large m UI value (Table I), indicates that the relative transition state (TS1) is fairly compact (ϳ45% of the native protein). Because of the extremely steep dependence of the refolding rate on [GdnHCl], amplitude analysis is very difficult (data not shown). Thus, although a burst phase is hardly evident, we cannot exclude the possibility that an additional collapsed intermediate may precede transition state TS1. The free energy profile in Fig. 4 also indicates that intermediate I is very compact, given that ϳ60% of its surface (exposed on unfolding) is buried.
The I 3 N transition in the folding of cyt c 552 displays a very shallow dependence on [GdnHCl] (Fig. 3 and Table I), which implies that the decrease in solvent-accessible surface area (ϳ30%) between the intermediate and TS2 for folding to N is associated to limited additional compaction. Nevertheless, this process is associated with additional quenching of the Trp 91 emission, which should be accounted for. We have first considered the possibility that the residual fluorescence may arise from an equilibrium mixture of U and I, the former being fully fluorescent and the latter fully quenched (similar to the native state). To test this hypothesis, we simulated the time course at 1.5 M GdnHCl, assigning the intermediate either 25% or 0% residual fluorescence and exploring different values for the parameters in Table I. Simulations (not shown) indicate that if the intermediate was assigned a relative fluorescence significantly smaller than 25%, k IU should be at least 65 s Ϫ1 to fit the observed time course. Because this value is grossly inconsistent with the rate constant for the unfolding of the intermediate extrapolated to 1.5 M GdnHCl (ϳ1 s Ϫ1 from Fig. 3), we discarded this hypothesis and concluded that the intermediate is   (Table I) with a pre-exponential factor arbitrarily set to 10 7 s Ϫ1 . The reaction coordinate ␤ represents the fractional burial of solvent-accessible surface area calculated as follows: fluorescent, and thus the relative position of the heme and Trp 91 must be different from that of the native state (14).
Although it has been shown recently that even on-pathway intermediates may contain non-native structures (29), the compactness of the on-pathway cyt c 552 intermediate suggests that a substantial fraction of its structural features and tertiary interactions might be similar to those of the native protein; on this basis, we present an interpretation of our experimental data in the context of the native structure of the protein (14). The physical-chemical character of most residues involved in the interaction between helices A and D in cyt c 552 is conserved (Fig. 1C), as reported originally by Ptitsyn (16) on the basis of a structural comparison of other class I cytochromes c. Equilibrium and kinetic experiments carried out on several sitedirected mutants of yeast iso-1-cyt c (30), horse cyt c (15), and Pseudomonas aeruginosa cyt c 551 (17,31) clearly showed that the interactions at the interface between the N-and C-terminal helices play a crucial role in the folding mechanisms of these proteins. These interactions have been claimed to stabilize an intermediate state for the two eukaryotic proteins while they are involved in the formation of the transition state in the prokaryotic cyt c 551 . Based on structural considerations, it is expected that this same network of interactions may be responsible for the stabilization of the on-pathway intermediate state of cyt c 552 . On the other hand, the observed residual fluorescence of this intermediate shows that the position of Trp 91 (at the end of helix D) relative to the heme (covalently bound to helix A) is different from that of the native state, despite the compactness of the intermediate (Fig. 4).
On the basis of a careful analysis of the native structure of cyt c 552 , it is possible to propose a model for the structure of the intermediate accounting for both of these observations. All of the residues of helix A involved in the interactions with the nearby helix D belong to the first portion of the helix, whereas the residues covalently bound to the heme (Cys 11 and Cys 14 ) belong to the second (Fig. 1B); more to the point, we noticed that helix A is kinked at position 9 (see Fig. 1A). If the relative positions of the two portions of helix A separated by this kink are postulated to be different in the intermediate as compared with the native state, the maintenance of the interactions between helices A and D would not be inconsistent with the incomplete quenching of the Trp fluorescence in the folding intermediate, which implies a greater overall distance from the heme. Furthermore, the high level of compactness of the intermediate (see Fig. 4) suggests that most of the protein has a native-like structure, which is consistent with the hypothesis that the docking of the region of helix A after the kink (carrying the heme) is sufficient to account for the additional decrease in the solvent-accessible surface area upon transition to N. A direct test of this hypothesis may be attempted when mutants of cyt c 552 are available.
In conclusion, we propose that a common folding mechanism is applicable to different members of the cytochrome c family: the same set of conserved tertiary interactions (see Fig. 1C) seems to be responsible for the formation of a similar, partially structured state in all cases. This hypothesis has been cham-pioned by Englander and co-workers (18), whose work shows that native-like, partially folded intermediates occur during horse cyt c folding. However, the role that this species plays in the folding mechanism of the different cytochromes c may be different, because it may either represent a well defined local minimum in the energy barrier (the transient intermediate of cyt c 552 ) or an unstable high energy state, as in cyt c 551 (17). From this perspective, the "historical" distinction between twostate and three-state folding mechanisms may demand a more careful interpretation, as the main difference between the two models may originate from the height of the energy barrier along the reaction coordinate (32,33), which in turn depends on a limited but crucial set of stabilizing interactions in the folding nucleus.