Unveiling a Hidden Folding Intermediate in c-Type Cytochromes by Protein Engineering*

Several investigators have highlighted a correlation between the basic features of the folding process of a protein and its topology, which dictates the folding pathway. Within this conceptual framework we proposed that different members of the cytochrome c (cyt c) family share the same folding mechanism, involving a consensus partially structured state. Pseudomonas aeruginosa cyt c551 (Pa cyt c551) folds via an apparent two-state mechanism through a high energy intermediate. Here we present kinetic evidence demonstrating that it is possible to switch its folding mechanism from two to three state, stabilizing the high energy intermediate by rational mutagenesis. Characterization of the folding kinetics of one single-site mutant of the Pa cyt c551 (Phe7 to Ala) indeed reveals an additional refolding phase and a fast unfolding process which are explained by the accumulation of a partially folded species. Further kinetic analysis highlights the presence of two parallel processes both leading to the native state, suggesting that the above mentioned species is a non obligatory on-pathway intermediate. Determination of the crystallographic structure of F7A shows the presence of an extended internal cavity, which hosts three “bound” water molecules and a H-bond in the N-terminal helix, which is shorter than in the wild type protein. These two features allow us to propose a detailed structural interpretation for the stabilization of the native and especially the intermediate states induced by a single crucial mutation. These results show how protein engineering, x-ray crystallography and state-of-the-art kinetics concur to unveil a folding intermediate and the structural determinants of its stability.

Several investigators have highlighted a correlation between the basic features of the folding process of a protein and its topology, which dictates the folding pathway. Within this conceptual framework we proposed that different members of the cytochrome c (cyt c) family share the same folding mechanism, involving a consensus partially structured state. Pseudomonas aeruginosa cyt c 551 (Pa cyt c 551 ) folds via an apparent two-state mechanism through a high energy intermediate. Here we present kinetic evidence demonstrating that it is possible to switch its folding mechanism from two to three state, stabilizing the high energy intermediate by rational mutagenesis. Characterization of the folding kinetics of one singlesite mutant of the Pa cyt c 551 (Phe 7 to Ala) indeed reveals an additional refolding phase and a fast unfolding process which are explained by the accumulation of a partially folded species. Further kinetic analysis highlights the presence of two parallel processes both leading to the native state, suggesting that the above mentioned species is a non obligatory on-pathway intermediate. Determination of the crystallographic structure of F7A shows the presence of an extended internal cavity, which hosts three "bound" water molecules and a H-bond in the N-terminal helix, which is shorter than in the wild type protein. These two features allow us to propose a detailed structural interpretation for the stabilization of the native and especially the intermediate states induced by a single crucial mutation. These results show how protein engineering, x-ray crystallography and state-of-the-art kinetics concur to unveil a folding intermediate and the structural determinants of its stability.
Do proteins sharing similar tertiary structures fold via the same mechanism? During the last few years, attempts to provide an answer to this question have led to a number of studies which explored the folding mechanism of homologous proteins. Quantitative analysis of transition and intermediate state structures by means of the -value analysis (1) suggests that proteins with similar tertiary structure generally share the same folding pathway (2,3). Furthermore, the demonstration that appropriate mutagenesis may switch apparent two-state mechanisms to three-state mechanisms (4,5), rather than arguing for dramatic changes in the folding mechanism, adds evidence to the relative robustness of the folding pathway to sequence variations. Altogether these observations are in agreement with a folding mechanism characterized by sequential transition states and obligatory high energy intermediates (6), although alternative models have been proposed (7). Within this scenario it should be possible to energetically tune the folding pathway of a protein by selectively (de)stabilizing intermediate and transition states; it is predicted that the stabilization of high energy intermediates, by changing solvent conditions or by appropriate mutagenesis, would be associated with their transient population. The folding mechanism of the c-type cytochromes has been extensively studied, with horse cytochrome c (cyt c) 2 representing a widely used model system (8 -13). We recently proposed a consensus folding mechanism for several cytochromes c, from prokaryotes to eukaryotes, involving an on-pathway intermediate species with essentially conserved structural features, which may be either a low or a high energy state (14). More recently, quantitative analysis of the folding kinetics of the cyt c 552 from Hydrogenobacter thermophilus (Ht cyt c 552 ) allowed the identification of an intermediate state as predicted by the proposed consensus mechanism (15). Cytochrome c 551 from the mesophile bacterium Pseudomonas aeruginosa (Pa cyt c 551 ) is a close structural homologue of Ht cyt c 552 (root mean square deviation is 0.6 Å; sequence identity is 56%). Pa cyt c 551 folds through a complex mechanism involving parallel pathways; a fraction of the molecules reaches the native state within 10 ms following a fast folding track, while the remaining portion folds via a sequential transition state mechanism in which the presence of a high energy intermediate was inferred from kinetic analysis in the ms time range (16).
In agreement with the proposed consensus mechanism, we show in this paper that the high energy intermediate state of Pa cyt c 551 can be stabilized by a single critical mutation (Phe 7 3 Ala) and becomes kinetically detectable. Determination of the crystal structure of the F7A mutant allows us to propose a structural interpretation for the stabilization of the native state and of the consensus intermediate.

EXPERIMENTAL PROCEDURES
Protein Expression and Purification-Pa cyt c 551 wt was expressed and purified as described previously (17). The mutant F7A was produced with the "Gene Taylor-site directed mutagenesis system" kit by Invitrogen and then expressed and purified as for wt Pa cyt c 551 .
Crystallization and Structure Determination-The protein was crystallized using the sitting-drop method at 294 K; drops (2.0 l) were prepared by mixing equal volumes of protein solution (6.5 mM) and crystallization solution consisting of 26 -30% polyethylene glycol 4000, 0.2 M zinc acetate, sodium acetate 0.1 M, pH 4.5, and allowed to equilibrate against 1.0 ml of the same reservoir solution. Crystals grew up in ϳ3 days. They were cryoprotected with a solution made up of 34% polyethylene glycol 4000, 0.2 M zinc acetate, and sodium acetate 0.1 M, pH 4.5, and flash-frozen in liquid nitrogen. A complete data set at 1 (Grenoble, France). Reflection intensities were integrated and scaled by using DENZO/SCALEPACK (18). The crystal belongs to the P65 space group with unit cell dimensions of a ϭ 66.766, b ϭ 66.766, c ϭ 62.462. ( Table 1). The structure was solved by molecular replacement method using the program Molrep (19); the structure of cytochrome c 551 from P. aeruginosa (Protein Data Bank code 351C) was used as a template. Two molecules were found in the asymmetric unit. The model was iteratively modified using COOT (20) and refined using Refmac5 (21); the last few cycles were performed using tls (translation libration screw-motion) restrained refinement. Solvent molecules were added into the F o Ϫ F c density map, contoured at 3 , and visually inspected; only the ones that showed proper hydrogen bonding to the protein and displayed a B-factor less than 60 Å 2 were kept in the structure. The model was refined to a R-factor of 17.8% and a R free value of 23.3%, and its geometrical quality was checked using the program PROCHECK (Ref. 22; see Table 1). Coordinates and structure factors have been deposited in the Protein Data Bank (Protein Data Bank code 2EXV). Structural alignments were performed with CE (23), and pocket analysis was carried out using CASTp (24).
Equilibrium Experiments-Chemical denaturation was monitored by fluorescence emission at 354 nm (excitation at 290 nm; 10 mm light path) using a Jobin-Yvon fluorometer and by circular dichroism at 222 nm using a Jasco spectropolarimeter (2 mm light path). Samples (1 M for fluorescence and 11 M for CD measurements) were thermostated at 10 Ϯ 0.1°C.
Kinetic Experiments-Single mixing kinetic experiments were carried out using a "Pi-star-180" instrument (Applied Photophysics, Leatherehead, UK). Folding and unfolding were initiated by an 11-fold dilution of the denatured or native protein in the appropriate GdnHCl solution. Fluorescence emission was collected above 320 nm using a cut-off filter (excitation was at 290 nm). Double mixing kinetic experiments were carried out with a "SX-18" stopped-flow instrument (Applied Photophysics). In these experiments refolding or unfolding were initiated by an asymmetric mixing (1-to-10) of the denatured or native protein with buffer or denaturant, respectively, followed by a symmetric mixing with an unfolding or refolding solution. In all the kinetic experiments reported the temperature was set at 10 Ϯ 0.1°C. The program supplied by Applied Photophysics was used to determine the nonlinear least squares fit of the fluorescence kinetic data to monophasic, plus steady-state and biphasic decay equations.
Data Analysis-Equilibrium fluorescence and far-UV CD measurements as a function of denaturant concentration were fitted according to a two-state model in which only the native and the fully denatured states are populated, using Equation 1.
where [GdnHCl]1 ⁄ 2 is the GdnHCl concentration in which 50% of the protein is denatured and m D-N is the m-value. The free energy of denaturation in the absence of denaturant (⌬G 0 D-N ) was calculated assuming a linear dependence of the free energy of unfolding on denaturant concentration.

RESULTS
Structure of F7A-The three-dimensional structure of the F7A mutant of Pa cyt c 551 in the oxidized form was determined by x-ray crystallography at 1.86 Å resolution (crystallographic parameters are listed in Table 1). As expected, the overall structure of the mutant is very similar to that of the wt (root mean square deviation calculated on C␣ atoms between the two is 0.6 Å). Analysis reveals that in the N-terminal helix the H-bond between the carbonyl oxygen of Glu 4 and the amide group of Lys 8 is significantly shorter than in the wt (3.58 Å in the latter protein versus 3.13 or 3.18 Å in the former, depending on which monomer of the asymmetric unit is considered; see Fig. 1A). Interestingly in Ht cyt c 552 , where Ala is present at position 5, topologically equivalent to position 7 in Pa cyt c 551 , the same segment of the N-terminal helix is correctly stabilized by the expected H-bond network (e.g. the length of the H-bond between Leu 2 -CO and Lys 6 -NH, equivalent to Glu 4 -Lys 8 , is 2.96 Å; see Ref. 15). Moreover the three-dimensional structure of F7A reveals that the mutation produces near the residue 7 a cavity of ϳ55 Å 3 (the corresponding one in the wt being 24 Å 3 ). This cavity, which is accessible to solvent in the F7A mutant ( Fig. 1B) but not in the wt protein, displays a higher polar character, 30% of the lining residues being polar. In this cavity three water molecules are clearly detectable in the electron density map in both monomers of the asymmetric unit; they are H-bonded to each other and to the protein (His 16 -CO, Tyr 27 -OH, Glu 4 -O⑀2, Val 13 -NH) and display low B-factors (average B-factor is 29 Å 2 ), indicating that they are well ordered in the structure.
Equilibrium Unfolding-Characterization of the GdnHCl-induced unfolding transition of F7A has been carried out with the ferric derivative and at pH 3.0 and 4.7, following the change in the intrinsic Trp fluorescence. We chose to study this protein at pH 4.7 to directly compare the results with those obtained on Ht cyt c 552 , the thermophilic counterpart of Pa cyt c 551 . In all conditions the equilibrium transition corresponds to a single reversible process with no evidence of intermediate states (Fig. 2). At pH 4.7 the single point mutation stabilizes the protein by about 2.2 kcal/mol (⌬G D-N ϭ 7.7 Ϯ 0.2 kcal/mol; m ϭ 2.9 Ϯ 0.1 M Ϫ1 ) with respect to the wt (⌬G D-N ϭ 5.5 Ϯ 0.4 kcal/mol; m ϭ 2.9 Ϯ 0.2 M Ϫ1 ). This significant gain in stability is more than halfway toward that of Ht cyt c 552 , the thermophilic counterpart of Pa cyt c 551 (⌬G D-N ϭ 9.5 Ϯ 1.5 kcal/mol, measured at the same pH and solvent conditions).
Folding Kinetics-Contrary to the wt protein, the refolding time course of F7A mutant, measured at pH 4.7 and low GdnHCl concentration (Յ1.75 M), is clearly biphasic and is satisfactorily fitted to a double exponential decay, although a third and much slower phase (k is approximately 0.03 s Ϫ1 ) with a small amplitude (Ͻ10%) was also observed. The latter has not been included in the analysis because interrupted unfolding experiments (data not shown) demonstrated that this slow phase reflects a minor fraction of molecules refolding along a pathway ratelimited by prolyl-peptide bond isomerization processes (25). A semi- Ramachandran statistics % of residues in most favored regions 93.2 % of residues in allowed 6.8 % of residues in not allowed 0 logarithmic plot of the folding and unfolding rate constants versus denaturant concentration (chevron plot) is reported in Fig. 3; interestingly, while the logarithm of the faster refolding rate constant linearly decreases with increasing [GdnHCl], the slower one displays a clear curvature (roll-over effect) at low denaturant concentrations (between 0.5 and 2.0 M GdnHCl). Both features, which are generally interpreted in terms of multistate folding (26,27), were not observed in the refolding of the wt protein from the GdnHCl denatured state (28). To exclude the possibility that the refolding rollover is caused by formation of transient aggregates (28,29), we followed the protein concentration dependence (between 0.1 and 50 M after mixing) of the refolding time course at 1 M GdnHCl, without any detectable effects (data not shown).
To demonstrate that for the F7A mutant an intermediate species is accumulated under refolding conditions, we carried out interrupted refolding experiments (30,31). In these experiments the completely denatured protein was allowed to renature by dilution with a refolding buffer (first mix); then, after a controlled delay time, the protein was challenged with a high denaturant in a second mix. Using this protocol the native protein is expected to unfold with a rate slower than any partially folded intermediate, being separated from the denatured state by the highest energy barrier. As can be seen from Fig. 4, while interrupted refolding experiments carried out at pH 4.7 with the wt protein do not reveal any additional process over-and-above the unfolding of N, in the case of the F7A mutant a fast unfolding species becomes evident using short delay times. Fitting the unfolding time course observed at 4.15 M GdnHCl to a double exponential decay yields a rate constant for the slow phase (0.20 s Ϫ1 ), which is virtually identical to that measured under the same conditions in single mix unfolding experiments (Fig. 3) and represents the unfolding of the native protein formed during the delay time. The additional fast phase (k ϭ 34 s Ϫ1 ) observed only for the mutant indicates the existence of a population of partially structured molecules not observed in the case of the wt protein. At longer delay times the amplitude of the fast unfolding phase disappears progressively and at delay times Ͼ5 s, a single exponential process is observed, as expected if only the native state were present.
The observation that a detectable amount of native state can be formed in the short time between the first and the second mixing (100 ms) suggested the presence of parallel refolding routes. To determine whether the native state is formed only in a sequential reaction, involving an obligatory on-pathway intermediate (U-I-N) or other mechanisms are operative (32), we plotted the amplitudes obtained in the same interrupted refolding experiment as a function of the delay time. For the F7A mutant the amplitude dependence of the slow unfolding reaction on the delay time at pH 4.7 is well described by a double exponential process (Fig. 5), with no evidence of native protein formation at the   shortest delay time (10 ms). The two phases have rate constants and relative amplitudes close to those measured in classical single mix dilution experiments, at the same pH and GdnHCl concentration (i.e. at 1 M GdnHCl: k 1 ϭ 27 s Ϫ1 versus 36 s Ϫ1 , a 1 ϭ 85% versus 87%; k 2 ϭ 1.40 s Ϫ1 versus 1.45 s Ϫ1 , a 2 ϭ 15% versus 11%; see also Fig. 3). This result clearly indicates that the two phases detected in single mix stopped-flow experiments both represent processes leading to the native state and imply a triangular folding mechanism characterized by parallel pathways, as depicted in Scheme 1. When the interrupted refolding experiments were carried out at pH 3.0 no fast unfolding species was detected (data not shown); moreover, a plot of the amplitudes against the delay time can be fitted to a single exponential (Fig. 5), yielding a rate consistent with that of the main phase observed in single mix dilution experiments under the same conditions (approximately 3 s Ϫ1 ). Both these observations suggest that at the lower pH the alternative refolding route is probably abolished and the intermediate state is no longer populated (see below for further discussion).

DISCUSSION
A consensus folding mechanism involving the presence of a conserved intermediate state, populated in the ms time window, has been proposed to be common to all c-type cytochromes (14). A subsequent kinetic study on Ht cyt c 552 provided further evidence for the existence of such an intermediate (15). The underlying hypothesis implies that the higher stability and the different distribution of the helical propensity of Ht cyt c 552 , compared with Pa cyt c 551 , stabilizes the folding intermediate sufficiently to be populated, detected, and to some extent kinetically characterized. Previous work on Pa cyt c 551 showed that a selected set of five mutations can confer to this mesophilic cytochrome the stability characteristic of its termophilic homologue Ht cyt c 552 (33). To verify our hypothesis, we investigated in detail the kinetic folding mechanism of the Pa cyt c 551 mutant F7A, which was selected among other multiple mutants as a promising combination of: a minor structural perturbation, an approximate 4-fold increase in the helical propensity of the N-terminal helix (calculated by AGADIR (34)), and increased thermodynamic stability (35). Moreover this substitution was particularly intriguing given that Phe 7 is in a central position of the N-terminal helix, whose stability and docking to the C-terminal helix are considered key events in the folding process of the c-type cytochromes (12,36).
Analysis of the F7A mutant folding kinetics at pH 4.7 shows that a double exponential decay is required to satisfactorily describe the refolding time courses at low [GdnHCl], at variance with the wild type protein, which is always described by a single exponential in the ms time range. Since the most probable interpretation is the accumulation of an intermediate state during refolding of F7A, we carried out double-mixing interrupted refolding experiments, which showed a fast unfolding phase, absent in the wt protein (Fig. 4). Thus we concluded that at pH 4.7 a partially folded species is populated during the short delay time between the two mixings. In addition, the interrupted refolding experiments indicate that the two refolding phases observed in single mixing experiments at pH 4.7 (Fig. 3) represent the formation of native protein (Fig. 5). This suggests the existence of parallel pathways. However, at pH 3.0 there was no evidence for this intermediate, and no fast unfolding species was observed, indicating that the native state formation occurs by a single process (Fig. 5). We believe that at pH 3.0, where the mutant F7A is considerably less stable (⌬G D-N ϭ 4.9 kcal/mol) than at pH 4.7, the intermediate is likely to be high energy, probably because of the protonation of Glu 70 . This residue, indeed, forms a salt bridge with Lys 10 , an interaction shown to be important for the docking of the Nand C-terminal helices (28).
The data shown in Fig. 3 were fitted with a three-state model, involving the presence of an intermediate state. Discrimination between the on-and the off-pathway model is unfortunately very difficult because of the limited number of points to describe the refolding roll-over and

Unveiling a Hidden Folding Intermediate
unfolding of the intermediate. The presence of a fast unfolding process may suggest a sequential mechanism involving an on-pathway intermediate (U-I-N). However, the biphasic character of the native state formation, as seen in the interrupted refolding experiments (Fig. 5), suggests that it is more plausible to reconcile the data with a triangular mechanism in which the native state is formed via two parallel reactions, i.e. a faster one directly from the denatured state and a slower one involving the transient population of an intermediate en route to the native state. Although an off-pathway mechanism cannot be ruled out, a triangular scheme (Scheme 1) seems more plausible, and it is consistent with evidence for parallel pathways in the wt protein (37) and the results obtained for Ht cyt c 552 (which refolds through on-pathway intermediate; Ref. 15).
The crystallographic structure of ferric Pa cyt c 551 (38) shows that the N-terminal helix, extending from Pro 3 to Asn 9 , deviates from the canonical hydrogen bonding pattern of ␣-helices, showing a bending at Val 5 and an unusually long H-bond between Glu 4 -CO and Lys 8 -NH (see Fig. 1A). On the contrary analysis of the Ht cyt c 552 crystallographic structure (15), where an Ala is present at the topological position corresponding to Phe 7 in Pa cyt c 551 , shows that the distance between the carbonyl oxygen of Glu 2 and the amide group of Lys 6 is shorter, implying the formation of a stronger H-bond. We concluded that the distortion of the N-terminal helix in Pa cyt c 551 is probably due to the unfavorable fit of the bulky side chain of Phe 7 within the protein core. Thus, it is likely that replacement of the aromatic side chain with Ala removes steric constraints, allowing the main chain atoms to properly fit the complete H-bond network of the N-terminal helix, resulting in the stabilization of the protein. Before commenting on the structural data of the single F7A mutant illustrated above, we recall that Hasegawa et al. (33,35) proposed, on the basis of the solution NMR structure, that the increased thermodynamic stability of the double mutant F7A/V13M with respect to the wt should mainly be ascribed to a tighter packing of the new side chains in the protein core: the packing defect due to the replacement of Phe 7 with Ala would be efficiently filled by the side chain of the Met 13 , resulting in a better compaction of the protein interior and, therefore, an increased stability. The F7A mutant x-ray crystal structure clearly shows that removal of the bulky Phe ring produces a cavity, which is twice as big as that in the wt and is considerably more polar. While the formation of a larger cavity in the mutant was predicted, it was surprising that such a notable packing defect exists in an overstabilized mutant like F7A (39, 40). An explanation for this apparent contradiction may reside in the presence of three well ordered water molecules present in the cavity (average B-factor is 29 Å 2 ). Indeed modeling and experimental studies indicate that polar cavities almost invariably host water molecules, which usually have a stabilizing effect (41)(42)(43). It has been reported that intracavity water molecules compensate to a variable extent for the decrease in stability due to packing defects: on average this is estimated to be more than 50% (44), although in some cases it has been postulated that they could even fulfill a real structural role, resulting in a stabilization (45)(46)(47). This may indeed be the case for the F7A mutant of Pa cyt c 551. The three buried water molecules (Fig. 1B) interact with each other and with the residues enclosing the cavity, forming a "bridging" H-bond network whose importance for the global energetic balance of the protein is likely to be relevant. We believe that the destabilization due to the considerable increase in the size of the cavity is counterbalanced and possibly overcompensated by the stabilizing effect of the water molecules, resulting in a net positive contribution and thereby an increased stability of F7A.
Thus, we think it is plausible to propose that the observed stabilization of the intermediate and the native state in the mutant F7A is the result of different contributions, as seen from the structure. The replacement of the Phe ring with the methyl group of Ala leads to an ϳ4-fold enhancement of the helical propensity of the N-terminal helix, and more importantly, it allows the strengthening of a main chain H-bond in the same helix. As a result, the N-terminal helix is more stable, increasing the probability for the intermediate state to be populated. Moreover, the packing defect resulting from the mutation is filled by an array of three water molecules representing the "core" of a H-bond network, which could concur with the stronger Glu4-Lys 8 H-bond to the stabilization of the native state.
In conclusion, we present kinetic and structural evidence that even a single rational mutation can remarkably change the shape of the energetic landscape characteristic of a protein folding process, resulting not only in striking stabilization of the native state but also in the unveiling of the consensus folding intermediate, otherwise energetically inaccessible.