A conserved "hydrophobic staple motif" plays a crucial role in the refolding of human glutathione transferase P1-1.

The specific (i, i+5) hydrophobic staple interaction involving a helix residue and a second residue located in the turn preceding the helix is a recurrent motif at the N terminus of alpha-helices. This motif is strictly conserved in the core of all soluble glutathione transferases (GSTs) as well as in other protein structures. Human GSTP1-1 variants mutated in amino acid Ile(149) and Tyr(154) of the hydrophobic staple motif of the alpha6-helix were analyzed. In particular, a double mutant cycle analysis has been performed to evaluate the role of the hydrophobic staple motif in the refolding process. The results show that this local interaction, by restricting the number of conformations of the alpha6-helix relative to the alpha1-helix, favors the formation of essential interdomain interactions and thereby accelerates the folding process. Thus, for the first time it is shown that the hydrophobic staple interaction has a role in the folding process of an intact protein. In P(i) class GSTs, Tyr(154) appears to be of particular structural importance, since it interacts with conserved residues Leu(21), Asp(24), and Gln(25) of the adjacent alpha1-helix which contributes to the active site. Human GSTP1-1 variants L21A and Y154F have also been analyzed in order to distinguish the role of interdomain interactions from that of the hydrophobic staple. The experimental results reported here suggest that the strict conservation of the hydrophobic staple motif reflects an evolutionary pressure for proteins to fold rapidly.

It is known that the folded conformation adopted by a protein is governed by its amino acid sequence, suggesting an underlying stereochemical code for protein folding. Current protein folding models attribute a critical role to local interactions that reduce the conformational freedom at certain points of the polypeptide chain and thus define the folding pathways. Consequently, significant attention has recently been directed to identifying local motifs responsible for the formation of secondary structure element as well as of supersecondary regions including combination of such elements. At the N terminus of helices two local motifs have been described. The first one, named the N-capping box (1,2), involves the reciprocal main chain-side chain hydrogen bonds between the N-cap (Ser/Thr) and the N-3 (Glu/Asp) residues of a helix. (We follow the no-menclature proposed by Richardson and Richardson (1): NЉ-NЈ-Ncap-N1-N2-N3-N4; where N1-N4 belong to the helix and Ncap is the boundary residue.) The second one, termed the hydrophobic staple motif (3)(4)(5), involves a specific hydrophobic interaction between residues located at the NЈ and N-4 positions. A global survey of helix capping in proteins of known structure has recently been reported (6). When present in combination with a capping box (expanded box (3)), the hydrophobic staple motif gives a cooperative effect in defining the ␣-helix starting point. In a statistical analysis (4) its presence in the absence of a capping box has also been shown, suggesting that it may represent an independent motif. Analysis of protein structures in data bases as well as studies involving short peptides have shown that residues at positions NЈ and N-cap adopt dihedral angles characteristic of the ␤-region of Ramachandran plots (4). It has therefore been hypothesized that the hydrophobic staple motif may serve to determine the direction of a polypeptide chain as it forms the ␣-helix (4,5). However, the function of the hydrophobic staple motif still remains to be clarified. Notably, its suggested contribution to folding and stability has never been studied experimentally in a protein, despite of its frequent occurrence at the N terminus of helices.
Glutathione transferases (GSTs) 1 are a superfamily of detoxication enzymes (7) that catalyze the nucleophilic attack of glutathione (GSH) on the electrophilic groups of a wide range of hydrophobic toxic compounds (8). GSTs are dimeric proteins in which each 25-kDa subunit is characterized by two distinct domains and an active site that acts independently of the other subunit. The smaller N-terminal domain (domain I) adopts an ␣/␤ topology and contributes most of the contacts to GSH. The C-terminal domain (domain II) is all ␣-helical and provides most of the amino acid residues of the hydrophobic substratebinding site which lies adjacent to the GSH-binding site (9 -13).
We have previously shown that amino acid residues forming an N-capping box and a hydrophobic staple motif at the N terminus of the ␣6-helix in domain II, are strictly conserved in all known sequences of soluble GSTs (more than 50) (14). While the N-capping box, being involved in the ␣6-helix formation, is essential to the folding of GSTs (15), the possible role of the hydrophobic staple motif in the refolding of GST is still unknown. It should be noted that it is located in the hydrophobic core of the GSTs and that, in a very similar structural environment, it is also conserved in other protein superfamilies (16). Thus, GSTs serve as good model proteins for investigations of the specific role played by this supersecondary interaction in protein folding.
Site-directed mutagenesis was employed to change the amino acid residues forming the hydrophobic staple motif in * This work was supported by grants of the Ministero dell'Università e della Ricerca Scientifica e Tecnologica (60%), the Carl Trygger Foundation, and the Swedish Natural Science Research Council. The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
¶ To whom correspondence should be addressed: Dipartimento di Scienze Biomediche, Università degli Studi "G. D'Annunzio," Via dei Vestini, 31, I-66100 Chieti, Italy. Tel.: 39-0871-3555272; Fax: 39-0871-3555356; E-mail: a.aceto@dsb.unich.it. the human enzyme GSTP1-1. The results indicate that the interaction between the side chains of the loop residue NЈ and the ␣6-helix residue N-4 favors the appropriate conformation of the ␣6-helix with respect to the preceding structural element thereby enhancing the rate of protein folding and defining the folding pathway.
Construction of GSTP1-1 Mutants-Site-directed mutagenesis was employed to generate the hydrophobic staple mutants. The following oligonucleotides were used for the mutations: I149A, 5Ј-GACCAGGC-TTCCTTCGCTGACTACAACCTGCTG-3Ј; I149G, 5Ј-GACCAGGGCTC-CTTCGCTGACTACAACCTGCTG-3Ј; Y154A, 5Ј-GACCAGATCTCCTT-CGCTGACGCTAACCTGCTG-3Ј; T154F, 5Ј-GACCAGATCTCCTTCGC-TGACTTCAACCTGCTG-3Ј; I149A/Y154A, 5Ј-GACCAGGCTTCCTTC-GCTGACGCTAACCTGCTG-3Ј; I149G/Y154A, 5Ј-GACCAGGGCTCCT-TCGCTGACGCTAACCTGCTG-3Ј; I149G/Y154F, 5Ј-GACCAGGGCTC-CTTCGCTGACTTCAACCTGCTG-3Ј. The reverse primer had the sequence: 5Ј-TCCCACAATGAAGGTCTTGCCTCCCTGGTTCTG-3Ј. For the L21A mutation the mutagenic primer was: 5Ј-CGTATGGCGCTGGCT-GACCAGGGTCAG-3Ј and the reverse primer 5Ј-CAGTGCAGCGCAA-CGACCACGAACTGG-3Ј. Altered nucleotides are underlined. Oligonucleotides were phosphorylated and then used in pairs of one mutagenic and the reverse primer in inverted polymerase chain reaction. The expression clone pKHP1 (17) was used as a template. The polymerase chain reaction mixture contained 0.8 M of each primer, 0.2 mM dNTPs, 2.5 units of Pfu DNA polymerase (Stratagene, La Jolla, CA), the buffer supplied with the enzyme and various amounts of DNA template. The temperature program started at 94°C for 10 min and was followed by 25 cycles of 94°C for 1 min, 70°C for 1 min, and 72°C for 9 min. The program terminated with a reaction at 72°C for 30 min. After electrophoresis the polymerase chain reaction product was recovered from the agarose gel. The DNA was ligated and used to transform E. coli XL-1 Blue cells. The cDNA encoding the isolated GSTP1-1 mutants was sequenced in its entirety to verify that no undesired mutations had been introduced in the polymerase chain reaction.
Protein Expression and Stability in the Host Cells as a Function of the Growth Medium Temperature-Cultures of E. coli XL-1 Blue containing plasmids were grown in 500 ml of LB broth in a 2-liter Erlenmeyer flask at 37°C. At an A 555 of 0.35, isopropyl-1-thio-␤-D-galactopyranoside was added to a final concentration of 0.2 mM. From the time of addition the culture was grown for about 20 h at two different temperatures of incubation, i.e. 25 and 37°C. The subsequent purification of wild-type and mutant proteins, expressed at different growth temperatures, was performed as described previously (18) with the only modification that the enzymes were purified on a GSH-Sepharose affinity column. The intracellular protein yields of GST in bacteria were expressed as percentage of GST purified in relation to total cytosolic protein. The GST activity in the cytosol and the level of protein expression were also determined as a function of temperature.
Kinetic and Structural Studies on Wild-type and Hydrophobic Staple Motif Mutants Expressed at 25°C-The kinetic parameters, k cat and K m , were determined at 25°C as described previously (19). The saturating concentrations for GSH and CDNB were 2 and 1 mM, respectively. Spectroscopic properties of the mutants and the wild-type enzyme were also studied. A Jasco-600 spectropolarimeter was used for CD measurements in the far-ultraviolet region from 200 to 250 nm. Spectra were recorded at 0.3 mg/ml protein concentration with cuvettes of 0.1-cm path length in a thermostatic cell holder. Intrinsic fluorescence emission spectra were measured with a Spex (model Fluoromax) spectrofluorometer. The excitation wavelength was 280 nm and the max and the fluorescence intensity of emission spectra were analyzed at a protein concentration of 0.1 mg/ml.
Heat Inactivation Assays-Enzyme was incubated at each temperature for 10 min at a protein concentration of 0.05 mg/ml in 0.1 M potassium phosphate (pH 7.0) containing 1 mM EDTA and 5 mM dithiothreitol to prevent oxidative inactivation. The enzyme was heat inactivated in sealed Eppendorf tubes and the temperature was monitored with a Cryson telethermometer. The inactivation time courses were determined by withdrawing suitable aliquots at different time points from the denaturation mixture for assay of remaining activity. The activity was assayed in 0.1 M potassium phosphate (pH 6.5) with 2 mM GSH and 1 mM CDNB at 25°C. In no case could the thermal inactiva-tion be reversed by lowering the temperature of incubation.
Temperature Dependence of Refolding in Vitro for Wild-type and Hydrophobic Staple Mutants-When the refolding of human GSTP1-1 and its mutants was to be monitored, 10 M enzyme was first denatured in 4 M guanidinium chloride (0.2 M phosphate, 1 mM EDTA, 5 mM dithiothreitol, pH 7.0) at 25, 32, and 40°C for 30 min and then diluted (defining time 0) 1:40 into renaturation buffer (0.2 M phosphate, 1 mM EDTA, 5 mM dithiothreitol, pH 7.0) at the same temperature. The final guanidinium chloride concentration was 0.1 M during refolding. All refolding experiments were carried out by rapid addition of the denatured enzyme to the renaturation buffer. Recovered activity was monitored as a function of time by withdrawal of appropriate aliquots of the renaturation mixture and immediate assay for activity at 25°C. Refolding rate constants were determined by nonlinear regression analysis of the experimental data by using the KaleidaGraph 3.0.5 program (Abelbek Software). The refolding rate of all variants were independent, in the range 5 to 50 M, of the protein concentration. At even higher enzyme concentration all variants were characterized by a reduced refolding yield. The values reported in this paper represent the mean of at least three different experimental data sets. Under our conditions, an equation describing a single exponential process can be fitted to the data, where F(t) is the activity at time t, A the amplitude, k f the rate constant, and B the reactivation value at time 0. The effect of mutation on the energy of the transition state of folding can be calculated using transition-state theory in a similar manner to that reported by Jackson et al. (20). The stability of the transition state of a mutant protein relative to that of wild-type is calculated from, where ⌬⌬G f is the difference in energy of the transition state of folding relative to the unfolded state between wild-type and mutant proteins; k f and k f Ј are the respective rate constants of folding. Double mutant cycles have been used to calculate ⌬⌬G int , the interaction energy between two residues in a protein, under conditions in which they can also make additional interactions. ⌬⌬G int is obtained by subtraction of the ⌬⌬G f values of the corresponding single mutants from the ⌬⌬G f of the double mutant, calculated as above. The general theory of double mutant cycle analysis has been discussed extensively elsewhere (21). It is assumed that the ⌬⌬G int is 0 in the unfolded state and this is taken as the reference state. The reactivation behavior of all GSTP1-1 variants can be described by a single exponential equation. Thus, we have assumed: (i) that the enzyme reactivation follows, kinetically, a two-state model without intermediates so that the refolding rate, measured under native-like conditions, can be used to characterize the transition state for folding; (ii) that none of the mutations have any significant effect on the folding pathway, at least not when analyzed at subphysiological temperatures.
Molecular Graphics Analysis-Coordinates of GST x-ray structures were derived from the Brookhaven Protein Data Bank via the Anonymous file-transfer protocol. The crystal structures were analyzed by using Hyperchem (22) (Autodesk, Sausalito, CA) and MolView 1.4.6 (Purdue University) programs. Figures were generated by using the RasMol (version 2.6) program.

Identification of a Conserved Hydrophobic Staple Motif and
Molecular Graphics Analysis-At the N terminus of the ␣6helix in domain II, the sequence (Ser/Thr)XXAsp is conserved in the alpha, mu, P i , theta, and sigma classes of GSTs. We have recently demonstrated that this motif corresponds to a "capping box" signal (1, 2, 6) which exerts a strong influence on the ␣6-helix nucleation (14) as well as on GST folding (15). The sequence analysis has also revealed the presence of a second structural motif conserved at the N terminus of the ␣6-helix in GSTs of the same classes (14). This motif seems to correspond to a specific (i, iϩ5) hydrophobic interaction, named the hydrophobic staple motif (3)(4)(5), between the residue NЈ located before the N-cap and the residue located at position N-4 within the helix (residues Ile 149 and Tyr 154 in GSTP1-1, respectively). It has been suggested (4) that this motif, flanking the capping box, enhances the demarcation of the N terminus limit and serves to cause a reversal of the polypeptide chain direction as it forms the ␣-helix. Three-dimensional structure analysis of GSTs shows that the side chains of the hydrophobic residues at the NЈ and N-4 positions are located within a distance of 4 Å from each other in all known GST structures and that the NЈ and N-cap residues always adopt dihedral angles in the ␤-region of the Ramachandran plot. Thus, this supersecondary interaction possesses all structural characteristics of a hydrophobic staple motif and, despite the low overall sequence identity (in some cases less than 5%), is well conserved among GSTs and GST-related proteins (14).
As shown in Fig. 1, residue Tyr 154 , in addition to the hydrophobic staple interaction with Ile 149 , forms very favorable contacts with amino acid residues Leu 21 , Asp 24 , and Gln 25 all belonging to an important structural element (the ␣1-helix) which contributes to the active site. Moreover, the hydrogen bond as well as the hydrophobic interactions made by Tyr 154 with Asp 24 and Leu 21 , respectively, are conserved in class P i GSTs. It is interesting to note that these polar and apolar interdomain interactions seem to be specific for P i class GSTs, since they are maintained in the bovine, murine, porcine, and rat GSTP1-1 but not in GST structures from other classes (not shown).
Design, Expression, and Purification of Hydrophobic Staple Mutants of Human GSTP1-1-To determine the role of the hydrophobic staple motif in the folding process, site-directed mutagenesis was used to produce the two single mutations I149A and Y154A as well as the double mutation I149A/Y154A. However, considering that an alanine residue still potentially can form, during the folding process, a weak hydrophobic staple and a glycine cannot, the I149G variant and the corresponding I149G/Y154F and I149G/Y154A double mutants were also generated. Furthermore, in order to distinguish the contribution of the hydrophobic staple versus the other contributions of Tyr 154 with the ␣1-helix, the variants Y154F and L21A were also analyzed. These two mutants make it possible to separate the contribution of the hydrogen bonds (Y154F mutant) from the hydrophobic interaction (L21A variant) made by Tyr 154 with domain I. The mutants and the wild-type GSTP1-1 were expressed in E. coli and purified in a single chromatographic step using a GSH affinity matrix. No more than 2-3% of the total enzymatic activity remained unbound for any of the different GST variants. Apparently, the mutations do not affect the affinity for GSH-Sepharose. The purified proteins were homogeneous as judged by SDS-polyacrylamide gel electophoresis (results not shown).
The possibility that the mutations could have resulted in temperature-sensitive species was investigated by expressing the proteins at different growth temperatures of the host cells. As shown in Table I the intracellular yields of wild-type GSTP1-1 and mutants Y154A and I149A, expressed in bacteria grown at 25°C, were essentially the same. With increasing temperature, the yields of these mutants slightly decreased, unlike the yield of the wild-type, L21A, Y154F, and I149G variants, which remained unchanged. A much lower intracellular yield, strongly dependent on the temperature of host cell growth, was found for the I149A/Y154A, I149G/Y154F, and I149G/Y154A double mutants. At 37°C the protein expression levels of all double mutants were more than 15 times lower than those of the wild-type and single-point mutants. These results probably have the explanation that the double mutants fail to attain the native conformation at physiological temperatures. The activity in the crude bacterial lysate of all double mutants (not shown) and the specific activity of the purified I149A/Y154A and I149G/Y154A variants (Table I) were markedly lower when the growth temperature of the host cell was 37°C.
Kinetic and Structural Properties of Wild-type and Hydrophobic Staple Mutant Enzymes Expressed at 25°C-The level of protein expression of some mutants was more or less dependent on the growth temperature of the bacterial host cells. Moreover, some mutants displayed a lower specific activity when expressed at 37°C. Thus, kinetic and structural characterizations were performed using enzymes expressed at the more permissive temperature of 25°C. Table II summarizes the kinetic parameters for the conjugation of CDNB with GSH catalyzed by wild-type GSTP1-1 and mutant variants. The k cat , K m GSH , and k cat /K m values of all mutants, although to various extents, were all higher than those of the wild-type enzyme. In particular, mutants I149A/ Y154A, L21A, and I149G/Y154A displayed k cat values about 2-fold higher as compared with the parent enzyme. The K m values for CDNB were also higher than that of the wild-type  Expression at 25°C yielded wild-type and mutant proteins with very similar structural properties. The similar far-UV CD spectra of all variants as well as the same gel filtration retention times (not shown) indicate that the secondary structure content and the dimeric arrangement of the proteins are essentially unaffected by the mutations. Fig. 2 shows the intrinsic fluorescence spectra of the proteins. The max values were the same, indicating that a similar polarity characterizes the environment of the tryptophan residues of all enzyme variants. However, the normalized intensities of fluorescence of most mutants were lower than that of the wild-type enzyme. This finding suggests that significant conformational changes distinguish the final structure of the mutants from that of the parent enzyme. These differences imply shorter distances between some fluorescence quenching groups and one or both tryptophanyl residues located in the GST domain I, far from the mutation site.
Thermal Stability of the Wild-type and Mutant Enzymes-The thermal stability of wild-type and hydrophobic staple motif mutant proteins, heterologously expressed at 25°C, was investigated. Residual activity was monitored after 10 min of exposure to different temperatures. As shown in Fig. 3, A and B, all mutants were destabilized compared with the wild-type GSTP1-1. The I149A (Fig. 3A) mutant was completely inactivated at 50°C, whereas the wild-type enzyme, under the same conditions, was inactivated to less than 10%. Furthermore, mutants Y154A and I149A/Y154A were significantly more unstable than both wild-type GSTP1-1 and the I149A mutant, being almost completely inactivated by incubation at 40°C for 10 min. For the Y154A and the I149A/Y154A mutants about 60% of the activity remained after the same time at 37°C. The fact that the thermal inactivation curves for the single Y154A and the double I149A/Y154A mutants were essentially the same suggests that interactions made by the Tyr 154 residue with other regions of the molecule play a major role in the global stability of the protein. At 40°C, the L21A and Y154F mutants were 40 and 10% inactivated, respectively, suggesting that the hydrophobic interaction made by Tyr 154 with the ␣1helix is essential for enzyme stability (Fig. 3B). The results also show that about 40% of the activity of mutant I149G is lost when this variant is incubated at 40°C. Moreover, the I149G/ Y154A variant, being inactivated to 70% at 37°C (Fig. 3B), is more unstable than the I149A/Y154A double mutant (Fig. 3A). Thus, an Ala to Gly substitution at position 149 in GSTP1-1 clearly affects the overall stability of the enzyme.
Reactivation Yields at Different Temperatures-The proposed role of the hydrophobic staple motif involves determination of the direction of the polypeptide chain during folding. One consequence of substituting residues that contribute to this motif could be the generation of temperature-sensitive folding mutants. To test this possibility, reactivation yields of GSTP1-1 and its mutants at different temperatures were determined. Fig. 4 shows that when the reactivation was performed at 25°C the double mutant I149A/Y154A possessed a higher specific activity (35 units/mg) than that observed for the wild-type and the single mutants I149A and Y154A. Still, at 25°C all enzymes show the same (70 -80%) reactivation yield. Fig. 4 also shows that while the wild-type enzyme was essentially unaffected by temperature in the 25-40°C range, the reactivation yields of mutants, even if to different extents, decreased with temperatures increasing toward physiological values. In particular, the I149A/Y154A double mutant, already at 32°C, displayed a substantial reduction in the yield of refolded active form. On the contrary, at the same temperature, the reactivation yield of the I149A single mutant was complete with a specific activity (25 units/mg), slightly higher than that of the wild-type. At 32 and 40°C lower refolding yields were observed for the Y154A variant than for the I149A mutant, suggesting that a role in the refolding could be played by the specific interdomain interactions between the ␣6and ␣1-helices identified in the crystal structure of the native enzyme. To address this aspect, the refolding yields of L21A and Y154F  The I149G single mutant is completely unable to refold at 40°C and even at 32°C it displayed a refolding yield significantly lower than that attained at 25°C (Fig. 5). Moreover, for the corresponding I149G/Y154A (Fig. 6) and I149G/Y154F (not shown) double mutants complete recovery of enzymatic activity could not be obtained even at 25°C.
Analysis of Reactivation at 25°C-A single exponential equation could be fitted to all data sets for the refolding kinetics (not shown). However, as shown in Table III, while the refolding rates of both I149A and Y154A single mutants were only slightly different from that of the wild-type enzyme, the I149A/Y154A double mutant, in which the staple motif has been significantly weakened, was characterized by a reactivation velocity about 6 times lower than that of the wild-type. These differences in the refolding rates reflect changes in the free energy of activation of folding upon mutation. Double mutant cycles analysis (20,21) has been used to estimate the energy contribution (⌬⌬G int ) of the staple interaction at the transition state of folding. As reported in Table III and Fig. 7, an estimated value significantly different from 0 (Ϫ0.53 kcal/ mol) indicates that the hydrophobic staple interaction is pres-ent in the transition state of the reactivation process. The results show that also the I149G single mutant refolds at a lower rate than does the wild-type enzyme and the estimated ⌬⌬G f was Ϫ0.25 kcal/mol. The I149G/Y154F and I149G/Y154A double mutants, in distinction from the I149A/Y154A variant, were unstable even at temperatures as low as 25°C suggesting that the substitution of Ile 149 to Gly destabilizes the transition state of folding more dramatically than does the I149A muta- , heterologously expressed at 25°C, was first denatured in 4 M guanidinium chloride at 25, 32, and 40°C, respectively, for 30 min. This denaturant concentration was sufficient to completely unfold the proteins, as indicated by the loss of their CD signal at 222 nm (not shown). Successively each unfolded enzyme was diluted (defining time 0) 1:40 into renaturation buffer (0.2 M phosphate, pH 7.0, 5 mM dithiothreitol) at the same temperature as during denaturation. The final guanidinium chloride concentration was 0.1 M during refolding. Appropriate aliquots from this incubation mixture were immediately assayed for catalytic activity at 25°C. tion. In contrast to the other GSTP1-1 variants, the reactivation rate of the L21A mutant was not significantly different from that of the wild-type, suggesting that the hydrophobic contact formed by Tyr 154 with domain I of the protein probably forms later on the folding pathway. For all variants, the reactivation rate was independent of the protein concentration (not shown). DISCUSSION Although the (i, iϩ5) hydrophobic staple interaction is a recurrent motif at the N terminus of ␣-helices, its possible function in folding and stability has not previously been tested experimentally in a complete protein. In the present work we have found that this supersecondary interaction plays an essential role in the folding and stability of a GST protein. In the GST superfamily the amino acid residues that make up this motif represent, together with Gly 146 , the only residues strictly conserved in domain II in all GSTs, but their role has not been investigated until now.
To study folding and stability in vitro, GSTP1-1 and its corresponding mutants were expressed at 25°C. At this permissive temperature the kinetic and structural properties of all mutants were only slightly different from those of the wild-type protein, suggesting that the global folding of all variant enzymes were very similar. However, all mutants were less stable, although to various degrees, than the wild-type enzyme (Fig. 3). The N-4 mutant (Y154A), in particular, was significantly more destabilized than the NЈ mutant (I149A). A simple explanation for this difference is provided by the analysis of the three-dimensional structure of human GSTP1-1. As shown in Fig. 1, the residue Tyr 154 , in addition to the hydrophobic staple interactions with Ile 149 , forms conserved interactions with amino acid residues Leu 21 , Asp 24 , and Gln 25 , belonging to an important structural element (the ␣1-helix) of the active site. The almost identical thermal inactivation curves for the single mutant Y154A and the double mutant I149A/Y154A support the interpretation that these interdomain interactions are critical for stability of class P i GSTs. Furthermore, the L21A mutant is much more unstable than the Y154F variant. This indicates that among all interdomain contacts made by Tyr 154 and the ␣1-helix, the conserved hydrophobic interaction of Tyr 154 and Leu 21 plays an essential role for the overall stability of the protein. The results also show that the I149G/Y154A variant is more unstable than the I149A/Y154A double mutant. Thus, although the complex unfolding behavior of GSTP1-1 does not allow accurate determinations of thermodynamic parameters (18), the present results demonstrate that a glycine residue at position 149 significantly destabilizes the structure of GSTP1-1.
Refolding in vitro of all mutants was thermosensitive. The results show that the reactivation yield of the I149A/Y154A double mutant was much lower than that found for either of the two I149A and Y154A single mutants and provide an explanation for the very low yield observed for the heterologous expression of this mutant in E. coli. In particular, the double mutant I149A/Y154A is completely unable to refold at 32°C, although the final structure of this variant is stable at this temperature. This means that during refolding of GSTP1-1 the contribution from the staple motif is significant also in a situation where other interactions between Tyr 154 and the ␣1-helix occur. The results also show that refolding yields of the Y154A variant were lower than those of the I149A mutant. Considering the interactions made by Tyr 154 with other residues of the ␣1-helix, these results suggest that the appropriate docking of the ␣6helix with the ␣1-helix in domain I is also required for a productive folding pathway. In order to better clarify the significance of different interactions formed by Tyr 154 and domain I, the reactivation of L21A and Y154F mutants have also been analyzed. While the refolding yield of this last variant was almost identical to that of wild-type, the very low refolding yields observed for the L21A mutant suggest that the interdomain hydrophobic interaction made by Tyr 154 with Leu 21 plays a major role in the juxtapositioning of the domains and in establishing the proper conformation of class P i GSTs.
The analysis of the reactivation rates at 25°C indicates that the removal of the hydrophobic staple motif destabilizes the transition state of folding. Upon mutation of GSTP1-1, the rate constants decrease for nearly all mutants (Table III). This is because side chain deletions, to various extents, destabilize the transition state and thereby increase the activation energy for folding. In particular the ⌬⌬G f value for the Y154A variant is higher than that for the I149A mutant (Fig. 7). As discussed below, some additional interactions made by Tyr 154 with its environment further contribute to the stabilization of the transition state. The results also show that the reactivation rate of the I149A/Y154A double mutant is lower than that of either the I149A and Y154A single mutants as could be expected (Table  III). It should be noted that a single exponential equation can be fitted to the refolding data for all GSTP1-1 variants suggesting that no significant amount of intermediate is accumulated during the reactivation of wild-type and variants and that none of the mutations have a dramatic effect on the refolding pathway. Thus, double mutant cycle analysis was used to estimate quantitatively free-energy contribution of the staple interaction between Ile 149 and Tyr 154 to the activation energy of the folding process. As shown in Fig. 7  (Ϫ0.70 kcal/mol) for the substitution of Ile 149 by Ala in the Y154A single variant was higher than that determined (Ϫ0.17 kcal/mol) for the same substitution in the wild-type enzyme. The difference (Ϫ0.53 kcal/mol, Table III and Fig. 7) represents the contribution (⌬⌬G i ) of the staple interaction to the activation barrier of folding and indicates that this motif affects the transition state.
Similarly, the analysis of reactivation rate of the Y154F mutant also suggests that the interdomain hydrogen bond between Tyr 154 and Asp 24 is formed early during GSTP1-1 folding. On the contrary, the refolding rate of the L21A mutant was similar to that of the wild-type. This indicates that the interaction between Tyr 154 and Leu 21 does not influence the activation energy of the folding process. It is conceivable that this interdomain contact is formed very late on the folding pathway, after the transition state, and maybe is not realized until the approach of the final state.
Since Ala residues still can form a weak hydrophobic staple during folding (4,5), the reactivation analysis of the I149G variant as well as of the I149G/Y154A and I149G/Y154F double mutants was also performed. The I149G mutant is unable to refold at physiological temperatures. It should be noted that at 37°C the native state at this variant (expressed at 25°C) is stable for several hours (not shown) confirming that the mutation affects an essential step during GSTP1-1 refolding. The corresponding I149G/Y154A double mutant, in contrast to the I149A/Y154A variant, was characterized by a very low refolding yield even at 25°C. Thus, the presence of a glycine at position 149 in the core of GSTs instead of a hydrophobic residue is even more destabilizing and precludes accurate energy determinations of the possible Ala-Ala interaction during refolding.
The present results indicate that the removal of the hydrophobic staple motif does not necessarily prevent folding but, at least at permissive temperatures, decreases the rate of the process. At 25°C, variants of GSTP1-1 are generated that are characterized by a higher catalytic efficiency but a lower structural stability. Given the lower solubility of folding intermediates (23), the slower refolding rate could favor, at more restrictive temperatures, competing aggregation reactions that lead to the complete loss of reactivation at physiological temperature. The reactivation rate was independent of the protein concentration indicating that the elimination of the hydrophobic staple motif gives rise to a new limiting step of refolding that does not involve dimerization. Considering the proposed role of this motif in determining a reversal of the polypeptide chain direction, the results suggest that the appropriate conformation of the ␣6-helix relative to the preceding long loop represents an important feature of a transition state occurring during monomer formation. The conserved hydrophobic staple motif could stabilize one out of several closely related conformations of the ␣6-helix that favors essential interdomain interactions that in some cases are formed later in the refolding process.
It has been hypothesized (24,25) that, for monomeric single domain proteins, buried and highly conserved residues, making a definite set of native like contacts, determine the formation of a specific folding nucleus that serves as a necessary template onto which other parts of the molecule can form. The present work provides evidence also that in a more complex multidomain enzyme, such as specific nucleus, contacts do exist and, in GSTs, they are at least in part determined by conserved local motifs.
Different theoretical studies suggest that there is an evolutionary pressure for proteins to fold rapidly (23,(25)(26)(27)(28). Amino acid replacements that dramatically retard the folding process might be selected against, because competing aggregation reactions could reduce the yield of the native form. The present work identifies a rapid-folding determinant of the polypeptide chain that, at least in GSTs, is represented by a conserved hydrophobic staple motif.