A Major Kinetic Trap for the Oxidative Folding of Human Epidermal Growth Factor*

The folding pathway of human epidermal growth factor (EGF) has been characterized by structural and kinetic analysis of the acid-trapped folding intermediates. Oxidative folding of the fully reduced EGF proceeds through 1-disulfide intermediates and accumulates rapidly as a single stable 2-disulfide intermediate (designated as EGF-II), which represents up to more than 85% of the total protein along the folding pathway. Among the five 1-disulfide intermediates that have been structurally characterized, only one is native, and nearly all of them are bridges by neighboring cysteines. Extensive accumulation of EGF-II indicates that it accounts for the major kinetic trap of EGF folding. EGF-II contains two of the three native disulfide bonds of EGF, Cys14–Cys31 and Cys33–Cys42. However, formation of the third native disulfide (Cys6–Cys20) for EGF-II is slow and does not occur directly. Kinetic analysis reveals that an important route for EGF-II to reach the native structure is via rearrangement pathway through 3-disulfide scrambled isomers. The pathway of EGF-II to attain the native structure differs from that of three major 2-disulfide intermediates of bovine pancreatic trypsin inhibitor (BPTI). The dissimilarities of folding mechanism(s) between EGF, BPTI, and hirudin are discussed in this paper.

A protein that contains three disulfide bonds can potentially assume 75 different disulfide isomers. The disulfide folding pathway is defined by the heterogeneity and structures of disulfide isomers that accumulate along the process of oxidative folding. Meticulous application of this technique, pioneered by Creighton (1)(2)(3), has produced a major model of protein folding, the disulfide folding pathway of bovine pancreatic trypsin inhibitor (BPTI) 1 (4 -8). The original model of BPTI was based mainly on the analysis of folding intermediates trapped by iodoacetate and separated by ion-exchange chromatography (4 -6). A subsequent study, using the method of acid trapping and HPLC analysis, revealed a somewhat different pattern of folding intermediates of BPTI (7,8). Specifically, intermediates with non-native disulfides are detected at much lower concen-tration than those found previously. Despite few discrepancies, the folding pathway of BPTI (4 -9) is characterized by the predominance of limited number of folding intermediates that adopt native disulfide bonds and native-like structures. Most importantly, only 1-and 2-disulfide intermediates were observed during the folding of BPTI.
Using the same technique of acid trapping, our laboratory has analyzed the folding pathway(s) of three single domain, 3-disulfide containing proteins that have sizes similar to that of BPTI, including hirudin (10), potato carboxypeptidase inhibitor (PCI) (11), and tick anticoagulant peptide (12). Their folding mechanism(s) have been shown to differ from that of BPTI in two important aspects. 1) Their 1-and 2-disulfide intermediates are far more heterogeneous than that found in the case of BPTI. At least 40 fractions of 1-and 2-disulfide intermediates were identified along the folding pathways of hirudin (10). 2) Scrambled 3-disulfide isomers, which have not been observed in the case of BPTI, were shown to serve as folding intermediates of hirudin, PCI, as well as tick anticoagulant peptide. Moreover, accumulation of scrambled isomers as folding intermediates can be greatly enhanced by the presence of oxidized glutathione or cystine (11)(12)(13). For instance, when folding of PCI was performed in the presence of 0.5 mM of cystine, more than 98% of the total protein was trapped as scrambled species before trace amount of the native PCI even appeared (11).
Another protein that has been analyzed in our laboratory and others is human epidermal growth factor (EGF) (14,15). The native EGF comprises 53 amino acids and three disulfide bonds bridged by Cys 6 -Cys 20 , Cys 14 -Cys 31 , and Cys 33 -Cys 42 (Fig. 1). The folding mechanism of EGF differs from that of BPTI as well as hirudin. The characteristics of EGF folding display both similarity and dissimilarity to that of BPTI and hirudin. For example, the folding intermediates of EGF (14) consist of scrambled isomers that are not found in the case of BPTI and also predominant 2-disulfide species that are absent from the pathway of hirudin. These discrepancies indicate that the folding pathway of small disulfide containing proteins is indeed more complex than what has been learned from the BPTI model alone.
However, contrasting to the pathways of hirudin (10), PCI (11), and the updated model of BPTI (7,8), which are concluded primarily from the analysis of acid trapped intermediates, existing data of EGF folding are acquired by trapping methods of either carboxymethylation (14) or cyanylation (15). From the experience of BPTI (3,7), it is known that acid trapping represents a more efficient method for quenching the reshuffling of disulfide bonds. To make a more valid and accurate comparison between EGF, hirudin, and BPTI, we have re-examined the folding mechanism of EGF using the technique of acid trapping. These efforts allow us to identified structures of three new species of 1-disulfide intermediates and permit a detailed kinetic analysis of a predominant 2-disulfide intermediates of EGF using the stop/go folding experiments.

EXPERIMENTAL PROCEDURES
Material-Recombinant human EGF was supplied by the Protein Institute Inc. (Broomall, PA). The purity of EGF was greater than 96% as judged by HPLC and N-terminal sequence analysis. Thermolysin (P-1512), dithiothreitol, guanidine hydrochloride, reduced glutathione (GSH), oxidized glutathione (GSSG), cysteine (Cys), and ␤-mercaptoethanol were products of Sigma with purity of greater than 99%.
Oxidative Folding of the Fully Reduced EGF-The native EGF (2 mg/ml) was first reduced and denatured in the Tris-HCl buffer (0.1 M, pH 8.4) containing, 10 mM EDTA, 5 M guanidine hydrochloride, and 30 mM dithiothreitol. Reduction and denaturation were carried out at 22°C for 90 min. To initiate the folding, the sample was passed through a PD-10 column (Amersham Pharmacia Biotech) equilibrated in the 0.1 M Tris-HCl buffer (pH 8.4). Reduced and denatured EGF was recovered in a volume of 1.1 ml, which was immediately diluted with the same Tris-HCl buffer to a final protein concentration of 0.5 mg/ml (83 M) and with selected concentrations of redox agents. Folding intermediates of EGF were trapped in a time course manner by mixing aliquots of the sample with an equal volume of 4% trifluoroacetic acid in water. Trapped folding intermediates were analyzed by HPLC or stored at Ϫ20°C.
Reductive Unfolding of the Native EGF-The native EGF (0.5 mg/ml) was dissolved in the Tris-HCl buffer (0.1 M, pH 8.4) containing varying concentrations of dithiothreitol (0.5-100 mM). Reduction was carried out at 22°C. To monitor the kinetics of unfolding, aliquots of the sample were removed at time intervals, quenched with an equal volume of 4% aqueous trifluoroacetic acid, and analyzed by HPLC. The samples were stored at Ϫ20°C.
Stop/Go Folding of the Predominant 2-Disulfide Intermediate EGF-II-Acid-trapped EGF-II was isolated by HPLC, freeze-dried, and allowed to carry on the folding by dissolving the sample (0.5 mg/ml) in the same Tris-HCl buffer containing selected redox agents. Folding intermediates of EGF-II were similarly trapped by acidification, stored at Ϫ20°C, and analyzed by HPLC.
Structural Analysis of 1-and 2-Disulfide Intermediates of EGF-The acid-trapped folding intermediates of EGF were purified from HPLC and freeze-dried. The samples (20 g) were derivatized with 50 l of vinyl pyridine (0.1 M) in the Tris-HCl bufer (0.1 M, pH 7.5) 23°C for 45 min. Vinyl pyridine-derivatized samples were further purified by HPLC, freeze-dried, and treated with 2 g of thermolysin (Sigma, P-1512) in 65 l of N-ethylmorpholine/acetate buffer (50 mM, pH 6.4). Digestion were carried out at 37°C for 16 h. Peptides were then purified by HPLC and analyzed by amino acid sequencing and mass spectrometry to identify the disulfide-containing peptides.
Amino Acid Sequencing and Mass Spectrometry-Amino acid se-quence of disulfide containing peptides were analyzed by automatic Edman degradation using a PerkinElmer Life Sciences Procise sequencer (model 494) equipped with an on-line phenylthiohydantoinderivative analyzer. The molecular mass of disulfide-containing peptides were determined by matrix-assisted laser desorption/ionizationtime-of-flight mass spectrometer (PerkinElmer Life Sciences Voyager-DE STR).

Rapid Accumulation of a 2-Disulfide Intermediate (EGF-II) along the Pathway of Oxidative Folding of EGF-Oxidative
folding of the fully reduced EGF was carried out at alkaline pH in the absence and presence of redox agent. Folding intermediates were trapped by acidification and analyzed by HPLC (Fig. 2). The structures of folding intermediates were established through analysis of thermolysin-digested peptides (see following sections). The major intermediates consist of five 1-disulfide species, one predominant 2-disulfide species, and two species of 3-disulfide scrambled isomers. Kinetics of the folding was determined quantitatively by the flow rate of various classes of intermediates and the recovery of native structure.
The results demonstrate that oxidative folding of EGF un- Folding was carried out in the Tris-HCl buffer (pH 8.4) containing 2-mercaptoethanol (0.25 mM) (Controlϩ) or without redox agent (ControlϪ). One is oxidation by oxygen in the presence of 2-mercaptoethanol and the other is oxidation by oxygen. EGF concentration was 0.5 mg/ml (83 M). Folding intermediates were analyzed by HPLC using the following conditions. Solvent A was water containing 0.05% trifluoroacetic acid. Solvent B was acetonitrile/water (9:1 by volume) containing 0.042% trifluoroacetic acid. The gradient was 14 -34% solvent B linear in 15 min, 34 -56% solvent B linear in from 15-50 min. The column was Zorbax C-18, 4.6 mm, 5 m. Column temperature was 23°C. N and R stand for the native and the fully reduced EGF. II, I-A, I-B, I-C, I-D, and I-E indicate the predominant 2-disulfide intermediate and five identified 1-disulfide intermediates. III-A and III-B are two major species of 3-disulfide scrambled isomers. dergoes a sequential flow of the 1-disulfide, 2-disulfide, and 3-disulfide scrambled species as intermediates (Fig. 3). Quantitative recovery of the native EGF can be achieved by allowing the folding in the buffer containing different combinations of redox agents, including the simple presence of 2-mercaptoethanol. Specifically, the folding of EGF and the recovery of the native structure could be significantly accelerated by the presence of oxidized glutathione (Fig. 3). However, when folding of EGF was performed in the absence of redox agent, about 43% of the protein become trapped as 3-disulfide scrambled species, unable to convert to the native structure due the absence of thiol catalyst. Despite the variations of kinetics, the patterns of folding intermediates remain practically indistinguishable under a wide range of folding conditions.
The most striking feature of the folding kinetics of EGF is the rapid formation of a predominant 2-disulfide intermediate, designated as EGF-II. Accumulation of EGF-II occurs both in the absence and presence of redox agents (Figs. 2 and 3), suggesting that EGF-II adopts a highly stable structure and represents a major kinetic trap of the EGF folding. Structural analysis revealed that EGF-II contains two native disulfide bonds (Cys 14 -Cys 31 and Cys 33 -Cys 42 ) (see the following sections).
EGF-II Is Also a Major Intermediate along the Pathway of Reductive Unfolding of EGF-Reductive unfolding of the native EGF was performed at pH 8.4 using various concentrations of dithiothreitol as the reducing agent. The unfolding intermediates were trapped in a time course manner by acidification and were subsequently analyzed by reversed phase HPLC. Unfolding of native EGF undergoes a stable 2-disulfide intermediate (II) (Fig. 4). This 2-disulfide intermediate subsequently converts to the fully reduced EGF (R) without significant buildup of 1-disulfide intermediate along the pathway. The same phenomenon was observed with the concentration of dithiothreitol ranging from 2 to 100 mM. This 2-disulfide un-folding intermediate has the same retention time as that of EGF-II observed along the pathway of oxidative folding (Fig. 2). Structural analysis confirms that this major unfolding intermediate is indeed identical to the predominant folding intermediates EGF-II.
Structures of 1-Disulfide Folding Intermediates of EGF-Five major species of acid trapped 1-disulfide intermediates were isolated (I-A, I-B, I-C, I-D, and I-F) (Fig. 2). They were treated with vinylpyridine, further purified by HPLC, and were all shown to contain single species. They were digested with thermolysin. Thermolytic peptides were isolated by HPLC (Fig.  5) and analyzed by Edman sequencing and matrix-assisted laser desorption/ionization mass spectrometry to identify the structures of the disulfide-containing peptides. Data obtained are summarized in Table I The Structure of a Predominant 2-Disulfide Folding Intermediate of EGF-The predominant 2-disulfide folding intermediate (EGF-II) was analyzed by the same method used to characterize 1-disulfide intermediates. The HPLC peptide mapping of EGF-II is shown in Fig. 5, and the structural data of thermolytic peptides are given in Table II. The results confirm that EGF-II contains two native disulfide bonds (Cys 14 -Cys 31 and Cys 33 -Cys 42 ). The structure of the 2-diuslfide unfolding intermediate ( Fig. 4) was similarly characterized, and it was found to be identical to that of EGF-II.
Structures of Well Populated 3-Disulfide Intermediates of EGF-Scrambled 3-disulfide intermediates of EGF comprise at least seven species. The disulfide structure of two major spe- cies, III-A and III-B, representing about 90% of the total scrambled isomers, has been determined previously (14). Their structures are Cys 6 -Cys 42 , Cys 14 Tables I and II. Peptides were separated under the HPLC conditions described in the legend to Fig. 2, except for using a different gradient consisting of 5-22% solvent B, linear, over 30 min, 22-60% solvent B from 30 to 31 min, stay at 60% solvent B for 5 min, and then returned to 5% solvent B within 2 min. Intermediate EGF-II-To further assess the kinetic role of EGF-II, acid-trapped EGF-II was isolated, freeze-dried, and allowed to resume the folding in the absence and presence of a wide selections of redox agents. Folding intermediates were trapped by acidification and analyzed by HPLC. The data, presented in Figs. 7 and 8, display two important characteristics of the EGF-II folding. First, conversion of EGF-II to the native structure requires only the formation of the third native disulfide bond (Cys 6 -Cys 20 ). However, this process is relatively slow. In the absence of redox agent (bottom panel of Fig. 7 and panel A of Fig. 8), about half of the protein again became trapped as scrambled isomers, unable to convert to the native structure during prolonged folding. Quantitative recovery of the native EGF could be achieved only in the presence of redox agent containing free thiol. But even under optimized conditions, a 50% recovery of the native EGF requires longer than 1 h of folding. Indeed, the inclusion of oxidized glutathione (panels G and H of Fig. 8), which has been shown to significantly promote the rate of formation of disulfide bond and the native structure in numerous different proteins (13), has only modest effect in accelerating the recovery of native EGF.
Second, conversion of EGF-II to form the native structure apparently requires structural rearrangement of EGF-II. Folding intermediates of this rearrangement pathway include scrambled 3-diuslfide isomers (III-A and III-B), which exist under all folding conditions investigated (Fig. 8). The level of accumulation of scrambled EGF is dependent upon the concentration of thiol agent (compare panels A-D of Fig. 8) and become most evident when folding of EGF-II was performed in the absence of free thiol (see panels A and G of Fig. 8). A separate analysis of folding kinetic using both EGF-II and EGF-X (III-A and III-B) as starting material has also been performed to compare the rate constant of k XfN and k IIfNЈ , where X, II, and N stand for scrambled EGF, EGF-II, and the native EGF, respectively. These experiments were carried out in the presence of 2-mercaptoethanol (0.25 mM) or reduced glutathione (0.25 mM). Using 2-mercaptoethanol, k XfN (5 ϫ 10 Ϫ3 min Ϫ1 ) was shown to be 5-fold greater than k IIfN (1.1 ϫ 10 Ϫ3 min Ϫ1 ). In the presence of glutathione, k XfN (2 ϫ 10 Ϫ3 min Ϫ1 ) was 4.3-fold greater than k IIfN (4.6 ϫ 10 Ϫ4 min Ϫ1 ). These results demonstrate that disulfide reshuffling of scrambled EGF and their conversion to the native structure represent a major pathway and also an additional rate-limiting step for the folding of EGF-II.  Table I and Table II. The pathways of oxidative folding of several 3-disulfide-containing proteins have been characterized, including the most extensively documented case of BPTI (4 -8) as well as hirudin (10,16) and PCI (11) elucidated in our laboratory. The folding intermediates of BPTI comprise only 1-and 2-disulfide species (Table III). Out of the 60 species of possible 1-and 2-disulfide isomers, only five to seven species were shown to predominate along the folding pathway of BPTI, and most of them contain only native disulfide bonds. The folding pathways of hirudin and PCI are nearly indistinguishable. However, it differs from that of BPTI by 1) a much higher degree of heterogeneity of 1and 2-disulfide intermediates and 2) the presence of 3-disulfide scrambled isomers as folding intermediates (Table III). The folding mechanism of EGF described here is further different from both cases of BPTI and hirudin. Comparison and discussion of these discrepancies are essential for us to comprehend the diversity of disulfide folding pathway(s). These comparisons are focused on the heterogeneity of folding intermediates, their disulfide contents, disulfide structures, and folding kinetics.
Comparison of the Folding Pathways between EGF and Hirudin-Oxidative folding of hirudin and PCI undergoes an initial stage of nonspecific disulfide pairing that leads to the formation of scrambled 3-disulfide isomers as intermediates. This is followed by disulfide reshuffling of scrambled species in the presence of thiol agent to reach the native structure (10,11,13,16). The folding intermediates of hirudin and PCI are thus far more heterogeneous than that of EGF. Analysis by HPLC and capillary electrophoresis revealed that more than 60% of the possible 1-and 2-disulfide isomers (there are 60 species) of hirudin are present along the folding pathway (10). Among the 14 possible 3-disulfide isomers of hirudin, 11 have been structurally identified (16) as folding intermediates (Table III). There is also no evidence for the existence of predominant and thermodynamically stable intermediates at the level of 1-and 2-disulfide species.
The kinetics of hirudin and PCI folding are sensitive to the presence of redox agent and can be regulated in a two-stage manner by varying the compositions of GSH/GSSG or Cys/Cys-Cys (11,13). Specifically, GSSG and Cys-Cys accelerate the formation of disulfide bonds, whereas Cys and GSH, like that of 2-mercaptoethanol, catalyze the disulfide reshuffling. When folding of hirudin and PCI was performed in the presence of GSSG or Cys-Cys (0.5 mM), more than 98% of the intermediates accumulated rapidly as scrambled isomers before a significant amount of the native structure appeared (11). Under these conditions, conversion of scrambled isomers to form the native structure represents the rate-limiting step of hirudin folding. Such folding kinetics differ significantly from that of EGF in which a stable 2-disulfide intermediates (EGF-II) accounts for the major kinetic trap and rate-limiting step of oxidative folding even in the presence of GSSG (Fig. 3).
The major similarity between hirudin and EGF is the presence of scrambled 3-disulfide isomers as folding intermediates. For both hirudin (10) and EGF, about 50 -70% of the total protein become trapped as scrambled isomers when folding was performed in the absence of redox agent. In both cases, the predominant intermediates of scrambled isomers adopt the beads-form disulfide pairing, in which disulfide bonds were formed by three pairs of neighboring cysteines (14 -16).
Comparison of the Folding Pathways between EGF and BPTI-The folding mechanism of EGF also bears resemblance as well as dissimilarity to that of BPTI. Both proteins exhibit only limited numbers of folding intermediates at the level of 1-disulfide and 2-disulfide species (Table III). The two major 1-disulfide intermediates of BTTI, Cys 30 -Cys 51 and Cys 5 -Cys 55 , contain native disulfides and adopt native-like structures (4 -8, 17-19). They accumulate substantially along the folding pathway of BPTI and represent more than 80% the total 1-disulfide intermediates of BPTI (4,7,9). By contrast, only one out of the five identified 1-disulfide intermediates of EGF (I-D) contains native disulfide bond. Nearly all of them are formed by neighboring cysteines, and none of them is separated by more than 11 amino acids (Fig. 6). The only native disulfide Cys 33 -Cys 42 (I-D) may possibly serve as a direct precursor of EGF-II. However, I-D and the remaining 1-disulfide intermediates of EGF appear to exist in a state of equilibrium throughout the folding (Fig. 2). These results demonstrate that 1-disulfide intermediates of EGF are unlikely to adopt native-like stable structures. This conclusion is further supported by their low concentration presented along the folding pathway (Fig. 2), which was observed under a wide range of folding conditions.
Another feature that distinguishes EGF from BPTI is the presence of 3-disulfide scrambled isomers as folding intermediates. These fully oxidized species were not observed in the case of BPTI, but have been found along the folding pathways of four different 3-disulfide proteins, including hirudin, and EGF. This difference is a major hallmark for the diversity of disulfide folding pathway (20). Through kinetic analysis of reductive unfolding, we have recently shown that the absence or presence of scrambled isomers as intermediates depends largely upon how native disulfide bonds are being stabilized FIG. 7. Stop/Go folding of the predominant 2-disulfide intermediate of EGF. Acid-trapped EGF-II was purified by HPLC under acidic conditions, freeze-dried, and then dissolved in the Tris-HCl buffer (pH 8.4) to allow the folding either in the absence (ControlϪ) or presence (Controlϩ) of 2-mercaptoethanol (0.25 mM). Folding intermediates were trapped with acid (4% trifluoroacetic acid) and analyzed by HPLC using the same conditions as those described in the legend to Fig. 2. N and II stand for the native EGF and EGF-II. III-A and III-B are two major species of 3-disulfide isomers of EGF. (20).
However, the most intriguing dissimilarity between EGF and BPTI is reserved for the property of their 2-disulfide intermediates. In the case of BPTI, three major acid-trapped 2-disulfide intermediates were identified. They are Cys 30 -Cys 51 , Cys 5 -Cys 55 ; Cys 30 -Cys 51 , Cys 14 -Cys 38 ; and Cys 5 -Cys 55 , Cys 14 -Cys 38 , all contain exclusively native disulfide bonds, and all of them were shown to be capable of adopting native-like structures (21, 22). Among them, only Cys 30 -Cys 51 , Cys 14 -Cys 38 and Cys 5 -Cys 55 , Cys 14 -Cys 38 accumulate significantly along the pathway (7). These two intermediates either form the third respective native disulfide very slowly or rearrange slowly to form Cys 30 -Cys 51 , Cys 5 -Cys 55 , which then converts rapidly to the native BPTI by forming the third native disulfide bonds Cys 14 -Cys 38 (4,5,7). Cys 30 -Cys 51 , Cys 5 -Cys 55 , which was designated as N SH SH (7), thus adopts native-like structure and represents a direct precursor for the native BPTI during its oxidative folding. Its rapid conversion to the native BPTI is consistent with its low concentration accumulated along the folding pathway of BPTI.
For EGF, there is only one predominant 2-disulfide intermediate. EGF-II also contains two native disulfide bonds (Cys 14 -Cys 31 , Cys 33 -Cys 42 ). But unlike N SH SH of BPTI, EGF-II does not readily convert to the native EGF. Under the same folding conditions (e.g. in the presence of 0.25 mM of GSSG), conversion of EGF-II to the native EGF is at least 500 fold slower than the conversion of N SH SH to the native BPTI. 2 The property of EGF-II also differs from that of other two predominant 2-disulfide intermediates of BPTI that convert slowly to N SH SH (4,7,9). The major pathway, if not the only pathway, for EGF-II to reach the native structure is through 3-disulfide scrambled species. This conclusion is substantiated not only by the presence of scrambled EGF as intermediates but also by the fact that conversion of EGF-II to the native EGF could be greatly accelerated in the presence of thiol agents (2-mercaptoethanol or GSH) that cat-alyze disulfide reshuffling (Figs. 7 and 8). However, it is likely that direct formation of the third native disulfide bond may lead a fraction of EGF-II to N directly. What makes the comparison between EGF and BPTI even more baffling is that EGF-II and N SH SH display otherwise very similar properties. Both EGF-II and N SH SH are sole intermediates (20) accumulated along the unfolding pathways of EGF and BPTI (23)(24)(25), respectively, and both convert to the fully reduced species without undergoing buildup of 1-disulfide species.
These results indicate that N SH SH of BPTI and EGF-II, despite their many comparable characteristics as folding and unfolding intermediates, adopt fundamentally very different structures. The folding mechanism exhibited by EGF only magnifies the extent of diversity of disulfide folding pathways. a Disulfide structures of these isomers have not been determined due to their heterogeneity (10).
b Species containing mostly non-native disulfide bonds (Refs. 14 and 15 and this study).
e The structures of two major species (III-A and III-B) have been determined. They contain mostly non-native disulfide bonds (14).