The NMR Structures of the Major Intermediates of the Two-domain Tick Carboxypeptidase Inhibitor Reveal Symmetry in Its Folding and Unfolding Pathways*

There is a lack of experimental structural information about folding intermediates of multidomain proteins. Tick carboxypeptidase inhibitor (TCI) is a small, disulfide-rich protein consisting of two domains that fold and unfold autonomously through the formation of two major intermediates, IIIa and IIIb. Each intermediate contains three native disulfide bonds in one domain and six free cysteines in the other domain. Here we have determined the NMR structures of these two intermediates trapped and isolated at acidic pH in which they are stable and compared their structures with that of the native protein analyzed under the same conditions. Both IIIa and IIIb were found to contain a folded region that corresponds to the N- and C-terminal domains of TCI, respectively, with structures very similar to the corresponding regions of the native protein. The remainder of the polypeptide chains of the intermediates was shown to be unfolded in a random coil conformation. Solvent exchange measurements further indicated that the two protein domains are not completely independent, but affect each other in terms of dynamics and stability, in agreement with reported inhibitory activity data. The derived results provide structural evidence for symmetric TCI folding and unfolding mechanisms that converge in IIIa and IIIb and reveal the structural basis that accounts for the strong and simultaneous accumulation of both intermediates. Altogether, this work has important implications for a better understanding of the folding mechanisms of multidomain, disulfide-rich proteins.

Understanding the sequence of folding events that lead to a biologically active protein from its amino acid sequence is one of the major challenges in structural and molecular biology. The initial theoretical efforts devoted to the study of protein folding have been reinforced in the last several years by the discovery of a wide range of pathological diseases associated with protein misfolding and the increasing pharmacological interest in protein drug design (1,2). The structural characterization of partially folded intermediates and the analysis of the interactions that stabilize them are of fundamental importance to unveil the folding mechanisms (3,4). But these studies are hampered by the rapid and cooperative nature of protein folding, and therefore, the short half-life of the intermediates arising during the folding reaction. It is however possible to trap and isolate discrete intermediates from the oxidative folding of disulfide-rich proteins because of the particular chemistry of disulfide bond formation that takes place as an integral part of disulfide folding (5,6).
A large set of single domain, disulfide-rich proteins has been investigated to date in terms of oxidative folding; among them, outstanding examples such as bovine pancreatic trypsin inhibitor (BPTI), 5 ribonuclease A (RNase A), hirudin, insulin-like growth factor-1 (IGF-1), or conotoxins (7)(8)(9)(10)(11). These folding studies have involved the trapping (either by acidification or alkylation), isolation by liquid chromatography, and final disulfide analysis of the intermediates that accumulate. The extent of heterogeneity of intermediates, together with their number and type (native or non-native) of disulfide bonds has been used as a general premise to classify the folding landscape of these proteins (12,13). Structural studies of disulfide folding intermediates have been mostly focused on analogs of single domain proteins, such as BPTI, RNase A, or lysozyme, with the cys-teines of the missing disulfide bond(s) mutated to alanines or serines (14 -19). Only a few genuine intermediates isolated directly from the folding reactions of BPTI, lysozyme, IGF-1, leech carboxypeptidase inhibitor, and the cyclotides kalata B1 and MCoTI-II have been structurally analyzed by NMR (20 -26). However, many disulfide-rich proteins are of a multidomain nature, for which the folding processes are expected to be more complicated. According to this scenario, there is a need for structural studies of genuine intermediates of multidomain, disulfide-rich proteins to shed light on their folding mechanisms. Of special interest is to clarify to what extent the folding pathways of this kind of polypeptides are modulated by interdomain interactions.
Tick carboxypeptidase inhibitor (TCI) is a small protein of 75 residues that tightly inhibits metallocarboxypeptidases (MCPs) of the A/B subfamily with nanomolar K i (27). Its strong inhibition of plasma CPB, also known as thrombin-activable fibrinolysis inhibitor (TAFI), stimulates the fibrinolysis of blood clots making it a promising adjuvant for use in thrombolytic therapies based on the tissue-type plasminogen activator (28). The x-ray crystal structures of TCI in complex with bovine CPA, human CPB, and bovine TAFI have revealed the presence of two domains that are structurally similar despite sharing a low sequence homology (29,30). The NMR solution structure of free TCI at pH 5.5 has recently shown that the two globular domains are joined by a flexible linker (31). Each domain comprises three disulfide bonds of identical pattern that constrain the TCI structure contributing to its extremely high stability against temperature and denaturing agents. The oxidative folding and reductive unfolding pathways of TCI and its individual domains have been examined in a previous study by acid-trapping and further chromatographic analysis of the occurring intermediates (32). TCI folding proceeds through a sequential oxidation of cysteines that leads to the formation of a complex population of non-native 6-disulfide (scrambled) isomers. The reshuffling of these forms into the native protein represents the strongest kinetic trap of the folding reaction. Importantly, the two domains of TCI appear to fold and unfold autonomously based on the predominant accumulation of two 3-disulfide intermediates (termed IIIa and IIIb) that contain, respectively, the three native disulfide bonds of the N-and C-terminal domains of TCI.
The current work reports the solution structures of the IIIa and IIIb intermediates trapped by acidification and purified from the folding process by reversed-phase liquid chromatography. The ensemble of NMR structures, determined at pH 3.5, showed for both species a native-like fold in the domain crosslinked by the three native disulfide bonds and an unfolded conformation in the other domain containing the six free cysteines. Amide proton exchange experiments revealed a decreased stability of native TCI at acidic pH and an increased backbone dynamics in the folded domain of each intermediate with respect to the native form. This indicates that the two protein halves are not completely independent of each other in the folded state, which is in agreement with reported stability and inhibitory activity data. To our knowledge, this study represents the first structural determination of genuine folding intermediates of a two-domain, disulfide-rich protein, providing inter-esting insights into the molecular rules that account for symmetry in the oxidative folding and reductive unfolding of TCI.

EXPERIMENTAL PROCEDURES
Protein Expression and Purification-TCI was expressed and purified as previously reported (27). Briefly, the pBAT-4-OmpA-TCI plasmid, transformed into the Escherichia coli strain BL21(DE3), was grown at 37°C in M9 minimal medium (0.5% glycerol) supplemented with 0.2% casamino acids and induced at A 600 ϭ 1.0 with 1 mM isopropyl-␤-D-thiogalactopyranoside (final concentration). The secreted TCI was refolded by overnight incubation of the supernatant with 2 mM cystine and 4 mM cysteine (final concentration) at pH 8.4 to obtain the maximum amount of native protein. The native TCI was initially purified by hydrophobic chromatography in a Sep-Pak C18 cartridge (Waters), followed by cation exchange chromatography in a Sep-Pak Accell Plus CM cartridge (Waters), and reversed-phase high performance liquid chromatography (RP-HPLC) in a 4.6-mm Protein C4 column (Vydac Grace). The protein was finally loaded onto a gel filtration column (Superdex Peptide, GE Healthcare) using 30% acetonitrile as solvent and kept lyophilized. Protein identity and purity were confirmed by mass spectrometry and automated Edman degradation. The concentration of TCI in solution was determined by measuring the absorbance at 280 nm and using a calculated absorption coefficient ⌭ 0.1% ϭ 0.97. The native inhibitor was fully active (Ͼ95%) as determined by titration with bovine carboxypeptidase A (Sigma), assuming an equimolar interaction between inhibitor and enzyme.
Oxidative Folding and Reductive Unfolding Experiments-Native TCI (N) was reduced and unfolded in 0.1 M Tris-HCl buffer, pH 8.4, containing 6 M guanidine thiocyanate and 100 mM dithiothreitol (DTT) for 2 h at 23°C. To initiate oxidative folding, the protein was passed through a PD-10 column (Sephadex G-25; GE Healthcare) previously equilibrated with 0.1 M Tris-HCl, pH 8.4, and immediately diluted to a final concentration of 0.5 mg/ml in the same buffer, both in the absence and presence of 0.25 mM 2-mercaptoethanol. The refolding reaction was monitored by removing aliquots of the sample at various time intervals and quenching with the same volume of 4% aqueous trifluoroacetic acid. Acid-trapped intermediates were subsequently analyzed by RP-HPLC using a linear 10 -40% gradient of acetonitrile with 0.1% trifluoroacetic acid over 50 min in a 4.6-mm Jupiter C4 column (Phenomenex) at a flow rate of 0.75 ml/min. In the reductive unfolding experiments, native TCI (0.5 mg/ml) was dissolved at 23°C in either 0.1 M Tris-HCl, pH 8.4 or 0.1 M sodium acetate buffer, pH 4.5, containing different concentrations (5, 20, and 100 mM) of DTT or tris(2carboxyethyl)phosphine (TCEP), respectively. To monitor the unfolding reaction, time course aliquots of the samples were trapped with 4% trifluoroacetic acid and similarly analyzed by RP-HPLC.
NMR Sample Preparation-Approximately 2.5 mg of IIIa and IIIb intermediates and of fully reduced/unfolded protein (R) were obtained by reducing 15 mg of native TCI with 5 mM DTT, trapping the intermediates after 1 h of unfolding reaction with 4% trifluoroacetic acid, and isolating the species using RP-HPLC as detailed under "Oxidative Folding and Reductive Unfolding Experiments." The NMR samples (1 mM N, IIIa, IIIb, or R) were prepared by dissolving the lyophilized protein powder in 300 l of H 2 O:D 2 O (19:1, ratio by volume), and adjusting the pH to 3.5 with concentrated HCl and NaOH. The TCI samples were analyzed at acidic pH to maintain the free cysteines in a reduced form and prevent their oxidation. For the amide proton exchange experiments, the samples prepared at pH 3.5 were lyophilized and resuspended in 300 l of 99.98% D 2 O. Measurements were always conducted using 5-mm susceptibility matched NMR tubes (Shigemi).
NMR Spectroscopy-NMR experiments were performed at 298 K on a Bruker AV 700 spectrometer. Homonuclear twodimensional TOCSY (mixing time of 80 ms) and NOESY (mixing time of 150 ms) spectra were acquired for both, the determination of the sequence-specific polypeptide backbone chemical shift assignments and structure calculations of the native and intermediate forms of TCI. NMR data were processed with the program TOPSPIN (Bruker Biospin), and the program NMRView (33) was used for interactive spectrum analysis. Chemical shifts were measured relative to 2,2-dimethyl-2-silapentane-5-sulfonate sodium salt used at 50 M as internal reference for 1 H. The stability of the native and IIIa and IIIb intermediates during the time necessary to perform the NMR measurements was checked by comparing one-dimensional spectra recorded before and after the two-dimensional experiments. Within the resolution provided by these spectra, no significant changes in the frequencies of the signals were detected, suggesting that the conformation of the intermediates, and the redox state of their cysteines, had not changed over time. Furthermore, in the series of two-dimensional TOCSY spectra spanning 5 days to measure the exchange rates for each of the two intermediates, no frequency changes or appearance of new signals were observed, again indicating that the intermediates were stable over this time period.
Structure Calculation-The NMR-based constraints used for the structure calculation of TCI and the two intermediates were derived from the homonuclear two-dimensional NOESY spectra. Peak picking of the spectra were carried out manually, and peak volumes were determined using the automatic integration function of NMRView (33). In the spectra of the IIIa and IIIb intermediates, only peaks corresponding to the folded domains were picked and used for the structure calculation. The three-dimensional structures were determined by combined automated NOESY cross-peak assignment (34) and structure calculation with torsion angle dynamics (52) implemented in the program CYANA (53). The standard CYANA protocol of seven iterative cycles of NOE assignment and structure calculation, followed by a final structure calculation, was applied. The consistency between the automated NOE assignment and structure calculation was based on two criteria. First, that the chemical shift assignment should cover close to 90% of the protons, and second, that the backbone root mean square deviation (r.m.s.d.) to the mean for the structure ensemble after the first cycle of structure calculation should be smaller than 3 Å (36). Stereospecific assignments for some isopropyl methyls and methylene groups were determined by the GLOMSA method (54) and before the final structure calculation by analyzing the structures obtained in the preceding seventh NOE assignment/structure calculation cycle. Pseudoatoms with appropriate distance corrections were used for distance restraints involving protons with no stereo-specific assignment (37). A set of upper and lower distance limits was introduced for each pair of cysteine residues involved in disulfide bonds (2.1/2.0 Å for S ␥ (i)-S ␥ (j) and 3.1/3.0 Å for C ␤ (i)-S ␥ (j) and S ␥ (i)-C ␤ (j)). The compatibility of the disulfide bond pattern identified in each protein (32) with the structures from the initial rounds of automated calculation was evaluated before introducing these restraints in the subsequent structure calculations. Several intercysteine NOE connectivities confirm the disulfide paring of the intermediates. This connectivity was however consistent with the full set of measured NOEs, which produced three-dimensional structures where the two cysteine side chains were at the distance necessary for the formation of the disulfide bond (see below). Weak constraints on (⌽,⌿) torsion angle pairs and on side chain torsion angles between tetrahedral carbon atoms were used temporarily during the NOE assignment/structure calculation cycles to favor the allowed regions of the Ramachandran plot and staggered rotamer conformations, respectively (35). In each cycle, the structure calculation started from 100 randomized conformers and the standard CYANA-simulated annealing schedule was used with 10,000 torsion angle dynamics steps per conformer. The 20 conformers with the lowest values of the final CYANA target function were subjected to restrained energy minimization in explicit water using the AMBER 9.0 program (38). TIP3P model was used to describe a 9-Å thick cubic box of water molecules surrounding the TCI molecule under periodic boundary conditions. Initially, the protein was subjected to 2000 cycles of restrained energy minimization in vacuum. In a second step a standard protocol of preparation and equilibration of water was used. Finally, the whole system was minimized. The 20 minimized conformers were used to validate the final structure using the program PROCHECK-NMR (39). The program MOLMOL (40) was used to visualize the structures and prepare the figures.
Amide Proton Exchange-The exchange of native TCI amide protons with solvent deuterons was measured at pH 3.5 and 298 K by dissolving the lyophilized sample in 300 l of 99.98% D 2 O. NMR data acquisition was started within 15 min of the initiation of the exchange reaction. A series of TOCSY spectra (80 ms mixing time, 2048 complex data points, 512 t 1 increments, 8 scans per increment) were collected over the course of twelve days. The acquisition time for each experiment was 1 h and 24 min. Amide exchange experiments for the IIIa and IIIb intermediates were acquired under the same conditions over a time period of 5 days. For the reduced/unfolded form, exchange of the amide protons was complete during the acquisition of the first TOCSY spectrum. All the spectra were processed with NMRPipe (41) using the same processing scheme and parameters. The amide proton exchange rate constants were obtained from the intensity of the H Ni -H ␣i cross-peaks corresponding to each residue in the TOCSY spectra. The decay in cross-peak intensities over time was fitted to a single exponential equation of the form: I ϭ A ϫ exp(Ϫk ex t) ϩ C, where I represents the intensity of the cross-peak, A is the amplitude of the exchange curve, k ex is the observed exchange rate, t is the time expressed in min, and C is a constant which takes into account the residual non-deuterated water. The intensity of the cross-peaks, and the data were fitted with the rate analysis module of NMRView (42). The fastest values of k ex were estimated based on the duration of the first experiment, and assuming that we could not reliably measure a remaining intensity Յ 10%. For both the native and intermediate forms of TCI we could measure k ex Ն 0,154 min Ϫ1 . According to the Linderstrøm-Lang model (43), the exchange reaction of a protected amide proton takes place when the protection is lost, and the proton is exposed to the solvent as a result of a structural fluctuation. There are two limiting mechanisms for the exchange (44): EX1, when each opening event leads to exchange, and EX2, when exchange itself is the rate-limiting step of the exchange process. The exchange mechanism of native TCI at pH 5.5 is of the EX2 type (31), therefore we may assume that the same occurs at pH 3.5. Under an EX2 exchange mechanism, the apparent free energies of exchange can be calculated from the equation ⌬G ex ϭ ϪRTln(k ex /k int ), where R is the gas constant and T the absolute temperature. The intrinsic exchange constant, k int , was calculated for every amino acid of the TCI sequence using the SPHERE web tool.

RESULTS
Oxidative Folding and Reductive Unfolding of TCI-The oxidative folding of fully reduced and unfolded TCI (R) was carried out in Tris-HCl buffer at pH 8.4 in the absence and presence of 0.25 mM 2-mercaptoethanol. The folding reaction was monitored by acid-trapping of the occurring intermediates and further RP-HPLC analysis as described under "Experimental Procedures." A high number of intermediates populate the initial stages of the oxidative process (up to 5 h), with similar RP-HPLC chromatograms regardless of the presence or absence of reducing agent (Fig. 1A). The addition of 2-mercaptoethanol only influences the final steps of the folding process by promoting disulfide rearrangement of scrambled isomers into the native form (N). This effect increases enormously the efficiency of the reaction with more than 80% of protein recovered as native TCI after 72 h, compared with ϳ20% obtained in the absence of reducing agent. As previously shown (32), TCI refolds through the sequential formation of 1-, 2-, 3-, 4-, 5-, and 6-disulfide intermediates that finally render two products: the Xa scrambled isomer and the native form. This sequential oxidation of disulfide bonds is thought to take place autonomously for each domain, leading to the predominant accumulation of two major intermediates (IIIa and IIIb), which contain three native disulfide bonds in one domain and six free cysteines in the other one (32) The reductive unfolding of TCI also seems to proceed autonomously through the formation of the IIIa and IIIb intermediates (Fig. 1B). The reactions performed in Tris-HCl buffer, pH 8.4, using DTT as reducing agent lead to an almost equivalent accumulation of both species without the presence of any other detectable intermediate. The use of the alternative reducing agent TCEP in acetate buffer, pH 4.5, significantly alters the ratio of IIIa and IIIb, diminishing the accumulation of the former intermediate. This is explained by the lower working pH, which causes reductive reactions to prevail over disulfide reshuffling, slowing down and hindering the conversion of IIIb into IIIa. Previous Stop/Go experiments demonstrated the fluctuation of IIIb into IIIa at alkaline pH (32). On the other hand, both intermediates reduce their three native disulfide bonds in a cooperative manner following an "all-or-none" mechanism, with IIIb resisting slightly higher concentrations of reducing agent than IIIa (data not shown). This result is in agreement with the higher accumulation of IIIb versus IIIa in the reductive unfolding conducted at acidic pH.
NMR Structural Characterization of Native and Intermediate Forms of TCI-The low field region of the one-dimensional 1 H NMR spectrum of the native, intermediate, and reduced/  OCTOBER 3, 2008 • VOLUME 283 • NUMBER 40

JOURNAL OF BIOLOGICAL CHEMISTRY 27113
unfolded forms of TCI at pH 3.5 is shown in Fig. 2. The fully reduced and unfolded state shows most of the amide signals at 8 -8.5 ppm, the typical range of random coil polypeptide chains, indicating no or little remaining structure. The IIIa and IIIb intermediates show some spectral features similar to those of the native form plus an increased intensity in the random coil region, indicating the coexistence of both natively folded and unstructured regions. The assignment of the backbone 1 H resonances of native TCI was performed following the standard sequential assignment strategy using two-dimensional homonuclear proton spectroscopy (45). The assignment of the backbone proton resonances of native TCI is complete, except for Asn 1 whose amine protons are not observed. Further analysis of the TOCSY and NOESY spectra permitted a nearly complete 1 H side chain assignment. Overall, the assignment covers 90.2% of the chemical shifts of non-labile protons. Two distinct sets of cross-peaks were observed in the two-dimensional spectra of the intermediates. A well resolved set of signals could be unambiguously assigned to one of the two domains of TCI: the resolved signals in the spectrum of IIIa to the N-terminal domain and those in the IIIb spectrum to the C-terminal domain. The remaining signals, with chemical shifts close to random coil values and generally sharper than the others, were not resolved enough to achieve an unambiguous sequential assignment. Nevertheless, the identification of the spin system was possible for all the peaks. In intermediate IIIa we found 12(CHWDNS), 5G, 6(EQ), 6(KR), 2A, 3L, and 1V, and for intermediate IIIb we identified 16(CFYDNS), 4G, 4(EQ), 3(KR), 1A, 3L, 2V, and 2T. These results are consistent with the amino acid composition of the unassigned polypeptide chain and confirm the assignment of the folded region of the two intermediates. The overall assignment of the non-labile protons covers the 93.8 and 91.9% of the folded domains of IIIa and IIIb, respectively. For all Xxx-Pro bonds present in the native and intermediate forms of TCI, a trans conformation was confirmed by intense Xxx(H ␣ )-Pro(H ␦ ) sequential NOEs (45).
The chemical shifts of the assigned residues in the folded regions of the folding intermediates are very similar to the corresponding values in the spectrum of native TCI, suggesting that they have similar structures. This is confirmed by the observation that the side chain proton H ⑀1 of the Trp 73 has the same resonance frequency in N and IIIb, while it is closer to its random coil value in R and IIIa (Fig. 2). Also, the signal of the H ␣ of Tyr 24 has a highly anomalous chemical shift both in native TCI and IIIa but not in R or IIIb, indicating a similarly folded structure around this residue in the first pair of species but not in the second one. The plot of the differences in the H ␣ chemical shifts of native TCI at pH 3.5 and pH 5.5 (supplemental Fig.  S1A) demonstrates that the structure of this protein is almost identical at the two pH values, with only a few residues showing slight differences due to the proximity of side chains with acid groups that titrate in this pH range. The same figure showing the differences between the folded parts of IIIa and IIIb and the equivalent regions of N at pH 3.5 demonstrates that the structures of the folded domains in the intermediates are very similar to those in the native conformation (supplemental Fig. S1B). Tertiary NOEs from different regions of the two-dimensional NOESY spectra, diagnostic of the three-dimensional structure of the TCI domains, further confirm this assessment. Crosspeaks corresponding to NOEs between residues found far away in the sequence can be identified both in the intermediate and native forms of TCI (supplemental Fig. S2).
Solution Structure of Native and Intermediate Forms of TCI-The reliable identification of the signals from the folded and unfolded regions in the intermediates, together with the large dispersion of resonances corresponding to the folded domains in the intermediates and in native TCI allowed automatic NOE assignments and structure calculations for the folded parts of the three species (see "Experimental Procedures"). These calculations were solely done on the basis of the homonuclear NOESY spectra recorded at pH 3.5. Of the total NOESY cross-peaks, 87, 83, and 81% could be assigned by the program CYANA for native TCI, IIIa and IIIb, respectively. Statistics about the quality and precision of the 20 best NMR conformers that represent the solution structures of native TCI and the folded domains of the intermediates are summarized in Table 1. The ensembles of NMR structures are well defined and are highly consistent with the experimental data, with no dis- tance constraint violations larger than 0.35 Å. The precision of the structures is characterized by low r.m.s.d. values to the mean coordinates for the backbone and for all heavy atoms, excluding the unstructured regions at the polypeptide chain termini (residues 1 and 74 -75) and at the interdomain linker (residues 37-38). The quality of the structures is also reflected by a high percentage of (,) backbone torsion angle pairs being found in the most favorable or additionally allowed regions of the Ramachandran plot. The solution structure of native TCI at pH 3.5 (Fig. 3A) is essentially the same as that determined recently at pH 5.5 by manual NOE assignment (PDB entry 2JTO; (31)), within the precision that could be achieved with the assigned NOEs (supplemental Table S1). Also, the structures of the N-terminal domain of IIIa and the C-terminal domain of IIIb are very similar to the equivalent regions of native TCI, although the precision is not as high in these two cases due to the smaller number of assigned NOEs (Fig. 3B). As already described, the structure of TCI consists of a one-turn ␣-helix (␣1 N or ␣1 C ) and a central triple-stranded antiparallel ␤-sheet (␤1 N , ␤2 N and ␤3 N or ␤1 C , ␤2 C and ␤3 C ) in each of the two domains. The C-terminal domain contains an additional two-turn ␣-helix between the last two ␤-strands (␣2 C ).
Amide Proton Exchange of Native and Intermediate Forms of TCI- Fig. 4 shows the first TOCSY spectra recorded after the exchange reaction was initiated for native TCI and the two intermediates. 41, 23, and 13 NHs were observed in the spectrum of native TCI, IIIa and IIIb, respectively, most of them located in the ␤-sheet that forms the central protein core of the two globular domains. Only residues from folded domains were observed for the intermediates, with those in the unfolded region exchanging too fast to be detected. The exchange rates, k ex , could be measured for 33, 16, and 9 residues of N, IIIa, and IIIb, respectively. These values could not be determined for the other residues because of different reasons: some of them exchanged so fast that could not be observed in the first TOCSY spectrum and for others the decay of the signals was too rapid, and the curve could not be fitted to yield a reliable estimate of the exchange rate. The exchange rate constants for the three TCI forms are given in supplemental Tables S2, S3, and S4. Native TCI shows essentially the same relative pattern of   OCTOBER 3, 2008 • VOLUME 283 • NUMBER 40 exchange along the sequence at pH 3.5 and 5.5 (31), although the exchange rate could be measured for three more residues at pH 3.5 because of the slower rate of exchange occurring at this pH. Backbone amide proton exchange with the solvent provides a window for observing the dynamics of the main chain, from the closed incompetent state for exchange to the open competent state (46). Compared with the data at pH 5.5, there is a general decrease in the exchange rates measured at pH 3.5 for native TCI. This is, however, due to the intrinsic retardation of the exchange at this more acidic pH. When the exchange rates are normalized by the intrinsic values (k int ), there is a general increase in the exchange rates at pH 3.5, suggesting an increased backbone dynamics in the time scale at which the exchange with the solvent deuterons takes place. The same effect is qualitatively observed when comparing the folded domains of the intermediates with the corresponding regions of the native protein at the same pH, indicating that the presence of unfolded chains increases the backbone dynamics of the folded regions.

Structures of the Major Folding Intermediates of TCI
Under our working conditions (EX2 exchange regime, see "Experimental Procedures"), the exchange rates can be used to calculate the equilibrium constant between the open and closed forms for each residue of the protein, and the corresponding free energy of exchange (⌬G ex ), which in this case is equivalent to the free energy of the opening reaction. The calculated ⌬G ex values for native TCI and for the folded domains of the intermediates are represented along the TCI sequence in Fig. 5. Assuming that the free energy of exchange of the slowest exchanging amide proton corresponds to the free energy of the global unfolding event, native TCI is about 2.2 kcal/ mol less stable at pH 3.5 than at pH 5.5 (see also supplemental Table S2 and Ref. 31). In addition, the amide proton exchange data does not reveal any significant difference in stability between the N-and C-terminal domains of the native protein.

DISCUSSION
Protein domains are evolutionary independent units that may constitute a single domain polypeptide on their own or recombine with others to form part of a multidomain protein. Although ϳ75% of eukaryotic proteins contain more than one domain (47), there is limited information about the folding mechanisms of individual domains from multidomain proteins. The folding/unfolding transitions of two-domain proteins can be described by three different models: (i) the two-state model, with the two domains either folded or unfolded; (ii) the autonomous folding/unfolding of the two domains; and (iii) the folding/unfolding of interacting domains exhibiting mutual effects on their stability and/or kinetic behavior. In the first case no intermediate state exists between the fully unfolded and completely folded protein, whereas in the two latter cases one or more partially folded intermediates populate the folding reaction. In most cases, the presence of these intermediates can be inferred only by monitoring deviations from two-state behaviors at equilibrium or in time-resolved folding/unfolding experiments using indirect structural probes such as intrinsic fluorescence or ellipticity. Deciphering the conformation of each domain in the intermediates arising along the folding pathway would clearly provide a more direct assessment of their folding mechanism. Therefore, studies of multidomain proteins stabilized by disulfide bonds might provide considerable advances on this field.
The particular two-domain architecture and disulfide crosslinking of TCI represents an attractive system for structural and folding studies. We have recently reported the high-resolution structure of free TCI in solution at pH 5.5 (31). The ensemble of The inset in each spectrum corresponds to the three-dimensional structure of the protein, highlighting in black the residues whose cross-peaks remain after exchange in the corresponding TOCSY spectrum. A few false peaks that result from artifacts in the spectra are labeled with asterisks.
NMR structures shows two globular and compact domains separated by a two-residue linker. The conformation of this short segment is defined only by local constraints and displays an increased flexibility as compared with the two protein domains. In addition, we have previously characterized the oxidative folding and reductive unfolding of TCI and its dissected domains (32). The comparative folding analysis of the whole protein and its individual domains suggested that the folding pathway of this two-domain protein results from the additive contribution of their individual N-and C-terminal domains, which appear to fold autonomously. The folding and unfolding routes converge in the formation of two major intermediates (IIIa and IIIb) that contain, respectively, the three native disulfide bonds of the N-and C-terminal domains of TCI (see Fig. 6). In the present work, we have determined the solution structures of these two intermediates at pH 3.5 by NMR spectroscopy methods and made a comparison with that of the native TCI analyzed at the same acidic conditions. The results reveal a native-like structure for the N-and C-terminal domains of IIIa and IIIb, respectively, as well as a random coil conformation for the accompanying halves. The structures here solved further support the autonomous folding and unfolding ability of both TCI domains. It is common that when two domains fold/unfold autonomously in multidomain proteins, only one of them, either the most stable or the one with faster folding kinetics, but always the same, is able to keep the native structure in the intermediates (48). As far as we know, symmetric situations like those observed for TCI intermediates have not been described before, which broadens the possible folding scenarios in multidomain proteins.
Despite their high level of structural similarity (r.m.s.d. of 1.8 Å for backbone atoms) and identical disulfide bond pattern, the two domains of TCI fold at very different rates and through distinct mechanisms (32). This behavior is reminiscent of the cases of BPTI and tick anticoagulant peptide, two small, single domain proteins with similar structures and disulfide bonds but with completely different folding pathways (49). The folding of the C-terminal domain of TCI involves the formation of a smaller amount of intermediates and is much faster than that of the N-terminal one (32). Instead, the N-terminal half displays a significantly higher number of solvent exchange protected residues, especially at the ␤1 and ␤2 strands and the turn in between. Also, in general, the residues at equivalent spatial positions of the N-terminal domain are more hydrophobic than those in the C-terminal region (e.g. Phe 8 /Gly 45 , Gly 9 / Glu 46 , Leu 11 /Asn 48 , Ala 20 /Lys 55 , and Tyr 20 /Glu 60 ), suggesting that the formation of a hydrophobic core is favored in the former domain. The contact order, defined as the normalized average sequence separation between interacting residues in the folded state (50), is a measure of local versus long-range interactions in the native-state structure. This parameter is small for proteins stabilized mainly by local interactions but large when residues in a protein interact frequently with partners far away in the sequence. The contact order of the C-terminal domain is 22% while that of the N-terminal one is 19%, indicating a slightly higher extent of local contacts in the native state of the latter. Although this concept was developed for two-state folding proteins, it still could be useful to compare the formation of the first folding nucleus in multistate folding proteins. In the case of TCI, it would suggest that the N-terminal domain could adopt a compact structure earlier than the C-terminal one.
How can we correlate this information with the folding of TCI? For the C-terminal domain, the intermediates that accumulate during the initial stages of the folding performed in the absence of redox agents correspond to a reduced number of three-disulfide intermediates, whose composition does not evolve significantly over time (32). The addition of a redox compound promoting disulfide reshuffling causes a strong acceleration of the folding process. Loosely packed conformations with high backbone dynamics would favor this kind of folding reaction. By contrast, the individual N-terminal domain folds more slowly through the accumulation of 1-, 2-, and 3-disulfide intermediates that finally render two products: the native domain and a predominant Xa-like scrambled isomer (32). The burial of particular disulfides into a more compact initial structure with reduced conformational dynamics and disfavored disulfide exchange might promote this effect. As discussed above, the faster compaction of the N-terminal domain could promote a faster and preferential formation of native disulfides in a fraction of the polypeptide population at the beginning of the folding reaction, thus explaining why IIIa and IIIb coexist even if the folding of the C-terminal domain is much faster and more efficient.
Based on the exchange data, the backbone dynamics of the individual TCI domains is higher in the intermediates than in the native protein at the time scale at which exchange takes place. Therefore, the presence of unfolded regions in IIIa and IIIb increases the backbone dynamics of their folded regions. This indicates that although the respective folding pathways are controlled by the primary sequence of each domain, the two moieties are not completely independent of each other in the final folded structure, providing TCI with additional conformational stability relative to that displayed by the individual domains. This finding is consistent with the stability and inhibitory properties of the intermediates, for which the unfolded domain has a destabilizing effect on the folded part and significantly contributes to the inhibition of metallocarboxypeptidases (32). The higher dynamics of the C-terminal domain with respect to the N-terminal one suggests that the motional mode could be important for the functionality of the inhibitor because a rapid sampling in the conformational space to select the most suitable conformation would enhance the affinity for the target protease, especially in the first binding steps. Accordingly, the structure of the C-terminal-half suffers the strongest conformational changes upon inserting the C terminus into the active site groove of human CPB (31).
On the other hand, it seems clear that both domains can attain their native structure autonomously because no stable intermediates with disulfide bonds cross-linking the two TCI halves are observed. A recent study analyzing the relationship between packing density as well as interface structure and interdependence in the folding of domains from multidomain proteins suggests that the domains displaying a flexible linker and a small interface tend to fold autonomously, whereas the folding of domains sharing a large interface is coupled to some extent (48). In agreement with this report, the autonomous assembly of TCI would be facilitated by the presence of a short but flexible linker connecting the two globular domains and the FIGURE 6. Schematic representation of the oxidative folding and reductive unfolding of TCI. The pathways of oxidative folding and reductive unfolding are shown by solid and dashed arrows, respectively. R and N indicate the fully reduced/unfolded and native forms of TCI. XS are ensembles of molecules with X number of disulfide bonds. IIIa and IIIb are two major intermediates with three native disulfide bonds at the N-and C-terminal domain, respectively. Xa is a predominant six-disulfide scrambled isomer accumulating at the end of the folding reaction. Three-dimensional models with the corresponding ribbon structures and disulfide bond pairings are represented for R, IIIa, IIIb, and N. absence of contacts between them. However, Stop/Go experiments reveal that the folding of the N-terminal domain becomes more effective when covalently linked to the folded C-terminal part (in IIIb) (32); and the opposite way for the unfolded C-terminal domain (in IIIa), thus indicating that at least in kinetic terms they mutually influence each other. In any case, the autonomous folding of TCI domains renders a quite efficient global folding process, probably much faster than a folding reaction involving the formation of interdomain disulfide bonds that would imply an increase in the intermediate complexity of the folding landscape.
For small proteins under an EX2 exchange regime, it is assumed that the highest ⌬G ex value estimated by amide proton exchange experiments corresponds to the overall unfolding ⌬G u value calculated from the unfolding transition curve obtained with denaturants or heat (51). The data obtained herein indicate that once folded, the two TCI domains display only marginal differences in their conformational stability. In addition, their disulfide bonds are similarly protected: Cys 3 , Cys 16 , Cys 31 , and Cys 32 in the N-terminal domain, together with Cys 40 , Cys 64 , Cys 70 , and Cys 71 in the C-terminal one expose less than 10% of their surface to solvent. This locking in of disulfides explains why both intermediates can be largely and simultaneously detected during the oxidative folding and reductive unfolding of TCI, giving rise to symmetric folding and unfolding routes as schematically shown in Fig. 6. The reductive unfolding reactions carried out at acidic pH show a slightly higher prevalence of IIIb (with a folded C-terminal domain) over IIIa (with a folded N-terminal part). Previous unfolding experiments on the dissected TCI domains already pointed out this result (32). Although the protection of the C-terminal domain is globally lower than that of the N-terminal one, the particular protection of the longer ␤3 strand located at the C terminus, with two consecutively buried cysteines, could explain the higher disulfide stability of this domain. The autonomous unfolding behavior of the two TCI domains offers an obvious advantage to the protein in vivo: the unfolding of the N-terminal domain will not promote the unfolding of the adjacent C-terminal-half, which would otherwise result in an unfolding cascade that could compromise the functionality of the inhibitor. This property might be important for a molecule designed to function in a harsh environment like blood, where it stimulates the fibrinolysis of clots during the parasitic infection by inhibition of TAFI.