NMR Elucidation of Early Folding Hierarchy in HIV-1 protease

Folding studies on proteases by the conventional hydrogen exchange experiments are severely hampered, due to interference from the autolytic reaction in the interpretation of the exchange data. In this background, we report here NMR identification of the hierarchy of early conformational transitions (folding propensities) in HIV-1 protease, by systematic monitoring of the changes in the state of the protein as it is subjected to different degrees of denaturation by guanidine hydrochloride. Secondary chemical shifts, H N -H α coupling constants, 1 H – 15 N NOEs and 15 N transverse relaxation parameters have been used to report on the residual structural propensities, motional restrictions, conformational transitions etc, and the data suggest that even under the strongest denaturing conditions (6M guanidine) hydrophobic clusters as well as different native and non-native secondary structural elements are transiently formed. These constitute the folding nuclei, which include residues spanning the active site, the hinge region and the dimerization domain. Interestingly, the proline residues influence the structural propensities, and the small amino acids, gly and ala enhance the flexibility of the protein. On reducing the denaturing conditions partially folded forms appear. The residues showing high folding propensities are contiguous along the sequence at many locations or are in close proximity on the native protein structure suggesting a certain degree of local cooperativity in the conformational transitions. The dimerization domain, the flaps and their hinges seem to exhibit the highest folding propensities. The data suggest that even the early folding events may involve many states near the surface of the folding funnel.


INTRODUCTION
The folding of a protein is conceptually described in terms of a folding funnel (1)(2)(3)(4)(5)(6). The narrow end of the funnel represents the folded native state, and the broad end, the unfolded state consisting of millions of rapidly inter-converting conformers. As a protein folds from an unfolded state, it goes through one or more partially folded intermediates, which need to be characterized for elucidation of its folding pathways. Experimentally, this is a very challenging task. The most common and direct approach for this purpose relies on kinetic pulse labeling experiments of amide protons coupled with hydrogen exchange at different time points along the folding reaction of the protein (reviewed in ref. 7). However, as has been pointed out (8), this also has limitations: first, it is limited by the necessity of detectable protection against exchange, second, any lack of protection does not necessarily imply absence of structure, and, third, it biases the interpretation of the structure in intermediates toward the native state. Even so, useful information has been obtained in many protein systems (7). But, this approach is complicated in the case of proteases with autolytic property, since the autolytic reaction interferes with interpretation of the hydrogen exchange data. Therefore in these systems it becomes necessary to look for alternative experimental avenues.
Several equilibrium, real time NMR and kinetic pulse labeling studies (8)(9)(10)(11)(12)(13)(14)(15)(16)(17)(18)(19)(20)(21)(22) have indicated that in many proteins the local structural features of the kinetic intermediates have many things in common to the partially unfolded states created by the use of chemical denaturants such as urea, guanidine or extreme pH conditions (15)(16)(17)(18)(19)(20)(21)(22). That means, characterization of the partially unfolded states created by denaturants can provide useful insights into the structural features of the kinetic intermediates. Further, by guest on March 24, 2020 http://www.jbc.org/ Downloaded from the progressive folding of a protein will be associated with significant changes in its internal dynamics at all time scales (pico to milli seconds). The fully unfolded state is highly dynamic with motions occurring mostly on pico second time scales. Any restriction in the motions implies transient ordering of the polypeptide chain. As the protein starts to fold, more and more of structure forming-breaking events (milli to micro second time scale) occur, and these lead to an increase in the slow motions. Thus a systematic monitoring of these graded changes in the motional characteristics, as also in the residual structures along the polypeptide chain, under different conditions of denaturation, provides very valuable information on the hierarchy of folding propensities in a protein. Following these ideas, we describe below NMR identification of the hierarchy of folding propensities in HIV-1 protease using a tethered dimer construct of the protein in which the two monomers are joined head-to-tail by a flexible linker GGSSG; this protein hereafter referred to as HIVTD, folds similarly to the native homodimer, in vitro (23)(24) and also has similar activity towards the substrates (unpublished results).

NMR experiments
Isotopically ( 15 N) labeled HIV-1 protease tethered dimer for the NMR experiments was prepared as described earlier (25). 1  collected with 3 sec relaxation delay and 2 sec presaturation of the protons. The equilibrium experiment was performed with 5 sec relaxation delay. The NOEs were calculated as peak intensity ratios, I sat / I eq , where I sat is the peak intensity in the spectrum with proton saturation and I eq is the peak intensity in the equilibrium experiment. The experiments were carried out using the pulse sequences described by Farrow et al, 1994 (26). All the experiments were performed on a 600 MHz Varian Unity plus spectrometer and the data was processed using FELIX on a SGI workstation.
Analysis of the primary structure of HIVTD to locate hydrophobic clusters was carried using the program HCA_Draw (27).

NMR spectral features of the denatured states of HIVTD
We first monitored the changes in the state of the protease when guanidine denaturant concentration was systematically decreased from 6M to 1M (Fig. 1A). At 6M, the 1 H-15 N HSQC spectrum was characteristic of an unfolded state. As the guanidine concentration was reduced to 5M, the 6M peaks were still present, but additional peaks 6 appeared (see below). The protein is however, still intact ( Fig. 1A). At 3M guanidine, a coexistence of folded (or partially folded) and unfolded species is seen and as the concentration is further reduced, the protein starts showing protease activitycharacteristic tryptophan side chain peaks of auto-cleavage products are seen in the HSQC spectrum and, also, auto-cleavage products were seen by gel electrophoresis. The autolytic activity increased progressively as the guanidine concentration was further reduced. Therefore, to gain mechanistic insight into the early hierarchical folding events in the protein we investigated the structural and dynamical characteristics of the protein at 6M and 5M guanidine concentrations, under which conditions the protein still remains intact. spectrum is as must be expected from the equivalence of residues in the two halves of the tethered dimer. All the peaks present in the 6M spectrum are also present in the 5M spectrum at almost identical positions barring a few which show small shifts. This allowed an easy transfer of assignments (6M assignments have already been reported (25)). Besides, there are several weaker peaks, which suggest the presence of some partially folded forms. Keeping in mind the fact that the denatured and partially folded states are highly dynamic and heterogeneous, one can envisage that the partially folded states could be in slow exchange with the '6M denatured state'. Though, intuitively, one may assign these additional peaks to their nearest neighbors, there could be more than one partially folded species existing simultaneously in solution and the peaks could correspond to different ones. The presence of these peaks indicates that the state identified by the conserved peaks in the two spectra would have differences in the dynamical characteristics under 6M and 5M guanidine conditions. This is also evident from the fact that the peak resolution in the 5M spectrum is much less, which must be attributed to reduced coupling constants or increased line width due to conformational exchange or both. We have monitored these differences using a variety of NMR parameters, which provide valuable insight into the folding conformational transitions in the protein.

Local Structural preferences in 6M guanidine: folding nuclei
A large body of evidence in the literature indicates that the denatured state is not a random coil, but contains some residual structure or at least some local structural preferences in an otherwise heterogeneous dynamic model (28). Certain regions of (φ,ψ) dihedral angle space may exhibit higher probabilites than others in the Ramachandran plot (29). These are believed to be the folding nuclei or the regions where the initial folding events occur in the ploypeptide chain.
Detailed characterization of the residual structures in the denatured states of proteins is generally obtained from residue wise (C α , H α , C β , CO) chemical shift deviations from random coil values (secondary shifts), H N -H α coupling constants, amide proton temperature coefficients, and in favourable cases the amide proton protections against deuterium exchange (30). Although both the secondary chemical shifts and H N -H α coupling constants reflect on the secondary structural propensities, the former are much more sensitive and reflect even very small population differences in the (φ,ψ) space. On the other hand the amide temperature coefficients, (less than ~ 7 ppb/K) indicate hydrogen bonding and thus report on the presence of persistent structures (30). In the present case we monitored the carbon secondary chemical shifts, the H N -H α coupling constants and the amide proton temperature coefficients in HIVTD and a part of their analysis has been described earlier (25). As mentioned in that paper, the sequence At this stage we may ask whether the tethering of the two monomers by the flexible linker GGSSG in HIVTD has any influence on the structural propensities. One may expect to observe these influences at the C-terminal of the first monomer and the Nterminal of the second monomer. In the NMR spectra we do observe separate peaks for the residues P1, Q2, V3, T96, L97 and N98 from the two monomers and the secondary shifts for these residues are also slightly different as indicated by filled bars in Fig. 2A.
However, the shifts for the C-terminal residues are very small for both the monomers but the shifts for the N-terminal residues P1 and V3 are large for both. This indicates that the observed propensities are primarily dictated by the intrinsic sequence and the tethering has only a small influence if at all. However, as we shall discuss below, the tethering may have some influence on the motional characteristics, because of local interactions of the residues around the linker.
The residue-wise H N -H α coupling constants in HIVTD in 6M guanidine measured from the fine structures of the peaks in the HSQC spectra are shown in Fig. 2B. The values range from 5.5 to 9.0 Hz with an average close to 7 Hz. This is typical of random coils and the small variations observed are within the ranges expected from sequence dependence of these coupling constants (32). Thus the observed coupling constants here do not show any specific secondary structural preferences, which is in contrast to the results from secondary chemical shifts. However, such differences have been seen in many proteins, which have led to the belief that the coupling constants are less sensitive to small variations in the (φ,ψ) populations and hence are less diagnostic (30). Examination of the local structural preferences derived from the secondary chemical shifts in the protein, in the light of its primary structure reveals an interesting correlation: many of the preferences are seen to be associated with prolines. In the primary sequence, prolines are located at positions 1, 9, 39, 44, 79 and 81. Of these, P9 and P39 seem to induce α helical propensity in the neighbouring residues. On the other hand, all the other prolines are associated with β propensities. Thus, it appears that in HIVTD, the prolines may play a major role in dictating the initial folding events when the folding starts from an unfolded state. The different native and non-native folding nuclei described above along with the locations of the prolines are depicted on the native protein structure in Fig. 3. Interestingly, these cover the residues at the dimer interface and in the hinge region of the native protein structure. This suggests that native like folding of active site residues and perhaps active site formation may be an early event in the folding of HIVTD.   gly and ala residues act as molecular hinges in the folding mechanism of a protein.

Conformational transitions along the folding funnel
When the denaturing conditions are made slightly milder, the protein treks along the folding funnel. As mentioned before, at 5M guanidine, the 6M peaks in the HSQC spectrum are nearly conserved, and additional peaks corresponding to other partially folded conformers appear. This indicates that the secondary structural propensities of the conserved species, between 6M and 5M guanidine, could only be marginally different.
However there could be dynamics differences, which would reflect on the conformational transitions on the 'folding funnel'. We have monitored these effects from the changes in the residue-wise H N -H α coupling constants and 15 N transverse relaxation rates as described below. The differences in these parameters are also qualitatively evident in Fig.   1B itself. Fig 5A shows the changes in the coupling constants along the sequence of the protein as we go from 6M to 5M guanidine. The changes are clearly non random and there seems to be a decrease in the coupling constant for all the residues (barring a few).
This would indicate a slight increase in the helical propensity for most residues in 5M guanidine. Comparing with Fig. 2A, we observe that the stretch T12-D60, which has very little structural propensities in 6M guanidine, barring a few residues (E21, L24, D25) near the active site and a few near the hinge region (E34, S37, G40, K43, P44), acquires some preferences in 5M guanidine. This is a reflection on the hierarchy of the folding events.
The 15 N transverse relaxation rates showed a substantial increase in 5M guanidine conditions for 75% of the residues and the average R 2 value was higher by about 30%. A large portion of this increase may come possibly from assembly of the two monomeric halves into the dimer, which may be facilitated because of the head-to-tail covalent linkage through the linker in HIVTD. This transient assembly results in increased linewidths. Such a conclusion is based on the observation that as the guanidine concentration is further reduced the protein starts acquiring autolytic activity (Fig. 1) implying that a proper specific dimer is getting formed even in the presence of certain concentration of the denaturant. However, as can be seen in Fig. 5B  One might wonder whether the present data would throw any light on if the folding of the protease is a simple two state or a complex multistate process; this is of significance for understanding the mechanism of folding of the protein. Conventionally, this is discerned by monitoring the folding using a variety of probes such as circular dichroism (CD), fluorescence, calorimetry, IR spectroscopy etc, and in the event that it is a two state process, the folding profiles from all the techniques would be completely superimposable (39). In the present case, generation of a complete folding profile by any of the conventional techniques, either in the kinetic or in the equilibrium experiments is severely hampered due to the autocleavage property of the protein. Nevertheless, we observe that in 5M guanidine, the HSQC spectra show many extra peaks compared to the 6M spectrum and most of these peaks do not correspond to the native state of the protein.
This means that there is at least one other non-native state which exchanges slowly with the 6M denatured state. It is more likely that there are several non native states which are slightly different from the 6M denatured state and these arise from different protein molecules at the surface of the folding funnel following different paths along the folding funnel. This is consistent with the different extents of residue level folding propensities along the sequence of the polypeptide chain. Thus we believe that the protein has a complex folding mechanism and even the early events involve many states near the surface of the folding funnel. Different molecules may follow different paths and there may be local co-operativity of folding transitions along each path.

CONCLUSIONS
We have attempted here to obtain useful insights into the early events in the folding pathway of HIV-1 protease, using a tethered dimer construct as its representative.
From a number of NMR parameters determined under conditions of guanidine denaturation, the early folding nuclei have been identified. These include native as well as non-native secondary structures and the native structures span the residues at the active site, the hinge and the dimerization domain. Interestingly, proline residues seem to have significant influence on the local structural preferences in the denatured state of the protein. Hydrophobic interactions between the side chains seem to cause substantial restrictions on the motional properties along the chain and these may be important in driving the folding process. It is also observed that glycines and alanines cause enhanced conformational flexibilities and thus may act as molecular hinges in the folding as suggested earlier (38). As the denaturant concentration is progressively reduced, partially folded species start appearing and the protein begins to acquire autolytic activity. The investigations at 6M and 5M guanidine denaturing conditions show that the dimerization domain, the flaps, and the hinges or the elbow of the protein have the highest propensity to conformational transitions. The residue wise graded propensities also suggest a certain degree of local co-operativity in these conformational transitions. Finally, from the observation of more than one non-native species even in 5M guanidine and the sequence wise hierarchy of the conformational transitions, it appears that the protein folds following a complex mechanism with different molecules following different pathways.
To our knowledge this is the first experimental description of the early events in the folding mechanism of HIV-1 protease.

ACKNOWLEDGEMENT
We thank the National Facility for High Field NMR at TIFR, for all the facilities.