Folding and Misfolding in a Naturally Occurring Circularly Permuted PDZ Domain*

One of the most extreme and fascinating examples of naturally occurring mutagenesis is represented by circular permutation. Circular permutations involve the linking of two chain ends and cleavage at another site. Here we report the first description of the folding mechanism of a naturally occurring circularly permuted protein, a PDZ domain from the green alga Scenedesmus obliquus. Data reveal that the folding of the permuted protein is characterized by the presence of a low energy off-pathway kinetic trap. This finding contrasts with what was previously observed for canonical PDZ domains that, although displaying a similar primary structure when structurally re-aligned, fold via an on-pathway productive intermediate. Although circular permutation of PDZ domains may be necessary for a correct orientation of their functional sites in multi-domain protein scaffolds, such structural rearrangement may compromise their folding pathway. This study provides a straightforward example of the divergent demands of folding and function.

One of the most extreme and fascinating examples of naturally occurring mutagenesis is represented by circular permutation. Circular permutations involve the linking of two chain ends and cleavage at another site. Here we report the first description of the folding mechanism of a naturally occurring circularly permuted protein, a PDZ domain from the green alga Scenedesmus obliquus. Data reveal that the folding of the permuted protein is characterized by the presence of a low energy off-pathway kinetic trap. This finding contrasts with what was previously observed for canonical PDZ domains that, although displaying a similar primary structure when structurally re-aligned, fold via an on-pathway productive intermediate.

Although circular permutation of PDZ domains may be necessary for a correct orientation of their functional sites in multidomain protein scaffolds, such structural rearrangement may compromise their folding pathway. This study provides a straightforward example of the divergent demands of folding and function.
A crucial development of our knowledge on protein folding has been contributed by correlating rate constants of folding of small proteins with their topology as measured by the gross parameter of the contact order (1). The contact order represents the average distance, on the primary structure, between interacting residues in the tertiary structure. A protein with a low contact order will by and large present interacting residues that are close in sequence. On the other hand, high contact order implies a large number of long-range interactions. Baker and coworkers (2) first showed a strong correlation between kinetic and structural parameters, suggesting that protein topology is a key factor in determining the folding pathways and speed. An important corollary of these observations is that folding transition states must reflect a distorted version of the native state. This feature was already captured by the nucleation condensation model (3,4), which suggested the protein to fold all at once around an extended, weakly formed, folding nucleus.
The notion that protein folding pathways are governed by protein topology (1) has recently been challenged by ingenious experiments using topological mutants such as circularly permuted variants (5)(6)(7)(8)(9). Despite the dramatic change experienced by the primary structure, circular permutations seem well tolerated by several protein sequences. In nature, circular permutations have been recognized in ϳ5% of proteins of known structures (10,11). Folding studies on artificially permuted proteins have been specifically aimed at monitoring the effect on both folding speed and mechanism. By systematically altering the sequence connectivity of the ribosomal protein S6, Oliveberg and coworkers (12) showed that protein folding rate constants of circularly permuted variants are well predicted by the contact order parameter. On the other hand, analysis of protein folding pathways reveals apparently contradicting results. In particular, while the folding pathway of chymotrypsin inhibitor 2 retains its nucleus when challenged with circular permutation (8), both S6 and the Src homology 3 domain appear to fold via different folding trajectories (6,9). In an attempt to reconcile these contradicting results, Lindberg and Oliveberg (13) recently suggested pathway malleability to be a consensus feature of protein folding. When and if the denatured chain may reach its native conformation by means of different independent nuclei, circular permutation, involving the cleavage of a folding nucleus present in a dominant foldon, may result in a different dominant folding pathway. A critical test to verify this hypothesis would imply performing circular permutation experiments of multi-state single domain proteins. Indeed, if pathway malleability is a general feature of protein folding, it is tempting to speculate that circular permutation may affect the stability and possibly the mechanistic role of folding intermediates. Based on lattice model simulations, Li and Shakhnovich (5) predicted circular permutation to remarkably affect the stability of folding intermediates. No experimental folding study has been so far performed on a multi-state single domain protein.
PDZ domains are small globular protein-protein interacting modules that display a fold comprising six ␤-strands and two ␣-helices (Fig. 1). Although numerous PDZ domains are found in bacteria and plants, their presence in yeast is questionable. It has therefore been suggested that the PDZ domains in bacteria and plants result from horizontal gene transfer (14). Furthermore, the PDZ domains from bacteria and plants may be considered as circularly permuted variants of their metazoan counterparts, i.e. sharing the same overall fold but characterized by N and C termini located at different positions along the sequence (15). Hence, this protein family represents an ideal system to investigate the relationships between sequence connectivity and protein topology and, for the first time, on naturally evolved sequences.
The canonical eukaryotic PDZ domains have been shown to fold via a conserved mechanism involving an on-pathway intermediate (16 -18); -value analysis on the second PDZ domain of PTP-BL (PDZ2) 3 demonstrated that interaction between N and C termini is a key event in PDZ folding (19). Here we report the kinetic folding mechanism of a PDZ domain of the D1 C-terminal-processing protease (D1pPDZ) from the green alga S. obliquus (15).
A structure-based sequence alignment of D1pPDZ with other PDZ domains ( Fig. 1) reveals that the primary structure of D1pPDZ is similar to that of its canonical counterparts, displaying pairwise sequence identity of ϳ25% and sequence similarity of ϳ50%. Despite the high degree of sequence similarity with other members of the PDZ domain family, we have demonstrated here that the folding of D1pPDZ differs considerably from the folding of the canonical PDZ domains and its denatured state may be trapped to a misfolded intermediate that competes with productive folding.

EXPERIMENTAL PROCEDURES
Protein Expression and Purification-A synthetic gene encoding D1pPDZ (residues 159 -253 of the full D1p) (15) was purchased from GENEART. As previously described for other PDZ domains (17), an engineered tryptophan was introduced to function as a fluorescent probe (V178W). The gene was subcloned into the expression vector pET28(c), and protein was expressed in BL21(DE3) cells (Invitrogen) as described elsewhere (16). After induction with 1 mM isopropyl-1-thio-␤-D-galactopyranoside the cells were grown for 24 h at 25°C. The resulting hexahis-tagged protein partly formed inclusion bodies and was resuspended using an excess of urea and purified using nickel affinity chromatography (20 mM Tris-HCl, pH 7.8, elution using a gradient from 0 -1 M imidazole). D1pPDZ was also independently purified from soluble fractions (using a nickel affinity column followed by an S cation exchanger resin), resulting in protein samples displaying similar folding behavior (data not shown).
Site-directed mutagenesis was performed using a QuikChange site-directed mutagenesis kit (Stratagene). Four single mutants were constructed, (A188G, V202A, V215A, and V227A), and the mutations were confirmed by DNA sequencing of the coding regions.
Equilibrium Denaturations-Urea-induced equilibrium experiments were performed on wild type D1pPDZ in 50 mM phosphate buffer, pH 7.2, 25°C, in the absence and presence of 0.4 M sodium sulfate using a FluoroMax-4 spectrofluorometer (Horiba Jobin Yvon). Protein samples were excited at 280 nm, and the emission spectra were recorded between 320 and 380 nm. The experiments were repeated at different protein concentrations (0.2-15 M). Urea-induced equilibrium denaturations were also performed on the four mutants of D1pPDZ at a protein concentration of 1 M and in 0.4 M sodium sulfate.
Kinetic Experiments-Single-jump (un)folding experiments on wild type D1pPDZ were performed using a Pi-star stopped-flow apparatus (Applied Photophysics, Leatherhead, UK) in 50 mM sodium phosphate, pH 7.0, in the presence and in the absence of 0.4 M sodium sulfate. The protein sample was excited at 280 nm, and the folding reaction was followed by the change in fluorescence using different cutoff filters (305, 320, 335, and 360 nm). An 11-fold dilution of denatured or native protein in appropriate buffer initiated refolding and unfolding. Single-jump (un)folding experiments were also performed on the four variants of D1pPDZ in the presence of 0.4 M sodium sulfate, using the 360 nm cutoff filter.   (20). On the other hand, the time-resolved native state appearances of wild type D1pPDZ and site-directed mutants were studied by performing double-jump interrupted refolding experiments at different delay times between a first (refolding) and a second (unfolding) mix as described previously (21).

RESULTS
The function of the D1 C-terminal-processing protease is to remove the C-terminal extension of the D1 polypeptide of photosystem II of oxygenic photosynthesis, which is necessary for assembly of the photosynthesis complex (23,24). The suggested function of the PDZ of D1p is to serve as binding site of the C-terminal extension of the target protein (15).
A synthetic peptide (EAPSVNA) mimicking the suggested natural binding target, the C-terminal extension of the D1 polypeptide of photosystem II (15,23), was employed in ligand binding equilibrium experiments on wild type D1pPDZ (Fig. 2). As previously shown in the case of the PDZ domain family, binding of a ligand may induce a small conformational change that can be probed by intrinsic fluorescence (25). The effect of the ligand binding on the tryptophan fluorescence emission of wild type D1pPDZ was followed between 315 and 385 nm at different peptide concentrations (10 -1500 M). The observed transition corresponds to a simple binding isotherm at all wavelengths. The data were fitted to a hyperbolic equation, yielding an apparent K D of ϳ250 M, which is very similar to the K m of the recombinant full D1p (ϳ300 M) (26). This suggests that the isolated D1pPDZ domain binds its physiological target sequence with an affinity consistent with previous experiments. These observations indicate that the recombinant D1pPDZ is in a native functionally competent conformation.

Folding of a Circularly Permuted PDZ Domain
Urea-induced equilibrium denaturations of D1pPDZ in the presence and absence of stabilizing salt (0.4 M sodium sulfate), monitored by decrease of tryptophan emission, are reported in Fig. 3. At all recorded wavelengths the observed transition follows a simple two-state behavior that would suggest the absence of stable equilibrium intermediates. However, as detailed below, quantitative analysis of observed kinetics reveals a complex (un)folding mechanism of D1pPDZ involving at least one intermediate. The thermodynamic parameters calculated from Fig. 3 using a two-state model are listed in Table 1. Importantly, calculated parameters were independent of protein concentration, as revealed by equilibrium unfolding experiments performed at different protein concentrations (varying from 0.2 to 15 M). The kinetics of the (un)folding of D1pPDZ was investigated by single-and double-jump stopped-flow experiments both in the absence and in the presence of stabilizing salt (0.4 M sodium sulfate). The engineered tryptophan was excited at 280 nm, and the (un)folding reactions were followed by fluorescence using different cutoff filters (Fig. 4). Regardless of the filter used, the refolding was clearly biphasic. The fluorescence signals and the relative amplitudes of the two phases were highly dependent on the cutoff filter used, allowing unequivocal determination of the two rate constants over a wide range of denaturant concentration. Furthermore, the two observed phases were independent of protein concentration within a range of 1-80 M, thus excluding the possibility of transient aggregation events (27). The enhanced stability of D1pPDZ in the presence of the stabilizing salt allowed a quantitative description of all the microscopic rate constants describing D1pPDZ folding. Hence, for the purpose of this study we focus on kinetic data recorded in the presence of 0.4 M sodium sulfate.
The urea dependence of the two (un)folding rate constants (chevron plot) of D1pPDZ is reported in Fig. 5a. The biphasic behavior of the refolding may be explained either by an onpathway scheme, in which the fast formation of intermediate (I) is followed by formation of native protein (N), or by an offpathway scheme, in which there is a competing formation of I and N from denatured protein followed by slow breakdown of I (20).
Discrimination among the on-and off-pathway scenarios demands experiments aimed at the detection of the fraction of native molecules forming along the time course (see Fig. 4, c and d, in Ref. 20). Following Kiefhaber (21), such a task may be tackled by performing double-jump interrupted refolding experiments. This approach makes it possible to distinguish partially folded intermediates from native molecules because these states are characterized by different unfolding rate constants. In particular, the native protein, being separated from the denatured state by the highest energy barrier, should unfold more slowly than any partially folded intermediate. Thus, the fractional population of native molecules formed during the delay time, between a first (refolding) and a second (unfolding) mix, is represented by the relative amplitude of the slowest unfolding event. In agreement with an off-pathway scenario, the amplitudes were readily fitted to a double exponential decay (Fig. 5b), the on-pathway mechanism predicting single exponential behavior and a lag phase, as was discussed elsewhere (20). On the basis of this observation we conclude that the observed intermediate in the folding of D1pPDZ is an off-pathway species.
It is accepted that multiple pathways leading to fast and slow formation of native molecules may either arise from genuine transiently populated off-pathway intermediates or from structural heterogeneity in the denatured state, as in the case of prolyl-peptidyl cis-trans isomerization events (28). It should be noticed that, whereas in the first scenario the biphasic appearance of native molecules essentially arises from the fortuitous similarity of the microscopic rate constants for intermediate and native state formation (kinetic coupling), in the latter a multi-phasic signal change is always associated with a multiphasic native state appearance. In the former case, the relative fraction of native molecules channeled into a fast folding track will approximately approach the partition coefficient as shown in Equation 1 where k DN and k DI are the microscopic rate constants for native and intermediate state formation, respectively.
To further strengthen our hypothesis of an off-pathway intermediate, we compared the folding kinetics of D1pPDZ with that of destabilized site-directed mutants. Because our purpose was to probe the effect of site-directed mutagenesis on the kinetic coupling between intermediate and native state formation, our strategy was to alter the stability of their relative transition states by introducing conservative mutations in different regions of D1pPDZ (namely A188G, V202A, V215A, and V227A; see Fig. 1). The (un)folding kinetics observed for the four mutants were all biphasic, and the chevron plots of the different variants are shown in Fig. 6 together with their urea-induced equilibrium unfolding. Calculated folding parameters according to a three-state off-pathway folding model are listed in Table 1. Importantly, while the four variants display a biphasic fluorescence refolding kinetics, their native state appearance follows a single exponential time course in some cases, as expected from a reduced K part partition coefficient (Fig. 7). This observation suggests that by selectively destabilizing the transition state for intermediate and native state formation it is possible to finetune the kinetic coupling responsible for the heterogeneous native state appearance.

DISCUSSION
The strategy of circular permutation has been applied to a number of small protein domains, aiming at the investigation of the stability and folding mechanisms upon change of sequence connectivity (5)(6)(7)(8)(9). The most important theoretical breakthrough in the protein folding field is generally considered the concept of funneled energy landscape (29), which suggests the denatured chain to fold to its native conformation via an energy landscape displaying an overall funneled topography. Under such conditions, folding is considered a stochastic process so that a protein reaches its native conformation through folding pathways made up by an ensemble of different trajectories. Because many proteins exhibit single folding pathways, it has been difficult to establish the extent to which these trajectories can differ from each other and characterize the different routes by exquisitely experimental studies. In a recent review, Lindberg and Oliveberg (13) proposed the shifts in folding pathways induced by circular  permutation to be a consequence of the funneled energy landscape. Loop entropy perturbations, introduced by permutation, may specifically destabilize dominant folding pathways and favor alternative routes. Thus, the study of circularly permuted proteins provides the unique opportunity to explore experimentally the role of topology on different competing folding trajectories of a given protein. In line with these observations, understanding the effect of circular permutations on multistate proteins may provide clear-cut evidence for pathway rerouting along the folding funnel.
Here we report the first folding mechanism of a multi-state naturally evolved circularly permuted protein in comparison to that of previously characterized PDZ domains (16 -19). In agreement with theoretical predictions (5), in the case of the PDZ family circular permutation produces a remarkable stabilization of a folding intermediate. Indeed, contrary to what has been observed for other members of the PDZ family (17), (i) the intermediate observed for D1pPDZ is a low energy species accumulating during folding; (ii) surprisingly, such an intermediate state represents an off-pathway kinetic trap competing with PDZ folding. It is tempting to speculate that the off-pathway intermediate in the folding of D1pPDZ involves the formation of contacts between residues in the two last ␤ strands, which in the related eukaryotic PDZ domains correspond to a long-range early forming folding nucleus (19). Loop entropy perturbations, such as circular permutation, may at the same time perturb the productive folding pathway and stabilize this nucleus, which may thus act as a kinetic trap.
It has been suggested that folding pathways of single domain proteins are selected to avoid local energy traps and that protein sequences are sufficiently optimized to rely on minimal energetic frustration of their energy landscapes (30,31). The observation that protein folding pathways display energetic traps when sequence connectivity is varied may provide new insights in the energy landscape theory for protein folding, suggesting that (i) alterations of sequence connectivity may dramatically alter the preferential folding pathway of a given protein, and (ii) in line with the observations by Baker (1), protein topology, rather than sequence composition, plays per se a major role in sculpting folding energy landscapes and in optimizing the minimal frustration requirement for productive (and fast) folding.
PDZ domains are generally part of multi-domain proteins, and they exert their function by recognizing small target motifs exploiting a specific binding groove. Circular permutation of PDZ domains may have been necessary for the proper integration of the PDZ binding site with respect to the functional sites of other domains (11). However, it appears as if the divergent evolutionary demands of function and folding have compromised the folding process of D1pPDZ, as indicated by the presence of a misfolded kinetic trap in the folding process. Over and above the relevant exception of the sunflower albumin 8, which displays an unusually high portion of solvent-exposed hydrophobic residues (32), this was rarely observed in the folding of small single domain proteins. The proteins showing off-pathway kinetic traps often display a complex topology (33). The molecular details of the folding of the D1pPDZ to its native and intermediate states remain to be elucidated by extensive -value analysis (34). Finally, given the postulated relevance of off-pathway intermediates in triggering major misfolding events (35), the notable stability of the off-pathway intermediate of D1pPDZ makes it an excellent candidate to further explore the early events of protein misfolding.