Templated folding of intrinsically disordered proteins

Much of our current knowledge of biological chemistry is founded in the structure-function relationship, whereby sequence determines structure that determines function. Thus, the discovery that a large fraction of the proteome is intrinsically disordered, while being functional, has revolutionized our understanding of proteins and raised new and interesting questions. Many intrinsically disordered proteins (IDPs) have been determined to undergo a disorder-to-order transition when recognizing their physiological partners, suggesting that their mechanisms of folding are intrinsically different from those observed in globular proteins. However, IDPs also follow some of the classic paradigms established for globular proteins, pointing to important similarities in their behavior. In this review, we compare and contrast the folding mechanisms of globular proteins with the emerging features of binding-induced folding of intrinsically disordered proteins. Specifically, whereas disorder-to-order transitions of intrinsically disordered proteins appear to follow rules of globular protein folding, such as the cooperative nature of the reaction, their folding pathways are remarkably more malleable, due to the heterogeneous nature of their folding nuclei, as probed by analysis of linear free-energy relationship plots. These insights have led to a new model for the disorder-to-order transition in IDPs termed “templated folding,” whereby the binding partner dictates distinct structural transitions en route to product, while ensuring a cooperative folding.

The study of the mechanism whereby a polypeptide chain acquires its native conformation has represented one of the biggest challenges of biochemistry and molecular biology over the last half a century (1). In fact, because the functions of proteins are mainly dictated by their three-dimensional structure, it is not surprising that the folding process has gained the atten-tion of different scientific communities and that the folding field has played an influential role in protein science. Moreover, medical interest arose when it was realized that many human pathologies, including Alzheimer's, Parkinson's, and Creutzfeldt-Jacob diseases, are linked to misfolding and aggregation of specific proteins (2)(3)(4).
Some of the general properties of the folding process have already been characterized in detail, and our understanding of the basic features of this reaction has grown enormously over the last few decades. The collaborative efforts of experimentalists and theoreticians have resulted in substantial breakthroughs in the characterization of folding pathways, to the point that the folding of small single-domain proteins can be successfully described at nearly atomic resolution (5)(6)(7)(8)(9)(10)(11)(12)(13)(14)(15)(16).
The discovery that about 40% of the human proteome is essentially disordered if expressed as single proteins (17)(18)(19)(20)(21) challenged existing paradigms drawn from the structure-function relationship. In addition to the complexities of integrating natively disordered sequences into our concepts of cell biology, researchers were immediately keen to know more about how and when these proteins adopt structure. It has been shown that whereas some of these proteins retain a high degree of disorder in all of their physiologically relevant states, a large fraction of them undergo a disorder-to-order folding transition upon binding to specific partners, which could be other proteins or nucleic acids (22)(23)(24)(25). From a mechanistic perspective, proteins undergoing such coupled binding and folding process represent particularly interesting systems, where supramolecular organization is coupled to bimolecular recognition.
In this review, we explore the advances made thus far in understanding IDP folding by considering the main similarities and differences emerging from the comparison of the folding of globular versus intrinsically disordered proteins (IDPs). 3 We will first focus on the overall features of the observed folding transitions, as mirrored by the apparent kinetics. Then we will highlight the differences and similarities of the structural features of the main transition states. These comparisons have pointed to a new model for IDP folding called "templated folding," a mechanism that appears to capture the peculiar properties of the folding-upon-binding event of IDPs and can be used as a general framework for IDP research going forward.

Cooperativity in globular and IDP folding
The initial view of protein folding used to depict the reaction as a stepwise formation of native-like structure (26,27). All proteins were assumed to fold via a framework model postulating secondary structure to form before tertiary interactions were locked in place. This scenario implicitly involved the accumulation of a series of partially structured intermediates, with an increasing degree of native-like structure. The discovery that small single-domain proteins were in fact capable of folding in an all-or-none fashion therefore came as a surprise. After the first observation of this so-called "two-state folding" by Jackson and Fersht in 1991 (28), it soon became clear that the majority of proteins containing less than 100 amino acids could fold in a highly cooperative manner, despite the formation and breakage of hundreds of weak noncovalent bonds. The two-state protein folding, where only the fully native and denatured states could be experimentally detected, revolutionized our view of the folding reaction and established some of the general rules in the field (29).
A two-state reaction is characterized by the lack of low-energy intermediates, populated either at equilibrium or transiently. Consequently, the denatured and folded states are separated by a single energy barrier defining the transition state, predicting that the time evolution of any two-state folding reaction should conform to a single-exponential decay (30). However, because of the complexity of these reactions and the challenges in monitoring this transition, it is difficult to conclusively determine whether intermediates do not exist or are just elusive. Rigorous experimental tests have therefore been employed to exclude their presence. For example, the two-state assumption can be verified by comparing the consistency of the thermodynamic parameters obtained from different experimental methods, such as by using different probes (e.g. CD, fluorescence, calorimetry, and NMR) and comparing equilibrium with kinetic experiments (31)(32)(33).
IDPs provided an immediate challenge to these expectations: Because IDPs are by definition unstructured and dynamic in isolation (34), the existence of cooperative motions and transition states was no longer guaranteed. Instead, the encounter and binding to a physiological partner often triggered a conformational change. In these cases, a disorder-to-order transition upon recognizing the physiological partner may occur. This phenomenon has generally been referred to as "coupled folding and binding" or "induced folding." The revelation of this transitions cast doubt whether known folding rules established for globular proteins would apply for IDPs and what possibilities might exist. Does induced folding follow an induced fit or conformational selection kind of model? Would the structure of the same IDP bound to different partners be the same? Would different partners bind to the same regions of the IDP?
Importantly, it should be noticed that what "induced folding" entails might also be very diverse for different IDPs. In fact, whereas in some cases the IDP may fold to a structured conformation, some other systems may retain a considerable level of disorder even in their bound states, a phenomenon generally referred to as "fuzziness" (Fig. 1) (35). Thus, there is a range of possible scenarios from disorder-to-(complete)order to disor-der-to-disorder transition, but, whatever the case, binding will result in some change of structure and dynamics of the bound ground state. Our ensuing discussion on templated folding might be valid for any degree of fuzziness.
In theory, the mechanism of recognition between an IDP and its partner is expected to be a complex reaction involving, at least, the productive encounter between the two interacting units, the folding of the IDP in the bound conformation, and the locking of key stabilizing interactions. Thus, it would be expected that the mechanism of induced folding would imply the presence of multiple reaction intermediates. Nevertheless, in analogy to the folding of globular proteins, the expected theoretical complexity is contrasted by a striking simplicity of the binding kinetics.
Several IDPs were found to conform to robust two-state kinetics, displaying single-exponential time courses, as well as a linear dependence on reactant concentrations under pseudofirst-order conditions, another hallmark of two-state behavior (36 -51). Note that apparent two-state does not imply that there are no intermediates, but that any intermediates are highenergy states, which rapidly convert to the equilibrium bound state or back to the free state. It is very difficult to directly assess these fast transitions between high-energy intermediates, but one of the first kinetic characterizations of an induced folding reaction by Hagen and co-workers (52) actually achieved this by using a laser-induced temperature jump spectroscopy. In this case, inducing folding with trifluoroethanol cosolvent allowed measurement of the folding and unfolding rate constant of an intrinsically disordered protease inhibitor (IA 3 from yeast), found to occur in the microsecond time range, indicating that the folding of IDPs may occur very rapidly.
Furthermore, it is important to highlight that not all of the IDPs studied to date conform to a two-state scenario. In fact, It is wellknown that some IDPs undergo a disorder-to-order transition upon binding their physiological partner. Nevertheless, the level of disorder retained in the complex may vary substantially in different cases. The figure reports two cases: the complex between KIX and pKID (23) and that between GCN4 and Med15 (90). JBC REVIEWS: Binding-induced folding of IDPs reaction intermediates have been identified by NMR (53)(54)(55), by detecting multiexponential kinetics (56,57) as well as by analyzing the concentration dependence of observed rate constants (56 -60). Nevertheless, in most cases, these intermediates tend to be very elusive and scarcely populated, indicating that the cooperative nature of induced folding of IDPs is very similar to that of the folding of globular proteins.
The structure of the transition state: ⌽ value analysis As mentioned above, examination of the transition state structure has come to be a central step in analyzing the folding pathways of globular proteins. However, whether similar considerations would be relevant for describing IDP folding was not immediately clear. The description of reaction mechanisms implies identifying all of the intermediates along the pathway and then characterizing their structure. As outlined above, in the case of protein folding, the process often takes place in a highly cooperative manner, such that most often only the fully native and denatured states may be populated. Thus, information that is accessible to the experimentalist is generally very limited as no snapshots between reactants and products can be characterized. In this context, it becomes clear how the study of the transition state is critical to pinpoint the key residues dictating folding.
Because the transition state at the top of the energetic barrier never accumulates, information about its structure must be inferred indirectly. In this context, a powerful approach, called ⌽ value analysis, has been conceived by Fersht and co-workers (61) and is based on the following methodology. In particular, by systematically mutating amino acid side chains while measuring the effect of each structural perturbation on the activation free-energy barrier and on the ground states, it is possible to map interaction patterns in the transition state. Quantitatively, the index of native-like structural content is measured by the ⌽ value, which normalizes the change in free energy of the transition state (⌬⌬G ‡ ) upon mutation to that of the native state (⌬⌬G). A ⌽ value close to 1 is often interpreted as native-like structure in the folding transition state of that specific residue, whereas a ⌽ value equal to 0 suggests that the mutated residue is as unstructured in the transition state as it is in the denatured state.
The ⌽ value analysis represents the only experimental method available to describe the structure of a transition state, and it has been employed extensively to describe the mechanism of folding of globular proteins (62,63) and lately the coupled binding and folding of IDPs (36 -38, 40, 48, 50, 59, 60, 64 -73). Interestingly, it has been generally observed that ⌽ values tend to be between 0 and 1, with very few cases of unusual values (i.e. lower than 0 or larger than 1). Hence, in analogy to what is observed for globular proteins, when folding takes place, IDPs tend to avoid misfolded conformations, and the transition state represents a distorted version of the bound state. Thus, whereas several alternative structures are populated by IDPs in their free state, the transition state of induced folding appears committed to the structure of the complex. Of additional interest, it may be noted that the magnitude of the ⌽ values, calculated for induced folding of different IDPs, tend to be broadly distributed along the sequence, rather than clus-tered in regions of 0 and 1. This tendency, which has been discussed previously extensively in the case of globular proteins (6,63), indicates that IDPs display structurally diffused nuclei in their folding transition states, where diverse residues contribute fractional formation of native-like contacts.
One of the main assumptions of the ⌽ value analysis lies in a negligible effect of the mutation on the structure of the denatured (or unbound) state (61,63). In fact, when and if the structural perturbation induced by site-directed mutagenesis affects the stability of the denatured state (e.g. by disrupting an element of its residual structure), observed ⌽ values may deviate from the expected values, and unusual values may be detected (74). Because many IDPs retain elements of embryonic secondary or tertiary structure in their unbound states, it is worth considering this complication when comparing the results obtained with IDPs with those of globular proteins. As recalled above, however, none of the ⌽ value analyses performed to date present a significant number of unusual values, indicating that the effect of mutagenesis on the residual structure of the IDPs (studied so far) is not relevant enough to jeopardize these kinds of experiments and that such structures are relatively robust to the mutations considered. However, it will be important to conduct additional work to further support this conclusion.

Linear free-energy relationship (LFER) plots: The nucleation-condensation model versus templated folding
A general concept in chemical kinetics implies that optimal rates are obtained when reaction intermediate(s) have higher free energy than the reactants and products and are not accumulating, existing at very low concentrations (62,75). Similarly, in the case of protein folding, optimal folding rate constants may be obtained when reaction intermediates are very unstable, so that the reaction is consistent with two-state folding (31). Under such conditions, local elements of structure display a low tendency to interact in isolation, and folding may only occur in a highly cooperative manner. This scenario, which is very recurrent in the folding of globular proteins, conforms to the so-called nucleation-condensation mechanism (31,76,77).
The nucleation-condensation model for globular proteins postulates the existence of a folding nucleus in globular proteins. Importantly, because such a nucleus is weak, structurally diffused, and extended, formation of the discontinuous network of interactions stabilizing the nucleus can occur only when a significant fraction of the overall fold has acquired an approximately correct overall conformation. Under such conditions, therefore, nucleation is coupled with general condensation of a native-like fold, with secondary and tertiary structure forming simultaneously and the transition state reflecting a distorted version of the native state (76,77).
A valuable method to characterize the nucleation-condensation model is the analysis of LFER plots, also referred to as Leffler or Brønsted plots (78). This type of analysis correlates the activation free energy for a given reaction with its equilibrium free energy. LFER plots were originally introduced in physical organic chemistry to evaluate the position of the transition state along the reaction coordinate (79). In fact, by perturbing the structure and thereby the reactivity of a reactant, the dependence of the activation free energy on the equilibrium free energy generally results in a linear correlation, with a slope ␣ reflecting the position of the transition state along the reaction coordinate. When applying this type of analysis to protein folding, linear LFERs may be obtained only when the transition state resembles the structure of the native state.
If two systems respond similarly to the same perturbation, induced for example by mutagenesis, it may be deduced that the systems are also similar in structure. Consequently, the linearity in the LFER plot represents strong experimental evidence that the transition state of folding reflects by-and-large a distorted version of the native state (78). Nearly all globular proteins that have been subjected to LFER analysis display a linear plot, suggesting that nucleation-condensation represents a general model for protein folding (13,80,81). Moreover, the slope of the LFER plot is highly similar for different proteins, with a value of ␣ of ϳ0. 3 (80). This finding indicates that, despite the fact that the folding nucleus is specific to each protein, the degree of native-like structure in the transition states of globular proteins is very robust.
It is of particular interest to compare the LFER analysis performed on globular proteins versus those of IDPs. The LFER analysis of the induced folding of different IDPs shows that this class of proteins also displays a linear correlation. Nonetheless, peculiarities have been discovered, as shown in the binding reaction of the transactivation domain of c-Myb, an IDP system, to the globular protein KIX. In this case, it was shown that the transition state contains a very high degree of native-like structure, with a value of ␣ of 0.89. This finding indicated the transition state of this IDP to be much more ordered than what is typically observed for the folding of globular proteins (48).
To test the robustness of the structure of the transition state of c-Myb, the LFER analysis was repeated, measuring the binding with three different site-directed mutants of KIX, displaying a different degree of hydrophobicity in their binding pocket (67). Surprisingly, whereas the linearity in the LFER plot was maintained, it was observed that mutations of KIX had a pronounced effect on the values of ␣ calculated for the binding of c-Myb, with ␣ progressively decreasing with decreasing hydrophobicity of the binding pocket, down to 0.19 in the case of the least hydrophobic variant (Fig. 2). This variability demonstrates a degree of structural plasticity in the transition state. Subsequently, other clear signatures of this plasticity have been observed, such as in the case of the interaction between NCBD and ACTR (82) and in the case of the induced folding of NTAIL with XD (59,60). However, the protein BH3 appears more robust with regard to binding partners (37). Thus, at variance to what is observed for globular proteins, there is a remarkable structural malleability, or plasticity, in the transition state of induced folding, which is dictated by the binding partner.

Templated folding of IDPs
Despite the considerable interest in understanding the properties and peculiarities of IDPs, our current understanding of the mechanisms whereby binding-induced folding takes place is still relatively limited and based on the study of small protein systems. Furthermore, because the disorder-to-order transitions of IDPs are coupled to a binding reaction, it is generally The comparison between the spontaneous folding of globular domains and the induced folding of IDPs presented in this review highlights some key differences, which can be interpreted in light of homogeneous and heterogeneous nucleation. In phase transitions, heterogeneous nucleation is a process whereby the interactions leading to the formation of the nucleus are established in contact either with the heterogeneities found in the generating phase or with a surface (83). On the other hand, homogeneous nucleation occurs through condensation of a single type of chemical compound. Thus, in bindinginduced folding, the interacting partner of the IDP participates directly in the nucleation process, leading to a heterogeneous variable nucleation as opposed to the homogeneous nucleation observed in globular folding. Consequently, IDPs tend to follow a "templated folding" mechanism whereby the structure of the transition state is dictated by the nature of the interacting partner (Fig. 3). This behavior is in contrast to what would be expected from a homogeneous nucleation type mechanism, in which the disorder-to-order transitions would appear robust and imprinted in the amino acid sequence of the protein.
Whereas the folding nuclei of globular proteins are highly diffused, given that nucleation occurs homogeneously, the overall reaction appears rather robust. Thus, in spontaneous globular folding, nucleation sites are highly conserved and are maintained in different members of the same protein family, circular permutants, and truncated or circularized variants. This property is nicely illustrated by the so-called ⌽-⌽ plot analysis, where ⌽ values of corresponding residues in homologous proteins are plotted versus each other (13). On the other hand, the heterogeneous nucleation invoked by templated folding implies that binding-induced folding of IDPs would be more malleable, with alternative pathways and nucleation sites emerging with changing binding partners or experimental conditions (59,60,67,70,82,84). It is likely that this behavior may also be reflected in structural malleability of the bound state upon changing experimental conditions and/or mutagenesis, a hypothesis that seems to be supported by the observed fuzziness of several IDP systems (24,25,35,85,86) as well as crystal structures of site-directed mutants (87). Such structural malleability could be further tested by monitoring the structural behavior of different variants of the same IDP by different techniques, such as NMR, SAXS, or single-molecule FRET. Furthermore, we predict that when and if the propensity to form ordered structure in an IDP is increased, the folding mechanism would most likely transition toward a classic homogeneous nucleation type scenario and would therefore appear more robust to changing conditions or binding partners. This hypothesis is readily testable by future experiments.
We note that templated folding is conceptually similar to the model of "slaving," previously introduced by Frauenfelder and collaborators (88,89) to describe the role of solvent in controlling some of the dynamics and function of globins. In fact, by following this view, it was proposed that the conformational motions of proteins were essentially dictated by the external fluctuations of their hydration shell. Analogously, in the case of IDPs, the heterogeneous nature of their folding nuclei results in a remarkable malleability, which in their bound state is directly influenced by their physiological partner. We propose that templated folding represents a general mechanism whereby multiple alternative partners can recognize the same IDP and induce cooperative folding. In fact, templated folding ensures the robustness of the cooperativity and minimizes the possibility of establishing aberrant interactions with potential pathological consequences, while increasing the possibility of having an extended repertoire of different interaction partners. Folding pathways of globular proteins (top) are characterized by a robust transition state, which forms through a homogeneous nucleation process. On the other hand, folding of IDPs may take place via heterogeneous nucleation and, therefore, may display a remarkable malleability of the transition state. The latter scenario is characteristic of the so-called "template folding" mechanism.
JBC REVIEWS: Binding-induced folding of IDPs