Design Principles Involving Protein Disorder Facilitate Specific Substrate Selection and Degradation by the Ubiquitin-Proteasome System*

The ubiquitin-proteasome system (UPS) regulates diverse cellular pathways by the timely removal (or processing) of proteins. Here we review the role of structural disorder and conformational flexibility in the different aspects of degradation. First, we discuss post-translational modifications within disordered regions that regulate E3 ligase localization, conformation, and enzymatic activity, and also the role of flexible linkers in mediating ubiquitin transfer and reaction processivity. Next we review well studied substrates and discuss that substrate elements (degrons) recognized by E3 ligases are highly disordered: short linear motifs recognized by many E3s constitute an important class of degrons, and these are almost always present in disordered regions. Substrate lysines targeted for ubiquitination are also often located in neighboring regions of the E3 docking motifs and are therefore part of the disordered segment. Finally, biochemical experiments and predictions show that initiation of degradation at the 26S proteasome requires a partially unfolded region to facilitate substrate entry into the proteasomal core.

The ubiquitin-proteasome system (UPS) regulates diverse cellular pathways by the timely removal (or processing) of proteins. Here we review the role of structural disorder and conformational flexibility in the different aspects of degradation. First, we discuss post-translational modifications within disordered regions that regulate E3 ligase localization, conformation, and enzymatic activity, and also the role of flexible linkers in mediating ubiquitin transfer and reaction processivity. Next we review well studied substrates and discuss that substrate elements (degrons) recognized by E3 ligases are highly disordered: short linear motifs recognized by many E3s constitute an important class of degrons, and these are almost always present in disordered regions. Substrate lysines targeted for ubiquitination are also often located in neighboring regions of the E3 docking motifs and are therefore part of the disordered segment. Finally, biochemical experiments and predictions show that initiation of degradation at the 26S proteasome requires a partially unfolded region to facilitate substrate entry into the proteasomal core.
Many cellular pathways and regulatory networks require spatial and temporal control of effector protein levels. Regulated degradation mediated by the ubiquitin-proteasome system (UPS) 3 is an important post-translational mechanism that helps to achieve precise fine-tuning of protein levels and is being increasingly linked to more and more pathways. The UPS is the major intracellular degradation pathway that has evolved into a complex system consisting of several hundred dedicated components (1). Important examples of regulated degradation include cell cycle regulatory proteins (e.g. cyclins, cyclin-dependent kinase inhibitors, etc.) that need to be degraded or inactivated before cell cycle checkpoint mechanisms decide upon progress (2,3). Transcription factors (e.g. mammalian Myc, Jun, E2-F, p53, etc.) that activate gene expression triggered by specific stimuli are usually maintained at low levels (4,5); further, the ubiquitin system also triggers processing (by limited proteolysis) and activation of transcription factors such as NFB (6). Cell surface growth factor/hormone receptors undergo internalization and degradation to switch off signaling inputs (7)(8)(9)(10)(11)(12). The UPS also tightly regulates other key intracellular effectors (e.g. Smad proteins, Bcl-2) of signaling pathways (13)(14)(15). Not surprisingly, defects in regulated degradation are being linked to increasing numbers of diseases, including neurodegeneration and cancer, making the UPS very attractive for drug design (16 -19).

Current Challenges in Studying UPS-mediated Regulated Degradation
One of the current challenges is the identification of substrates for E3 ligases, recently addressed by several largescale strategies (24 -29). The other crucial aspect is the detailed characterization of specific elements (degrons) that the E3 ligases detect in their substrates (30). Furthermore, there are multiple regulatory mechanisms (cellular localization, post-translational modification (PTM) status, conformational state, etc.) that need to be outlined (31,32). These regulatory mechanisms act both on the E3 ligase and on substrates and connect regulated degradation with signaling outcomes. For example, E3s are held in inactive conformations and require phosphorylation of defined residues to trigger conformational activation of enzymatic activity (33,34). On the substrate side, the phosphorylation profile of degrons and their neighborhood play important roles in degron recognition by E3s. We have recently analyzed the role of structural disorder in enabling E3 ligase ubiquitination mechanisms (20) and also analyzed known degrons in experimentally validated substrates for structural disorder (35). In this minireview, we focus on the relationship of protein (structural) disorder with: (i) E3 regulation and function; and (ii) substrate degrons that specify recognition by E3 ligases for regulated turnover and outline the functional advantages conferred by the use of disorder in regulated protein degradation.  (97) residue-wise disorder scores (color scale is shown). IUPred disorder scores range from 0 to 1; the higher the score, the greater the predicted disorder. 0.5 and greater indicate disordered residues.

PTMs in Disordered Regions Regulate the Subcellular Localization and Activity of E3 Ligases
E3 ligases determine specificity in the UPS by selecting substrates for regulated degradation. Diverse mechanisms regulate E3 activity and prevent unnecessary protein degradation. Phosphorylation has been well studied in this context (although diverse PTMs may be used). Using phosphorylation as a trigger allows degradation to be linked to signaling pathways and to facilitate signal integration. Structural disorder has direct and indirect consequences on the cellular localization, stability, and activity of E3s. It has been demonstrated that disordered segments are enriched both in short and linear peptide motifs (SLiMs) and in phosphorylation sites that regulate SLiM functions, such as cellular localization, binding interactions, and catalytic activity (36). PTMs in disordered regions act combinatorially and allow complex regulatory decisions (37). How do these mechanisms regulate E3 function?
MDM2 is a well characterized E3 that regulates p53 (38). Multi-site phosphorylation of MDM2 occurs in several disordered segments within residues 114 -294. Akt-mediated phosphorylation of Ser 166 and Ser 186 within close proximity of nuclear localization sequences (NLSs) and a nuclear export sequence (NES) in this disordered region of MDM2 stimulates its nuclear entry, which is critical in regulating p53 (39,40). BRCA1, another well studied E3, possesses NLSs within highly disordered segments and also NESs within its N-terminal RING domain: phosphorylation-dependent use of these motifs changes binding to the nuclear export/import machinery and regulates the cellular localization of BRCA1 (41). In another example, phosphorylation of Thr 24 /Ser 29 residues in the highly disordered N terminus of the E3 ligase Siah2 by p38 MAPK results in its exclusion from the nucleus, changing its association with its nuclear target prolyl hydroxylase 3 (PHD3) (42). However, Siah2 has both nuclear and cytosolic substrates, and therefore its localization can affect its selection of substrates. The inherent flexibility and complexity of the system are underscored by multiple phosphosites and docking motifs for multiple kinases being located within this long disordered region, which allows regulation of the same E3 ligase by different pathways. Siah2 phosphorylation can also be mediated by other kinases, including c-Jun N-terminal kinase, dual specificity tyrosine phosphorylation-regulated kinase 2, and homeodomain-interacting protein kinase 2, which can phosphorylate similar motifs within the disordered N terminus of Siah2 (43). Nucleocytoplasmic shuttling of the yeast E3 Rsp5 (44), von Hippel-Lindau protein (45), hRPF1/Nedd4 (46), the RING-IBR protein RBCK1 (47), and muscle-specific E3 MAFbx/Atrogin 1 (48) via NLS and NES signals (often in disordered regions and regulated by phosphorylation) provides other examples of the regulation of E3 activity via cellular localization.

PTMs in Disordered Interdomain Linker Regions Regulate the Conformational State and Activity of E3 Ligases
An important mechanism of E3 activation is by phosphorylation events that modify critical residues in E3s and change E3 conformation from inactive to active states. E3 ligases have evolved a modular design whereby catalytic (ubiquitin-transferring), substrate-targeting, and other functionalities are often segregated into distinct domains (in the case of single-subunit E3s (ssE3s)) or into distinct subunits (in multi-subunit E3s (msE3s)) ( Fig. 1). Modularity enables regulation of supertertiary structure (49), i.e. the relative arrangement of domains within multi-domain proteins, where the dynamics of flexible inter-domain linkers leads to the formation/disruption of intramolecular, inter-domain contacts; this appears to be a widely used strategy in the regulation of E3 ligase activity (50). Again, this links degradation to signaling, as kinases are frequently used to modify residues in disordered/flexible interdomain linkers and thereby regulate E3 activity by affecting conformational states, as described next.
For example, the E3 ligase Itch is regulated by a phosphorylation-inducedconformationalchange (33).Whenunphosphorylated, the activity of the Ub-transferring, catalytic HECT (homologous to E6-AP C terminus) domain is inhibited via an intramolecular interaction with the WW domain. JNK1 phosphorylates Itch on three sites within the disordered proline-rich region, altering the conformation of WW domain that weakens the WW-HECT interaction and concomitantly increases catalytic activity of the HECT domain. Similarly, the catalytic RING domain of c-Cbl is negatively regulated by other domains (the tyrosine kinase-binding (TKB) and linker helix domains) in the protein. The linker helix region is predicted to be partly disordered and contains two critical Tyr residues (Tyr 371 and Tyr 368 ) that mediate phosphorylation-induced activation of c-Cbl. Tyr 371 and Tyr 368 phosphorylation removes negative regulation by inducing a conformational transition to an "open," active state (34,51). The neuro-protective E3 ligase Parkin is also maintained inactive in a closed, auto-inhibited conformation by intramolecular interactions (52). PTEN-induced putative kinase 1 (PINK1)-dependent phosphorylation of Ser 65 , which is buried within a pocket formed between the disordered linker region (residues 77-140) and the N-terminal UBL domain (residues 1-76), stimulates opening of the intertwined domains, allowing movement of UBL and the flexible linker that enables catalytic activity (53). Recent structural analysis provided detailed insights into progressive conformational changes following phosphorylation of ubiquitin and the Parkin UBL domain, which removes auto-inhibition and activates Parkin (54). Given the importance of these phosphoacceptor residues, several mutations at these positions have been implicated in disease. Fig. 1 shows the domain organization and structural design of the major E3 ligase subfamilies. The common design principles are: (i) modular construction, and (ii) spatial separation of substrate binding and E2-Ub binding functions. Once active, "open" conformations are achieved, ubiquitin transfer necessitates conformational flexibility that can bring E2-Ub into close proximity to the bound substrate such that a suitable microenvironment for catalytic Ub transfer is created. This is achieved by linker flexibility in the case of ssE3s and arises from the flexibility of intervening subunits in the case of msE3s. When crystal structures of representative E3s are colored by residue-wise predicted disorder scores (Fig. 1, bottom), we can identify putative flexible regions that may be crucial for Ub transfer dynamics.

Disorder Facilitates the Dynamics of Ubiquitin Transfer
Molecular simulations carried out on Cbl showed that the flexible linker helix connecting the E2-binding RING domain and the substrate-binding domain functions as a hinge and allows large conformational transitions that bring E2-Ub and substrate in close proximity (20). HECT E3s also employ similar mechanisms: WWP1/AIP5 has a two-lobed structure in which conformational flexibility enabled by rotation about a hinge region linking the N-and C-terminal lobes appears essential for catalytic activity (55). Multi-subunit E3 ligases use substratebinding subunits such as VHL (von Hippel-Lindau) box, SOCS (suppressor of cytokine signaling) box, or F-box proteins (Fig.  1) that possess two domains: one binds the substrate, and the other binds to Skp1/DDB1/Elongin BC subunits. Molecular dynamics simulations of nine such substrate-binding proteins (Skp2, Fbw7, ␤-TrCP1, Cdc4, Fbs1, TIR1, pVHL, SOCS2, and SOCS4) demonstrated that their flexible inter-domain linker acts as a hinge, rotating the substrate-binding domain toward the RING domain-bound E2-Ub (located on the other end of the msE3 complex), thus optimally positioning the bound substrate for ubiquitin transfer (56). Furthermore, investigating the cullin subunits of CRL complexes also showed that instead of being purely rigid scaffolds, the N-terminal domains of cullins contain hinge residues (highly conserved glycines) that impart flexibility (57). Thus, for large multi-subunit E3s such as the CRLs, flexibility throughout the complex, multi-protein structure is clearly evident: in the RING (Rbx1) module (58), in the cullin scaffold (57), and in the substrate-binding subunits (56).
Experimental elucidation of the role of flexibility in E3 linker regions/subunits mostly comes from comparison of multiple crystal structures and small-angle x-ray scattering experiments (as demonstrated in CRLs (59)) that highlight multiple linker conformations and significant conformational transitions to open structural forms that promote ubiquitination. Mutational analyses of linkers by changing linker length and introducing residues with different backbone geometries have also enabled elucidation of the role of linker flexibility (59).
Structural disorder in the E3 ligase BRCA1 has been experimentally characterized using NMR spectroscopy in conjunction with CD spectroscopy and limited proteolysis. BRCA1 has an ϳ1500-residue-long central disordered region (located between its N-terminal RING and C-terminal, tandem BRCT domains) that functions as a flexible scaffold for multiple interaction partners (60).

Disorder Facilitates Processivity in Ubiquitin Transfer by E3 Ligases
Structural disorder also enables processivity in ubiquitination. Ubiquitination enzymes undergo large conformational changes during their catalytic cycles such as bridging large distances so that ubiquitin transfer onto appropriate substrate lysine residue(s) can take place (50,59,61). Ubiquitination is a processive modification because most E3s will catalyze multiple rounds of ubiquitin addition to the bound substrate. Processivity is a kinetic phenomenon widespread among enzymes that act on polymeric substrates, such as DNA, RNA, polysaccharides, and proteins (62). Ubiquitination by E3 ligases is often highly processive, resulting in either multiple monoubiquitination of the substrate on proximal Lys acceptor sites or the buildup of a polyubiquitin chain after several successive steps of modification (63,64). It has been suggested that disordered regions in E3s may be instrumental in enabling "intramolecular diffusion" of substrate-and E2-binding regions of the E3 toward each other, resulting in processivity (20).

Structural Disorder and Folding Transitions in E3 Ligases upon Substrate Binding
In addition to disordered/flexible inter-domain linkers, disordered E3 regions may also play a role in substrate targeting. During previous work (see Fig. 5 of Ref. 20 and references therein), we observed two instances where disordered segments of E3 ligases undergo induced folding upon binding to their substrate proteins. The interaction between the E3 ligase SMURF1 and its substrate SMAD1 (Protein Data Bank (PDB) 2LAZ) is an instance of co-folding (or synergistic folding) between two disordered regions. Interaction between the E3 ligase RING2 and RYBP (PDB 3IXS) is another example.

Substrate Regions Harboring E3 Recognition Motifs Are Highly Disordered
Next we survey substrates that undergo UPS-mediated regulated degradation and the nature of the specific determinants that E3 ligases recognize on their targets. Yeast has ϳ100 E3 ligases, and this number increases to ϳ600 in higher eukaryotes, including humans (65). These 600 ligases are responsible for targeting, in principle, the entire proteome. An important but unanswered question is the nature of the degron that is recognized by these E3s. The name degron has been coined for substrate elements that confer metabolic instability (66). We discuss results demonstrating that degradation-specifying elements are distributed within the substrate ("distributed" degron architecture) and that many currently identified degrons are closely associated with disordered protein regions.
A number of E3 ligases (the precise fraction of which is unknown) recognizes short, linear (peptide) motifs on their target proteins (30). We refer to these as "primary degrons" because they mediate the first step in regulated degradation (35). We collected and analyzed 28 distinct primary degron types from the literature and from the Eukaryotic Linear Motif (ELM) database (of experimentally verified SLiMs) (67). Primary degrons are typical SLiMs (67): they are short (3-15 residues) sequences, conserved among orthologous proteins, and they contain specificity determinants that enable recognition by the substrate-targeting domains/subunits of E3 ligases. The D-box and KEN motifs were the first such degrons to be characterized in cell cycle regulatory proteins (68); they are recognized by the anaphase-promoting complex/cyclosome (APC/ C), a multi-subunit E3 that regulates cell cycle progression in eukaryotes (69,70). Primary degrons are docking motifs for E3 ligases and initiate substrate entry into the UPS.
What kinds of substrates are regulated by these degrons? Our recently compiled dataset comprises 157 substrates (containing a total of 171 experimentally validated instances corresponding to the 28 degron types). These proteins are involved in a wide variety of pathways, such as cell cycle checkpointing, apoptosis, transcriptional regulation, etc. (Fig. 2). Based on our analysis of the 171 primary degron instances (35), we detected a significant correlation between the location of these primary degrons and intrinsically disordered substrate regions (Fig. 2). Almost 80% of known degron instances are present in disordered interdomain linkers or in disordered regions outside domains. The remainder localized to surface (mostly unstructured) loops of folded domains. Not surprisingly, the primary degron region could be observed in PDB structures of unbound (free) substrates for only 1 out of 157 substrates (IB kinase ␤ (IKK-␤), residues 34 NQETGE 39 , PDB ID: 4E3C). In all other structures, either the highly disordered region encompassing the degron was not included in the crystallization construct (as is often done to facilitate crystallization), or missing electron density was observed for the degron and its neighborhood.

Disordered Primary Degrons Are Regulated by PTMs within Degrons and Their Flanking Regions
PTMs such as phosphorylation often regulate primary degron recognition: many degrons are turned on/off after modification of one (or more) residues. The CBL family of E3s targets protein tyrosine kinases via the recognition of [DN]X-pY[ST]XXP (X indicates any residue, and p indicates phosphorylation) and DpYR phosphotyrosine motifs (71,72). Phosphodegrons are also well known recognition sites for members of the F-box family (73), which form multi-subunit, S phase kinase-associated protein 1 (Skp1)-cullin 1 (Cul1)-F-box protein (SCF) complexes. The members of the F-box family function as substrate adaptors in the context of SCF complexes and mediates the degradation of many regulatory proteins. Phosphorylation-mediated on/off switching of degradation is a common strategy in substrates controlled by SCF E3 ligases. For example, SCF-␤TrCP binds the consensus DpSGX{2,3}[pST] that is activated after double phosphorylation, SCF-Fbw7 binds [LIVMP]X{0,2}(pT)PXX([pST]) sequences, and SCF-Skp2 targets [DE]X(pT)PXK (74,75). Structural disorder facilitates deposition of PTMs, and it can also be argued that multiple modifications within a restricted region would benefit from the structural plasticity/malleability offered by disordered segments (76).
Several examples are known where degrons with multiple phosphorylation sites can be targeted by multiple kinases, which adds increased regulatory complexity to substrate recognition. For example, cyclin-dependent kinase 2 and glycogen synthase kinase 3 phosphorylate different residues of the cyclin E degron (77). In certain cases, one (or multiple) priming phosphorylations in degron-flanking residues are necessary before phosphorylation of the degron itself can take place; such priming events have been shown to be critical for substrates such as c-Jun (78), ␤-catenin (79), and Yes-associated protein transcriptional coactivator (80). Priming phosphorylations sequentially create docking sites for downstream kinases, and it is highly likely that local structural disorder facilitates multiple interactions (multiple kinase/phosphatase pairs) required for the regulation of such events.
The use of PTMs to (in)activate substrate/E3 recognition can also confer structural advantages. Phosphorylation in disordered regions can modulate local conformational preferences such that bound-state-like pre-structuring is observed (81). Thus, priming modifications in disordered regions can serve to achieve specificity for E3 recognition, using a signaling event as a trigger and thereby achieving temporal and signal-dependent binding specificity.

Multiple Degrons Present in Disordered Segments Increase Avidity of Interaction
Avidity in E3-substrate interactions can be enhanced by multiple degrons in a disordered segment. Sic1 contains multiple suboptimal phosphodegrons that have evolved an ultrasensitive switch-like response such that phosphorylation of a certain defined number of degrons is required before E3 binding becomes stable enough (82). The creation of such an intricate docking network within phosphorylation clusters requires significant structural plasticity to ensure a functional system. Other examples of polyvalent cooperative interactions facilitated by disordered regions can be seen in the multiple Ser/Thrrich degrons of the E3 ligase Cul3-HIB/SPOP that are clustered in disordered substrate regions, and whose in vivo cooperativity appears important for E3 binding and degradation (83).

Ubiquitin-Acceptor Lysines on Substrates and Correlation with Disordered Regions
Following substrate recognition and binding, E3-E2 pairs catalyze Ub transfer onto substrate lysines. Ubiquitinated lysines that are linked to proteasomal degradation have been termed as "secondary degron(s)" (35). The identity and characteristics of selected lysines are not fully understood. Disordered substrates such as p27 and p21 undergo non-selective Lys modifications that lead to degradation. For other substrates, the geometry of the E3-E2 machinery should lead to preferential orientation and selection of defined surface regions based on an accessible search radius (ubiquitination zone (84 -86)) containing one or more Lys. In recent analyses, we observed that degradation-linked, ubiquitinated lysines were often missing from PDB electron density maps, and many were predicted to fall into locally disordered regions (35). This leads us to speculate that a conformationally fluctuating surface/region should increase the probability of fruitful Ub transfer to multiple substrate lysines. The APC/C was shown to prefer lysines in disordered regions for ubiquitination (87). Similar observations were also made by bioinformatics studies, suggesting a bias for degradation-linked ubiquitination sites to be more disordered when compared with unmodified lysines (88). Ubiquitinated proteins are targeted to the 26S proteasome for degradation, and a higher intrinsic flexibility of segments containing multiple ubiquitinated lysines would also serve to better engage the Ub receptors located on the regulatory subunits of the proteasome with higher avidity interactions (89).

Substrates Require a Disordered Initiation Site for Efficient Degradation by the 26S Proteasome
Detailed biochemical analyses suggested that ubiquitination is necessary but not sufficient for proteasome-mediated degradation. For that, a disordered (or partially unfolded) region on the substrate is required because substrates without this disordered degradation initiation site (we term it the "tertiary degron" (35)) are not effectively degraded, despite association with the proteasome (90,91). The function of this disordered region is to initiate productive proteasomal engagement of the substrate and subsequent ATP-dependent unfolding. Because the ubiquitin receptors on the 19S regulatory particle of the proteasome are located ϳ70 -80 Å away from the ATPase unfolding channel, the effective substrate requires a disordered region of a minimum of 20 -30 residues in length, located next to the polyubiquitin tag (92,93). We found that this feature distinguishes ubiquitination sites involved in degradation and those with regulatory functions: nearly 60% of degradationlinked sites are located in the immediate vicinity (within 0 -10 residues) of a long disordered region (of at least 20 consecutive disordered residues), whereas the equivalent fraction is only 20 -30% in the case of ubiquitinated lysines that include nondegradation, regulatory functions (35). Further, degradation efficiency drops sharply when the two sites (site of ubiquitin tagging and the disordered segment) are gradually separated (91), and paralogs lacking a local long disordered region are often involved in signaling rather than degradation (94). Actually, what may happen is that degradation becomes less efficient for these paralogs, but does not stop altogether; this ensures control of signaling decay with time.
In this context, it has been shown that the requirement for a disordered degradation initiation site appears to be the most important criterion even for ubiquitin-independent proteasomal substrates (examples include the transcriptional regulator Rpn4, thymidylate synthase, and ornithine decarboxylase) (95,96).

Conclusions and Perspectives
Alongside transcriptional and translational control, regulated degradation is a primary mechanism of the cell to control the functioning of its extremely complex collection of proteins, i.e. the proteome. Ubiquitin ligases select substrates for degradation by the recognition of often unknown signals (degrons) within target proteins, but nonetheless confer specific recognition. Although the area is of significant contemporary interest, much work is required to identify the precise nature of these signals and details about their operation.
Here we have reviewed multiple lines of evidence suggesting the involvement of structural disorder and conforma-tional flexibility in this pathway. Structural disorder also makes it more difficult to recognize degrons in the sequence and structure of proteins. We are in the beginning of unraveling the complex regulatory interplay between different signals and post-translational modifications in physiological and pathological functions. Dedicated bioinformatics, modeling, high-throughput proteomics, and detailed structural studies will be required to unravel the complexities associated with regulated degradation.