The FCS-like zinc finger scaffold of the kinase SnRK1 is formed by the coordinated actions of the FLZ domain and intrinsically disordered regions

The SNF1-related protein kinase 1 (SnRK1) is a heterotrimeric eukaryotic kinase that interacts with diverse proteins and regulates their activity in response to starvation and stress signals. Recently, the FCS-like zinc finger (FLZ) proteins were identified as a potential scaffold for SnRK1 in plants. However, the evolutionary and mechanistic aspect of this complex formation is currently unknown. Here, in silico analyses predicted that FLZ proteins possess conserved intrinsically disordered regions (IDRs) with a propensity for protein binding in the N and C termini across the plant lineage. We observed that the Arabidopsis FLZ proteins promiscuously interact with SnRK1 subunits, which formed different isoenzyme complexes. The FLZ domain was essential for mediating the interaction with SnRK1α subunits, whereas the IDRs in the N termini facilitated interactions with the β and βγ subunits of SnRK1. Furthermore, the IDRs in the N termini were important for mediating dimerization of different FLZ proteins. Of note, the interaction of FLZ with SnRK1 was confined to cytoplasmic foci, which colocalized with the endoplasmic reticulum. An evolutionary analysis revealed that in general, the IDR-rich regions are under more relaxed selection than the FLZ domain. In summary, the findings in our study reveal the structural details, origin, and evolution of a land plant–specific scaffold of SnRK1 formed by the coordinated actions of IDRs and structured regions in the FLZ proteins. We propose that the FLZ protein complex might be involved in providing flexibility, thus enhancing the binding repertoire of the SnRK1 hub in land plants.

signals (3). In agreement with this, FLZ proteins were found to be interacting with regulators of hormone signaling, with development and abiotic and biotic stresses supporting their role as adaptors to mediate SnRK1 signaling (8,10). Functional analysis of FLZ genes identified their roles in the regulation of many SnRK1-regulated processes (11)(12)(13). Consistent with these studies, we recently identified that two members of Arabidopsis FLZ gene family work as negative regulators of SnRK1 signaling (14). Interestingly, many FLZ proteins also showed interaction with RAPTOR, which suggests that they may work as scaffolds by bringing both SnRK1 and TOR complexes in close proximity, which is required for the energy-dependent antagonistic interaction (8,10).
The scaffold proteins usually possess high conformational flexibility to accommodate the interaction with diverse proteins (15). The plasticity and adaptability of scaffold proteins are usually achieved by the presence of intrinsically disordered regions (IDRs) in these proteins (16). FLZ proteins show low sequence conservation outside the FLZ domain indicating that these regions might be enriched with disordered regions (6,7). IDRs lack fixed secondary or tertiary structures. They are particularly enriched in eukaryotic proteins and thought to be related to organismal complexity (17). The pliable nature of IDRs increases the binding repertoire of proteins. Consistent with this, hub proteins are generally found to be enriched with IDRs (17, 18). The versatile interaction property of FLZ proteins, coupled with their low sequence conservation in the N and C termini, indicates that IDRs might be contributing to the protein interactions of FLZ proteins.
In this study, we employed a large dataset of FLZ proteins from 33 sequenced plant genomes to study the origin and evolution of IDRs in the FLZ protein family. Our analyses predict that the FLZ family proteins possess evolutionarily conserved protein-binding IDRs in the N and C termini. Specific enrichment of post-translational modification (PTM) sites responsible for protein-protein interaction in the IDR-rich region indicates their role in enhancing the interaction repertoire. The protein-protein interaction assays identified that IDRs in the N termini and the structured FLZ domain mediate the interactions with specific SnRK1 subunits in Arabidopsis. The scaffold proteins provide an interaction surface for other proteins by virtue of their large size or through dimerization (19). We also found that FLZ proteins form homo-and heterodimers through the IDRs in the N termini. Interestingly, in planta interaction assays identified that the interaction of FLZ proteins with SnRK1 subunits was specific to cytoplasmic foci that colocalizes with the endoplasmic reticulum (ER). The evolutionary analysis identified that the sequence divergence in the IDR-rich N and C termini is more dynamic compared with the structured FLZ domain. Thus, this study uncovers the origin and evolution of a plant-specific complex formed by the division of labor between the structured FLZ domain and the IDRs, and this complex might be important in mediating the function of the conserved eukaryotic energy gauge, SnRK1.

N and C termini of FLZ proteins are enriched with intrinsically disordered regions
The FLZ proteins from 33 sequenced plant genomes, which belong to key taxonomical positions, were identified through BLAST and HMM-based searches (Fig. S1). A nonredundant dataset of these FLZ proteins was created and used for the subsequent analysis (Table S1). To find out the disorder propensity of FLZ proteins, we employed PONDR-FIT, which is a metapredictor of PONDR-VLXT, PONDR-VSL2, PONDR-VL3, FoldIndex, IUPred, and TopIDP predictors and displays enhanced accuracy over individual predictors (20). FLZ proteins show limited domain acquisition throughout the plant lineage, and they usually possess a solitary FLZ domain ( Fig. 1A and Table S2), which is predicted to form an ␣-␤-␣ topology (6,7). Using PONDR-FIT, we first predicted the disorder propensity of FLZ proteins from the model plant Arabidopsis. Most of the FLZ proteins from Arabidopsis are predicted to have high propensity for disorder in N and C termini compared with the FLZ domain (Fig. 1B). FLZ domain-containing proteins are absent in all the green algae (Chlorophyta) species sequenced until now, and we identified an FLZ protein from Klebsormidium flaccidum, which belongs to Charophyta, suggesting that the origin of FLZ domain coincides with the terrestrialization of plants ( Fig. S1) (21). The FLZ protein of K. flaccidum was also predicted to have high propensity for disorder at the N and C termini. Furthermore, the FLZ proteins from Bryophyta and Pteridophyta also showed a similar trend indicating that the disorder of N and C termini of FLZ proteins is conserved (Fig. 1C). The FLZ gene family is highly expanded in spermatophytes due to the common and lineage-specific whole-genome duplication (WGD) events ( Fig. S1) (6). Analysis of the disorder propensity of FLZ proteins from 28 species belonging to Spermatophyta suggested that N and C termini of FLZ proteins possess high propensity for disorder in all species analyzed (Fig. 1D). These results collectively suggest that disordered nature of N and C termini of FLZ proteins is an evolutionarily conserved feature.
IDRs generally show great variation in sequence and length. We calculated the number of predicted short (10 -29 residues) and long (30 or more residues) IDRs from the FLZ protein family across the plant lineage (Table S3). We found that the number of predicted short IDRs is very high compared with long IDRs in the FLZ protein family (Fig. 1E). The N termini showed high frequency of both predicted short and long IDRs followed by C termini (Fig. 1F). Interestingly, many IDRs were also predicted in the junction of the N and C termini and FLZ domain. The probability for IDRs was found to be less in the structured FLZ domain region (Fig. 1F).
Although the number of FLZ proteins is highly expanded in spermatophytes, the average size of FLZ proteins is conspicuously reduced in spermatophytes (Fig. S1). We compared the size of FLZ proteins from lower plants with Arabidopsis and rice, and it seems that the reduction in the size of IDR-rich N terminus is responsible for the reduction in the overall protein size in spermatophytes (Fig. S2).

FLZ proteins form SnRK1 scaffold in land plants N and C termini of FLZ proteins predicted to have high propensity for protein binding
IDRs generally facilitate versatile protein and nucleic acid binding (22). To get clues about the molecular functions of the IDRs in the N and C termini of FLZ proteins, we analyzed the nucleic acid-and protein-binding propensities using Diso-RDPbind, which is an efficient multiple parameter-based pre-dictor of protein-, DNA-, and RNA-binding residues in IDRs (23). In our analysis, both N and C termini were predicted to have consistently high propensity for protein binding in all species analyzed. In the case of the C terminus, many proteins were also predicted to have high RNA-binding propensity ( Fig. 2A).
The IDRs are generally enriched with sites for PTMs such as phosphorylation, acetylation, and methylation (24 -28). These The bars indicate the relationship between IDR length and frequency after categorizing IDRs into groups based on length. The IDRs with a length difference of up to five amino acids were grouped together. The red dots indicate the relationship between IDR length and frequency without any grouping. F, average distribution of short (SIDR; 10 -29 residues) and long (LIDR; 30 or more residues) IDRs in different regions of FLZ protein family. The detailed table of IDRs in FLZ protein family is given in Table S3.  Table S4. The p value obtained in the paired t test indicating the statistically significant difference in the number of PTM sites between N or C termini and FLZ domain is given in the graphs.

FLZ proteins form SnRK1 scaffold in land plants
modifications alter the target-binding properties of IDRs and increase the repertoire of protein states in the cell (17, 29). Phosphorylation, arginine methylation, and lysine acetylation are known to modulate protein-protein interactions (22,27,28,30). Because the N and C termini of FLZ proteins are predicted to have high propensity for protein binding, we analyzed the extent of enrichment of putative phosphorylation-, acetylation-, and methylation-prone residues in these regions as compared with the structured FLZ domain. Indeed, we found a significant increase in the number of potent serine and threonine phosphorylation sites in N terminus followed by the C terminus as compared with the FLZ domain ( Fig. 2B; Table S4). However, we did not find any major increase in the putative tyrosine phosphorylation sites in the N terminus. A significant decrease in the tyrosine sites was observed in C terminus as compared with the FLZ domain region ( Fig. 2B; Table S4). Furthermore, we observed significant increases in the putative arginine methylation and lysine acetylation sites in the N termini compared with FLZ domain (Fig. 2, C, and D; Table S4).

Prediction of disorder-to-order transition regions and low complexity regions in the FLZ proteins
IDRs may or may not undergo disorder-to-order transition upon binding with targets (17, 31). To understand the evolutionary perspective of disorder-to-order transition in FLZ proteins, we predicted the regions that can undergo disorder-toorder transition upon binding to globular protein partners in the IDRs of FLZ proteins from lower plants and Amborella trichopoda, a basal species from the sister lineage of angiosperms using ANCHOR (32,33). The IDRs in the N terminus of FLZ proteins from lower plants were predicted to be highly enriched with such binding regions, whereas a general reduction in their number and restricted distribution of binding regions was predicted in A. trichopoda (Fig. S3A). Other spermatophytes also showed the same scenario (Fig. S3B). Molecular recognition features (MoRFs) are short motifs (typically 10 -70 residues long) in the IDRs that undergo disorder-toorder transition upon binding with their targets, and they are generally involved in facilitating protein-protein interactions (22,34). The fMoRFpred prediction revealed low enrichment of MoRFs in the FLZ proteins in both lower plants and spermatophytes (Fig. S3, C and D) (35,36).
The low complexity regions (LCRs) in the proteins are formed due to limited diversity and repetition of certain amino acid types. They are highly abundant in eukaryotic proteins in their IDRs and often enhance the binding promiscuity (37)(38)(39). In our analysis using SEG algorithm (40), LCRs were predominantly predicted in the IDRs of the N terminus of lower plants, whereas reduction in their number and restricted distribution was observed in spermatophytes (Fig. S3, E and F).

Arabidopsis FLZ proteins interact with all subunits of SnRK1
The in silico analysis suggested that FLZ proteins possess IDRs, which are potentially involved in protein-protein interaction. It is already reported that many FLZ proteins promiscuously interact with ␣ kinase subunits of SnRK1 in Arabidopsis (8,9). To experimentally validate the role of IDRs in proteinprotein interaction, we selected the interaction of FLZ proteins with SnRK1. SnRK1 is an obligate heterotrimeric enzyme, and FLZ proteins are proposed to be an adaptor of SnRK1; therefore, we hypothesized that the interaction might not be restricted to kinase subunits. To test this, we cloned all 18 Arabidopsis FLZs in the BD vector for the Y2H experiment with SnRK1 subunits. Prior to the Y2H experiment, BD-FLZ clones were transformed in yeast, and their auto-activation and toxicity were checked. In this assay, all 18 Arabidopsis FLZ proteins showed no auto-activation and toxicity in yeast (Fig. S4). Next, we cloned SnRK1␣1, -␣2, -␤1, -␤2, -␤3, -␤␥, and -␥1 subunits in the AD vector. The BD and AD constructs were cotransformed in yeast, and interaction was analyzed. As reported previously, FLZ proteins showed promiscuous interaction with SnRK1␣ subunits in yeast (Fig. 3). In a previous study, no interaction of SnRK1␣ subunits with FLZ14 and FLZ17/18 in yeast was observed (9). Interestingly, in our analysis, we found that all 18 Arabidopsis FLZ proteins, including FLZ14 and FLZ17/18, can interact strongly with both ␣ subunits in yeast. The ␤ subunits showed interaction with many FLZ proteins in variable strengths. Among the ␤ subunits, SnRK1␤3 interacted with most numbers of the FLZ protein family. SnRK1␤␥ showed strong interaction with FLZ2 and FLZ13 and weak interaction with FLZ8. Earlier, based on the sequence similarity, a cystathionine ␤-synthase protein was annotated as a ␥ subunit in Arabidopsis; however, protein-protein interaction and complementation studies found that it can neither complement the yeast snf4 mutant nor interact with ␣ and ␤ subunits (41,42). Later, this protein was found to have sequence similarity with SDS23, which works as an alternative energy sensor in fungi (5). None of the 18 FLZ proteins showed interaction with this SnRK1␥1/SDS23 LIKE (SDS23L) protein suggesting that FLZ proteins specifically interact with SnRK1 subunits that were found to be forming an enzyme complex in the earlier study ( Fig. 3) (42). SNF1-related-kinase 1 activating kinase 1 and 2 (SnAK1 and SnAK2) are redundant upstream kinases of SnRK1 that are essential for SnRK1 activity (43). Because these kinases physically interact with the SnRK1 complex, we investigated whether FLZ proteins can also interact with SnAK1 and SnAK2. In the Y2H assay, none of the 18 FLZ proteins showed interaction with SnAK1 or SnAK2 confirming that the interaction of FLZ proteins is restricted to the SnRK1 complex (Fig. S5).

FLZ domain mediates the interaction with SnRK1␣ subunits
The FLZ domain of FLZ1 is sufficient to mediate its interaction with PFA-DSP3 and STH2 suggesting that FLZ domain is the canonical protein-protein interaction module in the FLZ proteins (6). Consistent with this, the FLZ domain of FLZ12 alone was found to be sufficient to mediate interaction with SnRK1␣ subunits (9). Our in silico analysis identified that the N and C termini of FLZ proteins are highly enriched with IDRs with protein-binding propensities, and FLZ domain is the least disordered region in FLZ proteins. To find out which part of the FLZ protein is responsible for the interaction with SnRK1␣ subunits, we first cloned the FLZ domain region from five diverse FLZ proteins (Fig. 4A) and checked their ability to interact with SnRK1␣1 and -␣2. Interestingly, all five FLZ domains showed strong interaction with SnRK1␣2, whereas the FLZ  The CDS of SnRK1 and FLZ genes were cloned in AD and BD vectors respectively. The constructs were cotransformed, and interaction was screened on DDO (upper row in each group) and QDO plates supplemented with X-␣-Gal and AbA (lower row in each group). Simultaneously, a negative control experiment with BD vector and AD construct was carried out to identify false interactions.

FLZ proteins form SnRK1 scaffold in land plants
domain of FLZ1, -2, and -8 showed strong interaction with SnRK1␣1 ( Fig. 4B). However, the FLZ domain of FLZ3 showed a weak interaction, and the FLZ domain of FLZ15 did not show any interaction at all (Fig. 4B). Consistent with the previous report (9), these results suggest that the FLZ domain is majorly responsible for facilitating interaction with SnRK1␣ subunits. Furthermore, we checked whether the IDR-rich N and C termini of FLZ1 and FLZ2 can establish interaction with SnRK1␣ subunits. In the Y2H assay, neither N nor C termini showed interaction with SnRK1␣ (Fig. 4C). To confirm the role of the FLZ domain in mediating interaction with SnRK1␣ subunits, we replaced conserved cysteine residues with serine in the FLZ domain of FLZ8 (Fig. 4D). In the interaction assay with FLZ8⌬1, the strength of interaction with SnRK1␣1 was reduced compared with the intact FLZ domain, whereas FLZ8⌬2 and FLZ8⌬1-⌬2 showed a more pronounced reduction in the interaction (Fig. 4E). The interaction property of FLZ8⌬1 was dramatically reduced when we also replaced the adjacent cysteine (Cys-225 and Cys-227, FLZ8⌬1-⌬3 construct) residues with serine residues (Fig. 4E). This result suggests that the ability of the nonconserved cysteine residues (Cys-225 and Cys-227) to partially replace the function of conserved cysteine residues (Cys-226 and Cys-229) could be the reason for the retention of interaction in the FLZ8⌬1 construct. Taken together, mapping and site-directed mutagenesis (SDM) analysis confirmed that the FLZ domain is the canonical SnRK1␣-interacting module in FLZ proteins.

N terminus mediates the interaction with SnRK1␤ and ␤␥ subunits
We further investigated the role of different parts of FLZ proteins in mediating interaction with SnRK1␤ and ␤␥ subunits. Strikingly, except for a few weak interactions, the FLZ domain failed to establish interaction with SnRK1␤1-3 and ␤␥ subunits (Fig. 4, F-I). This result suggested that other regions in the FLZ proteins might be responsible for mediating interaction with SnRK1␤ and ␤␥ subunits. To test this, we checked the interaction of the N and C termini of FLZ1 with the SnRK1␤ subunits and the N and C termini of FLZ2 with the SnRK1␤ and ␤␥ subunits. We found that the N terminus alone is sufficient to mediate interaction with SnRK1␤ and ␤␥ subunits (Fig. 5, A and  B). Apart from ␣ subunits, FLZ8 interacts strongly with all three ␤ subunits in yeast (Fig. 3). Because the N termini of FLZ1 and FLZ2 are responsible for interaction with SnRK1␤ subunits, we hypothesized that the mutation of cysteine residues in the FLZ domain should not hamper the interaction of FLZ8 with ␤ subunits. In our assay, we found that FLZ8⌬1, FLZ8⌬2, and FLZ8⌬1-⌬2 constructs produced strong interaction with all three ␤ subunits indicating that the IDR-rich N terminus is responsible for establishing interaction with SnRK1␤ and ␤␥ subunits (Fig. 5, C-E).

FLZ-SnRK1 interaction site colocalizes with endoplasmic reticulum
To confirm the Y2H results and to identify the in planta interaction site of FLZ-SnRK1, we performed BiFC assay of

FLZ proteins form SnRK1 scaffold in land plants
SnRK1 subunits with different FLZ proteins. In the BiFC assay, all FLZ proteins were found to be interacting with SnRK1␣ subunits in cytoplasmic foci as sinuous bodies ( Fig. 6A; Fig. S6). These bodies were predominantly found to be closely associated with the nucleus. These results prompted us to investigate whether they are ER-associated with the nucleus. To investigate this possibility, we employed a widely used ER-marker, which was constructed by the fusion of ER-signal peptide of WALL-ASSOCIATED KINASE 2 and an ER retention peptide to mCherry (44). Indeed, we found a strong colocalization of BiFC signal with the ER-marker signal ( Fig. 6A; Fig. S6). To further confirm this, we used ER-Tracker Red dye. The interaction of FLZ15 with SnRK1␣1 was found to colocalize with the signal of ER-Tracker dye (Fig. 6B). These results collectively confirm the colocalization of the SnRK1-FLZ interaction with ER. The interaction of FLZ4 with SnRK1␤3 was also found to be localized to cytoplasmic sinuous bodies (Fig. S7A). Furthermore, staining with ER-Tracker Red dye identified that this interaction also colocalizes with ER (Fig. S7B). The BiFC signal of FLZ-SnRK1 interaction was found to be specific because removal of SnRK1 subunits or FLZ proteins resulted in the abolition of the fluorescent signal (Fig. S8). Intriguingly, subcellular localization analysis found that only FLZ9 and FLZ15 are predominantly localized to cytoplasmic bodies, whereas most of the FLZ proteins were found to be localized uniformly in both nucleus and cytoplasm (Fig. S9).

FLZ proteins undergo homo-and heterodimerization
The in planta BiFC assay identified that FLZ proteins interact with SnRK1 subunits in the same cytoplasmic foci. This result suggested that either different FLZ proteins might be recruited to the SnRK1 complex on the basis of specific cues or that SnRK1 might be forming a complex with multiple FLZ proteins simultaneously. The promiscuous interaction of FLZ proteins with SnRK1␣ subunits led us to propose their role as adaptor proteins of SnRK1, which help in the recruitment of SnRK1 targets to the complex (9). In this investigation, we also observed that multiple FLZ proteins interact with not only

FLZ proteins form SnRK1 scaffold in land plants
SnRK1␣ but also with the ␤ and ␤␥ subunits. These results led us to speculate that various FLZ proteins might be forming a complex with SnRK1, and the FLZ proteins might be physically interacting among themselves to facilitate the formation of this complex.
To test this hypothesis, we cloned 10 FLZs in the AD vector, and their interaction with 16 FLZ proteins in the BD vector was screened through the Y2H assay. Indeed, we found homo-and heterodimerization of FLZ proteins with varying strengths (Fig.  7). Among the combinations we analyzed, FLZ7, FLZ10, FLZ12, and FLZ15 showed homodimerization. Interestingly, these proteins also showed a high degree of heterodimerization with other proteins, whereas members such as FLZ1, FLZ6, and FLZ11 interacted specifically with one or two FLZ proteins (Fig. 7).

N terminus is involved in the interaction among FLZ proteins
To identify which part of the protein is responsible for mediating interaction among the members of FLZ family, we first analyzed which part of FLZ2 can mediate the interaction with the other FLZ proteins. Mapping of interacting regions identified that among the different parts of FLZ2, only the N terminus could recapitulate the interaction with full-length proteins (Fig.  8A). To confirm the role of the N terminus in mediating the interaction with other FLZ proteins, we mapped the FLZ-interaction region of FLZ1, which interacted with FLZ7 and FLZ15 in the previous assay. In the interaction site mapping, FLZ7 and FLZ15 were found to be strongly interacting with the N terminus of FLZ1. FLZ15 showed a weak interaction with the FLZ domain as well (Fig. 8B). The Y2H assay with the N terminus of both interacting FLZ proteins identified strong interaction confirming that the N terminus alone is sufficient for mediating interaction among these FLZ proteins (Fig. 8, C and D).

IDRs in the N terminus are involved in mediating interaction with FLZ proteins and SnRK1 subunits
The in silico analyses predict that the N terminus of FLZ proteins is enriched with protein-binding IDRs. In agreement with this, protein-protein interaction analysis identified that the IDR-rich N terminus of FLZ proteins is responsible for mediating interaction with other FLZ proteins and SnRK1␤ and ␤␥ subunits. To decipher the specific role of IDRs in mediating interactions, we first selected FLZ1 and FLZ2, a paralogous pair, which showed extensive interaction with other FLZ proteins and SnRK1 subunits in the Y2H assays. FLZ1 is a comparatively large protein due to the longer N terminus harboring a long IDR of 79 residues, which covers most of the N terminus (hereinafter referred as FLZ1 NIDR1 ) (Fig. 9A). In FLZ2, the middle region of FLZ1 NIDR1 is lost during evolution, which resulted in the formation of two separate IDR-rich regions (hereafter referred to as FLZ2 NIDR1 and FLZ2 NIDR2 ). The second IDR-rich region in the FLZ2 was found to be only 9 amino acids long on the cutoff parameters we employed; however, because these regions show similarity with the long FLZ1 NIDR1 , we considered it as a short IDR (Fig. 9A). We first cloned the short IDRs of FLZ2 separately, and their interaction property was analyzed with FLZ7 and FLZ10, which was found to be interacting with FLZ2 through the N terminus in the earlier experiment (Fig.  8C). Intriguingly, only FLZ2 NIDR2 could recapitulate the interaction of FLZ2 with FLZ7 and FLZ10 (Fig. 9B). This result suggested that specific IDRs in the N terminus might be involved in mediating interaction among FLZ proteins. Furthermore, we analyzed which of the IDRs is involved in mediating the interaction of FLZ2 with SnRK1␤ and ␤␥ subunits. SnRK1␤ and ␤␥ subunits were found to be interacting with both IDRs in yeast (Fig. 9C). Collectively, these results suggest that IDRs specifically or collectively are involved in mediating interactions of FLZ proteins with other proteins. To identify whether this specificity of IDR is conserved, we checked the interaction capacity of the long IDR of FLZ1. We divided the FLZ1 NIDR1 into two parts, where FLZ1 NIDR1  harbors the region that shows similarity with FLZ2 NIDR1 and FLZ1 NIDR1 (49 -81) shows similarity with FLZ2 NIDR2 (Fig. 9A). In the interaction assay with both parts of FLZ1 NIDR1 , only FLZ1 NIDR1 (49 -81) , which shows sequence similarity with FLZ2 NIDR2 , could recapitulate the interaction of FLZ1 with FLZ7 and FLZ15 (Fig. 9D). FLZ1 interacts with all three SnRK1␤ subunits through the N terminus. In the Y2H assay with FLZ2 NIDR1 and FLZ2 NIDR2 , we found that both these IDRs are cooperatively involved in facilitating the interaction with SnRK1␤ subunits (Fig. 9C). To find whether this property is conserved in FLZ1 also, we tested the interaction of FLZ1 NIDR1  and FLZ1 NIDR1 (49 -81) with SnRK1␤ subunits. As observed in FLZ2, both regions were found to be interacting with SnRK1␤ subunits (Fig. 9E). These results suggest that specific IDR regions might be involved in mediating different interaction in FLZ proteins, and this specificity might be conserved among paralogs. The FLZ genes were cloned in AD and BD vectors; constructs were cotransformed; and interaction was screened on DDO (upper row in each group) and QDO plates supplemented with X-␣-Gal and AbA (lower row in each group). Simultaneously, a negative control experiment with BD vector and AD construct was carried out to identify false interactions.

FLZ proteins form SnRK1 scaffold in land plants
To further confirm the role of IDRs in mediating specific interactions, we constructed a chimeric construct where the FLZ2 NIDR2 , which mediates the interaction with FLZ10, was fused with the N terminus of FLZ1. In the Y2H assay, the FLZ1 N terminus did not show interaction with FLZ10, whereas the chimeric construct with FLZ2 NIDR2 showed interaction with FLZ10 confirming the role of IDRs in mediating specific interactions (Fig. 9F).

Different regions of FLZ proteins show different rates of sequence divergence
During evolution, protein structure tends to be more conserved than the sequence (45). Consistent with this, many studies suggest enhanced sequence divergence among IDRs (46,47). However, later elaborative studies identified a more nuanced pattern of sequence divergence among IDRs where some IDRs and amino acids contributing to IDR formation were found to be highly conserved (17, [47][48][49]. FLZ domain was found to be the most conserved region in FLZ proteins suggesting the existence of different rates of sequence divergence among the different regions in the FLZ proteins (6,7). To know whether the IDR-rich N and C termini of FLZ proteins show more sequence divergence compared with the ordered FLZ domain, we estimated the K a /K s ratio of the N and C termini and FLZ domain region of putative orthologous genes from six closely related species each from monocot and eudicots ( Fig. 10; Table S5). Interestingly, among the 112 total gene pairs we analyzed, 92 pairs showed increased K a /K s ratio for the N terminus compared with FLZ domain. The remaining 20 pairs showed high ratio value for FLZ domain region compared with the N terminus (Table S5). This observation suggests that different regions of FLZ proteins are evolving under different evolutionary constraint, and in general the N terminus is under a more relaxed selection than the FLZ domain. Interestingly, the C-terminal region showed great variation in the K a /K s ratio ( Fig. 10; Table  S5). In many proteins, the C-terminal region showed very high sequence divergence, and in some proteins this region was found to be under strong purifying selection. We found high variation in ratio among the orthologous genes from the same species pair. The difference in the age of duplicated genes, which occurred due to the rampant WGD events in the angiosperms, could be the major contributing factor for this variation in the ratio (50).

Discussion
SnRK1/SNF1/AMPK1 is a conserved obligate heterotrimeric serine/threonine kinase in eukaryotes. Mounting evidence from studies using diverse eukaryotic models identified that AMPK and its homologs work as master regulators of adaptive growth in response to energy deficit (3). To achieve this, SNF1/AMPK1 interacts with a large number of proteins involved in the regulation of primary metabolism, transcription, splicing, translation, protein trafficking, autophagy, protein degradation, etc. (51)(52)(53)(54)(55)(56). Many of these proteins are phosphorylation targets of SNF1/AMPK1, and through these phosphorylation events, it works as a central regulator of growth in eukaryotes (57,58). Understandably, AMPK signaling is implicated in many diseases in humans (59). The SnRK1 subunits are also found to be interacting with diverse proteins and possess functions similar to SNF1/AMPK (8,60). These studies suggest that SnRK1/SNF1/AMPK1 complex works as a convergent point/hub of many diverse signaling pathways. Earlier, due to promiscuous nature of the interaction of FLZ proteins with SnRK1␣ subunits and common interacting partners, FLZ proteins were proposed to be scaffolds of SnRK1 enzyme complex (9,10). In this study, our in silico analysis predicts that FLZ proteins possess evolutionarily conserved IDRs. Together with the FLZ domain, they contribute to the interaction of these proteins with SnRK1 subunits and other FLZ proteins. These IDRs might be playing an important part in the proposed role of FLZ proteins as the scaffolding proteins.
The enhanced propensity for protein binding and the enrichment of PTM sites observed in the in silico analysis prompted us to investigate the role of IDRs in protein-protein interaction. Intriguingly, the FLZ domain region showed significant enrichment of potential tyrosine phosphorylation sites that could be due to the presence of two relatively conserved tyrosines in the spacer region across the plant lineage (6,7). The conservation of these residues across the plant lineage suggests their possible roles in regulating the protein-protein interactions mediated by the FLZ domain. In our Y2H assays with all subunits of SnRK1 in Arabidopsis, we could identify previously unknown interactions of FLZ proteins with SnRK1 kinase subunits that were also verified by the BiFC assay. Furthermore, FLZ proteins also showed extensive interaction with SnRK1␤ and ␤␥ subunits. It should be noted that there could be more interactions between SnRK1 subunits and FLZ proteins than what we identified in this study because the Y2H interactions are heavily dependent on the orientation of the construct and choice of AD and BD vectors (61).
The promiscuous interactions of FLZ proteins with SnRK1 subunits further reinforce their role as scaffold proteins of the SnRK1 complex. The FLZ proteins are relatively small proteins in angiosperm, and we speculated that the scaffold might be

FLZ proteins form SnRK1 scaffold in land plants
formed due to the interaction of different FLZ proteins. Indeed, we found that FLZ proteins show homo-and heterodimerization properties. Intriguingly, some proteins, such as FLZ7, FLZ12, and FLZ15, showed promiscuous dimerization property, whereas proteins such as FLZ1 and FLZ6 showed very specific interactions. It is indeed an interesting observation because, in multiprotein complexes, some core proteins can possess a promiscuous interaction property that helps in recruiting other subunits to the complex. A remarkable example for such core proteins is MED14 and MED17 subunits in the Mediator complex, which interact with a large number of other subunits (62). More functional and structural studies can uncover whether FLZ7, FLZ12, and FLZ15 possess similar functions in facilitating the formation of the FLZ protein complex.
Our results suggest that the IDRs in the N terminus are involved in the complex formation of FLZ proteins with SnRK1␤ and ␤␥ subunits and FLZ proteins. The FLZ domain was found to be solely responsible for mediating interaction with SnRK1␣ subunits. However, in some cases the FLZ domain produced weak or no interaction. Previously, we have shown that although the FLZ domain is sufficient to mediate interaction with other proteins, the strength of the interaction is significantly diminished when the FLZ domain alone was used, suggesting that other regions in the protein facilitate stronger binding (6). Apart from this, the stringency of selection medium must have also contributed to the weak interaction; and no interaction observed in case of FLZ3 and FLZ15 with SnRK1␣1. Although we predicted the enrichment of IDRs in the C terminus, in our interaction analysis with C-terminal constructs, we could not recapitulate any interactions. The C termini in FLZ proteins are usually small, and it is yet to be seen whether the IDR in this region possess any specific function or indirectly contribute to interactions facilitated by other regions.
We also found a general reduction in the size of FLZ proteins in most of the angiosperms with limited novel domain acquisition. This reduction in the overall protein size seems to have occurred due to the contraction of N termini. Although the number of FLZ proteins is limited in algae and bryophytes, they showed a relatively long N terminus with more disordered regions. In angiosperms, due to the contraction of N termini, the number and length of IDRs are generally reduced. The protein-protein interaction analysis in Arabidopsis suggests that FLZ proteins interact among themselves, and these interactions might be important in scaffold formation. Many regulatory pathways, which are essential for the survival in terrestrial environment, originated as simpler pathways with few proteins and regulatory modules in algae and bryophytes, and their complexity is gradually increased with the addition of more genes and regulatory interactions in higher plants (21,63,64). A classic example is the evolution of the auxin perception signaling pathway where successive additions of different modules resulted in the formation of a multilayered and complex regulatory pathway in higher plants (65). Interestingly, the expansion of FLZ gene family is also implicated in the evolution of body plan complexity in angiosperms (66). One possibility is that the solitary or limited number of FLZ proteins in the Char-ophyta and Bryophyta with its long N terminus enriched with IDRs can harbor SnRK1 and its limited number of binding partners in lower plants. The rampant gene duplication events in higher plants and the gradual increase of biological complexity might also have increased the complexity of SnRK1 signaling. The concurrent duplication and divergence of FLZ genes in higher plants might have resulted in functional specialization and facilitated the scaffold formation by the interaction of multiple FLZ proteins. Supporting this hypothesis, comparison of the distribution of ANCHOR-based disorder-to-order transition regions and LC regions in lower plants and spermatophytes identified a restricted distribution where some proteins in a species showed an enhanced number of such regions in spermatophytes. Taken together, these results suggest a gradual increase in the complexity of SnRK1-FLZ signaling pathway in higher plants. Interestingly, even after keeping very low stringency (five tandem amino acid residues with a propensity score of Ն0.5) for MoRF prediction, a very low number of MoRFs was predicted in the FLZ proteins, especially at the N-terminal region. In fact, proteome level analysis identified that only about 21% of the IDRs in eukaryotes contain MoRFs. This percentage is slightly increased to 29% in bacteria and archaea (36). These results indicate that a large majority of IDRs do not undergo disorder-to-order transition upon binding to the targets. These IDRs with increased conformational plasticity are often involved in the formation of fuzzy complexes (17). ANCHOR predicted more disorder-to-order transition regions in FLZ proteins compared with fMORFPred. This difference could be due to the difference in the prediction methods where MoRFPred uses more elaborative prediction strategy and displayed significantly better performance (33,35,67). The reduced frequency of MoRFs in the IDRs of FLZ proteins suggests that they might be involved in the formation of complexes with more conformational freedom such as fuzzy complexes, and this feature might help in recruiting diverse targets of SnRK1. The increased sequence divergence observed in the IDR-rich regions of FLZ proteins might have accelerated the adaptability and enhanced the binding repertoire of IDRs.
The in planta analysis identified that FLZ proteins interact with SnRK1 subunits in the common cytoplasmic foci indicating the formation of a complex. The colocalization analysis with a marker protein and stain in this study and a previous study identified that SnRK1-FLZ interaction colocalizes with ER, which provides clues about the biological significance of this complex (14). In most cases, these interactions were found to be located in the proximity of the nucleus suggesting that it might be the regions in ER where ribosomes are studded and mRNA translation is undergoing. Studies in various eukaryotes, including plants, suggest a pivotal role of SNF1/AMPK/SnRK1 signaling in negatively regulating the protein synthesis during energy starvation (3,4,68). Upon activation by starvation signals, AMPK inhibits TOR activity through phosphorylating RAPTOR, which promotes its dissociation from the TOR complex (69). This phosphorylation and the negative regulation are found to be conserved in plants as well (70). The multisubunit eukaryotic initiation factor 3 (eIF3), which is involved in the translation initiation, also works as a scaffold for TOR and its immediate target, ribosomal S6 kinases 1 and 2 (S6K1/2). Upon

FLZ proteins form SnRK1 scaffold in land plants
activation, the mTOR complex phosphorylates S6K1/2, which results in its dissociation from the eIF3 complex and further phosphorylation and activation by PDK1. The activated S6K1/2 phosphorylates eukaryotic translation initiation factor 4B (eIF4B) and ribosomal protein S6 (RPS6), a subunit of the 40S ribosome (71). mTOR, through the phosphorylation events detailed above and other phosphorylations, promotes translation in response to activation signals (68). In plants, this mechanism is found to be conserved where auxin was identified as one of the potent activators of TOR (72). TOR complex can also directly associate with ribosome (72,73). Taken together, these results indicate that the TOR and S6K shuttle from the inactive to active pool depending on signals and that they are linked with ribosomes. Although the role of AMPK as a potent inhibitor of mTOR-mediated activation of translation is known, how the AMPK is recruited to this complex is still elusive. The interaction of cell death-inducing DNA fragmentation factor 45-like effector A (CIDEA) with AMPK␤ subunit colocalizes with ER in brown adipose tissue suggesting that AMPK also exists in close proximity with ribosomes (74). In the subcellular localization assays, SnRK1 kinase subunits were found to be predominantly localized in the nucleus and cytoplasm (3,9,75). We found that most of the FLZ proteins also show a similar localization pattern. However, the specific localization of the SnRK1-FLZ interaction to the bodies colocalizing with ER indicates that FLZ proteins might be responsible for the recruitment of SnRK1 to the translation-regulating complex. This hypothesis is supported by the fact that FLZ proteins also show promiscuous interaction with RAPTOR (8). The expression of FLZ genes is highly regulated by cellular energy levels (76). Indeed, functional analysis with individual FLZ genes identified that they not only work as inert scaffolding proteins but also regulate the level of SnRK1␣1 and hence SnRK1 and TOR activity (14). Energy status and SnRK1 signaling regulate the transcription of FLZ genes (76). These results indicate that this complex formation might be regulated by energy status and might be involved in the regulation of SnRK1 signaling in plants. Taken together, this study identified that the FLZ domain and IDRs in the FLZ proteins coordinate in the formation of a complex with SnRK1. Because SnRK1 is a highly structured complex, the surface provided by the interaction of multiple FLZ proteins may help in the recruitment of upstream and downstream factors to the SnRK1 complex.

Identification of FLZ proteins from different plant genomes
In the previous studies, we identified FLZ proteins from more than 40 plant genomes through BLASTP and PFAM and InterPro Id (PF04570 and IPR007650)-based searches (6,7). We used this dataset to create a hidden Markov model profile of the FLZ domain. Using this HMM profile, proteins containing the FLZ domain were identified from 33 genomes, which represent important taxonomical positions in the plant lineage (Fig. S1) using Hmmsearch available in the HMMER web server version 2.15.0 with the given parameters (significance e-value: 0.01 for sequence and 0.03 for hits). Simultaneously, three rounds of iterative-BLAST searches were performed in the ref-erence proteomes using aligned FLZ domain sequences using jackhmmer available in the HMMER web server version 2.15.0 with the given parameters (significance e-value: 0.01 for sequence and 0.03 for hits). The hits obtained from the profile HMM-based search and iterative BLAST searches were combined with the previous dataset and outliers and repeats were removed. Finally, the nonredundant dataset was screened with Batch Web CD-search Tool (77) for the presence of FLZ and other protein domains (Table S2). The final set of protein sequences used for the analysis is given in Table S1. The species tree of the 33 genomes of interest was retrieved from NCBI Taxonomy Common Tree Tool and edited and visualized in FigTree version 1.4.3 (http://tree.bio.ed.ac.uk/software/figtree/). 7 The reported common and lineage-specific WGD events (50,79) in these species were also annotated in the species tree. The average size of FLZ proteins in a species is calculated as the mean size of all FLZ proteins in the given species.

Disorder prediction
The disorder of FLZ proteins was predicted by the metapredictor PONDR-FIT (20) using protein sequences as input. The average disorder score and standard deviation of each protein residue were obtained in the tabular form which is used for subsequent analysis (deposited in Figshare; https://figshare. com/s/0372c02ce2ed1173d93e). 7 We classified IDRs into two groups. Amino acid stretches ranging from 10 to 29 residues with average disorder score for each residue of Ն0.5 are classified as short IDRs. Amino acid stretches of Ն30 residues long with average disorder score for each residue of Ն0.5 are categorized as long IDRs. As used in the earlier studies (80, 81), a maximum three tandem amino acids with less than a 0.5 average disorder score is set as the tolerance limit. The boundaries of the N terminus, the FLZ domain, and the C terminus were determined by Batch Web CD-search Tool (77), and the average disorder score of each part was calculated from the disorder score of all amino acid residues in the region. An IDR is categorized as junction-IDR if at least five disordered residues (disorder score for each residue Ն0.5) are located in the other region. The IDRs identified from different regions of FLZ proteins are given in Table S3. Statistical sampling and data sorting was performed by employing in-house Cϩϩ scripts, and figures were generated in gnuplot version 5.2.

Binding propensity, disorder-to-order transition, and low complexity region predictions
The propensity of protein and nucleic acid binding of the N and C termini were calculated by DisoRDPbind (23) using protein sequences as input. The protein, DNA, and RNA binding scores of each protein residue were obtained in tabular form. The average binding propensity of N and C termini was calculated from the binding score of all amino acid residues in the particular region (deposited in Figshare; https://figshare.com/ s/0372c02ce2ed1173d93e).
The protein-binding disordered regions of FLZ proteins, which undergo disorder-to-order transition upon binding with globular proteins, were predicted by ANCHOR (33) using protein sequences as input. The results obtained were presented in tabular form (Table S6). The MoRFs were predicted by fMoRFpred (36). At least five amino acids in tandem with the propensity score of Ն0.5 for MoRF was considered as a potential MoRF. The final results obtained were presented in tabular form (Table S7).
The LCRs in the FLZ proteins were identified by SEG algorithm (40), and the results obtained were presented in tabular form (Table S8). The average protein and nucleic acid binding propensity, frequency of ANCHOR-based binding regions, MoRF, and LCR were converted to heat maps using Multi-Experiment Viewer (MeV, version 4.8) (82).

Post-translational modification site prediction
The putative serine, threonine, and tyrosine phosphorylation sites with a prediction score of Ն0.5 in the N and C termini and FLZ domain regions of FLZ proteins were identified by Net-phos3.1 (26). The putative arginine and lysine methylation sites with support vector machine score of Ն0.5 in different regions of FLZ proteins were identified by PMeS (83). The acetylation residues in the N and C termini and FLZ domain was predicted with PAIL (84) with medium stringency. The PTM site data obtained from this analysis was presented in Table S4. Paired t test was used to identify the difference in the distribution of PTM sites in N and C termini compared with FLZ domain.

Yeast two-hybrid assays
The full-length CDS of FLZ genes and SnRK1 subunits were amplified by CDS-specific primers (Table S9) and cloned into the pCR8/GW/TOPO vector using the pCR8/GW/TOPO cloning kit (Invitrogen). Subsequently, positive clones were mobilized to pGBKT7g (BD) and pGADT7g (AD) (61) vectors using Gateway cloning strategy (Invitrogen). All partial clones of FLZ genes were prepared using specific primers (Table S9) using the same strategy. The chimeric FLZ1 N -FLZ2 NIDR2 was constructed in pJET1.2 vector (ThermoFisher Scientific) and mobilized to pCR8/GW/TOPO vector. Subsequently, the construct was mobilized to pGBKT7g vector. All Y2H experiments were conducted in Y2H gold yeast strain according to the manufacturer's protocol (Clontech). Before the Y2H experiment, all the BD constructs were subjected to auto-activation and toxicity test as per the manufacturer's protocol (Clontech). For Y2H assays, respective AD and BD constructs were cotransformed in Y2H gold strain using EZ-Yeast transformation kit (MP Biomedicals), and transformed colonies were selected on Double Dropout medium (DDO; TrpϪ/LeuϪ). The transformed colonies were cultured, and equal amounts of cells were spotted on interaction screening Quadruple Dropout medium (QDO) supplemented with X-␣-Gal and aureobasidin A (TrpϪ/LeuϪ/ HisϪ/AdeϪ/X␣Galϩ/AbAϩ). Simultaneously, a negative control experiment with BD vector and AD construct was carried out to identify false interactions. The experiments were repeated three times.

Site-directed mutagenesis
The SDM of FLZ domain of FLZ8 (C225S, C226S, C227S, C229S, C249S, and C252S) was performed in the pCR8/GW/ TOPO-FLZ8 construct by QuikChange site-directed mutagenesis kit using the specific primers (Table S9) according to the manufacturer's protocol (Agilent). All mutations were verified by sequencing. Subsequently, the mutated constructs were mobilized to pGBKT7g using Gateway cloning (Invitrogen).

Bimolecular fluorescent complementation
The FLZ and SnRK1 subunits cloned in pCR8/GW/TOPO vector were mobilized to pSAT4-DEST-N(1-174) EYFP-C1 and pSAT5-DEST-C(175-END) EYFP-C1 vectors (85) using Gateway cloning strategy (Invitrogen). The BiFC assays were performed in onion epidermal peel system by bombarding both constructs by PDS-1000 Helios Gene Gun (Bio-Rad). A negative control experiment was conducted with vector alone along with the other construct to find out false-positive results. The ER-marker construct (44) was obtained from ABRC and transformed with BiFC constructs for colocalization. After bombardment, samples were incubated at 22°C for at least 16 h in the dark. DAPI staining was performed as described previously (6). For staining with ER-Tracker Red dye (Invitrogen), the samples were washed three times in Hanks' balanced salt solution (HBSS) without phenol red. Subsequently, samples were stained with 1 M ER-Tracker Red dye prepared in HBSS at 30°C for 30 min in dark. The samples were subjected to a quick wash in HBSS before visualization. All visualization and photography were performed in TCS SP2 (AOBS) laser confocalscanning microscope (Leica Microsystems) or AxioImager M2 Imaging System (Zeiss).

Subcellular localization assays
Subcellular localization assays were performed in onion epidermis and Arabidopsis mesophyll protoplasts. The FLZ genes cloned in pCR8/GW/TOPO vector were mobilized to pEG104 vector (86) using Gateway cloning strategy (Invitrogen). The constructs and vector alone were transfected individually to onion epidermis through PDS-1000 Helios Gene Gun (Bio-Rad) and incubated at 22°C for at least 16 h in the dark. Arabidopsis mesophyll protoplasts were prepared and transfected with vector and constructs as described previously (87). After the transfection, mesophyll protoplasts were incubated at 22°C for 16 h in the light. All visualization and photography were performed in TCS SP2 (AOBS) laser confocal-scanning microscope (Leica Microsystems).

Analysis of selection pressure on FLZ proteins
The putative orthologs from six closely related species each from monocots and eudicots were recovered through Bayesian phylogenetic reconstruction in TOPALI version 2.5 (88). The protein and CDS sequences were split to N and C termini and FLZ domain based on Batch Web CD-search Tool (77). The nonsynonymous (K a ) and synonymous (K s ) substitution rates and K a /K s ratio of each orthologous pair were estimated in the codeml program in PAML (78) package (Table S5). Protein regions with a minimum of 15 amino acid residues were considered for K a and K s estimation. The data sorting was performed by in-house Cϩϩ scripts, and the graph was generated in gnuplot version 5.2.