The Central Region of the Drosophila Co-repressor Groucho as a Regulatory Hub*

Background: The co-repressor Groucho has an essential, but disordered, central region. Results: We identified over 160 central region-binding proteins, many of which, including components of the spliceosome, modulate Groucho-mediated repression. Conclusion: Groucho regulates transcription by multiple mechanisms and may link the transcriptional and splicing machineries. Significance: Its central region may serve as the hub of a regulatory network. Groucho (Gro) is a Drosophila co-repressor that regulates the expression of a large number of genes, many of which are involved in developmental control. Previous studies have shown that its central region is essential for function even though its three domains are poorly conserved and intrinsically disordered. Using these disordered domains as affinity reagents, we have now identified multiple embryonic Gro-interacting proteins. The interactors include protein complexes involved in chromosome organization, mRNA processing, and signaling. Further investigation of the interacting proteins using a reporter assay showed that many of them modulate Gro-mediated repression either positively or negatively. The positive regulators include components of the spliceosomal subcomplex U1 small nuclear ribonucleoprotein (U1 snRNP). A co-immunoprecipitation experiment confirms this finding and suggests that a sizable fraction of nuclear U1 snRNP is associated with Gro. The use of RNA-seq to analyze the gene expression profile of cells subjected to knockdown of Gro or snRNP-U1-C (a component of U1 snRNP) showed a significant overlap between genes regulated by these two factors. Furthermore, comparison of our RNA-seq data with Gro and RNA polymerase II ChIP data led to a number of insights, including the finding that Gro-repressed genes are enriched for promoter-proximal RNA polymerase II. We conclude that the Gro central domains mediate multiple interactions required for repression, thus functioning as a regulatory hub. Furthermore, interactions with the spliceosome may contribute to repression by Gro.

Groucho (Gro) 2 is a conserved metazoan co-repressor that may be particularly critical for long range repression whereby repressors are able to establish large transcriptionally silent domains that can spread over many thousands of base pairs (1)(2)(3). Gro is essential in many developmental processes, including sex determination, neurogenesis, and pattern formation in Drosophila as well as myogenesis and hematopoiesis in vertebrates (2,4,5). Gro also has roles in multiple signal transduction pathways, including the Ras and Notch pathways (6 -8). Furthermore, increased Gro activity correlates with the appearance of certain forms of cancer such as lung cancer (9,10). Thus, understanding the mechanism of Gro-mediated repression should contribute to our understanding of long range repression and its role in development, signaling, and disease.
Sequence comparison of Gro family proteins reveals five domains (2,10). The C-terminal WD repeat domain forms a ␤-propeller that interacts with the WRPW and eh1 motifs found in many Gro-dependent DNA-binding repressors (11). The N-terminal Q domain folds into a coiled coil structure that forms tetramers and perhaps higher order oligomers, and this self-association is required for robust repression (12)(13)(14)(15). The central GP, CcN, and SP domains are believed to have essential functions even though their primary sequences are not well conserved. The GP domain interacts with the histone deacetylase Rpd3/HDAC1 (16,17). Histone deacetylation is broadly associated with gene silencing, and treatment of flies with histone deacetylase inhibitors attenuates Gro-mediated repression (18). In addition, the GP domain is essential for nuclear localization because deletion of this domain prevents Gro nuclear uptake (19). The SP domain regulates Gro function negatively as its deletion leads to promiscuous repression and developmental defects (19). Phosphorylation of the SP domain by Ras/MAPK signaling was shown to attenuate repression, providing a mechanism for regulating repression in response to environmental cues (20). Finally, the CcN domain is also targeted for phosphorylation by protein kinases and is required for repression by Gro (19,21).
Sequence analysis of the Gro central domains strongly suggests that they are intrinsically disordered (19). Intrinsically disordered regions in proteins lack rigid three-dimensional structures under native conditions and can serve as hubs of large regulatory networks by mediating a wide array of highly specific protein interactions (22,23). Increasing evidence suggests that intrinsically disordered domains have critical functions in transcriptional regulation (24,25).
In this study, we set out to illuminate the mechanisms of Gro-mediated repression by identifying proteins that interact with the N-terminal Q domain and the three central domains. A proteomic screen revealed over 160 interacting proteins, many of which are components of protein complexes in a variety of functional categories such as chromatin remodeling and RNA processing. Perhaps most notably, the interactors included multiple components of the spliceosome, and a coimmunoprecipitation experiment suggests that a sizable fraction of U1 snRNP (a subcomplex of the spliceosome) is associated with Gro in embryonic nuclei.
As a means of systematically validating the functional significance of these interactions, we carried out a novel reporter assay using three different luciferase reporters that could be monitored simultaneously. These assays showed that many of the interacting proteins, including the protein components of U1 snRNP, are required for optimal Gro-mediated repression. Lastly, we compared the effects on the gene expression profile of Gro and U1 snRNP knockdown, finding a significant overlap in the regulated genes. Our results indicate that the central domains of Gro mediate multiple interactions required for repression and reveal a possible mechanism of Gro-mediated repression through an interaction with the spliceosome complex or subcomplexes. This reinforces previous studies suggesting that the spliceosome has roles in transcriptional regulation in addition to its roles in RNA processing (26 -30).

Experimental Procedures
Plasmids-To generate plasmids for expression of glutathione S-transferase (GST) fusion proteins, sequences encoding the Gro domains were amplified by PCR and inserted between the BamHI and XhoI sites of pGEX4T (GE Healthcare). The Q domain included Gro amino acids 1-133, the GP domain included amino acids 134 -194, the CcN domain included amino acids 195-257, and the SP domain included amino acids 258 -390. Sequences of PCR primers are provided in Table 1.
Affinity Purification and Identification of Gro-interacting Proteins-Plasmids encoding the recombinant domains fused to GST or GST alone were transformed into BL21 cells. 250 ml of midlog cells were induced with 0.25 mM isopropyl 1-thio-␤-D-galactopyranoside for an hour. Cells were pelleted at 4,000 ϫ g, resuspended in 25 ml of salty TE (0.15 M NaCl, 10 mM Tris, pH 8, 1 mM EDTA) with protease inhibitor (Life Technologies catalog number 88266), and incubated on ice for 30 min. Samples were incubated at 4°C for 15 min after DTT and Triton X-100 were added to final concentrations of 5 mM and 1%, respectively. Cells were then disrupted through a microfluidizer (Microfluidics M110L) using standard conditions. The lysate was collected and centrifuged at 14,000 ϫ g for 10 min at 4°C. Supernatant was collected, and 1 ml of glutathione-agarose resin (50% slurry) was added. After overnight incubation, the resin was washed with ice-cold PBS three times and stored at 4°C.
Drosophila embryo nuclear extracts were prepared as described previously (32). To isolate Gro-interacting proteins, 20 g of glutathione bead-immobilized recombinant domains was mixed with nuclear extract containing 30 mg of protein (20 mg/ml) in 8 ml of HEMNK buffer (40 mM HEPES, pH 7.5, 5 mM MgCl 2 , 0.2 mM EDTA, 1 mM DTT, 0.5% Nonidet P-40, 0.1 M KCl) at 4°C overnight. Samples were washed six times for 15 min with 5 ml of HEMNK buffer. Proteins were first eluted with 5 ml of 2 M NaCl in HEMNK buffer and then with 2.5 ml of 2 M NaCl in HEMNK buffer for 20 min each. Eluted proteins were subjected to TCA precipitation prior to multidimensional protein identification technology (MudPIT) analysis. MudPIT analysis was performed as described previously (33). Peptide 3 A. J. Courey, unpublished data. identifications were filtered using a false discovery rate cutoff of 0.05 as determined by the decoy database approach. Protein level false positive rates were less than 0.03 for all individual runs.
Supplemental Table S1C includes all the mass spectroscopy data for the two independent replicate screens carried out with each GST fusion protein and GST alone, and supplemental Tables S1, A and B includes selective data for 159 proteins that were detected in both replicates as well as three proteins (Histone H3, Caf1, and Bic) that were only detected in one replicate but for which other data confirm the significance of the interaction (supplemental Table S1B, notes 2 and 3). Ribosomal proteins were excluded from the lists in supplemental Table S1, A and B.
Gro Immunoprecipitation and Reverse Transcription-qPCR (RT-qPCR) Analysis of U1 snRNA-500 g of nuclear extract was incubated with 1.875 g of affinity-purified rabbit antibody against the Gro GP domain or rabbit IgG in a final volume of 250 l of HEMNK buffer overnight at 4°C. 225 g of Protein A Dynabeads (Invitrogen catalog number 10001D) were incubated with the samples at 4°C for 1 h. Samples were then washed with HEMNK buffer three times for 10 min each. For RT-qPCR, RNA was eluted in 10 l of water by heating to 80°C for 2 min. Samples were treated with DNase I according to the manufacturer's protocol (Promega catalog number M6101). Reverse transcription was performed with 300 ng of random primer (Invitrogen catalog number 48190-011), and qPCR was performed using primers amplifying U1 snRNA ( Table 2). Threshold cycle values were converted to percent input values by comparison with a standard curve generated from multiple serial dilutions of RNA isolated by TRIzol (Life Technologies catalog number 10296010) extraction from the input nuclear extract. Primer specificity was validated by melting curve analysis of the amplification products (data not shown).
For immunoblotting, samples were eluted in SDS-PAGE loading buffer. Proteins were detected with a mixture of mouse anti-Gro (Developmental Studies Hybridoma Bank; 1:650 dilution) and affinity-purified rabbit anti-GP domain (1:100 dilution) antibodies. Immunoblots were subsequently probed with goat anti-mouse 680 and goat anti-rabbit 800 IRDye-coupled secondary antibodies (LI-COR) and imaged with a LI-COR Odyssey imager.
Three-reporter Luciferase Assay-To guard against off-target effects, each candidate gene was knocked down with three nonoverlapping dsRNAs when possible (the complete list of dsRNAs used is available upon request). Each dsRNA was tested in triplicate. dsRNA was synthesized by the Drosophila RNAi Screening Center and realiquoted into white flat bottom 96-well plates (USA Scientific catalog number CC7682-7968) at 150 ng/well in 10 l of water using a Beckman Coulter BioMek FX work station.
Transfections were carried out with Effectene reagent (Qiagen catalog number 301425). 6 g each of G5DE5-pCBR and DE5G5-pCBG68, 0.6 g of RpIII128-Rluc, 1 g of pPac Dl, 0.3 g of pPac Twi, and 1.2 g of pAct Gal4-Gro were suspended in 600 l of buffer EC. 33 l of this mixture was added to 25 l of enhancer. After 2-3 min, 7.5 l of Effectene was added and mixed by pipetting up and down. 6 l of this mixture was immediately added into each well of a 96-well plate containing 150 ng of dsRNA. 4 -8 min later, 100 l of S2 cells (diluted to 1 ϫ 10 6 cell/ml) was added to each well. Cells were incubated at 24°C for 2 days before assaying.
The luminescence signal was measured with a Molecular Devices LJL Analyst HT microplate reader using emission filters ET510/80m and E610LP (Chroma catalog numbers S-022658 and 138951). 50 l of D-luciferin (Chroma-Glo system, Promega catalog number E2980) was added to each well. Five minutes later, the reaction was stopped by the addition of 50 l of stop buffer containing coelenterazine (Dual-Luciferase system, Promega catalog number E1960). The luminescence signal was measured immediately without applying a filter.
To address the issue of signal overlap, raw signals were subjected to filter correction. The corrected red luminescence signal, RЈ, and green luminescence signal, GЈ, were calculated according to the following equations.
Parameters were determined by expressing the individual luciferases and recording the luminescence signals with red and green filters and with no filter (data not shown). The ratio of green signal passed through the red filter, Grf/Ggf, was determined to be 0.0975; the ratio of red signal passed through the red filter, Rrf/R, was determined to be 0.42; the ratio of red signal passed through the green filter, Rgf/R, was determined to be 0; and the ratio of green signal passed through the green filter, Ggf/G, was determined to be 0.47. Lrf and Lgf are luminescence signals in which cells are cotransfected with both red and green luciferases. Lrf is the signal recorded with the red filter, and Lgf is the signal recorded with the green filter.
The signal from untransfected cells was then subtracted from the corrected data to eliminate background. Processed data were then normalized to the internal control Renilla luciferase. Finally, data were compared with the signal from cells in the same plate that were treated with control GFP dsRNA. A change in long or short range repression was considered significant if the p value was Ͻ0.1. If multiple dsRNAs were tested for

Gene Sequence
Gro ATACTTACCTGGCGTAGAGGTTAACC AACGCCATTCCCGGCTA a given gene (as was true in most cases; supplemental Table S2), then a change is only listed if the p value was Ͻ0.1 for at least two separate dsRNAs. RNA-seq Library Preparation-Gro dsRNA was generated by PCR amplification of the first 800 nucleotides of the coding sequence using primers containing T7 promoters followed by in vitro transcription with T7 RNA polymerase. snRNP-U1-C dsRNA was generated by PCR and in vitro transcription of the snRNP-U1-C coding sequencing with primers 5Ј-taatacgactca-ctatagggtactCAAAGTACTATTGCGACTACTGC and 5Ј-taatacgactcactatagggtactCTTGGGTCCGTTCATGATTCC (lowercase letters represent the T7 promoter sequences). Transfection was carried out as described previously (34). RT-qPCR was used to determine the knockdown efficiency prior to RNA-seq library preparation. RT-qPCR primers targeted the 3Ј-UTRs of Gro and snRNP-U1-C. Rpl32 was used as a reference gene. The specificity of all primers was validated by melting curve analysis of the amplification products (data not shown). Sequences of the qPCR primers are listed in Table 2.
Total RNA was extracted with TRIzol according to the manufacturer's protocol. RNA integrity was determined with an Agilent 2100 Bioanalyzer using the RNA 6000 Nano kit (Agilent catalog number 5067-1511). Isolation of mRNA was carried out as follows. Streptavidin magnetic beads (Promega catalog number Z5481) were prepared in aliquots of 120 and 60 l in 0.5ϫ SSC with 10 mM EDTA. 15 g of total RNA was mixed with 1.5 M biotinylated 15-mer poly(T) oligonucleotide in 0.5ϫ SSC with 10 mM EDTA. Samples were first incubated at 75°C for 5 min followed by 15°C for 10 min and 10°C for 10 min. Samples were then incubated with 120 l of magnetic beads at 4°C for 2 h followed by 60 l of magnetic beads at 4°C for 30 min. The two aliquots of beads were combined and washed four times with 300 l of ice-cold 0.1ϫ SSC containing 10 mM EDTA. mRNA was first eluted with 100 l of water followed by 150 l of water at 37°C for 10 min each. Samples were precipitated with ethanol and stored at Ϫ80°C. Pulldown efficiency of mRNA and depletion efficiency of 18S rRNA were determined by RT-qPCR (data not shown).
The RNA-seq library was prepared according to the manufacturer's protocol (Epicenter, catalog numbers SSV21124 and RSBC10948). The concentration of the library was determined with Pico Green (Life Technologies catalog number Q32851) according to the manufacturer's directions. Fluorescence signal was measured using a TECAN M1000 fluorescence plate reader.

Results
Identification of Gro-interacting Proteins-A previous study showed that deletion of the GP or CcN domain in the Gro central region led to a loss of Gro-mediated repression and to lethality, whereas deletion of the SP domain led to reduced specificity of Gro-mediated repression and to reduced viability (19). To identify possible regulatory partners of these domains, we used them as affinity reagents to purify interacting proteins, which were then identified by mass spectrometry. The three central domains of Gro were expressed as GST-tagged proteins and purified from Escherichia coli lysates (Fig. 1, A and B). We also constructed a similarly tagged form of the N-terminal Q domain because previous studies suggested that, in addition to mediating Gro oligomerization, the Q domain engages in interactions with regulatory targets (39,40).
The glutathione bead-immobilized GST-fused domains (or, as a negative control, immobilized unfused GST) were incubated with a Drosophila embryo nuclear extract. After extensive washing, interacting proteins were eluted with 2 M salt and analyzed by MudPIT (33) (supplemental Table S1C). Duplicate extract preparations and affinity purifications were carried out and analyzed on separate dates, and there was a high degree of overlap between the sets of proteins identified in these duplicate experiments (Fig. 1C). With three exceptions (see "Experimental Procedures"), only proteins that appeared in both rep-FIGURE 1. Purification of Gro-interacting proteins. A, schematic representation of Gro. The Q, GP, CcN, and SP domains were tagged with GST. B, the GST-tagged domains were expressed in E. coli and purified with glutathioneagarose beads. They were then resolved by 10% SDS-PAGE and visualized by Coomassie Blue staining. These proteins were then used as affinity reagents in the purification of Gro-interacting proteins from embryonic nuclear extracts that were subsequently identified by MudPIT (see Table 4 and supplemental Table S1). C, Venn diagram showing overlap between the nonribosomal proteins identified in two replicate sets of affinity purification experiments. Fisher's exact test indicates that the overlap between the two sets is highly significant (p Ͻ 2.2 ϫ 10 Ϫ16 ).
licates were included in our list of Gro-interacting proteins ( Fig.  1C and supplemental Table S1, A and B). Gene ontology analysis of this list of 162 proteins revealed a variety of functions, including regulation of gene expression, RNA processing, and developmental processes ( Table 3).
89 of the 162 Gro-interacting proteins associated uniquely with one domain (in all but one case, the SP domain), whereas 32 interacted with two domains. In the case of 23 of the 32 proteins that interacted with two domains, one of these domains was the Q domain (supplemental Table S1A). This is consistent with the known role of the Q domain in homo-oligomerization (12)(13)(14)(15). In accord with this role, chromatography using GST-Q as the affinity reagent resulted in the purification of some full-length endogenous Gro (supplemental Table S1C and data not shown). This could lead to the co-purification of Gro-interacting proteins that bind to regions outside the Q domain. Thus, 112 (89 plus 23) of the 162 detected interacting proteins can, in principal, be accounted for by the binding of Gro to a single central domain. However, at least 50 proteins (162 minus 112) are able to bind independently to two or three central domains. The ability to interact with multiple Gro domains could allow tighter binding or more versatile control of binding.
The list of interacting proteins (Table 4 and supplemental  Table S1, A and B) contains multiple components of known multisubunit protein complexes. For example, we identified the ␣ and ␤ subunits of casein kinase II (CKII), a previously identified regulator of Gro activity (21). We also detected protein complexes involved in chromosome organization, including both components of the ATP-dependent chromatin remodeling and assembly factor (ACF), Acf1 and Iswi (41). Our proteomic screens also identified all the core protein components of the nucleosome (the core histones) as well as histone variant H2Av, consistent with previous studies demonstrating functional interactions between Gro and nucleosomes (42)(43)(44).
Perhaps most surprisingly, we discovered a number of components of the spliceosome among the group of Gro-interacting proteins, including all three proteins unique to U1 snRNP, components of U4/U6 snRNP, U2 snRNP, and the Sm complex (45,46). To validate the interaction between Gro and U1 snRNP, Drosophila embryo nuclear extracts were subjected to immunoprecipitation using an affinity-purified antibody against the Gro GP domain or, as a negative control, rabbit IgG. An anti-Gro immunoblot of the immunoprecipitated material demonstrates the efficiency of the immunoprecipitation (Fig.  2A). RNA was extracted from the immunoprecipitates and ana-lyzed by RT-qPCR with primers specific for U1 snRNA (a component of U1 snRNP). The results show that ϳ13% of the U1 snRNA in the nuclei of 0 -12-h embryos is associated with Gro (Fig. 2B).
Functional Analysis of Gro-interacting Proteins-We next carried out functional assays to determine whether the interacting proteins are required for regulation of a Gro-responsive reporter gene. Previous studies established a reliable reporter assay for Gro function using a luciferase reporter containing Gal4 binding sites (UAS elements) as well as an artificial enhancer containing binding sites for the Dorsal and Twist activators (14,16,18,47). Dorsal/Twist-activated transcription of this reporter is strongly repressed upon introduction of a Gal4-Gro fusion protein. By altering the position of UAS elements relative to the artificial enhancer, we were able to examine both short range and long range Gro-mediated repression simultaneously (Fig. 3, A and B). The reporter system relied on two variants of click beetle luciferase that use D-luciferin as a substrate and emit either red or green light (48). In addition, a plasmid encoding Renilla luciferase, which uses coelenterazine as a substrate, was used as an internal control for transfection efficiency, cell viability, and general effects on transcription and translation. We validated the three-reporter system using dsRNA against Dorsal, Gro, and Rpd3 (which is partially required for Gro-mediated repression (18)) (Fig. 3C). As predicted, Dorsal knockdown resulted in a complete loss of activation, Gro knockdown resulted in a complete loss of repression, and Rpd3 knockdown resulted in a partial loss of repression.
Each of the candidates from the screen for Gro-interacting proteins was knocked down by RNAi using up to three dsRNAs  per gene to guard against off-target effects. We excluded the histones from this analysis under the assumption that knockdown of these essential chromatin components would have pleiotropic deleterious effects on cell metabolism and because each histone is encoded by multiple genes, making efficient knockdown problematic. We therefore tested 157 genes in this S2 cell luciferase assay in most cases with multiple dsRNAs per gene (three if available), and each dsRNA was tested in triplicate. In total, we carried out ϳ1,300 assays (including controls) in a 96-well plate format using a partially automated approach (see "Experimental Procedures"). A candidate was scored as a regulator of Gro-mediated repression if knockdown reproducibly resulted in either an increase or a decrease in the level of repression (see "Experimental Procedures" for explanation of the statistical test of significance). 44 candidates met these criteria of which 28 interfered with optimal repression (i.e. repression increased upon knockdown; these were termed "negative regulators of Gro"), and 16 were required for optimal repression (i.e. repression decreased upon knockdown; these were termed "positive regulators of Gro"). We provide representative data for one negative regulator (Vir), one positive regulator (snRNP-U1-C), and one protein that is neither a positive nor a negative regulator (SR protein kinase, SRPK) (Fig. 3D); a list of all the positive and negative regulators (Table 5); and a separate list showing the quantitative effect of RNAi knockdown of each of the 44 regulators on repression by Gal4-Gro (supplemental Table S2). Of particular interest, four spliceosomal proteins, including two components of U1 snRNP, act as positive regulators of Gro, confirming the functional significance of the interaction between Gro and U1 snRNP. A few other noteworthy examples among the Gro regulators (Table 5 and supplemental Table S2) include both components of the CKII complex (CKII␣ and CKII␤), which act as negative regulators, and the chromatin remodeling factor Acf1, which acts as a positive regulator (see "Discussion").
Expression Profiling of Gro and snRNP-U1-C Knockdown Cells-snRNP-U1-C is one of the components of the U1 snRNP complex, which is responsible for 5Ј splice site recognition (46). In addition to its role in RNA processing, it has been shown to repress transcription of EWS/FLI-transactivated genes (30). Because our data indicated that snRNP-U1-C may also modulate Gro function, we examined the genome-wide role of snRNP-U1-C in Gro-mediated repression. Using RNA-seq, we compared the effects of snRNP-U1-C knockdown with those of Gro knockdown on the gene expression profile in S2 cells. Cells were treated with Gro or snRNP-U1-C dsRNA for 4 days, leading to 4-fold or greater knockdown of the Gro and snRNP-U1-C mRNA (Fig. 4A). The transcriptomes in wild-type and Gro knockdown S2 cells were quantitatively similar to those published previously (49,50) (Fig. 4, B and C). We note that the genes differentially expressed in the snRNP-U1-C knockdown are enriched for genes containing introns as would be expected given the role of U1 snRNP in splicing. However, this set of genes also contains a number of intronless genes, consistent with the idea that snRNP-U1-C has roles in gene regulation apart from its role in splicing (Fig. 4D). We note that changes in the expression of an intronless gene could also reflect a requirement for the product of an intron-containing gene in the expression of the intronless gene.
98 genes were differentially expressed in both Gro and snRNP-U1-C knockdown cells (Fig. 4E) of which 36 were upregulated in either case. These coordinately up-regulated targets included genes in various signaling pathways such as the Wnt, Notch, and Toll pathways ( Table 6). Comparison with publically available ChIP-seq data on histone modification and transcription factor binding revealed that these coordinately regulated genes were most enriched for histone H3K36 methylation and the H3K36 methyltransferase ASH1 (Fig. 4F).
To determine whether the regulatory effects of knocking down Gro are likely to be direct, we compared our RNA-seq FIGURE 2. Validation of the interaction between Gro and U1 snRNP. A, 0 -12-h Drosophila embryo nuclear extracts were subjected to immunoprecipitation using an affinity-purified polyclonal antibody directed against the Gro GP domain or, as a control, rabbit IgG. To assess immunoprecipitation efficiency and specificity, immunoprecipitates were subjected to SDS-PAGE and immunoblotting. The blot was probed with a mixture of the rabbit anti-GP domain antibody and a mouse monoclonal anti-Gro antibody and with IRDye-labeled secondary antibodies. The signal from the rabbit antibody was detected in the green channel of the IR imager, whereas the signal from the mouse antibody was detected in the red channel. Rabbit IgG heavy chain (IgG) and Gro bands are indicated with arrows on the right. The orange-yellow color of the Gro band is indicative of the overlap between the red and green signals. Lane 1, markers labeled in kDa; lane 2, 10% input; lane 3, anti-Gro immunoprecipitate (IP); lane 4, rabbit IgG immunoprecipitate; lane 5, mock anti-Gro immunoprecipitate from which input nuclear extract was omitted. B, RNA was extracted from immunoprecipitates prepared as described in A. The RNA from the immunoprecipitates as well as the RNA extracted from the input nuclear extracts was analyzed by RT-qPCR as described under "Experimental Procedures" to determine U1 snRNA levels. Error bars based on two independent biological replicates indicate S.D. A two-tailed t test gives p ϭ 0.016. data from Gro knockdown S2 cells with available S2 cell Gro ChIP data (49). Gro appears to bind many genes that it does not repress (Fig. 5A). This is consistent with observations made with numerous regulatory factors (51,52) and suggests that binding, although required, is not sufficient for regulation. We observed an enrichment of Suppressor of Hairless (Su(H)) and Brinker (Brk) binding motifs within Gro ChIP-seq peaks in the differentially expressed genes but not in the non-differentially expressed genes (Fig. 5B). Comparison of our RNA-seq data from Gro knockdown cells with available Pol II ChIP-chip data (53) also reveals an enrichment in Pol II pausing near the transcriptional start site in genes that are up-regulated upon Gro knockdown (i.e. genes that are repressed by Gro; Fig. 6). . The three-reporter high throughput luciferase assay. A, schematic representation of the three reporters. Constructs are not drawn to scale. In the red luciferase reporter, the Gal4 binding sites (UAS elements) are immediately upstream of the enhancer, whereas in the green luciferase reporter, the UAS elements are about 2 kb downstream of the transcriptional start site. Expression is induced by the Dorsal (Dl) and Twist (Twi) activators and repressed by Gal4-Gro. The Renilla luciferase reporter under control of the class II promoter from the gene encoding RNA polymerase III subunit (RplIII128) was used as an internal control for transfection efficiency, cell viability, and general effects on transcription and translation. B, flow chart of the reporter assay. C, validation of the reporter assay. Co-transfection with Dorsal and Twist (Dl/Twi)-encoding plasmids activated both the red and green reporters, whereas addition of a plasmid encoding the Gal4-Gro fusion resulted in repression of the reporters. Dorsal, Gro (including Gal4-Gro), and the histone deacetylase Rpd3, which is partially required for Gro-mediated repression (18), were knocked down by RNAi. Data are normalized to the red and green signals from the Gro dsRNA sample. Error bars based on triplicate transfection assays represent S.D. D, representative results of the reporter assay. The luciferase reporter assay was carried out using three non-overlapping dsRNAs from the genes encoding Vir, snRNP-U1-C, and SR protein kinase (SRPK). The result of transfection with each dsRNA was compared with that of transfection with GFP dsRNA. Error bars based on triplicate transfection assays represent S.D. DECEMBER 11, 2015 • VOLUME 290 • NUMBER 50 FIGURE 4. Genome-wide expression profiling reveals co-regulation of genes by Gro and snRNP-U1-C. A, expression of Gro and snRNP-U1-C mRNA after dsRNA treatment. RT-qPCR was performed after extraction of total RNA. Data were normalized to reference gene Rpl32. Error bars based on duplicate experiments represent S.D. B, comparison of transcriptomes from our wild-type S2 cell RNA-seq data and the modENCODE S2 cell RNA-seq data. C, comparison of transcriptomes from our Gro knockdown RNA-seq data and previously published Gro knockdown RNA-seq data (49). The transcripts that were detected at significant levels in only the previously published Gro knockdown study (represented by the points in contact with the vertical axis) correspond primarily to non-polyadenylated transcripts. In B and C, the scale on both axes is log 2 (CPM) where CPM is counts per million sequence reads. D, based on RNA-seq analysis of wild-type and snRNP-U1-C knockdown cells, genes were categorized as non-differentially expressed upon knockdown (Non-DE; 12,028 genes), up-regulated upon knockdown (1,431 genes), and down-regulated upon knockdown (1,691genes). The percentage of genes in each category with no introns is shown. Some Drosophila genes lack annotated transcripts, and thus it was not possible to determine their intron count. This results in a small numerical discrepancy between the number of differentially expressed genes included in this analysis and the number of snRNP-U1C differentially expressed genes shown in E. E, Venn diagram showing numbers of differentially expressed genes in Gro and snRNP-U1-C knockdown cells and the overlap between these sets. Fisher's exact test indicates that the overlap is highly significant (p Ͻ 2.2 ϫ 10 Ϫ16 ). F, enrichment of Gro/snRNP co-regulated genes for various features. Normalized enrichment scores were calculated using cumulative recovery curves (37). Scores above 2.5 are considered significant.  Table S2 for quantitative information on positive and negative regulation by these factors. See "Experimental Procedures" for an explanation of the test of statistical significance that genes had to pass to be included in this list. a Negative regulators are defined as the products of those genes the knockdown of which led to increased repression by Gal4-Gro in the reporter assay. b Positive regulators are defined as the products of those genes the knockdown of which led to decreased repression by Gal4-Gro in the reporter assay.

Discussion
Previous studies showed that the disordered Gro central domains are essential for properly regulated transcriptional repression (2,19). To shed light on the mechanism by which these domains function, we used them as affinity reagents to purify interacting proteins in Drosophila embryo nuclear extracts that were then identified by MudPIT. We identified over 160 interacting polypeptides, many of which associate with one another in a variety of multiprotein complexes. Several of these interacting proteins (e.g. the core histones and CKII) were previously characterized as Gro interactors, thus partially validating the screen. In addition, we validated the interaction between Gro and U1 snRNP by demonstrating the presence of U1 snRNA in an anti-Gro immunoprecipitate of embryonic nuclear extracts.
As a means of systematically validating interactions, we used a functional assay in Drosophila cells in which 157 of the interactors were each knocked down by RNAi to determine their requirement for Gal4-Gro-mediated repression of a luciferase reporter. In this way, we obtained evidence that 44 of the interactors have functional roles in Gro-mediated repression. 28 of these are required for repression, whereas 16 of them antagonize repression. The number 44 is probably an underestimate of the true number of functional interactors due to the artificiality of the reporter assay. For example, because we artificially recruit Gro to the reporter by tethering it to the Gal4 DNA binding domain, any interactions that work to help recruit Gro to the template will not be required. In addition, the reporters are introduced by transient transfection, and certain chromatin structures or modifications that contribute to Gro-mediated repression may not be reproduced in this context.

Gro Interactors Include Chromatin Remodelers, Protein Kinases, and Protein Complexes Involved in RNA Processing-
Gro-mediated repression may be associated with changes in chromatin structure, including histone deacetylation and possibly increased nucleosome density (3,18,54). Consistent with this possibility, our proteomic screen identified a number of histone modifiers and ATP-dependent chromatin remodelers, including subunits of the ACF chromatin remodeling complexes (Acf1 and Iswi), the histone chaperone NAP1, and the histone kinases JIL-1 and Ball. Consistent with the idea that chromatin remodelers may be required for Gro-mediated repression by catalyzing changes in nucleosome density or higher order chromatin structure, our reporter assay showed that Acf1 is required for optimal repression by Gro.
CKII is a heterotetrameric complex consisting of two copies of a catalytic subunit (CKII␣) and two copies of a regulatory subunit (CKII␤) (55,56). A previous study showed that CKII phosphorylates Gro at multiple sites, including serines 239 and 253, to promote repression (21). We identified both the ␣ and ␤ subunits of CKII and the CKII negative regulator Nopp140 in our proteomic screen, but our findings are inconsistent with the view that CKII is a positive regulator of Gro and that Nopp140 acts by inhibiting CKII. This is because our reporter assays show that CKII␣, CKII␤, and Nopp140 are all negative regulators of Gro. However, our results are consistent with other findings showing that Gro phosphorylation can block repression (2). Furthermore, the effect we observed due to Nopp140 knockdown could reflect the role of this factor in processes other than CKII regulation (57).
In addition to several expected protein complexes, we also isolated many novel Gro-interacting proteins, one of which is the RNA helicase Rm62 (also known as p68). Rm62 is a DEAD box RNA helicase that has multiple functions, including roles in RNA processing, RNAi, and transcriptional regulation (58). Previous studies have shown a dual role for Rm62 in transcriptional regulation; its interaction with coactivator CBP/p300 may lead to gene activation (59), whereas its interaction with HDAC1 may lead to repression (60,61). Our reporter assay confirms its function as a positive regulator of Gro-mediated repression as knocking down Rm62 resulted in attenuated Gro activity. Interestingly, Rm62 was also shown to be an essential splicing component through its action on the U1 snRNP (62,63). The possible significance of the spliceosome in Gro-mediated repression is discussed below.
An Unanticipated Role for the Spliceosome in Gro-mediated Repression-One of the most surprising findings from our proteomic screen was the purification of a significant portion of the spliceosome complex, which suggests a potential role for the spliceosome in transcriptional regulation.
Pre-mRNA processing frequently occurs co-transcriptionally (64 -66). Splicing factors are often recruited to nascent transcripts by the C-terminal domain of the Pol II large subunit and elongation factors (67,68). In addition, there is evidence that co-activators are able to interact with splicing factors (27). The interaction between the transcriptional and splicing machinery may be functionally relevant because different promoters can yield transcripts that are subject to differential alternative splicing (69,70). Although many studies have focused on the effect of transcription factors in splicing, there is also increasing evidence that promoter-proximal splicing elements can influence transcription (26,28,71). U1 snRNP, a part of the spliceosome, consists of U1 snRNA, three U1 snRNP-specific proteins, and the seven-subunit Sm complex (46). Our list of 162 Gro-interacting proteins (supplemental Table S1, B and C) includes all three U1 snRNP-specific proteins (snRNP-U1C, snRNP-U1-70K, and snRNP-U1-A) as well as two subunits of the Sm complex (Sm-D2 and Sm-D3). We note that we also detected at least four other Sm complex subunits in one of the two replicate screens (Sm-B, Sm-F, Sm-D1, and Sm-G) (supplemental Table S1C). Additionally, we showed by co-immu- noprecipitation that ϳ13% of U1 snRNA, the RNA component of the U1 snRNP, is associated with Gro in embryonic nuclei. Thus, we have detected essentially the entire U1 snRNP in our proteomic screens for Gro-interacting proteins. Data from our reporter assay suggest that the U1 snRNP complex is required for optimal Gro-mediated repression as snRNP-U1-C and snRNP-U1-70K knockdown attenuated repression. Consistent with our finding, it has been shown that snRNP-U1-C overexpression can decrease EWS/FLI-activated transcription (30). It is worth noting that the U1 snRNA is known to associate with transcription factor IIH and promote transcriptional initiation in vitro (29). Thus, the effect of the U1 snRNP complex in transcription regulation may be context-dependent.
Gro Recruitment Is Insufficient for Repression-The available S2 cell Gro ChIP-seq data (49) reveal 1,242 Gro binding sites in the S2 cell genome associated with 748 genes, whereas our RNA-seq analysis revealed that only 46 of these 748 genes are differentially expressed in Gro knockdown S2 cells, implying that Gro binds to many genes that it does not regulate. The apparent contradiction could be explained by the absence of a required transcriptional activator in S2 cells to activate these genes upon Gro depletion. Regardless of the reason for the finding that Gro binds to many more genes than it regulates, this is a phenomenon that is common to many (perhaps most) eukaryotic gene-specific transcriptional regulators (51,52). Gro ChIPseq peaks associated with genes differentially expressed upon Gro knockdown are enriched for Su(H) and Brk binding motifs. This is in agreement with the known roles of Su(H) and Brk in the recruitment of Gro to target genes in the Notch and Dpp signaling pathways, respectively (72)(73)(74).
Genes that are up-regulated in Gro knockdown cells (and that are therefore candidate Gro repression targets) exhibit enrichment in Pol II pausing near the transcriptional start site. This finding is in agreement with the hypothesis that Pol II pausing is one mechanism to repress gene expression (75,76). We note that our proteomic screen revealed the Pol II C-terminal domain kinase Cdk12 as a Gro-interacting protein (supplemental Table S1). By phosphorylating the C-terminal domain on Ser-2, Cdk12 may function to allow release of paused Pol II (77). Consistent with this idea, our reporter assay shows that Cdk12 functions to alleviate Gro-mediated repression (Table 5 and supplemental Table S2).
Genes that are differentially expressed in Gro and snRNP-U1-C knockdown cells are enriched for H3K36me1 as well as the H3K36 methyltransferase ASH1. Although H3K36me is involved in multiple functions, including transcriptional regulation, splicing, and DNA repair (78,79), these findings suggest a previously unknown role for this histone mark in Gro-mediated repression.  The percentage of non-differentially expressed genes and genes that are either up-regulated or down-regulated in Gro knockdown cells containing no Pol II bound or Pol II bound or enriched for promoter-proximal Pol II as ascertained by Pol II ChIP-chip analysis (53).