The FF4 and FF5 Domains of Transcription Elongation Regulator 1 (TCERG1) Target Proteins to the Periphery of Speckles*

Background: Coordinated transcription and splicing occurs at the periphery of speckles. Results: The FF4 and FF5 domains of transcription elongation regulator 1 (TCERG1) form a structural unit that directs proteins to the periphery of speckles. Conclusion: The FF4 and FF5 domains constitute a novel speckle periphery-targeting signal. Significance: This speckle periphery-targeting signal might participate in the coordination of transcription and splicing. Transcription elongation regulator 1 (TCERG1) is a human factor implicated in interactions with the spliceosome as a coupler of transcription and splicing. The protein is highly concentrated at the interface between speckles (the compartments enriched in splicing factors) and nearby transcription sites. Here, we identified the FF4 and FF5 domains of TCERG1 as the amino acid sequences required to direct this protein to the periphery of nuclear speckles, where coordinated transcription/RNA processing events occur. Consistent with our localization data, we observed that the FF4 and FF5 pair is required to fold in solution, thus suggesting that the pair forms a functional unit. When added to heterologous proteins, the FF4-FF5 pair is capable of targeting the resulting fusion protein to speckles. This represents, to our knowledge, the first description of a targeting signal for the localization of proteins to sites peripheral to speckled domains. Moreover, this “speckle periphery-targeting signal” contributes to the regulation of alternative splicing decisions of a reporter pre-mRNA in vivo.

The mammalian cell nucleus is a highly dynamic organelle that contains numerous morphologically defined structures, some of which have been implicated in essential processes such as RNA biogenesis. Nuclear speckles are one of these nuclear bodies, and studies on their composition, structure, and behavior have provided useful information for understanding the functional compartmentalization of the cell nucleus (1). Speckles are enriched in pre-mRNA splicing factors and are located in the interchromatin region of the nucleoplasm of mammalian cells (2)(3)(4). They appear as 20 -50 irregular regions per mammalian nucleus that are generally defined by immunofluorescence staining of RNA-processing factors, such as the serine/ arginine-rich (SR) 3 splicing factor SC35 (5). Speckles are dynamic structures, and they become round and increase in size upon transcriptional or splicing inhibition (6). It is believed that speckles are storage/assembly sites for splicing components and that transcription and pre-mRNA splicing do not occur within these structures. However, a significant proportion of RNA polymerase II-mediated transcription in the cell nucleus is associated with the periphery of speckles (7)(8)(9). Nascent transcript formation near the speckle compartment results in the recruitment of splicing factors from these nuclear bodies to the processing site, and this exchange rate might be regulated by continuous phosphorylation and dephosphorylation events. Thus, phosphorylation of SR proteins is necessary for their recruitment from nuclear speckles to sites of transcription in vivo (10). In summary, speckles appear to modulate the relative concentration of processing factors at active transcription sites, thus acting as an architectural integrator of the dynamic molecular associations that are involved in the coordination of transcription and RNA processing.
Although significant progress has been made on the role of speckles in gene expression, little is known about the sequence motifs responsible for the accumulation of splicing factors at the speckle region. In the case of the SR family of proteins, the RNA recognition motif (RRM) and the RS domain direct these splicing factors to the nuclear speckles (11)(12)(13)(14). Other regions of specific splicing factors can also act as targeting signals to nuclear speckles, such as the threonine-proline repeats found in SF3b155 (15) and the arginine-, proline-, and serine-rich domains of SRm160 (16). In the case of protein kinases CrkRS and DYRK1A, the RS domain and a histidine-rich region, respectively, are required for localization to speckles (17)(18)(19). To date, no localization signal has been clearly defined to target proteins to the interface between speckles and surrounding transcription sites.
TCERG1 participates in transcriptional elongation and alternative splicing of pre-mRNAs, and a role for this protein in coordinating both processes has been proposed (20,21). TCERG1 is composed of 1098 residues (22) and contains three WW domains at its N terminus followed by six FF domains at its C terminus. TCERG1 was first described as a transcriptional elongation regulator and was initially found in HIV-1 Tat-responsive HeLa nuclear extract fractions (22,23). However, accumulating evidence indicates a potential role of TCERG1 in splicing and, hence, in the coupling between transcription and splicing. TCERG1 affects the alternative pre-mRNA splicing of ␤-globin, ␤-tropomyosin, CD44, and fibronectin splicing reporters (24 -27) and of putative cellular targets identified by microarray analysis following TCERG1 knockdown (26). Consistent with a potential role in the coupling of transcription and splicing, TCERG1 localizes at the interface of splicing factorrich nuclear speckles and what are presumably nearby transcription sites (21), and it associates with RNA polymerase II and with elongation and splicing components (21,24,28,29).
In this study, we identified the FF4 and FF5 domains of TCERG1 as the region required to direct this protein to the periphery of nuclear speckles. We performed NMR-based analyses and observed that although the FF4 domain is folded and stable, the FF5 domain is not. However, when both domains are expressed as a pair, the folded properties of FF5 are improved. These observations suggest that both domains form a functional unit and provide insights into the nature of FF protein domains. Moreover, our data demonstrate that both of these FF domains specifically direct the localization of fused unrelated proteins to these nuclear regions. Therefore, we defined the FF4 and FF5 domains as novel targeting signals for the localization of proteins at the interface between speckles and what are presumably nearby transcription sites. Placing our data in a functional context, this "speckle periphery-targeting sequence" contributes to the regulation of alternative splicing decisions of a reporter pre-mRNA in vivo.
All constructs were verified by DNA sequencing. Plasmids were transformed into DH5␣ cells for selection.
Proteins-Constructs of the FF4 (878 -956 fragment) and for the pair of FF4-FF5 domains (878 -1022 fragment) of TCERG1 were amplified by PCR and subcloned into a pETM-11 (a gift from the European Molecular Biology Laboratory-Heidelberg Protein Expression Facility) using NcoI and HindIII sites. All constructs were confirmed by sequencing.
Unlabeled, 15 N-labeled, 13 C, 15 N and 2 H, 13 C, and 15 N-labeled proteins were expressed in Escherichia coli BL21 (DE3) in Luria Broth medium or minimal medium (M9) using either H 2 O or D 2 O (99.89%, CortecNet) enriched with 15 NH 4 Cl and/or D-[ 13 C] glucose as the sole sources of carbon and nitrogen, respectively (30). E. coli extracts were lysed using an Emul-siFlex-C5 (Avestin) cell disrupter equipped with an in-house developed Peltier temperature controller system. Soluble fusion proteins were purified by nickel affinity chromatography (HiTrap chelating High Performance column, GE Healthcare), and samples were eluted using buffer (20 mM Tris, 10 mM Imidazol, 150 mM NaCl) with EDTA. After nickel affinity purification, the proteins were cut with the Tobacco Etch Virus (TEV) protease and further purified by gel filtration on a HiLoad TM Superdex TM 75 prepgrade (GE Healthcare). All samples were prepared in 20 mM sodium phosphate buffer, 130 mM NaCl, 0.5 mM NaN 3 in 90% H 2 O, 10% D 2 O or 100% D 2 O (pH 5.8). To avoid aggregation of the FF4-FF5 sample, the buffer was supplemented with 5% glycerol.
Antibodies-Antibodies against the T7 tag were purchased from Bethyl and used at dilutions of 1:20,000 and 1:1000 for immunoblotting and immunofluorescence, respectively. Anti-body against CDK9 (catalog no. sc-484, Santa Cruz Biotechnology) was used at a dilution of 1:500. Antibodies against U2AF 65 were kindly provided by J. Valcárcel (CRG, Barcelona, Spain) and were used at a dilution of 1:500. Antibody against splicing factor SC35 (catalog no. S4045, Sigma) was used at a dilution of 1:4000. For Western blot analysis, primary antibodies were detected using HRP-conjugated secondary antibodies to rabbit and mouse (PerkinElmer Life Sciences). These were generally used at dilutions of 1:5000. For immunofluorescence, we used Alexa Fluor 488-conjugated goat anti-mouse and Alexa Fluor 647-conjugated goat anti-rabbit from Molecular Probes and were generally used at dilutions of 1:500.
Cell Culture and Transfection Assays-HeLa and HEK293T cells were grown and maintained as described previously (27). Transfection assays were carried out using protocols described previously (21,27). Cells were transfected by using calcium phosphate and/or Lipofectamine 2000 reagent (Invitrogen) according to the protocols of the manufacturer. Empty vector was used to keep the total amount of nucleic acid constant.
Immunofluorescence, Image Processing, and Quantification-HeLa and HEK293T cells were grown on coverslips and transfected with the constructs indicated in the legends to the figures. Approximately 24 h after transfection (50 -60% confluence), cells were fixed with 3.5% paraformaldehyde in PBS buffer (pH 7.4) for 45 min on ice. Cells were washed three times with PBS, permeabilized in PBS containing 0.5% Triton X-100 for 5 min at room temperature, and blocked in PBS containing 2.5% BSA overnight at 4°C. Cells were incubated with primary antibodies at appropriate dilutions in PBS containing 0.1% BSA for 1 h at room temperature (humidity chamber) and subsequently washed extensively with 0.1% BSA in PBS and incubated with appropriate secondary antibodies under the conditions described previously. After staining, cells were rinsed five times with 0.1% BSA in PBS and three more times with PBS. Coverslips were then mounted onto glass slides using ProLong Gold antifade reagent (Molecular Probes). Images were acquired with an inverted Leica SP2 confocal microscope, using an HCX PL APO CS 40.0 ϫ 1.25 OIL UV objective. In cases where double immunofluorescence was performed, images were all taken simultaneously. GFP, CFP, and Alexa Fluor 488 were excited with the 488-nm line of the argon laser, whereas Alexa Fluor 647 was excited with a 633-nm HeNe laser. The pinhole diameter was kept at 1 m. Quantification analysis shown in Fig. 1B was performed by measuring and comparing the average pixel intensity of the ECFP at the nuclear speckle site and adjacent nucleoplasm of a region of interest scan for each TCERG1 mutant. Thirty-six regions of interest from three different experiments with 12 cells per experiment (72 and 108 measurements for the speckles and nucleoplasm, respectively) were quantified for each TCERG1 mutant. Statistical analysis was performed using a standard Student's t test. Acquisition software was LAS AF v2.3.6 Build 5381sps. All images were digitally processed for presentation using Adobe Photoshop CS3 extended v10.0 software.
NMR Spectroscopy-All experiments were recorded on a Bruker Avance III 600-MHz spectrometer equipped with a z pulse field gradient unit and a triple ( 1 H, 13 C, 15 N) resonance probe head. Double-and/or triple-labeled samples were pre-pared to obtain sequence-specific (HNCACB/HN(CO)CACB or CBCA(CO)NH/CBCANH) experiments. All spectra were processed with the NMRPipe/NMRDraw (31) software and were analyzed with Computer Aided Resonance Assignment, CARA (32). The 15 N-relaxation experiments were acquired for 15 N-labeled samples of FF4ϩFF5 (0.5 mM) and FF4 (0.5 mM) essentially as described (33). Heteronuclear { 1 H}-15 N NOE experiment was performed using standard 2D experiments, with the reference and proton saturated spectra collected in an interleaved fashion. The values of steady-state 1 H-15 N NOEs were determined from the ratios of the peak intensities measured in spectra recorded either with (Is) or without (Io) presaturation during the relaxation delay as described (34). The standard deviation of the NOE was determined on the basis of measured background noise levels using the following relationship: T1 and T2 experiments were acquired with 135 (t1) ϫ 2048 (t2) total real points. T1 data points were obtained with 12 different relaxation periods: 20.8, 52, 104, 156, 265, 424, 520, 676, 832, 1040, 1352, and 1664 ms. Ten delay times were also sampled in the T2 experiments: 12, 24, 36, 48, 60, 72, 84, 120, 144, and 168 ms. T 1 and T 2 values were determined by fitting the measured peak heights to a two-parameter function of the form: where I(t) is the intensity after a delay of time t and Io is the intensity at time t ϭ 0. Minimization performed using the Levenberg-Marquardt optimization algorithm was used to determine the optimum value of the I 0 and T 1,2 parameters by minimizing the 2 goodness of fit parameter: where I c (t) are the intensities calculated from the fitting parameters, I e (t) are the experimental intensities, s I is the standard deviation of the experimental intensity measurements, and summation is performed over the number of time points recorded in each experiment.
Differential Scanning Calorimetry-Thermal denaturation was studied using differential scanning calorimetry (DSC). The experiments were performed using a VP TM DSC MicroCalorimeter (in the Polymorphism and Calorimetry Platform of the Scientific and Technical Services (SCT), Universitat de Barcelona) in 20 mM sodium phosphate buffer, 130 mM NaCl, 0.5 mM NaN 3 (pH 5.8). The protein solution was heated up at a constant rate of 1°C/min from 10 to 80°and a constant pressure. The temperature dependence of the excess heat capacity was analyzed and plotted with Origin 7.0 software (OriginLab Corp.).
RT-PCR Analysis-Approximately 2 ϫ 10 6 HEK293T cells were seeded 24 h before transfection. Cells were transfected with 0.6 g of Bcl-X2 minigene reporter (kindly provided by Benoit Chabot, University of Sherbrooke) and 1 g of the TCERG1 derivative plasmids as indicated in the legend of Fig. 6. Total RNA was isolated from transfected cells by using the TRIzol reagent (Invitrogen). Approximately 1 g of RNA was digested with 10 units of RNase-Free DNase (Roche). One-half of digested RNA was used for RT using the RT-Sveda primer and Moloney Murine Leukemia Virus (M-MLV) reverse transcriptase (Invitrogen) according to the instructions of the manufacturer. 10% of the RT reaction was used as template together with X34 and X-Agel-R primers. Reaction products were analyzed on 2% agarose gel and quantified using Quantity One 4.5.0 software. The following primers were used: 5Ј-GGGAAGCTA-GAGTAAGTAG-3Ј (RT-Sveda1-Rev), 5Ј-AGGGAGGCAGG-CGACGAGTTT-3Ј (X34), and 5Ј-GTGGATCCCCCGGGC-TGCAGGAATTCGAT-3Ј (X-Agel-R).

RESULTS
Identification of a Sequence Element Required for the Localization of TCERG1 to Nuclear Speckles-We reported previously that TCERG1 is present along the periphery of the speckles (21) but that the sequences responsible for this location have not yet been defined. In our previous study, we found that a TCERG1 protein lacking the FF5 and FF6 repeat motifs is not able to localize to speckles (21). At least three interpretations are possible: 1) FF5 is essential for the subnuclear localization of TCERG1; 2) FF6 and FF5 are both important but neither is essential; or 3) a minimum number of FF repeats are required for the localization to nuclear speckles. To identify the elements within the FF repeat motifs of TCERG1 responsible for its accumulation in nuclear speckles, we constructed a series of mutants of human TCERG1 fused to ECFP and investigated their subnuclear localization in transfected HEK293T cells by confocal laser scanning microscopy. All mutants retain the putative nuclear localization signal found in the middle of the protein (22). The localization of full-length ECFP-tagged TCERG1 was similar to that of the endogenous TCERG1 (Fig.  1A, ECFP and TCERG1 ). A TCERG1 deletion that completely eliminated the FF domains was localized in a diffuse pattern throughout the nucleoplasm without any evident accumulation on the speckles (Fig. 1A, ECFP and TCERG1 ), in agreement with our data published previously (21). A mutant containing the amino terminal region and the FF5 domain of TCERG1 showed a slight enrichment in speckles, indicating that this domain might be required for targeting the protein to nuclear speckles but may not be sufficient for efficient speckle localization (Fig. 1A, ECFP, compare the signal obtained with TCERG1  to the one obtained with TCERG1[1-662]-FF5). Addition of the FF6 domain to this mutant created a fusion protein with a staining pattern similar to the previous mutant, and it did not accumulate in nuclear speckles in significant amounts (supplemental Fig. S1A, TCERG1[1-662]-FF5/ FF6). To further analyze the speckle-targeting capacity of the FF domains, we generated a chimera with the amino-terminal region fused to the FF4 and FF5 domains. Remarkably, cells expressing this chimera showed strong nuclear speckle fluorescence that was indistinguishable from that of wild-type TCERG1 (Fig. 1A, ECFP and TCERG1  Fig. S1A).
These results suggest that the combination of FF4 and FF5 is the sequence element necessary and sufficient for localization to speckles. This was not due to differential protein expression levels, as confirmed by Western blotting (supplemental Fig.  S1B).
To confirm that the nuclear staining pattern of the transiently expressed proteins coincides with nuclear speckles, we carried out an immunofluorescence analysis with an antibody against the essential splicing factor SC35, which commonly serves to define nuclear speckles. Expression of wild-type and the FF4/FF5 mutant resulted in nuclear colocalization with speckles, whereas the FF5 mutant displayed a partially overlapping signal at the speckle region (Fig. 1A). The spatial relationship between wild-type and mutant TCERG1 variants relative to SC35 was identified by quantitatively scanning specific nuclear regions containing speckles and is shown on the right in Fig. 1A.
To further analyze the spatial distribution of TCERG1, we analyzed the spatial relationship between the relative spatial distributions of either wild-type or TCERG1[1-662]-FF4/FF5 and SC35 in individual nuclear speckles by confocal microscopy optical sectioning (0.3-m sections). We found that the speckles had peripheral and internally located the TCERG1 proteins that are excluded from the core region of the speckles (supplemental Fig. S2). The analysis of those images showed that the TCERG1 peaks partially overlap, but do not coincide with, the SC35 peaks (supplemental Fig. S2), lending additional support to the observation that wild-type (21) and TCERG1[1-662]-FF4/FF5 are enriched at the speckle periphery.
We performed quantitative and statistical analyses that involved computing the proportion of signal intensity contained in speckles (see "Experimental Procedures"). Intensity measurements agreed with the above data and confirmed that FF4/FF5 is required for the efficient targeting of TCERG1 to the speckle compartment (Fig. 1B).
To further corroborate that the FF4 and FF5 domains are the targeting signal for the localization of TCERG1 to nuclear speckles, we expressed wild-type TCERG1 and FF4/FF5-deleted protein tagged with ECFP at the amino terminus and examined their nuclear localization in HEK293T cells using immunofluorescence microscopy. As shown in Fig. 1C, the TCERG1 protein with a deletion encompassing the FF4 and FF5 domains (TCERG1⌬FF4⌬FF5) exhibited diffuse localization throughout the nucleoplasm without any evident accumulation on the speckles, thus supporting the conclusion that this region of TCERG1 contains the sequence required for proper localization to speckles.
Previously, we have shown that a deletion of the FF6 domain does not affect the speckle distribution of TCERG1, which is consistent with our current data, but that a further deletion that included the FF5 domain results in perturbation of TCERG1 localization to speckles (21). To test whether the FF5 domain is essential for the proper localization of TCERG1 to speckles, we transiently transfected cells with a plasmid encoding a TCERG1 protein in which the FF5 domain had been deleted (TCERG1⌬FF5) and examined its nuclear distribution. The expressed protein was present diffusely throughout the nucleoplasm (Fig. 1C). These results indicate that a deletion of the FF5 domain disrupts the localization of TCERG1 to nuclear speckles and imply that FF5 is essential for this association.
NMR Studies of the FF4 and FF5 Domains of TCERG1-The sequence comparison of the FF4-FF5 pair of domains in different species revealed a high level of sequence conservation in this pair of domains ( Fig. 2A). Remarkably, secondary structure predictions of the FF4 and FF5 domains suggested that seven residues (951-957) are shared by these two FF domains, with a fragment of the last helix of the FF4 overlapping with a part of the first helix of the FF5 domain. This prediction could imply that the FF4 and FF5 domains may not fold independently and might require the presence of the pair to acquire a stable fold. To evaluate the presence or absence of tertiary structure in the pair and to compare it with that of the independent domains, we used recombinant fragments corresponding to the FF4 and FF5 domains as well as a construct containing the FF4/FF5 pair and acquired heteronuclear single-quantum coherence (HSQC) NMR spectra at 298 K. These experiments provide  fold with an ␣1-␣2-3 10 -␣3 topology (35,36), whereas the FF5 domain presented properties of a sample containing mixtures of partially folded and unfolded molecules. A superimposition of the HSQC spectra of the independent domains to that of the FF4/FF5 pair revealed chemical shift differences in several residues of the FF4 domain and also that the FF5 domain substantially improved its chemical shift dispersion when compared with that of the isolated FF5 domain (Fig. 2, B and C). These differences seem to indicate that in the FF4-FF5 pair, the domains may contact one another. To support this hypothesis further, we analyzed the thermodynamic behavior of the FF4/ FF5 pair and compared it to that of the independent FF4 using DSC. DSC measures the enthalpy of unfolding processes because of thermal denaturation and also provides the thermal transition midpoint (Tm), which correlates with protein stability. In agreement with the NMR data, the FF4/FF5 pair presented a unique unfolding curve, suggesting that the pair unfolds in a concerted manner (Fig. 2D). The observed decrease in the midpoint thermal transition of the pair with respect to that of the FF4 domain (5.6°C) is attributed to an aggregation process occurring in the pair that is absent in the isolated FF4 domain. The aggregation process, which affects the overall stability of the pair, probably involves the FF5 domain because of its lower stability compared with that of FF4.
To identify the elements of secondary structure present in the FF4/FF5 pair, we used a ( 2 H, 13 C, and 15 N) recombinant sample and acquired backbone triple-resonance experiments. The assignment was achieved up to a 70% of all possible residues. A comparison of the ␣ and ␤ carbon chemical shifts of the assigned residues to that of random coil values revealed positive values that correlate with the presence of ␣ helices, characteristic of FF domains, shown in Fig. 2E as a green line. The high number of overlapped residues in the FF5 domain precluded a proper identification of all secondary structure elements (37). However, the positive 13 C values obtained for the unambiguously assigned regions of the FF5 domain suggested that this domain also presents helical secondary structure. From the obtained data it seems that the first helix of FF5 domain is a continuation of the last helix of the FF4 domain, that the 3 10 and third helices are shorter than expected on the basis of a sequence comparison with other described FF domains, and finally, that the second helix is almost undetectable.
To further examine the structural and dynamical properties of the pair, we performed heteronuclear (34) { 1 H}-15 N NOE relaxation experiments. Internal dynamics are directly related to folding, with less ordered proteins displaying higher internal motions. The data obtained for the assigned residues (shown in Fig. 2F) revealed that the distribution of NOE values was, on average, close for both domains in the pair. To further explore the dynamic behavior of the FF4/FF5 pair, we also performed T1 and T2 experiments and compared the obtained values to the independent FF4 domain. The correlation times obtained for the assigned residues (supplemental Fig. S3, A and B), T1, T2, and the T1/T2 ratio are predominantly in the same range, with fewer uniform data for the pair with respect to that of FF4. We attributed the observed results to the slightly higher flexibility of FF5 when compared with FF4, probably because of the presence of long loops connecting the helices in FF5, whereas the values corresponding to the secondary structured regions were more similar.
These results point out that the fourth and fifth FF domains of TCERG1 have similar properties when expressed together, suggesting that the FF5 domain is more stable in the pair. This organization is a feature specific to these two domains, as we did not observe any structural organization when FF5 was combined with FF6 (supplemental Fig. S3C).
FF4 and FF5 of TCERG1 Comprise a Novel Nuclear Localization Signal Targeting Proteins to the Periphery of Speckles-The SR protein SRSF1(formerly SF2/ASF) is an essential splicing factor that regulates alternative splicing of many pre-mRNAs and is located at the speckle region (38). SRSF1 contains two functional modules: an RS (arginine/serine-rich) domain and two RRMs. Previous work demonstrated that at least two of these domains are necessary for SRSF1 localization to nuclear speckles (14). The same work described mutant proteins carrying individual domains (RRM1 and RRM2) that localized throughout the cell without any evident accumulation on the nuclear speckles. We sought to test whether the FF4/FF5 sequence could target those SRSF1 mutants to nuclear speckles. To this end, we ligated the FF4/FF5 and FF5 sequences to the C terminus of the RRM1 and RRM2 gene coding sequences, and the fusion genes were transfected into HeLa cells. All constructs encoded proteins with a bacteriophage T7 epitope tag at their amino terminus, allowing detection of the exogenous proteins with antibodies that recognize this epitope. Immunofluorescence experiments using a T7 tag antibody showed that the transiently expressed wild-type SRSF1 protein localized exclusively in the nucleus with a typical speckled pattern (Fig. 3a). Double-labeling experiments with anti-SC35 antibodies confirmed SRSF1 localization at nuclear speckles (Fig. 3, b and c). In contrast, when individual domains were expressed (RRM1 or RRM2), the mutant protein localized throughout the cell, and colocalization with nuclear speckles was not detected (Fig. 3, d-f and m-o), which agrees with a previous report (14). Small changes in the staining pattern were observed upon addition of the FF5 region to the mutant proteins (Fig. 3, panels g-i and p-r). However, mutant proteins containing the FF4/FF5 region clearly led to speckle localization of the fusion protein (Fig. 3,  j-l and s-u). Expression of the various SRSF1 constructs was nearly identical, as assessed by Western blotting (supplemental Fig. S4).
Next, we generated GFP fusion proteins containing the FF4/ FF5 domains and the FF5 domain alone. The GFP-FF5 protein showed a diffuse staining pattern similar to that of the GFP alone with small punctuate areas in the nucleus (18) (supplemental Fig. S5). The GFP-FF4/FF5 protein clearly accumulated in the speckle compartment (supplemental Fig. S5), indicating that the presence of this sequence is sufficient for nuclear speckle localization of the GFP. These results, together with the analysis of the SRSF1 mutants, indicate that the FF4/FF5 region shown to be necessary for the localization of TCERG1 protein to speckles is sufficient to direct the localization of heterologous proteins to nuclear speckles.
To explore further the association of FF4/FF5-containing RRMs with speckles, we treated cells with the RNA polymerase II inhibitor ␣-amanitin. Upon RNA polymerase II inhibition, FIGURE 3. FF4/FF5 directs SRSF1 domain-deletion mutants to nuclear speckles. Cells were transfected with the indicated plasmids and dually labeled with antibodies directed against the expressed SRSF1 protein (left column, green) and SC35 (center column, red). The merged images are also shown (right column). In all cases, colocalization of expressed proteins with the endogenous marker was assessed by confocal imaging. A diagrammatic representation of the T7-tagged SRSF1 mutants used is shown at the left of the figure. The structure of the SRSF1 domain-deletion mutants was described previously (51). Scale bars ϭ 3 m. speckles decrease in number, enlarge, and become rounded because of the accumulation of the splicing machinery (39). We found similar changes in the immunofluorescence pattern using antibodies against SC-35 and RRM1-FF4/FF5 proteins upon transcriptional inhibition (Fig. 4). We conclude that the localization of FF4/FF5 fusion proteins to nuclear speckles is not dependent on active transcription and that the accumulation of these proteins shows behavior similar to that of splicing factors.
FF motifs are putative protein-protein interaction domains named for two conserved phenylalanine (F) residues (41). To study the requirement of particular residues within the FF4/ FF5 domain for their nuclear targeting activity, we generated TCERG1[1-662]-FF4/FF5 constructs containing phenylalanine-to-alanine mutations at positions Phe-903, Phe-946, Phe-961, or the double mutants F903A,F946A, and F903A,F961A. We then expressed the wild-type FF4/FF5 domain and the phenylalanine-to-alanine mutants tagged with the ECFP at the amino terminus and examined their nuclear localization using immunofluorescence microscopy. As shown in Fig. 5, the TCERG1[1-662]-FF4/FF5 protein and the mutants within the FF4 domain (F903A, F946A, and the double mutant F903A,F946A) exhibited a similar nucleoplasm distribution with an increased signal in speckles. Mutation of the first phenylalanine of the FF5 domain at position 961 showed a decreased enrichment in speckles. The F903A,F961A double mutant localized in a diffuse pattern throughout the nucleoplasm without any evident accumulation on the speckles (Fig.  5). Expression of the various phenylalanine-to-alanine constructs was nearly identical, as assessed by Western blotting (supplemental Fig. S6). These data have been repeated with the  same phenylalanine-to-alanine mutations in the context of a full-length TCERG1 protein (data not shown). These results demonstrate the involvement of the highly conserved phenylalanine residue within the FF5 domain in the targeting to the nuclear speckles. These results show that TCERG1 protein is able to localize to the periphery of nuclear speckles when the F-903 and F-946 residues within the FF4 domain have been mutated. This might indicate that the folding of this domain is less perturbed upon mutating these residues. This hypothesis is suggested by our NMR data (Fig. 2), which indicate a higher degree of stability for this domain.
Role of the FF4/FF5 Domains in TCERG1 RNA Splicing Activity-TCERG1 is involved in the process of pre-mRNA splicing and can affect the splicing of several minigene splicing reporters (see "Introduction"). We sought to investigate the role of the FF4/FF5 domains in the regulation of alternative splicing. Recently, we have found that TCERG1 can affect alternative splicing of the apoptosis gene Bcl-x (40). Bcl-x pre-mRNA uses an alternative 5Ј splice site in exon 2 to produce the antiapoptotic Bcl-x L or the proapoptotic Bcl-x S isoforms (Fig. 6A). A Bcl-x minigene was transfected into HEK293T cells in combination with vectors expressing wildtype or FF4/FF5-deleted TCERG1. The presence of different splice products was assessed by RT-PCR. As shown in Fig.  6B, cotransfection of wild-type TCERG1 with this reporter led to an increase in short-variant Bcl-x S with respect to transfection of the vector alone (Fig. 6B, WT). Furthermore, deletion of the FF4/FF5 domains reduced the ability of TCERG1 to affect 5Ј splice site selection (Fig. 6B,  ⌬FF4⌬FF5). These results demonstrate the involvement of the FF4/FF5 domains in the determination of 5Ј splice site selection in the Bcl-x gene, which supports the assumption of the FF4/FF5 unit as a functional entity. Although more work is clearly needed to show how the FF4/FF5 mutants are affecting splicing of the Bcl-x reporter, our data suggest that TCERG1 might exert its activity through its localization to the nuclear speckle region.

DISCUSSION
We have identified the sequences within the transcription and splicing-related factor TCERG1 that target it to the periphery of splicing factor-rich nuclear speckles. These consist of two contiguous FF domains, FF4 and FF5, of TCERG1: amino acids 878 -1022. They are both necessary but not sufficient by themselves to achieve complete accumulation within the speckle region. We showed previously that the FF5 domain was implicated in the localization of TCERG1 to nuclear speckles (21). We show here that FF5 is necessary but not sufficient in itself to induce the accumulation of proteins to speckles. The adjacent FF4 domain is required for proper localization to the splicing speckled regions. The comparison of the TCERG1 speckle-targeting sequences with known speckle domain-targeting sequences (11-15, 17, 18) suggests that the TCERG1 speckle determinants are novel. This is consistent with the fact that the spatial distribution of TCERG1 is somewhat different from the distribution of many splicing factors that show more diffuse nucleoplasmic staining than TCERG1. The FF4 and FF5 domains of TCERG1 represent a novel nuclear speckle-targeting signal. They are necessary for TCERG1 localization and are sufficient for targeting a heterologous protein to speckles. To our knowledge, this targeting signal is the first sequence reported to direct proteins to the periphery of nuclear speckles at the interface between speckles and nearby transcription sites.
Compared with the better-characterized WW domain, the function of the FF domains is less well understood. The FF domain is an ϳ60-amino acid module that contains two strictly conserved phenylalanine residues near the N terminus and C terminus, respectively. FF domains often occur in repeated arrays of four to six domains separated by linker sequences of variable length (41), and this organization is likely to be important for their biological function. Structural studies of the FF domains of TCERG1 show a flexibility that is consistent with a domain organization model that visualizes the FF domains as multifunctional units acting as a scaffold to bind to a diverse repertoire of molecules (42,43). The regulation of FF domain interactions is an intriguing issue. The TCERG1 FF domains could have individual specificities, or they could act in concert to achieve optimal recognition of binding partners. The interactions of TCERG1 with the phosphorylated carboxy-terminal domain (CTD) of RNAPII (44) and with the splicing factor Tat-SF1 (29) via the FF domains suggest that each FF domain within a tandem repeat possesses independent binding activity, conferring an ability to mediate many different protein-protein interactions through relatively weak binding affinity contacts. The possibility of each FF domain within a repeated structure mediating distinct ligand recognition was initially supported by our own data implicating the TCERG1 FF5 domain as the critical FF domain for TCERG1 localization to the periphery of speckles (21). Here, we have presented NMR data that chal- lenges the current view of FF domains acting as separate entities and provided evidence that the FF domains can organize in pairs to optimally achieve a relevant biological function, such as targeting to speckles. Although future investigations are necessary, our results provide a starting point for understanding the structural basis of the targeting to speckles by FF domains.
Speckles are enriched in pre-mRNA splicing factors (1,45). However, transcription and pre-mRNA splicing take place outside of the speckle compartment (7,8,46,47). We described previously that the majority of speckle-associated TCERG1 is found along the periphery of nuclear speckles (21) where newly synthesized RNA is also found (7). The distribution of TCERG1 is consistent with a potential role of this factor in linking transcription with splicing machinery. Indeed, many of the factors that interact with TCERG1, such as SF1 and U2AF 65 , are also present in speckles. An interesting hypothesis is that TCERG1 could recruit processing components from speckles to the nearby transcription sites. If so, interfering with the localization of TCERG1 would probably disrupt some of these interactions. This is suggested by our experiments that showed that the signals determining TCERG1 nuclear speckle localization are important for its alternative splicing function (Fig. 6). Real-time mRNA biogenesis studies will address the question of whether TCERG1 shuttles between speckles and speckle-associated transcription sites. TCERG1 undergoes posttranslational modification by SUMOylation (27). Computer algorithms aimed at locating putative SUMO acceptor residues predict two potential SUMOylation sites within the FF4/FF5 domains. Because SUMOylation influences the targeting of proteins to different cell compartments, we investigated whether putative SUMOylation of sequences within FF4/FF5 domains affects its targeting properties. Our results show that mutation of the putative SUMO acceptor lysine residues within FF4/FF5 generates a protein that localizes to the nucleus similarly to the wild-type protein, ruling out an effect of SUMO modification for the spatial distribution of this protein in the cell. 4 TCERG1 can also be modified by phosphorylation (48,49). Computational predictions of phosphorylation sites using the NetPhos 2.0 server show that many serine and threonine residues within the FF4/ FF5 domains are potential phosphorylation sites. Some of these sites are located in the loops of FF5. This is reminiscent of the phosphorylation status of SR proteins. The RS domain of SR proteins is extensively phosphorylated on serine residues, and this controls the subnuclear localization and activity of SR proteins (50). Similarly, phosphorylation events might regulate the nuclear localization and functionality of TCERG1. Moreover, TCERG1 interacts directly with phosphorylated RNAPII CTD via its FF repeats (44), and interactions between SR-related proteins and RNA polymerase II CTD have also been reported (52). Although no classical RRM has been identified, a previous study (53) described the presence of a putative RGG box in the N terminus of the Chironomus tentans homologue of TCERG1, which is also found in its human counterpart. The RGG box is a protein motif present in one class of RNA-binding proteins involved in various aspects of RNA processing (54), is characterized by the presence of an RGG triplet, and is rich in arginines and glycines. In addition, TCERG1 has highly basic FF domains, raising the possibility that some of these, such as FF1-3, FF5, and FF6, may interact with RNA. In fact, we observed nucleic acid binding activity with recombinant FF1 and FF2 proteins. 5 On the basis of these observations, an exciting possibility remains that TCERG1 behaves as an SR protein in the coupling between transcription and splicing. Strikingly, the Drosophila homologue of human TCERG1 was identified as an RS domain-containing protein in a genome-wide survey of RS domain proteins (55).