Identification of Sam68 arginine glycine-rich sequences capable of conferring nonspecific RNA binding to the GSG domain.

Sam68 is an RNA-binding protein that contains a heterogeneous nuclear ribonucleoprotein K homology domain embedded in a larger RNA binding domain called the GSG (GRP33, Sam68, GLD-1) domain. This family of proteins is often referred to as the STAR (signal transduction and activators of RNA metabolism) proteins. It is not known whether Sam68 is a general nonspecific RNA-binding protein or whether it recognizes specific response elements in mRNAs with high affinity. Sam68 has been shown to bind homopolymeric RNA and a synthetic RNA sequence called G8-5 that has a core UAAA motif. Here we performed a structure function analysis of Sam68 and identified two arginine glycine (RG)-rich regions that confer nonspecific RNA binding to the Sam68 GSG domain. In addition, by using chimeric proteins between Sam68 and QKI-7, we demonstrated that one of the Sam68 RG-rich sequences of 26 amino acids was sufficient to confer homopolymeric RNA binding to the GSG domain of QKI-7, another STAR protein. Furthermore, that minimal sequence can also give QKI-7 the ability (as Sam68) to functionally substitute for HIV-1 REV to facilitate the nuclear export of RNAs. Our studies suggest that neighboring RG-rich sequences may impose nonspecific RNA binding to GSG domains. Because the Sam68 RNA binding activity is negatively regulated by tyrosine phosphorylation, our data lead us to propose that Sam68 might be a specific RNA-binding protein when tyrosine phosphorylated.

The GSG domain is an evolutionarily conserved protein module initially identified by aligning the first three members of this family (17,18). In addition to the KH domain, the GSG domain contains ϳ75 amino acids N-terminal and ϳ25 amino acids C-terminal to the KH domain called the NK (N-terminal of KH) and CK (C-terminal of KH) regions, respectively (schematically represented in Fig. 1). Several properties have been ascribed to the GSG domain including RNA binding (10, 12, 18 -21), self-association (10,18,19,22), heterodimerization (10,20,23), and protein localization (13).
The physiological importance of the GSG domain is demonstrated by the fact that many genetic mutations that result in growth or developmental defects have been identified in this protein module. In the nematode Caenorhabditis elegans, the GSG protein GLD-1 functions as a tumor suppressor that is required for normal oocyte development (35,36). Thirty-two gld-1 mutations have been identified that fall into six phenotypic classes (17). In mice, a missense mutation in the quaking gene (qk) has been identified (24) that is known to be embryonic-lethal (37). This mutation, altering glutamic acid 48 to glycine (24), occurs in the NK region of the GSG domain and has been shown to prevent QKI dimerization (19). In Drosophila melanogaster, HOW plays a critical role in skeletal muscle development, because weak alleles result in the "held-outwings" phenotype (33,34).
The phenotype of the quaking viable and lethal mice suggests that the QKI proteins are involved in myelination and * This work was supported in part by Medical Research Council of Canada Grant MT13377. The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
‡ Both authors contributed equally to this work. § Supported by a post-doctoral fellowship from the Canadian Institutes of Health Research.
¶ Supported by a fellowship from Consejo Nacional de Ciencia y Technología.
Previously we have shown that the Sam68 GSG domain in addition to 50 amino acids at its C terminus are necessary and sufficient for RNA binding (10). To examine the role of the C-terminal amino acids in RNA binding, we generated chimeric proteins between two STAR proteins, Sam68 and QKI. Because the primary amino acid sequence of QKI-5, -6, and -7 isoforms are identical except for the last 6 -30 amino acids (depending on the isoform) and are predicted to have identical RNA binding specificity, we chose QKI-7 for our analysis. Here we identified two small regions harboring arginine-glycine repeats in Sam68 that can confer nonspecific RNA binding activity to an adjacent GSG domain. In addition, a novel Sam68 dimerization region has been identified in its C-terminal sequences.

MATERIALS AND METHODS
DNA Constructions-The constructs encoding Myc-QKI-7, Myc-QKI-7:E48G, Myc-Sam68, and Myc-Sam68⌬1-67 (now renamed S⌬N) were described previously (5,10,19). Myc-S-Q encodes a chimeric protein containing the N-terminal region of Sam68 and the C-terminal region of QKI-7. The DNA sequence encoding the C-terminal 180 amino acids of Myc-Sam68⌬1-67 was removed by restriction endonuclease digestion using EcoRV and KpnI and replaced with a sequence encoding the C-terminal 136 amino acids of QKI-7. The QKI-7 fragment was generated with polymerase chain reaction (PCR) using Myc-QKI-7 as the DNA template and the universal reverse primer and 5Ј-GAC GAT ATC AAG AAG ATG CAG CTG ATG-3Ј (the EcoRV site is underlined) as oligonucleotides. The chimeric protein Myc-Q-S contains the N-terminal region of QKI-7 and the C-terminal region of Sam68. The plasmid expressing this protein was constructed as follows: a DNA fragment encoding the C-terminal 188 amino acids of Sam68 was generated with PCR using Myc-Sam68 as the DNA template, 5Ј-CCT GGT ACC AGA TAT GAT GGA T-3Ј as the forward primer, and universal reverse primer as the reverse primer, the fragment was digested with KpnI (the restriction site is underlined) and subcloned into Myc-QKI-7, the Cterminal 145 amino acids of which had been removed with KpnI digestion. Myc-Q-S:E3 G was constructed using the same strategy as that of Myc-Q-S except that Myc-QKI-7:E48G, instead of Myc-QKI-7, was used to subclone the Sam68 DNA fragment. The construct encoding Myc-Q-S:A3 N was generated by inverse PCR using Myc-Q-S as the DNA template and 5Ј-ATT CAA CTT GAA GCA GAA ACG GGA-3Ј and 5Ј-ATT TGT AAG TCC TCT AGG TCC AAG-3Ј as primers (underlined nucleotides denote changes introduced). The Myc-Q-S and Myc-S-Q C-terminal deletion constructs, Myc-Q-S330, Myc-Q-S294, Myc-S-Q284, and Myc-S-Q205, were generated with PCR using Myc-Q-S and Myc-S-Q as the DNA templates, respectively. The T7 promoter primer was used as the forward primer and the following oligonucleotides were used as reverse primers: 5Ј-AGG AAT TCA TGG CAC CCC TCG AGT  CAC A-3Ј (for Myc-Q-S330), 5Ј-TAG AAT TCA GGC AGC TCC TCG  TCC TCT CAC-3Ј (for Myc-Q-S294), 5Ј-AGG GAA TTC AGA TTA ACC  CAG CTT CAG GCC-3Ј (for Myc-S-Q284), and 5Ј-ATG GAA TTC TAT  CTG TAG GTG CCA TTC AG-3Ј (for Myc-S-Q205). The amplified DNA fragments were digested with EcoRI and subcloned into Myc-Bluescript KS(ϩ) (5,10).
The plasmids encoding Myc-Q(GSG)-S, Myc-Q(GSG)-S⌬4RG, and Myc-Q(GSG)-S⌬6RG were constructed by a two-step subcloning strategy. DNA fragments encoding the C-terminal regions of Sam68 were amplified by PCR with universal reverse primer and 5Ј-GTG GTC GAC GGG TAT CTG TGA GAG GAC-3Ј for Myc-Q(GSG)-S, 5Ј-GGA GTC GAC CCT CCT CCT CCA CCT GT-3Ј for Myc-Q(GSG)-S⌬4RG or 5Ј-GTG GTC GAC CAC CTA GAG GAG CTT-3Ј for Myc-Q(GSG)-S⌬6RG as primers. The fragments were digested with SalI (the restriction site is underlined) and KpnI (the site is in the vector) and inserted in the same sites of pBluescript KS(ϩ). The resulting plasmids were then used to subclone the DNA fragments encoding Myc-tagged N-terminal regions of QKI-7. The Myc-QKI-7 fragments were generated by PCR using Myc-QKI-7 as template, T7 promoter primer as the forward primer, and the following oligonucleotides as reverse primers: 5Ј-CTG GTC GAC TAA TGT TGG CGT CTC TGT-3Ј for Myc-Q(GSG)-S, 5Ј-AGT GTC GAC AAG AGA AAA GGC AAG GGC-3Ј for Myc-Q(GSG)-S⌬4RG, and 5Ј-CAG GTC GAC GCC CAG TGA TGA TCC TTG-3Ј for Myc-Q(GSG)-S⌬6RG. These fragments were digested with BamHI (the site is in the vector) and SalI (the site is underlined) and subcloned in the corresponding Sam68-pBluescript plasmids described above. The construct encoding Myc-Q(GSG)-S⌬11RG was also generated in two steps. A PCR fragment encoding the Myc-epitope tag and the N-terminal 256 amino acids of QKI-7 was first subcloned in the BamHI and SalI sites of pBluescript KS(ϩ), and the XhoI fragment (C-terminal 112 amino acids) of Sam68 was then inserted in the SalI site of the resulting plasmid. To generate the Myc-QKI:1-256 fragment, Myc-QKI-7 was used as DNA template and T7 promoter primer and 5Ј-GTA GTC GAC TGA TCA AAG GCA TTA-3Ј as primers (the SalI site is underlined). Myc-QKI-5RG is a chimeric protein in which a Sam68 sequence harboring five RG repeats is introduced in QKI-7. The plasmid encoding this protein was generated as follows: a PCR fragment encoding the Cterminal 65 amino acids of QKI-7 was first inserted in the HindIII and KpnI sites of pBluescript KS(ϩ), and a second PCR fragment containing coding sequences for the Myc tag, QKI-7 amino acids 1-231 and Sam68 amino acids 308 -333, was then subcloned in the BamHI and HindIII sites of the resulting plasmid. The first PCR fragment was generated using Myc-QKI-7 as template and universal reverse primer and 5Ј-AGA AAG CTT TCA TGC CAA ACG GAA CTC-3Ј as primers. The second PCR fragment was generated using Myc-Q(GSG)-S⌬6RG as template and T7 promoter primer and 5Ј-GTG AAG CTT GCA CCC CTC GAG TCA CAG-3Ј as primers (the HindIII site is underlined).
The The construct encoding the G8 -5 RNA sequence (12), 5Ј-GGG UGA CAC ACU AGC UAU AGC AUU AAA AGA CCG AGC AAG U-3Ј (the UAAA motif is underlined), was generated by annealing two oligonucleotides (5Ј-GCC GAA TTC GGG TGA CAC ACT AGC TAT AGC ATT A-3Ј and 5Ј-AGC TCT AGA CTT GCT CGG TCT TTT AAT GCT ATA GCT-3Ј) and filling in the ends with the Klenow fragment of DNA polymerase I. This DNA fragment was digested with EcoRI and XbaI (the restriction sites are underlined) and subcloned into pBluescript SK(ϩ). The plasmids encoding the tra-2 3Ј UTR and a deletion mutant of tra-2 3Ј UTR, (Ϫ108) 3Ј UTR, were kindly provided by Tim Schedl (Washington University).
The identities of all the above plasmid constructs were verified by dideoxynucleotide sequencing.
Protein Expression and Analysis-Proteins were expressed in HeLa cells, using the vaccinia virus T7 expression system as described previously (5). HeLa cells were lysed in lysis buffer (1% Triton X-100, 150 mM NaCl, and 20 mM Tris-HCl (pH 8.0)), 50 mM NaF, 100 mM sodium vanadate, 0.01% phenylmethanesulfonyl fluoride, 1 g of aprotinin/ml, and 1 g of leupeptin/ml), and the cellular debris and nuclei were removed by centrifugation. For immunoprecipitation, the supernatant was incubated on ice with the specified antibody for 1 h. Then 20 l of a 50% protein A-Sepharose slurry was added and incubated at 4°C for 30 min with constant end-over-end mixing. The beads were washed twice with lysis buffer and once with PBS. Protein samples were analyzed on SDS-polyacrylamide gels and transferred to nitrocellulose membranes. Immunoblotting was performed using the anti-Myc (9E10), anti-hemagglutinin (HA), or anti-p59 fyn antibodies. The rabbit anti-p59 fyn antibody was provided kindly by André Veillette (Institut de Recherche Clinique de Montréal, Université de Montréal, McGill University). The designated primary antibody was followed by goat antimouse or goat anti-rabbit antibodies conjugated to horse radish peroxidase (ICN), and chemiluminescence was used for protein detection (DuPont).
In Vitro Transcription-32 P-labeled G8 -5 RNA and tra-2 3Ј-UTR RNA were transcribed in vitro with the T7 RNA polymerase following the protocols recommended by the manufacturer (Promega). After in vitro transcription, the template DNA was digested with DNase I (Promega), and the RNA was extracted with phenol-chloroform, precipitated with ethanol, and resuspended in diethyl pyrocarbonate-treated water at a concentration of 10 6 cpm/l.
RNA Binding and Functional Assays-For the poly(U) binding assay, Myc-tagged proteins expressed in HeLa cells were incubated at 4°C for 30 min with poly(U)-Sepharose beads or control Sepharose beads in lysis buffer supplemented with 2 mg/ml heparin. The bound proteins were separated by SDS-PAGE, transferred to nitrocellulose, and immunoblotted with anti-Myc antibodies. For G8 -5 or tra-2 3Ј-UTR RNA binding, Myc-tagged proteins expressed in HeLa cells were immunoprecipitated with an anti-Myc antibody or mouse IgG (control), and the immunoprecipitates were incubated at 4°C for 30 min with 1 l (10 6 cpm) of 32 P-labeled RNA in lysis buffer supplemented with 2 mg/ml heparin. The beads were washed twice with lysis buffer and once with PBS, and the bound radioactivity was quantitated by scintillation counting. To verify the identity of the radiolabeled RNA bound to beads, the bound RNA was eluted with sample buffer and analyzed with nondenaturing polyacrylamide gel electrophoresis and autoradiography. To verify protein expression, the immunoprecipitates were analyzed by immunoblotting with anti-Myc antibody. REV assays were performed as described previously (7).

RESULTS
STAR proteins contain a GSG domain, which is a tripartite protein module containing from N to C terminus the NK region, the KH domain, and the CK region (Fig. 1B). The Sam68 KH domain is necessary for RNA binding because its deletion prevents RNA binding (10,12,30). To investigate whether the RNA binding specificity of the STAR proteins resides only in the KH domain or whether neighboring regions can regulate RNA binding, we constructed chimeric proteins between QKI-7 and Sam68. These two proteins show a high degree of homology in their GSG domain ( Fig. 1A) but possess distinct RNA binding specificities. Sam68 has been shown to bind homopolymeric RNA poly(U) and poly(A) (2, 10) as well as a synthetic RNA (G8 -5) amplified by using systematic evolution of ligands by exponential enrichment (12). The QKI proteins bind C. elegans GLD-1 target, tra-2 (41), but not homopolymeric RNA (10). Chimeric proteins were generated between Sam68 and QKI-7 in such a way that the CK region and the C terminus of one protein was replaced by the corresponding region of the other protein (Fig. 1B). Q-S (QKI-7-Sam68 chimeric protein) contains the NK region and the KH domain of QKI-7 and the CK region and the C terminus of Sam68 (Fig. 1B). S-Q (Sam68-QKI-7 chimeric protein) contains the Sam68 N-terminal portion, the NK region and the Sam68 KH domain, and the QKI-7 C terminus including the CK region.
The C-terminal Portion of Sam68 Confers Poly(U) Binding to QKI-7-To examine the RNA binding specificity of these chimeric proteins, Myc-tagged Q-S and S-Q were expressed in HeLa cells and tested for their ability to bind poly(U)-Sepharose. Q-S bound poly(U)-Sepharose, whereas S-Q did not ( Fig.  2A, lanes 9 and 12, upper panel). This difference in binding was also observed if the Q-S and S-Q proteins were incubated together with the same poly(U)-Sepharose beads, hence eliminat-ing the possibility of a recovery problem ( Fig. 2A, lanes 22-24). As positive and negative controls, respectively, Sam68⌬1-67 (herein renamed S⌬N, Fig. 1B) bound poly(U) and QKI-7 did not ( Fig. 2A, upper panel, lanes 6 and 3, respectively). We used S⌬N in this assay because it has been shown to bind poly(U) as well as full-length Sam68, and its poly(U) binding activity is regulated by p59 fyn (11). We have shown previously that the introduction of QKI-7 glutamic acid 48 to glycine in the NK region prevents QKI-7 dimerization (19). To examine whether dimerization via the QKI-7 NK domain was required for Q-S to bind poly(U)-Sepharose, we introduced the E48G amino acid substitution in Q-S. Q-S:E3 G was expressed in HeLa cells and tested for its ability to bind poly(U)-Sepharose. The poly(U) binding activity of Q-S:E3 G was similar to that observed for Q-S ( Fig. 2A, lane 15, upper panel), suggesting that self-association via the predicted coiled-coil region in the QKI-7 NK region was not required for poly(U) binding. Because we have shown previously that Sam68 devoid of its KH domain does not bind poly(U) (10), these findings demonstrated that sequences neighboring the KH domain cannot bind poly(U) per se. Thus the ability of Q-S to bind poly(U) suggested that the Sam68 sequences conferred to the QKI-7 KH domain the ability to bind poly(U)-Sepharose.
To further delineate the Sam68 portion required to confer poly(U) binding to the Q-S chimeric protein, we performed C-terminal deletions resulting in Q-S294 and Q-S330 (Fig. 1B). We have shown previously that in the context of the S⌬N protein, 50 amino acids C-terminal to the GSG domain are required for RNA binding such that Sam68:330 bound RNA and Sam68:294 did not (10). We examined whether the Q-S chimeric protein truncated at Sam68 amino acids 294 and 330 bound poly(U)-Sepharose. The chimeric protein Q-S330 bound poly(U)-Sepharose and Q-S294 did not ( Fig. 2A, lanes 18 and  21, upper panel). These results demonstrated that Q-S, interestingly, was behaving like Sam68. Thus the minimal Sam68 sequence required to confer poly(U) RNA binding to QKI-7 resided from Sam68 amino acids 256 -330, a region harboring the CK region as well as an additional 50 amino acids.
The poly(U) binding activity of S⌬N is negatively regulated by p59 fyn (11). Thus we examined the effect of p59 fyn on the poly(U) binding activity of the chimeric proteins. The chimeric proteins were co-expressed with p59 fyn in HeLa cells, and their poly(U) binding activity was examined. The poly(U) binding activity of Q-S and Q-S:E3 G was severely impaired with the co-expression of p59 fyn (Fig. 2A, lanes 12 and 15, lower panel).
These results indicated that the poly(U) binding activity of Q-S and Q-S:E3 G was regulated by p59 fyn and further demonstrated that Q-S behaved like Sam68. The poly(U) binding of Q-S330 was not affected by p59 fyn (Fig. 2A, lane 18, lower  panel). This finding was expected, because Q-S330 does not contain the phosphorylation sites for p59 fyn that reside in the C terminus of Sam68 (5). The RNA binding activity of S⌬N was inhibited by the expression of p59 fyn and served as a positive control for the assay (Fig. 2A, lane 6, lower panel). The expression of p59 fyn was confirmed by immunoblotting an aliquot of total cell lysate corresponding to Fig. 2A with anti-p59 fyn antibodies (Fig. 2B). Because the C terminus of Sam68 harbors a regulatory domain that can abrogate RNA binding when phosphorylated by p59 fyn on tyrosine residues (11), it was conceivable that the C-terminal region of QKI-7 in S-Q may be inhibiting the ability of the Sam68 KH domain from binding poly(U). To eliminate this possibility, we made C-terminal deletions in S-Q (Fig. 1B, S-Q:284, and S-Q:205) and measured their ability to interact with poly(U)-Sepharose. None of these chimeric proteins bound poly(U)-Sepharose ( Fig. 2C: individually, lanes 1-12; mixed, lanes [13][14][15], demonstrating that the C-terminal region of QKI-7 does not harbor a sequence that inhibits RNA binding. An isoleucine to asparagine substitution in the second KH domain of the fragile X mental retardation gene product (FMRP) is sufficient to severely impair RNA binding (42). The equivalent amino acid substitution in QKI-7, alanine 110 to asparagine (A3 N), was introduced in Q-S to examine the contribution of the QKI-7 KH domain in the poly(U) binding observed with Q-S. Q-S:A3 N had impaired poly(U) binding compared with wild-type Q-S (Fig. 2D), suggesting that the QKI-7 KH domain is required for Q-S poly(U) binding. These findings suggested that the C-terminal sequences of Sam68 are able to confer a new RNA binding activity to the QKI-7 KH domain.
The Sam68 C-terminal Region Confers G8 -5 RNA Binding-The physiological RNA targets for Sam68 are unknown, but a degenerate RNA sequence containing a UAAA motif called G8 -5 has been identified by systematic evolution of ligands by exponential enrichment that binds Sam68 with high affinity (12). Myc-Sam68, -QKI-7, -S-Q, and -Q-S expressed in HeLa cells were immunoprecipitated with anti-Myc antibodies or control mouse IgG, and the immunoprecipitates were incubated with in vitro transcribed 32 P-labeled G8 -5 RNA. The immunoprecipitates were washed, and the amount of bound RNA was quantitated and expressed as counts per minute (Fig.  3A). The radioactivity bound by Sam68 and Q-S anti-Myc immunoprecipitates was 15-20 times higher than control immunoprecipitates, whereas there was only a 2-3-fold difference between anti-Myc and control immunoprecipitates of QKI-7 and S-Q (Fig. 3A). The bound RNAs were analyzed with nondenaturing polyacrylamide electrophoresis and visualized by autoradiography to verify that the radioactivity correlated with 32 P-labeled G8 -5. The G8 -5 RNA was observed in Sam68 and Q-S Myc immunoprecipitates (Fig. 3B), confirming that Sam68 and Q-S bound G8 -5. The absence of G8 -5 binding with QKI-7 and S-Q was not caused by a lower expression of these proteins, because anti-Myc immunoblotting of the immunoprecipitates showed comparable expression of the Myc-tagged proteins (Fig.  3C). These data are consistent with the poly(U) binding results shown in Fig. 2, confirming that the Q-S chimeric protein has an RNA binding specificity similar to Sam68.
G8 -5 RNA Binding Activity of Sam68 and Q-S Is Regulated by p59 fyn -We investigated whether the ability of Sam68 and Q-S to bind G8 -5 was regulated by p59 fyn . Although G8 -5 is a known high affinity RNA target for Sam68, it is not known

FIG. 2. The C-terminal domain of Sam68 confers poly(U) binding to QKI-7.
A, cDNAs expressing Myc-epitope-tagged proteins were transfected alone (upper panel, Ϫfyn) or co-transfected with p59 fyn (lower panel, ϩfyn) in HeLa cells. The cells were lysed, and an aliquot of lysates was kept for total cell lysate (TCL) or incubated with control Sepharose (C) or poly(U)-Sepharose beads (pU) in the presence of 2 mg/ml heparin. The bound proteins were separated by SDS-PAGE, transferred to nitrocellulose membranes, and immunoblotted with anti-Myc antibodies. As an internal recovery control, separately expressed Q-S and S-Q proteins were mixed together with the same poly(U)-Sepharose beads, and the binding reaction was carried out as before (lanes 22-24). B, total cell lysates corresponding to those of A were immunoblotted with anti-p59 fyn antibodies to verify the expression of p59 fyn . C, S-Q205 and S-Q284 C-terminal truncation mutant proteins were tested for poly(U) binding as described above. Again, the poly(U)-Sepharose beads binding assay was also carried out with a mixture of all four proteins in the same reaction (lanes 13-15). D, an amino acid substitution in the QKI-7 KH domain, A110N, was introduced in Q-S to examine the contribution of the QKI-7 KH domain in poly(U) binding.
whether G8 -5 RNA binding is regulated by tyrosine phosphorylation. Myc-Sam68, -S⌬N, and -Q-S were transfected in HeLa cells with or without p59 fyn , the cells were lysed, and the lysates were immunoprecipitated with anti-Myc or control IgG antibodies. The immunoprecipitated proteins were subsequently incubated with 32 P-labeled G8 -5 RNA, and the beads were washed, counted in a scintillation counter, and expressed in counts per minute. The G8 -5 binding activity of full-length Sam68, S⌬N, and Q-S was inhibited by co-expression of p59 fyn (Fig. 4A). The expression of p59 fyn nearly abolished G8 -5 binding to S⌬N, whereas G8 -5 binding to Sam68 and Q-S was reduced by ϳ50% (Fig. 4A). These findings suggested that the Sam68 N-terminal 67 amino acids regulate the ability of Src kinases to negatively abrogate RNA binding. Equivalent Myc and p59 fyn expression were observed in the different samples (Fig. 4, B and C). These data further demonstrated that Q-S behaved like Sam68.
The RG Repeats in Sam68 Are Necessary for Poly(U) Binding-The minimal region of Sam68 required to confer poly(U) RNA binding to QKI-7 resided in Sam68 amino acids 256 -330, which included the CK region (Fig. 1B, Q-S330). To verify whether the Sam68 sequence harboring the CK region or the RG repeats within the additional 50 amino acids was responsible for the new specificity of the Q-S chimera, a new chimeric protein was constructed that extended the QKI-7 sequences to include its CK region. This chimeric protein named Q(GSG)-S, which now contained the entire QKI-7 GSG domain, was tested for its ability to bind poly(U)-Sepharose. Q(GSG)-S retained the ability to bind poly(U) to the same extent as S⌬N or Q-S (Fig.  5A). Thus the sequences C-terminal of the Sam68 GSG domain, and not the Sam68 CK region, were capable of conferring poly(U) binding to the QKI-7 GSG domain. The minimal region essential to confer poly(U) binding specificity to QKI-7 was mapped by engineering chimeric proteins where the junction between QKI-7 and Sam68 sequences was gradually displaced toward the C terminus. The chimeric proteins were named according to the number of RG repeats that were deleted ( Fig.  5A: Q(GSG)-S⌬4RG, ⌬6RG, and ⌬11RG). These chimeras were expressed in HeLa cells and tested for their ability to bind poly(U)-Sepharose. The deletion of four or six RG repeats had little or no effect on the ability of the QKI-7-Sam68 chimeras to bind poly(U)-Sepharose (Fig. 5A: Q(GSG)-S⌬4RG and ⌬6RG). In contrast, a larger truncation deleting 11 RG repeats of the Sam68 sequence completely abolished poly(U) binding ( Fig. 5A: Q(GSG)-S⌬11RG). The minimal Sam68 sequence capable of changing the RNA binding specificity of the QKI-7 GSG domain was located between amino acids 308 -333, which harbors five RG repeats. This 26-amino acid sequence from Sam68 was FIG. 3. The Sam68 C-terminal region confers G8 -5 RNA binding. A, HeLa cell lysates containing Myc-tagged Sam68, QKI-7, S-Q, or Q-S were immunoprecipitated with anti-Myc antibody (hatched bars) or control IgG (white bars) and then incubated with 32 P-labeled G8 -5 RNA in the presence of 2 mg/ml heparin. The beads were washed, and the radioactivity associated with them was quantitated. Each bar represents the mean Ϯ standard deviation of data from more than six independent immunoprecipitations (IP) carried out during more than three separate experiments. B, RNAs eluted from the beads from A as well as the G8 -5 RNA probe were analyzed with nondenaturing polyacrylamide gel electrophoresis and autoradiography. A typical representation is shown. C, after quantitation of radioactivity, the immunoprecipitates in A were analyzed by immunoblotting with anti-Myc antibodies to verify the expression of proteins.

FIG. 4. G8 -5 RNA binding activity of Sam68 and Q-S is regulated by p59 fyn .
A, Myc-tagged Sam68, S⌬N, or Q-S were expressed alone or co-expressed with p59 fyn in HeLa cells and immunoprecipitated with control mouse IgG (white bars) or anti-Myc antibodies (hatched bars). The immunoprecipitates were incubated with 32 P-labeled G8 -5 RNA, and the bound radioactivity was quantitated. Each bar represents the mean Ϯ standard deviation of data from more than six independent immunoprecipitations carried out during more than three separate experiments. B, a typical representation of the expression of Myctagged proteins is shown. C, total HeLa cell lysates were analyzed with anti-p59 fyn immunoblotting to verify the expression of p59 fyn .

Sam68 Sequences Required for Nonspecific RNA Binding
introduced at a similar position in QKI-7. The plasmid expressing this QKI-5RG chimeric protein was transfected in HeLa cells and examined for its ability to bind poly(U)-Sepharose. The chimeric protein QKI-5RG bound poly(U)-Sepharose (Fig.  5A). These findings demonstrate that the Sam68 26 amino acids spanning amino acids 308 -333 are sufficient for conferring homopolymeric RNA binding to the QKI-7 GSG domain. As a control for loading and recovery, a representative poly(U) binding reaction with Myc-tagged QKI-7 was reimmunoblotted using anti-Sam68 antibodies, confirming that endogenous Sam68 bound poly(U) and not the epitope-tagged Myc-QKI-7 (Fig. 5B, upper half).
Three of the five RG repeats in the S⌬N sequences spanning amino acids 308 -333 were replaced with alanine and serine residues, respectively. The S⌬N RG3 AS mutant proteins were expressed in HeLa cells, and their capacity to bind poly(U) was examined (Fig. 6). If the RG repeats participate in conferring homopolymeric RNA binding to the Sam68 GSG domain, the removal by amino acid substitution or deletion should prevent poly(U) binding. The substitution of the individual RGs at position 315, 320, or 325 had a comparable effect such that poly(U) binding was reduced by more than 50% (Fig. 6: S⌬N: 315RG3 AS, 320RG3 AS, and 325RG3 AS). When the double substitution of 315 and 325 or the deletion of amino acids 315-325 was performed, poly(U) binding was completely abrogated ( Fig. 6: 315,325RG3 AS and S⌬N⌬3RG). These data demonstrate that the RG repeats at positions 315 and 325 are necessary for Sam68 poly(U) binding.
The deletion of amino acids 315-325 was performed next in the context of the full-length Sam68 protein, and its ability to bind poly(U)-Sepharose was examined (Fig. 6: Sam68⌬3RG). Sam68⌬3RG bound poly(U)-Sepharose with wild-type affinities unlike S⌬N:⌬3RG (Fig. 6). These findings suggested that the N-terminal sequences of Sam68 function in a redundant manner with the residues located between 315 and 325 to confer poly(U) binding to the Sam68 GSG domain. The Sam68 Nterminal 67 amino acids (1,5) harbor several individual arginine residues and a motif that matches the consensus of an RGG box, a type of RNA binding motif (43). To identify the N-terminal sequence required to confer poly(U) binding to the Sam68 GSG domain, a series of mutant proteins was engineered in the Sam68⌬3RG "background." Sam68⌬3RG was chosen to eliminate any contribution from the C-terminal 315-325 amino acids. Alteration of arginines 10, 13, and 17 to alanines had a minor effect on poly(U) RNA binding (Fig. 6: R3 A: 10,13,17). In contrast, substitution of arginines at position 43 (just outside the RGG box) and 45 (eliminating the first RGG repeat) to alanines was sufficient to severely impair the interaction with poly(U) RNA ( Fig. 6: R3 A:43, 45). Removing the second RGG sequence as well as the adjacent arginine had little or no effect on poly(U) binding (R3 A:52, 56), suggesting that the first RGG sequence plays a major role in conferring poly(U) specificity to the Sam68 GSG domain.
The RG Repeats Can Confer QKI-7 the Ability to Substitute for REV in an HIV RNA Export Assay-The cellular role of Sam68 is still unknown, but it has been proposed to be the cellular homologue of REV in transporting HIV RNA (14). We examined whether the QKI-Sam68 chimeras could functionally substitute for REV in mediating a REV response element (RRE)-directed reporter gene expression. COS-7 cells were transfected with an RRE-chloramphenicol acetyltransferase (CAT) reporter plasmid in the presence of expression vectors encoding REV, S⌬N, QKI-7, or the chimeric proteins. The presence of REV or S⌬N induced an approximate 8-fold increase in CAT activity that was not observed in cells co-transfected with vector alone (pcDNA) or a mutant of REV(M10) that is RNA binding-defective (Fig. 7). The transfection of Q-S induced an approximate 6-fold increase in CAT activity consistent with our data that Q-S binds poly(U)-Sepharose and G8 -5. QKI-7 and the S-Q chimeric protein had no significant activity on the RRE-CAT reporter plasmid (Fig. 7). The deletion of the three RG repeats in the S⌬N protein (S⌬N:⌬3RG) abolished its REVlike activity, and the addition of Sam68 amino acids 308 -333 in QKI-7 (QKI-5RG) conferred REV-like activity to QKI-7. These results demonstrate that the RG-rich sequences are necessary for poly(U) and G8 -5 binding and for Sam68 to functionally substitute for REV in the transport of RNAs.
The C-terminal Portion of Sam68 Harbors a Region That Mediates Self-association-STAR proteins have been shown to self-associate into oligomers (10). We have shown that the Sam68 KH domain and the QKI-7 NK region are required for self-association (10,19). The introduction of the E48G mutation in NK region abolishes QKI-7 self-association (19). To investigate the ability of the chimeric proteins to associate with either Sam68 or QKI-7, Myc-Sam68, -QKI-7, -S-Q, -Q-S, -Q-S:E3 G, and Q-S330 were individually co-expressed with either HAtagged Sam68 or QKI-7 in HeLa cells. The transfected cells were lysed, and the cell lysates were immunoprecipitated with anti-Myc antibody or control mouse IgG. The immunoprecipitates as well as an aliquot of the corresponding total cell lysates were separated by SDS-PAGE, transferred to nitrocellulose, and immunoblotted with anti-HA antibodies. The presence of HA-Sam68 or HA-QKI-7 in anti-Myc immunoprecipitates indicated an association of Myc-tagged proteins with Sam68 or QKI-7. Sam68 and QKI-7 self-associated (Fig. 8, lanes 3 and 12) and did not associate with each other (lanes 6 and 9). S-Q associated with Sam68 (lane 15) consistent with the Sam68 self-association region residing in the KH domain. The S-Q chimeric protein did not interact with QKI-7, as predicted (lane 18). Surprisingly, Q-S associated with both Sam68 and QKI-7 (Fig. 8, lanes 21 and 24). The association of Q-S with QKI-7 was expected because the chimeric protein contains the QKI-7 NK region. However, the association with Sam68 was not predicted and suggests that another multimerization region resides in the Sam68 C-terminal 188 amino acids. The deletion of the C-terminal 113 amino acids in Q-S (Q-S330) abolished the association with Sam68 (lane 33) but maintained the association with QKI-7 (lane 36). Similar results were obtained when the C-terminal 149 amino acids of Sam68 were deleted from Q-S (Q-S294, data not shown). The chimeric protein harboring the QKI-7 lethal mutation Q-S:E3 G did not associate with QKI-7, but the association with Sam68 remained intact (lane 30), consistent with the idea that the NK region of QKI-7 mediates the association with QKI-7. In summary, the chimeric proteins associated with QKI-7 in a predicted fashion: if the NK region of QKI-7 was present, there was association with QKI-7, and if the NK region was absent or if it contained the lethal point mutation E48G, there was no association with QKI-7. The association with Sam68 was more complex. Both the known region for self-association, namely the KH domain, and a newly identified region located in the C-terminal 113 amino acids were involved in association with Sam68. Thus the usage of chimeric proteins has permitted the discovery of a region in the C terminus of Sam68 that is involved in multimerization that was not observed by deletion and/or mutation analysis (10).
Tra-2 3Ј-UTR RNA Binding Activity of Sam68, QKI-7, and Chimeric Proteins-One question that remained unanswered was whether the S-Q chimera had gained a QKI-like RNA binding specificity. In the course of these studies it was shown by Goodwin and co-workers (41) that QKI-7 bound the 3Ј UTR of C. elegans tra-2. To test whether the chimeric proteins associated with an RNA target bound by QKI-7, Myc-tagged Sam68/QKI-7 chimeric proteins were expressed in HeLa cells and immunoprecipitated with anti-Myc or control IgG, and immunoprecipitates were incubated with in vitro transcribed 32 P-labeled tra-2 3Ј-UTR RNA or a mutant RNA with a deletion of 108 nucleotides (Ϫ108 3Ј UTR, Fig. 9C). The amount of bound RNA after several washes was quantitated and expressed as counts per minute. Anti-Myc immunoprecipitates of QKI-7 bound tra-2 (ϳ70,000 cpm, Fig. 9A) but not the Ϫ108 mutant RNA, which is consistent with previous studies (41). In contrast, Sam68, Q-S, and S-Q had negligible tra2 binding (Ͻ1000 cpm, Fig. 9A). The absence of binding with Sam68, Q-S, and S-Q was not caused by a lower expression of these proteins, because anti-Myc immunoblotting of the immunoprecipitates showed comparable expression of all Myc-tagged proteins (Fig.   FIG. 7. The RG repeats can confer QKI-7 the ability to substitute for Rev in an HIV RNA export assay. COS-7 cells were transfected with an RRE-CAT reporter plasmid in the presence of the indicated expression vectors as well as the pCH110 ␤-galactosidase expression vector. CAT activity was normalized for ␤-galactosidase activity to eliminate transfection efficiency variations. Shown is a representative CAT assay experiment autoradiogram, with the histograph above calculated from at least five distinct experiments. FIG. 8. The C-terminal portion of Sam68 harbors a region that mediates self-association. Myc-tagged Sam68, QKI-7, S-Q, Q-S, Q-S: E3 G, or Q-S330 were co-transfected with HA-tagged Sam68 or HAtagged QKI-7 in HeLa cells as indicated. The cells were lysed, and an aliquot was kept for the total cell lysate (TCL). The remaining cell lysate was divided equally into two and immunoprecipitated (IP) with anti-Myc antibody (myc) or control mouse IgG (C). The bound proteins as well as an aliquot of total cell lysate were separated by SDS-PAGE, transferred to nitrocellulose, and immunoblotted with anti-HA antibodies. The position of the HA-tagged Sam68 and HA-tagged QKI-7 as well as the heavy chains of the antibodies (IgG) is indicated on the left, and the relative molecular mass markers are shown on the right. The asterisk in lane 3 denotes most likely a degraded product of HA-Sam68. 9B). The Q(GSG)-S chimera, which contained an intact QKI-7 GSG domain, bound very strongly to the tra-2 3Ј-UTR RNA, to a level comparable with the wild-type QKI-7 protein (Fig. 9A). These results demonstrate that the entire QKI-7 GSG was required for tra-2 binding and that the C terminus of Sam68 did not participate or influence this binding ( Fig. 9 and data not shown). DISCUSSION In the present study, we demonstrate that an RG-rich sequence C-terminal to the Sam68 GSG domain spanning amino acids 308 -333 is necessary for poly(U) binding, G8 -5 binding, and functionally substituting for REV in the transport of HIV RNAs. Using chimeric proteins, we also demonstrate that these 26 amino acids of Sam68 are sufficient to confer to another unrelated STAR protein, QKI-7, the ability to bind poly(U)-Sepharose and G8 -5 and functionally substitute for REV in the transport of HIV RNAs. Because a functional KH domain is required for these three activities of Sam68 (10,12,14,30), it is clear that the RG sequences themselves do not possess intrinsic RNA binding activity. The most likely explanation is that the RG repeats confer to neighboring GSG domains the ability to bind certain RNAs. The collaboration between the RG-rich regions of Sam68 with the GSG domain is reminiscent of a study by Rosbach and co-workers (44) on the GSG domain of BBP/SF1, another STAR protein. BBP/SF1 contains both a GSG domain and a Zn knuckle RNA binding motif. The GSG domain of these proteins is involved in specific recognition of the pre-mRNA branchpoint sequence, but one or more accessory modules is required to achieve efficient binding (44).
The inhibitory effect of p59 fyn on S⌬N was more severe than on full-length Sam68, suggesting that the N-terminal region of Sam68 may contain regulatory sequences. The mouse Sam68 N-terminal 67 amino acids contain two RGG repeats interspaced by four residues (5) and may be a bona fide RGG box, a type of RNA binding motif (43). The deletion of the Sam68 N-terminal 67 amino acids had no effect on poly(U) and G8 -5 RNA binding, suggesting that the RGG repeats do not play a major role in RNA binding or that other sequences in Sam68 function in a redundant manner. Indeed, RG sequences on either side of the GSG domain function in a redundant manner to confer poly(U) binding to the GSG domain. Deletion of the N-terminal RGG boxes or deletion of amino acids 315-325 had no effect on poly(U) binding. However, a Sam68 protein containing the double deletion/mutation was unable to associate with poly(U)-Sepharose. The tyrosine phosphorylation of S⌬N by p59 fyn has been shown to abrogate poly(U) binding (11), suggesting that the phosphorylated C terminus of Sam68 can negatively regulate the nonspecific RNA binding contributions from RG-rich sequences located from 315-325. The fact that the full-length protein was less affected by phosphorylation by p59 fyn suggests several possibilities: 1) the N-terminal RG sequences are not regulated by p59 fyn and may require additional signals such as arginine methylation (15), and 2) a tyrosine kinase may phosphorylate different tyrosines on Sam68 that may now regulate the N-terminal RG region. In summary, our analysis has uncovered a role for the N-terminal RGG sequences that was not obvious using simple deletion strategies.
The Q-S chimeric protein has lost its ability to interact with a QKI-7-specific RNA target, the 3Ј UTR of tra-2 (see Fig. 9). Previous studies demonstrated that the GSG domain was the minimal region required for RNA binding and that GSG proteins devoid of a CK region had impaired RNA binding (19,21). Our results are consistent with this notion, because we show that if the complete GSG domain of QKI-7 is included in the chimeras (Q(GSG)-S) we now observe specific binding to the tra-2 RNA. Moreover, the Q(GSG)-S chimeric protein still displays Sam68-like RNA binding specificity, namely interaction with poly(U) and G8 -5 RNAs. Thus the GSG domain is the only RNA binding region in the QKI-7 protein required for specific high affinity RNA binding.
The construction of chimeric QKI-7/Sam68 proteins has also permitted us to find an oligomerization region located in the Sam68 C-terminal 113 amino acids. Our previous studies demonstrate that the Sam68 GSG domain (Sam68:103-269) is able to associate with a wild-type Sam68 protein and that a deletion in the KH domain of Sam68 (Sam68⌬KH) abolished self-association (10). These data showed that the GSG domain was necessary and sufficient for self-association and that the Sam68 C-terminal ϳ200 amino acids in Sam68⌬KH were not sufficient, without an intact KH domain, to mediate self-association (10). Using QKI-7/Sam68 chimeras, it is evident that the C-terminal region of Sam68 harbors a region required for self-association. In this situation the QKI-7 KH domain seems to compensate for the loss of the Sam68 KH domain. The coiled coil in the NK region of QKI-7 did not participate in the Sam68 association, because the introduction of the E48G substitution (Q-S:E3 G), known to abolish the association with QKI-7 (19), did not affect the association with Sam68. The presence of two regions in Sam68 that mediate self-association suggests that Sam68 may be forming head-to-tail multimers and/or that FIG. 9. tra-2 3-UTR RNA binding activity of Sam68, QKI-7, and chimeric proteins. A, Myc-tagged proteins were expressed in HeLa cells and immunoprecipitated with control mouse IgG or anti-Myc antibodies. The immunoprecipitates were incubated with 32 P-labeled wildtype or mutant (Ϫ108) tra-2 3Ј-UTR RNA, and the bound radioactivity was quantitated. Each bar represents the mean Ϯ standard deviation of data from more than six independent experiments normalized for the background counts obtained with control mouse IgG. B, a typical representation of the expression of Myc-tagged proteins is shown. C, 32 Plabeled wild-type or mutant (Ϫ108) tra-2 3Ј-UTR RNAs were produced in vitro using T7 RNA polymerase and resolved on nondenaturing polyacrylamide gels. Molecular size markers are indicated in nucleotides.
Sam68 is involved in intramolecular interactions. We have shown previously that the tyrosine phosphorylation of Sam68 by p59 fyn prevents self-association (10). The mechanism by which tyrosine phosphorylation regulates self-association is unknown. Now with the discovery of a new region that resides in the C-terminal tyrosine-rich region of Sam68, it is possible that the GSG domain associates with a region in the C-terminal tyrosine region and that the phosphorylation of this region by Src kinases would interfere with this self-association, which is consistent with our previous observations (10).
Based on the results presented in this study, we propose the following model: Sam68 in the resting state would be nonphosphorylated and exhibit nonspecific RNA binding mediated by the concerted effort of its GSG domain and its flanking RG-rich regions. The tyrosine phosphorylation of Sam68 would render it a specific RNA-binding protein. Supporting this model is the fact that poly(U) binding, the UAAA RNA target G8 -5, and REV-like function of Sam68 were identified by using unphosphorylated Sam68 (2,12,14). Moreover, Sam68 poly(U) binding (11,45), G8 -5 (this study), and REV-like function (7) are negatively regulated by tyrosine kinases. The absence of secondary structure in poly(U) and the systematic evolution of ligands by exponential enrichment RNAs such as G8 -5 and the lack of defined sequence/structure in the RRE recognized by Sam68 suggest that specific RNA binding, mediated by only the Sam68 GSG domain, may only be observed in the context of the whole protein when tyrosine-phosphorylated.