The Heterogeneous Nuclear Ribonucleoprotein C Protein Tetramer Binds U1, U2, and U6 snRNAs through Its High Affinity RNA Binding Domain (the bZIP-like Motif)*

Based on UV cross-linking experiments, it has been reported that the C protein tetramer of 40 S heterogeneous nuclear ribonucleoprotein complexes specifically interacts with stem-loop I of U2 small nuclear RNA (snRNA) (Temsamani, J., and Pederson, T. (1996)J. Biol. Chem. 271, 24922–24926), that C protein disrupts U4:U6 snRNA complexes (Forne, T., Rossi, F., Labourier, E., Antoine, E., Cathala, G., Brunel, C., and Tazi, J. (1995) J. Biol. Chem. 270, 16476–16481), that U6 snRNA may modulate C protein phosphorylation (Mayrand, S. H., Fung, P. A., and Pederson, T. (1996) Mol. Cell. Biol. 16, 1241–1246), and that hyperphosphorylated C protein lacks pre-mRNA binding activity. These findings suggest that snRNA-C protein interactions may function to recruit snRNA to, or displace C protein from, splice junctions. In this study, both equilibrium and non-equilibrium RNA binding assays reveal that purified native C protein binds U1, U2, and U6 snRNA with significant affinity (∼7.5–50 nm) although nonspecifically. Competition binding assays reveal that U2 snRNA (the highest affinity snRNA substrate) is ineffective in C protein displacement from branch-point/splice junctions or as a competitor of C protein’s self-cooperative RNA binding mode. Additionally, C protein binds snRNA through its high affinity bZLM and mutations in the RNA recognition motif at suggested RNA binding sites primarily affect protein oligomerization.

The C protein tetramer of 40 S heterogeneous nuclear ribonucleoprotein (hnRNP) 1 particles is one of the three major core particle proteins involved in the packaging of pre-mRNA in vertebrates (for review, see Ref. 1). The tetramer is highly stable in solution and, unlike the A-and B-group proteins, binds RNA in a salt-resistant manner (2,3). Hydrodynamic and ultrastructural studies have shown that three C protein tetramers bind approximately 700 nt of RNA in a cooperative manner and fold the RNA into a unique 19 S triangular complex that can be recovered from intact 40 S monoparticles following their dissociation in low ionic strength buffers (4 -6).
Interestingly, when all of the core particle proteins are allowed to bind long pre-mRNAs in vitro, a contiguous array of 20-nm 40 S hnRNP particles assemble such that each contains approximately 700 nt of RNA. The same experiment performed with purified native C protein alone yields one triangular complex per 700 nt of RNA. These in vitro activities are consistent with electron micrographs of native hnRNP complexes that mostly reveal contiguous arrays of 20-nm particles. Thus, C protein may function in vivo as a protein ruler that nucleates monoparticle assembly (4,7,8). C protein's intrinsic ability to melt RNA secondary structure further indicates that it may act as an RNA chaperonin to maintain a single-stranded state and to orient the RNA such that trans-acting factors can access their cognate sequences (9).
The in vitro reconstitution studies described above together with C protein's cooperative binding mode suggest that C protein may completely coat elongating transcripts in vivo. In support of this possibility, it has been reported that C protein is present in the nonspecific pre-spliceosomal H complex but that it is displaced from splice junctions during the early events of spliceosome assembly (10). In the ontogeny of mRNA, the sitespecific displacement of C protein would be a significant biochemical event. In this context, several recent reports have implicated snRNAs and their associated factors in a mechanism that could modulate C protein's affinity for RNA in a site-specific manner. Although not required for C protein-RNA interactions in vitro, previous studies have indicated that both U1 and U2 snRNPs are required for C protein binding to pre-mRNA (11). More recent reports have suggested that C protein interacts specifically with stem-loop I of U2 snRNA (12) and that C protein binds U6 snRNAs that contain elongated uridylate stretches at their 3Ј end (13). In the latter studies, it was reported that C protein disrupts U6:U4 snRNA interactions when elongated U6 snRNAs are a part of the complexes. Finally, it has been reported that U6 snRNA plays a role in the modulation of C protein phosphorylation (14) and it is now well established that hyperphosphorylated C protein lacks RNA binding activity (15,16). It is therefore possible that the displacement of C protein from branch-point/splice-junction sites might be mediated directly by snRNP binding or by an snRNPdirected C protein phosphorylation event.
To explore these possibilities further, we have characterized the in vitro interactions of purified native C protein with U1, U2, and U6 snRNA. C protein contains a consensus RNA recognition motif (CS-RRM) at its amino-terminal end (residues 8 -87) (17,18). Initially, the RRM was believed to be the major determinant of RNA binding. However, we have recently shown that C protein's high affinity RNA binding domain is located between residues 140 and 180 (19). This 40-residue RNA binding domain is highly basic (28% Arg and Lys) and immediately precedes a 28-residue leucine zipper motif (resi-dues 180 -207). Highly basic nucleic acid binding domains preceding leucine zipper motifs have been well characterized in several DNA-binding bZIP transcription factors (i.e. GCN4, Fos, and Jun) (20,21). It is therefore not surprising that this bZIP-like motif (the bZLM) is responsible for C protein's wild type RNA binding activity. Deletion constructs of C protein lacking the CS-RRM, but retaining the bZLM, are able to bind long lengths of RNA with a slightly higher affinity than fulllength C protein (19). This brings into question the function of C protein's CS-RRM. Because the interactions of certain CS-RRMs with snRNA are now well characterized, we were particularly interested in the possibility that C protein may bind cooperatively to long pre-mRNAs through its bZLM and to snRNA through its CS-RRM.
Using both equilibrium and non-equilibrium binding assays, we report here that C protein's CS-RRM is not responsible for binding U2 or U6 snRNA. Rather, C protein binds these RNAs in a nonspecific manner through its bZLM motif. We also show that 19 S complex formation can only be slightly attenuated by the presence of U2 snRNA. Finally, we will present evidence for the involvement of C protein's CS-RRM in the tetramer-tetramer interactions associated with formation of 19 S complexes. These studies further the ongoing search for possible functions of CS-RRMs and provide additional evidence for C protein's nonspecific affinity for RNA.

EXPERIMENTAL PROCEDURES
C Protein Purification-Typically, nuclei from approximately 9 billion HeLa S3 cells were isolated as described previously (22,23). The nuclei were sonicated briefly on ice and exposed to mild digestion with micrococcal nuclease (Boehringer Mannheim, 150 units/ml sonicate) for 5 min at 37°C. Particulate material was removed by centrifugation (5856 ϫ g for 10 min in a Sorvall SS-34 rotor). Pepstatin, leupeptin, and aprotinin were added to the supernatant prior to loading on a Hi Trap Q column (anion-exchange; Amersham Pharmacia Biotech). Proteins were eluted with a 200 -460 mM NaCl gradient. Fractions containing C protein were combined, diluted 3-fold, and further purified using the Amersham Pharmacia Biotech Mono Q HR 5/5 column. Purified C protein was then equilibrated in 1ϫ STE via dialysis for 12 h at 4°C. Protein concentrations were determined using the BCA assay (Pierce).
In Vitro Transcription-The 709-nt transcript of the human ␤-globin gene was prepared by enzyme digestion (BamHI) of the pHBG709 vector, followed by in vitro transcription using T7 polymerase as described previously (8). This particular RNA possesses the coding sequence for exon 1, intron 1, and a portion of exon 2 of human ␤-globin. The pHU6 -1 (DraI), pMRG3U2 (BamHI), and the pHU1A (SalI) vectors were kindly provided by Dr. Thoru Pederson. U2 and U6 snRNAs were transcribed using a T7-MEGA shortscript in vitro transcription kit (Ambion, Inc.) in order to produce high yields of RNA. U1 snRNA, transcribed with S6 polymerase under previously described conditions, was used as a control substrate (8). As an additional control, a 116-nt RNA was transcribed from a portion of antisense 18 S ribosomal DNA. In each case where RNA was radiolabeled, [␣-32 P]CTP (3000 Ci) was added to the transcription reaction. The r(U) 14 , the SELEX-identified winner sequence, and the 20-nt ␤-globin sequence were commercially prepared by Cruachem. The sequence of the winner oligonucleotide has been previously described by Gorlach et al. (38). The ␤-globin 20-mer is a sequence from IVS1 of human ␤-globin pre-mRNA that is not thought to function in pre-mRNA processing (see Burd and Dreyfuss (18) for the nucleotide sequence). All three oligonucleotides were end-labeled with T4 polynucleotide kinase in the presence of [␥-32 P]ATP (3000 Ci). RNA concentrations were determined through A 260 measurements. The extinction coefficients of U1, U2, and U6 snRNA are similar due to their similar uridine percentages (i.e. 24%, 29%, and 27%, respectively). Observed differences in C protein's affinity for these substrates is thus not due to differences in substrate concentrations. Although differences in the extinction coefficient between the 20-nt ␤-globin transcript and the r(U) 14 substrate could lead to lower molar concentrations of the homoribopolymer this alone can not explain the absence of observed binding to these RNAs especially inasmuch as, in separate experiments, they were used at a 10-fold higher concentration than the snRNA substrates. Optical density measurements at 280 nm were also obtained to ensure the absence of protein contamination in the RNA substrates.
Gel Mobility Shift Assays-RNA-protein interactions were examined using gel mobility shift assays. Each 20-l reaction consisted of 20 mM Tris-HCl, pH 7.5, 100 mM NaCl, 1 mM dithiothreitol, 10% glycerol, 0.1% Nonidet P-40, C protein (or a deletion construct of C protein), and 100 pM amounts of the appropriate 32 P-labeled RNA. Increasing concentrations of protein were used in each sample. Each sample was allowed to incubate for 20 min at room temperature. A 4.5% nondenaturing polyacrylamide gel (acrylamide to bisacrylamide ratio of 30:1) was preelectrophoresed in 0.5ϫ TBE for 30 min (Bio-Rad Mini-PROTEAN II electrophoresis system). The samples were first loaded into the gel while it was running at 100 V and then electrophoresed for 30 min at 200 V. Radioactive bands were detected with the use of a Molecular Dynamics PhosphorImager. In Figs. 1, 2, and 6, the apparent mobility retardation of the free RNA probes seen in both panels is a result of loading consecutive lanes while the gel was running and does not represent protein-RNA interactions. Control experiments using Epstein-Barr virus nuclear antigen-1 (EBNA-1) and EBNA-1 DNA (provided with the bandshift kit from Amersham Pharmacia Biotech) were performed to confirm that binding occurs under the conditions described above. As a negative control, U2 and U6 snRNA was titrated with bovine serum albumin. Bovine serum albumin was not observed to bind either substrate under these conditions, even at high bovine serum albumin concentrations.
In the case of the band shift assay, dissociation constants were determined by visually identifying the electrophoretic lane representing the protein concentration at which half of the RNA present is bound (24, 25). The K d values listed in Table I are therefore approximate. This qualitative method has been shown to be valid for K d estimations providing the nucleic acid concentration is at least 10-fold lower than the protein concentration at the midpoint (24). In this context, K d estimates from the band shift assay were confirmed through the equilibrium binding study described below. The band shift assay was also used to identify the primary snRNA binding domain. In these studies the approximate K d values determined for the M1-F115 construct (the RRM) and the Y119-G290 construct (the bZLM) were sufficiently dissimilar so as to render the results unambiguous.
Fluorescence Titrations: The Equilibrium RNA Binding Assay-Equilibrium binding isotherms were obtained by measuring the change in the fluorescence signal (enhancement) that occurs when C protein binds the fluorescent probe RNA, poly-r(⑀A). The probe substrate was approximately 3000 nt. Fluorescence measurements were performed with an SLM Aminco Bowman Series 2 luminescence spectrometer. Excitation and emission slits were fixed at 4 and 8 nm, respectively. Titrations were conducted by exciting a fixed amount (0.5-2 M) of the probe RNA at 310 nm, and measuring the change in its emission at 410 nm, as a function of increasing protein concentration. The fluorescent probe RNA was prepared as described previously (26). The concentration of the probe RNA was determined using an extinction coefficient (⑀) of 5330 for optical density measurements at 260 nm (24). Fluorescence measurements were corrected for dilution due to the addition of protein sample, and for background fluorescence. In all cases, the dilution of probe RNA did not exceed 5% of the total sample volume. Inner filter corrections were not necessary as the quantities of RNA used in these titrations have negligible absorbance at 310 nm. To determine if native C protein binds the fluorescent probe through a single binding mode, the macromolecular binding density function analysis (27) was performed at RNA concentrations of 0.7, 1.4, and 2.8 M (28).
The competitor RNAs were used in 2-fold molar excess (total nucleotides) depending on the strength of the competitor as a substrate for C protein binding (determined through test titrations). An expression for the apparent equilibrium constant (K app ) for the interaction of C protein with each competitor RNA was derived considering the following equilibria (29,30).
In a competition titration, the free protein concentration for both mass action expressions is the same. As a result, both expressions can be equated through [P] free . The solution for K comp is shown in the following equation.
In the above equation, K ⑀A was derived from a fit of data from a control titration (no competitor present) using the McGhee von-Hippel noncooperative binding equation (31). Velocity Sedimentation-Velocity sedimentation centrifugation in 15-30% linear glycerol gradients was used to characterize the sedimentation properties of various protein-RNA complexes and the efficiency of 19 S complex assembly in the presence of competitor RNAs. The gradients were subjected to centrifugation at 134,400 ϫ g for 24 h at 4°C in the Beckman SW 28 rotor. The protein and RNA present in consecutive fractions was precipitated with three volumes of cold ethanol and resolved by SDS-PAGE (32). The gels were first stained with Coomassie Brilliant Blue R-250 to identify the fractions containing the protein peak. Following this, the gels were dried (1 h at 80°C) and exposed to phosphor screens to visualize radiolabeled RNA. Radioactive bands were detected with the use of a Molecular Dynamics PhosphorImager.
The efficiency of 19 S complex formation was calculated by densitometry using NIH Image. The area under each peak (or for each band) was determined and the values (which were directly proportional to the band's density) were summed to determine the total integrated area. The integrated area for the 19 S bands alone was also determined and divided by the total in order to calculate percent 19 S complex formed.
C Protein Constructs-The M1-F115 and the Y119-G290 constructs were generated as described previously (19). The His-tagged 8KTD/A and 13SMNS/A constructs were generated by run-around polymerase chain reaction using the pET-28a vector, containing the full-length C1 cDNA sequence, as the template. In each case, primers were created such that the codons that were to be mutated were replaced with alanine (GCG). After kinasing and religating, the vector was then purified by gel extraction (Qiagen, Inc.). The purified plasmid was then transformed into both DH5␣ and BL21 DE3 pLys S Escherichia coli strains. Expression and purification of these C protein constructs was performed as described previously (19).

C Protein-snRNA Interactions Can Be Monitored through
Nonequilibrium and Equilibrium Binding Assays-Gel mobility shift assays were performed to determine if interactions between purified C protein and U1, U2, and U6 snRNA are of sufficient affinity to monitor under nonequilibrium conditions. Although the focus of this study was designed to evaluate interactions between C protein and U2 and U6 snRNA, U1 snRNA and a 116-nt transcript of ribosomal DNA were used to further examine the question of sequence-specific binding. In vitro transcribed 32 P-labeled snRNAs were incubated with increasing concentrations of purified native C protein (1.0 -50 nM), and the preparations were resolved in 4.5% nondenaturing polyacrylamide gels. As shown in Fig. 1A, C protein binds U2 snRNA with an approximate K d of 7.5 nM. C protein also binds U6 snRNA but with lower affinity (Fig. 1B). In the latter titrations, the protein concentration yielding 50% RNA bound complexes (K d ) could not be determined due to practical limitations associated with the methodology for obtaining highly purified native C protein. It is apparent, however, that the K d for the interaction of native C protein with U6 snRNA is not significantly above 50 nM. In Fig. 2 (A and B), it can be seen that C protein binds U1 snRNA with a K d near 10 nM. It can also be seen that C proteins binds with slightly lower affinity to a 116-nt transcript of ribosomal RNA (about 25 nM) (see Table  I for summary).
To determine if these differences in relative affinity occur in solution under equilibrium conditions, binding to the three snRNAs and to the 116-nt control was monitored using fluorescence spectroscopy (27,(33)(34)(35). In this assay, the ability of the individual snRNAs to compete as substrates with fluorescently labeled poly(A) was determined. RNA preparations containing a 2:1 M excess (total nucleotides) of snRNA over the fluorescent probe RNA were titrated with increasing amounts of C protein (from 0.5-16.5 nM). In these experiments, C protein binding to competitor RNA is seen as an attenuation of the binding-induced enhancement of fluorescence from the probe RNA. Consistent with the order of affinity observed in the band shift assay, calculations from the data shown in Fig. 3 reveal that C protein binds U2 snRNA under equilibrium conditions with 2-fold higher affinity than U1 snRNA, a 10-fold higher affinity that U6 snRNA, and a 4-fold higher affinity than to the 116-nt control substrate. Additionally, C protein binds the 116-nt control substrate with a 2-fold higher affinity than U6 snRNA. Although the equilibrium binding assay used here can detect subtle differences in relative binding affinities, it does not yield absolute dissociation constants because the competitors are present in molar excess to the probe and because the poly(⑀A) probe is of sufficient length (3000 nt) to accommodate C protein's cooperative binding mode (36). The results from FIG. 1. Gel mobility shift analysis of C protein binding to U2 and U6 snRNA. 100 pM amounts of the respective 32 P-labeled RNAs were incubated with increasing concentrations of C protein (0 -50 nM) for 20 min. at room temperature. Samples were then electrophoresed on a 0.5ϫ TBE nondenaturing polyacrylamide gel. RNA was then visualized using phosphorimaging techniques (see "Experimental Procedures"). C protein concentration is labeled above the gels. A, both free U2 snRNA and a slower migrating C protein-U2 snRNA complex are as identified. The estimated K d value for this interaction is 7.5 nM. B, a U6 snRNA band shift is also observed, indicating C protein-U6 snRNA complex formation at 50 nM concentration of C protein. From this experiment, a K d value could not be determined. The apparent retardation in RNA mobility (lanes 2-5) seen in both panels is a result of loading consecutive lanes while the gel was running and does not represent protein-RNA interactions. The position of the protein-RNA complex is indicated in the figure.
both assays were not unexpected, as our previous findings have demonstrated that C protein binds RNA regardless of sequence or the presence of RNA secondary structure (4,8,37). The differences in affinity observed here do not seem adequate to direct in vivo specificity for any of the RNA substrates tested.
C Protein Binds U2 and U6 snRNA through Its bZLM-To characterize the individual interaction of C protein's two RNA binding domains with U2 and U6 snRNA, two bacterially expressed deletion constructs were utilized. The amino-terminal construct (M1-F115) contains the CS-RRM. The second construct contains the bZLM and C protein's acidic carboxyl terminus (Y119-G290) (Fig. 4). The band shift assay shown in Fig.  5A reveals that the M1-F115 construct binds U2 snRNA with a K d near 125 nM (see Table I). It is apparent that at higher protein concentrations a second RNA-protein complex forms. This result could be due to the binding of a second mole of protein at high protein concentrations. In Fig. 5B, both the first and second complexes are again observed, indicating an interaction between U2 snRNA and the bZLM-containing construct (Y119-G290). The K d for the CS-RRM-U2 snRNA interaction is at least 10-fold greater (indicating a lower affinity) than that of the bZLM-U2 snRNA. Similar results were obtained for the interaction of these two RNA binding domains with U6 snRNA. In Fig. 6 (A and B), it can be seen that the affinity of the bZLM is at least 10-fold higher than the CS-RRM. As observed with U2 and U6 snRNA, it was found that the Y119-G290 construct binds with higher affinity than the M1-F115 construct to U1 snRNA and to the 116-nt control (results summarized in Table  I). These results are consistent with our previous studies using various pre-mRNAs as substrates (19,36), namely that the bZLM is the high affinity determinant of C protein-RNA interactions.
C Protein Binds snRNA with Higher Affinity than It Binds Uridine-rich Oligonucleotides-Several previous publications have indicated that C protein displays its highest binding affinity for homoribopolymers of uridine and for uridine rich oligonucleotides (38 -43). Additionally, in previous UV-induced cross-linking studies on C protein-snRNA interactions, it has been suggested that C protein may preferentially bind a uridine-rich loop of U2 snRNA (12) and to an unusually long FIG. 2. Gel mobility shift analysis of C protein binding to U1 snRNA and a 116-nt RNA. These RNAs were used as competitors to determine whether the binding of C protein to U2 and U6 snRNA was sequence-specific. C protein concentration ranged from 0 to 50 nM. A, identified are the migrations of free U1 snRNA and the C protein-U1 snRNA complex. The K d value is estimated to be 10 nM. B, the findings here show that C protein will also bind to the 116-nt RNA; however, the approximated K d value is slightly higher than that of the C protein-U1snRNA interaction (25 nM). As in Fig. 1, the apparent mobility retardation of the free RNA probe (especially apparent in the second and third lanes of panel A) is due to loading consecutive lanes while the gel was running and does not represent protein-RNA interactions. The position of the protein-RNA complex is indicated in both panels.  3. Equilibrium binding isotherms for the interaction of purified native C protein with various RNAs. These RNA binding isotherms reveal the relative abilities of U1, U2, and U6 snRNA (and the 116-nt control) to compete with the fluorescently labeled probe RNA substrate. The magnitude of fluorescence attenuation from the probe RNA correlates with increased C protein affinity for the RNAs indicated. More specifically, these equilibrium binding isotherms reveal that U2 snRNA is a better substrate competitor than U6 snRNA. In these experiments the concentration of the fluorescent probe RNA was 1 M. C protein concentrations ranged from 0 to 20 nM and the concentration of the competitors (U1, U2, U6, and the 116-nt control) was 2 M (see "Experimental Procedures"). The upper solid tracing (denoted probe RNA) is a nonlinear least squares fit of the titrations of poly-r(⑀A) with C protein using the McGhee and von Hipple non-cooperative model (31,47). The lines for the competitor RNAs are interpolations of the data points. The competitor titrations were not carried to plateau (E max ) due to limitations of C protein availability and concentration.

FIG. 4. Schematic representation of full-length C protein and two deletion constructs.
Shown in the top diagram is the full-length C protein with its four known major domains: the CS-RRM, the highly basic RNA binding domain and the four-heptad leucine zipper (or bZLM), and the acidic carboxyl-terminal region. Below the full-length depiction of C protein are the two deletion mutants, M1-F115 and Y119-G290, respectively. The M1-F115 mutant consists primarily of the CS-RRM whereas the Y119-G290 possesses the bZLM and the acidic carboxyl terminus. polyuridine element at the 3Ј terminus of U6 snRNA (13). However, in salt-dissociation studies, in in vitro reconstitution studies (4), and in equilibrium binding assays (36), we have not observed that native C protein preferentially binds uridine-rich sequences. It therefore seemed important to compare (via the band shift assay) C protein's affinity for snRNA (U1, U2 and U6 snRNA) with its affinity for two uridine-rich oligonucleotides previously described as high affinity substrates (38). Specifically, as substrates we used a 14-nucleotide homoribopolymer of uridine (r(U) 14 ) and the uridine-rich "winner" sequence identified through selection and amplification (SELEX) (38). A 20nucleotide sequence (from IVS1 of ␤-globin) that is not uridinerich and that has not been characterized as a high affinity substrate for C protein binding was used as a control substrate. The results of the band shift assays shown in Fig. 7 reveal that C protein does not form stable associations with any of the three substrates. Additionally, these titrations were performed at molar RNA concentrations 10-fold higher than those used for the snRNA titrations. At the highest concentration of protein (50 nM), a smear of labeled r(U) 14 can be seen (Fig. 7B). This could result from the dissociation of weakly interacting species. Consistent with these findings, we have used fluorescence spectroscopy to monitor C protein binding in solution to various oligonucleotides suggested as high affinity target sequences. 2 In these experiments, it was necessary to use target oligonucleotide concentrations at least 500 fold higher than used here to evaluate C protein-snRNA and C protein-pre-mRNA interactions.
Relative Affinities of C Protein for Pre-mRNA and U2 snRNA: U2 snRNA Moderately Attenuates 19 S Formation-The results described above demonstrate that native C protein binds U2 snRNA with significant affinity. Although the binding of C protein to U2 snRNA appears nonspecific, it could, however, function in vivo to block the RNA-activated interactions between tetramers that direct 19 S complex assembly at splice junctions. Additionally, interactions between snRNA and C protein might induce the site-specific displacement of C protein from intronic branch points. Regarding this question, a series of in vitro reconstitution studies were conducted to determine the relative affinities of C protein for pre-mRNA and U2 snRNA and to determine if U2 snRNA can dissociate or block the assembly of 19 S C protein-RNA complexes. In these experiments, purified native C protein was allowed to associate in solution with a 32 P-labeled 709-nt human ␤-globin transcript containing all of the first exon and intron and a 205-nt portion of exon 2 (44). In other experiments, C protein was either preincubated with 32 P-labeled U2 snRNA or the latter was present with the ␤-globin transcript prior to C protein addition. Because C protein, U2 snRNA, and nascent transcripts are all abundant intranuclear constituents in actively dividing cells, the molar concentration of the RNA substrates in most of the competitive reconstitution assays was adjusted to near equality. The various C protein-RNA preparations were resolved in 15-30% linear glycerol gradients. Following centrifugation, the distribution of RNA was monitored by pumping the gradients through a 30-l flow cell of a diode array spectrophotometer. Additionally, SDS-PAGE was performed on successive 1-ml gradient fractions to more precisely identify the RNA (radioactivity) and protein components (Coomassie stain) of the various gradient-resolved species. In several previous studies, we have shown that each C protein tetramer binds a 230-nt length of RNA and that three tetramers associate to fold approximately 700-nt RNA substrates (regardless of sequence or source) into a unique triangular complex that sediments at 19 S. Purified C protein alone sediments at approximately 5.8 S while the 709 nt transcript used here sediments in a well resolved manner slightly faster that C protein (4) (see Fig. 8, A and B).
In the control experiment shown in Fig. 8, it can be seen that when C protein is allowed to bind the 709-nt pre-mRNA transcript in the absence of snRNA and under conditions of slight protein excess, the great majority of RNA and protein sediment at 19 S. It can be seen that a very small amount of unbound RNA (fractions 12 and 13, panel A) and unbound protein (fractions 14 and 15, panel B) sediment as free moieties. It is also apparent that the great majority of RNA sediments with C protein about a peak at 19 S. In these SDS-PAGE gels, RNA secondary structures are not denatured and it can be seen that the 709-nt transcript is folded by C protein to form a unique and stable but protein-free, slow migrating secondary structure (upper band in fractions 8 -10 of Fig. 8A).
To observe the effects of U2 snRNA on 19 S complex formation, C protein (300 nM) was incubated with both U2 snRNA (46 nM) and the 709-nt transcript (40 nM) simultaneously. In Fig.  9A, it can be seen that most of the pre-mRNA moiety was assembled into 19 S complex. Even though the snRNA used in these experiments was labeled at approximately one-third the specific activity of the 709 nt transcript, it is clear that the U2 snRNA was not packaged with pre-mRNA in the 19 S complex. This demonstrates that, in competition with U2 snRNA, C protein preferentially binds long pre-mRNAs. A small amount of C protein is, however, present in the same fractions as the bulk of U2 snRNA (fractions 12 and 13, panels A and B). From these results alone, it can not be determined whether the protein in these fractions is bound to U2 snRNA alone, if it is bound to both U2 snRNA and the 709-nt transcript, or if it is coincidental sedimentation. To resolve this question, reconsti-tution experiments were performed under conditions of RNA excess (snRNA ϭ 310 nm, 709-nt transcript ϭ 71 nm, C protein ϭ 110 nm). U2 snRNA was again recovered with protein that was not committed to 19 S complex (data not shown). Finally, in the absence of U2 snRNA, the distribution of protein FIG. 7. Gel mobility shift analysis of native C protein interaction with short oligonucleotides. In each case, the substrate was incubated with an increasing concentration of full-length C protein (0 -50 nM). A, binding of the SELEX-identified winner sequence (14mer) to C protein was monitored; however, a slower migrating band was not observed, indicating no complex formation. B, r(U) 14 was the probe used in this second experiment. Minimal interaction can be observed at the highest level of C protein concentration (50 nM). C, a small portion of the ␤-globin sequence (20-mer) also does not interact with C protein at the concentrations used.

FIG. 8. Velocity sedimentation analysis of the 19 S complex.
A radiolabeled 709-nt transcript was incubated with native C protein and then loaded onto a 15-30% glycerol gradient. The gradient was spun at 4°C for 24 h. at 134,400 ϫ g and then fractionated into 1-ml fractions. The fractions were ethanol precipitated and then electrophoresed on a 8.75% Laemmli gel. The gel was stained for protein and then dried. RNA was visualized by phosphorimaging techniques. A, distribution of RNA throughout the gradient. Fraction numbers are labeled above the gels. The free 709-nt RNA is found in fractions 12 and 13. RNA that is part of the 19 S complex can be seen in fractions 8, 9, and 10. B, distribution of Coomassie-stained protein throughout the gradient. The result shown here indicates that C protein sedimented in the corresponding 19 S fractions (fractions 8, 9, and 10). Because C protein was present in excess, unbound protein is observed in fractions 14 and 15. The two isoforms of C protein, C1 and C2, are as indicated. and pre-mRNA typically appears symmetrical about the 19 S peak. These findings indicate that in vitro U2 snRNA does not effectively block 19 S complex assembly on a length of RNA containing the elements of a functional splice junction. The moderate attenuation of 19 S complex formation seen in the presence of U2 snRNA is not inconsistent with C protein moderate affinity for U2 snRNA. Densitometry of stained bands (shown in Figs. 8B and 9B) reveals that when present at approximate equal molar concentrations with pre-mRNA, U2 snRNA attenuates 19 S complex formation by approximately 25%.
The CS-RRM May Function in Tetramer-Tetramer Interactions in the 19 S Complex-Having observed that C protein's high affinity interaction with both pre-mRNA and snRNA is mediated through its bZLM motif, experiments were designed to gain further information on the role of the amino-terminal CS-RRM in the assembly of the 19 S C protein-RNA complex. Site-specific mutants of the CS-RRM were constructed and characterized regarding their ability to form 19 S structures. In previous experiments where the amino-terminal domain was mixed with r(U) 8 , NMR spectra revealed that residues 8 -10 (KTD) and [13][14][15][16] are involved in associations with RNA (45). In the present study, these residues were changed to alanines creating two mutants: 8KTD/AAA and 13SMNS/ AAAA (Fig. 10B). These constructs were separately incubated with the 709-nt ␤-globin transcript and resolved in 15-30% glycerol density gradients. Like wild type C protein, the 8KTD/ AAA was competent in 19 S complex assembly (Fig. 10A). However, the 13SMNS/AAAA construct was not only defective in 19 S complex assembly but directed the assembly of a faster sedimenting anomalous ribonucleoprotein complex. Because the latter construct functioned well in RNA binding but caused the assembly of an artifactual ribonucleoprotein complex, it is likely that the CS-RRM may function in the RNA-activated tetramer-tetramer contacts that direct 19 S complex assembly. DISCUSSION Both equilibrium and nonequilibrium binding studies reveal here that purified native C protein interacts with three snR-NAs (U1, U2, and U6) and a 116-nt ribosomal DNA transcript with significant in vitro affinities (7.5-50 nM). The differences observed do not, however, appear sufficient to direct in vivo binding specificity (46). Although one binding assay monitors the stability of protein-RNA complexes under nonequilibrium conditions (the band shift assay) and the other monitors binding in solution under equilibrium conditions (fluorescence spectroscopy), both assays revealed the same order of binding affinity (U2 Ͼ U1 Ͼ 116 Ͼ U6). It was observed here that C protein binds snRNAs with much higher affinity than it binds a 14-nt homoribopolymer of uridine or the SELEX-identified "winner" sequence. More specifically, no binding was observed in the band shift assay and, in a separate study, 500-fold higher concentrations of these oligonucleotides were required to monitor binding under equilibrium conditions (47). Although C protein binds snRNAs with relatively high affinity, in competition with a 709-nt human ␤-globin transcript, U2 snRNA, at slight molar excess, did not effectively block C protein from folding the bulk of the pre-mRNA into the 19 S C protein-RNA complexes that nucleate 40 hnRNP core particle assembly. This finding is consistent with our previous report that C protein binds pre-mRNA through a cooperative binding mode (36).
In a previous study on C protein-snRNA interactions, Temsamani and Pederson (12) did not observe UV-mediated crosslinking of C protein to U1 snRNA. They did, however, report that C protein can be cross-linked to stem loop I of U2 snRNA. From the results described here (that C protein binds U1 and U2 snRNA with similar affinities), one might expect that C protein would cross-link equivalently to U1 and U2 snRNA. This discrepancy may be explained by the intrinsically high photoreactivity of uridine and the presence of four peripherally located U's in the loop of stem loop I of U2 snRNA (48,49). In contrast, U1 snRNA does not possess a loop containing four uridines. In a second study utilizing UV irradiation, it was observed that C protein is preferentially recovered in small subfractions of U6 snRNA that contain an unusually long poly(U) sequences at the 3Ј terminus. Based on additional findings, it was further concluded that C protein may function to specifically dissociate U4-U6 complexes possessing this unique subset of U6 snRNAs. The binding affinities reported here are not inconsistent with an RNA-dissociating activity, but they suggest that the enhanced photoreactivity of uridine may underlie observations implying binding specificity based on UV-induced cross-links.
Because the C protein tetramer possesses four copies each of two different RNA binding domains (the CS-RRM and the bZLM), it was of interest to determine if C protein might specifically bind snRNAs through its CS-RRM motif. Such an activity could function in vivo to recruit snRNPs to nascent transcripts, to block C protein binding at branch point/splice junctions, or perhaps to displace nonspecifically bound C protein from splice junctions. Previous studies have shown that the bZLM is the primary determinant of C protein's high affinity interaction with pre-mRNAs. Two observations described here indicate that the interaction between C protein and FIG. 10. Velocity sedimentation of the site-specific C protein mutants 8KTD/AAA and 13SMNS/AAAA. Each mutant was incubated with a 709-nt RNA and then loaded onto a 15-30% gradient. The gradients were spun at 4°C for 24 h at 134,000 ϫ g, fractionated, and then monitored by pumping them through a 30-l flow cell of a diode array spectrophotometer. A, shown are the spectrophotometric results of the effects of these mutations on 19 S complex formation. Although the 8KTD/AAA mutant behaved as the native C protein tetramer, the 13SMNS/AAAA mutant could not efficiently form 19 S complexes. Instead a larger anomalous ribonucleoprotein particle formed. B, schematic representation of C protein. Important regions within the protein are identified. The locations of the site-specific mutations are indicated at the amino terminus of the CS-RRM. snRNA is also mediated via the bZLM motif. First, in reconstitution studies where both C protein and U2 snRNA were present at equal molar concentration prior to C protein addition, no U2 snRNA was recovered in the 19 S C protein-pre-mRNA complexes. Second, the M1-F115 amino-terminal construct possessing the CS-RRM binds U2 and U6 snRNA with significantly reduced affinity (8 -12 fold), whereas the Y119 -290 construct, containing the bZLM motif, binds these RNAs with affinities equal to or higher than wild type C protein. These two findings support our previous suggestion that the CS-RRM may function as a negative allosteric modulator of C protein-RNA interactions (19). The absence of U2 snRNA in 19 S C protein-RNA complexes formed in the presence of U2 snRNA indicate that the CS-RRM does not independently bind RNA. Other proteins in which deletion constructs, containing the primary determinant for binding, bind tighter than the fulllength protein include the Drosophila Sex-lethal protein, E. coli 70 subunit of RNA polymerase, and the mammalian glucocorticoid receptors (50). The finding that the CS-RRM is not the primary RNA binding domain is not without precedence. It has been observed previously that the COOH-terminal RNA binding domain of the human U1A protein does not bind to any of the following RNAs: snRNAs, an RNA hairpin, rA 16 , rU 16 , rC 16 , rA 3 U 3 GUA 4 , or random RNAs (51).
A finding of considerable interest is the appearance of two concentration-dependent RNA-protein complexes when either of the deletion constructs bind U2 snRNA. This result is U2 snRNA-dependent, as we did not observe second complex formation at high protein concentrations with the other RNAs tested. The binding of a second mole of protein to U2 snRNA may relate to the size of U2 (188 nt) versus the other RNA substrates (U6 at 107 nt, the 116-nt control, and U1 at 178 nt). In fact, the overall order of binding affinity reported here correlates strongly with RNA length (including the absence of stable binding to r(U) 14 and the winner oligonucleotide). In this context, we have shown previously that a single mole of native C protein occludes approximately 230 nt of RNA; however, slightly longer substrates can support the binding of a second tetramer (4). Thus, U2 snRNA appears to be of sufficient length for a second mole of truncated protein to bind at high protein concentrations. The binding of a second mole of protein to substrates of sufficient length has been reported by Kanaar et al. (50). Using gel mobility shift assays, they determined that two Drosophila Sex-lethal proteins will bind to transcripts containing two poly(U) sequences in a concentration-dependent manner.
In our efforts to evaluate the individual roles of the CS-RRM and the bZLM in snRNA binding, an observation of particular significance deserves some mention here. Namely, two C protein constructs, each possessing mutations at sites in the CS-RRM thought to function in RNA binding (8KTG/AAA and 13SMNS/AAAA), showed no attenuation in RNA binding activity. However, the second construct revealed clear defects in the RNA-activated tetramer-tetramer interactions that lead to 19 S complex formation. These findings, when taken together with the CS-RRMs low affinity for RNA, indicate that the aminoterminal RRM may play a major role in protein-protein interactions. A precedent for this possibility exists in the findings of others. More specifically, the NH 2 -terminal RRM of U2 snRNP protein U2B "is responsible for interacting with snRNP U2A" and the residues involved in protein-protein interactions apparently function in binding to U2 snRNA as well (52,53). Similarly, both RNA binding domains of the Drosophila Sexlethal protein are required for homodimerization (54). Additionally, it has been reported that the mammalian spliceosomeassociated protein, SAP-49 interacts with SAP-145 through its two RRMs, and that the RRM of snRNP U1A is responsible for the protein's homodimer formation (55,56).