Mutational Definition of RNA-binding and Protein-Protein Interaction Domains of Heterogeneous Nuclear RNP C1*

The heterogeneous nuclear ribonucleoprotein (hn- RNP) C proteins, among the most abundant pre-mRNA-binding proteins in the eukaryotic nucleus, have a single RNP motif RNA-binding domain. The RNA-binding domain (RBD) is comprised of ∼80–100 amino acids, and its structure has been determined. However, relatively little is known about the role of specific amino acids of the RBD in the binding to RNA. We have devised a phage display-based screening method for the rapid identification of amino acids in hnRNP C1 that are essential for its binding to RNA. The identified mutants were further tested for binding to poly(U)-Sepharose, a substrate to which wild type hnRNP C1 binds with high affinity. We found both previously predicted, highly conserved residues as well as additional residues in the RBD to be essential for C1 RNA binding. We also identified three mutations in the leucine-rich C1-C1 interaction domain near the carboxyl terminus of the protein that both abolished C1 oligomerization and reduced RNA binding. These results demonstrate that although the RBD is the primary determinant of C1 RNA binding, residues in the C1-C1 interaction domain also influence the RNA binding activity of the protein. The experimental approach we described should be generally applicable for the screening and identification of amino acids that play a role in the binding of proteins to nucleic acid substrates.

The heterogeneous nuclear ribonucleoprotein (hnRNP) 1 proteins are a group of about 20 abundant proteins that bind nascent RNA polymerase II transcripts and are involved in various aspects of pre-mRNA processing, mRNA transport, and mRNA metabolism (1)(2)(3)(4). Among them, the hnRNP C1 protein is one of the most avid pre-mRNA-binding proteins, and it has been shown to preferentially bind to uridine-rich RNA sequences (5)(6)(7)(8)(9). In vitro selection/amplification from pools of random sequence RNA (SELEX procedure) demonstrated that C1 binds avidly to sequences containing a stretch of five or more uridines, its high affinity "winner sequence," with an apparent dissociation constant (K d ) of about 170 nM (10). Re-cent studies have reported that C1 protein also binds specifically and with high affinity to several U snRNAs (11,12). C1 also binds U6 snRNA, which contains an elongated uridylated stretch at the 3Ј end and induces disruption of U4:U6 snRNA base pairing (13).
The RNA binding activity of hnRNP C1 has been thought to be mainly conferred by its single RNP motif RNA-binding domain (RBD) comprising the amino-terminal 94 amino acids (9,10). The RBD (also referred to as RRM, which stands for RNA-recognition motif) is the most prevalent RNA-binding motif in eukaryotes (1, 14 -20). It is an evolutionarily conserved domain present in pre-mRNA-, mRNA-, pre-rRNA-, and snRNA-binding proteins, including hnRNP proteins, splicing factors, and polyadenylation factors (1,15). The RBD is comprised of ϳ80 -100 amino acids in which two consensus sequences, an octapeptide termed RNP1 and a hexapeptide termed RNP2, about 30 amino acids apart, and many other hydrophobic amino acids are particularly highly conserved (15,17,18,20). The RBD is folded into a compact domain structure of ␤␣␤␤␣␤ (21)(22)(23). The four-stranded antiparallel ␤ sheets are packed against the two perpendicularly oriented ␣ helices. The highly conserved RNP1 and RNP2 consensus sequences are juxtaposed on the central ␤ 3 and ␤ 1 strands, respectively. NMR studies of hnRNP C1 RBD bound with its high affinity RNA substrate suggest that the ␤ sheets, the loops connecting the strands of the sheets, and the contiguous NH 2 -and COOHterminal regions of the RBD together form an exposed platform for direct and specific RNA binding (9). Crystal structure studies of U1A RBD complexed with its cognate U1 snRNA stemloop II further support the view that the conserved RNP1 and RNP2 and the COOH-terminal extension of the RBD interact with RNA extensively (24). Additionally, site-directed mutagenesis has been carried out on RBDs of many RNA-binding proteins, and these studies have pinpointed several amino acids, particularly in the conserved RNP1 and RNP2, as essential residues for the RNA binding activity of this domain. For example, residues Asn 9 , Thr 11 , Tyr 13 , Gln 54 , Phe 56 , and Gln 83 of U1A, which corresponds to Asn 15 , Arg 17 , Phe 19 , Phe 52 , Phe 54 , and Asn 83 , respectively, in the hnRNP C1, were identified as such essential amino acids (25)(26)(27)(28).
More recent studies, however, have shown that C1 lacking the canonical RBD retained considerable RNA binding in vitro to U1, U2, and U6 snRNA as well as to its SELEX winner sequence (12,29). It was suggested that instead of the RBD, the major determinant for C1 RNA binding is a highly basic domain that consists of residues from Val 140 to Asn 161 and immediately precedes a leucine zipper motif. The zipper motif forms a coiled-coil structure that mediates C1 oligomerization. In general, the basic zipper motif present in C1 protein is reminiscent of the DNA-binding bZIP motifs found in many transcription factors. However, compared with the essential role of the RBD, the importance of bZIP motif in RNA binding by intact C1 has not been fully established, because site-directed mutagenesis failed to generate expressible proteins for functional assays (12,29). Here we describe a method to systematically identify, in the context of full-length C1 protein, amino acids that are essential for RNA binding. It is based on a functional screening of a randomly mutagenized C1 expression library constructed in phage. This screen identified many conserved and thus expected residues in the RBD to be essential for C1 RNA binding. It also implicated previously unidentified amino acids, particularly, residues in the C1-C1 interaction domain (CID), to influence its binding to RNA. Additionally, these identified mutations in CID resulted in a defect in C1-C1 interaction, indicating a connection between the ability of C1 to form oligomers and the RNA binding activity of this protein. This functional screening method should be generally applicable to any protein of interest to identify amino acids that are required for the binding to its cognate nucleic acid or protein ligands.

EXPERIMENTAL PROCEDURES
Construction of the hnRNP C1 Mutant Library-To generate random point mutations in the coding region of the hnRNP C1 protein, errorprone PCR was performed using the standard protocol according to Leung et al. (30) with several modifications. The entire coding region of human hnRNP C1 cDNA (873 base pairs) was amplified by using the plasmid pHC12 (31) as the template and by using primers 5Ј-TCGAAT-TCGATGGCCAGCAACGTT-3Ј and 5Ј-CAGGCTCGAGACCCCACTAT-GTGCTTAA-3Ј, which contain EcoRI or XhoI restriction enzyme site, respectively. Mutation frequency was estimated to be about 0.25-0.4% when using the following PCR conditions. Reaction mixtures (100 l) contained 10 ng of template, 80 pmol of each primer, 1 mM of each dNTP, 16.6 mM (NH 4 ) 2 SO 4 , 67 mM Tris-HCl, pH 8.8, 6.1 mM MgCl 2 , 6.7 M EDTA, pH 8.0, 0.17 mg/ml bovine serum albumin, 10% dimethyl sulfoxide, 10 mM ␤-mercaptoethanol, and 4 units of Taq DNA polymerase (PerkinElmer Life Sciences). Four identical but separate PCR reactions were subjected to 28 cycles of 95°C for 1 min, 54°C for 1 min, and 72°C for 3 min. The pooled PCR products were extracted with phenol/chloroform/isoamyl alcohol (25:24:1), ethanol precipitated, cut with EcoRI and XhoI, and purified on a 1% (w/v) agarose gel. The purified EcoRI-XhoI fragments were then ligated into the EcoRI-XhoI cleaved Uni-Zap XR vector arms (Stratagene). The ligation mixture was packaged using the Gigapack II Packaging Extract kit (Stratagene) according to the manufacturer's suggested conditions to generate a Uni-Zap phage library containing C1 cDNAs with random mutations. The titer of this library is ϳ1 ϫ 10 6 plaque-forming units/ml.
Screening the Mutant Library and Identifying Mutations-To screen the mutant library, Escherichia coli XL-1 Blue bacteria were infected at ϳ200 plaque-forming units/plate and plated onto 50 LB plates (100 mm) with top agarose. After incubation at 37°C for 4 h, each plate was overlaid with a nitrocellulose filter impregnated with 10 mM isopropylthio-␤-D-galactoside. Following incubation for 6 -8 h at 37°C, the filter was lifted, replaced with another isopropylthio-␤-D-galactoside-treated nitrocellulose filter, and incubated at 37°C overnight. The first set of filters was immunoblotted with the anti-C1 monoclonal antibody 4F4 and then with 125 I-labeled goat anti-mouse F(abЈ) 2 as previously described (32). After washing, the immunoblotted filters were exposed to x-ray film. The second set of filters was incubated in screening buffer (50 mM Tris-HCl, pH 7.5, 100 mM NaCl, 1 mM EDTA, and 10 mM dithiothreitol) for 30 min at room temperature with constant agitation, blocked with 3% nonfat dried milk and 0.02% sodium azide in phosphate-buffered saline without MgCl 2 for 1 h at room temperature. After a brief wash with screening buffer, the filters were incubated in 100 ml of screening buffer containing denatured salmon sperm DNA (0.1 mg/ ml) for 30 min at room temperature. Then, about 10 pmol of 32 P-labeled oligo(dT) 25 (25-mer) was added to the screening buffer, and the incubation was continued for 1 h. After washing five times with screening buffer, the filters were exposed to x-ray films. Plaques scoring positive for 4F4 and negative for oligo(dT) 25 were isolated and excised using ExAssist helper phage and SOLR E. coli strain as the host cell (Stratagene) according to the manufacturer's recommendations. The mutated bases within the C1 cDNA from these clones were identified by DNA sequencing using the Sequenase version 2.0 DNA sequencing kit (United States Biochemicals). Samples from each ddNTP terminated reaction were run as single base ladders on polyacrylamide gels to facilitate rapid identification of mutations.
Production of Wild Type and Mutant C1 Proteins-Wild type and mutant hnRNP C1 proteins were produced in vitro using TnT T7/T3 polymerase coupled rabbit reticulocyte lysate system (Promega) in the presence of [ 35 S]methionine (Amersham Pharmacia Biotech) according to the manufacturer's protocols. The constructs used for TnT were either in pBluescript SK(Ϫ) vector (Stratagene) and transcribed by T3 RNA polymerase or in pcDNA3 vector (Invitrogen) and transcribed by T7 RNA polymerase. To produce recombinant proteins from E. coli, wild type and mutant C1 cDNA fragments were subcloned into EcoRI and XhoI cleaved pET-28b vector to produce His-tagged proteins (Novagen) or pGEX-5X-3 vector to produce GST-tagged proteins (Amersham Pharmacia Biotech). The resulting plasmids were transformed into BL21(DE3) E. coli (Novagen) for expression and purification of the fusion proteins according to the manufacturer's recommendations.
Site-directed Mutagenesis-The MT2N mutant was constructed using PCR to generate two partial COOH-terminal fragments. The more 5Ј terminal of these fragments spanned the unique BsrGI site in C1. The 3Ј primer for this fragment generated the V194N mutation and a silent mutation that formed an XhoI site at amino acids 198 and 199. The second fragment was amplified by PCR to contain the same engineered XhoI site, the L201N mutation, and the remaining COOHterminal end of C1 followed by an XbaI site. These fragments were digested with BsrGI-XhoI and XhoI-XbaI, respectively, and both were subsequently inserted into BsrGI-XbaI cleaved pcDNA3-Myc-C1 plasmid (33) that contained the amino-terminal portion of the C1 cDNA. The mutations were confirmed using the Sequenase version 2.0 DNA sequencing kit (United States Biochemicals).
Ribonucleotide Homopolymer Bead Binding Assays-Binding of in vitro transcribed and translated hnRNP C1 and mutants to AGpoly(U) type 6 beads (Amersham Pharmacia Biotech) was carried out as previously described (6). 5 l of in vitro produced proteins were used in each binding reaction. The poly(U)-binding buffer contains 10 mM Tris-HCl, pH 7.5, 2.5 mM MgCl 2 , 0.5% Triton X-100, 1 g/ml pepstatin A, 1 g/ml leupeptin, 0.5% aprotinin, and various concentrations of NaCl as indicated.
In Vitro Protein Binding Assays-In vitro protein interaction assays were carried out as previously described (34). Briefly, purified GST or GST-C1 proteins (2 g) bound to 30 l of glutathione-Sepharose 4B resin (Amersham Pharmacia Biotech) were incubated with 5 l of in vitro translated protein in 500 l of binding buffer (50 mM Tris-HCl, pH 7.5, 100 mM NaCl, 2 mM EDTA, 0.1% Nonidet P-40, 1 g/ml leupeptin, 1 g/ml pepstatin A, and 0.5% aprotinin). Following incubation at 4°C for 1 h, the resin was washed five times with 1 ml of binding buffer. Bound proteins were then eluted in SDS-PAGE sample buffer, separated by 12.5% SDS-PAGE, and visualized by fluorography.
In Vitro Transcription and Gel Mobility Shift Assays-In vitro transcription and purification of U2 snRNA were done as previously described (35). The 32 P-labeled and polyacrylamide gel-purified RNA probe (2 ϫ 10 4 cpm) was incubated with the indicated recombinant His-tagged C1 proteins (25-250 nM) in 20 l of reaction mixture containing 50 mM Tris-HCl, pH 7.5, 100 mM NaCl, 1 mg/ml yeast RNA (Sigma), and 5% glycerol at room temperature for 20 min. Reaction mixtures were loaded onto 4% nondenaturing polyacrylamide gels with 1 ⁄4ϫ TAE buffer (40 mM Tris-acetate, pH 7.5, 1 mM EDTA) that had been prerun at 100 V for 1 h. Gels were run at 100 V until bromphenol blue (loaded into an empty lane) migrated about 10 cm into the gel. The gel was then dried for autoradiography.
Protein Cross-linking and Western Blot Analysis-For cross-linking, gel mobility shift reactions were prepared with cold RNA probe, but instead of loading onto gels, glutaraldehyde (final concentration, 0.01%) was added, and the reaction mixtures were incubated at room temperature for 30 min. Reactions were stopped by adding 1 ⁄10 volume of 1 M Tris-HCl, pH 7.5 (36). Proteins were separated by 10% SDS-PAGE and transferred to a nitrocellulose membrane. Western blot was carried out essentially as previously described (32), using 4F4 at a 1:1,000 dilution.

Identification of RNA-binding Mutants of hnRNP C1-To
identify hnRNP C1 mutants that do not bind RNA, a phage expression library constructed from randomly mutated hnRNP C1 cDNA was screened with a high affinity nucleic acid probe. To generate random point mutations in C1, error-prone PCR was performed using the entire coding region of C1 as the template. Under the conditions used, this technique has been shown to produce mutation frequency of about 0.25-0.4% over 30 cycles of PCR amplification (30,37), based on ϳ10 Ϫ4 errors/ nucleotide synthesized by Taq DNA polymerase (38). To assure that library clones expressed full-length or near full-length C1 protein, a set of replica library filters was screened with 4F4 that binds to an epitope close to the carboxyl terminus of C1. 2 Thus, clones not producing full-length C1 will not be detected with 4F4. Another replica set of library filters was probed with 32 P-labeled oligo(dT) 25 . We used oligo(dT) 25 instead of poly(U), the more typically used high affinity substrate for C1 (6, 10), because it is more stable and produces lower background than poly(U) (data not shown). Furthermore, in in vitro bead binding assay, oligo(dT) 25 binds to C1 as well as poly(U) does. 3 Fig. 1 (A  and B) shows filter duplicates probed with 4F4 and oligo(dT) 25 , respectively. Phage plaques that are positive with 4F4 but negative with oligo(dT) 25 represent phage that likely express full-length C1 proteins but contain mutations that reduce RNA binding. After screening ϳ10,000 clones, we obtained 60 clones that lost oligo(dT) 25 binding capacity.
The phagemids containing mutated hnRNP C1 coding regions were in vivo excised out of the phage, and the entire coding region of each clone was sequenced (39). The sequencing gels were run as single nucleotide ladders to facilitate rapid identification of mutations. A portion of such a sequencing gel illustrating representative mutations found is shown in Fig.  1C. For example, in clone 3, a guanine is mutated to a thymine, which results in Gln to His change at amino acid 56 (Q56H), and in clone 5, a cytosine is changed to a thymine resulting in a silent mutation at the amino acid level. The loss of oligo(dT) 25 binding activity of this clone is attributed to the presence of a second mutation (data not shown). Among a total of 60 clones identified in the screening and sequenced, we found mutations at 29 amino acid positions. Mutations in a few positions were identified multiple times (data not shown).
To verify that the mutants scored by the filter binding screening of the library indeed produced RNA binding defective hnRNP C1, the mutant C1 proteins were produced by in vitro transcription and translation and tested for binding to poly(U) beads at 0.1 M NaCl. The results of the binding assays for several of these mutants are shown in Fig. 2. Compared with wild type C1 protein, mutants F19S and Q56H completely abolished poly(U) binding, and mutants S16F and G51Y significantly reduced binding. These results demonstrate that the filter binding screening successfully identified RNA binding defective mutants of hnRNP C1.
The various C1 mutants we characterized, and their relative poly(U) binding avidity is listed in Fig. 3B. The positions of the mutations on the ␤␣␤␤␣␤ domain structure of the RBD are indicated in Fig. 3A. Most of the mutations lie within the two RNP consensus motifs that form the two central ␤-sheets. The mutations within RNP consensus motifs, e.g. F19(S/L), G21V, F52L, and Q56H, showed no binding or severely impaired binding in the poly(U) bead binding assay. Mutations located in close proximity to the RNP consensus motifs, e.g. S16F and H49(Y/R/N), were less defective in binding to poly(U). A cluster of mutations near the COOH-terminal end of the RBD identified in the library screening (D81N, N83(D/Y), and E87G) had the same affinity as wild type C1 for poly(U) as measured by the in vitro bead binding assay. Amino acid residues Phe 37 , Gly 41 , and Ala 66 are well conserved in several RBD-containing proteins, and they are located in the two ␣-helices (Fig. 3A).  A and B, phage display screening of a library containing randomly mutagenized C1 cDNAs. A shows a nitrocellulose membrane immunoblotted with 4F4. B shows a replica of this membrane probed with 32 P-labeled oligo(dT) 25 . Arrows point to phage plaques that show positive signal with 4F4 but negative with oligo(dT) 25 . They indicate candidate C1 mutants defective in RNA binding. C, a portion of one sequencing gel illustrating nucleotide changes in C1 cDNA clone 1 to clone 6. To facilitate identification of mutations, sequencing reactions terminated by ddATP (lanes A), ddCTP (lanes C), ddGTP (lanes G), or ddTTP (lanes T) for each sample were run side by side. Indicated are a G 3 T change observed for clone 3 (asterisks) and a C 3 T change found for clone 5

(inverted open triangles).
Mutations in these three positions were identified as RNA binding defective in the initial library screening. However, these mutants displayed wild type binding activity in the poly(U) bead binding assay (Fig. 3B). For two reasons, the poly(U) bead binding assay is probably less sensitive than the library screening with oligo(dT) 25 in detecting a slight decrease in binding affinity. First, the nucleic acid substrate used for screening, oligo(dT) 25 , is 25 nucleotides in length, whereas

FIG. 3. A collection of point mutations identified from the screening for RNA binding defective C1.
A, sequence alignment of the RBDs from hnRNP and snRNP proteins. Consensus RNP motifs and conserved amino acid residues are in shaded boxes. Shown below is the corresponding secondary structure of the RBD, which consists of a four-stranded ␤-sheets flanked by two ␣-helices. Indicated with asterisks are amino acid residues identified in the mutated C1 proteins that are defective in oligo(dT) 25 binding. The number on the left indicates the position of the first amino acid shown on the line, and the numbers on top indicate the positions of corresponding amino acid residues in hnRNP C1. B, mutants contain amino acid substitutions that are located in RBD and in CID. Their relative poly(U)-binding avidity at 100 mM NaCl compared with wild type C1 protein is indicated. ϩϩϩ, wild type binding; ϩ, reduced binding; Ϫ, no binding; ϩϩϩ/Ϫ, wild type binding at 100 mM NaCl and no binding at high salt conditions; n/d, not determined. poly(U) beads present polyribouridylic acid chains of about 100 nucleotides. Second, the amount of poly(U) used (3 M) was in excess over in vitro produced mutant C1 proteins (0.5 nM), whereas a relatively low concentration of oligo(dT) 25 (0.1 nM) was used in the library screening. Nevertheless, the above mutations were generally expected in a successful library screening, because most of these residues are conserved in many RBD containing proteins (Fig. 3A). Furthermore, previous studies have indicated that residues located in the ␤-sheet (22,26,28) and in the flanking amino-and carboxyl-terminal regions (9) are involved in RNA interaction.
Our library screening identified amino acid changes that have not been previously predicted to be involved in the interaction with RNA. For example, multiple mutations were observed at Val 28 (V28(G/I)) and at Asp 71 (D71(V/G)). These residues are located at the termini of the loop structures connecting the ␤-sheets and the ␣-helices and thus are likely to be critical for the global folding of the RBD. Of particular interest, we discovered three mutations affecting C1 binding to RNA that map outside of the RBD in the CID (Fig. 3B).
The Effect of Point Mutations in CID on hnRNP C1-C1 Interaction-In isolated HeLa nuclear hnRNPs, the C1 and C2 proteins are present at a stoichiometry of (C1) 3 C2 possibly as tetramers (40). Bacterially expressed C1 proteins spontaneously form C1 tetramers (41). CD spectra analysis of C1 deletion mutants revealed that a leucine-rich coiled-coil domain (Leu 180 -Glu 207 ) mediates C1-C1 interaction (29). Using the yeast two-hybrid system, we have also mapped CID to the same region. 4 However, it has not been previously known that in the context of full-length C1 protein, the coiled-coil motif is a determinant of C1 protein-protein interaction. Because three point mutations (L187Q, Q192P, and L201P) found in the search for RNA binding defective mutant C1 were located within the CID, we tested the effect of these mutations on C1-C1 interaction. CID is predicted to be comprised of four heptad repeats (Fig. 4A). Hydrophobic residues at positions 1 and 4 likely form the hydrophobic inner core of the coiled-coil. Mostly charged residues at other positions likely form the polar outer surface. The L187Q and L201P mutations change the hydrophobic leucine residue to the polar residue glutamine or to proline, respectively. The Q192P mutation changes the amino acid from polar to aliphatic. We also engineered a mutant (MT2N) that changes two hydrophobic residues (Val 194 and Leu 201 ) to polar residues by site-directed mutagenesis. All these mutations are predicted to disrupt the ␣-helical structure or the hydrophobic environment of the coiled-coil. We produced C1 proteins harboring such mutations individually by in vitro transcription and translation and tested their binding to bacterially expressed and purified GST fused to C1 (GST-C1). All four CID mutants abolished C1 homotypic interactions, whereas the wild type protein was able to interact with GST-C1 (Fig. 4B). These results demonstrated that CID is critical for protein-protein interaction in the context of intact C1 protein.
The Effect of C1 Oligomerization on RNA Binding Activity-The finding of C1 protein-protein interaction defective mutations in the screening for RNA binding defective C1 prompted us to examine C1 oligomer RNA binding by gel mobility shift analysis. We first bacterially expressed and purified Histagged wild type and mutant C1 proteins (Fig. 5A). The Q56H mutant has a mutation in the RBD that completely abolished 4 V. Pollard and G. Dreyfuss, unpublished data. poly(U)-binding in the in vitro bead binding assay (Fig. 3). The L187Q mutation is located in the CID, and it disrupted C1-C1 protein interaction in vitro (Fig. 4). Previous studies have shown that hnRNP C protein tetramer binds U1, U2, and U6 snRNAs with high affinity (11)(12)(13). Therefore, U2 snRNA was transcribed in vitro in the presence of [ 32 P]UTP and used in gel mobility shift assays. As shown in Fig. 5B, wild type C1 readily bound the U2 snRNA probe. In contrast, the Q56H mutant did not bind U2 snRNA under the same conditions. Furthermore, the Q56H mutant was unable to bind Ad2 pre-mRNA (42) under similar gel mobility shift conditions (data not shown). These results further confirmed that the RBD is the primary determinant of C1 RNA binding activity, because a single mutation in the RNP1 consensus motif abolished C1 binding to U snRNA, pre-mRNA, and poly(U) homopolymers. When L187Q mutant C1 was used at low concentration, a faster migrating RNA-protein complex than that of wild type C1 was detected (Fig. 5B). As the amount of L187Q protein increased, we observed a gradual increase in the size of the shifted RNA-protein complex. This is likely due to binding of multiple C1 molecules on one RNA at higher protein concentrations. Nonetheless, the L187Q-U2 snRNA complex is smaller than that of the wild type C1, mostly likely because of the inability of L187Q to form oligomers on RNA.
To examine C1 oligomerization under gel mobility shift conditions, we used glutaraldehyde as a cross-linking reagent (36) and then analyzed C1 protein oligomer formation by SDSpolyacrylamide gel electrophoresis followed by Western blotting with 4F4 (Fig. 5C). At a final concentration of 125 or 250 nM, both wild type C1 and the L187Q mutant migrated at about 45 kDa, which corresponds to the size of the monomeric C1 as observed when no cross-linking reagent was added. For wild type C1 cross-linked with 0.01% glutaraldehyde, the band representing the monomer form dramatically diminished, whereas a band corresponding to the oligomeric form appeared. The size of the oligomerized C1 indicates the formation of C1 tetramers or even larger oligomers. Under the same cross-linking conditions, however, L187Q mutant C1 is almost exclusively monomeric. Taken together with the gel shift analysis shown in Fig.  5B, these results suggest that wild type C1 binds RNA as an oligomer, whereas L187Q mutant binds RNA as a monomer.
The fact that the L187Q mutant is able to bind U2 snRNA with similar apparent affinity as the wild type C1 protein raised the question of why we found L187Q as an RNA binding defective mutant in the library screening. To address this issue, we produced wild type C1 and mutant C1 (L187Q and Q192P) by in vitro transcription and translation, and performed poly(U) bead binding assay under more stringent con- FIG. 5. Demonstration of U2 snRNAbinding and oligomerization by C1 wild type and mutants. A, purified recombinant proteins (2 g each) on a Coomassie Blue-stained 12.5% SDS-polyacrylamide gel. B, gel mobility shift analysis. U2 snRNA was in vitro transcribed and labeled with [ 32 P]UTP and incubated with increasing concentrations of the indicated C1 proteins (25-250 nM) for 20 min at room temperature. Samples were then analyzed on a 1 ⁄4ϫ TAE 4% nondenaturing polyacrylamide gel followed by autoradiography. Both free U2 snRNA and slower migrating C1-U2 snRNA complexes are indicated. C, oligomer formation of C1 proteins under gel mobility shift conditions. Following incubation of gel shift reaction mixtures, proteins were cross-linked by 0.01% glutaraldehyde. Samples were then resolved by 12.5% SDS-PAGE and transferred to nitrocellulose membrane for Western blotting with 4F4. Positions of the monomeric and oligomeric C1 proteins are indicated. Sizes are indicated in kilodaltons.
ditions. The ability to bind poly(U) even at 2 M NaCl is a characteristic property of C1, which distinguishes C1 and C2 as high affinity specific binders of poly(U) from the other major hnRNP proteins (6). Consistent with this, we found that wild type C1 bound to poly(U) equally well at 0.1, 1, and 2 M salt concentrations (Fig. 6). For mutants L187Q and Q192P, RNA binding was similar to that of the wild type at 0.1 M NaCl. However, the RNA binding of these mutants was significantly reduced at 1 M NaCl and completely abolished at 2 M NaCl (Fig.  6). These findings show that mutations in CID reduce C1 RNA binding under high salt conditions and indicate a role for CID in the RNA binding activity of this protein.

DISCUSSION
Compared with site-directed mutagenesis and deletional mutagenesis approaches, random PCR mutagenesis coupled with phage display screening provides an easy and unbiased way to quickly identify essential amino acid residues that are involved in RNA binding in the context of full-length C1 protein. Although other mutagenic chemicals can be used (43), the production of a large pool of random mutations by use of PCR with Taq DNA polymerase under essentially standard reaction conditions is extremely simple and efficient (37). It has been reported that the mutagenesis conditions can be modified to produce the desired frequency of mutations from 0.25 to 2% within a specific region that ranges from 200 to 1,200 base pairs (30,37). The 4F4 epitope, near the carboxyl terminus of hnRNP C1, provided a convenient marker for the expression of full-length or near full-length C1 proteins. It is also possible, however, to construct any other terminal epitope tag to ensure expression of full-length proteins. This method should be readily applicable to many other RNA-protein interactions as well as DNA-protein interactions. Recently, a similar approach was used to study protein-protein interactions of Ran-binding protein 1 (RanBP1) and identified mutations that disrupt its association with RanGTPase and hence affect the function of Ran (44). Finally, this method could also be used to engineer proteins with novel ligand binding specificity or higher substrate binding affinity.
CID was defined as the coiled-coil motif that mediates C1 protein-protein interaction using CD and velocity sedimentation studies of C1 deletion constructs (29). Efforts to create C1 mutant protein containing L187A/V194A double mutations failed, because the mutations likely resulted in incorrectly folded polypeptides that are degraded rapidly in E. coli cells. (29). Here we identified three C1 mutants containing point mutations in their CID (L187Q, Q192P, and L201P) that could be displayed by phage. Furthermore, we were able to produce these mutant proteins by coupled transcription and translation and by expression and purification from E. coli cells. Compared with wild type C1, these mutants are incapable of binding to C1 in vitro as measured by protein binding assays (Fig. 4B). They form a smaller complex with U2 snRNA, most likely because of a defect in oligomer formation (Fig. 5, B and C), and gel filtra-tion analysis indicates that they form a smaller complex than wild type C1. 3 These results demonstrated that a single amino acid change in CID in the context of full-length C1 protein is detrimental to C1 oligomerization, thus establishing the crucial role of CID in mediating C1-C1 interaction.
LeStourgeon and colleagues (12,29) raised a question concerning the critical role of the RBD in C1 RNA binding. The studies they performed used various C1 deletion mutants and led them to conclude that the basic region preceding CID (Val 140 -Asn 161 ) is the primary determinant of the C protein high affinity binding for both C1 winner sequence and U snRNAs (12,29). However, individual amino acid residues comprising the basic region have not been experimentally examined to see whether substitution of any of them in intact C1 protein indeed result in loss of RNA binding. In our random mutagenesis and screening experiments for RNA binding defective mutants, no mutation was found to map within this basic region, which indicates that this segment of the protein is not essential for the RNA binding activity of full-length C1 protein. However, we cannot entirely exclude the possibility that mutations in this region were not present in our mutagenized C1 library. One unexpected but very interesting finding to emerge from our random mutagenesis studies is that several mutations outside of the RBD drastically affect RNA binding. Each of these mutations changes a single residue in CID and results in impaired C1 oligomerization (Fig. 4). Compared with wild type C1, these mutations very strongly reduced or abolished binding to poly(U) at higher salt concentrations (1 and 2 M NaCl), although binding at lower salt concentration (0.1 M NaCl) was not affected (Fig. 6). Similar defects in ribonucleotide homopolymer binding under high salt conditions were observed for the mutants of the RNA-binding proteins, hnRNP K and FMR1 (45). One possible explanation in the case of C1 is that its oligomerization may affect its affinity for RNA. However, two findings argue against this notion. First, at high salt concentrations, C1-C1 interactions were not detected by in vitro protein binding assays (data not shown). Second, McAfee et al. (29) found that the binding isotherm of a tetramerizationdefective deletion construct was essentially unchanged. Another possibility is that mutations in CID simply cause a conformational change that leads to reduced RNA binding affinity, because the coil structure could be involved in optimally orienting the RNA-binding sites vis à vis the RNA.
Nevertheless, among a collection of RNA binding defective mutants, we found that most of the critical residues are located within, or close to, the consensus RNP motifs, RNP1 and RNP2. This is consistent with and adds direct evidence to the conclusions drawn from previous structural studies on hnRNP C1 RBD and proteins of the same family (e.g. U1A). For example, for the U1A snRNP protein, structural modeling implicated several residues in its ␤-sheet to be involved in RNA binding. These include Thr 11 , Tyr 13 , Asn 15 , Asn 16 , and Gln 54 of U1A, which correspond to Arg 17 , Phe 19 , Gly 21 , Asn 22 , and Phe 52 of FIG. 6. Mutations in CID affect C1 poly(U) binding activity at high salt conditions. Wild type and mutant (L187Q and Q192P) in vitro translated proteins (5 l each) were incubated with poly(U)-Sepharose beads in poly(U)-binding buffer containing 0.1, 1, or 2 M NaCl. After washing, bound proteins were analyzed on a 12.5% SDS-polyacrylamide gel followed by fluorography. 20% of the input samples are shown in the TnT lanes. Bound proteins under the indicated salt concentrations were shown in the Binding lanes. C1, respectively (22,28). We found mutations at these positions in our screen and further demonstrated their inability to bind poly(U), a specific high affinity ribonucleotide homopolymer substrate of C1. Additionally, NMR studies on C1 RBD indicated that Gly 51 is topologically located at the tight turns between the ␤-sheets and thus is likely to be part of the RNAbinding surface (9,21). Indeed, a mutation was found at this position (G51Y), although its effect on poly(U) binding is less severe. Besides finding mutations within or near the RNP consensus sequences, we also discovered a cluster of mutations located at the COOH-terminal end of the RBD. This finding is in agreement with the conclusion drawn from the crystal structure studies of U1A RBD complexed with RNA hairpin, in which hydrogen-bonding was observed between RNA and residues Arg 83 , Gln 85 , and Tyr 86 of U1A (correspond to Asp 81 , Asn 83 , and Leu 84 of C1, respectively) (24). Our studies strongly support the conclusion that the RBD is the primary RNAbinding domain of C1 and further suggest that structures outside of RBD, such as CID, influence the RNA binding activity of hnRNP C1.