Optimized RNA targets of two closely related triple KH domain proteins, heterogeneous nuclear ribonucleoprotein K and alphaCP-2KL, suggest Distinct modes of RNA recognition.

The KH domain mediates RNA binding in a wide range of proteins. Here we investigate the RNA-binding properties of two abundant RNA-binding proteins, alphaCP-2KL and heterogeneous nuclear ribonucleoprotein (hnRNP) K. These proteins constitute the major poly(C) binding activity in mammalian cells, are closely related on the basis of the structures and positioning of their respective triplicated KH domains, and have been implicated in a variety of post-transcriptional controls. By using SELEX, we have obtained sets of high affinity RNA targets for both proteins. The primary and secondary structures necessary for optimal protein binding were inferred in each case from SELEX RNA sequence comparisons and confirmed by mutagenesis and structural mapping. The target sites for alphaCP-2KL and hnRNP K were both enriched for cytosine bases and were presented in a single-stranded conformation. In contrast to these shared characteristics, the optimal target sequence for hnRNP K is composed of a single short "C-patch" compatible with recognition by a single KH domain whereas that for alphaCP-2KL encompassed three such C-patches suggesting more extensive interactions. The binding specificities of the respective SELEX RNAs were confirmed by testing their interactions with native proteins in cell extracts, and the importance of the secondary structure in establishing an optimized alphaCP-2KL-binding site was supported by comparison of SELEX target structure with that of the native human alpha-globin 3'-untranslated region. These data indicate that modes of macromolecular interactions of arrayed KH domains can differ even among closely related KH proteins and that binding affinities are substantially dependent on the presentation of the target site within the RNA secondary structure.

Post-transcriptional controls play an important role in the determination of gene expression. The controls over RNA splicing, transport, localization, translation, and/or stability either contribute to or are the major component(s) of gene modulation during development (1). These controls are often mediated via interactions between specific mRNA sequences and/or structures and corresponding trans-acting RNA-binding proteins (2,3). In several cases such interactions have been described in detail and emphasize the importance of primary and higher order structural RNA motifs (4 -9). The number and variety of RNA-binding proteins reported in the literature are rapidly expanding. Some of these RNA-binding proteins, such as those associated with heterogenous nuclear RNA, show low sequence specificity, suggesting general packaging functions (3,10). Others demonstrate high level RNA-binding specificity suggesting circumscribed functions in gene control. Examples of the latter group of proteins include cytosolic iron-response element-binding protein (6,11), human immunodeficiency virus Rev response element-binding protein (8), and the sex-lethal alternative-splicing factor (13). RNA-binding proteins, like their DNA-binding protein counterparts, tend to be modular in structure with conserved RNA-binding domains and "auxiliary" domains that function in the assembly of multiprotein complexes central to specific RNP functions (14). Four common motifs have been identified in mRNA-binding proteins as follows: the RNP domain (also called RNA recognition motif) (15), the RGG box (16), zinc fingers (17), and the KH (hnRNP 1 K Homology) domain (18,19).
The KH domain is one of the most commonly identified RNA-binding motifs. This domain, first identified in hnRNP K (19), has been found in more than 50 proteins in a wide range of organisms. Although proteins with single KH domains have been identified (20), these proteins more commonly contain 2-4 or 15 KH repeats (reviewed in Ref. 21). The KH domain encompasses ϳ70 amino acids. The conserved structure of this domain, as defined by NMR and x-ray crystallography, composes a compact ␤␣␣␤␤␣ configuration (18,(22)(23)(24). This structure projects an invariant Gly-X-X-Gly loop between the first and second ␣-helices and a variable loop between the second and third ␤-sheets. These two loops directly participate in RNA binding by forming a "molecular vise" that makes multiple sequence-specific contacts with 4 -5 contiguous core bases within the target RNA (24). How such limited contacts can result in high specificity and high affinity interactions is not clearly understood.
hnRNP K, the founding member of the KH family of RNAbinding proteins, was initially characterized as a component of the hnRNP complex (19). This complex is responsible for packaging nuclear heterogeneous transcripts. Functional motifs in hnRNP K in addition to the three KH domains have been implicated in nuclear localization, nucleo-cytoplasmic shuttling, binding of protein kinases, and transcriptional control. These data suggest that hnRNP K is involved in a wide variety of cell functions (26). The only native RNA target defined for hnRNP K to date is a CU-rich sequence (DICE; differentiationcontrol element) repeated multiple times in the 3Ј-untranslated region (3Ј-UTR) of 15-lipoxygenase (LOX) mRNA. Binding of hnRNP K to this site mediates translation silencing of LOX mRNA in erythroid precursors (27). Due to the ubiquitous distribution and high abundance of hnRNP K, it is probable that many of its targets and functions remain to be identified.
A major group of KH domain proteins is composed of the ␣CP proteins (␣-globin mRNA poly(C)-rich segment-binding protein; see below). ␣CPs are also referred to as PCBPs (poly(C)binding protein) and hnRNP E (27,28). ␣CP proteins exist in the cell in multiple isoforms encoded by four dispersed loci in both mouse and man (21,29). The three best described isoforms are ␣CP-1, ␣CP-2, and ␣CP-2KL. All three bind tightly to poly(C). ␣CP-2KL is encoded by an alternatively spliced ␣CP-2 transcript that lacks a single exon (exon 8a) corresponding to an internal 31-amino acid segment (29) (Fig. 1). ␣CP-1 and the two ␣CP-2 isoforms can each independently bind to the h␣-globin mRNA 3Ј-UTR to form the ␣-complex that is functionally linked to ␣-globin mRNA stabilization (30). Tissue surveys reveal that each of these isoforms is found in a wide range of tissues and cell lines. RNA and protein analyses suggest that ␣CP-2KL is the most abundantly expressed isoform (21,29).
Interest in the ␣CP proteins emerged from studies of posttranscriptional control of gene expression. Stabilization of the human (h) ␣-globin mRNA in erythroid cell lines is tightly linked to the formation of a binary complex ("␣-complex") between a single ␣CP molecule and a pyrimidine-rich binding site in the 3Ј-UTR (30 -33). The identification of functionally important ␣CP binding sites in the 3Ј-UTRs of additional long lived mRNAs including collagen (35) and tyrosine hydroxylase (36) has further suggested that the ␣-complex may serve as a general determinant for high level mRNA stabilization (34). ␣CP proteins have also been implicated in translational control. They appear to maintain LOX mRNA in a translationally silent state until the terminal stages of erythroid differentiation by binding to the 3Ј-UTR DICE motif in conjunction with hnRNP K (27,37). Remarkably, ␣CP can also mediate translational enhancement. In this case ␣CP increases the efficiency of cap-independent translation of picornavirus RNA by binding to two specific sites within the 5Ј-UTR internal ribosome entry site as follows: the 5Ј-terminal cloverleaf structure of the 5Ј-UTR (38,39) and stem-loop IV (38,40). The interaction with the 5Ј cloverleaf, which is dependent on the co-binding of the viral protein 3CD, also controls the switch from translation to replication of the polio RNA viral genome (41). Additional ␣CPmediated translational controls have been reported in a variety of unrelated viral systems (42)(43)(44). Finally, ␣CPs may play a role in translational recruitment of dormant mRNAs during early development of the Xenopus embryo via controlled cytoplasmic poly(A) elongation (45). Thus, ␣CP proteins appear to be involved in a wide range of post-transcriptional controls involved in mRNA stability, modification, and expression. These controls target a specific subset of mRNAs and are mediated in an apparently sequence-specific and selective manner. Despite this wealth of descriptive data, the underlying mechanisms involved in these controls remain undefined.
The ␣CP and hnRNP K proteins are closely related in structure and binding properties. Together they constitute the major poly(C) binding activity in the cell. Both proteins contain three copies of the KH domain arranged in a similar manner as follows: KH1 and KH2 at the N terminus separated from the more C-terminal KH3 domain by a central region of variable length and sequence ( Fig. 1; also see Ref. 21). Close structural and evolutionary relationships between hnRNP K and the ␣CPs are further supported by the observation that the primary sequences of their three corresponding KH domains are more closely related to each other than are KH domains within the same protein (28,29). This conservation of KH domain number, sequence, and positioning, and the shared binding to the DICE element in the LOX 3Ј-UTR (27) suggest commonalities in their modes and specificities of RNA binding. However, the unique ability of ␣CPs to mediate translational enhancement, modification, and stabilization of specific mRNAs suggests that these proteins may be distinct in their relative RNA binding specificities.
Whereas both ␣CP and hnRNP K are categorized as poly(C)binding proteins, their optimal binding sites appear to be more complex and distinct from each other than is suggested by their common homopolymer recognition profiles. For example, hnRNP K cannot bind effectively to the ␣-globin 3Ј-UTR nor can it form the ␣-complex that is linked to stabilization of a number of additional mRNAs (30). The structural basis for ␣CP binding also appears to reflect more than a simple recognition of poly(C) as the major isoform of ␣CP, ␣CP-2KL, has a 6-fold higher affinity for its ␣-globin 3Ј-UTR target sequence than it does for poly(C) homoribopolymers. 2 The observation that mutations outside of the defined minimal ␣CP-binding site within the ␣-globin mRNA 3Ј-UTR can severely decrease ␣-complex formation (32,33) further suggests that higher order RNA structures might be of considerable importance in determining RNA target preference and affinity. In the present study we define the optimal sequences and structures of the RNA binding sites for hnRNP K and ␣CP-2KL. Parallel analyses of these two sets of protein-RNA interactions revealed well defined differences in how these two closely related proteins interface with their respective RNA targets.

EXPERIMENTAL PROCEDURES
SELEX-The SELEX protocol (46) was utilized to obtain high affinity binding sites for recombinant ␣CP-2KL or hnRNP K. 600 pmol (corresponding to ϳ4 ϫ 10 14 different molecules) of the polyacrylamide gel-purified oligonucleotide A2N50, 5Ј-GCG GAA GCT TCT CTA CAT GCA ATG GNN NNN NNN NNN NNN NNN NNN NNN NNN NNN NNN NNN NNN NNN NNN NNN NNN CAC GTG TAG TAT CCT CTC CC-3Ј, was used as template for PCR with the two primers A1T7, 5Ј-GCG AAT TCT AAT ACG ACT CAC TAT AGG GAG AGG ATA CTA CAC GTG-3Ј, and A3, 5Ј-GCG GAA GCT TCT CTA CAT GCA ATG G-3Ј, for 6 cycles in a total volume of 10 ml. A1T7 encodes the T7 RNA polymerase promoter, and A3 was used as the downstream primer for PCR amplification as well as serving as primer for cDNA synthesis by reverse transcriptase. These primers also carry the recognition sequences for the EcoRI and HindIII restriction enzymes (shown in italics), respectively, to facilitate cloning of reverse transcriptase-PCR products for sequencing. The DNA was phenol/chloroform-extracted, precipitated, and resuspended in 400 l of TE. After purification on a NAP-5 gel filtration column (Amersham Pharmacia Biotech), the DNA was precipitated and resuspended in TE, and 300 g of template was utilized as template for T7 transcription in the presence of trace amounts of [ 32 P]CTP to generate RNA for the first round of selection. Upon removal of the transcription template with RQ1 DNase (Promega), the RNA was phenol/chloroform-extracted, precipitated, and finally purified on preparative 10% denaturing polyacrylamide gels containing 8 M urea. The band corresponding to the 95-nt run-off transcript was identified by exposing on film. The band was cut out and crushed in a 15-ml plastic tube, and the RNA was eluted in diethyl pyrocarbonate/H 2 O by incubation on a shaker at room temperature overnight. After phenol/chloroform extraction the RNA was precipitated with ethanol and resuspended in a suitable volume of diethyl pyrocarbonate/H 2 O. After gel purification, a total RNA pool of 6.4 nmol (205 g) was obtained. This pool was split in two for parallel SELEX experiments against ␣CP-2KL and hnRNP K, respectively. RNA (100 g for the first round) was incubated with recombinant protein under standard conditions for ␣CP binding (30) in 1ϫ binding buffer (10 mM Tris-HCl, pH 7.4, 150 mM KCl, 1.5 mM MgCl 2 , 1 mM dithiothreitol) with the following modifications: no carrier RNA or heparin was used in the first round of selection. In subsequent rounds tRNA was added to 0.5 g/l and heparin to 5 g/l. Partitioning for the first 5 rounds was performed by nitrocellulose filtration (47). Briefly, after incubation of RNA and protein at room temperature for 20 min, the mixture was passed through a pre-wetted nitrocellulose filter (Millipore). Unbound RNAs were washed through using 5 ml of 1ϫ binding buffer containing 0.5 mg/ml tRNA. The filter was cut in pieces, and RNAs binding to protein were eluted by shaking for 30 min at room temperature in a 7 M freshly prepared urea solution with 2 volumes of added phenol (46). Eluted RNA was subsequently phenol/chloroformextracted, precipitated in the presence of glycogen (Roche Molecular Biochemicals), and reverse-transcribed with avian myeloblastosis virus reverse transcriptase (Promega) at 50°C for 20 min after annealing of the A3 oligonucleotide. PCR was performed with the A1T7 and A3 oligonucleotides for a suitable (typically 6 to 10) number of cycles, as the yield of the PCR product was monitored by 7% polyacrylamide gel electrophoresis. We found that an excessive number of cycles would lead to formation of higher molecular weight PCR products. It should be noted that for each round the input RNA was subjected to sizing by gel purification to avoid the accumulation of less than full-length transcripts and that all RNA was subjected to a renaturation step by heating to 75°C for 3 min in 1ϫ binding buffer and slowly cooling to room temperature. The molar ratio of protein to RNA was decreased gradually from 1:10 for round 1 to 1:250 for rounds 6 and 7. To prevent selection of RNAs toward unwanted targets, the partitioning of binders from non-binders was sequentially performed in 3 different ways. Rounds 1-5 were performed by nitrocellulose filtration and subsequent elution (46); round 6 was performed as preparative gel-shift, cutting out and extracting all shifted complexes; and round 7 was performed by co-immunoprecipitation of RNA-protein complexes with FF3 (for ␣CP2-KL) or anti-T7 tag antibody (the recombinant hnRNP K used in this study carries an N-terminal T7 tag) and protein A-Sepharose (Amersham Pharmacia Biotech). At the ends of the 6th and 7th round, individual cDNA clones were generated from selected RNAs by digesting the reverse transcriptase-PCRs with EcoRI and HindIII and ligating the pool of fragments into pUC19 EcoRI-HindIII. The ligation was transformed into DH5␣, and individual clones were picked in a random and nonexclusive manner for plasmid preparation and sequencing.
Generation of Mutant SELEX RNAs-T7 transcription templates encoding the mutant variants were created by PCR amplification of oligonucleotides carrying the corresponding nucleotide changes. The primers used as template for PCR amplification with primers A1T7 and A3 were as follows ( PCR products were cloned as described for the analysis of individual cDNA clones above, and the primary structure of each mutant cDNA was confirmed by sequencing. The T7 transcription template was prepared by PCR amplification with the A1T7 and A3 primers. Binding of recombinant ␣CP-2KL or recombinant hnRNP K to human ␣-globin 3Ј-UTR (␣3Ј-UTR) (lanes 1-5), or starting RNA library (R0) (lanes 6 and 9), or the pool present after seven rounds of SELEX against ␣CP-2KL and hnRNP K (R7␣ and R7K, respectively) (lanes 7, 8, 10, and 11). EMSA assays were performed with the amounts of protein indicated above each lane. The presence or absence of poly(C) competitor is indicated by ϩ or Ϫ, respectively. The smearing of the free probe in lanes 6 -11 reflects the structural complexity of the RNAs in the RNA pools.
Recombinant Proteins-Recombinant proteins (mouse ␣CP-2KL and human hnRNP K) were expressed as His-tagged variants in Escherichia coli and purified by standard procedures as described (30). To avoid differences in values due to unanticipated variation in recombinant protein activity, all studies in this report were carried out with a single constant preparation of each of the two recombinant proteins.
Electromobility Shift Analysis (EMSA)-EMSA (gel shift) analysis with recombinant proteins was performed in the above-mentioned 1ϫ binding buffer as described (30) on 5% non-denaturing polyacrylamide gels run in 0.5% TBE at room temperature. The amount of recombinant protein used in each incubation is noted above the respective lanes of each gel. To maximize the accuracy of inter-experimental comparisons, all EMSA studies were carried out using a single preparation of recombinant protein (see above). RNase T1 (Roche Molecular Biochemicals) was added to each incubation prior to gel analysis as described previously (30,32). RNase was specifically omitted in the case of the gel shifts shown in Figs. 2, 5, and 7. Only in Fig. 9 (C-E) was RNase T1 treatment performed. The binding value for each RNA was determined by plotting log (complexed/free probe) versus log (protein concentration) and determining the protein concentration resulting in 50% of the probe shifted into protein-RNA complexes. These values were then calculated relative to that found, respectively, for R7␣1 binding to ␣CP-2KL or R7K15 binding to hnRNP K as determined by EMSA. This "relative dissociation value" is indicated for each of the RNAs in the figures and throughout the text.
RNase Structural Mapping-5Ј-End labeling and RNase mapping was performed as described (48), except that RNAs were renatured and probed in 1ϫ binding buffer containing 0.25 g of tRNA/l. Incubations were carried out at 37°C for 4 min. The reactions were terminated by addition of formamide loading buffer, and the samples were electrophoresed on a 10% polyacrylamide sequencing gel. In lanes C, 1ϫ binding buffer was added in place of diluted enzyme. RNases T1, T2, and V1 were obtained from Roche Molecular Biochemicals, Life Technologies, Inc., and Amersham Pharmacia Biotech, respectively.
Analysis of Cellular Extracts-S-100 extracts from MEL cells were prepared as described (34). Fractionation of total extract on Superdex 200 HR 10/30 gel filtration column (Amersham Pharmacia Biotech) in 1ϫ S100 extract buffer was performed as described with a flow rate of 0.5 ml/min, collecting 500-l fractions on an automated fraction collector. Western and Northwestern analyses were performed as described previously (30). For Western analyses, the antibody specific to ␣CP-2 and ␣CP-2KL (lab antibody identifier, FF3) was used at a 1:5,000 dilution. The signals were developed by incubation with horseradish peroxidase-conjugated goat anti-mouse I 8 G as a secondary antibody.
The complexes were detected using the ECL system from Amersham Pharmacia Biotech. For Northwestern analyses, poly(C) was end-labeled with [␥-32 P]ATP. Gels were run and transferred as for Western studies. Nitrocellulose membranes were incubated in Northwestern buffer (Tris-HCl, pH 7.4, 50 mM NaCl, 1 mM EDTA, 1ϫ Denhardt's solution) for 2 h with added dithiothreitol (1 mM) and heparin (50 g/ml). This was followed by incubation in Northwestern buffer in the presence of 10 g/ml tRNA and 100,000 cpm/ml of probe for another 2 h. The membranes were then washed in Northwestern buffer (3 times for 5 min each), partially dried, and exposed to film.

Generation of RNA Pools Selected for High Affinity Binding
to ␣CP-2KL and hnRNP K-The SELEX protocol (46) was used to investigate the RNA binding specificities of ␣CP-2KL and hnRNP K (Fig. 1). ␣CP-2KL was chosen as a representative ␣CP isoform based on its high abundance in cells surveyed (29). High affinity RNA targets were isolated for these two proteins from a pool of 95-nucleotide (nt) RNAs containing a fully randomized 50-base central segment. The two SELEX studies were carried out in parallel under identical experimental settings. The initial RNA pool complexity was estimated at 4 ϫ 10 14 (see "Experimental Procedures"). Protein binding activities of successive RNA pools were monitored by EMSA. Incubation of an RNA probe corresponding to the native h␣-globin 3Ј-UTR (␣ 3Ј-UTR) with 125 ng of recombinant ␣CP-2KL resulted in the formation of a distinct complex ( (lane 9), whereas the RNA pool obtained after 7 rounds of SELEX against the hnRNP K protein (R7K) formed a cluster of poly(C)-sensitive complexes when incubated with this protein (lanes 10 and 11, respectively). Therefore, seven rounds of SELEX generated RNA pools highly enriched for RNAs binding to either ␣CP-2KL or hnRNP K.
RNAs Selected for High Affinity Binding to ␣CP-2KL Displayed an Extended Single-stranded Structure Encompassing a Triplicated Poly(C) Patch-EMSA of the RNA pools after 7 rounds (Fig. 2) were indistinguishable from similar analyses of round 6 SELEX RNAs (data not shown). Thus, we inferred that both pools were similarly enriched for high affinity protein binders. The primary sequences of a randomly chosen set of 26 cloned RNAs from rounds 6 and 7 (R6 and R7) were determined and compared (Fig. 3). The most striking common feature of these RNAs was the presence of three C-patches. Each of these C-patches was 3-5 nucleotides in length (2 of the 78 patches had only 2 Cs). No RNAs found at this stage of the ␣CP-2KL SELEX experiment contained less than three such C-patches. The C-patches were uniformly flanked by A-and U-rich segments with a strong bias against G bases throughout the conserved region. The spacing between the C-patches was variable but tended to be quite short. The consensus sequence was Binding of each of the ␣CP-2KL-selected RNA targets to ␣CP-2KL was quantified by EMSA. We have previously reported the K d(app) of the native ␣-complex is 0.5 ϫ 10 Ϫ9 M (30). This value was determined by incubating native ␣-globin 3Ј-UTR with unfractionated cell extracts. In the present study we have used recombinant ␣CP-2KL protein for the binding studies; the apparent K d value for the native ␣-globin 3Ј-UTR with the recombinant ␣CP-2KL is 20 ϫ 10 Ϫ9 . The difference between these two values is likely to reflect the fact that only a fraction of the recombinant protein appears to be biologically active (27). 3 Therefore all binding studies reported in the present study were carried out with a single preparation of recombinant ␣CP-2KL to minimize interassay variation, and the dissociation value for each RNA is reported relative to that of the index SELEX R7␣1. The "relative dissociation values" of each of the SELEX RNAs were found to be within 2-fold of each other and were on average 10 -20-fold lower (i.e. 10 -20-fold higher binding affinity) than for the native ␣-complex (see below for examples). The SELEX procedure thus enriched for a set of RNAs with a common primary sequence motif and a remarkably high affinity for ␣CP-2KL.
Predicted secondary structures of the ␣CP-2KL selected RNAs were generated using the M-fold program version 3.0 (49,50). Each of the selected RNAs had a high probability of assuming a secondary structure with a number of consistent properties. Six representative examples are shown in Fig. 4A. All of these RNAs contained an extensively base-paired stem topped by a large loop. In all cases the primary consensus sequence was entirely encompassed within the loop. There was no apparent conservation of primary sequence in the variable part of the stem regions. The predicted secondary structures of two of the RNAs (R7␣1 and R7␣2) was verified experimentally by probing in vitro synthesized 32 P-5Ј-end-labeled RNAs with three structure-specific ribonucleases as follows: RNases T1, T2, and V1. The results were consistent with the M-fold predictions (Fig. 4B). Of particular note was the marked sensitivity of the entire consensus sequence to cleavage by the single strand-specific RNase T2. These data suggested that all of the RNAs selected for high affinity binding to ␣CP-2KL conformed to a general structure in which the conserved primary sequences, consisting of three closely linked C-patches, were presented in an extended and uninterrupted single-stranded configuration.
The relationship of target structure to binding affinity was further tested by relating the structure of the SELEX RNAs to that of native ␣-globin 3Ј-UTR target. The native ␣-globin 3Ј-UTR binding target has three C-patches and yet binds to ␣CP-2KL with a 20-fold lower relative affinity than the SELEX targets (for example compare Fig. 2, lanes 4 and 7, and data not shown). The structure of the native ␣-globin 3Ј-UTR target as predicted by M-fold and confirmed by RNase mapping is diagrammed in Fig. 4C (primary data not shown). These data revealed that the first of the three C-patches previously implicated in ␣-complex formation (32,33) was incorporated in a doublestranded structure. The second patch was present as a small open loop, and the third and most extensive segment, a 16-base pyrimidine-pure and C-rich region, although predicted to be in a single-stranded conformation by M-fold, was not appreciably sensitive to single strand-specific RNases. These mapping data stand in marked distinction to the ␣CP-2KL SELEX targets in which the three C-patches were encompassed in a continuous and extensive domain with marked RNase T2 sensitivity (Fig. 4,  A and B). Thus, whereas the primary structure of the native ␣-globin 3Ј-UTR target was consistent with the SELEX consensus, its higher order structure suggested a suboptimal presentation of the poly(C) patches.
The importance of the conserved C-patches for ␣CP-2KL binding was established by mutagenesis of a representative RNA SELEX target (R7␣1; Fig. 5). C 3 A transversions were introduced at various positions in the target sequence. The substitutions were to A rather than G in order to minimize secondary structural changes. Substitution of all cytosines in the consensus sequence completely abolished ␣CP-2KL binding (Fig. 5A, compare 1st and 4th panels). Substitution of a single cytosine in each of the three C-patches caused a 3.5-fold increase in relative dissociation value (2nd panel), and replacement of all C-bases within the 3Ј-most C-stretch resulted in a 2.1-fold increase in relative dissociation value (3rd panel). A similar 2-3-fold increase in the relative dissociation value was observed after replacing all C-bases in either the most 5Ј or central C-patch (data not shown). Thus, optimal high affinity binding required the presence of three C-rich patches in the target RNA, but ␣CP-2KL binding is still reasonably tight when two of the three patches remain intact.
In contrast to the modest increase in the relative binding affinity caused by loss of a single C-patch, loss of two of the three patches resulted in a more substantial loss of binding activity (Fig. 5B). A single preserved 5Ј or central C-patch mediated binding at an affinity that was more than 10-fold lower than the native SELEX RNA (compare 1st panel with 2nd and 3rd panels). A single preserved 3Ј C-patch mediated an even more dramatic loss of binding activity (4th panel). In addition to resulting in a marked decrease in binding affinity, the elimination of two of the three C-patches also altered the migration of the resultant RNP complex. As seen most clearly with the first two sets of mutations (2nd and 3rd panels), the major complex migrated more rapidly than that formed by the intact SELEX RNA (1st panel) or that formed on a mutant RNA with 2 intact C-patches (Fig. 5A, 3rd panel). The structure of this low affinity complex was not further defined (see "Discussion"). Taken together, these data demonstrated that the SELEX RNA target site comprising three C-rich patches in a single-stranded configuration maximized the affinity of ␣CP-2KL binding.
The Consensus Sequence for High Affinity Binding to hnRNP K Composed a Single C-patch-SELEX enrichment for RNAs binding to hnRNP K was carried out in parallel with the studies on ␣CP-2KL. The consensus sequence of the hnRNP K SELEX RNAs isolated after 6 and 7 rounds consisted of a single short conserved sequence motif, 5Ј-UC 3-4 (U/A)(A/U)-3Ј (Fig. 6).
Although additional short C-stretches could be found in some of the RNAs (example, R6K16 and R7K10), eight high affinity binders contained only one sequence conforming to this motif (R6K6, R6K7, R7K6, R7K7, R7K15, R7K18, R7K21, and R7K23). These data suggested that a single short C-stretch was sufficient for maximal high affinity hnRNP K binding.
Secondary structure predictions of the hnRNP K SELEX RNAs revealed a common configuration. The primary consensus sequence (single C-patch) was presented on top of a stable stem structure or single-stranded structure bridging two adjacent stems. Six examples of computer folding of hnRNP K SELEX RNAs are shown in Fig. 7A. RNase mapping of two selected RNAs (R7K6 and R7K15) confirmed the predicted secondary structures, specifically emphasizing the singlestranded configuration of the C-patch (Fig. 7B).
R7K15 was studied in further detail to define the relationship of the C-patch to hnRNP K binding (Fig. 7C). The dependence of binding on the identified C-patch was supported by the dramatic loss of hnRNP K binding upon introducing a single C 3 A substitution in the consensus sequence (compare 1st and 2nd panels). All evidence of interaction was blocked by mutating all 3 Cs in the patch (3rd panel). The consensus sequence for high affinity binding to hnRNP K therefore consisted of a single short C-rich patch presented in a single-stranded configuration.
Specificity of ␣CP-2KL and hnRNP K for Their Corresponding SELEX RNAs-The hnRNP K and ␣CP-2KL SELEX experiments, carried out in parallel, yielded distinct consensus sequences. The specificity of the two sets of high affinity interactions was investigated by cross-binding comparisons (Fig. 8). R7K15, the hnRNP K SELEX mRNA, was recognized by hnRNP K (1st panel) but not ␣CP-2KL (3rd panel). In contrast, R7␣1, the ␣CP-2KL SELEX target, was bound by both hnRNP K and ␣CP-2KL (2nd panel). It should be noted, however, that this binding of R7␣1 to hnRNP K was at a 2.8-fold lower affinity than to ␣CP-2KL itself (4th panel). These cross-comparisons reinforced the conclusion that multiple C-patches are necessary for high affinity binding to ␣CP-2KL, whereas a single C-patch is sufficient for hnRNP K binding.
Specific Binding of SELEX RNAs to Native ␣CP and hnRNP K in Cell Extracts-The SELEX protocol was based on the binding of RNAs to recombinant (His 6 -tagged) proteins. To extend our findings, we determined whether the SELEX RNAs demonstrated binding specificity to native ␣CP-2KL or hnRNP K in the context of cell extracts. Studies were carried out with unfractionated and size-fractionated S100 protein extracts from a mouse erythroleukemia cell line (MEL) known to con-tain abundant levels of both proteins (45). The column fractions containing 68-kDa hnRNP K and 38 -40-kDa ␣CPs were identified by Northwestern analysis (Fig. 9A). The fractions containing ␣CP-2 and ␣CP2-KL were specifically identified by Western analysis (Fig. 9B). EMSA analysis revealed a single slowly migrating complex when R7K15 RNA was incubated with the total MEL extract (Fig. 9C, lane 1). This complex could be fully competed by poly(C) (lane 2) but was resistant to competition by poly(CT) (lane 3). In the fractionated extract, the peak of complex-forming activity with the R7K15 probe coincided with the peak of hnRNP K activity detected by Northwestern analysis (fraction 23; Fig. 8A, lane 9). Thus, in the context of a cellular extract, the R7K15 SELEX RNA bound specifically to ␣CP-2KL.
The ␣CP2-KL SELEX RNA, R7␣1, was similarly analyzed for binding activity to native proteins. When incubated in MEL extract, a set of closely migrating complexes was generated (Fig. 9D, lane 1). Both complexes were competed by the addition of poly(C) and were resistant to poly(CT) competition (lanes 2 and 3). We noted that under conditions of a high concentration of poly(C) (and exclusively under these conditions), where all ␣CP proteins are saturated with competitor, the R7␣1 probe was free to form a weak complex of lower mobility than the poly(C)-sensitive complexes. The CT-binding protein responsible for this complex was not further characterized. The proteins forming the strong poly(C)-sensitive complexes with the R7␣1 RNA peaked at fractions 26 (lane 13, slightly lower mobility complexes) and 29 (lane 18, slightly higher mobility complexes) (Fig. 8D). These fractions coincided with the presence of ␣CPs as identified by Northwestern and Western analyses (A and B, respectively). Thus, in the context of a native extract, the R7␣1 SELEX RNA bound specifically to ␣CP proteins.
The identity of the cluster of complexes formed with the R7␣1 RNA (Fig. 9D, lane 1) was further analyzed. These complexes all coincided with fractions enriched for ␣CP (see above). To confirm that these were exclusively ␣CP-containing complexes, EMSA supershift studies were carried out (Fig. 9E). The complete set of complexes forming with R7␣1 in the MEL extract (lane 2) could be quantitatively supershifted with the epitope-specific antibody recognizing both ␣CP-2 and ␣CP-2KL (lane 5) but not with unrelated antisera (lanes 8 and 9). These data confirmed that the R7␣1 bound native CP-2KL and CP-2 with high affinity and specificity.
FIG. 6. RNAs selected against hnRNP K. Details are the same as noted in Fig. 3. The highlighted R7K15 target was used for subsequent detailed study.

DISCUSSION
The optimized RNA-binding sequences for two closely related RNA-binding proteins, ␣CP-2KL and hnRNP K, were determined in the present study. These proteins each contain three structurally conserved copies of the KH domain (Fig. 1) and constitute the major poly(C) binding activities in murine FIG. 7. Structures of high affinity hnRNP K binders. A, computer (M-fold)-generated secondary structures of hnRNP K SELEX RNAs. Details as in Fig. 4A. B, structural map of high affinity hnRNP K binders. Markings as detailed in Fig. 4B. C, effect of point mutations in the R7K15 RNA consensus sequence on hnRNP K binding. Details as in Fig. 5. All studies were carried out with a single preparation of recombinant hnRNP K. and human cells (Fig. 9A). The two sets of SELEX RNAs revealed distinct primary consensus sequences (Figs. 3 and 6) encompassed in consistent secondary structures (Figs. 4 and 7). The data demonstrated that despite parallels in protein structure, ␣CP-2KL and hnRNP K have distinct requirements for high affinity RNA-protein complex formation.
SELEX studies have been previously reported on three additional KH-proteins as follows: Nova-1, Nova-2, and Vigilin. The neuronal RNA-binding Nova proteins each contain 3 KH domains. SELEX-determined consensus for Nova-1 is ((UCAU(N) 0 -2 ) 3 (51). Computer formulations predict that the three short pyrimidine-rich patches in this target are in singlestranded configuration. In contrast, SELEX studies of the closely related Nova-2 protein revealed a consensus binding sequence of a single short binding site, 5Ј-GAGUCAU-3Ј (12). The difference in structures of Nova-1 and Nova-2 consensus sequences would appear to parallel the differences between ␣CP-2KL and hnRNP K targets. However, in contrast to hnRNP K and ␣CP-2KL, structural mapping to confirm the structure of the RNA-binding sites was not carried out, and cross-competition binding studies between the two Nova proteins failed to confirm a difference in their respective binding specificities (12). A SELEX study has also been carried out with Vigilin. The Vigilin protein contains 14 KH domains. This protein binds unstructured, single-stranded RNAs containing multiple conserved 5Ј-(A) n CU-3Ј and 5Ј-UC(A) n -3Ј motifs in a largely G-free region. Approximately 75 bases were required for optimal binding (53). The multiple repeats in structure and the open configuration of the target site for Vigilin binding to its RNA target is similar to ␣CP-2KL and Nova-1. These three studies support the model that KH domain proteins interact with short and sometimes multiply arrayed RNA target sites and that these sites must be in single-stranded configuration for optimal interaction.
The ␣CP-2KL consensus sequence was 5Ј-(A/U) 2 C 3-5 (A/U) 2-6 C 3-5 (A/U) 2-6 C 3-5 (A/U) 2 -3Ј (Fig. 3). The three C-rich patches that constituted this consensus all contributed to ␣CP-2KL binding (Fig. 5). Remarkably, this tripartite C-rich consensus sequence bore considerable resemblance to the previously described ␣CP-binding site in the native human ␣-globin 3Ј-UTR (30,33). Identical or closely related C-rich regions have also been identified in the 3Ј-UTRs of 15-lipoxygenase, ␣(1)-collagen, and tyrosine hydroxylase mRNAs (34). These RNP complexes ("␣-complexes") formed between these regions, and ␣CPs appear to serve as critical determinants of mRNA stability and/or translational control (35)(36)(37). Thus, the functionally important ␣CP-binding sites previously identified in four native mRNAs were consistent with the consensus sequence obtained by ␣CP-2KL-based SELEX from a library with sequence complexity of Ͼ4 ϫ 10 14 . These data suggest that the nativebinding sites are indeed targeted by ␣CP and that the spectrum of native high affinity ␣CP-binding sites may be limited to this single motif.
The hnRNP K consensus binding motif was 5Ј-UC 3-4 (U/A) 2 -3Ј (Fig. 6). The cytosines in this motif were critical to protein binding (Fig. 7). The size and structure of the hnRNP K-binding sites were of note for several reasons. First, their sizes conformed to the 4 -5-base patch that can be recognized by a single KH domain (26). Second, they conformed to the short DICE motif (5Ј-UCCCCAA-3Ј) present in 11 copies within the 192-nucleotide repeat region of the LOX 3Ј-UTR and described as a native hnRNP K-binding site (27). Third, the structure was remarkably similar to the individual C-patches within the ␣CP-2KL consensus. These observations suggested that binding of hnRNP K to its target RNAs was mediated by a single KH domain. The lower complexity of the hnRNP K optimal binding site, when compared with that of ␣CP-2KL, would suggest that its interactions are less stringent in their sequence and structural constraints. This would be consistent with the inclusion of hnRNP K but not ␣CP in hnRNP complexes involved in general mRNA packaging.
The isolation of ␣CP-2KL targets with three C-patches reflected the power of SELEX to discriminate among targets with relatively close binding affinities. Mutagenesis of a representative ␣CP-2KL SELEX target (R7␣1) revealed that slightly lower affinity binding could occur on less extensive targets. For example, elimination of one of the three patches resulted in a 2.1-fold decrease in relative binding affinity (Fig. 5A). Despite this limited loss of binding affinity, there were no SELEX RNAs that contained less than three C-patches. Equally remarkable was the observation that elimination of two of the three C-patches did not completely eliminate ␣CP-2KL bind- ing. For example, two of the mutant R7␣1 RNAs containing only a single residual C-patch were still able to mediate protein binding, albeit at a substantially lower affinity (Fig. 5B). In contrast to the interaction of hnRNP K with its single C-patch target, the interaction of ␣CP-2KL with a single C-patch was weak, and the complex appeared to differ in structure from that with the intact target (Fig. 5B). This weak and qualitatively distinct interaction could not be attributed to suboptimal (i.e. nonexposed) structure of the mutated R7␣1-binding site because ␣CP-2KL also demonstrated very weak or no binding to a number of the hnRNP K SELEX targets that present the single C-patch in an optimized single-stranded conformation. Thus, while a single C-patch may be able to mediate interaction with an individual KH domain, and is sufficient for hnRNP K binding, the tandem array of three C-patches maximized ␣CP-2KL binding to its RNA target.
Interaction of ␣CP-2KL with mutated SELEX targets containing only a single residual C-patch not only demonstrated lower binding affinity than to the triple C-patch target but also resulted in an RNP complex with a more rapid electrophoretic mobility (Fig. 5B). This faster migration may have reflected an alteration in the overall geometry of the RNP complex or a change in the stoichiometry of the RNA-protein interaction. Of note, the optimal R7␣1 target assembled a similar fast complex when ␣CP-2KL was added at low concentrations (Fig. 5B, 1st  panel). We cannot exclude that the lower mobility complex FIG. 9. Specific binding of SELEX RNAs to native hnRNP K and ␣CP-2KL in the context of a cellular extract. (Note, all fractions were treated in parallel for the analyses shown in A-D.) A, detection of the major poly(C) binding activity in the extract. Superdex 200 HR 10/30 column fractions of MEL cell S100 extract were electrophoresed on an SDS-10% polyacrylamide gel electrophoresis, electroblotted to a nitrocellulose membrane, and incubated with 32 P-labeled poly(C). The fraction numbers are indicated above respective lanes. The elution positions of ovalbumin (43 kDa), bovine serum albumin (67 kDa), and catalase (232 kDa) are indicated above. The positions of protein size markers for the SDSpolyacrylamide gel electrophoresis analysis (right) and hnRNP K and ␣CP are indicated (left). B, immunodetection of ␣CP-2 and ␣CP-2KL. Protein fractions are as in A. The filter was probed with the rabbit polyclonal antibody specific for ␣CP-2 and ␣CP-2KL (30). The position of the ␣CP-2 and ␣CP-2KL immunoreactive proteins (left) and size markers (right) are indicated. C, gel shift with R7K15 probe. Total MEL S100 extract (lanes 1-3) or fractions 17-33 (lanes 4 -20) were incubated with 32 P-labeled R7K15 probe and electrophoresed on a native 5% acrylamide gel. Competition experiments with unlabeled excess poly(C) (lane 2) or poly(CT) (lane 3) are shown. The position of the hnRNP K complex is indicated (left). Note that the rapidly migrating band present in all lanes most likely corresponded to an RNase T1 digestion fragment of the probe because this band is also present in the absence of S100 extract (data not shown). Based on the intensity of this "background" band, it appears that lane 11 (fraction 24) is underloaded in this gel. D, gel shift with R7␣1 probe. Lanes as described for C. The position of the ␣-complex is indicated (left). E, supershift of complexes forming on R7␣1 RNA with total MEL S100 extract in presence of an antibody specific to ␣CP-2 and ␣CP-2KL. Gel shift analysis was carried out under conditions identical to D. Affinity-purified antibodies specific to ␣CP-2 and ␣CP-2KL (lanes 5-7), anti-glutathione S-transferase (GST) (lane 8), anti-c-Myc (lane 9) were all derived from rabbits and added at similar protein concentration to the gel shift incubation mixture. The addition of MEL S100 extract, poly(C), or poly(CT) competitor is indicated with ϩ and Ϫ above each lane. might be generated by multimerization of ␣CP-2KL on multiple C-patches, whereas such a multimerization would not be compatible with a single C-patch configuration as seen in Fig.  5B. This model would be consistent with the ability of ␣CP-2KL to dimerize in cell extracts and in yeast two-hybrid assays (38). 3 Selection for single C-patches on the hnRNP K SELEX targets might reflect a corresponding inability of hnRNP K to undergo productive protein-protein interactions.
The secondary structures of both sets of SELEX targets appeared to be crucial to high affinity interactions. The M-fold generated secondary structures, and RNase mapping demonstrated that the consensus sequences were encompassed within a single-stranded loop and that these loops were substantially longer than would have been necessary to accommodate the consensus binding sites (Figs. 4 and 7). The relevance of RNA secondary structure to ␣CP-2KL binding affinity was highlighted by comparing the structure of the SELEX target to that of the native ␣-globin 3Ј-UTR. Although the primary sequence of the ␣-globin 3Ј-UTR-binding site was consistent with the SELEX consensus, its affinity for ␣CP-2KL was substantially lower than that of the ␣CP-2KL SELEX RNAs (Fig. 2, lanes 4  verses 7). Secondary structure mapping of the ␣-globin 3Ј-UTR (summarized in Fig. 4C) demonstrated that the only segment in a clearly defined single-strand configuration (RNase T2sensitive) was a tight loop containing part of the second Cpatch (Fig. 4C). The 20-fold lower relative binding affinity of ␣-globin 3Ј-UTR compared with R7␣1 SELEX is similar to that for the mutated R7␣1 containing a single C-patch. Thus the lower binding affinity of the native ␣-globin 3Ј-UTR may reflect a suboptimal presentation of the binding site within the secondary structure which effectively exposes only a single Cpatch. This lower strength of RNA-protein interaction may be more consistent with its in vivo functions. Further studies using appropriately altered binding sites in model target mRNAs can address this possibility.
The specificities of hnRNP K and ␣CP-2KL for their respective SELEX RNAs (Fig. 8) were consistent with their optimized SELEX-binding sites; a single C-patch was sufficient for high affinity binding by hnRNP K but not for ␣CP-2KL, whereas a triple C-patch was necessary for high affinity binding by ␣CP-2KL. Cross-binding studies with recombinant proteins demonstrated that hnRNP K could bind to ␣CP-2KL SELEX targets although it appeared to interact most strongly with its own SELEX targets. This higher affinity of hnRNP K for its own SELEX targets may have reflected the more limited and sterically constrained single-stranded binding site than that presented on the ␣CP-2KL SELEX RNAs containing multiple exposed C-patches. The specificity of RNA-protein interactions appeared to be even greater when tested in the context of native cell extracts. SELEX RNAs bound specificity to their corresponding native proteins; R7K15 RNA bound to hnRNP K and R7␣1 bound to ␣CPs (Fig. 8, C and D). The specificity of R7␣1, the ␣CP-2KL SELEX target, for ␣CP-2KL in the extract was unexpected as this target contains multiple C-patches that can be bound by recombinant hnRNP K (Fig. 8). This higher apparent specificity of R7␣1 in the context of extracts may reflect the relatively lower levels of hnRNP K than ␣CP in the cytosol (hnRNP K is predominantly a nuclear protein). The selective interaction of R7K15, the hnRNP K SELEX target, to hnRNP K and its lack of binding to the ␣CP-2KL was consistent with the requirement for multiple C-patches for high affinity interaction with ␣CP-2KL.
In conclusion, the generation and analysis of SELEX targets to ␣CP-2KL and hnRNP K in the present study has defined optimal RNA structures for each of these two KH domain proteins. The physical analysis of the SELEX targets and binding studies using both recombinant proteins and native cytosolic extracts support the conclusion that ␣CP-2KL and hnRNP K have distinct binding specificities. The greater structural complexities of the RNA targets binding sites for ␣CP-2KL verses hnRNP K may correspond to the respective roles that these two proteins play in sequence-specific post-transcriptional controls and in nuclear RNA packaging. The structural basis for these distinct binding interactions will be of interest for subsequent study.