Determination of the RNA binding specificity of the heterogeneous nuclear ribonucleoprotein (hnRNP) H/H'/F/2H9 family.

Members of the heterogeneous nuclear ribonucleoprotein (hnRNP) H protein family, H, H', F, and 2H9, are involved in pre-mRNA processing. We analyzed the assembly of these proteins from splicing extracts onto four RNA regulatory elements as follows: a high affinity hnRNP A1-binding site (WA1), a sequence involved in Rev-dependent export (p17gag INS), an exonic splicing silencer from the beta-tropomyosin gene, and an intronic splicing regulator (downstream control sequence (DCS) from the c-src gene. The entire family binds the WA1, instability (INS), and beta-tropomyosin substrates, and the core-binding site for each is a run of three G residues followed by an A. Transfer of small regions containing this sequence to a substrate lacking hnRNP H binding activity is sufficient to promote binding of all family members. The c-src DCS has been shown to assemble hnRNP H, not hnRNP F, from HeLa cell extracts, and we show that hnRNP 2H9 does not bind this element. The DCS contains five G residues followed by a C. Mutation of the C to an A changes the specificity of the DCS from a substrate that binds only hnRNP H/H' to a binding site for all hnRNP H family members. We conclude that the sequence GGGA is recognized by all hnRNP H family proteins.

Some proteins of the heterogeneous nuclear ribonucleoprotein (hnRNP) 1 class were initially described as components of a nucleosome-like structure that protects and organizes nascent RNA polymerase II transcripts (1)(2)(3). Our understanding of hnRNP functions has broadened from this purely structural vision to include multiple aspects of mRNA biogenesis such as transcription, splicing, capping, polyadenylation, nuclear transport, and mRNA stability. More than 30 hnRNPs have been identified so far. Many hnRNPs have a modular structure characterized by one or more RNA binding domains and by one or more auxiliary domains that are frequently enriched in a few amino acids, mainly glycines. The RNA recognition motif (RRM), the arginine-rich motif, the KH domain, and the RGG box are some of the main motifs that are found in hnRNP proteins (4).
The precise roles of hnRNP proteins in gene expression are likely to be reflected by their RNA binding specificity. Therefore, it is important to have a clear understanding of the RNA binding specificities and affinities of the different family members. Early studies demonstrated that several hnRNPs have affinities for immobilized ribohomopolymers such as poly(G) bound by hnRNPs E, H, F, and M, poly(C) bound by hnRNPs K and J, and poly(U) bound by hnRNPs C and M (5). Subsequently, several hnRNPs were observed to bind specific RNA sequences by UV-mediated cross-linking. Furthermore, utilizing a selection and amplification approach from pools of random RNAs, high affinity binding sequences for hnRNPs A1 (6) and C (7), have been identified. The binding properties of hnRNP A1 are the best characterized so far. Several reports (1, 8 -12) have shown that hnRNP A1 and other members of the hnRNP A/B group (hnRNP A1b, A2, and B1) share common RNA substrate specificities and functions in pre-mRNA splicing. In a previous study (11) we showed that hnRNP A1, A2, and B1 specifically bind an exonic splicing silencer (ESS) in HIV-1 tat exon 2. The binding of the hnRNP A/B family to this sequence is responsible for inhibiting splicing of the viral transcript (11).
Members of the hnRNP H group of hnRNPs, hnRNP H, hnRNP HЈ, hnRNP F, and hnRNP 2H9, are involved in mRNA processing and exhibit extensive sequence homology. In humans, hnRNPs F, H, HЈ, and 2H9 are encoded by different genes but share a common structure of two (hnRNP 2H9) or three (hnRNPs F, H, and HЈ) repeats of a similar RNA binding domain named the quasi-RNA recognition motif (qRRM) and two glycine-rich auxiliary domains (13)(14)(15). hnRNP F has been shown to be involved in the neuronal specific splicing of the N1 exon of the c-src gene through its interaction with an intronic splicing enhancer sequence (16). hnRNP F has also been shown to interact with the nuclear cap-binding protein complex (17), and recently a role in transcription has been proposed via its interaction with the TATA-binding protein (18). hnRNPs H and HЈ are 96% identical and are likely to share common properties and functions (13). They have been found to be associated with nuclear-matrix proteins (19). hnRNP HЈ has been implicated in pre-mRNA 3Ј end formation (20), whereas hnRNP H is involved in splicing regulation as part of the intronic splicing enhancer complex in the c-src neuronal specific N1 exon (21) and through binding to the exon 7 exonic splicing silencer of the rat ␤-tropomyosin gene (22). hnRNP 2H9 is the most recently identified member of this family, and little is known about its functions except for its role in the splicing arrest induced by heat shock (14,23).
In this report we set out to define the RNA binding specificities for the members of the hnRNP H subfamily of hnRNP proteins. We use an RNA affinity chromatography assay in which we examine hnRNP protein assembly from splicing extracts onto four different sequences known to regulate splicing or mRNA transport. In addition to studying the specific assembly of the hnRNP H group onto these substrates, we also examine the ability of other hnRNP proteins to bind to these sequences including hnRNP A1, L, K/J, and C1/C2. These hnRNPs have been found to bind to specific RNA sequences and to regulate different steps of the mRNA processing pathway (1, 24 -33). In this work we show that all members of the hnRNP H group specifically assemble onto an hnRNP A1 high affinity-binding site previously identified through iterative selection (6). This assembly is independent of hnRNP A1 and involves a different subset of sequences. We also show that all members of the hnRNP H group can assemble onto the HIV-1 p17gag instability (INS) sequence that acts synergistically with the Rev response element (RRE) to promote the export of the unspliced HIV-1 transcripts (34). All of the hnRNP H group members assemble onto the rat ␤-tropomyosin exon 7 ESS sequence. This sequence has been shown previously to be an hnRNP H-binding site, and hnRNP H binding inhibits exon inclusion (35,36). Finally, we analyzed the hnRNPs assembling specifically onto the neural c-src N1 exon intronic splicing enhancer, known as the downstream control sequence (DCS), which has been extensively characterized in previous work (16,21,(37)(38)(39)(40). In this interesting case, we see distinct specificities of the different family members for assembling onto the substrate, and this assembly is dependent on the cell line source of the extract. We identified the core sequence required for the assembly of hnRNPs of the H family onto RNA substrates, and we show that a small region containing the core of the binding region is all that is required for specific binding. The sequence GGGA is required for the binding of all hnRNP H proteins, whereas a run of five Gs followed by a C promotes only hnRNP H and HЈ binding. The overlapping nature of the binding specificity for family members indicates that these proteins may have overlapping yet distinct functions in the various mRNA processing events discussed.

EXPERIMENTAL PROCEDURES
Immobilization of RNA on Agarose Beads and RNA Affinity Assays-RNAs were covalently linked to adipic acid dihydrazide-agarose beads by modification of a published procedure (41) as described previously (11). 500 pmol of RNA were placed in a 400-l reaction mixture containing 100 mM sodium acetate, pH 5.0, and 5 mM sodium m-periodate (Sigma). Reaction mixtures were incubated for 1 h in the dark at room temperature. The RNA was then ethanol-precipitated and resuspended in 500 l of 0.1 M sodium acetate, pH 5.0. 400 l of adipic acid dihydrazide-agarose bead 50% slurry (Sigma) was washed four times in 10 ml of 0.1 M sodium acetate, pH 5.0, and pelleted after each wash at 300 rpm for 3 min in a clinical centrifuge. After the final wash, 500 l of 0.1 M sodium acetate, pH 5.0, were added to the beads, and the slurry was then mixed with the periodate-treated RNA and rotated for 12 h at 4°C. The beads with the bound RNA were then pelleted and washed three times in 1 ml of 2 M NaCl and three times in 1 ml of buffer D (20 mM HEPES-KOH, pH 7.6, 5% v/v glycerol, 0.1 M KCl, 0.2 mM EDTA, 0.5 mM dithiothreitol). The binding efficiency of RNA to the beads was between 70 and 80% as determined using 5Ј 32 P-end-labeled RNA.
The beads containing immobilized RNA were incubated in a reaction mixture containing 250 l of HeLa cell nuclear extract and 400 l of buffer D for 20 min at 30°C. Beads were then pelleted by centrifugation at 1000 rpm for 3 min and washed four times with 1 ml of buffer D. After the final centrifugation, the proteins bound to the immobilized RNA were eluted by addition of 60 l of protein sample buffer.
Substrate RNA Synthesis-Substrate RNAs for bead immobilization were synthesized by in vitro transcription using T7 RNA polymerase. Linker sequences were added to short RNA substrates to prevent steric hindrance of protein complex assembly onto the RNAs by the agarose beads. The complete sequences of all the substrate RNAs are reported in Table I.
Preparation of hnRNP A/B-depleted Nuclear Extracts-HeLa cell nuclear extracts were depleted of hnRNP A/B proteins as described (11). Two consecutive rounds of depletion were performed utilizing the hnRNP A/B high affinity ESS WT RNA sequence derived from HIV-1 tat exon 2 immobilized on agarose beads. The low affinity control RNA was immobilized on beads and used to treat extracts to make a control extract not depleted of hnRNP A/B proteins.
Isolation and Sequencing of hnRNP H-The isolation of hnRNP H

RESULTS
hnRNPs Binding to an hnRNP A1 High Affinity Binding Sequence-By utilizing iterative selection (SELEX), Burd and Dreyfuss (6) identified a high affinity hnRNP A1-binding sequence. The sequence of this binding site, referred to as WA1, is 5Ј-UAUGAUAGGGACUUAGGGUG-3Ј. An additional protein of 50 kDa from HeLa extracts, unrelated to hnRNP A1, was also observed to bind to this same sequence in a UV cross-linking assay (6). In a previous study (11), using RNA affinity chromatography, we also detected a 50-kDa protein from HeLa cell extracts assembling onto this high affinity hnRNP A1-binding sequence, in addition to the expected binding by members of the hnRNP A/B subfamily. We identified the 50-kDa protein as hnRNP H by sequencing the protein directly (see "Experimental Procedures"). The identity of the protein was confirmed with hnRNP H-specific antibodies (Fig. 1C).
hnRNP H is known to have a role in RNA processing (21,22). We sought to determine whether the recruitment of hnRNP H to the WA1 sequence, which can function as a splicing silencer element when substituted for an ESS in HIV-1 tat exon 2 (11), is dependent on hnRNP A1 activity. If a dependence exists, this would imply a new cooperative interaction between these two subfamilies of hnRNP proteins. hnRNP A1 can interact with other hnRNPs through its glycine-rich domain (45), and this domain might be responsible for the recruitment of hnRNP H to the substrate. To test this hypothesis, we prepared a HeLa cell nuclear extract that is depleted of hnRNP A/B proteins utilizing RNA affinity chromatography. Two consecutive rounds of RNA bead depletion were performed on HeLa nuclear extract using an immobilized RNA that binds hnRNP A/B proteins with high affinity but not hnRNP H. This hnRNP A/B-binding RNA sequence is derived from the second tat exon of HIV-1 and serves as a splicing silencer element (11). After two rounds of depletion, more than 95% of the hnRNP A/B proteins have been removed, yet this treated extract retains splicing activity in in vitro splicing assays (11). When nuclear extract depleted of hnRNP A/B proteins is incubated with beads containing the WA1 substrate RNA immobilized to them, the amount of hnRNP H bound (Fig. 1A, lane 2) does not vary with respect to the amount that binds from undepleted and mock-depleted nuclear extracts (Fig. 1A, lanes 1 and 3). This implies that the binding of hnRNP H to the WA1 sequence is independent of hnRNP A/B protein activity.
Because hnRNP H and hnRNP A/B proteins appear to bind to the WA1 sequence independently, they must both have binding sites within this sequence. The core sequence bound by hnRNP A1 has been described as UAGGG(A/U) (6); the WA1 RNA substrate contains two of these elements. To determine the binding specificity of hnRNP A1 and hnRNP H to the WA1 sequence, we tested the binding of both proteins to RNAs containing a small number of base substitutions in this sequence (Fig. 1B). These RNAs were covalently linked to agarose beads, and the RNA-linked beads were incubated in HeLa cell nuclear extract and then washed extensively. Proteins remaining bound to the beads were eluted in sodium dodecyl sulfate-containing buffer and loaded onto SDS-PAGE gels. The gels were transferred to nitrocellulose and probed with antibodies specific for different hnRNP proteins. In addition to testing for the presence of hnRNP A1 and hnRNP H, we also immunoblotted with antibodies specific for the hnRNP H-related proteins hnRNP F and hnRNP 2H9, as well as antibodies specific for the unrelated proteins hnRNP C, hnRNP K/J, and hnRNP L (Fig. 1C). A control lane containing 10 l of total nuclear extract was also included in this immunoblot experiment. When this amount of nuclear extract was compared with the amount of extract incubated with the beads used for each experiment, the 10 l of nuclear extract (NE) lane represents 1/6 the amount of nuclear extract from which the proteins in each affinity binding lane in the gel were derived. However, in our experimental methods there is a loss of beads during the various washing steps, so we estimate (based on monitoring 32 P-end-labeled RNAs bound to the beads) that only 50% of the original quantity of beads was recovered after all of the washing steps. Therefore, the amount of nuclear extract loaded in the control lane is roughly 1/3 of the nuclear extract from which  Table I for complete sequences) were covalently linked to agarose beads and incubated in HeLa nuclear extracts. After extensive washing, proteins bound to the substrates were eluted and loaded onto an SDS-PAGE gel (lanes 1-5) along with 10 l of HeLa cell nuclear extract (lane 6). Gels were transferred to nitrocellulose and immunoblotted with indicated antibodies against various hnRNP proteins.
hnRNP H Family RNA Binding Specificity the eluted proteins were derived in each experiment. When a protein is indicated as assembling specifically onto an RNA sequence, its binding to the RNA sequence was much greater than its binding to a nonspecific RNA control, and at least 1/3 of the total amount of that protein that was present in the starting extract remained bound to the beads after the washing steps.
hnRNP A1 and hnRNP H both assemble specifically onto the WA1 and WA1 M2 sequences, implying that the WA1 M2 mutation does not disrupt binding for either protein (Fig. 1C,  lanes 1 and 3). The WA1 M1 sequence dramatically decreases binding affinity for both proteins indicating that the two sets of UAG triplets that are mutated in WA1 M1 are important for binding of both hnRNPs (Fig. 1C, lane 2). WA1 M3 has a striking effect on the binding of these proteins. WA1 M3 still retains strong affinity for hnRNP A1, but it disrupts hnRNP H binding indicating that mutation of a single guanosine residue in both of the GGG triplets present in WA1 selectively disrupts the binding of hnRNP H (Fig. 1C, lane 4). Based on the lack of hnRNP H binding to the WA1 M1 and WA1 M3 substrates, it appears that the run of three G residues forms the core of hnRNP H binding. This binding to a run of G residues is consistent with previous observations (5) of hnRNP H binding to poly(G) homopolymers. Analysis of hnRNP F and hnRNP 2H9 assembly to these substrates indicates that these two proteins, with extensive sequence homology to hnRNP H, have indistinguishable binding specificity from hnRNP H. None of the other hnRNPs tested (C1, C2, K, J, and L) bind the WA1 substrate with higher specificity than a control RNA sequence containing no known regulatory motifs (Fig. 1C, lane 5).
hnRNPs binding to the p17gag Instability (INS) Sequence-Given the similarity between the RNA sequence recognized by hnRNP A1 and H/HЈ/F/2H9, we sought examples of naturally occurring sequences that bind hnRNP A1 and that could potentially bind members of the H/HЈ/F/2H9 group as well. A clear candidate is the HIV-1 p17gag instability (INS) sequence. This sequence acts synergistically with the Rev response element to promote Rev-dependent export of unspliced transcripts ( Fig. 2A) (46). Interestingly, the INS sequence has been shown to be a specific binding site for both hnRNP A1 and for an unknown protein of 50 kDa (34). When the INS is substituted with the hnRNP A1 high affinity WA1 sequence, the new viral substrate RNA retains the synergistic stimulation of Rev-dependent transport (34). The INS sequence contains a GAUGGGA element that when mutated disrupts the binding of both hnRNP A1 and the 50-kDa protein. This disruption also inhibits the INS activity of the sequence (34). Given the sequence similarity between the INS and the WA1 sequence GAUAGGGA (with an extra A residue relative to the INS sequence), hnRNP H is a candidate to be the 50-kDa protein.
To confirm the identity of the unknown 50-kDa cellular protein, we analyzed hnRNP assembly on the wild type and mutant INS sequences using RNA affinity chromatography from HeLa cell nuclear extracts (Fig. 2B). The assay we performed confirms the specific binding of hnRNP A1 to the INS sequence and identifies the 50-kDa protein as hnRNP H. Analysis of hnRNP F and 2H9 confirms that these proteins bind to the INS element with specificity similar to hnRNP H/HЈ (Fig. 2C, lanes  1). To test whether hnRNP H family members and hnRNP A1 have similar overlapping binding, we created a mutation in the three G residues that in the case of WA1 form the core for hnRNP H family member binding. As can be seen in Fig. 2C, lane 2, this mutation does indeed block binding of both hnRNP H group proteins and hnRNP A1. When we compare the assembly of hnRNPs H/HЈ/F/2H9 onto the INS sequence, the mutated substrate, or a random RNA sequence, the specificity These results suggest the possible involvement of hnRNP H/HЈ and its relatives F and 2H9 together with hnRNP A1 in the Rev-dependent mRNA export pathway through interaction with the INS element. By using a UV cross-linking technique, a previous study did not identify hnRNP F and 2H9 in the same complex as hnRNP A1 and the 50-kDa protein, here characterized as hnRNP H (34). As in the case of WA1, the GGG motif is again essential for the assembly of hnRNP H family members onto an RNA substrate.
hnRNPs Binding to the Rat ␤-Tropomyosin Exon 7 Exonic Splicing Silencer-The rat ␤-tropomyosin gene has been extensively used as a model system to study the regulation of alternative RNA splicing (22,35,36,47,48). The ␤-tropomyosin pre-mRNA contains 2 pairs of mutually exclusive exons as follows: non-muscle and smooth muscle cells include exons 6 and 11 in the final mRNA, whereas skeletal muscle cells include exons 7 and 10. Mutational analysis identified two sequences that regulate exon 7 exclusion in non-muscle cells, an ESS located in exon 7 and an intronic regulatory element in the upstream intron (35,49) (Fig. 3A). A protein complex assembles on the intronic regulatory element and includes polypyrimidine tract-binding protein (PTB), FUSE-binding protein, and a homolog of the human Sam 68 tyrosine phosphoprotein (50,51). hnRNP H has been shown to bind to the ESS and to be required for inhibiting inclusion of exon 7 in smooth muscle cells (22).
Our results on the assembly of hnRNP H family members onto the WA1 and p17gag INS sequences indicate that hnRNPs H, HЈ, F, and 2H9 are likely to have similar if not identical binding specificity for target RNA sequences characterized by a GGG motif. Because the ␤-tropomyosin exon 7 ESS sequence UGUGGGGA has been shown to bind hnRNP H specifically, we sought to determine whether hnRNP F and 2H9 also assemble onto the ESS sequence. We performed RNA affinity chromatography on HeLa cell extract using either the wild type ␤-tropomyosin exon 7 sequence or a mutant that disrupts the GGGG motif in exon 7 as substrate RNAs (Fig. 3B). After eluting the proteins bound specifically to the RNA-containing beads, we separated them by SDS-PAGE, transferred the gels to nitrocel-lulose, and probed with antibodies specific for different hnRNP family members. The results are shown in Fig. 3C. hnRNP F and 2H9 specifically assemble onto the exon 7 ESS as does hnRNP H/HЈ. The similarity in binding specificity and the sequence homologies among hnRNPs H/HЈ, F, and 2H9 suggest the possible involvement of all the hnRNP H family members in the inhibition of exon 7 inclusion in smooth muscle cells, which was previously demonstrated for hnRNP H (22). hnRNP A1 shows stronger binding to the ␤-tropomyosin exon 7 sequence (Fig. 3C, lane 1) relative to a control RNA sequence (lane 3), but this binding is not inhibited by mutations to the hnRNP H family binding site (lane 2). This indicates that hnRNP A1 binds elsewhere to exon 7, and any possible role for this binding is unknown. When the ␤-tropomyosin substrates were tested for their ability to recruit hnRNPs C1, C2, K, J, and L, we could not detect any difference in binding between those substrates and the control RNA sequence.
hnRNPs Binding to the c-src N1 Exon Downstream Control Sequence-The mouse c-src gene contains an 18-nucleotidelong exon that is included in neurons but skipped in other cell types. This splicing regulatory pattern is maintained in HeLa cells where the N1 exon is skipped and in the WERI-1 retinoblastoma cell line where the N1 exon is efficiently included in mature messages. Mutational analysis has revealed a complex positively acting intronic regulatory sequence, named the downstream control sequence (DCS), located downstream of the N1 exon (16,37). A protein complex assembles on the DCS in both WERI-1 and HeLa cell nuclear extracts, but only the complex assembled in neuronal cells is active in promoting splicing (16). The complex assembled in neural cells contains hnRNP F and H, two related proteins named KH-type splicing regulatory protein and FUSE-binding protein, PTB, and a recently characterized neural homolog of polypyrimidine tractbinding protein, nPTB (16,21,38,52,53). HeLa cells express only PTB and not nPTB, whereas WERI-1 cells express both PTB and nPTB, with nPTB in excess over PTB. It has been shown clearly that nPTB binds to the DCS. However, because the same antibody is used to identify both PTB and nPTB, and they have similar migrations on gels, the presence of PTB on the DCS in WERI-1 cells cannot be ruled out (52). Both hnRNP H and hnRNP F have been shown to be required for N1 exon inclusion in vitro (16,21). The complex that assembles on the DCS in HeLa nuclear extracts lacks hnRNP F and nPTB (52). Because nPTB is the only one of these factors specifically expressed in neural cells (52), it is thought to be one of the factors hnRNP H Family RNA Binding Specificity that specifically regulates exon N1 inclusion in neural tissue.
hnRNP H but not hnRNP F binds to the GGGGGCUG element within the DCS in HeLa cell extracts, whereas both proteins bind this element in WERI-1 extracts as determined by cross-linking experiments. This is very different from our results presented here so far in which hnRNP H and hnRNP F in HeLa extract bind to various regulatory elements with similar specificity. We sought to confirm this result of differential hnRNP H family member affinity using RNA affinity chromatography while at the same time testing for the affinity of hnRNP 2H9 for this element.
We performed RNA affinity chromatography on HeLa cell nuclear extracts and WERI-1 cell nuclear extracts using a DCS substrate RNA and two mutant versions of this sequence (Fig.  4, B and C). Mutant N1 M1 disrupts a CUCUCU polypyrimidine run shown previously to be a binding site for PTB and nPTB (52). Mutant N1 M2 disrupts the run of 5 Gs shown previously to be a binding site for hnRNP H in HeLa extract and hnRNP H and hnRNP F in WERI-1 extract (52). When we analyzed the hnRNPs bound to the DCS, we found that our data matched the results previous obtained by Markovtsov et al. (52) using different techniques. hnRNPs H/F and PTB/nPTB are specifically binding to two distinct elements in the DCS, the guanosine stretch and the CUCUCU sequence, respectively. This can be seen by comparing the binding of proteins to the N1 WT RNA (Fig. 4C, lanes 1 and 6) to the N1 M1 RNA that has a disruption in the CUCUCU sequence that inhibits binding of PTB/nPTB (Fig. 4C, lanes 2 and 7) and to the N1 M2 RNA that has a disruption in the GGGGG sequence that inhibits binding of hnRNPs H and F. Binding of hnRNP F and nPTB was observed in WERI-1 but not in HeLa nuclear extracts. Surprisingly, hnRNP 2H9, which we observed to be related to hnRNPs H and F in its RNA binding specificity, does not bind to the DCS in either WERI-1 or HeLa extracts. It is possible that in neural cells hnRNP F interaction with the DCS complex could be stabilized by neural specific factors such as nPTB. However, in mutant N1 M1 which disrupts nPTB binding in WERI-1 extract (Fig. 4C, lane 7), no concomitant decrease in hnRNP F binding was observed. It is possible that additional uncharacterized factors are important for selectively recruiting hnRNPs H and/or F but not hnRNP 2H9 to the DCS in the different extracts. Indeed, published data (52) indicate that not all the components of the protein complex binding to the DCS have been identified. The protein complexes assembling on the DCS  7 and lanes 3  and 8), and control RNA substrate (lanes 4 and 9) were covalently linked to agarose beads and incubated in HeLa and WERI-1 nuclear extracts. Proteins bound to the substrates were eluted, separated on SDS-PAGE, and immunoblotted with indicated antibodies specific for hnRNPs. Lanes 5 and 10 contains 10 l of HeLa and WERI-1 nuclear extract, respectively.

hnRNP H Family RNA Binding Specificity
in HeLa and WERI-1 extracts were assayed for the presence of hnRNPs A1 (Fig. 4C) and C1, C2, K, J, and L (data not shown). No specific binding of any of these proteins to the substrate RNAs was detected.
Specificity for Binding to hnRNP H Family Proteins Lies within a Short Sequence of RNA Centered on the GGG Element-To understand better the RNA specificity of the hnRNP H protein family members, we sought to determine a minimal binding consensus sequence. To this aim we inserted 10 nucleotides centered around the ␤-tropomyosin ESS and the DCS GGGGG run into the control RNA (Fig. 5A) that does not bind hnRNPs of the H family (Fig. 5B, lane 1). Insertion of the 10 nucleotides stretch derived from the ␤-tropomyosin exon 7 into the control RNA substrate was sufficient to promote the recruitment of all the hnRNPs of the H family to the substrate RNA (Fig. 5B, lane 4). When the 10 nucleotides derived from the c-src DCS were inserted into the control RNA, only hnRNPs H/HЈ were recruited (Fig. 5B, lane 5). Thus 10 nucleotides are sufficient to transfer the distinct binding specificities of these RNA elements for hnRNPs of the H family. Given the different binding specificity of the ␤-tropomyosin ESS and the c-src DCS substrates, we further wanted to prove that no sequences outside of the 10 nucleotides centered around the G run were required for specificity. In Fig. 5B, lanes 6 and 7, we show that the hnRNP H family binding specificity of the ␤-tropomyosin and the c-src substrates can be swapped when the 10 nucleotides centered around the G run are transferred between the two substrates. Therefore, the binding specificity of the DCS element for only hnRNP H and HЈ but not hnRNPs F or 2H9 in HeLa cell extracts lies in this short sequence centered around the run of five G residues. Bases outside this region are not involved in this selectivity.
A Single Nucleotide Substitution Can Promote Binding of all hnRNP H Proteins to the DCS Element-Analysis of the sequences bound by all members of the hnRNP H family, the WA1, INS, and ␤-tropomyosin ESS, indicate that the GGG sequence is required for binding. In all cases, the three Gs are followed by an A. In the DCS sequence where only hnRNP H/HЈ bind, a run of three Gs is essential, but the GGGGG sequence is followed by a C. To test the possibility that the GGGA sequence is responsible for the ability to bind all hnRNP H family members, the sequences UGGGGA and GGGGA derived from the core-binding site of the ␤-tropomyosin ESS were both inserted into the control substrate RNA that did not bind any hnRNP H family proteins (Fig. 6A). As shown in Fig. 6B, lanes 2 and 3, transferring both of these sequences to the control substrate RNA transfers the ability to bind all hnRNP H family members. Therefore, we have experimentally identified the hnRNP H Family RNA Binding Specificity minimal essential binding sequence for all proteins of the H family as the pentanucleotide GGGGA. This core is likely to be only the tetranucleotide GGGA, because the conserved sequences in the WA1 and p17INS substrates that bind all the family members are AGGGA and UGGGA, respectively.
The c-src DCS binds only hnRNP H/HЈ but not F and 2H9 in HeLa extract. Its sequence is divergent from the other hnRNP H family binding sites studied in this report in that although it contains a run of Gs, the Gs are followed by a C and not an A. To test whether this is the source of the specificity of the DCS for only hnRNP H/HЈ, we generated two new RNA substrates (Fig. 5A). In the first, DCS UA, we mutated the DCS sequence by a two-nucleotide substitution around the run of five Gs from GGGGGC to UGGGGA, so that it better matched the ␤-tropomyosin core sequence. In the second DCS mutant substrate, DCS A, we mutated the C after the G run to an A, GGGGGA, to test whether the establishment of a GGGA sequence was sufficient to alter the specificity. Fig. 5B, lanes 6 and 7, shows that both of these two changes were sufficient to change the specificity of the DCS from only binding hnRNP H/HЈ to binding of all family members. This is most dramatically seen in lane 6 where a single nucleotide change to establish a GGGA sequence allows the c-src DCS to bind all the members of the hnRNP H family. DISCUSSION Previous studies showed that the role of hnRNP proteins in mRNA biogenesis is reflected by their RNA binding specificity (1, 8 -12). In this study we sought to identify specific binding sequences for the hnRNP H group of hnRNP proteins which contains hnRNPs H, HЈ, F, and 2H9. Identification of high affinity RNA-binding sequences for these proteins would be helpful in understanding the function of this protein family, members of which have been shown to have roles in mRNA processing (14, 16, 17, 20 -23). The proteins of the hnRNP H group share extensive sequence homology in their RNA binding domains that are characterized by a distinctive RRM named the qRRM (13)(14)(15). This suggests that the hnRNP H family members may share common RNA binding affinities and related functions. Our results demonstrate that hnRNP H, HЈ, F, and 2H9 do indeed share common RNA binding specificity, assembling on RNA substrates characterized by the GGGA sequence. In all of the substrates we tested, mutations in the GGG motif inhibit binding of all hnRNP H group members.
There is strong evidence of a role for GGG triplets in splicing regulation. GGG triplets have been found associated with the intronic portion of 5Ј-and 3Ј-splicing sites (54,55) and have been shown to positively regulate exon selection when posi- tioned in short introns (56). Furthermore, a screen in tissue culture cells for sequences that promote exon skipping in vivo, when placed in an alternative central exon of a reporter gene, identified sequences enriched in G triplets as capable of promoting exon skipping (57). These results suggest that GGG sequences when located in introns may promote splicing of the flanking exons, and when inserted in exons they may act to repress splicing. Although the molecular mechanisms regulating these processes are unknown, members of the hnRNP H/HЈ/ F/2H9 group are likely to be involved. In agreement with this model the exonic splicing silencer sequence of the rat ␤-tropomyosin exon 7 has been shown to bind to hnRNP H, and this protein is important for active inhibition of exon 7 splicing (22). We show that hnRNPs H, HЈ, F, and 2H9 all specifically assemble onto this sequence indicating that other hnRNP H group members may also play a role in this splicing regulation (Fig. 3C). In the c-src gene, the intronic DCS regulatory region, which promotes exon N1 inclusion, contains a GGGGGC motif upon which hnRNP H and hnRNP F assemble in neural cells (52) (Fig. 4C). Both of these proteins have been shown to enhance splicing of the N1 exon in neural tissues (16,21).
These two examples suggest that members of the hnRNP H/HЈ/ F/2H9 group can either stimulate or repress splicing upon binding to a GGG motif.
In this work we have shown that the GGGA sequence is the minimal element required for binding by members of the hnRNP H/HЈ/F/2H9 group, and no other flanking RNA sequences are required. In the DCS regulatory element of the c-src N1 exon, this GGGA sequence is not found. The GGGGGC sequence is capable of only assembling hnRNP H/HЈ from HeLa extracts, and other family members do not bind (52) (Fig. 4C). This argues against cooperative binding of different family members together to the RNA. In contrast, when this same substrate is incubated in WERI-1 cell extracts, both hnRNP H/HЈ and hnRNP F assemble onto the substrate; 2H9 still does not bind. The binding of hnRNP F is dependent on the G motif. This implies that there is an extract-specific component to the assembly of hnRNP F onto this motif. This may involve other protein factors that are part of the neuronal DCS complex, such as nPTB. However, for an RNA substrate in which the PTB/ nPTB-binding sequence CUCUCU is mutated, binding of nPTB is diminished, but hnRNP F binding remains strong (Fig. 4C,  lane 7). This implies that other perhaps unknown proteins in the DCS complex may stabilize the assembly of hnRNP F proteins from the WERI-1 extract. A similar situation in which hnRNP F binding specificity is dependent on other protein factors has been seen for hnRNP F binding to the nuclear messenger RNA cap-binding complex. hnRNP F does not bind the RNA in the absence of cap-binding proteins, whereas hnRNP H binds to the RNA substrate in the presence or absence of these proteins (17). When the GGGGGC sequence is mutated to GGGGGA, hnRNPs F and 2H9 bind efficiently to the substrate in HeLa extract, and an increase in hnRNP H/HЈ binding efficiency is seen as well (Fig. 6B). These results are consistent with strong direct binding of all hnRNP H family members to the GGGA sequence, whereas on weaker consensus sequences the binding of individual family members can be modulated by cooperative binding with other protein factors.
hnRNP 2H9 is the most recently identified of the proteins we analyzed in this study. This is the first report characterizing the RNA binding specificity of hnRNP 2H9. We showed that hnRNP 2H9 does assemble onto three of the RNA substrates with similar specificity to other hnRNP H group members. The exception to the similar binding properties of other family members occurred on the N1 exon DCS regulatory sequence onto which hnRNP 2H9 does not assemble. hnRNP 2H9 is clearly the most divergent member of the group. It is lacking the upstream qRRM, and it shows an overall homology of 57 and 59% when compared with F and H/HЈ. Whereas hnRNP 2H9 shares the GGGA core-binding sequence with the other hnRNP H group members, its divergent primary amino acid sequence may confer a different ability to interact with other protein factors or suboptimal RNA-binding sequences as evidenced by its inability to assemble onto the N1 DCS in both HeLa and WERI-1 extracts. There is evidence that at least six forms of the hnRNP 2H9 mRNA are generated by alternative splicing from a single gene (58). The protein isoforms encoded by these messages may also have distinct but overlapping functions in the cell, expanding the repertoire of hnRNP H group functions.
We have succeeded in identifying GGGA as the core-binding site for the hnRNP H protein family members. Because hnRNPs H, HЈ, F, and 2H9 have similar RNA binding affinities, they are likely to share common functions. As evidenced by our study of the ␤-tropomyosin exon 7 splicing silencer and the HIV-1 p17gag INS element, characterization of the binding of one family member should lead researchers to check whether other family members are assembling on the regulatory element and whether they too have a role in RNA processing. The determination of the GGGA-binding sequence for the hnRNP H family should prove extremely valuable in the identification of putative cis-regulatory elements involved in alternative splicing and in other RNA processing mechanisms.