Sequences Adjacent to the 5 (cid:1) Splice Site Control U1A Binding Upstream of the IgM Heavy Chain Secretory Poly(A) Site*

We have recently shown that the stability of the alternatively expressed immunoglobulin M heavy chain secretory mRNA is developmentally regulated by U1A. U1A binds novel non-consensus sites upstream of the secretory poly(A) site and inhibits poly(A) tail addition in undifferentiated cells. U1A’s dependence for binding and function upon a stem-loop structure has been extensively characterized for the consensus sites. We therefore probed the structure surrounding the novel U1A binding sites. We show that two of the three novel binding sites represent the major single-stranded regions upstream of the secretory poly(A) site, consistent with a major role at this site. The strength of binding and ability of U1A to inhibit poly(A) polymerase correlate with the accessibility of the novel sites. However, long range interactions are responsible for maintaining them in an open configuration. Mutation of an RNase V1-sensitive site 102 nucleotides upstream, directly adjacent to the competing 5 (cid:1) splice site, changes the structure of one the U1A binding sites and thus abolishes the binding of the second U1A molecule and the ability of U1A ability to inhibit poly(A) polymerase activity at this site. These sites bind U1A via its N-terminal domain but with a 10-fold lower affinity than U1 small nuclear RNA. This lower binding affinity is more conducive to U1A’s regulation of poly(A) tail addition to heterologous mRNA. The major source of diversity in metazoans is the ability to process pre-mRNA into alternative mRNAs via the selection of

The major source of diversity in metazoans is the ability to process pre-mRNA into alternative mRNAs via the selection of alternative splice sites and alternative 3Ј-ends, giving rise to a range of proteins encoded by the same gene that are differentially expressed during growth and development (for review see Ref. 1). An important pattern of alternative processing involves a competition between splicing on of extra exons and 3Ј-end cleavage at an upstream site, which cuts off these exons (for review see Ref. 2). This group includes a number of important receptors involved in growth and differentiation. The best characterized model for this pattern is the IgM heavy chain gene, which is alternatively processed into mRNAs that encode a membrane receptor or secreted antibody during differentiation (3) (see Fig. 1). To prevent inappropriate expression of secreted antibody in undifferentiated cells, the stability of secretory mRNA is highly controlled. Changes in stability of the secre-tory mRNA during differentiation play a major role in the dramatic burst of production of secreted antibody once B cells differentiate into immunoglobulin secreting cells (4 -7).
3Ј-End cleavage in metazoans takes place upon the recognition of a bi-partite poly(A) signal consisting of a consensus AAUAAA and a less defined GU-rich sequence, upstream and downstream of the cleavage site, respectively, by components of the cleavage/polyadenylation complex. These consist of the multimeric cleavage polyadenylation specificity factor and cleavage stimulatory factor, cleavage factors I and II, and poly(A) polymerase (PAP) 1 (reviewed in Ref. 8). After cleavage the RNA is specifically polyadenylated by PAP tethered to the RNA via the 160-kDa component of cleavage polyadenylation specificity factor, bound by the hexanucleotide sequence (9). Most eukaryotic mRNAs acquire a poly(A) tail during 3Ј-end processing within the nucleus, and this is coupled to efficient export of mRNAs to the cytoplasm (10,11). mRNAs that fail to acquire a poly(A) tail are retained at or near transcription sites and degraded by the nuclear exosome (12).
We have previously shown that the change in stability is regulated by the addition of a poly(A) tail to the secretory mRNA (13). The U1A protein binds upstream of the secretory poly(A) site and inhibits polyadenylation of the secretory mRNA in a developmentally regulated manner. We demonstrated that U1A inhibits poly(A) addition to the heterologous -secretory poly(A) site. This inhibition depends on sequences that resemble the consensus (AUUGCAC) U1A binding site but are non-consensus (A(U/G)GCN 1-3 C) (13). These sequences are ingeniously located in the intron that is spliced out upon formation of the membrane mRNA; i.e. between the 5Ј splice site and the secretory poly(A) site. In this way they are not present in the membrane form of mRNA and do not regulate poly(A) addition to this form of mRNA. Thus the U1A selectively regulates expression of the upstream poly(A) site. In addition, the effect of U1A is exerted more strongly in undifferentiated cells in which the membrane form is predominantly expressed. In this way leaky recognition of the secretory poly(A) site in undifferentiated cells does not lead to the formation of viable secretory mRNA at an inappropriate stage.
The ability of U1A to regulate poly(A) polymerase activity was first demonstrated in an autoregulatory system in which two molecules of U1A bind a complicated stem loop structure called the polyadenylation inhibitory element (PIE) immediately upstream of its own hexanucleotide sequence (14). The binding of U1A to its consensus sequence on U1 snRNA and on PIE RNA has been extensively studied both biochemically and by NMR and crystallography (14 -25). In addition to requiring certain nucleotides of the U1A consensus binding sequence for high affinity binding (26), the stem-loop structure of the binding site has also been shown to be essential for maintaining the binding site in an accessible configuration and for making contact with certain U1A residues (14,27).
We therefore wanted to determine the secondary structure surrounding the novel U1A binding sites upstream of the -secretory poly(A) site. We show that the U1A binding sites are the major accessible regions upstream of the secretory poly(A) site, consistent with a major role at this site. These structures are dependent on the flanking sequences, which control the accessibility and function of the U1A binding sites. We identify the sequences in the 5Ј flanking sequence necessary for the structure of the hexanucleotide proximal (8s) U1A binding site and find that they are directly adjacent to the 5Ј splice site involved in the competing splicing reaction.

EXPERIMENTAL PROCEDURES
Plasmid Constructs-For in vitro transcription the PCR products from the secretory poly(A) site and surrounding sequences, containing mutations and a 5Ј EcoRI site and a 3Ј XbaI site, introduced as part of the synthetic primers, were cloned into pGem 3Zf (containing a T7 promoter in the forward direction) between the EcoRI and XbaI sites. The forward primers were as follows: for V1S1 and 2, CCGGAATTCC-GGACCGTGGACAAGTggtgaGGTAAACCCtgtgTGTACAATGTCTCC-CTGATCATGTCTG; for V1S1, CCGGAATTCCGGACCGTGGACAAG-TggtgacGGTAAACCCACACTGT ACAATGTCTCC; and for V1S2, CCG-GAATTCCGGACCGTGGACAAGTCCACTGGGTAAACCCtgtgTGTAC-AATGTCTCCCTGATCATGTCTG. The reverse primer was: GCGTC-TAGATAGGG TGGAGGCAAGTATGC as previously described (13).
RNA Substrates-Templates for in vitro binding assays (EMSAs) were obtained by restriction of the pGem 3Zf constructs with XbaI or DdeI. Uniformly labeled RNA substrates were synthesized by in vitro transcription using T7 RNA polymerase as previously described (28). Transcripts that were 5Ј-end-labeled were obtained by the same protocol, substituting [␥-32 P]GTP as the radiolabeled nucleotide. RNA substrates that were 3Ј-end-labeled were synthesized by the same protocol omitting [␣-32 P]UTP. The cold RNA was subsequently labeled at the 3Ј-end with [5Ј-32 P]Cp using T4 RNA ligase as previously described (29). All transcripts were purified by extraction after electrophoresis on 8% polyacrylamide gels containing 7 M urea.
Recombinant Proteins-Untagged recombinant wild type U1A and U1A 1-117 were purified from Escherichia coli as described previously (30). Recombinant bovine poly(A) polymerase, tagged at the C terminus with 6 histidines, was purified from E. coli on Ni 2ϩ -NTA. The Histagged C terminus ensured that all of the C-terminal residues were present after purification (31).
RNA Secondary Structural Probing and Footprinting-Probing of the secondary structure of RNA was carried out as previously described (29). Briefly, all RNA substrates were first refolded by heating in reaction buffer at 70°C for 2 min followed by cooling to 37°C over a 30-min period. For RNase T1, RNase V1 and lead (II) cleavage, 10,000 cpm of 3Ј-or 5Ј-end-labeled RNA substrate was incubated at 37°C in a total volume of 5 l with either 0.4-0.001 unit of RNase T1 (Roche Applied Science) or 0.072-0.00072 unit of RNase V1 (originally Amersham Biosciences, kind gift of Jean Patterson) in 50 mM PIPES, pH 7.0, 100 mM NaCl, 10 mM MgCl 2 for 5 min or 0.1-0.5 mM Pb(OAc) 2 in 50 mM Tris HCl, pH 7.5, 100 mM, NH 4 Cl, and 10 mM MgCl 2 for 10 min. The reactions were stopped by the addition of 10 M urea, 25 mM EDTA. Samples were analyzed on 15% PAGE containing 7 M urea. A reference nucleotide ladder was made by boiling of substrates with 50 mM NaOH, 10 mM EDTA for 10 s. Footprints were obtained by adding U1A before T1 digestion. These reactions were stopped by the addition of 180 l of proteinase K buffer (50 mM Tris, pH 7.9, 10 mM EDTA, 230 mM NaCl, 0.2% SDS) and 50 g of proteinase K and incubated for 10 min at 30°C. This was followed by phenol/chloroform extraction and ethanol precipitation before loading on PAGE as above.
Electrophoretic Mobility Shift and Poly(A) Addition Assays-A U1A binding assay were performed as previously described (14) using 10,000 cpm of 32 P-labeled RNA substrate per lane and recombinant U1A as indicated. Nonspecific poly(A) addition assays were performed as previously described (32) with 50 ng of recombinant poly(A) polymerase, and incubations were carried out for 30 min at 30°C.

RESULTS
U1A binding sites are in the major accessible regions upstream of the secretory poly(A) site- Fig. 1A is a schematic diagram of the alternative processing of the -heavy chain pre-mRNA, showing that the -secretory poly(A) site is located within the intron used to splice on the membrane exons. Fig.  1B shows the precise positions of the various elements known to be important for -secretory mRNA expression mentioned in the introduction, including the location of the U1A binding motifs, as well as the location of the 5Ј-and 3Ј-ends of the RNA substrate used in this report. It also shows the proximity between the 5Ј splice site of this intron and the -secretory poly(A) site used for the -secretory mRNA. Positions are numbered according to the mouse IgM sequence with accession number V00818. Fig. 1C shows the sequence between 1838 and 1950, which we have previously scanned with mutation (mut1 through mut10), and the position of the 2s, 4s, and 8s mutations of the three novel U1A binding sites.
Because U1A binding to the consensus sequence on U1 snRNA is dependent upon a particular RNA hairpin stem-loop structure, we wanted to probe the structure surrounding the novel U1A binding sites that we have recently identified upstream of the -secretory poly(A) site. For this we used partial RNase T1 digestion. Substrates were digested with limiting amounts of RNase T1 for 5 min and then immediately run on 15% denaturing PAGE. RNase T1 cuts 3Ј to Gs with a preference for single-stranded RNA. Thus Gs that give a strong pattern of cuts will be situated in the single-stranded region of hairpin loops or bulges (33).
At first we examined the pattern of RNase T1 digestion of 3Ј-end-labeled IgM (1970 -2085) substrate, which spans all the known important elements involved in regulation of expression of -secretory mRNA ( Fig. 2A). These include the upstream 5Ј splice site, the novel U1A binding sites, the hexanucleotide sequence, and both downstream GU-rich regions. In this way the RNA structure surrounding the U1A binding sites will be preserved in a larger context and more accurately represents the physiological structure. A single-nucleotide ladder was generated by hydroxy radical cleavage and served as a marker for nucleotide position ( Fig. 2A, lane 1). The deepest RNase T1 cuts occur around the hexanucleotide sequence and the cleavage/ polyadenylation site , which is predicted to form a loop containing these sequences in a form accessible to components of the polyadenylation and cleavage complex (and RNase T1) ( Fig. 2A, lanes 3-8). The proximal and distal GU-rich regions are also accessible (labeled on Fig. 2A). The region of RNA surrounding the U1A binding motifs (Gs at 1854, 1877, and 1925) is indicated by a bracket. It can be seen from Fig. 2A that these are the major accessible regions upstream of the hexanucleotide sequence.
We labeled the IgM-(1790 -2085) RNA substrate either at the 3Ј-end or at the 5Ј-end and ran the gels long enough to be able to accurately determine the position of the T1 cuts in the region surrounding the U1A binding motifs (1854 -1930) and were able to identify a reproducible pattern. The RNase T1 partial digestion of the 5Ј-end-labeled wild type substrate is shown in Fig. 2B (lane 2). A distinct reproducible pattern of cuts can be seen around U1A binding motifs 4s (1875-1882) and 8s (1923)(1924)(1925)(1926)(1927)(1928)(1929)(1930) (Fig. 2B, lane 2). The 8s motif consists of one deep cut (1925) and two lighter cuts (1919 and 1921) that are closer to a mirror image set of cuts between the two U1A binding motifs (1903). The 4s motif consists of a pattern of two deep cuts (1871 and 1877) followed by two further cuts (1881 and 1883). The U1A binding motifs are labeled individually in Fig. 2B (8s, 4s, and 2s). The 2s motif was not RNase T1-sensitive.
To confirm the identities of the cuts ascertained by comparison with the hydroxy ladder, we employed mutants that replace the AUGC (or AGGC) motifs with AAAA (see Fig. 1C). These mutations abolish U1A binding (13). As the G is replaced by an A, these mutants lose the ability to be cut by RNase T1 at that site. It can be seen in Fig. 2B that mutation of either the 4s or 8s site results in the loss of the deep cuts at 1877 or 1925, respectively (see asterisks in lane 3 and 4). This confirms that these accessible sites are at the position of the novel U1A binding sites. To further confirm that open structures form around the 4s and 8s U1A binding sites, we performed lead cleavage analysis, which we have successfully used previously to identify bulges in RNA structure (29). Lead (II) cleavage patterning is a technique to probe tertiary structures of RNA (34). Furthermore, lead (II) preferably cleaves at bulges of RNA structure independent of nucleotide sequence (35). It can be seen that lead cuts around the hexanucleotide sequence, and two patterns of cuts form around the 4s and 8s U1A binding sites (lane 6). Taken together these data show that the 4s and 8s novel U1A binding sites are the major accessible regions upstream of the hexanucleotide sequence.
We noted that the pattern of RNase T1 cuts surrounding the 8s binding motif changes only in the region of the 8s mutation, in that the deep cut at 1925 is lost (Fig. 2B, compare lanes 2 and 4). In contrast, for the 4s mutation, the pattern of RNase T1 cuts surrounding the (1875-1882) motif changes more extensively (compare lanes 2 and 3), suggesting a remodeling of the surrounding area upon 4s mutation.
Is the Double-stranded 2s Motif Remodeled upon U1A Binding?-To determine the extent of remodeling with the 4s mutation and to further probe the structure surrounding the 2s binding site, we performed RNase V1 partial digestion. RNase V1 preferentially cuts double-stranded RNA (33). It can be seen in Fig. 3A that the 2s motif lies within a double-stranded region (lane 2). In fact it appears to be within a symmetrical structure consisting of four RNase V1-sensitive sites.
The 8s mutation once again appears not to be remodeled as probed by RNase V1 (compare lanes 2 and 6). In contrast, the 4s mutation leads to remodeling of the RNase V1-sensitive sites from 1911 to 1858 (compare lanes 2 and 4, labeled with a bar in lane 4). The RNase V1-sensitive site at 1855 (the 2s motif) is still visible but diminished in the 4s mutation (compare lanes 2 and 4).
We wondered whether the susceptibility this region to remodeling upon mutation of the AUGC motif might reflect its ability to be remodeled upon U1A binding this motif. We therefore attempted to footprint the RNA with U1A to look for changes in RNase T1 accessibility upon U1A binding. As can be seen in Fig. 3B, with 80 ng of U1A a footprint begins to appear over the 4s site but not the 8s site (lane 3, asterisk). No change in the pattern of RNase T1 digestion can be seen at the 2s site at this concentration. At higher concentrations all bands appear lighter (data not shown). We conclude that the lack of a definite footprint is due to the low affinity with which U1A binds these sites. Thus, whether or not U1A binding at 4s remodels the 2s site cannot be determined from this approach. Similarly, Teunissen et al. (36) found that the whole region surrounding the U1A binding sites on PIE RNA was footprinted and changes in structure could not be monitored. Nevertheless, this experiment here suggests that 4s is the first site to bind U1A.
The Flanking Sequences Are Necessary to Maintain the Structure of the U1A Binding Sites-As a first step to determine the secondary structure surrounding the single-stranded U1A binding motifs, we were interested in knowing the minimal RNA sequence that could maintain the U1A binding sites in accessible regions. For this we tested the effect of deleting the 5Ј and 3Ј flanking regions. The substrates were 3Ј-endlabeled and run for 5 h on 15% gels so that the pattern surrounding the U1A binding sites was well separated and that the pattern surrounding the hexanucleotide sequence was retained within the gel for convenient reference. The pattern of RNase T1 digestion in the whole substrate, which can be used as a reference point, can be seen in lanes 9 -14 of Fig. 4A. We performed a partial RNase T1 digest of 3Ј-end-labeled substrates, with or without the flanking regions to look for changes in RNA structure brought about by the addition of the flanking sequences. The results are presented in Fig. 4 (A and B), and the pattern of cuts for each substrate is summarized in the schematic diagram in C. which are both greater than when the 5Ј flanking region is present. The 4s cut at 1877 is only 50 nucleotides from the end of the substrate when the 5Ј flanking sequence is missing and may be more accessible for this reason. We presume that a comparison of the fate of the 4s cut upon deletion of the 3Јflanking sequence has more physiological relevance in the longer substrate, which includes the 5Ј flanking regions, but that sequences in the 5Ј flanking region have a small effect on the 4s site.
Deletion of both 5Ј and 3Ј flanking regions results in the complete disruption of the RNA structure surrounding the hexanucleotide sequence and a new deep cut appears in the 1938 -54 region (Fig. 4B, lanes 2-7). The region between 1925 and 1903 is remodeled, and the deep cuts at 1925 and 1903 are lost. In both substrates missing the 5Ј flanking region, a cut appears at 1855 in the region surrounding the 2s (1854 -1860) binding motif (see IgM-(1838 -2085) and IgM-(1838 -2030), Fig.  4A, lanes 1-7 and Fig. 4B, lanes 1-7, respectively). From this we conclude that sequences 5Ј of 1838 are necessary to maintain the 2s binding site in an inaccessible form.
From consideration of the pattern of sites affected by the respective flanking regions, a "criss-cross" pattern emerges: Deletion of the 3Ј flanking sequence has the greatest effect on the cut at 1877 (compare substrate IgM-(1790 -2085) with IgM-(1790 -2030), Fig. 4A, lanes 8 -14 with Fig. 4B, lanes 8 -14,  respectively). In contrast, deletion of the 5Ј flanking region has the most effect on the 1925 cut and results in the appearance of the cut at 1855 (compare substrate IgM-(1790 -2085) with IgM-(1838 -2085), Fig. 4A, lanes 8 -14 with Fig. 4A, lanes 1-7, respectively). Thus the 3Ј flanking region affects the 4s U1A binding site, whereas the 5Ј flanking sequence affects the 8s U1A binding site in a criss-cross effect. However, the 5Ј flanking region also affects both the 4s and the 2s site and appears to have a greater effect on the 8s motif than does the 3Ј flanking sequence on the 4s motif. These effects are summarized in Fig.  4C.
The Accessibility of the U1A Binding Sites Correlates with the Strength of Binding of U1A and Its Ability to Inhibit Poly(A) Addition in Vitro-We next wanted to test if the accessibility of the U1A binding sites had an effect on U1A binding. We performed electrophoretic mobility shift assays comparing the various substrates using uniformly labeled in vitro transcribed RNA substrates and recombinant non-tagged full-length wild type U1A. U1A binds the substrate IgM-(1838 -2030) that spans the U1A binding motifs, the hexanucleotide sequence of the poly(A) site, and the proximal GU-rich region (Fig. 5A,  lanes 1-4) (see Fig. 1 for location of sequences). Inclusion of the sequence 2030 -2085 enhances the binding (lanes 5-8). This sequence spans a stem-loop structure that includes the distal GU-rich region and is necessary for full polyadenylation activity. However, this sequence does not bind U1A on its own (data not shown). We next tested the effect of 5Ј flanking sequences. Adding 5Ј flanking sequences (1790 -1838) greatly increased U1A binding to both the IgM-(1838 -2030) and IgM-(1838 -2085) substrate (Fig. 5A, compare lanes 1-4 with 9 -12 and  lanes 5-8 with 13-16, respectively). Thus both the 3Ј and 5Ј flanking sequences separately enhance the binding of U1A to the substrate, although inclusion of the 5Ј flanking sequence has a greater effect than the 3Ј flanking sequence. Furthermore, 5Ј and 3Ј flanking sequences in combination enhance this binding further (Fig. 5A, lanes 13-16). Thus the strength of binding correlates with the accessibility of the binding sites determined by the RNase T1 partial digestion (compare Figs. 4 and 5). Because the extra sequences at either end of the U1A binding sites do not bind U1A themselves, we conclude that the flanking sequences, in particular the 5Ј flanking sequence, contribute to the enhanced binding by helping to maintain the binding sequences in an accessible configuration.
We note that, on all substrates with one or more of the flanking regions, three bands form, the slowest mobility band representing a complex of RNA with three molecules of U1A. This is in contrast to two accessible regions on the naked RNA (see Fig. 4C). This raises the possibility that two U1A molecules binding the accessible site recruit the third molecule to the inaccessible 2s site.
Accessibility of U1A Motifs Correlates with the Ability to Inhibit Poly(A) Tail Formation-We next examined the functional effect of the changes in secondary structure brought about by inclusion of the 5Ј flanking sequence on the ability of U1A to inhibit poly(A) polymerase activity. For this we used a reconstituted system containing only recombinant PAP and U1A and in vitro transcribed RNA (Fig. 5B). We compared the two cleaved substrates IgM-(1838 -2030) and IgM-(1790 -2030) for the ability of U1A to inhibit poly(A) polymerase activity (Fig. 5B, lanes 5-8 and 9 -12, respectively). Both substrates contain the U1A binding motifs but, as can be seen from the structural analysis, IgM-(1838 -2030) does not present both U1A binding motifs in single-stranded, RNase T1-accessible regions (Fig. 4A, compare lanes 1-7 with 8 -14, position 1925  (8s)). In addition, it can be seen in Fig. 5A that the second molecule of U1A necessary for co-operative inhibition of PAP binds this substrate poorly (Fig. 5A, lanes 1-4). These substrates were compared with the minimal RNA substrate containing only the hexanucleotide sequence and no upstream U1A binding motifs (Fig. 5B, lanes 1-4). All three substrates produced strong poly(A) tails in the absence of U1A (lanes 1, 5,  and 9). However, U1A was unable to inhibit poly(A) polymerase activity on the IgM-(1838 -2030) substrate, although it contains all the U1A binding motifs. A mild inhibition is seen with 100 ng of U1A (lane 8), which is comparable to that seen with the minimal substrate that does not contain the binding sites (lane 4) and is therefore a nonspecific effect. In contrast, on IgM-(1790 -2030), U1A could begin to inhibit PAP activity at 10 ng of U1A and completely abolish it with 30 ng. Thus we conclude that the ability to inhibit PAP activity at the -secretory poly(A) site correlates with the secondary structure of RNA that holds both U1A binding motifs in a single-stranded configuration.
Identifying Potential Hybridizing Sequences in the 5Ј-Flanking Region-As a first approach to identifying sequences in the 5Ј flanking regions that could interact with the 8s U1A binding site, we probed the RNA structure of this region. For this, we examined the pattern of partial RNase T1 and V1 of the 5Јend-labeled (1790 -2085) substrate run on 15% gels for shorter times so that the smallest fragments at the 5Ј-end of the substrate are retained on the gel (Fig. 6A). We were able to discern two possible structures. One of these lies between 1836 and 1865; 1838, 1843, 1855, and 1863 are the centers of four patterns of RNase V1 cuts, suggesting a "strong double-stranded structure" (Fig. 6A, lanes 6 and 7), which contained the 2s motif as noted in Fig. 3A. The second, between 1803 and 1824, spans the 5Ј splice site. The Gs at 1803 and 1824 are more accessible than the two Gs that flank the 5Ј splice site at 1810 and 1811 (Fig. 6A, lanes 3 and 4, labeled "structure surrounding the 5Ј splice site"). This is surrounded on either side by a pattern of RNase V1 cuts, suggesting double-stranded regions, the most prominent occurring at 1808 and 1820, which are both Cs (see Fig. 6B, where Cs are boxed). From an examination of the sequences on either side of the 5Ј splice site there is no obvious pattern of hybridization that would be consistent with a hairpin stem-loop structure (see Fig. 6B, sequences indicated by lines and labeled V1S1 and V1S2, where V1S ϭ RNase V1sensitive). We therefore wondered if these would hybridize with sequences in the vicinity of the 8s U1A binding site. Thus we mutated these sequences into their opposite and investigated the effect on the structure of the 8s U1A binding site.
As can be seen in Fig. 6C mutating both RNase V1-sensitive sites (V1S1 plus 2), alters the structure of the 8s binding site (Fig. 6C, compare lanes 5 and 6 with lanes 2 and 3 at position  1925 (8s)). The cut at 4s was diminished, but the pattern was maintained. Furthermore, the accessibility of the region between 1971 and 1925 is altered, demonstrating that mutation these two sequences have multiple effects on the structure. Nevertheless, the inaccessibility of 2s was unaffected. To isolate the effect of the mutations on the 8s binding site, we mutated the two sequences separately. The V1S1 mutation yielded a pattern of T1 partial digest very similar to that of the wild type (compare lanes 8 and 9 with lanes 2 and 3). In contrast, the V1S2 mutation produced a pattern around the 8s  (IgM 1838 -2085, lanes 5-8) and (IgM 1838 -2030, lanes 1-4), respectively, were compared with those with the addition of the 5Ј flanking sequence (IgM 1790 -2085, lanes 13-16) and (IgM 1790 -2030,  lanes 9 -12), respectively. EMSAs consisted of full-length wild type recombinant U1A and uniformly labeled RNA substrates. Increasing concentrations of U1A are as indicated. U1A⅐RNA complexes are labeled with a bracket in each panel. All samples were run on 5% native PAGE and visualized by phosphorimaging. B, the effect of the 5Ј flanking sequences on the ability of U1A to inhibit poly(A) polymerase activity. Nonspecific polyadenylation assays using in vitro transcribed uniformly labeled substrates with (IgM 1790 -2030, lanes 9 -12) and without (IgM 1838 -2030, lanes 5-8) the 5Ј flanking sequences, recombinant poly(A) polymerase and U1A at the indicated amounts. The core poly(A) site without the U1A binding sites (IgM 1951-2030, lanes 1-4) was included as a reference. Samples were run on 10% denaturing PAGE and visualized by phosphorimaging.

U1A Binding at the IgM Secretory Poly(A) Site
site that resembled that of the double-mutated RNA, whereas the 4s site and the region between 1971 and 1925 are comparable with the wild type. We therefore conclude that the sequence downstream of the 5Ј splice site (V1S2) is responsible for the structure of the 8s U1A binding site. Inclusion of the V1S1 mutation with the V1S2 mutation appears to have an effect on the overall structure but the effect of the VS1S alone was minor.
Sequences Adjacent to the 5Ј Splice Site Are Responsible for the Ability of the Second U1A Molecule to Bind and Func-tion-We next investigated whether the change in structure at the 8s U1A binding site affects U1A binding and function at this site. We first compared the wild type and the double mutant (V1S1 plus 2) in EMSAs (Fig. 7A). It can be seen that, although two molecules of U1A can bind the wild type substrate (lanes 1-6), only one molecule can bind the mutated substrate (lanes 7-12). We next compared the effect of the V1S1 or V1S2 mutant on U1A binding (Fig. 7B). Whereas both the wild type and V1S1 mutant could bind two molecules of U1A (lanes 1-4 and 5-8), the V1S2 mutant could only bind one molecule and the second could no longer bind.
To determine the functional significance of these changes, we tested the consequences of the mutations in reconstituted polyadenylation assays. Because it has been shown in the autoregulatory system that two molecules are necessary to inhibit poly(A) polymerase activity, any reduction in the number of molecules of U1A bound would be predicted to result in loss of polyadenylation inhibitory activity at the secretory poly(A) site. It can be seen in Fig. 7C that in the case of the wild type (lanes 1-6) and the V1S1 mutation (lanes 11-15) in which two molecules of U1A can bind, increasing concentrations of U1A result in the inhibition of poly(A) polymerase activity at this site. However, in the double mutant (V1S1 plus 2) (lanes 6 -10) and the single V1S2 mutant (lanes 16 -20) in which only one molecule can bind, increasing concentrations of U1A can no longer inhibit poly(A) polymerase activity. From this we conclude that the mutation in the sequence directly downstream of the 5Ј splice (V1S2), which affects the structure of the 8s U1A binding site, also affects the ability of the second U1A molecule to bind and thus the ability of U1A to inhibit poly(A) polymerase activity at this site.
Estimation of Relative Binding Affinity of Full-length U1A and the Ability of the N-terminal RRM of U1A to Bind Upstream of the Secretory Poly(A) Site-Having characterized the secondary structure necessary for forming the U1A binding sites, we wanted to look at U1A binding to these novel motifs more closely to relate it to what is known for U1A binding to the consensus sequence in U1snRNP. U1A binds U1 snRNA via RRM1 its N-terminal domain (27) with a K d value of 2 ϫ 10 Ϫ11 M (19).
We first compared the ability of IgM-(1790 -2085) versus U1 snRNA to bind full-length U1A in EMSAs (Fig. 8A). U1 snRNA binds one molecule of U1A, whereas multiple U1As bind upstream of the secretory poly(A) site. We therefore compared low U1A concentrations when only one U1A bound per RNA substrate (Fig. 8A). This would presumably represent the U1A binding site with the highest affinity for U1A. We adjusted the relative input of uniformly [ 32 P]UTP-radiolabeled U1 snRNA and IgM-(1790 -2085) for the relative numbers of Us in the substrates, because IgM-(1790 -2085) is almost twice as long as the U1 snRNA substrate (295 nucleotides versus 164 nucleotides, compare input in Fig. 8A, lane 1 with lane 7). We quantitated the amount of RNA bound to U1A relative to input for each substrate by phosphorimaging and constructed binding curves for four separate assays. We estimated the affinity of the IgM-(1790 -2085) substrate for the wild type full-length U1A to be 10 Ϯ 4-fold lower than that of U1 snRNA. We then constructed binding curves for U1A binding to the substrates containing single or double mutations in the U1A binding sites to gauge the relative strength of each of the binding sites (data not shown). We established the order of strength of the novel U1A binding sites as 4s Ͼ 8s Ͼ 2s (data not shown).
It has been previously shown that the N-terminal RRM is solely responsible for the binding of U1A to U1 snRNA (19,21). To test whether this is also the case for IgM-(1790 -2085), we examined its affinity, relative to U1 snRNA, for the truncated version of U1A containing only amino acids 1-117, which span RRM1. We employed two different EMSA assays for the two lengths of U1A. The shift due to the very small truncated U1A (1-117) was not visible above the long IgM-(1790 -2085) substrate unless we ran the gel for 5 h (Fig. 8B). This was not necessary for the U1A wild type, which is larger, and the shift was visible after 1.5 h (Fig. 8A). Quantitation and analysis showed that the truncated U1A bound the substrate IgM-(1790-2085) with a similar relative affinity to that of the wild type (Fig. 8, compare A and B). We overexposed the results for U1 snRNA relative to IgM-(1790 -2085) for easier visual comparison of RNA bound versus input (Fig. 8B, compare lanes 1,  2, and 8). Taken together, we conclude that U1A binds upstream of the -secretory poly(A) site via RRM1 with a10-fold lower affinity than to U1 snRNA. DISCUSSION We have recently shown that U1A binds novel sites upstream of the secretory poly(A) site and regulates poly(A) tail addition in a developmental manner (13). Here we further characterize these novel binding sites in terms of what is known about U1A binding to consensus sites. The U1A binding to consensus sites requires a stem loop structure in which both an accessible binding site and a specific interaction with the closing pair of the stem play essential roles. We therefore wanted to determine how far theses novel sites satisfy these requirements. The major findings of this work are that two of the three novel sites occur at the major accessible regions upstream of the secretory poly(A) site, that long range interactions maintain these in accessible configuration, and that U1A binding and function correlate with the accessibility of these sites. Furthermore, we have identified sequences responsible for the structure of the 8s sites, and these are directly adjacent to the upstream 5Ј splice site involved in the splicing out of the intron containing the secretory poly(A) site and the U1A binding sites. Thus these novel sites satisfy the requirements determined for the consensus sites but at the same time extend the definition of a U1A binding site. U1A binds these novel sites via its N-terminal RRM1, which is the same domain that binds the consensus sites, but at a 10-fold lower affinity. We suggest that the lower binding affinity is more consistent with a regulatory role of U1A on a heterologous mRNA. These finding have consequences when searching for similar binding sites on other genes that undergo the same type of regulation.
The IgM Secretory Poly(A) Site Exhibits a Criss-cross Structural Pattern-RNA structural probing of the whole -secretory poly(A) site and regions 5Ј of this that span the 5Ј splice site used for the competing splicing reaction show that two of the U1A binding sites are the major single-stranded regions upstream of the hexanucleotide sequence, consistent with a major role in the control of the secretory poly(A) site. These U1A binding sites (4s and 8s) consist of open structures, which are maintained in an open configuration by surrounding sequences. The sequences that maintain the most 3Ј U1A binding site (8s) in an open configuration are located 102 nucleotides upstream in the 5Ј flanking region, directly adjacent to the competing 5Ј splice site. Furthermore, the 3Ј flanking sequences affect the upstream 4s U1A binding site. Taken together, the flanking sequences affect the U1A binding site most distal to them in a criss-cross fashion, suggesting a more com- plicated secondary structure than a simple stem loop. Because U1A binding is dependent upon the secondary structure of RNA, this raises the possibility that the effect of U1A could be regulated by structural re-modeling caused by developmentally induced factors binding quite distant from the U1A binding site.
The Novel Sites Compared with the Crystallographic and NMR Structures of Consensus Sites-We have shown that the binding strength and ability of U1A to inhibit poly(A) polymerase at the secretory poly(A) site correlate with the accessibility of two of its binding sites (4s and 8s), which is consistent with what is known to be important for consensus U1A binding sites (14,27). We have also shown that U1A binds this site via its N-terminal domain, the same as is used for binding to U1snRNP. Based on crystallographic and NMR studies of the interaction of the N-terminal domain of U1A with the consensus binding sites on U1snRNP and PIE RNA (19,21,22,24), it is possible to envision how U1A binds the novel sites. The bases of the consensus binding site, in particular the UGC of the AUUGCAC of the unpaired loop, stack with amino acid side chains within the ␤ sheet of the U1A protein via electrostatic interactions (see Protein data bank pdb accession number 1DZ5). One of the loops connecting the ␤ sheet of the U1A protein (loop 3) protrudes through the RNA loop in a "button in a button-hole" effect, and another (loop 1) interacts with the 5Ј stem (22). Via these loops, U1A makes specific contacts with the 3Ј C of the AUUGCAC and the C:G pair at the base of the RNA hairpin, respectively (22). Both loop 3 of U1A and the RNA hairpin to which it binds becomes more structured in the protein⅐RNA complex in an "induced fit" effect (16,18,22). Furthermore, loop 3 contacts U1 snRNA and the PIE elements differently, suggesting flexibility in this interaction (15,22). The UGC sequence is common to the AUGC of the novel sequences and could be predicted to interact in the same way.
There is a C three nucleotides downstream of the AUGC motif for both motifs (see Fig. 1). The distance between the electrostatic interacting portion and the C is increased by two nucleotides, which could be accommodated by the flexibility of the interaction. There is no C in close proximity 5Ј of either the novel motifs that could form a C:G pair for interaction with loop 1. However, the interaction between the 5Ј-adjacent sequence and those adjacent to the 8s U1A binding site may provide the "stem" for interaction with loop 1. In this case the most likely pairing would be G:C, which has been shown to bind with lower affinity (22). We suggest that these adjustments for binding are responsible for the 10-fold lower binding affinity of U1A to the novel motifs.
Juxtaposition of Two U1A Molecules to Form the PAP Inhibitory Pocket-In the autoregulatory system the two dimerizing U1A molecules are juxtaposed by the complicated stem loop structure of PIE. In this way they form a bi-partite binding pocket for PAP that is necessary to inhibit its activity (14,25,37). We show here that U1A behaves in the same way in that failure to bind a second U1A molecule abolishes the ability of U1A to inhibit PAP activity at the secretory poly(A) site. Thus two U1A molecules can dimerize to form the PAP inhibitory binding pocket on this RNA despite the distance of at least 40 nucleotides between them. Therefore, their juxtaposition is imposed either by a rigid RNA structure or by the dimerization of two U1A molecules across a distance. The latter would require a flexible RNA structure, which we favor because of the remodeling of the RNA surrounding the 4s U1A binding sites when we introduced mutations into this site (see Fig. 2B). Furthermore, U1A is known to bind RNA with an induced fit as discussed above.
Is the 2s Site Remodeled?-We found that, although two of the novel motifs (4s and 8s) were accessible to RNaseT1 on the naked RNA, the third (2s) was within a double-stranded RNase V1-sensitive site. We know that 2s is a U1A binding site, because we have previously shown that mutation of 2s alone causes a release in the inhibition of poly(A) site expression in vivo and that mutation of all three motives is necessary for the complete loss of binding to this substrate in vitro. In addition, we have shown previously (13) and it can be seen in Figs. 5A and 7A that three bands bind these substrates. We noted that mutation of the 4s motif caused a remodeling of the surrounding regions to at least within three nucleotides of the 2s site. This finding, combined with what is known about U1A dependence upon an unpaired region for binding, raises the possibility that U1A binding at the 4s site remodels this region, increasing the accessibility of the 2s site and at the same time recruiting U1A to the 2s site via the dimerization interaction. The question then arises that, if a U1A at 4s alone is sufficient to recruit a second molecule to the 2s site, then these two U1A molecules would presumably be able to dimerize and create an inhibitory binding pocket for PAP even when the 8s site is lost. However, this is not the case (see Fig. 7). A scenario consistent with these findings is that U1A binding at both the 4s and 8s site is necessary to achieve sufficient binding strength to recruit a third U1A to 2s. However, this is a subject for future investigation.
Is This a General Mechanism?-We reasoned that other nonimmunoglobulin genes that produce alternative forms by activation of a poly(A) site within an internal exon might be regulated by U1A binding these novel motifs. Using the findings from this study, we identified eight candidate genes that contained clusters of potential novel U1A binding sites near the regulated poly(A) site. 2 These are predominantly receptors but also represented are transcription factors and biologically important enzymes. For instance, the oncogenes epidermal growth factor receptor and HER2 receptor are both receptors with truncated forms (38,39). In both cases the truncated versions are festooned with novel U1A binding motifs that may play a role in silencing the truncated forms. An investigation of these genes will determine whether this is a more widely used regulatory mechanism.