A Role for the (cid:1) -U Mismatch in the Recognition of the 5 (cid:1) Splice Site of Yeast Introns by the U1 Small Nuclear Ribonucleoprotein Particle*

The U1 small nuclear ribonucleoprotein particle (snRNP)/5 (cid:1) splice site (5 (cid:1) SS) interaction in yeast is essential for the splicing process and depends on the formation of a short RNA duplex between the 5 (cid:1) arm of U1 snRNA and the 1st intronic nucleotides. This RNA/RNA interaction is characterized by the presence of a mismatch that occurs with almost all yeast introns and concerns nucleotides 4 on the pre-mRNA (a U) and 5 on U1 snRNA (a (cid:1) ). The latter nucleotide is well conserved from yeast to vertebrates, but its role in yeast and the significance of the associated mismatch in the U1 snRNA/5 (cid:1) SS interaction have never been fully explained. We report here that the presence of this mismatch is a determinant of stability that mainly affects the off rate of the interaction. To our knowledge

Pre-mRNA splicing is the post-transcriptional maturation step that removes nuclear introns from primary transcripts of split genes. Introns are recognized by inspection of conserved sequences that define the sites of cleavage at the 5Ј and the 3Ј border. These sequences promote the assembly of a large ribonucleoprotein complex, called the spliceosome, that catalyzes the two transesterification steps leading to intron removal and exon joining.
Spliceosome assembly is traditionally believed to occur in a temporally ordered and sequential manner, although a more recent view holds that a preformed spliceosomal entity exists independently of the precursor RNA (1,2). Five small nuclear ribonucleoprotein particles (snRNPs), 1 each composed of one RNA (snRNA), various specific proteins, and a set of common proteins (Sm proteins), play a major role in the process. Several non-snRNP proteins also intervene in spliceosome assembly (3,4).
Splice sites are inspected multiple times during the process, and a number of proofreading mechanisms ensure accuracy in the cleavage and ligation steps. One of the first steps is the recognition of the 5Ј splice site (5ЈSS) by the U1 snRNP particle. The 3Ј region of the intron is subsequently recognized by a complex formed by the branch point binding protein (yBBP/ hSF1) and the associated yMud2p/hU2AF factor that, at least in metazoans, binds to the pyrimidine-rich region downstream of the branch point (5,6). These steps define the first complexes, which can be isolated by biochemical means, called the commitment or E complex (respectively in yeast and metazoans). The first ATP-requiring step is the formation of the prespliceosome. In this complex, the U2 snRNP interacts with the branch point sequence by base pairing, thereby replacing the yBBP⅐Mud2p complex. A number of protein factors, among them two ATPases, Sub2p and Prp5p, are involved in this step (7)(8)(9)(10). After the interaction of the preformed tri-snRNP particle (U4/U6-U5 snRNP) the spliceosome enters a series of structural rearrangements during which the U1 and U4 snRNPs are displaced and the spliceosome is activated (11). The U6 snRNA is a focal point in these rearrangements: it replaces the U1 snRNA in a mutually exclusive interaction with the 5Ј splice site, while base pairing with U4 snRNA is disrupted to allow interaction with the U2 snRNP. Disruption and formation of RNA/RNA, RNA/protein, and protein/protein interactions that are often mutually exclusive is carefully controlled by a family of RNA-dependent ATPases (or RNA helicases) to ensure the folding, positioning, and activation of the catalytic center in a timely fashion (11).
Recognition of the 5Ј splice site is paradigmatic in this regard. In yeast, the 5ЈSS sequence is first recognized by base pairing with the 5Ј arm of U1 snRNA. Based on genetic and cross-linking studies, it is believed that U1 snRNP-associated proteins, among them U1Cp, Prp40p, Nam8p, and the Sm complex, stabilize this interaction (12)(13)(14)(15). More recently (16) it has been shown that the protein component of U1 snRNP can recognize to some extent the 5Ј splice site sequence even in the absence of base pairing interaction. The U1 snRNP is displaced from the 5ЈSS later in the process, allowing base pairing between the U 4 G 5 U 6 portion of the 5ЈSS and the U6 snRNA. This step is ATP-dependent and is somehow controlled by the Prp28p DEAD box helicase (17), although the exact mechanism of action of this protein is still unknown. It has been suggested that this transition at the 5ЈSS is operated by unwinding the U1 snRNA⅐5ЈSS duplex (17) and/or by actively displacing the U1Cp protein (18). However, it remains possible that Prp28p somehow favors more directly the association of U6 snRNA.
Finally, two additional factors have been shown by genetic and biochemical experiments to interact with the 5ЈSS region: the U5 snRNA and Prp8p, which interact with the last nucleotides of the 5Ј exon and the first nucleotides of the intron (19 -25).
The sequence of the 5Ј splice site is therefore the result of multiple selective constraints, which is generally difficult to deconvolute. In yeast, the large majority of introns contain the sequence GUAUGU at the 5Ј border. This sequence is only partially complementary to the 5Ј region of the U1 snRNA, which leads to the formation of a partially base paired RNA duplex in the U1 snRNP⅐pre-mRNA complex. This duplex contains a non-Watson-Crick interaction between U 4 of the 5ЈSS and a pseudo-uridine (⌿ 5 ) of U1 snRNA (26). This mismatch is absent in higher eucaryotes where the 4th intronic nucleotide is generally an A that can base pair with ⌿ 5 . The conservation of U 4 in the yeast 5ЈSS might be explained by its interaction with the U6 snRNA. However, the presence of ⌿ 5 in the U1 snRNA (and of the ⌿ 5 -U 4 mismatch) remains unexplained. Although it was previously suggested that it might constitute a determinant of stability (27), this hypothesis was never demonstrated. We provide here evidence that the presence of the mismatch induces a stabilization of the interaction between the U1 snRNP and the 5ЈSS. Surprisingly we found that the stability of the mismatch-containing complex is comparable to the stability of a complex containing a fully paired U1 snRNA⅐pre-mRNA duplex. By performing SELEX experiments we also obtained an artificial phylogeny, which allows the isolation of the selective constraint on the 5ЈSS solely due to U1 snRNP binding.

EXPERIMENTAL PROCEDURES
RNA Synthesis and Preparation of Extracts-RNA substrates for U1 snRNP binding and immunoprecipitation experiments were synthesized from PCR products carrying a T7 RNA polymerase promoter. The full sequence is shown in Fig. 1. Biotinylated BP-U and BP-A RNAs were prepared as described previously (17). Oligonucleotide 5Ј-U1⌿ was purchased from Dharmacon Research and is identical to the first 11 nt of U1 snRNA. Extracts were prepared with the Umen and Guthrie method (28) with minor modifications. The U170kHA yeast strain was a gift from J. Tang. The nam8⌬,U170kHA strain was obtained from the U170kHA strain by deleting NAM8 with a PCR-based gene disruption method (29).
Immunoprecipitation Experiments-Immunoprecipitation experiments were carried out essentially as described in Abovitch et al. (30). Briefly, 0.5-2 fmol of radioactive RNA were incubated at 25°C in ATP-depleted extracts or U2 snRNA-inactivated extracts for the times indicated. The 30-l reactions were diluted into 500 l of NET100 containing 20 l of GammaBind Plus beads (Amersham Biosciences) preincubated with anti-HA antibody (Roche Molecular Biochemicals) and incubated at 4°C for 30 min. After extensive washings, the radioactive RNA was recovered by phenol extraction and analyzed by PAGE. Alternatively the amount of complex formed was evaluated by counting the radioactivity associated with the beads. As a control for immunoprecipitation efficiency, U1 snRNA from a fraction of the recovered RNA was primer-extended. This method was very sensitive and reliable, although the absolute amount of immunoprecipitated material depended somewhat on the particular extract preparation. In this respect, comparisons between different RNAs for U1 snRNP binding were always performed in side by side experiments.
Complex stability was assessed by adding a 100-fold excess of cold RNA after a 20-min incubation of radioactive RNA in the extract. Aliquots of the reaction were immunoprecipitated at various time points as described.
Oligonucleotide 5Ј-U1⌿ was incubated for 20 min with biotinylated BP-U or BP-A in splicing salts before addition of streptavidin-agarose beads and incubation for 15 min. The retained radioactivity was measured after extensive washings.
SELEX Experiments-The starting pool of sequences was constructed by PCR as described previously (31). In the first SELEX experiment (8N selection) 8 nt were randomized 2 nt downstream of the branch point sequence. The short random window was specifically chosen to preclude selection of an aptamer to one of the U1 snRNP components. In the second experiment (4N ϩ 3N selection), the U1 snRNP canonical binding site (GGUAUGU) was included as a constant region, and 4 nt upstream and 3 nt downstream were the randomized portion. Selection was performed by incubating the pool in the tagged extract for 15 min and immunoprecipitating the complexes as described above. The process was reiterated after reverse transcription-PCR amplification and T7 transcription and stopped when the selected pools bound U1 snRNP better than the BP-U RNA (respectively six and four cycles for the 8N and 4N ϩ 3N selections). The winning pools were cloned and sequenced together with 15 clones derived from the nonselected pools to determine the nucleotide bias. Sequences from the 8N selection were aligned according to two criteria: first, the GGUA motif that was almost universally present (only three sequences did not contain the first G); and second, complementarity to the U1 snRNA 5Ј arm was used to align the remaining three sequences (sequences 3, 12, and 19) and to define the register of interaction with the U1 snRNP (and thereby the family assignment based on the identity of the 4th nucleotide). For three sequences (sequences 5, 11, and 16) the pattern of alignment according to the U1 snRNA complementarity is ambiguous, and they could be classified in the ⌿-G family provided that the GGUA pattern is misaligned in these sequences. One likely possibility is that these sequences have evolved a "double" and overlapping U1 snRNA binding site (e.g. in sequence 5, uG 1 UAG 4 GUA 7 Au and uGUA ؊2 GG 1 UAA 4 u), which is likely to confer a kinetic advantage by lowering the entropic cost for binding. Interestingly six sequences of 32 winners of the 4N ϩ 3N selection have also independently evolved a motif (UGUA in the 4 nt that precede the canonical GGUAUGU) that could allow a double and overlapping U1 snRNP binding site.
The expected ⌬G for the formation of the RNA duplex with the U1 snRNA arm was calculated with the help of the MFOLD server (bioinfo. math.rpi.edu/ϳmfold) using version 3.1 of the MFOLD program and the latest version of the free energy parameters (version 3.0) (32)(33)(34). A free energy term of ϩ4.1 kcal/mol for duplex initiation and a penalty of ϩ0.45 kcal/mol for terminal AU base pairs have been included in the calculation (33). In the absence of measured free energy parameters, the contribution of the ⌿-U noncanonical pair has been approximated as a U-U mismatch.

RESULTS
We have previously demonstrated using in vivo randomization-selection experiments (31) that a sequence identical to the intronic portion of a 5Ј splice site can act as a splicing enhancer in yeast when located immediately downstream of the branch point sequence. Our experiments also suggested that U1 snRNP was the mediator of this enhancer effect. The conservation in our selected sequences of a U at position 4 was, however, somewhat surprising. The presence of this nucleotide decreases complementarity of the enhancer sequence with the U1 snRNA. Even more puzzling was the observation that substitution of an A for U 4 (GUAaGU), which promotes uninterrupted base pairing with the 5Ј arm of U1, led to almost complete loss of splicing enhancement (31).
The Presence of a Mismatch Stabilizes the Interaction of the RNA Substrate with the U1 snRNP-These experiments suggested that the presence of the ⌿-U noncanonical interaction might somehow favor the interaction of U1 snRNP with the enhancer sequence (and as a consequence, with the bona fide 5ЈSS). To test this hypothesis, we set up in vitro pull-down experiments with a variety of small 54-nt synthetic RNAs containing the relevant region. Splicing extracts were prepared from yeast strains containing a HA-tagged form of the U1 snRNP-associated U170K (or Prp40p, data not shown) protein.
The substrate RNA, containing a branch point sequence, the wild type or a mutated enhancer sequence, and the 3Ј end of the RP51B intron (Fig. 1A) were incubated in the tagged or control untagged extract, and the U1 snRNP-containing complex was immunoprecipitated with anti-HA antibodies. The radioactive RNA that co-immunoprecipitates with the U1 snRNP complex was extracted and analyzed by denaturing PAGE. An aliquot of the purified RNA was primer-extended with an oligonucleotide complementary to U1 snRNA to ensure that immunoprecipitation was equally effective in all cases. Besides the selected sequence (BP-UGUAUGU), we used a UϪ1G mutant (BP-gGUAUGU), a U4A mutant (BP-UGUAaGU), and a UϪ1G,U4A mutant (BP-gGUAaGU). All these mutations were expected to increase the complementarity between U1 snRNA and the tested RNA sequences.
Surprisingly, while the UϪ1G mutation led to the expected increase in the amount of complex formed ( Fig Since the tested sequences are virtually identical to bona fide 5Ј splice sites (with or without the U-⌿ mismatch), these experiments suggest the surprising conclusion that conservation of U at the fourth position of introns and of a ⌿ at the fifth position of U1 snRNA is at least partially linked to the formation or stability of the U1 snRNP⅐pre-mRNA complex.
To investigate more closely the stabilizing role of the ⌿-U mismatch, we decided to pursue this work with the two substrates that gave the highest amount of complex, notably BP-gGUAUGU (hereafter called BP-U) and BP-gGUAaGU (BP-A), containing, respectively, a U and an A at the fourth position. To estimate the amount of complex formed in the various conditions described below, we simply measured the amount of radioactive RNA retained on the beads with the U1 snRNP after incubation in extract, immunoprecipitation, and extensive washings. As before, the appropriate controls were performed with nontagged extracts and with primer extension to verify that equivalent amounts of U1 snRNA were retained on the beads. Finally, an aliquot of RNA-containing supernatant was analyzed by denaturing PAGE to ensure that the stability of BP-A and BP-U RNAs in the extract was equivalent (data not shown). Since the absolute amount of complex formation was somewhat dependent on different extracts preparations, comparisons between different RNAs was always performed in side by side experiments.
Formation of the complex with U1 snRNP might occur faster for BP-U than for BP-A. Alternatively, the former complex might be intrinsically more stable than the latter. In a first attempt to answer this question, we analyzed the time course of complex formation for both substrates. At early time points both complexes were formed at equivalent rates, although in some experiments BP-A complex formation was faster ( Fig. 3A and data not shown). At later time points BP-U reproducibly formed more complex than BP-A, confirming the data shown in Fig. 2.
The Presence of a BP Sequence Upstream of the 5ЈSS Sequence Destabilizes the Fully Paired Complex to a Larger Extent Compared with the Mismatch-containing Complex-We reasoned that the presence of a branch point sequence upstream of the U1 snRNP binding site might somehow affect complex formation or stability. This sequence is a binding site for the BBPp-Mud2p heterodimer (5,6): the interaction of these proteins 2 nucleotides upstream of the U1 snRNP interaction site might either favor or hinder formation of the complex. In both cases, this might affect to different extents a mismatched and a fully paired U1 snRNA⅐RNA complex. We then constructed two additional variants of the BP-U and BP-A RNAs in which the branch point sequence was mutated (UACUAAC to UACUggC, constructs GG-U and GG-A) or deleted (⌬bp-U and ⌬bp-A). Mutation of the two As is known to affect binding of BBPp to the branch point (5). Since the results were identical for the mutation and deletion of the branch point sequence we will only discuss the experiments performed with the former constructs. The four RNAs (BP-A, BP-U, GG-A, and GG-U) were incubated in parallel experiments with tagged and nontagged extracts and were immunoprecipitated as described above (Fig. 3B). Interestingly mutation of the BP sequence led to an increase in the amount of complex formed (compare constructs GG with constructs BP), suggesting that the branch point sequence located upstream of the 5Ј splice site sequence is a destabilizing factor. Most importantly, and surprisingly, this destabilizing effect was stronger when the complex relied on a fully paired U1 snRNA/5Јsplice site interaction than when the RNA/RNA interaction contained a ⌿-U mismatch. GG-A formed the highest amount of complex, but this amount was strongly reduced in the presence of a nonmutated branch point sequence (construct BP-A); on the contrary, GG-U and BP-U formed roughly the same amount of complex, leading to the observed order GG- The ⌿-U Mismatch Affects the Stability of the Complex Even in the Absence of the BP Sequence-Whatever the mechanistic reasons for the destabilizing effect of the branch point might be, the outcome of these experiments is compatible with a simple model: once formed, the U1⅐RNA complex is stabilized depending on the presence of the ⌿-U mismatch in the duplex. The mismatch containing the U1⅐RNA complex would then be more resistant to the challenging effect of the destabilizing BP sequence.
One important implication of this hypothesis is that the higher amount of complex formed with a fully paired BP-less sequence (GG-A) should be less stable than the corresponding, mismatch-containing GG-U⅐U1 snRNP complex. To address this question we performed the following experiments. We incubated in parallel reactions radiolabeled GG-A and GG-U with U1-tagged extracts for 20 min to allow formation of the U1 snRNP complex. We then added a large excess of cold competitor RNA to isolate the radioactive complex and immunoprecipitated aliquots of the reactions over time to measure the decay rates of the radioactive complexes. We used cold GG-A, cold GG-U, and an equimolar mixture of the two: since the results were identical in the three cases, we only show results obtained with the latter experiment. As shown in Fig. 4, the amount of radioactive complex retained on the beads decreased faster for the fully paired GG-A⅐U1 snRNP complex than for GG-U⅐U1 snRNP, which is consistent with the latter being more stable than the former. In a second experiment, we measured differences in the decay rates of the two purified complexes in the absence of challenging cold competitor by repeatedly washing the beads and measuring the retained radioactivity over time. Although the U1 snRNP⅐RNA complex decayed more slowly in these conditions, the dissociation rate was again faster for GG-A than for GG-U (data not shown).
These experiments strongly suggest that the ⌿-U mismatch plays a role in the stability of the U1 snRNP⅐5ЈSS complex.
Strengthening the U1 snRNP/BP-A Interaction Allows a Fully Paired Complex to Withstand the Destabilizing Activity of the Branch Point Sequence-The above experiments suggest that in the presence of a fully paired interaction, a higher amount of complex can be formed (GG-A versus GG-U) that is more "vulnerable" to destabilization and generally less stable. This model predicts that strengthening the U1 snRNP⅐BP-A interaction should allow the complex to counteract the destabilization due to the branch point sequence. In this case, it is expected that even in the presence of a BP sequence, a hyperstabilized fully paired duplex allows more complex formation than a mismatch-containing counterpart. We than constructed BP-A*, in which an additional base pair was added at position ϩ7 of the donor site, and compared this sequence with BP-U* (which is identical but contains the mismatched position) in our U1 snRNP binding assay (these sequences are identical to the selected sequences 1 and 15, see below). As shown in Fig. 5A, and consistent with expectations, a higher amount of U1 snRNP-containing complex was formed with BP-A* than with BP-U*.
The Integrity of the U1 snRNP Complex Is Required for the Mismatch-dependent Stabilization of U1 snRNP⅐RNA Interaction-It is possible that the stabilizing nature of the ⌿-U mismatch is linked to the presence of a protein that recognizes the FIG. 2. A, immunoprecipitation of substrate RNAs from a U170K tagged or nontagged extract. The RNA co-immunoprecipitated with the U1 snRNP (pellets) is extracted and analyzed by PAGE. As a degradation control, RNAs from the supernatants are also analyzed (supernatants). A fraction of the immunoprecipitated RNA was primer-extended with an oligonucleotide specific for the U1 snRNA (primer extensions) for equal immunoprecipitation efficiency control. B, immunoprecipitation of BP constructs requires the U1 snRNA 5Ј arm and is independent of U2 snRNP. Extracts were preincubated with an excess of 2Ј-O-methyl oligonucleotide against the U1 snRNA 5Ј arm (␣U1) or a DNA oligonucleotide against U2 snRNA (␣U2) before incubation with the radioactive RNAs and immunoprecipitation. unpaired region of the RNA/RNA interaction. A number of proteins, among them Nam8p, Prp40p, U1Cp, and SmD3p, have been directly involved in the stabilization of the U1 snRNA/pre-mRNA interaction (12)(13)(14)(15). Our attempts to repeat our in vitro U1 snRNP binding experiments in a U1Cp genetically depleted or heat-inactivated U1Cp thermosensitive strains failed because of the very low amount of complexes formed in these conditions (data not shown). Deletion of Nam8p has a no growth phenotype and only a very modest biochemical phenotype (13). We then constructed a nam8⌬,U170kHA strain and repeated our binding assays in this U1 snRNP-defective environment. Interestingly (Fig. 5B), while deletion of Nam8p decreased the amount of complex formed with both BP-U and BP-A, the U1 snRNP⅐BP-U complexes were affected to a larger extent compared with the U1 snRNP⅐BP-A complexes. Since a number of additional proteins are absent or loosely associated with the U1 snRNP complex in the absence of Nam8p (13), it cannot be concluded from this experiment that Nam8p is the stabilizing factor. However, these data strongly suggest that the integrity of the U1 snRNP complex is required for the mismatch-dependent stabilization of U1 snRNP⅐RNA interaction.
The ⌿-U Mismatch Destabilizes Formation of a Naked RNA⅐RNA Duplex-In the absence of measured thermodynamic parameters for the stability of ⌿-U mismatches in short RNA duplexes, it is possible that this mismatch is intrinsically stable in the sequence context of the U1 snRNA/5ЈSS interaction. To take this possibility into account, we synthesized an 11-mer RNA oligonucleotide (5Ј-U1⌿) with the same sequence as the U1 snRNA 5Ј arm (containing two ⌿s at positions 5 and 6). We then performed pull-down experiments with biotinylated BP-U and BP-A and 5Ј-radiolabeled 5Ј-U1⌿. As a control, we used a second radiolabeled oligonucleotide against a region shared by BP-U and BP-A (i.e. outside the mismatch region). As shown in Fig. 6, only in the absence of a mismatch (BP-A), 5Ј-U1⌿ was efficiently retained on streptavidin beads, while the control oligonucleotide was retained equally well by biotinylated BP-U and BP-A. This experiment shows that the ⌿-U mismatch is not intrinsically stable at least in this sequence context. Also it suggests that the protein component (or the overall integrity) of the U1 snRNP is required for the mismatchdependent stabilization of the complex.
Sequences Forming a Mismatched Complex Are Efficiently Recovered in SELEX Experiments-The 5Ј splice site sequence is recognized multiple times during the splicing process. Therefore its conservation depends on the existence of multiple, overlapping, selective pressures, one of which is related to U1 snRNP binding.
To substantiate the data presented above and to isolate the U1 snRNP interaction from the other factors that contribute to the definition of the 5Ј splice site sequence, we set up a SELEX experiment. RNAs containing eight random positions were immunoselected by tagged U1 snRNP. After extensive washings and reverse transcription-PCR amplification, the procedure was reiterated for additional rounds of selection. We included the branch point sequence upstream of the random region to increase the stringency of the assay. Based on our previous experiments, sequences binding strongly to U1 snRNA by virtue of extended base pairing were expected to dominate the selection. It was of interest to assess to what extent ⌿-U mismatch-containing sequences could survive in a direct competition assay with fully paired sequences. The pool obtained after five rounds of selection bound U1 snRNP better than BP-U, and sequences from this pool were cloned and analyzed (Fig. 7). A pattern essentially identical to a 5Ј splice site was easily identified. As expected, a sequence identical to BP-A* dominated the selection. The remaining sequences were roughly equally distributed into three classes containing an A, U, or G nucleotide at position 4. Finally, no sequences containing a C at position 4 were selected despite the fact that Cs were present in the initial pool at a frequency around 20%.
It might be expected that sequences in the "U" class evolve additional base pairs to compensate for the mismatch and be able to compete with fully paired sequences. We then calculated for every sequence the expected change in free energy for the formation of the most stable RNA duplexes with U1 snRNA (the ⌿-U mismatch was assigned the same ⌬G penalty as a U-U mismatch). In striking contrast to the fact that sequences of the three classes were present at roughly equivalent frequen-cies, the calculated changes in ⌬G were significantly different for the U and A or G classes. Sequences of the U classes had expected ⌬G values ranging from Ϫ3.6 to Ϫ7 (average, Ϫ5.5) (corresponding to K d values ranging from 3 ϫ 10 Ϫ3 to 1.6 ϫ 10 Ϫ5 ), while calculated ⌬G values for the two other classes were significantly larger (averages of Ϫ9.6 and Ϫ9.4, respectively, for the A and G classes), and the corresponding dissociation constants were in the 10 Ϫ5 -10 Ϫ8 range. These experiments indicate that RNA⅐U1 snRNP complexes containing a ⌿-U mismatch can efficiently compete with complexes containing a fully paired RNA duplex.
The SELEX experiment provided some interesting information concerning the interaction of nucleotides surrounding the GGUAUGU sequence with the 5Ј arm of U1 snRNP. Notably it suggested that the interaction site on the RNA extends from nucleotides Ϫ2 to ϩ8 (facing U1 nucleotides 1-10). To substantiate this result we performed a second SELEX experiment by keeping constant the GGUAUGU sequence and randomizing 4 nucleotides upstream and 3 nucleotides downstream (Fig. 6B). Analysis of the data indicate the existence of a strong selective pressure on nucleotides Ϫ2 (A), ϩ7 (A), and ϩ8 (U), which define the U1 interaction sequence on the RNA as AG:GUAUGUAU. DISCUSSION The consensus sequence for the 5Ј splice site of vertebrate introns contains a predominant A at position ϩ4, which forms a base pair with the 5th nucleotide of the U1 snRNA, a pseudouridine (⌿) (26). In contrast to vertebrates, in yeast the sequence of the 5Ј splice site is almost invariant, which testifies to the existence in this organism of a strong selective pressure on the identity of the first 5 intronic nucleotides. Four of the first 5 nucleotides in yeast introns are compatible with the consensus sequence of vertebrates. The most significant difference between the two systems is the presence of a U at the fourth position in yeast introns, which precludes the possibility of forming a canonical base pair with the corresponding ⌿ 5 of yeast U1 snRNA.
In the spliceosome assembly process, the U6 snRNA replaces U1 snRNA at the 5Ј splice site, and the intronic U 4 G 5 U 6 base pairs with A 47 C 48 A 49 of U6 snRNA. It is possible that conservation of U 4 in yeast is rather relevant to the latter interaction, which would only be essential in yeast. However, the conservation of a pseudo-uridine at position 5 of U1 snRNA remains to be explained. Given the high free energy price paid for the maintenance of a mismatch in this region (with calculated dissociation constants that can increase up to 3 orders of magnitude, Fig. 7, compare sequences 1 and 15; see also Fig. 6) one would expect the existence of a strong selective pressure to maintain a fully paired helix, leading to the U5A mutation in U1 snRNA (as, for instance, might be the case in Euglena gracilis (35)).
It is possible that conservation of ⌿ 5 is relevant to U1 snRNP integrity, e.g. biogenesis or stability of the particle. However, we have been unable to detect alterations in the mature levels of a mutated U1 snRNA bearing the ⌿5A mutation. 2 We favor the hypothesis that the ⌿-U mismatch plays a role in splicing, which is essentially related to the regulation of the interaction of the U1 snRNP with the 5Ј splice site.
In this report we first used a sensitive U1 snRNP binding and immunoprecipitation assay to compare the U1 snRNP binding efficiencies of 5Ј splice sites containing an A or a U at position 4, leading, respectively, to the formation of a fully paired or a mismatch-containing complex. The RNAs we used contain a branch point sequence upstream of the 5Ј splice site sequence, which leads to the partial destabilization of the complex formed with U1 snRNP as shown by mutation of the two A residues of the BP sequence (Fig. 3). This destabilization is likely to be linked to the binding of the branch point-recognizing protein factors, which might either directly challenge the formation/stability of the 5Ј splice site complex, impede the stabilizing activity of the cap binding complex, or somehow alter the structural integrity of the U1 snRNP particle. Surprisingly we found that a complex containing the presumably destabilizing ⌿-U mismatch was formed at equivalent or higher levels than a complex that contains a ⌿⅐A base pair at the same position. This was also confirmed by directly comparing the decay rates of the two complexes in the absence of the destabilizing BP sequence: although the absolute dissociation rate constants cannot be reliably measured by these assays, they were significantly different and higher for the complex that contains a fully paired helix. These data are also particularly significant in light of the parallel failure of biotinylated BP-U to efficiently pull down an 11-mer RNA oligonucleotide bearing the same sequence as the 5Ј arm of U1 snRNA. Note that the outcome of the latter experiment is compatible with the expected ϩ3 kcal/mol free energy change that accompanies in mismatched duplexes the disruption of two stacking interactions and the associated hydrogen bonds (A-A and A-G), which implies an increase in the dissociation constant of more than 2 orders of magnitude.
We also performed a SELEX experiment and asked whether U 4 -containing sequences could efficiently compete with sequences that could form theoretically more stable helices. As expected, sequences containing a U-mismatched position 4 were efficiently recovered in this experiment.
Interestingly no sequences containing a C in this position were selected, which strongly suggests that ⌿-U and ⌿-C mismatches are not equivalent in this context. This result, which was also directly confirmed by immunoselection assays with individual sequences containing a C at position 4 (data not shown), raises the question of why C is the second most frequent nucleotide in natural 5Ј splice sites, while Gs are never found (see below).
The outcome of the SELEX experiment might be considered somewhat different from our pull-down assays in that sequences with A (and G) at the mismatch position were not out-competed by sequences containing Us. This is likely due to the fact that the selection protocol is not able to discriminate between complexes with different off rates but similar on rates. It is possible that the ⌿-U-dependent stabilization follows duplex formation that is expected to be faster for fully complementary helices (which we observed in some experiments, data not shown). Since the U1 snRNP target is limiting, the composition of the winning pool might essentially reflect the competition between sequences for target accession. In the absence of such a competition (i.e. in pull-down experiments with individual sequences) the differences in the off rates would become apparent.
We favor the hypothesis that a protein factor plays an important role in "locking" the U1 snRNP/RNA interaction by recognizing the helix distortion induced by the unpaired nucleotides. The observation that the absence of Nam8p has a more dramatic effect on a mismatch-containing than on a fully paired complex is consistent with this hypothesis, although it does not directly implicate Nam8p in the mismatch-dependent stabilization. The absence of Nam8p has in fact been reported (36) to loosen the association of other protein factors with the U1 snRNP, like Snu56p, Snu71p, and Prp40p, which might be involved indirectly or directly (as suggested for Prp40p (12)) in the process. U1Cp has also been implicated in stabilizing the U1 snRNP/pre-mRNA interaction in vertebrates (37) and shown to cross-link, albeit weakly, to U 4 in yeast introns (14). More recently Chen et al. (18) reported the isolation of multiple U1C alleles (mutation of the same amino acid, Leu 13 ) that allow bypass of the requirement for Prp28p, a DEAD box ATPase implicated in the dissociation of U1 snRNP from the 5Ј splice site. Finally, recombinant U1Cp was shown to directly recognize a portion of the 5Ј splice site that includes U 4 . 3 Although these results suggest a role for U1Cp in the recognition of the mismatch, it has to be stressed that they do not necessarily implicate a direct role in the mismatch-dependent stabilization of the interaction. All our attempts to demonstrate a role for this protein using U1Cp thermosensitive ex-tracts failed, essentially due to the very low amount of complexes formed with our RNA substrates.
Staley and Guthrie (17) have recently shown that artificially extending the base pairing between U1 snRNA and the 5ЈSS decreases the efficiency of U1 snRNP displacement and inhibits splicing at low temperatures. It is possible that a U1 snRNA containing an A in the fifth position (and therefore able to form a fully paired duplex) might not dissociate efficiently from the 5ЈSS, which would be the rate-limiting step for growth. In apparent accordance with this hypothesis is the observed cold sensitivity of a U1 ⌿5A mutant strain. 2 However, we have recently shown 4 that the cold-sensitive phenotype of this strain is mainly due to the defective splicing of one single intron contained in a quasi-essential gene that we temporarily called LMD1. Replacement of the gene with its cDNA copy efficiently suppresses the cold sensitivity of the strain, indicating that the growth defect does not reside in a slow U1 snRNP dissociation step from one or more introns containing canonical 5Ј splice sites. Rather it is likely that binding to the LMD1 intron is limiting. In fact, the 5ЈSS of LMD1 contains an unusual A at position 5 that leads to the formation of an A-A mismatch with the mutant U1 snRNA and to inefficient splicing of this particular intron. 4 It has to be stressed, however, that the role played by the ⌿-U mismatch in the stabilization of the U1 snRNP/5ЈSS interaction does not preclude an additional requirement for an efficient dissociation step. Indeed our favorite hypothesis is that its presence is required to "transfer" part (or most) of the interaction energy of the U1 snRNP⅐5ЈSS complex to one (or more) protein factors. Modulation of the activities of the latter by phosphorylation or the action of other proteins during the splicing cycle would in turn favor association or dissociation of the particle from the 5ЈSS. This strategy might be more economical in that it would still exploit the RNA moiety for accurate positioning but would rely on the protein component for modulation of the affinity, which is presumably more efficient than disrupting a fully paired RNA duplex.
Why is the ⌿-U mismatch not conserved in vertebrates where the majority of 5Јsplice sites contain an A at the fourth position? One likely explanation might be that the yeast U1 snRNP contains a set of specific proteins that are not associated with the vertebrate particle (36). Interestingly at least one of these proteins (Nam8p) has a vertebrate homologue (38) that is not part of the U1 snRNP. It is likely that protein-mediated stabilization of the U1 snRNP/5ЈSS interaction in vertebrates is not universally mediated by a U1 snRNP component but occurs through the intervention of a set of specific, non-U1 snRNP-associated factors (e.g. SR proteins) playing major roles both in the selection of the less well defined 5ЈSS and the regulation of its use (for instance, see Refs. 39 and 40).
The 5ЈSS in yeast introns is inspected multiple times during the splicing process. Its sequence is therefore the result of several selective constraints, only one of which is U1 snRNP binding. Although a number of studies provided essentially genetic evidence for the interaction between individual nucleotides of the U1 snRNA 5Ј arm and the 5ЈSS (27,(41)(42)(43), there is very limited biochemical evidence that directly addresses the question of how many intronic nucleotides can actually be inspected by U1 snRNP. The outcome of our SELEX experiments allows the isolation of the latter evolutionary constraint by providing an artificial phylogeny solely based on U1 snRNP binding. Incidentally the strategy we used, i.e. selection based on complex pull-down assays, might be generally more inform- ative than classical SELEX to identify protein binding sites on the RNA. Contrary to selections based on binding to single polypeptides, complex pull-down selection is more likely to reveal physiological situations when the protein of interest assembles with a complex on the RNA. In this case, indeed, the protein/RNA-interacting surface might be constituted by more than one polypeptide that potentially contributes to the affinity and/or the specificity of the interaction.
The consensus sequence of our artificial phylogeny parallels data issued from statistical analysis of the splice sites of (almost) all yeast introns. In complete accordance with the latter is conservation of nucleotides Ϫ2 (A), 1-3 (GUA), and 5 and 6 (GU). Nucleotide G 1 is strongly conserved in our analysis but relatively poorly in yeast introns (36%) (44). However, taking into account the nucleotide bias in yeast and consistent with our results, Lopez and Seraphin (45) concluded the existence of a statistically significant selective pressure on this position.
Overall, the consensus sequence issued from our experiments is consistent with the first 10 nucleotides of U1 snRNA (A 1 -U 10 ) being available for base pairing with the 5ЈSS sequence; besides these positions, interactions between the two RNAs are unlikely to occur, consistent with the proposed involvement of nucleotide U 11 in stem I U1 snRNA structure (46) and the absence of conservation for an A at position Ϫ3 in our experiments. It is tempting to speculate that the two helices (stem I in U1 snRNA and the 5ЈSS⅐U1 snRNA duplex) might stack together, which would provide additional stabilization. Finally, our data imply that conservation found in natural 5ЈSS at positions Ϫ3 and Ϫ4 (two As) and position 7 (U) have to be accounted for by selective pressures other than U1 snRNP binding.
Finally, Gs are virtually absent from yeast 5ЈSS at position 4 but are present in roughly a third of sequences in our selection. Alternative (and negative) selective pressures (e.g. U6 snRNA interaction) can certainly be invoked to explain this discrepancy. One interesting hypothesis is that in the presence of this nucleotide the 5ЈSS is composed by two almost identical halfsites (AGGUAG 4 GUAU), which might interfere with correct positioning of the cleavage site.