Complex role of the β2-β3 Loop in the Interaction of U1A with U1 Hairpin II RNA*

RNA recognitionmotifs (RRMs) are characterized by highly conserved regions located centrally on a β-sheet, which forms the RNA binding surface. Variable flanking regions, such as the loop connecting β-strands 2 and 3, are thought to be important in determining the RNA-binding specificities of individual RRMs. The N-terminal RRM of the spliceosomal U1A protein mediates binding to an RNA hairpin (U1hpII) in the U1 small nuclear RNA. In this complex, the β2-β3 loop protrudes through the 10-nucleotide RNA loop. Shortening of the RNA loop strongly perturbs binding, suggesting that an optimal “fit” of the β2-β3 loop into the RNA loop is an important factor in complexation. To understand this interaction further, we mutated or deleted loop residues Lys50and Met51, which protrude centrally into the RNA loop but do not make any direct contacts to the bases. Using BIACORE, we analyzed the ability of these U1A mutants to bind to wild type RNAs, or RNAs with shortened loops. Alanine replacement mutations only modestly affected binding to wild type U1hpII. Interestingly, simultaneous replacement of Lys50 and Met51 with alanine appeared to alleviate the loss of binding caused by shortening of the RNA loop. Deletion of Lys50 or Met51 caused a dramatic loss in stability of the U1A·U1hpII complex. However, deletion of both residues simultaneously was much less deleterious. Simulated annealing molecular dynamics analyses suggest this is due to the ability of this mutant to rearrange flanking amino acids to substitute for the two deleted residues. The double deletion mutant also exhibited substantially reduced negative effects of RNA loop shortening, suggesting the rearranged loop is better able to accommodate a short RNA loop. Our results indicate that one of the roles of the β2-β3 loop is to provide a steric fit into the RNA loop, thereby stabilizing the RNA·protein complex.

The RNA recognition motif (RRM), 1 also known as the ribonucleoprotein (RNP) consensus domain or RNA binding do-main (RBD), is the most common and best characterized RNA binding domain. It is present in one or more copies in hundreds of RNA binding proteins (1)(2)(3). Proteins carrying RRM domains play critical roles in a wide variety of cellular processes, including RNA processing and packaging, mRNA export, translation, and RNA degradation. RRM domains are about 90 amino acids long and fold into a globular structure consisting of a fourstranded antiparallel ␤-sheet (the RNA binding surface) backed by two ␣-helices (see Fig. 1, A and B). RRM domains are characterized by the presence of two highly conserved stretches of 8 and 6 amino acids, known as the RNP1 and RNP2 consensus sequences (see Fig. 1A). These consensus sequences lie strategically in the center of the ␤-sheet surface and contain conserved aromatic residues critical for RNA binding (2, 4 -14). Because the highly conserved nature of the RNP sequences precludes a major role in controlling the specificity of the interaction (1,2,11,15,16), the less conserved loop regions surrounding the ␤-sheet surface are thought to be the predominant determinants of the target specificity of individual RRM proteins. The mechanism by which these flanking regions confer specificity is of considerable interest. Here we describe studies of one such region, the ␤ 2 -␤ 3 loop, which plays an important role in RNA binding by the U1A protein.
U1A, the A protein of the U1 small nuclear ribonucleoprotein, is the most extensively studied RRM protein, and is often used as a paradigm for RRM domain/RNA interaction. Its N-terminal RRM is necessary and sufficient for high affinity binding to U1 hairpin II (U1hpII), a stem-loop structure in the U1 small nuclear RNA (see Fig. 1D) (17,18). The ␤ 2 -␤ 3 loop of U1A (amino acids 46 -52, see Fig. 1, A-C) inserts into the RNA loop, causing the RNA loop residues to splay out (5). This allows the presentation of the bases to highly conserved amino acids within the RNP sequences and the establishment of the critical interactions that mediate high affinity binding (5). The interaction between the RNA and protein loops also appears to be important for providing stability to the RNA⅐protein complex, as suggested by the increased dissociation rate of complexes containing a Leu 49 3 Met U1A mutant or RNAs with shortened loops (9,19). Here we describe experiments aimed at testing whether proper complex stability might require a snug "fit" of the RNA and protein loops. We hypothesized that a reduction in the bulk of the ␤ 2 -␤ 3 loop might allow it to interact better with RNAs with shorter loops. Using a surface plasmon resonance-based biosensor (BIACORE) (20 -22) we investigated the effects of ␤ 2 -␤ 3 loop mutants on the interaction of U1A with U1hpII RNAs carrying loops of different sizes. We show that removal of two amino acids from the ␤ 2 -␤ 3 loop favors the interaction with smaller RNA hairpins, supporting a steric role of this part of U1A.

Construction of U1A Mutants and Protein
Purification-Throughout these studies, an N-terminal fragment of the human U1A (amino acids 1-101, herein referred to as U1A) containing the first RRM was used (9). This fragment has been demonstrated to be necessary and sufficient for specific and high affinity binding to U1hpII (17,18). The U1A fragment was inserted into a modified pET3d vector such that a myc and a His 6 tag were appended to the C terminus of the RRM, as described previously (9). For cloning purposes, engineered restriction sites were introduced within the U1A coding region. All clones were generated by digestion of restriction sites that flank the area to be mutated and replacement with complementary oligonucleotides encoding the desired substitutions or deletions. The sequence identity of each clone was confirmed using both restriction digests and sequencing. Proteins were expressed in Escherichia coli BL21(DE3) (Novagen, Madison, WI), and purification was carried out using the hexahistidine tag at the C terminus of the protein (9,23). After binding to Ni 2ϩ beads (Qiagen, Valencia, CA) samples were eluted using increasing concentrations of imidazole (50 -250 mM). The concentration of each protein was estimated using the Bradford assay (Bio-Rad, Hercules, CA) and confirmed by Coomassie Blue staining of an extensive protein dilution series next to a standard on SDS-PAGE gels. The identities of the deletion proteins (⌬Lys 50 , ⌬Met 51 , and ⌬Lys 50 Met 51 ) were confirmed by mass spectroscopy.
Biosensor Analysis-Binding experiments were performed on BIA-CORE 2000 and BIACORE 3000 instruments (Biacore Inc., Piscataway, NJ). All RNAs were chemically synthesized carrying a 5Ј-biotin tag (Dharmacon Research, Boulder, CO) to allow immobilization of the RNAs onto streptavidin-coated sensor chips (SA chips, Biacore Inc.). RNA was diluted to a final concentration of 1 M in HBS buffer (10 mM HEPES, pH 7.4, 150 mM NaCl, 3.4 mM EDTA, 0.005% surfactant P20) followed by heating at 80°C for 10 min and cooling to room temperature to allow annealing of the stem. The sample was then diluted 500-fold in running buffer (10 mM Tris/HCl, pH 8.0, 150 mM NaCl, 5% glycerol, 62.5 g ml Ϫ1 bovine serum albumin, 125 g ml Ϫ1 tRNA, 1 mM dithiothreitol, 0.05% surfactant P20) and injected over the sensor chip surface at 10 l min Ϫ1 at 20°C. To provide an optimal comparison of the results obtained from all different U1A mutants, we prepared an intermediate density RNA surface (100 -125 resonance units, RU) that would yield sufficient signal, even when proteins with lower affinities (the deletion mutants) were used. (Wild type protein and the alanine mutants were also analyzed using a low density surface (25 RU), which yielded identical results.) To test for the specificity of the RNA-binding interaction, binding of all proteins to two high density (100 RU) control surfaces was also tested. One consisted of a U1hpII RNA in which the order of the loop nucleotides had been reversed from 5Ј-AUUGCACUCC-3Ј to 5Ј-CCUCACGUUA-3Ј ("reverseU1hpII"), which changes 8 of the 10 loop nucleotides, including 6 of the 7 highly conserved loop residues (11, 24 -27) but leaves the loop structure intact. A second control surface contained an unrelated 19-nucleotide single-stranded RNA. Proteins were serially diluted in running buffer to the concentrations described in each sensorgram and injected at 20°C at a flow rate of 50 l min Ϫ1 for 1 min. Disruption of any complex that remained bound after a 5-min dissociation was achieved using a 1-min injection of 2 M NaCl at 20 l min Ϫ1 . Samples with different concentrations of protein were injected in random order, and every injection was performed at least twice within each experiment. All experiments were done in triplicate. To subtract any background noise from each data set, all samples were also run over an unmodified sensor chip surface and random injections of running buffer were performed throughout every experiment ("double referencing"). Data was analyzed using CLAMP (28) and a simple 1:1 Langmuir interaction model with a correction for mass transport (29).
Simulated Annealing Molecular Dynamics-Calculations were performed on structures based on the x-ray coordinates of human U1A (amino acids 2-97) complexed to the RNA hairpin 5Ј-AAUCCAUUGCA-CUCCGGAUUU (5). Because spacer nucleotides U13, C14, and C15 are not defined in the x-ray structure, they were independently built and inserted through energy minimization (of U13, C14, and C15 only) using AMBER4.0 (30,31). Proteins with single amino acid deletions (⌬Lys 50 and ⌬Met 51 ) were modeled by removing the appropriate amino acid and then "sealing" the backbone by minimizing (1000 cycles) the region around the deletion (from Leu 44 to Gln 54 in wild type numbering, encompassing the ␤ 2 -␤ 3 loop) while keeping the remainder of the protein and the RNA fixed. Using a similar procedure, the double deletion protein (⌬Lys 50 Met 51 ) was constructed by deleting Met 51 from the minimized ⌬Lys 50 ␤ 2 -␤ 3 loop. The single and double deletion protein⅐RNA complexes were then minimized for 4000 cycles, using a 12-Å nonbonded cutoff and applying a distance-dependent dielectric constant (⑀ ϭ 4r) to the electrostatic interactions. The ␤ 2 -␤ 3 loop region of each of these models (from Leu 44 to Gln 54 , using the numbering of the wild type protein) was then subjected to a brief molecular dynamics simulation (10 ps, time step 0.002 ps, 400 K, other conditions as above) followed by a minimization (using the conditions above) of the structure formed after the 10-ps simulation. This procedure was repeated 50 times for each complex, each time starting from the previous simulation. As a control, a similar series of 50 dynamics and minimization cycles was also performed for the wild type protein⅐RNA complex.

Side-chain Replacements of ␤ 2 -␤ 3 Loop Residues Lys 50 and Met 51 Affect Binding to U1hpII in Kinetically Distinct Ways-
The ␤ 2 -␤ 3 region of the N-terminal U1A RRM domain includes amino acids 46 -52 (Fig. 1, A and B). Of these, Ser 48 , Leu 49 , Lys 50 , and Met 51 appear to protrude into the RNA loop, with Leu 49 -Met 51 inserted deeply into the middle of the RNA loop (Fig. 1C). Of the three deeply inserted residues, Leu 49 contacts the RNA through a water-mediated hydrogen bond from the main-chain carbonyl to N3 of G4. Lys 50 and Met 51 do not appear to form short-range interactions with the RNA at all. Therefore, these latter two residues are the best suited to examine a steric role of the ␤ 2 -␤ 3 loop in complex formation and stability. To study the effects of decreasing the bulk of the ␤ 2 -␤ 3 loop, we therefore either reduced the size of side chains of Lys 50 or Met 51 or removed these residues altogether. Nucleotides U Ϫ5 to G15 are identical to the natural U1hpII sequence. The "spacer" nucleotides, whose identity is unimportant for binding, are U8 -C10. Nucleotide C9 was deleted individually or together with U8 to generate U1hpII variants with shorter loops (U1hpII⌬C9 and U1hpII⌬UC, respectively).
Lys 50 and Met 51 were replaced individually or simultaneously with alanine, in a U1A fragment containing the Nterminal RRM (9). The ability of the alanine replacement mutants to bind to wild type U1hpII was first tested, using BIACORE. Highly accurate kinetic data, yielding affinity constants that agree well with those obtained by other methods, can be obtained by BIACORE analysis of the U1A⅐U1hpII interaction (9). We had previously shown that the high affinity of the U1A⅐U1hpII complex (ϳ35 pM) results from a very high association rate and very low dissociation rate (9) ( Fig. 2A and Table I). Replacement of Lys 50 or Met 51 with an alanine reduced the affinity by 10-and 7-fold, respectively (Fig. 2, D and G, and Table I). Although the main effect of the Lys 50 3 Ala mutation was on the association rate, suggesting that this residue plays a role on the electrostatically mediated association of the two molecules (9), the loss in binding seen with the Met 51 3 Ala mutant was due to a 6-fold increase in the dissociation rate (Table I). This suggests that these two residues stimulate U1hpII binding through different mechanisms. The main role of Lys 50 appears to be the electrostatic recruitment of the RNA (as confirmed by salt dependence experiments (9)), whereas the main role of Met 51 appears to be to maximize non-bonded contacts between protein and RNA. The Met 51 3 Ala mutation thus appears to have an effect similar to replacing Leu 49 with methionine, which also increases complex dissociation (19). When both Lys 50 and Met 51 were replaced by alanine, the association rate was slightly faster than the substitution of Lys 50 alone, suggesting the double replacement mutant might allow compensation for the loss of charge, perhaps by rearrangement of neighboring amino acids ( Fig. 2J and Table I). However, the dissociation rate was faster than either of the single mutants; the Lys 50 Met 51 3 Ala⅐U1hpII complex exhibited a dissociation rate that was 16-fold faster than that of the wild type U1A⅐U1hpII complex. A U1A mutant, in which Lys 50 plus Met 51 was mutated to Gly plus Ser, has been described previously (32). The Gly plus Ser mutant also exhibited a loss in affinity for U1hpII; however, the effect was only 2.5-fold. The observed difference in the magnitude of the effect of Lys 50 plus Met 51 replacement might arise from the identity of the replacement residues (Ala plus Ala versus Gly plus Ser), from differences in experimental conditions (no absolute K D values nor kinetic parameters are given by the authors), or from the fact that the Gly plus Ser mutation was made in the context of the full-length protein.
The increased instability of RNA⅐protein complexes formed by Met 51 3 Ala and the double mutant suggests that these proteins are hampered by a reduced fit between the RNA loop and the protein loop. This reduced fit potentially arises from the reduction in size of the side chains that normally pack against the 5Ј half of the U1hpII loop (Fig. 1C). We hypothesized that it might be possible to partially restore the ability of these mutants to bind to RNA by reducing the size of the RNA loop. Accordingly, we analyzed the interaction between the alanine substitution mutants and RNAs lacking one or two loop nucleotides.
The Double Alanine Substitution Appears to Alleviate the Effect of RNA Loop Shortening-Of the 10 nucleotides in the U1hpII loop (Fig. 1D), the sequence of only the first seven bases is crucial, as determined by in vitro selection and biochemical experiments (9, 11, 24 -27). In contrast, the identity of loop nucleotides 8 -10 is irrelevant, because they can be replaced by a polyethylene glycol linker (27). We therefore chose to delete one or two of the spacer nucleotides to examine steric interactions between the ␤ 2 -␤ 3 loop and the RNA loop.
We compared the interaction of U1A or the alanine mutants with U1hpII, to that of these same proteins with U1hpII lack-ing C9 or U8-C9 (U1hpII⌬C9 and U1hpII⌬UC, Fig. 1D). In agreement to what was shown previously (9,27), the interaction of U1A with U1hpII weakens progressively with the removal of each nucleotide (Fig. 2, A-C, and Table I). This is due mainly to an increase in the dissociation rate (Table I and Ref.  9). When the alanine replacement mutants were used, reducing the RNA loop length also caused a marked increase in dissociation rates compared with the same proteins bound to U1hpII (Fig. 2, D-L, and Table I). Interestingly, the double alanine replacement mutant showed a 10-fold smaller negative effect of RNA loop shortening as visualized by comparison of the dissociation constant (K D ) of the RNA lacking two loop nucleotides relative to the wild type RNA (Table I, penultimate column). Although the affinity of wild type U1A for the mutant RNAs decreased ϳ3000-fold with the removal of two nucleotides, the affinity of the double alanine replacement mutant appeared to be less affected (ϳ300-fold decrease). Thus, removing spacer residues from the RNA appears to be less deleterious for the double alanine mutant than for wild type U1A or the single alanine replacements, suggesting that a reduction in size of residues in the ␤ 2 -␤ 3 loop might compensate to a limited extent for RNA loop shortening. However, the double alanine replacement (which reduces binding affinity) and the shortening of the RNA loop (which has a dramatic effect on binding) were not sufficiently complementary to offset the loss in binding ability, so that a net increase in K D was still observed for the mutant protein⅐RNA combination (Table I,

final column).
To verify that the binding we observed to the various RNAs in these experiments was not due to nonspecific binding that might occur at higher protein concentrations, all proteins were tested for binding to a U1hpII target ("reverseU1hpII"), in which the order of the loop nucleotides had been inverted (from 5Ј-AUUGCACUCC to 5Ј-CCUCACGUUA), using protein concentrations identical to those used for the wild type and deletion target RNAs. No binding to this RNA was observed for any of the proteins at these concentrations. Thus, the observed ability of U1A and alanine mutants to bind to the wild type and deletion RNAs is specific and depends on the identity of the loop nucleotides.
Together, our observations suggest that, although the double alanine replacement is somewhat beneficial in interactions with RNAs carrying smaller loops, a larger adjustment than that provided by two side-chain substitutions may be required to restore binding. We therefore explored the effect of completely removing Lys 50 and/or Met 51 from the ␤ 2 -␤ 3 loop.
Deletion of Residues from the ␤ 2 -␤ 3 Loop Dramatically Reduces Binding to U1hpII-Three deletion mutants were generated: ⌬Lys 50 , ⌬Met 51 , and ⌬Lys 50 Met 51 . Deletion of Lys 50 resulted in a ϳ9,000-fold loss in binding affinity for the wild type RNA target, which can be attributed largely to an increase in the dissociation rate (ϳ1,000-fold, Fig. 2M and Table II). The removal of Met 51 was even more deleterious, resulting in a ϳ50,000-fold loss in binding affinity, again attributable almost exclusively to a large decrease in complex stability (ϳ8,000-fold increase in dissociation rate, Fig. 2P and Table II). Surprisingly, deletion of both residues was less calamitous, showing a ϳ900-fold loss in affinity for U1hpII compared with the wild type U1A (Fig. 2S and Table II). This loss was again based largely on an increase (ϳ200-fold) in the dissociation rate. Thus, in all cases, the loss in affinity was almost entirely due to a dramatic loss in complex stability. Importantly, the association rate changed less than 10-fold for each of the three mutants (Table II), indicating that the initial step in complex formation is largely unaffected (9). Because the weak affinity of the deletion mutants necessitated the use of very high protein concentrations, we tested the ability of the proteins to bind to  F, I, L, O, R, and U). To allow optimal comparison of binding behaviors, the same RNA surfaces carrying the three different synthetic 5Ј-biotinylated RNAs, each coated on a different flow cell, were used for all data sets shown (A-U). Increasing protein concentrations (in triplicate and in randomized order) were injected over the RNA surfaces at the concentrations shown in the leftmost column and were identical for all RNAs except in the bottom row, where the concentrations used in panel S (indicated) were lower than those used in panels T and U (indicated in T). The reverseU1hpII and an unrelated single-stranded 19-nucleotide RNA. None of the proteins bound detectably to the singlestranded RNA. However, wild type U1A and the deletion mutants were able to bind the reverseU1hpII RNA at these high concentrations, with an apparent affinity in the low micromolar range (ϳ4 M for wild type U1A). For ⌬Lys 50 and ⌬Lys 50 Met 51 , this background binding was at least 10-fold weaker than the binding to U1hpII, supporting the specificity of the interactions of these proteins with U1hpII. However, ⌬Met 51 bound reverseU1hpII with an affinity of ϳ6 M, which approaches its affinity for wild type U1hpII (ϳ2 M, Table II). This suggests that the ability of ⌬Met 51 to bind to U1hpII is largely due to sequence-independent binding. This is in agreement with the observation that the binding of ⌬Met 51 to U1hpII appears to be primarily based on association and that the formed complex is highly unstable. As we have suggested previously, association of U1A with U1hpII appears to be mediated by electrostatic interactions between positively charged residues on the protein and negative charges on the RNA (9). Because such negative charges would be derived from the phosblack lines represent triplicate protein injections at the indicated concentrations. The red lines represent the global fit of each data set to a single site interaction, including a term for mass transport. The response units scale is set to 120 RU for all experiments to allow comparison of the sensorgrams. The insert in panel L shows a magnification of the data in that panel. Kinetic parameters for the experiments are given in Tables I and II. a The standard error of the mean, based on three independent experiments (each consisting of randomized injections repeated at least twice), is given.
b The calculated standard error of the K D value (K D ϭ k d /k a ) is given. a The standard error of the mean, based on three independent experiments (each consisting of randomized injections repeated at least twice), is given.
b The calculated standard error of the K D value (K D ϭ k d /k a ) is given. c The affinity of ⌬Met 51 for U1hpII falls into the non-sequence-specific binding range, based on our observation that U1A binds to reverseU1hpII RNA with an affinity of 4 Ϯ 3 M (based on three independent experiments) and ⌬Met 51 binds to reverseU1hpII at 6.0 Ϯ 0.3 M (again based on three independent experiments). d NA, not applicable. A comparison is difficult to make, because this protein/RNA combination binds with an affinity close to non-sequencespecific binding, as determined by binding to reverse U1hpII. phate backbone, they are expected to be sequence-independent. Yet association could well be structure-and/or charge-dependent, requiring presentation of the RNA in a defined context for optimal association. This is supported by the observation that no binding of ⌬Met 51 was observed to the single-stranded RNA target.
The loss in complex stability for the deletion mutants was much greater than that observed with the alanine replacement mutants. Taking into account the lack of any direct contacts between Lys 50 or Met 51 and the RNA, and the limited effect of the alanine replacement mutations on RNA binding, this suggests that it is not the loss of the Lys 50 or Met 51 side or main chains (and accompanying electrostatic and steric contributions) per se that dramatically destabilizes the complex but effects of the removal of these residues on the conformation of the ␤ 2 -␤ 3 loop region. Removal of 1 or 2 residues could lead to rearrangements in the flanking amino acids and the RNA or protein contacts that these residues make. Three of the loop residues directly contact the RNA (5): the main-chain carbonyl of Arg 47 makes a hydrogen bond to G11 in the closing base pair of the RNA stem, the main-chain carbonyl group of Leu 49 makes a water-mediated hydrogen bond to G4, and Arg 52 makes hydrogen bonds to A1 and the closing C-G pair. In addition, certain loop residues are also involved in intramolecular contacts within the protein (5,33): the main-chain carbonyl of Lys 50 contacts Gln 54 , Arg 52 makes a similar contact to Gln 54 and contacts the main-chain carbonyl groups of Arg 47 and Ser 48 . Thus, these loop residues appear to form a network of interactions that directly or indirectly affect Gln 54 . Gln 54 is a key residue that appears to be involved in positioning Tyr 13 , which stacks on C5 and is one of two critically important aromatic residues located in the RNP consensus sequences (the other is Phe 56 ) (4-6, 8, 10 -12, 14, 33, 34). Deletion of Lys 50 or Met 51 could result in a shift in this intricate network of contacts, indirectly interfering with Tyr 13 /C5 stacking. This is supported by NMR studies that show rearrangements of the ␤ 2 -␤ 3 loop in Tyr 13 3 Phe and Gln 54 3 Glu mutants (33). Loss of the Tyr 13 interaction leads to a dramatic loss in binding affinity, reminiscent of the consequences of a Phe 56 3 Ala mutation in U1A (9,12,33). Thus, it would appear that the most likely mechanism for the strong destabilization of the complex seen with the Lys 50 and/or Met 51 deletion mutants lies in the local rearrangement of the ␤ 2 -␤ 3 loop region, and the effects thereof, both on RNA contacts by other ␤ 2 -␤ 3 loop residues as well as on a network of interactions positioning Tyr 13 . Although their effects are strong, these rearrangements are likely relatively small in the context of the whole RRM domain, because the high association rate of the deletion mutants suggests that the global conformation of the domain and its charge distribution is largely intact.
Simultaneous Deletion of Lys 50 and Met 51 Partially Compensates for RNA Loop Shortening-Because the ␤ 2 -␤ 3 loop deletion mutants were still able to bind to U1hpII (albeit with greatly increased dissociation rates) we wished to explore whether the deletions in the ␤ 2 -␤ 3 loop would affect the ability to interact with the RNAs containing shorter loops. Therefore, we examined the interaction between the ␤ 2 -␤ 3 loop deletion mutants and U1hpII⌬C9 or U1hpII⌬UC RNAs. ⌬Met 51 , of which binding to U1hpII appears to be sequence-independent (see above), bound even more weakly (K D ϳ20 M) to U1hpII⌬UC than to reverseU1hpII. This effect is due to a loss in association rate, suggesting that the absence of two phosphates from the RNA loop can weaken even sequence-independent background binding. ⌬Lys 50 , which could still bind specifically to U1hpII, when bound to U1hpII⌬UC exhibited a similar loss in association rate. When combined with its in-creased dissociation from U1hpII⌬UC, the binding observed (K D ϳ 10 M) dropped to background binding.
Only the double deletion mutant, ⌬Lys 50 Met 51 , bound U1hpII⌬UC with an affinity above background. Interestingly, ⌬Lys 50 Met 51 appears to be affected less by deletion of the two spacer nucleotides than U1A. U1A shows a loss of ϳ3000-fold in binding affinity when confronted with an RNA loop shortened by two nucleotides (Table I, penultimate column), whereas ⌬Lys 50 Met 51 becomes only 20-fold weaker (Table II, penultimate column). The partial complementarity of RNA and protein loop deletions can also be visualized by comparing the affinity of ⌬Lys 50 Met 51 for a given target RNA to the affinity of wild type U1A to that same RNA. Although ⌬Lys 50 Met 51 exhibits a loss in affinity of ϳ900-fold in binding to U1hpII, this loss is only ϳ5-fold on the double deletion RNA (Table II, final column). Again, deleting Lys 50 and Met 51 is less harmful in the context of an RNA with a smaller loop, suggesting changes in these areas of protein and RNA are somewhat complementary. However, as with the double alanine mutant, the gain in the ability of the double deletion protein to bind to RNA with shortened loops is offset by a much larger loss in general RNA-binding activity (see below), so that the combination of RNA and protein deletion mutants does not result in an absolute gain in affinity.
A Complex Role for the ␤ 2 -␤ 3 Loop Region-The observed beneficial interaction between RNAs with shorter loops and proteins with reduced ␤ 2 -␤ 3 loops supports a steric role for the ␤ 2 -␤ 3 loop region. We envision this role as assisting in the splaying out of the RNA loop residues, as well as in achieving a snug "fit" of this part of the protein into the RNA, stabilizing the complex. However, the dramatic reduction in binding affinity exhibited by the Lys 50 and Met 51 deletion mutants suggests the ␤ 2 -␤ 3 loop plays one or more additional roles in complex formation. This region of U1A was originally thought to be disorganized in the free protein, because it was visible in only one of the two subunits of the crystal (4). It has since become clear that the ␤ 2 -␤ 3 loop contains a single turn of a helix. This turn is clearly seen in NMR studies of the free protein, although some mobility is suggested in this region (35). The same turn can be seen in the complex (5), suggesting it is preformed to some extent and important for complexation. A somewhat rigid structure in the ␤ 2 -␤ 3 loop would be helpful as the protein interacts with the RNA loop (which is thought to be mobile (24,35)), because structuring a disorganized ␤ 2 -␤ 3 loop as the complex forms is likely to be energetically costly (33). The single turn in the ␤ 2 -␤ 3 loop is composed of residues Leu 49 -Met 51 , so that its disruption by the Lys 50 and Met 51 deletion mutants seems inevitable. Accordingly, we suggest that the ␤ 2 -␤ 3 loop also has a structural role and that one possible reason for the loss of affinity of ⌬Lys 50 or ⌬Met 51 is the energy required to accommodate the altered loop into the complex.
Another possible reason for the loss in affinity is the likely repositioning of residues flanking Lys 50 or Met 51 , which would affect their direct and indirect interactions with the RNA and amino acids elsewhere in the protein. Of particular concern would be the effect on the network of interactions involving residues in the highly conserved RNP consensus sequences, such as Tyr 13 , Phe 56 , and Gln 54 . NMR studies of U1A have shown that mutations in these residues lead to rearrangements of ␤ 2 -␤ 3 loop residues (in particular Leu 49 and Lys 50 ) (10,33). Thus, the ␤ 2 -␤ 3 loop region can also be seen as an integral part of the RNA-binding platform, connected with and affecting the positioning of critical residues on the ␤-sheet surface.
In light of these possible roles of the ␤ 2 -␤ 3 loop, the observation that deletion of Lys 50 and Met 51 simultaneously is less disruptive than the individual deletions could be explained in several ways. First, although removal of a single amino acid might disorganize the ␤ 2 -␤ 3 loop and increase its mobility, removal of 2 residues might limit this mobility. Thus, less energy would be required to organize the ␤ 2 -␤ 3 loop during complex formation. Alternatively, removal of a single nucleotide might distort the position of ␤ 2 -␤ 3 loop amino acids flanking the deletion more than removal of two nucleotides, thereby more strongly affecting the (in)direct interactions supported by this region. Lastly, it is possible that removal of 2 amino acids results in major changes in the position of flanking amino acids but that these changes are in fact more favorable to RNA binding than those caused by removal of a single amino acid. In an attempt to distinguish between these possibilities, we used 50 cycles of molecular dynamics and energy minimization to predict the possible structural changes that might result from the single and double amino acid deletions. These calculations focused on the possible conformational changes in the shortened ␤ 2 -␤ 3 loop (from Leu 44 to Gln 54 , wild type protein numbering). Two results of interest emerged. There was a steady decrease in the energy of the double deletion protein⅐RNA complex over the 50 cycles, implying that a conformational rearrangement of the ␤ 2 -␤ 3 loop might be occurring, whereas the minimized energies of the wild type complex and the two single deletion complexes did not decrease significantly. The nature of the conformational change in the double deletion protein is illustrated in Fig. 3. Arg 47 and Leu 49 in the ⌬Lys 50 Met 51 /U1hpII complex may compensate for the loss of Lys 50 and Met 51 , respectively. In the shortened loop, Leu 49 is able to move into the space created by the deletion of Met 51 , thus playing the role of the hydrophobic space-filling amino acid in the center of the protein loop/RNA loop interaction. Arg 47 is able to rotate toward the center of the RNA loop and at least partially compensate for the loss of Lys 50 . These motions are apparently only possible in the double deletion protein⅐RNA complex, because neither were observed in the simulations of the ⌬Lys 50 ⅐U1hpII and ⌬Met 51 ⅐U1hpII complexes. These observations suggest that repositioning of flanking residues is at least in part responsible for the higher RNAbinding affinity of the double deletion U1A mutants compared with the single deletion mutants. This repositioning of amino acids in response to RNA binding may be illustrative of the conformational flexibility that can occur in RNA-protein interactions. A more complete understanding of such phenomena could aid in the rational design of engineered proteins that will bind specific RNA molecules with high affinity.
Comparison of the Role of ␤ 2 -␤ 3 Loops in Different RRM Proteins-The prominent role of the ␤ 2 -␤ 3 loop in the U1A-U1hpII interaction may be directly related to the fact that U1hpII carries a 10-nucleotide loop, because this sizeable loop offers a docking site for the protein. A smaller RNA loop or an unstructured RNA may not provide as much access to the bases or as good a structural "handle" for the protein. In the autoregulatory interaction of U1A with the Polyadenylation Inhibition Element (PIE) in the U1A mRNA (a complex structure containing two conserved U1A binding sites similar to U1hpII), ␤ 2 -␤ 3 loop residues protrude into the RNA loops (16). The RNA loops are comparable in size to the U1hpII loop but differ because a double-stranded stem linking the two binding sites protrudes from the spacer region. In this complex, ␤ 2 -␤ 3 loop residues make a large number of contacts with the RNA, locking the protein into the hole defined by the RNA loop (16). The U2BЉ protein (which is closely related to U1A) forms a ternary complex with the U2AЈ protein and the U2 hairpin IV RNA (a stem with an 11-nucleotide loop). In this complex, the ␤ 2 -␤ 3 loop region is again inserted into the RNA loop and makes extensive contacts with the spacer nucleotides, which form a rigid step ladder (36). Thus, the role of the ␤ 2 -␤ 3 loop is similar in the interactions of U1A and U2BЉ with their cognate RNA hairpins, all of which carry loops of comparable sizes. A very different case is presented by nucleolin, which carries four RRM domains, the first two of which are required to bind to the Nucleolin Regulatory Element (NRE), an RNA hairpin with a six-nucleotide loop (37). The smaller RNA loop accommodates only two amino acids, and these are derived from different parts of the protein: Phe 56 , which lies at the border of a very short ␤ 2 -␤ 3 loop in the first RRM, and Lys 94 , which lies in the hinge region between RRMs 1 and 2. Phe 56 and Lys 94 protrude into the loop from different sides and clamp the RNA between the two RRMs (14). Possibly, a loop that is too small limits the extent to which the ␤ 2 -␤ 3 loop region can participate in binding and may thereby necessitate the involvement of additional RRM domains. Obviously, RNAs lacking a loop or other secondary structure altogether would further limit the role of the ␤ 2 -␤ 3 loop. It is therefore of interest that the specific binding of unstructured (single-stranded) RNAs apparently requires two, three, or even four RRMs, as is evident from studies of poly(A) binding protein (38), Sex-lethal (39), hnRNPA1 (40), U2AF (41), HuD (23,42), and nucleolin (43). In those cases where the structures of complexes between multiple RRMs and unstructured RNA have been solved, the ␤ 2 -␤ 3 loops (which are generally longer in these RRMs than the U1A N-terminal RRM) appear to contribute to RNA binding by extending the RNA binding surface or "reaching out" to the RNA, which is strung across the ␤-sheet surfaces (14,38,39,42). These observations suggest that the nature of the bound RNA target is a key element defining the possible roles for the ␤ 2 -␤ 3 loop in complex formation. It will be of interest to determine the effects of ␤ 2 -␤ 3 loop mutations on proteins that bind to non-hairpin target RNAs, such as single-stranded RNA.