Structuring of the 3′ Splice Site by U2AF65*

Recognition of the 3′ splice site in mammalian introns is accomplished by association of the splicing factor U2AF with the precursor mRNA (pre-mRNA) in a multiprotein splicing commitment complex. It is well established that this interaction involves binding of the large U2AF65 subunit to sequences upstream of the 3′ splice site, but the orientation of the four domains of this protein with respect to the RNA and hence their role in structuring the commitment complex remain unclear and the basis of contradictory models. We have examined the interaction of U2AF65 with an RNA representing the 3′ splice site using a series of U2AF deletion mutants modified at the N terminus with the directed hydroxyl radical probe iron-EDTA. These studies, combined with an analysis of extant high resolution x-ray structures of protein·RNA complexes, suggest a model whereby U2AF65 bends the pre-mRNA to juxtapose reactive functionalities of the pre-mRNA substrate and organize these structures for subsequent spliceosome assembly.

Removal of non-coding intron sequences from pre-mRNAs 1 in eukaryotes involves two sequential transesterifications. In the first step, the branch point adenosine within the intron carries out a nucleophilic displacement at the 5Ј splice site, producing the 5Ј exon and lariat intermediate. The liberated 5Ј exon attacks the 3Ј splice site to yield ligated exon and lariat intron products. Both reactions are catalyzed by the spliceosome: a large (ϳ60 S) ribonucleoprotein assembly. The spliceosome consists of the U1, U2, and U4/U5/U6 snRNPs (small nuclear ribonucleoprotein particles), each containing a unique snRNA and associated proteins as well as non-snRNP splicing factors (1)(2)(3). Assembly of the spliceosome on the pre-mRNA proceeds through the formation of several complexes and is directed by conserved sequences at the 5Ј and 3Ј splice sites as well as other sequences in the pre-mRNA; recognition of the 3Ј splice site is closely coupled to recognition of the proximal branch region and polypyrimidine tract within the intron. Regulation of this assembly process results in differential splice site usage, and the resulting patterns of alternative splicing are a major source of proteome diversity in higher eukaryotes (4,5).
Commitment of a pre-mRNA to the splicing pathway involves the ATP-independent formation of the early or commitment complex on the pre-mRNA substrate (1)(2)(3). In mammals, this complex includes U1 snRNP, tightly associated with the 5Ј splice site, as well as non-snRNP protein factors. These proteins include the heterodimer U2AF, containing large (U2AF65) and small (U2AF35) subunits, which binds to the polypyrimidine tract and 3Ј splice site, the branch-binding protein SF1, and members of the SR protein family (6 -11). Following the formation of the commitment complex, U2 snRNP is recruited to the pre-mRNA in an ATP-dependent process. This association involves the formation of a duplex between U2 snRNA and the pre-mRNA branch region, which bulges out the branch adenosine, specifying it as the nucleophile for the first transesterification (12).
U2AF is at the center of a network of cross-intron proteinprotein bridging interactions in the splicing commitment complex, which has been extensively characterized in the mammalian system. For example, both U2AF35 and the U1 snRNPassociated protein U1 70K have been shown to interact with the SR protein SC35 by Far Western analysis (13). Similarly, it has been proposed that bridging interactions between U1 snRNP, the yeast branch-binding protein, and MUD2, the yeast homolog of U2AF65, are conserved in the mammalian system in a structure involving U1 snRNP, SF1, U2AF65, and possibly FBP11, the metazoan counterpart of yeast Prp40 (14,15). U2AF coordinates the complex process of 3Ј splice site recognition by virtue of indirect interactions with the branch region through the branch-binding protein SF1 and direct recognition of the conserved AG dinucleotide of the 3Ј splice site (6,7,9,11).
U2AF (U2 snRNP auxiliary factor) was first identified as an activity required for recruitment of U2 snRNP to the branch region, and extracts depleted of U2AF can be reconstituted for splicing by the addition of recombinant U2AF65 alone (10). U2AF65 contains three C-terminal RNA recognition motif (RRM) domains as well as an N-terminal region, commonly referred to as the RS domain, rich in basic residues and containing seven RS dipeptide repeats. Association of U2AF65 with the polypyrimidine tract is most likely mediated by two of the three RRM domains. Although it was initially reported that all three RRMs were required for tight RNA binding, recent experiments indicate that this may not be the case. RRM1 and -2 but not RRM3 can be cross-linked to polypyrimidine RNA, and the recent structure of the U2A65⅐SF1 interface suggests that RRM3 may not be an RNA-binding module (10,16,17). Recognition of the polypyrimidine tract by U2AF65 is central to splice site selection and spliceosomal formation for the major class of introns in higher eukaryotes. Both the length and the pyrimidine content of this sequence are critical determinants in the relative strength of competing 3Ј splice sites (18,19). Specific negative regulators of 3Ј splice site selection function as antagonists of U2AF65-pyrimidine tract interaction and include the protein Sex-lethal (Sxl), which regulates female-specific alternative splicing of the tra pre-mRNA in Drosophila as well as the pyrimidine tract-binding protein PTB, which has been shown to modulate alternative splicing of c-src (20,21).
The role of the RS region in U2AF65 function has proven difficult to define. In U2AF-depleted extracts, reconstitution of splicing activity by the addition of U2AF65 required the presence of the RS domain (10). Fusion of this domain to a heterologous polypyrimidine tract-binding module also reconstituted splicing in depleted extracts (21). The presence of an RS domain in U2AF is essential but redundant. Studies in Drosophila have demonstrated that either the RS domain of U2AF65 or a similar domain in U2AF35 but not both was required for viability (22). Green and co-workers (21,23) initially suggested that the RS domain of U2AF65 was positioned over the branch region and functioned in the recruitment of U2 snRNA to the pre-mRNA, and it was subsequently shown that U2AF65 possessed a non-specific RNA annealing activity. In support of a model in which the RS region of U2AF65 was positioned near the branch region, a cross-link could be formed between the RS domain and nucleotides directly proximal to the branch (24). Subsequent to these studies, it has been determined that the branch-binding protein SF1 interacts with the C terminus of U2AF65, and it has been demonstrated that U2AF35, which interacts with N-terminal amino acids 85-112 of U2AF65, directly recognizes the 3Ј splice site (7,9,11,25). Recent models of the commitment complex therefore propose an arrangement in which the C terminus of U2AF65 is positioned proximal to the branch, whereas the N terminus is situated in the vicinity of the 3Ј splice site (17,26,27). Neither of these models accommodates all of the biochemical data, and thus, our current understanding of commitment complex structure at the 3Ј splice site is confused. Because of the central role of U2AF in organizing the assembly of factors at the 3Ј splice site and in promoting spliceosome assembly, we have under-taken a series of studies to determine the disposition of U2AF65 on the pre-mRNA.
We have investigated the positioning of the RS domain of U2AF65 with respect to bound RNA using a series of U2AF65 deletion mutants that have been site-specifically modified at the N terminus with the chemical nuclease iron-EDTA (Fe-EDTA). Our studies confirm that the RS domain is associated with the pre-mRNA branch region but also that the N terminus of U2AF65 is close to the 3Ј splice site. This establishes the orientation of U2AF65 with respect to the RNA and indicates a pronounced bending of the RNA upon association of U2AF with the pre-mRNA. Structural data from x-ray studies of RRM⅐polypyrimidine complexes are consistent with both the proposed orientation as well as the RNA bending to juxtapose the branch region and 3Ј splice site.

EXPERIMENTAL PROCEDURES
RNA Synthesis-Wild type RNA (5Ј-GGGCUCGUCUCGAGGGUGC-UGACUGGCUUCUUCUCUCUUUUUCCCUCAGGCCUACUCUUCU-3Ј) was synthesized by T7 transcription from synthetic double stranded DNA templates (40 mM Tris, pH 8, 3 mM NTPs, 25 mM MgCl 2 , 0.4 M annealed DNA template, and 8 g of T7 RNA polymerase) containing a T7 promoter sequence attached to the 3Ј splice site region of PIP85.B pre-mRNA. The scrambled branch RNA substituted GUCGUAC for the optimal UGCUGAC. Transcriptions were performed at 37°C for 4 h and then purified by denaturing PAGE (8%, 19:1 acrylamide/bisacrylamide), visualized by UV, excised, and extracted from the gel. Purified transcriptions were dephosphorylated with calf intestinal alkaline phosphatase (Roche Applied Science) and labeled with 32 P using T4 polynucleotide kinase (Roche Applied Science). Labeled RNAs were gel-purified by denaturing PAGE (8%, 19:1 acrylamide/bisacrylamide), visualized by autoradiography, excised, and extracted from the gel. Gel-purified RNAs were resuspended in double distilled water and stored at Ϫ20°C.
Expression and Derivatization of U2AF65-based Probes-A panel of cDNAs representing full-length and deletion mutants of U2AF65, each with an N-terminal His 6 tag and a Factor Xa site followed by an introduced Cys, was prepared by PCR and cloned into the pET expres-
Binding and Affinity Cleavage-Binding reactions containing 5Ј 32 Plabeled RNA (50 -100 ϫ 10 3 cpm) and 0 -100 pmol of derivatized U2AF65 were performed in 10 mM Hepes, pH 7.9, 60 mM KCl, 2 mM MgCl 2 , 1 g of tRNA, 0.25 mM DTT, 0.1 mM EDTA, and 10% (v/v) glycerol. Reactions were incubated at room temperature for 60 min and then immediately analyzed by native PAGE (6%, 89:1 acrylamide/bisacrylamide) run with 50 mM Tris-glycine running buffer at 110 V for 3 h. RNase protection/footprinting assays were performed in binding reactions prepared as described above with 1.0 M U2AF followed by the addition of 0.5 ng of RNase A (Roche Applied Science) or 0.3 units of RNase T1 (Roche Applied Science) and cutting at room temperature for 3 and 1 min, respectively. Reactions were quenched with phenol/ chloroform, extracted, and ethanol-precipitated. Affinity cleavage reactions contained 0.6 M U2AF-EDTA (dialyzed into buffer D lacking EDTA and glycerol) in binding reactions as described above. Binding was performed at room temperature for 60 min followed by the addition of 0.5 M FeSO 4 and a further 10-min incubation on ice to allow chelation. Affinity cleavage was initiated by the addition of 0.05% (v/v) H 2 O 2 and 5 mM ascorbic acid, and the reactions were allowed to proceed on ice for 10 min. Reactions were quenched with 30 mM thiourea containing 1% (v/v) glycerol and ethanol-precipitated. Following ethanol precipitation, reactions were subjected to denaturing PAGE (10%, 19:1 acrylamide: bisacrylamide). Dried gels were exposed to a Amersham Biosciences phosphor screen and scanned using a Amersham Biosciences Storm 840 PhosphorImager. For cleavage experiments in the context of multiprotein complexes, His 6 -tagged U2AF35 and C4 were overexpressed in E. coli and purified by Ni-NTA chromatography.

Functionalization of U2AF65 with the Chemical Nuclease
Fe-EDTA-Hydroxyl radicals generated from probes covalently attached to RNA or protein have been useful for examining both intra-and intermolecular contacts to RNA in both structured RNAs and RNA⅐protein complexes (28,29). Diffusible radicals produced from a tethered Fe-EDTA moiety are excellent probes of local RNA structure since they are only capable of cleaving the phosphodiester backbone within ϳ10 -20 Å from their site of generation. We decided to map the interaction of the N terminus of U2AF65 with RNA by using a series of deletion mutants chemically modified at their N terminus with Fe-EDTA as local probes of protein-RNA interaction. Preparation of this panel of probes was carried out by chemoselective reaction of an EDTA thioester (EDTA-3MPA) with the appropriate U2AF65-derived precursor modified to present an Nterminal cysteine residue following site-specific proteolysis (Fig. 1A). Thioesters react specifically with N-terminal Cys since an initial thio-transesterification is followed by rapid rearrangement to yield a stable amide linkage (Fig. 1B). Transient modification of internal Cys residues is reversed using an excess of DTT, and the pH at which the modification is performed minimizes reaction with Lys side chains. This strategy is based on work by Verdine and co-workers (31), who extended Kent's peptide ligation chemistry (30) to the N-terminal functionalization of proteins with Fe-EDTA. In an elegant study, these workers were able to demonstrate NFAT (nuclear factor of activated T cells)-mediated modulation of DNA binding by AP1 using patterns of DNA cleavage as a reporter of binding orientation (31). In the present work, a panel of four modified U2AF65 probes, representing successive N-terminal deletions, was prepared, each containing an N-terminal His 6 tag and a Factor Xa cleavage site followed by an introduced Cys residue (Fig. 1, A and B). Cleavage of partially purified protein yielded the desired precursor with an N-terminal Cys, which was reacted with EDTA-3MPA to yield the functionalized probe, which could then be activated by the addition of Fe 2ϩ , ascorbic acid, and hydrogen peroxide (31). To confirm that modification of the N-terminal Cys was specific, we performed modification/ chase experiments with both EDTA-3MPA and the thioester biotin-3MPA since protein modification with biotin can be monitored by using an avidin-horseradish peroxidase conjugate (Fig. 1C). These experiments showed that modification of U2AF65 was specific to the N-terminal Cys (in the presence of six internal Cys residues) and was essentially quantitative (Fig. 1C, lanes 4 -6).
Protection tions. This RNA is a good model system for examination of 3Ј splice site recognition by U2AF65 and subsequent pre-spliceosome assembly; in HeLa nuclear extracts, a complex containing U2 snRNP stably associated with the branch region forms on this RNA in a U2AF65-dependent fashion (12).
We examined the interaction of derivatized U2AF65 with RNA by native gel shift and RNase protection (Fig. 2). Functionalized full-length U2AF65 and the three deletion mutants bound the RNA similarly ( Fig. 2A, K D ϳ 0.3-1 M, data not shown) with roughly equal affinity to that reported for U2AF65 alone (22). Protection experiments with RNase A and RNase T1 showed an extended footprint on the RNA for full-length U2AF65 from the 3Ј splice site through the polypyrimidine tract and including the branchpoint sequence (Fig. 2B, lanes 3,  4, 9, and 10). Deletion of the RS region resulted in a loss of protection in the vicinity of the branch region, but in all deletion constructs, the polypyrimidine tract was protected, consistent with the presence of the C-terminal RRMs (Fig. 2B,  lanes 3-6 and 9 -12). Thus, the RS region either directly covers the branch region or indirectly modifies the RNA structure to protect it.
Mapping U2AF65-RNA Interaction Using Tethered Fe-EDTA-We investigated the structure of the 3Ј splice site RNA bound to the U2AF65 probes using 5Ј end-labeled RNA, initiating Fe-EDTA-mediated cleavage under conditions where all of the RNA was bound and analyzing the results by denaturing  (Fig. 3). Using full-length U2AF65, we observed strong cleavage of the RNA in the vicinity of the branch region (Fig.  3A, lane 4). A similar pattern of cleavage was observed using the U2AF65 construct derivatized directly adjacent to the RS domain (⌬1-14, Fig. 3A, lane 5). These observations are consistent with the RS domain of U2AF65 directly contacting the branch region. Interestingly, using the probe derivatized adjacent to the RS domain (⌬1- 14), some cleavage at the 3Ј splice site was also observed, indicating that in the RNA⅐U2AF65 complex, the branch region and 3Ј splice site are close to one another (Fig. 3A, lane 5). When probes lacking the RS domain (⌬1-64, ⌬1-94) were assayed, cleavage at the branch region was no longer seen; instead, strong patterns of cleavage proximal to the 3Ј splice site, roughly 25 nucleotides downstream, were observed (Fig. 3A, lanes 6 and 7). These cleavages were centered at the splice site (⌬1-64) and immediately 5Ј to the splice site (⌬1-94). These patterns of cleavage suggest an association of the N-terminal region of U2AF65 with the 3Ј splice site, consistent with the observation that the 3Ј splice site recognition factor U2AF35 interacts with amino acids 85-112 of U2AF65 (27). The patterns of cleavage observed in these experiments are consistent with previous disparate observations (7,9,11,24,25) and suggest that they may be reconciled by a model in which U2AF65 bends the RNA to juxtapose the branch and 3Ј splice site. As well, the shift in cleavage pattern observed between the two constructs (⌬1-64, ⌬1-94) suggests a polarity of RNA recognition whereby the RNA is bound in a 3Ј to 5Ј sense N-to C-terminal on U2AF65 (Fig. 3A).
We investigated the sequence dependence of RNA recognition by performing a series of cleavage experiments with the same panel of derivatized U2AF65 probes using an RNA containing a non-functional scrambled branch sequence (Fig. 3B). The probes bound this RNA with similar affinity (data not shown), and the cleavage patterns observed were similar to those seen using the consensus branch sequence (Fig. 3, A and  B). Thus, structuring of the 3Ј splice site by U2AF65 is independent of the branch sequence. However, nucleotides 5Ј to the polypyrimidine tract do increase the affinity of U2AF65 for the RNA (data not shown), consistent with a non-specific interaction between the positively charged RS domain and the negatively charged RNA.
Structure of Higher Order Complexes at the 3Ј Splice Site-U2AF65 coordinates recognition of the 3Ј splice site by binding tightly to the pyrimidine tract and through interactions with the branch-binding protein SF1 and the 3Ј splice site recognition factor U2AF35. We repeated our analysis of U2AF65⅐RNA interaction using three of the U2AF65-based probes (⌬1-94 does not include part of the U2AF65/35 interface) in the presence of combinations of U2AF35 and C4, a deletion mutant of SF1 that is functionally equivalent to the full-length protein (25). Using these factors, we were able to form specific ternary and quaternary complexes on RNA as assayed by native gel shift (Fig. 3C), mimicking the known assembly of factors at the 3Ј splice site. The RNA cleavage patterns observed upon incubation of the U2AF65-based probes with C4 and U2AF35 alone or with both proteins together are very similar to those observed with U2AF65 alone (Fig. 3, A and C). We conclude that U2AF65 is the primary determinant of RNA structure at the 3Ј splice site within the commitment complex. DISCUSSION The splicing factor U2AF is the central organizing force in recognition of the 3Ј splice site in higher eukaryotes, and as such, plays a critical role in the regulation of gene expression. Our studies indicate that U2AF65 alone, bound to the 3Ј splice site region, bends the polypyrimidine tract to bring the 3Ј splice site and branch region close together. To refine a model of U2AF-RNA interaction, we considered known RRM-polypyrimidine-binding modes as determined by x-ray crystallography: specifically, the structures of the Sxl and HuD proteins complexed to RNA (Fig. 4A) (32,33). The Drosophila protein Sxl regulates female-specific alternative splicing of the tra pre-mRNA by binding to a specific polypyrimidine tract, thus preventing the binding of U2AF and utilization of the proximal 3Ј splice site (21). Human Hu proteins protect short-lived RNAs from degradation by binding to adenosine-uridine-rich elements in their 3Ј-untranslated region.
The domain structure of Sxl includes tandem RRM domains that mediate binding to a 17-nucleotide uridine-rich sequence. The individual RRMs have a characteristic structure consisting of a four-stranded antiparallel ␤-sheet buttressed by two ␣-helices. The 2.6-Å x-ray structure of Sxl bound to RNA reveals a V-shaped cleft formed by the ␤-sheets that are the principal determinants of RNA recognition; this cleft accommodates, in part, the RNA, which is strongly kinked in the bound structure (Fig. 4A). X-ray structures of the tandem adenosine-uridinerich element-binding RRMs of HuD bound to RNAs derived from the c-fos and TNF (tumor necrosis factor) adenosineuridine-rich elements show a remarkable similarity to each other and to the structure of Sxl bound to RNA, although the bound RNAs differ in sequence at several positions. HuD also forms a cleft for the recognition of an extended RNA, which is sharply kinked (Fig. 4A). Given the similarity in consensus sequences recognized by U2AF, Sxl, and HuD, it seems likely that the general modes of RNA recognition are quite similar. Thus, association of U2AF65 with the pre-mRNA 3Ј splice site strongly bends the RNA, as observed using the Fe-EDTA modified probes (Fig. 4B).
U2AF65 recognizes a diverse array of polypyrimidine-rich sequences with a typical length of 20 -40 nucleotides (18,19). In addition to direct recognition of the polypyrimidine tract, FIG. 4. Structuring at the 3 splice site by the polypyrimidinebinding factor U2AF65. A, high resolution structures of Sxl⅐RNA and HuD⅐RNA complexes (32,33). Both polypyrimidine tract bending and Cto N-terminal orientation of the protein with 5Ј to 3Ј orientation of the RNA match the observed interaction of U2AF65 with the 3Ј splice site. B, orientation of U2AF65 at the 3Ј splice site structures the polypyrimidine tract to juxtapose the branch point sequence, and the 3Ј splice site positioning the RS domain of U2AF65 in the vicinity of the branch point sequence. Interactions of U2AF65 with SF1 at the branch point sequence and U2AF35 at the 3Ј splice site in a multiprotein complex mimic the pre-spliceosome commitment complex.
U2AF-RNA interaction is modulated by U2AF35 binding to the 3Ј splice site and indirectly by the strength of the branch region-SF1 interaction. A 17-nucleotide uridine-rich consensus-binding site for U2AF65 has been identified by SELEX (34). From both biochemical experiments and the high resolution structural data for Sxl-RNA interaction, it is expected that this sequence could be recognized by RRM1 and RRM2 of U2AF65. The recent NMR structure of the U2AF65⅐SF1 interface reveals that the canonical RNA-binding surface of RRM3 of U2AF65 is blocked by an ␣-helix (17). A similarly positioned helix is displaced in the association of the U1A RRM with its cognate RNA. However, the atypical sequence of U2AF65 RRM3, lacking conserved RNA-interacting aromatics, the overall negative surface potential of the RRM, and the fact that RRM3 is not cross-linked to polypyrimidine RNA, suggests that the functional role of this domain is in protein-protein recognition (16,17). In contrast to the highly specific polypyrimidine tract recognition of Sxl, U2AF65 binds to tracts of varying length and sequence composition; long pyrimidine sequences may be recognized by a looping out of the RNA, whereas differences in pyrimidine tract sequence may be accommodated by different registers of binding (16).
Recently, using pre-mRNAs site-specifically functionalized with Fe-EDTA, we have shown that both the branch and the 3Ј splice site are in close proximity to the 5Ј splice site in the mammalian commitment complex. Consequently, we suggested that this arrangement of critical sequences templates further spliceosomal assembly and snRNA rearrangement (35). Structures of the first two RRMs of U2AF65, as well as the U2AF65⅐SF1 and U2AF65⅐U2AF35 interfaces and the SF1⅐branch region complex, have now been determined (17,27,36,37). However, the spatial and structural organization of these factors at the 3Ј splice site is unclear. The studies described here suggest that the 3Ј splice site complex will have a relatively compact structure in which the key conserved intron sequences are positioned close to one another by virtue of their interaction with the U2AF65 scaffold. A more complete understanding of commitment complex assembly and function will involve high resolution structural analysis of higher order complexes containing these and other factors.