Solution Conformation and Thermodynamic Characteristics of RNA Binding by the Splicing Factor U2AF65*

The U2 auxiliary factor large subunit (U2AF65) is an essential pre-mRNA splicing factor for the initial stages of spliceosome assembly. Tandem RNA recognition motifs (RRM)s of U2AF65 recognize polypyrimidine tract signals adjacent to 3′ splice sites. Despite the central importance of U2AF65 for splice site recognition, the relative arrangement of the U2AF65 RRMs and the energetic forces driving polypyrimidine tract recognition remain unknown. Here, the solution conformation of the U2AF65 RNA binding domain determined using small angle x-ray scattering reveals a bilobal shape without apparent interdomain contacts. The proximity of the N and C termini within the inter-RRM configuration is sufficient to explain the action of U2AF65 on spliceosome components located both 5′ and 3′ to its binding site. Isothermal titration calorimetry further demonstrates that an unusually large enthalpy-entropy compensation underlies U2AF65 recognition of an optimal polyuridine tract. Qualitative similarities were observed between the pairwise distance distribution functions of the U2AF65 RNA binding domain and those either previously observed for N-terminal RRMs of Py tract-binding protein that lack interdomain contacts or calculated from the high resolution coordinates of a U2AF65 deletion variant bound to RNA. To further test this model, the shapes and RNA interactions of the wild-type U2AF65 RNA binding domain were compared with those of U2AF65 variants containing either Py tract-binding protein linker sequences or a deletion within the inter-RRM linker. Results of these studies suggest inter-RRM conformational plasticity as a possible means for U2AF65 to universally identify diverse pre-mRNA splice sites.

Pre-mRNA splicing is an essential source of transcript diversity in multicellular eukaryotes (reviewed in Refs. 1 and 2), as reflected by the significant number of cancers and hereditary diseases associated with mutations in pre-mRNA splice site sig-nals or splicing factors (3)(4)(5). The splicing machinery (spliceosome) is faced with the task of recognizing relatively short exons (ϳ150 nucleotides on average in the human genome) located within vast stretches of intron RNA (Ͼ1500 nucleotides) (6). Although consensus sequences mark the 5Ј and 3Ј splice sites of the pre-mRNA, these sequences are relatively short and degenerate so that cryptic splice sites outnumber bona fide splice sites by an order of magnitude (7). Furthermore, to be distinguished and regulated in a specific manner, "weak" alternative splice sites may deviate substantially from the optimal consensus of "strong," constitutive splice sites (8). The consensus sequences of 3Ј splice sites recognized by the major spliceosome are more extensive than those of 5Ј splice sites (9) and include a branch point sequence closely followed by a polypyrimidine (Py) 4 tract, which is primarily composed of uridines and cytidines. How the spliceosome accurately identifies and pairs the splice sites remains an outstanding question.
Assembly of the core spliceosome (U1, U2, U4, U5, and U6 small nuclear ribonucleoprotein particles) first requires splice site identification by the U2 auxiliary factor large subunit (U2AF 65 ) (10). The U2AF 65 subunit serves as an extensive molecular surface, with domains responsible for critical functions (Fig. 1A). (i) As addressed here, two RNA recognition motifs (RRMs) recognize the Py tract splice site signal (11,12); (ii) a region near the RRMs recruits the ATPase UAP56 to the assembling spliceosome (13); (iii) a C-terminal U2AF homology motif (UHM) domain organizes the SF1 (14) and SF3b155 (15) splicing factors at the branch point sequence; (iv) a tryptophan-containing UHM ligand motif (16) positions the U2AF 35 small subunit at the 3Ј splice site (17); and (v) an N-terminal arginine-serine-rich (RS) domain promotes branch point sequence/U2 small nuclear RNA annealing (18). To accomplish these tasks, both the C-terminal UHM of U2AF 65 and the N-terminal RS domain are required to interact with splicing factors and RNA sites located upstream (5Ј), respectively, of the Py tract binding site. Simultaneously, the UHM ligand motif near the U2AF 65 N terminus interacts with U2AF 35 bound to the 3Ј splice site. Directed hydroxyl radical footprinting shows that the U2AF 65 N terminus contacts pre-mRNA sequences located both preceding and following the Py tract (19). To account for these distant interaction sites, the C-and N-terminal domains of U2AF 65 must be brought in proximity of one another in the context of the folded three-dimensional structure, possibly by a bent configuration of the central RRMs.
We previously investigated the source of the Py tract specificity of U2AF 65 by determining the atomic resolution structure of a modified U2AF 65 RNA binding domain in complex with an optimal Py tract composed of uridines (20) (Fig. 1B). Despite revealing specific RNA interactions, this high resolution structure was insufficient to test models of a bent inter-RRM arrangement of U2AF 65 because co-crystallization required a 20-residue deletion within the inter-RRM linker (21). Structures for two other Py tract binding factors composed of tandem RRMs currently are available for comparison with U2AF 65 , including Py tract-binding protein (PTB) (22) and FUSE-interacting repressor (FIR) (23). In PTB, an extended linker allows the N-terminal RRM1 and RRM2 to tumble independently, whereas the C-terminal RRM3 and RRM4 assemble into an integrated unit (22,24,25). In contrast, the FIR structure demonstrates a compact arrangement of its tandem RRM1 and RRM2 domains (23), with a detailed inter-RRM configuration distinct from that of PTB RRM3/RRM4. Either the PTB-type or the FIR-type models would be compatible with a bent configuration of the U2AF 65 RRM surfaces; however, the RRM1 and RRM2 registers would be flexible in the model represented by the N-terminal PTB RRMs, whereas the FIR-type model would require a relatively rigid inter-RRM arrangement. Therefore, additional experimental information is required to address the conformation of U2AF 65 .
Here, we present the overall molecular shape of the U2AF 65 RNA binding domain composed of RRM1 and RRM2, using small angle x-ray scattering (SAXS). A bilobal molecular envelope without apparent interdomain contacts between rigid body models of the known RRM coordinates was observed. Structural information was complemented by thermodynamic characterization, which revealed a significant enthalpy and entropy compensation drives association of the U2AF 65 RNA binding domain with an optimal polyuridine RNA. To further explore a possible analogy between the bilobal shapes of U2AF 65 and PTB RRM1-RRM2, a U2AF 65 variant containing an inter-RRM linker region from PTB was investigated. Separately, a U2AF 65 variant with a shortened inter-RRM linker was characterized for comparison with the crystal structure (PDB 2G4B) (20). Solution conformations, thermodynamic contributions to Py tract recognition, and in vitro pre-mRNA splicing activities of the U2AF 65 variants were similar to those of the unmodified U2AF 65 domain. Overall, these results support a model whereby the U2AF 65 RRM1 and RRM2 domains act with relative independence, which has significant ramifications for current paradigms of splice site recognition.
RNA Preparation-RNA oligonucleotides composed of 20 uridines (U 20 ) were synthesized by Dharmacon Research, Inc. (Lafayette, Colorado) and deprotected by the manufacturer's protocol for use in calorimetry experiments. RNA concentrations were estimated using calculated molar extinction coefficients as described (26).
Small Angle X-ray Scattering-SAXS data were collected at the SIBYLS Beamline 12.3.1 of the Advanced Light Source (Lawrence Berkeley National Laboratory) using a MARCCD x-ray detector system located 1.6 m from the sample chamber to collect data in the q-spacing 0.01-0.32 Å Ϫ1 , where q ϭ 4sin/ (2 is the scattering angle and ϭ 1.03 Å is the wavelength). U2AF 65 R12 variants were exchanged into 100 mM NaCl, 15 mM HEPES, pH 7.4 by size exclusion chromatography (Superdex-75, GE Healthcare), and the scattering of this buffer was collected before or after each protein sample was subtracted to correct the scattering data. Monodispersity of the samples was further checked prior to SAXS data collection by dynamic light scattering. SAXS data were collected at concentrations of 2.5, 5.0, and 10.0 mg/ml for 6-and 60-s exposures followed by a 6-s exposure to check for radiation damage. Low and high resolution data were scaled and merged from the short and long exposures, respectively. The radii of gyration were analyzed using the Guinier approximation (R g G ) (29) with low angle data (q Ͻ 1.3/R g ) to evaluate possible interparticle effects and showed little or no variation with concentration (⌬R g Ͻ 0.3 Å). Accordingly, scattering profiles at all three concentrations of each variant superimposed within the errors of the experiments after scaling and were merged using PRIMUS (30). The R g values (R g P ) and maximum dimensions (D max ) were also computed from the entire scattering profiles of the merged files using the program GNOM (31), as given in Fig. 2B and Table 1.
Molecular Modeling-GASBOR (32) was used for ab initio modeling with the default settings, assuming a starting particle with no symmetry and of unknown shape. The program BUNCH (33) combined rigid body modeling of the known RRM1 and RRM2 structures from PDB 2G4B with ab initio modeling of the inter-RRM linker regions (corresponding to residues 230 -257 of human U2AF 65 ). For each modeling method, 10 independent models were aligned and averaged to determine common structural features using SUPCOMB (34) and DAMAVER (35), with the exception of the wtU2AF 65 R12 BUNCH data, for which 15 independent models were analyzed. Each set of models agreed well with one another, as indicated by normalized spatial discrepancies (NSD Ͻ 1.0), as detailed in Table 1.
Fluorescence Anisotropy-The apparent equilibrium dissociation constants (K D ) for association of the wild-type and variant U2AF 65 R12 domains with U 20 RNA were determined using fluorescence anisotropy. Anisotropy changes were measured after the addition of U2AF 65 protein to a solution of 30 nM 5Ј-fluorescein-labeled U 20 in 100 mM NaCl, 25 mM Hepes, pH 6.8. The average K D and standard deviation of more than two independ-ent experiments are indicated on representative fitted curves in Fig. 3. Data were fit by non-linear regression assuming single site binding to obtain the apparent K D using the following equation, where x is the total protein concentration, [RNA] is the total RNA concentration, r is the observed anisotropy at the i th titration, r B is the anisotropy at zero protein concentration, and r F is the anisotropy at saturating protein concentration (floated in fit).
Isothermal Titration Calorimetry-The heats generated by the addition of U 20 RNA to U2A)F 65 R123 variants were measured at 30°C using a VP-ITC calorimeter (MicroCal, LLC). Proteins were dialyzed extensively against buffer containing 25 mM Hepes, pH 7.4, 100 mM NaCl, and 0.2 mM tris(2-carboxyethyl)phosphine, and concentrated RNA stock solutions were diluted Ͼ40-fold into this dialysis buffer. Samples were filtered and degassed before loading the calorimeter. RNAs at 200 -250 M concentrations were titrated into 1.4 ml of 12-17 M U2AF 65 fragments over 28 injections of 10 l each, with constant stirring at 307 rpm and 2-min injection spacings. Data were corrected for dilution and buffer effects by subtracting the average of 3-5 terminal injection points from the saturated tail of the binding curve. A control experiment titrating U 20 RNA into buffer showed that the heats of U 20 dilution were insignificant. Data were analyzed using the least-squares fitting routines available in the Origin v7.0 software (MicroCal, LLC). Values shown in Fig. 4D and Table 2 are the averages of two experiments.

RESULTS
Overall Shape of U2AF 65 RNA Binding Domain-A major goal of this investigation was to determine the solution conformation of the Py tract recognition domain of U2AF 65 using SAXS. The solution x-ray scattering profile of the wtU2AF 65 domain composed of RRM1 and RRM2 (R12, residues 148 -336) is shown in Fig. 2A. The average dimensions (radius of gyration, R g ) and maximum size (D max ) of the wtU2AF 65 R12 molecule were 24.5 and 79 Å, respectively, as estimated from the paired-distance distribution function ( Fig. 2B and Table 1). Similar values of R g were obtained from the Guinier plot. The D max of wtU2AF 65 R12 is ϳ26 Å less than that of the analogous R12 domain of PTB (R g ϭ 29 Å, D max ϭ 105 Å (25)). In light of the 23-residue greater length of the PTB inter-R12 linkers, the solution dimensions of wtU2AF 65 R12 are consistent with comparable or weaker interactions between the U2AF 65 RRM1 and RRM2 domains as compared with those of PTB.
The D max from the paired-distance distribution function of wtU2AF 65 R12 was used as a starting point for ab initio shape restorations using the program GASBOR (32), which represents the protein structure as a chain-like ensemble of dummy residues. Ten independent restorations gave reproducible results (mean NSD 0.85). The average most populated envelope demonstrates a distinctly bilobal shape consistent with two loosely associated RRMs (Fig. 2D). Next, the rigid body model- ing program BUNCH (33) was used to position the high resolution structures of the separated RRM1 (residues 148 -228) and RRM2 (residues 260 -334) from PDB ID 2G4B and to connect these independent domains via ab initio modeling of the inter-RRM linker region (residues 229 -259). Fifteen independent BUNCH models are compatible with the overall shape of the reconstructions from GASBOR, as reflected by the overlay shown in Fig. 2D (NSD ϭ 0.82 between the typical BUNCH and mean GASBOR models shown, calculated using the program SUPCOMB (34)). Like the bilobal ab initio models, the rigid body arrangements of the RRM1 and RRM2 domains remain beyond distances compatible with direct contacts between the domains (Ͼ14 Å closest separation). To further illustrate this point, the wtU2AF 65 R12 solution data are a better match for the relatively extended RRMs of the crystal structure of a deletion mutant (d)U2AF 65 (PDB 2G4B) (20) that lacks a portion of the inter-RRM linker (R g 22.7 Å, discrepancy value 2 ϭ 2.1) than for the compact fold of FIR R12 (PDB 2QFJ) (23) (R g 18.9Å, 2 ϭ 13.0) (Fig. 2E). Thus, SAXS data are inconsistent with a closely packed arrangement for U2AF 65 RRM1 and RRM2 in the context of the core R12 domain studied here.
Affinity of Polyuridine Tract Binding-As a prelude to rigorously investigating the enthalpy and entropy changes responsible for Py tract recognition by U2AF 65 , the apparent equilib- Small angle x-ray scattering analysis of U2AF 65 R12 variants. Color schemes are consistent throughout: U2AF 65 sequences, blue; PTB variants, maroon. A, experimental x-ray scattering profiles as compared with data calculated from the most typical BUNCH model (solid lines). Scattering intensities from the low q-region for short exposures and high q-region for long exposures were integrated and merged to achieve the experimental scattering profiles shown. The relative scattering intensities are arbitrarily displaced along a logarithmic y axis for clarity. B, comparison of P(r) functions for wtU2AF 65 R12, ptbU2AF 65 R12, and dU2AF 65 R12 calculated from the experimental scattering profiles using the program GNOM (31). The functions are presented in arbitrary units. The radius of gyration (R g ) and maximum intraparticle size (D max ) of the variants are in the inset. C, the P(r) functions calculated from the experimental dU2AF 65 R12 or wtU2AF 65 R12 scattering data, respectively, as compared with data calculated from the protein coordinates of the dU2AF 65 R12 (PDB ID 2G4B) or FIR R12 structures (PDB ID 2QFJ) using the program CRYSOL (46). D, envelope restorations of wtU2AF 65 R12 (colored as in Fig. 1B), ptbU2AF 65 R12, and dU2AF 65 R12. Mean ab initio shapes resulting from the program GASBOR (32) are superimposed with the most typical model built by the program BUNCH (33). For the BUNCH models, the ab initio models of the inter-RRM linker regions are shown as spheres, and the rigid body models of the individual RRMs are depicted by ribbon diagrams. The mean 2 value for the GASBOR models and the NSD value of the most typical BUNCH model are given. E, for comparison, the solvent accessible surfaces and ribbon diagrams of the dU2AF 65 R12 and FIR R12 coordinates are shown following removal of nucleotides. The locations of RRM1 and RRM2 are indicated for wtU2AF 65 R12 and FIR R12, and remaining models are oriented similarly. Panels D and E were drawn using PyMOL.

TABLE 1
Overall parameters and quality indicators derived from scattering data for U2AF 65 R12 variants R g G , radius of gyration value from the Guinier analysis; R g P , radius of gyration value from the P(r) analysis; D max , maximum size; 2 values are the discrepancies between the experimental data and scattering from the k th model as indicated by the subscript: 2 ab , ab initio GASBOR model; 2 RB , rigid body/ab initio BUNCH model; 2 dU2AF , dU2AF 65 R12 crystal structure (PDB ID 2G4B with RNA coordinates removed); 2 FIR , FIR R12 crystal structure (PDB ID 2QFJ, chain b with nucleotide removed). CRYSOL was used to calculate scattering curves to q ϭ 0.2 Å Ϫ1 from PDB coordinates with a single hydration layer of density 0.38 e/Å 3 added to the molecular surface. NSD is the average normalized spatial discrepancy among: NSD ab , 10 ab initio envelopes; NSD RB , 10 BUNCH models for PTB U2AF 65 R12 and dU2AF 65 R12, 15 BUNCH models for wtU2AF 65 R12; NSD SUP , between the average ab initio envelope and the most typical BUNCH model superimposed using SUPCOMB.  (36,37); and (iii) uridine is the nucleotide most frequently observed in natural Py tracts (9). Anisotropy changes were monitored as fluorescein-labeled fluorescein-U 20 solutions were titrated with U2AF 65 proteins (Fig. 3). The domain necessary and sufficient for Py tract binding, the wtU2AF 65 R12-bound fluorescein-U 20 , revealed an apparent affinity (K D 2.35 Ϯ 0.30 M) comparable with that previously measured for an immobilized biotin-labeled U 20 RNA using surface plasmon resonance (K D 3.4 Ϯ 1.8 M) (20). This value was compared for a larger construct including all three RRMlike motifs (RRM1-RRM2-UHM, R123). Despite the absence of detectable RNA cross-linking to the U2AF 65 UHM or chemical shift changes in the presence of RNA (12,38,39), the wtU2AF 65 R123 bound U 20 with ϳ7-fold higher affinity (K D 0.35 Ϯ 0.07 M) than the minimal R12 construct, perhaps due to RNA interactions by residues following RRM2. Because greater amounts of material are required for studying lower affinity interactions by calorimetry, the R123 variants were used for thermodynamic characterization. Thermodynamic Characteristics of Polyuridine Tract Binding-Isothermal titration calorimetry (ITC) was used to fully analyze the thermodynamic basis for U2AF 65 interactions with a representative Py tract ( Fig. 4 and Table 2). For the wild-type wtU2AF 65 R123 protein, the U 20 binding enthalpy (⌬H°) of Ϫ69 Ϯ 0.3 kcal mol Ϫ1 was nearly offset by a corresponding change in binding entropy (ϪT⌬S°) of 61 Ϯ 0.3 kcal mol Ϫ1 , demonstrating that recognition of the uridine tract is enthalpically driven. The enthalpyentropy compensation is unusually large as compared with the typical thermodynamic signatures for protein-protein or protein-doublestranded DNA interactions (for example, ⌬H°ϭ Ϫ10 kcal mol Ϫ1 and ϪT⌬S°ϭ 6 kcal mol Ϫ1 for U2AF 65 UHM binding to an SF3b155 fragment (40)). The one other available example of ITC characterization of single-stranded  RNA binding (by the bacterial RNA chaperone, Hfq) also demonstrates a large enthalpy-entropy compensation (for example, ⌬H°ϭ Ϫ41 kcal mol Ϫ1 and ϪT⌬S°ϭ 30 kcal mol Ϫ1 for Hfq binding an 18-mer polyadenosine) (41). Although further experiments are needed to test the generality and source of such effects, these results serve as a preliminary indication that remarkably large enthalpy and entropy changes may be a general characteristic of single-stranded RNA recognition.

Sample
Design of Interdomain Linker Variants-The elongated bilobal shape of the U2AF 65 R12 fragment implied that the relative RRM1 and RRM2 arrangement lacked significant interdomain constraints, consistent with the previous observation that both U2AF 65 RRM1 and RRM2 slide with overlapping cross-linking patterns across Py tracts (12). The U2AF 65 R12 pairwise distance distribution function was qualitatively similar to that of the corresponding PTB R12 domain (25). The independent action of PTB RRM1 and RRM2 has been established by the absence of interdomain nuclear Overhauser effects, rotational correlation times, and elongated average shape determined using SAXS (22,24,25). Thus, to further test to the possibility that the U2AF 65 RRM1 and RRM2 act independently of a tightly packed, intramolecular conformation, we tested the ability of the PTB RRM1-RRM2 interdomain linker to functionally substitute for that of U2AF 65 .
A hybrid ptbU2AF 65 construct replaced 20 residues (residues 238 -257) of the U2AF 65 inter-RRM linker with sequences from the PTB linker between RRM1 and RRM2 (Fig. 1A). Given that this RRM1-RRM2 linker of PTB has greater length than that of U2AF 65 , the central region of the PTB linker (residues 148 -167 of human isoform a) was chosen for substitution. The low sequence identity of these PTB and U2AF 65 regions (two identical out of 20 residues) allows the majority of the region to be replaced with unrelated amino acids. Importantly, these PTB residues lack detectable contacts with the flanking RRMs or RNA in the context of their native protein (22).
Separately, we constructed a deletion variant of these residues in the U2AF 65 inter-RRM linker region (Fig. 1A). These residues were absent from the atomic resolution dU2AF 65 R12 structure because the wtU2AF 65 RNA binding domain eluded co-crystallization in the absence of the internal linker deletion (21). Here, we expanded previous work demonstrating that these residues were dispensable for polyuridine binding affinity and in vitro splicing of the AdML substrate (20) by further analyzing their contribution to the molecular shape of the domain and thermodynamic forces underlying Py tract recognition by ITC. The possible effects of these linker modifications on the RNA binding characteristics and nanostructures of U2AF 65 are described below.
PtbU2AF 65 Linker Variant Supports Pre-mRNA Splicing-The ability of the PTB linker sequences to function in place of the natural U2AF 65 sequences was investigated by pre-mRNA splicing assays with the ptbU2AF 65 variant (Fig. 1C). Deletion of the corresponding inter-RRM linker region in dU2AF 65 had been shown previously to lack detectable effects on in vitro pre-mRNA splicing assays with the prototypical AdML pre-mRNA substrate (20). In an analogous experiment, the ability of ptbU2AF 65 to restore splicing activity was tested by the addition of the variant protein to nuclear extracts depleted of wtU2AF 65 . The recombinant ptbU2AF 65 restored pre-mRNA splicing to levels indistinguishable from wtU2AF 65 when similar amounts were added to the splicing reaction (Fig. 1C). Thus, the sequence composition of the PTB RRM1-RRM2 linker supports the fundamental ability of U2AF 65 to promote splicing of an optimal pre-mRNA substrate.
Comparison of Apparent Polyuridine Tract Affinities-Fluorescence anisotropy assays were used to compare the RNA affinities of the ptbU2AF 65 R12 and dU2AF 65 R12 variants with wtU2AF 65 R12 (Fig. 3, C and D). All three R12 proteins bound the U 20 RNA binding site with comparable affinities. Given that sequences within the PTB RRM1-RRM2 linker lack detectable interactions with RNA in the context of the native protein (22,24), this result suggests that residues 238 -257 of U2AF 65 likewise do not substantially contribute to Py tract affinity.
Thermodynamic Characteristics of RNA Binding are Comparable for Wt-, Ptb-, and dU2AF 65 R123 Variants-ITC characterization allowed the detailed thermodynamic similarities or differences to be compared among the inter-RRM variants of U2AF 65 . Representative isotherms for the titration of U 20 RNA into the wtU2AF 65 R123, ptbU2AF 65 R123, or dU2AF 65 R123 are shown in Fig. 4, A-C. Consistent with the results of fluorescence anisotropy assays, the free energy changes for uridine tract binding by all three variants are the same within error. The enthalpy and entropy changes are also very similar ( Fig. 4D and Table 2), with the qualification that ϳ20% decreases in their magnitudes are conferred by the PTB-linker substitution. Overall, residues 238 -257 of the inter-RRM linker do not contribute significantly to the thermodynamic basis for polyuridine binding by U2AF 65 .
U2AF 65 Variants Exhibit Bilobal Shapes-To determine how substitution with PTB sequences or reduction in length influences the overall arrangements of the U2AF 65 RRMs, the dU2AF 65 R12 and ptbU2AF 65 R12 variants were characterized using SAXS (Fig. 2 and Table 1). An experimental distance distribution plot of the dU2AF 65 R12 variant as compared with the calculated profile of the high resolution, RNA-bound dU2AF 65 R12 coordinates (PDB 2G4B) ( 2 ϭ 2.75) shows that the R g and D max dimensions decreased by 2 and 6 Å, respectively, in solution. These differences reflect a somewhat more collapsed average conformation in solution than in the RNAbound crystal structure of the deletion variant, although in both cases, a bilobal shape is observed. As compared with the unmodified wtU2AF 65 R12 protein, the dU2AF 65 R12 variant demonstrates a decreased average size R g (4 Å difference) and maximum length D max (16 -20 Å difference), consistent with the 20-residue deletion within the dU2AF 65 R12 interdomain linker. Substitution of the U2AF 65 linker residues with the PTB sequences in ptbU2AF 65 R12 results in ϳ5% increases in R g and D max , which could reflect partial structural differences between the PTB and the U2AF 65 linker regions and/or a greater proclivity for the natural linker to associate with its cognate RRMs. Both U2AF 65 variants display extended, bilobal ab initio molecular envelopes qualitatively similar to wtU2AF 65 R12, without apparent interdomain contacts between the rigid body models of the RRM1 and RRM2 coordinates (Fig. 2D). These qualitatively similar shapes indicate that residues (238 -257) are unlikely to directly determine the relative RRM arrangement of U2AF 65 . The respective low resolution shape restorations of the U2AF 65 variants are consistent with bilobal mass distributions separated by a flexible linker (Fig. 2B). This observation is also supported by P(r) functions, which display bimodal distributions. However, the low resolution shapes of U2AF 65 variants are unable to elucidate the structural elements within the natural inter-RRM linker. As such, it is worth considering experimental measurements of the average perresidue dimensions for sequences of well characterized structural composition. A short polymer of glycines, the residue with the fewest backbone restrictions, adopts a more extended conformation than an ␣-helical peptide (R g of ϳ1.5 Å and D max of ϳ5.7 Å per glycine residue (42) as compared with an average R g of ϳ0.4 Å per ␣-helical residue (43)). However, chimeric multidomain proteins with ␣-helical linkers are more elongated than counterparts containing flexible linkers, as reflected by increases in the R g and D max values derived from SAXS analysis (44). This difference is thought to arise from rearrangements of the flexible linker to accommodate molecular attraction between the protein domains, whereas sequences with ␣-helical propensity confer rigidity. The similar bilobal shapes of wtU2AF 65 R12, dU2AF 65 R12, and ptbU2AF 65 R12 support the view that the U2AF 65 RRM1 and RRM2 domains are relatively separated in solution, although additional linker variants would need to be analyzed to fully evaluate the role of interdomain sequences. These observations are relevant to our consideration of weak versus strong Py tract recognition.

DISCUSSION
The overall structures for U2AF 65 and its 3Ј splice site assemblies are currently unknown, which presents a major obstacle to understanding their critical role during initiation of pre-mRNA splicing. Previously, we determined the detailed interactions of U2AF 65 with an optimal polyuridine binding site from the high resolution structure of a variant lacking a portion of the interdomain linker (20). Here, we reveal key features of the intact U2AF 65 RNA binding domain and further investigate how modification of the interdomain linker influences these features. First, the intact RNA binding domain of U2AF 65 possesses a bilobal shape consistent with the physical separation of the tandem RRMs observed in the crystal structure (PDB 2G4B). Second, unusually large enthalpy and entropy changes serve as the energetic basis for U2AF 65 association with an optimal Py tract. Third, a region from the well characterized, flexible inter-RRM linker of PTB is capable of substituting for native U2AF 65 sequences to support in vitro splicing, thermodynamic characteristics of uridine tract recognition, and the bilobal shape of the RNA binding domain. Although the exact structural features of the linker region remain to be elucidated, the separation of the U2AF 65 RRM1 and RRM2 domains, coupled with the functionally neutral interchange of the PTB and U2AF 65 linker sequences, supports a model in which the U2AF 65 RRM1 and RRM2 act independently in a manner comparable with the N-terminal RRM1 and RRM2 of PTB, rather than the tightly coupled domain architecture observed for the RRMs of FIR or the C-terminal RRM3 and RRM4 of PTB.
Our structural and biochemical results have important implications for the mode of U2AF 65 recognition of the Py tract consensus sequences of the pre-mRNA substrates (Fig.  5). As shown in Fig. 5A, a map of the known interactions with U2AF 65 requires a spatial organization of protein domains and RNA sites beyond the simple linearity of the primary sequences. Because the U2AF 65 RRMs responsible for identifying the Py tract consensus sequences are centrally located (11,12), a hinge motion between the RRMs is one means for bringing the N-and C-terminal domains of U2AF 65 into proximity. Our SAXS analysis demonstrates that the average organization of the U2AF 65 RRMs is relatively straight in the absence of RNA or other factors, illustrating that a bent RRM1-RRM2 conformation is not prearranged. Closer inspection reveals that given the well established topology of the RRMs, there is no need to invoke an acute inter-RRM angle as an explanation for the proximity of the flanking Nand C-terminal U2AF 65 domains. The inter-RRM linker by definition connects the C terminus of RRM1 with the N terminus of the RRM2. Concurrently, the topology of the RRM fold constrains the C terminus to protrude adjacent the N terminus of the same domain (Fig. 5B). When connected by a linker of finite length, the relative RRM rotations are constrained so that the U2AF 65 sequences directly preceding or following the tandem RRMs are naturally positioned close to one another in three-dimensional space. Accordingly, the N and C termini of the RRM1 and RRM2 domains are oriented toward one another in all rigid body models docked within the U2AF 65 molecular envelope. This orientation is an outcome of the simple requirement for the ab initio linker to connect the termini of the two RRMs (Fig. 5B).
The Py tracts of multicellular organisms are often marked by interspersed rather than contiguous uridine tracts, in some cases as markers for selective regulation of alternative splice sites (8). For example, in the ␣-tropomyosin transcript, a continuous, uridine-rich Py tract directs inclusion of a default exon, whereas an alternatively spliced exon is preceded by several short uridine tracts interrupted by guanosines (45). Although the default Py tract is considerably stronger in its ability to direct splicing, the constitutive splicing factor U2AF 65 manages to recognize the weaker splice site, albeit with lower affinity (11). Based on our previous high resolution structure, we suggested that U2AF 65 could adjust to recognize weak Py tracts such as that found in ␣-tropomyosin by rearranging flexible side chains or intermediary water molecules. The separation of the U2AF 65 RRM1 and RRM2 observed here, coupled with the ability of flexible PTB inter-RRM linker to support U2AF 65 activities, opens a new avenue for diverse splice site recognition; a malleable conformation of inter-RRM linker sequences could serve as a potential means for U2AF 65 to adapt to diverse Py tract sequences (Fig. 5C). Accordingly, multiple binding registers are observed in cross-linking experiments between U2AF 65 and Py tracts of different lengths and sequences (12). We have shown that linker residues 238 -257 are dispensable for U2AF 65 to recognize polyuridine sequences and to promote splicing of an optimal substrate marked by eight consecutive uridines. Nevertheless, a larger number of constructs needs to be examined to unambiguously relate U2AF 65 functions to its inter-RRM sequence composition and length. In particular, shortening of this linker may interfere with the ability of U2AF 65 to recognize divergent splice sites by limiting the ability of the inter-RRM register to adjust. This potential means for U2AF 65 action, along with the conformation of the overall U2AF 65 -splice site complex, are thus highlighted as significant areas for future investigation as we progress toward a more sophisticated three-dimensional view of pre-mRNA splice site identification.