Cancer-relevant Splicing Factor CAPERα Engages the Essential Splicing Factor SF3b155 in a Specific Ternary Complex*

Background: CAPERα contains a U2AF homology motif (UHM) that typically recognizes U2AF ligand motifs (ULM) of pre-mRNA splicing factors. Results: Crystal structures reveal CAPERα UHM/SF3b155 ULM interactions. CAPERα preferentially associates with the SF3b155 ULM-containing domain. Conclusion: CAPERα is recruited for pre-mRNA splicing via UHM/ULM interactions with SF3b155. Significance: Knowledge of CAPERα/SF3b155 complexes will enhance understanding of angiogenic spliceoform regulation by CAPERα. U2AF homology motifs (UHMs) mediate protein-protein interactions with U2AF ligand motifs (ULMs) of pre-mRNA splicing factors. The UHM-containing alternative splicing factor CAPERα regulates splicing of tumor-promoting VEGF isoforms, yet the molecular target of the CAPERα UHM is unknown. Here we present structures of the CAPERα UHM bound to a representative SF3b155 ULM at 1.7 Å resolution and, for comparison, in the absence of ligand at 2.2 Å resolution. The prototypical UHM/ULM interactions authenticate CAPERα as a bona fide member of the UHM family of proteins. We identify SF3b155 as the relevant ULM-containing partner of full-length CAPERα in human cell extracts. Isothermal titration calorimetry comparisons of the purified CAPERα UHM binding known ULM-containing proteins demonstrate that high affinity interactions depend on the presence of an intact, intrinsically unstructured SF3b155 domain containing seven ULM-like motifs. The interplay among bound CAPERα molecules gives rise to the appearance of two high affinity sites in the SF3b155 ULM-containing domain. In conjunction with the previously identified, UHM/ULM-mediated complexes of U2AF65 and SPF45 with SF3b155, this work demonstrates the capacity of SF3b155 to offer a platform for coordinated recruitment of UHM-containing splicing factors.

The coactivator of activating protein-1 and estrogen receptors, CAPER␣ 4 (also known as HCC1.3, RBM39, FSAP59, and RNPC2), is a pre-mRNA splicing factor and transcriptional coactivator that has emerged as a candidate tumor suppressor in several human malignancies. First identified as an autoantigen in a hepatocellular carcinoma patient (1), overexpression of CAPER␣ reduces tumor vascularization and growth (2) and inhibits v-Rel mediated lymphocyte transformation (3). CAPER␣ is detected in human spliceosome complexes (4 -7) and has been shown to function in the pre-mRNA splicing pathways of fission yeast, Drosophila, and humans (8 -10). In fission yeast the CAPER␣ homologue Rsd1 has been shown to connect the U1 small nuclear ribonucleoprotein particle (snRNP) at the 5Ј-splice site with Prp5 and the SF3b subunit of the U2 snRNP at the 3Ј-splice site (9). In humans, CAPER␣ promotes formation of the less angiogenic spliceoforms of the vascular endothelial growth factor (VEGF), which could explain the reduced vascularization and growth of CAPER␣overexpressing tumors derived from Ewing sarcoma cells (2). Likewise, CAPER␣ expression levels are reduced in a CREBBP ϩ/Ϫ mouse model of myelodysplastic syndrome (11). In addition to its importance for alternative splicing, CAPER␣ also functions as a transcriptional coactivator of hormone-sensitive genes (8,12). Despite an increasing knowledge of CAPER␣'s physiological roles, the interaction partners of human CAPER␣ in the considerable splicing factor network (13) remain elusive.
Comparisons of protein sequences afford clues concerning the basis for CAPER␣ interactions with components of the pre-mRNA splicing machinery (Fig. 1). A similar domain organization, including an N-terminal RS domain, central RNA recognition motifs (RRM1 and RRM2), and a C-terminal U2AF homology motif (UHM), mark CAPER␣ as a paralogue of the essential splicing factor U2AF 65 . However, the specific regulatory function of CAPER␣ in alternative splicing (2,8) differs from the general U2AF 65 requirement for splicing of the major class of introns (14,15). CAPER␣ and U2AF 65 are members of a broader U2AF 65 -like family of pre-mRNA splicing factors that also includes Puf60 (also called FIR) and CAPER␤ (Fig. 1A). The presence of both Puf60 and U2AF 65 together has been shown to cooperatively stimulate pre-mRNA splicing in vitro, and the relative ratios of Puf60 and U2AF 65 can influence alternative splice site choice in HeLa cells (16). CAPER␣ is a close relative of CAPER␤ at the level of primary sequence conservation (49% sequence identity between CAPER␣ and CAPER␤, compared with only 19 or 11% identity with U2AF 65 or Puf60, respectively). Yet CAPER␣ differs from CAPER␤ in the addition of a distinctive CAPER␣ UHM. Functionally, the tissue-specific expression pattern of CAPER␣ is lower than CAPER␤ in human kidneys, and the two proteins differ slightly in their abilities to influence alternative splicing and transactivation of regulated genes (8). These differences and the unknown pre-mRNA splicing partners highlight the importance of elucidating the structure and molecular interactions of the distinctive CAPER␣ UHM.
UHMs comprise a subfamily of RRMs that have evolved specialized features to bind U2AF ligand motifs (ULM) (for review, see Ref. 17) as opposed to RNA. The distinctive UHM features include an acidic ␣-helix and an RXF-containing loop that embellishes a core RRM-like fold. Candidate UHMs are found in a relatively small group of proteins, the majority of which function in pre-mRNA splicing (for example, Fig. 1A). The abilities of most UHM family members to bind ULMs have been confirmed, yet the partner(s) of the CAPER␣ UHM remains an open question. In contrast with the globular UHM domains, the consensus ULM comprises an intrinsically unstructured, linear stretch of basic residues preceding a key tryptophan (Fig. 1B) that, respectively, engages the acidic UHM residues and stacks between the arginine and phenylalanine side chains of the RXFloop. Matches with the relatively short and degenerate ULM consensus ((R/K)X 0-3 W(D/N)(Q/E)) are identified in Ͼ500 human proteins using ScanProSite (18). Nevertheless, functionally important UHM targets have been established for only three ULM-containing pre-mRNA splicing factors, including single ULMs in each of the early stage splicing factors SF1 (19) and U2AF 65 (20,21) and seven ULM-like motifs in the later stage SF3b155 subunit, five of which have been confirmed to bind UHMs (22)(23)(24)(25). A growing number of UHM/ULM structures have been determined, including the U2AF 35 UHM/ U2AF 65 ULM heterodimer for 3Ј-splice site recognition (20) and the alternative splicing factor SPF45 UHM bound to the fifth ULM (ULM5) of SF3b155 (22) as well as an NMR characterization of an intramolecular UHM/ULM complex in the nuclear envelope protein Man1 (26). Most recently, the core U2AF 65 UHM/SF1 ULM structure (19) showed a U2AF 65 UHM interface with a coiled-coiled domain of SF1 (27,28), which in turn is phosphorylated by the UHM-containing KIS kinase (21,25,29). Despite emerging structural views of UHM/ULM complexes, the rules governing UHM/ULM selection and the functional interplay among the five contiguous SF3b155 ULMs remain to be defined.
Here we demonstrate that CAPER␣ contains a bona fide UHM that engages in prototypical ULM interactions by determining 2.20 Å and 1.74 Å resolution structures of the apoCAPER␣ UHM and its complex with a representative SF3b155 ULM (ULM5), respectively. We identify SF3b155 as the primary ULM partner of full-length CAPER␣ in human cell extracts and show that the CAPER␣ UHM specifically recognizes the SF3b155 ULM-containing domain with a Ͼ50-fold higher affinity than other ULM-containing proteins. Together with the previously described association of SF3b155 ULMs with the UHMs of U2AF 65 (23,30) and SPF45 (22), these findings suggest that the SF3b155 ULM-containing domain coordinates the recruitment of multiple UHM-containing proteins to the pre-mRNA splice site. A, human CAPER␣ (NCBI RefSeq NP_004893) compared with human paralogues CAPER␤ (NP_060577), U2AF 65 (NP_009210), Puf60 (NP_ 001258027), and SPF45 (NP_001139019). B, human ULM-containing splicing factors SF1 (NP_004621) and SF3b155 (NP_036565). Circled P, phosphorylated SF1 SPSP motif. HEAT, helical repeats; KH-QUA2, K-Homology Quaking-Homology-2; RS, arginine-serine-rich; RRM, RNA recognition motif (blue); UHM, U2AF homology motif (cyan); ULM, U2AF ligand motif (magenta); Zn, zinc knuckle. Sequence boundaries of domains relevant to this study and the residue numbers of C termini are indicated above. C, ULM "consensus" compared with known sequences of human splicing factor ULMs. ULM tryptophans are highlighted in yellow.
Crystallization and Structure Determination-For crystallization experiments, the CAPER␣ UHM was purified by size exclusion chromatography in 50 mM NaCl, 15 mM HEPES, pH 7.4, 0.2 mM tris(2-carboxyethyl)phosphine (TCEP). All crystals were grown at 277 K by sitting drop vapor diffusion from equalvolume mixtures of protein and reservoir solutions (400 nl total volume). The apoCAPER␣ UHM crystals were obtained from a 24 mg/ml protein solution and a reservoir composed of 0.1 M sodium acetate, pH 4.5, 0.1 M Bis-Tris, pH 5.5, 25% (w/v) PEG 3350. The SF3b155-bound crystals were obtained from 18.8 mg/ml CAPER␣ UHM in the presence of a 4-fold excess SF3b155 ULM5 and layered with a reservoir solution of 0.8 M sodium phosphate, 0.8 M potassium phosphate, 0.1 M HEPES, pH 7.5. In both states the space groups were P2 1 2 1 2 1 , and the unit cell dimensions were similar (Table 1). Crystals were coated with a 1:1 v/v mixture of Paratone-N and silicon oils and flash-cooled in liquid nitrogen. Crystallographic data were collected using an in-house microfocus sealed-tube x-ray generator equipped with CCD detector ( ϭ 1.54 Å) and processed with the Proteum software package (Bruker AXS, Inc).
Initial attempts to use other UHM structures as search models for molecular replacement were unsuccessful. A molecular replacement solution was obtained using the unpublished structure of mouse apoCAPER␣ UHM at 0.95 Å resolution, which meanwhile became available through the Joint Center for Structural Genomics/Partnership for T-Cell Biology (PDB ID 3S6E). The resulting structures are similar (r.m.s.d. 0.4 Å between PDB ID 3S6E compared with a 0.25 Å maximum-likelihood coordinate error for the human apo structure presented here). The CAPER␣ UHM/SF3b155 ULM5 structure was determined by difference Fourier keeping the same R free set as the apo counterpart. PHENIX (31) was used for molecular replacement and refinement, and Coot (32) was used for manual adjustments. The final structures were evaluated using MolProbity (33) and illustrated using PyMOL (Schrödinger, LLC). Buried surface areas were calculated using NACCESS (34) as the sum of the solvent-accessible surface areas of the two molecules less that of the pair. The crystallographic data and refinement statistics are reported in Table 1.
Cell extracts and in vitro translation products were treated with 10 g/ml RNase I (R4875, Sigma) for 5 min at room temperature in conditions that were optimized to remove RNA as monitored by agarose gel electrophoresis and ethidium bromide staining and stored at Ϫ80°C before use. Immediately before pulldown assays, cell extracts and in vitro translation products were thawed and clarified by centrifugation at 20,000 ϫ g for 10 min. For each pulldown experiment, 40 pmol of purified GST fusion protein was used as bait, and either 200 l of HEK293 cell extract (see Fig. 4) or 0.3 l of diluted in vitro translation product in 250 l of interaction buffer (10 mM Tris-HCl, pH 8.0, 50 mM NaCl, 0.1% Nonidet P-40, 10% v/v glycerol, and 1 g/l BSA as a nonspecific competitor) (see Fig. 6). For pulldown assays from increasing amounts of cell extract (see Fig. 7), 40 pmol of GST-SF3b155 was incubated with the indicated volumes of cell extract. Comparisons of SF1 phosphorylation states (see Fig. 4) used GST-SF1 protein incubated with or without purified KIS kinase in kinase assay conditions as described (27).
The interaction reactions were incubated for 90 min at 4°C. Glutathione beads (10 l) (GE Healthcare) were washed twice with interaction buffer, incubated with the interaction reaction for 30 min, and washed rapidly 5 times with interaction buffer, and the retained proteins were separated by SDS-PAGE. For assays with in vitro translated proteins, gels were first stained with Coomassie Blue, and then radioactivity was quantified using phosphorimaging (Typhoon FLA9000, GE Healthcare). Quantification of the Coomassie Blue staining was achieved using a 700-nm infrared laser scanner (Odyssey, Li-Cor). The fraction of 35 S-labeled proteins bound to the GST fusion proteins in the interaction mixture was then calculated as the fraction of radioactivity recovered on the beads divided by the fraction of GST fusion proteins recovered on the beads. For experiments with cell extracts, bound proteins were analyzed by SDS-PAGE and immunoblotting with anti-U2AF 65 (mouse mAb, clone MC3; Sigma), anti-CAPER␣ (mouse mAb P14; Santa Cruz), and anti-GST (mouse mAb B14; Santa Cruz Bio-technology). A secondary 800-nm-IRDye-conjugated antimouse antibody (Rockland Immunochemicals) was used for detection, and the fluorescence signal was acquired with an infrared laser scanner (Odyssey, Li-Cor). Quantification was performed with ImageJ (35).
Isothermal Titration Calorimetry-Binding affinities of the CAPER␣ UHM for the SF3b155 ULM5, wild-type, and mutant SF3b155, SF1 14 -132 , and U2AF 65 ULM were measured using a VP-ITC (MicroCal, LLC). Both the CAPER␣ UHM and the respective ULM protein, except for the SF3b155 ULM5 and ULM5L peptides, which were diluted Ն250-fold into the dialysis buffer, were dialyzed against 50 mM NaCl, 25 mM HEPES, pH 7.4, 0.2 mM TCEP before calorimetry and then extensively degassed. The CAPER␣ UHM was injected in 28 aliquots of 10 l each at 2 s l Ϫ1 into 1.4 ml of the ULM binding partner at 30°C with constant stirring at 307 rpm, 4 min of relaxation time between injections, and 15 cal s Ϫ1 reference power. A control experiment titrating CAPER␣ UHM into buffer was used to correct the isotherms for the heats of CAPER␣ UHM dilution. The isotherms were fit using Origin v7.0 (MicroCal, LLC). The average values and S.D. resulting from two independent titration experiments are given in Table 2, and representative isotherms are shown in supplemental Fig. 1.
Size Exclusion Chromatography-The SF3b155 ULM-containing domain was incubated with a 7-fold molar ratio of CAPER␣ UHM and then separated from excess UHM on a pre-packed Superdex-200 size exclusion column (GE Healthcare) in 50 mM NaCl, 25 mM HEPES, pH 7.4, 0.2 mM TCEP. Elution profiles of the individual subunits were compared. Blue dextran 2000 was used to determine the column void volume. K av ϭ (V e Ϫ V 0 )/(V c ϪV 0 ) was calculated from the void volume (V 0 ), total column volume (V c ), and elution volumes of molecular mass standards (V e ): chymotrypsinogen (25 kDa), ovalbumin (43 kDa), conalbumin (75 kDa), and aldolase (158 kDa). The linear fit of K av plotted against the logarithm of the molecular masses of the standard proteins was used to calculate the apparent molecular masses from the elution volumes of the experimental samples. The concentration of the eluted CAPER␣/SF3b155 complex was ϳ5 M assuming 2:1 stoichiometry.
Amino Acid Analysis-After 24 h of hydrolysis at 110°C in the presence of 6 N HCl and 1% phenol, the amino acid contents of the purified CAPER␣/SF3b155 complex were quantified using a post-column derivatization technique with ninhydrin and an internal 2.0 nmol of norleucine standard on a Hitachi L-8800 amino acid analyzer at the University of California-Davis Proteomics Core Facility.
Circular Dichroism Spectra-Separate CAPER␣ UHM, SF3b155, and the saturated complex were purified by size exclusion chromatography and dialyzed against 20 mM phosphate buffer, pH 7.4, and 0.2 mM TCEP. Reference spectra of stoichiometric mixtures of the CAPER␣ UHM and SF3b155, isolated proteins, and the saturated complex were collected on a Jasco J-815 spectropolarimeter in continuous mode with a 1.0-nm bandwidth, 2-s data integration time, accumulation of 3 scans, and 50 nm min Ϫ1 scan speed at room temperature using a 0.1 cm cell. Spectra were corrected for buffer absorbance.

RESULTS
Structure of the ApoCAPER␣ UHM-To confirm that the C-terminal domain of CAPER␣ (residues 411-524) adopts a canonical UHM-fold, we determined the 2.20 Å resolution structure in the absence of ligand (Table 1, Fig. 2). Two independent molecules in the crystallographic asymmetric unit are similar, with r.m.s.d. values for the human CAPER␣ molecules ranging from 0.5 Å between C␣ atoms located in regular secondary structure to Ͼ3 Å in the RXF-loops, which are engaged differently by crystal packing (overall r.m.s.d. 1.1 Å for 108 matching C␣ atoms). The buried surface area at the interface between the two CAPER␣ molecules (800 Å 2 ) is less than would be expected for a weak homodimer (1620 Ϯ 670 Å 2 ) and within the expected range for crystal packing (570 Ϯ 520 Å 2 ) (36). Accordingly, the CAPER␣ UHM migrates as a monomer by size exclusion chromatography (Fig. 5A).
The structure of the C-terminal domain of human CAPER␣ matches the core ␤␣␤␤␣␤ topology of the parent RRM and specialized UHM-fold families ( Fig. 2A). In a structural homology search using DaliLite (37) where I i is an intensity I of the ith measurement of a reflection with indices hkl, and ͗I͘ is the weighted mean of all measurements of I. c R cryst ϭ ⌺͉F obs Ϫ F calc ͉/⌺F obs where F obs and F calc are the observed and calculated structure factor amplitudes, respectively. R cryst and R free , respectively, were calculated using the working set and a 10% test set of randomly selected reflections that were excluded from the refinement.
CAPER␣) packs against the antiparallel ␤-sheet (Fig. 2B), which in RRMs offers consensus ribonucleoprotein motifs for RNA interactions (RNP1 and RNP2) (for review, see Ref. 38). Among UHMs, two key RNP residues are degenerate (Gln-421 at the second RNP2 and Asn-467 at the third RNP1 positions of CAPER␣). The fifth RNP1 position, which for RRMs is an aromatic residue that stacks with bound nucleotide bases, is conserved among the four CAPER␣-like UHMs (Tyr-469 of CAPER␣) but is masked from solvent exposure and hence inadvertent RNA binding by hydrophobic residues in the appended UHM ␣-helix (Leu-508, Phe-509, and Tyr-505 in CAPER␣). After this ␣-helix, the extended CAPER␣ C terminus adopts a unique proline kink (Pro-510) that alters the backbone trajectory and initiates a short 3 10 helix (residues 510 -512). Finally, the CAPER␣ extension integrates between the exterior ␤-strand (␤2) and adjacent ␣-helix (␣1) via an Leu-519 anchor of hydrophobic interactions. Altogether, the interface of this distinctive, 20-residue extension of the CAPER␣ RRM-like domain buries 1295 Å 2 of surface area, which approaches the average size of a protein-protein interface (1910 Ϯ 760 Å 2 ) (36).
Structure of the CAPER␣ UHM Bound to a Prototypical ULM from SF3b155-We next investigated the nature of ULM interactions with the CAPER␣ UHM. Given that CAPER␣ is a paralogue of U2AF 65 , we initially attempted to co-crystallize the CAPER␣ UHM with the interaction domains of the U2AF 65 partner, SF1 (SF1 14 -132 , residues 14 -132; Ref. 27), in the phosphorylated and unphosphorylated forms. In parallel, we screened for co-crystallization of the CAPER␣ UHM with the U2AF 65 ULM (residues 85-112; Ref. 20) based on the documented co-localization and co-immunoprecipitation of CAPER␣ and U2AF 65 (39). Last, we chose to co-crystallize the CAPER␣ UHM with the SF3b155 ULM5, the most conserved and highest affinity U2AF 65 UHM-binding site out of the five confirmed SF3b155 ULMs (23). Although complexes of the CAPER␣ UHM with any of these ULM-containing regions could be isolated by size exclusion chromatography, only the SF3b155 ULM5 produced diffraction-quality co-crystals with the CAPER␣ UHM ( Table 1). The 1.74 Å resolution structure of the CAPER␣ UHM/ SF3b155 ULM5 complex revealed strong difference density for the bound ULM in one of the two copies in the crystallographic asymmetric unit (Fig. 3A). A symmetry-related molecule occludes the binding site of the other CAPER␣ UHM. Apparent stoichiometries from size exclusion chromatography (see Fig.  5A) and isothermal titration calorimetry (ITC) ( Table 2) provide respective support for a monomeric CAPER␣ UHM and one-to-one complex between the CAPER␣ UHM and SF3b155 ULM5 peptide. The bound copy of the CAPER␣ UHM is essentially preconfigured to fit the SF3b155 ULM5, with no detectable differences in key residue conformations at the binding site (r.m.s.d. 0.56 Å between 103 C␣ of the corresponding bound and apo-polypeptide chains).
CAPER␣ UHM Interactions with SF3b155 ULM5-The key interactions between the CAPER␣ UHM and the minimal SF3b155 ULM5 are structurally conserved compared with known UHM/ULM complexes (Fig. 3, B and C). The SF3b155 ULM5 tryptophan (Trp-338), which in general is essential for ULM/UHM interactions (19,20,23), is buried in a hydrophobic pocket between two ␣-helices of the CAPER␣ UHM. One surface of the indole ring stacks in a T-shape with Phe-490 from the CAPER␣ RXF-loop, whereas the other is masked by a salt bridge between Arg-488 of this loop and Glu-447 from the opposing ␣-helix. Preceding the ULM tryptophan, a serine residue (Ser-336) is located within a cluster of acidic CAPER␣ residues (Asp-443, Glu-E446, and Glu-447) at an analogous position as a serine of the SF1 ULM (Ser-20), which interferes with U2AF 65 association after phosphorylation by protein kinase G-I (40).
Among bound ULMs, the pairwise C␣-C␣ distances of the CAPER␣-bound ULM5 most closely match the U2AF 65 ULM bound to the U2AF 35 UHM (20) (r.m.s.d. 0.13 Å) (Fig. 3C). This may be due to a key tryptophan at the X position of the RXFloop that is a shared feature of the CAPER␣ and U2AF 35 UHMs and appears to influence the conformation of the C-terminal ULM residues. The U2AF 65 ULM sandwiches the U2AF 35 indole group between prolines near the C terminus of the peptide. Although the C terminus of the SF3b155 ULM5 peptide used for structural studies is truncated compared with the

TABLE 2 Isothermal titration calorimetry of CAPER␣ UHM binding ULMs or ULM-containing proteins
Average values and S.D. of two independent experiments. ⌬G°was calculated using ⌬G°ϭ ϪRTln(K D Ϫ1 ), and ϪT⌬S°was calculated using ⌬G°ϭ ⌬H°Ϫ T⌬S°, T ϭ 303 K. Values for each class of multiple sites in the wild-type SF3b155 domain describe binding of one CAPER␣ UHM to one of these SF3b155 sites. Representative isotherms are given with c-values in supplemental Fig. 1. Boundaries of SF3b155, -W293, -W338 are residues 190 -344; ULM5 is residues 333-342 (KRKSRWDETP); ULM5L is residues 333-355 (KRKSRWDETPASQMGGSTPVLTP).  3C) and lies in similar proximity to an SPF45 tyrosine in an "RYF" motif. In contrast, the C-terminal trajectory of the SF1 ULM is more distant from and passes on the opposite face of the U2AF 65 RXF-loop (27), which rather than an aromatic residue offers a solvent-exposed lysine at the X position. A C-terminal proline in either the SF3b155 ULM5 or U2AF 65 ULM is preceded by a threonine (SF3b155 Thr-341 and U2AF 65 Thr-103) that is phosphorylated in vivo (41). As such, phosphorylation of ULM "TP" motifs offers a potential switch for UHM/ULM regulation through the known influence of phosphothreonine on the cis/trans-isomerization of an adjacent proline (42). CAPER␣ Specifically Recognizes SF3b155-To identify the preferred ULM-containing partner of CAPER␣, we compared the abilities of endogenous, full-length CAPER␣ from human cell extracts to associate with GST fusions of known ULMcontaining splicing factors (Fig. 4A), including the SF3b155 ULM-containing domain, SF1 (residues 1-255 including the ULM, coiled-coil and RNA binding domains), or U2AF 65 . We tested SF1 either in the unphosphorylated state or after treatment with KIS kinase to introduce specifically phosphorylated serines that slightly enhance U2AF 65 association (21). As a negative control, we included a "nonbinding" SF3b155 domain with all ULM tryptophans mutated to alanine (SF3b155-dW) (23). As a positive control, we re-probed the same pulldown assays by immunoblots for endogenous U2AF 65 , which is known to function in a complex with phosphorylated SF1 (21,27). Whereas the pulldown assays by U2AF 65 or by SF1/phosphorylated SF1, respectively, are weak or not detected, CAPER␣ exhibits significant association with SF3b155, consistent with the presence of CAPER␣ in B/B act spliceosome complexes that contain the SF3b subunits (6,7).

ULM ligand K D
To directly confirm the apparent preference of CAPER␣ for binding SF3b155, we used ITC to quantify the association of purified CAPER␣ UHM with various ULMs or ULM-containing splicing factors (  (27). The SF3b155 titrations with the CAPER␣ UHM were best fit by a model of two nonidentical types of sites (supplemental Fig. 1, A  and B, 2 9.6E4 for identical sites versus 0.5E4 for nonidentical sites). The first of these two classes of sites comprised high affinity sites (K D 58 Ϯ 2 nM) with an apparent stoichiometry of ϳ2 CAPER␣ UHM:1 SF3b155 domain. The second class comprised lower affinity sites (K D 330 Ϯ 42 nM) with an apparent stoichiometry of three CAPER␣ UHM per SF3b155 domain. Thus, the total apparent stoichiometry in the ITC experiments was close to 5 CAPER␣ UHM:1 SF3b155, in agreement with five documented SF3b155 ULMs that detectably bind U2AF 65 (23). The apparent equilibrium dissociation constants (K D ) of the CAPER␣ UHM for the U2AF 65 ULM or SF1 14 -132 were, respectively, 20-or 35-fold weaker than even the lower affinity class of SF3b155 sites. The substantially higher affinity of the CAPER␣ UHM for SF3b155 than for the U2AF 65 ULM, which in turn was slightly greater than for SF1, provided quantitative evidence for direct interactions that corroborated the binding preferences in human cell extracts (Fig. 4).
SF3b155 Simultaneously Associates with Two CAPER␣ UHMs-The presence of seven ULM-like candidates within SF3b155, five of which have been documented to bind UHMs (22)(23)(24)(25), raised the question of whether the different SF3b155 sites would sterically compete or simultaneously bind multiple CAPER␣ molecules. The ITC fits suggested the presence of two high affinity and three lower affinity binding sites for the CAPER␣ UHM in the SF3b155 ULM-containing domain; however, this apparent stoichiometry is subject to the fitting procedure and concentrations of active molecules. We first attempted to characterize the stoichiometry of a saturated CAPER␣ UHM/SF3b155 complex by size exclusion chromatography of a 7:1 mixture in comparison with the individual subunits and molecular weight standards (Fig. 5A). The CAPER␣ UHM eluted close to the expected monomeric molecular mass (18 kDa apparent molecular mass compared with 13 kDa expected molecular mass ). However, the isolated SF3b155 domain eluted at a substantially larger apparent size than expected (44-kDa apparent molecular mass compared with 17-kDa expected molecular mass), consistent with the intrinsic disorder of apoSF3b155 (23) that confers a large effective hydrodynamic radius (43). This discrepancy complicates interpretation of the CAPER␣ UHM/SF3b155 stoichiometry based on the elution profiles. Assuming a globular shape and considering the expected molecular mass of the subunits, the apparent molecular mass of the complex (86 kDa) would be consistent with a 5 CAPER␣ UHM:1 SF3b155 ratio. In contrast, a stoichiometry of ϳ2 CAPER␣ UHM:1 SF3b155 is calculated using the apparent molecular masses of the individual subunits and assuming that the SF3b155 subunit remains largely unfolded when bound to CAPER␣.
We resolved these possible interpretations by use of three alternative methods that are independent of the hydrodynamic radius and converged on the conclusion that the CAPER␣ UHM/SF3b155 complex most likely comprises a 2:1 stoichiometry after size exclusion chromatography. First, the Coomassie Blue-stained band intensities following SDS-PAGE of the purified complex in comparison with CAPER␣ UHM/SF3b155 mixtures of known composition agrees with a 2:1 complex (Fig.  5B). Second, amino acid analysis of the purified complex best matched the predicted composition of a 2:1 complex (Fig. 5C). Third, the circular dichroism (CD) spectrum of the purified complex was nearly indistinguishable from the CD spectrum of a known 2:1 mixture of the CAPER␣ UHM and SF3b155 subunits but clearly differed from the CD spectra of a 5:1 mixture or the individual subunits (Fig. 5D). Altogether, the purified complex with the SF3b155 ULM-containing domain contains two CAPER␣ UHM molecules, in agreement with the apparent stoichiometry of the high affinity class of sites resulting from the ITC fits.
CAPER␣ Weakly Associates with SF3b155 Variants Containing Individual ULMs-To aid structural interpretation and test this known ULM with highest affinity for U2AF 65 (23), we first determined the affinity of the co-crystallized SF3b155 ULM5 for the CAPER␣ UHM using ITC ( Table 2, supplemental Fig. 1). The CAPER␣ UHM bound the minimal SF3b155 ULM5 with 7-fold reduced affinity than the weaker class of sites in the intact SF3b155 domain. To account for this disparity, we initially pursued two straightforward hypotheses, either that the bona fide ULM5 comprises additional bordering sequences or that CAPER␣ preferentially associates with other SF3b155 ULMs. To test these hypotheses, we leveraged our established SF3b155 variants in which all but one of the SF3b155 ULM tryptophans are replaced by alanines (23), leaving each variant with only a single ULM with integrity for UHM interactions. Although only five ULM-like tryptophans of SF3b155 have been documented to bind UHMs (22)(23)(24)30), it remains possible that the additional two tryptophans in poorly conserved motifs (Trp-254 and Trp-310) could bind the CAPER␣ UHM. As such, we included these tryptophans in our SF3b155 mutations. Both the wild-type SF3b155 domain and SF3b155-dW share similar random-coil circular dichroism spectra (23), indicating that differences in UHM binding are unlikely to result from disruption of a well ordered SF3b155 structure.
To investigate the role of sequences surrounding SF3b155 ULM5, we used ITC to characterize the CAPER␣ UHM association with an SF3b155 variant containing only the single tryptophan (Trp-338, called SF3b155-W338) corresponding to ULM5 ( Table 2, supplemental Fig. 1). An apparent binding stoichiometry of 1:1 confirmed that the CAPER␣ UHM primarily associated with the remaining ULM5 without nonspecific binding to other mutated ULMs. However, the context of the SF3b155-W338 domain reduced the CAPER␣ UHM affinity by 5-fold (14.0 M) relative to the isolated ULM5 (residues 333-342), indicating that local extension of ULM5 could not account for the high CAPER␣ affinity for wild-type SF3b155. Indeed, the SF3b155 domain context reduced the apparent ULM5 affinity for the CAPER␣ UHM, possibly due to steric hindrance and/or favorable interactions between the positively charged N terminus of the ULM5 peptide with the acidic UHM ␣-helix 2, which was expected to be in close proximity. To confirm that sequences directly flanking the SF3b155 ULM5 had little impact on CAPER␣ UHM affinity, we compared titrations of the CAPER␣ UHM into a longer ULM5 peptide (ULM5L, residues 333-355) that contained C-terminal TP repeats and found little improvement in affinity relative to the minimal ULM5 (respective K D values 2.3 and 2.4 M, Table 2).
We next explored the potential contributions of other ULMs to SF3b155 association with CAPER␣. To enable the relevant, full-length CAPER␣ protein to be studied and to circumvent the large amount of material required for ITC (3 mg of CAPER␣ UHM per experiment), we used pulldown methods to comprehensively compare levels of CAPER␣ association for all seven different tryptophan-containing sites within SF3b155 (Fig. 1C). Initial pulldown assays using human cell extracts failed to show detectable co-precipitation of endogenous CAPER␣ with any SF3b155 variants containing single ULMs (data not shown), in contrast with the clear CAPER␣ association with the wild-type SF3b155 domain (Fig. 4). To improve the signal, we modified the assay as described (21) to input 35 S-labeled CAPER␣ that was produced by in vitro translation for pulldown by GST-fused SF3b155 variants (Fig. 6). Using the S35-detection method, CAPER␣ detectably associates with an SF3b155 variants containing unmodified ULM1 (W200) and ULM5 (W338) and to a lesser extent with an SF3b155 variant containing only ULM4 (W293). However, the sum of the CAPER␣ pull down by SF3b155 ULM-containing variants remained nearly an order of magnitude less than that by the intact SF3b155 domain.
We confirmed these results using ITC experiments to quantify the CAPER␣ UHM affinities for SF3b155 variants containing the tryptophans of either ULM1 (W200) or ULM4 (W293) ( Table 2, supplemental Fig. 1). We also had characterized the ULM5 (W338) variant by ITC as described above. The SF3b155-W293 variants bound the CAPER␣ UHM with extremely low affinity (68 M) ( Table 2), 200-fold weaker than the lower affinity class of wild-type SF3b155 sites. Although the SF3b155-W200 variant bound the CAPER␣ UHM with 6-fold higher affinity than SF3b155-W338, the affinity (2.3 M) remained respectively 7-and 40-fold weaker than the lower and higher affinity sites of wild-type SF3b155.
Altogether, we concluded that none of SF3b155 ULM-like sites, either individually or in the context of a mutated SF3b155 region, could fully account for the strong association of the CAPER␣ UHM with the intact SF3b155 ULM-containing domain, thereby highlighting roles for different molecular effects in order to fully account for the preferential association of CAPER␣ with the SF3b155 partner in the spliceosome.
Multiple ULMs Enhance CAPER␣ Association with SF3b155-The observations that CAPER␣ detectably binds three different SF3b155 ULM-containing variants and that at least two CAPER␣ UHMs simultaneously associate with the SF3b155 ULM-containing domain raised the alternate explanation that interactions with multiple ULMs enhance CAPER␣ affinity for SF3b155. To investigate this possibility, we further assessed the SF3b155 binding characteristics of full-length CAPER␣ using a series of carefully quantified pulldown assays. Due to difficulties in expressing and manipulating in vitro translated CAPER␣, the GST fusion of the SF3b155 ULM-containing domain was incubated with endogenous CAPER␣ in increasing amounts of HEK 293 cell extract. The retained CAPER␣ was detected by immunoblot analysis (Fig. 7).
The resulting plot of retained CAPER␣ versus volume of cell extract exhibits a sigmoidal shape that is typical of cooperative binding. Considering our finding that two CAPER␣ UHMs simultaneously associate with one SF3b155 domain (Fig. 5), the Hill coefficient of this plot (1.8 Ϯ 0.3) suggests that the two apparent high affinity sites are likely to arise from strong positive cooperativity between SF3b155-bound CAPER␣ UHM molecules.

DISCUSSION
Here, our crystal structures reveal CAPER␣ as a bona fide member of the UHM family bound to a prototypical ULM ligand. The CAPER␣ UHM complex with the SF3b155 ULM5 is mediated by (i) a ULM tryptophan (Trp-338 of SF3b155) inserted within a hydrophobic pocket between the UHM ␣-helices and (ii) preceding basic residues of the ULM interacting  with an acidic ␣1-helix of the UHM. These central interactions are shared among known UHM/ULM structures (19,20,22,27,28), yet new themes and variations emerge from structural comparisons with the CAPER␣ UHM/SF3b155 ULM5 complex.
We now conclude that nearly all UHM structures append a C-terminal ␣-helix (␣3 of the CAPER␣ UHM) to the core RRMlike topology (with the exception of the U2AF 35 UHM, which is followed by structurally-uncharacterized zinc knuckle). This C-terminal ␣-helix packs against the degenerate RNP2 and RNP1 motifs of the UHM, in particular concealing a singular conserved aromatic residue with the capacity to interact with RNA (CAPER␣ Tyr-469). Notably, it was recently determined that a coiled-coil extension of the SF1 ULM interacts with the C-terminal ␣-helix of the U2AF 65 UHM, stabilizing its position against the RNP motifs (27,28). The corresponding C-terminal extension of the CAPER␣ UHM is remarkably long; the ␣-helix is followed by a 3 10 helix and an extended polypeptide wrapped across the core UHM surface. By analogy, these observations raise the possibility that intact SF3b155 could selectively recognize the distinctive structure of the CAPER␣ C-terminal extension.
The conformation of the SF3b155 ULM5 bound to CAPER␣ adopts an inverted U-shape, which is shared by both the same ligand bound to SPF45 UHM (22) and the U2AF 65 ULM bound to the U2AF 35 UHM (20). In contrast, the trajectory of the SF1 ULM bound to U2AF 65 is nearly linear and passes across the opposite face of the RXF-loop (19,27,28). The conformations of the CAPER␣-or SPF45-bound SF3b155 ULM5s are secured by intramolecular electrostatic interactions between charged ULM residues flanking the central tryptophan (SF3b155 Arg-337/Glu-340) as well as by steric limits set by a bulky aromatic residue at the X position of the UHM RXF-loop (CAPER␣ Trp-338 or SPF45 Tyr-376). These SF3b155 ULM5 and CAPER␣ or SPF45 residues diverge from their counterparts in the SF1 ULM/U2AF 65 UHM complex (respectively, SF1 Arg-21/ Gln-24 and U2AF 65 Lys-453).
Although the U2AF 65 ULM lacks the intramolecular Arg-337/Glu-340 salt bridge of SF3b155 ULM5, its backbone conformation is nearly identical to the bound SF3b155 ULM5 ligands. As seen for the SF3b155 ULM5, the U2AF 65 ULM is constrained sterically by U2AF 35 Trp-134 in the RXF-loop. This UHM tryptophan additionally engages a proline-rich extension of the U2AF 65 ULM in a hydrophobic sandwich that is required for U2AF heterodimer formation (20). A C-terminal proline of the minimal, co-crystallized SF3b155 ULM5 abuts the RXFloop, where in the context of the TP repeats found in intact SF3b155 it could interact with an exposed aromatic UHM residue in an analogous manner as the U2AF 65 ULM/U2AF 35 UHM complex. Nevertheless, comparison of a C-terminal-extended SF3b155 ULM5L peptide shows little contribution of this region to CAPER␣ binding, at least in the absence of SF3b155 phosphorylation.
Importantly, we demonstrate that the second-step pre-mRNA splicing factor, SF3b155, is the ULM-containing partner of CAPER␣ both for purified proteins and in human cell extracts. Our finding that CAPER␣ preferentially associates with SF3b155 agrees with mass spectrometry findings that CAPER␣ is associated with catalytically active spliceosomes (4 -7). Furthermore, CAPER␣ is a known interaction partner of the SR-related alternative splicing factor, SRrp53, which is required for the second step of splicing (44).
We find that two major sites within the ULM-containing domain of SF3b155 can associate simultaneously with CAPER␣ molecules and further demonstrate that the integrity of multiple ULMs is important for CAPER␣ selection of SF3b155 over off-target ULM-containing proteins. This finding expands the paradigm that most splicing factors function as single copy subunits during spliceosome assembly. For example, early hypotheses for multimeric assemblies of the polypyrimidine tractbinding protein (PTB) were clarified by evidence for looping of the RNA bound to a PTB monomer (45,46). The UHM-containing U2AF 65 paralogue Puf60 forms weak homodimers (24,47) yet lacks detectable cooperativity for binding SF3b155 fragments containing either ULM1-ULM2 or ULM2-ULM3 (24). The formal prospect remains that the intact ULM-containing domain of SF3b155 is important for Puf60 as we observe here for CAPER␣. Now, our establishment that at least two CAPER␣ molecules simultaneously bind one SF3b155 supports the possibility that multiple copies of CAPER␣ and possibly other UHM-containing proteins contribute to spliceosome assembly.
In a wider context, an open question is whether various UHM-containing splicing factors will compete for binding the multiple SF3b155 ULMs or exhibit heterotypic positive cooperativity. The ULM-containing domain of SF3b155 has been shown to interact with several UHM-containing splicing factors, including U2AF 65 (23,30), SPF45 (22), and Puf60 (24). The preferences of U2AF 65 (23) and here CAPER␣ for binding each of the separated tryptophan-containing SF3b155 sites have been characterized and now can be compared. Both U2AF 65 and CAPER␣ associate strongly with ULM5 (Trp-338), yet CAPER␣ displays a distinct preference for SF3b155 ULM1 (Trp-200) and to a lesser extent ULM4 (Trp-293) (Fig. 6). Altogether, considering that U2AF 65 and CAPER␣ co-localize in cells (39) as well as bind distinct SF3b155 ULMs and that the U2AF 65 and Puf60 proteins synergistically promote splicing (16), it is tempting to speculate that cross-cooperativity occurs among UHM-containing proteins. Although little is known concerning the RNA sequence specificity of CAPER␣, an assembly of multiple RNA-binding proteins would assist recognition of longer cognate RNA sites. Certainly, a sigmoidal response resulting from cooperative UHM binding to SF3b155 would offer the means for subtle changes in splicing factor levels to invoke nearly all-or-none regulation of alternative splicing.
Altogether, we envision a tightly controlled UHM/ULM network wherein several UHM-containing splicing factors, including U2AF 65 and CAPER␣, recognize multiple, "weak" SF3b155 sites in a concerted fashion and selectively enhance or disrupt recognition of specific splice sites (Fig. 8). SF3b155 is the target of several lead compounds such as spliceostatin A that confer cytotoxicity and anti-tumor effects (for review, see Ref. 48). Furthermore, mutations in the SF3b1 gene encoding SF3b155 occur in ϳ75% of myelodysplastic syndromes with ring sideroblasts (49 -51) as well as chronic lymphocytic leukemia (for review, see Ref. 52) and cancers (53)(54)(55). Although the affected residues lie outside the SF3b155 ULM-containing region per se, future studies will determine whether these preleukemia mutations or natural product binding sites in SF3b155 could have an indirect, allosteric influence on the interplay among multiple, bound UHM-containing proteins. Such downstream effects potentially could include inhibition of CAPER␣ functions as a tumor suppressor controlling the expression of angiogenic VEGF spliceoforms (2).