Advertisement

Representative cancer-associated U2AF2 mutations alter RNA interactions and splicing

Open AccessPublished:October 05, 2020DOI:https://doi.org/10.1074/jbc.RA120.015339
      High-throughput sequencing of hematologic malignancies and other cancers has revealed recurrent mis-sense mutations of genes encoding pre-mRNA splicing factors. The essential splicing factor U2AF2 recognizes a polypyrimidine-tract splice-site signal and initiates spliceosome assembly. Here, we investigate representative, acquired U2AF2 mutations, namely N196K or G301D amino acid substitutions associated with leukemia or solid tumors, respectively. We determined crystal structures of the wild-type (WT) compared with N196K- or G301D-substituted U2AF2 proteins, each bound to a prototypical AdML polypyrimidine tract, at 1.5, 1.4, or 1.7 Å resolutions. The N196K residue appears to stabilize the open conformation of U2AF2 with an inter-RNA recognition motif hydrogen bond, in agreement with an increased apparent RNA-binding affinity of the N196K-substituted protein. The G301D residue remains in a similar position as the WT residue, where unfavorable proximity to the RNA phosphodiester could explain the decreased RNA-binding affinity of the G301D-substituted protein. We found that expression of the G301D-substituted U2AF2 protein reduces splicing of a minigene transcript carrying prototypical splice sites. We further show that expression of either N196K- or G301D-substituted U2AF2 can subtly alter splicing of representative endogenous transcripts, despite the presence of endogenous, WT U2AF2 such as would be present in cancer cells. Altogether, our results demonstrate that acquired U2AF2 mutations such as N196K and G301D are capable of dysregulating gene expression for neoplastic transformation.
      Large-scale sequencing projects, together with an emerging plethora of protein structures, have revealed statistically significant clustering of disease-associated mutations at protein–ligand interfaces (
      • Wang X.
      • Wei X.
      • Thijssen B.
      • Das J.
      • Lipkin S.M.
      • Yu H.
      Three-dimensional reconstruction of protein networks provides insight into human genetic disease.
      ,
      • Kamburov A.
      • Lawrence M.S.
      • Polak P.
      • Leshchiner I.
      • Lage K.
      • Golub T.R.
      • Lander E.S.
      • Getz G.
      Comprehensive assessment of cancer missense mutation clustering in protein structures.
      ,
      • Gao M.
      • Zhou H.
      • Skolnick J.
      Insights into disease-associated mutations in the human proteome through protein structural analysis.
      ,
      • Shao C.
      • Yang B.
      • Wu T.
      • Huang J.
      • Tang P.
      • Zhou Y.
      • Zhou J.
      • Qiu J.
      • Jiang L.
      • Li H.
      • Chen G.
      • Sun H.
      • Zhang Y.
      • Denise A.
      • Zhang D.E.
      • et al.
      Mechanisms for U2AF to define 3′ splice sites and regulate alternative splicing in the human genome.
      ). This revelation explains how mis-sense mutations of the same gene can cause different diseases by affecting distinct functional interfaces of an encoded protein product. Conversely, trans-acting mutations that modify mutually interacting surfaces of a multisubunit complex often produce similar disease symptoms. For example, mis-sense mutations in different domains of the WAS (Wiskott–Aldrich syndrome) protein cause clinically distinct disorders such as Wiskott–Aldrich syndrome or X-linked neutropenia (
      • Wang X.
      • Wei X.
      • Thijssen B.
      • Das J.
      • Lipkin S.M.
      • Yu H.
      Three-dimensional reconstruction of protein networks provides insight into human genetic disease.
      ). On the other hand, mutations clustered at the mutual interfaces of either complement factor H and component C3 proteins result in hemolytic uremic syndrome (
      • Wang X.
      • Wei X.
      • Thijssen B.
      • Das J.
      • Lipkin S.M.
      • Yu H.
      Three-dimensional reconstruction of protein networks provides insight into human genetic disease.
      ). Indeed, understanding the 3D aspects of the gene-to-disease process is of great interest to pharmaceutical and medical industries seeking to identify druggable targets for precision medicine. For example, an Phe508 deletion remodels the interactome of the cystic fibrosis transmembrane conductance regulator (ΔF508). Reduced levels of specific ΔF508 cystic fibrosis transmembrane conductance regulator interactors can partially restore channel function (
      • Pankow S.
      • Bamberger C.
      • Calzolari D.
      • Martínez-Bartolomé S.
      • Lavallée-Adam M.
      • Balch W.E.
      • Yates 3rd, J.R.
      ΔF508 CFTR interactome remodelling promotes rescue of cystic fibrosis.
      ). As a second example, oncogenic mutations of the p53 tumor suppressor protein can stimulate interactions with the transcription factor Nrf2, which in turn increases expression of proteasome genes and confers proteasome inhibitor resistance to cancer cells (
      • Walerych D.
      • Lisek K.
      • Sommaggio R.
      • Piazza S.
      • Ciani Y.
      • Dalla E.
      • Rajkowska K.
      • Gaweda-Walerych K.
      • Ingallina E.
      • Tonelli C.
      • Morelli M.J.
      • Amato A.
      • Eterno V.
      • Zambelli A.
      • Rosato A.
      • et al.
      Proteasome machinery is instrumental in a common gain-of-function program of the p53 missense mutants in cancer.
      ). The premise that disease-relevant mutations tend to cluster at 3D protein interfaces has been incorporated in several recent computational approaches for distinguishing neutral passenger mutations from candidate drivers of human disease (
      • Shim J.E.
      • Kim J.H.
      • Shin J.
      • Lee J.E.
      • Lee I.
      Pathway-specific protein domains are predictive for human diseases.
      ,
      • Ashford P.
      • Pang C.S.M.
      • Moya-García A.A.
      • Adeyelu T.
      • Orengo C.A.
      A CATH domain functional family based approach to identify putative cancer driver genes and driver mutations.
      ,
      • Chen S.
      • Fragoza R.
      • Klei L.
      • Liu Y.
      • Wang J.
      • Roeder K.
      • Devlin B.
      • Yu H.
      An interactome perturbation framework prioritizes damaging missense mutations for developmental disorders.
      ).
      Among hematologic malignancies and certain types of cancers, acquired mis-sense mutations frequently affect pre-mRNA splicing factors involved in the early stages of 3´ splice-site selection (
      • Dvinge H.
      • Kim E.
      • Abdel-Wahab O.
      • Bradley R.K.
      RNA splicing factors as oncoproteins and tumour suppressors.
      ). Most often, mutational hot spots affect clusters of residues at the protein–RNA interfaces of SF3B1, SRSF2, and U2AF1 (11). The cancer-associated mutations of SF3B1 further are believed to modify the toroidal structure of the protein and alter its recruitment of RNA helicases (
      • Tang Q.
      • Rodriguez-Santiago S.
      • Wang J.
      • Pu J.
      • Yuste A.
      • Gupta V.
      • Moldón A.
      • Xu Y.Z.
      • Query C.C.
      SF3B1/Hsh155 HEAT motif mutations affect interaction with the spliceosomal ATPase Prp5, resulting in altered branch site selectivity in pre-mRNA splicing.
      ,
      • Carrocci T.J.
      • Zoerner D.M.
      • Paulson J.C.
      • Hoskins A.A.
      SF3b1 mutations associated with myelodysplastic syndromes alter the fidelity of branchsite selection in yeast.
      ). The recurrent mutations of SF3B1, SRSF2, and U2AF1 in turn dysregulate the functions of the encoded proteins for gene expression and are thought to be drivers of cancer progression. Lower-frequency mutations of other splicing factors also may represent potential drug targets that alter proto-oncogenic functional events. Precedents for clinical consequences from such “long-tail” mutations in the frequency distributions of somatically mutated genes already have been established outside the field of pre-mRNA splicing, as exemplified for the paralogous mutations of Ras superfamily members (
      • Chang M.T.
      • Asthana S.
      • Gao S.P.
      • Lee B.H.
      • Chapman J.S.
      • Kandoth C.
      • Gao J.
      • Socci N.D.
      • Solit D.B.
      • Olshen A.B.
      • Schultz N.
      • Taylor B.S.
      Identifying recurrent mutations in cancer reveals widespread lineage diversity and mutational specificity.
      ).
      We previously documented cancer-associated mutations for U2AF2, the heterodimeric partner of the U2AF1 pre-mRNA splicing factor (
      • Glasser E.
      • Agrawal A.A.
      • Jenkins J.L.
      • Kielkopf C.L.
      Cancer-associated mutations mapped on high-resolution structures of the U2AF2 RNA recognition motifs.
      ). In most cases, U2AF2 mutations affect residues that are located in discrete domains of the protein, most prominently the two central RNA recognition motifs (RRM1 and RRM2), as well as an N-terminal region for heterodimerization with U2AF1 and a C-terminal protein-interaction motif (Fig. 1). The U2AF2 RRM1 and RRM2 are responsible for recognizing a polypyrimidine (Py) tract signal preceding the major class of 3´ splice sites (
      • Singh R.
      • Valcárcel J.
      • Green M.R.
      Distinct binding specificities and functions of higher eukaryotic polypyrimidine tract-binding proteins.
      ,
      • Singh R.
      • Banerjee H.
      • Green M.R.
      Differential recognition of the polypyrimidine-tract by the general splicing factor U2AF65 and the splicing repressor Sex-lethal.
      ). Structure determinations by NMR and X-ray crystallography demonstrate that the two RRMs recognize a continuous nine-uridine Py tract in an open, side-by-side configuration (
      • Mackereth C.D.
      • Madl T.
      • Bonnal S.
      • Simon B.
      • Zanier K.
      • Gasch A.
      • Rybin V.
      • Valcárcel J.
      • Sattler M.
      Multi-domain conformational selection underlies pre-mRNA splicing regulation by U2AF.
      ,
      • Huang J.R.
      • Warner L.R.
      • Sanchez C.
      • Gabel F.
      • Madl T.
      • Mackereth C.D.
      • Sattler M.
      • Blackledge M.
      Transient electrostatic interactions dominate the conformational equilibrium sampled by multidomain splicing factor U2AF65: a combined NMR and SAXS study.
      ,
      • Agrawal A.A.
      • Salsi E.
      • Chatrikhi R.
      • Henderson S.
      • Jenkins J.L.
      • Green M.R.
      • Ermolenko D.N.
      • Kielkopf C.L.
      An extended U2AF65-RNA-binding domain recognizes the 3′ splice site signal.
      ,
      • Voith von Voithenberg L.
      • Sánchez-Rico C.
      • Kang H.S.
      • Madl T.
      • Zanier K.
      • Barth A.
      • Warner L.R.
      • Sattler M.
      • Lamb D.C.
      Recognition of the 3′ splice site RNA by the U2AF heterodimer involves a dynamic population shift.
      ). In the absence of RNA, the inter-RRM conformation is dynamic and can adopt a range of RRM1/RRM2 proximities (
      • Agrawal A.A.
      • Salsi E.
      • Chatrikhi R.
      • Henderson S.
      • Jenkins J.L.
      • Green M.R.
      • Ermolenko D.N.
      • Kielkopf C.L.
      An extended U2AF65-RNA-binding domain recognizes the 3′ splice site signal.
      ,
      • Jenkins J.L.
      • Laird K.M.
      • Kielkopf C.L.
      A broad range of conformations contribute to the solution ensemble of the essential splicing factor U2AF65.
      ). A closed U2AF2 conformation, in which the RNA-binding surface of RRM1 is masked by RRM2 (18), is stabilized in the heterodimer with the U2AF1 subunit (
      • Warnasooriya C.
      • Feeney C.F.
      • Laird K.M.
      • Ermolenko D.N.
      • Kielkopf C.L.
      A splice site-sensing conformational switch in U2AF2 is modulated by U2AF1 and its recurrent myelodysplasia-associated mutation.
      ). Notably, many of the cancer-associated U2AF2 mutations are predicted to cluster near the RNA or RRM1/RRM2 interface of the respective open and closed conformations (
      • Glasser E.
      • Agrawal A.A.
      • Jenkins J.L.
      • Kielkopf C.L.
      Cancer-associated mutations mapped on high-resolution structures of the U2AF2 RNA recognition motifs.
      ). This observation suggests that cancer-associated mutations of the U2AF2 RRMs could modulate binding to the Py tract and dysregulate pre-mRNA splicing or other U2AF2–RNA-dependent processes. However, the structural and functional consequences of such long-tail U2AF2 mutations have yet to be investigated empirically.
      Figure thumbnail gr1
      Figure 1U2AF2 RNA recognition motifs recognize the polypyrimidine splice-site signal and can be affected by cancer-associated mutations. A, schematic diagram of the U2AF2 complex with the 3´ splice site. The C-terminal domain (UHM) is thought to sequentially bind SF1 and then SF3B1 during spliceosome assembly with the splice sites. B, location of the N196K and G301D mutations on the U2AF2 domains. The domain boundaries of the construct used for RNA-binding and crystallization experiments (U2AF212L) are shown at the bottom.
      Here, we use X-ray crystallography, RNA-binding assays, and pre-mRNA splicing assays of minigene reporters and endogenous transcripts to investigate the consequences of two representative cancer-associated mutations of U2AF2. We focused on an N196K substitution of the N-terminal RRM1 that recurs among patients with acute myeloid leukemia (AML) and a distinct, G301D substitution of the C-terminal U2AF2 RRM2 observed in cases of colon adenocarcinoma and castration-resistant prostate carcinoma. We compared the respective 1.4 and 1.7 Å resolution structures of the N196K- and G301D-substituted proteins bound to a prototypical Py tract with a baseline, 1.5 Å resolution structure of the wild-type (WT) complex. The U2AF2 loop containing the N196K mutation shifts to form an inter-RRM hydrogen bond that appears to stabilize the open conformation. The G301D conformation remains unchanged but may disfavor binding to the nearby phosphodiester backbone of the RNA. Accordingly, the N196K mutation increases the apparent RNA-binding affinity of U2AF2, whereas the G301D mutation has a converse effect. The mutations differently affect splicing of representative transcripts, with the G301D mutation showing effects similar to siRNA-mediated reductions in U2AF2 levels. These different structural and functional consequences affirm that the N196K and G301D mutations of U2AF2 have the potential to drive dysregulated gene expression in leukemias and cancers and, in addition, resolve how mutations in the same U2AF2 gene can result in clinically distinct disorders.

      Results

      Structure of human U2AF212L bound to the AdML Py tract

      As a baseline to distinguish the structural influence of the N196K and G301D mutations on a bona fide splice-site complex, we first determined the crystal structure of the U2AF2 RRM-containing region (U2AF212L; Fig. 1B) bound to a modified, Py tract oligonucleotide (5´-UUUU(dU)U(5BrdU)CC-3´) (Table 1 and Fig. 2A). The sequence of the co-crystallized oligonucleotide matched the prototypical adenovirus major late promoter transcript (AdML), which also is identical to the preferred U2AF2 binding site as determined by in vitro selection (
      • Singh R.
      • Valcárcel J.
      • Green M.R.
      Distinct binding specificities and functions of higher eukaryotic polypyrimidine tract-binding proteins.
      ). The two terminal cytidines differed from the primarily uridine sequences of our prior U2AF212L structures (
      • Agrawal A.A.
      • Salsi E.
      • Chatrikhi R.
      • Henderson S.
      • Jenkins J.L.
      • Green M.R.
      • Ermolenko D.N.
      • Kielkopf C.L.
      An extended U2AF65-RNA-binding domain recognizes the 3′ splice site signal.
      ). As for prior structures, including a deoxyuridine (dU) and 5-bromo-dU in the oligonucleotide marked the sequence register and facilitated high-quality crystals. The U2AF212LAdML Py tract structure was determined at 1.5 Å resolution by molecular replacement. Based on similar crystallization conditions (succinate, pH 7.0), crystal packing, and resolution limits, the prior structure of U2AF212L bound to an all-uridine oligonucleotide (PDB code 5EV3) was an appropriate starting model for refinement and comparison of structural changes to bind the terminal cytidines. Other high-resolution U2AF212L structures bound to uridines (PDB code 5EV1 and 5EV2) were available; however, the nonphysiological low pH of the crystallization conditions (pH 4.0) altered relevant terminal nucleotide interactions.
      Table 1Crystallographic data collection and refinement statistics
      U2AF212L variant + UUUU(dU)U(5BrdU)CCWildtypeN196KG301D
      PDB accession code6XLW6XLV6XLX
      Data collection
      Statistics for the highest-resolution shell are shown in parentheses.
       Wavelength (Å)1.0330.9790.979
       Resolution range (Å)38.71–1.50 (1.52–1.50)38.45–1.40 (1.42–1.40)38.71–1.70 (1.73–1.70)
       Space groupP 21 21 21
       Unit cell (Å)43.3, 62.7 77.443.3, 63.1, 76.943.5, 61.9, 77.4
       Total no. reflections230,397251,647136,172
       Multiplicity6.9 (6.0)6.2 (3.8)5.8 (5.2)
       Completeness (%)97.0 (85.0)96.0 (70.7)99.7 (97.1)
       Mean I/σ(I)22.3 (9.9)13.1 (1.3)13.2 (2.4)
      Rmerge (%)
      Rmerge = ∑hkl∑i|Ii − |/∑hkl∑i|Ii|, where Ii is an intensity I for the ith measurement of a reflection with indices hkl and is the weighted mean of all measurements of I.
      5.3 (9.8)5.5 (67.7)6.3 (57.0)
      Rp.i.m. (%)
      Rp.i.m. = ∑hkl (1/(n − 1))∑i|Ii − |/∑hkl∑i|Ii|, where n is the number of observations of the intensity Ii.
      3.2 (6.2)2.4 (38.1)2.8 (26.5)
       CC1/2 (%)
      CC1/2, correlation coefficient between intensities of random-half data set (42).
      99.8 (99.1)99.8 (70.3)99.8 (89.5)
      Refinement
       No. reflections (work/test)33,522/2,22376,426/5,06144,172/2,922
      Rwork/Rfree (%)
      Rwork = Σhkl‖Fobs(hkl)| − |Fcalc(hkl)‖)/Σhkl|Fobs(hkl)| for the working set of reflections. Rfree is Rwork for ∼7% of the reflections excluded from the refinement. All data were used in the refinement.
      13.1/15.314.1/16.916.3/19.4
       No. of atoms
      Macromolecules1,5601,5661,524
      DNA/RNA160160160
      Solvent266282178
       RMSD
      Bonds (Å)0.0080.0100.004
      Angles (°)1.0411.2090.835
       Ramachandran (%)
      Favored99.599.598.5
      Allowed0.50.51.5
      Outliers0.00.00.0
       MolProbity score
      Calculated using the program MolProbity (43).
      1.120.980.94
       〈B-factor〉 (Å2)22.027.837.4
        Protein20.627.337.5
        DNA/RNA20.920.835.7
        Solvent30.734.637.9
      a Statistics for the highest-resolution shell are shown in parentheses.
      b Rmerge = ∑hkli|Ii − |/∑hkli|Ii|, where Ii is an intensity I for the ith measurement of a reflection with indices hkl and is the weighted mean of all measurements of I.
      c Rp.i.m. = ∑hkl (1/(n − 1))∑i|Ii − |/∑hkli|Ii|, where n is the number of observations of the intensity Ii.
      d CC1/2, correlation coefficient between intensities of random-half data set (
      • Karplus P.A.
      • Diederichs K.
      Linking crystallographic model and data quality.
      ).
      e Rwork = ΣhklFobs(hkl)| − |Fcalc(hkl)‖)/Σhkl|Fobs(hkl)| for the working set of reflections. Rfree is Rwork for ∼7% of the reflections excluded from the refinement. All data were used in the refinement.
      f Calculated using the program MolProbity (
      • Chen V.B.
      • Arendall 3rd, W.B.
      • Headd J.J.
      • Keedy D.A.
      • Immormino R.M.
      • Kapral G.J.
      • Murray L.W.
      • Richardson J.S.
      • Richardson D.C.
      MolProbity: all-atom structure validation for macromolecular crystallography.
      ).
      Figure thumbnail gr2
      Figure 2Structure of U2AF212L bound to the AdML Py tract. A, overall structure of U2AF212L (blue) bound to an oligonucleotide sequence corresponding to the AdML Py tract (5´-UUUU(dU)U(5BrdU)CC, where dU is deoxyuridine, U is colored magenta, and C is yellow). The N/C termini of the polypeptide and 5´/3´ termini of the oligonucleotide are labeled. Cyan spheres mark the Cα positions of the Asn196 and Gly301 residues. B, comparison of the backbone conformations of the U2AF212LAdML structure with a prior structure bound to an all-uridine Py tract (5´-UU(dU)UU(5BrdU)dUdU; PDB code 5EV3 with the protein shown in grayscale and the polyU tract in magenta). C and D, feature-enhanced electron density maps contoured at 1 σ show well-ordered terminal cytidines for the current U2AF212L-bound AdML RNA (C) compared with the dU counterparts of PDB code 5EV3 (D). E and F, close view of U2AF212L interactions with the terminal nucleotides.
      The overall structure of the U2AF212LAdML complex is nearly identical to the uridine counterpart (RMSD 0.1 Å for 185 matching Cα atoms of PDB code 5EV3), apart from a large, ∼3 Å shift in the positions of the terminal cytidines (Fig. 2B). These tandem cytidines are well-ordered in an unbiased, feature-enhanced electron density map (
      • Afonine P.V.
      • Moriarty N.W.
      • Mustyakimov M.
      • Sobolev O.V.
      • Terwilliger T.C.
      • Turk D.
      • Urzhumtsev A.
      • Adams P.D.
      FEM: feature-enhanced map.
      ) (Fig. 2C), where they are engaged by a combination of direct and water-mediated hydrogen bonds with U2AF2 Arg146, Arg150, and Asp231 (Fig. 2E). The well-defined contacts with the Asp231 side chain are consistent with the ability of a D231V-variant U2AF2 to alter specificity for the terminal nucleotides of the Py tract (
      • Agrawal A.A.
      • McLaughlin K.J.
      • Jenkins J.L.
      • Kielkopf C.L.
      Structure-guided U2AF65 variant improves recognition and splicing of a defective pre-mRNA.
      ). Despite comparable resolutions and similar crystallization conditions, the terminal uridines of the prior structure are poorly defined in the electron density (Fig. 2D). A Gln147 side chain that mediates hydrogen bonds with the uracil base edges (Fig. 2F) has been displaced by arginine side chains in the current cytidine-bound structure. Although a prior deoxyribose substitution may contribute structural differences at the terminal nucleotides, the base-specific interactions and absence of 2′ hydroxyl contacts supports the conclusion that U2AF2 preferentially secures the tandem cytidines of the AdML Py tract compared with the uridine counterpart.

      Structure of N196K U2AF212L bound to the AdML Py tract

      To view the structural changes caused by a representative cancer-relevant substitution of the U2AF2 RRM1, we determined the crystal structure of an N196K-substituted U2AF212L bound to the AdML Py tract at 1.7 Å resolution (Table 1). This N196K substitution is among the most common U2AF2 mutations, resulting from A → T transversions in four AML patients (COSMIC code COSU544). The precipitant (malonate, pH 7.0) and crystal packing environment is similar to the WT AdML counterpart described above, such that structural changes can be attributed to the amino acid substitution. The Asn196 residue is located in a loop of the N-terminal RRM1 near the bound oligonucleotide and at the RRM1/RRM2 interface (Fig. 2A). Apart from local movement of this loop region, the overall structure of the N196K-substituted protein remains similar to the WT complex (RMSD 0.6 between 197 matching Cα atoms). In the WT complex, the Asn196 side chain is poorly ordered and modeled as two alternative conformations, one of which mediates a hydrogen bond with the uracil-O2 atom (Fig. 3, A and C). By contrast, the mutant lysine side chain is well-defined in the electron density (Fig. 3, B and D). While remaining well-defined, the Ser294 residue shows evidence of two alternative conformations in the N196K-substituted structure. Rather than directly interacting with the nucleotide, the position of the Lys196-containing loop has shifted to achieve a hydrogen bond with one conformation of the Ser294 backbone carbonyl in the opposite RRM. This interaction could potentially stabilize the open U2AF2 conformation for association with uridine-rich Py tracts.
      Figure thumbnail gr3
      Figure 3U2AF212L N196K interactions with bound oligonucleotide. A feature-enhanced electron density map contoured at 1 σ shows A, multiple conformations (a/b) of the WT N196 side chain. B, the Lys196 variant is well-defined but introduces Ser294 alternative conformations. C and D, view of U2AF212L interactions at these sites. The electron density map is colored to indicate the following: oligonucleotide, purple; Asn196 or Lys196 residues, cyan; other residues, marine.

      Structure of G301D U2AF212L bound to the AdML Py tract

      We next investigated the structural changes caused by a representative cancer-relevant substitution of the U2AF2 RRM2 by determining the crystal structure of a G301D-substituted U2AF212L bound to the AdML Py tract at 1.4 Å resolution (Table 1). This G301D substitution results from a A → G transition identified in colon adenocarcinoma and castration-resistant prostate carcinoma patients (
      • Cancer Genome Atlas, N
      Comprehensive molecular characterization of human colon and rectal cancer.
      ,
      • Chen E.J.
      • Sowalsky A.G.
      • Gao S.
      • Cai C.
      • Voznesensky O.
      • Schaefer R.
      • Loda M.
      • True L.D.
      • Ye H.
      • Troncoso P.
      • Lis R.L.
      • Kantoff P.W.
      • Montgomery R.B.
      • Nelson P.S.
      • Bubley G.J.
      • et al.
      Abiraterone treatment in castration-resistant prostate cancer selects for progesterone responsive mutant androgen receptors.
      ). A related G301S substitution also occurs in papillary renal cell carcinoma (International Cancer Genome Consortium code DO48476). The crystallization conditions and packing environment of the G301D structure remained similar to the N196K and unmodified counterparts. The Gly301 residue is located preceding the third β-strand of RRM2 and adjacent the first uridine of the bound oligonucleotide (Fig. 2A). There, the Gly301 residue appears to serve a structural role and lacks direct contacts with the bound oligonucleotide (Fig. 4, A and C). Overall, the protein backbone remains unchanged by the G301D substitution (RMSD 0.2 Å between 188 matching Cα atoms). The mutant Asp301 side chain is anchored by hydrogen bonds to the Lys328 and Asn268 side chains of neighboring RRM2 loops, from which it displaces an ordered water molecule of the Gly301 structure (Fig. 4, B and D). The partial negative charge of the acidic Asp301 side chain is located within van der Waals packing distance of the electronegative terminal phosphate of the oligonucleotide (3.7 Å oxygen–oxygen). This close proximity is expected to disfavor U2AF2–RNA association.
      Figure thumbnail gr4
      Figure 4U2AF212L G301D interactions with bound oligonucleotide. A, a feature-enhanced electron density map contoured at 1 σ shows the WT Gly301 side chain interacting with an alternative conformation (a/b) of Asn268 and two ordered water molecules. B, the Asp301 variant displaces one water molecule and instead forms direct hydrogen bonds with a single Asn268 conformation and the Lys328 side chain. The terminal phosphates have been omitted for clarity. C and D, view of U2AF212L interactions at these sites. The electron density map is colored to indicate the following: oligonucleotide, purple; Gly301 or Asp301 residues, cyan; other residues, marine blue; waters, red spheres.

      N196K and G301D substitutions alter U2AF212L affinity for the AdML Py tract

      Based on the structures of N196K and G301D U2AF212L, we predicted that these amino acid substitutions would influence the RNA-binding affinities of the mutated proteins. To test this prediction, we measured the fluorescence anisotropy changes during titration of the WT and mutant proteins into a consensus 3´ splice site labeled with 5´ fluorescein and fit the apparent binding affinities (Fig. 5). The N196K substitution increased the apparent binding affinity of U2AF212L for this splice-site RNA by approximately 4-fold. Conversely, the G301D substitution decreased the RNA-binding affinity of U2AF212L by nearly 12-fold. The magnitudes of the mutation-induced effects on U2AF212L–RNA association (∼1–1.5 kcal mol−1) agree with the positive charge and inter-RRM contacts of the Lys196 residue, as well as with the Asp301 negative charge introduced near the phosphodiester backbone.
      Figure thumbnail gr5
      Figure 5N196K and G301D cancer-associated mutations alter U2AF2–RNA-binding affinity. A, representative fluorescence anisotropy curves for U2AF212L variants binding a 5´-fluorescein-labeled, 32mer RNA oligonucleotide containing a near-consensus 3´ splice site (5´-CCUGUCCCUUUUUUUUUUUUAGGUCCUGGGCA, with the AG consensus underlined). B, bar graph of binding affinities. The average apparent equilibrium dissociation constants (KD) and standard deviation of three replicates are inset. The fluorescence emission intensities remained similar throughout the titrations. Two-tailed unpaired t tests with Welch's correction of the average values from three experiments were calculated for the mutants compared with WT in GraphPad Prism: *, p < 0.05; ***, p < 0.0005. WT, grayscale; N196K, teal; G301D, red.

      N196K and G301D substitutions alter splicing of a U2AF2-responsive minigene model

      We hypothesized that the altered RNA binding caused by the N196K and G301D substitutions would in turn affect splice-site selection. We first tested this hypothesis by RT-PCR and quantitative real-time RT-PCR of a well-characterized pyPY minigene (Fig. 6 and Fig. S1A), which comprises IgM M1 and partial M2 exons fused to an intron and exon from AdML (
      • Pacheco T.R.
      • Coelho M.B.
      • Desterro J.M.
      • Mollet I.
      • Carmo-Fonseca M.
      In vivo requirement of the small subunit of U2AF for recognition of a weak 3′ splice site.
      ) (Fig. 6A). These alternative 3´ splice sites are marked by uridine-poor (py) or uridine-rich (PY) Py tracts. Because the PY sequence of the minigene corresponds to the AdML prototype used for our U2AF21,2L co-crystal structures, this reporter is well-suited for evaluating U2AF2 structure–function relationships. In our cell line (HEK 293T) and culture conditions (“Experimental procedures”), the pyPY transcript was primarily unspliced (Fig. 6B). A small amount of splicing was detected at the strong, consensus PY site. A minor product of a cryptic splice site corresponds in size to a proximal AG, which lacks a distinguishable Py tract and is not expected to respond to U2AF2 levels. Transfection of a plasmid expressing WT U2AF2 increased splicing, particularly at the weak py site as noted previously (
      • Agrawal A.A.
      • Salsi E.
      • Chatrikhi R.
      • Henderson S.
      • Jenkins J.L.
      • Green M.R.
      • Ermolenko D.N.
      • Kielkopf C.L.
      An extended U2AF65-RNA-binding domain recognizes the 3′ splice site signal.
      ). Expression of the N196K-substituted U2AF2 also increased use of the py splice site, consistent with a gain in RNA-binding affinity. Conversely, expression of the G301D-substituted U2AF2 reduced py splicing nearly to background levels. These results agree with the differences in structure and RNA-binding affinities of the two mutant proteins, although downstream consequences for other interactions (e.g. regulation of U2AF1) may play roles in the altered splicing.
      Figure thumbnail gr6
      Figure 6N196K and G301D mutations of U2AF2 alter splicing of a minigene prototype. A, schematic diagram of the pyPY minigene. The respective central and distal exons are preceded by either the uridine-poor 3´ splice site (py) of the IgM transcript or the uridine-rich AdML splice site (PY). The sequences of the intron preceding each 3´ splice site are shown above for py and below for PY. B, representative RT-PCR of pyPY transcripts from HEK293T cells stably expressing the pyPY minigene and transfected either WT or the indicated mutant U2AF2. A cryptic splice site resulting in an ∼330-bp band represents an “AG” consensus closest to the 5´ splice-site donor. This site lacks a detectable Py tract and remains unchanged by U2AF2 expression, unlike the U2AF2-sensitive py and PY 3´ splice sites. STD, molecular size standards. C and D, quantitative real-time PCR analysis of the relative expression levels of the py (C) and PY (D) isoforms. Two-tailed unpaired t tests with Welch's correction of the average values from three experiments were calculated for the mutants compared with WT in GraphPad Prism: n.s., not significant (p > 0.05); *, p < 0.05; **, p < 0.05. Immunoblots of the transfected samples are shown in .

      N196K and G301D substitutions of U2AF2 alter splicing of representative endogenous pre-mRNAs

      We next tested the influence of expressing N196K- and G301D-substituted U2AF2 on splicing of representative endogenous pre-mRNAs in a human cell line (HEK 293T) (Fig. 7). We focused on three transcripts known to exhibit U2AF2-responsive exon-skipping (
      • Shao C.
      • Yang B.
      • Wu T.
      • Huang J.
      • Tang P.
      • Zhou Y.
      • Zhou J.
      • Qiu J.
      • Jiang L.
      • Li H.
      • Chen G.
      • Sun H.
      • Zhang Y.
      • Denise A.
      • Zhang D.E.
      • et al.
      Mechanisms for U2AF to define 3′ splice sites and regulate alternative splicing in the human genome.
      ): GSK3B, THYN1, and SAT1. To mimic the heterozygous context of acquired mutations in cancers, we overexpressed either WT, N196K, or G301D variants in the presence of endogenous U2AF2. We compared the effects of siRNA-mediated reductions in U2AF2 levels (Fig. S2). As expected (
      • Shao C.
      • Yang B.
      • Wu T.
      • Huang J.
      • Tang P.
      • Zhou Y.
      • Zhou J.
      • Qiu J.
      • Jiang L.
      • Li H.
      • Chen G.
      • Sun H.
      • Zhang Y.
      • Denise A.
      • Zhang D.E.
      • et al.
      Mechanisms for U2AF to define 3′ splice sites and regulate alternative splicing in the human genome.
      ), loss of U2AF2 increased inclusion of the GSK3B cassette exon and decreased inclusion of THYN1 and SAT1 cassette exons. The N196K and G301D substitutions lead to subtle but reproducible changes in splicing (Fig. 7). Overexpression of the G301D-mutant U2AF2 had a lesser but similar effect as U2AF2 knockdown, supporting that this mutant can stall the splicing process. The N196K variant slightly enhanced or had similar effects as WT U2AF2 on splicing of the GSK3B, THYN1, and SAT1 sites, consistent with its RNA-binding properties. Together with the alterations in pyPY splicing, these differences demonstrate that the N196K- and G301D-mutant U2AF2 can influence splicing of endogenous gene transcripts even in the presence of the normal U2AF2 counterpart.
      Figure thumbnail gr7
      Figure 7N196K and G301D mutations of U2AF2 alter splicing of representative transcripts. A–C, schematic diagrams of the indicated cassette exons from representative GSK3B, THYN1, or SAT1 transcripts. Splicing of these sites in HEK 293T cells, transfected with either empty control vector (pCMV) or plasmids expressing WT U2AF2 or the N196K or G301D variants, was analyzed by D–F, RT-PCR followed by agarose gel electrophoresis with ethidium bromide staining or G–I, quantitative real-time RT-PCR normalized to GAPDH. STD, molecular size standards. Immunoblots are shown in , corresponding analyses of U2AF2 knockdown samples are shown in , and primer sequences are listed in . Two-tailed unpaired t tests with Welch's correction of the average values from three experiments were calculated for the mutants compared with WT in GraphPad Prism: n.s., not significant (p > 0.05); *, p < 0.05; **, p < 0.005; ***, p < 0.0005.

      Discussion

      A number of the cancer-associated mutations of U2AF2 affect residues at the RNA interface of the U2AF2 RRMs (
      • Glasser E.
      • Agrawal A.A.
      • Jenkins J.L.
      • Kielkopf C.L.
      Cancer-associated mutations mapped on high-resolution structures of the U2AF2 RNA recognition motifs.
      ). In the present study, we demonstrate structural and functional consequences for two different representatives of this mutational class, including a leukemia-associated N196K mutation and solid tumor–associated G301D mutation. The positively charged N196K mutation of RRM1 favors U2AF2 binding to the negatively charged RNA. The mutant lysine also promotes a local conformational change and mediates a hydrogen bond bridge to the neighboring RRM2, which appears to promote the open U2AF2 conformation for RNA binding. Accordingly, the N196K mutation increases the RNA-binding affinity of U2AF2. The G301D mutation, on the other hand, introduces a negative charge abutting the 5´ terminal phosphate of the bound oligonucleotide in the crystal structure. Although the G301D protein and RNA conformations remain similar to the WT structure, the aspartate side chain is expected to repulse the neighboring phosphate. In support of this conclusion, the G301D mutation penalizes the RNA-binding affinity of U2AF2.
      For a prototypical pyPY minigene, expressing the N196K variant invokes a similar effect as WT U2AF2, most likely by binding to and activating the weak py tract. Conversely, the G301D-substituted U2AF2 is unable to activate splicing of the pyPY minigene, in agreement with its reduced RNA-binding affinity. Because U2AF2 in turn regulates expression of its heterodimeric U2AF1 subunit and likely other splicing factors (
      • Shao C.
      • Yang B.
      • Wu T.
      • Huang J.
      • Tang P.
      • Zhou Y.
      • Zhou J.
      • Qiu J.
      • Jiang L.
      • Li H.
      • Chen G.
      • Sun H.
      • Zhang Y.
      • Denise A.
      • Zhang D.E.
      • et al.
      Mechanisms for U2AF to define 3′ splice sites and regulate alternative splicing in the human genome.
      ,
      • Pacheco T.R.
      • Coelho M.B.
      • Desterro J.M.
      • Mollet I.
      • Carmo-Fonseca M.
      In vivo requirement of the small subunit of U2AF for recognition of a weak 3′ splice site.
      ), it is possible that the U2AF2-associated effects on pyPY splicing are indirect. However, the clear response of pyPY splicing to U2AF2 levels and substitutions is consistent with a primarily direct effect under the conditions of this experiment. The splicing of endogenous transcripts in human cells is even more complex because of competing factors and coupled processes such as transcription and polyadenylation. Moreover, we expressed the cancer-associated U2AF2 variants in the presence of normal U2AF2 to mimic the expected heterozygous state of cancer cells that have acquired the U2AF2 mutations. Nevertheless, expression of the N196K or G301D U2AF2 variants subtly but detectably alters splicing of endogenous transcripts. Although small, such changes could destabilize gene expression sufficiently to promote a cancerous state. Moreover, differences in the RNA-binding properties of the mutant proteins appear to affect splicing in different ways that could contribute to separate disease outcomes (e.g. AML for N196K versus colorectal/prostate carcinomas for G301D variants).
      An ongoing challenge in the field of personalized medicine is to distinguish cancer “driver” mutations from the millions of neutral variants that have been documented in nearly every human gene (
      • Lawrence M.S.
      • Stojanov P.
      • Mermel C.H.
      • Robinson J.T.
      • Garraway L.A.
      • Golub T.R.
      • Meyerson M.
      • Gabriel S.B.
      • Lander E.S.
      • Getz G.
      Discovery and saturation analysis of cancer genes across 21 tumour types.
      ). For example, inherited single-nucleotide variants can in some cases predispose carriers to cancers yet more often are simply neutral passengers. Although we did not detect significant growth differences for HEK 293T cells expressing the mutant U2AF2 proteins in the time frame of our experiments (data not shown), the WT U2AF2 allele is essential for cell viability, and accordingly, its acquired mutations typically are heterozygous. Our finding that the N196K or G301D variants of U2AF2 change splicing of representative gene transcripts, coupled with the apparent absence of N196K- or G301D-encoding mutations among inherited U2AF2 single-nucleotide variants (
      • Wang L.
      • Lawrence M.S.
      • Wan Y.
      • Stojanov P.
      • Sougnez C.
      • Stevenson K.
      • Werner L.
      • Sivachenko A.
      • DeLuca D.S.
      • Zhang L.
      • Zhang W.
      • Vartanov A.R.
      • Fernandes S.M.
      • Goldstein N.R.
      • Folco E.G.
      • et al.
      SF3B1 and other novel cancer genes in chronic lymphocytic leukemia.
      ), suggests that critical functional consequences could prevent passage of these mutations through the germline. Taken together, these findings support that the N196K or G301D mutations of U2AF2 are capable of contributing to the oncogenic dysregulation of gene expression.
      Conversely, mutations that affect mutual interfaces of distinct subunits can have analogous functional consequences and cause the same disease, i.e. “guilt by association” (
      • Wang X.
      • Wei X.
      • Thijssen B.
      • Das J.
      • Lipkin S.M.
      • Yu H.
      Three-dimensional reconstruction of protein networks provides insight into human genetic disease.
      ,
      • Oliver S.
      Guilt-by-association goes global.
      ,
      • Sahni N.
      • Yi S.
      • Taipale M.
      • Fuxman Bass J.I.
      • Coulombe-Huntington J.
      • Yang F.
      • Peng J.
      • Weile J.
      • Karras G.I.
      • Wang Y.
      • Kovács I.A.
      • Kamburov A.
      • Krykbaeva I.
      • Lam M.H.
      • Tucker G.
      • et al.
      Widespread macromolecular interaction perturbations in human genetic disorders.
      ). Because U2AF2 contacts the majority of 3´ splice sites (
      • Shao C.
      • Yang B.
      • Wu T.
      • Huang J.
      • Tang P.
      • Zhou Y.
      • Zhou J.
      • Qiu J.
      • Jiang L.
      • Li H.
      • Chen G.
      • Sun H.
      • Zhang Y.
      • Denise A.
      • Zhang D.E.
      • et al.
      Mechanisms for U2AF to define 3′ splice sites and regulate alternative splicing in the human genome.
      ), a mutant Py tract signal would have little impact on the transcriptome compared with a mis-sense mutation of the U2AF2 protein itself. However, acquired mutations affecting protein partners of U2AF2 (including U2AF1, SF3B1, SF1, or RNA unwindases) could trigger similar downstream effects. Accordingly, our crystal structure suggests that the AML-associated N196K substitution stabilizes the open U2AF2 conformation, as observed for the AML/myelodysplasia–associated S34F mutation of U2AF1 in complex with a subset of splice sites (
      • Warnasooriya C.
      • Feeney C.F.
      • Laird K.M.
      • Ermolenko D.N.
      • Kielkopf C.L.
      A splice site-sensing conformational switch in U2AF2 is modulated by U2AF1 and its recurrent myelodysplasia-associated mutation.
      ). As a second example, the cancer-associated K700E mutation of SF3B1 is expected to weaken SF3B1–RNA contacts and modulate RNA unwindases (
      • Jenkins J.L.
      • Kielkopf C.L.
      Splicing factor mutations in myelodysplasias: Insights from spliceosome structures.
      ,
      • Tang Q.
      • Rodriguez-Santiago S.
      • Wang J.
      • Pu J.
      • Yuste A.
      • Gupta V.
      • Moldón A.
      • Xu Y.Z.
      • Query C.C.
      SF3B1/Hsh155 HEAT motif mutations affect interaction with the spliceosomal ATPase Prp5, resulting in altered branch site selectivity in pre-mRNA splicing.
      ,
      • Carrocci T.J.
      • Zoerner D.M.
      • Paulson J.C.
      • Hoskins A.A.
      SF3b1 mutations associated with myelodysplastic syndromes alter the fidelity of branchsite selection in yeast.
      ), which could mimic the G301D-dependent destabilization of the U2AF2–Py tract complex. Third, a search using cBioPortal (
      • Gao J.
      • Aksoy B.A.
      • Dogrusoz U.
      • Dresdner G.
      • Gross B.
      • Sumer S.O.
      • Sun Y.
      • Jacobsen A.
      • Sinha R.
      • Larsson E.
      • Cerami E.
      • Sander C.
      • Schultz N.
      Integrative analysis of complex cancer genomics and clinical profiles using the cBioPortal.
      ,
      • Cerami E.
      • Gao J.
      • Dogrusoz U.
      • Gross B.E.
      • Sumer S.O.
      • Aksoy B.A.
      • Jacobsen A.
      • Byrne C.J.
      • Heuer M.L.
      • Larsson E.
      • Antipin Y.
      • Reva B.
      • Goldberg A.P.
      • Sander C.
      • Schultz N.
      The cBio cancer genomics portal: an open platform for exploring multidimensional cancer genomics data.
      ) indicates that SF1 alterations recur among castration-resistant prostate cancers and colorectal adenocarcinomas, which are the same cancer types associated with the U2AF2 G301D mutation. Indeed, the most common mis-sense mutations of SF1 (R255Q/W and R135C/H) are located at its RNA interface, which is expected to reduce RNA binding to the ternary SF1–U2AF2–U2AF1 complex, as would the G301D substitution of U2AF2. These observations suggest that in certain contexts, the long-tail N196K and G301D U2AF2 mutations may evoke similar consequences as more common mutations in other splicing factors, thereby dysregulating pre-mRNA splicing and contributing to neoplastic transformation.
      In conclusion, the results presented here offer a concrete molecular mechanism for N196K and G301D mis-sense mutations of U2AF2 to contribute to the progression of malignancies by altering splice-site signal recognition. By analogy, other known 3D interfaces of the U2AF2 protein can explain the enrichment of cancer-associated mutations in its splicing factor partners. Taken collectively, these examples of long-tail splicing factor mutations represent a source of pre-mRNA splicing aberrations among cancers that may be more widespread than apparent based on inspection of the relatively rare, individual occurrences. Beyond the N196K/G301D-containing cluster of mutations at the U2AF2–RNA interface, other subsets of cancer-associated U2AF2 mutations are located at the inter-RRM interface of its closed conformation or in the C-terminal SF1/SF3B1-interaction motif. More studies are needed to distinguish the relevance of other long-tail mutations for U2AF2 structure, function, and associated cancers. Meanwhile, our confirmation that the cancer-associated N196K and G301D mutations modify the structural and functional properties of U2AF2 raises the possibility of targeting U2AF2 and its partners as a potential means to investigate and treat leukemias and cancers.

      Experimental procedures

      Expression and purification

      For crystallization and RNA-binding experiments, the WT, N196K, or G301D variants of the U2AF2 RNA-binding domain included the N/C-terminal extensions of RRM1/RRM, RRM1, RRM2, and the inter-RRM linker (residues 141–342 of NCBI RefSeq NP_009210). These U2AF212L proteins were expressed and purified as described (
      • Agrawal A.A.
      • Salsi E.
      • Chatrikhi R.
      • Henderson S.
      • Jenkins J.L.
      • Green M.R.
      • Ermolenko D.N.
      • Kielkopf C.L.
      An extended U2AF65-RNA-binding domain recognizes the 3′ splice site signal.
      ). Following a final step of size-exclusion chromatography on a Superdex-75 prep-grade column (Cytiva Inc.) equilibrated with 100 mm NaCl, 15 mm HEPES, pH 6.8, 0.2 mm TCEP, the purified U2AF212L was concentrated using a Vivaspin 15R (Sartorius Corp.) centrifugal concentrator with a 10-kDa molecular mass cutoff. The protein concentration was estimated using the calculated extinction coefficient of 8,940 m−1 cm−1 and absorbance at 280 nm. Purified, deprotected oligonucleotides for co-crystallization were purchased from Integrated DNA Technologies Inc. Purified, fluorescein-labeled RNA oligonucleotides were purchased from Horizon Discovery Ltd. and deprotected according to the manufacturer's instructions.

      Crystallization and structure determination

      Prior to crystallization, WT, N196K, or G301D U2AF212L variants were mixed in a 1:1.2 molar ratio with purified oligonucleotide (5´-UUUU(dU)U(5BrdU)CC-3´) and incubated on ice for 20 min. The final protein concentration was ∼20 mg ml−1. Diffraction quality crystals were obtained within approximately 1 week from a hanging drop of 1 μl of macromolecule layered with 1 μl of precipitant equilibrated over a 0.7-ml reservoir at 4 °C. The precipitant of the WT complex was 1 m succinic acid, 0.1 m HEPES, pH 7.0, 3% (w/v) PEG mono methyl ether 2000. The precipitant of the N196K or G301D variants was 0.24 m sodium malonate, pH 7.0, 5% sucrose, and 20–25% w/v PEG 3350. Additionally, 0.1 μl of 5% w/v LDAO detergent (Hampton Research) was added to the N196K or G301D protein-oligonucleotide mixtures immediately prior to the precipitant solution. Each crystal was coated with a mixture of 1:1 (v/v) paratone-N and silicone oil and then flash-cooled in liquid nitrogen before data collection at 100 K. Crystallographic data sets were collected remotely at the Stanford Synchrotron Radiation Light Source Beamline 12-2 (
      • Soltis S.M.
      • Cohen A.E.
      • Deacon A.
      • Eriksson T.
      • González A.
      • McPhillips S.
      • Chui H.
      • Dunten P.
      • Hollenbeck M.
      • Mathews I.
      • Miller M.
      • Moorhead P.
      • Phizackerley R.P.
      • Smith C.
      • Song J.
      • et al.
      New paradigm for macromolecular crystallography experiments at SSRL: automated crystal screening and remote data collection.
      ). The data were processed using the Stanford Synchrotron Radiation Light Source AUTOXDS script (A. Gonzalez and Y. Tsai) implementation of XDS (
      • Kabsch W.
      Integration, scaling, space-group assignment and post-refinement.
      ) and CCP4 packages (
      • Winn M.D.
      • Ballard C.C.
      • Cowtan K.D.
      • Dodson E.J.
      • Emsley P.
      • Evans P.R.
      • Keegan R.M.
      • Krissinel E.B.
      • Leslie A.G.
      • McCoy A.
      • McNicholas S.J.
      • Murshudov G.N.
      • Pannu N.S.
      • Potterton E.A.
      • Powell H.R.
      • et al.
      Overview of the CCP4 suite and current developments.
      ). The structures were determined using the Fourier synthesis method starting from PDB code 5EV3. The models were adjusted using COOT (
      • Emsley P.
      • Lohkamp B.
      • Scott W.G.
      • Cowtan K.
      Features and development of Coot.
      ) and refined using PHENIX (
      • Adams P.D.
      • Afonine P.V.
      • Bunkoczi G.
      • Chen V.B.
      • Davis I.W.
      • Echols N.
      • Headd J.J.
      • Hung L.W.
      • Kapral G.J.
      • Grosse-Kunstleve R.W.
      • McCoy A.J.
      • Moriarty N.W.
      • Oeffner R.
      • Read R.J.
      • Richardson D.C.
      • et al.
      PHENIX: a comprehensive Python-based system for macromolecular structure solution.
      ). The crystallographic data and refinement statistics are given in Table 1.

      Fluorescence anisotropy RNA-binding assays

      Protocols for the RNA-binding experiments were essentially as described (
      • Jenkins J.L.
      • Shen H.
      • Green M.R.
      • Kielkopf C.L.
      Solution conformation and thermodynamic characteristics of RNA binding by the splicing factor U2AF65.
      ). A 5´-fluorescein–labeled, 32-mer RNA oligonucleotide contained a near-consensus 3´ splice site (5´-CCUGUCCCUUUUUUUUUUUUAGGUCCUGGGCA, with the AG consensus underlined). The purified proteins and RNA were diluted separately by >100-fold into a binding buffer comprising 100 mm NaCl, 15 mm HEPES at pH 6.8, 0.2 mm TCEP, 0.1 unit ml−1 Superase-InTM (Invitrogen). The final RNA concentration in the cuvette was 30 nm. The volume changes during addition of the protein were <10% to minimize dilution effects. The fluorescence anisotropy changes during titration were measured using a FluoroMax-3 spectrophotometer and temperature-controlled at 23 °C by a circulating water bath. The samples were excited at 490 nm, and the emission intensities were recorded at 520 nm with a slit width of 5 nm. The fluorescence emission spectra also were monitored for similarity throughout the experiment. Each titration was fit as described (
      • Jenkins J.L.
      • Shen H.
      • Green M.R.
      • Kielkopf C.L.
      Solution conformation and thermodynamic characteristics of RNA binding by the splicing factor U2AF65.
      ) to obtain the apparent equilibrium dissociation constant (KD). These fits and the p values of a two-tailed unpaired t test with Welch's correction were calculated using Prism version 6.0 (GraphPad Software Inc.). The apparent equilibrium affinities (KA) are the reciprocals of each KD. The average KD or KA values and standard deviation among three replicates for each sample were calculated using Microsoft Excel.

      Cell culture and transfections

      Human embryonic kidney epithelial cells (HEK 293T, ATCC® CRL-3216™) were maintained at 37°C in a humidified atmosphere containing 5% CO2. The cells were cultured in Dulbecco's modified Eagle's medium (Gibco; Thermo Fisher (catalog no. 11995-065) for pyPY experiments or Thermo Fisher (catalog no. 11885-084) to study splicing of endogenous transcripts, supplemented with pyruvate, 10% fetal calf serum (R&D Systems Inc.), and penicillin-streptomycin (Gibco). The N196K- and G301D-encoding mutations were introduced by Genscript in the context of an N-terminally HA-tagged, full length U2AF2 (NCBI Refseq NP_009210) in a pCMV5 vector. All constructs were fully sequenced for verification. A stable 293T cell line expressing the pyPY transcript was constructed by co-transfecting the pyPY minigene (28) and pBABE-puro (41) at a 100:1 w/w ratio and selecting for puromycin resistance. The unmodified or pyPY-expressing cells were transfected in 6-well plates at 80% confluency with 0.4 μg of plasmids encoding WT, N196K, or G301D human U2AF2 (NCBI RefSeq NP_009210) buffered with 1.6 μg of empty vector or 2.0 μg of an empty vector as a control, using JetPrime (Polyplus transfection) as instructed by the manufacturer. The cells were harvested 24 h after transfection for RNA and protein analysis. For comparison of U2AF2 knockdown, HEK 293T cells were transfected with Stealth™ siRNAs (Thermo Fisher), targeting either U2AF2 (catalog nos. HSS117616 and HSS117617) or a “Lo GC” control (catalog no. 12935200), and harvested 2 and 3 days post-transfection.

      Immunoblotting

      For total protein analysis, the cells were lysed in a buffer containing 50 mM Tris, pH 8.0, 10 mM EDTA, 1% (w/v) SDS, 1 mM DTT, and protease inhibitors. For immunoblotting (Fig. S1 and Fig. S2G), proteins were separated by SDS-PAGE, transferred to a polyvinylidene difluoride membrane, and immunoblotted with antibodies specific for U2AF2 (Millipore-Sigma catalog no. U4758, 1:1000 dilution), GAPDH (Cell Signaling catalog no. 2118S, 1:1000), or HA (Enzo Life Sciences Inc. catalog no. ADI-MSA-106-E, 1:1000 dilution). Secondary antibodies included anti-mouse IgG horseradish peroxidase or anti-rabbit IgG horseradish peroxidase (Cytiva Inc. catalog nos. NA931 and NA934, 1:5000 dilution). Chemiluminescent signal from Clarity™ Western ECL substrate (Bio-Rad) was detected on a Chemidoc™ touch imaging system (Bio-Rad).

      RT-PCR

      For RT-PCR (Figs. 6B and 7, D-F, and Fig. S2, A-C), total RNA was isolated from the harvested cells using the RNeasy Plus mini kit (Qiagen). cDNAs were synthesized from RNA using Moloney murine leukemia virus RT with random primers (Invitrogen). The RT-PCRs of the pyPY minigene were run for 40 cycles of 94°C for 30 s, 55°C for 10 s, and 72°C for 20 s. The RT-PCR of GSK3b was run for 36 cycles of 94°C for 30 s, 55°C for 10 s, and 72°C for 15 s; SAT1 was run for 32 cycles of 94°C for 30 s, 55°C for 10 s, and 72°C for 15 s; and THYN1 was run for 28 cycles of 94°C for 30 s, 57°C for 10 s, and 72°C for 15 s. The RT-PCR products were separated on a 2% TBE-agarose gel and stained with ethidium bromide and visualized using and visualized using a Gel Doc XR+ gel documentation system (Bio-Rad). The quantitative real-time RT-PCRs of cDNAs were run with SYBR™ Green in triplicate using a Bio-Rad CFX thermal cycler, quantified by the standard curve method, and normalized to GAPDH. The primer sequences are listed in Table S1.

      Data availability

      The atomic coordinates and structure factors of the WT, N196K, and D301G variants of U2AF12L bound to AdML oligonucleotide (accession codes 6XLV, 6XLW, and 6XLX) have been deposited at the Protein Data Bank.

      Acknowledgments

      We are grateful to Maria Carmo-Fonseca (University of Lisbon, Portugal) for providing the pyPY and wild-type HA-U2AF2 plasmids, Hartmut Land (University of Rochester) for pBABE-puro, Paul Boutz (University of Rochester) for advice with PCR optimization, and Irimpan Mathews (Stanford Synchrotron Radiation Light Source) for assistance with remote data collection.

      Supplementary Material

      References

        • Wang X.
        • Wei X.
        • Thijssen B.
        • Das J.
        • Lipkin S.M.
        • Yu H.
        Three-dimensional reconstruction of protein networks provides insight into human genetic disease.
        Nat. Biotechnol. 2012; 30 (22252508): 159-164
        • Kamburov A.
        • Lawrence M.S.
        • Polak P.
        • Leshchiner I.
        • Lage K.
        • Golub T.R.
        • Lander E.S.
        • Getz G.
        Comprehensive assessment of cancer missense mutation clustering in protein structures.
        Proc. Natl. Acad. Sci. U.S.A. 2015; 112 (26392535): E5486-E5495
        • Gao M.
        • Zhou H.
        • Skolnick J.
        Insights into disease-associated mutations in the human proteome through protein structural analysis.
        Structure. 2015; 23 (26027735): 1362-1369
        • Shao C.
        • Yang B.
        • Wu T.
        • Huang J.
        • Tang P.
        • Zhou Y.
        • Zhou J.
        • Qiu J.
        • Jiang L.
        • Li H.
        • Chen G.
        • Sun H.
        • Zhang Y.
        • Denise A.
        • Zhang D.E.
        • et al.
        Mechanisms for U2AF to define 3′ splice sites and regulate alternative splicing in the human genome.
        Nat. Struct. Mol. Biol. 2014; 21 (25326705): 997-1005
        • Pankow S.
        • Bamberger C.
        • Calzolari D.
        • Martínez-Bartolomé S.
        • Lavallée-Adam M.
        • Balch W.E.
        • Yates 3rd, J.R.
        ΔF508 CFTR interactome remodelling promotes rescue of cystic fibrosis.
        Nature. 2015; 528 (26618866): 510-516
        • Walerych D.
        • Lisek K.
        • Sommaggio R.
        • Piazza S.
        • Ciani Y.
        • Dalla E.
        • Rajkowska K.
        • Gaweda-Walerych K.
        • Ingallina E.
        • Tonelli C.
        • Morelli M.J.
        • Amato A.
        • Eterno V.
        • Zambelli A.
        • Rosato A.
        • et al.
        Proteasome machinery is instrumental in a common gain-of-function program of the p53 missense mutants in cancer.
        Nat. Cell Biol. 2016; 18 (27347849): 897-909
        • Shim J.E.
        • Kim J.H.
        • Shin J.
        • Lee J.E.
        • Lee I.
        Pathway-specific protein domains are predictive for human diseases.
        PLoS Comput. Biol. 2019; 15 (31075101)e1007052
        • Ashford P.
        • Pang C.S.M.
        • Moya-García A.A.
        • Adeyelu T.
        • Orengo C.A.
        A CATH domain functional family based approach to identify putative cancer driver genes and driver mutations.
        Sci. Rep. 2019; 9 (30670742): 263
        • Chen S.
        • Fragoza R.
        • Klei L.
        • Liu Y.
        • Wang J.
        • Roeder K.
        • Devlin B.
        • Yu H.
        An interactome perturbation framework prioritizes damaging missense mutations for developmental disorders.
        Nat. Genet. 2018; 50 (29892012): 1032-1040
        • Dvinge H.
        • Kim E.
        • Abdel-Wahab O.
        • Bradley R.K.
        RNA splicing factors as oncoproteins and tumour suppressors.
        Nat. Rev. Cancer. 2016; 16 (27282250): 413-430
        • Jenkins J.L.
        • Kielkopf C.L.
        Splicing factor mutations in myelodysplasias: Insights from spliceosome structures.
        Trends Genet. 2017; 33 (28372848): 336-348
        • Tang Q.
        • Rodriguez-Santiago S.
        • Wang J.
        • Pu J.
        • Yuste A.
        • Gupta V.
        • Moldón A.
        • Xu Y.Z.
        • Query C.C.
        SF3B1/Hsh155 HEAT motif mutations affect interaction with the spliceosomal ATPase Prp5, resulting in altered branch site selectivity in pre-mRNA splicing.
        Genes Dev. 2016; 30 (28087715): 2710-2723
        • Carrocci T.J.
        • Zoerner D.M.
        • Paulson J.C.
        • Hoskins A.A.
        SF3b1 mutations associated with myelodysplastic syndromes alter the fidelity of branchsite selection in yeast.
        Nucleic Acids Res. 2017; 45 (28062854): 4837-4852
        • Chang M.T.
        • Asthana S.
        • Gao S.P.
        • Lee B.H.
        • Chapman J.S.
        • Kandoth C.
        • Gao J.
        • Socci N.D.
        • Solit D.B.
        • Olshen A.B.
        • Schultz N.
        • Taylor B.S.
        Identifying recurrent mutations in cancer reveals widespread lineage diversity and mutational specificity.
        Nat. Biotechnol. 2016; 34 (26619011): 155-163
        • Glasser E.
        • Agrawal A.A.
        • Jenkins J.L.
        • Kielkopf C.L.
        Cancer-associated mutations mapped on high-resolution structures of the U2AF2 RNA recognition motifs.
        Biochemistry. 2017; 56 (28850223): 4757-4761
        • Singh R.
        • Valcárcel J.
        • Green M.R.
        Distinct binding specificities and functions of higher eukaryotic polypyrimidine tract-binding proteins.
        Science. 1995; 268 (7761834): 1173-1176
        • Singh R.
        • Banerjee H.
        • Green M.R.
        Differential recognition of the polypyrimidine-tract by the general splicing factor U2AF65 and the splicing repressor Sex-lethal.
        RNA. 2000; 6 (10864047): 901-911
        • Mackereth C.D.
        • Madl T.
        • Bonnal S.
        • Simon B.
        • Zanier K.
        • Gasch A.
        • Rybin V.
        • Valcárcel J.
        • Sattler M.
        Multi-domain conformational selection underlies pre-mRNA splicing regulation by U2AF.
        Nature. 2011; 475 (21753750): 408-411
        • Huang J.R.
        • Warner L.R.
        • Sanchez C.
        • Gabel F.
        • Madl T.
        • Mackereth C.D.
        • Sattler M.
        • Blackledge M.
        Transient electrostatic interactions dominate the conformational equilibrium sampled by multidomain splicing factor U2AF65: a combined NMR and SAXS study.
        J. Am. Chem. Soc. 2014; 136 (24734879): 7068-7076
        • Agrawal A.A.
        • Salsi E.
        • Chatrikhi R.
        • Henderson S.
        • Jenkins J.L.
        • Green M.R.
        • Ermolenko D.N.
        • Kielkopf C.L.
        An extended U2AF65-RNA-binding domain recognizes the 3′ splice site signal.
        Nat. Commun. 2016; 7 (26952537)10950
        • Voith von Voithenberg L.
        • Sánchez-Rico C.
        • Kang H.S.
        • Madl T.
        • Zanier K.
        • Barth A.
        • Warner L.R.
        • Sattler M.
        • Lamb D.C.
        Recognition of the 3′ splice site RNA by the U2AF heterodimer involves a dynamic population shift.
        Proc. Natl. Acad. Sci. U.S.A. 2016; 113 (27799531): E7169-E7175
        • Jenkins J.L.
        • Laird K.M.
        • Kielkopf C.L.
        A broad range of conformations contribute to the solution ensemble of the essential splicing factor U2AF65.
        Biochemistry. 2012; 51 (22702716): 5223-5225
        • Warnasooriya C.
        • Feeney C.F.
        • Laird K.M.
        • Ermolenko D.N.
        • Kielkopf C.L.
        A splice site-sensing conformational switch in U2AF2 is modulated by U2AF1 and its recurrent myelodysplasia-associated mutation.
        Nucleic Acids Res. 2020; 48 (32343311): 5695-5709
        • Afonine P.V.
        • Moriarty N.W.
        • Mustyakimov M.
        • Sobolev O.V.
        • Terwilliger T.C.
        • Turk D.
        • Urzhumtsev A.
        • Adams P.D.
        FEM: feature-enhanced map.
        Acta Crystallogr. 2015; 71 (25760612): 646-666
        • Agrawal A.A.
        • McLaughlin K.J.
        • Jenkins J.L.
        • Kielkopf C.L.
        Structure-guided U2AF65 variant improves recognition and splicing of a defective pre-mRNA.
        Proc. Natl. Acad. Sci. U.S.A. 2014; 111 (25422459): 17420-17425
        • Cancer Genome Atlas, N
        Comprehensive molecular characterization of human colon and rectal cancer.
        Nature. 2012; 487 (22810696): 330-337
        • Chen E.J.
        • Sowalsky A.G.
        • Gao S.
        • Cai C.
        • Voznesensky O.
        • Schaefer R.
        • Loda M.
        • True L.D.
        • Ye H.
        • Troncoso P.
        • Lis R.L.
        • Kantoff P.W.
        • Montgomery R.B.
        • Nelson P.S.
        • Bubley G.J.
        • et al.
        Abiraterone treatment in castration-resistant prostate cancer selects for progesterone responsive mutant androgen receptors.
        Clin Cancer Res. 2015; 21 (25320358): 1273-1280
        • Pacheco T.R.
        • Coelho M.B.
        • Desterro J.M.
        • Mollet I.
        • Carmo-Fonseca M.
        In vivo requirement of the small subunit of U2AF for recognition of a weak 3′ splice site.
        Mol. Cell Biol. 2006; 26 (16940179): 8183-8190
        • Lawrence M.S.
        • Stojanov P.
        • Mermel C.H.
        • Robinson J.T.
        • Garraway L.A.
        • Golub T.R.
        • Meyerson M.
        • Gabriel S.B.
        • Lander E.S.
        • Getz G.
        Discovery and saturation analysis of cancer genes across 21 tumour types.
        Nature. 2014; 505 (24390350): 495-501
        • Wang L.
        • Lawrence M.S.
        • Wan Y.
        • Stojanov P.
        • Sougnez C.
        • Stevenson K.
        • Werner L.
        • Sivachenko A.
        • DeLuca D.S.
        • Zhang L.
        • Zhang W.
        • Vartanov A.R.
        • Fernandes S.M.
        • Goldstein N.R.
        • Folco E.G.
        • et al.
        SF3B1 and other novel cancer genes in chronic lymphocytic leukemia.
        N. Engl. J. Med. 2011; 365 (22150006): 2497-2506
        • Oliver S.
        Guilt-by-association goes global.
        Nature. 2000; 403 (10688178): 601-603
        • Sahni N.
        • Yi S.
        • Taipale M.
        • Fuxman Bass J.I.
        • Coulombe-Huntington J.
        • Yang F.
        • Peng J.
        • Weile J.
        • Karras G.I.
        • Wang Y.
        • Kovács I.A.
        • Kamburov A.
        • Krykbaeva I.
        • Lam M.H.
        • Tucker G.
        • et al.
        Widespread macromolecular interaction perturbations in human genetic disorders.
        Cell. 2015; 161 (25910212): 647-660
        • Gao J.
        • Aksoy B.A.
        • Dogrusoz U.
        • Dresdner G.
        • Gross B.
        • Sumer S.O.
        • Sun Y.
        • Jacobsen A.
        • Sinha R.
        • Larsson E.
        • Cerami E.
        • Sander C.
        • Schultz N.
        Integrative analysis of complex cancer genomics and clinical profiles using the cBioPortal.
        Sci. Signal. 2013; 6 (23550210): pl1
        • Cerami E.
        • Gao J.
        • Dogrusoz U.
        • Gross B.E.
        • Sumer S.O.
        • Aksoy B.A.
        • Jacobsen A.
        • Byrne C.J.
        • Heuer M.L.
        • Larsson E.
        • Antipin Y.
        • Reva B.
        • Goldberg A.P.
        • Sander C.
        • Schultz N.
        The cBio cancer genomics portal: an open platform for exploring multidimensional cancer genomics data.
        Cancer Discov. 2012; 2 (22588877): 401-404
        • Soltis S.M.
        • Cohen A.E.
        • Deacon A.
        • Eriksson T.
        • González A.
        • McPhillips S.
        • Chui H.
        • Dunten P.
        • Hollenbeck M.
        • Mathews I.
        • Miller M.
        • Moorhead P.
        • Phizackerley R.P.
        • Smith C.
        • Song J.
        • et al.
        New paradigm for macromolecular crystallography experiments at SSRL: automated crystal screening and remote data collection.
        Acta Crystallogr. D Biol. Crystallogr. 2008; 64 (19018097): 1210-1221
        • Kabsch W.
        Integration, scaling, space-group assignment and post-refinement.
        Acta Crystallogr. D Biol. Crystallogr. 2010; 66 (20124693): 133-144
        • Winn M.D.
        • Ballard C.C.
        • Cowtan K.D.
        • Dodson E.J.
        • Emsley P.
        • Evans P.R.
        • Keegan R.M.
        • Krissinel E.B.
        • Leslie A.G.
        • McCoy A.
        • McNicholas S.J.
        • Murshudov G.N.
        • Pannu N.S.
        • Potterton E.A.
        • Powell H.R.
        • et al.
        Overview of the CCP4 suite and current developments.
        Acta Crystallogr. D Biol. Crystallogr. 2011; 67 (21460441): 235-242
        • Emsley P.
        • Lohkamp B.
        • Scott W.G.
        • Cowtan K.
        Features and development of Coot.
        Acta Crystallogr. D Biol. Crystallogr. 2010; 66 (20383002): 486-501
        • Adams P.D.
        • Afonine P.V.
        • Bunkoczi G.
        • Chen V.B.
        • Davis I.W.
        • Echols N.
        • Headd J.J.
        • Hung L.W.
        • Kapral G.J.
        • Grosse-Kunstleve R.W.
        • McCoy A.J.
        • Moriarty N.W.
        • Oeffner R.
        • Read R.J.
        • Richardson D.C.
        • et al.
        PHENIX: a comprehensive Python-based system for macromolecular structure solution.
        Acta Crystallogr. 2010; 66 (20124702): 213-221
        • Jenkins J.L.
        • Shen H.
        • Green M.R.
        • Kielkopf C.L.
        Solution conformation and thermodynamic characteristics of RNA binding by the splicing factor U2AF65.
        J. Biol. Chem. 2008; 283 (18842594): 33641-33649
        • Karplus P.A.
        • Diederichs K.
        Linking crystallographic model and data quality.
        Science. 2012; 336 (22628654): 1030-1033
        • Chen V.B.
        • Arendall 3rd, W.B.
        • Headd J.J.
        • Keedy D.A.
        • Immormino R.M.
        • Kapral G.J.
        • Murray L.W.
        • Richardson J.S.
        • Richardson D.C.
        MolProbity: all-atom structure validation for macromolecular crystallography.
        Acta Crystallogr D Biol Crystallogr. 2010; 66 (20057044): 12-21