Structure of the super-elongation complex subunit AFF4 C-terminal homology domain reveals requirements for AFF homo- and heterodimerization

AF4/FMR2 family member 4 (AFF4) is the scaffold protein of the multisubunit super-elongation complex, which plays key roles in the release of RNA polymerase II from promoter-proximal pausing and in the transactivation of HIV-1 transcription. AFF4 consists of an intrinsically disordered N-terminal region that interacts with other super-elongation complex subunits and a C-terminal homology domain (CHD) that is conserved among AF4/FMR2 family proteins, including AFF1, AFF2, AFF3, and AFF4. Here, we solved the X-ray crystal structure of the CHD in human AFF4 (AFF4-CHD) to 2.2 Å resolution and characterized its biochemical properties. The structure disclosed that AFF4-CHD folds into a novel domain that consists of eight helices and is distantly related to tetratrico peptide repeat motifs. Our analyses further revealed that AFF4-CHD mediates the formation of an AFF4 homodimer or an AFF1-AFF4 heterodimer. Results from fluorescence anisotropy experiments suggested that AFF4-CHD interacts with both RNA and DNA in vitro. Furthermore, we identified a surface loop region in AFF4-CHD as a substrate for the P-TEFb kinase cyclin-dependent kinase 9, which triggers release of polymerase II from promoter-proximal pausing sites. In conclusion, the AFF-CHD structure and biochemical analyses reported here reveal the molecular basis for the homo- and heterodimerization of AFF proteins and implicate the AFF4-CHD in nucleic acid interactions. The high conservation of the CHD among several other proteins suggests that our results are also relevant for understanding other CHD-containing proteins and their dimerization behavior.

Transcription of protein-coding genes in eukaryotic cells by RNA polymerase (Pol) 2 II is regulated not only during the initiation phase but also during elongation of the RNA chain (1)(2)(3)(4). Recent biochemical and structural studies revealed how transcription elongation is regulated via the release of promot-er-proximal pausing by the kinase complex P-TEFb (5,6). P-TEFb exists in several forms in cells, including an inactive 7SK snRNP particle, a complex with BRD4, and a complex with scaffolding and elongation factors within a multisubunit super elongation complex (SEC) (7)(8)(9). SEC was shown to play key roles in the transactivation of HIV-1 gene transcription and the release of Pol II that is paused in promoter-proximal regions of genes (10 -14).
SEC consists of an AF4/FMR family protein, either AFF1 or AFF4; a Pol II elongation factor, either ELL, ELL2, or ELL3; a partner protein, either EAF1 or EAF2; an ENL family protein, either ENL or AF9; and of P-TEFb, a heterodimer of the kinase CDK9 and a cyclin partner, either CycT1, CycT2a, or CycT2b (15). The subunits ELL and ELL2 associate with Pol II and can increase the catalytical rate of transcription elongation (16). Factors ENL and AF9 are thought to bridge between Pol II and SEC via their interaction with the Pol II-binding elongation factor PAF1 complex (PAF1c) (17).
In addition to the N-terminal disordered region, AFF4 shares a highly conserved C-terminal homology domain (CHD) with other AF4/FMR family proteins, including AFF1, AFF2, and AFF3. The latter proteins are scaffolds of SEC-L2 and SEC-L3 complexes that contain P-TEFb and AF9/ENL but lack ELL/ EAF subunits. The CHD is also found in the Drosophila protein Lilliputian, which is the founding member of the AFF family (10,24).
The subunits of SEC were shown to be frequent translocation partners in mixed linkage leukemia (MLL). In particular, the fusion of AFF4 to MLL leads to up-regulation of the expression level of MLL target genes (15). It was further shown that MLL-AFF1 and MLL-AFF4 can form a heterodimer via their CHDs in vivo and that heterodimerization correlates with the oncogenic potential of MLL-AFF1 and MLL-AFF4 (25). Furthermore, AFF4 and AFF1 form a heterodimer when co-expressed in vivo (26). Secondary structure predication indicates that the CHD is mainly ␣-helical and likely to fold into one independent domain (27,28).
To investigate the structure of the CHD in AFF family proteins and to elucidate the mechanism of CHD dimerization, we solved the X-ray crystal structure of the CHD in human AFF4 (AFF4-CHD) and characterized its biochemical properties by in vitro assays. Our results show that the CHD of AFF4 folds into a helical domain that distantly resembles 14-3-3 proteins and can mediate the formation of an AFF4 homodimer and an AFF1-AFF4 heterodimer. In addition, our fluorescence anisot- 2), and AFF2 (NP_001162594.1). The interaction regions of AFF4 with P-TEFb, ELL2, and AF9/ENL are shown as lines under the human AFF4 scheme. The alignment was performed using full-length AFF1, AFF2, AFF3, and AFF4 proteins from human, mouse, chicken, and zebrafish and Lilliputian protein from Drosophila using Jalview (muscle with default) (65). The C-terminal sequences of the human AFF proteins are shown here. The C-terminal sequences of the AFF proteins from all above species are shown in Fig. S1. The secondary structure on top of the alignment was assigned based on the crystal structure of AFF4-CHD to the multisequence alignment using ESPript 3 (66). Stars above the sequences indicate residues that are phosphorylated by P-TEFb identified by MS in this study. Black stars show the residues that are not conserved among AFFs. Brown stars show the phosphorylation sites that are conserved among AFFs. The conserved phosphorylation region L ␣5-␣6 is marked by solid and dashed lines above the sequence. A dashed line indicates the invisible residues in the crystal structure. Mutated residues on the dimerization interface are marked by orange squares. B, crystal structure of AFF4-CHD in one asymmetry unit. The model is colored from N to C using rainbow colors from blue to red. C, superposition of AFF4-CHD monomer with homodimer of 14-3-3 protein ␥ (Protein Data Bank code 6GKG-F, B (31)). 14-3-3 protein ␥ is colored in gray, and AFF4-CHD is colored in dark green as figures above. The ␣-helices of AFF4-CHD are labeled as ␣1-␣8, and the ␣-helices of 14-3-3 protein ␥ are labeled as H1-H9. H1, H3, and H4 of the two 14-3-3 protein ␥ monomers (in light and dark gray) mediate the homodimerization of 14-3-3 protein ␥.

Structure of AFF4 C-terminal homology domain
ropy data reveal that the AFF4-CHD interacts with RNA and DNA in vitro. Furthermore, we identified a surface loop region in the AFF4-CHD as a substrate for the P-TEFb kinase. The high conservation of the CHD indicates that our results are relevant for understanding other CHD-containing proteins and their dimerization behavior. Our work contributes to the structural characterization of SEC, which is a key player in gene regulation in eukaryotic cells.

Crystal structure of the AFF4-CHD
To study the structure of the AFF4-CHD, we designed a crystallization construct (residues 899 -1163) based on secondary structure prediction and multisequence alignments of AFF proteins from different species (Fig. 1A and Fig. S1). Soluble AFF4-CHD was purified and crystallized as indicated under "Experimental procedures," and the structure was determined by single-wavelength anomalous diffraction and refined to a resolution of 2.2 Å.
The crystal structure reveals that AFF4-CHD folds into eight ␣-helices, with ␣1-␣4, ␣7, and one-half of ␣6, forming a structure with a concave side, which is filled by the last helix ␣8 and covered by the N-terminal tail region that folds onto ␣8 (Fig.  1B). Interestingly, helices ␣4 and ␣5 are almost connected as one long helix that bends toward the long helix ␣6.
A superposition of each individual structure onto AFF4-CHD was produced with PyMOL ( Fig. 1C and Fig. S2, A-C) (29). The root mean square deviations between the structures and AFF4-CHD are 2.7 Å over 452 atoms for 5G05-O, 4.1 Å over 466 atoms for 6GKG-F, 2.2 Å over 424 atoms for 4JHR-B, and 5.4 Å over 468 atoms for 6HEP-B. Based on the superpositions, the arrangement of ␣1-␣4 and half of ␣6 corresponds to 2.5 TPR motifs. However, AFF4 lacks the conserved amino acids in the featured positions that define the TPR repeats, such that TPRpred could not identify typical TPR motifs in the sequence (34,35).
In AFF4-CHD, the C-terminal residues (␣7 and ␣8), which could potentially take the role of H7-H9, adopt a conformation different from H7-H9 in 14-3-3 protein ␥ (Fig. 1C). When AFF4-CHD is superposed with 14-3-3 protein ␥, the loop between ␣7 and ␣8 takes the position of H7 in 14-3-3 protein ␥; however, ␣8 folds into the concave surface and blocks the surface, instead of taking the open position as observed for H9 in 14-3-3 protein ␥ (Fig. 1C). When superposed with 14-3-3 protein ␤ (6HEP-B) in the presence of its ligand in complex with CFTR R-domain peptide pSer 753 -pSer 768 , helix ␣8 of AFF4-CHD clashes with the ligand peptide (Fig. S2C). However, we do not rule out the possibility that ␣8 and the N-terminal region of AFF4-CHD could change conformation and move away from the ligand-binding concave surface in the presence of natural substrates. In summary, except for a partial and distant similarity to 14-3-3 proteins and TPR motif-containing proteins, AFF4-CHD adopts a unique fold and does not resemble any known structure in the Protein Data Bank.
The helices ␣5 and ␣6 and the loop region that connects them (L ␣5-␣6 ) form a hydrophobic core and mediate the interaction between the two monomers ( Fig. 2A). In addition to the hydrophobic interactions, residues Ile 1085 to Ser 1082 of loop L ␣4 -␣5 form hydrogen bonds via main chain interaction with the same region of the partner, which further strengthens the interaction ( Fig. 2A, right panel).
During protein preparation, we noticed that AFF4-CHD forms homodimers based on the elution volume of the protein during gel filtration (Fig. 2E), which is consistent with the dimerization observed in the crystal structure. Analytical sizeexclusion chromatography showed that AFF4-CHD (theoretical molecular mass ϭ 31.9 kDa as a monomer) elutes slightly after conalbumin (75 kDa) and before ovalbumin (44 kDa), and the calculated molecular mass based on the calibration of the column is 64.6 kDa, which indicates that AFF4-CHD forms a dimer in solution ( Fig. 2E and Fig. S3C).
To further validate the dimerization interface observed in the crystal structure, we mutated the hydrophobic residues on ␣6, which play main roles in the dimerization interface, including His 1090 , Tyr 1096 , Val 1097 , Phe 1103 , and Leu 1104 , to alanine (5MA) or aspartate (5MD) and compared the elution volume of the WT protein and the mutants on analytical gel filtration (Superdex 200 increase 3.2/300; GE Healthcare). The peak of the WT AFF4-CHD elution peak was located at 1.53 ml, whereas both AFF4-CHD mutants 5MA and 5MD showed an elution peak at 1.62 ml, which indicates that mutation of the hydrophobic residues on the interface abolished homodimerization of AFF4-CHD (Fig. 2, B and E, and Fig. S2D). Static light scattering data confirmed that the molecular mass of AFF4-CHD-WT is 64.38 kDa, and the molecular mass of the dimerization mutant AFF4-

Structure of AFF4 C-terminal homology domain
CHD-5MD is 32.2 kDa in solution, which supports our dimerization hypothesis (Fig. S3, D and E).

CHDs of AFF4 and AFF1 form a heterodimer
It was reported that AFF4 and AFF1 are able to form a heterodimer in cells and that the CHD was responsible for heterodimerization (15,26). We therefore investigated whether the CHD of AFF4 and AFF1 also form a heterodimer in vitro and whether our homodimeric structure would be a good model for AFF4-AFF1 heterodimerization. For this purpose, we designed an AFF1-CHD (residues 938 -1210) construct based on alignments and on our crystal structure. Considering the similar size of AFF4-CHD and AFF1-CHD, we used an N-terminal His-MBP-tagged AFF1-CHD (MBP-

Structure of AFF4 C-terminal homology domain
AFF1-CHD), with a theoretical molecular mass of 72.3 kDa for a monomer.
Our analytical size-exclusion chromatography and static light scattering data showed that the molecular mass of MBP-AFF1-CHD is 147 kDa in solution, and the MBP-AFF1-CHD/ AFF4-CHD complex is 99 kDa in solution. This indicates that MBP-AFF1-CHD forms a homodimer and that AFF4-CHD was able to replace MBP-AFF1-CHD from the AFF1-CHD homodimer and to form an AFF4-CHD/MBP-AFF1-CHD heterodimer after incubation overnight at 4°C (Fig. 2C and Fig. S3, F and G). Incubation of MBP-AFF1-CHD with AFF4-CHD indicates that AFF4-CHD replaces most of MBP-AFF1-CHD at 1:1 molar ratio. An excess of AFF4-CHD could eliminate almost all MBP-AFF1-CHD homodimer (Fig. 2C). In contrast, neither of the AFF4-CHD mutants (5MA and 5MD) was able to disrupt the MBP-AFF1-CHD homodimer ( Fig. 2D and Fig.  S2E). This implies that the dimerization interface observed in the AFF4-CHD homodimer also mediates the interaction between AFF1 and AFF4.

AFF4-CHD interacts with nucleic acids
Surface electrostatic potential analysis of our crystal structure revealed several large positively charged patches on AFF4-CHD (Fig. 3A), which implies that AFF4-CHD might be able to interact with nucleic acids. To study the nucleic acid-binding ability of AFF4-CHD, we first tested the binding of a previously reported (24) Table S2). The standard deviations reported here and in all following fluorescence anisotropy experiments were calculated from triplicated experiments. The sequences of nucleic acids for fluorescence anisotropy are shown in Table S1.
We tested the interaction of AFF4-CHD with singlestranded TAR RNA, structured TAR RNA with a stem-loop structure and a G-quadruplex DNA HIV-LTR III, and compared the interaction with N19-FBS 35-mer G-quadruplex RNA in buffer containing 100 mM KCl ( Fig. 3C and Table S2). Single-stranded TAR RNA and structured TAR RNA interact with AFF4-CHD with a similar affinity as the N19-FBS RNA ( Fig. 3C and Table S2). The affinity for HIV-LTR III RNA is lower with a K d of 1.647 Ϯ 0.176 M. Our results indicate that AFF4-CHD binds to RNA regardless of the structure of the RNA (Fig. 3C and Table S2), and the interaction with DNA is weaker ( Fig. 3C and Table S2).
To investigate whether the phosphorylation of AFF4-CHD by P-TEFb affects the RNA binding and homodimerization of AFF4-CHD, we performed fluorescence anisotropy and analytical gel filtration after incubating AFF4-CHD with WT P-TEFb kinase (P-TEFb-WT) or the catalytically dead mutant of P-TEFb (PTEFb-N, D149N) (Fig. S3). Our fluorescence anisotropy results indicate that after phosphorylation by P-TEFb, the affinity of AFF4-CHD to N19-FBS 35-mer sequence is slightly reduced (the K d changed from 0.274 Ϯ 0.019 to 0.693 Ϯ 0.068

Structure of AFF4 C-terminal homology domain
M) (Fig. 4C and Table S2). Analytical size-exclusion chromatography showed that phosphorylation of AFF4-CHD has no effect on AFF4 homodimerization (Fig. 4D). Taken together, these results indicate that P-TEFb-dependent phosphorylation of AFF4-CHD may slightly modulate nucleic acid interactions but not domain dimerization.

Discussion
Here we report the crystal structure of the CHD of AFF4, the scaffold protein of the SEC. We show that AFF4-CHD folds into a unique structure that distantly resembles 14-3-3 proteins, which can interact with peptides that contain phosphorylated serine residues (29). We further show that AFF4-CHD forms a homodimer in vitro, and mutation of the dimerization interface disrupts both homodimerization and heterodimerization with AFF1-CHD. We further show that AFF4-CHD exhibits positively charged surfaces and can interact with DNA and RNA in vitro. Finally, we show that AFF4-CHD is a substrate of P-TEFb phosphorylation in vitro and identified the phosphorylation sites by MS.
The crystal structure of AFF4-CHD revealed the extensive dimerization interface that includes hydrophobic interactions and hydrogen bonds. Mutagenesis studies show that disruption of the hydrophobic core of the interface was able to abolish both AFF4-CHD homodimer and AFF4-CHD/AFF1-CHD heterodimer in vitro. Sequence alignment indicates that the dimerization interface residues are highly conserved among the CHD of AFF proteins. Thus our results reveal a conserved dimerization mechanism for AFF proteins.
Although our results are consistent with a large body of prior work, there is one discrepancy. Whereas we show that AFF4-CHD can form homodimers when expressed recombinantly, this was not observed by overexpressing AFF4 in vivo (26). It is possible that the protein undergoes additional modifications in vivo, which would prevent homodimerization. It is also possible that in the context of full-length protein, the dimerization interface is shielded by other regions of the protein or by interaction partners such as AFF1.
The heterodimer of AFF1 and AFF4 could be observed in vivo; however, this heterodimer most likely resides outside of the SEC complex, because tandem co-immunoprecipitation experiments showed that AFF1 and AFF4 do not co-exist in the same SEC complex (26). Furthermore, it was shown that the target genes of AFF1 and AFF4 are largely nonoverlapping, with less than 40% of them being regulated by both AFF1 and AFF4 (26). However, how AFF1 and AFF4 may cooperate in regulating the shared target genes remains unclear. In a previous study,

Structure of AFF4 C-terminal homology domain
it was proposed that during transcription several SEC complexes are recruited to transcribing Pol II, and the recruitment of AFF1-SEC and AFF4-SEC might happen in sequential steps (48). The CHDs might facilitate recruitment of the later SEC to Pol II.
CHD-mediated dimerization is required for the recruitment of intact SEC to MLL targets via the interaction between MLLfused AFF1 or AFF4 and the full-length AFF4 and AFF1 in the SEC complex, which led to constitutive activation of MLL target genes (49). Indeed, only MLL-AFF4 fusion proteins that include the CHD can induce up-regulation of Hoxa9 transcription and retain oncogenic potential (25,49). Therefore, our structure of AFF4-CHD might be explored to provide a new target for future therapeutic routes to treat MLL-rearranged leukemias (50).
We further showed that AFF4-CHD harbors large positively charged surface and showed that AFF4-CHD interacts with nucleic acids at micromolar range concentration, with slightly higher affinity to RNA than to DNA substrates. These observations are in line with a previous study showing that the C-terminal region of AFF proteins interacts with N19 RNA. The N19 RNA contains a 35-mer G-quadruplex structure, which was shown to be required for the interaction with AFF proteins (24). However, in our study, we did not observe a preference for N19 35-mer RNA G-quadruplex over other RNAs, maybe because our AFF4-CHD lacks the unstructured region upstream of the CHD, and the N19-FBS-35mer is shorter than the construct used in previous literature (24). An interaction of AFF4-CHD with the G-quadruplex structure of N19 was shown to be important for the splicing of this RNA (24). Identification of natural RNA targets would require suited in vivo approaches such as cross-linking followed by sequencing of the crosslinked RNA regions.
We also showed that P-TEFb phosphorylates AFF4-CHD and identified phosphorylation sites on the domain, among which there are phosphorylation sites that were also observed in vivo (40 -42, 51). Although the phosphorylation did not affect dimerization of AFF4-CHD or nucleic acids binding in vitro, it is possible that these modifications might influence the interaction with additional binding partners or regulate the function of AFF4 in cells.
In summary, we provide the structure of the conserved AFF protein CHD and the molecular basis for the homo-and heterodimerization of AFF proteins. In addition, our data impli- A, time course of the phosphorylation of AFF4-CHD by P-TEFb. AFF4-CHD and SPT4/SPT5 were incubated with P-TEFb over a time course of 2 h, and samples were taken at 0, 30, 60, and 120 min. SPT5 serves as a control to monitor the activity of P-TEFb and demonstrate the shift of protein bands after incubation with P-TEFb in the presence of ATP. B, final products of the phosphorylation assay triplicates for the identification of phosphorylation sites using MS. Lanes 1 and 5 are apo proteins before adding P-TEFb. Lanes 2-4 and 6 -8 are the triplicates of the phosphorylation assay after 2 h of incubation. C, binding of AFF4-CHD to 10 nM fluorescence labeled FBS 35-mer G-quadruplex in apo and phosphorylated states. The binding assay was performed after incubating AFF4-CHD with WT P-TEFb kinase or catalytical mutant of P-TEFb kinase in the presence of ATP. The experiments were repeated three times, each time with two pipetting repeats as in Fig. 3C. D, the final products of the three phosphorylation triplicates of the AFF4-CHD incubated with P-TEFb and P-TEFb-N-mut were injected onto the gel filtration column (Superdex 200 increase 3.2/300; GE Healthcare), respectively.

Structure of AFF4 C-terminal homology domain
cate the AFF4-CHD in nucleic acid interactions and identified a conserved loop region as a phosphorylation site for P-TEFb. The structure provides a missing building block of the super elongation complex and can be used to design future experiments aimed at dissecting the function of AFF4 and SEC.

Preparation of the C-terminal dimerization domain of AFF4 (AFF4-CHD) and AFF1 (AFF1-CHD)
The boundary of the C-terminal domain of AFF4 (AFF4-CHD) and AFF1 (AFF1-CHD) was designed based on the secondary structure prediction using PSIPRED (27,28). DNA containing the AFF4-CHD (residues 899 -1163) was amplified from genomic DNA and was cloned into vector 1B using ligation-independent cloning (MacroLab) (52,53). This results in an ORF containing an N-terminal hexahistidine tag with a TEV protease cleavage site. Similarly, the AFF1-CHD (residues 938 -1210) was cloned into vector 1C (MacroLab) using LIC, which results in an ORF containing an N-terminal hexahistidine-MBP tag with a TEV protease cleavage site. The primers used for cloning are listed in Table S1. The His-AFF4-CHD and His-MBP-tagged AFF1-CHD were expressed overnight at 20°C in Escherichia coli BL21 (DE3) strain cells (Stratagene), respectively. For selenomethionine labeling, AFF4-CHD was expressed in minimal medium supplemented with selenomethioine (54).
The cells were harvested and resuspended in buffer containing 50 mM HEPES, pH 7.4, 300 mM NaCl, 10% glycerol, 1 mM DTT, and 30 mM imidazole supplemented with protease inhibitors, DNase I and lysozyme and lysed by sonication. Cell lysates were subjected to affinity purification on a nickel column (His-Trap HP; GE Healthcare) and eluted with linear imidazole gradient ranging from 30 to 500 mM. The eluted protein was concentrated and polished by a final gel filtration (Superdex 200 10/300 increase; GE Healthcare). The sample was concentrated to 21 mg/ml and stored at Ϫ80°C in buffer containing 20 mM HEPES, pH 7.5, 300 mM NaCl, and 2 mM DTT. His-MBPtagged AFF1-CHD was expressed and purified in the same way, concentrated to 12 mg/ml, and stored at Ϫ80°C.
Mutants of AFF4-CHD (5MA and 5MD) were generated by site-directed mutagenesis PCR (QuikChange; Qiagen) using WT AFF4-CHD plasmid as the template. The primers used for mutagenesis are listed in Table S1. The AFF4-CHD 5MA and 5MD were expressed and purified in the same way as the WT protein. Importantly, in the affinity purification step for the AFF4-CHD mutants, only the fractions of the first peak, which is also the main peak, were collected for further purification. The tailing shoulder peak at the end of the imidazole gradient, which roughly contains less than 5% of the whole eluate, might contain mixture of dimer and monomers.

Crystal structure determination
The purified native and selenomethionine-labeled His-AFF4-CHD was crystalized by vapor diffusion using 14 -16% PEG8000, 40 mM KH 2 PO 4 , and 20% glycerol as reservoir solution. Complete diffraction data at 2.2 Å resolution were obtained for crystals of the selenomethionine-labeled protein on a PILATUS 6M detector at the PXII Beamline of the Swiss Light Source at a temperature of 100 K and processed by XDS and scaled by XSCALE (55). The resolution cutoff was set based on CC1 ⁄ 2 value (56). Selenium sites were identified with SHELXC and chain-racing and density modification with SHELXD/E (57). The model was manually built in coot using the initial polyalanine model from SHELXC/D/E as a starting model and refined with program PHENIX (58,59). We further refined the model using coot and PHENIX iteratively. 99.5% of residues were in the favored regions of the Ramachandran plot without any outliers. The stereochemical properties of the structure was verified with MolProbity (60). The refinement statistics are shown in Table 1.

Analytical gel filtration of AFF4-CHD and MBP-AFF1-CHD
Analytical gel filtration study with purified WT and mutated AFF4-CHD and MBP-AFF1-CHD proteins were performed with ÄKTAmicro FPLC using Superdex 200 3.2/300 increase column (GE Healthcare) in buffer containing 20 mM HEPES, pH 7.4, 300 mM NaCl, 4% glycerol, and 1 mM DTT. For the study of heterodimerization of AFF1 and AFF4, MBP-AFF1-CHD and AFF4-CHD were mix in different molar ratios, 1:0, 1:1, 1:3, and 1:5 in a 50-l system and incubated overnight at 4°C. To compare the interaction between MBP-AFF1-CHD and WT and mutants of AFF4-CHD, MBP-AFF1-CHD, and AFF4-CHD, proteins were mixed in 1:3 ratios and incubated at 4°C overnight. The premixed protein solution was injected onto the gel filtration column, and the elution fractions were collected and analyzed on SDS-PAGE.
To estimate the molecular masses of proteins, the analytical Superdex 200 3.2/300 increase column (GE Healthcare) was calibrated with size exclusion standards ferritin (440 kDa), aldolase (158 kDa), conalbumin (75 kDa), ovalbumin (44 kDa), and carbonic anhydrase (29 kDa) (GE Healthcare and Sigma-Aldrich) at a flow rate of 0.03 ml/min on ÄKTAmicro FPLC (GE Healthcare) in buffer containing 20 mM HEPES, pH 7.4, 300 mM NaCl, 4% glycerol, and 1 mM DTT. The Log molecular mass of each standard was plotted against the elution volume of the

Structure of AFF4 C-terminal homology domain
very peak of the elution profile using GraphPad Prism. The correlation between molecular mass and elution volume were calculated by fitting the data with the linear regression equation: y ϭ Ϫ2.3257x ϩ 8.374, in which y is the Log molecular mass, and x is the elution volume. The molecular masses of AFF4-CHD-WT and AFF4-CHD-5MD were calculated based on the elution volume and the equation.

Static light scattering
The size exclusion chromatogram coupled static light scattering experiments were performed using Superdex 200 10/300 increase column (GE Healthcare) and VISCOTEK 305 TDA detector (Malvern) in buffer containing 20 mM HEPES, pH 7.4, 300 mM NaCl, 4% glycerol, and 1 mM DTT at a flow rate of 0.5 ml/min. The scattering of proteins was measured at 90°( right-angle light scattering) and 7°(low-angle light scattering). The system was first calibrated with BSA. Then samples including AFF4-CHD-WT, AFF4-CHD-5MD, MBP-AFF1-CHD, and preincubated MBP-AFF1-CHD/AFF4-CHD-WT were sequentially measured using the same setup. The data were evaluated with OmniSEC 5.12 software (Malvern) and plotted using GraphPad Prism.

Fluorescence anisotropy
The fluorescence anisotropy experiments were performed using purified AFF4-CHD protein and 5Ј-FAM-labeled RNA (Table S1) and DNA oligonucleotides in the assay buffer containing 50 mM HEPES, pH 7.4, 1 mM MgCl 2 , 1 mM EDTA, 1 mM DTT, 10 g of BSA, 5 g of yeast tRNA, and 50, 100, or 200 mM KCl. The experimental procedures and buffer condition are modified based on previous literature (24,61). Briefly, the 5Ј-FAM-labeled nucleotides were diluted to 75 nM in the assay buffer, and the protein was diluted by a factor of 2 ranging from 40 nM to 40 M with the assay buffer. 7.5 l of protein and 4 l of nucleotides were mixed and incubated on ice for 10 min. 17.5 l of assay buffer was added to the mixture and incubated for another 20 min at room temperature. The final assay condition contains 10 nM labeled RNA and 5 nM (Fig. 3C)  ) in GraphPad Prism, where B max is the maximum specific binding, L is the DNA or RNA concentration, P is the concentration of AFF4-CHD, and K d,app is the apparent dissociation constant for AFF4-CHD and DNA or RNA.

Phosphorylation assays of AFF4-CHD
The AFF4-CHD was phosphorylated by the kinase complex P-TEFb. To enhance the solubility of P-TEFb, a fusion construct of human CYCT1 (1-272) and HIV-1 TAT (1-72) was made by PCR. A 5-amino acid linker (Gly-Gly-Gly-Ser-Gly) was inserted between CYCT1 and TAT. The resulting gene fusion was inserted into the pSPL vector (62,63) using the XbaI and StuI restriction sites. Human CDK9 (1-372) bearing an N-terminal His 8 tag was cloned into pACEBAC1 as described (5). The pSPL and pACEBAC vectors were combined using Cre recombination (62). Clones were screened to select for single recombination events, which result in a P-TEFb-5aa-TAT construct. The WT P-TEFb-5aa-TAT and the catalytic mutant of P-TEFb-5aa-TAT-N (D149N) were expressed and purified as described in a previous study (5).
The phosphorylation assays of AFF4-CHD were performed in a 20-l system based on a previous study (5). Briefly, 0.5 l of AFF4-CHD (675 M stock) or 0.8 l of DSIF (48.5 M) was mixed with 0.5 l of purified P-TEFb (40 M stock) and 19 l of phosphorylation buffer (20 mM HEPES, pH 7.4, 100 mM NaCl, 3 mM MgCl 2 , 1 mM DTT, 4% glycerol) and incubated for 5 min at 30°C. 2 l of sample was taken prior ATP addition as 0 min. 0.4 l of ATP (50 mM stock) was added to the assay solution and further incubated for a time course of 2 h, and 2 l of sample was taken at the 30-min, 60-min, and 2-h time points (5). The samples were loaded onto SDS-PAGE and run with MOPS buffer for 42 min at 200 kV (Invitrogen). The time course was repeated three times, and the final products of the triplicates were loaded onto a 10-well SDS-PAGE.
To produce the phosphorylated AFF4-CHD protein for fluorescence anisotropy, 4 l of AFF4-CHD (675 M stock) were incubated with either P-TEFb-5aa-TAT or the catalytic mutant of P-TEFb-5aa-TAT (D149N) as described above for 2 h. 0.5 l of the final product was checked on SDS-PAGE to validate the phosphorylation. The phosphorylated or nonphosphorylated AFF4-CHD was diluted to 40 M and mixed with nucleic acids as indicated above.

Identification of the phosphorylation sites
As described above, the final products of the three timecourse experiments were analyzed on SDS-PAGE. Phosphopeptides obtained from in-gel digest of the protein band were enriched and analyzed as previously described and the MS/MS spectra were searched against the human database with the Andromeda search engine (5,64). The phosphorylation sites reported in this study were present in all the three replicates with a localization probability higher than 0.99.
Author contributions-Y. C. and P. C. conceptualization; Y. C. data curation; Y. C. and P. C. formal analysis; Y. C. and P. C. validation; Y. C. and P. C. investigation; Y. C. and P. C. visualization; Y. C. and P. C. methodology; Y. C. writing-original draft; Y. C. and P. C. writing-review and editing; P. C. resources; P. C. software; P. C. supervision; P. C. funding acquisition; P. C. project administration.