Autoinhibitory Interdomain Interactions and Subfamily-specific Extensions Redefine the Catalytic Core of the Human DEAD-box Protein DDX3*

DEAD-box proteins utilize ATP to bind and remodel RNA and RNA-protein complexes. All DEAD-box proteins share a conserved core that consists of two RecA-like domains. The core is flanked by subfamily-specific extensions of idiosyncratic function. The Ded1/DDX3 subfamily of DEAD-box proteins is of particular interest as members function during protein translation, are essential for viability, and are frequently altered in human malignancies. Here, we define the function of the subfamily-specific extensions of the human DEAD-box protein DDX3. We describe the crystal structure of the subfamily-specific core of wild-type DDX3 at 2.2 Å resolution, alone and in the presence of AMP or nonhydrolyzable ATP. These structures illustrate a unique interdomain interaction between the two ATPase domains in which the C-terminal domain clashes with the RNA-binding surface. Destabilizing this interaction accelerates RNA duplex unwinding, suggesting that it is present in solution and inhibitory for catalysis. We use this core fragment of DDX3 to test the function of two recurrent medulloblastoma variants of DDX3 and find that both inactivate the protein in vitro and in vivo. Taken together, these results redefine the structural and functional core of the DDX3 subfamily of DEAD-box proteins.

DEAD-box proteins are defined by 12 different motifs that function in ATP binding or hydrolysis and RNA binding, or couple ATP and RNA binding (1). Outside of these conserved motifs, each DEAD-box protein subfamily has unique tails that lie N-or C-terminal to the helicase core and contain elements that define the unique properties of that subfamily. For example, DDX21 has a GUCT domain in its C-terminal extension (26,27), DDX5 has tandem P68HR domains in its C-terminal extension, and DDX43 has a KH1 domain in its N-terminal extension. However, as the tails of each DEAD-box protein subfamily are idiosyncratic, whereas the cores are very similar (28,29), it is essential to study individual subfamilies of DEAD-box proteins in detail to understand the role of subfamily-specific tails.
DDX3 is a member of the Ded1/DDX3 subfamily, along with the Saccharomyces cerevisiae ortholog DED1, and Vasa/DDX4 (5,30). The tails of Ded1/DDX3 subfamily members are thought to be largely unstructured and contain diverse motifs with different functions. For example, the N-terminal tail of DDX3 contains a Crm1-dependent nuclear export sequence (31) and an eIF4E-binding motif (10, 11), whereas the C-terminal tail contains conserved sequences of unknown function that are essential for oligomerization (5,32). The tails of DED1 additionally contain assembly domains that modulate translation and alleviate lethality associated with protein overexpression when deleted (10). The minimal functional core of DEAD-box proteins has been defined as the isolated phenylalanine upstream of the Q-motif through roughly 35 residues beyond Motif VI (29). However, this analysis does not consider subfamily-specific extensions. Similarly, prior structural work has truncated one (25) or both (33) of the Ded1/DDX3 subfamilyspecific regions down to the boundary of the helicase core, but it is unknown how active these truncations are when compared with full-length DDX3. Moreover, previous structures of DDX3 are of either inactive constructs (33) or point mutants (25), so the relevance of the crystallized conformations is unclear. Therefore, crucial details of the regions uniquely conserved within the Ded1/DDX3 subfamily and the conformational landscape of active, wild-type DDX3 remain incompletely understood.
Here, we assay the function of the tails of human DDX3 by generating a series of truncations and exploring their activity in vitro and in vivo. We define an active but truncated construct of DDX3 and solve its crystal structure at 2.2 Å resolution in a partially closed state, representing the highest resolution structure of active, wild-type DDX3 to date. We find that this partially closed state is autoinhibited and demonstrate that mutations predicted to destabilize this conformation accelerate RNA duplex unwinding by DDX3. Using molecular dynamics simulations, we show that the ATP-binding loop of DDX3 samples transient interactions with ATP and that the partially closed state is stable in solution. Lastly, we test two recurrent medulloblastoma variants of DDX3 and find that they inactivate duplex unwinding by up to 3 log units. Our work defines a functional truncation of DDX3 that purifies to high yield, elucidates the function of the signature tails found in the Ded1/ DDX3 subfamily of DEAD-box proteins, presents high-resolution structural information of active DDX3 of utility for molecular modeling and drug design, and demonstrates the consequences of two medulloblastoma-associated DDX3 variants.

Experimental Procedures
Recombinant Protein Purification-The Ded1/DDX3 subfamily core of DDX3 was expressed using a construct containing Escherichia coli codon-optimized, human DDX3X amino acids 132-607 fused to a His 6 -MBP (maltose-binding protein) tag and expressed in E. coli BL21 Star by induction with isopropyl-1-thio-␤-D-galactopyranoside at 16°C for 18 h. Cell pellets were lysed by sonication, clarified by centrifugation at ϳ30,000 ϫ g, and purified by nickel chromatography including a 1 M NaCl wash to remove bound nucleic acids. The His 6 -MBP tag was cleaved using tobacco etch virus protease during dialysis into 200 mM NaCl, 10% (v/v) glycerol, 20 mM HEPES, pH 7, and 0.5 mM TCEP. 3 The sample was then purified using heparin affinity chromatography, eluted at 400 mM NaCl, 10% glycerol, 20 mM HEPES, pH 7, and 0.5 mM TCEP, and applied to a Superdex 75 gel filtration column equilibrated in 500 mM NaCl, 10% glycerol, 20 mM HEPES, pH 7.5, and 0.5 mM TCEP. Fractions were then concentrated and supplemented with 20% (v/v) glycerol and flash-frozen for kinetics, or used directly for crystallization. Typical yield was ϳ10 mg of purified protein per liter. Point mutants were generated by site-directed mutagenesis.
Full-length DDX3X with a His 6 tag was cloned into a pET-29a vector and expressed in E. coli BL21 (37°C). Cells were processed as described previously for the purification of Ded1p (34). Lysates were passed through pre-equilibrated nickel-aga-rose beads and washed with increasing imidazole concentrations (5-60 mM) (34). DDX3X was eluted in 250 mM imidazole. The His 6 tag was cleaved using tobacco etch virus protease in 50 mM Tris-Cl (pH 8.0), 0.5 mM EDTA, 1 mM DTT, and 40% (v/v) glycerol. DDX3X was further purified by adsorption to phosphocellulose resin (P11, Whatman) and elution with NaCl, as described for Ded1p (34). Eluted fractions were analyzed by SDS-PAGE and Western blotting using anti-His antibody to confirm removal of the His 6 tag. DDX3X fractions were supplemented to 40% (v/v) glycerol, flash-frozen in liquid nitrogen, and stored at Ϫ80°C.
X-ray Crystallography-Purified protein was concentrated to ϳ5 mg ml Ϫ1 and mixed 1:1 with precipitant solution containing from 6 to 12% PEG 3000 and 100 mM sodium citrate, pH 5.0, and crystallization was achieved by hanging drop vapor diffusion at 18°C within 24 h. Nucleotide-bound crystals were grown identically but supplemented with 10 mM of the nucleotide in solution. Homogeneous pieces of branched crystals were harvested for data collection, which was conducted at Beamline 8.3.1 of the Advanced Light Source. Data were indexed, integrated, and scaled using XDS (35), phased using molecular replacement with PHASER (36) with both domains from Protein Data Bank (PDB) 2I4I (33) as independent search models, and refined and built using PHENIX (37) and Coot (38). High resolution was determined by CC1 ⁄ 2 Ն 10% (39). Structures were visualized with PyMOL (40).
RNA Duplex Unwinding Assays-Assays were performed as described (41) with minor modifications. Briefly, duplex RNAs containing a 3Ј overhang were formed by radiolabeling a singlestranded RNA, annealing, and using gel purification. The two RNA sequences are 5Ј-AGCACCGUAAAGACGC-3Ј and 5Ј-GCGUCUUUACGGUGCUUAAAACAAAACAAAACAA-AACAAAA-3Ј. Reactions contained trace duplex RNA and 1 M protein and were initiated by the addition of 2 mM MgATP.
Yeast Complementation Assays-A yeast strain containing the genomic region surrounding DED1 on a centromeric plasmid was used for all experiments (10). Mutations were made in a HIS3-marked plasmid and exchanged for the wild-type allele by plasmid shuffling using counterselection with 5-fluoroorotic acid. Growth assays show 10-fold dilution from an optical density of 1 and were conducted using rich medium (YPD) plates at the temperature indicated.
Molecular Dynamics Simulations-Molecular dynamics simulations were performed using Gromacs version 4.6.5 (45). The apo structure was used as the starting point for all simulations (PDB 5E7I) with missing loops built using UCSF Chimera version 1.8.1 (46) and Modeller version 9.12 (47). A rhombic dodecahedral, periodic simulation box was used with a buffer of 12 Å between the solute and the boundary. Na ϩ and Cl Ϫ ions were added for charge neutralization followed by vacuum and solvent equilibration using transferable intermolecular potential 3 point (TIP3P) water (48). The production simulation length was 100 ns, and the step size was 2 fs. For modeling, the ATP-bound, Vasa closed-state domains of the AMPPNP structure (PDB 5E7M) were superposed onto the corresponding domains of the Vasa structure using PyMOL. Then, missing loops were added using UCSF Chimera and Modeller. Initial energies were high, likely reflecting a clash due to imperfect alignment, but quickly equilibrated in vacuum. Interatomic distances along trajectories were calculated using VMD version 1.9.1 (49).

Results
The Ded1/DDX3 Subfamily Contains Conserved Regions outside the Helicase Core-To define the minimal active construct of human DDX3X, we aligned Ded1/DDX3 family members from diverse species. In addition to the RecA-like core domains, five regions of sequence conservation unique to this family are apparent (Fig. 1A). The N-terminal conserved sequences correspond to a Crm1-dependent nuclear export sequence (31)and an eIF4E-binding site (10, 11), and the C terminus contains an RDYR motif and an invariant WW dipeptide motif ( Fig. 1) (5). In addition, there are regions adjacent to the helicase core conserved between DDX3, Ded1, and Vasa/DDX4 (Fig. 1A) (5,25). The N-terminal extension (NTE; residues 132-168; Fig. 1B) is predicted by PSIPRED (50) to form a short ␣-helix from residues 145 to 151, whereas the C-terminal extension (CTE; residues 582-607; Fig. 1B) has no predicted structure but is highly positively charged (pI ϳ12). Recent structural and biochemical work demonstrated that DDX3 constructs containing the NTE are competent for ATP hydrolysis (25), but it is unknown how this activity compares with full-length human DDX3.
The CTE Is Essential for RNA Duplex Unwinding and Affects Yeast Growth-To compare the activity of full-length DDX3 with truncated variants, we expressed and purified full-length DDX3 and truncations of the NTE, CTE, or both and then measured RNA duplex unwinding activities. Truncation of 131 residues from the N terminus and 55 residues from the C terminus yields a functional core of DDX3 that robustly unwinds RNA duplexes, although this construct has a roughly 5-fold lower functional affinity for RNA when compared with the full-length protein ( Fig. 2A). Both full-length and truncated protein show sigmoidal functional binding isotherms, suggesting that DDX3 functions as an oligomer, as yeast DED1 (32).
Removal of the CTE containing the RDYR motif severely diminishes duplex unwinding, independent of the presence of the NTE ( Fig. 2A). It is possible that removal of this positively charged region in the CTE negatively impacts RNA binding, as suggested by weaker binding to heparin resin (data not shown) and by the RNA binding defect caused by deletion of the entire C-terminal tail of Ded1p up to the helicase core (29). Alternatively, or in addition, this region might be critical for oligomerization (32).
To assess the biological function of the truncated proteins, we tested the ability of truncated versions of DED1 to support yeast growth. We generated truncations of DED1 in a plasmid and shuffled these into a yeast strain containing the sole copy of   DED1 on a plasmid under control of its endogenous promoter (10). Under normal growth conditions, DED1 truncations lacking both the nuclear export sequence and the eIF4E-binding site complement yeast growth fully. Truncation up to the boundary of the NTE is viable but confers a slow growth phenotype (Fig. 2B). It is intriguing that this strain fully complements DED1 because Ded1p is thought to be primarily nuclear localized when the nuclear export sequence is deleted (51). We were unable to generate strains lacking the ATP-binding loop (ABL; DED1 130 -604 ; data not shown). The DED1 30 -604 and DED1 85-604 strains are cold-sensitive, indicating a special requirement for the N-terminal tail at cold temperatures. Similarly, truncation up to the boundary of the CTE is tolerated, but deletion of the CTE results in slowed growth (Fig. 2B), whereas further truncation into the CTE exhibits cold sensitivity (10) or very weak complementation (29). These data demonstrate that the activity of the Ded1/DDX3 subfamily of DEAD-box proteins is surprisingly resilient to truncation of the subfamilyspecific tails, but only up to conserved regions adjacent to the helicase core. Furthermore, inclusion of the CTE is essential for activity in vitro and in vivo. The 2.2 Å Crystal Structure of Wild-type DDX3 132-607-To better understand the role of the NTE and CTE, we solved the crystal structure of AMP-bound wild-type DDX3 132-607 to ϳ2.2 Å resolution ( Table 1). The DEAD-domain is oriented uniquely with respect to the HELICc domain when compared with two other structures of DDX3 (25,33) (Fig. 3A). Interestingly, the orientation of the present structure and DDX3 135-582 (⌬CTE; PDB 4PXA) is more similar than DDX3 168 -582 (⌬NTE and ⌬CTE; PDB 2I4I), suggesting that inclusion of the NTE biases crystallization toward this conformation. However, the interdomain orientation between the present structure and 4PXA is different, likely because PDB 4PXA contains a D354V mutation, which is located at the interdomain interface in both structures. Comparison of the present crystal structure with the structure of Vasa bound to RNA shows that the crystallized conformation of DDX3 is refractory to RNA binding, as the HELICc domain overlaps with the bound RNA in the Vasa structure ( Fig. 3B; PDB 2DB3) (52). The CTE is predicted to be disordered by PSIPRED, and we observe no density past residue 584 in any of our structures, despite the obvious requirement for this region for protein function in vitro and in vivo (Fig. 2). Therefore, DDX3 preferentially crystallizes in a partially closed conformation with interdomain interactions between the DEAD and HELICc domains that are refractory to catalysis.
We observe a short ␣-helix from residues 146 -151 (the 150Јs helix) in the NTE as predicted by PSIPRED and seen in the DDX3 135-582 structure (25). The NTE additionally contains the ABL, which is disordered in our structure and 4PXA (25) but forms a short ␣-helix in the structure of Vasa bound to RNA (52). Therefore, we crystallized DDX3 132-607 in the presence of no ligand, ADP, or AMPPNP to see whether the conformation of the ABL was altered (Table 1). In all cases, the ABL was disordered and difficult to model robustly. We elected to not build residues 155-165 of the NTE in any of these structures as repeated model building and refinement indicated insufficient density in this region to specify a unique structure. Thus, the ABL is dynamic in the presence of adenosine phosphates, and may fold into an ␣-helix cooperatively with interdomain closure and RNA binding, as seen in the Vasa structure bound to ssRNA (52).

Mutation of an Interdomain Interacting Residue Accelerates Duplex Unwinding-If the crystallized conformation of DDX3
is present in solution, then it should inhibit duplex unwinding by DDX3 because it is refractory to RNA binding (Fig. 3). The interdomain interface buries ϳ560 Å 2 , suggesting that it may be stable in solution. Therefore, we hypothesized that mutation of the interdomain interface should accelerate duplex unwinding. Four residues make apparent interdomain contacts: Asp-354 and Glu-388 form a salt bridge with His-527 (separated by 4.0 and 2.7 Å, respectively), and Asp-506 hydrogen-bonds to the backbone of Arg-276 and caps an ␣-helix ( Fig. 4A; 2.8 Å distance). We targeted residues Asp-354 and Glu-388 for mutation because Asp-506 and His-527 are members of conserved motifs Va and VI, respectively (1). Mutation of residue Asp-354 to either alanine or tryptophan (to sterically block formation of the partially closed state) results in decreased activity (Fig. 4B). Although Asp-354 is not part of a pan-DEAD-box motif, it is conserved within the Ded1/DDX3 subfamily, including Vasa. Interestingly, the structure of DDX3 with a D354V mutation shows a different closed state (25), supporting the role of this residue in interdomain interactions. In contrast, mutation of Glu-388 to alanine, arginine, or tryptophan accelerates duplex unwinding by a factor of two (Fig. 4B). In the closed, RNAbound structure of D. melanogaster Vasa, residue Glu-438 (DDX3 Glu-388) is solvent-exposed and distal to the RNAbinding site, making it unlikely that these mutations increase activity by altered RNA or interdomain interactions. As a comparison, we tested mutation of the conserved GINF motif in the NTE (Fig. 1A) and found that it decreases duplex unwinding by a factor of two, supporting the conclusion that the ABL is necessary for catalysis (Fig. 4B) (25). In sum, mutations predicted to destabilize the crystallized interdomain interface accelerate duplex unwinding by DDX3, consistent with the presence of the partially closed structure in solution and its inhibitory nature.
The Partially Closed Structure Is Stable over 100 ns of Molecular Dynamics Simulation-As the crystal structure was solved at pH 5 and crystals can trap transient structures, we tested the stability of the partially closed structure and the interactions of the ABL with ATP at neutral pH by performing 100-ns molecular dynamics simulations of apo DDX3 132-607 apo and ATP-bound DDX3 132-607. The ATP-bound state was modeled off the closed form of Vasa bound to RNA to attempt to induce structure formation in the ABL, but instead quickly equilibrated toward an unobserved, alternative closed state, which was not pursued further. In contrast, the partially closed interdomain interface remained stable for the full 100-ns simulation when started from the apo crystal structure (PDB 5E7I) (Fig. 5A). Similarly, in both the partially closed and the ATPbound states, the 150Јs helix remained stably docked to the side of the DEAD domain (Fig. 5B). In contrast, the ABL is dynamic. In the ATP-bound simulation, the ABL makes transient interactions with the adenine moiety of ATP (Fig. 5C) involving interactions between Lys-162 and ATP and Phe-160 cationstacking with Lys-162 (Fig. 5D). In this state, Lys-162 forms a bipartite hydrogen bond with ϳ3 Å separation from both a ribose oxygen and the adenine N3 of ATP and is separated from the Phe-160 aromatic ring by ϳ4 Å. These simulations show that the partially closed state of DDX3 (Fig. 3) is stable in silico and suggest that the ABL becomes structured in a cooperative manner with RNA binding. Medulloblastoma Variants of DDX3 Inactivate the Protein-DDX3 is among the most frequently mutated genes in the highly malignant brain tumor medulloblastoma (21)(22)(23)(24). Most variants are predicted to inactivate the catalytic activity of DDX3, and some have been shown to decrease the ATPase activity (25). The truncated construct of DDX3 containing the CTE and NTE is highly active and easy to purify, facilitating analysis of disease-associated variants of DDX3. We therefore selected three recurrent variants found in the DEAD-box motifs Ia and VI, R276K, R276Y, and R534H (Fig. 6A) (21)(22)(23), and additionally tested alanine substitutions and made these mutations in the DDX3 132-607 construct. The corresponding residue of Arg-276 in Vasa is Arg-328, which binds to the RNA backbone, and Arg-534 is Vasa Arg-582, which interacts with the ␥-phosphate of ATP. All mutant proteins purified to high yield (Ͼ5 mg l Ϫ1 ) and decreased the rate of duplex unwinding, from 1-fold to 1000-fold (Fig. 6B). In concert with the in vitro results, mutations R276A and R276K support yeast growth (Fig.  6C), whereas the other three mutations could not complement DED1 (data not shown). Similarly, mutation of the residue corresponding to Arg-534 in S. cerevisiae PRP28 causes dominant negative lethality in vivo (53). In sum, medulloblastoma variants of DDX3 at Arg-276 or Arg-534 are inactivating, further confirming that full-length, inactive DDX3 is selected for by this tumor (21)(22)(23)25), as well as demonstrating the utility of DDX3 132-607 in determining the functional consequences of disease-associated variants of DDX3.

Discussion
DEAD-box proteins consist of two RecA-like domains that comprise the "helicase" core surrounded by variable regions that are unique to individual subfamilies (1,29). Here, we studied the role of the N-and C-terminal extensions that are essential for function in the Ded1/DDX3 subfamily of DEAD-box proteins (5). We found that removal of the N-terminal 131 and  C-terminal 55 residues yields an active construct of DDX3, but further truncation is deleterious (Fig. 2). We then solved the highest resolution crystal structure of an active construct of DDX3 to date (Fig. 3), illustrating a unique autoinhibited state of the protein (Fig. 4) and providing an excellent starting structure for molecular dynamics simulations (Fig. 5). We find that two untested variants found in the highly malignant brain tumor medulloblastoma inactivate the protein in vitro and are lethal in yeast (Fig. 6). In sum, our data characterize the essential, conserved core of the Ded1/DDX3 subfamily, which will prove useful when interpreting variants found in human malignancies. DDX3 132-607 crystallized with a unique interdomain interface not seen in previous crystal structures (Fig. 4A). The interface buries ϳ560 Å 2 of surface area and overlaps with the RNA-binding surface, suggesting that it is inhibitory to RNA duplex unwinding (Fig. 3B). Indeed, introduction of point mutations predicted to destabilize the interdomain interface accelerated the rate of duplex unwinding, despite being distal to the RNA, ATP, or closed-state interdomain interface (Fig. 4). In support of the presence of this state in solution at neutral pH, it was stable during our molecular dynamics simulation (Fig. 5). Therefore, DDX3 contains a cryptic second binding site for the HELICc domain on the DEAD-domain, which is present in solution and is inhibitory for function. S. cerevisiae Prp5p was also crystallized in an inhibitory conformation, and destabilization of this "twisted" state accelerated catalysis (54). It is possi-ble that specific proteins may bind and stabilize inhibitory conformations of DEAD-box proteins to negatively regulate catalysis, as opposed to the many MIF4G domains that bind and activate catalysis (55)(56)(57)(58)(59).
Other DEAD-box subfamilies have different subfamily-specific NTE and CTEs. For example, the structure of RNA-bound DDX19 (PDB 3GOH) (60) contains an NTE helix similar to DDX3 and Vasa. However, this region forms part of a ␤-sheet in the ADP-bound structure of DDX19 (PDB 3EWS) (60). In the crystal structure of S. cerevisiae Prp5p, an NTE helix stabilizes the twisted conformation by interacting with the DEAD and HELICc domains (54). Therefore, structural plasticity at the N-terminal boundary of the DEAD domain may be a feature common to many DEAD-box proteins. Perhaps the most extreme example characterized to date of a CTE is in the mitochondrial DEAD-box protein Mss116p, where the CTE forms an entire domain essential for RNA binding and is a fundamental piece of the helicase core (61,62), unlike most DEAD-box proteins. Thus, to understand the function of an individual DEAD-box subfamily, it is essential to characterize sequences conserved within the subfamily and beyond the DEAD-box helicase core.
Interestingly, the Ded1/DDX3 subfamily of DEAD-box proteins multimerizes both in vitro and in vivo (32). Oligomerization depends on the C-terminal tail of Ded1, and truncation of the C-terminal 69 residues blocks multimerization and hinders duplex unwinding (32). Our data show that removal of this region strongly  suppresses duplex unwinding in vitro, but only if the conserved RDYR motif is deleted (Figs. 1A and 2A). Future experiments will test the role of the RDYR motif and C-terminal sequences in oligomerization of the Ded1/DDX3 subfamily.
Human DDX3X is altered in numerous malignancies, and different disorders have unique mutation spectrums indicative of their distinct requirements for DDX3 function. For example, nearly all variants found in patients afflicted by the malignant brain tumor medulloblastoma are nonsynonymous single nucleotide variants yielding point mutants predicted to inactivate the protein, yet there are no premature stop codons, frameshifts, or splice variants (21)(22)(23)(24). Thus, full-length but inactive protein is selected by this tumor type. In contrast, in blood cancers such as natural-killer/T-cell lymphoma (17), Burkitt lymphoma (63), or chronic lymphocytic leukemia (15,16), DDX3X variants include nonsynonymous single nucleotide variants but also many premature stop codons, frameshifts, and splice variants. The elucidation of the minimal conserved func-tional core and the new, high-resolution structures of DDX3 presented here is of broad utility for molecular modeling and when predicting the function of truncating variants of DDX3 present in patient samples.