The DEAD-box protein DDX43 (HAGE) is a dual RNA-DNA helicase and has a K-homology domain required for full nucleic acid unwinding activity

The K-homology (KH) domain is a nucleic acid-binding domain present in many proteins but has not been reported in helicases. DDX43, also known as HAGE (helicase antigen gene), is a member of the DEAD-box protein family. It contains a helicase core domain in its C terminus and a potential KH domain in its N terminus. DDX43 is highly expressed in many tumors and is, therefore, considered a potential target for immunotherapy. Despite its potential as a therapeutic target, little is known about its activities. Here, we purified recombinant DDX43 protein to near homogeneity and found that it exists as a monomer in solution. Biochemical assays demonstrated that it is an ATP-dependent RNA and DNA helicase. Although DDX43 was active on duplex RNA regardless of the orientation of the single-stranded RNA tail, it preferred a 5′ to 3′ polarity on RNA and a 3′ to 5′ direction on DNA. Truncation mutations and site-directed mutagenesis confirmed that the KH domain in DDX43 is responsible for nucleic acid binding. Compared with the activity of the full-length protein, the C-terminal helicase domain had no unwinding activity on RNA substrates and had significantly reduced unwinding activity on DNA. Moreover, the full-length DDX43 protein, with single amino acid change in the KH domain, had reduced unwinding and binding activates on RNA and DNA substrates. Our results demonstrate that DDX43 is a dual helicase and the KH domain is required for its full unwinding activity.

Helicases are molecular motors that transduce the chemical energy generated by ATP hydrolysis into an oligonucleotide strand separation and protein displacement. They are involved in virtually all aspects of nucleic acid metabolism, including replication, repair, recombination, transcription, chromosome segregation, and telomere maintenance (1)(2)(3). They move directionally along one nucleic acid strand; thus helicases are classified as 5Ј 3 3Ј or 3Ј 3 5Ј with respect to the strand they bind to and move along. Based on substrates, helicases can be classified as DNA or RNA helicases, although some can func-tion on both DNA and RNA molecules (4). According to their conserved motifs, helicases are also classified into six superfamilies (5), among which SF2 is the largest superfamily that includes DEAD-box helicases.
Several subfamilies of DEAD-box helicases have generated particularly high interest because individual enzymes from these subfamilies are involved in multiple processes of RNA metabolism (2), such as transcription and pre-mRNA splicing (6,7), ribosomes biogenesis (8), nuclear export (9), translation initiation (10), RNA degradation (11,12), and organelle gene expression (13). Altogether, DEAD-box helicases are the largest group of enzymes involved in eukaryotic RNA metabolism. DEAD-box helicases possess a nonprocessive separation activity, and most RNA helicases examined to date do not efficiently unwind duplexes with more than one-and-a-half-helical turns in vitro, but they efficiently separate shorter duplexes (2). This activity is particularly used in many RNA metabolic processes in eukaryotic cells, where DEAD-box helicases act as important placeholders or checkpoint proteins, allowing processes to proceed in order and connect steps in the RNA metabolism machinery (15).
DDX43, 2 (DEAD-box polypeptide 43; also known as HAGE (helicase antigen gene), was first identified as a cancer/testis antigen gene in a human sarcoma cell line (16). The gene is present on chromosome 6 (6q12-q13) as determined by radiation hybrid analysis and encodes a putative 73-kDa protein that belongs to the DEAD-box family of ATP-dependent RNA helicases. With the exception of testis that shows a high level of expression, DDX43 mRNA is found to be expressed by a wide range of tumor tissues at least 100-fold that of normal tissues (16). At the protein level, DDX43 is also detected at different levels in a variety of tumor tissues including bladder, brain, breast, colon, esophagus, kidney, liver, lung, stomach, and small intestine among others but not in normal tissues or at very low levels (17). DDX43 is also overexpressed in Ͼ50% of chronic myeloid leukemia, 20% of acute myeloid leukemia (18), and Ͼ40% of multiple myelomas. Therefore, considering the expression pattern and the diversity of DDX43 expression in different tumors, it may represent a suitable target for immunotherapy.
DDX43 protein contains the signature motifs of SF2 RNA helicases, including Q, I, Ia, Ib, II, III, IV, V, and VI (supplemental Fig. S1A). In addition, DDX43 also has conserved motifs Ic, Va, and Vb that are present in certain DEAD-box proteins. Besides the helicase core domain located in its C terminus, DDX43 also possesses potential K-homology (KH) domains at its N terminus (supplemental Fig. S1B) with its hallmark sequence, GXXG (19). The KH domain was first identified in the human heterogeneous nuclear ribonucleoprotein K (hnRNP K) more than two decades ago (20,21) and subsequently has been found in a wide diversity of organisms spanning archaea, bacteria, and eukaryotes (22). KH domains exist in two different versions: type 1 KH-fold (KH1) is found in eukaryotes and often in multiple copies and type 2 KH-fold (KH2) is found in prokaryotes and often in a single copy (23,24). The KH domain is ϳ70 amino acids long with the most conserved consensus sequence VIGXXGXXI mapping to the middle of the domain (25). The KH domain is composed of three ␣-helices pack onto the surface of a central antiparallel ␤-sheet. KH1 and KH2 domains share a minimal ␤␣␣␤ core, with two additional ␣ and ␤ elements positioned at either the C terminus (KH1) or the N terminus (KH2) of this core motif (26,27). The highly conserved GXXG loop has a specific sequence element that establishes contact with the nucleic acid, which is essential for the biochemical function of the proteins (19). KH domaincontaining proteins perform a wide range of cellular functions, and several diseases, including paraneoplastic syndromes and some cancers, are associated with the loss of function of specific KH domains (19). Although KH domains have been found in a variety of proteins, no KH domain has been described in helicases.
Here, we found that DDX43 protein could function as both an RNA and DNA helicase in an ATP-dependent manner. Intriguingly, DDX43 unwound duplex RNA regardless of single-strand tail orientation, but it preferred a 5Ј to 3Ј polarity on RNA substrates and unwound DNA only in the 3Ј to 5Ј direction. Site-directed mutagenesis and biochemical assays revealed that there is a KH domain in the N terminus that is responsible for nucleic acid binding. The C-terminal helicase core domain displayed very weak unwinding on DNA substrates and no unwinding on RNA substrates. These results suggest that the KH domain is required for the unwinding activity of DDX43 helicase.

Expression and purification of DDX43 protein
To characterize the DDX43 protein, we cloned the human DDX43 gene into a pET28a vector, overexpressed in Escherichia coli and purified through a two-step chromatography. First, the His-tagged DDX43 protein was passed through Ni-NTA affinity chromatography, and the retained proteins were eluted with imidazole. The recombinant DDX43 proteins were purified to near homogeneity as judged by their appearance on Coomassie-stained a SDS-polyacrylamide gel (Fig. 1A). The fractions with high protein concentration were pooled and applied to a Sephacryl S-300 size exclusion column. DDX43 protein eluted in two peaks, namely peak 1 and peak 2 (Fig. 1B). Identical migration bands were observed on the Coomassie stained SDS-PAGE gel for both peaks (Fig. 1C). The identity of DDX43 protein was confirmed by Western blot using anti-DDX43 and anti-His antibodies (Fig. 1D), indicating both fractions are indeed recombinant DDX43 proteins. According to the molecular mass standards used to calibrate the size exclusion column (supplemental Fig. S2), peak 1 was in the void volume of the column, likely representing aggregated DDX43 proteins, whereas the mass of peak 2 protein was 67.4 Ϯ 5.1 kDa, which is close to the predicted monomeric form of DDX43 (72.4 kDa). Additionally, we also performed multiangle laser light scattering (MALS) and refractive index measurements to further confirm the oligomeric state of DDX43 protein. The MALS analysis revealed that the majority of the protein (86.4%) existed with a molecular mass of 89.9 kDa (Fig. 1E), in line with the previous observations of monomers from size exclusion column. A minor fraction (13.6%) was present as a dimer (179.8 kDa), which might be due to the protein concentration step using membrane cut-off. Because of the overlap of monomer and dimer peaks, the light-scattering-of-dimer fraction might have influenced the monomer, causing apparent higher molecular mass for the monomer fraction. Therefore, we collected the monomer fractions of DDX43 for the following biochemical assays.

DDX43 is an ATP-dependent RNA helicase
According to its primary protein sequence, DDX43 belongs to the DEAD-box protein family of RNA helicases (16); thus we started to characterize its RNA unwinding activity. Using a 5Ј-tailed 13-bp duplex RNA that is commonly used with the DEAD-box proteins (28), we found that DDX43 could efficiently unwind this substrate in the presence of ATP in both a concentration (0 -3 M) and a time-dependent manner (0 -30 min) (Fig. 2, A and B). DDX43 also exhibited some unwinding activity on a 3Ј-tailed 13-bp duplex RNA (Fig. 2C) and a blunt-end 13-bp duplex RNA substrate (Fig. 2D). For example, at the highest concentration (3 M), DDX43 unwound ϳ60% of 5Ј-tailed dsRNA compared with ϳ15% of 3Ј-tailed dsRNA and ϳ10% of blunt-end dsRNA (Fig. 2E). However, when we increased the length from 13 to 16 bp for the 5Ј-tailed duplex RNA substrate, DDX43 failed to unwind this substrate under the same reaction conditions (Fig. 2F) even at high protein concentration (9 M), suggesting that DDX43 has low processivity on duplex RNA.
To confirm that the unwinding activity detected was truly dependent on DDX43 and not because of contaminants in the preparation of the DDX43 protein, we changed the conserved lysine 292 to alanine (K292A) in motif I, which is essential for ATP hydrolysis, and aspartic acid 396 to alanine in motif II's DEAD-box (D396A), which helps to bind cations for enzyme activity (supplemental Fig. S1). Using the identical methods and conditions as wild-type DDX43, mutant proteins were purified to near homogeneity (supplemental Fig. S3A); however, neither K292A nor D396A displayed any unwinding activity on the 5Ј-tailed 13-bp duplex RNA substrate (supplemental Fig. S3, B and C).

DDX43 is an ATP-dependent 3 3 5 DNA helicase
Many helicases have been found to possess dual unwinding activity, acting on both DNA and RNA substrates (4). Thus, we asked whether DDX43 can work on DNA substrates as well. Using a 19-bp forked duplex DNA substrate, we detected that DDX43 can efficiently unwind the DNA substrate in the presence of ATP in both a concentration (0 -3 M)and a time (0 -30 min)-dependent manner (Fig. 3, A and B). In contrast, it was less active on a 3Ј-tailed 19-bp duplex DNA (Fig. 3C) and totally inactive on a 5Ј-tailed 19-bp duplex DNA substrate (Fig.  3D) and a 19-bp blunt end duplex DNA substrate (Fig. 3E), indicating that DDX43 unwinds duplex DNA with defined directionality by the conventional mechanism, namely translocating in the 3Ј 3 5Ј direction coupled to unwinding. At the highest concentration (3 M), DDX43 unwound ϳ76% of forked dsDNA and ϳ25% of 3Ј-tailed dsDNA (Fig. 3F). Furthermore, we found that DDX43 could unwind 30-, 40-, and 50-bp forked duplex substrates; however, it had significantly reduced unwinding activity as the duplex length increased (Fig. 3, G-I). For example, at the highest concentration, DDX43 unwound ϳ65% of 30-bp dsDNA, ϳ50% of 40-bp dsDNA, and ϳ15% of 50-bp dsDNA (Fig. 3J), suggesting that the processivity of DDX43 is moderate on DNA substrates. Also, we observed that DDX43 could unwind DNA/RNA hybrid substrates (supplemental Fig. S4A); however, it displayed higher efficiency when the tailed strand was RNA (supplemental Fig. S4, B-D).
Again, to confirm that the DNA unwinding activity detected was truly dependent on DDX43 protein, we examined the unwinding activity of two engineered mutants, K292A and D396A, and found both mutations abolished unwinding activity of DDX43 on the DNA substrate (supplemental Fig. S3, D and E). Because many helicases have annealing activity, we also examined the annealing activity of DDX43 in the absence and presence of ATP and found that the DDX43 protein did not exhibit any annealing activity for either RNA (supplemental Fig. S5A) or DNA (supplemental Fig. S5B).
Because DDX43 can separate both dsRNA and dsDNA substrates, we sought to determine whether the single-stranded nucleic acid tail in substrates might influence the ability of DDX43 to unwind the flanking duplex. For RNA substrates, we used 8-and 18-nt 5Ј-tailed 13-bp duplex RNA substrate and found a slight increase in unwinding with increasing tail length (supplemental Fig. S6, A-C). For DNA substrates, increasing unwinding activity was observed when we increased the ssDNA tail from 15 to 25 nt (supplemental Fig. S6, D-F). Again, these results suggested that DDX43 unwinds RNA by a non-canonical/local destabilization mechanism while unwinding DNA by the canonical translocation mechanism, and the singlestranded nucleic acid's tail may serve as loading dock for DDX43 protein in both dsRNA and dsDNA.

ATP hydrolysis and Mg 2؉ are required for DDX43 unwinding
Helicases utilize the energy derived from ATP hydrolysis to separate base-paired DNA or RNA. First we aimed to identify the best NTP for the unwinding activity of DDX43 and found that efficient unwinding was observed with ATP and dATP (to a small extent) on the 5Ј-tailed 13-bp duplex RNA substrate (Fig. 4A), which is consistent with the previous findings for DEAD-box helicases, whereas both ATP and dATP were equally effective on the forked duplex DNA (Fig. 4B).
The conformational change caused by ATP binding with the DEAD-box helicases might lead to unwinding activity (29,30); therefore, we used non-hydrolyzable ATP-analogs, ATP␥S and AMP-PNP, and found no unwinding activity on DNA and RNA substrates (Fig. 4, C and D), indicating that ATP hydrolysis is essential for the unwinding activity of DDX43. So far the unwinding reactions were under multiple turnover conditions; thus, to mimic single turnover conditions, we used non-hydrolyzable ATP analog ADP-BeF x (30) that mimics the ATP prehydrolysis state and found that DDX43 could not unwind the DNA substrate (Fig. 4E). Thus, we concluded that ATP hydrolysis is required for DDX43 helicase activity on the substrates. Next, we asked whether DDX43 could function as an ATPase. In the absence of DNA or RNA stimulator, DDX43 displayed no detectable ATP hydrolysis activity (data not shown). In the presence of circular single-stranded M13 DNA, the hydrolysis activity of DDX43 was greatly stimulated, whereas the engineered mutants DDX43-K292A and -D396A displayed no ATPase activity (Fig. 4F). Using dT 30 or rU 30 as a stimulator (both at 0.03 mM), DDX43 had greater ATP hydro- lysis with dT 30 compared with rU 30 (Fig. 4G); for example, at the time point of 45 min, DDX43 hydrolyzed 9.8 Ϯ 1.7 pmol of ATP in the presence of dT 30 and 2.1 Ϯ 0.9 pmol ATP in the presence of rU 30 . These results indicate that DDX43 possesses an intrinsic DNA-or RNA-dependent ATP hydrolysis activity. Also, we found that Mg 2ϩ was the best cation for the helicase activity of DDX43 (supplemental Fig. S7, A and B), and ATP: Mg 2ϩ ratios of 1:1 and 1:2 were the best for its unwinding activity (supplemental Fig. S7, C and D).

DDX43 has weak translocase activity but no RNP remodeling activity
DDX43 unwinds dsDNA from 3Ј to 5Ј but not from 5Ј to 3Ј, indicating it might be a 3Ј to 5Ј translocase. Triplex displacement experiments have been utilized to monitor the translocase activity of helicases, such as AddAB (31), FANCM (32), and ChlR1 (33). In this assay a triple helix is formed when a third strand forms Hoogsteen base pairs with DNA duplex. If a translocase proceeds through the triplex, it will displace the third strand. Using a 5Ј-tailed triplex structure, we found that DDX43 could not displace the third strand, whereas ChlR1 helicase could (Fig. 5A); however, although it was weak, DDX43 could displace the third strand in a 3Ј-tailed triplex (Fig. 5B), indicating its 3Ј to 5Ј directional translocation activity.
Next, we asked whether DDX43 coupled its translocase activity to nucleoprotein remodeling activity. We performed a streptavidin displacement assay in which we monitored the ability of DDX43 to displace a streptavidin molecule bound to a biotinylated DNA. Because DDX43 behaves as a 3Ј 3 5Ј DNA helicase, we designed a 65-mer oligonucleotide that has the biotin-conjugated 28 nt from the 5Ј-end and the 37 nt from the 3Ј-end (supplemental Table S1). Briefly, DDX43 protein and the substrate were preincubated before initiation of the reaction by adding ATP/Mg 2ϩ mixture and an excess of biotin to trap free streptavidin molecules. As shown in Fig. 5, C and D, DDX43 failed to displace streptavidin in either a protein concentration or a time-dependent fashion. Under the same reaction conditions, FANCJ helicase was able to remove streptavidin from DNA as we have previously observed (34).

DDX43 has a KH domain in its N terminus
From its primary sequence, DDX43 contains three potential KH domains with three signature GXXG sequences (supplemental Fig. S1). Close inspection of the second GXXG sequence revealed that it has additional nearby conserved amino acid sequences shared with other well known KH domain-containing proteins (supplemental Fig. S8A), which was absent in the first and third sequence (data not shown). Moreover, these amino acids are conserved in DDX43 orthologues across species (supplemental Fig. S8B). To confirm experimentally that the second GXXG sequence-containing region is a KH domain, we cloned these conserved 74 amino acids (69 -142 aa, named DDX43 KH ) into a pET28 vector and purified this protein to near homogeneity (Fig. 6A). Previous studies have shown that the first glycine residue in the GXXG sequence is essential for RNA binding function of the KH domain (35)(36)(37); thus, we changed the first glycine to aspartic acid in the GXXG sequence (DDX43 KH-G84D ). The far UV-wavelength scan of the protein revealed that DDX43 KH is typically well folded with predominant ␣-helical structures, having a dip at 222 and 208 nm (Fig.  6B). This result is concordant with the previous reports of KH domains of the P-element somatic inhibitor protein (38) and Fragile X mental retardation protein (39,40). Further analysis of the CD data with Selcon 3 (41) revealed that the KH domain of DDX43 has 41% ␣-helices, 35% ␤-sheets, and 24% unordered. This random coil might be due to the N-terminal and C-terminal His tag obtained from the vector. The secondary structure prediction by PSIPRED (42) also revealed similar results: 40% ␣-helices, 30% ␤-sheets, and random coil showing that the GRGG motif falls within the predicted ␤␣-folded domain (Fig.  6C). Similar results were obtained for the DDX43 KH-G84D proteins (data not shown). Electrophoretic mobility shift assay (EMSA) revealed that the wild-type KH domain bound the ssDNA and forked dsDNA; in contrast, the mutant, DDX43 KH-G84D , showed no binding (Fig. 6, D and E). Similar results were obtained with ssRNA rU30 and dsRNA (Fig. 6, F and G). None of them bound blunt-end dsDNA (Fig. 6H), which is consistent with the structural evidence (43).
To elucidate potential sequence specificity binding for the KH domain, we labeled a dC 30 (oligodeoxycytosine 30 residues), a dA 30 (oligodeoxyadenine 30 residues), and a random DNA oligo (DNA 30-mer, supplemental Table S1) under the same reaction conditions as dT 30  To eliminate the possibility that the two other GXXG sequences were the KH domain motifs, we also cloned the N-terminal region (1-253 aa, named DDX43 NT ) that contains all three GXXG sequence into a pET28a vector for protein expression. We changed the first glycine-to-aspartic acid in each conserved GXXG sequence, namely G46D, G84D, and G154D (named DDX43 NT-G46D, NT-G84D, and NT-G154D , respectively). Three mutant proteins were purified along with wild type to near homogeneity (supplemental Fig. S10A). Using the 19-bp forked duplex DNA substrate, the EMSA result revealed that G46D and G154D bound forked dsDNA comparable with or even better than wild-type protein (supplemental Fig. S10B); however, the G84D mutation abolished its binding activity. Similar results were obtained with ssDNA (supplemental Fig.  S10C) and RNA substrates (supplemental Fig. S10D). None of them bound blunt-end dsDNA (supplemental Fig. S10E). Collectively, we concluded that the second 84 GXXG 87 sequencecontaining region is a functional KH domain responsible for nucleic acid binding.

KH domain is required for efficient helicase activity of DDX43 protein
As we observed that DDX43 unwinds RNA duplex with a non-processive mechanism and DNA duplex with a processive mechanism, we suspected that the helicase core domain may be responsible for this phenomenon (5). To address this, we purified the C-terminal helicase core domain (254 -648 aa, named DDX43 HD ) to near homogeneity (Fig. 7A). Using a 19-bp forked duplex DNA substrate, we detected unwinding activity for the helicase domain protein (Fig. 7B), although it was weaker than the full-length DDX43 protein. Moreover, it was active on a 3Ј-tailed 19-bp duplex DNA substrate (Fig. 7C) but not active on 5Ј-tailed (Fig. 7D) or blunt end 19-bp duplex DNA substrates (Fig. 7E), indicating that the helicase domain, like the full-length protein, unwinds duplex DNA with defined directionality, namely a 3Ј to 5Ј direction. Unexpectedly, unwinding activity was not detected for the RNA substrate at our regular reaction conditions (Fig. 7F), even at increased enzyme (9 M) or reaction time (45 min, data not shown). These results suggest that the helicase core domain is responsible for the processivity on DNA molecules; however, it is not as active as the full-length protein.
Because the helicase core domain does not exhibit efficient unwinding activity as compared with the full-length protein, we suspected that the KH domain is essential for its full functionality. To address this, we mutated the conserved glycine at position 84 to aspartic acid (named DDX43 FL-G84D ) in the context of full-length DDX43 protein. The mutant protein was purified to near homogeneity (Fig. 8A). Helicase assays revealed that DDX43 FL-G84D had reduced unwinding activity on both RNA (Fig. 8B) and DNA substrates (Fig. 8C), suggesting that the KH domain is required for the full unwinding activity of DDX43 protein. EMSA showed that the DDX43 FL-G84D had significant reduced binding ability to the forked dsDNA substrates (Fig. 8D), suggesting that the reduced unwinding activity might be due to poor binding between DDX43 protein and substrates. Taken together, these results suggest that the DDX43 protein's N-terminal KH domain binds to nucleic acids independent of its C-terminal helicase domain, whereas the C-terminal helicase domain requires N-terminal KH domain for its full activity (Fig. 8E). Our EMSA results also revealed that the N-terminal region has higher affinity to DNA and RNA molecules than the helicase core (supplemental Fig. S11). These results suggest that the presence of an ancillary KH domain is essential for the efficient biochemical activities of the C-terminal helicase domain, likely tightening the binding between helicase core and nucleic acid substrates.

Discussion
Although KH domains have been reported in many RNAbinding proteins, the role of a KH domain in a helicase has not been studied. In this study we have demonstrated that DDX43 is an ATP-dependent dual helicase that unwinds both RNA and DNA substrates. It catalyzes the unwinding reaction most efficiently in the presence of Mg 2ϩ and ATP. Intriguingly, DDX43 does not follow a strict translocation mechanism to unwind RNA substrates, but it unwinds DNA in a unidirectional manner. We also discovered that the KH domain in the N-terminal region is involved in nucleic acid binding and is essential for the efficient unwinding activity of DDX43 helicase. Superfamily 2 (SF2) helicases tend to be strictly specific for RNA or DNA, but it is not uncommon that a helicase can unwind both DNA and RNA substrates. Most of the DEAD-box proteins display clear specificity for RNA, whereas very few can unwind both DNA and RNA substrates, e.g. p68 (DDX5) (44) and Dbp9p (DDX56) (45). Some can unwind DNA-RNA chimeric substrates as well, e.g. Ded1 (46) and DDX1 (47). DDX43 is classified as a DEAD-box RNA helicase based on sequence homology to SF2 superfamily. Indeed, it has been reported that DDX43 unwinds duplex RNA (48). Our biochemical characterization of DDX43 indicated that both DNA and RNA can stimulate its ATP hydrolysis activity, and it can unwind both the substrates; therefore, DDX43 is an addition to the dual helicase family.
Generally, RNA helicases have low processivity on the substrate. For example, eIF4A (DDX2), a prototypic member of the

Characterization of DDX43 helicase
DEAD box family, acts in a non-processive manner to unwind 10 -15-bp duplex RNA (49). In fact, almost all members of the human DEAD-box family studied so far appear similar to eIF4A and have limited processivity; for example, 10 bp for DDX1 (50), 20 bp for DDX25 (51), 15 bp for DP103 (DDX20) (52), and 25 bp for RH-II (DDX21) (53). Unlike most helicases, the efficiency of strand separation for DEAD-box proteins is strongly dependent on the stability of the helix, thus, unwinding efficiency decreases with increase in the length and stability of the duplex. A few exceptions to this trend exist, such as DDX3 (50 bp) (54), p68 (162 bp) (55), and p72 (DDX17, 41 bp) (56) were shown to unwind moderately long RNA duplexes. However, it should be noted that reaction conditions, such as the amount of substrate and protein concentration, duration of reaction, amount of ATP, and so on may cause the difference reported.
Even in the case of DDX43, it has been reported it could unwind 40-bp duplex RNA in certain conditions (48,57). Nevertheless, most of the DEAD-box helicases have limited processivity.
Many DEAD-box proteins unwind RNA substrates without any preference for the polarity of single-stranded overhangs and unwind duplexes by interacting with an internal region of an RNA duplex and causing local strand separation; as a result, unwinding occurs without a defined polarity (46). Using a 16-bp duplex RNA substrate, Ded1 and Mss116p have been shown to unwind RNA duplexes without strict polarity, including blunt-end duplexes (46,58). Further additions to this list are p68 (59), eIF4A (60), DDX25 (51), Has1p (61), and SRMB (62). This distinct unwinding mode of DEAD-box proteins appears uniquely suited for the localized separation of short duplexes in the cell. First, usually short duplex RNAs (no longer than 10 bp) are present in cells. Second, the structure of Mss116p with RNA revealed that a single protein monomer could bind along most of the length of U 10 RNA, raising the possibility that strand separation by DEAD-box proteins does not involve any translocation at all (63).
In our study we found that DDX43 could efficiently unwind 13-bp dsRNA but not 16-bp dsRNA, suggesting that DDX43 is a low processivity DEAD-box helicase. It should also be noted that exclusive RNA duplex structures exist mainly during viral replication and transcription, and helical elements within structured RNAs are rarely longer than 10 base pairs (64). Thus, there appears to be little need for highly processive RNA helicases. We also found that DDX43 can unwind the blunt-end RNA substrate, suggesting that the unwinding could begin internally. However, DDX43 is more active on 5Ј-tailed than 3Ј-tailed dsRNA substrates, the reason for which is unknown. A very recent study reported that the Ski2-like RNA helicase Mtr4p unwinds RNA duplexes by 3Ј to 5Ј translocation (65), suggesting some RNA helicases may unwind RNA duplexes by translocation-based mechanisms.
Translocation polarity is a predefined and inherent feature for a particular enzyme and reflects helicase movement either in the 3Ј 3 5Ј or 5Ј33Ј direction (5). Helicase domain 1 (Rec A1) interacts with the 3Ј-end, and helicase domain 2 (Rec A2) interacts with the 5Ј-end of bound single strands, resulting in an interface with a well defined asymmetry that is likely to differentiate the two possible strand orientations within the binding cleft (66). Intriguingly, in our study we found that DDX43 unwinds the DNA substrate by canonical mechanism, that is, in a defined 3Ј 3 5Ј direction. It has been found that the helicase core domains HD1 and HD2 in the Mss116p DEAD-box helicase have modular functions enabling a novel mechanism for RNA-duplex recognition and unwinding where HD1 binds ATP and HD2 contains a nucleic acid-binding pocket, which accommodates A-form but not B-form duplexes (67). Also, the crystal structure of Msp116 with ssDNA and ssRNA revealed that the protein forms a closed-state complex with both nucleic acids. Although the interactions are similar, the closed-state complex with ssRNA contains protein contacts to RNA's 2Ј-OH groups that are absent in the closed-state complex with ssDNA (68). To probe the effects of RNA in various parts of the loading strand, a set of chimeric substrates was synthesized according to previous studies on NPH-II helicase (69) (supplemental Table S1). For the same duplex, such as the RNA duplex (supplemental Fig. S12, A and B) and DNA:RNA hybrid (supplemental Fig. S12, C and D), RNA is the preferred loading strand for DDX43, and even the few nucleic acids that are immediately adjacent to the junction with duplex made a difference (supplemental Fig. S12, E-G). This provides a basis for substrate specificity and might explain the differential unwinding mechanisms on RNA and DNA substrates. However, additional biochemical and structural studies are required to elucidate the unique polarity of DDX43 helicase on RNA and DNA substrates: for example, site-directed mutagenesis of helicase domain 1 in DDX43 to determine whether its unwinding is directly regulated by helicase domain 1 as reported in SF2 helicase (70). Also co-crystallization of DDX43 protein with nucleic acids may provide evidence of directional translocation.
Most DNA helicases use translocase mechanisms to unwind its substrate. A high degree of processivity is crucial for helicases involved in DNA replication, where millions of base pairs must be replicated quickly. In our study we found that DDX43 is more processive on the DNA substrates compared with the RNA substrates; however, the rate of duplex unwinding is significantly decreased with increasing length of the duplex, which is different from classic DNA helicases. A key property of DNA helicases is their ability to unwind long duplexes. Certain DNA helicases (e.g. MCM (71), RecBCD (72), TraI (73)) can processively unwind DNA tracts with longer than 500 base pairs. Even on a preferred forked duplex DNA substrate, DDX43 acts inefficiently to unwind a 50-bp duplex, suggesting that DDX43 is specifically tailored to act on short duplex substrates; however, its exact role in DNA metabolism needs further investigation.
Many helicases contain accessory domain(s) in their N-terminal or C-terminal regions that play critical roles in helicase function. It has been reported that DEAD-box helicases bind to helix extensions with low or no specificity, but the ancillary domains help in target recognition and thus unwinding. For example, Thermus thermophilus protein Hera binds structured RNAs through a RNA recognition motif (RRM)-like domain (74); another DEAD-box helicase DbpA uses its C-terminal ancillary domain to recognize a specific hairpin within the 23S rRNA and plays an important role in biogenesis of the large ribosomal subunit (75). Furthermore, deletion of the C-terminal tail, which is involved in RNA recognition in CYT-19 and Ms116p, has been found to affect the RNA chaperon activities of both proteins (76,77). In contrast, negative regulation has been reported in DDX19 helicase, where the N-terminal extension prevents its ATP hydrolysis activity (78).
In our study we found that the KH domain in the N terminus of DDX43 is responsible for substrate binding. Consistent with literature of DEAD-box helicases, our results showed that only the full-length DDX43 protein could unwind RNA and DNA duplex substrates. The truncated protein, the C-terminal helicase domain, did not show any unwinding activity on RNA and had very weak unwinding activity on DNA, indicating that the KH domain is crucial for DDX43 to perform unwinding activity on RNA substrates, which supports the notion that the KH domain is an RNA-binding domain (79). More studies are required to determine the structural and functional role of the KH domain in DDX43 helicase; for example, whether the KH domain in DDX43 forms a ␤␣␣␤ core with the two additional ␣ and ␤ elements positioned in its C terminus and whether the KH domain dictates the sequence-specific recognition of substrates for DDX43 helicase and/or stimulates the unwinding activity of the helicase core domain by binding ssRNA/ssDNA to prevent them reannealing (Fig. 8E).
DDX43 is overexpressed in a number of cancers and thus can serve as a biomarker and immunotherapy target. Recently, it was shown that the expression of DDX43 is a potential prognostic marker and a predictor of response to anthracycline treatment in breast cancer (80,81). Frequent expression of DDX43 was also found in chronic myeloid leukemia and acute myeloid leukemia (18,82,83), and its expression is associated with advanced disease and poor prognosis in chronic myeloid Characterization of DDX43 helicase leukemia (83). Therefore, DDX43 is considered as an ideal target for vaccine therapy. Vaccines, dendritic cell, peptide, and whole cell-based immunotherapy can be used to target DDX43 in various cancers. Although our current biochemical evidences suggest its KH domain is essential for DDX43's enzymatic activity, the KH domain, a short functional domain, can be a candidate epitope for the development of peptide vaccines against tumor immunotherapy.
Conclusively, our biochemical results demonstrate that DDX43 is a unique dual function DEAD-box helicase, indicating its potential distinctive cellular functions that might correlate with its overexpression in cancers.

Plasmid DNA
Human DDX43 cDNA clone was purchased from the SPARC BioCentre, the Hospital for Sick Children, Toronto, Canada. DDX43 gene (full-length, N-terminal region, C-terminal helicase domain, and KH domain) was PCR-amplified and cloned into the NdeI and XhoI sites of a pET28a vector (Novagen). The lysine mutation in motif I K292A, the aspartic acid mutation in DEAD-box motif II D396A, and first glycine in GXXG sequence mutations G46D, G84D, and G154D were generated with the QuikChange site-directed mutagenesis kit (Agilent Technologies) using primers listed in supplemental Table S2. All plasmids were verified by DNA sequencing.

Recombinant protein
The plasmid pET28a-DDX43 was transformed into E. coli Rosetta 2 cells (EMD Millipore). The Rosetta 2 cells were grown at 37°C in LB medium containing 30 g/ml kanamycin and 34 g/ml of chloramphenicol until the A 600 reached 0.6 and then induced by the addition of 0.3 mM IPTG overnight at 15°C. The cells were harvested by centrifugation at 5000 ϫ g for 10 min at 4°C. The periplasmic material was removed from the cells as described (83). Briefly, the cells were suspended in 5 ml/g of cell mass of hypertonic buffer solution (50 mM HEPES, pH 7.4, 20% sucrose, 1 mM EDTA) and centrifuged at 8000 rpm for 10 min at 4°C. The cells were resuspended in 5 ml/g of cell mass of hypotonic solution (5 mM MgSO 4 ) and incubated for 10 min on ice. Cells were then pelleted by centrifugation at 4000 rpm for 10 min at 4°C and stored at Ϫ80°C until used. The cell suspension was lysed by sonication in buffer A (25 mM Tris, pH 8.0, 0.15 M NaCl, 100 M Tween 20, and 10% glycerol) having a final concentration of 1 mM phenylmethylsulfonyl fluoride (PMSF) and protease inhibitor (Roche Applied Science) at 4°C with 5 short bursts of 10 s at intervals of 5 min. The cell debris and inclusion bodies were removed by centrifugation at 45,000 ϫ g for 30 min at 4°C. Recombinant His-tagged proteins were subjected to a two-step purification using nickel affinity beads (Sigma) and a Sephacryl S-300 HR 16/60 gel filtration column (GE Healthcare). The supernatant was applied to the Ni-NTA beads equilibrated with buffer A and washed with 10 column volume (CV) of buffer B (25 mM Tris, pH 8.0, 0.5 M NaCl, 100 M Tween 20, and 10% glycerol) containing 25 mM imidazole and eluted with 5 CV of buffer B containing 250 mM imidazole. The protein fractions were confirmed with SDS-PAGE; the fractions with high protein yield were pooled and subjected to size-exclusion chromatography on a Sephacryl S-300 HR 16/60 (GE Healthcare) equilibrated with buffer A. The fractions were collected at a flow rate of 0.5 ml/min with the same buffer. The proteins in peaks were confirmed with SDS-PAGE, and the ideal fractions were pooled and concentrated. The N-terminal region, C-terminal helicase domain, and KH domain proteins were purified similar to the full-length DDX43. FANCJ protein was purified as described previously (34). All proteins were snap-frozen in liquid nitrogen and stored at Ϫ80°C. Protein concentration was determined by the Bradford method using bovine serum albumin (BSA) as the standard.

Size exclusion chromatography and multiangle light scattering
Size-exclusion chromatography with SEC-MALS was performed using the Waters HPLC systems coupled to the refractive index and static laser light-scattering instruments (Wyatt MiniDAWN Treos and Wyatt OptiLab rEX, respectively). The DDX43 protein, purified by Ni-NTA chromatography, was concentrated using a 30-kDa centrifugal cut-off spin column (Millipore) and was injected (0.8 mg/ml; 500 l) onto the Wyatt WTC030S5 size exclusion column (300 Å pore size for the elution of 5-1250-kDa proteins) equilibrated with buffer A (without Tween 20). The light scattering and refractive index data were used to calculate the weight-averaged molar mass and the mass fraction of each peak using the Astra TM package v 6.1.2 (Wyatt Technologies).

Circular dichroism (CD) spectroscopy
The proteins were eluted with buffer C (10 mM phosphate buffer, pH 7.4, 100 mM NaCl) in size-exclusion chromatography. The CD spectra of the KH domain proteins (0.5 mg/ml) were acquired using Chirascan plus (Applied Photophysics) in 0.5-mm cuvettes, and the spectrum was recorded between 190 and 280 nm. Baselines were adjusted with buffer, scanned 10 times, and averaged. The protein samples were also scanned 10 times and averaged. The averaged baseline was subtracted from the averaged sample spectrum. The secondary structure content was analyzed using Selcon 3 (41) with respective spectrum input.

Nucleic acid substrates
PAGE-purified oligonucleotides used for RNA or DNA substrates were purchased from IDT and listed in supplemental  Table S1. The DNA substrates were prepared as described previously (84), and RNA substrates were prepared according to Jankowsky and Putnam (28). Briefly, a single oligonucleotide was 5Ј-end-labeled with [␥-32 P]ATP using T4 polynucleotide kinase (New England BioLabs) at 37°C for 1 h. Unincorporated radionucleotides were removed by a G25 chromatography column (GE Healthcare). Single-stranded DNA or RNA substrates were kept at 4°C and ready to use. For the double-stranded DNA substrates, a [␥-32 P]ATP-labeled oligonucleotide was annealed to a 2.5-fold excess of the unlabeled complementary strands in annealing buffer (10 mM Tris-HCl, pH 7.5, 50 mM NaCl) by heating at 95°C for 6 min and then cooling slowly to room temperature. For double-stranded RNA, a [␥-32 P]ATP-labeled oligonucleotide was annealed to a 2.5-fold excess of the unlabeled complementary strands in annealing buffer (10 mM MOPS, pH 6.5, 1 mM EDTA, 50 mM KCl) by heating at 95°C for 6 min and then cooling slowly to room temperature. All double-stranded substrates were purified by PAGE isolation, and their concentrations were determined by liquid scintillation counting before use.

Western blot
An equal amount of proteins (150 ng) were denatured at 100°C for 5 min, then resolved on 10% polyacrylamide Trisglycine SDS gels and transferred to PVDF membranes. The membrane was blocked in PBS containing 5% powdered milk at room temperature for 1 h followed by probe with a rabbit polyclonal anti-DDX43 antibody (1:1000, catalog #HPA031381, Sigma) or a mouse anti-His monoclonal antibody (1:5000, catalog #SAB1400618, Sigma), respectively. Goat anti-rabbit or goat anti-mouse IgG-horseradish peroxidase conjugate (cat#sc-2004, 2005, Santa Cruz Biotechnology) was used as a secondary antibodies at a 1:10,000 dilution and detected using ECL Plus (GE Healthcare).

Helicase assays
Helicase assay reaction mixtures (20 l) contained 40 mM Tris, pH 8.0, 0.5 mM MgCl 2 , 15 mM NaCl, 0.01% Nonidet P-40, 0.1 mM DTT, 1 mg/ml bovine serum albumin, equimolar mixture of 2 mM ATP and MgCl 2 , 0.5 nM concentrations of duplex RNA or DNA substrate, and the indicated concentrations of DDX43 protein. ADP-BeF x was prepared as a mixture of 20 mM ADP (Sigma) with 10 mM of metal fluoride (BeF 2 , Alfa Aesar) and 50 mM of NaF (BDH Chemicals) and incubated on ice for 4 h (14). Helicase reactions were initiated by the addition of DDX43 and then incubated at 37°C for 15 min unless otherwise indicated. Reactions were quenched with the addition of 20 l of 2ϫ stop buffer (17.5 mM EDTA, 0.3% SDS, 12.5% glycerol, 0.02% bromphenol blue, 0.02% xylene cyanol). For duplex RNA and DNA substrates, a 10-fold excess of unlabeled oligonucleotide (cold oligo) with the same sequence as the labeled strand was included in the quench to prevent reannealing. The products of the helicase reactions for duplex RNA substrates were resolved on nondenaturing 15% (19:1 acrylamide:bisacrylamide) polyacrylamide gels, and products of DNA unwinding reactions were resolved on nondenaturing 12% (19:1 acrylamide:bisacrylamide) polyacrylamide gels. Radiolabeled DNA or RNA species in polyacrylamide gels were visualized using a PharosFX Imager and quantitated using the Quantity One software (Bio-Rad). The percent helicase substrate unwound was calculated by using the formula, % unwinding ϭ 100 ϫ (P/(S ϩ P)), where P is the product and S is the substrate. The values of P and S have been corrected after subtracting background values in the no-enzyme and heat-denatured substrate controls, respectively.

Strand annealing assays
Annealed fork 19-bp duplex DNA or 5Ј-tailed 13-bp duplex RNA was denatured at 95°C for 5 min and kept on ice, then followed by the helicase assays described above in the presence or absence of ATP.

EMSA
Protein/DNA or RNA binding mixtures (20 l) contained the indicated concentrations of DDX43 and 0.5 nM specified 32 P-end-labeled DNA substrate in the same reaction buffer as that used for helicase assays (see above) without ATP. The binding mixtures were incubated at room temperature for 30 min after the addition of DDX43 protein. After incubation, 3 l of loading dye (74% glycerol, 0.01% xylene cyanol, 0.01% bromphenol blue) was added to each mixture, and samples were loaded onto native 5% (19:1 acrylamide/bisacrylamide) polyacrylamide gels and electrophoresed at 200 V for 2 h at 4°C using 1ϫTris borate-EDTA as the running buffer. The resolved radiolabeled species were visualized using a PharosFX Imager (Bio-Rad).

ATP hydrolysis assays
ATP hydrolysis was measured using [␥- 32 30 (30 M) stimulator was incubated at 37°C for 0, 7.5, 15, 30, and 45 min. Reactions were quenched with 5 l of 50 mM EDTA final concentration. The reaction mixture was spotted onto a PEI-cellulose TLC plate and resolved by using 0.5 M LiCl, 1 M formic acid as the carrier solvent. The TLC plate was exposed to a phosphorimaging cassette for 30 min and visualized using a PharosFX Imager (Bio-Rad).
Author contributions-T. T., V. V., and Y. W. conceived and coordinated the study and wrote the paper. J. Q. helped in the initial protein purification. M. G. prepared triplex DNA. A. K. provided technical assistance in protein purification. Y. L. and R. S. assisted in manuscript preparation. K. E. K. contributed the KH domain characterization. All authors reviewed the results and approved the final version of the manuscript.