Solution Small Angle X-ray Scattering (SAXS) Studies of RecQ from Deinococcus radiodurans and Its Complexes with Junction DNA Substrates*

Background: RecQ enzymes are homologous recombination repair relevant proteins. Results: Solution structures of full-length DrRecQ protein and its complexes with DNA substrates are defined. Conclusion: DrRecQ catalyzes dsDNA unwinding and Holliday junction migration in a compact state. Significance: Our models provide novel structural information about the way DrRecQ participates in homologous recombination repair. RecQ helicases, essential enzymes for maintaining genome integrity, possess the capability to participate in a wide variety of DNA metabolisms. They can initiate the homologous recombination repair pathway by unwinding damaged dsDNA and suppress hyper-recombination by promoting Holliday junction (HJ) migration. To learn how DrRecQ participates in the homologous recombination repair pathway, solution structures of Deinococcus radiodurans RecQ (DrRecQ) and its complexes with DNA substrates were investigated by small angle x-ray scattering. We found that the catalytic core and the most N-terminal HRDC (helicase and RNase D C-terminal) domain (HRDC1) undergo a conformational change to a compact state upon binding to a junction DNA. Furthermore, models of DrRecQ in complexes with two kinds of junction DNA (fork junction and HJ) were built based on the small angle x-ray scattering data, and together with the EMSA results, possible binding sites were proposed. It is demonstrated that two DrRecQ molecules bind to the opposite arms of HJ. This architecture is similar to the RuvAB complex and is hypothesized to be highly conserved in the other HJ migration proteins. This work provides us new clues to understand the roles DrRecQ plays in the RecFOR pathway.

RecQ proteins are recombination-specific DNA helicases that play critical roles in the maintenance of genome stability across all species (1). Mutations in three of the human RecQ genes have been proved to be associated with cancer susceptibility or premature aging. Defects in WRN, BLM, and RecQ4 helicases are the causes of Werner syndrome, Bloom syndrome, and Rothmund-Thomson syndrome, respectively (2)(3)(4). The severe consequences caused by RecQ defects emphasize the importance of the enzyme in preserving chromatin integrity.
Normally, DNA damages, including the potentially lethal double strand breaks, can be repaired by homologous recombination. The RecQ proteins have been proved to take part in the recombination and play at least two roles in cells. First, as ATPdependent helicases, they can translocate in the 3Ј-5Ј direction and initiate the RecFOR pathway together with SSB (singlestrand binding protein) and RecA proteins (5,6). Second, they possess the ability to promote ATP-dependent branch migration of HJ 3 through regions Ͼ2 kb DNA and thus can act as suppressors of illegitimate recombination (7)(8)(9)(10)(11).
D. radiodurans is one of the most radiation-resistant species on the earth. It can survive 7000 gray of ionizing radiation with only 10% cell deaths, whereas most other organisms can only suffer less than a few hundred gray. It has been proved that a dose as high as 7000 gray crushes its 3.28-Mb genome into 20 -30-kb fragments by causing 100 -150 double strand breaks (12)(13)(14). However, the most universal homologous recombination repair initiation machine RecBCD is not found in D. radiodurans (15,16). As a result, instead of a backup pathway as it is in Escherichia coli, the RecFOR pathway is a main homologous recombination repair way in D. radiodurans (17)(18)(19).
The most conserved structure of the RecQ family proteins is the catalytic core. Composed of one helicase domain and one * This work was supported by the grants from the National Basic Research Program of China (2012CB917203) and the National Natural Science Foundation of China (10979005). □ S This article contains supplemental RQC domain, it is able to catalyze ATP-dependent helicase activity alone (20). HRDC (helicase and RNase D C-terminal) is another typical domain of RecQ and plays a major role in the DNA binding activity (21). DrRecQ is a special member of the RecQ family proteins. Apart from the most conserved catalytic core domain, it has three successive HRDC domains in its C terminus, whereas other RecQ homologs usually contain only one HRDC domain (22,23). The special domain distribution of DrRecQ may contribute to the high efficiency of D. radiodurans in repairing damaged DNA. High resolution structures of the three HRDC domains of DrRecQ (22) and its homologous catalytic core from E. coli (20) have been solved separately, providing a near complete atomic image of the enzyme. However, the enzyme works as a whole to achieve its proper function. Thus, it is important to investigate the architecture of the full-length protein.
Moreover, structural knowledge of DrRecQ in complexes with Y structured DNA (Y-DNA) and HJ can provide us crucial information relevant to its functions. In this investigation, SAXS reveals solution architectures of DrRecQ protein and its complexes with DNA substrates. The novel structural information provides us new clues to understand the overall mechanisms of the enzyme.

EXPERIMENTAL PROCEDURES
Sample Preparation-Full-length DrRecQ (DrRecQ full ) and its truncation mutant with catalytic core and HRDC1 (DrRecQ 610 ) were amplified from the genomic DNA of D. radiodurans by PCR. After digestion with the corresponding restriction enzymes, they were ligated into the pET28a expression vectors. The constructed plasmids were then transformed into E. coli expression strain BL21(DE3) gold cells for protein expression.
The purification procedures for DrRecQ full and DrRecQ 610 were identical. Bacterial cells were grown in LB media to midlog phase at 37°C in the presence of 50 g/ml kanamycin. Induction of the culture was then carried out with 0.1 mM isopropyl ␤-D-thiogalactoside at 16°C for 20 h. The cell pellet was resuspended in buffer A (25 mM Tris-HCl, 1 M NaCl, 2 mM ␤-mercaptoethanol, and 1 mM PMSF, pH 7.5) and disrupted using high pressure cell cracker. The supernatant was then loaded onto a nickel-nitrilotriacetic acid resin column (GE Healthcare) and eluted with buffer B (25 mM Tris-HCl, 100 mM NaCl, and 250 mM imidazole, pH 7.5). Fractions containing DrRecQ were further purified by an ion exchange column (heparin, GE Healthcare), where the protein was eluted as a single peak at 280 mM NaCl. These fractions were finally purified by gel filtration (Superdex 200, GE Healthcare) pre-equilibrated in buffer C (25 mM Tris-HCl, 500 mM NaCl, and 2 mM DTT, pH 7.5). The final yield of the protein was stored at Ϫ80°C until it was used for SAXS measurements.
The next step is to purify DrRecQ in complexes with Y-DNA and HJ (DrRecQ-Y and DrRecQ-HJ). Y  (Y-DNA with one 18-bp double strand and two 12-nt single strands) and Y 7-6 (Y-DNA with 7-bp double strand and two 6-nt single strands) were mixed with the purified proteins with a molar ratio of 1:1, whereas HJ was mixed with proteins with a molar ratio of 1:2. The complexes were then concentrated to 10 mg/ml. Afterward, the mixtures were passed through a size-exclusion column (Superdex 200, GE Healthcare) with a loading volume of 1 ml for the last step of purification. The size-exclusion chromatography (SEC) was also used to judge molecular masses (MMs) of the complexes.
EMSAs-All substrates used for EMSA assays are 5Ј-FAM fluorescently labeled DNA complexes and their sequences are listed in supplemental Table S1. Reaction buffer contained 20 mM Tris-HCl, pH 7.5, 100 mM NaCl, 4 mM MgCl 2 , and 1 mM DTT. In the assays, 10 nM DNA were added into the mixture with indicated concentrations of DrRecQ proteins to analyze the binding affinities, whereas 1 M DNA were used to judge the molar ratios of protein and DNA in the complexes. After incubation of 30 min at 4°C, the reaction systems were resolved in native polyacrylamide gels with TG buffer (25 mM Tris-HCl, and 192 mM glycine, pH 8.3) at 4°C for 30 min and then visualized by Typhoon FLA 7000 (GE Healthcare). Dissociation constant (K D ) was calculated as the DrRecQ concentration at which half the available DNA was bound and half was unbound.
SAXS Data Collection-Synchrotron SAXS measurements were performed at the European Molecular Biology Laboratory on the storage ring DORIS III (DESY, Hamburg, Germany) on the X33 beamline (24) equipped with a robotic sample changer (25) and a PILATUS-1 M detector (DECTRIS, Baden, Switzerland). All samples were centrifuged at the speed of 13,000 rpm for 20 min just before measurements to get rid of aggregations and sediments. 2 mM DTT was added into the samples and buffers before measurements to avoid radiation damage.
All measurements were carried out in vacuum with exposure times of 2 min in eight 15-s frames to monitor for possible radiation damage (no radiation effects were detected). The scattering intensity I(s) was recorded in the range of the momentum transfer, 0.02 Ͻ s Ͻ 0.6 Å Ϫ1 , where s ϭ (4sin)/, 2 is the scattering angle, and ϭ 1.5 Å is the x-ray wavelength. Due to the considerable experimental noise at higher scattering angles, only the most informative part of scattering curves between 0.02 Å Ϫ1 and 0.2 Å Ϫ1 were used for structural analysis. To exclude concentration dependence, three different concentrations of each sample were prepared and measured. The con- centrations were 2 mg/ml, 4 mg/ml, and 6 mg/ml for proteins; 1 mg/ml, 2 mg/ml, and 3 mg/ml for DNA samples; and 1 mg/ml, 2 mg/ml, and 4 mg/ml for the complexes, separately. No concentration dependence and aggregations were observed during the measurements.
SAXS Data Processing-All SAXS data were processed with the program package ATSAS (26). The scattering of buffers were subtracted from that of the samples, and then were extrapolated to zero concentrations using standard procedures and program PRIMUS (27). The resultant curves were used for all FIGURE 2. SAXS analysis of DrRecQ 610 and DrRecQ full . A, SAXS scattering data from DrRecQ 610 (upper panels) and DrRecQ full (lower panels) in solution: 1, experimental data; 2, scattering pattern computed from ab initio model; 3, smooth curve back transformed from the p(r) function and extrapolated to zero vector; 4, scattering pattern computed from the CORAL model; 5, averaged scattering pattern calculated from the optimized models generated by EOM. LgI, relative, relative intensity of scattering pattern in logarithmic form. B, distance distribution functions for DrRecQ 610 (curve 1) and DrRecQ full (curve 2). C and D, ab initio and rigid body reconstructions of DrRecQ 610 (C) and DrRecQ full (D). Low-resolution envelopes of typical DAMMIN and GASBOR models are shown both separately (upper panels) and superimposed with atomic models determined by CORAL (lower panels). Views are rotated by 90°according to the vertical axis for each model. In this and other figures, domains are colored as described in Fig. 1, and loops are represented as cyan dots. calculations and reconstructions. MMs of the samples were obtained from the extrapolated I(0) values in comparison with the standard BSA sample. Radii of gyration (R g ) were evaluated within the range of Guinier approximation sR g Ͻ 1.3 according to Equation 1.
Distance distribution functions p(r) and maximum diameters D max of the scattering objects were calculated using indirect Fourier transformation and the program GNOM (28). Three-dimensional reconstructions for DrRecQ 610 , DrRec-Q full , and DrRecQ 610 -Y 7-6 were performed using programs DAMMIN (29), GASBOR (30) and CORAL (31). Ten independent runs for each of the program were compared by the program SUPCOMB (32), and those with the lowest normalized spatial discrepancy (NSD; a measure of quantitative similarity among sets of three-dimensional points) were chosen as typical models. Considering the flexibilities of proteins, program EOM (33) was also used to analyze the three specimens with assemblies of different conformers. The R g and D max distributions reflected the status of the specimens in solution. For DrRecQ 610 -Y  and DrRecQ full -Y 18-12 , multiphase models were built using program MONSA (29). The program can read multiple data sets, including not only the scattering from complex but also the scattering from DNA and protein alone,  thereby stimulates the scattering objects with different electron densities to fit scattering data from both monomer and complex. Several independent runs gave reproducible results from which the two with the lowest NSD compared with other models were chosen as typical models. Models of DrRecQ 610 -HJ and DrRecQ full -HJ were built using programs DAMMIN and SASREF (34), and the most typical DAMMIN model was superimposed on the protein phase of SASREF model.

RESULTS
Experimental Strategy-SAXS data were collected for two protein species, the full-length DrRecQ and the construct without HRDC2 and HRDC3 domains (Fig. 1A). Fig. 1B Fig. 2A. MMs of the two proteins calculated from SAXS data are practically identical to the theoretical values calculated from the known sequences (Table 1), indicating well behaved, monodisperse status of the two specimens in solution. The distance distribution functions for DrRecQ 610 and DrRecQ full are shown in Fig. 2B. The asymmetrical bell-shaped functions are characteristic for elongated shapes with cross-sections of ϳ40 Å and maximum diameters of ϳ110 Å for DrRecQ 610 and 160 Å for DrRecQ full .
To obtain more specific structural information, ab initio modeling was applied using the programs DAMMIN and  GASBOR. Ten independent models generated with both algorithms gave reproducible results (NSD av ϭ 1.25 for DrRecQ 610 and NSD av ϭ 1.78 for DrRecQ full , Table 1) and demonstrated good approximations to the experimental data with discrepancy values chi 2 ϭ 1.36 for DrRecQ 610 and chi 2 ϭ 1.39 for DrRecQ full . The final models display an ellipsoidal shape for DrRecQ 610 (Fig. 2C, upper panels) and an elongated shape for DrRecQ full (Fig. 2D, upper panels), consistent with their p(r) functions.
Furthermore, rigid body modeling was applied using the available high resolution structures and the program CORAL.
The model of DrRecQ 610 (Fig. 2C, lower panels) reveals the HRDC1 domain arranged slightly apart from the catalytic core. The model of DrRecQ full (Fig. 2D, lower panels) demonstrates the HRDC2 domain distributed far away from the catalytic core and HRDC1, with the extension of long loop between HRDC1 and HRDC2. However, instead of a continuous extending outward, the HRDC3 domain tends to fold back to the catalytic core. Importantly, the CORAL models are in good agreement with the ab initio reconstructions as demonstrated by program SUPCOMB (NSD ϭ 1.15 for DrRecQ 610 and NSD ϭ 1.68 for DrRecQ full ). Ab initio and rigid body modeling methods gave consistent results, showing a reliable averaged overall shapes of the two proteins in solution.
There is a 15-residue linker between the catalytic core and HRDC1 domain in DrRecQ, which is highly conserved in RecQ homologs. In addition, DrRecQ has a unique architecture with two more HRDC domains, which are linked with a 43-residue linker and a 21-residue linker, respectively. The flexibility of these loops might lead to different conformations of the proteins in solution. As a result, we also use an ensemble of conformers to characterize the system. Using the program EOM, a large pool of 10,000 different conformations is generated to analyze the flexibility of the protein, and an optimized ensemble of 50 models that best describes the SAXS data is selected. The selected ensemble of conformations fit the experimental data with chi 2 ϭ 0.56 and chi 2 ϭ 0.46 for DrRecQ 610 and DrRecQ full , respectively. The R g and D max distributions of DrRecQ 610 and DrRecQ full calculated from the optimized ensemble are shown in Fig. 3, A and B. Two single peaks of the distribution functions imply DrRecQ 610 may exist as two distinct conformations in solution: a closed state (Fig. 3C, left) with a smaller R g and D max , and an open state (Fig. 3C, right) with a bigger R g and D max . For DrRecQ full , the broaden peak means more flexibility of the full-length protein, which probably undergo more continuous conformational changes in solution. Moreover, the R g and D max distribution functions of DrRecQ full have smaller scopes than the ranges of the pools, indicating that the full-length protein has limited flexibility and is unable to be fully extended in solution. This result is in consistence with the CORAL model.
DrRecQ Forms Stable Complexes with Junction DNA-To characterize the process of DeRecQ unwinding doublestranded DNA and catalyzing HJ migration, it is essential to purify stable and monodisperse samples of DrRecQ in complexes with Y-DNA and HJ. The DNA binding assays demonstrate that DrRecQ full and DrRecQ 610 have strong binding  affinities with Y-DNA and HJ (Fig. 4, A and B) with K D values ranging from 79 to 115 nM. We then mixed DrRecQ with Y-DNA and HJ, and the samples were passed through a Superdex 200 column to check for their homogeneity. DrRecQ-Y and DrRecQ-HJ elute as mono-peaks (Fig. 5, A and B)   well with the SEC results and is further proved by the MMs calculated from the SAXS data (Table 1).
DrRecQ Forms a Compact State upon Binding to a Junction DNA-To probe a possible conformational change the catalytic core and the HRDC1 domain undergo when bound to a DNA substrate, SAXS data were collected for DrRecQ 610 in complex with Y 7-6 . The scattering pattern and the p(r) function of DrRecQ 610 -Y 7-6 complex are shown in Fig. 6A. The R g and D max values of the complex calculated from the SAXS data have decreased noticeably as compared with those of the initial protein in solution (Table 1), indicating a more compact status of DrRecQ 610 upon binding to the junction DNA. Due to small MM of Y 7-6 (8.5 kDa) , monophase modeling changes the final volume insignificantly. Models generated by programs DAMMIN, GASBOR, and CORAL are in a good agreement with NSD values listed in Table 2. The restorations present the catalytic core and HRDC1 domain being in the closed state (Fig.  6A, insets a-c).
To further validate the assumptions, EOM was used to describe the DrRecQ 610 -Y 7-6 complex. The ensemble selected from the random pool differs from that observed for DrRecQ 610 . R g and D max distributions show single peaks at low values (Fig. 6B), reflecting compact status of DrRecQ 610 -Y 7-6 in solution.
SAXS Reconstructions of DrRecQ in Complex with Fork Junction DNA-To determine the relative position of protein and DNA in the DrRecQ-Y complex, SAXS data were also collected for DrRecQ 610 and DrRecQ full in complexes with Y 18-12 (Fig.  7A), i.e. with a larger fork junction substrate. R g values are ϳ36.4 nm and 46.1 nm for DrRecQ 610 -Y 18-12 and DrRecQ full -Y 18-12 , respectively. p(r) functions are characteristic for elongated shapes (Fig. 7B and Table 1). Multiphase modeling and program MONSA were used to reconstruct ab initio models that include two phases. Several independent runs gave reproducible results with averaged NSD values of 0.8 for DrRecQ 610 -Y 18-12 and 0.97 for DrRecQ full -Y  . Good fits to the experimental profiles of complexes and Y-DNA were obtained with discrepancies (chi 2 ) listed in Table 2. The resultant models for DrRecQ 610 -Y 18-12 and DrRecQ full -Y 18-12 are shown in Fig. 7, C and D, respectively. For DrRecQ 610 -Y 18-12 , the protein phase has a D max of 100 Å and aligned well with the closed state of DrRecQ 610 . The DNA phase has a part protruding into the protein phase, and the other part stretched out into the solution. For DrRecQ full -Y 18-12 , the protein phase also has an elongated shape. Position and orientation of the DNA phase are similar to that in DrRecQ 610 -Y  . Importantly, the part stretched out into solution has a direct contact with the protein phase (probably with HRDC3 domain). Our EMSA results demonstrate that DrRecQ has much greater binding affinity with singlestranded DNA (K D ϭ 10 nM, Fig. 4C) than with dsDNA (K D Ͼ 2.5 M, data not shown). Therefore, the binding site of Y-DNA should reside in the single stranded regions of DNA. According to the analysis above, we deduced that the part protruding into the protein phase is the single stranded part of Y  , whereas the part stretched out into solution is the double stranded part of Y  . Moreover, a single catalytic core or a single HRDC domain was shown to have very low binding affinities to junction DNA (K D Ͼ 2.5 M, Fig. 4). Thus, it is assumed that the catalytic core and HRDC1 domain must work together to form a stable complex with the junction DNA, and each domain provides a binding site for the substrate.
SAXS Reconstructions of DrRecQ in Complex with Holliday Junction DNA-We then used SAXS to determine the spatial organization of DrRecQ-HJ complex (Fig. 8A). SEC results present DrRecQ 610 -HJ and DrRecQ full -HJ as single species with exclusion volumes of 12.8 and 12.6 ml, respectively. The corresponding MMs are ϳ175 kDa for DrRecQ 610 -HJ and ϳ220 kDa for DrRecQ full -HJ. These values are significantly greater than that of the protein monomers. The R g and D max values of the two complexes are also larger than those obtained from the protein monomers (Table 1). Taking the SEC and SAXS results into account, we inferred that two DrRecQ molecules bind to the HJ.
Because DrRecQ 610 forms a rigid compact mode upon binding to a junction DNA, the method of molecular tectonics and program SASREF were applied to model spatial configuration of the complex using DrRecQ 610 and HJ DNA as subunits. A typical model selected from 10 independent runs demonstrated that the two DrRecQ 610 subunits consistently bind to the opposite arms of HJ (Fig. 8B, upper panels). Furthermore, ab initio models were computed using programs DAMMIN and GAS-BOR. The final low resolution ab initio model has good selfconsistency and superimposes well with the protein phase of the SASREF model (NSD ϭ 1.37, Fig. 8B, lower panels). For DrRecQ full -HJ, although rigid body modeling is not applicable due to the high flexibility of the full-length protein, ab initio models demonstrate elongated shapes similar to the DrRecQ 610 -HJ complex (Fig. 8C), indicating close architectures of the two complexes in solution.

DISCUSSION
Here, SAXS was used to obtain solution structures of fulllength DrRecQ enzyme and its complexes with junction DNA substrates. These structures give novel insights into the architectures of the whole enzyme and its binding to DNA substrates. For the first time, we show that catalytic core and HRDC1 domain undergo large-scale conformational changes to a closed state upon binding to a junction DNA. Furthermore, the locations of the Y and HJ DNA were revealed, which provided insights into the structural bases of DrRecQ in its helicase and branch migration activities.
The SAXS results indicate that there might be an equilibrium between open and closed conformations of DrRecQ 610 in solution. When binding to a junction DNA, the equilibrium shifts almost completely toward the closed state. Although the binding site of the complex is not directly visible due to low resolution, additional information such as DNA binding assays provides clues for possible DNA locations. Because the catalytic core and HRDC1 domain each provides a binding site for the Y-DNA, the open conformation of protein would expose the binding sites to solution and help the enzyme to recognize and bind to the DNA substrates. Moreover, the fork junction DNA can be considered as a partly unwound dsDNA. Its stable complex with DrRecQ gives a good description of a single snapshot of the unwinding process, and therefore provides us clues to understand the way DrRecQ initiates the RecFOR pathway. The catalytic core and the HRDC1 domain are highly conserved domains in RecQ family. They share a common feature that the two domains are linked via a flexible loop. We assume that the observed binding mode of DrRecQ 610 with the Y-DNA substrate along with the conformational changes that occur during the interaction may be common in the entire RecQ family proteins which contain the two domains (E. coli RecQ, Saccharomyces cerevisiae Sgs1, human BLM, and WRN, and etc.).
It should be noted that although the full-length protein is flexible, it cannot be fully extended in solution. The HRDC3 domain tends to fold back to the catalytic core and has a direct contact with the double stranded part of Y-DNA. This kind of domain distribution is consistent with the previous assumptions about HRDC3. It is important for inter-domain interactions and is able to help regulate structure-specific DNA binding of DrRecQ (22,23). HRDC3 is also known to contain large negatively charged areas on its surface. Its presence reduces the binding affinity of DrRecQ to most of DNA substrates. Based on these two points and the SAXS models, we deduce that the HRDC3 domain can regulate the DNA binding metabolism by contacting directly with part of the DNA substrate.
The model of DrRecQ-HJ complex reveals that two molecules of DrRecQ bind to opposite arms of HJ in a closed state (Fig. 9A). The RuvAB complex is a well known motor machine for HJ migration (35)(36)(37). In the complex, RuvA forms an octamer to recognize and fix a HJ, whereas two RuvB hexameric rings are in charge of exerting a spiral rotation on each encircled DNA arm (Fig. 9B). Although the specific structures of the two complexes vary greatly, they share a common feature that the ATP-dependent helicases (RecQ and RuvB) are located on the opposite arms of HJ DNA. With such a structure, the helicases unwind the two opposite arms of HJ, whereas the other two arms anneal automatically through complementary sequence (Fig. 9C). Thus, the symmetrical architecture is highly relevant to its function and is hypothesized to be conserved in other HJ migration enzymes. Moreover, the RuvAB complex has a 5Ј-3Ј polarity, whereas RecQ has a reversed 3Ј-5Ј one. The opposite polarity results in the two enzymes catalyzing HJ migration in different orientations and makes RuvAB a promoter for recombination, whereas RecQ is a suppressor for hyper-recombination.
SAXS studies presented here give the first depiction of the full-length DrRecQ protein and its interactions with DNA substrates. We suggest that DrRecQ catalyzes dsDNA unwinding and Holliday junction migration in a compact state and that the helicase and branch migration activities are highly consistent with their solution structures. In future studies, it will be interesting to learn about the other members of the RecQ family proteins such as E. coli RecQ, human WRN, BLM, and RecQL1, and etc., to reveal whether they act in the same manner as DrRecQ in solution.