The Lon protease-like domain in the bacterial RecA paralog RadA is required for DNA binding and repair

Homologous recombination (HR) plays an essential role in the maintenance of genome integrity. RecA/Rad51 paralogs have been recognized as an important factor of HR. Among them, only one bacterial RecA/Rad51 paralog, RadA, is involved in HR as an accessory factor of RecA recombinase. RadA has a unique Lon protease-like domain (LonC) at its C terminus, in addition to a RecA-like ATPase domain. Unlike Lon protease, RadA's LonC domain does not show protease activity but is still essential for RadA-mediated DNA repair. Reconciling these two facts has been difficult because RadA's tertiary structure and molecular function are unknown. Here, we describe the hexameric ring structure of RadA's LonC domain, as determined by X-ray crystallography. The structure revealed the two positively charged regions unique to the LonC domain of RadA are located at the intersubunit cleft and the central hole of a hexameric ring. Surprisingly, a functional domain analysis demonstrated the LonC domain of RadA binds DNA, with site-directed mutagenesis showing that the two positively charged regions are critical for this DNA-binding activity. Interestingly, only the intersubunit cleft was required for the DNA-dependent stimulation of ATPase activity of RadA, and at least the central hole was essential for DNA repair function. Our data provide the structural and functional features of the LonC domain and their function in RadA-mediated DNA repair.

Homologous recombination (HR) 2 is one of the most important biological phenomena for maintaining genome integrity through DNA-processing pathways, such as double-strand break repair and the rescue of stalled replication forks (1)(2)(3)(4). HR also generates genetic variation and is recognized as one of the driving forces of evolution through horizontal gene transfer or intrachromosomal recombination (5,6). To fulfill these contradictory roles, numerous proteins engage in each reaction in the HR system. Among these proteins, the RecA/Rad51-family proteins have a central role in HR as a recombinase that searches for a homologous DNA sequence and catalyzes the strand-exchange reaction (7)(8)(9)(10). Interestingly, most organisms have one or more RecA/Rad51 paralogs in addition to the main recombinase, RecA/Rad51. In eukaryotes, Rad51 paralogs lack the recombinase activity but are involved in HR as an accessory factor of the recombinase (11). The human Rad51 paralogs are known as oncogenes (12,13).
RadA (also known as Sms) is a highly conserved RecA paralog in bacteria. Genetic studies have shown that radA is involved in DNA repair and genetic recombination in various organisms, such as Escherichia coli, Bacillus subtilis, and Deinococcus radiodurans (14 -22). Recently, a biochemical characterization of E. coli RadA protein has shown that RadA stimulates RecAcatalyzed strand exchange in vitro (23). RadA is a multidomain protein consisting of three putative functional regions (Fig. 1A). A zinc-finger (ZF) motif, a RecA-like ATPase domain, and a Lon protease-like domain (LonC) are located at the N-terminal, central, and C-terminal regions, respectively. The ZF and LonC domains are unique to the bacterial RecA paralogs and are not found in eukaryotic and archaeal RecA/Rad51 paralogs.
A LonC domain was originally identified in Lon protease as a catalytic domain of protease activity (24 -26). Lon protease is a member of the ATP-dependent serine proteases found in three kingdoms of life and involved in protein degradation upon various stress responses (27,28). It has been thought that a protein substrate binds to an N-terminal putative substrate recognition domain, is unfolded by a central AAA ϩ motor domain, and then is degraded by the C-terminal LonC domain (Fig. 1A). Structural and mutational analyses of the LonC domain from E. coli Lon protease have shown that the protease activity of the LonC domain is exerted by a catalytic Ser residue (24,25). Interestingly, different from Lon proteases, the catalytic Ser is not conserved in a LonC domain of RadA, and it has been shown that RadA has no protease activity (14). As a matter of convenience, we refer to a LonC domain of the Lon protease and that of RadA as pLonC (Lon protease-type) and rLonC (RadA-type), respectively. Genetic studies have indicated that rLonC has an essential role in RadA-mediated DNA repair (16,22). However, the tertiary structure and molecular function of rLonC are still unknown.
In this study, we determined the X-ray crystal structure of rLonC from Thermus thermophilus HB8 and revealed that rLonC possesses the DNA-binding activity exerted by two pos-itively charged regions, namely the intersubunit cleft and the central hole of the hexameric ring structure. We also showed that these two DNA-binding sites have different roles in the DNA-dependent ATPase activity of RadA in vitro and that at least the central hole is essential to the RadA-mediated DNA repair function in vivo. Furthermore, combining these results with a phylogenetic analysis highlighted structural and functional relationships between RadA and another LonC family protein, ComM. Our findings provide not only the structural and functional aspects of the rLonC domain in RadA-mediated DNA repair but also evolutionary aspects of the LonC family proteins.

Overall structure of rLonC
First, we performed structural prediction and multiple sequence alignment analyses to determine the domain organization of RadA homologs (Fig. 1A). Structural prediction of T. thermophilus HB8 RadA (423 residues) using the SWISS-MODEL (29) and the I-TASSER (30) servers showed that a region of amino acid residues 256 -275 of RadA from T. thermophilus HB8 could be a linker region between the RecA-like ATPase domain and rLonC. Based on this prediction, we prepared rLonC (amino acid residues 262-423) in N-terminal His 6 -tagged form for X-ray crystallographic analysis. The crystal structure of rLonC was determined at 2.7 Å resolution by molecular replacement using the structural model predicted by the I-TASSER server (30) as a search model. The statistics for the data collection and model refinement are summarized in Table 1.
The overall structure of rLonC was composed of a core structure, ribosomal protein S5 domain 2-type ␤␤␤␣␤␣ fold, and an extra C-terminal ␣/␤ subdomain (Fig. 1B). The structure of rLonC determined in this study was similar to that of pLonC from E. coli (25) as denoted by a root mean square (r.m.s.) deviation of 3.9 Å for C␣ atoms. As shown in Fig. 1, B and C, pLonC has two ␣-helices and one ␤-sheet in the extra C-terminal ␣/␤ subdomain (25), although these elements were absent in the rLonC structure. It should be noted that the N-terminal His 6 tag and 10 amino acid residues exhibited high B-factor values or were disordered in both molecules in the asymmetric unit. The N-terminal disordered region of one molecule in the asymmetric unit was longer (residues 262-271) than that of the other molecule (residues 262-266) by five residues.
In E. coli Lon protease, Ser-679 is a catalytic residue of the protease activity and forms a catalytic dyad with Lys-722 (25). However, in the rLonC structure, Gly-350 and Arg-393 were located at the positions corresponding to Ser-679 and Lys-722 of E. coli Lon protease, respectively (Fig. 1, B and C). Furthermore, there was no serine residue around Gly-350 or Arg-393. These findings are consistent with a previous report that RadA had no protease activity (14).

Oligomeric structure of rLonC
In a crystal, rLonC forms a hexameric ring structure related by crystallographic 3-fold symmetry of a non-crystallographic dimer with an r.m.s. deviation of 0.16 Å for C␣ atoms (Fig. 1D). The subunit arrangement of the hexameric ring structure of rLonC was similar to that of E. coli pLonC (25). In addition, like E. coli pLonC, the dominant interactions in the subunit interface are hydrophilic, including 11 hydrogen bonds and 6 -7 salt bridges in each subunit interface. The total buried accessible surface area of rLonC upon hexamer formation was 10,046 Å 2 , comparable with the 9,169 Å 2 for E. coli pLonC (25). The hexameric ring was predicted to be the most stable oligomer in the crystal structure by the PISA program (31). By contrast, gelfiltration analyses revealed that in solution the rLonC domain and full-length RadA existed as trimers to tetramers and dimers to pentamers, respectively (Fig. 2). In addition, the dissociation of the oligomeric structure was observed in a salt-dependent manner. These behaviors in the oligomer formation of RadA are similar to those of Lon proteases, in that they form various types of oligomers (32)(33)(34)(35).
Unlike in E. coli pLonC (Fig. 1E, top right panel), the surface electrostatic potential map showed that rLonC had positively charged regions at one side of the ring (Fig. 1E, top left panel). At this "basic" side of the ring, a cleft-like structure is formed at the subunit interface (hereafter referred to as the "intersubunit cleft") ( Fig. 1E, bottom left panel). At the other side of the ring, there was no significant biased distribution of charged residues (Fig. 1E, bottom right panel). In addition, rLonC had another positively charged region at the central hole of the ring (hereafter referred to as the "central hole") ( Fig. 1E, top left panel). On the basic side, the inner diameter of the entrance of the central hole is ϳ29 Å. The hole is gradually narrower from the basic side to the other side of the ring, and the inner diameter at the exit formed by the loop containing 299 TPFPAP 304 is ϳ17 Å (Fig. 1D, left panel). On the basis of these unique features, especially the positively charged regions, of rLonC's structure, we hypothesized that rLonC has a DNA-binding ability.

rLonC is a novel DNA-binding module
To verify the hypothesis that rLonC has DNA-binding activity, we constructed and purified various domain-truncated mutants of RadA and performed an electromobility shift assay (EMSA) using 60 bp of double-stranded DNA (dsDNA) (Fig. 3, A and B). As expected, the mutants containing the rLonC domain exhibited DNA-binding activity. In the presence of the rLonC domain, as the protein concentration increased, bands of the DNA-protein complex appeared, and those of free DNA molecules disappeared (Fig. 3A). Binding of the wild type, rLonC (⌬N261) and ⌬N53 to DNA, showed positive cooperativity, and the dissociation constants of the wild type and rLonC were comparable ( Fig. 3B and Table 2). However, the loss of the N-terminal ZF domain (⌬N53) resulted in considerably lower DNA-binding activity than the wild type.
Interestingly, our analysis could not detect the DNA-binding activity in either the N-terminal ZF domain or the RecA-like ATPase domain (Fig. 3, A and B). Structural prediction and multiple sequence alignment analyses support this finding in the RecA-like ATPase domain of RadA (Fig. 3, C and D). The L1 and L2 loops of RecA have been thought to be important for DNA binding (36,37). In the case of RadA, the corresponding loops are expected to be shorter than those of RecA, and the residues that are important for DNA binding are not conserved (Fig. 3, C and D). However, there is still the possibility that the It has been shown that Lon protease activity is modulated by DNA binding and that the DNA-binding ability resides in the ␣-subdomain of the N-terminal AAA ϩ domain but not in the pLonC domain (32, 38, 39). As mentioned above, rLonC has unique positively charged regions that are not found in pLonC. Thus, we expected that the positively charged regions at the surface of rLonC's structure have an important role in the DNA-binding ability.

Identification of the DNA-binding residues
To identify the residues responsible for DNA binding, we selected 11 basic surface residues (Fig. 4A) and performed ala-  (25) are shown in a schematic representation, colored in a spectrum from the N terminus (blue) to the C terminus (red). The side chains of the two residues (Ser-679 and Lys-722) forming the catalytic dyad of E. coli pLonC and the corresponding residues (Gly-350 and Arg-393, respectively) of rLonC are represented as magenta sticks. In the E. coli pLonC structure, Ser-679 is replaced with Ala to avoid self-digestion. The segment absent in rLonC is indicated as a dashed oval. C, multiple sequence alignment of the LonC domains with secondary structures. The accession numbers of the sequences are as follows: T. thermophilus (Tt) RadA (YP_143807); D. radiodurans (Dr) RadA (NP_294829); E. coli (Ec) RadA (NP_418806); B. subtilis (Bs) RadA (NP_387968); and E. coli (Ec) Lon (NP_414973). The locations of the secondary structure elements of T. thermophilus rLonC and E. coli pLonC are shown above and below each amino acid sequence, respectively. The cylinders (red) and arrows (blue) represent ␣-helices and ␤-strands, respectively. The segment absent in rLonC is indicated as a dashed rectangle. The catalytic residues, Ser-679 and Lys-722, of E. coli Lon protease and the corresponding residues in the RadA orthologs are boxed in magenta. The residues for the site-directed mutagenesis of T. thermophilus RadA and the corresponding residues in other LonC domains are highlighted in color backgrounds as follows (related to Fig. 4): cyan, the residues that showed decreased DNA-binding activity by mutation to Ala at the intersubunit cleft; yellow, the residues that showed decreased DNA-binding activity by mutation to Ala at the central hole; and gray, the residues that showed similar DNA-binding activity to the wild type by mutation to Ala at the outer wall and peripheral regions.

Novel Lon protease-like DNA-binding domain of RecA paralog
nine-scanning mutagenesis using full-length RadA. Nine single mutants, R286A, R305A, R314A, K345A, R385A, R392A, R395A, R399A, and R404A, and one double mutant, K412A/ R413A, were constructed and purified to verify the DNA-binding activity by EMSA. Five mutants, R286A, R305A, R314A, K345A, and R385A, exhibited slightly decreased DNA-binding activities compared with the wild type, whereas the others exhibited DNA-binding activities similar to the wild type ( Fig.  4B and Table 2). The dissociation constants of R286A, R305A, R314A, K345A and R385A mutants were 1.9 -3.4-fold higher than that of the wild type. Two residues, Arg-286 and Arg-385, and three residues, Arg-305, Arg-314, and Lys-345, were located at the unique positively charged regions, namely the intersubunit cleft and the central hole, respectively (Figs. 1E and 4A). The other six residues were located at the outer wall and peripheral regions of the ring. These results suggest that both the intersubunit cleft and the central hole participate in DNA binding and that the outer wall and peripheral regions do not. Interestingly, both of the possible DNA-binding regions are located at subunit interface and formed by dimerization, suggesting the requirement of dimerization for DNA binding (Fig. 4A).
Multiple sequence alignment shows that these possible DNA-binding residues, except for Arg-314 (replaced with Asn in most organisms), are highly conserved in RadA orthologs (Fig. 1C, cyan and yellow backgrounds). By contrast, these basic residues are not conserved in Lon proteases, suggesting that the conservation of these residues is a unique feature of RadA. This finding supports our results that both the intersubunit cleft containing Arg-286 and Arg-385 and the central hole containing Arg-305, Arg-314, and Lys-345 are important for the DNAbinding activity of RadA via rLonC.
Furthermore, we constructed R286A/R385A and R305A/ R314A/K345A mutants because each single mutation of these residues exhibited significant but only limited effects on DNAbinding activity. As a result, both R286A/R385A and R305A/ R314A/K345A mutants exhibited lower DNA-binding activity than each single mutant ( Fig. 4B and Table 2). The dissociation constants of R286A/R385A and R305A/R314A/K345A were 7.3-and 7.7-fold higher than that of the wild type. These data imply that each mutation has an independent effect on the DNA-binding activity.

DNA-binding activity of rLonC is essential for the DNA repair function of RadA in vivo
One of the most important questions is whether the DNAbinding activity of rLonC is necessary for the DNA repair function of RadA in vivo. The previous results have shown that rLonC domain-truncated mutants cannot complement the phenotypes of disruptants of the radA genes in E. coli or D. radiodurans (16,22). However, there is no evidence that the DNA-binding activity of rLonC is important for the DNA repair function of RadA. To address the effect of DNA-binding activity of rLonC on RadA-mediated DNA repair in vivo, we performed an in vivo complementation assay using two DNAbinding-deficient mutants, namely R286A/R385A and R305A/ R314A/K345A, in T. thermophilus HB8.   First, we constructed a ⌬radA strain of T. thermophilus HB8 using a gene-disruption method via natural transformation and HR (Fig. 5, A and B) (40). To verify the involvement of RadA in DNA repair, the phenotype of ⌬radA was characterized by measuring the sensitivity to UV light and mitomycin C, which introduce DNA strand breaks via DNA adducts and cross-links (Fig.  5C). ⌬radA was ϳ30-fold more sensitive than the wild type to both UV and mitomycin C treatments. These results imply that RadA functions as a DNA-repair protein in T. thermophilus HB8 as it does in other organisms, such as E. coli and D. radiodurans (16,22).
The DNA-damage sensitivities of ⌬radA were recovered at the same level as those of wild type by introducing the expression plasmid of wild-type RadA (Fig. 5C). This indicates that the RadA protein expressed by the plasmid was fully functional in vivo and that the DNA-damage sensitivities of ⌬radA were caused by its loss of RadA. In contrast, overexpression of wildtype RadA in the wild-type strain showed no effect on sensitivity to UV light and to mitomycin C (Fig. 5, B and C). It should be noted that in the Western blotting analysis, the bands of degraded RadA were observed even in the presence of protease inhibitors (Fig. 5B). The His tag-cleaved RadA (labeled with

Novel Lon protease-like DNA-binding domain of RecA paralog
RadA in Fig. 5B) was observed in the strains that were complemented with plasmids encoding the His-tagged RadA.
By contrast, neither the expressions of R286A/R385A nor of R305A/R314A/K345A mutants were able to rescue the DNAdamage sensitivities of ⌬radA (Fig. 5C). It should be noted that the expression level of the R286A/R385A mutant in ⌬radA was apparently lower than that of the wild type and R305A/R314A/ K345A mutant (Fig. 5B), implying the structural instability of the R286A/R385A mutant in vivo. These data suggest that these five residues located at the intersubunit cleft and the central hole are important for the proper function of RadA and that at least the DNA-binding ability of RadA via the central hole of rLonC is essential for RadA-mediated DNA repair in vivo.

Biochemical characterization of the DNA-binding-deficient mutants
The next question is whether there are any functional differences between the two DNA-binding sites. To address this question, we examined the DNA-dependent ATPase activity of wild-type RadA and the DNA-binding-deficient mutants, namely R286A/R385A and R305A/R314A/K345A. In the case of wild type, the turnover rate was stimulated 4.5-and 2.2-fold in the presence of dsDNA and single-stranded DNA (ssDNA), respectively ( Fig. 6A and Table 3). In E. coli, it has been shown that dsDNA and ssDNA stimulate the ATPase activity of RadA at the same level (23). Our data suggest that dsDNA and ssDNA have different actions on the ATPase activity of RadA. By contrast, the Michaelis constant for ATP was not affected by the addition of DNA, and the dissociation constants for dsDNA and ssDNA were comparable (Fig. 6, A and  B; Tables 3 and 4).
Interestingly, the ATPase activity of R286A/R385A was not stimulated even at a high concentration of DNA (Fig. 6, A and B;  Table 2. The errors are standard errors from the regression analysis. b h is Hill coefficient. c -indicates no binding activity was detected. Table 3). Furthermore, R286A/R385A exhibited a higher turnover rate in ATP hydrolysis than the wild type in the absence of DNA, and the turnover rate of R286A/R385A was comparable with that of the wild type in the presence of DNA. These data suggest that Arg-286 and Arg-385 have an important role in the DNA-dependent stimulation of ATP hydrolysis and that the R286A/R385A mutant mimics the stimulated state of wild-type RadA induced by DNA binding. In contrast to R286A/R385A, R305A/R314A/K345A showed similar patterns to the wild type in DNA-dependent ATPase activity, except for a slightly lower turnover rate and binding affinity to DNA than the wild type (Fig. 6, A and B; Tables 3   and 4). Based on these results, we concluded that these two DNA-binding sites have different functions in DNA-dependent ATPase activity.

Discussion
In this study, we determined the X-ray crystal structure of rLonC, which represents the first report of the tertiary structure of the "non-protease-type" LonC domain (Fig. 1). We further revealed that rLonC possesses DNA-binding activity, which is exerted by two positively charged regions, namely the intersubunit cleft and central hole of the ring (Figs. 1, 3, and 4). Our data raise the hypothesis for the model of two DNA-binding modes: one is that the DNA molecule binds to the intersubunit cleft and the other is that the DNA molecule passes through the central hole (Fig. 6C).
As for the latter mode, it should be noted that the loop containing 299 TPFPAP 304 forms a lid-like structure at the opposite side of the basic side of the ring (Fig. 1D, left panel). The inner diameter between the two loops is ϳ17 Å. Therefore, this loop could be steric hindrance based on the hypothesis that the DNA molecule passes through the central hole of the ring. However, the average B-factors of the C␣ atoms of this loop and the overall structure were 92.0 and 51.2 Å 2 , respectively, implying a high flexibility of this loop. These data raise the hypothesis that structural rearrangements in the loop region might occur upon binding to DNA through the central hole of the ring.
It should be also noted that RadA forms various types of oligomers that are smaller than a hexamer in solution (Fig.  2), implying the possibility that there are DNA-binding modes other than the hexameric ring structure. Both of the DNA-binding sites are located at the subunit interface and formed by dimerization (Fig. 4A). We speculate that dimerization, not hexamerization, is required and sufficient for DNA binding.
Our biochemical analysis suggests the functional difference between the two DNA-binding sites in the DNA-dependent ATPase activity is that the intersubunit cleft is essential to the DNA-dependent stimulation of the ATPase activity, whereas the central hole is not (Fig. 6C). Interestingly, the R286A/ R385A mutant exhibited similar ATPase activity to the DNA-bound stimulated state of the wild type even in the absence of DNA. Thus, Arg-286 and/or Arg-385 would be required to regulate ATPase activity in the DNA-free state as well as to stimulate the activity upon DNA binding. We speculate that DNA binding to the intersubunit cleft of rLonC changes the ATPase domain in such a way as to stimulate its ATPase activity.
It should be noted that RadA can bind to DNA in the absence of nucleotides (Fig. 3, A and B). Interestingly, it has been shown that E. coli RadA requires ADP for DNA-binding activity, suggesting the ATPase domain has an important role for DNA binding (23). Further biochemical analysis and structural information about full-length RadA and its DNA complex are necessary for a complete understanding of how the rLonC and ATPase domains coordinate to regulate DNA-binding activity.  Table 5. B, estimation of protein expression levels in T. thermophilus HB8 cells by Western blot analysis using anti-RadA and anti-SSB antisera. C, effects of DNA-binding deficient mutations to UV light (left panel) and mitomycin C (right panel) sensitivity in T. thermophilus HB8 cells. The sensitivity is shown as log10-transformed survival fraction. The mean value (black line) of six independent experiments and each of their values (gray circle) are indicated. The error bar indicates the standard error of the mean. Statistical analysis was performed using Welch's t test, and multiplicity was adjusted using Holm-Bonferroni method: *, p Ͻ 0.05; **, p Ͻ 0.01; ***, p Ͻ 0.001; ****, p Ͻ 0.0001. Statistical comparisons evaluated as "not significant" are not indicated. The cleft *1 and hole *2 show the R286A/R385A *1 and R305A/R314A/K345A *2 mutants, respectively. EV, empty vector; WT, wild type.
Our in vivo study suggests that the DNA-binding activity of rLonC via the central hole is essential to the DNA-repair function of RadA (Fig. 5). The DNA-binding activity via the intersubunit cleft is likely to be associated with the DNA-binding function of RadA, although it could not be clearly verified that the loss of DNA-binding activity in the mutant at this region directly leads to the loss of DNA repair activity. The HR pathway involves two DNA molecules and produces branched DNA structures as intermediates, such as the D-loop and Holliday junction (1,4). It has been suggested that the two DNA-binding sites of RecA/Rad51 are necessary for efficient recombining of two DNA molecules. For instance, RecA has a unique C-terminal domain containing conserved basic residues that interact with dsDNA as well as the L1 and L2 loops in the ATPase domain (41). In addition, the N-terminal regions of Rad51, Dmc1, and archaeal RadA (not a bacterial RadA ortholog) contain a modified helix-hairpin-helix motif, which has a similar function to the C-terminal domain of RecA (42)(43)(44). In E. coli, genetic and biochemical studies have shown that RadA might be involved in recombination intermediate processing in vivo and that RadA stimulates RecA-catalyzed branch migration in vitro (14,16,23). Thus, the two DNA-binding sites of rLonC might be necessary for efficient recombination intermediate processing, such as branch migration.
Our study revealed the novel LonC subfamily, rLonC, that has a DNA-binding activity instead of a protease activity. We speculate that the LonC family can be divided into "DNA-bind-

Table 4
The DNA-binding affinities determined from the DNA-dependent ATPase activity The results are shown in Fig. 6B. ing type" and "protease type." In the Pfam database (45), most of the rLonC and pLonC have a ChlI (PF13541) signature as well as a Lon_C (PF05362) signature. This ChlI signature was originally found in an N-terminal domain of ComM, which is a widely conserved bacterial protein involved in genetic recombination (46 -48). Interestingly, like RadA and Lon protease, ComM also has an ATPase domain featuring Walker A and B motifs at the C terminus. Most of the ComM proteins also have the Lon_C signature in their N-terminal ChlI domain. Thus, we speculate that ComM is a novel LonC family protein. Phylogenetic analysis suggests that the LonC family can be divided into three subfamilies: RadA, ComM, and Lon protease (data not shown). The analysis also suggests that RadA and ComM are more closely related to each other than either is to Lon protease. Structural prediction and structural alignment analyses showed that highly conserved Arg and Lys residues are located near Arg-305, Arg-314, and Lys-345, the DNA-binding residues of rLonC (data not shown). Considering the putative function of ComM in natural transformation via HR (46,47), ComM might be involved in DNA processing via its LonC domain.
It is noteworthy that among three LonC subfamilies, only the Lon protease has the conserved catalytic Ser residue for the protease activity. Interestingly, the LonC family belongs to the ribosomal protein S5 domain 2-like superfamily, which includes various types of nucleic acid-binding proteins, such as ribosomal proteins S5 and S9, DNA mismatch repair protein MutL, DNA topoisomerases, ribonuclease P, and elongation factor G in the SCOP and CATH databases (49,50). In particular, ribosomal proteins, MutL, and ribonuclease P utilize the domain belonging to this superfamily as a nucleic acid-binding module (51)(52)(53). Therefore, we speculate that the ancestor of the LonC family has a nucleic acid-binding role. Structural and biochemical characterization of the LonC domain of ComM will be necessary to better understand the evolutionary relationships of LonC's subfamilies.
The cells were harvested by centrifugation and suspended in 40 ml of buffer I (50 mM Tris-HCl (pH 7.5) and 500 mM NaCl) containing 20 mM imidazole and 1 mM phenylmethylsulfonyl fluoride, and they were then lysed by ultrasonication on ice. After centrifugation (8,000 ϫ g) for 1 h at 4°C, the supernatant was loaded onto 2 ml of TALON Metal Affinity Resin (Clontech) pre-equilibrated with buffer I containing 20 mM imidazole. The resin was washed with 100 ml of buffer I and then eluted with a 20-ml gradient of 20 -300 mM imidazole in buffer I. The fraction containing the His 6 -tagged RadA was dialyzed against buffer I; then the protein was digested by TEV protease for 6 h at 15°C. After complete cleavage of the His 6 tag, the tag-free RadA was precipitated by 30% saturated ammonium sulfate. After centrifugation (8,000 ϫ g) for 30 min at 4°C, the precipitant was dissolved in buffer I, and then the solution was loaded onto a HiLoad 16/60 Superdex 200 pg column (GE Healthcare, Uppsala, Sweden) with the same buffer. The eluted RadA was concentrated using a Vivaspin concentrator (Sartorius AG, Göttingen, Germany). The protein concentration was determined on the basis of the absorbance at 280 nm using the molar extinction coefficients (M Ϫ1 cm Ϫ1 ) calculated through the previously described procedure (54)  Except for the rLonC used for X-ray crystallography, all of the RadA mutants were overexpressed and purified with the same procedures as mentioned above. For crystallization, rLonC was purified in a His 6 -tagged form without TEV protease digestion.

X-ray crystal structure determination
A crystal used for X-ray crystallographic analysis was obtained after 6 months of incubation under 4 mg/ml His 6tagged rLonC in 50 mM Tris-HCl (pH 7.5) and 500 mM NaCl, in a glass test tube at 4°C. The crystal was soaked in 25% glycerol as a cryoprotectant and was cryo-cooled in a nitrogen cryostream. Data were collected at the wavelength of 1.00000 Å at BL38B1 of SPring-8 (Hyogo, Japan). The diffraction images were processed with the HKL2000 program suite (55). Using a structural model predicted using the I-TASSER server as a search model (30), the structure of rLonC was solved by molecular replacement. The MOLREP program in the CCP4 suite was used for the rotation and translation searches (56,57). The model was refined using REFMAC5 (58) in the CCP4 suite and COOT (59). The final model was validated by RAMPAGE (60) and PROCHECK (61) in the CCP4 suite. The structure coordinates have been deposited into the Protein Data Bank (PDB) under accession number 5H45. The surface electrostatic potentials were calculated by APBS (62) and PDB2PQR (63). The calculation of the molecular contacts and the prediction of the stable oligomer were carried out using PISA in the CCP4 suite (31). All of the molecular graphics were generated by PyMOL (Schrödinger, New York, NY).

Gel filtration analysis
Gel-filtration analysis was performed using a Superdex 200 10/30 GL column (GE Healthcare) on an Ä KTA explorer system (GE Healthcare) at 25°C. First, 5 M full-length RadA or rLonC was incubated in a buffer composed of 50 mM Tris-HCl (pH 7.5), 0.2-2 M NaCl at 25°C for 2 h. Then, 100 l of the protein solution was loaded on the column and eluted at a flow rate of 0.5 ml/min in the same buffer. The elution profile was monitored by recording the absorbance at 230 nm. The apparent molecular mass was estimated by the calibration curve using thyroglobulin (669,000 Da), ␤-amylase (200,000 Da), alcohol dehydrogenase (150,000 Da), bovine serum albumin (66,000 Da), ovalbumin (44,000 Da), and carbonic anhydrase (29,000 Da).

EMSA
Chemically synthesized 60-mer ssDNA (Bex, Tokyo, Japan), 5Ј-GGGTGAACCTGCAGGTGGGCAAAGATGTCCTAG-CAATCCATTGGTGATCACTGGTAGCGG-3Ј, was annealed to complementary 60-mer ssDNA (Bex), 5Ј-CCGCTAC-CAGTGATCACCAATGGATTGCTAGGACATCTTTGC-CCACCTGCAGGTTCACCC-3Ј, to obtain 60-bp dsDNA. Then, 3 M (bp) dsDNA and 1.1 to 20 M RadA were incubated in a buffer containing 50 mM Tris-HCl (pH 7.5), 100 mM NaCl, 0.1 mg/ml bovine serum albumin, 10% glycerol, and 0.01% bromphenol blue at 25°C for 30 min. The reaction mixtures (5 l) were loaded onto a 5% polyacrylamide gel containing 10% glycerol and then electrophoresed in 1ϫ TBE buffer (89 mM Tris borate and 2 mM EDTA). The gel was stained with SYBR Gold Nucleic Acid Gel Stain (Molecular Probes, Eugene, OR), and the bands were visualized by a FLA-3000 image analyzer (Fujifilm, Tokyo, Japan). The amounts of free and shifted DNAs were quantified using MultiGauge version 2.1 software (Fujifilm) to determine the fraction bound. Under conditions with a large excess of RadA over DNA, the apparent dissociation constant K d was determined from three independent experiments by fitting to Equation 1 using the Igor Pro 4.03 software (WaveMetrics, Lake Oswego, OR).
[P], K d , and h are the total protein concentration, apparent dissociation constant for DNA, and Hill coefficient, respectively. Table 5 Sequences of primers used for this study a The recognition sites of the restriction enzymes used for plasmid construction and the start/terminator codons are indicated by underlining and boldface, respectively.

ATPase assay
To determine the kinetic constants of the ATPase activity, an ATPase assay was performed in reaction mixtures containing 50 mM Tris-HCl (pH 7.5), 150 mM NaCl, 10 mM MgCl 2 , 0.1 mg/ml bovine serum albumin, 0.5 M RadA, and 7.8 to 500 M ATP containing 0.1 Ci of [␥-32 P]ATP, in the presence and absence of 30 M (bp) dsDNA (60 bp), or 30 M nucleotides of ssDNA (60-mer). Reactions were initiated by ATP and carried out at 25°C. To stop the reactions, aliquots of the mixtures were mixed with a final concentration of 25 mM EDTA and 0.1% SDS; then 0.5 l of each sample was spotted onto a polyethyleneimine cellulose plastic sheet (Merck Millipore, Darmstadt, Germany) and analyzed by thin-layer chromatography with a development buffer containing 0.1 M formic acid and 0.2 M LiCl at room temperature for 5 min. The plate was dried, placed in contact with an imaging plate, and then analyzed by a FLA-3000 image analyzer. The amounts of ATP and P i were quantified using MultiGauge version 2.1 software. The apparent rate constant k app (ϭ v 0 /[E] 0 ) was estimated from the time-course analysis. The kinetic parameters k cat and K m were determined from three independent experiments by fitting to Equation 2 using the Igor Pro 4.03 software.
, k cat , and K m are the initial velocity, total enzyme concentration, free ATP concentration, catalytic rate constant, and Michaelis constant, respectively.
To determine the dissociation constants for DNA through the DNA-dependent ATPase activity, an ATPase assay was performed in reaction mixtures containing 50 mM Tris-HCl (pH 7.5), 150 mM NaCl, 10 mM MgCl 2 , 0.1 mg/ml bovine serum albumin, 0.1 M RadA, 500 M ATP containing 0.1 Ci of [␥-32 P]ATP, 0 -30 M (bp) dsDNA (60 bp) or 0 -30 M (nucleotides) ssDNA (60-mer). The initiation and termination of the reactions and the analysis by thin-layer chromatography were performed in the same manner as described above. The apparent rate constant k app was estimated from the time-course analysis. The apparent rate constant of the DNA-free enzyme, k unbound , was determined from six independent experiments in the DNA-free condition. Under conditions with a large excess of DNA over RadA, the kinetic parameters k bound and K d were determined from three independent experiments by fitting to Equation 3 using the Igor Pro 4.03 software.
[DNA], k bound , and K d are the total DNA concentration, apparent rate constant of the DNA-bound enzyme, and dissociation constant for DNA, respectively.

Constructs for gene disruption and complementation
A T. thermophilus HB8 strain lacking radA (⌬radA) was constructed by substituting the target gene with a thermostable kanamycin-resistance gene, htk, through natural transformation and HR as described previously (40). A plasmid used for gene disruption was obtained from RIKEN BioResource Center. The plasmid (clone name TDs03G08) is a derivative of pGEM-T Easy (Promega, Madison, WI) and was constructed by inserting htk flanked by 500 bp upstream and downstream sequences of the target gene. Gene disruption was confirmed by PCR using the isolated genomic DNA as a template and Western blotting using cell lysates. The primer sequences used to confirm gene disruption are listed in Table 5.
For in vivo complementation analysis, gene expression of radA derivatives in T. thermophilus HB8 was carried out using a polycistronic transcription of the radA gene following a thermostable hygromycin B resistance gene under the isocitric acid dehydrogenase promoter from Thermus aquaticus YT1 as described previously (64), with a slight modification. The pMK18::Hyg⌬Km plasmid was constructed by self-ligation after inverse PCR to delete the htk region of pMK18::Hyg. The pET-HisTEV/radA, pET-HisTEV/radA_R286A/R385A, and pET-HisTEV/radA_R305A/R314A/K345A plasmids were digested by XbaI and HindIII and then ligated into the complement site of the pMK18::Hyg⌬Km plasmid to generate pMK18:: Hyg⌬Km::radA, pMK18:: Hyg⌬Km::radA_R286A/R385A, and pMK18:: Hyg⌬Km::radA_R305A/R314A/K345A. These plasmids and pMK18::Hyg⌬Km, as a negative control, were transformed into the T. thermophilus HB8 wild-type and ⌬radA strains. Protein expression in each strain was confirmed by Western blotting.

Examination of the sensitivities to DNA-damaging agents
The sensitivity to DNA-damaging agents was measured as follows. To measure UV sensitivity, T. thermophilus HB8 cells in the mid-exponential growth phase were serially diluted by TR medium; then the cells were spread onto TT plates and irradiated with 254 nm UV light at a dose of 54 J m Ϫ2 using a SLUV-4 lamp (254 nm; As One, Osaka, Japan). The numbers of colonies were counted after incubation at 70°C for 18 h. To measure mitomycin C sensitivity, serially diluted T. thermophilus HB8 cells were mixed with a 300 g/ml mitomycin C solution (Nacalai Tesque) to a final concentration of 3 g/ml. The cells were further incubated at 37°C for 30 min and spread onto TT plates. The numbers of colonies were counted after incubation at 70°C for 18 h. The survival fraction was determined by monitoring the colony-forming units (CFUs) of bacterial cells with UV light or mitomycin C treatment and was expressed relative to the CFUs of untreated cells. Statistical analysis was performed on log10-transformed data of the surviving fractions from six independent experiments using Welch's t test, and the multiplicity was adjusted by the Holm-Bonferroni method.

Novel Lon protease-like DNA-binding domain of RecA paralog Western blotting
T. thermophilus HB8 cells at the late-exponential growth phase were harvested by centrifugation. The cells were lysed by ultrasonication in a buffer containing 50 mM Tris-HCl (pH 7.5), 500 mM NaCl, 10 mM EDTA, and 10ϫ protease inhibitor cocktail (Nacalai Tesque). The cell debris was removed by centrifugation. Total protein concentrations in the cell lysates were measured and corrected by the Bradford method using the protein assay kit from Bio-Rad. The cell lysates containing 50 g of total protein in each lane were resolved on 11% polyacrylamide gels and electroblotted onto a PVDF membrane (Millipore, Milford, MA). The membrane was incubated in a blocking solution containing 20 mM Tris-HCl (pH 7.5), 500 mM NaCl, and 5% skim milk for 30 min at room temperature. After washing with a wash buffer containing 20 mM Tris-HCl (pH 7.5) and 500 mM NaCl, the membrane was immersed into a blocking solution containing rabbit anti-RadA and anti-SSB antisera and then incubated for 2 h at room temperature. After washing with the wash buffer, the membrane was reacted with goat anti-rabbit IgG-AP conjugate (Bio-Rad) in the blocking solution for 1 h at room temperature. The membrane was washed twice with the wash buffer and then reacted with AP color reagents in a development buffer (Bio-Rad) for 15 min at room temperature. The color development was stopped by washing with deionized water.

Phylogenetic analysis
A multiple sequence alignment was prepared by the Muscle program (65) using 64 LonC domain sequences consisting of 26 Lon protease, 21 RadA, and 17 ComM proteins. Then, a phylogenetic tree was created with the Mega program (66) using the neighbor-joining method (67).
Author contributions-M. I., K. F., R. M., and S. K. conceived and designed the work. M. I. constructed expression plasmids, prepared proteins, crystallized the protein, solved the X-ray structure, and performed in vivo and in vitro experiments. K. F. performed the data collection for X-ray crystallography. Y. F. performed gene disruption of radA. M. I. and R. M. wrote the paper with the help from K. F., N. N., T. Y., and S. K.