Double-stranded Endonuclease Activity in Bacillus halodurans Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR)-associated Cas2 Protein*

Background: Cas2 is universally conserved and essential for new CRISPR spacer acquisition. Results: Bha_Cas2 uses a single metal ion to cleave dsDNA and is likely activated by a pH-dependent conformational change. A method to classify Cas2 into ssRNase and dsDNase is proposed. Conclusion: B. halodurans and T. thermophilus Cas2 are metal-dependent endonucleases. Significance: dsDNase activity is consistent with the direct involvement of Cas2 in new spacer acquisition. The CRISPR (clustered regularly interspaced short palindromic repeats) system is a prokaryotic RNA-based adaptive immune system against extrachromosomal genetic elements. Cas2 is a universally conserved core CRISPR-associated protein required for the acquisition of new spacers for CRISPR adaptation. It was previously characterized as an endoribonuclease with preference for single-stranded (ss)RNA. Here, we show using crystallography, mutagenesis, and isothermal titration calorimetry that the Bacillus halodurans Cas2 (Bha_Cas2) from the subtype I-C/Dvulg CRISPR instead possesses metal-dependent endonuclease activity against double-stranded (ds)DNA. This activity is consistent with its putative function in producing new spacers for insertion into the 5′-end of the CRISPR locus. Mutagenesis and isothermal titration calorimetry studies revealed that a single divalent metal ion (Mg2+ or Mn2+), coordinated by a symmetric Asp pair in the Bha_Cas2 dimer, is involved in the catalysis. We envision that a pH-dependent conformational change switches Cas2 into a metal-binding competent conformation for catalysis. We further propose that the distinct substrate preferences among Cas2 proteins may be determined by the sequence and structure in the β1–α1 loop.

cas (CRISPR-associated) genes encode a set of conserved proteins found in the vicinity of the CRISPR loci (4,6,31). These proteins can be classified into a set of core Cas proteins (Cas1-6) as well as subtype-specific genes (32)(33)(34). Cas proteins support the completion of three molecular events: the acquisition of new spacers derived from the extrachromasomal elements into the CRISPR loci, processing of precursor CRISPR RNAs into the mature form, and mediating the degradation of the complementary nucleic acids, in most cases DNA, in a CRISPR RNA-specific fashion (2)(3)(4)7). cas1 and cas2 are two core cas genes universally present in all CRISPR-Cas subtypes, required in the new spacer acquisition step (7,35). Cas1 proteins from Pseudomonas aeruginosa and E. coli were characterized as metal-dependent endonuclease (26,36). A recent study further classified the cas2 genes among different subtypes into three different clades based on phyloge-netic analysis (34). The crystal structures of Cas2 from Sulfolobus solfataricus (Sso_Cas2) and Desulfovibrio vulgaris (Dvu_ Cas2) revealed that Cas2 contains a ferredoxin domain and assemblies into a symmetric dimer (27,37) A conserved N-terminal aspartate residue in the S. solfataricus Cas2 was hypothesized to coordinate the catalytic divalent metal ion(s) in the Cas2 dimer. The S. solfataricus Cas2 was further characterized as a metal-dependent single-strand (ss) endoribonuclease with preference for Uracil-rich ssRNA (27); however, this activity was not observed in the D. vulgaris Cas2 (37). Furthermore, the ssRNase activity of Cas2, combined with its essential function in spacer acquisition, may imply that new spacers could be derived from transcribed RNAs through a reverse transcription mechanism (4,33,38,39). This is, however, inconsistent with the literature showing that new spacers can be derived from untranscribed regions.
In this study, we report the structural and biochemical characterization of the B. halodurans Cas2 protein (denoted as Bha_Cas2). Our assay revealed that Bha_Cas2 is a metal-dependent double-stranded (ds)DNA endonuclease instead of an RNase. This conclusion was further strengthened by the identification of dsDNase activities in the Thermus thermophilus Cas2 (Tth_Cas2) protein from another CRISPR subtype. We propose a way to classify Cas2 proteins into dsDNase and ssRNase based on sequence and structure features. Mutagenesis combined with isothermal titration calorimetry (ITC) revealed that the two conserved Asp 8 residues in the Bha_Cas2 dimer coordinated one divalent metal ion for catalysis. The distance between these two aspartate residues in the crystal structure is, however, too far away to chelate one metal ion together. These together with the observation that the dsDNase activity and metal chelation in Bha_Cas2 are strongly correlated and steeply pH-dependent, point to the possibility that a pH-dependent conformational change enables the Bha_Cas2 protein to chelate a divalent metal ion in the active site, which in turn enables the protein to cleave dsDNA.

EXPERIMENTAL PROCEDURES
Protein Expression and Purification-Cas2 (accession no. Q9KFX8; gene name, BH0342) was PCR-amplified from Bacillus halodurans C-125 genomic DNA and cloned into the pET28b vector (Novagen) via NdeI and XhoI sites. The protein containing an N-terminal His 6 tag was expressed from E. coli BL21 Star (Novagen) at 18°C for 18 h in Luria Bertani media after 1 mM isopropyl ␤-D-1-thiogalactopyranoside induction. Five grams of cell pellet was resuspended in ice-cold lysis buffer (50 mM Tris-HCl, pH 8.0, 0.1 M NaCl, 2 mM ␤-mercaptoethanol, and 0.2 mM phenylmethylsulfonyl fluoride) and disrupted by sonication. The supernatant after centrifugation was loaded onto a nickel-nitrilotriacetic acid column (Qiagen) and eluted with the lysis buffer supplemented with 300 mM imidazole. The N-terminal His tag was then removed by thrombin cleavage at 4°C overnight. The protein was further purified on the heparin column (GE Healthcare), followed by the Superdex 200 10/300 column (GE Healthcare) purification using an elution buffer containing 10 mM Tris-HCl, pH 8.0, 100 mM NaCl, and 2 mM dithiothreitol. Bha_Cas2 mutants (D8N) were generated using a modified site-directed mutagenesis Phusion method (New England Biolabs) and verified using DNA sequencing. The SUMO-Cas2 construct was generated by subcloning the Bha_Cas2 gene into a modified pSUMO vector. The T. thermophilus cas2 gene was cloned into the vector pQE80 via BamHI and XhoI sites. These proteins were expressed and purified following the protocol for His 6 -tagged Bha_Cas2.
Nuclease Activity Assays-All nuclease activity assays were performed at 37°C for 60 -90 min with the exception of the time course study. Fluorescently labeled ssRNA (1 M), ssDNA (10 M), or dsDNA (1.19 -2.38 g of a pUC19 plasmid) substrates were incubated with the Bha_Cas2 (20 M) in a reaction buffer containing 25 mM HEPES, pH 7.5, 200 mM KCl, and 2.5 mM MgCl 2 . Metal dependence was measured in the same buffer where Mg 2ϩ was replaced with either 2.5 mM of Mn 2ϩ , Ca 2ϩ , Zn 2ϩ , Cu 2ϩ , or EDTA. In the pH dependence assays, the HEPES component in the reaction buffer was replaced with 50 mM of either sodium citrate (pH 3.0 -5.0), MES (pH 6.0), Tris-HCl (pH 7.0 -8.0), or CAPS (pH 9.0 -11.0). The optimal salt concentration for the Bha_Cas2 nuclease activity was determined by increasing the KCl or NaCl concentration from 50 to 200 mM. Reaction products were separated by electrophoresis with either 0.7-2.0% (w/v) agarose or 15-18% (w/v) 8 M urea-PAGE gels and visualized using ethidium bromide staining or fluorescence scanning, respectively. The latter method involved scanning the urea-PAGE gel on a Typhoon 9400 (GE Healthcare) to detect fluorescent signals.
Crystallization, Data Collection, and Structure Determination-Bha_Cas2 (8 mg/ml) was crystallized using the hanging drop vapor diffusion method at 18°C by mixing 2 l of protein solution with 2 l of one of the following three reservoir buffers: (i) data 1 (50 mM MES, pH 6.0, 2% (w/v) PEG MME 2000, and 10 mM MgSO 4 ); (ii) data 2 (50 mM MES, pH 6.0, 2% (w/v) 2methyl-2,4-pentanediol, and 10 mM MgSO 4 ); (iii) data 3 (50 mM MES, pH 6.0, 4% (w/v) PEG 6000, and 10 mM MgSO 4 ). Crystals were cryo-protected with the addition of 30% (v/v) ethylene glycol and flash-frozen in liquid nitrogen. X-ray diffraction data were collected at 100 K in beamline A1 at MACCHESS. Diffraction data were processed and scaled using the program HKL2000 (40). Initial phases were obtained using molecular replacement program MOLREP (41) using the deposited T. thermophilus Cas2 structure (gene name TT1823; Protein Data Bank code 1ZPW) as the search model. Interactive manual model building and refinement were carried out using the programs Coot (42) and Refmac5 (43) in the CCP4 package, respectively. Simulated annealing omit maps were systematically generated to check the quality of the model. The structure was analyzed using the programs CNS (44), PDBSUM (45), and MolProbity (46). All figures were prepared using Program PyMOL. The data processing and refinement statistics are shown in Table 1.
Isothermal Titration Calorimetry-All ITC experiments were performed at 25°C using a Nano-ITC instrument (TA Instrument). To remove non-specifically bound metal ions, the N-terminal His 6 tag on Bha_Cas2 was removed by thrombin cleavage during purification, and the proteins were dialyzed first into an EDTA-containing buffer (10 mM Tris-HCl, pH 8.0, 100 mM NaCl, 2 mM DTT, and 2 mM EDTA) and then into the ITC buffer containing 20 mM HEPES, pH 7.5, and 50 mM NaCl.
The pH-dependent ITC titration used a different buffer containing 25 mM MES, pH 6.0, and 50 mM NaCl. To determine the metal-binding stoichiometry, 2 l of 20 mM Mg 2ϩ or Mn 2ϩ solution was titrated in 20 injections into the 190 l of Bha_Cas2 protein (0.8 -1 mM) in the same solution. Injections were administered with a 240-s interval with continuous stirring at 300 rpm. The base-line heat was measured by making identical injections in the absence of divalent metal ions. The identity of the monovalent cation (Na ϩ versus K ϩ ) did not produce appreciable differences in ITC experiments. Data analysis was performed using independent binding modes available in NanoAnalyze Software program (version 2.1.13, TA Instrument). Upward peaks in a NanoITC trace signify an exothermic reaction. This is the opposite from the convention by MicroCal.

RESULTS
Metal-dependent Double-stranded Nuclease Activity in B. halodurans Cas2-Non-homologous recombination in CRISPR-cas system has been suggested for new spacer acquisition (33). The source of the new spacers were suggested to be either from mRNAs presumably with the assistance of a putative reverse transcriptase that is occasionally present in vicinity of the cas operon, or more likely, from dsDNA (4,33,38,39). A genetic study revealed that Cas1 or Cas2 proteins are required for the new spacer acquisition (8). In this study, we focused on the Bha_Cas2 protein to characterize its enzymatic activities.
Various DNA and RNA substrates were incubated with the Bha_Cas2 protein. We were not able to detect cleavage activity in Bha_Cas2 for ssRNAs or ssDNAs (28 -32 nucleotides) (Fig. 1, A and B, and supplemental Table S1). Lack of ssRNase activity in Bha_Cas2 is consistent with the D. vulgaris Cas2 study but not the S. sulfataricus Cas2 study (5,27). Instead, robust dsDNase activity was detected from the Bha_Cas2 protein (Fig.  1C). The observation that Bha_Cas2 could actively degrade both circular and linear plasmid DNA pointed to endo-instead of exonuclease activity (Fig. 1C). The time course experiment further revealed that the end product after Bha_Cas2 processing was ϳ120 bp in size (Fig. 1E). Divalent metal ions were required for this dsDNase activity, as addition of EDTA completely inhibited plasmid degradation (Fig. 1E). A survey of the metal dependence revealed that Bha_Cas2 activity could be supported to various extent by different divalent metal ions, in the descending order of Mg 2ϩ Ͼ Ͼ Mn 2ϩ Ͼ Fe 2ϩ Ͼ Ni 2ϩ Ͼ Ca 2ϩ (Fig. 1E). Zn 2ϩ on the other hand, could not support Bha_Cas2 activity. A monovalent cation dependence was also observed. It was found that the Bha_Cas2 protein was more active in higher salt concentrations (measured from 50 -200 mM), and preferred K ϩ over Na ϩ in the assay condition (Fig.  1F). In the pH dependence experiments from pH 3.0 to 11.0, the dsDNase activity was more pronounced between pH 7.0 and 10.0 (Fig. 1G). The activity decreases sharply below pH 6.0 (Ͻ10% of that in pH 7.0). Taken together, our data strongly suggested that unlike S. solfataricus Cas2, B. halodurans Cas2 protein was a metal-dependent endonuclease targeting dsDNA for degradation.
Crystal Structure of Bha_Cas2-To gain insight into the distinct dsDNase activity in the Bha_Cas2, we determined the crystal structure of Bha_Cas2. The space group of the crystal structure of Bha_Cas2 was determined to be the orthorhombic P2 1 2 1 2 space group, with one molecule occupying the asymmetric unit. Bha_Cas2 forms a symmetric dimer in the crystal lattice ( Fig. 2A). This oligomerization state is consistent with the observed dimer formation in size-exclusion chromatogra-  Table S1. C, Bha_Cas2 can cleave both circular and linearized dsDNA plasmid. Reactions were done in duplicates. D, time course of the nuclease activity. E, the nuclease activity in Bha_Cas2 is activated by divalent metal ions, whereas addition of EDTA or Zn 2ϩ strongly inhibits the activity. F, effect of increasing monovalent metal ion concentration on nuclease activity, Bha_Cas2 is more active in higher concentrations of KCl. 2.5 mM Mg 2ϩ is present in the solution. G, the nuclease activity in Bha_Cas2 is steeply pH-dependent. The substrate used in D-G is a linearized dsDNA plasmid. OCTOBER 19, 2012 • VOLUME 287 • NUMBER 43 phy (supplemental Fig. S1). The P2 1 2 1 2 space group could be obtained using PEG MME 2000, PEG 6000, or MPD as a precipitant. To investigate whether different precipitants may influence the Bha_Cas2 conformation, we determined structures from crystals grown from each precipitant at resolution of 1.10, 1.30, and 1.70 Å, respectively (Table 1). Comparison between these structures revealed minor global and local conformational changes. As much as 1.5°rigid-body rotation can be detected between each protomer in the Bha_Cas2 dimer (Fig. 2C), the distance between active site residues varies from 10.6 -11.3, and the conformation of two surface loops are also different among these three Bha_Cas2 structures (supplemental Fig. S2). The electron density map allowed tracing of the entire molecule except three flexible regions: between ␤1-␣1 (Ala 9 -Ala 13 ), between ␤4 -␤5 (Ala 73 -Ala 77 ), and the C-terminal tail (Ala 85 C-terminal) (supplemental Fig. S2).
The total buried surface area at the Bha_Cas2 dimer interface is ϳ1430 Å 2 (ϳ30% of the surface area of the protomer). Closer analysis revealed that the dimer interface was dominated by polar interactions (68% hydrophilic, 32% hydrophobic). Signif  Fig. S3; asterisk signifies residues from the interacting dimer).
The Bha_Cas2 structure aligns well with S. solfataricus and D. vulgaris Cas2 structures (24 and 51% sequence identity, respectively), with root mean square deviations of 1.61 and 2.57 Å, respectively (Fig. 3A). The conserved structure features include the dimer formation, the ␤5-strand swapping at the dimer interface, and a conserved aspartate residue pair in the putative active site (Fig. 3A) (27,37). Interestingly, the distance between this invariable Asp pair varied significantly among three structures, from 10.62 Å in Bha_Cas2, 6.5 Å in Sso_Cas2, to 15.42 Å in Dvu_Cas2, pointing to the possibility of substrateinduced conformational changes or diversity in the catalytic mechanism (Fig. 3, B and C). In addition, large conformational differences were observed in the ␣2-␤4 and ␤1-␣1 loops in the Cas2 structures. The long ␣2-␤4 loop in the Sso_Cas2 structure, which was suggested to be responsible for recognizing the RNA substrates (27,47), is much shorter and more rigid in the Bha_Cas2 and Dvu_Cas2 structures (Fig. 3, D and E). By contrast, the ␤1-␣1 loop is larger and more flexible in these two structures. The net result is a deeper and narrower substrate binding groove in the Sso_Cas2, more appropriate for ssRNA binding, and a wider and shallower binding groove in the Dvu_Cas2 and Bha_Cas2, better tuned for dsDNA binding.
Role of Asp 8 in Coordinating the Catalytic Metal Ion-The Asp 10 pair in the S. solfataricus Cas2 dimer has been shown to be critical for catalysis. Based on the structural and sequence alignments with the Sso_Cas2 and other Cas proteins ( Fig. 3 and supplemental Fig. S4), the Asp 8 residue in Bha_Cas2 was thought to play an equivalent role in possibly coordinating a catalytic divalent metal ion(s) for dsDNA cleavage. Indeed, the dsDNase activity in the D8N Bha_Cas2 mutant was drastically reduced as compared with the wild type in the presence of Mg 2ϩ , Mn 2 , or Ca 2ϩ (Fig. 4A). However, none of the three apo-Cas2 structures revealed the binding of divalent metal ion(s) near the invariable Asp pair. Moreover, the distance between the invariable Asp pair varies significantly among the three Cas2 structures (Fig. 3, B and C), leaving open the question about whether one or two divalent metal ions are coordinated by the Asp pair in the Cas2 dimer.
We used ITC to measure the metal-binding stoichiometry in the wild type and D8N mutant Bha_Cas2 proteins (Bha_Cas2-WT and Bha_Cas2-D8N). To prevent prebound metal ions from complicating the analysis, purified Bha_Cas2 proteins were first dialyzed into an EDTA-containing solution to strip residual metal ions before dialyzing into the ITC buffer. The thermodynamic parameters and the metal-binding stoichiometry can be derived from ITC analysis. Interestingly, the thermodynamic process of Bha_Cas2 binding to Mg 2ϩ and Mn 2ϩ was very different (Fig. 4 and Table 2). Titration of Mg 2ϩ into the Bha_Cas2-WT protein solution was an endothermic process, with a binding enthalpy (⌬H°) of 18.3 kJ/mol and a binding free energy (⌬G°) of Ϫ5.8 Ϯ 0.006 kJ/mol (Fig. 4B). Binding of Mn 2ϩ by Bha_Cas2, on the other hand, was an exothermic process, with enthalpy values of ⌬H°ϭ Ϫ20.0 kJ/mol and a binding free energy (⌬G°) of Ϫ4.2 Ϯ 0.005 kJ/mol (Fig. 4B). The affinity of Bha_Cas2-WT for Mg 2ϩ (K d ϭ 54 M) was 14 times higher than that for Mn 2ϩ (K d ϭ 791 M).
Titration of Mg 2ϩ and Mn 2ϩ into the Bha_Cas2-D8N mutant followed a similar trend to wild type (Fig. 4C). The binding of Mg 2ϩ was an exothermic binding event, with ⌬H°a nd ⌬G°of 17.8 kJ/mol and Ϫ7.9 Ϯ 0.004 kJ/mol, respectively. Mn 2ϩ binding was an endothermic binding event, with ⌬H°and ⌬G°of Ϫ18.3 kJ/mol and Ϫ4.2 Ϯ 0.002 kJ/mol, respectively. Notably, although the binding affinity of Bha_Cas2-D8N for Mn 2ϩ ions (K d ϭ 773 M) remained roughly the same as the Cas2-WT, its affinity for Mg 2ϩ ions (K d ϭ 1.68 M) was 32-fold tighter than that of the wild type protein, suggesting that the Mg 2ϩ coordination is significantly altered in the D8N mutant.
Metal-binding stoichiometry could also be calculated from the same ITC data. The binding stoichiometry of Mg 2ϩ : Bha_Cas2-WT was ϳ1:2 (n ϭ 0.48 Ϯ 0.001), or one Mg 2ϩ per Bha_Cas2-WT dimer, presumably coordinated by the Asp 8 pair (Fig. 4 and Table 2). This coordination would require the Asp 8 pair to move much closer toward each other from the confor-  OCTOBER 19, 2012 • VOLUME 287 • NUMBER 43 mation in the crystal structure (10.7 Å apart), which may explain why Mg 2ϩ binding was not captured in our crystal structure. The binding stoichiometry of Mn 2ϩ :Bha_Cas2 was ϳ1.4:2 (n ϭ 0.69 Ϯ 0.001), which is consistent with a 50% mixture of one-and two-metal ion coordination. Assuming 1:2 stoichiometry leads to productive catalysis, this would explain why the DNase activity of Bha_Cas2 is weaker in the presence of Mn 2ϩ . On the other hand, the binding stoichiometry of both Mg 2ϩ and Mn 2ϩ for the Bha_Cas2-D8N mutant was changed to 1:1, suggesting that two metal ions were coordinated by the Asn 8 pair in the D8N mutant. This change of divalent metal coordination could explain why the Cas2-D8N mutant displayed much weaker dsDNA nuclease activity. Taken together, our ITC data suggest that the Bha_Cas2 protein coordinated one metal ion in the active site to catalyze the endonucleolytic cleavage of dsDNA.

Endonuclease Activity in CRISPR Cas2 Protein
Binding of the Catalytic Metal Ion Was pH-dependent-A sharp pH dependence in nuclease activity was observed in the nuclease assay (Fig. 1D). In particular, the activity of Bha_Cas2 was shown to decrease steeply when the pH dropped below 6.0. This paralleled with the ITC observation that when the metal titration was carried out in pH 6.0 instead of 7.5 shown above, Bha_Cas2 no longer showed appreciable affinity for Mg 2ϩ and Mn 2ϩ (supplemental Fig. S5). The strong correlation suggests that the loss of dsDNase activity at low pH is very likely the result of the loss of the catalytic metal ion at the active site. This points to the possible existence of a pH-dependent conformational change that switches on Bha_Cas2 to chelate a metal ion for catalysis. Because Bha_Cas2 was only crystallizable at pH lower than 6.0, we were only able to reveal the "nonproductive" conformation in the crystal structure. Jumping the solution pH to 7.0 before crystal freezing did not result in significant conformational changes presumably due to crystal lattice trapping (data not shown).
Another piece of evidence pointing to the existence of a different conformational state in Bha_Cas2 during catalysis came from the observation that the N-terminal SUMO-tagged Bha_Cas2, although still capable of dimerization (supplemental   OCTOBER 19, 2012 • VOLUME 287 • NUMBER 43 Fig. S1), had drastically reduced DNase activity, despite the fact that the ϳ11-kDa SUMO tag was located opposite from the active site, therefore unlikely to sterically interfere with DNA binding. Our interpretation is that this distal SUMO tag causes an allosteric inhibition to prevent the Bha_Cas2 from adopting the productive conformation for catalysis.

Endonuclease Activity in CRISPR Cas2 Protein
dsDNase Activity in Other Cas2 Proteins-Cas2 protein was previously characterized as a ssRNase (27), whereas here we identified dsDNase activity in the Bha_Cas2 protein. To investigate whether dsDNase activity may be present in other Cas2 proteins, we carried out nuclease activity survey experiments on the T. thermophilus Cas2 (Tth_Cas2) protein (Fig. 5).
Results showed predominant dsDNase activity in Tth_Cas2 but also rather weak ssRNase as well (Fig. 5, A and B). Similar to Bha_Cas2, activity in Tth_Cas2 is metal-and pH-dependent, although Mn 2ϩ is slightly favored over Mg 2ϩ , and the final product size is slightly bigger, averaging ϳ170 bp. These results suggest that dsDNase activity is present in Cas2 proteins from at least two different CRISPR subtypes.

Conserved Active Site Configuration and Envisioned
Conformational Changes During Catalysis-Despite sequence variations and distinct enzymatic activities among Cas2 proteins, the  7). B, Tth_Cas2 also contains weak nonspecific endoribonuclease activity against ssRNA with a sequence-nonspecific manner. Increasing concentration of Tth_Cas2 caused increasing degradation of two different ssRNA substrate. Alkaline hydrolysis ladder for the second RNA substrate is shown in lane 5. C, comparison of the proposed substrate recognition loop from Bha_Cas2, Sso_Cas2, and Tth_Cas2, indicating that Tth_Cas2 lacks extensive substrate recognition loops found in Bha_Cas2 and Sso_Cas2, which may explain its promiscuous enzymatic activity (see "Discussion" for details). D, opened substrate binding pocket in Tth_Cas2. Several observations, including the correlated strong pH dependence in the dsDNase activity and the divalent metal ion binding at the active site of Bha_Cas2, the non-optimal metal binding site in the low-pH Bha_Cas2 crystal structure, as well as the allosteric inhibition by SUMO tagging, all point to the possible presence of a pH-dependent conformational change that enables Bha_Cas2 to bind the active site metal to cleave dsDNA. The conformational change likely involves a rigid-body hinge motion between the two Cas2 protomers to bring the Asp 8 pair closer to coordinate a single metal ion, and a hint of such motion has been observed when Bha_Cas2 is crystallized from different conditions (Fig. 2C). One of the future goals is to capture the productive Bha_Cas2 conformation from an alternative crystal form in neural or basic pH conditions and/or in the presence of its dsDNA substrate.
Rationalization of Different Enzymatic Activities among Cas2 Proteins and Their Functional Implications-Although the overall structures of Bha_Cas2 and Sso_Cas2 are quite similar, one is characterized as a dsDNase, and the other is characterized a an ssRNase. Localized structural differences in the ␣2-␤4 and ␤1-␣1 loops above the active site pocket may predetermine their different substrate preferences. The ␣2-␤4 loop in the Sso_Cas2 structure, which contains important residues for ssRNA recognition (27,47), is significantly longer and more flexible than in the Bha_Cas2 structure (Fig. 3D). By contrast, the ␤1-␣1 loop is larger and more flexible in the Bha_Cas2 and Dvu_Cas2 than the Sso_Cas2. The net result is a deeper and narrower substrate binding groove in the Sso_Cas2, which is more appropriate for ssRNA binding, and a wider and shallower binding groove in the Dvu_Cas2 and Bha_Cas2, which is better tuned for dsDNA binding (Fig. 3E) (37). Using the length of the ␣2-␤4 and ␤1-␣1 loops as an indicator, the annotated Cas2 proteins can be roughly divided into two groups, presumably with different enzymatic activities. There are yet some Cas2 proteins that do not belong to either of the classes. For example, the T. thermophilus Cas2 (Tth_Cas2) structure contains an open cleft atop of the active site, lacking extensive structures in either of the substrate recognition loops (Fig. 5, C  and D). Is the Tth-Cas2 a ssRNase, dsDNase, or both? Our results showed that its biochemical activity agreed more with the Bha_Cas2, but not the Sso_Cas, which is consistent with our observation that the active site conformation in Tth_Cas2 is more similar to the Bha_Cas2 (Fig. 5A).
The DNase activity in Cas2 agrees well with its essential function in the new spacer acquisition process, which involves the dicing of foreign dsDNA in short fragments (proto-pacers) and the insertion of these into the CRISPR loci as new spacers. This process minimally requires Cas1 and Cas2 proteins (9, 13), and both are metal-dependent dsDNases. Neither protein produces dsDNA fragments as short as found in the CRISPR spacers; therefore, it is imaginable that these two proteins may function in a concerted fashion to generate the right sized proto-spacers. This scenario agrees well with a recent study showing that combined action of Cas1 and Cas2 extends the CRISPR region in E. coli (35).