In Vitro Reconstitution of an Escherichia coli RNA-guided Immune System Reveals Unidirectional, ATP-dependent Degradation of DNA Target*

Background: The CRISPR-Cas immune system protects E. coli against invasive DNA. Results: We have reconstituted this immune system in vitro using recombinant proteins. Conclusion: Degradation of invasive DNA is tightly regulated and unidirectional. Significance: Reconstitution provides an invaluable tool for understanding the CRISPR-Cas immune system. Many prokaryotes utilize small RNA transcribed from clustered, regularly interspaced, short palindromic repeats (CRISPRs) to protect themselves from foreign genetic elements, such as phage and plasmids. In Escherichia coli, this small RNA is packaged into a surveillance complex (Cascade) that uses the RNA sequence to direct binding to invasive DNA. Once bound, Cascade recruits the Cas3 nuclease-helicase, which then proceeds to progressively degrade the invading DNA. Here, using individually purified Cascade and Cas3 from E. coli, we reconstitute CRISPR-mediated plasmid degradation in vitro. Analysis of this reconstituted assay suggests that Cascade recruits Cas3 to a single-stranded region of the DNA target exposed by Cascade binding. Cas3 then nicks the exposed DNA. Recruitment and nicking is stimulated by the presence, but not hydrolysis, of ATP. Following nicking and powered by ATP hydrolysis, the concerted actions of the helicase and nuclease domains of Cas3 proceed to unwind and degrade the entire DNA target in a unidirectional manner.

interact with Cascade in the absence of target DNA, and both its nuclease and helicase activities are essential for the interference phase of Type I CRISPR immunity (17).
Biochemical analysis of E. coli Cas3, and consequently the E. coli Type 1-E system, has been hindered by the inability to generate recombinant Cas3 (14,16,17). To resolve this, Westra et al. (17) observed that Cas3 and the CasA subunit of Cascade occur as fusion proteins in three species (Streptomyces sp. SPB78, Streptomyces griseus, and Catenulispora acidiphia DSM 44928) and then produced chimeric Cas3 fused to CasA by a linker sequence identical to that of S. griseus. This chimeric protein was expressed and purified with the other subunits as part of a Cascade-Cas3 fusion complex. This complex could degrade target DNA both in vitro and in vivo (17). However, the vast majority of Type I systems contain stand-alone Cas3, and it is unknown if there are mechanistic differences between the fused and stand-alone systems.
Here, we describe the purification of stand-alone E. coli Cas3 and subsequent reconstitution of the bona fide E. coli Type I-E system. We show that transition metal ions, in particular cobalt, stimulate the nuclease activity of Cas3 and that Cas3 is recruited by Cascade to DNA targets that contain both protospacer and PAM sequences. We also find that, unlike the Cascade-Cas3 fusion, the stand-alone proteins degrade a linear DNA target. Finally, we mapped the sites of DNA target cleavage and determined that degradation by Cas3 is unidirectional.

EXPERIMENTAL PROCEDURES
Cloning and Mutagenesis-The genes encoding E. coli Cas3 and high temperature protein G (HtpG) were amplified from genomic DNA (American Type Culture Collection) and directionally cloned into pMAT and pRSFDuet-1 (Novagen), respectively. pMAT was engineered by inserting DNA encoding maltose-binding protein into the SpeI site of pHAT4 (18). QuikChange site-directed mutagenesis (Stratagene) was used to create point mutants. Plasmid targets were prepared by cloning synthetic oligonucleotides carrying the appropriate sequence into pBAT4 (18). Primers and oligonucleotides are listed in Table 1. All clones were verified by DNA sequencing.
Protein Expression and Purification-E. coli Cascade, CasA, and a subcomplex of Cascade lacking the CasA subunit (CasBCDE) were expressed and purified as described previously (12). E. coli Cas3 was overexpressed in the T7Express strain of E. coli (New England Biolabs). Cells were grown at 20°C to an A 600 of ϳ0.3, at which point protein expression was induced with 0.2 mM isopropyl-␤-D-thiogalactopyranoside. After overnight growth, the cells were harvested, lysed in buffer L (20 mM Tris-HCl, pH 8.0, 100 mM NaCl, and 10% glycerol), clarified by centrifugation, and loaded onto a 5-ml immobilized metal affinity chromatography column (Bio-Rad). The column was washed consecutively with buffer L supplemented with 5 mM imidazole and then 1 M NaCl. The remaining bound proteins were eluted with buffer L supplemented with 250 mM imidazole. The sample was directly loaded onto a HiLoad 26/60 S200 size exclusion column (GE Healthcare) pre-equilibrated in buffer A (20 mM Tris-HCl, pH 8.0, 200 mM NaCl, and 1 mM dithiothreitol). Fractions containing Cas3 were pooled and desalted into buffer B (20 mM Tris-HCl, pH 8.0, and 200 mM NaCl). The N-terminal His 6 -maltose-binding protein tag was removed by overnight treatment with tobacco etch virus protease at 4°C. The cleaved sample was then flowed through an immobilized metal affinity chromatography column, concentrated, and loaded onto a HiLoad 26/60 S200 size exclusion column pre-equilibrated with buffer B. Purified Cas3 was concentrated to ϳ5 M, flash-frozen, and stored at Ϫ80°C. The  D75A and D452A Cas3 mutants were co-expressed with HtpG in T7Express cells and purified like the wild-type protein.
Preparation of Synthetic DNA Targets-PAGE-purified oligonucleotides ( Table 1) were 5Ј-labeled with ␥-[ 32 P]ATP (PerkinElmer Life Sciences) using T4 polynucleotide kinase (New England Biolabs). Duplexes were formed by mixing the target and non-target strands, heating at 95°C for 2 min, and then cooling to room temperature over 2 h. DNA ladders were prepared using a Sanger sequencing kit (Asymmetrix).
Reconstitution Assay-Reactions were performed in buffer containing 5 mM HEPES, pH 7.5, and 60 mM KCl. The indicated amounts of divalent metal ions, target DNA, Cascade, Cas3, and ATP were assembled together and incubated at 37°C for 30 min or the indicated duration. All reactions were terminated by the addition of 20 mM EDTA. The range in divalent metal ion concentrations was chosen based on their estimated cellular concentrations (19 -21). Proteins were removed by phenol extraction. Plasmid DNA was analyzed by electrophoresis through 1% agarose gels and ethidium bromide staining. Labeled synthetic DNA was analyzed by electrophoresis through 10% polyacrylamide gels and autoradiography.
ATPase Assay-ATP hydrolysis by Cas3 was monitored using an NADH-coupled ATPase assay as described previously (22). Two reaction mixtures were prepared, each in 10 mM HEPES, pH 7.5, 60 mM KCl, and 10% glycerol. Mixture A contained 0.5 mM NADH, 4 mM ATP, and 20 mM MgCl 2 as well as 4 nM DNA where indicated. Mixture B contained 6 mM phosphoenol pyruvate (Sigma-Aldrich) and 0.4 units/l pyruvate kinase/lactate dehydrogenase (Sigma-Aldrich) as well as 40 nM Cascade and/or 200 nM Cas3 where indicated. Mixtures A and B were incubated separately at 37°C for 10 min before equal volumes of both were mixed to initiate a 100-l reaction. Absorbance at 340 nM was measured every 30 s for 10 min. The rate of NADH oxidation was calculated from the linear decrease in A 340 . All reactions were performed at 37°C.

Overexpression and Purification of Recombinant E. coli
Cas3-To facilitate expression and purification, the gene encoding E. coli Cas3 was cloned with an N-terminal His 6 maltose-binding protein tag. The maximum yield of soluble protein (ϳ1 mg of pure protein/liter of culture) was obtained when cultures were grown at 20°C, and expression was induced in early log phase (A 600 of 0.3). Cultures grown at higher temperatures or cultures that were induced above an A 600 of 0.3 pro-duced little or no soluble Cas3. Tagged protein was purified from clarified cell lysate by nickel affinity and size exclusion chromatographies. Tobacco etch virus protease was added to remove the tag, and untagged protein was isolated by additional nickel affinity and size exclusion steps. Untagged Cas3 eluted from the size exclusion column at the volume expected for a Cas3 monomer and was over 90% pure, as judged by SDS-PAGE and Coomassie staining ( Fig. 2A). Mutant variants of Cas3 were produced in a similar manner as wild-type protein, except for co-expression with the chaperone HtpG (23), to compensate for their lower solubility.
Cascade-directed Cleavage of Plasmid DNA by Cas3-The HD domain of Cas3 specifically cleaves ssDNA (13)(14)(15). Previously, we have shown that transition metal ions and not magnesium ions activate the nuclease activity of Thermus thermophilus Cas3 (13). Therefore, we tested the nuclease activity of E. coli Cas3 on circular single-stranded DNA (M13 phage) with a selection of divalent metal ions before attempting to reconstitute the activity of the E. coli CRISPR system. Consistent with the results from T. thermophilus, nickel ions stimulated the nuclease activity of the E. coli protein (Fig. 2B). Because magnesium ions are necessary for the ATPase activity of Cas3 (14), magnesium and transition metal ions were included in subsequent reconstitution assays.
To test whether Cascade can direct stand-alone Cas3 to degrade DNA target in a reconstituted system, we incubated Cas3 and Cascade with a plasmid target bearing functional PAM and complementary protospacer sequences. Following incubation, proteins were removed by phenol extraction, and the DNA was analyzed by electrophoresis through agarose gels and ethidium bromide staining. We found that in the presence of ATP, Mg 2ϩ , and transition metal ions, in particular Co 2ϩ , Cascade directed Cas3 to degrade the plasmid target, as shown by a nonspecific smear of dsDNA products on the agarose gel (Fig. 3A). Reactions containing either Mg 2ϩ or select transition metal ions, but not both, degraded plasmid target to a much lesser extent, consistent with the differing metal ion requirements of the two domains of Cas3 (Fig. 3, A-C) (13,14). Control plasmids lacking a protospacer sequence were not degraded. Target degradation was also ablated when a critical residue in the nuclease active site of Cas3 was mutated (D75A) (Fig. 3B) (13)(14)(15)17).
Previous electrophoretic mobility shift assays have demonstrated that Cascade requires the CasA subunit when binding to dsDNA target (11,12,24). Consistently, a subcomplex of Cascade lacking the CasA subunit (CasBCDE) was unable to direct degradation of the plasmid target. The addition of CasA to the reaction restored this activity (Fig. 3D).
ATP is required for DNA target degradation (17) (Fig. 3). In the absence of ATP, the Cascade-Cas3 fusion was shown to nick the DNA target, and an ATPase-deficient variant of the fusion nicked the target both in the presence or absence of ATP (17). With 50 nM stand-alone Cas3, only 13% of the plasmid target was nicked in the absence of ATP (Fig. 3A). However, when the concentration of Cas3 was varied, nicking activity increased in a concentration-dependent manner (Fig. 3E); 68% of the target was nicked at 300 nM Cas3. When ATP was included in the reaction, target was completely degraded except at the lowest concentrations of Cas3. In addition, the nicking activity of an ATPase-deficient variant (D452A) of Cas3 was stimulated by the presence of ATP (Fig. 3E). In the absence of ATP, the variant Cas3 nicked 55% of the target DNA, but in the presence of ATP, close to 100% of the target DNA was nicked. These data suggest that Cas3 recruitment to target DNA is stimulated by the binding but not the hydrolysis of ATP. To further examine the effects of ATP on Cas3 activity, we monitored target degradation as a function of ATP concentration (Fig. 3F). At high ATP concentrations, we observed a smear on the agarose gel corresponding to degradation products with a wide range of sizes. At lower ATP concentrations, the average product size decreased and spanned a smaller range. These results suggest that the frequency of cutting by the nuclease domain is coupled to the rate of DNA unwinding by the helicase domain.
Cascade Bound to DNA Target Activates the ATPase Activity of Cas3-The helicase domain of S. thermophilus Cas3 harbors both ATP-dependent helicase and ssDNA-dependent ATPase activities (14). Using an NADH-coupled assay (22), we investigated the ATPase activity of E. coli Cas3 by testing the effects of reaction components on the rate of ATP hydrolysis (Fig. 3G). ATPase activity was not stimulated by dsDNA and was stimulated only modestly by ssDNA (ϳ3-fold). The addition of Cascade alone failed to stimulate the ATPase activity, but with the addition of plasmid target, the rate of ATP hydrolysis was stimulated ϳ44-fold. This stimulation is dependent on base pairing between the crRNA and protospacer sequences because targets lacking a protospacer failed to stimulate the ATPase activity. No ATPase activity was detected with an ATPase-deficient variant (D452A) of Cas3. These results suggest that the ATPase activity of Cas3 is tightly regulated and relies on the recruitment of Cas3 by Cascade to a protospacer.
Degradation of DNA Targets Requires both PAM and Seed Sequences-Mutations in the PAM or seed sequences of DNA targets render cells with an otherwise functional CRISPR system sensitive to phage infection (8). Binding studies revealed that this is a result of the reduced affinity between Cascade and the mutant DNA targets (8,11). To determine if the activity of our reconstituted CRISPR system is also dependent on PAM and seed sequences, we monitored nicking activity on plasmid targets containing point mutations in either the PAM or the protospacer. These reactions were performed in the absence of ATP to avoid smearing of the DNA products on the agarose gels, allowing us to quantify the activity through the ratio of nicked product to negatively supercoiled substrate. Mutations in the PAM sequence abolished target nicking, mutations in the seed sequence reduced nicking activity (particularly at positions 1 and 4), and mutations outside the seed region generally had little to no effect (Fig. 4A). We also tested the ability of these variant targets to activate the ATPase activity of Cas3 (Fig. 4B) and found that the mutations had similar effects on ATPase activation as they had on nicking activity. Altogether, these results establish that the reconstituted assay recapitulates the observed in vivo dependence for target PAM and seed sequences.
Cascade Can Direct Cas3 to Degrade Linear DNA, and Degradation Is Unidirectional-To determine if Cascade and stand-alone Cas3 can degrade linear DNA and if degradation proceeds from the protospacer in one or both directions, plasmid targets were linearized using either of two restriction enzymes, KpnI or ScaI. The protospacer is positioned ϳ3 kb from the 5Ј-end of the target strand in the KpnI-treated plasmid and ϳ2 kb away in the ScaI-treated plasmid. After reaction with the reconstituted CRISPR system, the linear KpnI-and ScaItreated targets were clearly degraded, yielding products that were resistant to degradation of ϳ3 and ϳ2 kb, respectively (Fig. 5A). This pattern of resistance suggests that degradation is unidirectional, initiating in or near the protospacer and proceeding upstream, leaving the downstream DNA intact (Fig.  5A). As observed with negatively supercoiled targets, degrada-tion of linear DNA was also found to be ATP-and Cascade-dependent, and mutation of either the nuclease (D75A) or helicase domain (D452A) of Cas3 ablated this degradation (Fig. 5B).
To investigate if negative supercoiling affects the rate of target degradation by Cascade and stand-alone Cas3, we compared the rates of degradation of negatively supercoiled with linearized plasmid targets (Fig. 5C). Fitting the data to a singleexponential decay yielded observed rate constants (k obs ) of 2.92 and 0.66 min Ϫ1 for negatively supercoiled and linear target, respectively (Fig. 5C). This suggests that the E. coli CRISPR system prefers negatively supercoiled target to linear target by ϳ4.5-fold. Consistent with the nuclease assay, both substrates stimulated ATPase activity, but activity with supercoiled target was greater than that of the linearized target by ϳ2-fold (Fig. 3G).
Mapping Degradation of Target DNA by Cas3-When Cascade binds to foreign DNA, the crRNA base-pairs to the target strand and displaces the non-target strand. DNA footprinting experiments show that the majority of the protospacer DNA is protected when bound to Cascade except for a 19-base region of the non-target strand (Fig. 1) (6). To determine if Cas3 nicks this accessible region, we performed a reconstitution assay in the absence of ATP, purified the nicked product from an agarose gel, and sequenced it using primers that flanked the protospacer region. A clear interruption in the sequence of the non-target strand was observed, whereas the sequence of the target strand was uninterrupted (Fig. 6A), indicating that nicking occurs in the accessible region of the non-target strand 11 bases from the 3Ј-end of the PAM. Next, we performed similar experiments sequencing the linear product, enriched in assays containing low concentrations of ATP (Fig. 6A). Again, a clear interruption in the sequence of the non-target strand was observed, 11 bases from the 3Ј-end of the PAM (Fig. 6A). However, sequence information from the target strand was unreadable in the region of the protospacer, consistent with the presence of multiple cuts in this strand (Fig. 6A).
To map the degradation of target DNA in more detail, we repeated reconstitution assays on synthetic dsDNA targets, one labeled with 32 P at the 5Ј-end of the target strand and the other at the 5Ј-end of the non-target strand. In the absence of ATP (or in the presence of ATP but using the ATPase-deficient mutant of Cas3, D452A), the target strand was not cleaved, whereas the non-target strand was cut weakly within the protospacer, 7 and 11 bases from the PAM sequence (Fig. 6, B-D). When ATP was included in the reactions, multiple cuts were observed in both strands. In the target strand, cleavage occurred in the region 3Ј of the protospacer and in the flanking upstream DNA (Fig. 6, B  and D). A similar cleavage pattern was observed for the nontarget strand (Fig. 6, C and D). These results reaffirm that degradation of target DNA is unidirectional because we observe no cleavage downstream of the protospacer sequence. The nuclease-deficient mutant (D75A) of Cas3 did not cut the synthetic DNA target. Targets lacking a PAM sequence also failed to be cut by wild-type Cas3.

DISCUSSION
During the interference stage, the E. coli Type I-E system proceeds through the identification and degradation of foreign DNA. Cascade recognizes foreign DNA and then recruits Cas3 for the ATP-dependent degradation of the target. Studies of the E. coli system have greatly increased our understanding of target recognition (6 -8, 11, 12, 17, 24). However, the mechanisms underlying Cas3 recruitment and subsequent target degradation are poorly understood. This could be a result of an inability to produce a recombinant form of stand-alone E. coli Cas3 suitable for biochemical analysis. Here, we report the production of stand-alone E. coli Cas3 with which we could reconstitute the E. coli Type I-E system in vitro. Using this in vitro system, we investigate the mechanism of Cas3 recruitment and subsequent target degradation.
Cascade binding to target DNA is a prerequisite for recruitment of Cas3. For Cascade to bind, DNA targets require a protospacer complementary to the crRNA and a PAM (8). Cascade binding generates an R-loop structure in the target DNA that exposes part of the non-target strand (Figs. 1 and 7) (6). Our results indicate that this exposed ssDNA serves as the binding platform for Cas3 and is also the site for the initial nicking of the DNA target (Figs. 6 and 7). Thus, complex formation between Cascade and target DNA provides Cas3 with the ssDNA required both for loading the helicase domain and as the substrate for nicking by the nuclease domain. Additional proteinprotein interactions with Cascade, in particular the CasA sub- unit, may also play a role in recruitment (17). Nicking of target does not require ATP hydrolysis (17) but is stimulated by the presence of ATP (Fig. 3, A and E), probably because ATP binding stimulates recruitment of Cas3. Mutations in the PAM and seed sequence, which reduce the binding affinity of Cascade (8), inhibit the cleavage of DNA target (Fig. 4A). Thus, the nuclease activity of Cas3 is tightly regulated. Only DNA that has been correctly engaged by Cascade and formed an R-loop will be degraded. Similarly, we also find that the ATPase activity is tightly regulated, being significantly activated only in situations where Cascade can form an R-loop with DNA target (Fig. 3G).
Tight regulation is presumably necessary to control the deleterious effects Cas3 could have on the host chromosome or other beneficial DNA within the cell.
Following nicking, further DNA cleavage requires ATP hydrolysis by Cas3 (Fig. 3, A and E), presumably to provide the energy for DNA unwinding, which generates the ssDNA substrate for the nuclease domain (Fig. 7). The coupling of dsDNA unwinding to ssDNA degradation is reminiscent of the mechanism employed by the RecBCD family of enzymes in homologous recombination (25). To further investigate DNA target degradation, we mapped the sites of this ATP-dependent cleav-  Binding of Cascade (green) to DNA target displaces the non-target strand of the protospacer. The displaced strand then serves as a binding platform for the recruitment of Cas3 (blue) (i). Once bound, Cas3 nicks the non-target strand in a reaction that is stimulated by the presence of ATP but does not require ATP hydrolysis (ii). ssDNA binding stimulates the ATPase and helicase activity of Cas3, which subsequently translocates in the 3Ј-5Ј direction on the non-target strand, unwinding the DNA target. Unwinding provides the ssDNA substrate for the nuclease domain and probably releases Cascade from the protospacer. Thus, the combined actions of the helicase and nuclease domains of Cas3 degrade the DNA target in a unidirectional manner (iii). age using labeled synthetic DNA. We found that Cas3 extensively cuts both strands within the protospacer and upstream of the PAM (Fig. 6). This, as well as results from monitoring degradation of linear plasmids (Fig. 5), shows that the progression of target degradation is unidirectional, proceeding only upstream of the protospacer (Fig. 7). Cas3 may also have an active role in recycling Cascade (14) because we also observe cuts in the target strand of the protospacer (Fig. 6), suggesting that the target strand has been unwound from the crRNA (Fig. 7). Consistently, E. coli Cas3 has been shown to harbor ATP-dependent R-loop unwinding activity (16).
The activities of the two domains of Cas3 are coupled because the helicase domain generates the substrate for the nuclease domain. We monitored degradation of plasmid target as a function of ATP concentration to gain further insight into this coupling (Fig. 3F). The unwinding activity of the helicase domain should increase with ATP concentration. Our results suggest that, under these conditions, the nuclease domain cuts the DNA less frequently, giving rise to products with a wide range of sizes. When the helicase rate is low, as observed with lower ATP concentrations, the nuclease domain makes cuts more frequently, which generates smaller sized products.
Genetic screening and expression experiments have shown that the chaperone HtpG positively modulates E. coli Type I-E resistance by maintaining functional levels of Cas3 (23). Consistent with this, we have shown that R-loop formation by Cascade is sufficient to recruit Cas3 to DNA targets, suggesting that additional factors, such as HtpG, are not essential at this step.
The Cascade-Cas3 fusion has been shown to degrade negatively supercoiled but not relaxed (i.e. nicked or linear) DNA (17). In our reconstituted system, stand-alone Cas3 can degrade both negatively supercoiled and linear DNA (Fig. 5). However, the rate of degradation of negatively supercoiled DNA is greater than that of linear DNA by ϳ4.5-fold. Negatively supercoiled DNA is probably a better substrate because of the increased energy required to melt the DNA strands over the length of the protospacer in relaxed versus negatively supercoiled DNA (17). Indeed, negative supercoiling stimulates other processes that rely on strand separation, such as RecA-mediated homologous recombination (26). Because the most likely substrate for the Type I CRISPR systems in vivo is negatively supercoiled DNA (17), further analysis of Type I systems, with both fused and stand-alone Cas3, will be needed to understand if there is functional significance to targeting relaxed DNA.
While this manuscript was in preparation, Sinkunas et al. (27) reported the in vitro reconstitution of the Type I-E CRISPR system from S. thermophilus. Like E. coli, this system contains stand-alone Cas3 and Cascade. In agreement with the results reported here, they show that the reconstituted S. thermophilus system is able to cleave linear DNA and that target degradation is unidirectional. They also go on to map the cleavage sites, revealing a pattern similar to that observed in the E. coli CRISPR system. Thus, the molecular mechanisms of the Type I-E CRISPR systems appear conserved.