CRISPR-Cas12a has widespread off-target and dsDNA-nicking effects

Cas12a (Cpf1) is an RNA-guided endonuclease in the bacterial type V-A CRISPR-Cas anti-phage immune system that can be repurposed for genome editing. Cas12a can bind and cut dsDNA targets with high specificity in vivo, making it an ideal candidate for expanding the arsenal of enzymes used in precise genome editing. However, this reported high specificity contradicts Cas12a's natural role as an immune effector against rapidly evolving phages. Here, we employed high-throughput in vitro cleavage assays to determine and compare the native cleavage specificities and activities of three different natural Cas12a orthologs (FnCas12a, LbCas12a, and AsCas12a). Surprisingly, we observed pervasive sequence-specific nicking of randomized target libraries, with strong nicking of DNA sequences containing up to four mismatches in the Cas12a-targeted DNA-RNA hybrid sequences. We also found that these nicking and cleavage activities depend on mismatch type and position and vary with Cas12a ortholog and CRISPR RNA sequence. Our analysis further revealed robust nonspecific nicking of dsDNA when Cas12a is activated by binding to a target DNA. Together, our findings reveal that Cas12a has multiple nicking activities against dsDNA substrates and that these activities vary among different Cas12a orthologs.

sequences called spacers that are flanked by repeat sequences. Spacer sequences are acquired from foreign nucleic acids during infections and serve as memories to defend against future infections (2). The CRISPR locus is transcribed into a long pre-CRISPR RNA (pre-crRNA), which is processed by Cas12a into small mature CRISPR RNA (crRNA) (3). Cas12a uses these crRNAs as guides to bind to complementary "protospacer" sequences within the invading DNA (1). Following DNA binding, Cas12a sequentially cleaves each strand of the DNA using a RuvC nuclease domain, creating a double-strand break that eventually leads to neutralization of the infection (1,4,5). Because the crRNA can be changed to guide Cas12a cleavage at a sequence of interest, Cas12a is easy to repurpose for programmable genome editing and other biotechnological applications (1), similar to the widely used Streptococcus pyogenes Cas9 (SpCas9) (6). Several orthologs, including Francisella novicida Cas12a (FnCas12a), Lachnospiraceae bacterium Cas12a (LbCas12a), and Acidaminococcus sp. Cas12a (AsCas12a), have been used for genome editing (1,7).
The specificity of Cas endonucleases is an important consideration for both CRISPR-Cas immunity and genome editing applications. Whereas SpCas9 has been shown to tolerate mutations in the target sequence (8,9), Cas12a orthologs are reported to be relatively specific (10 -12). In searching for potential targets, both Cas9 and Cas12a identify a protospaceradjacent motif (PAM) (13,14) to initiate R-loop formation with the crRNA-complementary strand of the dsDNA target (15,16). However, they differ in terms of structure and target cleavage mechanisms that may modulate target cleavage specificity (17,18). The first few nucleotides following the PAM sequence (i.e. the PAM-proximal sequence) where DNA-RNA hybrid nucleation begins is referred to as the "seed" region (19). Mismatches in this region have been reported to be more deleterious for binding and cleavage by Cas12a than by SpCas9 (8,14,(19)(20)(21). This intrinsic low tolerance for mismatches in the target sequence is a desirable trait for high-fidelity genome editing but raises the question of how Cas12a can provide effective immunity to bacteria against invaders in its native role. Higher mismatch tolerance may limit the ability of phages to escape from CRISPR-Cas immunity via mutations and may provide broader defense against closely related phages (10,22,23).
In addition to its crRNA-guided target cleavage activity, recent reports showed that Cas12a can indiscriminately degrade single-stranded DNA (ssDNA) in trans upon binding and activation by a crRNA-complementary dsDNA or ssDNA (24,25). Structural studies of Cas12a have led to a model for this nonspecific activity in which the RuvC catalytic pocket remains open due to conformational changes following 16 -17 nucleotide crRNA-target strand (TS) hybridization and cleavage of at least the nontarget strand (NTS) of the target dsDNA (5,26). The release of PAM-distal products from the complex (21,27) also increases the accessibility of the RuvC domain to nonspecific substrates (5).
In this study, we developed a high-throughput in vitro method to determine the native specificity and cleavage activity of Cas12a orthologs. We show that Cas12a has crRNA-directed sequence-specific nicking activity against target sequences containing up to four mismatches with the guide RNA, where linearization does not always occur. We further show that when activated by target binding, Cas12a displays nonspecific nicking activity in trans against dsDNA, similar to the previously observed activated ssDNA degradation activity (24,25). Activated Cas12a also has weak dsDNA degradation activity for both target and nonspecific DNA. Our results report several cleavage activities of Cas12a, including cis (sequence-dependent or target-dependent) and activated trans (nonspecific or target-independent) dsDNA nicking, and cis and trans dsDNA degradation resulting from extensive nicking activity.

Cleavage activity of Cas12a against a target library
Genome editing activity and specificity of Cas12a have previously been characterized in vivo (10 -12). These studies show that Cas12a has low or no tolerance for mismatches in the target sequence in eukaryotic cells. However, the native cleavage specificity of Cas12a remains unclear, given that the eukaryotic genomic structure may sequester potential off-targets (28 -30) and that most off-target analyses account only for doublestrand breaks (31,32). In vitro analysis reveals the effect of single mismatches in the target sequence that slow the rate of R-loop formation and target-strand cleavage by Cas12a (27,33,34), but it remains unclear how multiple types and combination of mismatches in the target sequence affect the cleavage mechanism.
To directly observe the cleavage activity and specificity of Cas12a, we performed in vitro cleavage assays using a plasmid library followed by high-throughput sequencing analysis ( Fig.  1A and Fig. S1). The libraries were designed to contain target sequences with different numbers and combinations of mismatches for a given guide sequence of the crRNA. We used a randomization frequency to generate a pLibrary pool with targets containing two or three mismatches maximally represented in the pool (see "Library creation") ( Fig. S2A) (35). The total number of possible sequences is 4 20 (1.1 ϫ 10 12 ). However, due to the partial randomization, sequences with more than 9 mismatches compose less than 0.1% of the total pool. Therefore, we limited our analysis to target sequences with fewer than 10 mismatches. Extra 0 mismatch "perfect" target was added in as an internal control (Fig. S2A). We tested the cleavage activity of three Cas12a orthologs: FnCas12a, LbCas12a, and AsCas12a (collectively referred to as Cas12a hereafter). For each Cas12a ortholog, we used three different crRNA sequences and corresponding negatively supercoiled (nSC) plasmids (see "Plasmid and nucleic acid preparation") containing the perfect target (pTarget) or target libraries (pLibrary) (Fig. S2 (B-D) and Table  S1). The three crRNA and library sequences were designed based on protospacer 4 sequence from S. pyogenes CRISPR locus (55% G/C), EMX1 gene target sequence (80% G/C), and CCR5 gene target sequence (20% G/C), henceforth referred to as pLibrary PS4, EMX1, and CCR5, respectively.
The use of negatively supercoiled dsDNA plasmid substrates enables detection of nicked and linear cleavage products resulting from Cas12a cleavage via product migration on agarose gel electrophoresis (36). Cas12a completely linearized all pTarget substrates within the time course tested, indicating complete cleavage of both strands of the dsDNA target ( Fig. 1B and Fig.  S2E). In contrast, only a fraction of pLibrary was linearized, and some of the plasmid remained supercoiled, indicating the presence of sequences within the pool that could not be cleaved by Cas12a in the time span tested ( Fig. 1 (B and C) and Figs. S2E and S3 (A and B)). We also observed a nicked fraction for pLibrary that persisted through the longest time point in the assay (3 h) ( Fig. 1 (B and C) and Figs. S2E and S3 (A and B)). To determine whether any cleavage might occur outside of the target region during pLibrary cleavage, we tested the cleavage activity of Cas12a against the empty plasmid backbone without and with the different crRNAs (Fig. S2F). Cas12a orthologs have been reported to have variable crRNA-independent nicking activity (37,38). Similarly, we observed weak, variable amounts of plasmid nicking when incubated with Cas12a in the absence of crRNA ( Fig. 1B and Fig. S2E). To prevent this crRNA-independent nicking, we incubated Cas12a with excess crRNA in all cleavage assays (see "In vitro cleavage assay"). For most Cas12a-crRNA pairs, we observed minimal nicking of the empty plasmid backbone until the 3 h time point. However, more nicking occurred at the shorter time points for AsCas12a-CCR5 crRNA. Although some level of background nonspecific nicking was observed for all crRNAs tested, further analysis indicates that pLibrary nicking is, in part, sequence-dependent (see below).
Interestingly, we observed variable cleavage rates and patterns for the three pLibraries and Cas12a orthologs ( Fig. 1C and Fig. S3 (A and B)). Both FnCas12a and LbCas12a cleaved most pLibraries with a relatively high efficiency, resulting in substantial depletion of supercoiled DNA to ϳ10% of the total DNA at the longest time point (3 h). One notable exception is pLibrary EMX1, for which FnCas12a showed relatively weak cleavage compared with LbCas12a (Fig. S3A). AsCas12a showed the least amount of overall cleavage, with a relatively large fraction of pLibrary remaining in the supercoiled pool for all three libraries, suggesting that a large fraction of sequences remained uncleaved by this ortholog.
To determine which sequences were uncleaved and nicked, we extracted the plasmid DNA from gel bands for each of these fractions, PCR-amplified the target region, and performed high-throughput sequencing (HTS) followed by bioinformatic analysis (Fig. 1A and Fig. S1). Although we could not PCRamplify and analyze target sequences present in the linearized DNA pool due to the double-strand break generated in the tar-EDITORS' PICK: Cas12a nickase activities get sequence by Cas12a upon cleavage, we assumed that sequences that were absent from both the supercoiled and nicked pools were linearized. PCR amplification of the target region from the plasmid for high-throughput sequencing facilitates analysis of target sequences; however, it also removes our ability to detect the relative abundance of sequences in the nicked versus supercoiled fraction. Whereas we were unable to sequence the linearized pool, the analysis of target sequences in the nicked and supercoiled pool provides an understanding of the mismatch tolerance of Cas12a (see below).
To adjust for the abundance of sequences present in the supercoiled or nicked pool prior to PCR amplification, we normalized counts obtained from our high-throughput sequencing data using the fraction of DNA that was present in a given pool at a given time point (see "HTS analysis"). We normalized abundance at a given time point relative to the abundance in the original library, enabling analysis of the relative depletion of sequences from the supercoiled pool and enrichment of sequences within the nicked pools over time ( Fig. 1 (D and E) and Figs. S3C and S4). In addition, we plotted mismatch distribution curves for the normalized abundance of sequences present in the supercoiled and nicked pools to analyze the proportion of sequences with a given number of mismatches in each pool at each time point (Fig. S5).
Depletion of target sequences from the supercoiled pool indicates cleavage, and enrichment in the nicked pool indicates nicking by Cas12a. The decrease of target sequences present in the nicked pool over time implies linearization of these  EDITORS' PICK: Cas12a nickase activities sequences. As expected, the perfect target (0 mismatch) was quickly depleted from the supercoiled pool ( Fig. 1D and Fig. S3 (C and D)). Whereas the perfect target sequence was present in the nicked pool at earlier time points, over time it was depleted from the pool, indicating linearization ( Fig. 1E and Fig. S3 (E and F)). All Cas12a orthologs also cleaved target sequences with multiple mismatches (i.e. target sequences containing mismatches with the guide RNA) based on the depletion of sequences with 1-4 mismatches from the supercoiled pool over time ( Fig. 1D and Fig. S3 (C and D)). Single mismatch sequences were depleted from the supercoiled pool at similar rates to the perfect target sequence by LbCas12a for pLibrary EMX1 and CCR5 (Fig. S3, C and D), but slightly more slowly for other orthologs and pLibraries ( Fig. 1D and Fig. S3 (C and D)). Single-mismatch sequences were also generally depleted more slowly from the nicked pool than the perfect target ( Fig. 1E and Fig. S3 (E and F)), indicating a decreased rate of linearization of single-mismatch targets by Cas12a. For the A/T-rich pLibrary CCR5, all three orthologs displayed the fastest rate of depletion of sequences with 2-4 mismatches (Fig. S3D). Target sequences with two, and in some cases with three, mismatches entered and were subsequently depleted from the nicked pool, suggesting that Cas12a can linearize these sequences with reduced kinetics for the second cleavage step ( Fig. 1E and Fig.  S3 (E and F)).
Interestingly, target sequences with five or more mismatches were also depleted from the supercoiled pool by FnCas12a and LbCas12a for all three pLibraries ( Fig. 1 (D and E) and Figs. S3 (C and D) and S4). Correspondingly, we observed an increase for these sequences in the nicked pool over time ( Fig. 1E and Figs. S3E and S4)). In contrast, AsCas12a showed minimal depletion of sequences with 5-10 mismatches from the supercoiled pool (Fig. S4); although these sequences were enriched in the nicked fraction, they remained a relatively small proportion of the overall library even at late time points (Figs. S4 and S5). Importantly, unlike the target sequences with 1-4 mismatches, target sequences with five or more mismatches were generally cleaved at almost the same rate by all Cas12a orthologs (Fig. S4). These results suggest that sequences with large numbers of mismatches may have been nicked through a nonspecific nicking activity that is more pronounced for LbCas12a and FnCas12a than for AsCas12a (see below).

Sequence determinants of Cas12a cleavage activity
We next looked at the sequences that were present in the supercoiled and nicked pools to determine the effects of mismatch position and type (Fig. S1). The heatmaps in Fig. 2 and Figs. S6 and S7 show the relative abundance of target sequences containing 1-6 mismatches (MM) with all possible nucleotides at each position of the sequence in the supercoiled fraction,  EDITORS' PICK: Cas12a nickase activities traced over time (see "HTS analysis"). Although target sequences with a single mismatch (1 MM) were quickly depleted from the supercoiled fraction ( Fig. 2 and Figs. S6 and S7), some sequences with mismatches in the PAM-proximal "seed" region were depleted slowly, consistent with the expected importance of this region for R-loop formation and Cas12a-binding affinity. We observed a short seed region of ϳ6 nucleotides for most pLibraries and Cas12a orthologs, when compared with the ϳ10-nucleotide length for SpCas9 (8,14,20). This is also in agreement with previously reported in vivo and in vitro specificity studies on Cas12a (10 -12, 19). A single C substitution in the seed region is highly deleterious for cleavage by Cas12a, whereas G or A mismatches slowed the rate of cleavage or depletion from the supercoiled fractions to a lesser degree ( Fig. 2 and Figs. S6 and S7). In contrast, target sequences with a single T mismatch in the seed are generally tolerated for cleavage outside of the first PAM-proximal position for most pLibraries. Outside of the seed, most target sequences containing a single mismatch were rapidly depleted from the supercoiled pool, indicating that any type of single mismatch outside the seed can be similarly tolerated. However, in some cases, single mismatches in the target sequence slowed the rate of cleavage regardless of mismatch position based on a slower decrease in abundance relative to other Cas12a orthologs, particularly for FnCas12a cleavage of pLibrary PS4 and both FnCas12a and LbCas12a cleavage of pLibrary EMX1 ( Fig. 2 and Fig. S6).
A similar seed-dependent cleavage trend was observed for target sequences with 2, 3, and 4 mismatches (2 MM, 3 MM, and 4 MM, respectively) ( Fig. 2 and Figs. S6 and S7). In general, C substitution is most deleterious for cleavage at all positions in the target sequence with two and three mismatches. A and G substitutions also slowed the rate of cleavage by Cas12a, particularly in pLibrary CCR5 (Fig. S7). For LbCas12a, most target sequences with two and three mismatches outside the seed region were eventually depleted in all three pLibraries. All three Cas12a orthologs tolerate up to three mismatches in the PAMdistal region for pLibrary CCR5 (Fig. S7), resulting in strong depletion of these sequences from the supercoiled fraction ( Fig.  S3D). For sequences with five and six mismatches, we observed a steady depletion of all sequences regardless of mismatch position or type for FnCas12a and LbCas12a ( Fig. 2 and Figs. S6 and S7). Together with the observation that sequences with 5-10 mismatches are depleted from the supercoiled fraction at the same rate (Fig. S4), these data suggest that Cas12a may cleave highly mismatched sequences present in pLibrary in a sequence-independent manner.

Sequence determinants of Cas12a nickase activity
To determine target sequences that are preferentially nicked by Cas12a, we performed analysis on the nicked pool similar to the supercoiled pool (Fig. S1). The heatmaps in Fig. 3 and Figs. S8 and S9 show the relative abundance of target sequences containing 1-6 MM with all possible nucleotides at each position of the sequence in the nicked fraction, traced over time (see "HTS analysis").
Cas12a linearized most single mismatches (1 MM) within the time frame tested (3 h) ( Fig. 3 and Figs. S8 and S9). Whereas LbCas12a rapidly linearized target sequences with most single mismatches, FnCas12a and AsCas12a displayed slower kinetics for double-stranded cleavage for some single-mismatch sequences in pLibrary PS4 (Fig. 3). Although they were a relatively small proportion of the overall library (Fig. S5), we observed enrichment of single-mismatch sequences when the mismatch was located either in the seed or in the region where the nontarget strand is cleaved (positions 16 -18 from the 5Ј end of the target on the nontarget strand) (1, 33) ( Fig. 3 and Figs. S8 and S9). Similarly, target sequences with two and three mismatches (2 MM and 3 MM, respectively) in the seed and PAMdistal region were enriched in the nicked fraction at later time points, especially for FnCas12a and AsCas12a ( Fig. 3 and Figs. S8 and S9). For LbCas12a, these two-and three-mismatch targets in all pLibraries were eventually depleted, indicating that these sequences were linearized ( Fig. 3 and Figs. S8 and S9). For AsCas12a cleavage of PS4 and EMX1 pLibraries, some target sequences with two and three mismatches in the PAM-distal region remained enriched throughout the time course ( Fig. 3 and Fig. S8), indicating that mismatches in the PAM-distal region block the second cleavage step for this Cas12a ortholog.
FnCas12a and LbCas12a can nick target sequences with four or more mismatches, observed as an increase in abundance of these target sequences for all pLibraries tested ( Fig. 3 and Figs. S8 and S9). Similar to the supercoiled fraction, target sequences with five and six mismatches were uniformly enriched in the nicked fraction irrespective of the mismatch position or type for most pLibraries and Cas12a orthologs ( Fig. 3 and Figs. S8 and S9). However, AsCas12a did not nick target sequences with four or more mismatches for pLibraries PS4 and EMX1 ( Fig. 3 and Fig. S8) but exhibited weak nicking of these target sequences in pLibrary CCR5 (Fig. S9).

Effect of mismatch location and separation on Cas12a cleavage activity
The heatmaps presented in Figs. 2 and 3 and Figs. S6 -S9 show the overall effects of multiple mismatches in the target sequences. In these heatmaps, the effect of a single mismatch on Cas12a cleavage can be easily identified. We next generated heatmaps to show the effects of two mismatches on Cas12a cleavage as a function of mismatch location and distance between the mismatches (see "HTS analysis"). The total number of unique sequences with two mismatches in our pLibraries is 1,710, but the number of possible ways two-mismatch sequences can occur is 190 (Fig. S10). Cas12a can cleave most targets with two mismatches, based on depletion of these sequences from the supercoiled pool over time (Fig. 4). Whereas target sequences with two mismatches separated by a shorter distance were depleted slowly from the supercoiled pool, sequences with one mismatch in the seed region were also enriched in the supercoiled pool irrespective of the distance from the other mismatch, highlighting the effect of a mismatch in the seed ( Fig. 4 and Fig. S10). Target sequences with two mismatches in G/C-rich EMX1 pLibrary were depleted slowly by all three Cas12a orthologs compared with the other two pLibraries (Fig. 4B). Interestingly, FnCas12a can tolerate most double mismatches in an A/T-rich CCR5 EDITORS' PICK: Cas12a nickase activities pLibrary irrespective of the distance between the mismatches except when one mismatch is in the seed. In contrast, LbCas12a and AsCas12a are not tolerant of sequences with mismatches separated by a distance of 5 or fewer bases of pLibrary CCR5 (Fig. 4C).
Cas12a strongly nicks sequences with two mismatches and in some cases linearizes them, as observed by initial enrichment and subsequent depletion of sequences (Fig. 4). Among the three Cas12a orthologs tested, LbCas12a displays the most linearization, whereas AsCas12a has the strongest sequence-specific nicking activity against target sequences with two mismatches. Interestingly, for the nicked pool, we observed less accumulation of two-mismatch sequences containing mismatches in the seed region for AsCas12a than for FnCas12a and LbCas12a ( Fig. 3 and Figs. S8 and S9). Consistently, we do not see representation of target sequences with a mismatch in the seed (i.e. when the mismatches are separated by a distance of 16 or more) in the nicked pool ( Fig. 4 and Fig. S10). It is possible that these sequences are linearized without accumulation of the nicked intermediate, as these sequences are depleted from the supercoiled pool (Fig. 4).
We observed an initial, overall enrichment of target sequences with two mismatches separated by a distance of 13 or more due to one of the mismatches being in the seed ( Fig.   4 and Fig. S10). To further study the bias and effect of a seed mismatch, we generated similar heatmaps for target sequences with two mismatches in the seed or outside the seed region. As expected, most double mismatches in the seed region are deleterious for Cas12a (Fig. S11). These sequences were slowly depleted from the supercoiled pool and appear in the nicked pool over time. This indicates that Cas12a mostly nicks target sequences with two mismatches in the seed.
Cas12a can tolerate most target sequences with two mismatches outside the seed seen as a depletion from the supercoiled pool over time for all pLibraries tested (Fig. S11, D-F). We observed that mismatched sequences were more rapidly depleted from the supercoiled pool from A/T-rich pLibrary CCR5 by all three Cas12a orthologs compared with the other two pLibraries (Fig. S11, D-F). Cas12a strongly nicks sequences when the two mismatches outside the seed are closer, seen as the enrichment of these sequences in the nicked pool. LbCas12a consistently linearizes sequences with two mismatches from all three pLibraries. Double mismatches separated by a distance of 10 or more are more rapidly linearized and depleted from the nicked pool by all three Cas12a orthologs. These observations also agree with sequence-specific nicking activity reported for Cas12a (39).

Cas12a has nonspecific activated nicking and dsDNA degradation activity
Our HTS data suggest that Cas12a can nick, and in some cases linearize, sequences with several mismatches. To validate this observation, we selected sequences from pLibrary PS4 that were relatively enriched in the nicked fraction at the longest time point (3 h). We cloned target sequences containing 2-8 mismatches and individually tested sequence-specific nicking activity of Cas12a against each plasmid. Consistent with our HTS results, we observed varying degrees of sequence-specific nicking and linearization of target sequences containing 2, 3, and 4 MM for different Cas12a orthologs (Fig. 5A). To compare among the Cas12a orthologs, we quantified the supercoiled, linearized, and nicked fractions at the 3 h time point (based on the time frame used for the pLibrary cleavage) for the perfect and mismatched target sequences (Fig. 5B and Fig. S12 (B and  C)). FnCas12a and AsCas12a showed higher sequence-specific nicking activity, whereas LbCas12a partially linearized most targets. All three Cas12a orthologs strongly nicked one of the three mismatch target sequences (3.2 MM) tested, with no or low partial linearization of the mismatched target. The distinct patterns of nicking observed for targets with different mis-matches indicate that the nicking is target sequence-specific and is not a result of nonspecific, background nicking of the plasmid backbone by Cas12a-crRNA (Fig. S2F). These data, together with our HTS data, suggest that LbCas12a has strong linearization activity against target sequences with multiple mismatches, whereas FnCas12a and AsCas12a can only nick these target sequences.
Surprisingly, sequences with greater than five mismatches were not nicked by Cas12a even after 5 h of incubation, although these sequences were enriched in the nicked pool of our HTS data (Figs. 1E and 5B and Figs. S3E, S4, and S12). As noted above, the number of mismatches did not affect the rate of cleavage for target sequences with five or more mismatches (Fig. S4), and heatmaps for target sequences with five and six mismatches indicated that enrichment of these sequences in the nicked pool was not sequence-specific ( Fig. 3 and Figs. S8 and S9). Although we observed similar levels of nonspecific cleavage of the empty plasmid backbone by all three Cas12a orthologs (Fig. S2F), sequences with five or more mismatches were depleted from the supercoiled pool far faster for FnCas12a and LbCas12a than for AsCas12a (Fig. S5). Together, these observations suggest that an additional nonspecific cleavage  EDITORS' PICK: Cas12a nickase activities activity occurred during pLibrary cleavage. This led us to hypothesize that Cas12a may have nonspecific, target-activated nicking activity against dsDNA substrates with low or no homology to the crRNA. In the pLibrary cleavage assays, the mixed pool of sequences contains the perfect target sequence, which may activate Cas12a for nonspecific nicking activity (24,25). To test this, we used a short dsDNA oligonucleotide activator that was fully complementary to the crRNA to activate Cas12a. We formed a complex containing Cas12a, crRNA, and the dsDNA activator and tested for nonspecific cleavage activity against empty nSC plasmid that lacked a target sequence. Surprisingly, we observed robust nonspecific, trans nicking and partial linearization of the empty negatively supercoiled plasmid by FnCas12a and LbCas12a (Fig. 6, top panels). This nicking activity is readily detected using negatively supercoiled dsDNA substrates, where a single nick in the DNA is sufficient to cause a large shift in mobility of the DNA (36). The activated nicking activity is comparable with Cas12a ssDNase activity, with dsDNA nicking and ssDNA degradation observed in the same time frame (Fig. 6, bottom panels). Over time, FnCas12a and LbCas12a degraded the ssDNA, and similarly, further nicking events in the dsDNA eventually resulted in linearization. A discrete linear band initially appeared in the agarose gels, indicating nicking of both strands in close proximity. However, over time, multiple cleavage events, either along the length of the plasmid or from the freed ends, lead to degradation. The observation that trans-activated Cas12a could degrade linearized dsDNA suggests that degradation should also be observed for plasmids containing a target that can activate Cas12a in cis. We therefore tested cleavage of pTarget by Cas12a and performed the cleavage assay for longer time points (up to 24 h) to determine whether degradation eventually  EDITORS' PICK: Cas12a nickase activities occurs. Negatively supercoiled pTarget was linearized quickly, likely due to specific cleavage within the target region. Following linearization, pTarget was slowly and processively degraded after 4 h of incubation with LbCas12a or FnCas12a, indicating that these Cas12a orthologs can be activated for dsDNA degradation by a cis activator (Fig. S13A). In contrast, pTarget remained uncleaved by all three Cas12a orthologs programmed with a noncognate crRNA and in the absence of an activator.
We next tested the types of trans activators that can trigger dsDNA nicking by Cas12a. Like the nonspecific ssDNase activity (24,25), Cas12a can be activated by crRNA-complementary ssDNA binding in a PAM-independent, RuvC domain-dependent manner for trans, nonspecific dsDNA nicking and degradation (Figs. S13 (B-D) and S14A). We also tested whether mismatched target sequences that were present in the pLibrary cleavage assays could act as activators for nonspecific nicking activity. Interestingly, Cas12a was activated by some of these mismatched targets as well, especially those that were partially linearized by Cas12a (Fig. S14B). However, mismatched targets that were only nicked by Cas12a, like an activator based on mismatched pTarget 2.1 MM (Fig. 5B and Fig. S12 (B and C)) were weak activators, indicating that double-strand cleavage of the mismatched target activator may be important for nonspecific activated nicking of dsDNA.
Whereas both FnCas12a and LbCas12a have strong nonspecific nicking activity at lower concentrations (20 nM), AsCas12a is not strongly activated as a nickase at this concentration, although it does display activated ssDNase activity (24) (Fig. 6  and Fig. S13). The HTS data indicated that target sequences containing more than four mismatches were modestly enriched in the nicked fraction by AsCas12a (Figs. 1E and 3 and Figs. S4, S8, and S9). Indeed, we observed a stronger nonspecific acti-vated nicking by AsCas12a at higher concentration (100 nM), at which the pLibrary cleavage assays were performed (Fig. S14C) (see "In vitro cleavage assay"). AsCas12a is also reported to have slower rates of PAM-distal product release after cleavage of a target (5,27). The reduced nonspecific activity may be a result of the cleaved target products hindering the RuvC domain from accessing dsDNA substrates. Some conformational changes or lack thereof may also not allow for nonspecific activated nicking of dsDNA substrates by AsCas12a.
The nonspecific nicking observed for plasmid substrates may be due to the partially single-stranded nature of negatively supercoiled DNA. To determine whether negative supercoiling is required for activated cleavage, we tested plasmid substrates in three different forms: nSC, linear, and nicked. Whereas activated Cas12a robustly nicked negatively supercoiled DNA, resulting in linearization and eventual degradation, nicked dsDNA was only weakly linearized and degraded (Fig. S15). This activity was also reproducible with commercially available enzymes and other crRNA-activator pairs, although the activation of the trans-nicking activity varied with the crRNA-activator pair (Fig. S15). These results suggest that nicking depends on the supercoiled state of the DNA, whereas degradation may require exposed termini.

Discussion
Cas12a has become a widely used tool for various biotechnological applications such as genome editing and diagnostic tools (7,24,40). Several reports show that Cas12a and engineered orthologs are highly specific for RNA-guided dsDNA cleavage activity (10 -12). Despite these studies on Cas12a specificity, the cleavage activity and specificity outside of a eukaryotic setting remains unclear. The apparent high specificity of Cas12a in

EDITORS' PICK: Cas12a nickase activities
genome-editing studies is paradoxical to its natural role as an immune system effector. Phages evolve rapidly and can escape from CRISPR-Cas immunity via mutations (22,23). The high specificity of Cas12a may also limit targeting of closely related phages (41).
Here we show that Cas12a has additional sequence-specific and nonspecific activated dsDNA-nicking and degradation activities apart from previously described crRNA-mediated cis cleavage of dsDNA targets (1), activated trans cleavage of nonspecific ssDNA substrates (24,25), and RNA-independent dsDNA nicking and ssDNA degradation (37). Our results demonstrate that Cas12a can nick and, in some cases, create doublestrand breaks in targets with up to four mismatches. Similarly, a recent study by Fu et al. (39) demonstrated that Cas12a and Cas9 have sequence-dependent nicking activity against targets with one or two mismatches. We also establish that Cas12a has nonspecific dsDNA-nicking activity upon binding to a crRNAcomplementary DNA. While this manuscript was in preparation, a complementary study reported similar observations, demonstrating that these activities are reproducible in vitro (42).
Cas12a has a single active site, and it cleaves the dsDNA target in a sequential order with NTS nicking followed by TS cleavage (5,21). As a result, Cas12a cleaves the NTS faster than the TS (26,33). In contrast, double-stranded cleavage by Cas9 is coordinated simultaneously by two different nuclease domains, although studies suggest that target sequences with mismatches result in differential rates of TS and NTS cleavage that may lead to a nicked intermediate (39,43,44). Similarly, the difference in rates of strand cleavage by Cas12a could potentially be amplified by the presence of mismatches in the target sequence, resulting in a nicked intermediate. Consistent with previous reports, we observed that single and multiple PAMdistal mismatches slow the linearization of the target sequence by Cas12a (33,39). Target sequences with single mismatches reduce binding affinity and slow the rate of R-loop formation by Cas12a, which should result in a reduced rate of NTS cleavage. However, the observation of nicked intermediates suggests that specific types of mismatches may cause slower cleavage of the TS following NTS cleavage, resulting in sequence-specific nicking. In this case, TS cleavage may be the rate-limiting step for double-stranded break formation. It is also possible that the target dissociates following NTS cleavage due to R-loop collapse, as mismatches in the PAM-distal region have previously been shown to accelerate the rate of target dissociation (33). Alternatively, the TS may not be presented to the RuvC domain and/or the conformational change required for TS cleavage may not occur. Biochemical studies have revealed that TS recognition is a prerequisite for NTS and ssDNA cleavage (24).
The release of the PAM-distal products upon cleavage exposes the active RuvC domain that can accept substrates for nonspecific trans cleavage activities (5,21,27). Consistently, we observed that certain mismatched targets can only be nicked by Cas12a, and these natural mismatched substrates are also ineffective activators of the nonspecific nicking activity of Cas12a, presumably due to failed cleavage and release of PAM-distal TS products. Targets with multiple mismatches also cause reduced binding affinity and defects in R-loop formation (21,34), resulting in slower cleavage of both strands.
Cas9 cleavage specificity is not only dictated by positionspecific mismatches but is also defined by the target sequence (20,45). Here, by testing three different Cas12a orthologs against three pLibraries, we show the effects of target crRNA sequence and different nucleotide substitutions across the target region in varying mismatch combinations on Cas12a cleavage activity. All three Cas12a orthologs can tolerate most types of PAM-distal mismatches in A/T-rich target sequences in pLibrary CCR5, where these sequences are quickly depleted from the supercoiled pool. However, almost all target sequences with one, two, and three mismatches in G/C-rich pLibrary EMX1 had slower cleavage rates, where we observed accumulation of nicked products. We also observe that LbCas12a tolerates the most mismatches and is likely to produce double-strand breaks in targets with mismatches, whereas FnCas12a and AsCas12a may be more prone to nicking these target sequences.
A recent study demonstrated that topology plays a role in Cas12a-mediated target DNA cleavage (46). Similarly, we observed different rates of nonspecific activated nicking and degradation of different forms of DNA. Nicking by activated Cas12a was easily detected for negatively supercoiled DNA. Cas12a is likely to mainly encounter negatively supercoiled DNA in both prokaryotic and eukaryotic cells, suggesting that the activated trans cleavage could enable dsDNA nicking in vivo. Negatively supercoiled DNA may undergo relaxation and breathing, exposing ssDNA. We also observe slow degradation of linear dsDNA by activated Cas12a, where the linear ends of dsDNA may fray (47,48). In both cases, the exposed ssDNA regions are likely degraded via the ssDNase activity of Cas12a (24,25). With pre-nicked dsDNA, we observed slow and direct degradation by activated-Cas12a rather than an intermediate linear product. This may be attributed to decreased breathing and exposure of ssDNA sites. It remains unclear how negatively supercoiled plasmids are nicked multiple times, leading to linearization; however, multiple events of the trans nicking activity may cause the DNA to eventually fall apart (42). This suggests that a cumulative effect of all of the trans activities of Cas12a eventually leads to complete degradation of nucleic acid substrates (Fig. 7).
Cas12a has been successfully used for gene editing in vivo without any deleterious off-target effects (10 -12, 49, 50). Our high-throughput pLibrary cleavage analysis indicates that Cas12a can bind and cleave sequences with multiple mismatches in vitro. Although we cannot establish with certainty how and what target sequences ended up in the nicked pool, our mismatch and position analysis along with the individual 2-4mismatch target sequence cleavage assays strongly suggests that target sequences with 1-4 mismatches are nicked in a sequence-dependent manner. In vivo, nicked DNA may lead to genomic changes as DNA repair machinery can be recruited at these sites (51)(52)(53). However, chromatin structure may prevent nicks, and nicks may be repaired by error-free DNA repair pathways (54). The cellular context also plays a role in Cas effector binding and cleavage. SpCas9 can bind to targets containing several mismatches, depending on DNA breathing and supercoiling, both in vitro and in vivo (55,56). Cas12a can stably bind EDITORS' PICK: Cas12a nickase activities to some targets with mismatches in vitro (27,34), but in vivo studies suggest low or no off-target binding (57). This could reflect the inability of Cas12a to unwind and bind DNA in varying topological and cellular contexts, which may result in overall lower off-target editing rates by Cas12a.
Whereas off-target sites can be predicted and avoided by careful design of crRNAs, the robust nonspecific activated nicking activity we observed in vitro may lead to off-target editing as nicked DNA can result in indels as discussed above. Although there are a limited number of target sites in the genome where Cas12a can be activated for nonspecific nicking, it is unclear how long Cas12a remains bound to a target. Prolonged binding could result in nicking of proximal DNA sequences. The unpredictable nature of the nonspecific activated nicking makes it difficult to detect the outcomes of nicking. In addition, the commonly used methods to verify off-target editing do not detect nicks (31, 32), meaning that detection of potential offtarget effects due to nonspecific nicking would require wholegenome sequencing or specialized detection methods (58,59). Use of Cas12a orthologs, such as AsCas12a, that display reduced nonspecific nickase activity may reduce these unpredictable effects during genome-editing experiments. Notably, our in vitro specificity analysis also suggests that AsCas12a is less prone to creating double-strand breaks at sites with two or three mismatches, suggesting that this ortholog may be less prone to off-target, sequence-specific nicking and cleavage at highly homologous sites.
The activities reported in our study add to the growing number of target sequence-specific and sequence-nonspecific Cas12a cleavage activities (Fig. 7) and may provide a possible explanation of how Cas12a compensates for its highly specific targeted cleavage activities as an immune effector. The sequence-specific nicking, nonspecific activated nicking, and degradation activities, along with previously described dsDNA cleavage (1) and trans ssDNase activity (24,25), could allow Cas12a to mount a strong defense against different types of invading phages. In the event of phage evolution via mutations, Cas12a may tolerate some mutations and still nick or fully cleave phage DNA via the sequence-specific nicking activity. Cas12a could also be activated by the evolved/mutated target region of the phage DNA, enabling nonspecific activated nicking and degradation activities.
Whereas our in vitro results indicate that Cas12a has robust nicking activities, it remains uncertain whether this activity occurs in vivo. During phage infection, multiple copies of phage DNA are injected into the bacterial cell, each carrying targets that can activate Cas12a. The effective activated-Cas12a concentration in the bacterial cell is therefore relative to the level of infection occurring at a given time in the cell. It is also interesting to note that recent studies reported several Cas12 proteins with strong nickase activities, even against perfectly matching target DNA (60,61). Similarly, the recent discovery of the Can1 nickase that is activated by signaling molecules produced by type III Cas effectors upon target recognition suggests that targeted nicking or target-activated nicking may be a common mechanism among CRISPR-Cas systems to slow phage replication (62). The nonspecific nature of the nicking and degradation activities may also be harmful to the host bacteria. In type III and type VI CRISPR-Cas systems, Cas nucleases can be activated for nonspecific cleavage of RNA (4). Perhaps the nonspecific nicking activity is a means to prevent phage proliferation by initiating programmed cell death, abortive infection, or dormancy to save the bacterial population (63,64). Further studies are required to investigate the cost-benefit relation of such nonspecific activities of Cas12a to the bacteria.

Cas12a cloning
The gene sequences for FnCas12a and AsCas12a were synthesized as Escherichia coli codon-optimized gBlocks (purchased from Integrated DNA Technologies) and inserted into EDITORS' PICK: Cas12a nickase activities pSV272 using Gibson assembly (New England Biolabs) as per the manufacturer's protocol to generate a protein expression construct encoding Cas12a fused with N-terminal His 6 sequence, a maltose-binding protein (MBP), and a tobacco etch virus (TEV) protease cleavage site. The AsCas12a clone inadvertently carried two mutations, P874G and G1292V, which we confirmed did not affect the activity of this protein (data not shown). Experiments shown in Fig. 6 and Fig. S14C were performed with WT AsCas12a. LbCas12a was expressed using expression plasmid pMAL-his-LbCpf1-EC. pMAL-his-Lb-Cpf1-EC was a gift from Jin-Soo Kim (Addgene plasmid 79008; RRID:Addgene_79008) (10). Catalytically inactive (pre-crRNA-processing and DNase-dead) Cas12a mutants were generated via site-directed mutagenesis or "round-the-horn" PCR and verified by Sanger sequencing (Eurofins Genomics, Louisville, KY) (see Table S1 for site-directed mutagenesis primers).

Cas12a expression and purification
All Cas12a proteins were expressed in E. coli BL21 (DE3) cells. 2ϫ TY broth supplemented with corresponding antibiotics was inoculated with overnight cultures of cells in a 1:100 ratio. Cultures were grown to an optical density (600 nm) of 0.5-0.6 at 37°C, and protein expression was induced by the addition of isopropyl 1-thio-␤-D-galactopyranoside to a final concentration of 0.2 mM. The incubation was continued at 18°C overnight.
FnCas12a was purified by the following protocol adapted from a previous method (65) to initially test purification of the protein. Cells were resuspended in Lysis Buffer II (20 mM Tris-HCl, pH 8.0, 500 mM NaCl, 10 mM imidazole, and 10% glycerol) supplemented with phenylmethylsulfonyl fluoride. Cells were lysed by sonication or a homogenizer, and the lysate was centrifuged to remove insoluble material. The clarified lysate was applied to a HisPur TM Ni-NTA resin (Thermo Fisher Scientific) column. The column was washed with 10 column volumes of Lysis Buffer, and bound protein was eluted in Elution Buffer II (Lysis Buffer II ϩ 250 mM imidazole final concentration). The elution was concentrated and run on a HiLoad 16/600 Superdex 200 gel filtration column (GE Healthcare) pre-equilibrated with SEC Buffer A (20 mM Tris-HCl, pH 8.0, and 500 mM NaCl). Fractions containing His 6 -MBP-tagged Cas12a were collected and treated with TEV protease in a 1:100 (w/w) ratio, overnight at 4°C. Samples were reapplied to HisPur TM Ni-NTA resin (Thermo Fisher Scientific) to remove the His-tagged TEV, free His 6 -MBP, and any remaining tagged protein. The flowthrough was collected, concentrated, and further purified by using a HiLoad 16/600 S200 gel filtration column in SEC Buffer B (20 mM Tris-HCl, pH 8.0, 200 mM KCl, and 1 mM EDTA). Peak fractions were combined, concentrated, and flash-frozen in liquid nitrogen and stored at Ϫ80°C until further use.
After further optimization of the purification protocol, LbCas12a, AsCas12a, and all Cas12a mutants were purified using a modified protocol adapted from a previous method (66). Cells were harvested by centrifugation, and the cell pellet was resuspended in Lysis Buffer I (20 mM Tris-HCl, pH 8.0, 500 mM NaCl, 5 mM imidazole), supplemented with protease inhibitors phenylmethylsulfonyl fluoride, cOmplete TM Protease Inhibitor Mixture Tablet, or Halt Protease Inhibitor Mixture. Cells were lysed by sonication or a homogenizer, and the lysate was centrifuged to remove insoluble material. The clarified lysate was applied to a HisPur TM Ni-NTA resin (Thermo Fisher Scientific) column. The column was washed with 10 column volumes of Wash Buffer (Lysis Buffer ϩ 15 mM imidazole final concentration), and bound protein was eluted in Elution Buffer I (20 mM Tris-HCl, pH 8.0, 500 mM NaCl, 250 mM imidazole). Fractions containing Cas12a were pooled, and TEV protease was added in a 1:100 (w/w) ratio. The sample was dialyzed in Dialysis Buffer (10 mM HEPES-KOH, pH 7.5, 200 mM KCl, 1 mM DTT) at 4°C overnight. For further purification, the protein was diluted 1:1 with 20 mM HEPES-KOH (pH 7.5) and loaded on a HiTrap Heparin HP (GE Healthcare) column. The column was washed with Buffer A (20 mM HEPES-KOH, pH 7.5, 100 mM KCl) and eluted with Buffer B (20 mM HEPES-KOH, pH 7.5, 2 M KCl) by applying a gradient from 0 to 50% over a total volume of 60 ml. Peak fractions were analyzed by SDS-PAGE, fractions containing Cas12a were combined, and DTT was added to a final concentration of 1 mM. The protein was fractionated on a HiLoad 16/600 Superdex 200 gel filtration column (GE Healthcare), eluting with SEC buffer (20 mM HEPES-KOH, pH 7.5, 500 mM KCl, 1 mM DTT). Peak fractions were combined, concentrated, and flash-frozen in liquid nitrogen and stored at Ϫ80°C until further use. Commercially available Cas12a orthologs were purchased from New England Biolabs (EnGen Lba Cas12a (Cpf1)) and Integrated DNA Technologies (Alt-R A.s. Cas12a (Cpf1)).
To generate a pool of sequences containing mismatches (i.e. target sequences containing mismatches with the guide RNA), the library was partially randomized (35). The following probability distribution function was used to determine the randomization/doping frequency, where P is the fraction of the population, L is the sequence length, n is the number of mutations/template, and f is the probability of mutation/position (doping level or frequency). The number of different mutation combinations (MM c ) for a given n and L, regardless of the doping level/frequency, is determined by the following.
For example, the total number of unique target sequences with a single mismatch is 60, the number with two mismatches is 1,710, and the number with three mismatches is 30,780, etc. The three library sequences tested were a modified protospacer 4 sequence from the S. pyogenes CRISPR locus (55% GC), EMX1 gene target sequence (80% GC), and CCR5 gene target sequence (20% GC) (see Table S1 for target sequence). A randomization/ doping frequency (f) of 15% was selected to optimize the library to contain a maximum of sequences with two or three mismatches.

EDITORS' PICK: Cas12a nickase activities
Plasmid and nucleic acid preparation DNA oligonucleotides were synthesized by Integrated DNA Technologies or Thermo Scientific. All RNAs and singlestranded library oligonucleotides with 15% doping frequency in the target region were ordered from Integrated DNA Technologies. Sequences of DNA oligonucleotides and RNA used are listed in Table S1.
The oligonucleotides (for the libraries, target, and mismatched targets) were diluted to 0.2 M in 1ϫ NEBuffer 2. pUC19 vector was amplified via PCR to insert homology arms, followed by DpnI digestion and PCR cleanup (Promega Wizard SV Gel and PCR Clean-Up System). 30 ng of PCR-amplified pUC19, 5 l of oligonucleotide (0.2 M), and double-distilled H 2 O to bring the volume to 10 l were mixed with 10 l of 2ϫ NEBuilder HiFi DNA Assembly Master mix (New England Biolabs) and incubated at 50°C for 1 h. 2 l of the assembled product was transformed into NEB Stable Competent E. coli as per the manufacturer's protocol. After the recovery step, all of the cells in the outgrowth medium were used to inoculate a 50-ml lysogeny broth supplemented with ampicillin and incubated overnight at 37°C. Cells were cooled on ice before harvesting for the plasmid library (pLibrary) extraction using Qiagen Plasmid Midi Kit. All of the initial steps (lysis to neutralization) for plasmid extractions (pTarget, pLibrary, and empty plasmid) were performed on ice with minimum mechanical stress to ensure that the plasmid retained its supercoiled state following purification. Plasmids were stored as aliquots that were used for up to 10 freeze-thaw cycles. Three different pLibrary assembly reactions and preparations were used for the three replicates of the high-throughput in vitro cleavage assays.
Target plasmids and empty pUC19 were linearized by restriction enzyme digestion using BsaI-HF and nicked using a nicking enzyme Nt.BspQI (New England Biolabs). All restriction digestion and Gibson assembly reactions were carried out as per the manufacturer's protocols. All sequences were verified by Sanger sequencing (Eurofins Genomics). The topology of the extracted and restriction-digested plasmids was verified on an agarose gel before using in cleavage assays.

In vitro cleavage assay
The protocol was adapted from methods described previously (67). Briefly, Cas12a:crRNA complex was formed by incubating Cas12a and crRNA (1:1.5 ratio) in 1ϫ CutSmart buffer (50 mM potassium acetate, 20 mM Tris acetate, 10 mM magnesium acetate, 100 g/ml BSA, pH 7.9) and 5 mM DTT at 37°C for 10 min. For activator-mediated cleavage assays, Cas12a, crRNA, and dsDNA oligonucleotide activator (1:1.5:1.5) were incubated at 37°C for 10 min. Cleavage reactions were initiated by mixing the Cas12a complex with pTarget, pLibrary, or empty plasmid (150 ng) or M13mp18 ssDNA (250 ng) (New England Biolabs) and incubating at 37°C. 10-l aliquots were drawn from the reaction at each time point and quenched with phenol-chloroform. The aqueous layer was extracted and separated on a 1% agarose gel via electrophoresis and stained with SYBR safe or RED safe stain for dsDNA and SYBR gold for ssDNA for visualization. Excess crRNA was used in cleavage assays to prevent any crRNA-independent cleavage activity (37,38). For library and mismatched target plasmid cleavage assays, 100 nM Cas12a and 150 nM crRNA were used. For activatormediated cleavage assays, 20 nM Cas12a, 30 nM crRNA, and 30 nM activator DNA oligonucleotide were used. Concentrations of pLibrary, pTarget, and empty pUC19 used were at 150 ng/10-l (8.6 nM) reaction, and M13mp18 ssDNA was at 250 ng/10 l (10.6 nM).

Library preparation for HTS
The library plasmid cleavage products were run on an agarose gel as described above to separate the cleaved (linear and nicked) and uncleaved (supercoiled) products. The bands from the nicked and supercoiled fractions from various time points were excised and gel-purified using the QIAquick Gel Extraction Kit (Qiagen). PCR was used to add Nextera adapters, followed by another round to add unique indices/barcodes for each sample. Samples were cleaned using the QIAquick PCR Purification Kit (Qiagen). Although the plasmid DNA conformation plays a role in PCR quantification (68), the nicked and supercoiled DNA yielded PCR products when using primers that flanked the randomized target region in the pLibraries (see Table S1 for Nextera adapter primers). PCR product size and quality were verified using the DNA 1000 kit and Agilent 2100 Bioanalyzer. Samples were sent for MiSeq or NextSeq for paired-end reads of 75 or 150 cycles to the Iowa State DNA Facility or Admera Health, LLC (South Plainfield, NJ). Due to low sequence diversity of the pLibraries for HTS, we added 15% PhiX.

HTS analysis
HTS data were obtained as compressed fastq files and were processed with custom bash scripts (see associated GitHub repository; 10.5281/zenodo.3712356). A simple workflow of the analysis is described in Fig. S1. Briefly, the files were renamed based on the sample information (pLibrary name, replicate, and Cas12a ortholog) and stored in separate folders identified by the library, and the target sequences were extracted. Bash scripts were used for obtaining the counts of the extracted target sequences, determining the number of mismatches, calculating the fractions in each replicate, and preparing summary tables for total counts of each mismatched target sequence. Once all of the processing was done on the command line, data were imported into Microsoft Excel for plotting and summarizing.
After imaging the agarose gels with the pLibrary cleavage products (Fig. 1B and Fig. S2E), the plasmid bands were quantified with ImageQuant TL (GE Healthcare) or ImageJ (National Institutes of Health). Intensities of the band (I) in the uncleaved (supercoiled (SC)) and cleaved fractions (nicked (N) and linearized (L)) were measured. Fraction (FR) cleaved and uncleaved were calculated as follows. The fraction cleaved (FR C ), fraction supercoiled (uncleaved) (FR SC ), fraction nicked (FR N ), and fraction linearized (FR L ) were determined for each of the time points t, where I SC , I N , and I L were the intensities of the supercoiled, nicked, and linearized bands at that time point.
The fraction of target sequences containing n mismatches (MM) (F n-MM ) in the pool was calculated. F n-MM was normalized to the fraction (FR) of DNA present in the supercoiled or nicked pool at a given time point t to generate an estimated abundance (EA) of a given set of sequences at a given time point.
EA n-MM ϭ ͑F n-MM of S at t͒ ϫ ͑FR at t͒ (Eq. 8) These values were plotted against number of mismatches to generate mismatch distribution curves (Fig. S5). The relative abundance (enrichment and/or depletion) (RA) of a sequence S containing n mismatches at each time point t compared with the pLibraries in the negative control (i.e. pLibrary incubated in reaction conditions for the longest time point, labeled as (Ϫ) in all of the gels) was calculated and plotted versus time to generate normalized curves shown in Fig. 1 (D and E) and Figs. S3 (C-F) and S4.

RA S ϭ
EA n-MM of S at t EA n-MM in the pLibrary (Eq. 9) For the heatmaps in Figs. 2 and 3 and Figs. S6 -S9, the EA of sequences containing a particular nucleotide (N ϭ A, G, C, or T) at a particular position (P ϭ 1-20) for target sequences containing n mismatches was calculated as above. RA was calculated by normalizing EA against the fraction of DNA in the original library to eliminate variability in aberrant nicking that may have occurred for individual pLibraries in the negative control.

RA S-NP ϭ
EA n-MM of S with N at P at t F n-MM in the pLibrary (Eq. 10) Each RA value was normalized to the maximum RA value present in either the supercoiled or nicked pool to scale the relative abundance from 0 to 1 for the heatmaps. For the analysis of target sequences with two mismatches, the sequences with two mismatches were extracted. The distance between the two mismatches and the total counts for sequences separated by that distance were determined. The counts were normalized to the number of possible ways the two mismatches can occur (Fig. S10), and the normalized RA was calculated as described above for heatmaps.

Data availability
HTS data and processed data files from this study have been deposited in the Iowa State University Library's DataShare and can be found at 10.25380/iastate.8178938. HTS data were processed with custom bash scripts, which can be found at the GitHub repository; 10.5281/zenodo.3712356. All other information and data are available from the authors upon request.