Chemistry of Class 1 CRISPR-Cas effectors: binding, editing, and regulation

Among the multiple antiviral defense mechanisms found in prokaryotes, CRISPR-Cas systems stand out as the only known RNA-programmed pathways for detecting and destroying bacteriophages and plasmids. Class 1 CRISPR-Cas systems, the most widespread and diverse of these adaptive immune systems, use an RNA-guided multi-protein complex to find foreign nucleic acids and trigger their destruction. In this review, we describe how these multisubunit complexes target and cleave DNA and RNA, and how regulatory molecules control their activities. We also highlight similarities and differences to Class 2 CRISPR-Cas systems, which use a single-protein effector, as well as other types of bacterial and eukaryotic immune systems. We summarize current applications of the Class 1 CRISPR-Cas systems for DNA/RNA modification, control of gene expression, and nucleic acid detection.


Introduction
All cells must defend against infection by harmful genetic elements, like viruses or transposons. Prokaryotes use a multitude of different strategies to combat their viruses, which are called phages. These include, but are not limited to, adsorption and injection blocking, abortive infection, toxin-antitoxin, restrictionmodification, and CRISPR-Cas (clustered regularly interspaced short palindromic repeats-CRISPR-associated) systems (1). CRISPR-Cas loci constitute the only known adaptive immune system in bacteria and archaea. They typically include an array of repeat sequences (CRISPRs) with intervening "spacers" matching sequences of DNA or RNA from viruses or other mobile genetic elements, and a set of genes encoding CRISPR-associated (Cas) proteins (Fig. 1A). Transcription across the CRISPR array produces a precursor crRNA (pre-crRNA) that is processed by nucleases into small, non-coding CRISPR RNAs (crRNAs) (Fig. 1B). Each crRNA molecule assembles with one or more Cas proteins into an effector complex that binds crRNA-complementary regions in foreign DNA or RNA (Fig. 1C-E). The effector complex then triggers degradation of the targeted DNA or RNA using either an intrinsic nuclease activity or a separate nuclease in trans (Fig. 1A, 1C-E).
CRISPR-Cas systems have been classified into two groups comprising three types each (Class 1 includes Types I, III, IV; Class 2 includes Types II, V, and VI) (2). Class 1 systems use multisubunit complexes that contain multiple different Cas proteins, while Class 2 effectors contain only a single protein (Fig. 1C-E) (2). To date, much attention has focused on the mechanism of Class 2 effectors, such as Cas9, Cas12 and Cas13, given their practical applications in genome editing and manipulation (3,4). Class 1 systems, though less well studied, are far more abundant in nature, comprising about 90% of CRISPR-Cas systems in bacteria and archaea (2,5,6). They are also present across diverse bacterial and archaeal phyla, and likely evolved earlier than Class 2 systems (2). Class 1 CRISPR-Cas systems harbor a number of different enzymatic activities, including cleavage of double-stranded DNA (dsDNA), single-stranded DNA (ssDNA), and RNA, and synthesis of a second messenger molecule, cyclic oligoadenylate (cOA). These functions could be harnessed for genome or transcriptome manipulation and control of cellular outcomes. Here, we review the interference mechanisms of effector complexes from Class 1 systems and their regulation, focusing on new paradigms of adaptive immunity from recent studies of Types I and III systems, and emerging applications of these systems in genome and transcriptome engineering. copy of Cas8, the large subunit, and Cas5 at the 5´ end of the crRNA ("foot"), a helical "backbone" filament composed of Cas7 subunits that assembles along the crRNA spacer region, a "belly" filament composed of Cas11 subunits, and a Cas6 subunit that binds to the 3´-end of the crRNA ("head") and caps the backbone (Fig.  1C)(8-10). Structural studies of Type III effector complexes, "Csm" (Cas subtype Mtube; subtypes III-A/D/E/F) and "Cmr" (Cas module RAMP; subtypes III-B/C), indicate that they adopt a similar architecture, but have a more extended, worm-like shape (11)(12)(13). Type III complexes also include Cas10 as the large subunit, instead of Cas8 (Fig. 1D). While Cas10 and Cas8 occupy similar positions in the complex, they are highly divergent by amino acid sequence and play different roles in target interference (2). Another notable difference between Type I and III complexes is that Type III complexes lack Cas6, instead employing specialized Cas7-like subunits (Csm5 in Type III-A and Cmr1 and Cmr6 in Type III-B) to bind the 3´ end of the crRNA (Fig. 1D) (11)(12)(13)(14). Type IV CRISPR-Cas complexes, which include three subtypes (IV-A to IV-C), have a unique large subunit, Csf1, as well as subunits homologous to Cas7, Cas6, and Cas5 ( Fig. 1E) (15,16). The subunit assembly and architecture of Type IV complexes is not clear, as structures of the entire complex have not yet been determined.
Homology between Cas7, Cas6, and Cas5 and the similarity of complex architectures across Class 1 systems point to a common evolutionary origin for these effector complexes. Both Type I and III systems also have Cas11, known as the "small subunit." Cas11 proteins do not share significant sequence similarity across types, but exhibit structural homology and occupy analogous positions in Type I and III complexes (Fig. 1C, D) (2,7,17). The divergence of large subunits, Cas10 and Cas8, on the other hand, suggests that they may be under greater evolutionary pressure from phage counterdefense strategies. This is consistent with the roles that Cas8 and Cas10 play in activating and regulating immunity, which we will discuss in subsequent sections of this review.
Several subtypes of Types I CRISPR-Cas also lack genes encoding for the large and/or small subunits (2). In these systems, other subunits typically take over the functional roles and positions of Cas8 and/or Cas11 in the complex. For example, in the Type I-F2 system from Shewanella putrefaciens (S. putrefaciens), Cas5 and Cas7 substitute for the lack of Cas8 and Cas11, respectively (18). In addition, many Class 1 complexes include fusions of subunits into a single protein (2). The newly identified Type III-E locus lacks the large subunit, Cas10, and encodes a predicted fusion of Cas7 with Cas11 (2). Whether one of the subunits replaces Cas10, or if this system has a novel function compared to other Type III systems is not known. Future biochemical and structural studies of diverse Type I and III variants will likely uncover unexpected functional and structural versatility of Class 1 CRISPR-Cas complexes, and identify minimal complexes that could be more easily expressed and assembled in heterologous systems for genome engineering.
Class 1 CRISPR-Cas systems also encode for effector nucleases and/or helicases that cooperate with the surveillance complex for RNA-guided immunity (Fig. 1A). The Type I Cascade has no intrinsic enzymatic activity, and relies on recruitment of Cas3, a helicasenuclease, to degrade dsDNA in trans ( Fig.  1A) (19,20). Type III CRISPR-Cas Csm/Cmr possess intrinsic DNase and RNase activities, but also synthesize a second messenger, cOA, which binds and stimulates RNA cleavage by Csm6/Csx1, a separate nuclease effector ( Fig.  1A) (21). Type IV systems are associated with Csf4, a DinG family helicase (Fig. 1A)(2). Csf4 is required for in vivo plasmid interference by Type IV systems, but its role in the process is unclear (22). These "partner" enzymes illustrate how a common RNA-guided surveillance complex architecture can be adapted to perform diverse functions.

crRNA biogenesis
In contrast to the CRISPR-Cas9 effector, which requires both a crRNA and a tracrRNA (transacting crRNA) for activity, all Class 1 complexes contain only a single crRNA molecule (2,23). Processing of the crRNA in Class 1 systems typically requires the Cas6 ribonuclease, which cuts pre-crRNA transcripts into individual crRNA molecules containing a repeat-derived 5´-tag, a spacer region, and a 3´stem loop hairpin (Fig. 1B). In Type I and IV systems, the crRNA retains the 3´ hairpin structure (Fig. 1B) (15,16). In Type III systems, host nucleases trim the 3´ end to variable lengths corresponding to the number of Cas7 subunits in the complex (Fig. 1B) (24)(25)(26)(27). Genetic and biochemical studies of a Type III-A system from S. epidermidis suggest that Csm5, the subunit that caps the Cas7 helical filament, recruits polynucleotide phosphorylase (PNP), to trim the exposed 3´-end of the crRNA (28,29). However, deletion of the pnp gene did not result in complete loss of mature, trimmed crRNAs, suggesting that other host RNases may also contribute to processing (29). A recent study also showed that a Type III-Bv system, which lacks Cas6, uses a host RNase E enzyme for crRNA maturation (30). This highlights the importance of studying different subtypes in order to understand how Class 1 crRNAs are specifically assembled with Cas proteins into RNA-guided effector complexes. Understanding the minimal requirements for guide maturation and complex assembly would also facilitate introduction of these complexes into eukaryotic cells for DNA and RNA detection and editing.

Class 1 CRISPR-Cas Interference
Type I and III CRISPR-Cas systems together comprise the most abundant types of CRISPR-Cas systems, and encompass diverse subtypes (2,5). In this section, we review recent advances in our mechanistic understanding of nucleic acid targeting by Type I and III surveillance complexes and their modes of regulation.

DNA targeting and regulation by Type I CRISPR-Cas systems
Type I CRISPR-Cas systems target homologous regions of double-stranded DNA in phages or plasmids for degradation (2). The overall mechanism of targeting involves two major steps -recognition of a complementary target in foreign DNA by the surveillance complex and cleavage of the target by Cas3, a protein with an SF2 (Superfamily 2) helicase and HD (histidineaspartate) nuclease domain, that is recruited in trans (31)(32)(33). Target recognition requires complementarity between the crRNA and the target, as well as the presence of a PAM (protospacer adjacent motif), which allows the host to avoid self-immunity (34).

RNA-guided DNA binding and cleavage by Type I CRISPR-Cas systems
During an infection, the Type I complex first scans the viral genome for the PAM, a 2-5 basepair (bp) motif flanking the target sequence ( Fig.  2A) (34). Since the PAM is not present in the repeat sequences flanking the CRISPR spacers, this protects the host's own DNA from being targeted for degradation (35). The mechanism of PAM recognition has been most well-studied for the Type I-E Cascade complex from E. coli, which recognizes a trinucleotide 5´-A-(T/C/A)-G-3´ (36). The PAM is recognized in a doublestranded form through minor groove contacts with a lysine finger, glutamine wedge, and glycine loop in Cas8 (also known as Cse1 in Type I-E systems) (36). Local bending of the DNA combined with insertion of the wedge motif into the DNA duplex following the PAM initiates DNA unwinding (36,37). Wedge and loop motifs are functionally and structurally conserved across different Type I systems, but sequence variability enables recognition of distinct PAM sequences (36)(37)(38)(39). Some Type I complexes have also evolved to use other subunits for PAM recognition. The Type I-F2 complex, which lacks Cas8, uses Cas5 to recognize a 5´-GG-3´ PAM through major groove interactions (18). Studies of different Type I variants will likely reveal further diversity in the mechanisms and protein motifs used to recognize PAM sequences. This information could be used to engineer more flexible PAM recognition or near-PAM-less variants of Type I effector complexes for genome engineering, similar to those that have been developed for Cas9 (40,41).
Once DNA unwinding initiates at the PAM, crRNA hybridization with the target strand of DNA leads to displacement of the non-target DNA strand, forming a three-stranded nucleic acid structure known as an R-loop ( Fig. 2A) (42,43). Structural, biochemical, and singlemolecule experiments on purified Type I complexes have led to a detailed understanding of R-loop formation. Complementarity at a seed region (positions 1-5 and 7-8 following the PAM), is required for target binding and interference (37,44). In the Thermobifida fusca (T. fusca) Type I-E Cascade, binding of the PAM first leads to bending of DNA and unwinding of an ~11-nt "seed loop" intermediate (37). Further base-pairing along the crRNA then expands the seed loop into a full Rloop, which is then locked in place by interactions with Cas7 and/or Cas11 (37,45). In Type I-E Cascade, positively-charged residues on the surface of Cas8 and Cas5 guide the displaced non-target strand away from target strand towards the back of the Cas11 subunits, where it is locked ( Fig. 2A) (18,(36)(37)(38). In Type I-F2 complexes, which lack Cas11 and Cas8, the non-target strand winds through a positivelycharged "trench" formed by Cas5 and Cas7 subunits (18,37). The Cas7 "backbone" and Cas11 "belly" filaments are also involved in "locking" the target DNA once it hybridizes with the crRNA (18,37,38). Single-molecule studies indicate that R-loop formation serves as a step for rejecting off-target DNA, as mismatches between the crRNA and the target increase the likelihood of R-loop collapse before it reaches the locked state (45). Bacterial RNA polymerases, which unwind DNA without energy input, also bend the path of the DNA duplex, suggesting this may be a conserved mechanism to facilitate DNA unwinding (46,47). Studies of DNA unwinding and R-loop stabilization in subtypes of Type I systems may reveal further similarities with other protein families that unwind or bend DNA, including helicases, RNA polymerases, transcription factors, and RecA.
Formation of the R-loop induces a conformational change in the complex that enables recruitment of Cas3, a helicase-nuclease protein that is required for target DNA degradation ( Fig. 2A) (19,20,48). Current evidence supports a model in which Cas3 nicks the DNA at the R-loop, loads onto the ssDNA, and processively unwinds and degrades DNA in a unidirectional and ATP-dependent manner ( Fig. 2A) (19,33,49). Structures of T. fusca Cascade bound to a DNA target and Cas3 revealed that a protruding "bubble" in the nontarget strand of the R-loop is required for Cas3 to nick the DNA ( Fig. 2A) (50). Cascade bound to a partial R-loop lacking the protruding bubble could recruit Cas3, but did not induce DNA cleavage by Cas3 (50,51). Single-molecule FRET and bulk fluorescence experiments indicate that after ssDNA loading, Cas3 first stays associated with Cascade and cleaves ssDNA by a "reeling" mechanism; in this model, Cas3 uses its SF2 helicase domain to repeatedly pull in and present ssDNA to its HD nuclease domain for cleavage ( Fig. 2A) (52)(53)(54). Previous studies also showed that Cas3 can break free of Cascade and translocate on its own, but no evidence of DNA degradation during translocation was observed ( Fig. 2A) (53). Thus, it is unclear whether Cas3 would degrade DNA during translocation. In addition, while structural and biochemical studies have shed light on how Cas3 degrades the non-target strand, how it nicks and degrades the target strand of DNA is less well understood. One possibility is that once the non-target strand has been degraded, the exposed ssDNA of the target strand would become a substrate for a second Cas3 molecule to nick and degrade.

Anti-CRISPR inhibition of Type I CRISPR-Cas effector complexes
Several phage and prophage genomes encode for "anti-CRISPRs," small proteins (~50-300 amino acids) that inhibit CRISPR-mediated immunity (43,44). Identification and characterization of anti-CRISPRs that inhibit Type I-E or I-F systems show that they block DNA interference using diverse mechanisms (55). Some anti-CRISPRs mimic duplex DNA or induce a conformational change in Cas8 to interfere with PAM binding (38,57). Structural studies of Type I-F anti-CRISPRs indicate that some can bind to the Cas7 backbone or the crRNA to prevent crRNA:ssDNA base-pairing and R-loop formation (38,57). In addition, Type I-E and I-F anti-CRISPRs, AcrIE1 and AcrIF3, bind and inhibit the recruitment of Cas3 by Cascade for DNA cleavage (58)(59)(60). Structural comparison of AcrIF3 with Cas8 revealed that AcrIF3 resembles a helical bundle in Cas8 that binds Cas3, which indicates that mimicry of host proteins could be a common strategy for phages to evade CRISPR-Cas interference (48). Binding of the Type I-F anti-CRISPR, AcrIF9, to the surveillance complex also induces nonspecific DNA binding, which could sequester the complex away from its target (61,62). These discoveries highlight not only the diversity of anti-CRISPRs, but also the importance of PAM recognition, R-loop formation, and DNA cleavage in Type I CRISPR-Cas immunity. The steps at which anti-CRISPRs inhibit Class 1 complexes are analogous to steps at which anti-CRISPRs inhibit DNA-targeting Class 2 systems, Cas9 and Cas12, illustrating a remarkable evolutionary convergence of counter-defense strategies by phages (63,64). Further discovery and characterization of anti-CRISPRs against other Type I subtypes could reveal new insights into the host-virus evolutionary arms race and lead to new strategies to control Type I CRISPR-Cas effectors for genome manipulation.

RNA-guided DNA transposition by Type I CRISPR-Cas systems
Bioinformatic analyses revealed that some Type I-B and I-F systems lacking Cas3 have been coopted by mobile genetic elements (65). The transposons in which these systems are found also lack a key protein involved in directing sitespecific transposition (65). Thus, it was hypothesized that the Type I effector in these systems used crRNAs to direct DNA insertion by the transposase to specific sites (65). This concept was recently demonstrated by experiments showing that a transposon-encoded Type I-F effector complex from Vibrio cholerae (V. cholerae) can mediate targeted insertion of cargo DNA sequences when expressed in Escherichia coli (E. coli) (Fig. 2B)(66). Biochemical and genetic experiments indicate that the Type I-F effector specifically interacts directly with TniQ, a transposition protein, and that this interaction is required for RNA-guided transposition (66,67). Cryo-EM structures also showed that the Type I-F complex associates with a dimer of TniQ through contacts with Cas6 and a Cas7 subunit at the 3´ end of the crRNA (67). Interestingly, RNA-guided transposition is sensitive to mismatches in not only the PAM-proximal seed region, but also in a four-nucleotide region near the TniQ binding site (66,67). Further structural and biochemical studies are needed to determine how target DNA binding and unwinding leads to recruitment of the core transposase, comprising TnsA, TnsB, and TnsC, for RNA-guided DNA insertion (Fig.  2B). Such studies could facilitate the targeted insertion of large DNA elements into the genomes of microbes and eukaryotic cells without requiring homologous recombination, which is often inefficient and only occurs in dividing cells.

Dual DNA and RNA targeting by Type III CRISPR-Cas systems
Type III CRISPR-Cas systems are the most evolutionarily ancient CRISPR-Cas system, and are widespread across bacteria and archaea (68). Their effector complexes include enzymatic domains that cleave RNA and ssDNA, and that synthesize second messenger molecules to activate antiviral nucleases in trans (Fig. 3A). Their effector complexes coordinate a sophisticated, multi-pronged defense against invasive genetic elements, including DNA and RNA phages, plasmids, and jumbo phages (69)(70)(71)(72)(73). They are divided into six subtypes (III-A to III-F), but the most common are Type III-A and III-B systems (2,5). The study of these systems has offered insights into the evolutionary origins of CRISPR-Cas immunity and surprising parallels with other prokaryotic and eukaryotic immune systems.

RNA-guided RNA cleavage by Type III CRISPR-Cas complexes
Type III CRISPR-Cas effector complexes recognize RNA through base-pairing interactions with their crRNAs and cleave it using their Cas7-like subunits, Csm3 (III-A/D/E/F) or Cmr4 (III-B/C) (Fig. 3A). Like in Type I complexes, the crRNA is presented by the effector complex in discontinuous segments for base-pairing with the target nucleic acid (11)(12)(13)74). In addition, each Csm3/Cmr4 inserts a "finger" loop into the duplex, flipping out every 6 th base pair (11)(12)(13)(14). This closely resembles the mechanism of ssDNA binding in the R-loops of Type I effector complexes (10,38). Type III effector complexes do not have a clear seed region for RNA binding, unlike Type I Cascade and other RNA-guided RNA nucleases, like Argonaute and CRISPR-Cas13 (75,76). Each segment of target RNA is recognized and cleaved independently by Csm3 or Cmr4; crRNA:target mismatches or deoxynucleotide modifications that disrupt cleavage at one site do not inhibit cleavage at other sites (25,77). Complete guide:target complementarity is not required for RNA cleavage, though reduced base-pairing results in a slower rate of cleavage (70,78). A seed or "target capture" motif has been reported at the 5´ end of the target in Type III-B Cmr, but truncation of this region did not entirely inhibit RNA binding or cleavage (79,80).
Unlike in Type I effectors, the Cas7-like subunits of Type III effector complexes (Csm3/Cmr4) are catalytically active. Cleavage requires a conserved aspartate (Asp) residue in Csm3/Cmr4 and occurs 3´ to every flipped base, resulting in a characteristic six-nucleotide cleavage periodicity (11)(12)(13)(14)74). RNA cleavage requires a 2´-OH in the target RNA for cleavage, and the resulting termini of the reaction products have a 5´-OH and either a 3´-phosphate or 2´,3´cyclic phosphate (14,81). This suggests a metalindependent cleavage mechanism, but experiments show that divalent metal ions are required for RNA cleavage (24-26, 70, 79, 81, 82). Atomic-resolution structures of Type III complexes show how RNA targets are positioned prior to cleavage, but use of activesite mutants or a non-cleavable ssDNA target in these structures has prevented determination of the cleavage mechanism (12)(13)(14). Thus, structural studies of the complex in cleavagecompetent and post-cleavage states will be required to understand the catalytic mechanism of RNA cleavage by Type III effectors.

RNA-activated DNA cleavage by Type III CRISPR-Cas effector complexes
Biochemical experiments showed that Csm and Cmr recognize and cleave complementary single-stranded RNA in vitro (Fig. 3A)(83). However, Type III-A and III-B CRISPR-Cas systems exhibit transcription-dependent DNA targeting in vivo (84,85). This was a puzzle, until the discovery that recognition of RNA allosterically activates a latent ssDNA endonuclease activity in Cas10, the large subunit (Fig. 3A) (26,77,(86)(87)(88). DNA cleavage is catalyzed by Cas10's HD nuclease domain and requires RNA binding, but not RNA cleavage by the effector complex (26,77,(87)(88)(89). The HD domain cleaves random sequences of ssDNA and generally requires transition metals (Ni 2+ or Mn 2+) for maximal activity, similar to Cas3 (26,74,77,(86)(87)(88). How Cas10 is activated to bind and cleave ssDNA is not clear, as ssDNA could not be visualized in structures of Csm/Cmr and few conformational changes were observed in the HD domain upon RNA binding (12,13,74). The identification of Cas10 mutations that constitutively activate ssDNA cleavage indicate that RNA binding may relieve an autoinhibited state (13). Elucidation of the conformational changes and dynamics involved in activation are needed to fully understand how crRNA-guided RNA binding activates the Cas10 subunit for ssDNA cleavage.
Due to the ability of Type III effector complexes to cleave both ssDNA and RNA, it was proposed that the Type III complex would cotranscriptionally bind the growing transcript and trigger cleavage of both the RNA and the unwound ssDNA in transcription elongation complexes (26,77,86,87). However, experiments in which Csm was added to in vitro transcription reactions or stalled transcription complexes indicated that Type III-A Csm prefers to cut RNA transcripts, rather than ssDNA (74,90). This preference is likely because the transcript is more accessible than ssDNA bound by RNA polymerase during transcription. This is supported by the finding that Csm can cleave ssDNA in free R-loops that are not bound by RNA polymerase (74). Similarly, the Cas3 nuclease in Type I systems only nicks the ssDNA of the R-loop when it forms an exposed loop that protrudes above the surface of Cascade (50). Further studies are required to identify the DNA target of Type III CRISPR-Cas effector complexes in the cell. Potential targets may include unbound R-loops (Fig. 3A) or DNA replication intermediates (43,91). Type III effectors could also cleave DNA during transcription initiation, when longer lengths of ssDNA are exposed by RNA polymerase during a process known as "scrunching" (92). Thus, future studies may reveal an unanticipated level of coordination between CRISPR-Cas immunity and other DNA processes in the cell.
DNA cleavage is important to help clear a phage infection, but it could also be deleterious to the host's genomic integrity if it persists indefinitely. Thus, Cas10 is gradually inactivated over time by cleavage and dissociation of the RNA from the effector complex (Fig. 3A)(77, 86). Consistent with this, use of a modified, non-cleavable RNA or a cleavage-deficient Csm3/Cmr4 mutant prevented inactivation of ssDNA cleavage (77,86). In the cell, RNA cleavage by the Csm3/Cmr4 subunits in the effector complex is likely important for turning off Cas10's DNase activity once viral transcripts have been cleared.

RNA-guided cOA synthesis in Type III complexes
In addition to DNA cleavage, RNA binding also activates the Cas10 subunit for cyclic oligoadenylate (cOA) synthesis and activation of signaling effectors (Fig. 3A)(93-97). Binding of cOA to accessory nucleases, Csm6 (III-A) or Csx1 (III-B), dramatically stimulates their enzymatic activity (Fig. 3A)(93, 95, 96). This leads to degradation of viral and host transcripts, induces a growth arrest in the host cell, and promotes plasmid clearance (72,98,99). Target RNA cleavage and dissociation from the crRNA-guided effector complex eventually inactivates the Palm domains of Cas10 for cOA synthesis, which would prevent persistent degradation of host transcripts after the phage infection has been cleared (93,97). Genomic analyses of CRISPR-Cas loci suggest that the ancestral function of these systems was a nucleotide-based stress signaling pathway, similar to the Type III cOA signaling pathway (100). Thus, further studies into RNA-guided cOA signaling could reveal unexpected connections between RNA-guided CRISPR-Cas immunity and other nucleotide signaling pathways.
Biochemical and structural studies of Type III-A complexes suggest the following mechanism for cOA synthesis. In addition to the HD nuclease domain, Cas10 also has two Palm domains that form a composite active site for cOA synthesis (12,93,95,101). The catalytic motif for cOA synthesis, GGDD, are present in only one of these domains (93,96). Cooperative binding of two ATP molecules by the Palm domains positions the 3´-OH of one ATP molecule for attack of the 5´-a-phosphate of the second molecule to generate a 3´-5´ phosphodiester bond (12,93,95,101). Specific recognition of ATP is mediated through a network of hydrogen bonding interactions (101). Further reaction of this substrate with incoming ATP molecules leads to extension of the oligoadenylate chain and eventually ring closure through intramolecular attack of the terminal nucleotide's 5´-a-P by the first nucleotide's 3´-OH (Fig. 3A, bottom inset) (93,101). Release of the cyclic oligoadenylates occurs through a channel formed by Cas10 and Csm4 (101). The lengths of the cOA species range from 3-6 AMP molecules per ring (Fig. 3A, bottom inset), but how the size of the ring is determined is not well understood. In addition, how exactly RNA binding allosterically activates Cas10's Palm domains for cOA synthesis is also not understood.

RNA-guided cOA signaling in Type III CRISPR-Cas systems
Type III CRISPR-Cas loci are frequently associated with genes encoding for Csm6 or Csx1, which contain an N-terminal CARF (CRISPR-associated Rossman fold) "sensor" domain and a C-terminal HEPN (higher eukaryotes and prokaryotes nuclease) "effector" domain (5,102,103). The RNase activity of Csm6/Csx1's HEPN domain is allosterically activated by binding of either cOA4 or cOA6 to the CARF domains (Fig. 3A)(93, 95, 96, 104). The HEPN domains of Csm6 and Csx1 exhibit a base cleavage preference that varies by ortholog, with most cleaving adjacent to either purines or C's (93,99,102,105). Structures of Csm6 and Csx1 reveal that they typically form dimers, but some orthologs also exhibit the ability to form higher-order oligomers (103,104,106,107). Binding of cOA does not appear to induce large conformational changes in their HEPN active sites, suggesting that conformational activation may involve subtle changes or transient sampling of an activated state (104,106).
In addition to Csm6 and Csx1, several DNases fused to CARF domains also respond to cOA molecules. For example, binding of cOA3 to NucC, an enzyme whose gene is associated with Type III CRISPR-Cas loci and other prokaryotic defense modules, activates it for dsDNA cleavage (108). A recent study also identified can1 (CRISPR associated nuclease 1) in a genome with a Type III-A CRISPR-Cas system (109). Binding of cOA4 to Can1 activates it for nicking at random sequences of dsDNA (109). These cOA-activated nucleases may promote immunity by triggering degradation of viral DNA during replication or induce host death before the phage can replicate and infect other cells. Bioinformatic studies have also identified additional CARF domain proteins with transmembrane or other nuclease domains (110,111). Further characterization of cOA-regulated effectors is needed to determine the full effects of cOA signaling by Type III systems in cells.
Nonspecific RNA or DNA degradation by nucleases can have deleterious effects on the host cell. Thus, cells have evolved dedicated enzymes called "ring nucleases" that degrade cOA and switch off the signaling pathway (112) (Fig. 3B). Ring nucleases degrade cOA4 or cOA6 using a catalytically active CARF domain (Fig. 3B)(112). Some Csm6/Csx1 orthologs also harbor an intrinsic ring nuclease activity in their CARF domains that leads to slow selfinactivation over time (Fig. 3B)(104, 107, 113). Cleavage proceeds in two steps, with the first step generating a linear oligoadenylate with a 2´,3´-cyclic phosphate, followed by a second step in which it is split into two halves (Fig.  3B)(104, 107, 112). Several Csm6 orthologs are still activated by linear A4 or A6 with 2´,3´cyclic phosphates at their 3´-termini, suggesting that the second cleavage event is crucial for complete inactivation (96). Interestingly, anti-CRISPR proteins that inactivate Type III CRISPR-Cas systems are either highly active ring nucleases (e.g. AcrIII-1) or proteins that bind Cas10 and inhibit its cyclic oligoadenylate activity (e.g. AcrIIIB1) (114,115). This highlights the importance of the cOA signaling pathway in bacterial immunity. How the opposing activities of Csm6/Csx1 and ring nucleases are effectively coordinated to mount a defense against phages is still not well understood. In vitro kinetic studies of substrate binding and cleavage by the effector complex and its associated nucleases have been used to model the dynamics of Type III CRISPR-Cas immunity in cells, and illustrate the distinct effects that host and viral ring nucleases have on immunity (116). Further studies that measure actual cellular concentrations of cOA and both host and viral transcript levels during an infection will reveal whether these kinetic models accurately describe how the Type III cOA signaling pathway protects cells from invaders.

Regulation of Type III Cas10 by tag:antitag pairing
All immune systems must distinguish between self versus nonself. Antisense transcription across the CRISPR array produces RNA molecules that are complementary to the crRNA (117,118). To prevent these antisense transcripts from triggering self-immunity, complementarity between the 5´-tag of the crRNA and the "anti-tag" sequence flanking the 3´ end of the target RNA inhibits Cas10's enzymatic activities (Fig. 3C)(26, 71, 93). In Type III-A systems, base-pairing between positions -2 to -5 in the anti-tag (+1 is the first nucleotide of the protospacer) with the corresponding positions in the crRNA 5´ tag is crucial for recognition of self RNA (12,13,71,86,93). In the Type III-B system from Thermotoga maritima (T. maritima), inhibition is similarly mediated by tag:anti-tag complementarity, but is additionally enhanced by the presence of a guanine in position -1 (119). How this guanine promotes Cas10 inhibition is unclear. The Type III-B CRISPR-Cas system from Pyrococcus furiosus (P. furiosus) also recognizes a "protospacer flanking sequence" (PFS) in the first three nucleotides flanking the 3´ end of the target (positions -1 to -3) in order to license ssDNA cleavage and cOA synthesis by Cas10 (87,120). It is unclear how the PFS is recognized by the P. furiosus Cmr, and whether tag:anti-tag complementarity still plays a role. Further analysis of how Type III complexes bind different anti-tag sequences will advance our understanding of how RNA binding regulates both ssDNA cleavage and cOA signaling by Cas10. Interestingly, the inhibition of Type III complexes by complementarity between the crRNA 5´-tag and the anti-tag has also been reported in an RNA-targeting Class 2 system, Type VI Cas13 (121). Thus, further insights into self versus nonself discrimination by Type III systems could reveal concepts that apply to other RNA-guided RNA nucleases.

Comparison of Type III CRISPR-Cas systems with other nucleotide-based immune systems in bacteria and eukaryotes
Recent studies show that Type III CRISPR-Cas systems are not the only bacterial immune systems that use cyclic nucleotides for signaling. For instance, a cGAS (cyclic GMP-AMP synthase)-like enzyme in a bacterial defense module synthesizes a cyclic GMP-AMP in response to phage infection, which leads to membrane degradation by a phospholipase and cell death (122). A recent study also reported that a cGAS/DncV-like nucleotidyltransferase (CD-NTase) could synthesize cOA3 , which in turn could bind and activate NucC, a DNA nuclease (123). Thus, insights into the cOA signaling pathway in Type III CRISPR-Cas systems may reveal concepts that are broadly applicable to other cyclic nucleotide-based antiphage signaling systems in prokaryotes.
The Type III cOA signaling pathway also bears similarities to the oligoadenylate synthase (OAS)-RNase L and cGAS-stimulator of interferon genes (STING) pathways in eukaryotes (124). OAS-RNase L constitutes a eukaryotic innate immune system, in which sensing of viral double-stranded RNA activates OAS to synthesize 2´,5´-linked oligoadenylates. These oligoadenylates in turn bind and activate RNase L to cleave viral transcripts. Like Csm6, RNase L also has a HEPN domain that catalyzes RNA cleavage. In the cGAS-STING pathway, cytosolic DNA activates the cGAS enzyme to synthesize a cyclic GMP-AMP, which binds to the STING receptor and ultimately activates transcription of antiviral genes. Thus, studies of the cOA signaling pathway in Type III systems may reveal evolutionary connections between bacterial adaptive immunity and eukaryotic innate immunity.

Editing and applications of Class 1 CRISPR-Cas effectors
While Class 1 systems are less widely used than Class 2 systems in genome editing, they are emerging as tools for genome and transcriptome manipulation in both microbial and eukaryotic cells (Fig. 4A-C). In bacteria, targeting of Type I CRISPR-Cas effectors to DNA sequences in the absence of Cas3 or with a Cas3-inhibiting anti-CRISPR leads to transcriptional repression (Fig. 4A)(58, 125). Transcriptional inhibition is strongest when guide RNAs target the promoter rather than the open reading frame, similar to dCas9, a cleavage-deficient mutant of Cas9 engineered to repress transcription in cells (125,126). Fusion of a transcriptional activator or repressor domain to Type I CRISPR-Cas complex subunits also enables them to modulate gene expression in plant or mammalian cells, illustrating their utility across different kingdoms (Fig. 4A)(127, 128).
Type I systems have also been introduced into various cell types for DNA modification (129)(130)(131)(132). Introduction of the Type I effector complex and the Cas3 helicase-nuclease into mammalian cells result in long-range chromosomal deletions in DNA (Fig. 4A)(129-131). These deletions are unidirectional, consistent with biochemical studies of Cas3 degradation. Type I-E effectors fused to the FokI nuclease can also be programmed with a pair of guide RNAs to induce dsDNA breaks, triggering both small deletions and templated repair in mammalian cells (Fig 4A)(131). Endogenous Type I systems have been harnessed for faster genome manipulation of the archaeon, Sulfolobus islandicus (132). Lastly, transposaseassociated Type I-F systems have been shown to specifically insert synthetic "cargo" DNA up to 10 kb (kilobases) in length in E. coli with high fidelity, and thus holds promise as a technique to knock-in genes without requiring homologous recombination (Fig 4A)(66). Indeed, recent preprints have reported the use of transposaseassociated Type I-F systems for targeted insertion of antibiotic resistance genes in members of a microbial community, and multiplexed gene insertion in several medically and industrially important bacterial species (133,134).
Type III CRISPR-Cas effectors have been repurposed for RNA knockdown in archaea and in zebrafish, which lack an RNAi pathway ( Fig.  4B)(135-137). In addition, they have also been used to assist in phage DNA editing (138). Lastly, the cOA-regulated enzyme, Csm6, has been repurposed for viral diagnostics in conjunction with Cas13, a Type IV CRISPR-Cas effector that is activated to cleave RNA in trans upon crRNA-guided recognition of an RNA target (Fig. 4B) (105). Trans cleavage of an RNA oligonucleotide bearing an A6 at its 5´ end and multiple U's at its 3´ end by Leptotrichia wadei (L. wadei) Cas13 leads to production of a linear A6 with a 2´,3´-cyclic phosphate, which can bind to and activate certain Csm6 orthologs (105). This led to an approximately 3.5-fold boost in RNA detection sensitivity over Cas13 alone (105). Further exploration of Csm6 and Csx1 orthologs from different organisms could lead to improved kinetics and sensitivity, expanded multiplexing, and greater thermostability of RNA diagnostic technologies. There may also be additional opportunities for reprogramming endogenous Type III systems in individual bacteria or bacterial communities by delivery of crRNA guides. The RNA-sensing function of Type III CRISPR-Cas effector complexes coupled with nonspecific RNA degradation by Csm6 could also be used to modify cell state (e.g. induce cell death or inhibit cell growth) in response to transcription of specific genes (Fig.   4C). This may be useful in the context of antimicrobials or developing disease therapeutics that target cells with aberrant gene expression.
A major challenge for the widespread use of these systems in eukaryotic cells has been the delivery of these large complexes to the site of editing in cells. Several methods have now been established for introducing Type I and III complexes into eukaryotic cells, including nucleofection of pre-formed ribonucleoprotein complexes and expression from multiple DNA vectors (127)(128)(129)(130)(131). Further discovery of minimal Type I and III complexes and advances in RNA and protein delivery methods will simplify delivery and facilitate the continued development of these systems as tools for DNA and RNA manipulation in diverse cell types.
Class 1 CRISPR-Cas effectors extend the toolbox for genome engineering beyond the capabilities of Class 2 systems. The distinct and flexible PAM sequence requirements of Type I systems, which differ from the PAM sequences recognized by Cas9 and Cas12, broaden the array of DNA targets that can be recognized (34). The generation of long-range deletions by Type I Cascade also contrasts with the smaller deletions ("indels") that result from Cas9 or Cas12 editing (Fig. 4A)(129, 130). The multisubunit composition of Class 1 effectors would also facilitate multiplexed fusion of domains that perform DNA or RNA base editing, epigenetic modifications, visualization of genomic loci or RNA sequences, and/or transcriptional regulation (Fig. 4C). Transposase-associated Type I-F systems also integrate DNA cargo with fewer off-target events than a transposase-associated Type V (Cas12k) system (66,134,139). The unique, cOA-regulated RNases of Type III systems could also be repurposed for control of cellular growth or behavior, in response to an upstream signal generated by a cOA synthetase (Fig. 4C). Further study of Class 1 systems and their mechanism will likely continue to broaden the array of tools available for investigation of genome and transcriptome function in cells and for in vitro nucleic acid detection.

Summary and outlook
Class 1 CRISPR-Cas systems are the most common adaptive immunity pathways in prokaryotes. The diverse functions and activities of Type I and III crRNA-guided complexes and enzymes illustrate their versatility, and recent studies highlight their promise and development as tools for genome and transcriptome manipulation. These complexes resemble each other in subunit assembly and use a conserved mechanism for crRNA-mediated target recognition. Type I and III systems have evolved distinct mechanisms to recognize dsDNA and RNA, respectively. Discrimination between self and non-self occurs through interactions with sequences flanking their DNA or RNA targets; Type I Cascade recognizes a dsDNA PAM, while Type III Csm/Cmr probes for noncomplementarity between an RNA anti-tag and the 5´-tag of its crRNA. DNA recognition and R-loop formation by Type I complexes licenses foreign DNA degradation by Cas3. In Type III systems, RNA recognition triggers nonspecific ssDNA cleavage and initiation of a cOA signaling pathway that activates additional nucleases for DNA or RNA degradation in trans. Type III systems also include a timer for self-inactivation by slow cleavage of the RNA target and degradation of cOA, the second messenger. Critical steps of interference are inhibited by anti-CRISPR proteins against both systems. Fundamental studies of Class 1 enzymes have enabled the application of Type I systems for transcriptional regulation and genome editing, and Type III systems for RNA degradation and diagnostics. Future studies on Class 1 CRISPR-Cas complexes will expand our understanding of the mechanism and evolution of prokaryotic RNA-guided immunity, and reveal unexpected connections with other cellular DNA processes, nucleotide signaling pathways, and eukaryotic innate immunity. Such insights will also open new avenues for using CRISPR-Cas systems to interrogate genome and transcriptome function, control gene expression, and detect DNA or RNA for disease diagnostics. Subunits that are analogous between the different types are shown with the same color. Type III-and IVspecific names for Cas7, Cas5, Cas6, and Cas11 subunits are also shown below the canonical subunit names. B. Biogenesis of Class 1 crRNAs. Transcription across a CRISPR array (repeat sequences are gray diamonds, unique spacers are shown as dark and light blue rectangles) leads to production of a pre-crRNA transcript that is then cleaved by Cas6 into individual guide molecules. Each crRNA has a 5´-tag that is derived from the repeat sequence. Individual guides are then directly incorporated into Type I and IV complexes, or trimmed at their 3´ end by host nucleases before assembly with Type III effector subunits. C. Architecture and enzymatic activities of the Type I crRNA-guided effector complex, Cascade. Subunits are shown with the same color scheme as in A. The crRNA is shown with the same color scheme as in (B). D. As in C but for the Type III effector complex, Csm (subtypes III-A/D/E/F) or Cmr (subtypes III-B/C). Subunits unique to Type III systems (Cas10, and Csm5 or Cmr1/6) are labeled. E. As in C but for the Type IV effector complex. The subunit unique to the Type IV system (Csf1) is labeled. Type IV complexes contain a crRNA assembled with Cas7, Cas6, and Csf1, but their enzymatic activity, precise stoichiometry, and structure is not yet known.

Figure 2. Type I CRISPR-Cas interference mechanism.
A. Target binding and degradation by the Type I-E Cascade complex. 1) Recognition of the PAM (shown as an orange rectangle) by Cas8 leads to DNA bending and initiation of unwinding; 2) Hybridization of the crRNA with target strand of DNA leads to displacement of the non-target DNA strand and formation of an R-loop; 3) A conformational change in Cascade that accompanies R-loop formation leads to recruitment of Cas3, a helicase-nuclease protein, to a small bulge in the non-target strand; 4) The nontarget strand bulge is cleaved by Cas3 and the ssDNA is loaded into the helicase; 5) Cas3's helicase domain unwinds DNA upstream of the PAM, and "reels in" ssDNA towards its nuclease active site; 6) Cas3 dissociates from Cascade and continues to translocate; whether or not degradation occurs during translocation is unclear. B. Cooperation of Type I-F3 Cascade with a Tn7 transposase to mediate RNAguided DNA insertion. 1) Recognition of the PAM and 2) base-pairing with a complementary DNA sequence leads to R-loop formation, which activates the TniQ dimer to recruit the core transposase (TnsA, TnsB, TnsC) and transposon DNA with paired ends (violet lines) to the Cascade-bound target site.
3) Integration of the transposon's DNA cargo (pink lines) occurs ~49 bp downstream of the DNA target. In all panels, individual Cas proteins are colored as in Figure 1, but with colors muted for greater clarity of the nucleic acid strands. dsDNA, double-stranded DNA; PAM, protospacer adjacent motif. A. Multi-pronged interference mechanism of Type III effector complexes. Type III Csm or Cmr complexes 1) recognize and degrade complementary target RNA (orange lines), 2) cleave ssDNA (red lines) nonspecifically when bound to an RNA target, possibly at exposed R-loops, and 3) synthesize a second messenger molecule, cyclic oligoadenylate (cOA, gray diamonds), when bound to an RNA target. cOA binds to and activates an accessory nuclease, Csm6 or Csx1, to nonspecifically cleave host and viral transcripts. The panel below shows the mechanism of cOA synthesis by Cas10. ATP molecules (gray circles labeled with "A") are polymerized into linear oligoadenylates with 3´,5´-phosphodiester linkages. This is then followed by ring closure to produce cOA with 3-6 AMP's per ring (cOA3-6). cOA4, and cOA6 molecules bind Csm6/Csx1 and activate its RNase activity. B. cOA signaling is regulated by Csm6/Csx1 (top row) and dedicated ring nucleases (middle row), which use their CARF domains to degrade cOA4 or cOA6. Ring nucleases may be derived from the host (e.g. Crn-1) or from phages (e.g. AcrIII-1). The bottom row illustrates how cOA4 is degraded by ring nucleases or Csm6/Csx1 into linear di-adenylates with 2´,3´-cyclic phosphates (A2>P). cOA6 is degraded in a similar manner. C. Self versus non-self discrimination in Type III CRISPR-Cas systems depends on 5´-tag:anti-tag complementarity. Basepairing at the -2 to -5 positions (+1 is the first nucleotide of the spacer region) inhibits ssDNA cleavage and cOA synthesis, but not complementary RNA degradation by Csm/Cmr. In all panels, Cas proteins are colored as in Figure 1, but with colors muted for clarity of nucleic acid substrates. A. Applications of Type I CRISPR-Cas effector complexes include transcriptional repression/activation, generation of long-range genomic deletions, generation of dsDNA breaks, and insertion of large DNA fragments, as discussed in the text. Subunits are colored as in Figure 1. B. Applications of Type III effectors include RNA knockdown by Csm or Cmr (left) and use of the Csm6 RNase in RNA diagnostics (right), as discussed in the text. Subunits are colored as in Figure 1. Cas13 is an RNA-guided RNA nuclease from the Type VI CRISPR-Cas system. Trans cleavage of an RNA substrate containing A's (yellow) and U's (red) by Cas13 leads to release of a linear hexaadenylate with a 2´,3´-cyclic phosphate (A6>P activator). A6>P can bind and stimulate Csm6 to cleave a fluorescent RNA reporter. C. Future applications of Class 1 CRISPR-Cas systems. The multisubunit Type I and III complexes could be fused to diverse functional domains for DNA and RNA editing. Multiple subunits per effector complex could also facilitate multiplexing. The cOA signaling pathway of Type III systems could be harnessed for control of cellular states in prokaryotes or eukaryotes by coupling cOA-binding nucleases to a cOA synthetase. MTase, methyltransferase; GFP, green fluorescent protein; cOA, cyclic oligoadenylate.