Structural basis of cyclic oligoadenylate binding to the transcription factor Csa3 outlines cross talk between type III and type I CRISPR systems

RNA interference by type III CRISPR systems results in the synthesis of cyclic oligoadenylate (cOA) second messengers, which are known to bind and regulate various CARF domain–containing nuclease receptors. The CARF domain–containing Csa3 family of transcriptional factors associated with the DNA-targeting type I CRISPR systems regulate expression of various CRISPR and DNA repair genes in many prokaryotes. In this study, we extend the known receptor repertoire of cOA messengers to include transcriptional factors by demonstrating specific binding of cyclic tetra-adenylate (cA4) to Saccharolobus solfataricus Csa3 (Csa3Sso). Our 2.0-Å resolution X-ray crystal structure of cA4-bound full-length Csa3Sso reveals the binding of its CARF domain to an elongated conformation of cA4. Using cA4 binding affinity analyses of Csa3Sso mutants targeting the observed Csa3Sso•cA4 structural interface, we identified a Csa3-specific cA4 binding motif distinct from a more widely conserved cOA-binding CARF motif. Using a rational surface engineering approach, we increased the cA4 binding affinity of Csa3Sso up to ∼145-fold over the wildtype, which has potential applications for future second messenger-driven CRISPR gene expression and editing systems. Our in-solution Csa3Sso structural analysis identified cA4-induced allosteric and asymmetric conformational rearrangement of its C-terminal winged helix-turn-helix effector domains, which could potentially be incompatible to DNA binding. However, specific in vitro binding of the purified Csa3Sso to its putative promoter (PCas4a) was found to be cA4 independent, suggesting a complex mode of Csa3Sso regulation. Overall, our results support cA4-and Csa3-mediated cross talk between type III and type I CRISPR systems.

RNA interference by type III CRISPR systems results in the synthesis of cyclic oligoadenylate (cOA) second messengers, which are known to bind and regulate various CARF domaincontaining nuclease receptors. The CARF domain-containing Csa3 family of transcriptional factors associated with the DNA-targeting type I CRISPR systems regulate expression of various CRISPR and DNA repair genes in many prokaryotes. In this study, we extend the known receptor repertoire of cOA messengers to include transcriptional factors by demonstrating specific binding of cyclic tetra-adenylate (cA4) to Saccharolobus solfataricus Csa3 (Csa3 Sso ). Our 2.0-Å resolution X-ray crystal structure of cA4-bound full-length Csa3 Sso reveals the binding of its CARF domain to an elongated conformation of cA4. Using cA4 binding affinity analyses of Csa3 Sso mutants targeting the observed Csa3 Sso cA4 structural interface, we identified a Csa3-specific cA4 binding motif distinct from a more widely conserved cOA-binding CARF motif. Using a rational surface engineering approach, we increased the cA4 binding affinity of Csa3 Sso up to 145-fold over the wildtype, which has potential applications for future second messenger-driven CRISPR gene expression and editing systems. Our in-solution Csa3 Sso structural analysis identified cA4-induced allosteric and asymmetric conformational rearrangement of its C-terminal winged helix-turn-helix effector domains, which could potentially be incompatible to DNA binding. However, specific in vitro binding of the purified Csa3 Sso to its putative promoter (P Cas4a ) was found to be cA4 independent, suggesting a complex mode of Csa3 Sso regulation. Overall, our results support cA4-and Csa3-mediated cross talk between type III and type I CRISPR systems.
The hallmark property of the CRISPR-Cas systems is their ability to adapt, process, and interfere with foreign genetic material. This is accomplished through a three-stage process involving (i) spacer acquisition (adaptation), (ii) crRNA production (processing), and (iii) target interference (8). The adaptation stage involves de novo spacer acquisition by a complex formed by two highly conserved nucleases (Cas1 and Cas2) (9,10). A Cas4 endonuclease further recognizes and removes a short 5 0 -flanking region of the prespacers called protospacer adjacent motif (PAM) ensuring integration of only mature spacers into the CRISPR array (11). During the crRNA production stage, Cas6 (or an endogenous protein) processes long CRISPR pre-RNAs into short mature crRNAs to contain a region of the extrachromosomal genetic element, a 5 0 -tag derived from the preceding repeat, and a 3 0 end handle from the downstream repeat (12)(13)(14)(15)(16)(17). During the final stage of target interference, a ribonucleoprotein (RNP) complex, comprising the mature crRNA and Cas proteins, specifically cleaves foreign nucleic acids via base pairing with the crRNA (18) and by the recognition of the PAM sequence in the target (19)(20)(21)(22)(23)(24).
Based on the complexity of the associated RNP complexes, CRISPR systems are classified into two major classes (class 1 and class 2) that contain a total of six different types (types I, III, and IV for class 1 and types II, V, and VI for class 2) (25,26). The interference RNP complexes of class 2 systems employ a single protein in complex with a crRNA, whereas class 1 systems use multisubunit RNP complexes. Owing to their simplicity and amenability to practical applications such as genome editing, the class 2 systems (containing Cas9, Cas12, and Cas13) are more well studied than class 1 systems (27,28). The less studied class 1 systems, however, are more primitive, abundant, and widespread in prokaryotes comprising about 90% of all CRISPR-Cas systems (25,29).
The most widespread type I and type III class 1 systems coexist in many prokaryotic genomes (30)(31)(32). Such a coexistence is well represented in genomes from a crenarchaeal order Sulfolobales, two-thirds of which harbor both the systems (4,33,34). For example, Sulfolobus islandicus encodes two CRISPR loci, one subtype I-A adaptation and two subtype III-B interference modules (34). Saccharolobus solfataricus, on the other hand, possesses a more extensive CRISPR system with six different CRISPR loci, which include two type I adaptation modules as well as three type I and four type III interference modules (35,36).
The type I and III interference complexes exhibit a striking functional diversity (37). Recognition of a PAM and a crRNA complementary sequence in the target DNA by the type I interference complex (known as CRISPR-associated complex for antiviral defense or Cascade) recruits the Cas3 endonuclease for degradation of the nontarget strand ( Fig. 1B) (38,39). By contrast, the type III interference (cmr-csm) complexes recognize a newly transcribed phage RNA in a PAM-independent fashion by base pairing with a seed motif at the 3 0 end of the crRNA (4,(30)(31)(32)(33)(34). Self-targeting during type III interference is prevented by a mismatch at the 5 0 end of the crRNA (Fig. 1A) (40,41). The interference by cmr-csm complexes involves degradation of the target RNA as well as the nontemplate DNA ( Fig. 1A) (32,(42)(43)(44)(45)(46)(47). Owing to its PAM-independent functional mode, the cmr-csm complex exhibits a broad target specificity and provides a unique phage survival advantage to the prokaryotic cells coharboring the type I and III systems (48,49).
Bacteriophage infections drive large global changes in the archaeal transcription. For example, infection by S. islandicus rod-shaped virus 2 has been shown to upregulate the expression of approximately one-third of the S. islandicus genome (55). Although production of cOAs by type III interference complex could conceivably mediate such transcriptional regulation, type III loci do not encode a transcription factor.
The adaptation and interference cassettes of the type I systems, however, encode CRISPR Apern 3 (Csa3) family transcription factors Csa3a and Csa3b, respectively (56). The S. islandicus Csa3b (Csa3b Sis ) acts as a transcriptional repressor to genes encoding subtype I-A CRISPR spacer acquisition complex, and subtype I-A target interference complex, as well as a transcriptional activator to genes encoding subtype III-B cmr interference complex (57,58). S. islandicus Csa3a (Csa3a Sis ), on the other hand, transcriptionally activates expression of CRISPR arrays, subtype I-A adaptation complex, and DNA repair proteins (59,60). An atomic structure of the apo form of a Csa3a homolog from S. solfataricus (Csa3 Sso ) has been previously reported to harbor an N-terminal CARF and a C-terminal MarR-like winged helix-turn-helix (wHTH) domain (61). The Csa3 Sso CARF domain exhibited a dimerizationmediated 2-fold symmetric ligand-binding pocket, which was predicted to bind a four-nucleotide-long RNA (61). Consistent with this, Csa3b Sis has recently been shown to bind a linear analog of cA4 (5 0 CAAAA3 0 ) in a CARF domain-dependent way (57). However, despite functional significance of Csa3 transcription factors, their ligand specificity and the structural basis of ligand binding has not been reported. Here, we identify cA4 as the cognate ligand of Csa3 Sso and show that Csa3 Sso lacks ring nuclease activity in vitro. We determine a 2.0-Å crystal structure of Csa3 Sso bound to cA4 and identify Csa3 Sso residues important for cA4 binding. Complementary SAXS analysis indicates a cA4-induced conformational change in preformed Csa3 Sso dimers, suggesting that allosteric changes within the Csa3 Sso dimer may regulate Csa3-mediated signaling.

Results
Csa3 Sso specifically binds cyclic oligoadenylate 4 Based on previous identification of a 2-fold symmetric ligand-binding pocket at the Csa3 Sso (a Csa3a homolog, KEGG accession number Sso1445) dimer interface (61), and in vitro binding of a cA4 analog to Csa3b Sis (57), we hypothesized that the Csa3 family of transcription factors from S. solfataricus could also be receptors of cOAs. To test specificity of binding of Csa3 Sso to cOA nucleotides, we purified an N-terminally His 6 -tagged fusion of Csa3 Sso (His 6 -Csa3 Sso ) recombinantly produced in Escherichia coli (Fig. 1C) and performed binding affinity analyses of cA3, cA4, cA6, and the linear 5 0 CAAAA3 0 RNA analog using microscale thermophoresis (MST). Despite a nonspecific binding exhibited by high concentrations of all the cOAs (250-1000 μM) (Fig. S1), only cA4 exhibited a specific low micromolar binding to His 6 -Csa3 Sso with an apparent dissociation constant (K D ) of 5.8 ± 0.03 μM (Fig. 1D). This is consistent with the previously observed predominance of cA4 among the cOAs produced by S. solfataricus Csm complex in vitro (50). Of note, the lack of Csa3 Sso binding to cA3, a second messenger also produced by CD-NTases of an alternate CBASS antiviral defense system in bacteria (62), suggests specificity of the Csa3 transcription factors to CRISPR-Cas systems. Furthermore, His 6 -Csa3 Sso binding to cA4 is stronger (8-fold) than the previously reported binding cA4-induced allostery in Csa3 of Csa3b Sis with the linear "cA4 analog" (5 0 CAAAA3 0 , K D of 46.10 ± 8.14 μM) that was determined using surface plasmon resonance (57). However, both these Csa3 homologs from Sulfolobus show cA4 binding affinities lower than Treponema succinifaciens Card1 (K D of 15 nM), which is also a cA4 receptor lacking ring nuclease activity (63). Nevertheless, Figure 1. Synthesis of cyclic oligoadenylates (cOAs) by type III interference complex, and transcriptional activation of CRISPR array and acquisition genes by Csa3a. A, infection by a mutant virus lacking a PAM sequence escapes DNA recognition by Cascade (type I interference complex). Type III interference complexes can use crRNAs produced from type I CRISPR loci for interference against the mutant phage. crRNA end mismatch-mediated binding of the phage transcript by type III interference complex induces ssDNA nuclease and cyclase activities of its Cas10 subunit (yellow) resulting in the synthesis of cA(n) (n = 3-6, with more abundant cA3, cA4, and cA6 illustrated in yellow background). Cas10 subunit activities are turned off by Csm3 or Cmr4 subunit (shown as magenta ovals)-mediated cleavage of the phage transcript. Most of the characterized cOAs receptors are nucleic acid hydrolases whose activities are regulated by cOA binding to a CARF domain. B, Csa3 (Csa3a-type) transcriptional factor from the type I CRISPR locus activates the transcription of acquisition cas genes and CRISPR arrays to facilitate acquisition of new spacers (step 1), synthesis of new crRNAs (step 2) for their incorporation into Cascade complex (step 3) for the eventual recognition and degradation of the phage DNA. Csa3a carries a CARF domain at its N terminus and is investigated as a receptor of cAn in this study. C, gel filtration chromatography analysis of the purified Csa3 Sso from S. sulfolobus strain P2 (UniProtKB database accession number Q97Y88, MW theor : 28.1 kDa) shows that Csa3 Sso forms dimers (MW exper : 58.31 kDa) in solution. Vertical bars above the absorbance trace indicate the peak positions of the gel filtration standards. The sodium dodecyl sulfate polyacrylamide gel electrophoresis (SDS-PAGE) picture shows the purity of Csa3 Sso after gel filtration. D, Csa3 Sso prefers cA4 (K D of 5.8 ± 0.03 μM) over other cOAs and a linear cA4 analog. The nucleotide preference was determined by microscale thermophoresis-based binding affinity analyses.

cA4-induced allostery in Csa3
considering the expected high micromolar concentrations of cA4 in the cell upon infection as discussed below (64), we believe that cA4 is the preferred ligand of Csa3 Sso .
The 2.0-Å crystal structure of Csa3 Sso bound to cA4 To better understand the structural basis for Csa3 Sso ligand specificity, we determined a 2.05-Å X-ray crystal structure of His 6 -Csa3 Sso bound to cA4 ( Table 1). The overall Csa3 Sso cA4 structure is very similar to that of the previously determined 1.8 Å apo Csa3 Sso structure (PDB ID 2WTE, RMSDs of 0.735 Å over all atoms, Fig. S2) (61). Like the apo Csa3 Sso structure, the Csa3 Sso dimer is domain swapped with respect to an N-terminal CARF and the C-terminal wHTH DNAbinding domains of the two protomers A and B (Fig. 2). The N-terminal CARF domain in each protomer is composed of six mixed β-strands flanked by four α-helices. The first five βstrands (βN1-βN5) run parallel, whereas the last one (βN6) runs antiparallel and connects the CARF domain to a C-terminal wHTH DNA-binding domain (residues 145-212) through a linker (residues 133-144). The linker is composed of two turns of α-helices followed by four residues in an elongated conformation. The C-terminal wHTH DNA-binding domain contains a right-handed three-helix bundle formed by three helices (αC1-αC3) where αC2 and αC3 comprise the wHTH motif with αC3 (residues 173-186) constituting the DNA recognition helix. The wHTH of Csa3 Sso belongs to the widespread MarR-like wHTH fold in which the DNA recognition helix is followed by two β-strands to make a "wing" that follows an α-helix. Accordingly, the HTH wing in the His 6 -Csa3 Sso cA4 structure is composed of the αC3-αC4 loop that is followed by αC4. Although residues 192 to 196 at the tip of this wing are disordered in protomer A, the electron density for all the protomer B wing residues was observed likely due to their stabilization by contacts with a symmetry-related subunit ( Fig. 2A).
The Csa3 Sso cA4 complex exists as a dimer in vitro Apo Csa3 Sso has previously been shown to exist in a dimeric form (61). However, crystal packing analysis of our Csa3 Sso cA4 crystal structure depicted dimer-dimer Figure 2. X-ray crystal structure of Csa3 Sso cA4 complex at 2.05 Å. A, a side view of the X-ray crystal structure of Csa3 Sso dimer complexed with cA4 (ball-and-stick model) using its N-terminal CARF domain. Blue and purple cartoons represent two Csa3 Sso protomers in the dimer. The Csa3 Sso secondary structure elements are labeled with "type" (α or β), followed by "domain" (N or C terminal), and "number" (1-6 for β strands and 1-4 for α-helices). Protomer B elements are labeled with an apostrophe symbol (') to differentiate them from protomer A elements. αC3 is connected to αC4 by a flexible linker shown as a dashed blue line. The zoomed-in inset on the top shows an expanded side view of cA4 (ball-and-stick model), highlighting the "planar" and "outward facing" (outer) groups in cA4. B, a top view of the Csa3 Sso cA4 structure showing Csa3 Sso in the surface representation (blue and purple). To obtain this view of the Csa3 Sso cA4 model, the structure illustrated in A was rotated 90 in the direction indicated by the black arrow. The zoomed-in inset on the top shows an expanded top view of cA4 (ball-and-stick model), with a composite electron density map (2F o -F c , contoured at 2.0 σ) of cA4 in the refined Csa3 Sso cA4 structure. In both A and B insets, groups in cA4 are denoted as A, adenine; R, ribose; P, phosphoryl. To simplify illustration of the bound cA4 in this text, we refer to its AMP moieties as AMP 1a , AMP 1b , AMP 2a , and AMP 2b ; the corresponding adenine rings as A 1a, A 1b , A 2a , and A 2b ; its ribose rings as R 1a, R 1b , R 2a , and R 2b; and the four phosphoryl groups as P 1a (connecting R 1a and R 2a ), P 2a (connecting R 1b and R 2a ), P 1b (connecting R 1b and R 2b ), and P 2b (connecting R 2b and R 1a ). All the cA4 atoms had a clash score of <0.7 except for C4 and O4 atoms of R 2b , which was allowed at 1.15 Å to fit cA4 molecule in the electron density.

cA4-induced allostery in Csa3
interactions that could underlie a higher-order oligomerization. More specifically, a symmetry mate obtained by operation X − 1/2, −Y + 1/2, −Z − 1 to the dimer in the asymmetric unit showed an increase in buried surface area of 2444.5 Å 2 per dimer of Csa3 Sso using the PISA server (65,66). Such a large crystallographic interface does not exist in the apo Csa3 Sso structure. To assess the possibility of cA4-induced Csa3 Sso oligomerization in solution, we performed size-exclusion chromatography. Consistent with prior reports, our Csa3 Sso preparations eluted as dimers in gel filtration chromatography (60 kDa versus globular standards, Fig. 1C). We further analyzed the oligomeric properties of Csa3 Sso in the presence and absence of cA4 using sedimentation velocity analytical ultracentrifugation (SV-AUC) (Figs. 3 and S3 and Table S1). In SV-AUC, His 6 -Csa3 Sso (96 μM) appeared as a single peak at 3.5 S 20,w with estimated mass (M f ) of 52 kDa, consistent with the theoretically calculated S and mass values of 3.8 S and 52.3 kDa from the apo Csa3 structure (PDB 2WTE) (Figs. 3 and S3 and Table S1) (61). The addition of excess amounts of cA4 (Figs. 3 and S3 and Table S1) yielded a very similar dimer profile (3.6 S 20,w and 61.2 kDa), consistent with the calculated S and mass values from our Csa3 Sso cA4 structure with cA4 bound (3.86 S 20,w and 57.5 kDa). Overall, these data evidence monodisperse dimers of Csa3 Sso in solution that persist in the presence and absence of cA4. Consistent with insolution dimers observed for Csa3 Sso , the asymmetric unit for the Csa3 Sso cA4 complex structure consisted of a dimer of Csa3 Sso bound to one molecule of cA4 (Fig. 2B).
Conformation adopted by cA 4 in the Csa3 Sso cA4 structure Comparison of all the cA4-bound structures of CARF domain proteins revealed that cA4 in the Csa3 Sso -cA 4 complex structure exists in a unique elongated conformation. In this conformation, P 1a and P 1b (called distal phosphoryl groups) in the cA4 ring are stretched outward (with a P-to-P distance of 11.8 Å), which brings P 2a and P 2b (called proximal phosphoryl groups) to a shorter P-to-P distance of 4.4 Å ( Fig. 2B inset). The difference of 7.4 Å in these P-to-P distances exemplifies the most extended conformation of cA4 observed among known cA4 receptors (Fig. S4). Furthermore, A 2a and A 2b adenines adopt a conformation parallel to the plane of cA4 backbone (hereafter called planar adenines), whereas A 1a and A 1b adenines face outside of this plane pointing away from the binding pocket (hereafter called nonplanar adenines [ Fig. 2A inset]). Similarly, 2 0 hydroxyls in R 1a and R 1b and both unbonded oxygens in P 2a face away from the binding pocket ( Fig. 2A inset). Three of the phosphoryl groups in cA4 (labeled P 1a , P 1b , and P 2b ) are planar, whereas P 2a faces away from the binding pocket (hereafter called nonplanar phosphoryl). Furthermore, the 2 0 hydroxyl oxygen (O'), phosphorous (P), and 5 0 -ribosyl oxygen (O'') of R 1a , R 1b , R 2a , and R 2b exhibit O'-P-O" angles of 125 , 141 , 158 , and 94 , respectively (Fig. 4C).

cA 4 recognition by Csa3 Sso
Superimposition of our cA4-bound structure with the apo-Csa3 Sso structure (PDB 2WTE) revealed that the cA4-binding pocket is largely preformed upon dimerization of the CARF domain (residues 1-132) (Fig. 2). Owing to symmetry in this pocket, overlapping sets of CARF residues from both the Csa3 Sso protomers make equivalent interactions with the two halves of cA4. More specifically, CARF residues from protomers A and B of the Csa3 Sso dimer (labeled with subscripts A and B in this text) interact almost exclusively with adenine-, ribose-, and terminal phosphoryl groups labeled "a" (for A 1a , A 2a , R 1a , R 2a , and P 1a ) and "b" (A 1b , A 2b , R 1b , R 2b , and P 1b ), respectively (Fig. 4, A-D). Our structural analysis of the Csa3 Sso -cA4 interface as well as sequence alignment of ten archaeal Csa3 Sso homologs refined the two previously predicted ligand-binding motifs within the Csa3 Sso CARF domain (Figs. 4 and 5, also see Discussion) (61). The first motif (named cA4 binding motif 1) comprises residues 8 to 14 (from the β1-α1 loop and the α1 helix), and the second motif (named cA4 binding motif 2) comprises residues 94 to 99 (located in the β4-α4 loop and α4 helix). Residues from these motifs as well as other regions in Csa3 Sso make an extensive network of hydrogen bonding and hydrophobic interactions with different cA4 groups (Fig. 4, A and B) as described below.

Csa3 Sso interactions with cA4 adenine rings
Extensive hydrophobic interactions from residues in cA4 binding motifs 1 and 2 stabilize all four cA4 adenine rings. The nonplanar A 1 adenines (A 1a and A 1b ) dock into shallow hydrophobic pockets formed by Thr13 A&B and Phe14 A&B (from cA4 binding motif 1), whereas the planar A 2 adenines (A 2a and A 2b ) occupy much elaborate and deep hydrophobic pockets. In the planar adenine pocket, one face of the adenine is stabilized by a fully conserved Phe10 A&B (cA4 binding motif 1) and the other face is docked onto lesser conserved Met97 A&B (cA4 binding motif 2), Pro35 A&B , Val39 A&B , Thr42 A&B , and T13 B (Fig. 4, A and B). The hallmark of all the adenine-binding pockets in Csa3 Sso are π-stacking interactions of Phe14 A&B Figure 3. cA4-bound Csa3 Sso is a dimer in solution. Sedimentation velocity analytical ultracentrifugation data showing c(S) distributions. c(S) values were derived from the fitting of the Lamm equation to the experimental data collected for wildtype Csa3 Sso (40 μM) in the absence (red) and presence of cA4 (50 μM) ligand (blue), as implemented in the program SEDFIT. The profile for the Csa3 Sso -R98A mutant (64 μM) in the absence of cA4 ligand (green) is also shown. This analysis shows evidence of dimeric species in solution that persists upon the addition of cA4. The emergence of the 1.5S species in the presence of cA4 is interpreted to be mild protein degradation enduring during the time course of the experiment. Parameters derived from these analyses are presented in Table S1, and Lamm equation fits to the primary data are shown in Fig. S3. The cA4 and interacting Csa3 Sso residues are shown in a balland-stick representation with green, purple, and blue sticks representing cA4, promoter A, and promoter B, respectively. To depict the symmetric orientation of cA4 and its interactions with Csa3 Sso residues with respect to CARF dimerization interface, the corresponding Csa3 Sso protomer regions are shown by blue (protomer A) and purple (protomer B) backgrounds. B, a top view of the Csa3 Sso cA4 complex structure showing interactions of cA4 with CARF domains from the two Csa3 Sso protomers. Csa3 Sso protomers A and B are depicted in blue surface and purple cartoon representations, respectively. For simplicity of illustration, only the hydrophobic interactions from protomer B (purple residues in ball-and-stick representation) and only the polar interactions from protomer A (blue residues in ball-and-stick representation) are shown. The hydrogen bonds are depicted as solid yellow (B) and dashed green (A) lines with cA4-induced allostery in Csa3 and Phe10 A&B with the nonplanar (A 1a and A 1b ) and planar (A 2a and A 2b ) adenine rings, respectively. More specifically, Phe14 A and Phe10 B stabilize adenines A 1a and A 2b by T-shaped π-stacking interactions, and sandwich-type π-stacking interactions from Phe14 B and Phe10 A engage A 1b and A 2a , respectively (Fig. 4C). Both Phe14 and Phe10 are significantly conserved in Csa3 Sso homologs; Phe14 is conserved in six of the ten Csa3 Sso homologs, and Phe10 is conserved as a Phe or Tyr in all ten homologs (Fig. 5). Furthermore, structural superimposition of cA4-bound Csa3 Sso with apo Csa3 Sso shows linear movement of Phe14 A or 90 rotation of Phe14 B likely facilitating interactions with cA4 (data not shown). To analyze the contribution of Phe14 and Phe10 to cA4 binding, we mutagenized these residues to alanines and investigated their ability to bind cA4 by MST. The alanine mutants of both Phe14 and Phe10 showed a complete loss of cA4 binding Figure 5. Structure-guided sequence alignment of Csa3 homologs. The Csa3 Sso residues at the Csa3 Sso cA4 structural interface are identified by circles below the alignment. The Csa3 Sso residues marked with stars at the bottom of alignment identify cA4 binding motif identified in the nuclease receptors of cOAs containing the CARF domain. Blue rectangles depict cA4 binding motifs 1 and 2 as determined from the Csa3 Sso cA4 structure. Circles filled with green triangles identify conserved (>65% similarity among ten Csa3 homologs) residues at the cA4 interfaces in both the Csa3 Sso protomers that showed significant loss or gain of function in the cA4 binding assay (Fig. 4E). Empty circles and those filled with smaller circles represent the interfacial residues present in one or both protomer(s) in the Csa3 Sso dimer, respectively, and were not subjected to mutagenesis. Circles filled with red color identify interfacial residues that are conserved but were not subjected to mutagenesis in this study. The numbering is based on the residue positions of the Csa3 Sso (gene accession number: Sso1445). The conservation of residues at each position is depicted by the size of the letters in the sequence logo on top of alignment, where the most conserved residues are highlighted by a larger-sized letter and a by a black background. Logo letters colored blue, green, red, and black indicate basic, polar, acidic, and hydrophobic residues, respectively. The secondary structure elements are derived from the Csa3 Sso cA4 structure (in which α-helices are shown as magenta cylinders and β-sheets are shown as yellow arrows). The sequence alignments of the Csa3 homologs were performed using the T-Coffee method (107) and were edited using Geneious Prime software (https://www.geneious.com) and Adobe Illustrator (version 25.3). Each homolog is identified by its accession number and bacterial source. interatomic distances labeled above the green lines. In both A and B, groups in cA4 are denoted as A, adenine; R, ribose; P, phosphoryl. C, a bottom-up view of the polar interactions of Csa3 Sso -Arg98 and -E122 residues with P 2b phosphoryl oxygens in cA4. D, surface electrostatic potential distribution of the cA4binding pocket showing electrostatic repulsion of the negatively charged central phosphoryl in cA4 by negative charge from Csa3 Sso -E122 side chain. Calculations of surface electrostatic potential distribution were performed with APBS electrostatics plugin in Pymol program using default parameters.

cA4-induced allostery in Csa3
Electrostatic potential values are shown in a scale from red to blue, corresponding to −5.0 and +5.0 kcal/(mol), respectively, at 310 K. E, binding affinities of Csa3 Sso mutants targeting the Csa3 Sso structural interface with cA4 are measured using microscale thermophoresis. The graph displays data from three independent measurements. Error bars represent standard deviation.
Among the polar interactions of Csa3 Sso with the cA4 adenines, side chains of Asn11 A&B hydrogen bond to N7 in the nonplanar adenines (A 1a and A 1b ). Since Asn11 is conserved as an Asn, Asp, or His in most of the Csa3 homologs (Fig. 5), it is conceivable that the side-chain carboxylic oxygen or ring nitrogen atoms in these homologs could instead hydrogen bond to the nonplanar adenines (Fig. 4, A and B). Furthermore, the main-chain oxygen atoms of Met8 A&B (cA4 binding motif 1) and Glu122 A&B (β5-β6 loop) also hydrogen bond N3 in the planar (A 2a and A 2b ) and nonplanar (A 1a and A 1b ) adenines, respectively (Fig. 4, A and B).
Csa3 Sso interactions with phosphate and ribose groups at distal ends of cA4 The P 1 phosphoryls (P 1a and P 1b ) and R2 ribosyls (R 2a and R 2b ) at the distal ends of the elongated cA 4 are hydrogen bonded by main-chain atoms of many residues in the cA4 binding motifs 1 and 2 (Fig. 4, A and B). More specifically, the main-chain nitrogen atoms of Phe10 A&B hydrogen bond with 2 0 ribosyl oxygens of R 2 groups (R 2a and R 2b ) as well as with P 1 phosphoryl oxygens (P 1a and P 1b ). The main-chain nitrogen atoms of Gly9 A , Asn11 A&B , and Gly96 A&B hydrogen bond with the P 1 phosphoryl oxygens (P 1a and P 1b ). Among hydrophobic interactions, main-chain atoms of Gly9 B , Gly96 A&B , and Met95 A&B further stabilize the P 1 and R 2 groups. Of interest, Gly9 and Gly96 are parts of β1-α1 and β4-α4 loops in motifs 1 and 2, respectively, and are completely conserved among Csa3 homologs (Fig. 5). It is therefore possible that these glycines add flexibility to these loops to facilitate binding of the P 1 phosphoryl groups in an extended cA4 conformation (Fig. S4). Furthermore, Csa3 Sso -Met95 is conserved as a glycine in most of the Csa3 homologs, which may further be adding to the flexibility of the β4-α4 loop to accommodate this cA4 conformation (Fig. 5). To test whether small glycine residues play a role in binding, we mutagenized Gly96 to a slightly bulkier residue alanine. Indeed, the G96A mutant completely lost the ability to bind cA4 in our MST assay, confirming the requirement of a small residue at this position (Fig. 4E).

Csa3 Sso interactions with proximal cA4 phosphoryls
The inward-facing phosphoryl group in the middle of the elongated cA4 (P 2b ) was found to hydrogen bond with highly conserved Arg98 A&B (cA4 binding motif 2) and Glu122 A (β5-α5 loop) residues in Csa3 Sso . The interactions with Arg98 involve hydrogen bonds to P 2b oxygens using either two (for Arg98 B ) or one (for Arg98 A ) of its side-chain nitrogen atoms (Fig. 4, A and B). Arg98 is completely conserved in all the Csa3 Sso homologs (Fig. 5). Accordingly, our alanine mutant of Arg98 completely lost cA4 binding in vitro (Fig. 4E). Since Arg98 also contributes significantly to dimerization interface of Csa3 Sso (buried surface area per Arg = 256.14 Å 2 ), we confirmed that this mutant is still dimeric in solution using SV-AUC (Figs. 3 and S3 and Table S1).
Of interest, the cA4 phosphoryl interacting surface of the Csa3 Sso pocket is largely positively charged except for the conserved Glu122 residue that hydrogen bonds with cA4 central phosphoryl (P 2b ) oxygen via its side-chain carboxyl in the protomer A (Fig. 4, A and B). Such interactions of Glu122 side chains could create an electrostatic repulsion with central phosphoryls (P 2a and P 2b ) likely constraining them close to each other in the elongated cA4 conformation (Fig. 4D). The Glu122 A&B side-chain carboxyl also hydrogen bonds with the main-chain nitrogen of Arg98 B&A in the alternate protomer across the dimerization interface (Fig. 4C). Given the role of Arg98-and Gly96-containing motif 2 in cA4 binding discussed above, we wondered if Glu122 interactions with Arg98 main chain help position these motif 2 residues to affect cA4 binding. We therefore hypothesized that substitution of the Glu122 side chain to an Ala should remove electrostatic Glu122 repulsion to the central phosphoryls and disrupt polar interactions with motif 2, which should facilitate Csa3 Sso binding to cA4. Indeed, a mutation of Glu122 to alanine (E122A) dramatically improved the binding affinity of Csa3 Sso for cA4 (K D of 38.3 nM, 145-fold higher affinity over wildtype) (Fig. 4E). To further dissect whether this cA4 binding gain is due to removal of side chain charge, or to the disruption of interactions with Arg98 main chain, we mutagenized Glu122 to a neutral Gln residue, which should still be able to make polar interactions with Arg98 (and P 2b ). An E122Q mutation was also found to drastically increase the cA4 binding affinity in vitro (K D of 44.5 nM, 125-fold over wildtype) showing a more significant role of electrostatic repulsion by Glu122 as compared with its coordination of Arg98 (Fig. 4E). In summary, these results reveal the mechanistic basis underlying the inability of wildtype Csa3 Sso to strongly bind cA4 and further confirm that we have identified a biologically important Csa3 protein surface.

Csa3 Sso lacks ring nuclease activity
In the CARF-containing ring nucleases, the motif I is primarily implicated in the catalytic activity (67). Nevertheless, a conserved cOA-binding lysine in motif II of cA4 ring nucleases (Lys168 in Sso1393 and Lys106 in Sso2081) (67) was predicted to participate in cOA catalysis by stabilizing the transition state (68). Given Arg98 A&B coordination with P 2b at the end of the β4-α4 loop in Csa3 Sso cA4 structure, we tested cA4 ring nuclease activity of Csa3 Sso in two different conditions (Figs. 6  and S5, A and B). A C18 HPLC analysis of the reaction components after removal of Csa3 Sso showed no sign of cA4 hydrolysis products (Fig. 6). Consistent with this, P 2b exhibits an O'-P-O" angle of 94 in the Csa3 Sso cA4 structure, which is inconducive for an inline nucleophilic attack by the 2 0 ribosyl hydroxyl.
In a few CARF-containing ring nucleases, polar interactions of a conserved Glu or Asp residue have been proposed to position their catalytic loop for activity (67). To further determine whether Glu122 A side chain coordination of Arg98 main chain and/or negative charge of Glu122 A side chain around central phosphoryl (P 2b ) prevents ring nuclease activity cA4-induced allostery in Csa3 in Csa3 Sso , we tested cA4 hydrolysis activity of the E122A and E122Q mutants, respectively. However, no ring nuclease activity was observed for these mutants (Fig. S5C). Overall, our Csa3 Sso data are consistent with the proposal that CARF motif 2 is involved in cA4 binding and not catalysis (67).

Solution structure reveals cA4-induced wHTH domain rearrangements in Csa3 Sso
The high overall similarity of the Csa3 Sso cA4 complex structure with the apo Csa3 Sso structure (61) suggested no significant Csa3 Sso conformational changes upon cA4 binding (Fig. S2). This was perplexing to us specially since conformational differences in Csa3 Sso could be the only other way to understand functional relevance of the cA4 binding given cA4 did not change the oligomerization state of Csa3 Sso in our SV-AUC experiments. Also, cA4 binding has been previously shown to induce conformational changes in Can1 nuclease receptor of cA4 (69). To determine if the Csa3 Sso cA4 crystal structure accurately represents the solution state of this complex, we conducted small-angle X-ray scattering (SAXS) analysis of Csa3 Sso in the presence and absence of cA4. In the presence of excess cA4 (sufficient to saturate all cA4-binding sites), the shape of dimeric Csa3 Sso changed significantly. The data for Csa3 Sso both in the presence and absence of cA4 displayed linearity in the classical Guinier analysis (Fig. S6), with an observed increase in the radius of gyration (R g ) in both the Guinier and inverse Fourier transform (GNOM) analyses ( Fig. 7 and Table 2). Mass calculations from this data are consistent with in-solution dimers from both the states (Table 2). By Pr analysis, these differences coincide with increases in R g and D max and a redistribution of interatomic vectors to greater values (Fig. 7A). The numerical values derived from these analyses correlate very well with SAXS measurements made previously for apo Csa3 Sso (61). This suggested that there is a significant conformational difference between the bound and apo states of Csa3 Sso in solution.
Model-independent analyses including Guinier, Kratky, Porod-Debye, and mass calculations indicate that the cA4induced conformation of the dimer is not due to changes in flexibility and disorder, or mass, but rather discrete differences in the configurations of the structural domains (Figs. 7B and S6 and Table 2). Although low in resolution, SAXS analysis allows for the rigorous testing of atomic models against their solution properties. These model-independent analyses would indicate that single atomistic models (ab initio or atomistic) could be reliably tested against the solution data to discern the nature of these conformational changes.
Crystallographic Csa3 Sso models only provide atomic inventory for 212 of the 248 a.a. in the His 6 -tagged Csa3 Sso construct used in this study, including the disordered C terminus (25 a.a.). To model this missing atomic inventory against its solution data, we employed the CORAL program, which uses coarse-grain beads for missing amino acids in a hybrid bead-atomistic modeling approach (70). When the method was applied, Csa3 Sso in its unbound form could be readily reconciled against the SAXS data (χ Crysol = 1.5). No improvement was observed when allowing for the C-terminal wHTH domain positions to be refined via rigid-body docking, without symmetry constraints (Fig. 7C). In contrast, a significant preference was shown for a large asymmetric wHTH domain positioning over its symmetric crystallographic configuration in the cA4-bound state (χ Crysol = 2.9 versus 1.1) (Fig. 7D). Therefore, the binding of cA4 to the Csa3 Sso dimer was found to induce large asymmetric conformational changes in the position of the two wHTH domains in solution cA4-induced allostery in Csa3 (Fig. 8). These in-solution conformational changes involved asymmetric, but significant, rotations and displacements of the wHTH domains from chains A (wHTH A ) and B (wHTH B ) (Fig. 8). More specifically, the wHTH A domain exhibited an 83.1 rotation and 20.9 Å displacement (left inset in Fig. 8), whereas the wHTH B domain exhibited a 107.6 rotation and 29.8 Å displacement (right inset in Fig. 8). This results in what we refer to as a "closed" Csa3 Sso state. We envisage that Csa3 Sso samples between this closed and open (equivalent to apo conformation) states. Although it is possible that apo Csa3 Sso populates a complex population of open and closed states, the closed conformation is stabilized upon cA4 binding. Overall, our data demonstrate significant allosteric repositioning of wHTH A and wHTH B domains upon cA4 binding, which may underlie cA4 regulation of Csa3 signaling.   Table 2. In the bound form, a redistribution of interatomic vectors from 38 to 80 Å is observed, consistent with rearrangement of globular domains in response to a ligand. B, dimensionless Kratky Plot analysis (105), where the intensity of scattering is plotted as qR g 2 *I(q)/I(0) versus qR g 2 . R g is the radius of gyration in Å, I is the scattering intensity in arbitrary units, and q is the scattering angle (q = 4π sin(θ)/λ, where λ is the X-ray wavelength and 2θ is the scattering angle). Both Apo (blue line) and cA4-bound (red line) Csa3 Sso dimers show a characteristic bell-shaped peak at low-q that returns to near baseline at wider scattering angles, indicative of a more compact, globular macromolecule. C and D, CORAL analysis (70) of Csa3 Sso dimers, which employs a rigid body approach to optimize crystallographic models against experimental SAXS data. In this approach, missing and flexible atomic inventory are represented as beads in coarse grain fashion and flexibly fit. Shown on the left are the experimental SAXS data for both Apo (C, blue) and cA4-bound (D, red) as gray circles in a log-log plot, where intensity I is plotted as a function of q. In this analysis, two modeling approaches were considered in each state: a "fixed" approach, where the C-terminal wHTH domains were fixed in their crystallographic configuration, and a "refined" calculation, where the C-terminal domains were additionally refined in atomic position. In the Apo state (C), the fixed configuration (cyan dotted line) showed a slightly better agreement (χ 2 = 1.5) than the calculations where wHTH positions were refined (blue solid line, χ 2 = 1.9). Conversely, in the cA4-bound state (D), the fixed configuration (pink dotted line) showed worst agreement with the solution data (χ 2 = 2.9) than the calculations where the wHTH positions were refined (red solid line, χ 2 = 1.1). The corresponding structural models derived are shown to the right for both the apo and cA4-bound states. A gallery of representative calculations (n = 10) in each state is provided in Fig. S7.

cA4-induced allostery in Csa3
a pseudopalindromic consensus region upstream of CRISPR leader and spacer acquisition operon (viz., cas1, cas2, cas4, and csa1/cas4a genes) (59,60). We therefore evaluated the effect of the cA4-induced repositioning of wHTHs observed in our cA4-bound solution structure on the DNA binding by Csa3 Sso . Our superpositional docking of the DNA fragments from homologous OhrR-DNA complexes (PDB ID: 1Z9C) (71) onto these apo state structures showed no change in DNA conformations ( Fig. 9A left panel) (61). However, the "closedstate" (cA4-bound) models exhibited drastically different docked DNA conformations for both wHTH A and wHTH B . More specifically, the fragment docked onto the wHTH A in the closed state sterically clashed with the CARF B , and the one docked onto wHTH B showed an 90 rotation with respect to its position in the apo state model (Fig. 9A right panel). Since each docked DNA fragment here represents one of the two palindromes Csa3 Sso is expected to bind, we considered experimental evaluation of a possibility of DNA binding disruption by cA4 binding to Csa3 Sso . For this, we performed initial electromobility shift assays (EMSAs) using the cas4a (accession ID: sso1451) promoter fragment (P Cas4a ) originally predicted by Liu et al. (59) in S. solfataricus P2 genome to conserve the Csa3a-binding site (Fig. S8, A and B). Our negative controls lacking predicted Csa3-binding sites included a similarly long leader A as well as a DNA fragment unrelated to CRISPR systems (Fig. S8). Unexpectedly, Csa3 Sso interacted with all these fragments in a sequence-independent way generating protein-DNA precipitates that did not enter the gel (Fig. S8C). Furthermore, crude visualization of these Csa3 Sso -P Cas4a complex precipitates in microcentrifuge tubes showed that presence of excess of cA4 prevented their formation (Fig. 9B). To analyze if this cA4-mediated effect is due to a possible alteration in the physicochemical composition of the binding reaction by cA4, we used cA4 binding-deficient Csa3 Sso -R98A mutant in this examination and found that cA4 could not rescue the Csa3 Sso -R98A mutant from precipitating with the DNA (Fig. 9B). To analyze the effect of cA4 addition on the sequence-specific DNA binding by Csa3 Sso , we chose P Cas4a that additionally revealed a small proportion of the shifted probe representing soluble protein-DNA complexes (Fig. S8C). Surprisingly, however, we did not see a significant effect of cA4 on the levels of the minor soluble Csa3 Sso -DNA complex population (faint shifted band), even using a more sensitive EMSA utilizing a Cy5-labeled P Cas4a probe. However, the levels of the insoluble complex population were reduced in a cA4 concentration-dependent manner as indicated by an increase in the amount of the free probe in lanes with excess cA4 added (Fig. 9C).
To further analyze whether binding of cA4 affected specific DNA binding by Csa3 Sso , we performed MST-based binding affinity analysis of Csa3 Sso using P Cas4a as a ligand where we removed precipitated material before the fluorescence measurements. This analysis depicted a K D of 1.62 ± 0.17 μM for binding of Csa3 Sso to P Cas4a , which is moderately better than that observed previously for Csa3b Sis binding to an analogous S. islandicus P Cas4a promoter (57). However, the addition of excess of cA4 (or cA6) did not significantly change the P Cas4a binding affinity (K D s of 3.20 ± 0.36 and 2.4 ± 0.17 μM in the presence of cA4 or cA6, respectively) (Fig. 9D).
In conclusion, these analyses showed that binding of cA4 increases solubility of the Csa3 Sso in the presence of DNA. However, in our assays using the selected binding regions, the specific DNA binding affinity of Csa3 Sso was unaltered in vitro   (105). These values were determined using the program ScÅtter (https://bl1231.als.lbl.gov/scatter/). b Mass determinations using the Q r invariant (106) were determined using the program RAW. Expected dimeric mass is shown in parentheses. c Errors reported reflect the uncertainty in the value for R g determined using classical Guinier fitting.

cA4-induced allostery in Csa3
by adding cA4 alone. Although we might have missed the physiologically relevant DNA target of Csa3 Sso , this also brings up possibilities of alternate models of transcriptional regulation discussed below.

Discussion
The type I interference complexes are highly efficient in clearing infections associated with phage genomes containing an intact PAM sequence (72). The phages, on the other hand, have evolved escape strategies like the generation of escape mutations in the PAM sequences (1,48,73,74) and production of anti-CRISPR (Acr) proteins (75,76). During infection by a resilient phage, a coexistent type III interference (csm/cmr) complex works independent of the type I Acr-and PAM motif to facilitate phage clearance ( Fig. 1) (25,48,49,77,78). However, excessive activity of the csm/ cmr complexes and cOA receptors results in nonspecific genomic mutagenesis and RNA hydrolysis, respectively (54,79). The type III interference therefore needs to prevent this aberrant fate during reinfection by the same virus. Consistent with this, the type III interference is bioinformatically predicted to trigger de novo spacer acquisition and crRNA production enabling the type I system for a future reinfection (Fig. 1, A and B) (4, 25). However, the regulatory mechanisms underlying such a reversal to type I interference are not clear.
A CARF domain is a variant of the Rossmann fold lacking the canonical G-X-G-X-(G/A) motif involved in binding NAD(P)H or FADH 2 (80). Instead, CARF domains generally conserve a (D/N)-X-(S/T)-X 3 -(R/K) motif in their βN4-αN4 loop that is known to bind cOAs in many nucleic acid hydrolases (81). The βN4-αN4 loop is conserved in the Csa3 transcription factors (residues marked with white stars in red background in Fig. 5), which, along with the βN1-αN1 and βN5-βN6 loops, creates a potential nucleotide-binding pocket (61). However, biochemical and structural characterization of ligand binding specificity and regulation of Csa3 family transcriptional factors has been missing (4,57,61). Despite the low micromolar binding affinity we observed for cA4 binding to Csa3, we expect it to be physiologically relevant since high micromolar cA4 concentrations have been estimated to be attained in an infected Sulfolobus cell. More specifically, every phage transcript molecule detected by the type III interference complex produces an intracellular cA4 concentration of 6 μM. Thus, a concomitant synthesis of multiple phage RNA molecules could result in an intracellular cA4 concentration in the multiples of six to the number of RNA molecules detected (64). Therefore, a relevant biological scenario includes failure of type I CRISPR systems leading to the concomitant detection of several phage transcripts by the type III interference complexes, which then produce cA4 at a rate that significantly exceeds its hydrolysis by the ring nuclease effectors. As a lowaffinity cA4 receptor, Csa3 Sso may therefore act only as a last resort to regulate spacer acquisition, crRNA synthesis, and DNA repair. Such a mechanism could help alleviate cellular toxicity known to increase upon extensive spacer acquisition (37). This is further supported by the fact that Csa3 Sso lacks any demonstratable ring nuclease activity in vitro (discussed below), which could reduce the effectiveness of such a system.

cA4-induced allostery in Csa3
stacking contributions of a Trp residue (Trp42) in a Thermus thermophilus Can1 receptor of cA4 have previously been proposed to stabilize cA4 in an asymmetrical conformation (69), which may also be true for the extended cA4 conformation bound to Csa3 Sso . Furthermore, the essential interactions of the highly conserved Csa3 Sso -Gly96 with the terminal cA4 phosphoryl moieties may additionally contribute to this stabilization (Fig. 4). Finally, our observed contribution of Csa3 Sso -Arg98 interactions with the central phosphoryl moieties in cA4 is consistent with a previously reported loss of the binding of the cA4 analog to a multisite Csa3b Sis mutant encompassing an Csa3 Sso -Arg98-equivalent residue (57). These residues' functional relevance in Csa3a and Csa3b homologs for binding cA4 further confirms the validity of our Figure 9. cA4 does not significantly affect specific DNA binding by Csa3 Sso in vitro. A, predictive docking of DNA fragments (orange backbone) onto the wHTH domains of SAXS-derived apo (blue protein model on left) and cA4-bound (light brown protein model on right) Csa3 Sso dimers. The DNA fragment docked onto protomer A in the cA4-bound Csa3 Sso dimer sterically clashed with a region in the CARF domain from protomer B in this dimer (encircled by a red dotted circle). The DNA fragment docked onto protomer B in this cA4-bound dimer exhibited a 90 rotation instead. To obtain these models, wHTH domains of B. subtilis OhrR (PDB code: 1Z9C (71), RMSD: 1.7 Å) were individually superimposed onto wHTH domains of the Csa3 Sso dimer and OhrR was omitted from the view for clarity. A single and straight model of B-form DNA with two copies of the palindromes could not be modeled with the apo Csa3 Sso dimer owing to a slight misorientation (not shown) between Csa3 Sso wHTH domains (possibly requiring a bend in the DNA). B, precipitation of Csa3 Sso in the presence of P Cas4a DNA and its solubilization by cA4 binding to the Csa3 Sso CARF domain. Before taking the photograph, 50 μM Csa3 Sso (WT or R98A mutant), 10 μM P Cas4a , and/or 500 μM cA4 were added in the binding buffer without Tween 20 (see Experimental procedures) and incubated for 30 min at 20 C. C and D, cA4 does not significantly affect the binding of Csa3 Sso to the P Cas4a promoter in vitro. EMSA shows no effect of cA4 on the specific binding of Csa3 Sso to P Cas4a (C) that drives spacer acquisition in S. solfataricus P2. Cy5-labeled P Cas4a , 50 nM, was added to different Csa3 Sso concentrations ranging from 500 nM to 25 μM in the presence or absence of cA4 added in a 50:1 ratio of cA4:Csa3 Sso . D, microscale thermophoresis binding affinity analysis also shows no significant effect of cA4 (as well as cA6) on the Csa3 Sso binding to the P Cas4a . For microscale thermophoresis, Csa3 Sso was held constant at 500 nM, and P Cas4a concentration was varied from 763 nM to 25 μM in the absence or presence of 50-fold excess of cOAs (25 μM) to Csa3 Sso .
A conserved Glu122 from the βN5-βN6 loop (outside of the above two motifs) was found to significantly limit the cA4 binding affinity in Csa3 Sso via electrostatic repulsion to the central cA4 phosphoryls; substituting Glu122 with an Ala, or more conservatively with a Gln residue, drastically increased cA4 binding affinity of Csa3 Sso (145-or 125-fold, respectively) (Figs. 4 and 5). The conservation of Glu122 during evolution further indicates that it might be limiting Csa3 response only to high cellular cA4 concentrations during type III phage interference. This is also supported by the lack of a self-limiting ring nuclease activity in Csa3 Sso . Nevertheless, this represents an interesting example of a rational surface design that increases ligand binding affinity and has potential applications in engineering an amplified Csa3 response to phage infections.
Owing to variation in the identity of the catalytic residues in the active sites of the known ring nucleases, the mechanistic details of ring nuclease activity are still emerging (67). Except for Crn3 that requires metal ions for ring nuclease activity, most of the cOA receptors employ a metal-independent nucleophilic substitution mechanism where a general base deprotonates a ribosyl 2 0 hydroxyl (attacking group) for a nucleophilic attack on the scissile phosphorous atom and/or stabilizes a pentacovalent transition state by coordinating scissile phosphoryl oxygens (68,85). Ultimately, these interactions position the 2 0 hydroxyl oxygen (O'), phosphorous (P), and 5 0 -ribosyl oxygen (O'') in-line (with an ideal O'-P-O" angle 180 ) for the phosphodiester bond hydrolysis (86,87). Enterococcus italicus cA6 ring nuclease (EiCsm6), however, is an exception where alternate residues in the protein sterically force cA6 O'-P-O" in a compatible in-line conformation (88). Although originally postulated to be involved in catalysis, the motif 2 Arg/Lys residues from CARF-containing ring nucleases have recently been found only to mediate cOA binding (67,85). Instead, residues in another conserved motif GxS/T have been recently implicated in the catalysis, where Ser/Thr and a Trp residue conserved adjacent to this motif participate in the catalysis (68,(83)(84)(85)88). Furthermore, a Glu or Asp residue conserved in a few CARF domain proteins coordinates the Gly to position the GxS/T motif adjacent to the scissile phosphoryl group, which is hypothesized to facilitate hydrolysis (67). Despite the essential interactions of the central cA4 phosphoryl (P 2b ) with the motif 2 residues in Csa3 Sso (Gly96 and Arg98), it lacks a demonstratable cA4 ring nuclease activity (Fig. 6). This could be due, at least in part, to the absence of the GxS/T motif in Csa3 proteins. Also, the active site interactions of Arg98 and Glu122 with the target cA4 phosphoryl (P 2b ) results in an O'-P'-O" angle of 94 presenting with a stereochemistry inconducive for nucleophilic attack by the 2 0 ribosyl hydroxyl (Fig. 4C). By contrast, the nonplanar (P 2a ) and distal phosphates (P 1a and P 1b ) that have O'-P'-O" angles of 158 , 125 , and 141 , respectively, lack coordination by side chain of a basic residue to stabilize the transition state. The lack of ring nuclease activity in the Csa3 Sso proteins, unlike most self-limiting cOA ring nucleases, could allow for a long-term potentiation of the cA4 signal in an infected cell.
It is intriguing why X-ray crystallography data did not depict cA4-induced conformational changes revealed by our SAXS analysis. This could be explained, at least in part, by immobilization of the wHTH B by its symmetry-related crystallographic interactions with CARF A&B from the next Csa3 Sso dimer in our Csa3 Sso cA4 complex crystals. A significantly large interface constituted by these packing interactions could have driven selection of biologically infrequent Csa3 Sso conformer we observed in the crystal structure (Fig. S9).
Csa3b Sis (a Csa3b homolog) regulation of type I interference (cas) genes and CRISPR spacer acquisition complex has been recently studied (57,58). In the absence of an MGE, the subtype I-A interference (cas) genes are kept repressed by Csa3b Sis -mediated recruitment of the Cascade-crRNA complex at the P Cas promoter. During MGE invasion, recognition of a protospacer sequence by the Cascade-crRNA complex facilitates its release from the P Cas resulting in derepression of the cas gene expression (58). Furthermore, 10-fold increase in Csa3b Sis binding affinity to the P csa1 promoter has been reported in the presence of the cA4 analog in vitro, which suggests a further repression of the adaptation gene expression upon phage transcript clearance by the type III interference complex (57). cA4-induced conformational changes we observed using SAXS bring DNA binding face of the wHTH A (specifically helix αC3 that is expected to dock into the major groove of DNA) in proximity to the CARF domain (αN3 and βN2) from chain B (Fig. 8 left panel). Although this could potentially disrupt Csa3 Sso interactions with at least one copy of the two palindromes in its binding site ( Fig. 9 right panel), our attempts to analyze the effects of cA4 on the binding of Csa3 Sso to P Cas4a promoter showed no significant change in specific DNA binding by Csa3 Sso . Although this unexpected observation does not fully align with previously published transcriptional regulation of the acquisition operon in S. islandicus by Csa3a Sis , it could simply be because the two Csa3a orthologs utilize alternate regulatory mechanisms. For example, cA4 binding could improve stability of DNA-bound Csa3 Sso or alter its DNA binding mode in vivo, which is supported by our increased Csa3 Sso -DNA complex solubility in vitro. Alternatively, Csa3 Sso may need to interact with other cA4-induced allostery in Csa3 similar ligand(s) or novel protein partner(s) for its transcriptional regulation. The latter mechanism, however, would be more intricate and distinct from a more common and straightforward regulatory model where ligand binding alone regulates DNA binding affinity of transcription factors to recruit RNA polymerase. Such an alternate mechanism could further harness different binding partners to differentially regulate a vast variety of CRISPR loci that S. solfataricus genome encodes in comparison with S. islandicus. Need for a Csa3 Sso partner is also indicated by in vitro instability of the Csa3 Sso -DNA complexes we observed in the absence of cA4 in vitro (Figs. 9, B and C and S8C). Consistent with this, Csa3a proteins have been hypothesized to recruit transcription factor B to a noncanonical TATA box coexistent with the Csa3a promoters for spacer acquisition in Sulfolobales (59). Furthermore, such a cooperative interaction of Csa3 Sso with protein partners may also further improve Csa3 Sso binding affinities with cA4 and/or target DNA. Although we anticipate involvement of our observed cA4-induced Csa3 Sso conformations to underlie any sort of functional regulation, an additional possibility of DNA-dependent oligomerization cannot be ruled out with the existing data. Therefore, more work needs to be done to appreciate the binding of cA4 to Csa3 Sso .
Finally, although possibilities always exist for other highaffinity ligands for Csa3 (that could also better regulate DNA binding by Csa3 in vitro), it is unlikely for a Csa3 CARF domain dimer with four binding pockets aptly engaged in recognizing a four-nucleotide ligand to accommodate a completely different ligand structure. In this context, it is of high interest to characterize Csa3a homologs from organisms that are reported to lack Cas10 and CRISPR-Cas loci to identify such possible alternate Csa3a ligands (67). Nonetheless, our data and coevolution of csa3 genes with CRISPR loci in most prokaryotes support a cross talk between the type III and type I CRISPR systems in the form of cA4 binding to Csa3a homologs. This may underlie the regulation of spacer acquisition, crRNA gene expression, and (type I) Cascade -mediated clearance of re-infection by the same virus (Fig. 1B).
The supernatants containing all the wildtype and mutant His 6 -Csa3 Sso preparations were applied to a HisTrap Fast Flow column (Cytiva Life Sciences, Inc) pre-equilibrated in buffer A. Following a wash with five column volumes (CVs) of buffer A, Csa3 Sso was eluted with a three-step gradient of buffer B (20 mM Tris-HCl pH 8.0 and 50 mM NaCl) to buffer C (20 mM Tris-HCl pH 8.0, 50 mM NaCl, and 500 mM Imidazole): (i) 0% to 10% (v/v) buffer C in ten CVs, (ii) 10% to 40% (v/v) buffer C in ten CVs, and (iii) 40% to 100% (v/v) buffer C in 20 CVs. The fractions containing Csa3 Sso were pooled and applied to a Mono-Q anion exchange chromatography column (Cytiva Life Sciences, Inc) equilibrated with buffer D (50 mM Tris-HCl pH 9.0). The flow-through containing Csa3 Sso was collected and subjected to a Cibacron Blue 3GA column (Sigma Aldrich) equilibrated with buffer D. Bound proteins were eluted using a linear gradient of buffer D and buffer E (50 mM Tris-HCl pH 9.0 and 1.5 M NaCl).
His 6 -tagged Csa3 Sso preparations obtained above were concentrated and applied onto a Superdex 200 16/70 gel filtration column pre-equilibrated with buffer F (20 mM Tris-HCl pH 8.0 and 50 mM NaCl). The Csa3 Sso -containing fractions were concentrated using an Amicon Ultra-10 kDa cutoff centrifugal filter (Millipore) and stored at −80 C.

Microscale thermophoresis
The ligand-binding specificity of Csa3 Sso (WT) and the role of Csa3 Sso residues in ligand binding were determined by microscale Thermophoresis (MST). Wildtype or mutant His 6 -Csa3 Sso (200 nM) was fluorescently labeled in 2× binding buffer (10 mM Na/K phosphate, pH 5.8, 10 mM MgCl 2 , 25 mM NaCl, and 0.05% Tween 20) using 100 nM RED-Tris NTA His-Tag labeling dye (Nanotemper Technologies, Inc) and incubated in the dark at room temperature for 30 min. Two millimolar stocks of synthetic ligands including cA3, cA4, and cA6 (Biolog Life Science Institute) or linear ribonucleotide cA4-induced allostery in Csa3 5 0 -rCrArArArA-3 0 (Bio-Synthesis Inc) were serially diluted in a 1:1 ratio with nuclease-free water. The labeled proteins were then mixed with ligands in a 1:1 ratio. Only cA4 was used in the MST experiments with the Csa3 Sso mutants (2 mM stocks of cA4 were used for F10A, F14A, G96A, and R98A mutants, and 20 μM stocks were used for E122A and E122Q mutants). The mixtures of proteins and ligands were incubated in the dark at room temperature for 30 min. The precipitated material was removed by centrifugation at 6000 rpm for 5 min. The supernatants were then loaded into Monolith NT.115 Series Premium capillaries in triplicate, and the thermophoresis was detected with 40% excitation power and 40% IR-laser power for an on-time of 20 s at 25 C.
The MST experiment and data analysis for ligand-induced changes in Csa3 Sso binding affinity with P Cas4a was performed as mentioned above except for the following: 1 μM of His 6 -Csa3 Sso (WT) was fluorescently labeled in 2× binding buffer with 40 nM RED-Tris NTA His-Tag labeling dye in the absence or presence of cA4 or cA6. The labeling mixtures were incubated in the dark at room temperature for 30 min. A 50 μM stock of P Cas4a was serially diluted in 1:1 ratios with nuclease-free water. The labeled proteins ± ligands were then mixed with diluted P Cas4a in a 1:1 ratio and incubated in the dark at room temperature for 30 min. The supernatants obtained after removal of precipitated material were then loaded into Monolith NT.115 Series Premium capillaries in triplicate, and the thermophoresis was detected with 100% excitation power and 60% IR-laser power for an on-time of 20 s at 25 C.
The binding affinities of oligonucleotides and P Cas4a to Csa3 Sso (WT and mutants) were analyzed according to the law of mass action in a standard fitting mode of MO.Affinity analysis software (version 2.3).
Crystallization, X-ray data collection, data processing, model building, and refinement Our high-throughput crystallization screen using His 6 -Csa3 Sso in the presence of 2-fold molar excess of cA4 identified a crystallization condition yielding cA4-dependent His 6 -Csa3 Sso crystals. The crystallization condition was optimized to obtain crystals growing up to 400 μm size. Csa3 Sso cA4 complex crystals were produced by the vapor diffusion method at 20 C using a 1:2 mixture of Csa3 Sso cA4 (300 μM His 6 -Csa3 Sso and 600 μM cA4) in gel filtration buffer (buffer F: 20 mM Tris-HCl pH 8.0 and 50 mM NaCl) and well solution (0.1 M K 2 SO 4 , 0.1 M Na/K 5.8, 16% PEG3350). X-ray diffraction data were collected in NSLS-II using AMX beamline at wavelength 0.92009 Å. The diffraction data were indexed, integrated, and scaled in HKL2000 (90). The initial phase information was obtained by molecular replacement in phenix phaser (91) using the native Csa3 structure (PDB ID: 2WTE) as template. The crystal belongs to P2 1 2 1 2 and two molecules of Csa3 are in an asymmetric unit. A strong difference density was observed on inspection of the F O -F C map at the CARF domain dimeric interface and was identified as bound cA4. The molecular model for the ligand cA4 (CHEBI:142457) was obtained from the ChEBI EMBL database (92). The ligand restraints were generated in Phenix ReadySet, and cA4 has been manually modeled on the difference density map. Iterative rounds of manual model building in Coot and refinement in Phenix refinement generated the final model with R work = 16.0 and R free = 20.5 (93,94). The stereochemical quality of the final structure was verified using Ramachandran plot, and 99.03% of the residues are found to have favorable conformation, whereas only 0.97% of residues have allowed conformation; no outlier was found. The structure is submitted to PDB with a PDB ID 6WXQ. The data processing and refinement statistics are reported in Table 1.

Ring nuclease activity assays
The ring nuclease activity assays of cA4 were performed in two different conditions. In the first condition (used for Fig. S5, A and B), 50 μM Csa3 Sso and/or 250 μM cA4 was incubated (in a cA4:Csa3 Sso molar ratio of 5:1) at 55 C for 3 h either in (i) 1× binding buffer or (ii) the buffer G (20 mM Tris-HCl, pH 7.5, 50 mM KCl, and 50 mM NaCl) previously used to demonstrate the ring nuclease activity of Csm6 (84). The reaction mixtures were deproteinized by ultrafiltration with an Amicon Ultra 3 kDa cutoff centrifugal filter (Millipore). In the second condition (used for Figs. 6 and S5C), 5 μM Csa3 Sso (wildtype or its E122A/E122Q mutants) and/or 500 μM cA 4 was incubated (in a cA4:Csa3 Sso molar ratio of 100:1) in the buffer G or 1× binding buffer at 60 C for 3 h. The reactions were quenched and deproteinized by phenol-chloroform extractions (84). For both conditions, the products and controls were collected and analyzed with High Performance Liquid Chromatography system (UltiMate 3000, Thermo Scientific) equipped with the C18 column (4.6 × 100 mm, 5 μM particle size, Thermo Scientific) and C18 column, (15 cm × 4.6 mm, 3 μM particle size, Supelco), for the first and second conditions, respectively.

Analytical ultracentrifugation
Sedimentation velocity analytical ultracentrifugation (SV-AUC) experiments were performed at 20 C with an XL-A analytical ultracentrifuge (Beckman-Coulter) and a TiAn60 rotor with two-channel charcoal-filled Epon centerpieces and quartz windows. Data were collected with detection at 280 nm. Complete sedimentation velocity profiles were recorded every 30 s at 40,000 rpm. Data were fit using the c(S) implementation of the Lamm equation as implemented in the program SEDFIT (95) and corrected for S 20,w . Direct fitting of association models was performed using SEDPHAT (96). Calculated hydrodynamic properties for homology models were determined using WinHYDROPRO (97). The partial specific volume (ῡ), solvent density (ρ), and viscosity (η) were derived from chemical composition by SEDNTERP (http://www.rasmb.bbri. org/). Figures were created using the program GUSSI (98). All measurements were performed in 10 mM Na/KPO 4 (pH 5.8), 10 mM NaCl, 25 mM MgCl 2 .

cA4-induced allostery in Csa3
Small-angle X-ray scattering data collection SAXS data were collected at beamline 16-ID (LiX) of the National Synchrotron Light Source II (99). Data were collected at a wavelength of 1.0 Å in a three-camera conformation, yielding accessible scattering angle with 0.013 < q < 3.0 Å −1 , where q is the momentum transfer, defined as q = 4π sin(θ)/λ, where λ is the X-ray wavelength and 2θ is the scattering angle. The data to q < 0.5 Å −1 were used in subsequent analyses. Samples were loaded into a 1-mm capillary for ten 1-s X-ray exposures. All measurements were performed in 10 mM Na/ KPO 4 (pH 5.8), 10 mM NaCl, 25 mM MgCl 2 .

SAXS analysis
Data were analyzed in the program RAW (100). When fitting manually, the maximum diameter of the particle (D max ) was incrementally adjusted in GNOM (101) to maximize the goodness-of-fit parameter, to minimize the discrepancy between the fit and the experimental data, and to optimize the visual qualities of the distribution profile.
Hybrid bead-atomistic modeling of Csa3 was performed using the program CORAL (70), where the known structure was fixed in composition and inventory not resolved by X-ray crystallography was modeled as coarse-grain beads. Ten independent calculations for each protein were performed and yielded comparable results. The final models were assessed using the program CRYSOL. The models were rendered using the program PYMOL (102).

Electromobility shift assays
Top and bottom DNA oligonucleotide strands used for EMSA were purchased from Sigma-Aldrich (Table S2). The oligonucleotide strands were annealed by mixing them in 1:1 molar ratio followed by heating at 98 C for the unlabeled, or 70 C for the Cy5-labeled, DNA fragments for 15 min. Slow cooling to room temperature was used to anneal the fragments. Unlabeled probe-containing samples used for Fig. S8C were electrophoresed using a 2% (w/v) agarose Tris-Borate-EDTA (TBE) gel at 100 V for 30 min at room temperature. Probes were stained with ethidium bromide and visualized using Gel Doc XR+ Molecular Imager (Bio-Rad). EMSAs shown in Figure 9C were performed using the Cy5-labeled P Cas4a probe. A 5 0 amino-modified P Cas4a top strand was obtained from Biosearch technologies. The Cy5-coupled oligo fraction was purified from unlabeled oligos and loose dye in C18 reverse-phase HPLC using already established protocols (103,104). The Cy5-labeled probes were electrophoresed in a 5% acrylamide TBE gel at 100 V for 1 h at 4 C. DNA bands were visualized using FluorChem R gel imager (Protein Simple, Inc).

Data availability
Coordinates and structure factors for the Csa3 Sso :cA4 structure have been deposited in the RCSB Protein Data Bank (http://www.rcsb.org) with the accession code 6WXQ. Strains and plasmids are described in this article, and the raw data for the binding analyses in Figures 1D, 4E and 9D and Fig. S1 are available upon request.