A Ruler Protein in a Complex for Antiviral Defense Determines the Length of Small Interfering CRISPR RNAs

Background: CRISPR immune systems protect prokaryotes from their viruses using small interfering RNAs (crRNAs), which require maturation events during their biogenesis. Results: In Staphylococcus epidermidis, crRNAs undergo maturation in a Cas10·Csm ribonucleoprotein complex; Csm3 modulates the extent of maturation. Conclusion: Csm3 acts as a ruler for crRNAs. Significance: Investigating CRISPR immunity is important to understand prokaryotic ecology and to develop biotechnological applications. Small RNAs undergo maturation events that precisely determine the length and structure required for their function. CRISPRs (clustered regularly interspaced short palindromic repeats) encode small RNAs (crRNAs) that together with CRISPR-associated (cas) genes constitute a sequence-specific prokaryotic immune system for anti-viral and anti-plasmid defense. crRNAs are subject to multiple processing events during their biogenesis, and little is known about the mechanism of the final maturation step. We show that in the Staphylococcus epidermidis type III CRISPR-Cas system, mature crRNAs are measured in a Cas10·Csm ribonucleoprotein complex to yield discrete lengths that differ by 6-nucleotide increments. We looked for mutants that impact this crRNA size pattern and found that an alanine substitution of a conserved aspartate residue of Csm3 eliminates the 6-nucleotide increments in the length of crRNAs. In vitro, recombinant Csm3 binds RNA molecules at multiple sites, producing gel-shift patterns that suggest that each protein binds 6 nucleotides of substrate. In vivo, changes in the levels of Csm3 modulate the crRNA size distribution without disrupting the 6-nucleotide periodicity. Our data support a model in which multiple Csm3 molecules within the Cas10·Csm complex bind the crRNA with a 6-nucleotide periodicity to function as a ruler that measures the extent of crRNA maturation.

CRISPR sequences (clustered regularly interspaced short palindromic repeats) are an essential component of a prokaryotic immune system that protects against phage infection and other invading genetic elements (1)(2)(3). CRISPR loci harbor an archive of short sequences (known as spacers) derived from past invaders that provide the specificity for CRISPR immunity.
These sequences encode small CRISPR RNAs (crRNAs) 3 that together with CRISPR-associated (Cas) proteins, can locate and destroy foreign nucleic acids by an antisense targeting mechanism (4 -7). The CRISPR immune system can also build a memory of past infections by incorporating new invader-derived sequences into CRISPR loci (1,8). Found in Ͼ40% of bacteria and nearly all archaea (9 -12), CRISPR-Cas systems exhibit remarkable mechanistic and functional diversity. They can be classified in three main types (I-III) that are defined based upon cas gene content and differences in the mechanism of immunity (13).
crRNA biogenesis is the essential first step in the CRISPR immunity pathway. Spacers range from 24 -48 nucleotides in length and are interrupted by similarly sized repeat sequences (see Fig. 1A). This repeat-spacer array is transcribed as a long precursor that is subsequently processed to liberate mature crRNAs. In all CRISPR-Cas systems, the first step in crRNA biogenesis, known as primary processing, entails endoribonucleolytic cleavage within repeats. In type I and III systems, Cas6 is considered the primary processing endonuclease (2,(13)(14)(15). One exception appears to be the type I-C system in Bacillus halodurans, where Cas5d was recently shown to catalyze the cleavage of repeat sequences (16). In contrast, primary processing in type II systems relies upon an antisense trans-encoded crRNA and the host RNase III to cleave within repeats (17). Primary processing generates crRNA intermediates that consist of a single spacer that is flanked on both ends by partial repeats. Whereas no further processing is known to occur in type I CRISPR-Cas systems, in type II and III systems, the crRNA intermediates are subject to a final maturation step that eliminates repeat and spacer sequences at the 5Ј or 3Ј end of the intermediate, respectively (17)(18)(19).
Type III CRISPR-Cas systems have been classified into two subtypes: III-A, containing the csm module of cas genes, and III-B, harboring the cmr module (13). In both subtypes, mature crRNAs display an invariant 5Ј end containing 8 nt of repeat sequence (the crRNA tag) but variable 3Ј ends that match the targeted sequence in the phage or plasmid genome (14,19). Primary processing cleaves the repeat sequence immediately upstream of the crRNA tag and maturation occurs at the 3Ј-end of the intermediate crRNA. Importantly, the extent of maturation at the 3Ј-end determines the cleavage site within the target sequence (5), and its mechanism remains poorly understood. The type III-A system of Staphylococcus epidermidis RP62a contains nine cas/csm genes (see Fig. 1A). Previously, we showed that a ruler mechanism anchored at the primary processing site generates mature crRNAs of discrete lengths (37 and 43 nucleotides, see Fig. 1B) and that csm2, csm3, and csm5 are required for crRNA maturation (19). Here, we show that Csm2, Csm3, and Csm5, along with Csm4 and the type III signature protein Cas10, are part of a ribonucleoprotein complex analogous to the Cascade (CRISPR-associated complex for antiviral defense) complex described for Escherichia coli and other organisms (2,5,16,20,21). This complex, here named Cas10⅐Csm, is enriched with mature crRNAs that range from 31 to 67 nucleotides, measured precisely in 6-nucleotide increments. We show that Csm3 is essential for the formation of this complex, and demonstrate that mutating conserved residues or changing overall levels of Csm3 in vivo alters the size distribution of the crRNAs without disrupting their 6-nucleotide periodicity. Furthermore, recombinant Csm3 binds RNA in vitro in a sequence-independent manner, producing gel-shift patterns that suggest that each protein binds 6 nucleotides of substrate. Our observations support a model in which multiple Csm3 molecules bind the crRNA with a 6-nucleotide periodicity to function as a ruler that measures the extent of crRNA maturation within the S. epidermidis Cas10⅐Csm complex.

EXPERIMENTAL PROCEDURES
Bacterial Strains and Growth Conditions-S. epidermidis RP62a (22) and S. aureus OS2 (23) strains were grown in brain heart infusion medium (Difco). When required, the medium was supplemented with antibiotics as follows: neomycin (15 g/ml) for selection of S. epidermidis LM1680; chloramphenicol (10 g/ml) for selection of pcrispr and pLM9-based plasmids; and mupirocin (5 g/ml) for selection of pG0400. E. coli BL21(DE3) Codon Plus cells (EMD Millipore) were grown in LB supplemented with chloramphenicol (34 g/ml). When appropriate, the medium was also supplemented with kanamycin (50 g/ml) to select for pET28b-based plasmids or ampicillin (100 g/ml) to select for pET23a-based plasmids.
Strain Construction-S. epidermidis LM1680 was isolated as a recipient clone that escaped CRISPR interference after the conjugative transfer of pG0400 from a donor strain S. aureus RN4220 to the recipient S. epidermidis RP62a. pG0400 was cured from S. epidermidis LM1680 by multiple passages and replica plating on brain heart infusion agar with and without mupirocin. A large deletion flanking the crispr-cas locus was mapped by PCR and Sanger sequencing: 257,871 bp are missing from coordinates 2,327,546 to 2,585,416 of the S. epidermidis RP62a genome.
Plasmid Construction-pcrispr was constructed by fusing plasmids pLM304 (19) and pC194 (24). Briefly, pC194 was amplified with primers W175 and W225 (supplemental Table). The PCR product and pLM304 were digested with AatII (New England Biolabs (NEB)) and BamHI (NEB), and both were gel-purified. The purified fragments were combined and ligated with T4 DNA ligase (NEB). The ligated product was transformed into S. aureus OS2 and chloramphenicol-resistant colonies were selected. The crispr-cas insert was confirmed by a functional test for CRISPR interference against the conjugative transfer of pG0400. In-frame deletions, His 6 tags, and amino acid substitutions were introduced into pcrispr by inverse PCR using the primers indicated in the supplemental table. Restriction cut sites were included on the primers for generation of Csm3 deletions (PspOMI and EagI), His 6 tags (NheI), and amino acid substitutions (Csm3 D100A , PspOMI; Csm3 E120A,E124A , AgeI) to facilitate ligation. Following inverse PCR, products were purified using a PCR purification kit (Qiagen) and were either 5Ј-phosphorylated using polynucleotide kinase (for blunt-end ligation) or cleaved with the appropriate restriction enzyme(s) (NEB). Restriction digests were heat-inactivated according to the manufacturer's recommendations. Digested or phosphorylated PCR products were then circularized using T4 DNA ligase (NEB). All constructs were transformed into S. aureus OS2 prior to transformation into S. epidermidis LM1680. At least two chloramphenicol-resistant transformants were selected for each construct, and plasmids were purified. To confirm the intended mutations, plasmids were subject to PCR amplification using the appropriate primers flanking the mutation site (supplemental table). Deletion mutants were confirmed by sequencing of the entire CRISPR locus using primers L19, W13, W14, L35, W15, W16, W17, W18, T17, and W19 (supplemental table), and His 6 tags and amino acid substitution mutations were confirmed by restriction digest with the appropriate enzyme and/or sequencing of the mutated region. At least two isolates of confirmed mutant plasmids were prepared from S. aureus and transformed into S. epidermidis LM1680.
The csm3-csm3 (tethered) mutant was constructed by the insertion of a region encoding a (GGGGS) 3 linker followed by a csm3 sequence downstream of the original copy of csm3 in pcrispr. Plasmid and insert were ligated using Gibson assembly (25). The construct was transformed into S. aureus OS2 prior to transformation into S. epidermidis LM1680. At least two chloramphenicol-resistant transformants were selected, and plasmids were purified. To confirm the intended mutation, plasmids were subjected to PCR amplification and sequencing of the PCR product using the appropriate primers flanking the mutation site (supplemental table).
pET28b-His 10 Smt3-csm3 was constructed by inserting an S. epidermidis csm3 PCR product into the pET28b-His 10 Smt3 multiple cloning site. Briefly, csm3 was amplified from pcrisprcas with primers A132 and A133 (supplemental table), and both the PCR product and pET28b-His 10 Smt3 were digested with BamHI (NEB) and XhoI (NEB). The digested PCR product and linearized vector were gel-purified, combined, and ligated by T4 DNA Ligase (NEB) and then transformed into E. coli BL21 (DE3) codon plus cells (EMD Millipore). The identity of the cloned DNA fragment was confirmed by sequencing using primers T7P and T7T (see supplemental table). pLM9 was constructed by adding the lacI repressor gene and the P Spac promoter region of pMutin-HA (26) into the HindIII site of pC194 (24). pLM9/csm3 and pLM9/csm3 D100A were created by ligating the different csm alleles into the multiple cloning site of pLM9 using Gibson assembly (25).
Conjugation-Conjugation was carried out by filter mating as described previously (3). Confirmation of the presence of the desired plasmids in transconjugants was achieved by extracting DNA of at least two colonies and performing PCR with suitable primers (L70/L71 to confirm pG0400 and various primers as specified in the supplemental table to confirm pcrispr mutations).
Cas10⅐Csm Purification from S. epidermidis-S. epidermidis LM1680 strains harboring pcrispr with the indicated His 6 tag were grown to A 600 of 2 and harvested. Cell pellets were frozen at Ϫ80°overnight and then thawed on ice for ϳ1 h. Cell pellets were resuspended in 10 ml of lysis buffer (20 mM MgCl 2 , 35 g/ml lysostaphin) and incubated at 37°for 30 min. Cell lysates were then diluted 1:1 with 2ϫ resuspension buffer (600 mM NaCl, 100 mM NaH 2 PO 4 , 20 mM imidazole, 0.1% Triton X-100, and 2 complete EDTA-free protease inhibitor tablets (Roche Applied Science). Diluted lysates were sonicated on ice (large probe, power of 10), with four 30-s pulses and 30 s of rest in between pulses. Lysates were centrifuged at 15,000 ϫ g for 20 min (twice) to pellet insoluble cell debris, and cleared lysates were filtered through a 0.2-bottle top filter. A column (1-cm diameter) was packed with nickel-nitrilotriacetic acid resin (Thermo, 1 ml of slurry per 1 liter of starting culture volume) and pre-equilibrated with 10 ml pre-equilibration buffer (100 mM NaCl, 50 mM NaH 2 PO 4 ). Cleared lysate was applied to the column, and the column was washed with 10 ml of wash buffer 1 (100 mM NaCl, 50 mM NaH 2 PO 4 , 20 mM imidazole), and 10 ml of wash buffer 2 (100 mM NaCl, 50 mM NaH 2 PO 4 , 20 mM imidazole, 10% glycerol). Protein was eluted from the column with 4 ml of elution buffer (300 mM NaCl, 50 mM NaH 2 PO 4 , 250 mM imidazole, and 10% glycerol), and 500-l fractions were collected. Fractions containing Cas10⅐Csm were pooled and either dialyzed in 1 liter of dialysis buffer (50 mM NaCl, 10 mM Tris-HCl, pH 7.5, and 25% glycerol) at 4°C overnight, or where indicated, subject to a second analytical affinity purification as follows: streptavidin-coated magnetic beads (15 l per sample, Thermo) were equilibrated by washing three times (100 l per wash) with wash buffer 3 (50 mM NaCl, 10 mM Tris-HCl, pH 7.5, and 5% glycerol). Beads were incubated with 2 ng of 5Ј-biotinylated oligonucleotide antisense to spc1 (see the supplemental table for sequence) for 30 min at room temperature. Beads were washed three times with wash buffer 3 (100 l per wash) and mixed with 10 g of eluted Cas10⅐Csm complex. Complexes were allowed to anneal to the antisense oligonucleotide for 30 min at room temperature, and the beads were washed three times with wash buffer 3 (100 l per wash) and resuspended in a final volume of 25 l of wash buffer 3. Beads were boiled in an equal volume of 2ϫ protein loading buffer (4% SDS, 20% glycerol, 120 mM Tris-HCl, pH 7, and 2% bromphenol blue) and resolved by SDS-PAGE on 4 -15% gradient gels (Bio-Rad) in 1ϫ protein running buffer (25 mM Tris, 192 mM glycine).

Cas10⅐Csm and Csm3
Purification from E. coli-Cultures (4 liters) of E. coli BL21 (DE3) codon plus cells (EMD Millipore) containing pET23a-Cas10/Csm/csm2 H6N or pET28b-His 10 Smt3-csm3 were grown at 37°C in Luria-Bertani medium containing 34 g/ml chloramphenicol and either 100 g/ml ampicilin or 50 g/ml kanamycin, respectively. When the A 600 reached 0.6, the cultures were adjusted to 0.3 mM isopropyl-1-thio-␤-D-galactopyranoside, and incubation was continued for 16 h at 17°C with constant shaking. The cells were harvested by centrifugation, and the pellets were stored at Ϫ80°C. All subsequent steps were performed at 4°C.
For purification of the Cas10⅐Csm(Csm2 H6N ) complex, thawed bacteria were resuspended in 75 ml of buffer A1 (50 mM Tris-HCl, pH 7.5, 350 mM NaCl, 200 mM Li 2 SO 4 , 20% sucrose, and 10 mM imidazole) containing two complete EDTA-free protease inhibitor tablets (Roche Applied Science). Triton X-100 and lysozyme were added to final concentrations of 0.1% and 0.1 mg/ml, respectively. After 1 h, the lysate was sonicated to reduce viscosity. Insoluble material was removed by centrifugation for 30 min at 15,000 rpm in a Beckman JA-3050 rotor. The soluble extract was mixed for 1 h with 5 ml of Ni 2ϩ -nitrilotriacetic acid-agarose resin (Qiagen) that had been pre-equilibrated with buffer A1. The resin was recovered by centrifugation and then first washed with 50 ml of buffer A1, followed by washing with 50 ml of IMAC buffer (50 mM Tris-HCl, pH 7.5, 250 mM NaCl, 10% glycerol) containing 15 mM imidazole. The resin was subsequently resuspended in 10 ml of IMAC buffer containing 50 mM imidazole and then poured into a column. The column was then eluted stepwise with 10-ml aliquots of IMAC buffer (50 mM Tris-HCl, pH 7.5, 250 mM NaCl, 10% glycerol) containing 100, 200, 350, and 500 mM imidazole. The 100 mM imidazole elutes containing the complex was pooled together. Subsequently, 0.5 ml of the fraction was analyzed by gel filtration chromatography with a Superdex 200 10/300 GL (GE Healthcare) using buffer B (50 mM Tris-HCl, pH 7.5, 5% glycerol, 150 mM NaCl). The protein complex elution profile was monitored continuously by UV absorbance, and A 280 was plotted as a function of elution volume. The void volume of the column was measured by tracking the elution peak of blue dextran, and the column was calibrated by tracking the elution profiles of marker proteins of known native size (Bio-Rad gel filtration standards).
For purification of Csm3, thawed bacteria were resuspended in 30 ml of buffer A2 (50 mM Tris-HCl, pH 7.5, 1.25 M NaCl, 200 mM Li 2 SO 4 , 10% sucrose, 15 mM imidazole) containing one complete EDTA-free protease inhibitor tablet (Roche Applied Science). Triton X-100 and lysozyme were added to final concentrations of 0.1% and 0.1 mg/ml, respectively. After 1 h, the lysate was sonicated to reduce viscosity. Insoluble material was removed by centrifugation for 30 min at 15,000 rpm in a Beckman JA-3050 rotor. The soluble extract was mixed for 1 h with 2 ml of Ni 2ϩ -nitrilotriacetic acid-agarose resin (Qiagen) that had been pre-equilibrated with buffer A2. The resin was recovered by centrifugation and then first washed with 40 ml of buffer A2, followed by washing with 5 ml of 3 M KCl solution. The resin was subsequently resuspended in 25 ml of buffer A2 and then poured into a column. The column was then eluted stepwise with 3-ml aliquots of IMAC buffer (50 mM Tris-HCl, pH 7.5, 250 mM NaCl, 10% glycerol) containing 50, 100, 200, and 500 mM imidazole. The 200 mM imidazole elutes containing the His 10 Smt3-Csm3 polypeptide were pooled and dialyzed for 3 h in dialysis buffer (25 mM Tris-HCl, pH 7.5, 150 mM NaCl, 15 mM imidazole, 5% glycerol) containing 5 units of SUMO protease (27). Dialysate was then incubated with Ni 2ϩ -nitrilotriacetic acid-agarose resin pre-equilibrated in the dialysis buffer for 1 h at 4°C. The beads and dialysate mixture were passed through a column, and the flow-through containing purified Csm3 was collected. Protein concentration was determined by using the Bio-Rad dye reagent with BSA as the standard and confirmed by densitometry after resolving Csm3 and the BSA standards on a gel.
CRISPR RNA Capture-Spc1 crRNAs were captured from 50 g of total RNA extract using a 5Ј-biotinylated PAGE-purified oligonucleotide antisense to spc1 (supplemental table) as described previously (19). Mass Spectrometry Analysis-Pooled fractions of freshly purified Cas10⅐Csm complexes (ϳ100 g) were submitted to the Proteomics Resource Center at Rockefeller University for Mass Spectrometry, and data analysis was done with a nano-ESI LTQ Orbitrap XL instrument.

Mature crRNAs Are Measured in a Cas10⅐Csm
Ribonucleoprotein Complex-Our previous genetic characterization of the S. epidermidis CRISPR-Cas system showed that isogenic strains lacking csm2, csm3, or csm5 accumulated intermediate crRNAs but were lacking in the mature species (19). These genes were candidates to encode proteins involved in crRNA maturation. In types I and III CRISPR-Cas systems, different Cas proteins have been found associated with the crRNAs to form different ribonucleoprotein complexes (2,5,16,20,21). Although such a complex for type III-A CRISPR-Cas systems has not yet been described, we suspected that Csm2, Csm3, and/or Csm5 would participate in crRNA maturation in the context of a ribonucleoprotein complex. To test this, we created pcrispr, a plasmid that contains the entire S. epidermidis RP62A CRISPR-Cas system cloned into the multicopy staphylococcal plasmid pC194 (24). We added His 6 to each of the Cas and Csm proteins involved in crRNA biogenesis (Cas10, Csm2, Csm3, Csm4, Csm5, and Cas6) (19) and corroborated that the presence of the tag does not disrupt CRISPR interference against the conjugative plasmid pG0400 (Table 1). We used Ni 2ϩ affinity chromatography to isolate each tagged protein. SDS-PAGE of proteins purified from cellular extracts of strains expressing Cas10-, Csm2-, Csm3-, Csm4-, or Csm5-tagged protein showed co-purification with each other (Fig. 1C), suggesting the presence of a complex composed of these five proteins, which we termed Cas10⅐Csm. Western probing for the His 6 tag (Fig. 1D) and semi-quantitative mass spectrometry analysis (ESI MS/MS) of the His 6 ⅐Csm2 complex (Table 2) both confirmed the identity of its members. Mass spectrometry also revealed that Cas6 co-purifies with the Cas10⅐Csm complex, but with significantly less abundance, at least 100-fold less abundant than the other members of the complex. No other proteins were enriched in the elution fraction when compared with a negative control expressing an untagged Cas10⅐Csm complex.
To further characterize the complex, we overexpressed and purified the His 6 ⅐Csm2 complex in E. coli and subjected it to size-exclusion chromatography (Fig. 2, A and B). The elution profile demonstrated a strong association between the members of the complex and estimates a molecular mass of 331 kDa. Assuming each subunit of the complex exists in a single copy (the theoretical mass of each protein is shown in Table 2), and allowing for a maximum of 25 kDa additional for the crRNA, the mass of such complex will add up to 255 kDa (i.e. over 70 kDa of mass remains unaccounted for). The higher molecular weight estimated after gel filtration could be due to a particular stoichiometry or to a non-globular shape that impacts on the apparent molecular weight of the complex. crRNAs extracted from the E. coli-expressed complex maintain their 6-nucleotide periodicity (Fig. 2C); however, the overall length of mature crRNAs in the complex are shortened by 2 nucleotides when compared with those produced in S. epidermidis. We hypothesize that different host-encoded nucleases might be responsible for the maturation cleavage event. A, organization of the type III-A CRISPR system in S. epidermidis RP62A. This system contains 9 CRISPR-associated (cas and csm) genes, 4 direct repeats (black boxes), and 3 spacers (colored boxes), the first of which targets the nickase gene in staphylococcal conjugative plasmids. B, a ruler mechanism determines the length of mature crRNAs. Transcription of the repeat-spacer array generates a precursor crRNA that is subject to two cleavage events: primary processing within repeats to yield ϳ71-nt intermediates (filled triangles), and maturation through trimming of the 3Ј end of the intermediate (empty triangles). A ruler mechanism anchored at the primary processing site determines the extension of maturation to generate 37-and 43-nt-long mature crRNA. C, the type III-A Cas10⅐Csm complex. His 6 tags were placed on the indicated (N or C) terminus of each of the genes involved in crRNA biogenesis. Constructs were expressed in S. epidermidis LM1680, and whole cell lysates were subject to Ni 2ϩ affinity chromatography. Complexes were resolved by SDS-PAGE and visualized using coomassie G-250 staining. Red asterisks indicate each of the tagged species. D, each tagged Cas10⅐Csm subunit was visualized by Western blot of cell extracts using Ni 2ϩ -HRP. E, RNA was extracted from each of the His 6 -Cas10⅐Csm complexes, radiolabeled at the 5Ј end, and resolved using denaturing PAGE.  Nonetheless, we tested whether the Cas10⅐Csm complex itself performs maturation of intermediate crRNAs. We purified complexes from S. epidermidis using a ⌬cas6 construct (Fig. 3A) because Cas6 is considered the primary processing endonuclease for type I and III CRISPR-Cas systems (2,(13)(14)(15) and therefore the obtained complexes do not contain any crRNAs (data not shown). The purified complex was incubated with a mix of labeled 71-nt spc1, spc2, and spc3 crRNA substrates (extracted from purified complexes that do not perform maturation) in the presence of different divalent cations (Fig.  3B). No cleavage or degradation was observed, supporting our hypothesis that different nucleases specific to each host might be responsible for the maturation cleavage event. The 6-nt periodicity of crRNAs, however, is inherent to the Cas10⅐Csm complex, indicating that one or more of its components acts as a ruler to determine the length of the mature crRNAs.
Csm3 Modulates the Length of Mature crRNAs-Complexes carrying the His 6 tag in different proteins showed the same crRNA pattern with the exception of the His 6 ⅐Csm3 complex, which displayed a crRNA size distribution that is shifted to shorter lengths (37 and 31 nt) (Fig. 1E). This observation suggested that the addition of the tag affects Csm3 activity or association with the rest of the members of the complex in a manner that impacts the extent of crRNA maturation. To investigate this possibility, we performed alanine substitutions in conserved residues in Csm3 and looked for mutants that affect the mature crRNA sizes. We mutated a conserved histidine (His-18) suspected to have a catalytic role (28), as well as three highly conserved acidic residues: Asp-100, Glu-120, and Glu-124 (Fig. 4A). Mutation of these residues either individually or in combination had no effect on CRISPR function (Fig. 4B and Table 1). We purified complexes carrying these mutations and examined the crRNA distributions. Whereas the H18A complexes produced wild-type levels and sizes of crRNAs, the E120A,E124A substitutions caused the appearance of crRNA species between 31 and 37 nucleotides, and the D100A substitution caused mature crRNAs to collapse down to the smallest detectable species, 31 nt (Fig. 4C). These results indicate that Csm3 has an important role in the measurement of the length of mature crRNAs.
Multiple Copies of Csm3 Bind the crRNA with a 6-Nucleotide Periodicity-A direct role of Csm3 in the modulation of crRNA length would require a direct interaction of this protein with crRNAs. To test this, we overexpressed and purified Csm3 from E. coli (Fig. 5A) and used gel shift assays to characterize its binding to end-labeled spc1 crRNAs of different lengths (Fig. 5B). Multiple shifted bands were observed, suggesting the binding of multiple Csm3 molecules to each crRNA (Fig. 5C). Surprisingly, the number of bands correlated with the length of the  Cas10⅐Csm (Csm2 H6N ) purified from a pcrispr/⌬cas6 construct in S. epidermidis LM1680. Whole cell lysates were subjected to Ni 2ϩ affinity chromatography. The ⌬cas6 construct was used to ensure the absence of mature crRNAs preloaded into the complex. B, a 65-nt spc1 crRNA substrate was 5Ј end-labeled using polynucleotide kinase, PAGE-purified, and combined with the complex (500 pmol) in the presence of EDTA or various metals as indicated. The reaction mixture was incubated at 37°C for 20 min, and RNAs were resolved by denaturing PAGE. FIGURE 4. Csm3 modulates the length of mature crRNAs. A, partial Csm3 sequence alignment using BLAST. Highly conserved residues are highlighted in red, mutated residues are indicated with asterisks. B, mutations in conserved Csm3 residues do not affect CRISPR interference against the conjugative plasmid pG0400. S. epidermidis LM1680 strains harboring wild type and mutant pcrispr plasmids with the indicated alanine substitutions in Csm3 and encoding for His 6 -Csm2 were used as recipients for the transfer of pG0400. Conjugation was carried out in duplicate; the values (in cfu/ml; mean Ϯ S.D.) obtained for recipients and transconjugants are shown. C, wild type and mutant Cas10⅐Csm (Csm2 H6N ) complexes were purified from S. epidermidis, and crRNAs were extracted, radiolabeled on their 5Ј end, and resolved using denaturing PAGE. Whereas most mutations do not affect severely the crRNA size distribution, the D100A substitution leads to the elimination of all crRNAs except the 31-nt species.
crRNA substrate, with approximately one band for every 6 nt of RNA. We tested RNA molecules of 12, 18, and 33 nucleotides in length and obtained 2, 3, and 5 shifted bands, respectively. This suggests that each Csm3 protein interacts with 6 nt of the crRNA. This interaction did not depend on the sequence of the RNA substrate, as similar results were obtained for an RNA with a random sequence (Fig. 5, B and D). We were unable to purify enough quantities of Csm3 D100A to perform substratebinding studies. This protein was highly insoluble, suggesting that the D100A mutation might affect the folding and/or stability of Csm3 when overexpressed in E. coli.
The above results lead us to hypothesize that Csm3 could bind the crRNAs associated with the Cas10⅐Csm complex and determine their length. Csm3 could have two possible functions: to protect the 3Ј end of the crRNA from nuclease activity or to specify a cleavage site within the crRNA. To distinguish between these situations, we engineered complexes carrying a Csm3-Csm3 dimer in which both proteins are tethered by a (GGGGS) 3 flexible linker. The presence of a dimer will change the periodicity of crRNA lengths from 6 to 12 in the protection scenario, but will remain 6 nt if Csm3 specifies the cleavage site. We purified complexes containing the dimer (Fig. 6A) and looked for their associated crRNA species (Fig. 6B). Consistent with Csm3 protection of the crRNA, the 31-nt species disappeared, and the smallest mature crRNA species in the Csm3⅐Csm3 complex was 37 nt. Similar results were obtained by capturing spc1 crRNAs from total RNA extracts (Fig. 6C). According to the protection hypothesis, the next crRNA species in the Csm3⅐Csm3 complexes should be 49 nt (37 ϩ 12 nt). CrRNA species larger than 37 nt, however, were not observed, and complexes containing the dimer were not able to execute full CRISPR immunity against the pG0400 plasmid (Table 1). We believe that although these complexes can be formed and purified, they might adopt an artificial configuration or stoichiometry, precluding an unequivocal interpretation of these experiments. Despite these caveats, the in vitro and in vivo results taken together confirm that each Csm3 molecule inter- FIGURE 5. Multiple copies of Csm3 bind crRNAs with a 6-nucleotide periodicity. A, Csm3 was cloned into the pET28b vector in frame with an N-terminal His 10 tag and the Smt3 peptide. The construct was expressed in E. coli and whole cell lysates were subject to Ni 2ϩ affinity chromatography, followed by dialysis with SUMO protease, which cleaves the His 10 -Smt3 tag. A second Ni 2ϩ affinity purification was used to separate the tag. Csm3 preps before and after cleavage of the tag (Smt3-Csm3 and Csm3, respectively) were resolved by SDS-PAGE and visualized using Coomassie G-250 staining. B, RNA substrates used for Csm3 binding assays. Specific substrates (S) are identical to spc1 crRNAs, harboring the repeat sequence corresponding to the 5Ј, 8-nt tag (blue) and various lengths of spacer sequence (yellow). The nonspecific substrate (NS, gray) harbors no resemblance to a crRNA. C, Csm3 binding to the spc1 crRNA substrates of different lengths is shown. Trace amounts of end-labeled substrate were incubated with increasing amounts of Csm3 (0, 256, 320, and 384 pmol). Bound and free substrates are indicated. D, Csm3 binding to a nonspecific substrate. Trace amounts of end-labeled substrate were incubated with increasing amounts of Csm3 (0, 8, 16, 32, 64, 128, 185, and 256 pmol). Bound and free substrates are indicated. FIGURE 6. Csm3 protects 6 nt of the crRNAs during maturation. A, purification of Cas10⅐Csm complexes containing a Csm3-Csm3 dimer. The complex was purified from a strain expressing His 6 -Csm2 and a dimer consisting of two Csm3 proteins tethered through a flexible (GGGGS) 3 linker (Csm3-Csm3) using Ni 2ϩ affinity chromatography. B and C, total crRNAs were extracted from the indicated Cas10⅐Csm complexes (B), or spc1 crRNAs were captured from total RNA extracts of indicated strains using a biotinylated probe antisense to spc1 (C, probe marked with a black arrowhead). crRNAs were radiolabeled at the 5Ј end and resolved using denaturing PAGE. acts with 6 nt of the crRNA and likely protects it from the nuclease involved in maturation.
Csm3 Levels Determine the Extent of crRNA Maturation-Next, we wanted to test the role of Csm3 in the generation of multiple crRNA species. Csm3 is an ortholog of CasC (13), which has been shown to exist in six copies per Cascade complex in the type I-E system of E. coli (29). We hypothesized that multiple copies of Csm3 might exist within the Cas10⅐Csm complex and could bind the crRNA with a periodicity of 6-nt to protect it from the nuclease that cleaves or degrades the 3Ј end of the intermediate crRNA (Fig. 7A). If this is the case, complexes lacking Csm3 should contain very short or no crRNAs, whereas extra copies of Csm3 should result in larger sizes for mature crRNAs. We attempted the purification of Cas10⅐Csm complexes from strains lacking Csm3. Absence of Csm3 not only prevented the formation of the complex but also the detection of the other members of the Cas10⅐Csm complex. Regardless of which complex member carried the His 6 tag, we were unable not only to purify a soluble subcomplex but also to detect the tagged protein in cell extracts by Western blot (Fig.  7B). In Pseudomonas aeruginosa, the crRNA is required for the assembly of the Csy complex (20). Therefore, we hypothesized that because Csm3 can bind the crRNA, the lack of Csm3 could prevent the recruitment of the crRNA into the Cas10⅐Csm complex, thus preventing its assembly. To test whether the crRNA is indeed required for the formation of the complex, we purified it from a strain deleted for the repeat/spacer sequences. The Cas10⅐Csm complex did form in the absence of crRNAs (Fig. 7, C and D), a result that rejected our hypothesis. These findings suggest Csm3 is an essential member of the core complex and that in its absence, the Cas10⅐Csm complex is unstable, and its members are degraded. Alternatively, although unlikely, Csm3 could be required for the expression of the cas/csm genes.
To test the effect of additional copies of Csm3 within the Cas10⅐Csm complex, we cloned the csm3 and csm3 D100A mutant genes under an isopropyl 1-thio-␤-D-galactopyranoside-inducible promoter in the S. epidermidis plasmid pLM9, creating pLM9/csm3 and pLM9/csm3 D100A , respectively. We then induced expression of the different csm3 alleles in wildtype S. epidermidis RP62A (a wild-type strain expressing all the cas/csm genes from the chromosomal operon) and captured spc1 crRNAs from cellular extracts with a biotinylated antisense probe. Consistent with our model, overexpression of Csm3 resulted in a shift in spc1 crRNA sizes toward larger species (49, 55, and 61 nt) and in the disappearance of the smallest detectable size in this pulldown assay (37 nt, Fig. 7E). In contrast, overexpression of Csm3 D100A caused no detectable shift and was undistinguishable from an empty vector control. Altogether, these data demonstrates that Csm3 plays a fundamental role in determining the extent of crRNA maturation. Higher levels of this protein result in longer crRNAs, suggesting that Csm3 acts as a ruler that sets the lengths of the Cas10⅐Csm complex-bound crRNAs. We believe that alterations of Csm3 that perturb the folding and/or result in a reduced ability to bind the crRNA, such as the addition of an N-terminal His tag or the D100A mutation, result in shorter mature crRNA species.

DISCUSSION
Here, we show that a Cas10⅐Csm ribonucleoprotein complex, similar to the E. coli Cascade (2) and Pyrococcus furiosus and Sulfolobus solfataricus (5, 21) Cmr complexes, mediates type III-A CRISPR immunity in staphylococci. This complex is com- FIGURE 7. Csm3 levels determine the extent of crRNA maturation. A, a model for crRNA maturation in which Csm3 protects the 3Ј end of crRNAs acting as a 6-nt ruler within the Cas10⅐Csm complex to dictate the extent of maturation. B, a Western blot is shown using Ni 2ϩ -HRP to probe for each of the indicated His-tagged species in a ⌬csm3 background. His 6 tags were placed on pcrispr/⌬csm3 on each remaining member of the complex (as indicated), and also in Cas6, and constructs were expressed in S. epidermidis LM1680. Whole cell lysates of each strain were subject to Ni 2ϩ affinity chromatography, and extracts were resolved by SDS-PAGE prior to Western blotting. Although Cas6 can be detected in cell extracts, the members of the Cas10⅐Csm complex cannot. C, SDS-PAGE of a His 6 ⅐Csm2 complex expressed from a pcrispr plasmid lacking the repeat/spacer sequences (⌬R/S) and purified using Ni 2ϩ affinity chromatography. D, extraction, labeling, and PAGE separation of RNAs from the ⌬R/S Cas10⅐Csm complex. E, Spc1 crRNAs visualized in wild type S. epidermidis in the presence of pLM9, pLM9/csm3, or pLM9/ csm3 D100A . Where indicated, Csm3 overexpression was induced with isopropyl 1-thio-␤-D-galactopyranoside (IPTG) for 3 h prior to harvesting total RNA. crRNAs were captured from total RNA extracts using a biotinylated probe antisense to spc1, radiolabeled at the 5Ј end, and resolved using denaturing PAGE. oligo, oligonucleotide probe. posed of Cas10, Csm2, Csm3, Csm4, and Csm5 and mature crRNAs that, through base pair interactions with a cognate DNA sequence, guide the complex to its target on the genome of plasmids and bacteriophages. Unexpectedly, multiple mature crRNA species differing by 6-nt increments at the 3Ј end are present in this complex. Previously, we demonstrated that the final length of mature crRNAs is determined by a ruler mechanism that measures from the 5Ј end primary processing cleavage site (19). Here, we demonstrate that Csm3 is the ruler that determines the mature crRNA length. Csm3 is essential for the formation of the complex and seems to be present in multiple copies. In vitro, multiple copies of Csm3 bind the RNA in a sequence-independent manner, one protein every 6 nt of substrate. Alanine substitution of a conserved aspartate residue, Asp-100, impacts Csm3 folding and/or ability to bind the crRNA and prevents the accumulation of longer mature crRNA species. However, overexpression of Csm3 leads to the accumulation of longer species. Altogether, these results allow us to propose that Csm3 binds to the crRNAs in the complex at multiple sites, once every 6 nt, with each additional copy extending the crRNA length by 6 nt.
The nucleases involved in the biogenesis of crRNAs remain to be determined. In other type III and type I CRISPR-Cas systems, primary processing is carried out by Cas6 (2,15,30), which cleaves within repeats at the base of a hairpin and defines the 5Ј end for all crRNAs. Therefore, S. epidermidis Cas6 is a strong candidate to cleave the crRNA precursor into 71-nt intermediates. The identity of the nuclease responsible for crRNA maturation in type III CRISPR-Cas systems is less clear. The fact that the same type III-A complex, when expressed in S. epidermidis and E. coli, produce mature species of different lengths and that this complex is unable to direct the maturation of a crRNA substrate in vitro, suggests that a host-encoded endo-or exoribonuclease might be responsible for the degradation of the 3Ј end of crRNA intermediates. Several ribonucleases are annotated in the S. epidermidis RP62a genome (22), RNase H-II and -III, RNase III, RNase II/VacB and RNase R, RNase BN, RNase P, Cbf1, and YjgF. Any of these could participate in crRNA maturation. Future biochemical and genetic experiments will determine the role of host RNases in crRNA maturation.
The three different types of CRISPR-Cas systems can be distinguished according to their crRNA biogenesis pathways. Type II systems display a distinct mechanism for the generation of crRNAs that requires the pairing of the precursor crRNA with an antisense trans-encoded crRNA to ensure RNase III cleavage of the precursor (17). Type I and III systems, however, have comparable crRNA biogenesis pathways; the main difference being the lack of further processing of the intermediate crRNAs in the former. This may be attributed to the different properties of the type I Cas6 homologs. In these systems, the endonuclease is part of the Cascade complex (2,7,29) and remains tightly bound to the 3Ј end of its product after cleavage (15). The strong and continuous protection of the crRNA by the type I Cas6 homologs may prevent further nucleolysis of the 3Ј end. In contrast, Cas6 is not part of the S. epidermidis type III-A ribonucleoprotein complex or of the well characterized type III-B P. furiousus and S. solfataricus Cmr complex, which also contain mature crRNAs differing by 6-nt increments at the 3Ј end (5,18,21,31). Due to its homology to Csm3, we suspect that Cmr6 acts as a ruler of the mature crRNA length in P. furiousus.
It remains to be known whether the precise measurement of crRNA length by Csm3 is important for CRISPR immunity. One possibility is that the extra trimming of crRNAs is caused by rearrangements in the crRNA:Cas⅐Csm complex that expose these RNAs to host nucleases and has no consequences for CRISPR immunity. Alternatively, crRNA maturation could be necessary for CRISPR-Cas function. In this scenario, it is possible that repeat sequences at the 3Ј end of an intermediate crRNA, which do not anneal with the target and form stemloop structures, would interfere with target recognition and/or cleavage. In fact, in the type III-B system of P. furiosus, the 3Ј end of the mature crRNA determines the cleavage site on the target sequence exactly 14 nt upstream of this end (5); such precise cleavage may not be possible with a guide crRNA containing extra sequences at the 3Ј end. Equally unknown is the importance, if any, of the presence of multiple crRNA species. If indeed the target cleavage site of type III CRISPR-Cas systems is determined by the mature 3Ј end of the crRNA, then multiple mature crRNA sizes could direct multiple target cleavage events. The advantage of this would be that targets cleaved by complexes containing longer crRNAs could be subjected to a second cleavage event directed by shorter crRNAs, thus providing less chance for repair and stronger immunity. Our results argue that this may not be the case since we demonstrated that the Csm3 D100A complex, which produces only the shortest crRNA (31 nt), confers full immunity against plasmid conjugation. In this system, a perfect match exists between the spc1 crRNA and its targeted sequence, the protospacer. The possibility exists that longer crRNAs may be required when there is an imperfect match between the crRNA and the protospacer. In the latter scenario, longer crRNAs would increase the likelihood of establishing sufficient length of complementarity to facilitate interference. The 31 nt-long crRNA contains 8 nt of repeat sequences at the 5Ј end plus 23 nt of spacer sequence at the 3Ј end. Because the full-length spacer is 36 nt, this result also shows that the 13 nt at the 3Ј end of the spacer sequence are not required to specify the target of CRISPR immunity. This suggests that if there is a "seed sequence" in type III systems, a region of homology between the crRNA and the target absolutely required for CRISPR immunity (20,32), it must be located at the 5Ј end of the crRNA target. Deciphering the mechanisms of crRNA biogenesis and targeting will be important both to understand how this immune system prevents infection in prokaryotes as well as to exploit it for biotechnological applications (33)(34)(35)(36)(37).