No off-target mutations in functional genome regions of a CRISPR/Cas9-generated monkey model of muscular dystrophy

CRISPR/Cas9 is now widely used in biomedical research and has great potential for clinical applications. However, the safety and efficacy of this gene-editing technique are significant issues. Recent reports on mouse models and human cells have raised concerns that off-target mutations could hamper applying the CRISPR technology in patients. The high similarities of nonhuman primates to humans in genome content and organization, genetic diversity, physiology, and cognitive abilities have made these animals ideal experimental models for understanding human diseases and developing therapeutics. Off-target mutations of CRISPR/Cas9 have been analyzed in previous studies of nonhuman primates, but no report has investigated genome-wide off-target effects in living monkeys. Here, we used rhesus monkeys in which a genetic disorder mimicking Duchenne muscular dystrophy had previously been produced with CRISPR/Cas9. Using whole-genome sequencing to comprehensively assess on- and off-target mutations in these animals, we found that CRISPR/Cas9-based gene editing is active on the expected genomic sites without producing off-target modifications in other functional regions of the genome. These findings suggest that the CRISPR/Cas9 technique could be relatively safe and effective in modeling genetic disease in nonhuman primates and in future therapeutic research of human diseases.

Genome-editing technologies, especially CRISPR systems, have been growing quickly and have been extensively used in modeling of human diseases. CRISPR/Cas9 is capable of inducing precise modification of genomic sequences, which could lead to both on-target and potential off-target effects (OTs) 4 because of the high sequence homology of sgRNA (1,2). Previous reports on mouse models and human cells (3)(4)(5) have raised concern about its OTs, which could hamper the process of applying CRISPR to patients. Recent research has verified that CRISPR/Cas9 can be used to precisely modify the genome of monkeys, demonstrating that CRISPR/Cas9-mediated nonhuman primates (NHPs) have promising application potential in understanding of gene function and disease mechanisms and exploration of gene therapy (6 -9). However, uncertain OTs are an inevitable concern for translational medicine research using these NHP models. We had previously generated monkey models of Duchenne muscular dystrophy (DMD) by disrupting the X-linked dystrophin gene via CRISPR/Cas9. The genotypes and predicted OTs were assessed in the placenta and umbilical cord of newborn or tissues of stillborn monkeys, and no OT was detected (8), which is the most important requirement for gene therapy research. In this study, we performed OT analysis based on whole-genome sequencing (WGS) in two CRISPRedited monkey models. The result demonstrated that there was no off-target mutation in functional genome regions of a CRISPR/Cas9-generated monkey model of DMD.

Analysis of mutations and presumptive off-target sites via DNA sequencing
We had previously assessed the genotypes and presumptive OT sites in the placenta and umbilical cord of newborn or tissues of stillborn monkeys, and no OT was detected (8). Here, we tested two CRISPR-edited DMD monkeys (both were 4 years old) that were identified as mutant. The MT-1 is a biallelic knockout homozygous with a 1-bp deletion, and 100% of tested blood, muscle, and skin cells are mutant. The MT-2 is a mosaic mutant monkey with 2-bp deletions, containing about 50% mutant cells (Fig. 1A). OTs were analyzed in both mutant monkeys and one age-matched WT sibling control (WT-1). A total of 17 presumptive OT sites were predicted with CCTop cro ACCELERATED COMMUNICATION (CRISPR/Cas9 target online predictor) software according to PAM loci and similarity of sgRNA and were identified through Sanger sequences with specific primers. Consequently, there was no OT occurring at these predicted sites ( Fig. 1 (B-E) and Table S1).

WGS off-target analysis
To further validate on-and off-targets at WGS level, the single-nucleotide variants (SNVs) and insertions or deletions (indels) were examined at a depth of 60ϫ to cover variants of lower frequencies, including coding region, RNA splicing, upand downstream, intron, intergenic region, and 5Ј-UTR. There are 7,546,817 SNVs and 1,324,952 indels in MT-1 ( Fig. 2A) Tables S2-S5. Compared with the reference genome, a total of 847 homologous sequences of sgRNA were screened out with the NCBI alignment tool blastn (Table S6). There were 510 general sites in MT-1, including 442 SNVs and 68 indels, and 592 general sites in MT-2, containing 535 SNVs and 57 indels. We used the WGS sequence of the WT sibling first and then an online genomic database as reference to filter the SNVs and indels step by step. According to the identity of genomes between reference and sibling, SNVs were filtered, and there remained 208 SNVs in MT-1 and 227 SNVs in MT-2 in intronic or intergenic regions, but not in the coding region, splicing, and up-and downstream in both mutant monkeys (Table 1). In contrast with inbred mice, humans and monkeys have rich genome diversity and complex individual differences. We mainly focused on the identification of indels that contain almost all of the mutations introduced by nonhomologous end joining, because they are the hallmarks of endonuclease-induced damage (9,10). Similarly, given the identity with reference and WT control, we filtered and excluded repetitive sequence and genomic WGS sequence assemblies (NW_xxx). The remaining three indels in MT-1 and two in MT-2 were validated using Sanger sequence, and we found that there were one intronic indel in MT-1 and two intergenic indels in MT-1 and MT-2, respectively. To investigate the potential interaction with dystrophin gene for these five indels, we searched them in the genome bank of Macaca mulatta from NCBI. The four intergenic sequences and the intronic one could not be found in known coding genes. There is a single exonic indel located at the expected  (Table 1). Together, these results indicated that all on-target but no off-target mutations exist in functional genome regions of DMD monkeys.

Discussion
The OT effect is an important issue for both animal model construction and gene therapy. Genome-modified monkey models are important for translational medicine research in safety and efficiency assessment due to their high similarity to humans in genome and genetic diversity, physiology, and cog-nition. Recent literature based on mouse models indicated that OT effects are still an unavoidable issue. It is essential for reliable analytical methods and in-depth sequencing to determine whether or not an OT exists. Anderson et al. (3), using nextgeneration sequencing, analyzed on-and off-target effects of genome-edited mice and rats and detected the existing OTs, whereas in another report, potential OTs were excluded by Sanger sequencing (11). In our work, all potential OTs in aborted tissue (8) and live monkeys in this study were excluded by T7EN1 and Sanger sequencing. To further explore the present of OTs in WGS level, Anderson's (3) but not Iyer's group (4) found the occurrence of OTs, including SNVs and indels. In view of genome diversity and complex individual differences in monkeys, we focused on indels and filtered the data with steps similar to those used in the methods for rodents. We found that no functional OTs happened in single sgRNA-induced DMD monkey models. The above OT analysis indicated that gene targets in rodents and monkeys display different consequences, case by case. Therefore, with regard to safety concerns of CRISPR/Cas9 technology, OT analysis should be a regular assessment in model construction and therapeutic research.

ACCELERATED COMMUNICATION: No DMD off-target mutations
efficacy, CRISPR/Cas9 would be better applied in clinical research and disease treatment.

Animals
DMD rhesus monkey models were reported previously (8). For OT assessment, blood, ear tissues (cells), and muscle samples were taken in a minimally invasive manner from three 4-year-old healthy siblings, one of them an untreated WT control and the other two CRISPR/Cas9-mediated mutant monkeys. All animal procedures were approved in advance by the institutional animal care and use committee and were performed in accordance with the Association for Assessment and Accreditation of Laboratory Animal Care International for the ethical treatment of primates.

Identification of mutations and predicted OTs via DNA sequencing
Genomic DNA was extracted with the DNeasy blood and tissue kit (Qiagen) according to the kit instructions. The sequence of the targeting region was analyzed by PCR, and the complete process was performed by initial incubation at 95°C for 5 min, followed by 20 cycles of 98°C for 10 s, 60°C for 30 s, and 72°C for 40 s and then 10 cycles of 98°C for 10 s, 50°C for 30 s, and 72°C for 40 s. The PCR products were purified, and 200 ng of DNA/sample was digested with 0.3 l of T7 endonuclease in a 20-l reaction volume for 30 min at 37°C. Then the product was isolated by a 3% agarose gel. To assess mosaic status, purified PCR products were ligated with the TA cloning vector and transfected into Escherichia coli-competent cells. Single colonies were picked and sequenced, and chimeric rate was analyzed by sequencing results. A total of 17 presumptive OT sites were predicted with CCTop software (http:// crispr.cos.uni-heidelberg.de/index.html) 5 (14); according to PAM NGG and similarity of sgRNA, the maximum total mismatches is 4. OTs were identified through Sanger sequences with specific primers (Table S2).

DNA library preparation and WGS analysis
Blood samples of WT-1, MT-1, and MT-2 monkeys were taken, and genomic DNA was subjected to standard wholegenome DNA library preparation for high-throughput sequencing (Illumina platform) with a mean coverage of 60ϫ. According to the Illumina sequencing features, using doubleend sequencing data, we required Q30 to have an average ratio above 80% and an average error rate below 0.1%. Valid sequencing data were aligned to the reference genome of the rhesus monkey (M. mulatta) (assembly Mmul_8.0.1) by BWA (12), and the results were ranked using a comparison of SAMtools (13). Finally, we marked duplicate reads using Picard (http:// sourceforge.net/projects/picard). 5 All polymorphic SNVs and indel sites in the genome were extracted, and high-confidence SNVs and indel data sets were ultimately obtained and analyzed.

Prediction of potential OT sites
Obtained data sets were used to compare the sgRNA sequence with the reference genome. All potential polymorphic SNVs and indels sites in the genome were extracted by SAMtools and Picard tools. We then ultimately obtained highly confident SNVs and indel data sets along with repeatability and other factors for further filtering. These OT sites contained 1-5 base mismatches with the 20-bp target sequence and at least one mismatch in the PAM-proximal seed region via the NCBI alignment tool blastn. To find more SNVs and indel sites, sgRNA homologous regions were amplified 150 bp in their upstream and downstream, respectively. All obtained potential OT sites were differentiated as exon (coding region), RNA splicing, up-and downstream, intronic, and intergenic regions. All SNVs and indels were further filtered according to identity with reference and WT control and excluded repetitive sequence and WGS shotgun.
Author contributions-Y. C. and W. J. designed and conceived the study. W. J. and Y. N. took part in discussion; S. W., S. R., R. B., P. X., and Q. Z. performed molecular, cell, sequencing, and off-target analyses; Y. Z. and Z. Z. collected monkey materials; Y. C., S. W., and S. R. analyzed data and wrote the manuscript, which was approved by all authors.