Biallelic insertion of a transcriptional terminator via the CRISPR/Cas9 system efficiently silences expression of protein-coding and non-coding RNA genes

The type II bacterial CRISPR/Cas9 system is a simple, convenient, and powerful tool for targeted gene editing. Here, we describe a CRISPR/Cas9-based approach for inserting a poly(A) transcriptional terminator into both alleles of a targeted gene to silence protein-coding and non-protein-coding genes, which often play key roles in gene regulation but are difficult to silence via insertion or deletion of short DNA fragments. The integration of 225 bp of bovine growth hormone poly(A) signals into either the first intron or the first exon or behind the promoter of target genes caused efficient termination of expression of PPP1R12C, NSUN2 (protein-coding genes), and MALAT1 (non-protein-coding gene). Both NeoR and PuroR were used as markers in the selection of clonal cell lines with biallelic integration of a poly(A) signal. Genotyping analysis indicated that the cell lines displayed the desired biallelic silencing after a brief selection period. These combined results indicate that this CRISPR/Cas9-based approach offers an easy, convenient, and efficient novel technique for gene silencing in cell lines, especially for those in which gene integration is difficult because of a low efficiency of homology-directed repair.

Previous studies indicate that non-protein-coding genes, especially those that encode long non-coding RNAs (lncRNAs), 2 play a key role in gene regulation and are involved in many biological processes, e.g. cell growth, epigenetic regulation, cancer development, and human disease (1)(2)(3)(4). Compared with protein-coding genes, silencing long non-protein-coding genes by insertion or deletion of small DNA fragments is difficult as the lncRNA function is primarily subject to conformational changes. Recently, a CRISPR/Cas9-based system was developed as a novel tool for gene editing (5,6). As a simple, convenient, and efficient system, it has been used for gene editing in a variety of organisms (7). The genes edited with this system include protein coding and non-protein coding genes. Two strategies have been employed for permanently silencing nonprotein-coding genes using large genomic deletions with the CRISPR/Cas9 system. One strategy is to completely delete the given gene using multiple guide RNAs (gRNAs) targeting the 5Ј-and 3Ј-flanking sequences (8 -10). The alternative strategy is to delete core promoter sequences of the given gene (11). The limitation of these strategies is that the deletion of a large genomic fragment may alter the function of the gene of interest due to removal of the potential regulatory elements or other functional genes around the targeted genomic region.
This study was designed to silence three genes, including the lncRNA gene, MALAT1, by biallelic integration of a poly(A) signal using the CRISPR/Cas9 system and thus developing an easy, convenient, and efficient approach to silencing gene with the advantages of both the CRISPR/Cas9 and poly(A) signal approaches. First, a poly(A) signal was optionally inserted into three designated sites (immediately behind the promoter, at the first exon, or at the first intron) of the targeted gene via CRISPR/Cas9-induced homology-directed repair (HDR). Second, double marker selection was employed for screening clonal cell lines with successful biallelic integration of the poly(A) signal. Finally, the efficiency of gene silencing was evaluated by qRT-PCR, and biallelic integration was verified by genotyping analysis. Our data showed that the transcription of the given gene was efficiently terminated, demonstrating that CRISPR/Cas9-mediated biallelic integration of a poly(A) signal with double marker selection is an easy, convenient, and efficient novel approach for gene silencing in a cell line.

Downstream transcription was suppressed by transcriptional terminators immediately behind an endogenous promoter
A viral promoter (cmv promoter) was used for driving the transcription of the selection marker gene (GFP) and a poly(A) signal in a previous study, in which a zinc finger nuclease-based approach to gene silencing was established (12). Here, we tested whether an endogenous promoter could do the same as the viral promoter with the following specific aims: to test whether an integration construct could be driven by the promoter of the targeted gene so that the integration construct can be shortened to facilitate its transfection into cells without an exogenous promoter; and because viral promoters are easily subject to epigenetic modification in eukaryotic cells, an endogenous promoter may allow a more persistent transcription of the integration construct.
We also verified whether a poly(A) signal and ␤-globin terminator could terminate downstream transcription in our system using a cmv promoter. The BGH poly(A) signal was previously used as an RNA-destabilizing element for gene silencing.
In addition, multiple tandem poly(A) signals were shown to enhance transcriptional termination, and a ␤-globin terminator has also been shown to terminate the transcription driven by eukaryotic RNA polymerase II (13,14). Therefore, we constructed a set of plasmids with the following cassettes: cmv promoter-RFP-IRES-EGFP (for the control plasmid); cmv promoter-RFP-BGH poly(A) signal-IRES-EGFP; cmv promoter-RFP-4ϫ BGH poly(A) signals-IRES-EGFP; and cmv promoter-RFP-␤globin terminator-IRES-EGFP (Fig. 1A). RFP and EGFP were used as selection markers. All plasmids were independently transfected into HEK293 cells. After 48 h of transfection, the transcription of sequences downstream from the BGH poly(A) signal, 4ϫ BGH poly(A) signals, and the ␤-globin terminator was drastically reduced compared with controls, as indicated by the qRT-PCR analysis (Fig. 1B). Moreover, the effect of the ␤-globin terminator on gene silencing was stronger than that of the BGH poly(A) signal, and multiple tandem BGH poly(A) signals lead to an enhanced transcriptional termination.
To determine whether the terminators driven by an endogenous promoter could silence the targeted gene, we con- The control plasmid is highlighted with a viral promoter, cmv promoter (blue arrow), RFP (red box), IRES (orange box), and EGFP (green box). The other plasmids are constructed from control plasmid. The difference of the plasmids from control plasmid is the different silencing signals inserted between RFP and IRES, including BGH poly(A) ϩ (black arrow), 4ϫ BGH poly(A) (4 black arrows), and ␤-globin terminator (long black arrow). B, results of the qRT-PCR analysis with IRES-F/R primers on HEK293 cells transfected with different plasmids. Compared with the control plasmid, transcriptional termination signals in the other plasmids efficiently silence downstream transcription. The IRES transcription is expressed as fold change over the relative abundance of IRES transcripts in the cells transfected with the control plasmid. n ϭ 3. The data are presented as the means Ϯ S.D. structed a set of plasmids similar to the ones described above. The differences of the plasmids are as follows. 1) The cmv promoter was replaced by the endogenous EF-1␣ promoter. 2) The RFP gene was removed. 3) An EF-1␣ promoter-reversed BGH poly(A) signal-IRES-EGFP cassette was constructed to test whether the integration of BGH poly(A) signal had any effect on antisense transcription ( Fig. 2A). The qRT-PCR analysis with primers specific for IRES showed that, compared with the control, the transcription downstream from the BGH poly(A) signal, 4ϫ BGH poly(A) signals, and the ␤-globin terminator was drastically inhibited after transfection with the individual plas-mid into HEK293 cells. An enhanced inhibition of downstream transcription was observed in the cells transfected with the plasmid containing 4ϫ BGH poly(A) signals. In contrast, transfection with the plasmid containing a reversed BGH poly(A) signal did not cause significant inhibition of its downstream transcription (Fig. 2B), suggesting that the integration of the BGH poly(A) signal had a negligible effect on the antisense transcription. This feature may be very useful for avoiding interference with the transcription of an overlapping antisense gene on the complementary DNA strand, as this may potentially cause a misinterpretation of the function of the targeted gene (15).   Taken together, these findings indicated that the BGH poly(A) signal driven by an endogenous promoter was able to efficiently terminate its downstream transcription. Because BGH poly(A) is a short terminator, its integration into a target gene may be more efficiently mediated by a CRISPR/Cas9-induced HDR approach than by other long terminators (i.e. 4ϫ BGH poly(A) signals and the ␤-globin terminator). We thus speculated that a biallelic BGH poly(A) integration mediated by CRISPR/Cas9-induced HDR could efficiently shut down transcription of a target gene.

Transcription of PPP1R12C was terminated by an integrated BGH poly(A) signal
To silence PPP1R12C, a protein-coding gene, by biallelic integration of a poly(A) signal via CRISPR/Cas9-induced HDR, a donor plasmid was constructed for integrating a poly(A) signal into the adeno-associated virus integration site 1 (AAVS1) in the first intron of PPP1R12C based on gene trapping strategy ( Fig. 2A). AAVS1 is a "safe harbor" gene located within the first intron of the human PPP1R12C (16). This plasmid contains a splice adapter (SA), T2A, a puromy-cin resistance (PuroR) selection marker, and a BGH poly(A) signal, flanked by homologous sequences of the AAVS1 gene (Fig. 3B). The donor plasmid was co-transfected into HEK293 cells with the plasmids expressing the Cas9 protein and the AAVS1 guide RNA (gRNA). A PCR-based genotyping analysis using genomic DNA revealed that 93.5% of clonal cell lines were transgenic after 2 weeks of puromycin selection, with only 3 out of 62 (4.9%) clonal lines having biallelic transgenes (Table 1). Moreover, as the PuroR selection marker lacked a promoter in the donor plasmid, 4 out of 62 (6.5%) clonal lines with the PuroR marker randomly integrated into the genome. Using the cell lines with the poly(A) signal integration, we determined whether PPP1R1 was prematurely terminated as a result of the integration. The results from the qRT-PCR analysis with two pairs of primers specific for PPP1R12C revealed that, compared with the control, the expression of PPP1R12C was drastically reduced by the integration of the PuroR-BGH-poly(A) (Fig. 2D), suggesting that biallelic integration of the poly(A) signal via CRISPR/Cas9-induced HDR efficiently silenced the transcription of its target gene even if the integration took place in the intron of the gene.

Efficient selection for biallelic silencing by double selection markers
A single marker selection for poly(A) signal integration was not efficient, and only 4.9% of the clonal cell lines showed biallelic integration of the poly(A) signal into the PPP1R12C gene (Table 1). This low efficiency may become worse if genes do not respond well to Cas9/gRNA or HDR (11). We therefore tested whether double selection markers (e.g. neomycin resistance (NeoR) and PuroR for drug selection) could improve the efficiency of the biallelic silencing. The plasmid with NeoR-BGH poly(A) cassette was constructed in the same way as the plasmid with PuroR-BGH poly(A) cassette (Fig. 4A). The two plasmids were co-transfected into HEK293 cells with the Cas9/AAVS1-gRNA plasmids. Genotyping analysis showed that, after 2 weeks of double drug selection, 94.2% of clonal cell lines were biallelically transgenic, and only 4 out of 69 (5.8%) clonal cell lines were monoallelically transgenic ( Fig. 4C and Table 1). The biallelic integration of the poly(A) signal was mediated by Cas9/ gRNA-induced HDR (Fig. 4B). Moreover, qRT-PCR analysis on the clonal lines revealed that the transcription of the PPP1R12C was dramatically shut down (Fig. 4D). Together, these findings indicated that biallelic silencing was largely improved by the double marker selection.

Silencing the lncRNA gene by biallelic integration of a poly(A) signal
MALAT1 is an lncRNA gene that is highly expressed in a variety of cell lines (e.g. A549, HeLa, and HepG2). Because silencing lncRNA is considered a challenge, we tested whether the approach used to silence the PPP1R12C as described above could also be applied to lncRNA genes. For this purpose, two donor plasmids were constructed containing a 5Ј-homologous arm (HA), PuroR (or NeoR), the BGH poly(A) signal, and a 3Ј-HA (Fig. 5A). A plasmid containing gRNA targeting to the sequences immediately behind the MALAT1 promoter was also constructed. These donor plasmids did not include SA-T2A sequences as MALAT1 is a non-protein-coding gene. These plasmids, together with the Cas9-containing plasmid, were co-transfected into HepG2 cells. After 2 weeks of double drug selection, 37 out of 47 (78.7%) clonal cell lines were biallelically transgenic with the insertion of NeoR (or PuroR) and the BGH poly(A) signal sequences (Table 1), as indicated by the genotyping analysis using specific primers (MALAT1-HDR-F/R) for the transgenic fragments (Fig. 5B). The qRT-PCR analysis showed that the mRNA abundance of MALAT1 in the cells with biallelic integration of the poly(A) signal was reduced to 0.1% compared with the control cells (Fig. 5C). These findings suggested that biallelic integration of the poly(A) signal via CRISPR/Cas9-induced HDR efficiently silenced the transcription of a long non-protein-coding gene MALAT1.

Silencing NSUN2 by biallelic integration of a poly(A) signal
We also tested whether biallelic integration of a poly(A) signal into the open reading frame of NSUN2, an RNA methyltransferase gene, could disrupt the transcription of this protein-coding gene. Similarly, two donor plasmids were constructed containing a 5Ј-HA, T2A, PuroR (or NeoR), the BGH poly(A) signal, and a 3Ј-HA (Fig. 6A). The plasmid containing gRNA targeting to the first exon of NSUN2 was also constructed. These plasmids, together with the Cas9-containing plasmid, were co-transfected into HEK293 cells. After 2 weeks of double drug selection, 17 out of 26 (65.4%) clonal cell lines were biallelically transgenic (Table 1), as indicated by genotyping analysis using specific primers (NSUN2-HDR-F/R) for the transgenic fragments (Fig. 6B). The qRT-PCR analysis showed the expression of NSUN2 in the cells with biallelic integration of poly(A) signal was reduced to 1-3% compared with the control cells (Fig. 6C). These results suggested that biallelic integration of a poly(A) signal into the first exon of NSUN2 efficiently terminated the transcription of this proteincoding gene.

Discussion
Gene deletion or knock-out is a useful tool for gene functional studies. However, it is often necessary to remove the whole sequence of the targeted gene or a large DNA fragment surrounding the promoter or the whole gene (8,11). This can cause a misinterpretation of the function of the targeted gene because it is possible that a gene on the complementary DNA strand overlaps with the targeted gene and/or its regulatory sequence or that a partial coding sequence is also removed due to this overlap (e.g. the genomic region that can be transcribed into sense and antisense transcripts). RNA interference is another useful tool commonly used for gene silencing, but it also has several shortcomings, including a high off-targeting rate and insufficient gene silencing (17).
In this study, we developed a novel approach to gene silencing using the biallelic integration of a poly(A) signal mediated by CRISPR/Cas9-induced HDR. The CRISPR/Cas9 system is a tool for gene editing that has been recently developed for gene knock-out and knock-in (5,16). Compared with other gene editing tools, it is simple, easy, and convenient and has the potential to target multiple genes simultaneously. More importantly, the CRISPR/Cas9 system can significantly increase the rate of HDR-mediated biallelic integration (16,18). In this study, we leveraged the power of the CRISPR/Cas9 system and used a BGH poly(A) signal together with two selection markers to successfully integrate constructs into PPP1R12C, NSUN2, and MALAT1 at three different sites (i.e. the first intron of PPP1R12C, the first exon of NSUN2, and directly behind the promoter of MALAT1). Because this approach does not involve either the removal of large DNA fragments or RNA interference, it overcomes some of the shortcomings of the conventional gene editing tools such as the low efficiency of HDRmediated biallelic integration and the extensive screening of clonal cells (8,14). In addition, the ability to choose from multiple integration sites allows the side effects of the integrated poly(A) signal in the targeted or overlapping genes to be prevented. Moreover, as indicated by qRT-PCR analysis, gene silencing by biallelic integration of a poly(A) signal did not completely block the transcription of the targeted gene. This feature makes the novel approach especially useful for functional studies of genes whose deletions are lethal.
The successful integration of the BGH poly(A) signal significantly shut down the transcription of three genes, including two protein-coding genes (PPP1R12C and NSUN2) and a long non-protein-coding gene MALAT1. Although it is not surprising that the protein-coding genes were silenced through the novel approach, the almost complete shutdown of the long non-protein-coding gene implies that this approach can be applied for silencing almost any gene of interest.
In this study, the efficiency of gene silencing by biallelic integration of a poly(A) signal was greatly improved by double marker selection. Although the marker genes were driven by the endogenous promoter of the targeted gene, expression of the marker genes appeared to be sufficient for effective selection. The results from the single marker selection study showed that only 4.8% of clonal cell lines were biallelically transgenic. In contrast, the double marker selection leads to 65-93% of biallelic integration by 2-week selection. Based on these results, we can speculate that double marker selection may be particularly useful for the cell lines with low biallelic integration efficiency. If we could prolong the selection times of antibiotics (e.g. from 2 weeks to 3 weeks), we would have much better biallelic transgenic rates. Nevertheless, if the endogenous promoter of the targeted gene was very weak, this approach would not be appropriate because the expression of marker genes may not be suf-ficient for drug selection. For example, the expression of HOTAIR in HEK293 cells was very low (Fig. 7), and under this promoter the integrated marker gene had a very low expression, leading to the failure of drug selection (data not shown). In this case, an additional promoter (e.g. EF-1␣) may be inserted in front of the marker gene to drive its expression (Fig. 8) (12). In conclusion, we tried three different insertion targets for poly(A) integration, i.e. an integration site within the first intron, within the first exon, or immediately behind the promoter of a target gene. These strategies provide multiple choices for poly(A) integration sites to achieve the best silencing efficiency based on the specificity of a target gene. Using this approach, we can establish a cell line with a coding or non-coding gene silencing very efficiently and shortly.

Cell culture and transfection
HEK293 was purchased from ATCC (CRL-1573). Cells were maintained in Dulbecco's modified Eagle's medium (DMEM) supplemented with 10% fetal bovine serum (HyClone), 2 mM GlutaMAX (Life Technologies, Inc.), 100 units/ml penicillin, and 100 mg/ml streptomycin at 37°C with 5% CO 2 . For transfection, cells were seeded into 60-mm dishes (Corning) at a density of 2 ϫ 10 6 cells/dish and cultured in an antibiotic-free medium. When cells were at 80 -90% confluency, they were transfected with plasmids using Lipofectamine 2000 (Life Technologies, Inc.) according to the manufacturer's instructions. The plasmid without any promoter, pGL3-Basic, is used as a negative control. The pGL3-control plasmid, which contains a control promoter sequence, is used as a positive control. The other plasmids, containing MALAT1 promoter, HDR-MALAT1 promoter, PPP1R12C promoter, or HOTAIR promoter, are constructed from pGL3-control. HDR sequence in pGL3-HDR-MALAT1 denotes the sequence homologous to MALAT1 is constructed into plasmid for HDR.
For poly(A) testing, cells were transfected with a plasmid mixture (a total 4 g of BGH-puroR and BGH-neoR donor plasmids and 4 g of Cas9/sgRNA plasmid). After 2 days of transfection, cells were treated with 1 g/ml puromycin (Sigma) for 3 days and then followed by treatment with 400 g/ml neomycin (Sigma) for 14 days.

Construction of poly(A) contained plasmids
The BGH poly(A) signal was amplified from a pcDNA3.1 (ϩ) vector by PCR. The 4ϫ poly(A) was constructed with four poly(A) using a Golden gate clone. The ␤-globin poly(A) was also amplified from HEK293 cell genomic DNA following previous published protocols (14). Four different poly(A) signals were inserted into the EcoRI site of pIRES2-EGFP (Clontech), followed by the replacement of the cmv promoter with the EF-1␣ promoter. The four vectors containing the viral cmv promoters were digested with BglII and ligated with the RFP gene derived from pGBT-RP2-1. A list of all cloning primers are listed in the supplemental Table S1.

Construction of vectors
The sgRNA sequences were synthesized and annealed to the Cas9 expression plasmid (Addgene ID 42230) according to a previously published protocol (19). The Neo-BGH poly(A) donor was constructed as described previously (20,21). In brief, NeoR and PuroR were subcloned from the pcDNA3.1(ϩ) vector (Invitrogen) and the pSMPUW-IRES-Puro vector (Cell Biolabs), respectively. NeoR and PuroR were then linked with BGH poly(A) signals through overlapping PCR. The left and right homology sequences were cloned from HEK293 cell genomic DNA.
The pGEM-T easy vector (Promega) was amplified with a pair of oligonucleotide primers containing the BsmbI restriction site and used as a vector backbone. To generate NeoR/ PuroR-BGH poly(A) donor vectors, NeoR/PuroR-BGH poly(A) and homology sequences generated by amplification with BbsI site-containing oligonucleotide primers were cloned into the BbsI-digested vector using the Golden Gate cloning method.

Reverse transcription and quantitative PCR
Cells were disrupted in TRIzol (Life Technologies, Inc.), and RNA was extracted according to the manufacturer's instructions. Complementary DNAs were synthesized using the Pri-meScript TM RT reagent kit plus gDNA Eraser (Takara, Kusatsu, Japan) and random primers. Quantitative PCR was performed using SYBR Premix Ex Taq (Takara, DDR420A). Regular PCR was performed using Taq polymerase (Fermentas, Guangzhou, China) following the manufacturer's recommendations.

Isolation of genomic DNA and PCR-based genotyping analysis
Cells in 96-well plates were lysed with 0.1 ml of lysis buffer containing 10 mM Tris, pH 8, 2 mM EDTA, 0.2% Triton X-100, and 200 g/ml proteinase K. After 2 h of incubating at 50 -56°C, the cells were heated at 95°C for 5 min to inactivate the proteinase K. Five-ml samples were used for a subsequent integration-oriented PCR with 2ϫ Taq Master Mixture (Vazyme, Nanjing, China) according to the manufacturer's recommendations.

Luciferase reporter assays for promoter activity
The reporter vector of pGL3 (Promega, Beijing, China) was used for transfection into HEK293 cells. We first generated several promoter sequences by PCR with the primers listed in supplemental Table S1. The fragments included the MALAT1 large promoter (nucleotides ϩ45 to Ϫ1047 bp of the TSS), the HDR-MALAT1 promoter (nucleotides ϩ45 to Ϫ729 bp), and the HOTAIR promoter (nucleotides ϩ204 to Ϫ904 bp of the TSS in NR_047518.1). These promoter fragments were respectively cloned into the pGL3-promoter vector using two cloning sites (XhoI and HindIII). The constructs were named MALAT1-large, MALAT1-small, HDR-MALAT1, and pGL3-Hotair, respectively. HEK293 cells were seeded at a density of 4 ϫ 10 5 cells per well in 12-well plates 24 h before transfection. Four hundred nanograms of luciferase plasmid containing different promoters were co-transfected with 2 ng of pRL-TK plasmid. The PGL3-control and PGL3-Basic constructs were used as positive and negative controls, respectively. The Lipofectamine 2000 transfection reagent (Invitrogen) was used. Forty eight hours after transfection, the luciferase activity was measured and normalized.

FACS analysis
HEK293 cells were transfected with the plasmid containing the EGFP and RFP genes. Cells were then analyzed on a FACS Aria II cell sorter (BD Biosciences) after 24 h of incubation.