Leukodystrophy-associated POLR3A mutations down-regulate the RNA polymerase III transcript and important regulatory RNA BC200

RNA polymerase III (Pol III) is an essential enzyme responsible for the synthesis of several small noncoding RNAs, a number of which are involved in mRNA translation. Recessive mutations in POLR3A, encoding the largest subunit of Pol III, cause POLR3-related hypomyelinating leukodystrophy (POLR3–HLD), characterized by deficient central nervous system myelination. Identification of the downstream effectors of pathogenic POLR3A mutations has so far been elusive. Here, we used CRISPR-Cas9 to introduce the POLR3A mutation c.2554A→G (p.M852V) into human cell lines and assessed its impact on Pol III biogenesis, nuclear import, DNA occupancy, transcription, and protein levels. Transcriptomic profiling uncovered a subset of transcripts vulnerable to Pol III hypofunction, including a global reduction in tRNA levels. The brain cytoplasmic BC200 RNA (BCYRN1), involved in translation regulation, was consistently affected in all our cellular models, including patient-derived fibroblasts. Genomic BC200 deletion in an oligodendroglial cell line led to major transcriptomic and proteomic changes, having a larger impact than those of POLR3A mutations. Upon differentiation, mRNA levels of the MBP gene, encoding myelin basic protein, were significantly decreased in POLR3A-mutant cells. Our findings provide the first evidence for impaired Pol III transcription in cellular models of POLR3–HLD and identify several candidate effectors, including BC200 RNA, having a potential role in oligodendrocyte biology and involvement in the disease.

Center) to confirm that the missense mutation and the deletion were on different alleles.
For BC200 KO cell lines, we used a CRISPR-Cas9 approach adapted from ref (3), with dual sgRNAs targeting upstream and downstream of the BC200 gene (Fig. S8a). We tested four different combinations of sgRNAs: g1+g3, g1+g4, g2+g3, g2+g4 (Table S10). sgRNA sequences were cloned into the plasmid pSpCas9(BB)-2A-Puro (PX459) as described above. Plasmids (2.5 μg) were transfected into MO3.13 cells with Lipofectamine 3000 and puromycin was added to the media at a concentration of 1μg/mL after forty-eight hours. Puromycin-resistant clonal cell colonies were screened by PCR for the presence of a band of approximately 200 bp corresponding to the targeted region without the BC200 gene (Fig. S8b).
To confirm complete deletion of the BC200 gene, PCR products were sequenced on an Illumina MiSeq (Fig. S8c). The resulting reads were trimmed with Trimmomatic(4) and aligned to the reference genome hg19 using STAR v2.3.0e (5). Aligned reads overlapping with the BC200 gene were identified and their cigar strings were parsed using Pybedtools (6) to determine if the gene was present (one or more mapped base in the BC200 gene) or deleted (no mapped base in the BC200 gene). We identified two clones with deletion of BC200 on all alleles (Fig. S8c) and absent BC200 RNA expression (Fig. S8d). All primers used for sgRNA cloning, PCR and sequencing are indicated in Table S10. For each sgRNA targeting POLR3A or BC200, we screened the top five possible off-target effects predicted by http://crispr.mit.edu/ by PCR and Sanger sequencing.

Small RNA and tRNA precursor sequencing
Since Pol III transcripts range in size from 70 to 330 nucleotides, we used two complementary RNA-seq approaches, consisting of rRNA-depleted RNA-seq for transcripts ≥ 200 nt and a modified small RNAseq approach aimed at measuring the levels of Pol III transcripts < 200 nt, with a special focus on tRNA precursors (pre-tRNAs). Because of their short half-lives, pre-tRNAs have been shown to provide a more reliable estimate of Pol III transcription compared to mature tRNAs. (7)(8)(9) They are also easier to quantify by RNA-seq because they have not yet acquired the post-transcriptional modifications or complex secondary structure that can interfere with reverse transcription. However, pre-tRNAs are only present at low coverage in standard RNA-seq data because of their size (~100 nt). Conversely, commercial small RNA-seq kits are often biased towards Dicer or Drosha-processed small RNAs and do not offer optimal coverage of Pol III transcripts. To overcome these limitations, we enriched total RNA extractions for small RNAs (< 200 nt), directly followed by random priming, cDNA synthesis and next-generation sequencing (Fig. S5a). Small RNA enrichments were performed using the miRNeasy kit (Qiagen) with the modifications outlined in Appendix A of the manufacturer's protocol to allow for separation of RNAs smaller than 200 nucleotides.
To monitor the level of small RNA enrichment in each sample, we synthesized three spike-in RNAs of different sizes (70, 94 and 250 nt, selected from previous publications), and added them at the beginning of the procedure. The two small spike-in RNAs were chosen for their similar size to that of Pol III transcripts: 70 nt for SS-70 in Locati et al.(10) and 94 nt for the synthetic spike-in from Zhong et al. (11) The larger 250 nt spike-in RNA corresponds to ERCC-00051 from the ERCC spike-in set,(12) but without the polyA tail. PCR products to be used for in vitro transcription reactions were generated using a G-block (IDT) template primer pairs corresponding to each spike-in RNA (Table S10). In vitro transcription was performed using PCR products as individual templates with a MAXIscript T7 in vitro Transcription Kit (Ambion). Completed reactions were treated with TurboDNase (Ambion) and subsequently loaded onto mini Quick Spin RNA Columns (Roche) to remove unincorporated nucleotides. RNA was phenol-chloroform extracted and analyzed by 7% denaturing PAGE. Full-length RNA molecules were then eluted from the gel and quantified by Nanodrop. Synthetic spike-in RNAs were mixed at an equimolar concentration of 4x10 -9 mol/L and 0.5 μL of this mix was added to Qiazol lysis buffer after cells were homogenized.
Small RNA enrichment was confirmed using an Agilent Bioanalyzer (Fig. S5a). Libraries from three HEK293 mutant clones (M1-M3) and three control clones (C1-C3) were prepared with the KAPA stranded RNA-seq library preparation and sequenced on an Illumina HiSeq 2500 with 100bp single-end reads. Since the three mutant clones have slightly different genotypes (Fig. 1a), we also sequenced small RNA-seq libraries from biological triplicates of the mutant clone with the lowest POLR3A expression (M2, see Fig. 1b to 1d) and a control clone (C3), in order to assess the impact of POLR3A hypofunction in the worst-case scenario.
Quality control and trimming were performed as previously described. (13) Trimmed reads were aligned to the reference genome hg19 using STAR v2.3.0e(5), including reads mapping to up to 100 locations. with only a slight drop in coverage near the 3' end. Expression levels were estimated with featureCounts(15) using exonic reads in three successive runs with different parameters to treat multimapping reads: i) uniquely mapped reads only; ii) all multimapping reads, counting primary alignments only; and iii) all multimapping reads, counting primary and secondary alignments. All subsequent analyses were performed with the three types of counts. Unless otherwise specified, results are reported for option ii), but general agreement of results was verified with the three options. Expression levels of tRNA precursors were estimated by counting reads mapping at least partially to tRNA introns, leader (20bp upstream) or trailer (20bp downstream) sequences, using featureCounts (15) and custom scripts. pre-tRNA reads represented on average 74.9% of the total number of uniquely mapped tRNA reads and 37.8% of all mapped tRNA reads, while the remainder were exonic reads that could not distinguish between mature and pre-tRNAs (Fig. S6c). We used both types of reads for further analyses.
To assess small RNA enrichment, we calculated the ratio of counts from the small spike-ins over counts from the large spike-in. As a second measure of enrichment, we quantified the library size factors with DESeq2 for small (< 200 nt) and large RNAs (≥ 200 nt), respectively. The ratio of size factors was highly correlated to the ratio of spike-in counts (Fig. S6a), indicating that both measures can be used to assess small RNA enrichment level. Thus, to account for small RNA enrichment variability during subsequent analyses, expression levels of small and large RNAs were normalized with their respective size factors, using DESeq2. (16) For transcripts with multiple isoforms, the maximum size was used. For small RNAs, tRNAs were excluded from the size factor calculation since they represent ~38% of expressed small transcripts and could thus skew the size factors if they are differentially expressed in mutants. After normalization, small and large transcripts were combined for the remainder of the differential expression analysis workflow with DESeq2. Differentially expressed genes were considered statistically significant if adjusted p-value (FDR) < 0.05 and mean expression >10.

Microarray
Total RNA was extracted from control and patient fibroblasts in triplicate using miRNeasy (Qiagen). LC Sciences (Houston, TX, USA) generated a custom microarray including three different probes (~22 nt) for each known Pol III transcript and pseudogene. Briefly, custom probes were synthesized on Paraflo ® microfluidic chips. 5μg of total RNA was reverse transcribed and hybridized to the microarray in biological triplicates (LC Sciences). Following cross-array normalization of samples, probes with a mean intensity signal < 100 were excluded from analysis. Of the three probes targeting BC200 RNA, two had very low signal intensity and were excluded. The remaining probe, 5'-CGTAACTTCCCTCAAAGCAACAACCCC-3', targeted the unique 3' region of the transcript and showed a statistically significant difference between patients and controls. Trypsin/P was defined as enzyme, allowing for two missed cleavages. FDR threshold was set to 0.01 for peptide and protein identifications. Minimal ratio count was set to 2 for protein quantification and the functions "match between runs", "requantify" and "match from and to" were enabled. Normalized MaxQuant ratios were used for subsequent analyses with Perseus v.1.5.6.0. Known protein contaminants were removed from the analysis. Protein groups were kept for analysis if they were detected in at least four out of six biological replicates. Conditions (POLR3A M852V or BC200 KO vs. MO3.13-WT) were compared in pairs using a one-sample t-test on log2 ratios. Multiple testing correction was performed using the Benjamini-Hochberg method. The threshold for statistical significance was set at FDR < 0.05.
Proteins were considered to be significantly differentially abundant when absolute log2 fold change (BC200 KO /WT or POLR3A M852V /WT) > 0.5. Subsequent analyses were performed using custom scripts in R. We used the software GOrilla (18)  We considered proteins to undergo protein-level changes only if SILAC FDR < 0.05, D was in the upper quantile, the absolute log2FCS was in the upper quantile and the absolute log2FCR was below the median ( Fig. S11). We considered proteins to undergo substantially greater protein-level than mRNA-level changes if they did not belong to the above category, SILAC FDR < 0.05, D was in the upper quantile and the absolute log2FCS was in the upper quantile (Fig. S11). Finally, we considered proteins to be regulated at the mRNA level when SILAC FDR < 0.05, RNA-seq adjusted p-value < 0.05, the absolute log2FCR was in the upper quartile and D was smaller than the tercile.             RNU6-1

Control Mutant
Pol III score (log2)    Overview of the workflow for the custom small RNA-seq protocol used in this study compared to traditional small RNA-seq and rRNA-depleted RNA-seq approaches. b) Proportions of read counts mapping to small and large transcripts using custom small RNA-seq (protocol 1) compared to rRNAdepleted RNA-seq (protocol 3). c) Proportion of reads counts mapping to tRNAs, other small Pol III transcripts or large Pol III transcripts in the two protocols. b) and c) show an enrichment of small transcripts in Protocol 1 compared to Protocol 3.    Proteins were considered differentially abundant when FDR < 0.05 and absolute log2 fold change > 0.5 or 1. mRNAs were considered differentially expressed when the mean normalized expression across samples > 100, adjusted p-value < 0.05 and log2 fold change > 0.5 or 1. b) Distribution of the log2 fold change for POLR3A M852V /WT and BC200 KO /WT using all expressed mRNAs (mean expression > 100) in RNA-seq data. More mRNAs have high fold changes in BC200 KO (p-value < 2.2 x 10 -16 , two-sample Kolmogorov-Smirnov test). c) Overlap between mRNAs detected by RNA-seq (normalized expression > 100) and corresponding proteins detected by SILAC (in 4 out of 6 replicates). The 1,200 overlapping mRNA/protein pairs were used for subsequent analyses. d) Distribution of fold change in POLR3A M852V for mRNAs that showed statistically significant differences in RNA-seq in both conditions (adjusted p-value < 0.05) and had a significant fold change in BC200 KO (log2 FC > 0.5). Only mRNAs also detected in SILAC were used for this analysis. ***p < 0.001, Wilcoxon rank-sum test. Proteins were considered differentially abundant when FDR < 0.05 and absolute log2 fold change > 0.5 or 1. mRNAs were considered differentially expressed when the mean normalized expression across samples > 100, adjusted p-value < 0.05 and log2 fold change > 0.5 or 1. b) Distribution of the log2 fold change for POLR3A M852V /WT and BC200 KO /WT using all expressed mRNAs (mean expression > 100) in RNA-seq data. More mRNAs have high fold changes in BC200 KO (pvalue < 2.2 x 10 -16 , two-sample Kolmogorov-Smirnov test). c) Overlap between mRNAs detected by RNAseq (normalized expression > 100) and corresponding proteins detected by SILAC (in 4 out of 6 replicates). The 1,200 overlapping mRNA/protein pairs were used for subsequent analyses. d) Distribution of fold change in POLR3A M852V for mRNAs that showed statistically significant differences in RNA-seq in both conditions (adjusted p-value < 0.05) and had a significant fold change in BC200 KO (log2 FC > 0.5). Only mRNAs also detected in SILAC were used for this analysis. ***p < 0.001, Wilcoxon rank-sum test.