Advertisement

H3K36me3-mediated mismatch repair preferentially protects actively transcribed genes from mutation

  • Yaping Huang
    Affiliations
    Department of Basic Medical Sciences, Tsinghua University School of Medicine, 100084 Beijing, China
    Search for articles by this author
  • Liya Gu
    Affiliations
    Department of Radiation Oncology, University of Texas Southwestern Medical Center, Dallas, Texas 75390
    Search for articles by this author
  • Guo-Min Li
    Correspondence
    Holds the Reece A. Overcash, Jr., Chair for Center for Research on Colon Cancer. To whom correspondence should be addressed. Tel.:214-648-9955; Fax:214-648-5995;
    Affiliations
    Department of Basic Medical Sciences, Tsinghua University School of Medicine, 100084 Beijing, China

    Department of Radiation Oncology, University of Texas Southwestern Medical Center, Dallas, Texas 75390
    Search for articles by this author
  • Author Footnotes
    3 Please note that the JBC is not responsible for the long-term archiving and maintenance of this site or any other third party hosted site.
Open AccessPublished:April 02, 2018DOI:https://doi.org/10.1074/jbc.RA118.002839
      Histone H3 trimethylation at lysine 36 (H3K36me3) is an important histone mark involved in both transcription elongation and DNA mismatch repair (MMR). It is known that H3K36me3 recruits the mismatch-recognition protein MutSα to replicating chromatin via its physical interaction with MutSα's PWWP domain, but the exact role of H3K36me3 in transcription is undefined. Using ChIP combined with whole-genome DNA sequencing analysis, we demonstrate here that H3K36me3, together with MutSα, is involved in protecting against mutation, preferentially in actively transcribed genomic regions. We found that H3K36me3 and MutSα are much more co-enriched in exons and actively transcribed regions than in introns and nontranscribed regions. The H3K36me3–MutSα co-enrichment correlated with a much lower mutation frequency in exons and actively transcribed regions than in introns and nontranscribed regions. Correspondingly, depleting H3K36me3 or disrupting the H3K36me3–MutSα interaction elevated the spontaneous mutation frequency in actively transcribed genes, but it had little influence on the mutation frequency in nontranscribed or transcriptionally inactive regions. Similarly, H2O2-induced mutations, which mainly cause base oxidations, preferentially occurred in actively transcribed genes in MMR-deficient cells. The data presented here suggest that H3K36me3-mediated MMR preferentially safeguards actively transcribed genes not only during replication by efficiently correcting mispairs in early replicating chromatin but also during transcription by directly or indirectly removing DNA lesions associated with a persistently open chromatin structure.

      Introduction

      DNA mismatch repair (MMR)
      The abbreviations used are:
      MMR
      mismatch repair
      SNV
      single nucleotide variant
      Seq
      sequencing
      TSG
      tumor suppressor gene
      WGS
      whole-genome sequencing
      NER
      nucleotide excision repair
      BER
      base excision repair
      indels
      insertions/deletions
      RPKM
      reads per kilobase pair per million
      PCNA
      proliferating cell nuclear antigen
      HMM
      hidden Markov model
      BWA
      Burrows-Wheeler Aligner.
      is a highly-conserved DNA repair pathway whose primary role is to correct mispairs arising during DNA replication (
      • Modrich P.
      • Lahue R.
      Mismatch repair in replication fidelity, genetic recombination, and cancer biology.
      • Kolodner R.
      Biochemistry and genetics of eukaryotic mismatch repair.
      ,
      • Kunkel T.A.
      • Erie D.A.
      Eukaryotic mismatch repair in relation to DNA replication.
      • Li G.M.
      Mechanisms and functions of DNA mismatch repair.
      ). Genetic or epigenetic alterations in genes encoding MMR components result in a mutator phenotype and increase susceptibility to tumorigenesis (
      • Modrich P.
      • Lahue R.
      Mismatch repair in replication fidelity, genetic recombination, and cancer biology.
      ,
      • Kolodner R.
      Biochemistry and genetics of eukaryotic mismatch repair.
      ). The minimal essential components required for human MMR have been identified (
      • Constantin N.
      • Dzantiev L.
      • Kadyrov F.A.
      • Modrich P.
      Human mismatch repair: reconstitution of a nick-directed bidirectional reaction.
      ,
      • Zhang Y.
      • Yuan F.
      • Presnell S.R.
      • Tian K.
      • Gao Y.
      • Tomkinson A.E.
      • Gu L.
      • Li G.M.
      Reconstitution of 5′-directed human mismatch repair in a purified system.
      ). The MSH2–MSH6 heterodimer, called MutSα, is one of the most important components in human MMR, and its primary role is to recognize mismatches. MutSα binding to a mismatch triggers the downstream steps of MMR, including recruitment of other MMR factors, mismatch excision and repair DNA synthesis (
      • Kunkel T.A.
      • Erie D.A.
      Eukaryotic mismatch repair in relation to DNA replication.
      ,
      • Li G.M.
      Mechanisms and functions of DNA mismatch repair.
      ).
      Recent studies demonstrate that chromatin structure and epigenetic signals regulate many biological pathways and processes, including DNA repair and carcinogenesis. Our earlier studies show that trimethylated histone H3 lysine 36 (H3K36me3) recruits MutSα to chromatin by interacting specifically with the PWWP domain of MutSα (
      • Li F.
      • Mao G.
      • Tong D.
      • Huang J.
      • Gu L.
      • Yang W.
      • Li G.M.
      The histone mark H3K36me3 regulates human DNA mismatch repair through its interaction with MutSα.
      ). Deleting the MSH6 PWWP domain or inactivating SETD2, the major enzyme that trimethylates H3K36 (
      • Edmunds J.W.
      • Mahadevan L.C.
      • Clayton A.L.
      Dynamic histone H3 methylation during gene induction: HYPB/Setd2 mediates all H3K36 trimethylation.
      ), strongly suppresses MutSα recruitment to chromatin and causes a mutator phenotype (
      • Li F.
      • Mao G.
      • Tong D.
      • Huang J.
      • Gu L.
      • Yang W.
      • Li G.M.
      The histone mark H3K36me3 regulates human DNA mismatch repair through its interaction with MutSα.
      ). Similarly, H3K36M and H3K36I mutations also increase susceptibility to cancer (
      • Fang D.
      • Gan H.
      • Lee J.H.
      • Han J.
      • Wang Z.
      • Riester S.M.
      • Jin L.
      • Chen J.
      • Zhou H.
      • Wang J.
      • Zhang H.
      • Yang N.
      • Bradley E.W.
      • Ho T.H.
      • Rubin B.P.
      • et al.
      The histone H3.3K36M mutation reprograms the epigenome of chondroblastomas.
      ,
      • Lu C.
      • Jain S.U.
      • Hoelper D.
      • Bechet D.
      • Molden R.C.
      • Ran L.
      • Murphy D.
      • Venneti S.
      • Hameed M.
      • Pawel B.R.
      • Wunder J.S.
      • Dickson B.C.
      • Lundgren S.M.
      • Jani K.S.
      • De Jay N.
      • et al.
      Histone H3K36 mutations promote sarcomagenesis through altered histone methylation landscape.
      ).
      In addition to its role in MMR during DNA replication (
      • Li F.
      • Mao G.
      • Tong D.
      • Huang J.
      • Gu L.
      • Yang W.
      • Li G.M.
      The histone mark H3K36me3 regulates human DNA mismatch repair through its interaction with MutSα.
      ), H3K36me3 is highly associated with transcription processes, including transcription elongation and splicing (
      • Wagner E.J.
      • Carpenter P.B.
      Understanding the language of Lys36 methylation at histone H3.
      ). The histone mark is enriched in actively transcribed open chromatin (
      • Wagner E.J.
      • Carpenter P.B.
      Understanding the language of Lys36 methylation at histone H3.
      ) and is more abundant in exonic than in intronic regions. However, the exact role that H3K36me3 plays in transcription is unknown. Conversely, MMR has been reported to participate in transcription-coupled nucleotide excision repair (
      • Mellon I.
      • Rajpal D.K.
      • Koi M.
      • Boland C.R.
      • Champe G.N.
      Transcription-coupled repair deficiency and mutations in human mismatch repair genes.
      ), but the idea was highly controversial (
      • Rochette P.J.
      • Bastien N.
      • McKay B.C.
      • Therrien J.P.
      • Drobetsky E.A.
      • Drouin R.
      Human cells bearing homozygous mutations in the DNA mismatch repair genes hMLH1 or hMSH2 are fully proficient in transcription-coupled nucleotide excision repair.
      ,
      • Sonneveld E.
      • Vrieling H.
      • Mullenders L.H.
      • van Hoffen A.
      Mouse mismatch repair gene Msh2 is not essential for transcription-coupled repair of UV-induced cyclobutane pyrimidine dimers.
      ). Even if MMR is involved in transcription, it is unclear how it works.
      Previous studies have shown that, in addition to recognizing mispairs, MutSα specifically interacts with DNA lesions that are usually processed by base excision repair and nucleotide excision repair pathways, which include 8-oxo-7, 8-dihydro-2′-deoxyguanosine (
      • Ni T.T.
      • Marsischky G.T.
      • Kolodner R.D.
      MSH2 and MSH6 are required for removal of adenine misincorporated opposite 8-oxo-guanine in S. cerevisiae.
      ), and many bulky DNA adducts (
      • Li G.M.
      The role of mismatch repair in DNA damage-induced apoptosis.
      ). These studies have established MutSα as a general DNA damage sensor, although the biology of this function is not quite understood. It is likely that MutSα's broad lesion recognition function constitutes its role in transcription.
      Based on our earlier demonstration that H3K36me3 interacts specifically with the MSH6 subunit of MutSα (
      • Li F.
      • Mao G.
      • Tong D.
      • Huang J.
      • Gu L.
      • Yang W.
      • Li G.M.
      The histone mark H3K36me3 regulates human DNA mismatch repair through its interaction with MutSα.
      ), it is possible that highly enriched H3K36me3 recruits MutSα to actively transcribed regions to mitigate the risk of DNA damage during transcription and/or actively promote lesion repair in the transcribed regions. To test this hypothesis, we used ChIP combined with sequencing (ChIP-Seq) and whole-genome sequencing (WGS) to characterize the distribution of H3K36me3 and MSH6 on chromatin and to create genome-wide profiles of relative mutation frequency in HeLa cells with or without disruption of the H3K36me3–MutSα interaction. The results presented here demonstrate the co-enrichment of H3K36me3 and MutSα in actively transcribed regions, which are consistent with the idea that H3K36me3 and MutSα play roles in protecting the integrity of DNA in open chromatin during transcription, as disrupting the H3K36me3–MutSα interaction preferentially promotes mutations in actively transcribed regions. This notion is further supported by the fact that H2O2-induced DNA lesions, which are substrates of base excision repair, preferentially cause mutations in actively transcribed genes in MMR-deficient cells. These results suggest that MMR is involved in removing oxidative DNA damage during transcription, which is distinct from its traditional mismatch correction function during DNA replication. This study also provides evidence for a novel function of H3K36me3 in removing transcription-associated lesions via MMR.

      Results

      Genome-wide distribution and co-enrichment of H3K36me3 and MSH6

      We performed whole-genome ChIP-Seq analyses in HeLa cells (referred to as P0, see “Experimental procedures”) arrested at G1/S or S phase using antibodies against H3K36me3 or MSH6 that had been used for ChIP assays previously (see “Experimental procedures”). We used the RSEG pipeline (
      • Song Q.
      • Smith A.D.
      Identifying dispersed epigenomic domains from ChIP-Seq data.
      ) to map the genome-wide distributions of H3K36me3 and MSH6 with high confidence (see “Experimental procedures”). As expected, H3K36me3 was enriched in gene bodies and more enriched in exonic regions than in intronic regions (Fig. 1A). Binding domains were overlaid on transcribed genes in the reference genome Hg19 (http://genome.ucsc.edu/).
      Please note that the JBC is not responsible for the long-term archiving and maintenance of this site or any other third party hosted site.
      H3K36me3-enriched domains aligned with 15,884 genes, and MSH6-enriched domains aligned with 5100 genes (Fig. 1B). The limited MSH6 enrichment is likely due to competition between MSH6 and other H3K36me3 reader proteins (
      • Vermeulen M.
      • Eberl H.C.
      • Matarese F.
      • Marks H.
      • Denissov S.
      • Butter F.
      • Lee K.K.
      • Olsen J.V.
      • Hyman A.A.
      • Stunnenberg H.G.
      • Mann M.
      Quantitative interaction proteomics and genome-wide profiling of epigenetic histone marks and their readers.
      ) and/or inefficient pulldown of chromatin fragments by the MSH6 antibody used because of a relatively weak interaction between MSH6 and H3K36me3. Nevertheless, ∼97% (4924/5100) of MSH6-enriched genes are enriched for H3K36me3 (Fig. 1B), which indicates MSH6 recruitment to chromatin via H3K36me3.
      To further define the relationship between H3K36me3 and MSH6 and their distribution and abundance in the genome, we classified the whole genome into groups based on H3K36me3 ChIP intensity and then plotted these data against the corresponding MSH6 ChIP intensity. H3K36me3 and MSH6 ChIP intensities correlate with each other in a linear fashion throughout the entire genome (Fig. 1C). Representative loci demonstrating this correlation are shown in Fig. 1D. To determine the specific distributions of H3K36me3 and MSH6 and the differences between these distributions in replicating and nonreplicating cells, we performed ChIP-Seq analyses in HeLa cells arrested in S phase or G1/S boundary. We found that 60–80% of H3K36me3-enriched regions were co-bound in both phases (Fig. 1E). Interestingly, the ChIP intensities for H3K36me3 and MSH6 also correlated with each other in both phases (Fig. 1F). Finally, we analyzed the relationship between transcription activity and H3K36me3/MSH6 distribution and demonstrated that regions bound by both H3K36me3 and MSH6 show higher transcription activities than those bound by H3K36me3 alone (Fig. 1G). Taken together, these results suggest that MSH6 is preferentially recruited to actively transcribed regions in both replicating and nonreplicating cells. Given MMR's essential role in correcting replication errors, it is understandable that MSH6 is recruited to chromatin in replicating cells. Why MSH6 is also associated with chromatin in nonreplicating cells, particularly in actively transcribed regions, remains to be investigated.

      H3K36me3–MSH6 co-enrichment is inversely correlated to local mutation frequency

      We used whole-genome sequencing data from HeLa P0 cells to construct a genome-wide map of mutations, i.e. single nucleotide variants (SNVs) and insertions/deletions (indels). These mutations were aligned with the genome-wide distributions of H3K36me3 and MSH6. To determine the impact of H3K36me3/MSH6 enrichment and distribution on local mutations, we classified the whole genome into functional segments, i.e. promoter, exon, intron, and intergenic regions, and mapped H3K36me3/MSH6 enrichments and mutation frequencies to the individual segments. Consistent with the data shown in Fig. 1A, H3K36me3 was more enriched in exonic regions than in intergenic regions, as was MSH6 (Fig. 2A). However, relative mutation frequency was inversely related to the enrichment of both H3K36me3 and MSH6, as exons were less susceptible to mutation, and introns and intergenic regions were more susceptible to mutation (Fig. 2, A and B). In addition, exons with higher H3K36me3 intensities exhibited lower mutation frequencies than those with lower H3K36me3 intensities (Fig. S1A). These results reveal that mutation frequency is inversely correlated to H3K36me3/MSH6 enrichment.
      Figure thumbnail gr2
      Figure 2.Analyses of H3K36me3 and MSH6 enrichment and relative mutation frequency associated with various genomic features in HeLa P0 cells. A, observed/expected enrichment for H3K36me3 or MSH6 in ChIP and relative mutation enrichment scores are shown in promoter, exon, intron, and intergenic regions, as described in the legend to A. B, average mutation frequencies in exons (blue) and introns (purple) are shown. C, ChIP reads density for H3K36me3 (blue) and MSH6 (orange) are plotted on 5′ to 3′ gene bodies in the left y axis. Mutation frequency is similarly plotted in the right y axis. D, H3K36me3 ChIP intensity (blue) and mutation frequency (orange) are plotted versus inverse replication-Seq intensity. Early replicating regions are represented on the left side of the x axis.
      Figure thumbnail gr1
      Figure 1.Genome-wide ChIP analyses of H3K36me3 and MSH6 on HeLa cell chromatin at G1/S or S phase. A, bar graph showing relative H3K36me3 enrichment scores (observed/expected ChIP-Seq reads densities) in different genomic regions. Bars are color-coded for each region, as indicated in A. Regions shown include 2 kb upstream (Up2K) and 2 kb downstream (Down2K) of gene bodies, 5′-UTR, 3′-UTR, coding sequence, intron, and intergenic regions. Expected ChIP-Seq reads density is calculated as follows: (total length of DNA region/length of whole genome)·(total reads). Observed ChIP-Seq reads density is calculated as normalized read count in the specific DNA region. B, Venn diagram of H3K36me3- and MSH6-enriched genes in HeLa cells arrested at the G1/S boundary. C, correlation between ChIP intensity of H3K36me3 and ChIP intensity of MSH6 in HeLa cells at G1/S. D, representative profiles showing co-enrichment of H3K36me3 and MSH6 on HeLa cell chromatin. E, Venn diagram of H3K36me3-enriched domains represented in G1/S and S phase cells. F, correlation of H3K36me3 and MSH6 ChIP intensities between G1/S and S phase HeLa cells. G, transcription levels of the genes (4924) bound by both H3K36me3 and MSH6 (overlap), and those (10,960) bound by H3K36me3 only, as shown in B.
      We also found that the H3K36me3–MSH6 interaction is critical for mutation avoidance, as mutations were found in only 18.8% (925/4924) of the H3K36me3–MSH6 co-bound genes but 44.6% (4886/10960) of the genes bound by H3K36me3 alone. Conversely, the mean mutation frequency in H3K36me3-only genes is higher than in the co-bound ones (Fig. S1B). These data indicate that H3K36me3 executes its genome stability function via MMR.
      It is known that H3K36me3 enrichment tends to increase from the 5′ end to the 3′ end in gene bodies (
      • Barski A.
      • Cuddapah S.
      • Cui K.
      • Roh T.Y.
      • Schones D.E.
      • Wang Z.
      • Wei G.
      • Chepelev I.
      • Zhao K.
      High-resolution profiling of histone methylations in the human genome.
      ). To determine this distribution tendency's influence on mutation frequency, we divided all gene bodies in the genome into 10 segments from 5′ to 3′ and analyzed their relative H3K36me3- and MSH6-ChIP intensities and mutation frequencies. As shown in Fig. 2C, H3K36me3-ChIP signals indeed increase from the 5′ end to the 3′ end in the gene bodies, as do the MSH6-ChIP signals. Mapping mutations in the corresponding sequences revealed that mutation frequencies in 5′ sequences are generally higher than those in 3′ sequences (Fig. 2C), suggesting that even within the gene body H3K36me3-dependent MMR determines mutation frequency.
      Replication timing substantially influences gene mutability (
      • Woo Y.H.
      • Li W.H.
      DNA replication timing and selection shape the landscape of nucleotide variation in cancer genomes.
      ). To determine the involvement of H3K36me3 in this process, we examined the relationship between replication timing, enrichment for H3K36me3, and mutation frequency. The results show that regions relatively enriched for H3K36me3 tend to replicate earlier and have a lower relative mutation frequency than regions with fewer H3K36me3 signals (Fig. 2D). These observations imply that H3K36me3-dependent MMR is critical in replication timing-associated mutability. It is worth mentioning that the replication timing-associated mutation frequency difference observed in HeLa P0 cells is not as big as those reported in certain cancer genomes (
      • Woo Y.H.
      • Li W.H.
      DNA replication timing and selection shape the landscape of nucleotide variation in cancer genomes.
      ,
      • Liu L.
      • De S.
      • Michor F.
      DNA replication timing and higher-order nuclear organization determine single-nucleotide substitution patterns in cancer genomes.
      • Sima J.
      • Gilbert D.M.
      Complex correlations: replication timing and mutational landscapes during cancer and genome evolution.
      ). We believe that this may be related to DNA repair capacities. HeLa cells are proficient in all known DNA repair pathways, including MMR (
      • Holmes Jr, J.
      • Clark S.
      • Modrich P.
      Strand-specific mismatch correction in nuclear extracts of human and Drosophila melanogaster cell lines.
      ), but many cancer genome studies reported in the literature used cancer cases that usually carry a defective DNA repair pathway.

      Gene bodies highly enriched for H3K36me3 display high mutation frequencies

      The data shown above strongly suggest that H3K36me3/MSH6 enrichment is inversely associated with local mutation frequency. However, we found that this rule does not apply to a small portion (∼0.4%) of the genome, where both H3K36me3/MSH6 intensity and mutation frequency are high (highlighted in the gray box in Fig. 3, A and B). We divided the whole genome into 18 groups according to their relative H3K36me3/MSH6 ChIP-intensity (i.e. 1–6, 7–12, and 13–18 classified as low, intermediate, and high intensity groups, respectively) and quantified the mutation frequency in each of these groups from the whole-genome sequencing data of HeLa P0 cells. As shown in Fig. 3, A and B, when H3K36me3/MSH6 intensity is less than 9, the number of both SNVs and indels per kb gradually decreases as the abundance of H3K36me3/MSH6 increases; however, when H3K36me3/MSH6 intensity reaches a relatively high level (higher than level 13), we observed increased mutation frequencies in these regions. To determine whether this unusual phenomenon is related to transcription, we analyzed mutation frequencies of gene bodies and intergenic sequences in the three groups. The results show that, unlike in the gene bodies, mutation frequencies in these intergenic sequences were inversely correlated to the abundance of H3K36me3 (Fig. 3C), consistent with the overall inverse correlation between H3K36me3 abundance and mutation frequency (see above). Because the major difference between gene bodies and intergenic sequences is their transcriptional activity, it is possible that the excess mutations identified in H3K36me3-high regions are associated with transcription.
      Figure thumbnail gr3
      Figure 3.Mutational analysis in H3K36me3/MSH6 highly-enriched regions in HeLa P0 cells. A and B, SNV (A) or indel (B) frequency is plotted against H3K36me3 (blue) or MSH6 (orange) relative ChIP intensity. The SNVs and indels are identified in HeLa P0 cells by comparing the sequencing data with the HeLa genome reference, as described under “Experimental procedures.” H3K36me3-high regions are highlighted in the gray boxes. C, mutation frequencies of gene bodies and intergenic regions are plotted on three groups of DNA sequences with different levels of H3K36me3 intensity, as described in the text. D, percentages of different C→T mutation types are shown on three groups of DNA sequences, as described in C. The 16 kinds of trinucleotides are shown in the x axis. E, mutational strand asymmetry scores of C→T/G→A, T→C/A→G, and C→A/G→T mutations are calculated on H3K36me3-high (green), -intermediate (orange), and -low (purple) regions, as described under “Experimental procedures.”
      Previous studies have demonstrated mutational signatures of base substitutions associated with transcription-coupled damage in humans (
      • Alexandrov L.B.
      • Nik-Zainal S.
      • Wedge D.C.
      • Aparicio S.A.
      • Behjati S.
      • Biankin A.V.
      • Bignell G.R.
      • Bolli N.
      • Borg A.
      • Børresen-Dale A.L.
      • Boyault S.
      • Burkhardt B.
      • Butler A.P.
      • Caldas C.
      • Davies H.R.
      • et al.
      Signatures of mutational processes in human cancer.
      ) and bacteria (
      • Jinks-Robertson S.
      • Bhagwat A.S.
      Transcription-associated mutagenesis.
      ). We therefore analyzed the mutational signature of the base substitutions by incorporating information regarding the 5′ and 3′ neighboring sites of each mutated base in the WGS data from HeLa P0 cells, as described (
      • Alexandrov L.B.
      • Nik-Zainal S.
      • Wedge D.C.
      • Aparicio S.A.
      • Behjati S.
      • Biankin A.V.
      • Bignell G.R.
      • Bolli N.
      • Borg A.
      • Børresen-Dale A.L.
      • Boyault S.
      • Burkhardt B.
      • Butler A.P.
      • Caldas C.
      • Davies H.R.
      • et al.
      Signatures of mutational processes in human cancer.
      ). The results reveal predominant C→T mutations occurring at NpCpG trinucleotides (Fig. S2A), a characteristic of mutational signature 1A/B (
      • Alexandrov L.B.
      • Nik-Zainal S.
      • Wedge D.C.
      • Aparicio S.A.
      • Behjati S.
      • Biankin A.V.
      • Bignell G.R.
      • Bolli N.
      • Borg A.
      • Børresen-Dale A.L.
      • Boyault S.
      • Burkhardt B.
      • Butler A.P.
      • Caldas C.
      • Davies H.R.
      • et al.
      Signatures of mutational processes in human cancer.
      ,
      • Haradhvala N.J.
      • Polak P.
      • Stojanov P.
      • Covington K.R.
      • Shinbrot E.
      • Hess J.M.
      • Rheinbay E.
      • Kim J.
      • Maruvka Y.E.
      • Braunstein L.Z.
      • Kamburov A.
      • Hanawalt P.C.
      • Wheeler D.A.
      • Koren A.
      • Lawrence M.S.
      • Getz G.
      Mutational strand asymmetries in cancer genomes reveal mechanisms of DNA damage and repair.
      ). This mutational signature occurred more frequently as follows: 1) in H3K36me3-high regions than H3K36me3-low regions (Fig. 3D and Fig. S2A) and 2) in H3K36me3-high gene bodies than H3K36me3-high intergenic regions (Fig. S2, B, upper panel). However, the latter does not apply to H3K36me3-low gene bodies and intergenic regions (Fig. S2, B, bottom panel). These results imply that the excess mutations in H3K36me3-high gene bodies are likely related to transcription-associated DNA damage and repair.
      In addition to mutational signatures, strand bias of C→T, C→A, and T→C mutations have also been linked to transcription-coupled damage and repair (
      • Alexandrov L.B.
      • Nik-Zainal S.
      • Wedge D.C.
      • Aparicio S.A.
      • Behjati S.
      • Biankin A.V.
      • Bignell G.R.
      • Bolli N.
      • Borg A.
      • Børresen-Dale A.L.
      • Boyault S.
      • Burkhardt B.
      • Butler A.P.
      • Caldas C.
      • Davies H.R.
      • et al.
      Signatures of mutational processes in human cancer.
      ,
      • Haradhvala N.J.
      • Polak P.
      • Stojanov P.
      • Covington K.R.
      • Shinbrot E.
      • Hess J.M.
      • Rheinbay E.
      • Kim J.
      • Maruvka Y.E.
      • Braunstein L.Z.
      • Kamburov A.
      • Hanawalt P.C.
      • Wheeler D.A.
      • Koren A.
      • Lawrence M.S.
      • Getz G.
      Mutational strand asymmetries in cancer genomes reveal mechanisms of DNA damage and repair.
      ). To determine whether the excess mutations in H3K36me3-high gene bodies are strand-biased, we designated genomic regions as tx(+) when they code genes on the reference strand and tx(−) when they code on the complementary strand, as described by Haradhvala et al. (
      • Haradhvala N.J.
      • Polak P.
      • Stojanov P.
      • Covington K.R.
      • Shinbrot E.
      • Hess J.M.
      • Rheinbay E.
      • Kim J.
      • Maruvka Y.E.
      • Braunstein L.Z.
      • Kamburov A.
      • Hanawalt P.C.
      • Wheeler D.A.
      • Koren A.
      • Lawrence M.S.
      • Getz G.
      Mutational strand asymmetries in cancer genomes reveal mechanisms of DNA damage and repair.
      ), and we calculated the strand asymmetry scores (see “Experimental procedures”) of each of the three mutation types in H3K36me3-high (intensity groups 13–18), -intermediate (intensity groups 7–12), and -low (intensity groups 1–6) regions. Fig. 3E shows the strand asymmetry analysis of C→T, C→A, and T→C mutations. We detected little strand bias in these mutations over the entire genome (Fig. 3E, upper panel) but a relatively strong strand bias in H3K36me3-low regions (middle and bottom panels). Interestingly, despite excess mutations in H3K36me3-high regions (Fig. 3, A–D), the predominant C→T and T→C mutations did not show the strong strand bias as seen in H3K36me3-low regions (Fig. 3E), which indicates that DNA repair systems (
      • Ikehata H.
      • Ono T.
      Significance of CpG methylation for solar UV-induced mutagenesis and carcinogenesis in skin.
      ,
      • Pfeifer G.P.
      Mutagenesis at methylated CpG sequences.
      • Poulos R.C.
      • Olivier J.
      • Wong J.W.
      The interaction between cytosine methylation and processes of DNA replication and repair shape the mutational landscape of cancer genomes.
      ), including H3K36me3-dependent MMR, repair such transcription-coupled damages in transcribed and nontranscribed DNA strands in a balanced manner. Thus, at least some of the excess mutations in H3K36me3-high regions likely accumulated because of insufficient DNA lesion repair on the transcribed strand during active transcription.

      Actively transcribed genes are highly susceptible to mutation despite being enriched for H3K36me3

      To determine transcription's association with the excess mutations observed in H3K36me3-high regions, we performed RNA-Seq analysis in HeLa P0 cells, as described under “Experimental procedures.” Nontranscribed genomic regions were defined as bins with zero reads mapped, whereas transcribed genomic regions were identified as bins with one or more reads mapped. This analysis excluded rRNA transcripts, but genomic regions encompassed by spliced reads (e.g. introns) were binned with transcribed genomic regions. The overall fold-enrichment for H3K36me3/MSH6 was significantly higher (Fig. 4A, p < 0.01 and p < 0.05 for H3K36me3 and MSH6, respectively), and relative mutation frequency was significantly lower (Fig. 4B, p < 0.01) in all transcribed regions, including noncoding RNA genes, than in nontranscribed regions. Mutational analysis of the actively transcribed 5S RNA gene clusters and their flanking nontranscribed intergenic spacer regions revealed that only 4 of the 17 gene regions harbored a mutation, but 11 of the 16 nontranscribed intergenic spacer regions contained 1–5 mutations. Although we did not link excess mutations in H3K36me3-high gene bodies to all transcribed genes, these data further suggest that H3K36me3-dependent MMR preferentially protects transcribed genes from mutations.
      Figure thumbnail gr4
      Figure 4.Relationships between H3K36me3 or MSH6 enrichment, transcriptional activity, and mutation frequency in HeLa P0 cells. A, relative enrichment for H3K36me3 or MSH6 is shown in whole-genome–wide transcribed and nontranscribed regions. B, relative mutation frequency is shown in whole-genome–wide transcribed and nontranscribed regions, as indicated. C, H3K36me3 ChIP intensity is plotted versus mutation frequency in six groups of genomic regions with different transcription levels, as described in the text. Transcription level increases from group 1 to group 6. D, H3K36me3 (blue) and MSH6 (orange) enrichment level (left y axis) and mutation frequency in gene bodies (purple, right y axis) are plotted versus relative transcription level. The regions highlighted in the gray box indicate highly transcribed genes, as described in the text.
      Given that highly transcribed genes are more susceptible to mutation than less active and silent genes (
      • Haradhvala N.J.
      • Polak P.
      • Stojanov P.
      • Covington K.R.
      • Shinbrot E.
      • Hess J.M.
      • Rheinbay E.
      • Kim J.
      • Maruvka Y.E.
      • Braunstein L.Z.
      • Kamburov A.
      • Hanawalt P.C.
      • Wheeler D.A.
      • Koren A.
      • Lawrence M.S.
      • Getz G.
      Mutational strand asymmetries in cancer genomes reveal mechanisms of DNA damage and repair.
      ,
      • Zheng C.L.
      • Wang N.J.
      • Chung J.
      • Moslehi H.
      • Sanborn J.Z.
      • Hur J.S.
      • Collisson E.A.
      • Vemula S.S.
      • Naujokas A.
      • Chiotti K.E.
      • Cheng J.B.
      • Fassihi H.
      • Blumberg A.J.
      • Bailey C.V.
      • Fudem G.M.
      • et al.
      Transcription restores DNA repair to heterochromatin, determining regional mutation rates in cancer genomes.
      ,
      • Shah K.A.
      • McGinty R.J.
      • Egorova V.I.
      • Mirkin S.M.
      Coupling transcriptional state to large-scale repeat expansions in yeast.
      ), we next analyzed transcription activity and its relationship with H3K36me3 abundance and mutation frequency. We classified all regions (including those for noncoding RNAs) in the whole genome into six groups based on their transcription levels, with group 1 and group 6 being the lowest and highest level in transcription activity, respectively. We found that mutation frequency is inversely correlated to H3K36me3 abundance in genomic regions in transcription groups 1–3 (Fig. 4C), which consist of the vast majority of the regions analyzed. For group 4 regions, mutation frequency does not correlate with the abundance of H3K36me3. However, for regions in groups 5 and 6, which represent a very small portion of all the regions analyzed, mutation frequencies increase gradually as H3K36me3 intensity increases (Fig. 4C, see the regions highlighted in the gray box). Thus, an overall biphasic correlation between mutation frequency and H3K36me3 intensity was observed (Fig. 4C, green solid line), which is consistent with the data shown in Fig. 3, A and B. These results suggest that active transcription contributes to gene mutations.
      We performed a similar analysis on all protein-coding genes, but with more details about their transcription activities. We observed a positive correlation between H3K36me3 or MSH6 ChIP signals and transcription levels and an overall inverse correlation between these variables and relative mutation frequencies. However, when the transcription activity reached a level higher than 12, mutation frequency began to increase as transcription activity increased (Fig. 4D). Interestingly, these highly transcribed genes are heavily loaded with both H3K36me3 and MSH6 (Fig. 4D, see regions highlighted in the gray box), a phenomenon observed in Fig. 3A. We indeed found that the vast majority (∼78%) of these highly transcribed genes are located in the regions with excess mutations shown in Fig. 3A (regions between ChIP intensity 12–18). These unexpected results suggest that highly transcribed genes, although heavily protected by H3K36me3-dependent MMR, are highly susceptible to mutations (see below for details). These mutations are probably derived from DNA damage in open chromatin during transcription (
      • Hamperl S.
      • Cimprich K.A.
      The contribution of co-transcriptional RNA:DNA hybrid structures to DNA damage and genome instability.
      ,
      • Costantino L.
      • Koshland D.
      The Yin and Yang of R-loop biology.
      ).

      Disrupting the H3K36me3–MSH6 interaction preferentially increases mutation frequency in highly transcribed regions

      SETD2 is the major methyltransferase that trimethylates H3K36 (
      • Edmunds J.W.
      • Mahadevan L.C.
      • Clayton A.L.
      Dynamic histone H3 methylation during gene induction: HYPB/Setd2 mediates all H3K36 trimethylation.
      ), and the MSH6 PWWP domain interacts with H3K36me3 (
      • Li F.
      • Mao G.
      • Tong D.
      • Huang J.
      • Gu L.
      • Yang W.
      • Li G.M.
      The histone mark H3K36me3 regulates human DNA mismatch repair through its interaction with MutSα.
      ). To determine the actual involvement of H3K36me3-dependent MMR in transcription-associated damage removal, we examined the relationships between H3K36me3/MSH6 abundance, transcription activity, and de novo mutations from various HeLa cells (i.e. SETD2-knockout or MSH6 PWWP-domain disrupted) with 100 passages (P100, see “Experimental procedures”). As expected, we detected very low levels of H3K36me3 and MSH6 on chromatin in SETD2-knockout (SETD2-KO) HeLa cells (Figs. S3, A and B, and S4, A and B). Consistent with previous studies (
      • Li F.
      • Mao G.
      • Tong D.
      • Huang J.
      • Gu L.
      • Yang W.
      • Li G.M.
      The histone mark H3K36me3 regulates human DNA mismatch repair through its interaction with MutSα.
      ), the de novo mutation frequency was elevated in SETD2-KO cells (Fig. S4C). Interestingly, the de novo mutations in SETD2-KO cells but not in control HeLa cells occurred preferentially in exonic regions originally enriched for H3K36me3/MSH6 (Fig. 5A), and the relative mutation frequency in SETD2-KO cells correlates with the original fold-enrichment for H3K36me3 in control HeLa cells (Fig. 5B). In addition, relative mutation frequencies were higher in transcribed genes than in nontranscribed genes in SETD2-KO HeLa cells, and the opposite was observed in control HeLa cells (Fig. 5C). We similarly analyzed the relationship between transcription level and de novo mutation frequency and found that mutation frequency correlates positively with transcription activity in SETD2-KO cells but not in control HeLa cells (Fig. 5D).
      Figure thumbnail gr5
      Figure 5.Analyses of de novo mutations in H3K36me3/MSH6-deficient or control HeLa P100 cells. A, H3K36me3/MSH6 and de novo mutation enrichment scores (OE, observed/expected value) in exons, introns, and intergenic regions in SETD2-KO, MSH6-KO, PWWP-MT, and control HeLa cells. B, de novo mutation frequency of WT HeLa and each individual mutant cell line (SETD2-KO, MSH6-KO, and PWWP-MT), as indicated, is plotted versus H3K36me3 intensity in WT cells. C, mutation frequencies in transcribed (blue) and nontranscribed (purple) genes are shown for WT, SETD2-KO, MSH6-KO, and PWWP-MT cells. D, de novo mutation frequency of WT and each individual mutant cell line is plotted versus relative transcription level. E, mutational strand asymmetry scores of C→T/G→A, T→C/A→G, and C→A/G→T mutations are calculated in HeLa cells and each of the three MMR-deficient cells, as described under “Experimental procedures.”
      To rule out the possibility that H3K36me3 prevents mutations in actively transcribed regions by recruiting other repair proteins, we also performed similar analyses in HeLa cells carrying a complete knockout of MSH6 or an MSH6 defective in the PWWP domain (Fig. S3, C–E), which only disrupted the MMR pathway or disrupted the interaction between H3K36me3 and MSH6 without impairing other DNA repair pathways. We observed similar results to those obtained in the SETD2-KO HeLa cells (Fig. 5, A–D).
      To test whether the de novo mutations observed in H3K36me3–MSH6-deficient HeLa cells are associated with transcription-associated DNA lesions, we analyzed the mutational signatures in all three MMR-deficient cells and their control HeLa cells, as described above (see Fig. S2 and Fig. 3). The overall mutation spectrum is quite similar in these cells (Fig. S4D) and also similar to that observed in HeLa P0 cells (Fig. S2). However, when we analyzed the detailed mutational signatures on different gene groups with different transcription levels, we found an increase in the C→T mutational signature associated with an increase in transcriptional activities in all three MMR-deficient cells (Fig. S4E), consistent with the results shown in Fig. 3D. More strikingly, analysis of mutational strand bias revealed that, compared with wildtype (WT) HeLa P100 cells, strand-biased mutations in cells defective in the SETD2, MSH6, and MSH6 PWWP domains were all significantly compromised for C→T mutations (Fig. 5E, middle and bottom panels). Similarly, the degree of mutational strand bias for C→A and T→C mutations was also reduced (Fig. 5E). Given that all cell lines used here are proficient in both base excision repair (BER) and nucleotide excision repair (NER), but differ in MMR, we conclude that the observed mutations and the loss of mutational strand bias in actively transcribed genes in H3K36me3–MSH6-defective cells are due to the loss of MMR function.

      Actively transcribed tumor suppressor genes are more susceptible to mutation in MMR-deficient cells

      We selectively examined enrichment for H3K36me3 and relative mutation frequency in 123 tumor suppressor genes annotated in the COSMIC database (
      • Forbes S.A.
      • Bhamra G.
      • Bamford S.
      • Dawson E.
      • Kok C.
      • Clements J.
      • Menzies A.
      • Teague J.W.
      • Futreal P.A.
      • Stratton M.R.
      The catalogue of somatic mutations in cancer (COSMIC).
      ). The results in control HeLa cells revealed the following: 1) H3K36me3 was more enriched in these genes than in control genes (Fig. 6A); 2) relative mutation frequency was lower in these tumor suppressor genes than in control genes (Fig. 6B); and 3) these tumor suppressor genes transcribed at a higher level than control genes (Fig. 6C), consistent with a previous study (
      • Chen K.
      • Chen Z.
      • Wu D.
      • Zhang L.
      • Lin X.
      • Su J.
      • Rodriguez B.
      • Xi Y.
      • Xia Z.
      • Chen X.
      • Shi X.
      • Wang Q.
      • Li W.
      Broad H3K4me3 is associated with increased transcription elongation and enhancer activity at tumor-suppressor genes.
      ). In H3K36me3-depleted cells (i.e. SETD2-KO), nonsilent mutations occurred preferentially in these tumor suppressor genes. However, not all tumor suppressor genes carried a new mutation in SETD2-KO cells. For example, we detected de novo mutations in BRCA2, DAXX, NOTCH1, TET1, TGFBR2, ARID2, NF2, and FAT1 but not in many other tumor suppressor genes analyzed in this study (Table S1). Interestingly, 5 of the 11 (∼46%) nonsilent point mutations shown in Table S1 are C→T mutations.
      Figure thumbnail gr6
      Figure 6.Analyses of H3K36me3 enrichment, transcription activity, and mutation frequency in TSGs. A, H3K36me3 ChIP read densities in TSGs and non-TSGs in control HeLa P0 cells. B, mutation frequency in TSGs and non-TSGs in control HeLa P0 cells. C, transcription levels of TSGs and non-TSGs in control HeLa P0 cells. D, transcription levels of all TSGs and only the mutated genes in SETD2/PWWP-deficient HeLa cells. E, representative mutated or mutation-free TSG loci in SETD2-KO HeLa P100 cells. H3K36me3 ChIP intensities and expression levels are shown.
      We reasoned that the discrepancy in mutation frequencies among these tumor suppressor genes can be explained by their transcription activity, which has been shown to influence the stability of the local sequences (Figure 4., Figure 5.). We therefore compared the mutagenic activities of tumor suppressor genes that were enriched for H3K36me3/MSH6 prior to SETD2 knockout, but possessed different transcription activities. In general, relatively highly transcribed tumor suppressor genes were more prone to mutation in MMR-deficient cells (Fig. 6D). Fig. 6E shows six representative genes in this analysis. Even though the SETD2 knockout depleted the H3K36me3 signal in all ARID2, NF2, FAT1, CEBPA, CDH1, and DNMT3A genes, we detected mutations in the highly transcribed genes ARID2, NF2, and FAT1 but not in the much less transcribed genes CEBPA, CDH1, and DNMT3A. These results indicate that actively transcribed tumor suppressor genes are preferentially safeguarded by H3K36me3-dependent MMR in normal circumstances to avoid mutations during transcription, but they are susceptible to mutation when the MMR system is down. Our data provide a further molecular explanation for why actively transcribed genes are targeted for mutation (
      • Supek F.
      • Lehner B.
      Differential DNA mismatch repair underlies mutation rate variation across the human genome.
      ,
      • Supek F.
      • Lehner B.
      Clustered mutation signatures reveal that error-prone DNA repair targets mutations to active genes.
      • Supek F.
      • Miñana B.
      • Valcárcel J.
      • Gabaldón T.
      • Lehner B.
      Synonymous mutations frequently act as driver mutations in human cancers.
      ). This also indicates that chromatin structure and their susceptibility to DNA damage during DNA metabolism, as well as differences in rates of transcription, have a great impact on local mutation frequencies.

      Actively transcribed cancer genes are more susceptible to oxidation-induced mutations in H3K36me3- or MSH6-depleted cells

      To further confirm that the observed mutations in actively transcribed genes are due to loss of MMR activity associated with transcription, we treated MMR-proficient and -deficient HeLa cells with a low concentration of H2O2, which mainly induces oxidative DNA damage, such as 8-oxo-guanine and strand breaks. These oxidative DNA lesions are independent of DNA replication and are not classical substrates for MMR. The H2O2 treatment did not block loading of H3K36me3 and MSH6 to chromatin in the transcription-active G1 phase (Fig. S5, A and B). We analyzed H2O2-induced mutations by DNA sequencing of three actively transcribed cancer genes CALR, MYC, and TP53 and two transcriptionally inactive genes CD79A and GATA1. We observed that the relative mutation frequencies were significantly higher in the actively transcribed genes than in the transcriptionally inactive ones in SETD2-KO, MSH6-KO, or MSH6-PWWP-defective cells (Table 1 and Table S2). However, we did not detect this discrepancy in mutation frequencies between actively transcribed and transcriptionally inactive genes in MMR-competent control HeLa cells (Table 1). It is worth noting that in H3K36me3- and MSH6-defective cells, H2O2-induced mutation frequencies in these genes are essentially proportional to their transcriptional activities, e.g. CALR > MYC > TP53 (Table 1), consistent with the data shown in Fig. 5D. Because all cell lines used here differ only in terms of their MMR activity, these results confirm that H3K36me3-mediated MMR preferentially protects actively transcribed genes from mutation.
      Table 1Mutation frequency induced by H2O2 in actively transcribed or inactively transcribed cancer genes
      Figure thumbnail gr7

      Discussion

      MMR is well known for correcting mismatches generated during DNA replication to ensure high DNA replication fidelity. We have recently shown that H3K36me3 is an important MMR factor, as it recruits MutSα to replicating chromatin in human cells (
      • Li F.
      • Mao G.
      • Tong D.
      • Huang J.
      • Gu L.
      • Yang W.
      • Li G.M.
      The histone mark H3K36me3 regulates human DNA mismatch repair through its interaction with MutSα.
      ). Here, we provide the evidence showing that H3K36me3-dependent MMR preferentially protects actively transcribed genes from mutation. We believe that MMR does this through both replication and transcription. Therefore, this study has identified a new genome maintenance function for H3K36me3 and MMR in repairing transcription-associated DNA lesions.
      Under normal circumstances, both H3K36me3 and MSH6 are more abundant not only in gene bodies than in other genomic regions but also in actively transcribed genes than in less actively transcribed and silent genes (Figure 2., Figure 4.). Correspondingly, local mutation frequency is inversely correlated with H3K36me3/MSH6 intensity (Figure 2., Figure 4.). However, in H3K36me3-depleted cells or cells in which the H3K36me3–MSH6 interaction had been disrupted, mutations occurred preferentially in actively transcribed regions (Fig. 5). These observations clearly suggest that H3K36me3-mediated MMR preferentially protects actively transcribed regions from mutation.
      It is known that active protein-coding genes resided in early replicating chromatin (
      • Hiratani I.
      • Ryba T.
      • Itoh M.
      • Rathjen J.
      • Kulik M.
      • Papp B.
      • Fussner E.
      • Bazett-Jones D.P.
      • Plath K.
      • Dalton S.
      • Rathjen P.D.
      • Gilbert D.M.
      Genome-wide dynamics of replication timing revealed by in vitro models of mouse embryogenesis.
      ,
      • Hiratani I.
      • Ryba T.
      • Itoh M.
      • Yokochi T.
      • Schwaiger M.
      • Chang C.W.
      • Lyou Y.
      • Townes T.M.
      • Schübeler D.
      • Gilbert D.M.
      Global reorganization of replication domains during embryonic stem cell differentiation.
      • Lubelsky Y.
      • Prinz J.A.
      • DeNapoli L.
      • Li Y.
      • Belsky J.A.
      • MacAlpine D.M.
      DNA replication and transcription programs respond to the same chromatin cues.
      ), where H3K36me3/MSH6 are highly enriched (Fig. 2D). Thus, the MMR system can efficiently correct errors generated in actively transcribed genes during early replication timing, resulting in error-free replication. In addition, the following facts support a role for MMR in transcription-associated DNA lesion removal. We observed that a number of very actively transcribed genes, which are highly enriched for H3K36me3/MSH6 and supposed to be in early replicating chromatin, had a mutation frequency higher than that of genes with less transcription activity and less abundance in H3K36me3/MSH6 (Fig. 4, C and D). This phenomenon cannot be explained by replication-associated mutations, as the very actively transcribed genes would have been better protected by H3K36me3-mediated MMR. However, during transcription very actively-transcribed genes are persistently exposed in open chromatin structure and thus suffer more DNA damage-induced mutations than less actively transcribed genes. This assumption is supported by the fact that actively transcribed genes displayed a higher H2O2-induced mutation frequency than less actively transcribed ones in MMR-deficient cells (Table 1). Given that all cells used are competent in other DNA repair pathways and that H2O2-induced damage is not replication-specific, the simplest explanation is that defects in H3K36me3-mediated MMR are responsible for the observed mutations in actively transcribed genes during transcription. These results agree with recent studies demonstrating that human cells differentially repair DNA lesions or mispairs in euchromatin and heterochromatin (
      • Supek F.
      • Lehner B.
      Differential DNA mismatch repair underlies mutation rate variation across the human genome.
      ) and that mutations preferentially occur in active genes in tumors defective in MMR (
      • Supek F.
      • Lehner B.
      Clustered mutation signatures reveal that error-prone DNA repair targets mutations to active genes.
      ,
      • Frigola J.
      • Sabarinathan R.
      • Mularoni L.
      • Muiños F.
      • Gonzalez-Perez A.
      • López-Bigas N.
      Reduced mutation rate in exons due to differential mismatch repair.
      ).
      Transcription-coupled NER has been well documented to selectively remove some DNA lesions from transcribed strands (
      • Mellon I.
      • Spivak G.
      • Hanawalt P.C.
      Selective removal of transcription-blocking DNA damage from the transcribed strand of the mammalian DHFR gene.
      ,
      • Hanawalt P.C.
      • Spivak G.
      Transcription-coupled DNA repair: two decades of progress and surprises.
      ). However, NER's DNA lesion specificity may limit its role in transcription-coupled repair. In contrast, MutSα has a very broad DNA substrate capacity and is capable of recognizing many DNA lesions that are normally processed by BER and NER pathways, such as G:U mispairs, 8-oxo-guanine, O6-methylguanine, and UV dimers (
      • Ni T.T.
      • Marsischky G.T.
      • Kolodner R.D.
      MSH2 and MSH6 are required for removal of adenine misincorporated opposite 8-oxo-guanine in S. cerevisiae.
      ,
      • Li G.M.
      The role of mismatch repair in DNA damage-induced apoptosis.
      ,
      • Poulos R.C.
      • Olivier J.
      • Wong J.W.
      The interaction between cytosine methylation and processes of DNA replication and repair shape the mutational landscape of cancer genomes.
      ,
      • Mazurek A.
      • Berardini M.
      • Fishel R.
      Activation of human MutS homologs by 8-oxo-guanine DNA damage.
      • Mu D.
      • Tursun M.
      • Duckett D.R.
      • Drummond J.T.
      • Modrich P.
      • Sancar A.
      Recognition and repair of compound DNA lesions (base damage and mismatch) by human mismatch repair and excision repair systems.
      ,
      • Wang H.
      • Lawrence C.W.
      • Li G.M.
      • Hays J.B.
      Specific binding of human MSH2.MSH6 mismatch-repair protein heterodimers to DNA incorporating thymine- or uracil-containing UV light photoproducts opposite mismatched bases.
      • Gu L.
      • Wu J.
      • Qiu L.
      • Jennings C.D.
      • Li G.M.
      Involvement of DNA mismatch repair in folate deficiency-induced apoptosis small star, filled.
      ). In addition, unlike other DNA repair genes, whose expressions are induced in response to DNA damage, the expressions of MMR genes are maintained at a very stable level in all cell cycle phases (
      • Li F.
      • Mao G.
      • Tong D.
      • Huang J.
      • Gu L.
      • Yang W.
      • Li G.M.
      The histone mark H3K36me3 regulates human DNA mismatch repair through its interaction with MutSα.
      ), making them available at any time when needed. Thus, both the availability and capability render MMR be an ideal genome maintenance system.
      It is known that during DNA replication, MMR corrects biosynthetic errors in a strand-specific manner and that strand specificity is directed either by a pre-existing strand break (e.g. the ends of Okazaki fragments) or by endonucleolytic cleavage by MutLα. How the MMR system removes the strand-specific damage during transcription is unknown. However, previous studies have provided some hints to the answer. MMR has been shown to occur in nondividing cells exposed to high levels of DNA damage, a reaction called noncanonical MMR (
      • Rodriguez G.P.
      • Romanova N.V.
      • Bao G.
      • Rouf N.C.
      • Kow Y.W.
      • Crouse G.F.
      Mismatch repair-dependent mutagenesis in nondividing cells.
      ,
      • Crouse G.F.
      Non-canonical actions of mismatch repair.
      • Peña-Diaz J.
      • Bregenhorn S.
      • Ghodgaonkar M.
      • Follonier C.
      • Artola-Borán M.
      • Castor D.
      • Lopes M.
      • Sartori A.A.
      • Jiricny J.
      Noncanonical mismatch repair as a source of genomic instability in human cells.
      ). Recent studies by Modrich and co-workers (
      • Kadyrov F.A.
      • Dzantiev L.
      • Constantin N.
      • Modrich P.
      Endonucleolytic function of MutLα in human mismatch repair.
      ,
      • Pluciennik A.
      • Burdett V.
      • Baitinger C.
      • Iyer R.R.
      • Shi K.
      • Modrich P.
      Extrahelical (CAG)/(CTG) triplet repeat elements support proliferating cell nuclear antigen loading and MutLα endonuclease activation.
      • Pluciennik A.
      • Dzantiev L.
      • Iyer R.R.
      • Constantin N.
      • Kadyrov F.A.
      • Modrich P.
      PCNA function in the activation and strand direction of MutLα endonuclease in mismatch repair.
      ) have revealed that covalently closed circular DNA containing a lesion or a helix perturbation that MutSα recognizes stimulates PCNA loading and subsequently activates the MutLα endonuclease activity. Although the endonucleolytic cleavage by MutLα lacks strand bias in vitro, it has been proposed that in vivo interactions between PCNA and MutSα (
      • Kawasoe Y.
      • Tsurimoto T.
      • Nakagawa T.
      • Masukata H.
      • Takahashi T.
      MutSα maintains the mismatch repair capability by inhibiting PCNA unloading.
      ) and/or unidentified DNA signals confer strand specificity on the reaction. We therefore propose that, after H3K36me3 recruits MutSα to actively transcribed chromatin containing a DNA lesion, MutSα directly promotes repair of the DNA lesion, thereby mitigating the risk of locally higher mutation frequency in actively transcribed genes and ensuring both the stability of the transcribed genes and transcription accuracy. It is also possible that MutSα recruits other DNA repair proteins to DNA lesions in actively transcribed regions, similar to the process known as transcription-coupled nucleotide excision repair (
      • Mellon I.
      • Bohr V.A.
      • Smith C.A.
      • Hanawalt P.C.
      Preferential DNA repair of an active gene in human cells.
      ).
      We also show that mutation frequency is higher in introns than in exons in MMR-proficient HeLa cells (Fig. 2A), but the order of mutation frequencies in these regions is reversed when H3K36me3 or the H3K36me3–MSH6 interaction was depleted or disrupted (Fig. 5A), suggesting a critical role for MMR in maintaining exon stability. These results are in agreement with a recent bioinformatics study that analyzed the whole-genome expression and mutation data of several hundred MMR-proficient and MMR-deficient tumors deposited in the TCGA database. The data suggest that mismatches in exonic DNA are repaired more efficiently than their intronic counterparts in a manner that depends on H3K36me3-dependent MMR (
      • Frigola J.
      • Sabarinathan R.
      • Mularoni L.
      • Muiños F.
      • Gonzalez-Perez A.
      • López-Bigas N.
      Reduced mutation rate in exons due to differential mismatch repair.
      ). However, because many introns are transcribed together with their corresponding exons, how does the MMR system selectively repair DNA lesions in exons? Given that H3K36me3 is highly enriched in exons, Schwartz et al. (
      • Schwartz S.
      • Meshorer E.
      • Ast G.
      Chromatin organization marks exon-intron structure.
      ) have proposed that H3K36me3 levels define exon–intron boundaries during mRNA synthesis and H3K36me3's preferential recruitment of MutSα to exon-containing nucleosomes allows MMR to specifically target exons. Based on their bioinformatics analysis, Frigola et al. (
      • Frigola J.
      • Sabarinathan R.
      • Mularoni L.
      • Muiños F.
      • Gonzalez-Perez A.
      • López-Bigas N.
      Reduced mutation rate in exons due to differential mismatch repair.
      ) suggest that a cross-talk between the RNA splicing machinery and MMR could account for the differential mutation frequencies in exons and introns. However, further studies are required to elucidate the detailed mechanism by which H3K36me3 and MutSα promote stability in actively transcribed genomic regions.

      Experimental procedures

      Cell lines and cell culture

      SETD2 and MSH6 knockout HeLa (SETD2-KO and MSH6-KO) and PWWP-mutant HeLa (PWWP-MT1/2) cells were created using CRISPR–Cas9 genome-editing technology with small guide RNAs targeting to the third and the second exon of SETD2 and MSH6, respectively, as described previously (
      • Cong L.
      • Ran F.A.
      • Cox D.
      • Lin S.
      • Barretto R.
      • Habib N.
      • Hsu P.D.
      • Wu X.
      • Jiang W.
      • Marraffini L.A.
      • Zhang F.
      Multiplex genome engineering using CRISPR/Cas systems.
      ). The PWWP-mutant cells are by-products of an MSH6 knockout that deleted a fragment (PWWP-MT1, 112–120–amino acid deletion; PWWP-MT2, 122–128–amino acid deletion) in the MSH6 PWWP domain without disrupting the whole protein. Cells were grown in Dulbecco's modified Eagle's medium (Invitrogen) supplemented with 10% fetal bovine serum.

      Cell synchronization and cell cycle analysis

      We synchronized cells as described (
      • Li F.
      • Mao G.
      • Tong D.
      • Huang J.
      • Gu L.
      • Yang W.
      • Li G.M.
      The histone mark H3K36me3 regulates human DNA mismatch repair through its interaction with MutSα.
      ) and arrested cells at G1/S by culturing for 18 h in complete medium with 2 mm thymidine, 10 h in thymidine-free medium, and then 15 h in thymidine-containing medium. We harvested early S phase cells by releasing G1/S cells into thymidine-free medium for 1.5 h and confirmed cell cycle status by flow cytometry.

      Western blotting and chromatin fractionation

      We performed Western blotting analyses using antibodies against SETD2 (A3194, ABclonal), MSH6 (sc-10798, Santa Cruz Biotechnology), H3 (ab12079, Abcam), H3K36me3 (ab9050, Abcam) or β-actin (BE0033, EasyBio), according to the manufacturer's recommendations.
      SETD2-KO, PWWP-MT, and control HeLa cells were fractionated as described (
      • Daikoku T.
      • Kudoh A.
      • Sugaya Y.
      • Iwahori S.
      • Shirata N.
      • Isomura H.
      • Tsurumi T.
      Postreplicative mismatch repair factors are recruited to Epstein-Barr virus replication compartments.
      ) with some modifications. Briefly, cells were lysed for 10 min on ice in 1 ml of ice-cold 0.5% Triton X-100/mCSK buffer containing multiple protease inhibitors and 200 μm Na3VO4. Cell lysates were subjected to centrifugation (2000 × g, 3 min) to obtain nuclear fractions. Nuclei were digested with 250 units/ml DNase I in 0.1% Triton X-100/mCSK containing 2 mm MgCl2 and 1 mm MnCl2 at room temperature for 20 min. The samples were centrifuged (2000 × g, 3 min), and the supernatant fraction, which contains solubilized chromatin, was saved.

      Microscopy and immunofluorescence analysis

      We performed immunofluorescence analysis as described (
      • Li F.
      • Mao G.
      • Tong D.
      • Huang J.
      • Gu L.
      • Yang W.
      • Li G.M.
      The histone mark H3K36me3 regulates human DNA mismatch repair through its interaction with MutSα.
      ) with antibodies against H3K36me3 (ab9050, Abcam) and MSH6 (610968, BD Biosciences). Fluorescence images were obtained using Axio Observer Z1 microscope (Carl Zeiss MicroImaging), and images were processed and analyzed using ImageJ software.

      ChIP and ChIP-Seq analysis

      We performed ChIP using extracts from 2 × 107 cells and 4 μg of ChIP-grade H3K36me3 (ab9050, Abcam) or MSH6 (sc-10798, Santa Cruz Biotechnology) antibody, according to the native ChIP (N-ChIP) protocol, as described previously (
      • Méndez J.
      • Stillman B.
      Chromatin association of human origin recognition complex, cdc6, and minichromosome maintenance proteins during the cell cycle: assembly of prereplication complexes in late mitosis.
      ). For MSH6 ChIP, we first cross-linked cells with 1% formaldehyde at room temperature for 10 min before micrococcal nuclease digestion. We kept 5% chromatin fragments digested by micrococcal nuclease as input. Both the input and ChIP products were deproteinized using phenol/chloroform extraction, followed by sequencing library construction and sequencing in single-end 50-bp (SE50) mode using the Illumina Next-Generation-Sequencing platform.
      We subjected the sequencing reads to the standard quality control pipeline and mapped clean reads to a human reference genome (UCSC hg19) by the short oligonucleotide alignment package (SOAP2.1) with default parameters (
      • Li R.
      • Yu C.
      • Li Y.
      • Lam T.W.
      • Yiu S.M.
      • Kristiansen K.
      • Wang J.
      SOAP2: an improved ultrafast tool for short read alignment.
      ). Reads mapping to more than one position in the genome were filtered out. Multiple reads mapping to the same position were counted once to avoid potential PCR bias. We used the RSEG pipeline (
      • Song Q.
      • Smith A.D.
      Identifying dispersed epigenomic domains from ChIP-Seq data.
      ) to identify H3K36me3- and MSH6-enriched regions and ChIP-Seq intensities. RSEG models the read counts with a negative binomial distribution and subsequently uses a two-state hidden Markov model (HMM) to segment the genome into foreground domains and background domains. To identify H3K36me3- and MSH6-enriched domains, we regarded sequencing reads from DNA fragments without enrichment by any antibody (input) as background compared with reads from ChIP-ed samples under rseg-diff mode-2 with default settings. We computed the bin size based on total read counts, and the effective genome size with the posterior probability of each bin obtained by HMM decoding is larger than 0.95 (false discovery rate <5%). The mean of read counts within a region is above the 90th percentile of foreground emission distribution. Finally, we merged adjacent enriched bins into enriched domains. We used domain score (> = 10), which measures both the quality and size of the domain, as a filter parameter to keep only highly confident domains.

      Whole-genome sequencing analysis

      We performed whole-genome sequencing using the Illumina HiSeq 2500 (San Diego) platform in PE150 (paired-end 150 bp) mode with an average 30× depth for each base. We processed raw image files using the Illumina Pipeline (version 1.3.4) for base-calling with default parameters. The clean reads were aligned to the HeLa genome reference (
      • Landry J.J.
      • Pyl P.T.
      • Rausch T.
      • Zichner T.
      • Tekkedil M.M.
      • Stütz A.M.
      • Jauch A.
      • Aiyar R.S.
      • Pau G.
      • Delhomme N.
      • Gagneur J.
      • Korbel J.O.
      • Huber W.
      • Steinmetz L.M.
      The genomic and transcriptomic landscape of a HeLa cell line.
      ) by Burrows-Wheeler Aligner (BWA) alignment pipeline with default parameters (
      • Li H.
      • Durbin R.
      Fast and accurate short read alignment with Burrows-Wheeler transform.
      ). We then processed the BAM files derived from BWA alignment by the Genome Analysis ToolKit (GATK version 1.6) (
      • McKenna A.
      • Hanna M.
      • Banks E.
      • Sivachenko A.
      • Cibulskis K.
      • Kernytsky A.
      • Garimella K.
      • Altshuler D.
      • Gabriel S.
      • Daly M.
      • DePristo M.A.
      The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data.
      ) to perform re-alignment around known indel sites. We subjected all aligned reads to GATK count covariates based on known SNVs (dbSNP135) and recalibrated base quality using GATK table recalibration. Both uniquely-mapped reads and multiple-mapped reads were used to identify variations. Multiple aligned reads with a mismatch = 1 were randomly assigned to one site for each mismatch. We used the GATK unified genotyper to identify genome-wide SNVs and indels with default parameters and filtered variations by quality control.
      To elucidate the accumulated de novo mutations in SETD2-KO, MSH6-KO, PWWP-MT, and WT HeLa cells, we used an experimental evolution strategy, as described previously (
      • Sun L.
      • Zhang Y.
      • Zhang Z.
      • Zheng Y.
      • Du L.
      • Zhu B.
      Preferential protection of genetic fidelity within open chromatin by the mismatch repair machinery.
      ). All cells were cultured for 100 generations (P100) and then subjected to WGS analysis, as described above. We regarded all the cells (SETD2-KO P100, MSH6-KO P100, and PWWP-MT P100 and two replicates of HeLa P100 and HeLa P0) as a uniform population to discard the common mutations identified in all the samples. Then, we compared the P100 data to the HeLa P0 data to identify de novo mutations that occurred only in P100 cells but not in P0 cells. We used mutations in HeLa P0 cells to analyze the H3K36me3/MSH6–mutation relationship under normal circumstances (Figure 2., Figure 3., Figure 4.). We used de novo mutations to address mutations that preferentially targeted actively transcribed genes in H3K36me3–MMR-deficient cells (Figure 5., Figure 6., D and E). The sequencing data are accessible in NCBI with SRA accession no. SRP128160.

      Mutational signature and strand bias analyses

      We performed mutational signature analysis as described (
      • Alexandrov L.B.
      • Nik-Zainal S.
      • Wedge D.C.
      • Aparicio S.A.
      • Behjati S.
      • Biankin A.V.
      • Bignell G.R.
      • Bolli N.
      • Borg A.
      • Børresen-Dale A.L.
      • Boyault S.
      • Burkhardt B.
      • Butler A.P.
      • Caldas C.
      • Davies H.R.
      • et al.
      Signatures of mutational processes in human cancer.
      ). We counted 96 substitution classifications defined by six substitution classes (C→A/G→T, C→G/G→C, C→T/G→A, T→A/A→T, T→C/A→G, and T→G/A→C) and analyzed the sequence context immediately 5′ and 3′ to the mutated nucleotide in the whole genome or in certain genomic regions. Then, we calculated the percentage of each classification and normalized to its corresponding trinucleotide frequency in the whole genome or in certain genomic regions.
      We analyzed mutational strand bias as described (
      • Haradhvala N.J.
      • Polak P.
      • Stojanov P.
      • Covington K.R.
      • Shinbrot E.
      • Hess J.M.
      • Rheinbay E.
      • Kim J.
      • Maruvka Y.E.
      • Braunstein L.Z.
      • Kamburov A.
      • Hanawalt P.C.
      • Wheeler D.A.
      • Koren A.
      • Lawrence M.S.
      • Getz G.
      Mutational strand asymmetries in cancer genomes reveal mechanisms of DNA damage and repair.
      ). We annotated genomic regions as tx(+) if they encoded genes on the reference strand and as tx(−) if they encoded genes on the complementary strand. For each of the six mutation types, we calculated the mutation frequencies of the pair of complementary substitutions (e.g. C→T and G→A) in both the tx(+) and tx(−) groups. Then, we calculated the strand asymmetry score as the log2 ratio of the mutation frequencies for the complementary substitutions (e.g. log2(C→T/G→A)) in the entire genome or in the tx(+) and tx(−) groups. The degree of strand bias increases as the absolute value of the strand asymmetry score increases.

      Processing Repli-Seq data and HeLa transcriptome data

      We obtained Repli-Seq data from ENCODE for HeLa-S3 cells with accession no. ENCSR647UES (
      • ENCODE Project Consortium
      An integrated encyclopedia of DNA elements in the human genome.
      ). Signal intensities were averaged over 1-kb intervals throughout the genome. We downloaded HeLa-S3 RPKM data for Hg19 annotated genes with UCSC accession no. wgEncodeEH000130 (
      • ENCODE Project Consortium
      An integrated encyclopedia of DNA elements in the human genome.
      ). These genes were stratified into groups by RPKM value, ranking from low to high.

      Identifying transcribed and nontranscribed genomic regions in HeLa

      HeLa RNA-Seq reads were processed using a standard filter pipeline and aligned to Hg19 using Bowtie2 (
      • Langmead B.
      • Salzberg S.L.
      Fast gapped-read alignment with Bowtie 2.
      ) with default parameters. The mapped reads were then mapped to H3K36me3 bins. Nontranscribed genomic regions were identified as a single bin corresponding to zero reads mapped; transcribed genomic regions were identified as bins with one or more reads mapped. Genomic regions encompassed by spliced reads (e.g. introns) were binned with transcribed genomic regions. As defined by these parameters, 952,358.472-kb regions were identified as transcribed, and 1,966,089.658-kb regions were identified as nontranscribed.

      Oxidation-induced mutation analysis

      Cells were plated onto two 60-mm dishes for each of the four cell lines (SETD2-KO, MSH6-KO, PWWP-MT, and control HeLa cells) with 50% confluence. Cells in one dish were treated with 50 μm H2O2 for 6 days, and cells in the other dish were grown in H2O2-free medium. Genomic DNA were isolated and PCR-amplified using primers specific for three actively expressed genes (CALR, MYC, and TP53) and two inactively expressed genes (CD79A and GATA1). The length of PCR products is ∼700 bp (Table S2). We subjected PCR products to TA cloning and randomly selected 20 positive colonies from each PCR for DNA-sequencing analysis. We aligned the sequences to the reference sequence to identify H2O2-induced mutations. Mutation percentage was calculated by dividing the mutation-containing colonies by the total colonies (n = 20), as shown in Table S2. Relative mutation frequency was calculated as shown in Table 1: relative mutation frequency = number of mutations/total sequencing length (20 × PCR length)/generations.

      Statistical analysis

      All statistical analyses were performed with Welch's two sample t test using R 3.1.2 package. Data were considered statistically significant if p values were less than 0.05 or 0.01, as indicated.

      Author contributions

      Y. H. and G.-M. L. data curation; Y. H. formal analysis; Y. H. investigation; Y. H. writing-original draft; L. G. and G.-M. L. project administration; L. G. and G.-M. L. writing-review and editing; G.-M. L. conceptualization; G.-M. L. supervision; G.-M. L. funding acquisition.

      Acknowledgments

      We thank Dong Wang and Guangshuo Ou for making their bioinformatics resource and microscopy available for the sequencing data analysis and immunofluorescence experiments, respectively; the NCBI dbGaP (the Database of Genotypes and Phenotypes) committee for allowing us to use the HeLa genome database for this study; and Jonathan Feinberg for helpful comments.

      Supplementary Material

      References

        • Modrich P.
        • Lahue R.
        Mismatch repair in replication fidelity, genetic recombination, and cancer biology.
        Annu. Rev. Biochem. 1996; 65 (8811176): 101-133
        • Kolodner R.
        Biochemistry and genetics of eukaryotic mismatch repair.
        Genes Dev. 1996; 10 (8666228): 1433-1442
        • Kunkel T.A.
        • Erie D.A.
        Eukaryotic mismatch repair in relation to DNA replication.
        Annu. Rev. Genet. 2015; 49 (26436461): 291-313
        • Li G.M.
        Mechanisms and functions of DNA mismatch repair.
        Cell Res. 2008; 18 (18157157): 85-98
        • Constantin N.
        • Dzantiev L.
        • Kadyrov F.A.
        • Modrich P.
        Human mismatch repair: reconstitution of a nick-directed bidirectional reaction.
        J. Biol. Chem. 2005; 280 (16188885): 39752-39761
        • Zhang Y.
        • Yuan F.
        • Presnell S.R.
        • Tian K.
        • Gao Y.
        • Tomkinson A.E.
        • Gu L.
        • Li G.M.
        Reconstitution of 5′-directed human mismatch repair in a purified system.
        Cell. 2005; 122 (16143102): 693-705
        • Li F.
        • Mao G.
        • Tong D.
        • Huang J.
        • Gu L.
        • Yang W.
        • Li G.M.
        The histone mark H3K36me3 regulates human DNA mismatch repair through its interaction with MutSα.
        Cell. 2013; 153 (23622243): 590-600
        • Edmunds J.W.
        • Mahadevan L.C.
        • Clayton A.L.
        Dynamic histone H3 methylation during gene induction: HYPB/Setd2 mediates all H3K36 trimethylation.
        EMBO J. 2008; 27 (18157086): 406-420
        • Fang D.
        • Gan H.
        • Lee J.H.
        • Han J.
        • Wang Z.
        • Riester S.M.
        • Jin L.
        • Chen J.
        • Zhou H.
        • Wang J.
        • Zhang H.
        • Yang N.
        • Bradley E.W.
        • Ho T.H.
        • Rubin B.P.
        • et al.
        The histone H3.3K36M mutation reprograms the epigenome of chondroblastomas.
        Science. 2016; 352 (27229140): 1344-1348
        • Lu C.
        • Jain S.U.
        • Hoelper D.
        • Bechet D.
        • Molden R.C.
        • Ran L.
        • Murphy D.
        • Venneti S.
        • Hameed M.
        • Pawel B.R.
        • Wunder J.S.
        • Dickson B.C.
        • Lundgren S.M.
        • Jani K.S.
        • De Jay N.
        • et al.
        Histone H3K36 mutations promote sarcomagenesis through altered histone methylation landscape.
        Science. 2016; 352 (27174990): 844-849
        • Wagner E.J.
        • Carpenter P.B.
        Understanding the language of Lys36 methylation at histone H3.
        Nat. Rev. Mol. Cell Biol. 2012; 13 (22266761): 115-126
        • Mellon I.
        • Rajpal D.K.
        • Koi M.
        • Boland C.R.
        • Champe G.N.
        Transcription-coupled repair deficiency and mutations in human mismatch repair genes.
        Science. 1996; 272 (8614807): 557-560
        • Rochette P.J.
        • Bastien N.
        • McKay B.C.
        • Therrien J.P.
        • Drobetsky E.A.
        • Drouin R.
        Human cells bearing homozygous mutations in the DNA mismatch repair genes hMLH1 or hMSH2 are fully proficient in transcription-coupled nucleotide excision repair.
        Oncogene. 2002; 21 (12173044): 5743-5752
        • Sonneveld E.
        • Vrieling H.
        • Mullenders L.H.
        • van Hoffen A.
        Mouse mismatch repair gene Msh2 is not essential for transcription-coupled repair of UV-induced cyclobutane pyrimidine dimers.
        Oncogene. 2001; 20 (11313985): 538-541
        • Ni T.T.
        • Marsischky G.T.
        • Kolodner R.D.
        MSH2 and MSH6 are required for removal of adenine misincorporated opposite 8-oxo-guanine in S. cerevisiae.
        Mol. Cell. 1999; 4 (10518225): 439-444
        • Li G.M.
        The role of mismatch repair in DNA damage-induced apoptosis.
        Oncol. Res. 1999; 11 (10821533): 393-400
        • Song Q.
        • Smith A.D.
        Identifying dispersed epigenomic domains from ChIP-Seq data.
        Bioinformatics. 2011; 27 (21325299): 870-871
        • Vermeulen M.
        • Eberl H.C.
        • Matarese F.
        • Marks H.
        • Denissov S.
        • Butter F.
        • Lee K.K.
        • Olsen J.V.
        • Hyman A.A.
        • Stunnenberg H.G.
        • Mann M.
        Quantitative interaction proteomics and genome-wide profiling of epigenetic histone marks and their readers.
        Cell. 2010; 142 (20850016): 967-980
        • Barski A.
        • Cuddapah S.
        • Cui K.
        • Roh T.Y.
        • Schones D.E.
        • Wang Z.
        • Wei G.
        • Chepelev I.
        • Zhao K.
        High-resolution profiling of histone methylations in the human genome.
        Cell. 2007; 129 (17512414): 823-837
        • Woo Y.H.
        • Li W.H.
        DNA replication timing and selection shape the landscape of nucleotide variation in cancer genomes.
        Nat. Commun. 2012; 3 (22893128): 1004
        • Liu L.
        • De S.
        • Michor F.
        DNA replication timing and higher-order nuclear organization determine single-nucleotide substitution patterns in cancer genomes.
        Nat. Commun. 2013; 4 (23422670): 1502
        • Sima J.
        • Gilbert D.M.
        Complex correlations: replication timing and mutational landscapes during cancer and genome evolution.
        Curr. Opin. Genet. Dev. 2014; 25 (24598232): 93-100
        • Holmes Jr, J.
        • Clark S.
        • Modrich P.
        Strand-specific mismatch correction in nuclear extracts of human and Drosophila melanogaster cell lines.
        Proc. Natl. Acad. Sci. U.S.A. 1990; 87 (2116007): 5837-5841
        • Alexandrov L.B.
        • Nik-Zainal S.
        • Wedge D.C.
        • Aparicio S.A.
        • Behjati S.
        • Biankin A.V.
        • Bignell G.R.
        • Bolli N.
        • Borg A.
        • Børresen-Dale A.L.
        • Boyault S.
        • Burkhardt B.
        • Butler A.P.
        • Caldas C.
        • Davies H.R.
        • et al.
        Signatures of mutational processes in human cancer.
        Nature. 2013; 500 (23945592): 415-421
        • Jinks-Robertson S.
        • Bhagwat A.S.
        Transcription-associated mutagenesis.
        Annu. Rev. Genet. 2014; 48 (25251854): 341-359
        • Haradhvala N.J.
        • Polak P.
        • Stojanov P.
        • Covington K.R.
        • Shinbrot E.
        • Hess J.M.
        • Rheinbay E.
        • Kim J.
        • Maruvka Y.E.
        • Braunstein L.Z.
        • Kamburov A.
        • Hanawalt P.C.
        • Wheeler D.A.
        • Koren A.
        • Lawrence M.S.
        • Getz G.
        Mutational strand asymmetries in cancer genomes reveal mechanisms of DNA damage and repair.
        Cell. 2016; 164 (26806129): 538-549
        • Ikehata H.
        • Ono T.
        Significance of CpG methylation for solar UV-induced mutagenesis and carcinogenesis in skin.
        Photochem. Photobiol. 2007; 83 (16620158): 196-204
        • Pfeifer G.P.
        Mutagenesis at methylated CpG sequences.
        Curr. Top. Microbiol. Immunol. 2006; 301 (16570852): 259-281
        • Poulos R.C.
        • Olivier J.
        • Wong J.W.
        The interaction between cytosine methylation and processes of DNA replication and repair shape the mutational landscape of cancer genomes.
        Nucleic Acids Res. 2017; 45 (28531315): 7786-7795
        • Zheng C.L.
        • Wang N.J.
        • Chung J.
        • Moslehi H.
        • Sanborn J.Z.
        • Hur J.S.
        • Collisson E.A.
        • Vemula S.S.
        • Naujokas A.
        • Chiotti K.E.
        • Cheng J.B.
        • Fassihi H.
        • Blumberg A.J.
        • Bailey C.V.
        • Fudem G.M.
        • et al.
        Transcription restores DNA repair to heterochromatin, determining regional mutation rates in cancer genomes.
        Cell Rep. 2014; 9 (25456125): 1228-1234
        • Shah K.A.
        • McGinty R.J.
        • Egorova V.I.
        • Mirkin S.M.
        Coupling transcriptional state to large-scale repeat expansions in yeast.
        Cell Rep. 2014; 9 (25464841): 1594-1602
        • Hamperl S.
        • Cimprich K.A.
        The contribution of co-transcriptional RNA:DNA hybrid structures to DNA damage and genome instability.
        DNA Repair. 2014; 19 (24746923): 84-94
        • Costantino L.
        • Koshland D.
        The Yin and Yang of R-loop biology.
        Curr. Opin. Cell Biol. 2015; 34 (25938907): 39-45
        • Forbes S.A.
        • Bhamra G.
        • Bamford S.
        • Dawson E.
        • Kok C.
        • Clements J.
        • Menzies A.
        • Teague J.W.
        • Futreal P.A.
        • Stratton M.R.
        The catalogue of somatic mutations in cancer (COSMIC).
        Curr. Protoc. Hum. Genet. 2008; 2008 (Chapter 10, Unit 10.11 18428421)
        • Chen K.
        • Chen Z.
        • Wu D.
        • Zhang L.
        • Lin X.
        • Su J.
        • Rodriguez B.
        • Xi Y.
        • Xia Z.
        • Chen X.
        • Shi X.
        • Wang Q.
        • Li W.
        Broad H3K4me3 is associated with increased transcription elongation and enhancer activity at tumor-suppressor genes.
        Nat. Genet. 2015; 47 (26301496): 1149-1157
        • Supek F.
        • Lehner B.
        Differential DNA mismatch repair underlies mutation rate variation across the human genome.
        Nature. 2015; 521 (25707793): 81-84
        • Supek F.
        • Lehner B.
        Clustered mutation signatures reveal that error-prone DNA repair targets mutations to active genes.
        Cell. 2017; 170 (28753428): 534-547
        • Supek F.
        • Miñana B.
        • Valcárcel J.
        • Gabaldón T.
        • Lehner B.
        Synonymous mutations frequently act as driver mutations in human cancers.
        Cell. 2014; 156 (24630730): 1324-1335
        • Mellon I.
        • Spivak G.
        • Hanawalt P.C.
        Selective removal of transcription-blocking DNA damage from the transcribed strand of the mammalian DHFR gene.
        Cell. 1987; 51 (3664636): 241-249
        • Hanawalt P.C.
        • Spivak G.
        Transcription-coupled DNA repair: two decades of progress and surprises.
        Nat. Rev. Mol. Cell Biol. 2008; 9 (19023283): 958-970
        • Mazurek A.
        • Berardini M.
        • Fishel R.
        Activation of human MutS homologs by 8-oxo-guanine DNA damage.
        J. Biol. Chem. 2002; 277 (11756455): 8260-8266
        • Mu D.
        • Tursun M.
        • Duckett D.R.
        • Drummond J.T.
        • Modrich P.
        • Sancar A.
        Recognition and repair of compound DNA lesions (base damage and mismatch) by human mismatch repair and excision repair systems.
        Mol. Cell. Biol. 1997; 17 (9001230): 760-769
        • Wang H.
        • Lawrence C.W.
        • Li G.M.
        • Hays J.B.
        Specific binding of human MSH2.MSH6 mismatch-repair protein heterodimers to DNA incorporating thymine- or uracil-containing UV light photoproducts opposite mismatched bases.
        J. Biol. Chem. 1999; 274 (10358035): 16894-16900
        • Gu L.
        • Wu J.
        • Qiu L.
        • Jennings C.D.
        • Li G.M.
        Involvement of DNA mismatch repair in folate deficiency-induced apoptosis small star, filled.
        J. Nutr. Biochem. 2002; 13 (12088801): 355-363
        • Hiratani I.
        • Ryba T.
        • Itoh M.
        • Rathjen J.
        • Kulik M.
        • Papp B.
        • Fussner E.
        • Bazett-Jones D.P.
        • Plath K.
        • Dalton S.
        • Rathjen P.D.
        • Gilbert D.M.
        Genome-wide dynamics of replication timing revealed by in vitro models of mouse embryogenesis.
        Genome Res. 2010; 20 (19952138): 155-169
        • Hiratani I.
        • Ryba T.
        • Itoh M.
        • Yokochi T.
        • Schwaiger M.
        • Chang C.W.
        • Lyou Y.
        • Townes T.M.
        • Schübeler D.
        • Gilbert D.M.
        Global reorganization of replication domains during embryonic stem cell differentiation.
        PLoS Biol. 2008; 6 (18842067): e245
        • Lubelsky Y.
        • Prinz J.A.
        • DeNapoli L.
        • Li Y.
        • Belsky J.A.
        • MacAlpine D.M.
        DNA replication and transcription programs respond to the same chromatin cues.
        Genome Res. 2014; 24 (24985913): 1102-1114
        • Frigola J.
        • Sabarinathan R.
        • Mularoni L.
        • Muiños F.
        • Gonzalez-Perez A.
        • López-Bigas N.
        Reduced mutation rate in exons due to differential mismatch repair.
        Nat. Genet. 2017; 49 (29106418): 1684-1692
        • Rodriguez G.P.
        • Romanova N.V.
        • Bao G.
        • Rouf N.C.
        • Kow Y.W.
        • Crouse G.F.
        Mismatch repair-dependent mutagenesis in nondividing cells.
        Proc. Natl. Acad. Sci. U.S.A. 2012; 109 (22474380): 6153-6158
        • Crouse G.F.
        Non-canonical actions of mismatch repair.
        DNA Repair. 2016; 38 (26698648): 102-109
        • Peña-Diaz J.
        • Bregenhorn S.
        • Ghodgaonkar M.
        • Follonier C.
        • Artola-Borán M.
        • Castor D.
        • Lopes M.
        • Sartori A.A.
        • Jiricny J.
        Noncanonical mismatch repair as a source of genomic instability in human cells.
        Mol. Cell. 2012; 47 (22864113): 669-680
        • Kadyrov F.A.
        • Dzantiev L.
        • Constantin N.
        • Modrich P.
        Endonucleolytic function of MutLα in human mismatch repair.
        Cell. 2006; 126 (16873062): 297-308
        • Pluciennik A.
        • Burdett V.
        • Baitinger C.
        • Iyer R.R.
        • Shi K.
        • Modrich P.
        Extrahelical (CAG)/(CTG) triplet repeat elements support proliferating cell nuclear antigen loading and MutLα endonuclease activation.
        Proc. Natl. Acad. Sci. U.S.A. 2013; 110 (23840062): 12277-12282
        • Pluciennik A.
        • Dzantiev L.
        • Iyer R.R.
        • Constantin N.
        • Kadyrov F.A.
        • Modrich P.
        PCNA function in the activation and strand direction of MutLα endonuclease in mismatch repair.
        Proc. Natl. Acad. Sci. U.S.A. 2010; 107 (20713735): 16066-16071
        • Kawasoe Y.
        • Tsurimoto T.
        • Nakagawa T.
        • Masukata H.
        • Takahashi T.
        MutSα maintains the mismatch repair capability by inhibiting PCNA unloading.
        Elife. 2016; 5 (27402201): e15155
        • Mellon I.
        • Bohr V.A.
        • Smith C.A.
        • Hanawalt P.C.
        Preferential DNA repair of an active gene in human cells.
        Proc. Natl. Acad. Sci. U.S.A. 1986; 83 (3466163): 8878-8882
        • Schwartz S.
        • Meshorer E.
        • Ast G.
        Chromatin organization marks exon-intron structure.
        Nat. Struct. Mol. Biol. 2009; 16 (19684600): 990-995
        • Cong L.
        • Ran F.A.
        • Cox D.
        • Lin S.
        • Barretto R.
        • Habib N.
        • Hsu P.D.
        • Wu X.
        • Jiang W.
        • Marraffini L.A.
        • Zhang F.
        Multiplex genome engineering using CRISPR/Cas systems.
        Science. 2013; 339 (23287718): 819-823
        • Daikoku T.
        • Kudoh A.
        • Sugaya Y.
        • Iwahori S.
        • Shirata N.
        • Isomura H.
        • Tsurumi T.
        Postreplicative mismatch repair factors are recruited to Epstein-Barr virus replication compartments.
        J. Biol. Chem. 2006; 281 (16510450): 11422-11430
        • Méndez J.
        • Stillman B.
        Chromatin association of human origin recognition complex, cdc6, and minichromosome maintenance proteins during the cell cycle: assembly of prereplication complexes in late mitosis.
        Mol. Cell. Biol. 2000; 20 (11046155): 8602-8612
        • Li R.
        • Yu C.
        • Li Y.
        • Lam T.W.
        • Yiu S.M.
        • Kristiansen K.
        • Wang J.
        SOAP2: an improved ultrafast tool for short read alignment.
        Bioinformatics. 2009; 25 (19497933): 1966-1967
        • Landry J.J.
        • Pyl P.T.
        • Rausch T.
        • Zichner T.
        • Tekkedil M.M.
        • Stütz A.M.
        • Jauch A.
        • Aiyar R.S.
        • Pau G.
        • Delhomme N.
        • Gagneur J.
        • Korbel J.O.
        • Huber W.
        • Steinmetz L.M.
        The genomic and transcriptomic landscape of a HeLa cell line.
        G3. 2013; 3 (23550136): 1213-1224
        • Li H.
        • Durbin R.
        Fast and accurate short read alignment with Burrows-Wheeler transform.
        Bioinformatics. 2009; 25 (19451168): 1754-1760
        • McKenna A.
        • Hanna M.
        • Banks E.
        • Sivachenko A.
        • Cibulskis K.
        • Kernytsky A.
        • Garimella K.
        • Altshuler D.
        • Gabriel S.
        • Daly M.
        • DePristo M.A.
        The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data.
        Genome Res. 2010; 20 (20644199): 1297-1303
        • Sun L.
        • Zhang Y.
        • Zhang Z.
        • Zheng Y.
        • Du L.
        • Zhu B.
        Preferential protection of genetic fidelity within open chromatin by the mismatch repair machinery.
        J. Biol. Chem. 2016; 291 (27382058): 17692-17705
        • ENCODE Project Consortium
        An integrated encyclopedia of DNA elements in the human genome.
        Nature. 2012; 489 (22955616): 57-74
        • Langmead B.
        • Salzberg S.L.
        Fast gapped-read alignment with Bowtie 2.
        Nat Methods. 2012; 9 (22388286): 357-359