Drosophila, which lacks canonical transcription-coupled repair proteins, performs transcription-coupled repair

Previous work with the classic T4 endonuclease V digestion of DNA from irradiated Drosophila cells followed by Southern hybridization led to the conclusion that Drosophila lacks transcription-coupled repair (TCR). This conclusion was reinforced by the Drosophila Genome Project, which revealed that Drosophila lacks Cockayne syndrome WD repeat protein (CSA), CSB, or UV-stimulated scaffold protein A (UVSSA) homologs, whose orthologs are present in eukaryotes ranging from Arabidopsis to humans that carry out TCR. A recently developed in vivo excision assay and the excision repair-sequencing (XR-Seq) method have enabled genome-wide analysis of nucleotide excision repair in various organisms at single-nucleotide resolution and in a strand-specific manner. Using these methods, we have discovered that Drosophila S2 cells carry out robust TCR comparable with that observed in mammalian cells. Our findings provide critical new insights into the mechanisms of TCR among various different species.

Nucleotide excision repair (excision repair) is the main mechanism used by many organisms to remove cyclobutane pyrimidine dimers (CPDs) 5 that are induced in DNA by UV light (1,2). The mechanism of excision repair has been investigated in considerable detail in several model organisms including Escherichia coli, Saccharomyces cerevisiae, mice, humans, and more recently Arabidopsis thaliana (3,4). In contrast, although Drosophila melanogaster is an extensively used model organism that has contributed significantly to the development of modern views on ionizing radiation and chemical mutagenesis and to the development of double-strand break repair and recombination models (5), there is rather limited information on nucleotide excision repair in this organism.
Work on excision repair in Drosophila is of special interest because, of all of the model organisms studied, ranging from E. coli to humans, it is the only one that has been reported to lack transcription-coupled repair (TCR) (6 -10). This conclusion was further supported by genomic analyses that revealed that the CSA and CSB proteins, which are necessary for TCR (11) and are present in organisms ranging from yeast (Rad28 and Rad26, respectively) to humans, are missing in Drosophila and other insects in the order Diptera (5,12,13). Considering the importance of TCR in mutation avoidance (14), the apparent absence of TCR in Drosophila is rather surprising. Therefore, we decided to investigate this phenomenon with new biochemical tools that have become available since the previous experimental studies on the subject nearly 3 decades ago.
In transcription-coupled repair, damage in the template (transcribed strand (TS)), but not the nontranscribed strand (NTS), causes arrest of RNA polymerase (RNAP). The arrested RNAP is recognized by Mfd in E. coli (15,16) and by CSB in eukaryotes (17). The bacterial TCR process is well-characterized and involves concerted removal of RNAP by the Mfd translocase and Mfd-mediated delivery of repair proteins, resulting in repair at an accelerated rate compared with repair of the NTS (15). In contrast, TCR is not well-understood in higher organisms. The mechanism is unlike that in E. coli, because the CSB translocase does not remove the stalled RNAP (18). Nevertheless, CSB, along with CSA, a WD repeat-containing protein with ubiquitin ligase activity (19), somehow enables the eukaryotic basal repair proteins (which are not homologous to proteins in E. coli) to produce a commonly 3-10-fold accelerated rate of TS repair, which varies depending upon the level of transcription (14,16,20).
In the classic TCR assay (14), UV-irradiated cells are incubated for various times to allow repair and then harvested. Genomic DNA is isolated and digested with appropriate restriction enzymes, and then duplicate aliquots from each time point are incubated with or without T4 endonuclease V, which incises sites of CPDs. Following separation of the DNA on a denaturing agarose gel, the gel is probed with gene-specific and strand-specific probes. The level of damage at each time point is calculated from the decrease in full-length fragment caused by T4 endonuclease. The loss of damage with time represents repair. Although this method has served to characterize essential properties of strand-specific repair, by its very nature, it is This work was supported by National Institutes of Health Grants GM118102 and ES027255 (to A. S.). The authors declare that they have no conflicts of interest with the contents of this article. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.

The raw data and alignment data have been deposited in the Gene Expression
Omnibus under accession numbers GSE76391, GSE67941, and GSE138846. 1 Both authors contributed equally to this work. 2 Supported by the Turkish Fulbright Commission. 3 To whom correspondence may be addressed. E-mail: aziz_sancar@ med.unc.edu. 4 To whom correspondence may be addressed. E-mail: cselby@med.unc.edu. 5 The abbreviations used are: CPD, cyclobutane pyrimidine dimer; TCR, transcription-coupled repair; XR-Seq, excision repair sequencing; TS, transcribed strand or template strand; NTS, nontranscribed strand or coding strand; NGS, next-generation sequencing; TSS, transcription start site; TES, transcription end site; CSA and CSB, Cockayne Syndrome group A and B, respectively; RPKM, reads per kbp per million total reads; RNAP, RNA polymerase; nt, nucleotide(s); UVSSA, UV-stimulated scaffold protein A.

ACCELERATED COMMUNICATION
low-resolution and has the potential of missing TCR because it is an indirect method that is based on subtracting two large numbers from one another and because it is limited to analyzing one gene at a time. In contrast, the recently developed XR-Seq (excision repair-sequencing) assay directly captures and identifies the excised oligomers, and thus it directly and simultaneously measures repair throughout the genome (4,20). Using XR-Seq, we find that Drosophila is proficient at TCR despite the previous reports and the absence of CSA and CSB in this organism.

Excision by dual incision in Drosophila
Drosophila S2 cells were irradiated with 20 J m Ϫ2 of 254-nm UV light and incubated for various times at 27°C. Then cells were lysed, and the low-molecular weight DNA was separated from genomic DNA by the Hirt procedure (4). After immunoprecipitation with anti-CPD antibodies, the samples were mixed with an internal control 50-mer DNA oligonucleotide, 3Ј-end-labeled, and separated on a sequencing gel. As seen in Fig. 1A, excision products in the range of 24 -30 nt were detected with a median of 27-28 nt.
Irradiated S2 cells were also processed for XR-Seq, in which the excision products are isolated, repaired by CPD photolyase, sequenced by next-generation sequencing (NGS), and aligned to the genome to create a map of the entire genome showing sites of repair in both strands at single-nucleotide resolution. The lengths of excision products sequenced (reads) after 10 min of repair were recovered with the frequency distribution shown in Fig. 1B, which shows a peak at 27 and 28 nt, in agreement with Fig. 1A results. Reads for products 26, 27, 28, and 29 nt in length were individually analyzed for nucleotide distribu-tion as shown in Fig. 1C. The results show enrichment of pyrimidines 7-9 nt from the 3Ј-end. Thymines are more common than cytosines, consistent with the prevalence of cyclobutane thymine dimer photoproducts produced by UV. Thus, the location of the dual incisions made by the excision nuclease with respect to the CPD in S2 cells is essentially the same as in humans (3,20). This finding is not surprising, considering that Drosophila possesses the complete set of orthologs to the basal excision repair factors present in mammals (5). It is worth noting that in human cells, it has been shown that the same excision pattern is produced by the transcription-coupled and global repair pathways (20).

Processing of the excised oligomers
In humans (20,21), yeast (22), and Arabidopsis (23,24), the CPD-containing oligomers are rapidly processed by exonuclease(s) such that the primary excision product has a half-life of about 1 h. As repair progresses, two populations of excision products, one ranging from 24 to 32 nt and centering at 26 -27 nt and a second in the range of 15-20 nt and centering at 17-18 nt, are observed. These products as produced in NHF1 cells are illustrated in Fig. 2A. The 15-20-nt degradation products can be seen even at the earliest time point of 30 min and become prominent at 2 h. Thereafter, both the primary and secondary excision products detectable by radiolabeling decline, presumably due to low recovery and labeling efficiency of smaller degraded CPD-containing oligonucleotides (not shown). A comparison of the fate of excised oligomers in human and Drosophila cells (Fig. 2B) reveals an interesting contrast; in S2 cells, no degradation products are detectable in the first 2 h and, in fact, up to at least 12 h (data not shown). This indicates that Drosophila either lacks a nuclease that is present in organisms ACCELERATED COMMUNICATION: TCR in Drosophila ranging from yeast to Arabidopsis to humans and that generates a similar degradation pattern in all of these other species, or this degradation intermediate is not stabilized in Drosophila as it is in other organisms.

Transcription-coupled repair in Drosophila
Previous work using strand-specific Southern hybridization did not detect a difference in the rates of repair of TS and NTS in individual Drosophila genes (6 -8), and it was concluded that Drosophila does not perform TCR. This notion was further strengthened by genome analyses, which revealed that Drosophila and other Dipteran insects lacked CSA and CSB (5,12,13) and UVSSA (25) orthologs. However, it was also pointed out that in the absence of TCR, RNAPII stalled at damage sites in the TS would be expected to interfere with excision repair, as has been shown in E. coli (26), resulting in preferential repair of the NTS (5), which was not observed (6 -10). To clarify these conflicting observations and theoretical expectations, we analyzed our XR-Seq data from S2 cells to obtain a global view of strand-specific repair of the Drosophila genome.
From our genomic repair maps of Drosophila, we initially focused on the betaTub60D and ade3 genes for TCR analysis because the lack of strand-specific repair in these genes, as determined by Southern blotting, was considered definitive evidence for lack of TCR in Drosophila (6 -8). Fig. 3A shows screenshots of repair of the betaTub60D and ade3 genes in Drosophila as well as the human orthologues of these genes (TUBB3 and GART, respectively) in NHF1 and CSB cells, and Fig. 3B shows quantitative analysis of the TS and NTS repair of these genes. As is visually apparent from the screenshots, the TS of betaTub60D is repaired more efficiently than the NTS. A similar result is seen with repair of ade3. The quantitative data for TS and NTS reads in Fig. 3B clearly show that Drosophila performs TCR similar in magnitude to that seen in NHF1 human cells. A gene-by-gene survey of the genomic repair map (not shown) reveals varied levels of TCR in many but not all Drosophila genes, presumably reflecting varied levels or absence of transcription. For comparative purposes, Fig. 3 shows a screenshot and quantitation of repair for a Drosophila gene, CG10348, in which TCR is essentially absent.
Interestingly, in CSB cells (Fig. 3), there is not only a lack of TCR, but a preferential repair of the NTS in both the TUBB3 and GART genes. This result parallels the preferred NTS repair seen in E. coli cells lacking the TCR factor Mfd (27). This result in CSB cells is consistent with inhibition of repair by the stalled polymerase and merits more systematic investigation.

Genome-wide TCR in Drosophila
To examine TCR in a large sample of Drosophila genes, we analyzed the CPD repair of all nonoverlapping genes over 1 kbp (5,706 genes). The results are shown in Fig. 4A, with results for different repair time points for S2 cells in the top row, and corresponding time points for NHF1 cells (10,100 genes over 5 kbp) in the row below. Each panel illustrates the averaged repair from the transcription start site (TSS) to the transcription end site (TES) for all of the genes considered, plus the average repair 2 kbp upstream and downstream. As is clear from this figure, in both species, TCR across the gene bodies appears to have begun by 10 min, and TS repair is dominant at 30 -36 min. The magnitudes of TCR in human and Drosophila cell lines are at comparable levels. At later time points, after the TS is cleared of damage, the preference shifts to the NTS strand. It takes longer to clear TS damage (over 8 h) in human cells, probably due to longer genes in humans. The robust TCR signal exhibited by S2 cells, which lack CS proteins, contrasts starkly with the complete absence of TCR in mutant human CSB cells, which are compared with NHF1 cells in Fig. 4B. In fact, the CSB cells exhibit a trend toward preferential NTS repair, as seen in the individual TUBB3 and GART genes (Fig. 3).
An interesting difference between the human and Drosophila repair profiles is seen in Fig. 4. As reported previously for NHF1 cells (20), and seen here at the 36-min and 1-h time points, there is a switch in strand preference for repair, from TS within the gene body to the NTS immediately upstream of the TSS. This is because of the well-known fact that upstream of the mammalian TSS, there is promoter-and enhancer-specific transcription "antisense" relative to the gene body (28). These "antisense" transcripts are not found in Arabidopsis (24), and our data suggest that they are absent in Drosophila as well. Instead, a dip in repair of both strands is observed immediately upstream of the TSS in Drosophila, presumably because of interference by transcription factors bound immediately upstream of coding sequences.

Discussion
Contrary to the view held for nearly 3 decades, we show that Drosophila performs TCR. In fact, the magnitude of the TS/NTS repair ratio in transcribed genes is comparable to the ratios of corresponding genes in human cells. We ascribe the previous failure to observe TCR in Drosophila partly to the limitation of the T4 endonuclease V digestion/Southern assay used in previous studies. An important additional consideration is the particular time points at which repair was analyzed. The earliest repair time point commonly used in the previous studies was 4 h post-UV treatment (6 -10). TCR may be detected in mammalian cells following 4 h of repair (Fig. 4); however, in yeast, which have much smaller genes, TCR is complete within 1 h (22). It appears that a similar rapid kinetic profile in Drosophila, which also has relatively small genes, likely contributes to the discrepancy between our findings and previous studies.
Our finding of TCR in Drosophila raises a new question: analysis of the Drosophila genome and the genomes of other Dipteran insects has failed to reveal CSA, CSB, or UVSSA orthologs, which are known to be essential for TCR in organisms ranging from plants to humans. The E. coli TCR factor Mfd, which has a different evolutionary history than CSB, exhibits considerable sequence homology in the ATPase/translocase domain of the protein. It is possible that another translocase with great sequence divergence from all known TCR translocases couples transcription to repair upon the encounter of RNAP with a transcription-blocking DNA lesion in Drosoph- Genes analyzed include the betaTub60D gene from Drosophila and the human ortholog TUBB3, as well as the Drosophila ade3 gene and the human ortholog GART. Preferential repair in the TS as compared with the NTS of these genes is seen in the S2 and NHF1 cells but is absent in CSB cells, which in fact show the opposite preference, perhaps due to inhibition of TS repair by RNAP stalled at the damage. The Drosophila CG10348 gene is also analyzed in the bottom panels of A and B to illustrate an example of the absence of TCR. The y axis in each case is RPKM. The repair signal for Drosophila appears stronger because it has a much smaller genome and thus more reads per kbp per million reads. Time points are 30 min (S2 cells), 36 min (NHF1), and 1 h (CSB).

ACCELERATED COMMUNICATION: TCR in Drosophila
ila. In fact, in Caenorhabditis elegans, other chromatin-remodeling factors have been suggested to function in place of CSB for TCR during certain developmental stages (29). There is a need for further work to identify such potential transcription-coupled repair factors in Drosophila.
A passive mechanism for TCR has been suggested in which the stalled RNAPII reveals to the basal repair factors lesions that would otherwise be concealed by nucleosomes. This mechanism is based partly upon observations of inhibition of repair by nucleosomes, seen in vitro (30), and repair of transcription-blocking lesions unimpeded by the presence of the stalled RNAPII, also seen in vitro (11,(31)(32)(33). This particular model as related to Drosophila is attractive in its independence of CS proteins. Further work is needed to clarify the involvement of CS proteins in TCR as well as in the developmental deficiencies associated with Cockayne syndrome (34).

Excision and XR-Seq assays
S2-DGRC cells were obtained from the Drosophila Genetic Resources Center. Cells were cultured at 27°C in Schneider's medium with heat-inactivated fetal bovine serum at 10% (v/v). For UV treatments, cells were inoculated into R-150 plates and grown to about 25-80% confluence. Medium was gently removed from the semi-adherent monolayer, cells were irradiated with UV-C, and then fresh medium containing sterilized, conditioned medium was added to cells, and cells were incubated at 27 ºC. Excision assays employed 20 J m Ϫ2 (Fig. 1) or 10 J m Ϫ2 (Fig. 2), and XR-Seq assays employed 20 J m Ϫ2 . Following predetermined repair times, plates were placed on ice, and cooled cells were harvested by scraping and rinsing with icecold PBS. Cells were pelleted, transferred to Eppendorf tubes, washed with cold PBS, and resuspended in 320 -340 l of cold TE. Samples were processed for excision and XR-Seq assays as described (27), using 5 l of RNase A and proteinase K and using 9.2% sequencing gels. For both assays, samples were immunoprecipitated with anti-CPD antibody and then either radiolabeled (excision assay) or ligated to adapters and processed for sequencing (XR-Seq). Approximately 13% of the input CPD-containing excision products are recovered using this standard excision assay/XR-Seq immunoprecipitation procedure. One plate of S2 cells per repair time point was sufficient for excision or XR-Seq assay. Drosophila possesses photolyases, so cells were kept in the dark or under dim yellow illumination from the time of irradiation to the time of addition of NaCl to cell lysates. XR-Seq of NHF1 (normal human fibroblast) cells was performed as described previously (4). XR-Seq analysis of CSB cell repair utilized data deposited in a prior study (20). XR-Seq data from the indicated time points are plotted as average repair reads RPKM (y axis) along the length of a "unit gene." For constructing a unit gene, a Drosophila data set of 5,706 genes was selected to include all genes Ͼ1 kbp with no genes that overlap or have a distance of less than 100 bp between adjacent genes. The human gene set includes 10,100 genes of Ͼ5 kbp in length with no overlaps and with a distance of at least 5 kbp between genes. For each species, the "unit gene" is 100 bins in length, and values for average repair were obtained by dividing each gene into 100 bins and averaging the repair values for each successive bin from 1 to 100. B, comparison of WT human NHF1 cells with cells from a patient with CSB. Data were analyzed as in A.

Data analysis
At least 6 million unique mapped reads were obtained for each sample. Analysis of sequencing reads and data visualization were as described previously (35). The alignment genome version was dm6_UCSC for S2 cells and hg38_UCSC for human cell lines. For plotting average repair profiles as a unit gene, we chose the genes with length Ͼ1 kbp for Drosophila and Ͼ5 kbp for Homo sapiens, and the distance between genes was at least 100 bp for Drosophila and 5 kbp for Homo sapiens. With these criteria, the total number of genes selected was 5,706 for Drosophila and 10,100 for Homo sapiens. Each gene was evenly divided into 100 bins from the TSS to the TES, and 2 kbp (25 bins) upstream of TSS, 2 kbp (25 bins) downstream of TES, and for each bin, from first to last, an average value for each of the selected genes was obtained and plotted. The y axis average reads per kbp per million total reads (RPKM) for each bin was plotted with R. The raw data and alignment data have been deposited in the Gene Expression Omnibus under accession numbers GSE76391, GSE67941, and GSE138846.