Mfd translocase is necessary and sufficient for transcription-coupled repair in Escherichia coli

Nucleotide excision repair in Escherichia coli is stimulated by transcription, specifically in the transcribed strand. Previously, it was shown that this transcription-coupled repair (TCR) is mediated by the Mfd translocase. Recently, it was proposed that in fact the majority of TCR in E. coli is catalyzed by a second pathway (“backtracking-mediated TCR”) that is dependent on the UvrD helicase and the guanosine pentaphosphate (ppGpp) alarmone/stringent response regulator. Recently, we reported that as measured by the excision repair–sequencing (XR-seq), UvrD plays no role in TCR genome-wide. Here, we tested the role of ppGpp and UvrD in TCR genome-wide and in the lacZ operon using the XR-seq method, which directly measures repair. We found that the mfd mutation abolishes TCR genome-wide and in the lacZ operon. In contrast, the relA−spoT− mutant deficient in ppGpp synthesis carries out normal TCR. We conclude that UvrD and ppGpp play no role in TCR in E. coli.

highly sensitive XR-seq method (22,23) to analyze TCR in E. coli strains of various genetic background. Previously, using XR-seq, we found that genome-wide TCR was mediated by Mfd but found no evidence for a role of UvrD in TCR (23). Instead XR-seq analysis confirmed the role of UvrD in catalytic turnover of the Uvr(A)BC excision nuclease, which had been deduced from in vitro studies (24,25). Here, we investigated the role of ppGpp in TCR both genome-wide and in the inducible lacZ operon. We found that both genome-wide TCR as well as inducible lacZ TCR are dependent on Mfd but are not affected by the absence or presence of ppGpp.

Genome-wide effects of Mfd, UvrD, and ppGpp on transcription-coupled repair
To determine the contributions of Mfd, UvrD, and ppGpp on TCR on the genomic scale, we conducted XR-seq on wildtype E. coli as well as E. coli strains with mfd Ϫ , uvrD Ϫ , and relA Ϫ spoT Ϫ mutations. The results in the forms of histograms are shown in Fig. 1. As noted previously, in the wild type, although in the majority of genes, the TS is repaired at a faster rate than the NTS, in many genes, the NTS is repaired more efficiently. Moreover, this trend is observed in all genetic backgrounds tested. This unexpected phenomenon has been ascribed to the complex transcriptional landscape of the E. coli genome, which encompasses numerous annotated and unannotated antisense transcripts and "transcriptional noise" due to spurious promoter-like sequences capable of initiating transcription (26 -28). In addition, UV damage hotspots may contribute to this anomalous repair pattern (23).
From the perspective of the molecular mechanism of TCR and the roles of various factors in this phenomenon, we note the following facts evident in Fig. 1. First, in the mfd Ϫ mutant, the TS/NTS repair ratio is drastically reduced and the NTS has become the preferentially repaired strand because of the inhibition of repair by stalled RNAP in the TS, especially in the highly transcribed genes (Fig. 1B). Second, in agreement with a previous finding (23), in the uvrD Ϫ strain, either TCR was unchanged or the TS/NTS ratio was amplified, although overall repair was drastically reduced in both strands. Finally, the genome-wide TS/NTS histogram of relA Ϫ spoT Ϫ (ppGpp o ) is essentially identical to that of the wild-type strain, indicating that the ppGpp alarmone has no role in TCR.

Analysis of the effect of ppGpp on TCR at gene resolution
Although Fig. 1 shows no indication that ppGpp plays a role in TCR, the confounding factors of genome-wide analysis (antisense transcripts, spurious transcripts, overlapping genes) preclude a definitive conclusion vis à vis the role of ppGpp in TCR. Hence, we decided to analyze the repair pattern of a few sentinel genes with high levels of transcription and no known antisense transcription or other complicating factors to more precisely define the putative role of ppGpp in TCR. We chose RNA polymerase ␤ subunit (rpoB), polyribonucleotide nucleotidyltransferase (pnp), and DNA polymerase I (polA) as representatives of genes expressed at high levels. Fig. 2 shows that there is TCR in wild-type cells in all three genes. In the mfd Ϫ mutant, the TCR is abolished in all three operons, as expected. Of significance, in the highly transcribed rpoB gene, the TS/NTS repair ratio is reversed by a factor of about 10 in favor of the ACCELERATED COMMUNICATION: Mfd, the TCR factor in E. coli NTS because in the absence of Mfd, the high-density RNAPs stalled at damage in the TS inhibit repair in this strand but do not affect repair in the NTS (6). This effect is less obvious or absent in genes with lower transcription rates. The uvrD Ϫ mutation, as reported previously (23), reduces the level of overall repair and either enhances TCR (rpoB) or does not significantly affect the TS/NTS repair ratio. Importantly, the TCR profiles of all three genes in the ppGpp o strain are indistinguishable from the wild-type strain, consistent with the conclusion drawn from the genome-wide analyses of TCR.

Effect of ppGpp on transcription-coupled repair on an inducible operon
Traditionally, in vivo TCR experiments in E. coli have been done using the lac operon for readout (4). E. coli cells grown in glucose-containing medium do not express lacZ, but the operon can be turned on by the addition of IPTG to induce transcription to varying degrees and thus study the effects of various levels of transcription in a relatively well-defined system. With these considerations, we then analyzed TCR in the lacZ operon in wild-type, mfd Ϫ , relA Ϫ spoT Ϫ , and uvrD Ϫ strains under conditions of no induction and full induction. Fig.  3 shows screen shots of XR-seq results for the relevant strains under conditions of various levels of transcriptional activity. The results with the wild-type strain are consistent with a previous report (6) and with the in vitro data (7,8,11) that under conditions of full induction, the TS is repaired at about a 4.6fold faster rate than the nontranscribed strand. This preferential repair is abolished in the mfd Ϫ mutant, and the TS/NTS repair ratio is reversed in favor of the NTS, as has been observed in the reconstituted TCR repair system (8,11). In the uvrD Ϫ mutant, overall repair is reduced because of the lack of catalytic turnover of Uvr(A)BC excision nuclease. However, the TS/NTS is either unchanged or slightly enhanced because of the retention of the UvrB-UvrC complex on the TS following the dual incision and in agreement with the model that UvrD plays a role in post-incision steps of excision repair but not in the pre-incision steps (24,25), in the absence or presence of transcription. Finally, the ppGpp o (relA Ϫ spoT Ϫ ) is not distinguishable from the wild-type strain in lacZ: the TS is repaired at about 4.3-fold faster than the NTS. Quantitative data points are shown in Fig.  3. A minor difference between WT (4.6-fold) and ppGpp o (4.3- ACCELERATED COMMUNICATION: Mfd, the TCR factor in E. coli fold) is likely because of the diminished transcription levels in the absence of ppGpp as indicated by the ␤-galactosidase activities (Fig. 3).
We considered the possibility that TCR may have a biphasic character in which the early mode mediated by Mfd is followed by a second repair mode that becomes active upon induction of UvrD protein and the ppGpp alarmone. Hence we conducted a time-course XR-seq experiment in the UvrD Ϫ background with and without IPTG induction and analyzed repair in lacZ. The drawback of UvrD Ϫ strains is the fact that there is a slow catalytic turnover of the Uvr(A)BC proteins and slow release and degradation of the excised oligonucleotide. Therefore, an XRseq profile at a certain time essentially represents cumulative rather than ongoing repair. To overcome this drawback, we performed subtractive analysis to determine the incremental change in repair from one time point to the next (see "Experimental procedures"). As seen in Fig. 4, the TS is preferentially repaired only when the operon is actively transcribed, even at later time points. This observation shows that TCR is UvrDindependent at all stages, consistent with the notion that Mfd/ TRCF is solely responsible for coupling transcription to repair in E. coli.

Discussion
Following the identification of Mfd as the protein responsible for preferential repair of UV-induced cyclobutane pyrimidine dimers in the anticodon of some tRNA genes and for being essential for transcription-coupled repair in several in vitro assays, it was concluded that Mfd was solely responsible for specifically stimulating excision repair in the transcribed strand (11,12). Thus, it was concluded that Mfd and the biochemically identified transcription-repair coupling factor were identical and hence Mfd and TRCF were used interchangeably. In vivo mutagenesis experiments (29) and structural (16,17) and single-molecule (18) studies confirmed and refined the original model.
However, several studies reported results suggesting either that factors in addition to Mfd and Uvr(A)BC were required for TCR or that there were multiple pathways for TCR. First, in a high-resolution study of the initially transcribed region of lacZ, it was reported that template strand repair stimulated by IPTG was Mfd-independent (30). Our results overall are not consistent with these findings; however, we did observe IPTG-dependent, Mfd-independent repair at one site, located in the operator-binding site (O 1 ) in the initially transcribed region (shown in green in Fig. 3). Thus, binding of the lac repressor inhibits repair. Second, it was reported that mutL Ϫ and mutS Ϫ mutations abolished TCR of the lacZ operon in a manner identical to the mfd mutation (31). However, in vitro experiments with cellfree extracts revealed that although mfd Ϫ extract was deficient in TCR, both mutL Ϫ and mutS Ϫ extracts performed normal TCR, and thus the in vivo findings implicating these genes were ascribed to the limitations in the in vivo assay (13). Finally, a third series of studies using an indirect in vitro repair assay, and the lac-inducible system in uvrD Ϫ , ppGpp o , or uvrD Ϫ ppGpp o backgrounds, concluded that UvrD with the aid of damageinduced ppGpp led to backtracking of RNAP stalled at damage, allowing the Uvr(A)BC to repair damage and enabling RNAP to elongate the truncated transcript (19). We are unable to confirm these results. We have no explanation for the data in the report that links ppGpp to TCR. Importantly, using the canonical lacZ-inducible operon, we show that TCR in this system is absolutely dependent on Mfd and that its magnitude as defined by the TS/NTS repair ratio is unaffected by ppGpp and either unaffected or enhanced by the absence of UvrD because of the lack of catalytic turnover of Uvr(A)BC excision nuclease. We conclude that Mfd/TRCF is the sole protein that couples transcription to repair in E. coli.  ACCELERATED COMMUNICATION: Mfd, the TCR factor in E. coli

Experimental procedures
Strains are described in supplemental Table S1. Double mutant ⌬relA25G⌬spoT203 strains are unable to synthesize ppGpp and are designated ppGpp o . Cultures were grown in LB (10 g of Tryptone, 5 g of yeast extract, 10 g of NaCl/liter). For experiments, overnight liquid cultures (MG1655-derived strains) or colonies grown overnight on LB-agar (CF10237-derived strains) were used to inoculate LB supplemented with glucose (0.4%) and/or IPTG (1 mM) where indicated. Experimental cultures were grown at 37°C to an OD of ϳ0.4, which, at the dilutions used, took over an hour.
For measuring ␤-galactosidase, 0.2-or 0.3-ml volumes of culture were put on ice and pelleted, and then pellets were frozen at Ϫ80°C. ␤-Galactosidase activity was measured as described (32) using the procedure in which 20 l of SDS (0.1%) and 20 l of chloroform were mixed with resuspended cells.
For the XR-seq assay, cells grown to an OD of ϳ0.36 were cooled to room temperature by swirling in a water bath at about 15°C. 15-ml volumes were transferred to R-150 dishes and irradiated at room temperature with UV at 60 J/m 2 with agitation. Cell survival under these conditions was 88% (40 J/m 2 ) and 72% (120 J/m 2 ) for MGP cells and 70% (40 J/m 2 ) and 41% (120 J/m 2 ) for CFPK cells. The dose rate was ϳ2 J/m 2 /s. One minute after light exposure began, dishes were placed on ice water, and cells were harvested and processed as described previously (23), with exceptions noted below. Timing of the irradiations was such that 10 plates were successively irradiated in a batch, with the irradiation, repair, and placement of plates on ice taking 12 min per batch. By this approach, all cells were irradiated at an OD of ϳ0.4. Experiments utilized a total of either 600 ml or 300 ml of cells irradiated and repaired in this manner, with the exception of uvrD Ϫ cells, which generate sufficient excision product (23) that only 30 ml of cells were needed for XR-seq experiments.
In repair experiments with CF-10237 derivatives, dilutions of unirradiated cells at OD Ϸ 0.4 were plated on LB and M9 to determine whether substantial numbers of RNA polymerase revertants had arisen, as suggested (33). Only revertants are able to grow on M9, and revertants were present at less than 0.1% in the populations subsequently irradiated.
The XR-seq procedure was modified slightly from prior experiments (23). In an early step of XR-seq, the partially purified product of excision is ligated on both ends to adapters. In these experiments, we used a modified 3Ј-adapter called "A3b," which includes an 8-base random sequence. The purpose of this sequence was to allow the identification and elimination of duplicate reads introduced in the PCR step. A3b is made by annealing the oligonucleotides A3Fb: SpC3-3Ј-GGAACCG-TGGGCTCTTAAGGTNNNNNNNNGCCGTGAGTATGC-TTGACTCGG(P)-5Ј and A3Rb: 5Ј-CGGCACTCATACGA-ACTGAGCCNNNNN-SpC3-3Ј. These oligonucleotides were purchased from Integrated DNA Technologies, and are phosphorylated (P), possess random A,G,C or T (N), or are blocked by spacer-C3 (SpC3-3) as indicated. The 5Ј-adapter was as described in Hu et al. (22).
We used cutadapt (34) to trim adapter sequences with the following parameter: a, GGCTCAGTTCGTATGAGTGCCG. We analyzed the reads of 13-mer length having dithymines at the expected damage site (8th and 9th positions of the 13-mer reads). The reads were aligned to the E. coli reference genome retrieved from the National Center for Biotechnology Information (NCBI, accession number NC_000913.2) using bowtie (35) with the following parameters: q; nomaqround; phred33-quals; S; seed 123. Samtools (36) and bedtools (37) were used to process the alignment. Strands were separated by a custom script. The genomic coordinates were normalized and converted to bedGraph and BigWig files using bedtools (37) and ucsctools (38), respectively. Per-gene repair signals were computed using bedtools and normalized to reads per kilobase per million mapped reads with custom scripts. Publicly available RNA sequencing (RNA-seq) data sets (27) were analyzed as we reported before (23).
The time-course subtractive analysis of the UvrD Ϫ XR-seq datasets was performed with custom scripts. Random sampling was applied to retrieve 12 million reads from each sample. Each unique genomic location was counted, and the counts of a sample of a certain time point were subtracted from the subsequent one (time points were 1, 3, 10, 30, and 90 min).
Author contributions-O. A., A. S., and C. P. S. designed research, analyzed data, and wrote the paper. O. A. and C. P. S. performed research. All authors reviewed the results and approved the final version of the manuscript.