Replication stress at microsatellites causes DNA double-strand breaks and break-induced replication

Short tandemly repeated DNA sequences, termed microsatellites, are abundant in the human genome. These microsatellites exhibit length instability and susceptibility to DNA double-strand breaks (DSBs) due to their tendency to form sta-ble non-B DNA structures. Replication-dependent microsatellite DSBs are linked to genome instability signatures in human developmental diseases and cancers. To probe the causes and consequences of microsatellite DSBs, we designed a dual-fluorescence reporter system to detect DSBs at expanded (CTG/CAG) n and polypurine/polypyrimidine (Pu/Py) mirror repeat structures alongside the c- myc replication origin integrated at a single ectopic chromosomal site. Restriction cleavage near the (CTG/CAG) 100 microsatellite leads to homology-directed single-strand annealing between flanking AluY elements and reporter gene deletion that can be detected by flow cytometry. However, in the absence of restriction cleavage, endogenous and exogenous

Short tandemly repeated DNA sequences, termed microsatellites, are abundant in the human genome. These microsatellites exhibit length instability and susceptibility to DNA double-strand breaks (DSBs) due to their tendency to form stable non-B DNA structures. Replication-dependent microsatellite DSBs are linked to genome instability signatures in human developmental diseases and cancers. To probe the causes and consequences of microsatellite DSBs, we designed a dual-fluorescence reporter system to detect DSBs at expanded (CTG/CAG) n and polypurine/polypyrimidine (Pu/Py) mirror repeat structures alongside the c-myc replication origin integrated at a single ectopic chromosomal site. Restriction cleavage near the (CTG/CAG) 100 microsatellite leads to homology-directed single-strand annealing between flanking AluY elements and reporter gene deletion that can be detected by flow cytometry. However, in the absence of restriction cleavage, endogenous and exogenous replication stressors induce DSBs at the (CTG/CAG) 100 and Pu/Py microsatellites. DSBs map to a narrow region at the downstream edge of the (CTG) 100 lagging-strand template. (CTG/CAG) n chromosome fragility is repeat length-dependent, whereas instability at the (Pu/Py) microsatellites depends on replication polarity. Strikingly, restriction-generated DSBs and replication-dependent DSBs are not repaired by the same mechanism. Knockdown of DNA damage response proteins increases (Rad18, polymerase (Pol) h, Pol k) or decreases (Mus81) the sensitivity of the (CTG/CAG) 100 microsatellites to replication stress. Replication stress and DSBs at the ectopic (CTG/CAG) 100 microsatellite lead to break-induced replication and high-frequency mutagenesis at a flanking thymidine kinase gene. Our results show that non-B structure-prone microsatellites are susceptible to replication-dependent DSBs that cause genome instability.
Here, we focus on replication-dependent DSBs that occur at two types of microsatellite elements, an expanded (CTG/ CAG) 100 trinucleotide repeat from the 39-UTR of the human DMPK locus and the 88-bp asymmetric polypurine/polypyrimidine (Pu/Py) 88 mirror repeat from the PKD1 IVS21 locus. Much work has concentrated on (CTG/CAG) expansions in the DMPK gene, inasmuch as expansions of this microsatellite beyond ;40 repeats promote further enlargement of the tract and genetic anticipation leading to myotonic dystrophy type 1 (DM1, chr19q13.32, MIM no. 160900) (for a recent review, see Ref. 54). Previous work showed that expanded (CTG/CAG) tracts can form hairpin structures in vivo (55,56), and several reports have also shown that (CTG/CAG) n microsatellite repeats contribute to DNA DSBs in bacterial, yeast, and human model systems (29,33,57,58).
The PKD1 (Pu/Py) 88 asymmetric mirror repeats have the potential to form triplex H-DNA and G quadruplex DNA. In vitro, DNA triplex structures are visible in this sequence by atomic force microscopy (59). Mutations in the PKD1 gene are associated with at least 85% of the cases of autosomal dominant polycystic kidney disease (ADPKD) (chr16p13.3, MIM no. 173900). In more than 100 unrelated patients with ADPKD, mutations were at least twice as frequent in the exons flanking the PKD1 (Pu/Py) 88 microsatellite as in 59 exons 1-8 (60). Surprisingly, the (Pu/Py) 88 microsatellite is not detectable as a hotspot for mutation in blood samples of ADPKD patients (61,62).
We have shown that the PKD1 IVS21 mirror repeat also causes orientation-dependent fork stalling during replication in vitro and in vivo. When integrated alongside the c-myc replicator at an ectopic chromosomal site in the HeLa genome, the (Pu/Py) 88 tract elicited a polar replication fork barrier. When the repeat was in the fork-stalling orientation, the binding of replication checkpoint proteins Rad9, RPA, and ATR near the repeat and the sensitivity of cells to Chk1 inhibition suggested that the DNA damage response is activated by replication fork stalling at this microsatellite (63).
In the present work, we describe a novel system to analyze replication-dependent DNA double-strand breaks in human cells, using fluorescent marker protein genes flanking the (CTG/CAG) 100 or (Pu/Py) 88 microsatellites. We find that the expanded (CTG/CAG) 100 tract is sensitive to breakage following exposure to multiple forms of replication stress. Under nonperturbed conditions as well as after treatment with lowdose hydroxyurea (HU), DSBs occur in a narrow region near the downstream end of the (CTG/CAG) repeats. Moreover, these breaks are not repaired by the same mechanism as a restriction enzyme-generated DSB. Replication-dependent DSBs at the ectopic (CTG/CAG) 100 microsatellite result in BIR and a greatly elevated frequency of mutagenesis of the neighboring thymidine kinase gene. The (Pu/Py) 88 microsatellite is also sensitive to DSBs under unperturbed conditions and is highly vulnerable to DSBs when the purine-rich strand is the lagging-strand template for replication in cells treated with a G4-stabilizing drug.
Our results show that diverse forms of replication stress cause DSBs at microsatellite repeats prone to forming non-B DNA structures. The frequency of DSBs depends on the structure-selective Mus81 endonuclease and translesion polymerases. Invasion of the sister chromatid by the broken DNA results in complex rearrangements and a high rate of base substitutions during break-induced replication.

Results
A dual-fluorescence reporter system for analysis of DNA DSBs in vivo Double-strand breaks are the most dangerous of DNA lesions because of the potential for error-prone repair, gross chromosomal rearrangement, and loss of heterozygosity. To identify factors affecting microsatellite DSBs and repair, we developed a system in which a DSB between two chromosomal reporter genes could be detected by microscopy or flow cytometry (Fig. 1). In this system, (CTG/CAG) 100 or (Pu/Py) 88 microsatellites were individually integrated at a single-copy FLP recombinase target (FRT) site at chromosome 18p11.22 in HeLa cells (64), bordered by the c-myc replication origin core (55,65,66), an I-Sce1 site, and two fluorescent protein marker genes (dTomato, eGFP) flanked by three identical AluYa5 elements ( Fig. 1A) (67). Control cell lines were also constructed that contain the same starting construct except that the dualfluorescence (DF)/myc cell line is missing the microsatellite sequences (Fig. 1B), and the DF cell line is additionally missing the c-myc origin core (Fig. 1C). The (CTG/CAG) 23 and (CTG/ CAG) 100 sequences are pure (CTG/CAG) repeats (55). The sequence of the PKD1 (Pu/Py) 88 microsatellite is shown in Fig.  1D. The cell lines are named to indicate the DNA sequence of the lagging-strand template when replicated from the c-myc origin (55,63).
The DF/myc(CTG) 100 cell line (hereafter referred to as (CTG) 100 ) expresses both the dTomato and eGFP reporter genes and fluoresces yellow (Fig. 1E). Transfection of an I-Sce1 expression vector results in double-strand DNA cleavage 25 bp downstream of the (CTG/CAG) 100 microsatellite. This leads to intrachromosomal homology-directed recombination (singlestrand annealing) between the second and third Alu elements (67), which eliminates the eGFP reporter. The half-lives of eGFP and dTomato reporter proteins are ;24 h (68, 69); therefore, I-Sce1 digestion resulted in cells that appear red after allowing 4-8 days for turnover of the reporter proteins present before digestion (Fig. 1F).

Replication-dependent DSBs are not repaired in the same way as I-Sce1 DSBs
To quantitate these observations over the entire cell population, (CTG) 100 cells were analyzed by flow cytometry. The (CTG) 100 cells initially expressed both dTomato (red) and eGFP proteins and appeared in the upper right quadrant (yellow, double-positive) (Fig. 2, A and B). A small percentage of cells (,2%) had spontaneously lost either the green reporter (upper left quadrant, red cells), the red reporter (lower right quadrant, green cells), or both reporters (lower left quadrant, double-negative cells). When these cells were transfected with the I-Sce1 expression vector, more than 40% of the cells lost the green reporter (generating red cells) or both reporters (resulting in double-negative cells) by 4 days after treatment (Fig. 2C). The loss of the eGFP reporter gene is the result of intrachromosomal single-strand annealing between the second and third Alu elements, whereas the double-negative cells result from single-strand annealing between the first and third Alu elements (67).
In striking contrast, approximately half of the (CTG) 100 cells exposed to low-dose hydroxyurea (0.2 mM) for 96 h had lost the dTomato marker after 4 days of recovery (Fig. 2D). This HU treatment quickly arrests replication forks in S phase and induces a low level of the phosphorylated replication stress proteins gH2AX and pChk1 345 (Fig. S1), consistent with previous reports that gH2AX marks stalled forks before DSBs are detectable (70).
DSBs were also induced in (CTG) 100 cells by treatment with a low dose of the replication inhibitor aphidicolin (Fig. S2), or by using H 2 O 2 as a source of ROS and replication stress (Fig.  S3). In contrast, (CTG) 23 cells did not exhibit these effects. We conclude that replication stress-dependent DSBs occur between the Tomato and eGFP marker genes near the ectopic (CTG) 100 site. Based on the difference in flow cytometry patterns Replication-dependent microsatellite DSBs cause BIR between cells treated with I-Sce1 and HU, we conclude that replication-dependent DSBs in the ectopic (CTG) 100 locus are not repaired in the same way as "clean" restriction enzyme-generated DSBs and that replication-dependent DSBs caused by the (CTG/ CAG) 100 repeat are refractory to the most common repair pathways of homology-directed repair and nonhomologous end joining.

(CTG/CAG) n repeat length-dependent DSBs
To confirm that the HU-induced DSBs were dependent on the (CTG/CAG) 100 repeat, several control cell clones were constructed and tested with HU, namely DF cells missing the cmyc origin and the (CTG/CAG) 100 repeat (Fig. 3, A and B), DF/ myc cells missing only the (CTG/CAG) 100 repeat (Fig. 3, C and D), and DF/myc cells containing a shorter (CTG/CAG) 23 repeat  88 refer to the 88-bp PKD1 microsatellite with the purine-rich strand or pyrimidine-rich strand, respectively, in the lagging-strand template when replicated from the c-myc origin. dTomato and eGFP genes are flanked by three identical AluYa5 repeats. B, DF/myc cells contain the same construct as in A but are missing the microsatellite sequences. C, DF cells contain the same construct as in A but are missing the ectopic c-myc origin and the microsatellite sequences. Hyg, hygromycin phosphotransferase (Hyg r ) gene; Neo, neomycin phosphotransferase (Neo r ) gene; TK, HSV thymidine kinase gene; FRT, S. cerevisiae FLP recombinase target, allowing site-directed integration. D, the PKD1 microsatellite sequence, showing two regions of mirror repeat symmetry; E, DF/myc(CTG) 100     In previous work in which the position of the I-Sce1 site was changed, we showed that homology-directed repair that removes either the ectopic dTomato gene or the eGFP gene is not inherently deleterious to cells (67). Therefore, the loss of the dTomato gene raised the possibility that the replication-dependent DSBs that were resistant to recombination and deleted all or part of chromosome 18 containing the dTomato gene were also inimical to cell survival. To test this hypothesis, cells were allowed to recover for 4, 8, or 10 days following HU treatment (Fig. 3, G-I). The abundance of green cells in the culture suggests that an acentric fragment of chromosome 18 including the dTomato reporter gene may have been lost due to DSBs at the (CTG/CAG) 100 microsatellite (see "Discussion"). The progressive loss of the green cells from the population (lower right quadrant) during the 4-10-day time course suggests that unrepaired replication-dependent DSBs had a lethal effect on these cells.

DNA DSBs are localized downstream of the (CTG/CAG) 100 microsatellite
To determine the location of the replication-dependent DSBs, DNA was isolated from (CTG) 100 cells treated with HU or I-Sce1 and subjected to linear amplification ligation-mediated PCR (lamPCR) (71,72). The lamPCR primer was complementary to the single-copy eGFP gene (Fig. 4A) and designed to hybridize to the lagging-strand template DNA and leadingstrand nascent DNA relative to the c-myc origin.
The I-Sce1 site 25 bp 39 to the (CTG/CAG) 100 microsatellite and ;500 bp from the lamPCR primer was used as a landmark. As expected, when I-Sce1 was expressed in (CTG) 100 cells, a major lamPCR band of ;500 bp was observed (Fig. 4B). Lowermolecular weight bands (;200-350 bp) were also observed in the undigested (Fig. 4B, lane 1) and I-Sce1-digested (Fig. 4B, lane 2) reactions, which we attribute to multiple leading-strand initiations near the c-myc origin (73,74) that are not dependent on exogenous replication stress.
A discrete band of approximately the same size as the I-Sce1-generated lamPCR band could also be seen in a longer exposure of the PCR products from untreated cells (Fig. 4B9, lane 1). We propose that this band is due to DSBs close to the 39 end of the (CTG/CAG) 100 repeat and the I-Sce1 site that are generated during endogenous replication stress. The breadth of this band suggests that DSBs resulting during unperturbed growth are primarily the result of nuclease cleavage within a limited region near a stalled fork and not due to random torsional breakage throughout the (CTG/CAG) 100 microsatellite.
When (CTG) 100 cells were treated with HU, a band of ;500 bp again appeared (Fig. 4C), suggesting that DSBs induced by HU treatment occur at or near the sites of DSBs due to endogenous replication stress and I-Sce1 cleavage. HU treatment also suppressed the 200-350 bp bands caused by nascent strand DNA annealing to the leftward-facing lamPCR primer. We conclude that endogenous replication stress, HU-induced fork stress, and I-Sce1 cleavage all produce DSBs at or near the 39 end of the (CTG/CAG) 100 microsatellite at the ectopic site in (CTG) 100 cells.

Mus81 knockdown decreases (CTG/CAG) 100 DSBs during replication stress
We have shown that (CTG/CAG) n repeats form hairpin structures in vivo that cause replication fork stalling (55,56,75). Inasmuch as the Mus81 nuclease has been strongly implicated in the cleavage of stalled replication forks (76)(77)(78)(79)(80)(81)(82)(83)(84), we wished to test whether knockdown of Mus81 (Fig. 4D) would affect the (CTG/CAG) 100 DSBs. In this experiment, roughly 18% of (CTG) 100 cells had suffered DSBs during unperturbed clonal growth (Fig. 4E). HU treatment of these cells significantly increased the percentage of green (DSB) cells in the population to greater than 40-50% ( Fig. 4F and Fig. S4, p = 0.018). Consistent with the cleavage of stalled forks by Mus81, knockdown of the nuclease reproducibly resulted in a significant decrease in the percentage of cells with endogenous DSBs (cf. Fig. 4 (E and G) and Fig. S4, p = 0.049). Additionally, Mus81 knockdown dramatically decreased the percentage of DSBs induced by HU treatment from ;50% to 25% of cells (cf. Fig. 4 (F and H), p = 0.003). Whereas these results are consistent with reports that Mus81 is a structure-selective nuclease that cleaves stalled replication forks (84)(85)(86)(87)(88), and our results show that Mus81 is involved in dTomato marker loss, we note that our experiments have not shown that it is specifically the nuclease activity of Mus81 that is responsible for the DSBs.
G quadruplex formation induces DSBs at the PKD1 polypurine/polypyrimidine microsatellite The polycystic kidney disease type 1 (PKD1) locus harbors a polypurine-polypyrimidine (Pu/Py) 88 tract of 88 base pairs in intron 21 (89), which is capable of forming intramolecular DNA triplex (H-DNA) and G quadruplex structures (90,91). These structures have been strongly implicated in replication fork stalling and collapse (8,(92)(93)(94). Replication of the polypurine strand is blocked by non-B structure formation in vitro, and the PKD1 (Pu/Py) 88 tract selectively inhibits replication when the polypurine strand is the lagging-strand template in vivo (63). To determine whether G quadruplex formation would sensitize this microsatellite to DSBs, we integrated the PKD1 (Pu/Py) 88 repeat at the ectopic chromosomal site, in either the (Pu) 88 or (Py) 88 lagging-strand orientation when replicated from the c-myc origin (Fig. 5A).
The starting culture of DF/myc/(Pu) 88 cells (referred to as (Pu) 88 cells) showed a significantly higher percentage of cells than the (Py) 88 cells in the lower right (green) quadrant, resulting from DSBs occurring in the absence of exogenous stress ( Fig. 5B and Fig. S6, p = 0.0001). Treatment of the (Pu) 88 cells with HU did not significantly increase the percentage of green cells (Fig. 5C, p = 0.354); however, treatment of (Pu) 88 cells with the G quadruplex-stabilizing drug telomestatin (TMS) (95,96) markedly increased chromosome fragility at the ectopic site (Fig. 5D, p = 0.0045), and this effect was enhanced by co-administration of HU ( Fig. 5E and Fig. S6, p = 0.024). These data suggest that replication fork slowing acts synergistically when G quadruplex formation is induced by the exogenous ligand. However, it remains to be seen whether endogenous levels of replication stress promote G4 versus H-DNA formation in the (Pu) 88 repeat in vivo.
We showed previously that the PKD1 microsatellite in the (Pu) 88 lagging-strand orientation blocked replication fork progress in vivo from the c-myc origin and elicited a constitutive DNA damage response, which was not observed when the ectopic site repeat was in the (Py) 88 orientation (63). To test the effect of replication polarity on the stability of the PKD1 microsatellite, we also analyzed the stability of the ectopic site when (Py) 88 was in the lagging-strand orientation. As shown in Fig.  5F, a significantly smaller percentage of DF/myc/(Py) 88 cells (referred to as (Py) 88 cells) than (Pu) 88 cells were initially green  100 insert showing the primer used for lamPCR (see "Experimental procedures"). Cells were transfected with an I-Sce1 expression plasmid or empty vector and incubated for 24 h before DNA isolation. Alternatively, cells were treated with 0.2 mM HU for 48 h. Genomic DNA was isolated and subjected to two rounds of lamPCR using the 59-biotinylated lamPCR primer indicated. The biotinylated PCR products were captured on streptavidin-tagged magnetic beads and ligated to a 59-phosphate, 39dideoxy adapter oligonucleotide (Circligase). Nested primers were used for exponential PCR amplification of the ligated template followed by gel electrophoresis. B, DF/myc(CTG) 100 cells were treated with I-Sce1, and DSBs were mapped by lamPCR; B9, darker exposure of B showing endogenous DSB; C, DF/myc (CTG) 100 cells were treated with HU, and DSBs were mapped by lamPCR. Arrows, bands indicating DSBs. Asterisks, putative extension products on unbroken leading-strand nascent DNA. These bands are not reproducible. D, DF/myc(CTG) 100 cells were treated with nontargeting siControl siRNA or siRNA targeting Mus81 and analyzed by Western blotting. Flow cytometry was performed on cells treated with siControl (E), siControl plus HU (F), Mus81 siRNA (G), or Mus81 siRNA plus HU (H). Although the starting (CTG) 100 culture had an increased percentage of green cells (cf. Fig. 3), the effects of HU treatment and the rescue by Mus81 knockdown were reproducible in three independent experiments (Fig. S4).

Replication-dependent microsatellite DSBs cause BIR
in the absence of exogenous replication stress (cf. Fig. 5 (B and F) and Fig. S6, p = 0.0001). This result suggests that the ectopic (Pu/Py) 88 tract is more sensitive to endogenous DSBs when the purine-rich strand is replicated as the lagging-strand template (i.e. in the fork-stalling orientation for replication). Nevertheless, compared with the effects of TMS on (Pu) 88 cells, TMS had a reduced but statistically significant effect on the percentage of green (Py) 88 cells in the absence (Fig. 5H, p = 0.006) or presence (Fig. 5I, p = 0.004) of HU, which is likely due to the presence of the NHE III 1 G4 prone sequence in the c-myc replication origin core (97)(98)(99). In contrast, HU treatment did not have a significant effect on the flow cytometry profile of (Py) 88 cells in the absence (Fig. 5G and Fig. S6, p = 0.355) or presence of TMS (Fig. 5I and Fig. S6, p = 0.483).
Indirect induction of (CTG/CAG) 100 DSBs TMS is a highly selective intramolecular G quadruplex ligand (95,96,100), which inhibits telomerase and causes telomere shortening in vivo (101). The results of TMS treatment of (Pu) 88 and (Py) 88 cells imply that DSBs occur preferentially when G4 prone sequences are present on lagging-strand tem-plates and that G quadruplex formation contributes to fork stalling and replication-dependent DSBs.
Therefore, it was surprising that treatment of (CTG) 100 cells with TMS resulted in DSBs between the dTomato and eGFP reporter genes (Fig. 6, A and B). The effect of HU on these cells was not additive to the effect of TMS (Fig. 6C), in contrast to the dramatic effect of HU on (CTG) 100 cells in the absence of TMS (Fig. 2). These results suggest that the induction of DSBs by TMS or HU at the ectopic (CTG/CAG) 100 microsatellite may both be related to replication fork stalling. To confirm that the TMS effect in (CTG) 100 cells was due to the (CTG/CAG) 100 repeat, we treated DF/myc cells with TMS and observed a significantly decreased appearance of green cells (Fig. 6 (D-F), p = 0.011).
The effect of TMS on DF/myc control cells was not statistically significantly different from its effect on (Py) 88 cells (p = 0.149), suggesting that as in (Py) 88 cells, the residual effect of TMS on the DF/myc control cells may be due to the NHE III 1 G quadruplex-forming sequence in the 2.4 kb c-myc replication origin DNA (94, 98-100, 102, 103).
To test the possibility that TMS induces unexpected structural changes in the (CTG/CAG) microsatellite, we used CD to monitor the effect of TMS on CTG and CAG oligonucleotides (Fig. 6G). As anticipated, TMS caused a dramatic shift in the CD spectrum of a 22-mer oligonucleotide derived from the cmyc G quadruplex-forming promoter sequence (100,102,103). However, TMS had no discernible effect on the CD spectra of a scrambled DNA negative control, or (CTG) 12 or (CAG) 12 oligonucleotides, although the (CTG) 12 and (CAG) 12 sequences are known to form hairpins in vitro (104). This is consistent with the lack of stabilization of dsDNA by telomestatin or similar G4 ligands (105,106). We conclude that TMS does not have a direct effect on (CTG/CAG) DNA structure in vitro and, therefore, that the effects of TMS on (CTG/CAG) stability in vivo are more likely to be an indirect result of activation of the DNA stress response (see "Discussion"), consistent with the observation that TMS leads to phosphorylation of the DNA damage response proteins Chk1, Chk2, and H2AX (107) (Fig. S5).
The ectopic (Pu) 88 repeat is not sensitive to HU, whereas the ectopic (CTG) 100 microsatellite is sensitive to multiple forms of replication stress, suggesting that the (Pu) 88 G4 structure is responsible for DSBs. Nevertheless, the effects of TMS on (CTG) 100 cells raise the possibility that TMS may contribute to DSBs at the ectopic G4 sequences in (Pu/Py) 88 cells both in trans through the DNA damage response and directly by binding to G quadruplex-prone DNA.

Replication-dependent microsatellite DSBs cause BIR
(green cells) (cf. Fig. 7 (A and D) and Fig. S7, p = 0.046), and Rad18 knockdown substantially increased the percentage of cells with DSBs when combined with HU treatment (cf. Fig. 7 (B and E) and Fig. S7, p = 0.037). Considered together, these results suggest that Rad18/TLS stabilizes the (CTG/CAG) 100 microsatellite against the fork-slowing effects of endogenous and exogenous replication stress.
TLS polymerases have been implicated in the bypass of DNA hairpin structures (e.g. by Escherichia coli Pol V synthesis across abasic DNA sites (118) and Saccharomyces cerevisiae Pol z/Rev1 primer extension by DNA template switching at hairpins (119)). In the current experiments, depletion of Pol h (cf. Fig. 7 (F and I) and Fig. S7) resulted in a statistically significant increase in green cells when compared with siControl (p = 0.041), whereas knockdown of Rev1 did not (not shown).
In addition, the effect of HU treatment was amplified by siRNA depletion of Rad18 (Fig. 7 (B and E) and Fig. S7, p = 0.022) or Pol h (Fig. 7 (G and J) and Fig. S7, p = 0.042), whereas Rev1 knockdown did not increase the effect of HU treatment (not shown). These results suggest that Pol h is one of the TLS polymerases involved in the restart of (CTG) 100 stalled forks.
In contrast to the effects of knockdown of Rad18 or Pol h, depletion of Pol k in the absence of HU treatment dramatically increased the fraction of cells containing DSBs (cf. Fig. 7 (K and  N) and Fig. S7, p = 0.0007). Taken together, the modest effect of Rad18 knockdown versus the strong effect of Pol k knockdown suggests that Pol k may also have a fork restart function independent of Rad18 (120)(121)(122). Surprisingly, HU treatment did not augment the effect of Pol k knockdown ( Fig. 7O and Fig. S7, p = 0.185) These results indicate that Rad18, Pol h, and Pol k are involved in resolving non-B DNA (123). In the presence of HU, Pol k may interact with the stalled fork in a nonproductive manner; thus, when replication is inhibited by HU, fewer structures that lead to DSBs are formed when Pol k is depleted.

Replication stress causes (CTG/CAG) BIR
Non-B DNA structure-prone repeats can induce mutagenesis at a distance in mammalian cells (9). These mutational events are thought to result from replication fork stalling at microsatellite repeats, fork breakage, and subsequent errorprone repair in a process termed repeat-induced mutagenesis (RIM) or BIR (9,37,124).
Our results have shown that the expanded (CTG/CAG) microsatellite stalls replication forks and induces replicationdependent DSBs. To test for RIM/BIR induced by the (CTG/ CAG) 100 microsatellite, we integrated a modified reporter plasmid, (CTG) 100 eGFP/TK, at the ectopic site such that the eGFP, FRT, and TK sequences become fused during FLP-mediated integration (Fig. 8A). We postulated that if BIR occurs following a hydroxyurea-induced, replication-dependent doublestrand break at the (CTG/CAG) 100 sequence, invasion of the broken end into the sister chromatid would result in mutagenesis of the neighboring TK gene ;1 kb downstream (Fig. 8B).
Untreated DF/myc(CTG/CAG) 100 eGFP/TK cells produced ganciclovir (GCV)-resistant clones at a frequency of approximately three per 10 5 cells (Fig. 8C). These data are comparable with the frequency of GCV-resistant clones stemming from DSBs at an ectopic (CGG/CCG) 153 repeat in a clonal population of murine erythroid leukemia cells (124). As in the case of the (CGG/CCG) 153 murine erythroid leukemia cells, GCV re-sistance likely arose by endogenous replication stress and BIR during extended clonal outgrowth of the dual-fluorescence cell line.
When DF/myc(CTG/CAG) 100 eGFP/TK cells were treated with 0.2 mM HU followed by GCV selection, the frequency of GCV-resistant colonies rose to ;7-8 colonies/10 5 cells (Fig.  8C, p = 4 3 10 25 ). Thus, acute treatment with HU produced a similar number of GCR-resistant cells as extended (.1-year) clonal outgrowth. To confirm that the appearance of GCV-resistant cells is the result of BIR, we knocked down PolD3, which is necessary for BIR in yeast (40) and human cells (44,124), or knocked down BRCA2, which mitigates DSBs under conditions of replication stress and promotes Rad51-dependent BIR in human cells (44,125) (Fig. 8C). When DF/myc(CTG/ CAG) 100 eGFP/TK cells were exposed to shRNA, knockdown of PolD3 (Fig. 8C, p = 0.015) or BRCA2 (Fig. 8C, p = 0.002) significantly decreased the frequency of GCV-resistant cells following HU treatment to the background levels of TK mutants accumulated during prolonged clonal outgrowth, supporting the view that DSBs at the (CTG/CAG) repeat lead to break-induced replication.
We previously used inverse PCR (iPCR) to show that replication stress leads to breakage at an ectopic (CTG/CAG) 102 repeat in myc(CTG/CAG) 102 cells (126), as well as DSBs at endogenous microsatellites across the genome (126). The (CTG/ CAG) 102 construct differed from the construct in DF/myc (CTG/CAG) 100 eGFP/TK cells in that there was no eGFP/TK fusion or GCV selection for cells undergoing BIR. DNA sequence analysis of iPCR products with nonallelic breakpoint junctions initiated within the ectopic site also showed that the broken site underwent nonrandom chromosomal translocations similar to genome rearrangements attributed to BIR in tumor cells (126).
Although there was no GCV selection for BIR in the myc (CTG/CAG) 102 cells, we reanalyzed the iPCR DNA-sequencing data to focus on the distribution of mutations upstream and downstream of the ectopic repeat that had been repaired by homology-mediated templating of the sister chromatid (Fig.  9). The great majority (.95%) of mutations were single base substitutions. As predicted by BIR models, base substitutions were dramatically greater downstream of the (CTG/ CAG) 102 repeat than upstream (Fig. 9), consistent with the rightward replication of the repeat from the c-myc origin and with lamPCR mapping of the DSB at the downstream edge of the (CTG/CAG) 100 repeat. Subtracting the frequency of nucleotide substitutions (PCR and sequencing errors, in vivo mutations) upstream of the repeat as background, the average frequency of nucleotide substitution downstream of the (CTG/CAG) 102 repeat was ;1.5 3 10 27 substitutions per bp. This value is at least 2-3 orders of magnitude greater than recent estimates of the natural mutation frequency in humans (40,127,128).
Within the (CTG) 102 repeat, we observed expansions, contractions, inversions, and base substitutions. Interestingly, there was a strong third base periodicity of nucleotide substitutions at dG residues in the (CTG) 102 template, which peaked dramatically near the center of the microsatellite (Fig. 9B). These data are consistent with the observation that dG:dC base pairs are preferential targets for single-base substitution mutations in tumor cells (129) and that a loop at the center of a single large (CTG) 102 hairpin is a hotspot for mutagenesis during the process of BIR. Considered together, our data indicate that DSBs at (CTG/CAG) repeats lead to highly mutagenic breakinduced replication.

(CTG/CAG) microsatellite DSBs detected by flow cytometry
Microsatellite repeats prone to forming non-B DNA structures undergo expansion, contraction, and double-strand breakage in a variety of yeast and mammalian cell systems (7,29,36,124,(130)(131)(132)(133)(134)(135). Here, we used a dual-fluorescence reporter gene system to analyze DSBs in human cells. We show that DNA double-strand breaks occur at a relatively high frequency in an ectopic (CTG/CAG) 100 microsatellite expanded beyond the WT range of repeats found in the human DMPK gene. These DSBs occur in unstressed cells and were dramatically increased in cells treated with four qualitatively different replication stressors (hydroxyurea (126,136,137), aphidicolin (56), hydrogen peroxide (138), and telomestatin (107,111)), in agreement with the view that endogenous and exogenous replication stress leads to DNA DSBs. The low background level of DSBs at the ectopic site (CTG/CAG) 23 repeat suggests that expanded (CTG/CAG) 100 tracts promote replication-dependent breakage.
Replication-dependent DSBs at repeated sequences have been attributed to the propensity of these repeats to form noncanonical DNA structures (8,35,48,139,140). Consistent with this view, it has been shown that (CTG/CAG) repeats form hairpin structures in vivo (55,56,141). It is intriguing, therefore, that the cellular repair machinery treats a restriction enzyme-generated DSB differently than a replication-dependent DSB. We speculate that localized Mus81-dependent cleavage near the downstream edge of the (CTG/CAG) 100 repeat is due to a noncanonical DNA structure that is refractory to replication and repair. Similar conclusions regarding the breakage and repair of structured ends have been obtained with an AT-rich repeat derived from the FRA16D common fragile site (30, 142).
The abundance of green cells (;50%) after different forms of replication stress is consistent with models in which both ends of a replication-dependent DSB persist in the population (143); based on the abundance of dTomato 2 eGFP 1 cells, we propose that the eGFP side of the DSB is replicated by a leftward moving replication fork (Figs. 8B and 10) (76) that produces two DSBs that are functionally single-ended.
Among other possibilities, the subsequent instability of the green cells may be due to the structure of the non-B DNA end per se, inhibition of the major pathways of repair (homologous recombination and nonhomologous end joining) with the downstream end, a nontelomeric structure of the downstream eGFP DSB end, and loss of DNA from the acentric upstream side of the DSB.
In contrast to the results presented here for the PKD1 microsatellite, Wenger et al. (144) reported the inability to detect fragile sites by cytogenetic G banding in blood cell cultures from congenital DM1 patients containing repeats as large as (CTG/CAG) 1000 after treatment with replication stressors including 0.2 mM aphidicolin. Aside from differences in cell type, Wenger et al. (144) treated cells with bromodeoxyuridine, 59-deoxy-5-fluorouridine, or aphidicolin for 24 h immediately before chromosome spreading, whereas the present experiments treated cells for 4 days prior to 4-10-day recovery in drug-free medium and flow cytometry. It is possible therefore that in the present experiments, DSBs occur during prolonged replication stress or during replication restart following drug treatment.
We observed as well that treatment of cells with telomestatin induced DSBs at the (CTG/CAG) 100 ectopic site in vivo. Telomestatin is a known replication stressor of intramolecular G quadruplex-prone sequences, especially telomeres (101,107,(145)(146)(147); however, no change in (CTG/CAG) oligonucleotide structure due to TMS could be detected in vitro by CD. These results suggest that telomestatin action at telomeres or other G quadruplex-prone sequences (107) can affect (CTG/CAG) stability in trans. We propose that G quadruplex formation elsewhere in the genome causes a diffusible state of replication Figure 9. High frequency of base substitutions due to BIR. A, base substitution analysis at the ectopic site following sister chromatid templated BIR. DNA was isolated from myc(CTG/CAG) 102 cells treated with 0.2 mM aphidicolin, digested with MseI, and intramolecularly circularized. The circularized DNA was amplified by inverse PCR and analyzed by high-throughput sequencing. Nonallelic recombination junctions have been published previously (126). B, interpretative map of BIR at the ectopic site. C, quantitation of base substitutions at C, T, and G residues within the (CTG) repeat. stress in the cell that promotes fork slowing and non-B structure formation at the ectopic (CTG/CAG) 100 repeat.

microsatellite
During DNA replication, the lagging-strand template is expected to be more susceptible to structure formation than the leading-strand template, due to its relatively prolonged single-strandedness (6,35). Consistent with this model, replication of the PKD1 intron 21 (Pu) 88 sequence as the laggingstrand template leads to replisome stalling; recruitment of RPA, Rad9, and ATR to the stalled fork; and induction of a DNA damage response (63).
In the current work, the (Pu/Py) 88 repeat displayed replication polarity-dependent instability in the absence of exogenous stress. Administration of the G quadruplex-binding ligand telomestatin strongly enhanced the sensitivity of the (Pu/Py) 88 tract to DSBs in the same replication polarity-dependent manner, which we attribute to the stabilization of G quadruplex DNA structures in the lagging-strand template. However, the (Pu/Py) 88 tract also has the ability to form triplex H-DNA structures in the absence of telomestatin (63,89,90,148). Therefore, whereas the present results indicate that induced G quadruplex formation can stall replication forks and cause DSBs, the sensitivity of the lagging-strand (Pu) 88 tract to DSBs in unperturbed cells could also be the result of H-DNA formation. This possibility is currently under investigation.
In contrast to the DSB sensitivity of the ectopic PKD1 IVS21 (Pu/Py) 88 tract, PCR analysis of 57 patients with autosomal dominant polycystic kidney disease (ADPKD1) showed no hotspot for mutation in the PKD1 gene, although mutations were 2-3 times more frequent in the exons surrounding IVS21 than in exons 1-8 (60). Similarly, in samples from 15 tuberous sclerosis (TSC) patients in which deletions in the upstream TSC2 gene extended into PKD1, multiplex ligation-dependent probe amplification did not show clustering of breakpoints near the IVS21 (Pu/Py) 88 tract (62). One explanation for these observations may be a strong selection against PKD1 DSBs, which includes origin choice (149)(150)(151)(152) to avoid lagging-strand replication of the PKD1 (Pu) 88 sequence.

Translesion polymerases mediate (CTG/CAG) stability
The TLS polymerases comprise a group of functionally divergent enzymes that can bypass non-B DNA and template lesions that stall replicative polymerases (123). Rad18-dependent ubiquitination of the PCNA scaffold allows TLS polymerase exchange and access to the unreplicated lesion (111,115,(153)(154)(155). Multiple TLS polymerases can bind to ubiquitinated PCNA (123); thus, in the absence of one TLS polymerase, alternative postreplication repair pathways can be employed (115,123). In the dual-fluorescence assay, siRNA knockdown of Rad18 or Pol h caused a marked increase in ectopic instability when combined with HU treatment. In contrast, depletion of Pol k led to a large increase in DSBs in otherwise unstressed cells but minimized the effect of HU treatment. We suggest that Rad18 and Pol h assist in the replication of the (CTG/CAG) 100 repeat, particularly during HU-induced fork slowing, whereas depletion of Pol k in the presence of HU decreases the formation of a subset of difficult-to-replicate structures. Intriguingly, translesion synthesis has also been implicated in break-induced replication (156,157). Our results are consistent with reports that Pol h and Pol k are involved in the replication of common fragile sites (158) and that knockdown of these polymerases enhances DSBs in HeLa cells transfected with plasmids containing c-myc G4-prone DNA (111).

(CTG/CAG) 100 break-induced replication
Early replicating fragile sites (ERFSs) are detected as DNA breaks in the absence of exogenous replication stress but are increased on release from hydroxyurea treatment or ATR inhibition (159). Tubbs et al. (143) recently showed that a subset of ERFSs close to replication origins containing poly(dA:dT) tracts are highly sensitive to DSBs in lymphocyte cultures treated with HU. The 2:1 ratio of DNA ends on opposite sides of the DNA breaks suggested that both arms of the replication fork are broken into DSBs, in contrast to the single-ended DSB model of RIM and BIR (40,(160)(161)(162)(163).
The ectopic (CTG/CAG) 100 repeat resembles an ERFS in its proximity to an origin (143), its early replication (164), and its sensitivity to several forms of replication stress (159). However, breakage at the (CTG) 100 may differ from the model for ERFS DSBs on both arms of a fork based on RIM/BIR mutagenesis at the flanking TK sequence upon HU treatment (i.e. the 39 end of a replication-dependent DSB generated by (CTG) 100 fork collapse can invade and mutagenize the TK gene of an intact sister chromatid).
In the dual-fluorescence system, the frequency of GCV R colonies is comparable with recently reported values for BIR initiated by (CGG/CCG) 153 repeats in murine cells (124) and may reflect the rate of error-prone DNA synthesis during breakinduced replication. However, additional factors (efficiency of mismatch repair, incidence of sister chromatid (versus nonallelic) invasion, efficiency of synthesis on the sister chromatid template, frequency of template switching) may also affect the observed frequency of mutagenesis.
A variety of replication stressors, including HU, induced breaks between the dTomato and eGFP reporter genes at the ectopic site. HU also induced a (CTG/CAG) 100 DSB localized to the edge of the repeat downstream of the c-myc replication origin. The induction of GCV-resistant clones by HU treatment and the decrease in TK mutant clones following knockdown of PolD3 or BRCA2 suggest that HU causes BIR in this system. Considered with the strong preferential occurrence of mutations downstream of the ectopic repeat in (CTG/CAG) 102 cells, these data are consistent with a model in which replication stress at the (CTG/CAG) microsatellite leads to DSBs in forks originating at the c-myc origin, resulting in BIR and mutagenesis of the downstream TK gene.
We propose that Mus81 cleavage of the stalled rightward moving fork results in a covalently open or closed hairpin end (165) of the dTomato chromosome fragment (Fig. 10). Alternatively, Mus81 cleavage may occur as the leftward moving fork stalls at a hairpin structure or template-switched/chicken-foot reversed fork structure. The abundance of green cells after HU treatment implies that the eGFP gene downstream of the (CTG/CAG) 100 repeat DSB is replicated by a leftward moving replication fork and is preserved in cells that have lost the dTomato gene (76).
We speculate that the non-B DNA structure of the dTomato end blocks repair and yields two DSBs that are functionally single-ended. The unligated DSB also leads to loss of the acentric dTomato chromosome fragment.
A small fraction of eGFP ends may undergo BIR and generate GCV S cells that are green (mutant dTomato) or yellow, whereas the majority of green cells die, inasmuch as a single unresolved DSB can cause apoptosis (166,167). BIR of the dTomato end gives cells that are GCV R (TK mutant) and red (eGFP mutant) or yellow.
Recently, Mayle et al. (76) showed that knockout of Mus81 in yeast could increase BIR mutagenesis. In this system, a mutant form of the FLP recombinase was used to generate a long-lived DNA nick that could be converted to a seDSB when traversed by a replication fork. The authors concluded that Mus81 cleavage of the BIR D-loop normally reduced BIR mutagenesis. In contrast, the present results suggest that earlier cleavage by Mus81 at stalled forks may also increase DSBs. Further experiments are under way to analyze the structures of the right and left DSB ends and test the effects of enzymes involved in BIR in processing DSBs in this system.

Hydroxyurea, aphidicolin, and telomestatin treatment assays
Cells were treated at a final hydroxyurea concentration of 0.2 mM, aphidicolin at a final concentration of 0.2 mM, and TMS at a final concentration of 0.5 mM. The reagents were added to the medium 24 h after plating the cells and maintained until the start of recovery (2-4 days). Cells were treated with 200 mM H 2 O 2 for 15 min. Treatment and recovery of cells was in Dulbecco's modified Eagle's medium supplemented with 10% fetal bovine serum, 1% penicillin/streptomycin, 5% CO 2 at 37°C.

Flow cytometry
Cell were trypsinized and centrifuged at 300 3 g for 3 min. Medium was aspirated, and cells were washed with cold PBS. After a final wash with PBS, cells were analyzed using a C-Flow ® Plus Accuri cytometer. All of the results that compare the effect of treatments on a single cell line within a figure were obtained contemporaneously from sister subcultures split from the same cell population.

Statistical analysis
Student's two-tailed t test was used to analyze the statistical significance of the experimental results versus the corresponding paired controls using the "percent green cells" shown in Figs. 4 and 10 and Figs. S1-S7 and generate p values (GraphPad Prism 8). A value of p , 0.05 was taken to indicate statistical significance.
Unpaired Student's two-tailed t tests were used to compare (Pu) 88 versus (Py) 88 cells (Fig. 5) and (CTG) 100 versus DF/myc cells treated with TMS with or without HU (Fig. 6), as these cells were derived from separate clonal outgrowths using different integrant constructs.

DRAQ-7 flow cytometry
After treatment of cells with replication stress-inducing agent followed by recovery, cells were centrifuged at 500 3 g for 3 min. Medium was aspirated, and cells were washed with cold PBS and spun down at 500 3 g for 3 min. Cells were permeabilized with 70% ethanol at 220°C for 20 min or overnight. Cells were centrifuged and washed and resuspended in 1 ml of PBS with RNase A (0.75 mg/ml final concentration) and incubated at 37°C for 20 min. Finally, DRAQ7 dye (170) (Abcam, catalog no. ab109202) was added at a final concentration of 7.5 mM. Cells were incubated in the dark for 25 min and analyzed using a CFlow ® Plus Accuri cytometer.
CD CD spectra were collected using a Jasco J-815 CD spectropolarimeter (Jasco Inc., Easton, MD). Spectra were recorded from 320 to 220 nm with a bandwidth of 1.0 nm, scan rate of 50 nm/ min, and time constant of 1 s. All DNA samples were dissolved in 10 mM Tris, 1 mM EDTA, pH 7.4, and diluted in water to a working concentration of 10 mM. Telomestatin was added to DNA samples at a final concentration of 50 mM. The CD spectra represent the average of four scans taken at 25°C and baselinecorrected for buffer. The oligonucleotides used for CD were as follows: c-myc G4, 59-TGA GGG TGG GGA GGG TGG GTA A, (CTG) 12 , and (CAG) 12 .

Ligation-mediated lamPCR
LamPCR was performed based on previously described conditions with the following modifications (71,72). Genomic DNA was isolated from untransfected DF2/myc/(CTG/CAG) 100 cells, 24 h after transfection with the I-Sce1 expression plasmid, or after 4 days of 0.2 mM HU treatment. Linear amplification was performed using 5 mg of DNA and the downstream biotinylated primer 39 F0-biot ((biotin)59-GTCAGCTTGCCGTAGGTGG-39) for 50 cycles. A second aliquot (0.5 ml) of HotStarTaq was added, and amplification was performed for an additional 50 cycles.
The linear amplification products were captured on streptavidin beads, washed, and ligated overnight to the adapter oligonucleotide (59-pATCGACAACAACTCTCCTCCTCC-GTGCGddC-39) (71). The beads were washed, and the ligated products were amplified with the adapter reverse complement primer  and the nested downstream primer 39F1 (59-GCTGAACTT-GTGGCCGTTTA-39). Products were electrophoresed on 1.5% agarose gels.

Data availability
All data are contained in the article and supporting information.