Emerging roles for R-loop structures in the management of topological stress

R-loop structures are a prevalent class of alternative non-B DNA structures that form during transcription upon invasion of the DNA template by the nascent RNA. R-loops form universally in the genomes of organisms ranging from bacteriophages, bacteria, and yeasts to plants and animals, including mammals. A growing body of work has linked these structures to both physiological and pathological processes, in particular to genome instability. The rising interest in R-loops is placing new emphasis on understanding the fundamental physicochemical forces driving their formation and stability. Pioneering work in Escherichia coli revealed that DNA topology, in particular negative DNA superhelicity, plays a key role in driving R-loops. A clear role for DNA sequence was later uncovered. Here, we review and synthesize available evidence on the roles of DNA sequence and DNA topology in controlling R-loop formation and stability. Factoring in recent developments in R-loop modeling and single-molecule profiling, we propose a coherent model accounting for the interplay between DNA sequence and DNA topology in driving R-loop structure formation. This model reveals R-loops in a new light as powerful and reversible topological stress relievers, an insight that significantly expands the repertoire of R-loops' potential biological roles under both normal and aberrant conditions.

DNA superhelicity: What is it, and why does it matter? DNA superhelicity, discovered over 50 years ago (1)(2)(3), is an essential physical property of the DNA double helix that can be most easily understood for closed circular duplex DNA molecules, such as plasmids. Each strand in a circular duplex DNA is a circle, and these two circles are interlinked due to the helical nature of DNA (4). The number of times either strand crosses through the closed circle formed by the other strand is a fixed integer called the linking number (Lk). 3 Lk can only be changed by transiently cutting one or both strands, followed by strand passage or rotation and religation. In the absence of any external stress, the "relaxed" linking number value, denoted here Lk 0 , will reflect the geometry of the Watson-Crick B form DNA, with one strand crossing every ϳ10.5 bp. Molecules with different Lk values (referred to as topoisomers) experience varying degrees of superhelical stress depending on their linking difference, ␣ ϭ Lk Ϫ Lk 0 , also called superhelicity. Superhelicity can be either positive or negative, reflecting an excess or deficit of strand crossings, respectively. The superhelix density, ϭ ␣/Lk 0 , allows comparisons of the levels of superhelicity in molecules of different lengths.
Superhelicity can also be imposed on noncircular DNA molecules. If a piece of linear DNA is held between semi-rigid attachment points such that the diffusion of superhelical stresses beyond these points is blocked, a topologically constrained domain is formed. Examples include CTCF-anchored topologically associated domains (TADs) and lamin-associated domains in mammalian genomes (5)(6)(7). Chromatin templates themselves are also topologically constrained (8,9). Simple sequence-specific DNA-binding factors, such as the lac repressor, can act as topological domain boundaries (10 -13). Prokaryotic circular genomes are partitioned into multiple dynamic, negatively supercoiled domains that contribute to genome architecture (14 -16). Thus, a topological domain is any portion of a DNA molecule, linear or circular, on which superhelicity can be imposed. Although the linking difference imposed on a domain can only be changed by transiently cutting one or both of the strands, several processes alter how superhelicity is distributed within a domain (see below).
DNA superhelicity is critical to biology for a variety of reasons. First, the three-dimensional (3D) shapes available to a superhelical DNA molecule must satisfy geometrical twisting and writhing constraints, as first elucidated by Vinograd and Lebowitz (1,17). The strain on a negative superhelical domain can be accommodated either as undertwist of the DNA duplex or by the formation of toroidally or plectonemically writhed structures (18) (Fig. 1). These higher-order structures are biologically relevant because they contribute to genome folding in 3D space, increase local proximity of DNA sequences, and affect local concentrations of DNA-and RNA-binding factors involved in gene expression control and genome dynamics (19). When negative superhelicity is expressed as undertwist, the DNA can undergo strand separation, exposing the two strands in a single-stranded bubble (Fig. 1). Superhelical duplex destabilization can be critical for processes that require strand open-ing, such as the initiation of transcription and of DNA replication. Promoter and origin regions in prokaryotes have evolved AT-rich DNA sequences that can efficiently transition to a strand-separated state (20 -24). Hundreds of Escherichia coli genes respond to superhelicity changes, supporting the notion that topology plays an important role in the control of gene expression (25)(26)(27). Negative superhelicity is also thought to facilitate transcription elongation, whereas excessive positive supercoiling impedes it (28).
DNA superhelicity also affects protein-DNA interactions. A host of DNA-binding proteins sense the topological state of DNA, with negative superhelicity generally facilitating protein-DNA interactions. Nucleosomes, for instance, preferentially form on negatively supercoiled DNA, with each nucleosomebinding event stabilizing one negative superhelical turn (29,30). The replication initiation proteins of E. coli (DnaA) and of Drosophila (ORC1), along with a number of transcription factors (31)(32)(33)(34)(35)(36), also prefer binding to negatively supercoiled DNA. Negatively supercoiled topological domains therefore represent hubs of genome organization and activity in both prokaryotes and eukaryotes (37,38). Negative superhelicity can also induce the formation of alternative non-B DNA structures (see below), and these structures, in turn, can regulate gene expression (26,39,40).

All genomes experience superhelical stresses
All genomes, whether of viral, prokaryotic, archaeal, or eukaryotic origin, experience superhelical stresses as a result of transcription and replication. These processes involve large macromolecular machines that translocate processively along the DNA (41). Impediments to the rotation of these complexes around the DNA axis force the DNA to rotate instead, which imposes large amounts of torque on the fiber (30,42). This leads to a dynamic repartitioning of superhelicity, whereby a diffusive wave of positive superhelicity is pushed ahead of the advancing replication and transcription forks, and a wave of negative superhelicity of equal magnitude is produced behind. During transcription, the main focus of this review, this repartitioning of positive and negative superhelicity is referred to as the "twin supercoiling domain model" (43,44) (Fig. 2, A and B).
Transcription-induced supercoiling is a major source of superhelical stress in all genomes. Direct DNA topology measurements in mammalian cells established that transcription creates a ϳ1.5-kb negatively supercoiled domain upstream of (i.e. behind) active promoters (45). The superhelix density achieved varied with gene activity, reaching ϭ Ϫ0.07. Assuming that transcription generates reciprocal superhelical density waves of ͉͉ ϭ 0.05, then 150 superhelical turns of each sign will be generated during each round of transcription of an average length human gene (30 kb). On a larger scale, Naughton et al. (46) showed that the human genome consists of a series of underwound and overwound domains delineated by both GC/AT sequence transitions and binding sites for the domainorganizing CTCF protein. Underwound domains were transcriptionally active and enriched for open chromatin, and their topological states were dynamically responsive to transcription inhibition. Interestingly, transcription and its ability to form topological domains has been linked to the large-scale folding of chromatin domains (46). Independent modeling experiments have suggested that transcription-induced supercoiling facilitates both cis-interactions between loci in TADs, including promoter-enhancer contacts (47-49), and chromatin loop extrusion (19,50).
Not surprisingly, mechanisms that manage topological stresses are essential for efficient transcription and replication. DNA topoisomerases, a family of ubiquitous and conserved proteins (for recent reviews, see Refs. 51 and 52) transiently cut, move, and religate DNA strands to relax superhelicity. Type I topoisomerases, which transiently cut one strand of DNA to relax superhelicity, occur in all free-living organisms. Type II topoisomerases make dsDNA cuts and then pass another part of the duplex through the gap before religation, changing ⌬Lk by 2. Whereas most type II topoisomerases relax DNA, the prokaryotic DNA gyrases introduce negative supercoils to offset positive superhelical stresses (53). As a result, topological domains in prokaryotes are maintained in a negatively superhelical state with an average superhelix density ϭ Ϫ0.05 (15,54).
DNA topoisomerases are closely associated with, and necessary for, normal transcription. In mammals, DNA topoisomerase I directly binds to the transcription machinery and becomes catalytically activated upon release into productive elongation JBC REVIEWS: R-loops in topological stress relief (55). This fine-tuned mechanism for DNA topoisomerase I control enables the dynamic removal of positive supercoils ahead of the machinery to facilitate elongation, while preserving promoter-opening negative superhelicity toward the transcription start site. A role for DNA topoisomerase I in favoring elongation was also observed for long genes (56 -58). Topo II enzymes, by contrast, have been implicated in the local management of excess supercoils around transcription start sites, particularly for highly expressed genes (45,59). The Top2B enzyme has been implicated in managing the excess promoterproximal supercoils generated upon heat shock and serum and hormone induction at the expense of the generation of DNA breaks (60 -62).

Superhelicity favors transitions to alternative non-B DNA structures
Superhelicity represents a high energy state for DNA. At equilibrium, the energy associated to DNA superhelicity is quadratic and can be modeled as E(␣) ϭ 1 ⁄ 2K␣ 2 , where ␣ is the linking difference (63,64). The twisting and writhing of the molecule in three dimensions reflects this energy as the DNA deforms to accommodate the imposed superhelicity. As negative superhelicity increases, transitions to alternative non-B form DNA structures will become favored. Strand opening, for example, absorbs undertwist, thereby allowing the rest of the domain to relax a corresponding amount. Whereas the transition itself costs energy through the formation of junctions between duplex B-DNA and the alternative structure itself, the accompanying relaxation provides an energy return. If this return is larger than the transition cost, then the transition is favored at equilibrium. Negative superhelicity drives transitions to a wide variety of alternative structures in vitro, including strand-separated (i.e. melted) DNA, Z-form DNA, cruciforms, H-form DNA, and R-loops, the focus of this review (39,40,65). Which transitions occur in a specific domain depends on base sequence and the level of superhelicity it experiences. Importantly, long genomic sequences will often carry multiple regions susceptible to forming various alternative structures.

Figure 2. Transcription generates superhelical stresses that can be mitigated by R-loop formation.
Transcription-driven supercoiling leads to the formation of dual waves of positive (downstream) and negative (upstream) superhelicity depicted here as interlinked plectonemic structures (A) or toroidal structures (B). C, topological disruptions caused by transcription are shown as undertwist (upstream) and overtwist (downstream). As the RNA polymerase (light blue) translocates forward, an R-loop initiates and extends. R-loop formation relaxes the upstream negative superhelical stress by absorbing undertwist within the strand opening that accompanies the formation of long R-loops. In addition, the displaced looped out ssDNA strand may wrap around the RNA:DNA hybrid in a left-handed helical fashion (bottom), further absorbing negative superhelicity.

JBC REVIEWS: R-loops in topological stress relief
These structures exist in a competitive equilibrium because the available superhelicity couples together the transition behaviors of all susceptible sites in the domain. This coupling happens because a transition at any one site will absorb superhelicity, thus decreasing the amount remaining to drive transitions elsewhere and thereby lowering their likelihood (66,67). The existence of many thousands of alternative DNA structures in the genomes of activated B cells in vivo demonstrates the presence of vast stores of negative superhelicity in mammalian genomes (68).

R-loops: Prevalent non-B DNA structures
R-loops are three-stranded nucleic acid structures consisting of an RNA:DNA hybrid and a displaced ssDNA strand (69). R-loops can form in trans as a result of the invasion of an RNA strand into complementary dsDNA. Such invasion typically requires protein-mediated catalysis, either by components of the homology-directed DNA recombination machinery (70,71) or by CRISPR-Cas systems (72,73). R-loops also form in cis during transcription upon hybridization of the nascent RNA with the template DNA strand behind the advancing RNA polymerase (RNAP) (74). The formation of a nascent RNA:DNA hybrid leaves the nontemplate DNA strand unpaired and free to wrap around the hybrid duplex (Fig. 2C). Co-transcriptional R-loops are the primary focus of this review.
Given their prevalence, R-loops represent an important class of non-B DNA structures that is increasingly the subject of investigation. R-loops can be biochemically reconstituted with high efficiency using simple in vitro transcription systems (91)(92)(93), and a variety of orthogonal methods have been developed to report their formation in plasmids and chromosomes (88,94,95). Studies have linked R-loops to a range of both positive and negative cellular outcomes, suggesting that these structures not only form in genomes but also are biologically relevant. The main purpose of this review is to highlight our understanding of the physiochemical forces that underlie R-loop formation, focusing on the role of DNA topology. We refer readers to recent reviews that cover the possible biological roles of R-loops in health and disease (69,74,(96)(97)(98)(99).

Understanding R-loops from first principles
As with other non-B DNA structures, an R-loop will be favored to form if its energy at equilibrium is lower than that of B-form DNA. As with other alternative DNA structures, the largest energy barrier to R-loop formation is the formation of the two junctions between duplex DNA and the R-loop itself. Junction energies have been measured for B/Z transitions and strand separation in the range of 10 -11 kcal/mol/pair of junctions (100 -103). It is possible that the value for R-loops is even higher, given that three strands must be accommodated instead of two. A recent energy-based equilibrium energy-modeling approach (104) reveals that R-loops can compensate for this high junctional cost using two complementary paths: DNA sequence and DNA topology.
Making an R-loop involves breaking DNA:DNA base pairs and forming RNA:DNA base pairs. If the energy of the RNA: DNA base pairs is lower than that of the DNA:DNA base pairs with the same sequence, the energy of the structure as a whole will be reduced. When analyzed as dinucleotides, seven of the 16 possible combinations favor the RNA:DNA state over the DNA duplex (105). Most of these sequences are G-rich or purine-rich. This analysis suggests that R-loops should prefer to form in G-rich and G/A-rich regions of transcripts. Experimental evidence indeed shows that R-loops form efficiently from G-rich transcripts (92,93) and that G clusters constitute strong initiation points for R-loops (106,107). GC-rich DNA sequences that show strand asymmetry in the distribution of G and C bases (GC skew) are prone to R-loop formation at endogenous loci (88,94,108) and in vitro (88,(92)(93)(94). Evidence from R-loop mapping studies shows that R-loop hotspots are generally (but not always) enriched for GC-skewed regions and purine-skewed regions (83,86,87,95). These results indicate that R-loop formation in a wide variety of organisms follows at least in part the intrinsic thermodynamic landscape of RNA: DNA versus DNA:DNA base pairing.
DNA topology provides a second way by which R-loops can lower their energy below that of superhelical duplex DNA. A role for negative supercoiling in favoring R-loops has been experimentally established through a strong body of work in E. coli (for a review, see Ref. 109). Strains of E. coli deficient for the DNA relaxing DNA topoisomerase I enzyme were shown to accumulate R-loops (76), highlighting the key role of negative superhelicity in driving R-loops during transcription. DNA gyrase, with its ability to introduce negative supercoiling into DNA (53), was the primary driver of transcription-associated hypernegative supercoiling and of R-loop formation (76). DNA topoisomerase I activity, by contrast, suppressed this phenomenon (76,91,110,111). Strains overexpressing RNase H, an enzyme that specifically degrades RNA in the context of RNA: DNA hybrids, partially rescued the defects arising from DNA topoisomerase I deficiency (112) and abolished the formation of DNA gyrase-and R-loop-dependent hypernegatively supercoiled plasmids (91,111). This work suggested that R-loop formation is a common occurrence in prokaryotes that is dynamically regulated by both pro-R-loop (negative supercoiling) and anti-R-loop (DNA topoisomerase I, RNase H) factors.
One interpretation of these early findings is that the underwound state of DNA associated with transcription-driven negative supercoiling (43) produces partially melted regions that permit invasion by the nascent RNA, initiating an R-loop. How-JBC REVIEWS: R-loops in topological stress relief ever, analysis of superhelical duplex destabilization shows that sites of strand separation are confined to the AT-richest regions of a domain (24). By contrast, most R-loops occur in G/C-rich locations. Thus, whereas transient supercoiling-induced strand separation or base unstacking cannot be ruled out as contributing to R-loop initiation, this view may not adequately capture the role of negative superhelicity. We suggest instead that, as described for other non-B DNA alternative structures, negative superhelicity constitutes a high-energy, stressed state of the DNA that is efficiently relieved by R-loop formation and the relaxation it produces (104). The structure of an R-loop lends itself well to relaxing negative superhelicity. In an R-loop, the two DNA strands are separated and no longer twist around each other; this allows surrounding undertwist to migrate into the R-loop bubble, relaxing the rest of the domain. Every time an R-loop grows by the helical pitch of DNA (10.5 bp), it absorbs an additional negative superhelical turn. Assuming that the displaced strand is free (which may not necessarily be the case (113,114)), it can helically wrap around the RNA: DNA hybrid, and, if the wrap is left-handed, this will absorb additional undertwist (Fig. 2C). These two effects together provide significant stress relief to the DNA domain. The ability of R-loops to lower the energy level of the DNA fiber provides a clear alternative explanation for the role of negative superhelicity in their formation and stability. Altogether, if the combined energy saving resulting from favorable base pairing and topological relaxation exceeds the junction energy cost, then R-loop formation will be favored to occur at equilibrium.

R-loops are powerful, reversible, superhelical stress relievers
This model reveals R-loops in a new light as nonenzymatic topological "stress relief valves." In support of this notion, R-loop formation upon in vitro transcription of supercoiled circular plasmids is well-known to cause significant plasmid relaxation. We showed that a 3.5-kb plasmid carrying ϳ18 negative supercoils (assuming a superhelix density of ϭ Ϫ0.05 for DNA extracted from E. coli) was partially to fully relaxed by an R-loop, indicating that these structures absorb an astounding amount of superhelicity (104). The amount of superhelical stress relief afforded by an R-loop depends primarily on its length. R-loop sizes can be analyzed at the single-molecule level by measuring the single-stranded character of the displaced DNA strand using long-read sequencing (94). Now adapted for PacBio sequencing (115), single-molecule R-loop footprinting (SMRF-Seq) was used to measure the lengths of the in vitro R-loops formed on the plasmid substrate described above. The majority of the structures ranged from 80 to 175 bp, with a median length of 120 bp (104). Such structures are expected to relax ϳ8 -17 supercoils based on the length of the untwisted region. The helical wrapping of the displaced strand is expected to relax an additional 1-2 supercoils. Thus, R-loops of median length 120 bp relax the large majority of the negative superhelicity present in plasmids nearly 30 times their size, demonstrating the ability of R-loops to act as long-range topological relief valves.
Because the relaxation provided by R-loops is expected to grow roughly linearly with their length (104), it is important to understand the distribution of R-loop sizes in genomic sequences. Using SMRF-Seq, we interrogated a number of R-loop hotspots in the human genome at ultradeep coverage. R-loop lengths typically ranged from 200 to 500 bp, a full order of magnitude larger than other non-B DNA structures (68). Strikingly, kilobase structures, while rarer, were not uncommon (115), consistent with prior mapping data on R-loopprone murine class switch regions (94,108,116,117). R-loops are giants in the world of non-B DNA structures and are uniquely suited to absorb large amounts of negative DNA superhelicity. A 300-bp-long R-loop (the median length of genomic R-loops (115)) is expected to fully relax a 6 -7-kb DNA molecule with a superhelix density ϭ Ϫ0.05. Rare kilobasesize R-loops are expected to relax vast amounts of negative superhelicity, affecting the topology of the DNA fiber over long distances. Importantly, because R-loops can hold vast stores of negative superhelicity, R-loop resolution can release this superhelicity back into the surrounding DNA domain (see Fig. 4). Indeed, R-loop formation does not involve any change in linking number: the total superhelicity of a given domain is only transiently repartitioned by either R-loop formation or resolution.
Whereas these considerations hold true for "naked" DNA, eukaryotic chromosomes carry nucleosome arrays. Each nucleosome binds ϳ147 bp of DNA and stabilizes one turn of negative superhelicity (29,118,119), indicating that the superhelix density of nucleosomal winding is Ϫ0.07. The dissociation of nucleosomes downstream (i.e. in front of) the RNA polymerase releases this stored negative superhelicity, counteracting the transcription-driven wave of positive supercoiling (Fig. 3A). Similarly, the reassociation of nucleosome behind the RNAP will stabilize one negative supercoil, lessening the accumulation of negative superhelical stress in the wake of the RNAP complex. The fact that positive supercoiling destabilizes nucleosomes and negative supercoiling facilitates their formation (120 -122) nicely agrees with the notion that nucleosomes play important roles in managing supercoiling during transcription (45,46,123,124). Nonetheless, empirical observations show that transcription results in a build-up of upstream negative supercoils (45,46). In this situation, R-loops can play useful roles in relieving this superhelical stress. This, in turn, is expected to impede or prevent nucleosome redeposition behind the RNAP for two reasons. First, R-loops cannot be wrapped around nucleosomes (125), and second, relaxed DNA is a poor substrate for nucleosome formation (120,122). This agrees with observations that R-loops are associated with increased chromatin accessibility under normal conditions (87). Because R-loops are capable of absorbing virtually all of the negative superhelicity in large regions, R-loop formation could reduce the strength of nucleosomal binding over neighboring domains (Fig. 3B). Conversely, the release of the topology stored in an R-loop upon its resolution is also expected to favor rapid nucleosome binding. The details of the energetics and kinetics of these important interactions between nucleosomal binding and R-loops largely remain to be elucidated.

JBC REVIEWS: R-loops in topological stress relief
The interplay between DNA sequence and DNA topology guides R-loop formation, elongation, and stability The notion that DNA topology and DNA sequence cooperate to control local R-loop propensity predicts that at one extreme, R-loops could occur over highly favorable sequences without the need for significant superhelical relief. In vitro transcription of highly GC-skewed class switch sequences and CpG island regions results in R-loop formation even on linear templates (104,126). Measurements of R-loop efficiency as a function of substrate topology nonetheless revealed that negatively supercoiled plasmids favored R-loops by at least one order of magnitude, even in the context of an R-loop prone sequence (104). At the other extreme, very high levels of superhelicity are predicted to support R-loop formation even over unfavorable DNA regions. This was confirmed by bulk in vitro transcription assays (127). Examination of individual in vitro R-loop footprints at deep coverage confirmed that R-loops tend to initiate over unfavorable sequences when plasmids are highly negatively supercoiled (104). Most R-loops observed when negative supercoiling was high were promoter-proximal, likely because these early R-loops formed first, and the relaxation they provide inhibits structure formation over more favorable downstream regions. Thus, increased negative superhelicity can alter both the frequency and the landscape of R-loop formation, at least in vitro.
Under most conditions, R-loop formation is likely to require a balance of contributions from DNA sequence and topology.
Calculations suggest that the vast majority of genomic DNA sequences will require some negative superhelicity to permit R-loop formation. This, in turn, implies that the regions observed to form R-loops in vivo are likely experiencing negative supercoiling. Thus, R-loop maps could inform us, albeit indirectly, about local in vivo levels of superhelical stresses. Regions with the most favorable RNA:DNA energetics are expected to permit R-loops even with low superhelicity (104). As such, the conserved class of highly GC-skewed CpG island promoters (128) may represent sensitive R-loop "reporters" that have evolved to transition into R-loop structures at low levels of superhelical stress. By contrast, SMRF-Seq analysis revealed multiple examples of R-loop hotspots whose sequences are not strongly favored to form R-loops (115). This suggests that these regions, which often are located in gene bodies or terminal genic regions, may experience high levels of local superhelicity that allow R-loop formation despite their weaker sequence favorability. However, we cannot rule out the possibility that R-loops might originate via multiple mechanisms, some of which could be less dependent on fundamental nucleic acid thermodynamic properties.
Whereas R-loop sequence signatures can be used to indirectly infer the possible contribution of DNA topology to their formation, direct evidence for its role is also supported in eukaryotes in vivo. In yeast, loss of DNA topoisomerase I leads to increased R-loop levels over the 5Ј-end of the highly transcribed rDNA region (129). R-loop frequencies were further JBC REVIEWS: R-loops in topological stress relief enhanced upon loss of RNase H activity (129), similar to prior observations in E. coli. In human cells, depletion of Top1 leads to a compensatory accumulation of R-loops in genes with acute topological management needs, namely long, highly transcribed, and physically constrained genes (57). Thus, DNA topology regulates R-loop formation in silico, in vitro, and in vivo, from E. coli to yeast to human cells. Furthermore, the evidence suggests that DNA topoisomerase I and R-loops share topological management duties, with R-loops playing compensatory roles in relieving negative superhelical stress in the absence of Top1.
In addition to regulating R-loop initiation, DNA topology was also predicted to control R-loop extension and therefore length (104). Once initiated, R-loops are likely to extend until the available superhelicity that drives their formation has been (partially or fully) relaxed. Alternatively, R-loops may terminate if the underlying DNA sequence becomes highly unfavorable. Discerning between these two possibilities may be possible by analyzing sequence transitions at the distal edges of individual R-loops. We and others (116) have observed that distal DNA sequence features are often less well-defined than those at proximal edges and that R-loops with the same initiation site can end at multiple downstream locations. Prior observations have shown that, although favorable G stretches are often necessary at the proximal edges of R-loops, these structures can extend through areas that otherwise would not support initiation (106). These results are expected if DNA topology regulates R-loop extension. From this, we can deduce that the length distribution of R-loops may provide information about the level of local superhelical stress that existed in their domains prior to their formation.
Finally, our work predicts that the relative contributions of DNA sequence and DNA topology to the formation of a particular R-loop will also determine the stability of that structure when faced with a change in the topology of the fiber such as might be expected from DNA topoisomerase action or strand breakage. As expected, R-loops that are primarily driven by DNA topology over less favorable sequences are highly suscep-tible to spontaneous resolution when that topology is lost (104). By contrast, R-loops that form over sequences with strong RNA:DNA base-pairing potential are much more resistant to topological changes. These findings have implications for our understanding of the instances of RNA:DNA hybrid or R-loop formation observed at sites of DNA double-strand breaks (130 -132). Because the loss of topology induced by doublestrand breaks is detrimental to both R-loop initiation and stability, we favor models in which two-stranded RNA:DNA hybrids are formed at the break, either through rehybridization of the nascent RNA to a resected ssDNA strand or through de novo loading of RNA polymerase II (130). Overall, the sensitivity of R-loops to DNA topology suggests that superhelicityrelaxing enzymes might provide an attractive mechanism for R-loop resolution. We note that D-loops, which are structurally similar to R-loops and form during recombination, are efficiently resolved through the combined action of the SGS1 helicase and Top3A DNA topoisomerase (133). Interestingly, topoisomerase 3B can cleave R-loops and D-loops (134) and reduce R-loop formation in vitro through its DNA relaxation activity (135). Loss of the Top3B-interacting partner TDRD3 causes slight elevation of R-loop levels and genome instability (135). Whether Top3B directly acts on nuclear R-loops in vivo remains to be determined, given that Top3b also plays an important role as an RNA topoisomerase in the cytoplasm (136,137).

Rethinking the potential roles of R-loops under normal and pathological conditions
As discussed above, negative superhelicity is an important and often neglected regulator of gene expression and genomic architecture. The observation that R-loops can transiently absorb and release large stores of negative superhelicity expands the repertoire of potential biological roles of these structures (Fig. 4). First, because negative supercoiling favors local strand opening, it may facilitate promoter and/or replication origin firing. An R-loop formed downstream of a promoter region will sequester local superhelicity, which would negatively impact JBC REVIEWS: R-loops in topological stress relief strand opening and hence promoter activity. By contrast, the wave of supercoiling released upon resolution of this R-loop would facilitate promoter firing. The dynamic formation and resolution of R-loops may thus contribute to long-range regulation of gene expression. A similar logic can be applied to replication origins, which are often located near gene ends where R-loops are particularly prevalent (138,139). Evidence for supercoiling-mediated long-distance communication between RNA polymerases now exists in Bacteria (140), and transcription contributes to the long-range mobility of cis-regulatory elements in mammals (141). Second, given that alternative non-B DNA structures compete for negative superhelicity (39,40,65), the sequestration of superhelicity by R-loop formation is expected to dramatically curtail the formation of other non-B DNA structures. Conversely, supercoil release by R-loop resolution will enable the formation of other structures, such as strand-separated bubbles, B/Z transitions, and cruciforms (Fig.  4). Third, negative supercoiling is known to facilitate some protein-DNA interactions, such as nucleosome binding, and to enhance long-range contacts between distant loci, such as promoters and enhancers (48). Transcription-induced supercoiling in particular was proposed to drive the formation of TADs (19,50). Because the relaxation activity of R-loops is expected to affect large surrounding areas, it is possible that R-loops may exert a dynamic impact on the local structure of chromatin and on its folding in 3D space, affecting long-range contacts between distant loci.
It is interesting to consider these possible roles in light of the fact that dysfunctions of R-loop metabolism have often been invoked as a source of genomic instability. Excess R-loop formation, because it would sequester so much superhelicity, might be expected to cause TAD unfolding and chromatin opening, lower promoter-enhancer contacts, reduce the probability of promoter and origin firing, and lower the frequencies of occurrence of other non-B DNA structures (Fig. 4). This comes in addition to R-loops' documented detrimental effects on transcription elongation (129,(142)(143)(144)(145)(146) and on potentiating transcription-replication conflicts (69,99). By contrast, lower R-loop levels, resulting, for instance, from enhanced R-loop resolution activity, might be expected to lead to higher levels of negative superhelicity throughout the genome. This, in turn, could favor the formation of other alternative non-B DNA structures, such as cruciforms, triplexes, strand-separated bubbles, and B/Z transitions, which might affect normal DNA replication and create sites susceptible to cleavage and mutagenesis (147,148). Thus, it is possible that altered R-loop homeostasis, whether characterized by increased or decreased R-loop loads, could lead to strong negative consequences for genome function and stability.