Structure-dependent recombination hot spot activity of GAA.TTC sequences from intron 1 of the Friedreich's ataxia gene.

The recombinational properties of long GAA.TTC repeating sequences were analyzed in Escherichia coli to gain further insights into the molecular mechanisms of the genetic instability of this tract as possibly related to the etiology of Friedreich's ataxia. Intramolecular and intermolecular recombination studies showed that the frequency of recombination between the GAA.TTC tracts was as much as 15 times higher than the non-repeating control sequences. Homologous, intramolecular recombination between GAA.TTC tracts and GAAGGA.TCCTTC repeats also occurred with a very high frequency (approximately 0.8%). Biochemical analyses of the recombination products demonstrated the expansions and deletions of the GAA.TTC repeats. These results, together with our previous studies on the CTG.CAG sequences, suggest that the recombinational hot spot characteristics may be a common feature of all triplet repeat sequences. Unexpectedly, we found that the recombination properties of the GAA.TTC tracts were unique, compared with CTG.CAG repeats, because they depended on the DNA secondary structure polymorphism. Increasing the length of the GAA.TTC repeats decreased the intramolecular recombination frequency between these tracts. Also, a correlation was found between the propensity of the GAA.TTC tracts to adopt the sticky DNA conformation and the inhibition of intramolecular recombination. The use of novobiocin to modulate the intracellular DNA topology, i.e. the lowering of the negative superhelical density, repressed the formation of the sticky DNA structure, thereby restoring the expected positive correlation between the length of the GAA.TTC tracts and the frequency of intramolecular recombination. Hence, our results demonstrate that sticky DNA exists and functions in E. coli.

the most ubiquitous short tandem repeats, among 10 possible trinucleotide sequences, in the human genome (3,4). Microsatellites are highly polymorphic and can be found in all chromosomes in a variety of lengths (2). Polypurine⅐polypyrimidine (poly(R⅐Y)) sequences, including GAA⅐TTC repeats, are frequently the longest repeating tracts in the human genome (3,4). GAA⅐TTC repeats are also very abundant in Alu elements (5).
Poly(R⅐Y) repetitive DNA tracts have been extensively studied since the discovery of their potential to adopt unusual DNA conformations, primarily triplex structures (reviewed in Ref. 6). These sequences may have several important biological functions in vivo. Polymorphism of the GAA⅐TTC repeats is responsible for the regulation of gene expression in Mycoplasma (7,8). Recently, large expansions of the GAA⅐TTC sequence in intron one of the Friedreich's ataxia (FRDA) 1 gene were found to be associated with this autosomal recessive disorder (9), which belongs to a large group of trinucleotide repeat expansion diseases (10). The structural properties of the GAA⅐TTC repeats were related to their pathogenic potential. The role of intramolecular triplexes (11,12), DNA/RNA hybrid triplexes (11,13,14), and a sticky DNA structure (15)(16)(17)(18)(19)(20) influenced the instability of the GAA⅐TTC tracts as well as the expression of the FRDA gene.
Recombination, together with replication and repair, is one of the major processes responsible for the instability of microsatellite sequences (10). In fact, expansions, deletions, and exchange of point mutations between two CTG⅐CAG tracts were found in yeast (21)(22)(23)(24), mammalian cells (25,26), and in Escherichia coli using an intramolecular genetic assay as well as two-plasmid recombination systems (27)(28)(29)(30)(31). In addition, different types of microsatellites, especially poly(R⅐Y) repeats, were found to be the preferred sites of homologous recombination, presumably by virtue of forming unusual DNA secondary structures (32)(33)(34)(35)(36). We demonstrated previously (29, 30) that another trinucleotide repeat, the myotonic dystrophy associated CTG⅐CAG tract, functions as a recombination hot spot in E. coli.
Here we evaluated the recombination propensity of GAA⅐TTC tracts in both intramolecular and intermolecular systems in E. coli. This GAA⅐TTC recombination study is the first demonstration of a structure-dependent hot spot activity of a DNA sequence. In addition, this study showed that the sticky DNA structure can exist and function in living E. coli, in agreement with the conclusions described in the accompanying article (20), which used an entirely different approach. 65 and the (GAA⅐TTC) repeats were originally cloned into the pSPL3 vector (11,37). pMP142 contains 33 (GAA⅐TTC) repeats; pRW3804 contains 60 (GAA⅐TTC) repeats; pRW3829 contains 176 (GAA⅐TTC) repeats, and pMP193 harbors 65 (GAAGGA⅐TCCTTC) repeats (11,37). All fragments were prepared by BssHII/HaeIII digestion and cloned into the EcoRI/HindIII and PvuII positions (the X and Y inserts, Fig. 1A) of pBR322 as described previously (18,29). The repeating tracts are flanked by 34 and 54 bp of the human FRDA gene. The short (GAA⅐TTC) 33 insert contains a pure repeating motif without any interruptions. In contrast, both (GAA⅐TTC) 60 and (GAA⅐TTC) 176 contain interruptions (see Refs. 18,19, and this work). The sequence of the (GAA⅐TTC) 60 insert is GAGGA(G-AA) 50 GAAAAAGAAAA(GAA) 10 , and the sequence of the (GAA⅐TTC) 176 is (GAA) 170 A(GAA)GAG(GAA) 6 (18,19). The cloning of the GFPuv gene into the pBR322 derivatives was carried out as described previously (29).

Plasmids-(GAAGGA⅐TCCTTC)
In order to clone TRS tracts into pFW25 (a plasmid harboring the R6K ␥ origin of replication (30,38)) the BssHII/HaeIII insert containing 176 (GAA⅐TTC) repeats was ligated to the HincII digested vector (30). Due to the high instability of the (GAA⅐TTC) repeats cloned into pFW25, we were unable to isolate plasmids harboring more than 100 (GAA⅐TTC) repeats. The pFW25 derivatives shown in Fig. 1B were obtained as the deletion products after the transformation of E. coli HB101 with the ligation mixture. All plasmids used here were characterized by restriction mapping (to determine the orientation and length of the cloned TRS) and dideoxy sequencing of both strands as described earlier (29). Control plasmids pRW4819gfp, pRW4873gfp, pRW4870, and pRW4823 were described previously (29,30).  (30).
All experiments were conducted in LB media at 37°C. In the case of recombination studies performed in the presence of novobiocin (Sigma), the drug was added, at 5 M concentration, at all stages of the experiment beginning with the preparation of competent cells. In addition, plasmids used in these experiments were isolated from the E. coli HB101 cells grown with novobiocin. Novobiocin inhibits the assembly of the active gyrase (39,40) and, at the concentrations used in this work, has no influence on cell growth (41). In contrast to the quinolones, novobiocin (as well as other coumarins) does not lead to the formation of DNA double-strand breaks that may influence the results of the recombination studies (42)(43)(44)(45).
Detection of RB-The presence of sticky DNA was determined by detection of the retarded band (RB) in 0.8% agarose gels as described earlier (15,18,19). To detect the RB, DNA was incubated for 10 min at 80°C in 50 mM EDTA (pH 8.0). This treatment completely converts the RB to the linear monomeric form (15). In order to evaluate the amount of RB in different plasmids (Fig. 2B), AccI-digested DNA fragments were radioactively labeled with the Klenow fragment of E. coli DNA polymerase I and [␣-32 P]dATP, and the gels were analyzed using Storm 820 and ImageQuant software (Amersham Biosciences). The statistical analyses were performed using SigmaStat version 2.03.
Polyacrylamide Gel Analyses-The sizes of the TRS tracts as well as complex (GAA⅐TTC) m (GAAGGA⅐TCCTTC) n motifs in the recombination products were determined using AatII/NdeI digestion followed by end labeling as described above. The products were resolved on 6% polyacrylamide gels in TAE buffer (40 mM Tris acetate, 1 mM EDTA (pH 8.0)). The lengths of the inserts were calculated as described earlier (46). The primary structures of more than 20 individual recombination products were determined by direct DNA sequencing of one or both DNA strands.

RESULTS
Genetic Assays for Intra-and Intermolecular Recombination-The genetic assays for monitoring the frequencies of in-tramolecular as well as intermolecular recombination between CTG⅐CAG sequences were established previously (29, 30). An intramolecular recombination system was developed in which a crossover event between direct repeats leads to the deletion of the intervening sequence, including the GFPuv gene, located between the repeats (Fig. 1C). This deletion results in white colony formation. In contrast, when the plasmid containing the GFPuv gene is established in the host cell, the expression of this gene leads to the formation of a fluorescent green colony. Thus, the frequency of the recombination events was measured as the ratio of the number of white colonies to the total number of viable cells.
In order to study the frequency of intermolecular recombination between plasmids harboring the (GAA⅐TTC) repeats, derivatives of pBR322 containing different TRS lengths were transformed into E. coli AB1157 and ECF005 cells. This was followed by a second transformation with the pFW25 derivatives containing varying lengths of the (GAA⅐TTC) inserts (Fig.  1B). In E. coli ECF005, both plasmids can co-exist without recombining and can give rise to Tet R and Cm R colonies. These plasmids could also recombine to form co-integrants at a certain frequency. In order to establish the frequency of these recombination events, the (GAA⅐TTC)-containing pBR322 and pFW25 derivatives were sequentially transformed into E. coli AB1157. Because this was a two-step transformation, the pBR322 derivative was already established in the cell and could therefore exist independently in the presence of tetracycline. The pFW25 derivatives, however, cannot replicate in this strain because of the absence of the protein (30). Thus, after the transformation of the Tet R cells with the pFW25 derivatives, the only way to obtain Tet R , Cm R colonies is by recombination between the TRS-harboring plasmids. This could be exclusively due to co-integrants, which not only replicate using the ColE1 replicon but also carry the chloramphenicol resistance gene. Thus, the frequency of recombination between the plasmids was calculated as shown in Equation  Since two different strains were used in the experiment, it was possible that their transformation efficiencies differed and thus could affect the results. Therefore, R is the ratio of the efficiency of transformation of E. coli ECF005 to that of E. coli AB1157 established using pACYC184 (30). Several additional controls (such as plasmid copy number determination, growth advantage control, and plasmid establishment control) were conducted, as described in detail previously (29, 30). We determined, within the limits described previously (29, 30), that none of the above-mentioned factors had a significant influence on the frequency of intramolecular as well as intermolecular recombination.
GAA⅐TTC Tracts Are Preferred Sites for Intramolecular Recombination, but Long Sticky DNA Is Inhibitory-Plasmids pRW4876gfp, pRW4882gfp, and pRW4886gfp harboring pairs of 33, 60, and 176 GAA⅐TTC tracts, respectively (Fig. 1A), were transformed into E. coli AB1157, the parental strain. Two controls were used in these experiments as follows: pRW4819gfp containing a pair of (CTG⅐CAG) 67 repeats and pRW4873gfp harboring a pair of non-repeating, 354-bp fragments of the DMPK gene (Fig. 1A). Fig. 2A shows that pRW4876gfp with two (GAA⅐TTC) 33 tracts, oriented as direct repeats, recombined in E. coli AB1157 with a frequency ϳ10 times higher than the plasmid carrying the non-repeating sequences (pRW4873gfp). In addition, pRW4876gfp had a much higher (1.5-60 times) propensity for intramolecular recombination than plasmids harboring tracts FIG. 1. Plasmids used in this intramolecular and intermolecular recombination study between GAA⅐TTC repeats. The sequences of the leading strand templates of the TRS inserts are shown for all plasmids. Thus, GAA⅐TTC and TTC⅐GAA inserts correspond to orientation I and orientation II, respectively (11). The approximate positions of the origins of replication (R6K ori and ColE1) and the genes encoding the resistances to ampicillin (Amp R ), chloramphenicol (Cm R ), and tetracycline (Tet R ) as well as the GFP gene are shown (A and B). Control plasmids harboring the non-repeating sequence (354-bp fragment of the human DMPK gene, part of the exon 7 and intron 7) were constructed as described previously (29, 30). C and D, schematic diagrams of the intramolecular and intermolecular recombination studies. The homologous recombination between direct repeats (C) leads to the formation of a smaller plasmid containing only one TRS tract; the DNA segment containing the GFP gene, which originally separated the two TRS tracts, is deleted because it is an inviable recombination product. D, intermolecular recombination between the GAA⅐TTC tracts leads to the formation of larger co-integrants. of 17, 67, 98, and 165 CTG⅐CAG repeats (29) ( Fig. 2A). Surprisingly, the studies conducted with pRW4882gfp, carrying longer GAA⅐TTC tracts (of 60 repeats), showed a decrease in the frequency of intramolecular recombination from 12.6% for the plasmid with a pair of (GAA⅐TTC) 33 repeats to 7.4% for pRW4882gfp. Moreover, a further increase in the length of the The recombination frequency data for the CTG⅐CAG harboring plasmids were determined previously (29). The standard deviations in all experiments were less than 15%. B, agarose gel analyses of retarded band (RB) formation. Plasmids harboring GAA⅐TTC tracts of different lengths (Fig. 1A) were digested with AccI, end-labeled, and analyzed on a 0.8% agarose gel. To determine the presence of sticky DNA (RB), aliquots of the reaction mixtures were subjected to heat/EDTA treatment (see "Experimental Procedures"; identified by ϩ lanes). The other portions of the DNA samples remained untreated (designated by Ϫ lanes). Results of the quantitative analyses of the amounts of RB are shown below. The amount of RB was calculated as a fraction of the radioactivity measured for the RB to the total radioactivity present in each lane of the gel. Slight differences in the mobilities of the retarded bands were found as expected, due to the differences in length of the inserts. The standard deviation for the amount of the RB is less than 5%. recombining sequences, up to 176 repeats, led to an even more dramatic suppression of recombination ( Fig. 2A). Thus, increasing the length of the recombining sequences from 33 to 176 GAA⅐TTC repeats showed an ϳ3-fold decrease in the recombination frequency. These results were confirmed by three independent experiments performed in E. coli AB1157. The same tendency was observed in another wild type E. coli strain, KMBL1001 (Fig. 3A, filled symbols). It should be pointed out that, in all cases, regardless of the length of the recombining sequences, the frequency of recombination between the GAA⅐TTC repeats was several times higher than for the nonrepeating, control DMPK fragments ( Fig. 2A). Thus, we conclude that GAA⅐TTC repeats per se are preferred sites (hot spots) for intramolecular recombination.
In order to determine the cause of the inverse correlation between the frequency of recombination and the length of homologous sequences, a restriction analysis of the plasmids harboring GAA⅐TTC tracts was performed (Fig. 2B). As expected, when the two GAA⅐TTC repeat inserts were oriented as direct repeats, a DNA triplex structure between the distantly located TRS tracts, sticky DNA, was formed (15,18,19). The presence of sticky DNA was determined routinely by the detection of the substantially retarded band on agarose gel electrophoresis after restriction digestion of the plasmids. All three plasmids analyzed showed the presence of the retarded band (Fig. 2B). To identify the RB, the cleaved DNA samples were incubated for 10 min at 80°C in 25 mM EDTA (pH 8.0) as described earlier (15)(16)(17)(18)(19). Under these conditions, RB is completely converted to the linear restriction fragments (Fig. 2B). The amount of sticky DNA (calculated as a percent of the RB in the total amount of DNA) varied depending on the length of the GAA⅐TTC tract. In the case of pRW4876gfp, the sticky DNA composed ϳ1% of the total DNA. On the other hand, the longest repeats ((GAA⅐TTC) 176 ) had a very high propensity (36%) to form stable sticky DNA (Fig. 2B). In agreement with previous studies (15,18), there is a length threshold of ϳ60 GAA⅐TTC repeats for efficient formation of sticky DNA.
Hence, the inverse relationship between the amount of sticky DNA and the frequency of intramolecular recombination suggests that the in vivo triplex formation between two recombining GAA⅐TTC tracts influenced the frequency of intramolecular recombination. To test this prediction, we altered the propensity of E. coli to form the sticky DNA structure by modulating the in vivo DNA superhelical density.
Superhelical Density Influences the Frequency of Intramolecular Recombination between GAA⅐TTC Tracts-In order to further evaluate if sticky DNA formation in vivo is responsible for the unexpected recombination behavior of the GAA⅐TTC sequences, the in vivo recombination assay was conducted in the presence of novobiocin. Novobiocin, a coumarin antibiotic, inhibits the assembly of an active gyrase (39, 40) causing a relaxation of DNA in vivo. This would result in a decreased propensity of DNA sequences to adopt underwound unusual structures such as triplexes and sticky DNA at GAA⅐TTC repeat tracts. Our previous studies (41) demonstrated in E. coli KMBL1001 that the population of more highly negative supercoiled topoisomers of pUC19 was reduced in the presence of novobiocin. In addition, the formation of sticky DNA is limited to directly repeating GAA⅐TTC tracts and depends on negative supercoiling as well as the presence of divalent metal ions (15,18,19). Novobiocin was kept in the media throughout, starting from the preparation of the competent cells. The concentration of novobiocin used did not inhibit E. coli growth (data not shown). The biochemical assay for detection of the sticky DNA (15)(16)(17)(18)(19), performed on plasmids isolated from KMBL 1001Nov, revealed a complete lack of the RB for pRW4876gfp and pRW4882gfp. In the case of pRW4886gfp, the amount of the RB was reduced from 36.4 (in the absence of novobiocin) to ϳ15% (in the presence of novobiocin; data not shown).
Three plasmids, pRW4876gfp, pRW4882gfp, and pRW4886gfp, were introduced into KMBL 1001 strain, and the frequency of For each plasmid, two or more independent experiments were performed, and at least 5,000 colonies were counted. The plasmids employed, which harbored 33, 60, and 176 GAA⅐TTC, were pRW4876gfp, pRW4882gpf, and pRW4886gfp, respectively. E. coli KMBL1001 cells were used for this study due to their higher transformation efficiency than for AB1157. B, model for sticky DNA-mediated modulation of the frequency of intramolecular recombination between GAA⅐TTC sequences. intramolecular recombination was calculated. As shown in Fig.  3A (filled symbols), the relative frequencies obtained with KMBL1001 cells are parallel to the data described earlier for AB1157 ( Fig. 2A); lengthening the GAA⅐TTC tracts decreased the frequency of intramolecular recombination. The same experiment was conducted in KMBL 1001 cells cultured in the presence of 5 M novobiocin (Fig. 3A, open symbols). The change in the DNA negative superhelical density and hence the propensity of the longer GAA⅐TTC tracts to form sticky DNA had a dramatic effect on the frequency of intramolecular recombination (Fig. 3A, open symbols). pRW4876gfp harboring the shortest GAA⅐TTC tract recombined in KMBL1001Nov with a significantly lower frequency (p Ͻ 0.001) than in the cells grown without the drug. On the contrary, the frequency of recombination between a pair of very long GAA⅐TTC tracts, capable of forming a stable sticky DNA structure (Fig. 2B), increased significantly (p ϭ 0.003) in the cells treated with novobiocin (Fig. 3A, open symbols). No statistically significant difference was observed in the frequencies of recombination between fragments containing 60 GAA⅐TTC repeats (pRW4882gfp). Thus, the correlation between the length of the GAA⅐TTC tracts and the frequency of intramolecular recombination in the presence of novobiocin resembles the data obtained for CTG⅐CAG tracts (29). Hence, two opposing effects of novobiocin were found: an inhibition of recombination between the short GAA⅐TTC tracts and a stimulation of this process for longer repeats. In the control experiments performed with CTG⅐CAG repeats of different lengths (17, 67, and 98 triplets), the in vivo relaxation of the plasmids in the presence of novobiocin led to a 2-4-fold decrease in the frequency of intramolecular recombination (data not shown). Therefore, the behavior found for the GAA⅐TTC containing plasmids (Fig. 3A) is specific for this TRS.
All of the data strongly suggest that the properties of sticky DNA are responsible for the unexpected recombination behavior of the GAA⅐TTC tracts (Fig. 3B, see "Discussion"). Furthermore, we conclude that the sticky DNA structure exists in vivo in E. coli cells, as also proposed in the accompanying paper (20).
DNA Structure Promotes Homologous Recombination between GAA⅐TTC Triplets and GAAGGA⅐TCCTTC Hexanucleotide Repeats-The hexanucleotide GAAGGA⅐TCCTTC repeats were found in intron 1 of the FRDA gene of unaffected individuals (37). It has been shown previously that these non-pathogenic sequences cannot form sticky DNA structures by themselves (16,17). On the other hand, GAAGGA⅐TCCTTC repeats together with the GAA⅐TTC tracts adopted the sticky conformation but with a lower efficiency than pure trinucleotide repeats (19).
We used the intramolecular system (Fig. 4A) to study the propensity of these two non-perfectly homologous tracts to recombine. Despite the relative low level of homology (83%, 1 mismatch every 6 bp), the GAA⅐TTC tracts recombined with the hexanucleotide repeats in E. coli KMBL 1001 with an ϳ3-fold higher frequency than non-repeating, perfectly homologous control sequences and with only ϳ2.4 times lower frequency than that observed for the longest (GAA⅐TTC) 176 tracts (Table I). This result suggests a strong influence of the DNA structure formed by the polypurine⅐polypyrimidine tracts on the recombinational capacity of these sequences.
The repeat sequence lengths (see below) and the DNA sequences of ϳ30 recombinants were analyzed by restriction mapping and direct sequencing (Fig. 4, B and C). As predicted, more than 80% of the recombinants contained a complex repeating motif composed of a stretch of GAA⅐TTC repeats followed by GAAGGA⅐TCCTTC repeats. Despite the inherent dif-ficulty in sequencing long repeating tracts (47), in 12 cases we were able to analyze the entire sequence of the repeating tract including the junction between the triplet and the hexamer repeats (Fig. 4, B and C); ϳ20% of the clones contained only one repeating unit (either GAA⅐TTC or GAAGGA⅐TCCTTC). The shortest recombinant analyzed harbored only 20 GAA⅐TTC repeats and the longest was composed of 116 triplets followed by 55 hexanucleotide repeats. The extent of deletions during recombination was much higher for the GAA⅐TTC tract than for the hexanucleotide repeats (Fig. 4C).
Recombination-mediated Instability of GAA⅐TTC and GAAGGA⅐TCCTTC Repeats-To study the instability of GAA⅐TTC repeats resulting from intramolecular recombination, plasmids were isolated from a representative number of white colonies and analyzed by restriction mapping and DNA sequencing (Fig. 5A). These analyses showed that the only product of intramolecular recombination between direct repeats is that of intramolecular deletion of the DNA segment between the homologous tracts (Figs. 1C and 4A). Other types of rearrangements were not observed. 30 -60 individual isolates from each recombination experiment were analyzed by AatII/NdeI restriction digestion and polyacrylamide gel electrophoresis (Figs. 4B and 5A).
In the case of pRW4876gfp and pRW4882gpf, expansions, deletions, and retentions of the progenitor length of the GAA⅐TTC tracts were observed (Fig. 5B). Expansions (ϳ60% of all products) were the predominant product of the intramolecular recombination for the plasmid containing 33 repeats, and the average size of the recombinants was 41 GAA⅐TTC repeats. For pRW4882gfp, the expansions and deletions were almost equally frequent (Fig. 5B).
Thus, short and medium length GAA⅐TTC tracts exhibited similar propensity for deletions and expansions, and the average length of the repeating sequence in the recombination products was close to the size of the progenitor sequence (Fig.  5C, open circles). This result is as expected, presuming that the recombination events were random and reciprocal. We obtained similar data during the CTG⅐CAG recombination studies (29) (Fig. 5C, filled symbols). In the case of the short CTG⅐CAG repeats (17 and 67 triplets), the average length of the TRS tract in the recombination products was equal to the full-length progenitor sequences. For longer tracts (Ն98 CTG⅐CAG), a very small bias toward deletions was found (Fig.  5C, filled symbols).
However, two plasmids, pRW4886gfp and pRW5000gfp, showed a significant discrepancy from the linear pattern observed for the CTG⅐CAG tracts and short GAA⅐TTC repeats. The recombination products, obtained for these parental plasmids (which are capable of adopting the sticky DNA structure), carried a significantly smaller number of repeats than the progenitor sequence (Fig. 5C, open square and triangle). This behavior is especially apparent in the case of pRW4886gfp, which harbored a pair of (GAA⅐TTC) 176 sequences. Restriction analyses revealed that all recombination products (36 total) contained deleted tracts, with no exceptions (Fig. 5, A and B). The average size of the GAA⅐TTC tract was 64 repeats (Fig. 5C, open triangle) and the longest recombinant plasmid found harbored only 134 repeats.
The analysis of the products of recombination between GAA⅐TTC and GAAGGA⅐TCCTTC showed a predominance of deletions with the average length of the tract being 123 triplet equivalents (recombinants usually contained a complex (GAA⅐TTC) m (GAAGGA⅐TCCTTC) n motif, see above). These unexpected results can be explained by the influence of a stable sticky DNA structure during the intramolecular recombination processes. We hypothesize that the sticky DNA structure ham-

FIG. 4. Intramolecular recombination between two non-perfectly homologous poly(R⅐Y) tracts.
A, intramolecular recombination between GAA⅐TTC and GAAGGA⅐TCCTTC sequences leads to the formation of a plasmid harboring a complex repeating tract composed of both trinucleotide and hexanucleotide repeat sequences. B, restriction analyses of recombination products. Direct sequencing of ϳ20 clones of the recombination products revealed that 80% of the molecules harbor both the repeating sequences in immediate juxtaposition, demonstrating a recombination event between the two sequence types. The result of the DNA sequencing of clone g (shown on the gel) is presented at the bottom. This recombination product carries 15 (GAA⅐TTC) repeats that are immediately followed by 22 (GAAGGA⅐TCCTTC) repeats. The arrowhead indicates the junction point between the GAA⅐TTC and GAAGGA⅐TCCTTC repeats. C, results of the sequence analyses of the recombination products. We were able to analyze the entire repeating motif for these 12 recombinants. The total numbers of repeats for each clone are expressed as triplet equivalents. KBL, kbp ladder.

TABLE I Frequency of intramolecular recombination between (GAA⅐TTC) and (GAAGGA⅐TCCTTC) in E. coli KMBL1001
E. coli KMBL1001 was transformed with pBR322 derivatives containing a pair of (GAA⅐TTC) 176 tracts (pRW4886gfp), containing both (GAA⅐TTC) 176 and (GAAGGA⅐TCCTTC) 65 tracts oriented as direct repeats (pRW5000gfp) or inverted repeats (pRW5002gfp), and harboring a pair of non-repeating homologous DMPK gene fragments (pRW4873gfp). The transformation mixtures were plated onto plates with ampicillin, and the number of white and green fluorescent colonies were counted. The frequency of recombination was calculated as described previously (29) pers the formation or resolution of the recombination intermediates. Additionally, an intramolecular triplex may decrease the accessibility and/or availability of the GAA⅐TTC repeat sequences not involved in sticky DNA formation for recombination (see "Discussion"). If the above-stated hypothesis is correct, we would predict that the size of the GAA⅐TTC inserts in the recombinants formed in the cells grown in the presence and in the absence of novobiocin should differ. Therefore, we analyzed the length of the TRS tracts in the pRW4886gfp recombination products found in E. coli KMBL 1001Nov. In the presence of novobiocin, the average size of the GAA⅐TTC insert was 28 repeats longer than in the cells cultured without the gyrase inhibitor (92 versus 64 repeats; compare the open triangle with the open diamond in Fig. 5C). In addition, three cases of GAA⅐TTC repeat expansions were detected with the longest recombinant FIG. 5. The influence of GAA⅐TTC lengths on intramolecular recombination-mediated instability. A, analyses of the sizes of the TRS tracts in the products of recombination between a pair of (GAA⅐TTC) 60 tracts (pRW4882gfp) and a pair of (GAA⅐TTC) 176 tracts (pRW4886gfp). The plasmids were isolated from white colonies and were digested with AatII/NdeI to release the TRS-containing inserts. Labeled DNA fragments from individual colonies were separated by 6% PAGE to determine the lengths of the GAA⅐TTC sequences. A 1-kbp ladder (KBL) was used to determine the lengths of the TRS tracts. The arrowheads identified the approximate positions of the original full-length inserts containing 60 (pRW4882gfp) and 176 repeats (pRW4886gfp). B, the frequencies of the GAA⅐TTC repeat expansions and deletions mediated by intramolecular recombination. The lengths of the TRS-containing fragments (shown in A) were measured, and the numbers of GAA⅐TTC repeats were calculated as described earlier (46). Filled regions, deletions; cross-hatched regions, retention of size of progenitor sequence; white regions, expansions. Each bar represents the data collected from the analysis of ϳ40 clones. C, the average size of the TRS tracts in the recombination products. The diagonal interface between the shaded and the unshaded areas represents the original length of the progenitor sequence. Filled symbols, CTG⅐CAG recombinants; open symbols, GAA⅐TTC recombinants (for details, see "Results").
harboring 276 repeats. Thus, sticky DNA not only modulates the frequency of intramolecular recombination but also influences the recombinational outcome, i.e. the size of the repeating tracts in the recombination products.
Long GAA⅐TTC Tracts Stimulate Intermolecular Recombination-An overview of the intermolecular recombination system is described under "Genetic Assays for Intra-and Intermolecular Recombination." A set of plasmids (Fig. 1B) harboring GAA⅐TTC tracts of different lengths and orientations relative to the origins of replication were introduced into E. coli AB1157 and ECF005 by a two-step transformation procedure (see above). The frequencies of intermolecular recombination between these tracts were calculated and compared with the background level of recombination (pBR322 ϩ pFW25) and recombination between non-repeating sequences (pRW4870 ϩ pRW4328, Table II). The background level of recombination in this system was determined to be 6 ϫ 10 Ϫ7 by measuring the frequency of recombination between control vectors pBR322 and pFW25, which are essentially non-homologous. Two precautions were taken to avoid the influence of sticky DNA as well as deletions of long GAA⅐TTC tracts on the results of the recombination experiments. Prior to the transformation with pFW25 derivatives, DNA was isolated from competent cells carrying the pBR322 derivatives. The stabilities of the GAA⅐TTC sequences were analyzed by restriction digestion and polyacrylamide electrophoresis. A minimum of 90% of the plasmids had to carry the full-length, undeleted progenitor sequence. If the fraction of the deleted plasmids exceeded 10% of the total plasmid population, the entire batch of competent cells was discarded.
Furthermore, the amount of plasmid dimers and higher order multimers (capable of adopting the sticky DNA structure) was determined using agarose gel electrophoresis. Varying amounts of plasmid multimeric forms may change the number of GAA⅐TTC tracts potentially available for recombination (e.g. dimer represents two TRS inserts potentially accessible for recombination, etc.). Competent cells harboring plasmids with a substantial fraction of dimers and/or multimers (Ͼ10%) were eliminated. However, the residual small differences in the plas-mid status in the competent cells may contribute to the elevated levels of the standard deviation observed in some experiments (Table II).
The results (Table II) demonstrate that the frequency of intermolecular recombination between GAA⅐TTC tracts is influenced by the following two major factors: first, length of the recombining sequences; second, the orientation of the GAA⅐TTC tracts relative to the origins of replication. The lengthening of the GAA⅐TTC tracts in both of the recombining plasmids increased the frequency of recombination ϳ10 times (from 0.6 ϫ 10 Ϫ4 for pRW4858 ϩ pRW5104 to 5.4 ϫ 10 Ϫ4 for pRW4854 ϩ pRW5106 in orientation I, and from 3.8 ϫ 10 Ϫ4 for pRW4859 ϩ pRW5107 to 28.8 10 Ϫ4 for pRW4855 ϩ pRW5109 in orientation II). Thus, contrary to the intramolecular recombination frequencies (described above), the frequency of intermolecular recombination between the GAA⅐TTC tracts increased with the length of the recombining sequences. These results are similar to the data obtained in the same intermolecular system for CTG⅐CAG tracts (30).
Previous in vivo studies demonstrated that the genetic instabilities of various TRS sequences, caused by either replication (48 -52) or recombination (30), are determined by the orientation of the repeats relative to the origin of replication. Experiments conducted here with the GAA⅐TTC tracts in orientations I or in orientations II (GAA repeats on the leading and lagging strand template, respectively) showed that the frequency of recombination is up to 20 times higher when the GAA⅐TTC repeats in the replicating plasmid (the pBR322 derivative) were in orientation II (Table II). The influence of orientation was statistically significant (p Ͻ 0.05) in seven of nine sets of recombination experiments (Table II, orientations I versus orientations II). These findings suggest that events occurring during replication, such as replication fork arrest and DNA polymerase pausing at unusual DNA structures formed by GAA⅐TTC tracts, 2 could stimulate intermolecular homologous recombination between these sequences. E. coli AB1157 and ECF005 were transformed with pBR322 derivatives followed by transformation with the pFW25 derivatives harboring different lengths of (GAA⅐TTC) n tracts or the control non-repeating tracts. The insert orientations were described under "Experimental Procedures." The transformation mixtures were plated onto plates with chloramphenicol and tetracycline, and the number of colonies were counted. The frequency of recombination was calculated as a ratio between the number of colonies obtained from E. coli AB1157 and the number of colonies obtained from ECF005 (for details see "Experimental Procedures"). The standard deviations are the average of two or more experiments. NA, not applicable. GAA⅐TTC repeats are hot spots for genetic recombination as monitored by both inter-and intramolecular recombination assays in vivo in E. coli. The frequency of homologous recombination between these sequences was up to 15 times higher than between two non-repeating sequences of similar length. In contrast to the previously studied myotonic dystrophy CTG⅐CAG tracts (29), the frequency of intramolecular recombination between GAA⅐TTC repeats decreased with the lengthening of the recombining tracts. This unusual recombinogenic behavior of the FRDA sequence is caused by its capacity to adopt the sticky DNA structure in living cells. Accordingly, this behavior can be modulated by the regulation of DNA superhelical density in vivo. The sticky DNA conformation also influences the genetic instability of the GAA⅐TTC tracts, reducing their propensity to expand during the recombination process. Again, this effect can be reversed by the modulation of the in vivo negative superhelical density of the DNA, using the gyrase inhibitor novobiocin. Thus, we conclude that the sticky DNA structure exists in E. coli and can regulate important intracellular processes such as recombination. In addition, recombination between the GAA⅐TTC tracts is a significant source of the instability (both deletions and expansions) of these sequences.
GAA⅐TTC tracts are structurally polymorphic and may exist in the following distinctly different conformations: (a) an orthodox right-handed B-DNA duplex; (b) a folded back intramolecular triplex composed of one consecutive stretch of R⅐Y repeats (12,(53)(54)(55) (as shown in Fig. 6, top left, of Ref. 20); and (c) the sticky DNA conformation, a complex triplex structure formed by two directly repeated GAA⅐TTC tracts that are distantly located in a DNA (as modeled in Fig. 6, bottom, of Ref. 20) (15)(16)(17)(18)(19). Our data demonstrate that the recombinational properties of this sequence depend on the type of structure, which it adopts.
Namely, short GAA⅐TTC repeats ((GAA⅐TTC) 30 ) that can exist in the cell as a B-DNA duplex or a triplex, but cannot form the sticky DNA structure due to their length (15,18), are excellent substrates for intramolecular recombination and therefore recombine with the highest frequency (Fig. 3B). We believe that folded back intramolecular triplex structures formed by short tracts are responsible for the recombination hot spot activity of GAA⅐TTC repeats, because lowering the in vivo negative superhelical density, using novobiocin, decreases the potential of these tracts to recombine (Fig. 3 B). Given that high superhelical density facilitates the formation of unusual DNA structures, especially triplexes, the change in recombination frequency induced by the gyrase inhibitor novobiocin implicates the involvement of supercoiling-dependent structures in the stimulation of recombination. Therefore, these results clearly demonstrate that the higher recombinogenic capacity of the GAA⅐TTC sequences is due to their structural properties rather than their ability to align with each other in multiple frames during the strand exchange reaction.
Repetitive sequences, including poly(R⅐Y) tracts as well as trinucleotide repeats, were shown previously to stimulate recombination (29, 30, 32, 56 -60). The formation of unusual DNA secondary structures was proposed in all cases to be responsible for the recombination hot spot activity of these sequences. DNA triplexes are among the sequences showing the greatest recombination capacity (33,61). The (GA⅐TC) triple-stranded DNA was found to be a hot spot for in vivo as well as in vitro recombination (34 -36, 62). Also, C ϩ (GC) triplexes enhanced recombination in E. coli in a transcription-dependent manner (63). Several mechanisms were proposed to explain the high recombinational potential of the triplexes. Recently, Benet and Azorin (35) demonstrated that formation of triple-stranded DNA at the recombining molecules inhibits branch migration; the arrest of the Holliday junction could account for the recombination hot spot character of DNA triplex structures (35). Also triple-stranded DNA can be indirectly involved in the recombination process; the single-stranded DNA regions (an inherent part of any triplex structure) can be recombinogenic due to their susceptibility to endonucleolytic cleavage (34). DNA cleavage could lead to the formation of single-stranded nicks or double-stranded breaks that can stimulate homologous recombination (64,65). In addition, GA⅐TC and GAA⅐TTC tracts are known to cause DNA polymerase stalling in vitro (14,66,67) and replication fork arrest in vivo. 2 This impediment may result in the replication fork collapse followed by activation of the double-strand break repair pathway and an increase in the frequency of recombination (65,68,69).
Previous studies (70) demonstrated that as little as 4% heterology between tandem repeats reduced the frequency of recombination by as much as 2 orders of magnitude. It was also shown that interruptions (polymorphisms) in CTG⅐CAG repeating tracts lowered their recombination frequency (27,28,30). Surprisingly, we found that the 17% sequence divergence between the GAA⅐TTC tracts and the GAAGGA⅐TCCTTC hexamer repeats resulted in only a small decrease in the frequency of intramolecular recombination when compared with a pair of perfectly homologous (GAA⅐TTC) 176 tracts. This high efficiency of intramolecular homologous recombination supports the idea that folded back intramolecular triplex structures formed by GAA⅐TTC repeats and/or GAAGGA⅐TCCTTC hexamers strongly stimulate recombination. On the other hand, these two non-perfectly homologous sequences are capable of adopting sticky DNA conformation (19). Thus, considering the inhibitory influence of sticky DNA on intramolecular recombination (see below), the increase in the frequency of homologous recombination between the GAA⅐TTC tracts and the hexamers could be even more apparent.
The frequency of recombination between any two homologous sequences, in the vast majority of cases, increases upon lengthening the recombining DNA fragments (71,72). This behavior was also found for CTG⅐CAG microsatellites in both inter-and intramolecular recombination systems (29, 30). Surprisingly, the GAA⅐TTC sequences demonstrated an exactly reverse relationship between the length and the frequency of intramolecular recombination. Likewise, Kirkpatrick et al. (73,74) recently reported similar findings for a yeast meiotic hot spot harboring 12 copies of the CCGNN motif; however, unexpectedly, the same sequence repeated 48 times repressed recombination. They proposed a model of competitive interactions between recombination hot spots in which adjacent tracts compete for one or more recombination proteins (73,74). Alternatively, we favor a model based on the structural properties of the poly(R⅐Y) repeats (Fig. 3B). It is very likely that the stable sticky DNA structure formed between two homologous GAA⅐TTC tracts of a sufficient length (Ͼ60 repeats) (Fig. 3B, bottom panel) hampers one of the important steps involved in homologous recombination. The relaxation of the DNA in vivo using novobiocin increased the frequency of intramolecular recombination between long GAA⅐TTC repeats. Lowering the superhelical density of DNA reduces the amount of sticky DNA in vivo (15,18) and also, as shown in our previous EM studies (18), minimizes the number of GAA⅐TTC repeats involved in this structure. Therefore, in the presence of novobiocin, the sticky DNA structure is not only less abundant but also is shorter. As a result, longer stretches of the right-handed duplex GAA⅐TTC repeats are available for the recombination machinery. Thus, recombinants found in E. coli KMBL1001Nov harbor an average of 28 GAA⅐TTC repeats more than the re-combinants obtained from the cells grown without novobiocin.
In agreement with our proposed model (Fig. 3B) of the inhibitory role of sticky DNA on recombination, the frequency of intermolecular recombination between two GAA⅐TTC repeat containing plasmids increased with the length of the homologous tracts. This result is as expected because, in our assay, intermolecular recombination cannot be affected by the formation of sticky DNA (sticky DNA is only formed by an intramolecular, not intermolecular, process (18 -20)).
These results enable a new therapeutic strategy for FRDA. This autosomal recessive disease is caused by the reduced expression of frataxin as a result of the GAA⅐TTC expansion in the first intron of the FRDA gene (9,75). The formation of non-B DNA structures, such as a DNA triplex, sticky DNA, or a DNA/RNA hybrid triplex, by long GAA⅐TTC tracts inhibits the transcription of the FRDA sequence (11)(12)(13)(14)(15)(16)(17). Due to the reduction of the amount of the X25 mRNA in FRDA patients (75,76), potential therapies may be aimed at alleviating the inhibition of transcription. The use of specific pharmacological agents, capable of interfering with the formation of these unusual structures, may up-regulate the expression of the FRDA gene, thereby providing an effective therapeutic approach.