GGA*TCC-interrupted triplets in long GAA*TTC repeats inhibit the formation of triplex and sticky DNA structures, alleviate transcription inhibition, and reduce genetic instabilities.

Large expansions of GAA.TTC repeats in the first intron of the frataxin (X25) gene are the principal mutation responsible for Friedreich's ataxia (FRDA). Sticky DNA, based on R.R.Y triplexes, was found at the expanded GAA.TTC repeats from FRDA patients. The (GAAGGA.TCCTTC)(65) repeat occurs in the same frataxin locus but is nonpathogenic and does not form sticky DNA. To elucidate the behavior of sticky DNA, we introduced various extents of GGA.TCC interruptions into the long GAA.TTC repeat. More than 20% of GGA.TCC interruptions abolished the formation of sticky DNA. However, the GAA.TTC repeats with less than 11% of GGA.TCC interruptions formed triplexes and/or sticky DNA similar to the uninterrupted repeat sequence. These triplexes showed different P1 nuclease sensitivities, and the GGA.TCC interruptions were slightly more sensitive than the surrounding GAA.TTC repeats. Furthermore, genetic instability investigations in Escherichia coli revealed that a small number (4%) of interruptions substantially stabilized the long GAA.TTC tracts. Furthermore, the greater the extent of interruptions of the GAA.TTC repeats, the less inhibition of in vitro transcription was observed, as expected, based on the capacity of interruptions to inhibit the formation of sticky DNA. We propose that the interruptions introduce base mismatches into the R.R.Y triplex, which explains the observed chemical and biological properties.

The clinical features as well as the molecular pathology of Friedreich's ataxia (FRDA) 1 are summarized in the accompanying paper (1). The molecular mechanism of the biological effects of the long GAA⅐TTC repeat was proposed to be the formation of an unusual DNA structure (2)(3)(4). The GAA⅐TTC repeats have been known to form a triplex structure (5)(6)(7)(8)(9)(10)(11). Recently, the sticky DNA structure was found specifically in the long GAA⅐TTC repeats from FRDA patients (12). This structure was based on the R⅐R⅐Y type of triplex and was hypothesized to be formed by exchanging the pyrimidine strands between two R⅐R⅐Y triplexes (12). The GAA⅐TTC repeat lengths required for the formation of sticky DNA correlated well with that required for the disease phenotype (12). Furthermore, we found that sticky DNA inhibited T7 and SP6 RNA polymerase transcription effectively by sequestering RNA polymerases (1). Therefore, sticky DNA may be involved in the pathology of FRDA.
On the other hand, the GAAGGA⅐TCCTTC repeat (65 units in length) was also found in the first intron of the frataxin gene but was demonstrated to be nonpathogenic (13). Unlike the GAA⅐TTC repeat, this hexamer repeat does not form a triplex and/or sticky DNA, does not inhibit transcription, and does not associate with the FRDA disease state (12,13). This strongly suggests that triplexes and/or sticky DNA may be involved in the pathology of FRDA. However, the reason why the hexamer repeat did not form a triplex or sticky DNA was unclear.
In normal individuals with moderate lengths of GAA⅐TTC repeats, short GAGGAA⅐TTCCTC repeat interruptions were found (14,15). This repeat appeared to also be nonpathogenic, and thus it became evident that the length of the uninterrupted GAA⅐TTC repeat was important for the pathology of FRDA. Thus, interruptions in the long GAA⅐TTC repeats may influence the ability of the repeats to form sticky DNA and to inhibit transcription.
These kinds of interruptions were shown to affect the genetic instabilities of other triplet repeat sequences (16 -18). Furthermore, the GAAGGA⅐TCCTTC repeat was more stable genetically than the uninterrupted GAA⅐TTC repeat (13).
Herein, we describe the preparation and characterization of a family of direct repeat sequences of composition intermediate between uninterrupted GAA⅐TTC and GAAGGA⅐TCCTTC repeats. Frequent GGA⅐TCC interruptions (more than 20%) in the long GAA⅐TTC repeat interfere with the formation of sticky DNA and triplexes, with the inhibitory effect on transcription of long GAA⅐TTC repeats, and with genetic instabilities. Mapping of P1 nuclease-cleaved sites revealed that the GAA⅐TTC repeats with less than 20% of GGA⅐TCC interruptions adopt R⅐R⅐Y triplexes similar to the pure GAA⅐TTC repeats but that the GGA⅐TCC interruptions may cause base mismatches in the triplexes. Hence, a high percentage of GGA⅐TCC interruptions in the GAA⅐TTC repeats inhibits the formation of sticky DNA by introducing base-mismatches into the triplexes, which makes them unstable.
For the analyses of sticky DNA formation, the EcoRI-PstI fragments of the mutated plasmids were subcloned into the same sites in the pSPL3 vector (Life Technologies, Inc.) and transformed into E. coli SURE strain by the CaCl 2 method (21). The transformants were cultured in 1 liter of LB medium at 37°C to an A 600 of ϳ0.5 (during logarithmic phase). Plasmids were isolated by the alkaline lysis method and purified by CsCl density gradient centrifugation (21). Three phenol/ chloroform extractions were performed, followed by precipitation with ethanol and resuspension in 10 mM Tris-HCl (pH 8.0), 1 mM EDTA.
For transcription studies, the EcoRI-XbaI fragments of the pSPL3based plasmids were inserted into the EcoRI-NheI site of the pCR3.1 vector (Invitrogen) for the formation of the purine strand (interrupted or uninterrupted rGAA repeat) or into the EcoRI-XbaI site for the formation of the pyrimidine strand (interrupted or uninterrupted rUUC repeat). After the transformation of the E. coli SURE strain, the transformants were cultured in 1 liter of LB media. Plasmids were isolated by the alkaline lysis method and purified by the CsCl density gradient centrifugation method (21), followed by phenol/chloroform extraction, precipitation with ethanol, and resuspension in 10 mM Tris-HCl (pH 8.0), 1 mM EDTA.
All plasmids were characterized by restriction mapping and DNA sequencing on each complementary strand using the Thermo Sequenase Radiolabeled Terminator Cycle Sequencing Kit (U.S. Biochemical Corp.).
Analysis of Formation of Sticky DNA-The pSPL3-based plasmids (1.5 g) were linearized by XmnI (New England Biolabs) at 37°C in NEBuffer 1 (10 mM BisTris propane-HCl, 10 mM MgCl 2 , 1 mM DTT (pH 7.0 at 25°C)) supplemented with 100 g/ml bovine serum albumin. The linearized plasmids were purified by phenol/chloroform extraction and ethanol precipitation and analyzed in a 0.7% agarose gel in 0.5ϫ TBE buffer.
P1 Nuclease Digestion-The "weak" P1 nuclease digestion and analysis of P1 nuclease sensitivity on each DNA strand was performed as described (12), except that P1 nuclease was purchased from Amersham Pharmacia Biotech. The "strong" P1 nuclease digestion was performed as described by Hanvey et al. (22) with some modifications. Briefly, 1.5 g of the supercoiled plasmids were treated with 10 units of P1 nuclease (Amersham Pharmacia Biotech) at 37°C for 10 min in 10 mM Tris-HCl (pH 7.6), 10 mM MgCl 2 , and 50 mM NaCl. The reaction was terminated by phenol/chloroform extraction, and DNA was recovered by ethanol precipitation and analyzed by a 1.5% agarose gel electrophoresis in 0.5ϫ TBE buffer. For high resolution mapping of P1 nuclease cleaved sites, 500 ng of P1 nuclease-cleaved plasmids were denatured in 200 mM NaOH, 0.2 mM EDTA in a volume of 40 l at 65°C for 10 min in the presence of one of the following primers: KO751 primer, 5Ј-ACCTGGC-CAACATGGTGA-3Ј; KO753 primer, 5Ј-GTAGCTGGGATTACAGGC-3Ј. After neutralization by the addition of 13 l of 3 M sodium acetate (pH 5.2), DNA was precipitated by ethanol. The DNA was dissolved in 10 l of Sequenase buffer (40 mM Tris-HCl (pH 7.5), 20 mM MgCl 2 , 50 mM NaCl) and incubated at 42°C for 10 min to anneal the primer to the denatured template. The primer extension reaction was started by the addition of 1 l of 0.1 M DTT, 2 l of labeling mix (1.5 M each of dGTP, dCTP, and dTTP), 1 l of [␣-32 P]dATP (3000 Ci/mmol, 10 mCi/ml) (Amersham Pharmacia Biotech) and 2 l of Sequenase version 2.0 (1.625 units/l) (Amersham Pharmacia Biotech) and incubated at room temperature for 5 min. Then 14 l of 80 M dNTP mix were added and incubated at 37°C for 5 min to complete the reaction. After purification of DNA, the extended products were analyzed in a 4% denaturing polyacrylamide gel in TBE buffer. As the marker, the sequencing reaction was performed according to the Sequenase version 2.0 (Amersham Pharmacia Biotech) sequencing protocol in parallel.
In Vitro Transcription-The pCR3.1-based plasmids were linearized by XbaI to enable the run-off of transcription. The pCR3.1 vector was also linearized by XbaI as the internal control. 100 ng of the XbaIlinearized pCR3.1-based plasmid and 100 ng of the XbaI-linearized pCR3.1 vector were mixed and subjected to in vitro transcription using T7 RNA polymerase.
Recultivation Assay for the Analyses of Genetic Instabilities of Repeat Tracts-Plasmids containing homogenous lengths of undeleted/unexpanded triplet repeat tracts were prepared as described previously (1) and transformed into E. coli AB1157 (thr-1, araC14, leuB6, ⌬(gpt-proA)62, lacY1, tsx-33, qsrЈ-0, glnV44(AS), galK2(Oc), -, Rac-0, hisG4(Oc), rfbD1, mgl-51, rpoS396(Am), rpsL31(strR), kdgK51, xylA5, mtl-1, argE3(Oc), thi-1) by electroporation. The transformation mixture was plated on LB plates containing 100 g/ml of kanamycin and incubated for 16 h at 37°C. Liquid cultures were initiated from the colonies on the plates as follows. Washings from ϳ100 single colonies were inoculated into 10 ml of L broth containing 10 g/ml kanamycin. The culture was allowed to grow at 37°C with shaking at 250 rpm. The growth of the culture was monitored periodically by measuring the absorbance at 600 nm. When the absorbance reached 1.0, an aliquot of the culture was inoculated into a fresh 10-ml L broth (with kanamycin as before) tube at a dilution of 10 Ϫ6 , resulting in a final cell concentration of 10 3 cells/ml. The fresh culture was incubated at 37°C with shaking at 250 rpm until the absorbance was 1.0, a duration of ϳ24 h. The original culture was centrifuged, and the plasmid DNA was isolated as per standard alkaline lysis procedures (2). The triplet repeatcontaining fragment was excised by digestion with StuI and XmaI, labeled with [␣-32 P]dCTP, and analyzed by 6 -8% native polyacrylamide gel electrophoresis. The dried gels were exposed to a Phosphor-Imager screen and scanned using a Molecular Dynamics PhosphorImager. The extents of the genetic instabilities of the repeat tracts were determined by measuring the signal intensity of the band corresponding to the full-length repeat tract as a percentage of the total signal in the lane from all triplet repeat-containing bands.
Quantitations-For the analysis of the formation of sticky DNA, the amount of the retarded band (12) was measured by a densitometric analysis of the negatives of the gel picture using the ImageQuant (Molecular Dynamics) on a Molecular Dynamics densitometer. The determination of the amount of the RNA synthesized in the transcription analysis was performed on the autoradiographs by ImageQuant (Molecular Dynamics) on a Molecular Dynamics PhosphorImager.

DNA Sequence and Restriction Enzyme Analyses of FRDA Patient
Expanded Repeats-Expanded GAA⅐TTC repeats from 22 Friedreich ataxia patients, including eight Acadians, were amplified from genomic DNA as described (13). The corresponding DNA fragments were separated on 1% agarose gels and purified on QIAGEN columns. To detect interruptions in GAA⅐TTC, the purified fragments were digested with the restriction enzymes EarI and MnlI, whose recognition sequences are GAAGAG and GAGG, respectively. In addition, the pyrimidine strand of 11 repeats was directly sequenced on a Licor automatic sequencer using a fluorescently labeled 2500R primer (28) and the SequiTherm EXCELII sequencing kit (Epicentre Technologies). Readable sequence was obtained for up to 160 triplets.

Introduction of GGA⅐TCC Interruptions into (GAA⅐TTC) 150
Repeats-To create the intermediate sequences between the uninterrupted GAA⅐TTC repeat and the GAAGGA⅐TCCTTC hexamer repeat, various extents of GGA⅐TCC interruptions were introduced into the (GAA⅐TTC) 150 repeat by a method based on site-directed mutagenesis (19). Fig. 1 shows an overview of the mutagenesis strategy. The (TTC) 150 -containing single-stranded circular plasmid was produced from pRW3546, which is the pGEM-3Zf(Ϫ)-based plasmid containing the (GAA⅐TTC) 150 repeat, by the single-stranded phage system. The mutated repeating GAA oligonucleotides and AatII elimination primers were annealed to the single-stranded (TTC) 150 plasmids, and the primer-directed DNA syntheses and the ligations of the newly synthesized DNA fragments were carried out by the addition of T4 DNA polymerase and T4 DNA ligase. After transformation of the E. coli mutS strain, the plasmids were isolated and cleaved by AatII to eliminate the plasmid derived from the original template strand and then used for the transformation of E. coli SURE strain. Only the mutated and uncleaved plasmids should grow.
Using this methodology, we obtained a series of plasmids containing GGA⅐TCC-interrupted GAA⅐TTC repeats; the sequences of the inserts are listed in Table I. Each mutated sequence, (GAA) 23 GGA, (GAA) 14 GGA, (GAA) 11 GGA, (GAA) 8 GGA, and (GAA) 3 GGA(GAA) 4 GGA, contains 5 (4%), 9 (7%), 10 (8%), 14 (11%), and 26 (20%) GGA⅐TCC interruptions, respectively, within the 130 -136 repeats of the GAA⅐TTC sequence (Table I). All mutated sequences had a length of about 130 triplet repeats, probably because of small deletions during the process of the preparation of the single-stranded (TTC) 150 plasmid. However, this length is equivalent to the 65 repeats of the GAAGGA⅐TCCTTC hexamer sequence and thus is appropriate for the comparison between these sequences. Interestingly, the GGA⅐TCC interruptions were found to be aligned in a systematic order, indicating that many of the mutated oligonucleotides were closely aligned without gaps during the annealing to the single-stranded plasmid.
Sticky DNA Formation by Interrupted GAA⅐TTC Repeats-We have recently reported that the uninterrupted long GAA⅐ TTC repeat forms the sticky DNA structure, which is hypothesized to be an intermolecular bitriplex, but the GAAGGA⅐ TCCTTC hexamer repeat of equivalent length is incapable of adopting this conformation (12). To analyze the structural properties of the intermediate sequences, the interrupted GAA⅐TTC repeats were recloned into the pSPL3 vector. The designations of these plasmids are listed in Table II.
When these supercoiled circular plasmids were analyzed in an agarose gel, all plasmids showed dimers and higher multimers (migrating at the top half of the gel) since they were isolated from the E. coli SURE strain, which is RecA ϩ strain ( Fig. 2A). These plasmids contained homogeneous inserts of the expected lengths (Fig. 2B). Since all of the sequences have ϳ254 and ϳ354 bp of human flanking sequences on the sides, the total length of inserts was ϳ1.1 kb (Fig. 2B).
Then the plasmids shown in Fig. 2A were linearized by XmnI and analyzed in an agarose gel to test for the formation of the retarded band, which represents the sticky DNA (Fig. 2C). The retarded bands were observed at the same positions on the gel for the DNAs containing 0 -11% of GGA⅐TCC interruptions. The amount of the retarded band was about 17% of the total DNA, although a slight reduction was observed for the plasmid with 11% GGA⅐TCC interruptions (Fig. 2, C and D). However, when the interruptions were increased up to 20%, the retarded band disappeared completely, and the hexamer repeat (50% GGA⅐TCC interruptions) did not show the retarded band, as expected (Fig.  2, C and D) (12). These data show that sticky DNA, which is responsible for the retarded band, can be formed in the presence of GGA⅐TCC interruptions up to 11% but that 20% or more GGA⅐TCC interruptions abolish the formation of sticky DNA.
Thermostability of Sticky DNA Formed by Interrupted GAA⅐TTC Repeats-To evaluate the thermostability of the sticky DNA structures formed by the interrupted GAA⅐TTC repeats, pRW3822 containing uninterrupted (GAA⅐TTC) 150 and pRW2114 containing 8% of GGA⅐TCC interruptions were linearized by XmnI, and the thermostabilities of the sticky DNA TABLE I GGA⅐TCC-interrupted GAA⅐TTC sequences created by site directed mutagenesis in this study GGA⅐TCC interruptions were introduced into GAA⅐TTC repeat tracts by site-directed mutagenesis (see "Experimental Procedures"). Chain termination sequencing revealed the number and the positions of each GGA⅐TCC interruption. The sequences obtained by this method are listed below and identified by an abbreviated name. The schematic depiction of these sequences shows the GAA⅐TTC repeat blocks (black boxes) separated by GGA⅐TCC triplets (white bars). The number of GAA⅐TTC repeats within each block is denoted above it. The (GAA) 150 has a single block containing 150 triplets, and the GAAGGA repeat contains alternating GAA⅐TTC and GGA⅐TCC triplets.
structures were tested in the presence or absence of EDTA. The sticky DNA formed by pRW3822 was extremely thermostable in the absence of EDTA (Fig. 3, open circles), whereas the retarded band of pRW3822 completely disappeared at 60°C in the presence of EDTA as described before (Fig. 3, filled circles) (12). Similarly, the sticky DNA structure adopted by pRW2114  Table I were digested with EcoRI and PstI, and the length of the insert was analyzed in a 1.5% agarose gel. C, XmnI-linearized plasmids. The plasmids listed in Table I were Fig. 2A were cleaved by 1 unit of P1 nuclease at 37°C for 1 min, and P1 nuclease sensitivities on each strand were analyzed as described (22).  showed similar thermostability to that of pRW3822 (Fig. 3,  triangles). Namely, the retarded band disappeared at 60°C in the presence of EDTA but survived up to 80°C in the absence of EDTA despite the existence of 8% of GGA⅐TCC interruptions. Thus, this result shows that once the sticky DNA structure was formed, it was very stable, and the GGA⅐TCC interruptions do not have a significant influence on the stability of the sticky DNA. Triplex Formation by Interrupted GAA⅐TTC Repeats-To analyze the capacities of these interrupted GAA⅐TTC repeats (Tables I and II) to form triplex structures, we treated these plasmids with P1 nuclease, a DNA-specific endonuclease that recognizes single-stranded and structurally contorted duplexes, and we mapped the cleaved sites.
First, we performed a "weak" digestion by P1 nuclease (1 unit for 1 min) so that a single cleavage occurred on each plasmid molecule. The cleaved sites were mapped on each strand as described previously (12). For analysis of the pyrimidine strand, the NdeI-EcoNI fragment containing the repeat was isolated, and the NdeI site was 3Ј-end-labeled, whereas for the purine strand, the BbsI-SapI fragment containing the repeat was isolated, and the BbsI site was 3Ј end-labeled (Fig. 4A). In the analysis of the pyrimidine strand, we observed smeared signals within the repeat for plasmids containing 0% (uninterrupted) to 11% GGA⅐TCC interruptions (Fig. 4B). On the other hand, the purine tract did not show such intense signals (Fig.  4B). These results suggest that the GGA⅐TCC interruptions up to 11% in the long GAA⅐TTC repeat do not interfere with the ability of the long GAA⅐TTC repeat to form triplex structures and that all of these interrupted sequences form similar R⅐R⅐Y type of triplexes. This is consistent with the ability of these plasmids to form the sticky DNA structure (Fig. 2).
The GAAGGA⅐TCCTTC hexamer repeat (50% GGA interruptions) showed an intense signal at the 5Ј-end of purine strand where a poly(A⅐T) tract exists (Fig. 4B). This indicates that the poly(A⅐T) sequence adjacent to the GAAGGA⅐TCCTTC repeat on its 5Ј side is unpaired when 50% of GAA⅐TTC triplets are substituted alternately by GGA⅐TCC triplets.
Next, we carried out a "strong" P1 nuclease digestion (10 units for 10 min) of these mutated plasmids. After P1 nuclease digestion, plasmids were cleaved by NheI and analyzed by an agarose gel electrophoresis. If a double-stranded cleavage occurred within the repeats, 1.3-1.75-kb fragments should be observed (Fig. 5A). As seen in the "weak" reaction ( Fig. 4), P1 nuclease cleavages were observed within the repeat for plas- mids containing from 0% (uninterrupted) up to 11% GGA⅐TCC interruptions (Fig. 5B).
Interestingly, when these cleaved plasmids were analyzed in a long agarose gel, each cleavage pattern revealed different distinct bands (Fig. 5C). The uninterrupted GAA⅐TTC repeat showed a smeared pattern, indicating that P1 nuclease cleaved the repeat randomly. However, the GAA⅐TTC repeats interrupted by 4, 7, 8, and 11% of GGA⅐TCC triplets showed about 4, 7, 8, and 10 separate bands, respectively, although there were smeared backgrounds in each lane. This suggests that there are favored P1 nuclease cleavage sites in the interrupted GAA⅐TTC repeats. The distance between the adjacent bands decreased as the extent of the GGA⅐TCC interruptions increased. For the DNAs with 4% GGA⅐TCC interruptions, the distance between each band was ϳ80 bp, similar to the interval between each of the GGA⅐TCC interruptions, 69 bp. For the DNAs with 7 and 8% GGA⅐TCC interruptions, the distance between each band was estimated to be ϳ51 and ϳ41 bp, respectively. These values are also close to the expected distance between the GGA⅐TCC interruptions in each sequence, 42 and 33 bp. However, it is not clear if the GGA⅐TCC interruptions or the short GAA⅐TTC repeats were cleaved by P1 nuclease.
Finally, we attempted to precisely map the P1 nuclease cleaved sites using a primer extension method. After digestion of pRW2116 by 10 units of P1 nuclease, the cleaved sites were analyzed on each strand by the primer extension method. When the interrupted GAA strand was analyzed, we observed a smeared pattern, indicating the absence of specific cleaved sites on the GAA strand (data not shown). However, the analysis on the interrupted TTC strand showed slightly intense signals at the positions of the TCC interruptions, although there was a smeared background signal (Fig. 6). These results suggest that the GGA⅐TTC interrupted sites are less fully basepaired than the more perfect neighboring GAA⅐TTC regions.
GGA⅐TCC Interruptions Genetically Stabilize Long GAA⅐TTC Repeat Tracts-Prior studies (3,4) showed that the presence of CTA or AGG triplets interspersed within CTG or CGG repeats drastically reduced the extent of expansions and deletions of these tracts. Therefore, we hypothesized that the introduction of GGA⅐TCC triplets into long homogenous GAA⅐TTC repeat tracts would prevent the rampant expansions and deletions observed for these sequences. To test this hypothesis, we analyzed the genetic instabilities of the pure and interrupted GAA⅐TTC repeat tracts in pRW4250, pRW2150, pRW2148, and pRW4251 (Table I) in E. coli AB1157 (see "Experimental Procedures"). The plasmids were introduced into the cells and propagated in log phase growth by successive recultivations. The cultures were harvested after each recultivation, and the plasmid DNA was isolated and analyzed by restriction digestion and polyacrylamide gel electrophoresis. The genetic instabilities were measured by determining the amount of full-length repeat tract as a relative percentage of the total amount of repeat tract in the sample. Fig. 7A shows the 7% polyacrylamide gel analysis of the excised StuI-XmaI inserts from the recultivation assays. The quantitative analyses of these gels is shown in Fig. 7B. The data clearly show that the uninterrupted (GAA⅐TTC) 150 repeat is the most unstable of all the tracts investigated. After just one recultivation, as little as 8% of the initial full-length triplet repeat remained (Fig. 7B). However, the presence of just five GGA⅐TCC interruptions (4%) within 131 GAA⅐TTCs served to dramatically increase the stability of these sequences as observed for pRW2150. As much as 85% of the full-length repeat remained after three recultivations (Fig. 7B). When the number of GGA⅐TCC interruptions was increased to 10 (8%), the genetic instability of the repeat tract was almost negligible (82% of full-length repeat tract remaining after three recultivations) (Fig. 7B). Unexpectedly, the (GAA-GGA⅐TTCTCC) 65 hexamer repeat tract in pRW4251 was substantially more unstable than the GAA 23 GGA and the GAA 11 GGA repeat tracts. After three recultivations, 63% of the full-length repeat remained. However, this repeat was still dramatically more stable than the pure (GAA⅐TTC) 150 tract. This result was also verified in other E. coli strains. 2 We postulate that the homogeneity of the hexamer repeat tract results in the unexpected levels of expansion and deletion (see "Discussion").
Thus, our data show that the presence of even a small number of interruptions significantly stabilizes the (GAA⅐TTC) n repeats. However, this effect is reversed when the number of GGA⅐TCC interruptions is so high as to abolish the sequence heterogeneity; the relative homogeneity of the hexamer repeat probably allows the formation of slipped structures, which in turn mediate expansions and deletions of these repeats.
GGA⅐TCC Interruptions Alleviate Transcription Inhibition by Long GAA⅐TTC Repeat Tracts-Long GAA⅐TTC repeat tracts inhibit transcription from an upstream promoter (3,23). We showed that this behavior is due to the formation of sticky DNA by these repeats; when the sticky DNA structure was disrupted by heat or EDTA, there was no appreciable inhibition of transcription (1). Since the formation of sticky DNA is dramatically reduced when the number of GGA⅐TCC interruptions is increased, we postulated that these interruptions would also alleviate the inhibition of transcription. To test this hypothesis, we performed in vitro transcription studies with the T7 RNA polymerase using the pCR3.1-derived plasmids, which contained pure and interrupted GAA⅐TTC repeats (Table II) as described under "Experimental Procedures." The plasmids were linearized with XbaI and incubated with T7 RNA polymerase and rNTPs. An equimolar amount of pCR3.1 (which did not contain a triplet repeat insert) was included in each reaction as an internal control. The transcripts generated in the reaction were analyzed on 4% denaturing acrylamide gels. The amount of transcript produced from the insert-containing plasmid was compared with that from the control plasmid. The measured ratios revealed that the relative amount of full-length transcript was the least for the (GAA⅐TTC) 150 -containing plasmid. However, increasing amounts of GGA⅐TCC interruptions resulted in the generation of significantly higher amounts of full-length transcript (data not shown). This effect was observed for both orientations of the triplet repeat inserts, i.e. when the transcript was purine-rich (rGAA) or pyrimidine-rich (rUUC).
Interruptions in FRDA Chromosomes-The potential role of interruptions in expanded GAA⅐TTC repeat sequences in the human FRDA chromosomes that may affect phenotypic variability is uncertain. To evaluate this question, we performed DNA sequence analyses and restriction mapping experiments in order to determine the frequency of interruptions in patient materials (Table III). The pyrimidine strands of 11 expanded alleles were sequenced for as long as technically possible, usually up to 200 triplets. Interruptions were found and were clustered at the 3Ј-ends of the expanded repeats, affecting the last 10 -15 triplets. These interruptions usually consisted of stretches of three or four A⅐T pairs and occasional GGA⅐TCC triplets. The remainder of the repeats contained only GAA⅐TTC triplets within the sequencing limitations; these data included eight alleles, which were sequenced in their entirety. These types of interruptions were not significantly more common in chromosomes from Acadian patients, which are known to have milder phenotypes compared with non-Acadian patients carrying repeats of comparable lengths.
In order to further probe the presence of interruptions, re-

FIG. 7. Effect of GGA⅐TTC interruptions on the genetic instabilities of GAA⅐TTC repeats.
A, plasmids pRW4250, pRW2150, pRW2148, and pRW4251 were propagated in E. coli AB1157 in log phase by stepwise recultivation as described under "Experimental Procedures." The triplet repeat containing StuI-XmaI inserts were labeled and electrophoresed through 6% native polyacrylamide gels. The gels were exposed to x-ray film as well as a PhosphorImager screen for quantitation. Lanes 1-3 contain DNAs isolated after 1-3 recultivations, respectively. The arrow indicates the band that contains the full-length triplet repeat. The boxed regions designate the expansion and deletion products. B, the genetic instabilities of plasmids pRW4250 (q), pRW2150 (), pRW2148 (f), and pRW4251 (ࡗ) were quantitatively determined by scanning the exposed Phospho-rImager screens with a Molecular Dynamics PhosphorImager. The signal intensity of the full-length triplet repeatcontaining band was measured as a proportion of the total signal intensity in the lane above and below the band but not including the vector bands. The average percentage of full-length insert from three independent experiments was plotted on the y axis against the number of recultivations on the x axis. The error bars represent the S.D.

TABLE III
Summary of studies to identify interruptions in expanded GAA⅐TTC repeats DNA samples from all patients were tested for the capacity of EarI and MnlI cleavage. However, not all DNA samples were sequenced; the sequenced samples are underlined. The definition of GAA1 and GAA2 was described (31). Interruption detected by restriction enzyme digestion striction enzyme analyses were conducted to attempt to identify GGA⅐TCC triplets that were buried more deeply into the repeats (Table III); these analyses were positive for only two non-Acadian and two Acadian patients. Thus, neither the DNA sequence analyses nor restriction enzyme determinations enabled an exhaustive scan for interruptions; the former determination suffers from the weakness that it is not possible to sequence an entire expanded repeat tract of the average size found in Friedreich's ataxia patients (700 -900 triplets). The restriction enzyme analyses suffer from the lack of specificity, since only a few enzymes are available that recognize the sequence configurations found by exchanging G⅐C for A⅐T base pairs in the GAA⅐TTC triplet repeat sequence (i.e. the interruptions may, in fact, be present in the patient samples but might not have been detected with these probes).

DISCUSSION
To clarify the role of interruptions in GAA⅐TTC repeats in their capacity to form triplexes and sticky DNA as well as to inhibit transcription and their involvement in genetic instabilities, we conducted the studies described herein. Considering the role of direct repeat sequences in the molecular pathology of the FRDA, two extreme cases exist. One is the long, uninterrupted GAA⅐TTC repeats, which form the sticky DNA structure, inhibit transcription, and are associated with the disease (12). The other is the GAAGGA⅐TCCTTC hexamer repeat, which does not form sticky DNA, does not inhibit transcription, and does not cause the disease (12). However, the molecular reason for these phenomena was not clear. Several sequences of intermediate composition of these repeats were synthesized; up to 11% of GGA⅐TCC interruptions in the long GAA⅐TTC repeat did not influence the formation of sticky DNA. Furthermore, we showed that these GAA⅐TTC repeats interrupted up to 11% form similar R⅐R⅐Y triplexes. However, these structures were slightly different from the structure of pure GAA⅐TTC, because the GGA⅐TCC interruptions were more sensitive to P1 nuclease compared with the surrounding GAA⅐TTC repeats, indicating that the GGA⅐TCC interruptions may be, at least partially, unpaired.
Thus, we hypothesize that the GAAGGA⅐TCCTTC repeats cannot form a triplex and/or sticky DNA structure due to their deleterious effect on base-pairing capabilities. The R⅐R⅐Y type triplexes are based on TA*A and CG*G triads, which consist of normal Watson-Crick base pairs and reverse Hoogsteen pairs (Fig. 8B). In this case, the reverse Hoogsteen base pairs contain two hydrogen bonds (Fig. 8B). When we consider all three possible reading frames of Hoogsteen base pairings, the uninterrupted GAA⅐TTC repeat can form a stable R⅐R⅐Y triplex using only TA*A and CG*G triads in one frame (Fig. 8A, left  column). However, in the case of the GAAGGA⅐TCCTTC hexamer repeat, TA*G and CG*A triads must be included in all six possible frames (Fig. 8A, right column). As shown in Fig. 8C, both TA*G and CG*A triads contain only one hydrogen bond in each reverse Hoogsteen pair. Although the CG*A triad can have two hydrogen bonds in the reverse Hoogsteen base pair if the adenine is protonated at N-1 position (Fig. 8C, right column), this scheme is unlikely because protonation usually requires an acidic pH (24,25). In the case of the hexamer repeat, at least 33% of bases have CG*A or TA*G mismatched triads (Fig.  8C, left column), and these mismatches probably destabilize the triplex and/or sticky DNA structures. We calculate that 11% of GGA⅐TCC interruptions make ϳ7.4% of mismatched base triads in the minimum case. Hence, this may be the upper limit of mismatched base triads to maintain the triplex structure.
There were great differences between 11 and 20% of GGA⅐TCC interruptions on the formation of sticky DNA and triplexes. We tried to prepare sequences with intermediate content to better understand the effect of the GGA⅐TCC interruptions. However, our attempts to make (GAA) 17 GGA were unsuccessful, although the introduction of both more and fewer GGA⅐TCC interruptions were successful. Three different experiments using different oligonucleotides with GGA interruptions at different positions were conducted. However, all cases resulted in the creation of the (GAA) 8 GGA sequence (data not shown). The reason for the failure to prepare the (GAA) 17 GGA is unclear but may be related to a DNA secondary structure, which is unfavorable for the (GAA) 17 GGA sequence.
We also evaluated the role of the GGA⅐TTC interruptions on the genetic instabilities of long GAA⅐TTC tracts in the Escherichia coli system (2,26,27). A small number of interruptions (4%) served to substantially stabilize the GAA⅐TTC tract. These results are consistent with previous data (17,27) with the myotonic dystrophy and the fragile X sequences, where a relatively small number of interruptions provided a substantial genetic stabilization. Unexpectedly, the repeating hexamer sequence, GAAGGA⅐TCCTTC showed some instability, although this is, in a sense, the most highly interrupted of the sequences tested. However, this sequence is unique compared with the others (Table I), since it has a perfectly repeating interrupteduninterrupted sequence. The hexamer repeat has a substantial degree of homogeneity, which enables it to slip by six base pairs and remain in perfect register. We propose that this property may result in the capacity of the hexamer repeat to expand and delete as observed (Fig. 7). Further genetic instability studies in methyl-directed mismatch repair strains will be described. 2 Concerning the transcription studies, we found that the greater the extent of interruption of the GAA⅐TTC repeats, the less inhibition of transcription was observed. These results were as expected based on the effect of interruptions to inhibit the formation of sticky DNA (1). The data are consistent with the notion that the formation of sticky DNA by long GAA⅐TTC repeat tracts is crucial for the inhibition of transcription and consequently may explain the reduction in the amount of the X25 mRNA in Friedreich's ataxia patients (28).
We attempted to evaluate the role of interruptions in expanded GAA⅐TTC repeat sequences in FRDA chromosomes in terms of their effect on phenotypic variability. Our current data do not permit the conclusion that interruptions in the GAA⅐TTC expanded repeat sequence are a common cause of the variability in FRDA; the level of interruptions required to cause an effect on sticky DNA formation and in vitro transcription might have been detected in our analyses of patient materials. Alternatively, natural sequence variants, in addition to occurring in nonpathogenic alleles (13), may also play a role in rare patients with atypical phenotypes that have short interrupted repeats (29). One example is offered by two sibs in the series of atypical cases reported by Moseley et al. (30) who had a very slowly evolving disease and carried a (GAA) 50 GAAAA-(GAA) 19 GAAAA(GGAGAA) 4 (GAA) 12 repeat along with a large expansion. Thus, the potential role of interruptions in FRDA phenotypic variabilities remains to be clarified.