![]()
|
|
||||||||
J. Biol. Chem., Vol. 280, Issue 2, 941-952, January 14, 2005
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||

From the Center for Genome Research Institute of Biosciences and Technology, Texas A & M University System Health Science Center, Texas Medical Center, Houston, Texas 77030
Received for publication, September 10, 2004 , and in revised form, October 13, 2004.
| ABSTRACT |
|---|
|
|
|---|
| INTRODUCTION |
|---|
|
|
|---|
1:8000) and shows frequent mortality in affected infants (1). An unstable region on chromosome 19q13.3 was discovered as the genetic basis of DM1. A polymorphic locus was found to be larger in DM1 patients (13), because of substantial expansions of a CTG·CAG repeat tract in the 3'-untranslated region of the myotonic dystrophy protein kinase (DMPK) gene (1). As many as 3000 repeats (9000 bp) have been found in some patients, expanded from the normal range of 537 repeats. DM1 displays a non-Mendelian inheritance pattern.
The molecular mechanisms responsible for this genetic instability have been extensively investigated in recent years in bacteria, yeast, cell culture, and mouse systems (reviewed in Refs. 14). DNA replication (59), repair (1013), and recombination (1416) are involved, probably acting in concert with other factors/processes, such as single-strand DNA-binding proteins (17) and transcription (9, 18). Also, the long CTG·CAG repeat tract can adopt an unusual flexible and writhed conformation (19), which may promote the formation of slipped structures (9, 20, 21) with a transiently formed, quasistable, long CTG sequence along with an unpaired and unstacked long CAG complementary strand. These types of preferential single-strand stabilities and DNA conformational behaviors are integral to the interpretation of the genetic instability effects of TRS orientation relative to the direction of DNA replication (14, 9).
A 2.5-kbp poly(purine·pyrimidine) tract from the human polycystic kidney disease 1 gene (PKD1), which is known to form triplexes, slipped structures, and other non-B DNA conformations (22), induced long deletions and other instabilities in plasmids that were manifested by mismatch repair and, in some cases, transcription. The breakpoints occurred at or near the predicted non-B DNA conformations. Distance measurements also indicated a significant proximity of alternating purine·pyrimidine and oligo(purine·pyrimidine) tracts to breakpoint junctions in 222 gross deletions and translocations, respectively, involved in human diseases. In 11 of these deletions, which were analyzed in detail, the breakpoints were explicable by non-B DNA structure formation. Hence, Bacolla et al. (21) concluded that alternative DNA conformations trigger genomic rearrangements through recombination-repair activities. Also, a substantial literature is growing on the role of non-B DNA conformations involving low copy repeats in genomic rearrangements (deletions, inversions, duplications, translocations, etc.) associated with human diseases (reviewed in Ref. 23).
Here we have substantially extended these studies by exploring the capacity of CTG·CAG and GAA·TTC repeat tracts of various lengths, extents of interruptions (polymorphisms), and orientations to serve as mutagens, as a function of transcription. Most surprisingly, long CTG·CAG repeat tracts promoted the formation of inversions and long deletions (0.61.8 kbp) that removed part or all of the repeats as well as the flanking GFP reporter gene. This behavior, if found in humans, implies that the DMPK gene flanking the long CTG·CAG repeats in patients may be subject to deletions and rearrangements, thus labilizing the DMPK protein.
| EXPERIMENTAL PROCEDURES |
|---|
|
|
|---|
|
The (GAA·TTC)n repeats were obtained from pMP141 and pRW3808 that contain (GAA·TTC)9 and (GAA·TTC)176, respectively (26). Both repeating tracts are pure (uninterrupted) and have
254 and
354 bp of human sequences, respectively, flanking the repeats (26). pRW3808 is a derivative of pUC18NotI containing the (GAA·TTC)176 fragment inserted into the BamHI site of the vector. pMP141 was a derivative of pUC18NotI containing the (GAA·TTC)9 fragment inserted into the EcoRI and PstI sites of the vector (26).
Cloning of the (CTG·CAG)n and (GAA·TTC)n Sequences into pG-FPTThe strategy was to clone the TRS downstream from and in close proximity to the GFP gene. The fragments containing the CTG·CAG and the GAA·TTC sequences were recloned into the pGFPT vector plasmid (Fig. 1). The fragments were prepared as follows: the parental plasmids containing (CTG·CAG)17, (CTG·CAG)98 and (CTG·CAG)175 were digested with EcoRI and NotI (New England Biolabs, Inc.), and these inserts were used to obtain clones with the repeats in orientation I (defined under "Experimental Procedures" and see Refs. 5 and 25). In order to obtain clones in orientation II, the (CTG·CAG)98 insert was excised using HindIII and NotI digestion. The recessed 5' terminus of this fragment was filled in with 0.1 unit of the Klenow fragment of E. coli DNA polymerase I (U. S. Biochemical Corp.) and the four dNTPs (0.1 mM each). To prepare fragments of CTG·CAG repeat (n = 175 and 17) for cloning in orientation II, the parental plasmid pRW3248 was digested with NotI, whereas cleavage of pRW3244 with EcoRI/HindIII was followed by filling in the termini with 0.1 unit of the E. coli DNA polymerase I Klenow fragment for blunt-end ligation. The parental plasmid containing (GAA·TTC)9 was digested with EcoRI and EcoRV, whereas the plasmid harboring (GAA·TTC)176 was treated with EcoRI and XbaI.
|
For clone preparation with inserts in orientation I, the fragments containing (CTG·CAG)n (where n = 17, 98, and 175) were ligated to pGFPT digested with EcoRI and EagI (EagI and NotI have compatible sticky ends). To prepare the vector for cloning (CTG·CAG)98 in orientation II, the EcoRI/EagI digestion of pGFPT was followed by filling in the recessed 5' termini with 0.1 unit of the E. coli DNA polymerase I Klenow fragment for the blunt-end ligation. For the cloning of (CTG·CAG)n fragments (where n = 17 and 175) in orientation II, the vector was linearized with StuI and EagI, respectively. The pGFPT vector for cloning the fragments containing the (GAA·TTC)n in orientation II was prepared as follows. To ligate the (GAA·TTC)9-containing fragment, the vector was cleaved with EcoRI and StuI, whereas for the ligation of the (GAA·TTC)176 repeat sequence, the pGFPT was cleaved by EcoRI and SpeI (SpeI and XbaI have compatible sticky ends).
The digested DNA was electrophoresed, and the appropriate bands were excised and eluted, as described above. The vector and the insert were mixed in a molar ratio of
1:10 and ligated for 16 h at 16 °C by the addition of 20 units of T4 DNA ligase (U. S. Biochemical Corp.). The ligation mixture was ethanol-precipitated and transformed by electroporation (2.5 kV, cuvette size 0.2 mm) into E. coli HB101 (27) and plated on LB agar plates containing 100 µg/ml ampicillin (Ap). The transformants were screened by using a long wavelength UV lamp and were used to prepare a liquid culture. The plasmids were isolated, purified by CsCl density gradient ultracentrifugation (27), and characterized by restriction mapping to verify the lengths of the cloned TRS. Also, the plasmids were sequenced using a ThermoSequenase Radiolabeled Terminator Cycle Sequencing Kit (U. S. Biochemical Corp.) The sequencing reactions were carried out according to the manufacturer's recommendations using the following primers: AP1 (CGAATTCGAGCTCGGTACCCGGG) and AP2 (GCAGGTCGGCCTCAGCCTGGCCG) (Genosys). The products of the sequencing reactions were analyzed on 6% Long Ranger Gels (FMC BioProducts), containing 7.5 M urea, in the glyceroltolerant gel buffer (U. S. Biochemical Corp.). The gels were dried and exposed to x-ray film. The superhelical forms of the DNA containing the undeleted CTG·CAG and GAA·TTC tracts in orientations I and II (5, 25) were used for all subsequent experiments. The pGFPT vector was used as a control (Fig. 1).
Transformation of TRS-containing Plasmids into E. colipGFPT or its derivatives containing the TRS were transformed into the appropriate E. coli strains by electroporation (27). The transformation mixture was used to inoculate 10 ml of LB media containing ampicillin (100 µg/ml) and isopropyl
-D-thiogalactoside (IPTG) (2 mM). The cultures were incubated at 37 °C and grown until they reached an A600 of
0.81.0 (18). An aliquot (10 µl) was inoculated into 10 ml of fresh LB media (with Ap and IPTG, as before). Recultivations of the cultures were repeated through five growth cycles for these population studies. After the 1st, 3rd, and 5th growth cycles, the cell populations were harvested, and DNA was isolated by alkaline lysis using the Wizard Plus Miniprep DNA Purification System (Promega). The plasmids were then digested with EcoRI and EagI to release the CTG·CAG- and GAA·TTC-containing fragments, which were radiolabeled with [
-32P]dATP, and electrophoresed on a 7% native polyacrylamide gel. The gels were then analyzed using a PhosphorImager (Storm 820, Amersham Biosciences).
Conditions of Bacterial Growth for the Screening of White Colonies upon Transcription Activation or RepressionSix parental E. coli strains were used to evaluate the capacity of the TRS to induce mutations in the upstream reporter GFP gene. To identify single green and white colonies ("white CFUs"), the liquid culture from growth cycles 1, 3, and 5 were spread on LB agar plates containing Ap (150 µg/ml) and IPTG (2 mM), and the plates were incubated at 37 °C. By using a long wavelength UV lamp, green and white colonies were counted. In each experiment,
100 white colonies from the 1st, 3rd, and 5th growth cycles were re-streaked three times on LB agar plates (Ap and IPTG as before) to be certain that the loss of fluorescence was a permanent phenotype (22). Approximately 30 white colonies were transferred to LB liquid media and grown at 37 °C, and the DNA was isolated. This DNA was used for restriction analyses, DNA sequencing, and re-transformation into new competent E. coli cells to confirm the white phenotype. Parallel experiments were also conducted with the pIQ-kan repressor (28) (gift from Dr. Richard P. Bowater, University of East Anglia, UK) to turn off transcription (21). For the purpose of these studies, the E. coli strains (Table I) were first transformed with pIQ-kan expressing the repressor and then subsequently with the appropriate plasmids harboring the CTG·CAG and GAA·TTC repeats (Fig. 1). The transformation mixture was inoculated into 10 ml of LB media containing ampicillin (100 µg/ml) and kanamycin (50 µg/ml), and the recultivation assay was conducted as described earlier. After the 1st, 3rd, and 5th growth cycles, the cultures were plated on LB agar plates containing Ap (150 µg/ml), kanamycin (50 µg/ml), and IPTG (2 mM) and incubated at 37 °C. After the green and white colonies were counted and the white ones re-streaked (three times), the liquid cultures were prepared, and DNA was isolated as described previously. The utilization of different parental strains enabled a preliminary survey of the potential role of genetic backgrounds on the mutagenic process.
Sequencing PrimersThe primers for sequencing the regions upstream of the GFP gene and downstream of the CTG·CAG as well as the GAA·TTC repeat sequences were obtained from MWG Biotec (High Point, NC) or from Sigma Genosys (The Woodlands, TX). The forward primer at position 52 of pRW5301 was GCAGCTGGCACGACAGGTTTCC. The reverse primers at position 1509 or 1846 of the same plasmid were CAAGCTGTGACCGTCTCCG and CAGGGTTATTGTCTCATG, respectively. In some cases, it was necessary to use a primer at the origin of replication. This forward primer at position 3390 was GCTTCCAGGGGGAAACGCCTG. To detect breakpoints of the mutants of pRW5304, the reverse primer GGCGTATCACGAGGCCCTTAAG at position 2459 and the forward primer GCTTCCAGGGGGAAACGCCTG at position 4147 of the plasmid were employed. The primers and the plasmids were used at a concentration of 10 pmol/µl and 200 ng/µl, respectively, for the sequencing reactions.
DNA SequencingThe DNA isolated from the 47 mutant white colonies was sequenced from both strands by using the primers described above. The DNA was sequenced in the Molecular and Human Genetics Sequencing Core at the Baylor College of Medicine, Houston. The DNA of the mutants was analyzed by cycle sequencing using a GeneAmp PCR System 9700 and the ABI 3700 Sequencer. Cycle sequencing conditions were as follows: initial denaturation at 96 °C for 10 min, and 25 cycles of heating (96 °C, 10 s), annealing (50 °C, 30 s), and elongation (60 °C, 4 min).
| RESULTS |
|---|
|
|
|---|
The (CTG·CAG)n-containing fragments (where n = 17, 98, and 175) were cloned either in orientation I or II, relative to the origin of replication; for the plasmids containing (CTG·CAG)n tracts in orientation I, the CTG repeat is in the leading strand template, whereas for the plasmids in orientation II, the CTG repeat is in the lagging strand template (5, 24, 25, 29, 30). The (GAA·TTC)n repeat sequences (where n = 9 and 176) were cloned into the pGFPT vector only in orientation II, where the GAA repeats are in the lagging strand template; orientation II is less genetically stable than orientation I (31, 32). As a control, the pGFPT vector with no repeat sequences was used (Fig. 1). We performed the experiments either with transcription activation or repression and conducted five successive re-cultivation steps. To activate transcription, experiments were performed in the presence of 2 mM IPTG, whereas co-transformation with pIQ-kan ensured that the lacIQ repressor inhibited GFP transcription from the lacZ promoter (22, 28).
Fraction of White CFUs Depends on the Type and Length of the TRS SequenceThe genetic instability of the TRS depends on the length of the repeat tracts (12, 25). To determine whether repeat tracts of different lengths influenced mutations in sequences flanking the repeats, we transformed the E. coli strains (Table I) with the nine plasmids listed in Fig. 1, harboring (CTG·CAG)n or (GAA·TTC)n repeat sequences. To identify single colonies, liquid cultures at the end of the 1st, 3rd, and 5th re-cultivations were spread on IPTG-containing agar plates, and the number of green and white colonies was counted. To verify the fluorescent status of the cells used to start each re-cultivation, the transformation mixture was immediately plated on LB plates, and the green fluorescence of the cells was determined. The cells used for each re-cultivation assay contained plasmids with a functional GFP reporter gene, because all colony-forming units ("CFUs") were fluorescent. Therefore, all white colonies arose during the re-cultivation growths of the cells (Fig. 1). The fraction of white CFUs was calculated as the ratio of the number of white colonies to the total number of viable cells (green and white).
Several "parental" E. coli strains that are genotypically different were studied. This diversity may influence the cellular behavior. To investigate a possible role of DNA repair in TRS-induced mutagenesis, while taking into account the genotypic variability, we examined whether the presence of the TRS increased the mutations in six different parental E. coli strains, all proficient in the four main repair pathways (methyl-directed mismatch repair, nucleotide excision repair, transcription-coupled repair, and base excision repair). The results of screening for white colonies, when plasmids contained (CTG·CAG)n tracts of different lengths (n = 0, 17, 98, and 175) and (GAA·TTC)176, are shown in Table II. The data revealed that the loss of fluorescence of the GFP reporter gene depends on the presence of the CTG·CAG tract, because the fraction of white CFUs increased with the length of the repeat tract. The total number of CFUs analyzed ranged from 18,700 to 68,237, with an average of 34,718 CFUs in each experiment. Table II is a summation of all data for plasmids in both orientations and with and without transcription, in order to present the composite global results.
|
In contrast, after transformation of E. coli with plasmids containing GAA·TTC sequences in parallel experiments, no effect of the repeat tracts on the loss of fluorescence was detected (Table II). For the longer (GAA·TTC)176, white CFUs were formed only when the re-cultivation was conducted in E. coli
UvrA (data not shown), where 76 mutants were found out of a total of 47,585 CFUs (0.0016). Also, no effect was found for the shorter (GAA·TTC)9 repeat in any of the six strains listed in Table II (data not shown).
We conclude, first, that the myotonic dystrophy type 1 repeat tract is much more prone to induce loss of fluorescence from the adjacent GFP gene than the Friedreich's ataxia GAA·TTC repeat sequence. Second, the fraction of white mutants formed in the presence of the CTG·CAG tract increased with the length of the repeat. We speculate that quasi-stable slipped non-B DNA structures formed by the long CTG·CAG repeat tracts elevated the frequency of deletions in a pathway that depended on DNA repair proteins (see "Discussion").
CTG·CAG Repeats Are More Mutagenic in Orientation II than in Orientation IIt is well established that the orientation of the repeat sequences relative to the replication origin plays an important role in their stability (12). This behavior was attributed to a higher propensity of the CTG repeats than the CAG repeats to form stable hairpin structures on the lagging strand template (1, 12). To analyze whether the orientation of the repeat sequences influenced the loss of fluorescence (the fraction of white CFUs), we conducted growth studies in the six E. coli strains (Table I) by using plasmids harboring (CTG·CAG)n (where n = 17, 98, and 175) in both orientations. Table III shows that the mutagenic effect was more pronounced when the (CTG·CAG)n sequence was in orientation II. We found the long and pure (CTG·CAG)98 to be mutagenic in both orientations when re-cultivations were conducted in the KMBL1001, JJC510, and JTT1 strains (Table III). In KMBL1001 cells, there were 11,427 white CFUs out of 24,888 CFUs (0.459) for orientation II and 9,961 out of 43,349 for orientation I (0.229) (Table III). We also found a higher fraction of white CFUs for orientation II when pRW5305 containing 98 repeats was re-cultivated in E. coli JJC510 (0.356 versus 0.179). Most surprisingly, when growth was conducted in JTT1 cells, we detected the reverse effect, i.e. a higher fraction of white CFUs was formed for (CTG·CAG)98 in orientation I (Table III). In this case, there were 7,085 white CFUs out of 22,539 CFUs in orientation I (0.314) and only 10 white mutants out of 7,672 CFUs (0.0013) for orientation II (Table III). This anomalous result is probably linked to the inviability of the cells harboring pRW5305, because a considerably smaller number of CFUs was observed on LB plates. Hence, this low frequency of white CFUs may have been due to plasmid loss (22) or deletion events affecting the ampicillin gene and/or the replication origin.
|
In summary, these data show that the orientation, as well as the length of the CTG·CAG sequence, are important factors that influence the fraction of white CFUs.
Transcription through the CTG·CAG Sequence Increases the Fraction of White CFUsTranscription has been shown to induce mutations that invoke DNA repair and recombination (3338). Induction of transcription in long CTG·CAG repeats contained on plasmids in E. coli revealed an increase in the frequency of deletions within the repeat tract (18). Therefore, we tested whether active transcription through the DM1 TRS influenced the fraction of mutations in sequences flanking the repeats. Table IV shows the composite data on the distribution of white CFUs found for plasmids containing CTG·CAG sequences in orientations I and II, when experiments were conducted in the presence (IPTG) or the absence (no IPTG) of transcription through the repeats.
|
In the presence of transcription, the CTG·CAG repeat sequences caused a significant elevation in the fraction of the white CFUs in a length-dependent manner (Table IV). For (CTG·CAG)17, a small fraction of white CFUs appeared in the KMBL1001 and KA796 strains, and as the length increased, more strains showed a response (Table IV). The highest fraction of deleted mutants was observed for both the pure and interrupted CTG·CAG tracts in KMBL1001 and JJC510. Plasmids harboring (CTG·CAG)98 showed a substantial fraction of mutants when replicated in JTT1 (Table IV). The lowest fraction of white CFUs for both the 98 and 175 CTG·CAG repeat sequences was found when plasmids were cultivated in KA796. Alternatively, studies on the influence of transcription on the deletion behaviors of (GAA·TTC)176 in orientation II in all six parental E. coli strains revealed no mutagenic response. Indeed, in a total of 232,097 CFUs, no mutants were found. The one exception was in the
UvrA strain, where white CFUs were found only in the presence of transcription (data not shown).
Hence, the long CTG·CAG sequences exerted their mutagenic character through a process associated with transcription. The fraction of mutants formed upon transcription activation was considerably greater than in the presence of replication alone. Thus, we conclude that the deletions detected herein were the result of repair-dependent reactions, which were enhanced by transcription.
Types and Locations of MutationsRestriction mapping and DNA sequence analyses of the white CFUs were performed to evaluate the alterations within the repeat tracts and flanking sequences. Analyses of 47 clones revealed that all mutants contained a nonfunctional GFP gene. These clones were characterized in detail. Twenty one were from pRW5301, 7 from pRW5305, 12 from pRW5309, and 7 from pRW5304 (Fig. 2). All mutations were large deletions; also derivatives of pRW5305 and pRW5309 contained inversions (Fig. 2, B and C, boxed regions). More than one clone was found with identical mutations from individual transformations of the plasmids harboring the repeat tracts.
|
All mutant derivatives of pRW5305 and pRW5309 underwent both deletions and inversion reactions (Fig. 2, B and C, boxed regions). All clones had two deletion events, one affecting the GFP gene and the other the repeat sequences. The retained part of the GFP gene was 40 and 115 bp for plasmids containing (CTG·CAG)98 and (CTG·CAG)175, respectively. Also, all clones had only a few CTG·CAG repeats remaining, which were in the inverted orientation. In fact, all mutants had two additional breaks outside the repeat tract within the EcoRI and EagI recognition sites; the repair of these breaks led to the inversion events giving rise to the sequences at positions 10101417 of pRW5305 and 10761721 of pRW5309 (Fig. 2, B and C). The reason why derivatives of pRW5305 and pRW5309 contain inversions, in addition to deletions, is unclear but may be due to the strategy of their cloning. Even though all clones had the same breakpoint junctions after the terminator cassette and within the GFP gene, they represent independent mutation events because they were found in separate transformations. Hence, these regions must be hot spots for recombination in orientation II.
For the long GAA·TTC repeat sequence, the repaired products of mutations revealed the occurrence of one large deletion of 2.0 kbp (Fig. 2D). Restriction mapping of the DNA from seven white mutants (clones 1016) showed that one break always occurred within 2 bp from the replication origin region, and the second inside the repeat tract. Therefore, these seven clones had lost the entire GFP reporter gene and a considerable part of the repeat sequence. The number of repeats remaining varied from 15 to 23 (Fig. 2D). Clones 1013 and 16 had 17 repeats remaining, whereas clones 14 and 15 revealed 15 and 23 repeats, respectively. Furthermore, all repaired products of pRW5304 had an additional small deletion localized downstream of the repeat tract, which removed one copy of a 4-bp GATC tandem repeat. Because these three clones were found from a single transformation, it is conceivable that they were derived from a common event.
Sequence Features at the BreaksThe ability of CTG·CAG repeats to adopt quasi-stable folded secondary structures is well established (1, 6, 19, 39, 40). Non-B DNA structures are susceptible to strand breaks, either single or double, within the repeat tract (41, 42). The breaks appear to be repaired by the RecA-dependent homologous recombination pathway (43). Repair of the breaks can lead to instability of the repeat tract and, moreover, cause deletions of the sequences flanking the repeats (41, 44). The propensity of long CTG·CAG sequences to induce breaks in an adjacent gene, which were subsequently repaired, was determined.
We analyzed the positions of the breaks and the sequences at the junctions for the 47 mutant white CFUs. We also inspected the sequences flanking the breakpoints for any direct, inverted, or mirror repeats capable of forming slipped structures, cruciforms, or triplexes (1, 9, 39), respectively.
Analyses of the sequences at the breaks revealed the existence of short homologies, from one to eight nucleotides (Table V, AD). Breakpoints did not occur at random positions but were within specific repeat sequences able to adopt unorthodox DNA conformations (21). Moreover, sequences flanking the breaks revealed the existence of repeat motifs. For example, in the vicinity of the break localized at position 4285 in the repaired products of pRW5304, three copies of an eight-nucleotide motif of direct repeats GGCCTTTT were detected (Fig. 3A). This panel shows an example of the non-B DNA structures at breakpoints of deleted pRW5304 (clones 1016 in Table V, D). A few mutant clones of pRW5301 (clones 57) (Table V, A) had breaks mapped at positions 138 and 920 of the vector part of the plasmid. These sites were also near three copies of direct (CAA, TTACC, GGC, TTA), two copies of inverted (CTTTCGAAAG), and mirror (CATT) repeat motifs (data not shown). Fig. 3, BD, shows examples of the presumptive structures at the breakpoints for clones 47 and 19. The presence of these sequences near the breakpoints may have played a role in the deletion events (21). Characterization of a few additional deletions confirmed the association of deletion breakpoints with sites of non-B DNA structures. The non-B DNA conformations may have served as substrates for the repair machinery that generates long deletions (21).
|
|
1.21.4 kbp of the plasmids. For deleted derivatives of pRW5309, one mutation event occurred between positions 7 (in the vector DNA) and 902 (in the GFP gene) (Table V, C), which were joined together. The next break mapped at position 1012 within the EcoRI recognition site and was followed by bp 1721 located inside the second EcoRI restriction site of the opposite strand, hence leading to the inversion. Homologous GAATTC sequences were present at both breakpoints (Table V, C). The end point of the inverted fragment (bp 1076) located in the first EagI site continued with bp 1736 mapped inside the second EagI recognition site of the opposite strand. These breakpoints had GCGGCCG homologous tracts. The other breaks, which occurred within the CTG·CAG sequence, caused deletions of the repeat tract. For the mutants of pRW5305, a similar repair behavior was found. A 1.1-kbp deletion occurred between positions 3602 (in the vector) and 977 (in the GFP gene) (Table V, B) that were joined together. A third break was found 33 nucleotides downstream of this site at position 1010, which was followed by bp 1417 of the opposite strand causing the inversion. A homologous GAATT tract was present at the breaks. The end point of the inversion (bp 1010) continued with bp 1425 of the opposite strand, at sites where the inversion revealed a homologous GC pair (Table V, B). Furthermore, the CTG·CAG sequence, which was found in the inverted orientation, had lost a considerable number of the repeats.
In summary, the presence of the long CTG·CAG repeats promotes the formation of multiple breaks in sequences flanking the repeats. Their repair occurred between motifs that shared homology of a few nucleotides. In all cases, the positions of the breakpoints were near or within specific repeat sequences capable of forming non-B DNA structures. Thus, weakened and/or distorted base pairs in the unorthodox DNA conformations (1, 9, 45) probably served as substrates for the generation of large deletions and rearrangements (inversions).
| DISCUSSION |
|---|
|
|
|---|
The length of the CTG·CAG tract has a pronounced effect on the capacity of plasmids to promote gross deletions. If n = 0 or 17, essentially no mutants were observed. Alternatively, for the longer tracts (n = 98 and 175), substantial deletions were found. The role of TRS length on the capacity to adopt non-B DNA structures has been established for CTG·CAG repeats (16, 816, 18, 22, 24, 25, 29, 30, 3949, 5153, 58) and for GAA·TTC repeats (31, 32, 55, 5961). Whereas the exact role of the non-B conformations adopted by the CTG·CAG repeats in the deletion process remains to be clarified, the distinct effect of repeat length in triggering these mutagenic reactions strongly suggests a role for the overall DNA topology rather than the sequence alone.
Most interestingly, long tracts of GAA·TTC were inert in promoting the formation of deletions in the work described herein. This TRS was shown to adopt triplex as well as sticky DNA conformations (31, 32, 55, 5961); triplexes were demonstrated to cause the site-specific introduction of DNA damage in eukaryotic cells (62, 63). Thus, the molecular basis of the mutational impotency of long GAA·TTC repeats found in our studies remains to be clarified.2 It is possible that the relatively long sequences flanking the GAA·TTC repeats (see "Experimental Procedures") could contain deletions that were not detected by our assay conditions. However, the large range of deletion lengths promoted by long CTG·CAG tracts makes this possibility unlikely.
Long CTG·CAG repeats in orientation II were much more prone to promote gross deletions and inversions than in orientation I. In fact, for the longest CTG·CAG repeat (n = 175), deletions and inversions were only observed in orientation II but not in orientation I; as found for the effect of length, the host cell strain had an influence. This effect of insert orientation is diagnostic for an involvement of replication repair in the genetic instability behavior. Although this effect was first seen in plasmids harboring CTG·CAG repeats in E. coli (5, 6), it has been repeatedly observed in a wide range of studies in yeast, cell cultures, and mice (14, 9, 30, 41, 42, 44, 45, 47, 58, 64). This behavior is due to the preferential capacity of the CTG repeat-containing strand on the lagging strand template to adopt hairpin loop structures (compared with the less stable CAG repeat strand), which serve as an impediment for replication fork progression at the repeats and thereby enable the induction of double-strand breaks at the stalled fork. Also, the (CTG·CAG)175 insert with two G to A interruptions was less mutagenic than the shorter but uninterrupted 98 repeat tract (Table II). Numerous other examples have been found of the highly disruptive effect of interruptions on genetic instabilities (10, 14, 15, 56, 57, 6567).
Active transcription of the TRS caused an increase in the formation of gross deletions by several orders of magnitude. This dramatic effect reveals the important consequences of transcription as a biological process in mutagenesis, which has been reviewed extensively (9, 3337, 51, 6872). Virtually every process that exposes the single strands of DNA also destabilizes triplet repeats, including transcription (18, 51), replication (5, 52), recombination (1416, 41, 53), and DNA repair (1012, 54). When transcription occurs on a DNA segment that is simultaneously being replicated or contains lesions, which need to be repaired, transient changes occur in the DNA topology (9, 69, 73). As the negatively supercoiled DNA facilitates strand separation, it is vulnerable to metabolic attacks on the single-stranded regions leading to both mutagenic and recombinogenic lesions (14, 39, 69). Because transcription generates a high level of negatively supercoiled DNA and thereby promotes the formation of underwound non-B conformations, it is possible that the TRS-induced mutations were caused by these conformations at the repeat tracts. We demonstrated that the sequences at the breakpoints of the deletions for all 47 mutants could adopt supercoil-dependent non-B conformations, in agreement with prior studies (21, 23).
Prior investigations revealed (21) that the highly unusual 2.5-kbp poly(purine·pyrimidine) sequence from intron 21 of the human PKD1 gene induced long deletions and other instabilities in plasmids that were mediated by mismatch repair and transcription. Other prior studies showed that this 2.5-kbp R·Y tract forms non-B DNA structures (22). For 11 deletions, which were analyzed in detail, the breakpoints could be explained by the formation of non-B DNA conformations. This work proposed that alternative DNA conformations (but not the sequences per se) promote genomic rearrangements through recombination-repair activities. The work described herein, demonstrating that long CTG·CAG repeat tracts also trigger the formation of large deletions and inversions and are greatly stimulated by transcription, provides a substantial extension of the original observations (21, 22) and establishes a clear role of transcription. Although transcription through the long CTG·CAG tracts (12, 18) is known to enhance its instability (via deletions), the different conditions of bacterial growth and the strains used did not previously allow detection of the gross deletions and inversions.
Fig. 4 presents a model for the mechanisms of formation of the products described in Fig. 2. The DSB, close to or within specific sequences capable of adopting non-B DNA conformations, may induce repair by the single or multiple deletion pathways. Repair of DSBs occurring between sequences with direct or inverted homologies at the breakpoints caused inversions and deletions of part or all of the repeat tracts along with some flanking DNA. For example, four mutants (Fig. 2, clones 14), and 11 other DNAs (Fig. 2, clones 2023 and 2531) are products of repair events that could have been formed by the left and right side mechanisms, respectively, of the single deletion pathway (see Fig. 4). However, clone 7 is a typical example of a product formed by the multiple deletions pathway. Also, all derivatives of pRW5305 and pRW5309 as well as a few clones derived from pRW5301 (clones 5, 6, 9, 19, and 24), which had retained flanking DNA downstream of the repeats, were also derived by the multiple deletions mechanism.
|
| FOOTNOTES |
|---|
To whom correspondence should be addressed: Center for Genome Research, Institute of Biosciences and Technology, Texas A & M University System Health Science Center, Texas Medical Center, 2121 W. Holcombe Blvd., Houston, TX 77030. Tel.: 713-677-7651; Fax: 713-677-7689; E-mail: rwells{at}ibt.tamu.edu.
1 The abbreviations used are: DM1, myotonic dystrophy type 1; TRS, triplet repeat sequence; CFUs, colony-forming units; IPTG, isopropyl
-D-thiogalactoside; DSB, double-strand breaks; Ap, ampicillin. ![]()
2 M. Wojciechowska and R. D. Wells, unpublished data. ![]()
3 B. Wieringa, personal communication. ![]()
| REFERENCES |
|---|
|
|
|---|