![]()
|
|
||||||||
J. Biol. Chem., Vol. 279, Issue 40, 41715-41726, October 1, 2004
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||



¶
From the
Institute of Biosciences and Technology, Center for Genome Research, Texas A&M University System Health Science Center, Texas Medical Center, Houston, Texas 77030-3303 and the
Department of Genetics, Cell Biology and Development, University of Minnesota, Minneapolis, Minnesota 55455
Received for publication, June 9, 2004 , and in revised form, July 21, 2004.
| ABSTRACT |
|---|
|
|
|---|
| INTRODUCTION |
|---|
|
|
|---|
At least 15 hereditary neurological diseases (i.e. myotonic dystrophy, fragile X syndrome, and Friedreich's ataxia) are associated with the expansions of (CTG·CAG)n, (CGG·CCG)n, or (GAA·TTC)n repeat tracts, respectively (713). Additionally, two other non-triplet repeat neurological diseases, spinocerebellar ataxia type 10 (SCA10) (14) and progressive myoclonus epilepsy of Unverricht-Lundborg type (EPM1) (15), are caused by the expansions of pentanucleotide and dodecanucleotide repeating sequences, respectively.
Replication (9, 10, 1623), recombination (2428), and repair (10, 21, 29, 30) were shown to be responsible for the instabilities of triplet repeat sequences. Slippage of the repeats (3134) as promoted by non-B DNA structures (911, 3537) formed by these repeating sequences causes polymerase to pause during replication, as shown both in vivo as well as in vitro (17, 20, 3843), thereby generating instabilities. Furthermore, these structures are also recognized by mismatch repair (MMR) (29, 4448) and nucleotide excision repair (NER) (49, 50); both pathways have been implicated in the stability of the secondary structures, thus influencing the expansion and deletion processes. Also, double strand breaks caused by replication fork arrest or repair of the non-B DNA structures induces repair-mediated recombination that may participate in the expansions observed in both prokaryotic as well as eukaryotic model systems (21, 30, 5156). Triplet repeat sequences are hotspots for recombination, which may account for the massive expansions found in certain diseases (2428, 57, 58).
Herein, we show that the repeating tetranucleotide genetic instabilities associated with DM2 are caused by a structure-mediated replication-based slippage mechanism. The difference in the propensities of the CAGG and CCTG strands to transiently form quasistable DNA hairpin structures determines the type and level of instability. Furthermore, this instability is influenced by the length, orientation, and position of the repeats with respect to the origin of replication.
| EXPERIMENTAL PROCEDURES |
|---|
|
|
|---|
(gpt-proA)62, leuB6, glnV44, ara14, galK2, lacY1,
(mcrC-mrr), rpsL20 (Strr), xyl5, mtl-1, hsdS20 (r B, m B), recA13) and plated on LB plates containing ampicillin (100 µg/ml). Individual colonies were grown in LB broth containing ampicillin (100 µg/ml) for 16 h at 37 °C. Plasmids were isolated using the alkaline lysis procedure (Promega, Wizard Plus Miniprep DNA Purification System).
The plasmids carrying the (CCTG·CAGG)n inserts were characterized using restriction mapping and DNA sequencing. An EcoRI digest was used for repeat tracts cloned proximal (SmaI site) to the SV40 origin of replication and an AflIII/BglII digest was used for inserts cloned into the site distal (MfeI site) to the same origin of replication. The restriction fragments were end-labeled with [
-32P]dATP and 1 unit of the Klenow fragment of E. coli DNA polymerase I and analyzed on 5.5% polyacrylamide gels. Furthermore, the plasmids containing the (CCTG·CAGG)n repeats were dideoxy sequenced on both strands to determine the length, purity, and orientation of the repeats with respect to the origin of replication. The sequencing reactions were performed using the Thermo Sequenase Radiolabeled Terminator Cycle Sequencing Kit (U. S. Biochemical Corp.) according to the manufacturer's recommendations. The products of the sequencing reactions were analyzed on 6% Long Ranger gels (FMC BioProducts) containing 7.5 M urea in the glycerol tolerant gel buffer (1.78 M Tris, 0.57 M taurine, and 0.01 M EDTA) (U. S. Biochemical Corp.). The (CCTG·CAGG)30 is a pure repeat (i.e. it contains no polymorphisms/interruptions) as determined by sequencing of the entire repeat containing tract. The (CCTG·CAGG)114 and the (CCTG·CAGG)200 carried a single base pair interruption 11 repeats into the tract to give the sequence (CCTG)11CCTT(CCTG)n, where n = 102 and 188, respectively. The (CCTG·CAGG)n (where n = 114 or 200) repeats were sequenced using primers located at both ends of the repeating tract. In case of the (CCTG·CAGG)114, sequencing enabled the determination of
200300 bp into the repeats from either side; however, on excising and analyzing the repeat containing fragment from the plasmid, an error of approximately ±5 repeats was obtained. Similarly, on analyzing the excised (CCTG·CAGG)200 fragment on polyacrylamide gels, the estimated repeat length varied by ±5 repeats. Thus, the repeat lengths were estimated by both sequencing and fragment analyses. The plasmids carried the (CCTG·CAGG)n repeat tracts in both orientations with respect to the SV40 origin of replication (Fig. 1). However, the SV40 origin is bidirectional (59) and is the operative origin in COS-7 cells (60). Orientation I refers to the orientation in which the CCTG repeats are on the leading strand template with respect to the SV40 origin of replication, whereas the orientation in which the CCTG repeats are on the lagging strand template with respect to the same origin are referred to as orientation II.
|
Genetic Instabilities by Individual Colony AnalysesThe episomal DNA, obtained after transfection in COS-7 cells cultured for either 48 h or 2 weeks, was transformed into E. coli HB101 and plated on LB plates containing ampicillin (100 µg/ml). Individual colonies were then picked and grown in LB broth for 16 h at 37 °C. The analyses of single colonies enabled the detection of individual events during replication of the plasmids in COS-7 cells. The plasmids were isolated using alkaline lysis and the DNA analyzed by restriction mapping. The (CCTG·CAGG)n inserts were excised using an EcoRI digest for plasmids carrying the repeats cloned into the SmaI site, whereas an AflIII/BglII digest was used to excise the inserts cloned into the MfeI site. The restriction fragments were resolved on 5.5% polyacrylamide gels. The genetic instability was measured as a change in the length of the (CCTG·CAGG)n repeat containing fragments when compared with the size of the insert excised from the parental plasmid and the 1-kbp DNA ladder size standard (Invitrogen). The lengths of each insert excised from individual colonies were determined using FluorChem version 3.04 (Alpha Innotech Corp.). Furthermore,
20 random clones containing expansions were sequenced to confirm the lengths and sequences of the repeat containing inserts.
The genetic instability (expansions and deletions) of the (CCTG·CAGG)n repeats generated in COS-7 cells is calculated by subtracting the background instability of the repeat-containing plasmids in E. coli HB101. The genetic instability of the parental plasmids was measured by transforming these plasmids into E. coli HB101 and analyzing individual colonies using restriction mapping. The restriction digests used for these analyses were similar to those used for individual colony analyses of the episomal DNA. The percentage of background instability (average of two experiments for the three tetranucleotide lengths in both orientations (total of six experiments)) varied from 2 to 7% for repeats cloned proximal (SmaI site) and from 0 to 12% for repeats cloned distal (MfeI site) (except for (CCTG·CAGG)200 in orientation I, which was 23%) to the SV40 origin (data not shown). All statistical analyses were performed using SigmaStat version 2.03.
Substrate Preparation for Chemical Modification and Enzymatic ProbingTo analyze the structural features of the (CCTG·CAGG) repeats, we used "single-stranded" synthetic oligonucleotides as models of unpaired regions of the duplex repeat sequences for chemical and enzymatic modification studies. The individual oligonucleotides (Genosys), d(CCTG)26 and d(CAGG)26, were purified on a 6% denaturing polyacrylamide gel containing 7.5 M urea in glycerol tolerant gel buffer (U. S. Biochemical Corp.). The purified oligonucleotides were labeled at the 5' end with 15 units of T4 polynucleotide kinase (U. S. Biochemical Corp.) and [
-32P]ATP at 37 °C for 1 h. The labeled oligonucleotides were purified on a 6% denaturing polyacrylamide gel. These purified and labeled oligonucleotides were used as substrates for the chemical and enzymatic probing reactions.
Chemical Modifications and Enzymatic ProbingThree chemical probes, osmium tetraoxide (OsO4) (Aldrich), potassium permanganate (KMnO4) (Fisher), and diethyl pyrocarbonate (DEPC) (Sigma), each were used to modify the d(CCTG)26 oligonucleotide, whereas the latter two chemicals were used to modify the d(CAGG)26 oligonucleotide. The purified and labeled oligonucleotides (45 x 105 cpm/reaction) in 10 mM Tris, 40 mM NaCl, and 10 mM MgCl2 were denatured by heating at 80 °C for 5 min followed by renaturation by gradually decreasing the temperature (2 °C/min) to the indicated reaction temperature (61). The chemical and enzymatic probes were then added along with the carrier DNA (salmon sperm DNA, 1 µg/µl) (Invitrogen).
The DEPC modification of the oligonucleotides was carried out in a buffer containing 50 mM sodium cacodylate, 40 mM NaCl, and 10 mM MgCl2 (pH 7.0) at 25 °C for 60 min (40, 6267). The KMnO4 reaction was performed at 20 °C for 30 min in a buffer containing 50 mM sodium cacodylate and 1 mM EDTA (pH 7.0) (65, 6871). A final concentration of 8% DEPC and 0.1, 0.5, and 1 mM KMnO4 was used. The chemical modification with 1 mM OsO4 in the presence of 1 mM 2,2'-bipyridine (Sigma) and 150 mM sodium phosphate (pH 7.8) was carried out at 5 °C for 30 min (64, 65, 67, 7274). The modified oligonucleotides were then ethanol precipitated and cleaved at the phosphodiester bonds using 1 M piperidine (67, 72, 73, 75, 76). A series of ethanol precipitations and lyophilization steps then were used to ensure removal of the piperidine.
The enzymatic probes used included S1 nuclease (77, 78) (Invitrogen), P1 nuclease (65, 71, 7981) (Invitrogen), and mung bean nuclease (8284) (New England Biolabs Inc.). All of the enzymatic probing reactions were carried out at 5 °C for 60 min in a buffer containing 10 mM Tris, 40 mM NaCl, and 10 mM MgCl2 (pH 7.2). 1 mM ZnCl2 was included in the buffer used for the S1 nuclease reactions (75, 78). Concentrations ranging from 4.3 to 8.7 units of S1 nuclease and 0.251.0 units of mung bean nuclease were used in a 10-µl total reaction volume. P1 nuclease was used in concentrations ranging from 0.05 to 0.25 µg/ml. The details of the actual concentrations are in the legends to Figs. 5 and 6. The reactions were stopped by addition of a urea-EDTA dye solution (61) followed by quick freezing on dry ice.
|
|
| RESULTS |
|---|
|
|
|---|
To evaluate the effect of the length of the (CCTG·CAGG)n repeat tract on the genetic instability, repeats of 30, 114, or 200 (CCTG·CAGG) units were studied. Previous studies showed a marked effect of orientation of trinucleotide repeats (CTG·CAG, CGG·CCG, and GAA·TTC) with respect to the origin of replication (10, 16, 18, 19, 21, 39, 85, 86). Therefore, we studied plasmids carrying the (CCTG·CAGG) repeats in both orientations. The distance of the (CCTG·CAGG)n repeats from the origin of replication may also play an important role in determining the genetic instabilities of these repeats (22).2 Thus, the repeats were cloned either proximal (
74 bp from the center of the SV40 core origin (64 bp in length) (87) at the SmaI site (map position 2078)) or distal (
1842 bp from the center of the same core origin of replication at the MfeI site (map position 162)) to the SV40 origin of replication.
Genetic Instability of the (CCTG·CAGG)n RepeatsThe plasmids carrying 30, 114, or 200 (CCTG·CAGG) repeats cloned in both orientations were transfected into COS-7 cells (Figs. 1 and 2) and the cells were cultured. The episomal DNA was isolated using alkaline lysis and the DNA was digested with DpnI (see "Experimental Procedures"). To determine the genetic instability of the episomal DNA after 48-h cultures, Southern blot analyses were performed with the (CCTG)8 oligonucleotide as a probe (data not shown), which confirmed the presence of the (CCTG·CAGG) repeats in the excised fragments. Because, no instability was observed, the more sensitive single colony analyses were performed.
|
|
|
Length-dependent Instability of the (CCTG·CAGG)n Tetranucleotide RepeatsFor CTG·CAG, CGG·CCG, and GAA·TTC trinucleotide repeats, the number of repeat units plays an important role in genetic instability (10, 18, 21, 39, 93, 94). To evaluate the effect of the length of the (CCTG·CAGG)n tetranucleotide repeats (where n = 30, 114, or 200) on the genetic instability, three different lengths of the repeats were studied. The percentage of expansions and deletions found for each of the (CCTG·CAGG)n repeats in COS-7 cells for a 2-week period is indicated in Fig. 4. In general, the longer the length of the (CCTG·CAGG) tract, the greater the observed instability. The (CCTG·CAGG)30 repeats were extremely stable (lack of any genetic instability) when the plasmids harboring these repeats were cultured in COS-7 cells for both 48 h and 2 weeks (Fig. 3, A and D). A statistically significant difference in the genetic instabilities was observed between pRW5114 ((CCTG· CAGG)200, orientation II) and pRW5112 ((CCTG·CAGG)114, orientation II) (p = 0.003). Also, similar statistically significant differences in the genetic instability of these repeats were observed on comparing the two pairs of plasmids pRW5121 and pRW5119 (p = 0.012) and pRW5120 and pRW5118 (p = 0.010). However, in the case of the plasmids carrying (CCTG·CAGG)200 cloned in orientation I proximal (SmaI site) to the SV40 origin of replication, a statistically significant length-dependent effect was not observed.
Furthermore, the increase in genetic instability with an increase in the length of the (CCTG·CAGG)n tract was observed for repeats cloned both proximal (SmaI site) and distal (MfeI site) to the SV40 origin of replication. Thus, there is a positive correlation between the length of the (CCTG·CAGG)n tracts and their genetic instabilities.
Effect of OrientationPrior genetic instability studies on microsatellites revealed the inequality of leading versus lagging strand DNA replication (i.e. the direction of replication) with respect to the types and amount of products (10, 16, 18, 22, 52, 85, 95, 96). To determine whether a similar orientation effect was observed for the tetranucleotide repeats, (CCTG·CAGG)n repeats were cloned in both orientations with respect to the bidirectional SV40 origin of replication (see "Experimental Procedures"). The percentage of instability (both expansions and deletions) obtained from plasmids cultured in COS-7 cells for 2 weeks (Fig. 4) showed that orientation II was more unstable than orientation I, especially when the inserts were cloned in the SmaI site. This effect was most pronounced for plasmids carrying the longest repeats. Thus, a statistically significant difference in instability was observed when comparing pRW5115 and pRW5114 (p < 0.001). In experiments where the plasmids were cultured in COS-7 cells for 48 h, a similar effect was observed (data not shown) for (CCTG·CAGG)200. Thus, once the tetranucleotide repeats reached a threshold of about 200 repeats, an orientation effect, where orientation II was significantly more unstable than orientation I, was observed. This was true of repeats cloned proximal (SmaI site) to the SV40 origin of replication. Alternatively, for repeats cloned distal (MfeI site) to the SV40 origin, the differences were not statistically significant, (Fig. 4), even for 48-h cultures (data not shown).
Furthermore, on analyzing the types of genetic instabilities i.e. expansions and deletions, it was observed that expansions in most cases predominated in orientation II compared with orientation I (Fig. 4); this behavior was especially pronounced for (CCTG·CAGG)200 when cloned into the SmaI site. A 1.5 to at least 35-fold increase in expansions in orientation II compared with orientation I was observed, which was dependent on the length of the (CCTG·CAGG)n repeats. Analyses of the instability products from plasmids cultured in COS-7 cells for 48 h also showed a similar trend where expansions predominated in orientation II, especially in the case of the repeats cloned proximal (SmaI site) to the SV40 origin of replication (data not shown). Deletions were present in both orientations. The percentage of deletions did in most cases increase as a function of length; however, an orientation effect was not observed.
Hence, the genetic instability was greater in the case of the repeats cloned in orientation II than in I and when the repeats were cloned proximal (SmaI site) rather than distal (MfeI site) to the SV40 origin. Furthermore, as the length of the repeats increased, the orientation effect was more pronounced. Expansions predominated in orientation II compared with I, especially at the SmaI site.
Magnitude of Expansions and DeletionsThe ranges of expansions and deletions of the (CCTG·CAGG)n repeats cloned proximal (SmaI site) and distal (MfeI site) to the SV40 origin of replication is shown in Table I. For the (CCTG·CAGG)n repeats cloned proximal (SmaI site) to the SV40 origin of replication (Table I), the largest expansion product obtained from (CCTG·CAGG)114 (pRW5112) was a (CCTG·CAGG)210, showing a 1.84-fold increase in the number of repeats. pRW5114 expanded from 200 repeats to a maximum of 300 repeats, indicating a 1.5-fold increase in length. Both of these expansions were obtained for the repeats cloned in orientation II (Table I). The deletions obtained for the (CCTG·CAGG)n repeats cloned proximal to the SV40 origin of replication ranged from a complete loss of the repeats to a reduction of 2 or 3 repeats (Table I). Plasmids harboring the shortest repeats (n = 30) cloned into either of the sites were completely stable in all experiments.
|
The plasmids cultured in COS-7 cells for a period of 48 h (data not shown) showed expansions ranging between 118 and 144 repeats for (CCTG·CAGG)114 and between 202 and 370 repeats (up to 1.85-fold) for (CCTG·CAGG)200. The deletions were also within a similar range as observed for the plasmids cultured for 2 weeks. This range of expansions and deletions was similar for repeats cloned both proximal (SmaI site) and distal (MfeI site) to the SV40 origin.
In summary, a range of expansions and deletions were found. The larger expanded products were found for the repeats cloned in orientation II compared with I. Furthermore, the magnitude of expansions was much larger for repeats cloned proximal to the origin than for repeats cloned distal.
Oligonucleotide Model Studies: Enzymatic ProbingTwo oligonucleotides, d(CAGG)26 and d(CCTG)26, were chemically synthesized to study their structural properties as related to the behavior of unpaired regions of the (CCTG·CAGG) repeats during replication and related processes that unwind the duplex. d(CAGG)26 and d(CCTG)26 were purified and labeled ("Experimental Procedures"). The labeled oligomers were then probed with conformation-sensitive enzymatic agents (S1 nuclease, P1 nuclease, and mung bean nuclease) (65, 71, 77, 78, 8082, 84, 97). All three enzymes have been used widely to evaluate ordered and H-bonded pseudo-duplex DNA regions within DNA tracts that are otherwise disordered, random-coil structures (65, 71, 78, 79, 97).
S1 nuclease from Aspergillus oryzae, a zinc requiring enzyme with an acidic pH optimum, cleaves single-stranded and partially unpaired DNA and is not base-specific (77, 78, 98). On probing d(CAGG)26 with S1 nuclease (Fig. 5), the greatest reactivity was observed between the cytosine and adenine residues of the 14th CAGG repeat, followed by weaker cleavage between the adenine and guanine of the 14th CAGG repeat, as well as cleavage between the first and second guanine residues of the same repeat. Thus, these residues are more accessible to the enzymatic probe and may form a terminal loop of a hairpin structure. The stem of the hairpin probed with S1 nuclease showed cleavage of the phosphodiester bonds between residues CpA, ApG, and GpG (Fig. 5), but the intensity of cleavage in the proposed stem is weaker than that found in the putative loop (Fig. 5A). Alternatively, on probing the labeled d(CCTG)26 (Fig. 6), an equal intensity of cleavage was observed between the CpT and TpG residues, suggesting the lack of formation of a stable secondary structure.
P1 nuclease from Penicillium citrium is similar in its behavior to S1 nuclease in preferentially cleaving single-stranded, non-helical DNA tracts or regions that are transiently unpaired (65, 71, 7981, 98). However, P1 nuclease has a neutral pH optimum and lacks a requirement for zinc. Probing the d(CAGG)26 substrate with P1 nuclease further supported the evidence for formation of a terminal loop as demonstrated by the hypersensitivity of the substrate to this enzyme between the residues ranging from the adenine to the guanine residues of the 13th CAGG repeat and between the adenine and guanine of the 14th CAGG repeat. Furthermore, cleavage was also observed between the cytosine and adenine residues of the 14th CAGG repeat (Fig. 5). Cleavage occurred between ApG, GpG, and CpA with the strongest cleavage between the ApG and GpG residues in the stem (as illustrated for repeat 10 but also found for all other repeats in the stem) (Fig. 5). d(CCTG)26 showed a similar cleavage pattern to that observed with S1 nuclease when probed with P1 nuclease, where the phosphodiester bonds between the CpT and TpG residues were cleaved with an equal intensity (Fig. 6), again indicating that the d(CCTG)26 oligonucleotide did not form a stable secondary structure.
Mung bean nuclease is also highly sensitive to variations in DNA structure and converts single-stranded or unpaired DNA to mono- or oligonucleotides with 5'-phosphates (8284, 98). The probing of d(CAGG)26 with mung bean nuclease showed predominant cleavage between the first and second guanines of the 13th CAGG repeat, the cytosine and adenine, as well as the first and second guanine residues of the 14th CAGG repeat. Less cleavage was observed between the adenine and guanine residues of the 14th CAGG repeat (Fig. 5). These residues form the proposed terminal hairpin loop. The bonds between the Gs of each of the CAGG repeats (Fig. 5) were more resistant to cleavage than for the bonds in the putative loops; thus, we propose that they exist in the stem. Alternatively, the probing of d(CCTG)26 with mung bean showed cleavage of the bonds between GpC and CpC of each of the CCTG repeats (data not shown), thus indicating the lack of a stable secondary structure in agreement with the results from S1 and P1 nucleases. Hence, the enzymatic probing studies revealed the preferential formation of a hairpin structure by d(CAGG)26 as opposed to d(CCTG)26.
Chemical Probe Determinationsd(CAGG)26 and d(CCTG)26 were individually probed with OsO4, DEPC, or KMnO4. OsO4 specifically reacts with thymines, and to a much lesser extent with cytosines, in single-stranded oligonucleotides and single-stranded or distorted double-stranded regions in DNA to yield mainly cis-thymine glycols (39, 64, 65, 7274). OsO4/piperidine was used to probe d(CCTG)26 (Fig. 6). An equal intensity of cleavage was observed at each of the thymine residues of the CCTG repeats. This indicates the equal accessibility of the pyrimidines and hence the oligonucleotide did not form any preferential secondary structure. This probe was not used for the complementary oligomer because it lacked thymine residues.
DEPC reacts with the N7 positions of the purines, both adenines and guanines in single-stranded DNA, to form the corresponding ring-opened dicarboxyethoxylated derivatives (39, 40, 6264, 66, 67). DEPC/piperidine was used to probe both d(CAGG)26 and d(CCTG)26. Similar to the enzymatic probing, the most effective modification occurred at the adenine and the first guanine residues of the 13th CAGG repeat and at the adenine of the 14th CAGG repeat (Fig. 5), indicating the formation of a terminal loop of the putative hairpin structure. The putative stem showed cleavage of the adenine residues to a greater extent than the guanine residues (Fig. 5) as previously stated. In the case of d(CCTG)26, DEPC modified the Gs of each of the CCTG repeats and an equal intensity of cleavage was observed at all of the Gs (Fig. 6).
KMnO4 preferentially oxidizes unpaired or unstacked thymines (and to a much lesser extent cytosines) resulting in strand cleavage upon subsequent treatment with piperidine (65, 68, 69, 71, 76). KMnO4 showed reactivity at all residues in the oligomer but a very prominent cleavage was observed at the cytosine residue in the 14th CAGG repeat of d(CAGG)26 (data not shown). However, in the case of d(CCTG)26, KMnO4 equally modified the thymine residues of each CCTG repeat (data not shown) indicating again the lack of any specific stable secondary structure. Thus, the chemical modification studies further confirmed the results obtained from the enzymatic probing studies that indicate that d(CAGG)26, in contrast to d(CCTG)26, preferentially forms a stable hairpin structure.
| DISCUSSION |
|---|
|
|
|---|
26 to
11,000 repeats. Using an African green monkey kidney cell system, we demonstrate that replication-based slippage contributes to the genetic instability of these repeats. The instability was dependent on the length of the repeats, their orientation, and their distance from the replication origin. Furthermore, synthetic oligonucleotides representing the unpaired repeat regions during replication were analyzed with chemical and enzymatic probes revealing the preferential formation of hairpin structures by the CAGG oligomer, further supporting our structure-mediated replication-slippage model (Fig. 8).
|
To study the propensity of the (CCTG·CAGG) repeats to form non-B DNA structures, we analyzed the synthetic oligonucleotides d(CAGG)26 and d(CCTG)26 with chemical and enzymatic probes. The modification patterns obtained for d(CAGG)26 showed the formation of a folded back, hairpin structure. Several different fold-back structures are possible (Fig. 7). d(CAGG)26 can form hairpin structures with 6, 5, 4, or 3 unpaired residues in the terminal loop (Fig. 7, AD, respectively). This would result in different Watson-Crick and non-Watson-Crick pairing schemes in the putative stems. For the species shown in Fig. 7, B and C, there would be no Watson-Crick pairing making them less stable than the structures shown in Fig. 7, A and D. However, the structure formed with 6 unpaired residues in the terminal loop (Fig. 7A) is more stable than that with 3 residues in the loop (Fig. 7D) because of the two non-Watson-Crick A·G pairs that flank the Watson-Crick G·C pairs (Fig. 7A) compared with the A·A and G·G oppositions that flank the G·C pairs in the structure in Fig. 7D. Although the stability of the mispairs is context dependent, an approximation of the pairing stabilities is G·C > A·T > G·G > G·T
G·A > A·C+ > T·T
A·A
C·C+ > T·C
A·C
C·C (100106). Previous studies have also indicated the greater stability of the loop of a hairpin with an even number of residues compared with an odd-numbered loop (107, 108), as well as the favored closing of the loop with a 5' pyrimidine and a 3' purine pair (107, 108). Thus, our chemical and enzymatic probe results showed the formation of the structure (Fig. 7A) that the above stated rationale predicts to be most stable. Furthermore, DEPC modified both the adenines and guanines at the N7 positions showing that this position is not involved in hydrogen bonding and that the G·A pairs are of the G(anti)·A(anti) or A(anti)·G(anti) types (102, 104, 109).
|
A distinct orientation effect was observed for repeats cloned into the SmaI site where orientation II was significantly more unstable than orientation I. For the triplet repeat sequences, instability was defined primarily as the loss of the full-length progenitor fragment. Also, deletions were the predominant products of instability. In case of the tetranucleotide repeats, however, instability refers to both expansions and deletions. Thus, upon initial consideration, our results appear to be similar to the results with the triplet repeat sequences, including CTG·CAG, CGG·CCG, and GAA·TTC, in which orientation II was shown to be more unstable (9, 10, 16, 18, 19, 21, 22, 39, 52, 85, 86, 95, 96). However, the CCTG sequences are genetically unstable in the orientation prone to expand (orientation II) (Fig. 8) as compared with the triplet repeat sequences that are unstable in the orientation prone to delete (orientation II).
Fig. 8 shows a mechanism for the orientation-dependent instability of the (CCTG·CAGG) sequences. In 1995, Kang et al. (16) proposed the original model to explain the orientation dependence of (CTG·CAG) sequences associated with DM1. For this triplet repeat sequence, the CTG strand forms a more stable secondary structure than the CAG strand. However, for the DM2 sequence, the CAGG strand (on either the nascent lagging strand or the lagging strand template) forms a more stable structure compared with the CCTG strand, generating expansions and deletions, respectively, as shown herein. Thus, the tetranucleotide repeats cloned in orientation II are prone to expand, whereas those cloned in orientation I are prone to delete (Fig. 8). This is the first case of a "reversed" orientation behavior (9, 10). However, similar to the triplet repeats, the capability of one of the two DNA strands to form a quasistable folded structure, in our case the ability of the CAGG strand to adopt a hairpin structure, explains this reversed behavior.
The distance of the triplet repeats from the origin of replication plays an important role in the genetic instabilities of these repeats (16, 19, 22, 111).2 In our studies, the (CCTG·CAGG) repeats cloned proximal to the SV40 origin were more unstable than those cloned distal to the same origin. Several hypotheses were described to explain this effect. If an Okazaki initiation zone is a region of single-stranded template DNA, averaging 135 to 145 nucleotides (in eukaryotes) (22, 112), where priming of the Okazaki fragments occurs, then depending on where the repeats fall within the Okazaki fragments (i.e. at the 5' end or the 3' end), their ability to form secondary structures would differ thereby influencing the amount of instability observed (12, 22, 112). Alternatively, we speculate that because of almost continuous initiation events occurring at the origin, the repeats cloned close to the origin would be rendered single-stranded for a longer period of time thereby giving the repeats a greater opportunity to fold-back and form a stable secondary structure. Once replication has been initiated and the fork progresses away from the origin, the repeats lying distal to the origin do not have as great an opportunity to form these slipped structures on the leading strand. However, because of the single-stranded nature of the Okazaki fragments on the lagging strand, both expansions and deletions can occur depending on the stability of the hairpin structures formed by these repeats. The aberrant processing of the Okazaki fragments involving FEN-1 and DNA ligase has also been hypothesized to play an important role in generating genetic instabilities (21, 113118).
Furthermore, polymerase switching (PolI/PolIII) has been implicated in E. coli to contribute to the genetic instability of triplet repeats (16, 47). Polymerase switching has also been proposed to occur during eukaryotic replication where following synthesis of the RNA-DNA primer by the polymerase
-primase complex, replication is continued by polymerase
, both on the leading and lagging strands (87, 119). Thus, polymerase switching may further contribute to the observed instability of the tetranucleotide repeats.
In the present study, we have assayed for replication-based instability; however, transcription through the repeats could also have an influence on the levels of observed instability (49, 93, 120123).3 The (CCTG·CAGG)n repeats in our experiments, cloned into the SmaI site, fall within the promoter region of the neomycin resistance gene that is transcribed. Because this gene is transcribed to elicit the drug resistance, a further increase in the amount of instability could be obtained for repeats cloned proximal to the SV40 origin. Also, the secondary structures formed by these repeats may cause both the transcription as well as the replication machinery to pause, as shown both in vitro and in vivo for triplet repeats (17, 20, 23, 122). This in turn could lead to activation of repair (10, 12, 21, 29, 30, 44, 45, 47, 48, 52, 94, 95, 124126) and recombination (2428) processes that, in the case of triplet repeats, are known to generate instability.
Thus, a complex interplay of replication, repair, recombination, and transcription may effect the massive expansions observed in DM2. In this study, we have specifically focused on replication and its role in generating the instabilities. However, ongoing work implicates recombination as a powerful mechanism that contributes to the instabilities of the tetranucleotide repeats.4 These data provide the first insights into an understanding of the molecular mechanisms of the tetranucleotide instabilities. Our ultimate goal is to understand the molecular processes causing these expansions to develop therapeutic strategies.
| FOOTNOTES |
|---|
¶ To whom correspondence should be addressed: Center for Genome Research, Institute of Biosciences and Technology, Texas A&M University Health Science Center, 2121 W. Holcombe Blvd., Houston, TX 77030-3303. Tel.: 713-677-7651; Fax: 713-677-7689; E-mail: rwells{at}ibt.tamushsc.edu.
1 The abbreviations used are: DM2, myotonic dystrophy type 2; DEPC, diethylpyrocarbonate. ![]()
2 R. R. Iyer and R. D. Wells, unpublished data. ![]()
3 Mochmann, L., and Wells, R. D. (2004) Nucleic Acids Res. 32, 44694479. ![]()
4 R. Dere and R. D. Wells, unpublished data. ![]()
| REFERENCES |
|---|
|
|
|---|