(CAG)·(CTG) Repeats Associated with Neurodegenerative Diseases Are Stable in the Escherichia coli Chromosome*

(CAG)n·(CTG)n expansion is associated with many neurodegenerative diseases. Repeat instability has been extensively studied in bacterial plasmids, where repeats undergo deletion at high rates. We report an assay for (CAG)n·(CTG)n deletion from the chloramphenicol acetyltransferase gene integrated into the Escherichia coli chromosome. In strain AB1157, deletion rates for 25–60 (CAG)·(CTG) repeats integrated in the chromosome ranged from 6.88 × 10–9 to 1.33 × 10–10, or ∼6,300 to 660,000-fold lower than in plasmid pBR325. In contrast to the situation in plasmids, deletions occur at a higher rate when (CTG)43, rather than (CAG)43, comprised the leading template strand, and complete rather than partial deletions were the predominant mutation observed. Repeats were also stable on long term growth following multiple passages through exponential and stationary phase. Mutations in priA and recG increased or decreased deletion rates, but repeats were still greatly stabilized in the chromosome. The remarkable stability of (CAG)n·(CTG)n repeats in the E. coli chromosome may result from the differences in the mechanisms for replication or the probability for recombination afforded by a high plasmid copy number. The integration of (CAG)n·(CTG)n repeats into the chromosome provides a model system in which the inherent stability of these repeats reflects that in the human genome more closely.

The expansion of (CAG) n ⅐(CTG) n repeats is associated with many diseases including myotonic dystrophy type 1 (DM1), 2 Huntington disease, and many spinocerebellar ataxias (1,2). (CAG)⅐(CTG) tracts are generally comprised of less than 30 repeats in normal individuals. In certain diseases, such as DM1, repeats can expand from less than 30 to 30 -80 copies, resulting in mild symptoms, and then expand to thousands of copies in severely affected individuals. In the majority of (CAG)⅐(CTG)associated diseases, however, full expansion mutations causing the disease are less than 80 repeats in length (see Ref. 2 for review). With all diseases, repeat expansion is the genetic mutation responsible for the disease, and various approaches for causing repeat deletions or for preventing expansions throughout life may provide therapeutic approaches for disease treatment (3)(4)(5)(6)(7).
Much of our understanding of the biochemistry and molecular mechanisms of repeat instability has come from studies involving repeats cloned into plasmids in Escherichia coli. Plasmid-based genetic assays for repeat deletion have shown that (CAG) n ⅐(CTG) n repeats are very unstable when inserted into plasmids in E. coli, where they exhibit rates of deletion as high as 10 Ϫ4 to 10 Ϫ2 deletions per cell per generation (8 -11). The repeats can be almost completely lost from the plasmid following several recultivations of the bacterial culture (12)(13)(14)(15). Plasmid-based assays have revealed that mutations in genes involved in DNA mismatch repair (mutS), excision repair (uvrA), recombination (recA and recB), and replication restart ( priA, recG, ruvAB, recF, and recO) influence repeat instability in E. coli (8,10,11,16). (CAG) n ⅐(CTG) n repeats can block DNA replication in cells (17), and repeat deletions may predominantly result from attempts to restart DNA replication when the replication fork is stalled at trinucleotide repeats (8,10,16). Repeat instability in plasmids may also result from the high plasmid copy number that may promote recombination between repeats (18,19).
The high rates of deletion and recombination in plasmids are very different from the situation in humans where intergenerational transmission can lead to expansion from hundreds to thousands of repeats (1,2). In some cases, such as fragile X syndrome, once expanded, the repeat tracts are somatically stable (20). In other diseases such as DM1 and Huntington disease, long repeat tracts can increase in length throughout life (21)(22)(23). Thus, the study of triplet repeats in bacterial plasmids, although useful for studying the mechanism and genetics of deletion, does not model many characteristics of (CAG) n ⅐(CTG) n instability in human cells where high rates of deletion are not observed.
Here, we have developed a chromosome-based genetic assay for repeat deletion. For this, we placed a chloramphenicol acetyltransferase (cat) gene, disrupted by insertion of (CAG) n ⅐(CTG) n repeats, into the phage attB attachment site in the E. coli chromosome using the InCh method (24). In both wild type cells and strains containing mutations affecting replication restart, (CAG) n ⅐(CTG) n tracts were remarkably stable in the E. coli chromosome when compared with the very high rate of instability in plasmids. Thus, this model system more closely mimics the situation for repeats in the human chromosome.

EXPERIMENTAL PROCEDURES
Strains and Media-Parental E. coli strains, AB1157, MG1655, and SM551, were used as permissive strains for the generation of InCh1 lysates, which were prepared from parental lysogenic strain DH6521 ( Table 1). Derivatives of bacterial strain AB1157, AG109 ( priA2::kan), RJ1075 ( priA4; where the final six amino acids of the PriA protein are replaced by 12 different amino acid residues, resulting in a partially functional PriA (25)), and N3793 (⌬recG263) were used as target strains for developing the chromosome-based genetic selection assays. Strains containing the original pBR325 plasmids were grown in broth (10 g of bacto-tryptone and 5 g of NaCl/liter H 2 O). For reversion and long term growth experiments, cells were grown at 37°C in K medium (M9 buffer supplemented with 1% casamino acids, 1% glucose, 0.2 mM CaCl 2 , 1 mM MgSO 4 , and 0.001% thiamine) containing 100 g/ml ampicillin (Amp). M9 buffer contained 1 g of NH 4 Cl, 5.8 g of Na 2 HPO 4 , and 3 g of KH 2 PO 4 /liter. K medium ϩ Amp plates contained 100 g/ml Amp. K medium ϩ Amp ϩ Cm plates contained 100 mg/ml Amp and 25 mg/ml chloramphenicol (Cm).
Construction of the Chromosome-based Genetic Assay-The cat gene, disrupted by (CAG) n ⅐(CTG) n tracts from pVH plasmids derived from pBR325 (10), was inserted into the E. coli chromosome using InCh vectors as described by Boyd et al. (24). Different plasmids were used that contained either orientation of the cat gene, thus reversing the repeats with respect to the direction of replication. The structure at the insertion site was confirmed by PCR and DNA sequencing as recommended (24).
Repeat Deletion Rate Measurements and Analysis of Revertants-Measurement of the rate of reversion to chloramphenicol resistance (Cm r ), which occurs by deletion to Յ24 repeats, was as described (10,16). At least six independent reversion assays were performed for each strain. The mutation rate was determined by the method of the median frequency (26). 95% confidence intervals (C.I.) for the individual mutation rates were calculated by the method of the population mean (27). Statistical significance levels ( p values) between the mutation rates were determined by the two-sample ranking test of the Mann-Whitney test (27). The nature of the deletion events in the Cm r revertants was analyzed by PCR as described (10).
Long Term Growth Assay and Analysis-A 10-ml culture in K medium containing ampicillin (30 g/ml) was inoculated from a single colony and grown overnight at 37°C at 250 rpm. The next day, the cultures were recultivated by a 1:1,000 dilution into 10 ml of K medium containing ampicillin and grown for another 1 or 2 days. This procedure was repeated every day for a total of 7 days and, in a separate experiment, every 2 days for a total of 14 days. Plasmids were purified each time the culture was recultivated. The length of the repeat tract This study a After transfer of a cat gene containing an insertion of (CTG) 25 ⅐(CAG) 25 into the E. coli chromosome, the repeat tracts were duplicated to 50 in all cured lysogens. Presumably, these changes occurred at the level of the primary lysogens for this original plasmid construct.
was determined by polyacrylamide gel analysis of PCR products encompassing the repeat region, and the frequency of Cm r revertants in the population was determined.

RESULTS
A Genetic Assay for Studying the Genomic Instability of DNA Repeats Inserted into the E. coli Chromosome-We have developed and used pBR325 plasmid-based genetic assays to measure rates of deletion of (CAG) n ⅐(CTG) n tracts in E. coli (8 -10, 16). Here, we have introduced this system, in single copy, into the E. coli chromosome. For this, both orientations of the cat gene in plasmid pBR325, disrupted by (CAG) n ⅐(CTG) n tracts, were inserted into the attB site of the E. coli chromosome (24). The two orientations, with respect to the chromosomal replication origin (oriC), place the (CAG) n or (CTG) n tract in the leading template strand for E. coli chromosome replication (Fig. 1). During the process of introducing the repeats into the chromosome, the (CAG) 79 ⅐(CTG) 79 tract present in the original plasmid was shortened to (CTG) 60 ⅐(CAG) 60 in the cured lysogens SKA793, SKG793, SKR793, and SKN793 and to (CAG) 52 ⅐(CTG) 52 in the cured lysogens SKA794, SKG794, and SKN794. However, cured lysogen SKR794 contained the original length (CAG) 79 ⅐(CTG) 79 ( Table 1). The (CAG) 25 ⅐(CTG) 25 tract was duplicated to (CAG) 50 ⅐(CTG) 50 in cured lysogens, SKA254, SKG254, SKR254, and SKN254 (Table 1). Presumably, these changes occurred during construction of the primary lysogens. The initial (CAG) 43 ⅐(CTG) 43 tracts were stable upon introduction into the chromosome in all cured lysogens (Table 1).
As expected in this genetic assay, the in-frame insertion of (CAG) n ⅐(CTG) n tracts of 25 repeats or longer in the cat gene resulted in a chloramphenicol-sensitive (Cm s ) phenotype in either orientation of the cat gene. In plasmids, deletions that reduce the length to less than 25 (CAG)⅐(CTG) repeats restore a chloramphenicol-resistant (Cm r ) phenotype, allowing measurement of the rates of deletions (10). This was also the case for the cat gene constructs integrated into the chromosome as revertants containing 23 repeats were obtained for Cm r revertants from cells containing the (CAG) 25 ⅐(CTG) 25 tract (see below).
(CAG)⅐(CTG) Repeats Were Very Stable in the E. coli Chromosome-To determine (CAG)⅐(CTG) deletion rates, cultures were grown to mid-log phase in K medium from single colonies from frozen cultures, as described (10). In parental strain AB1157, mutation rates for (CAG) 25 , (CAG) 43 , and (CAG) 52 tracts in the leading template strand of the E. coli chromosome ranged from 2.05 to 18.02 ϫ 10 Ϫ10 Cm r revertants per cell per generation (Table 2). Mutation rates for (CTG) 43 and (CTG) 60 tracts in the leading template strand of the E. coli chromosome were 6.88 ϫ 10 Ϫ9 and 2.29 ϫ 10 Ϫ9 , respectively.
Repeat deletion rates were also measured in several derivatives of AB1157 containing mutations in priA and recG, which reduce repeat deletion rates in plasmids (16). In the priA2 strain (AG109), mutation rates for the two orientations of (CTG) 43 and (CAG) 43 tracts were very similar, and they were 14-and 4-fold lower than the rates in the isogenic parental strain AB1157, respectively ( Table 2). The differences in rates comparing the parental and priA2 strain were significant ( p Ͻ 0.001). In contrast, mutation rates for (CAG) 25 , (CTG) 60 , and (CAG) 52 tracts in the E. coli chromosome were increased by factors of 2.8, 1.7, and 4.1, respectively ( p Ͻ 0.01-0.005) when compared with rates in parental strain AB1157. In the priA4 strain (RF1075), mutation rates for most repeats ((CAG) 25 , (CAG) 43 , and (CTG) 43 tracts in the chromosome) were decreased 1.6 -5.5-fold when compared with the parental strain ( p Ͻ 0.01-0.001). The mutation rate for the (CAG) 79 tract was similar to rates for repeat lengths from 43 to 60 in this strain. In strain recG263, mutation rates for (CAG) 25 , (CTG) 43 , and (CAG) 43 tracts were decreased 1.6 -25-fold when compared with the parental strain ( p Ͻ 0.001), whereas mutation The cat gene containing various lengths of (CAG)⅐(CTG) repeats was inserted into the E. coli chromosome at the attB site as described (24). By using pBR325-based plasmids containing opposite orientations of the cat gene (8 -10), both orientations of (CTG) n ⅐(CAG) n tracts, with respect to the chromosomal replication origin (oriC), were introduced into the chromosome. rates for (CTG) 60 and (CAG) 52 were increased 2-and 23.6-fold, respectively, relative to the parental strain ( p Ͻ 0.001).

(CAG) n ⅐(CTG) n Repeats Generally Undergo Complete Deletions in the E. coli Chromosome during Selection in Log Phase
Growth-Mutation spectra of Cm r revertants were characterized by PCR analyses ( Table 3). All of the initial 43-79 (CAG)⅐(CTG) repeats were completely deleted in the Cm r revertants in all strains. When (CAG) 25 tracts were placed in the leading template strand of the E. coli chromosome, complete deletions prevailed in the parental, priA4, and recG263 strains, although several Cm r revertants in wild type and the recG strain had deletions of only two repeats, leaving 23 repeats in the chromosome. In the priA2 strain, only the deletion of two repeats was observed for (CAG) 25 .

Stability of (CAG) n ⅐(CTG) n Repeats in the E. coli Chromosome during Repeated Passage through Log and Stationary
Phase-In a biochemical assay for repeat instability, long tracts of (CAG)⅐(CTG) repeats can be almost completely lost from plasmid following only several recultivations of the bacterial culture in which cells go through many cell divisions and multiple cycles of log and stationary phase (12)(13)(14)(15). To test the stability of (CAG)⅐(CTG) repeats in the E. coli chromosome under these growth conditions, cells were grown for 7 days with recultivation every day or for 14 days with recultivation every 2 days. Cells were grown in the presence of ampicillin selection to maintain plasmid, but not chloramphenicol, selection to select for repeat deletions. On analysis of the repeat tracts in the total population by PCR analysis, only the original repeat lengths were detected (data not shown). Cm r revertants were detected in the long term growth cultures but at only very low levels, consistent with the rates reported in Table 2.

DISCUSSION
We have used a chromosome-based genetic assay to measure deletion rates of repeat tracts from the E. coli chromosome. For this, the cat gene, disrupted by the in-frame insertion of 25, 43, 52, 60, or 79 (CAG)⅐(CTG) repeats, was inserted into the phage attB attachment site of the E. coli chromosome using the InCh method (24) (Fig. 1). These lengths of repeats encompass the length in normal individuals (Ͻ30) as well as the full repeat expansions for the majority of (CAG)⅐(CTG) repeat expansion diseases (2). We have not yet been able to test the stability of very long repeat tracts associated with a few diseases, such as DM1. In our previous plasmid-based genetic assay, mutation rates for deletions to less than 25 repeats, from pBR325-derivative plasmids carrying 25, 43, or 79 (CAG)⅐(CTG) repeats, ranged from 1.1 ϫ 10 Ϫ4 to 2 ϫ 10 Ϫ6 in parental strain AB1157 and its derivatives (16). In the chromosome-based genetic assay, the deletion rates ranged from 6.88 ϫ 10 Ϫ9 to 1.33 ϫ 10 Ϫ10 in AB1157 (Fig. 2, A and B). These rates are ϳ6,300 -660,000-fold lower than the rates in plasmids. Considering a copy number of about 20 for plasmid (10) when compared with a copy number of 2-4 for the chromosome, the differences in mutation rates remain high (630 -1,260-to 63,000 -126,000fold lower). In addition, upon long term growth, through multiple cycles of log and stationary phase, the (CAG)⅐(CTG) repeats were stable, unlike the situation in plasmids, where they undergo rapid deletion generally within 20 -40 generations (12)(13)(14)(15). Thus, (CAG)⅐(CTG) repeat tracts are extremely stable in the E. coli chromosome relative to the situation in plasmids. Moreover, the mutation spectrum in the chromosome was quite different from that in the plasmid. In the chromosome, complete deletions predominated, except for (CAG) 25 in the priA2 strain. In plasmids, partial deletions were more common (Fig. 2, C and D) (16).
These large differences in stability may be explained by one or more of the following differences. 1) ColE1-based plasmids and the E. coli chromosome initiate DNA replication very differently. In E. coli, the initiation of chromosomal DNA replication occurs at a 254-bp sequence, oriC, in a process controlled by the DnaA protein (28). ColE1 plasmids, on the other hand, initiate replication from a very long RNA primer molecule (RNA II) synthesized by RNA polymerase. Transcription begins 555 nucleotides from the origin (where the transition for DNA synthesis is made). DNA pol I then continues DNA synthesis on the long RNA primer for at least 400 nucleotides before DNA pol III begins synthesis. It is not until this point that a primosome assembles to begin lagging strand synthesis (28). Instability, including aberrant replication events, such as intermolecular strand switching during replication, can occur at this pol I to pol III junction region (29). 2) Plasmid replication is unidirectional, whereas replication on the chromosome is bidirectional (28). 3) The copy number of the cat gene in cells is also very different, being about 20 for cells harboring pBR325 derivatives, when compared with 2-4 under log phase conditions, when the cat gene is introduced into the bacterial chromosome (28). The high copy number and the pol I to pol III switch may increase the probability of primer-template misalignment. Moreover, this may also promote the high rate of recombination observed in repeats in plasmids (18,19), whereas recombination may only rarely be involved in repeat instability in human cells (2). 4) ColE1-based plasmids are not maintained in the priA2 strain (30), suggesting a strict requirement for PriA in plasmid replication or stability but not for chromosomal replication. 5) Finally, the size difference of the two genomes may influence stability given that the repeat may constitute 10% of a plasmid but only a very small proportion of the chromosome. DNA secondary transitions, which might occur within the (CAG)⅐(CTG) tract (31,32), would have very different consequences for supercoiling levels given different topological domain sizes. Consistent with our results, a large difference between deletion rates of DNA repeats when present on plasmids or on the chromosome was also observed in another genetic assay in E. coli. The rates of deletion between long tandem repeats in the chromosome in E. coli strains carrying mutations in many replication genes ranged from 1.7 ϫ 10 Ϫ5 to 3.1 ϫ 10 Ϫ9 . These rates were ϳ10 -100,000-fold lower than rates in the plasmid (30). Also, mutation rates of (CTG) n ⅐(CAG) n tracts within yeast chromosomes range from 2 ϫ 10 Ϫ3 to 5 ϫ 10 Ϫ5 (33)(34)(35), rates that are much higher than observed in the E. coli chromosome.
In plasmids, (CAG)⅐(CTG) repeats are sometimes more prone to deletions when (CTG) n tracts are placed in the lagging rather than the leading template strand (10,14). On the other hand, expansions are often favored when the (CTG) n tract comprises the leading template strand (14). However, this strand bias for deletion was not observed for (CTG)⅐(CAG) repeats in plasmid pBR325 in strain AB1157 grown in K medium (16), and in the present study (CTG) n , tracts were more unstable when they comprised the leading template strand of the E. coli chromosome in AB1157 and derivatives ( Table  2). In another chromosome mutation assay system, base substitution mutations occurred at a slightly higher rate in the leading template strand (36), whereas this pattern was altered in cells when the SOS repair system was induced (37). Differences in orientation-dependent repeat instability in the plasmids and the chromosome may also result from the differences in these two systems, as discussed previously and as discussed below.
The progression of replication can be blocked at trinucleotide repeats (17,38,39) and by collisions between replication and transcription complexes (40). To compare different repeat orientations in the plasmid-and chromosome-based cat gene assay systems, the direction of transcription of the cat gene is reversed with respect to movement of the replication fork. Thus, deletion rates may be influenced by the orientation of the cat gene. Indeed, in the previous plasmid-based genetic assays, Cm r reversion frequencies were decreased ϳ2-fold when the cat gene containing symmetrical repeat inserts was reversed, making the direction of transcription the same as the direction of the replication fork (41). In the chromosome-based genetic assay in parental strain AB1157, the deletion rate of 43 (CAG)⅐(CTG) repeats was 3.8-fold lower when progression of the replication fork was codirectional with transcription of the cat gene than when replication and transcription collide ( Table 2). Our results are consistent with an interpretation that collisions between DNA polymerase and RNA polymerase complexes at the repeats may be responsible for an increased rate of deletion. Indeed, in wild type and mismatch repair-deficient mutL strains containing lacZ frameshift alleles, mutation frequencies were increased 1.1-1.6-and 1-5.4-fold, respectively, when the movement of the fork was opposite to the direction of lac transcription in the E. coli chromosome (42).
In the priA2, priA4, and recG263 strains, deletion rates for strains containing 43 or fewer repeats were reduced relative to the parental strain, consistent with observations in plasmids (16). The influence of mutations in these genes on the rates of deletion is consistent with the involvement of replication restart in instability (16). An exception was the (CAG) 25 tract in the priA2 strain, where the Cm r reversion rate was increased. Moreover, in this strain, only partial deletions were observed, in strong contrast to all other chromosomal mutation spectra. This is consistent with an increase in replication slippage in this FIGURE 2. Comparison of (CTG) n ⅐(CAG) n instability in plasmids and the E. coli chromosome. A, deletion rates for (CTG) n ⅐(CAG) n repeats in the cat gene in plasmid pBR325 in strains AB1157 and priA4 and recG derivatives (data from Ref. 16). B, deletion rates for (CTG) n ⅐(CAG) n repeats in the cat gene but integrated in the chromosome (from Table 2). C, frequency of complete deletion of the (CTG) n ⅐(CAG) n tract from the cat gene in plasmid pBR325 in strains AB1157 and priA4 and recG derivatives (data from Ref. 16). D, frequency of complete deletion of the (CTG) n ⅐(CAG) n tract in the cat gene integrated in the chromosome (from Table 3). genetic background. Repeat tracts longer than 43 repeats generally showed an increase in deletion rates in priA and recG strains relative to the parental strain. This may result from an increased probability of DNA secondary structure formation leading to replication pausing or DNA secondary structure formation occurring during replication restart pathways (8). 43 repeats may represent the length threshold for replication pausing, perhaps mediated by structure formation. This is in the range of repeat thresholds for pausing in (CGG)⅐(CCG) and (GAA)⅐(TTC) tracts in plasmid in yeast (39). However, it is shorter than the threshold observed for (CAG)⅐(CTG) repeats in plasmids in E. coli and yeast (17,39). The length threshold for structure formation may be different in the E. coli chromosome, especially in the presence of mutations in genes involved in replication restart. In summary, these results show that mutations in priA2, priA4, and recG263 can increase or decrease rates of (CAG) n ⅐(CTG) n deletion from the E. coli chromosome depending on the repeat length and particular mutation.
Human disease-associated DNA repeats thousands of base pairs in length can be maintained relatively stably in human chromosomes (20 -23), whereas they exhibit very high rates of deletion in E. coli plasmids (8,10,18,19). The large differences in rates of instability of triplet repeats comparing plasmids with the bacterial chromosome may be due to differences in genome size, mechanisms of replication, and copy number. This chromosome-based genetic assay should provide a valuable new system for understanding many features of the mechanisms responsible for the instabilities of disease-associated DNA repeats.