Long CTG (cid:1) CAG Repeats from Myotonic Dystrophy Are Preferred Sites for Intermolecular Recombination*

, Homologous recombination was shown to enable the expansion of CTG (cid:1) CAG repeat sequences. Other prior investigations revealed the involvement of replication and DNA repair in these genetic instabilities. Here we used a genetic assay to measure the frequency of homologous intermolecular recombination between two CTG (cid:1) CAG tracts. When compared with non-repeating sequences of similar lengths, long (CTG (cid:1) CAG) n repeats ap- parently recombine with an (cid:1) 60-fold higher frequency. Sequence polymorphisms that interrupt the homogeneity of the CTG (cid:1) CAG repeat tracts reduce the apparent recombination frequency as compared with the pure uninterrupted repeats. The orientation of the repeats relative to the origin of replication strongly influenced the apparent frequency of recombination. This suggests the involvement of DNA replication in the recombination process of triplet repeats. We propose that DNA polymerases stall within the CTG (cid:1) CAG repeat tracts causing nicks or double-strand breaks that stimulate homologous recombination. The recombination process is RecA-dependent. LB agar plates with 35 (cid:4) g/ml chloramphenicol and 0.1% arabinose. Plasmids were then isolated from individual transformants by the Wizard Plus Miniprep DNA purification System (Promega) and characterized by restriction mapping. By using this strategy, clones were obtained containing the CTG (cid:1) CAG tracts in both orientations relative to the origin of replication (5). The TRS-containing Not I fragments were used to construct pRW4312, pRW4313, pRW4316, pRW4317, pRW4318, and pRW4319, and TRS- harboring Eco RI/ Hin dIII inserts were used to create pRW4323, pRW4324, pRW4331, pRW4332, pRW4333, pRW4334, pRW4827, pRW4828, pRW4898, and pRW4899 (Table II). All clones used herein were characterized by restriction mapping; however, all repeating se- quence inserts were derived from plasmids prepared in this laboratory which were characterized (5, 31–33) by DNA sequence analyses. Two non-repeating sequences were used in this study as controls: a 564-bp Hin dIII fragment of bacteriophage (cid:1) DNA (spanning positions 36,895–37,459 of the (cid:1) genome) and a 354-bp segment of the human myotonic dystrophy protein kinase gene (DMPK gene) (part of exon 7 and intron 7). The Hin dIII fragment was obtained by digesting the (cid:1) DNA with Hin dIII followed by blunt-ending of the cohesive ends. The insert was then cloned into the Pvu II site of pBR322 or the Hin cII site of pFW25 (as described above). The DMPK fragment was prepared by PCR amplification of the sequence from the human genomic DNA (31). Restriction fragments were then separated on 1% agarose gels in TAE buffer with 1-kbp DNA size markers (Invitrogen). The gels were then dried and exposed to the x-ray films. In order to detect double crossover events and to analyze the instability of the repeating sequences, the DNAs were digested with Hin dIII or Nde I/ Xmn I and labeled by end-filling with the Klenow fragment of E. coli DNA polymerase I and [ (cid:6) - 32 P]dATP or [ (cid:6) - 32 P]dTTP, respectively. The restriction fragments were then separated through 7% polyacrylamide gels. The dried gels were exposed to the x-ray films. The lengths of the CTG (cid:1) CAG- containing fragments were measured as described before (42). Some of the recombination products were sequenced using Thermo Sequenase Radiolabeled Terminator Cycle Sequencing Kit (U. S. Biochemical Corp.). The sequencing reactions were performed using primer AP1-CGAATTCGAGCTCGGTACCCGGG homologous to the human flanking sequence from the TRS-containing fragments. The products of the sequencing reactions were analyzed on 10% Long Ranger gels (FMC BioProducts) containing 7.5 M urea in the glycerol tolerant gel buffer (U. S. Biochemical Corp.). The gels were dried and exposed to x-ray film.

Repetitive sequences promote homologous recombination in prokaryotic as well as eukaryotic systems, presumably by virtue of forming unusual DNA secondary structures (15)(16)(17)(18)(19)(20)(21)(22). In fact, homologous recombination has been implicated in the instability of repetitive sequences (23)(24)(25)(26). Similar findings have also been made with CTG⅐CAG repeats using a two-plasmid system in Escherichia coli (27,28). Multiple fold expansions, deletions, and the exchange of point mutations between tracts were found in this system; these events were dependent on the presence of TRS tracts on both plasmids, CTG⅐CAG repeat lengths longer than 30, and a functional recA gene.
Previously (29), it was proposed that CTG⅐CAG tracts might function as recombination hot spots in the bovine genome. More recently, Young et al. (30) surveyed the genome of Saccharomyces cerevisiae and suggested that these repeats could be recombination hot spots based on the distribution of triplet repeats therein. However, no experimental evidence exists regarding the capacity of triplet repeat sequences to influence the apparent frequency of homologous recombination.
Here we present the first genetic assay for the determination of the apparent frequency of intermolecular homologous recombination between CTG⅐CAG tracts. We have identified several factors that influence the recombination frequency. In our companion paper (31), we have furthermore established a genetic assay for monitoring the apparent recombination frequency of CTG⅐CAG repeats tracts in an intramolecular system.
The CTG⅐CAG-containing sequences were subcloned into pBR322 and pFW25 (34) as follows. Fragments containing (CTG⅐CAG) n were prepared by digesting the pUC19NotI derivatives with either NotI or EcoRI/HindIII (New England Biolabs, Inc.). The DNA was then bluntended by filling in the cohesive ends with 1 unit of the Klenow fragment of E. coli polymerase I (U. S. Biochemical Corp.) and dNTPs, electrophoresed on a 7% polyacrylamide gel in TAE (40 mM Tris acetate, 1 mM EDTA, pH 8) buffer, and the band containing the triplet repeat fragment excised. The DNA was then eluted from the excised band and purified by phenol extraction (35). The pBR322 and pFW25 vectors were prepared by digesting with PvuII and HincII, respectively. The vector and the insert were mixed and ligated for 16 h at 16°C by the addition of 20 units of T4 DNA ligase (U. S. Biochemical Corp.). E. coli HB101 was used to maintain the pBR322-derived clones, whereas pFW25 and its derivatives were grown in E. coli ECF003. Strain ECF003 (see below) is a derivative of strain DH10B (F Ϫ mcrA ⌬(mmr-hsdRMS-mcrBC) 80dlacZ⌬M15 ⌬lacX74 endA1 recA1 deoR ⌬(ara, leu)7697 araD139 galU galK nupG rpsL) into which a "copy-up" allele of the pir gene (⅐P106L ∧ F107S) was integrated into a chromosomal attP by using Int-dependent recombination (see below). ⅐P106L ∧ F107S is a hyperactive (copy-up) variant of protein (36,37). Thus, for pBR322, the ligation mixture was transformed into E. coli HB101 by electroporation and plated on LB agar plates containing 100 g/ml ampicillin. In the case of pFW25, the ligation mixture was transformed into E. coli ECF003 and plated on LB agar plates with 35 g/ml chloramphenicol and 0.1% arabinose. Plasmids were then isolated from individual transformants by the Wizard Plus Miniprep DNA purification System (Promega) and characterized by restriction mapping. By using this strategy, clones were obtained containing the CTG⅐CAG tracts in both orientations relative to the origin of replication (5). The TRS-containing NotI fragments were used to construct pRW4312, pRW4313, pRW4316, pRW4317, pRW4318, and pRW4319, and TRSharboring EcoRI/HindIII inserts were used to create pRW4323, pRW4324, pRW4331, pRW4332, pRW4333, pRW4334, pRW4827, pRW4828, pRW4898, and pRW4899 (Table II). All clones used herein were characterized by restriction mapping; however, all repeating sequence inserts were derived from plasmids prepared in this laboratory which were characterized (5,(31)(32)(33) by DNA sequence analyses.
Two non-repeating sequences were used in this study as controls: a 564-bp HindIII fragment of bacteriophage DNA (spanning positions 36,895-37,459 of the genome) and a 354-bp segment of the human myotonic dystrophy protein kinase gene (DMPK gene) (part of exon 7 and intron 7). The HindIII fragment was obtained by digesting the DNA with HindIII followed by blunt-ending of the cohesive ends. The insert was then cloned into the PvuII site of pBR322 or the HincII site of pFW25 (as described above). The DMPK fragment was prepared by PCR amplification of the sequence from the human genomic DNA (31). The DMPK fragment was then ligated with the PvuII-linearized pBR322 or the HincII-linearized pFW25.
E. coli Strain Construction-In order to study the frequency of RecAdependent homologous recombination, we used the following strains: E. coli AB1157 (F Ϫ thi-1 hisG4 ⌬(gpt-proA)62 argE3 thr-1 leuB6 kdgK51 rfbD1 ara-14 lacY1 galK2 xyl-5 mtl-1 tsx-33 supE44 rpsL31 rac (38)), ECF005, and the isogenic but recA strain E. coli JC10287 (srl-recA ⌬304) (39), and ECF006. The E. coli AB1157 and JC10287 strains were obtained from the E. coli Genetic Stock Center, Yale University, New Haven, CT. The E. coli ECF005 and ECF006 strains were constructed as described below. The plasmid pFL814 is a derivative of plasmid pJW344 (kindly provided by Dr. Jadwiga Wild, UW Madison) into which pir allele that encodes ⅐P106L ∧ F107S was inserted downstream of the arabinose (Para BAD ) promoter, rendering (over)expression dependent on externally supplied arabinose and permitting a site-specific integration of the pir allele into attB. Briefly, pFL814 contains two NotI sites, cleaving of which generates two DNA fragments as follows: one fragment contains attP, bla, and Para BAD pir (⅐P106L ∧ F107S) and another fragment contains pBR322 oriV. The former DNA fragment was gel-purified, ligated under diluted conditions, and transformed into strains (DH10b, AB1157, and JC10287) each harboring a temperaturesensitive plasmid pJW289 (from J. Wild) that produces Int protein.
Chromosomal integrants (Campbell recombinants) were selected on plates supplemented with ampicillin (50 g per ml). Cells were then cured of the Ts plasmid pJW289 by several passages on plates incubated at 42°C. Integration of the pBR322-oriV-less DNA was assessed by determining the ability of the resulting strains to support replication of plasmid pFW25 (␥ ori) and by immunoassay using anti-antibodies (data not shown). The resulting strains were designated ECF003, ECF005, and ECF006, respectively.
Transformation of TRS-containing Plasmids into E. coli-For the recombination studies, plasmids containing TRS or non-repeating sequences were grown to A 600 ϳ0.5-0.6 in 500-ml LB cultures with the appropriate antibiotic, and the DNA was isolated using Wizard Plus Miniprep DNA purification System (Promega). The DNA was electrophoresed through a 1% agarose gel. The superhelical form of the DNA containing the undeleted CTG⅐CAG sequence was excised and electroeluted (35); this homogenous, undeleted plasmid was used for all the recombination experiments. pBR322 derivatives containing TRS or non-repeating tracts were transformed into the appropriate E. coli strain by electroporation (35). The transformation mixture was inoculated into 10 ml of LB media (containing tetracycline at 5 g/ml), and the cultures were grown at 37°C at a shaking rate of 100 rpm. When the cultures reached an absorbance (600 nm) of ϳ0.6 unit, an aliquot (1 ml) was inoculated into fresh 200 ml of LB (with tetracycline as before). The cultures were allowed to grow until they had reached an A 600 value of 0.7 units. Aliquots of the cultures were used for plasmid isolation (as described before) to verify the integrity of the TRS-containing insert. These cultures were then used to prepare competent cells for electroporation.
The pBR322 derivative-containing cells were then transformed with the purified pFW25 derivatives. The transformation mixture was then plated onto plates with tetracycline (5 g/ml) and chloramphenicol (75 g/ml). For all experiments, the E. coli cells (ECF005, ECF006, as well as AB1157 and JC10287) were grown in the presence of arabinose (0.1%). The colonies were counted after 14 h of incubation at 37°C (ϳ25 cell generations), and these numbers were used to estimate the apparent frequency of recombination (see "Results").
For the restriction analyses and sequencing, single colonies were inoculated into 10 ml of LB with tetracycline (5 g/ml) and chloramphenicol (75 g/ml). When the cultures reached an absorbance (600 nm) of 1.0 unit (20 -24 h), DNA was isolated and analyzed for single and double crossover events as well as TRS instability.
As a control of the efficiency of transformation of E. coli AB1157 and ECF005 or JC10287 and ECF006, the cells containing pBR322 derivatives were then transformed with pACYC184 (for an explanation see "Results"). The pACYC184 plasmid contains the chloramphenicol resistance gene and the p15A origin of replication (40). pACYC184 can co-exist in E. coli with pBR322 and can be replicated in the absence of the protein (27,28,41).
Transformation of "Recombinant Molecules" into E. coli Cells-In order to study the recombination frequency, the pFW25 derivatives were transformed into E. coli AB1157 cells containing pBR322 derivatives (see "Results"). The pFW25 derivatives cannot replicate by themselves; the only way to obtain Cm R colonies is when the two plasmids recombine, and the recombinant molecule uses the pBR322 origin of replication. However, the formation of Cm R colonies is contingent not only upon the formation of recombinant plasmids after transformation but also their establishment and maintenance, particularly in the face of competition from multiple copies of the resident plasmid which is incompatible with the newly formed recombinant. Thus, the following questions were raised: is the establishment of the recombinant molecule influenced by the presence of an incompatible plasmid in the cell? Does the efficiency of establishment of the recombinant vary with the sequence composition of the insert?
In order to address these questions, we measured the ability of isolated recombinant molecules (examples shown in Fig. 2) to transform and be maintained in host cells harboring the incompatible resident plasmid in multiple copies. If the answers to the questions raised were in the affirmative, and if these factors influenced the outcome of the recombination frequencies measured, then we would predict that there would be a 1:1 correlation between the transformation efficiencies of the recombinant molecules and the apparent recombination frequencies measured for plasmids harboring the same sequence insert.
Hence, the following experiment was performed to test the effect of the incoming plasmid on the measurements. The recombinant molecules (examples shown in Fig. 2) from experiments done with various sets of monomer plasmids were electroeluted from an agarose gel as described above. The purified molecules containing different lengths of TRS or non-repeating sequences were transformed into plasmid-less AB1157 cells or into cells that carried a resident pBR322 derivative containing a DMPK insert. To control for the transformation efficiency of the host cells, pACYC184 was used. The efficiency of transformation was calculated by dividing the number of Tet R , Cm R colonies obtained from transformation with the recombinant molecules by the number of colonies obtained after transformation with pACYC184. The results presented in Table I (1st and 3rd lines) show that there is no significant difference in the transformation efficiency between plasmid-less or pRW4870-containing cells. Therefore, the first conclusion is that there is no significant influence of the presence of a resident incompatible plasmid on the transformation efficiency with the recombinant molecules. The selection pressure for the antibiotic allows detection of all cells into which the recombinant molecule is introduced, regardless of the presence of the resident plasmid.
Furthermore, we analyzed the influence of the TRS sequence and its length in the resident plasmid on the transformation efficiency. The AB1157 cells containing pRW4827 or pRW4898 (harboring (CTG⅐CAG) 67 or (CTG⅐CAG) 98 , respectively) were transformed with recombinant molecules (derivatives of recombination between pBR322 and pFW25 derivatives harboring (CTG⅐CAG) 67 or (CTG⅐CAG) 98 ). The efficiency of transformation was calculated (as described above), and the results are shown in Table I (3rd, 5th, and 7th lines). There was no difference in the transformation efficiencies whether the resident plasmids contained the non-repeating DMPK sequence or the (CTG⅐CAG) 67 . Surprisingly, we found that there was an increase in transformation efficiency when the resident plasmid contained (CTG⅐CAG) 98 . This appeared to indicate that the higher recombination frequencies measured were in fact due to the greater transformability of cells harboring (CTG⅐CAG) 98 . However, in our recombination assay, the apparent recombination frequency was measured as a ratio between Tet R , Cm R colonies obtained from AB1157 and from ECF005 (see under "Results"). Thus, if the observation of higher transformability of cells harboring (CTG⅐CAG) 98 in the resident plasmid was also found in the ECF005 strain, these effects would cancel each other out. Hence, parallel experiments to those described above were performed using ECF005 cells. The plasmid-less as well as pBR322 derivative-containing cells were transformed with recombinant molecules or pACYC184. The transformation frequencies were calculated, and the results are shown in Table  I (2nd, 4th, 6th, and 8th lines). Interestingly, we observed a similar increase in transformation efficiency for the (CTG⅐CAG) 98 tract in ECF005 as had been observed for AB1157. Also, we noted that the ratios of transformation efficiencies of AB1157 to ECF005 are 2.5, 3.2, and 3.3 for the DMPK sequence, (CTG⅐CAG) 67 , or (CTG⅐CAG) 98 , respectively. Thus, the presence of the (CTG⅐CAG) 98 tract in the resident plasmid does result in an increase in the transformation efficiency for both strains; therefore, this does not influence the measurement of the recombination frequencies.
To verify these conclusions, we did parallel experiments in AB1157 and ECF005 using plasmids harboring (CTG⅐CAG) 175 (data not shown). We observed that the presence of (CTG⅐CAG) 175 in the resident plasmid resulted in an increase in the transformation efficiency by the incoming recombinant in both strains. It should be noted that the (CTG⅐CAG) 175 exhibited an anti-recombinogenic behavior in the recombination assay (Table IV). Thus, we conclude that there is not a direct correlation between the transformation efficiencies measured for the recombinant plasmids and the apparent recombination frequencies (shown in Tables III and IV, see "Results") for plasmids containing the same sequence inserts. Therefore, the concept that differences in plasmid maintenance and establishment influenced the measured recombination frequencies was disproven. The reason why the transformation efficiency of cells harboring resident plasmids with longer repeats is higher than for shorter repeats is uncertain and is beyond the scope of this study.
Agarose and Polyacrylamide Gel Analyses of Recombination Products-The NdeI/SnaBI (New England Biolabs, Inc.) digestion of the recombinant DNA, isolated from E. coli AB1157 and ECF005, was used to assay for single crossover events. The digested DNA was labeled by end-filling with the Klenow fragment of E. coli DNA polymerase I and [␣-32 P]dTTP. Restriction fragments were then separated on 1% agarose gels in TAE buffer with 1-kbp DNA size markers (Invitrogen). The gels were then dried and exposed to the x-ray films. In order to detect double crossover events and to analyze the instability of the repeating sequences, the DNAs were digested with HindIII or NdeI/XmnI and labeled by end-filling with the Klenow fragment of E. coli DNA polymerase I and [␣-32 P]dATP or [␣-32 P]dTTP, respectively. The restriction fragments were then separated through 7% polyacrylamide gels. The dried gels were exposed to the x-ray films. The lengths of the CTG⅐CAGcontaining fragments were measured as described before (42).
Some of the recombination products were sequenced using Thermo Sequenase Radiolabeled Terminator Cycle Sequencing Kit (U. S. Biochemical Corp.). The sequencing reactions were performed using primer AP1-CGAATTCGAGCTCGGTACCCGGG homologous to the human flanking sequence from the TRS-containing fragments. The products of the sequencing reactions were analyzed on 10% Long Ranger gels (FMC BioProducts) containing 7.5 M urea in the glycerol tolerant gel buffer (U. S. Biochemical Corp.). The gels were dried and exposed to x-ray film.

RESULTS
Biological Assay for the Apparent Frequency of Intermolecular Recombination-Two plasmid systems have been used previously to measure the frequency of intermolecular recombination between homologous sequences (43). Here, two plasmids, each containing a specific triplet repeat insert, belonging to different incompatibility groups, were introduced sequentially into an appropriate host strain. The plasmids and the host strains were chosen such that neither plasmid could exist independently; although the replication origin of one plasmid was non-functional in the chosen strain, the presence of the other plasmid by itself was selected against using an appropriate antibiotic. Thus, the selection ensured the survival of only those cells in which the two plasmids had recombined to form a co-integrant that would replicate using the origin of one plasmid and would survive on the appropriate antibiotic using the antibiotic resistance gene from the other plasmid.
Also, experiments in a strain that could support the independent replication of both plasmids served to establish a base line. Determination of the transformation efficiencies of the two strains was accomplished by transforming each strain with a control plasmid and by using these numbers to normalize the efficiencies obtained with the experimental plasmids. The recombination frequency was then calculated by comparing the number of colonies obtained after the two-step transformation of the two different strains normalized for the transformation efficiency differences.
This strategy requires that the following events must occur for the antibiotic resistant colonies to appear. First, the incoming plasmid must integrate into the resident plasmid by homologous recombination to link the antibiotic resistance gene to a functional replicon. Second, the resulting recombinant molecule must get established and be maintained in the cell in the face of competition from the pre-existing incompatible multicopy plasmid. Experiments done previously by Bierne and coworkers (44) had suggested that the measurement of recombination frequencies could be seriously jeopardized by differences in the fitness between the parental and recombinant plasmids. Their studies using a Tus/Ter system showed that facilitated establishment of some recombinant plasmids occurred in part "recombinant molecules" E. coli AB1157 or ECF005 cells with or without pBR322 derivatives were transformed with recombinant molecules obtained from intermolecular recombination between pBR322 and pFW25 derivatives harboring different lengths of the (CTG ⅐ CAG) n tracts or the non-repeating sequences (see "Experimental Procedures"). The transformation mixtures were plated onto plates with Cm and Tet, and the number of colonies was counted. The frequency of transformation with recombinant molecules is presented as the ratio of the number of colonies obtained with recombinant molecules to those obtained from the transformation by pACYC184 that was used to control for the differences in transformation efficiencies between different host cells. All the numbers are expressed per g of DNA. Frequency of Recombination of Triplet Repeats due to the reduction in the copy number of the resident incompatible parental plasmid. To investigate the possibility that similar processes may play a role in our intermolecular recombination system, we analyzed the influence of sequence composition and length on the establishment and maintenance of recombinant molecules in the face of competition from the resident incompatible parental plasmid. This was accomplished by measuring the relative efficiencies with which recombinant plasmids transform cells that already harbor an incompatible parental plasmid (see "Experimental Procedures"). Our results (described under "Experimental Procedures") enabled the conclusion that plasmid establishment and maintenance do not significantly influence the measurement of recombination frequencies for the sequences investigated herein. Nevertheless, we describe the recombination frequencies as "apparent frequencies." Accordingly, we used a pFW25 vector that contains the R6K ␥ origin (␥ ori) of replication (34). The ␥ ori is a unidirectional origin, which can function only in the presence of the protein encoded by the pir gene (45). In E. coli ECF005 and ECF006, the protein is expressed from a chromosomal pir gene, which is under the control of an arabinose promoter (see "Experimental Procedures"). The pFW25 plasmid also contains a chloramphenicol resistance marker. For the second plasmid, we used pBR322 which contains the protein independent ColE1 origin of replication as well as genes conferring resistance to ampicillin and tetracycline.
In order to study the apparent frequency of intermolecular recombination between plasmids harboring triplet repeat sequences, derivatives of pBR322 containing different lengths of TRS were transformed into E. coli AB1157 and ECF005 (AB1157pir) cells. This was followed by a second transforma-tion with pFW25 derivatives containing various TRS inserts (see "Experimental Procedures"). In E. coli ECF005, both plasmids can co-exist without recombining and can give rise to Tet R and Cm R colonies (Fig. 1). These plasmids can also recombine to form co-integrants at a certain frequency. To establish the frequency of these recombination events, the (CTG⅐CAG)-containing pBR322 and pFW25 derivatives were successively transformed into E. coli AB1157. Because this was a two-step transformation, the pBR322 derivative was already established in the cell and could therefore exist independently in the presence of tetracycline. The pFW25 derivates cannot replicate in this strain because of the absence of the protein. Thus, after the transformation of the Tet R cells with the pFW25 derivatives, the only way to obtain Tet R , Cm R colonies is by recombination between the TRS-harboring plasmids. This could only be due to co-integrants that not only replicate using the ColE1 replicon but also carry the chloramphenicol resistance gene. The apparent frequency of recombination was the fraction of plasmids co-existing in E. coli ECF005 that underwent recombination. Thus, the number of colonies obtained from E. coli AB1157 (Fig. 1B) divided by the number of colonies obtained from E. coli ECF005 (Fig. 1A) gives the apparent frequency of recombination between plasmids containing CTG⅐CAG tracts.
Because two different strains were used in the experiment, it was possible that their different transformation efficiencies could affect the results. Therefore, the efficiencies of transformation of both E. coli strains were normalized with pACYC184 as a control plasmid. The use of pACYC184 was advantageous because it contains the -independent p15A origin of replication (40) and therefore can be maintained in both AB1157 as well as ECF005 strains. Also, pACYC184 has a chloramphen-FIG. 1. Diagram of plasmids and scheme of study. The strategy of two-step transformation of pBR322 derivatives (thick circle) and pFW25 derivatives (thin circle) harboring CTG⅐CAG tracts into E. coli AB1157 or ECF005 (for details see "Results") is presented. The approximate positions of the origins of replication (R6K ␥ ori and ColE1), and the genes encoding the resistances to ampicillin (Amp R ), chloramphenicol (Cm R ), and tetracycline (Tet R ) are shown. The CTG⅐CAG tract is designated by the open box, and the non-repeating sequence is represented by the cross-hatched box. A and C, the number of colonies obtained from E. coli ECF005 after transformation with plasmids containing CTG⅐CAG repeats and non-repeating sequences, respectively. B and D, the number of colonies obtained from E. coli AB1157 after transformation with plasmids containing CTG⅐CAG repeats and non-repeating sequences, respectively. icol resistance marker gene for selection. The number of Tet R and Cm R colonies obtained from both E. coli strains revealed the ratio of the efficiency of transformation of ECF005 to that of AB1157.
In summary, the apparent frequency of recombination between plasmids containing the TRS or non-repeating sequences was calculated as shown in Equation 1.
where R is the ratio of the efficiency of transformation of E. coli ECF005 to that of E. coli AB1157 by the control plasmid pACYC184.
In order to determine whether the triplet repeats had any influence on the apparent recombination frequency, it was necessary to estimate the capacity of homologous but nonrepetitive sequences to recombine in this system. Therefore, pRW4870 and pRW4328 containing a 354-bp segment of the human DMPK gene or pRW4829 and pRW4322 harboring a 564-bp fragment of phage DNA (Table II) were transformed into E. coli AB1157 or ECF005. The apparent frequency of recombination between plasmids containing non-repetitive sequences was calculated as the number of Tet R and Cm R colonies obtained from AB1157 (Fig. 1D) versus the number of Tet R and Cm R obtained from ECF005 (Fig. 1C). Thus, a comparison between the frequencies of recombination of plasmids containing triplet repeats and the frequencies of recombination of plasmids containing non-repetitive sequences reveals the influence of triplet repeats on the apparent frequency of homologous intermolecular recombination.
Long (CTG⅐CAG) Tracts Stimulate Intermolecular Recombination-pRW4899 and pRW4323, which both contain (CTG⅐CAG) 98 , were introduced into E. coli AB1157 and ECF005 by a two-step transformation. The apparent frequency of recombination between these tracts was 190 ϫ 10 Ϫ4 (Table  III). Because the inserts in these plasmids had 410 bp of homology (containing the CTG⅐CAG repeat tracts, human flanking sequences, and segments of the polylinker), we used a 354-bp non-repeating sequence fragment of the human DMPK gene as a control. pRW4870 and pRW4328, which contain this insert, recombine at a frequency of 3.1 ϫ 10 Ϫ4 . Thus, the stimulation of recombination by the (CTG⅐CAG) 98 repeats is ϳ60-fold. Also, the apparent frequency of recombination between (CTG⅐CAG) 98 tracts is ϳ4 times higher when compared with the longer non-repeating sequence (564 bp of DNA) ( Table III).
The background level of recombination in this system was determined to be 2 ϫ 10 Ϫ6 by measuring the frequency of recombination between the control vectors pBR322 and pFW25, which had no inserts. These vectors are essentially non-homologous; therefore, the recombinants presumably arise by illegitimate non-homologous recombination. Also, because pFW25 cannot replicate in AB1157, the non-replicative circle may integrate into the E. coli chromosome and give rise to Cm R colonies when subjected to Cm selection.
Considering instabilities from the human genetics standpoint, the severity and age of onset of the TRS diseases has been correlated with an increase in the length of the repeats in certain genes in patients (1). The stability of CTG⅐CAG in plasmids in E. coli depends on the length of the repeats (5,33). Recombination of the tracts has also been shown to depend on their length (27,28). Furthermore, the relationship between the length of homology and the apparent frequency of recombination has been well established in various systems (46,47). Therefore, we investigated the effect of CTG⅐CAG tract lengths on their recombination frequency.
The apparent frequency of recombination between the (CTG⅐CAG) 67 tracts in pRW4828 and pRW4317 is 100 ϫ 10 Ϫ4 (Table III). Thus, these tracts recombine at half the frequency at which the (CTG⅐CAG) 98 tracts recombine (p ϭ 0.007). Also, the apparent recombination frequency of the (CTG⅐CAG) 67 tracts, which have 317 bp of homology, is ϳ30-fold higher than that of the 354-bp control DMPK sequence. Thus, we conclude that a lengthening of the CTG⅐CAG tracts substantially increases the intermolecular homologous apparent recombination frequency.
Orientation of the CTG⅐CAG Tracts Influences Recombination Frequency-Previous in vivo studies demonstrated that the genetic instabilities of (CTG⅐CAG) n sequences are determined by their orientation relative to the origin of replication (5-7, 33, 42). The expansions and deletions of the TRS due to replication were shown to be dependent on the location of the CTG tracts on the leading (orientation I) or on the lagging (orientation II) strand template (5-7, 33, 42). Also, the orientation of the (CTG⅐CAG) n repeats strongly influenced the pausing of the replication fork in vivo (48).
To test the effect of triplet repeat orientation on the apparent frequency of recombination, we performed experiments with the two (CTG⅐CAG) n tracts in orientations I or in orientations II (Table III, 3rd and 4th lines). Interestingly, the (CTG⅐CAG) 98

TABLE II Plasmids used in this study
The two families of plasmids used in this study were derivatives of pBR322 and pFW25. The different lengths of the CTG ⅐ CAG tracts as well as two non-repeating sequences were cloned into the PvuII site of pBR322 or into the HincII site of pFW25 (for details see "Experimental Procedures"). Orientations I and II were defined (5-7) by the presence of CTG or CAG repeats, respectively, on the leading strand template for DNA replication. The (CTG ⅐ CAG) 67 and (CTG ⅐ CAG) 98 sequences are pure CTG ⅐ CAG tracts; however, (CTG ⅐ CAG) 175 contains two G to A interruptions at repeats 28 and 69 and (CTG ⅐ CAG) 73 contains 2 G to A interruptions at positions 28 and 59 (5). The right column shows the number of G to A interruptions in each repeat sequence. Orientations A and B (for the 354-bp non-repeating sequence fragment) were defined by the presence of a KpnI recognition site, in the sequence, distal or proximal to the origin of replication, respectively.  67 in orientation II) was 0.41 Ϯ 0.4. In another experiment, the measured ratios for pRW4323 ((CTG⅐CAG) 98 in orientation II) and pRW4324 ((CTG⅐CAG) 98 in orientation I) were 3.0 Ϯ 0.5 and 2.6 Ϯ 1.0, respectively. Thus, there was no significant influence of triplet repeat orientation on the transformation efficiency. Because orientation is defined relative to the origin of replication, the strong influence of orientation on the apparent recombination frequency suggested a role for DNA replication in the process. However, it was also possible that inverting the orientation of any sequence on a plasmid might influence the recombination frequency. To test this idea, we constructed pBR322 and pFW25 derivatives in which the 354-bp human DMPK gene fragments were oriented in the reverse orientations (orientations B) compared with pRW4870 and pRW4328 (orientations A) (defined in Table II). The recombination assay revealed no difference in the recombination frequencies for the two orientations of the DMPK fragment (Table III). Thus, these data strongly suggest that replication has a role in the increased apparent recombination frequency observed for the repeats in orientations II (see "Discussion").
To elucidate further the role of replication on the homologous recombination frequency, we conducted experiments with the repeats in opposite orientations. We reasoned that because the pBR322 derivatives replicate in AB1157, the orientation of the (CTG⅐CAG) n tract in this vector would dictate the apparent frequency of recombination. Hence, we performed a recombination assay in E. coli AB1157 with a pBR322 derivative, pRW4899, containing the (CTG⅐CAG) 98 in orientation II and a pFW25 derivative, pRW4324, with the TRS in orientation I (Table II). Also, similar experiments were conducted with pRW4898 and pRW4323 wherein the (CTG⅐CAG) 98 tracts were in orientations I and II, respectively (Table II). We were surprised to find that, in general, the apparent recombination frequency between the plasmids containing repeating tracts in the opposite orientations was lower than between plasmids harboring TRS in the same orientation (Table III). To determine whether this behavior also extended to non-repeating sequences, we measured the apparent recombination frequency between the derivatives of pBR322 and pFW25 that contained the 354-bp DMPK fragment in orientation A and orientation B, respectively. The opposing orientations of the DMPK fragment had no influence on the homologous apparent recombination frequency. The frequency of recombination between the DMPK fragments in orientation A in pBR322 and in orientation B in pFW25 was 2.8 ϫ 10 Ϫ4 . For the reciprocal cross, the frequency of recombination between DMPK fragments in orientation B in pBR322 and in orientation A in pFW25 was 4.1 ϫ 10 Ϫ4 . By using the same two-plasmid system, other experiments were conducted to determine the frequency of intermolecular recombination between two (GAA⅐TTC) 176 or two (GAA⅐TTC) 60 tracts. The results revealed that the relative orientations of two GAA⅐TTC repeat tracts had no influence on the recombination frequency. 2 Thus, the inhibition of recombination between oppositely oriented tracts appears to be a (CTG⅐CAG) triplet repeat-dependent phenomenon (see "Discussion"). Nevertheless, when the tracts were in opposite orientations, the apparent recombination frequency was the highest when the (CTG⅐CAG) 98 repeats in the replicating plasmid (the pBR322 derivative) were in orientation II (Table III).
These results clearly demonstrate that the intermolecular apparent recombination frequency is influenced by the orientation of the repeating tracts and is significantly higher when the CAG tract is on the leading strand template for pBR322 derivatives. These findings suggest that events that occur during replication, presumably replication fork arrest and pausing at unusual DNA structures (48 -50), could stimulate intermolecular homologous recombination between the triplet repeat tracts.
Interruptions in the CTG⅐CAG Tracts Decrease the Recombination Frequency-Previous studies (51) suggested that interruptions stabilized the TRS sequences by interfering with the formation of slipped strand structures. It was also shown that interruptions in the repeating tracts inhibit the recombination between TRSs (27,28). Hence, we postulated that interruptions in the (CTG⅐CAG) n tracts would reduce their recombination frequency. To test this hypothesis, we assayed the apparent frequency of recombination for CTG⅐CAG tracts containing G to A interruptions (Table IV). In the case of (CTG⅐CAG) 175 , the TABLE III Apparent frequency of intermolecular recombination in E. coli AB1157 (RecA ϩ ) E. coli AB1157 and ECF005 were transformed with pBR322 derivatives followed by transformation with pFW25 derivatives harboring different lengths of the (CTG ⅐ CAG) n tracts or the non-repeating tracts. The transformation mixtures were plated onto plates with chloramphenicol and tetracycline, and the numbers of colonies were counted. The apparent frequency of recombination was calculated as a ratio between the number of colonies obtained from E. coli AB1157 and the number of colonies obtained from ECF005 (for details see "Experimental Procedures"). The orientations of the TRS were defined in Table II apparent frequency of recombination was ϳ2 ϫ 10 Ϫ4 ; there was no significant effect of orientation on the frequencies. Because these tracts had 654 bp of homology, we used a 564-bp nonrepetitive sequence from the genome of bacteriophage as a control. The control sequence recombined at a frequency of 55 ϫ 10 Ϫ4 . Thus, the frequency of recombination of the triplet repeat tract was ϳ25-fold less than observed for the phage fragments. It was possible that the presence of the interruptions caused the 25-fold lower frequency of recombination. However, an examination of the sequence revealed that it was ϳ60% A ϩ T-rich (data not shown). Because A ϩ T-rich regions are known to be highly recombinogenic (52), the lower frequency of recombination might have been, in part, due simply to a higher recombinogenicity of the sequence. To clarify this, we determined the apparent recombination frequency between the 335-bp (CTG⅐CAG) 73 inserts that contain two G to A interruptions. The frequency was ϳ6 ϫ 10 Ϫ5 . This was ϳ5-fold less than compared with the 354-bp long DMPK sequence (Table  III). Thus, we conclude that interruptions do, in fact, reduce the recombination frequencies between the triplet repeat tracts.
Interestingly, we also observed that the frequency of recombination between an interrupted (CTG⅐CAG) 175 tract and an uninterrupted pure tract containing (CTG⅐CAG) 67 was 5.8 ϫ 10 Ϫ4 (Table IV). This was about 2-fold higher than the frequency measured between two interrupted tracts containing (CTG⅐CAG) 175 . Thus, a G to A interruption in one of the tracts reduces the recombination frequency but to a lesser extent than when both recombining tracts have the interruption.
We postulate that the G to A interruptions generate G‫ؠ‬T and A‫ؠ‬C mismatches during the formation of heteroduplex recombination intermediates. These intermediates may be destabilized by the mismatch repair system, thereby reducing the recombination frequency (see "Discussion").
Intermolecular Recombination Is a RecA-dependent Process-It was demonstrated previously (53) that intermolecular recombination between two compatible plasmids is diminished in RecA-deficient cells. This stands in contrast to intramolecular recombination that was demonstrated to occur by both RecA-dependent and -independent processes (26,53,54). In order to determine whether intermolecular homologous recombination between TRSs involved the RecA protein, we measured the apparent recombination frequency between CTG⅐CAG tracts in RecA Ϫ cells (E. coli JC10287). The plasmids containing (CTG⅐CAG) 67 or (CTG⅐CAG) 175 were sequentially transformed into E. coli JC10287 as well as into ECF006 (Fig. 1, for details see "Experimental Procedures"). Similar experiments were performed using plasmids harboring non-repeating sequences (the 564 bp DNA fragment and the 350-bp DMPK fragment). For repeating as well as non-repeating sequences, we did not observe any recombination events (data not shown). Our assay allows the selection for recombination events that take place with a frequency of 10 Ϫ6 or higher. Therefore, we cannot rule out the possibility that recombination does take place in the recA cells but with a frequency of Ͻ10 Ϫ6 .
In summary, our results agree with previous observations (27,28,53) demonstrating that intermolecular recombination between homologous sequences is greatly reduced in recA cells.
A Single Crossover Event between Two Plasmids Occurs through CTG⅐CAG Tracts-In order to characterize the products of recombination, we analyzed the plasmids recovered from E. coli AB1157 and ECF005 cells that had been transformed with the pBR322 and pFW25 derivatives. Thus, pRW4312 containing (CTG⅐CAG) 175 and pRW4316 harboring (CTG⅐CAG) 67 were transformed sequentially into E. coli AB1157 and ECF005 (e.g. E. coli AB1157 and ECF005 were each transformed with pRW4312; subsequently, these cells harboring this plasmid were also transformed with pRW4316). The plasmids were isolated from the Tet R and Cm R colonies obtained from both E. coli strains.
Agarose gel electrophoresis of the supercoiled DNA revealed the presence of monomers of pRW4312 and pRW4316 as well as their multimeric forms when the plasmid DNAs were isolated from E. coli ECF005 (Fig. 2, lanes 7-12). This result was as expected because both pBR322-derived constructs as well as the pFW25 derivatives can co-exist and replicate independently in ECF005. Restriction analysis (see below) of these plasmids revealed that recombination (single crossover events) had occurred between the two plasmids (data not shown). These events were observed for plasmids of all lengths. Thus, plasmids that can co-exist independently in a cell also undergo homologous recombination.
When the DNAs were isolated from E. coli AB1157, the monomeric form of pRW4316 (pFW25 derivative) was not present (Fig. 2, lanes 1-6). This was in agreement with previous findings that the protein is essential for the replication and maintenance of R6K plasmids. Interestingly, the monomer of pRW4312 was able to co-exist (Fig. 2, lanes 1-6) with the recombinant (co-integrant) species (composed of pRW4316 and pRW4312). The co-existence of pRW4312 and the recombinant plasmid, despite both being replicated by the ColE1 replicon, may be attributed to the ability of antibiotic selective pressure to overcome the statistical plasmid incompatibility observed in the absence of selection (55,56).
It should be noted that plasmid multimers were observed in both AB1157 and ECF005. The pBR322 constructs, the pFW25 derivatives, as well as the recombinant co-integrants were all able to form multimers efficiently ( Fig. 2 and data not shown). Therefore, we can rule out the possibility that some selective advantage accrued to some of the constructs that might have skewed the data.
To analyze the products of the recombination process, plasmids obtained from E. coli AB1157 were digested with SnaBI and NdeI that have unique recognition sites on pRW4316 and  " and Table II), were successively transformed into E. coli AB1157 and ECF005 cells. The transformation mixtures were plated onto plates with chloramphenicol and tetracycline, and the numbers of colonies were counted. The apparent frequency of recombination between repeating tracts was calculated as described in Table III pRW4312, respectively (Fig. 3). Digestion of the individual plasmids with these enzymes resulted in the expected linear products (Fig. 3, lanes 9 and 10). For the recombination products, those plasmids that underwent a single crossover would be expected to release a "short fragment" containing one copy of recombined triplet repeat tract as well as a "long fragment" harboring the second copy of the TRS with the remaining vector sequence (Fig. 3). If the crossover took place within the triplet repeats, the short fragment would contain a CTG⅐CAG tract flanked on one side by 294 bp of non-repeating human sequence plus a segment of pBR322, and on the other side by 106 bp consisting of the non-repetitive human sequence and a fragment of pFW25. The size of this short fragment would be 601 bp if the fragment contained 67 repeats of CTG⅐CAG tract or 925 bp if the fragment harbored the (CTG⅐CAG) 175 tract. In all cases, the sizes of the bands were in this range suggesting that the fragments did indeed contain the TRS (Fig. 3). (For a detailed analysis of the triplet repeat lengths and instabilities, see Fig. 6 (discussed below).) The size of the long fragment was ϳ7500 bp. In all cases, the band corresponding to the linear pRW4312 was present, as expected. Because the transformations were done in a two-step manner, pRW4312 could continue to exist in the cell as a monomer and give rise to the linear product after digestion.
In order to conclusively establish that the "short bands" contain CTG⅐CAG tracts, the restriction fragments were iso- FIG. 2. Products of recombination between pRW4312 and pRW4316. The lanes numbered 1-6 contain DNAs isolated from individual colonies from E. coli AB1157. The lanes numbered 7-12 show DNAs isolated from individual colonies from E. coli ECF005. Lanes 13 and 14 correspond to the control plasmids pRW4312 and pRW4316, respectively.
FIG. 3. Assay for a single crossover event. The thick circle represents the pBR322 derivative containing a (CTG⅐CAG) n tract (cross-hatched box) and the thin circle shows the pFW25 derivative harboring the (CTG⅐CAG) n tract. NdeI and SnaBI are restriction sites unique to pBR322 and pFW25, respectively. The figure eight-shaped molecule (top left) represents the recombinant molecule (co-integrant) formed after a single crossover that has taken place between the two TRSs. The molecules represented on the lower left of the figure show the products of digestion by NdeI/SnaBI of the recombinant molecules. The gel presents the analysis of the plasmid DNAs isolated from the single colonies from E. coli AB1157 (after transformation with pRW4312 and pRW4316) digested with NdeI/SnaBI. The restriction fragments were labeled with [␣-32 P]dTTP and separated on a 1% agarose gel. The dried gel was exposed to an Amersham Biosciences PhosphorImager screen followed by scanning. The lanes numbered 1-8 correspond to DNAs isolated from the single colonies. Lanes 9 and 10 represent linear pRW4312 after NdeI and linear pRW4316 after SnaBI digestion, respectively. Both linear plasmids were labeled with [␣-32 P]dTTP (pRW4316 appears as a smear because digestion with SnaBI gives blunt ends that cannot be labeled; however, a small amount of 3Ј 3 5Ј-exonucleolytic digestion generates some appropriate ends for labeling). The 1-kbp ladder was purchased from Invitrogen, and the sizes of these bands are shown at the right. lated from the gel and sequenced. The DNA sequence analysis confirmed that all the fragments contained the CTG⅐CAG tracts, as expected (data not shown).
These results clearly demonstrate that pRW4312 and pRW4316 recombine through the CTG⅐CAG tracts by a single crossover event and give rise to the Tet R and Cm R colonies. The lack of additional unique recognition sites on both plasmids did not allow us to release the second TRS-containing recombinant fragment present in the ϳ7500-bp linear molecule (Fig. 3).
In this assay, ϳ5% of colonies of AB1157 after the two-step transformation contained only trace amounts of the co-integrant plasmids, which were isolated and characterized (data not shown). The restriction analysis for single crossover events performed on these DNAs revealed the existence of little or no TRS-containing recombinant fragment. It is possible that the Cm R colonies arose due to the integration of the pFW25 derivatives into the chromosome in the presence of Cm selection.
Assay for a Second Crossover Event-The co-integrants obtained as a result of a single crossover between pRW4312 and pRW4316 can replicate in the cell using the ColE1 origin of replication. These recombinant molecules are able to recombine further via a second intermolecular crossover event (as shown in Fig. 4). If the second crossover takes place, the sequences flanking the TRS for the short bands should both be derived from either pBR322 or from pFW25. This would also be the case for 4, 6, and higher even-numbered crossover events. To identify even numbers of crossovers, HindIII or NdeI/XmnI digestions were performed. The NdeI/XmnI restriction releases the TRS-containing insert from pRW4312, whereas the HindIII digestion excises the CTG⅐CAG-containing tracts from pRW4316. Because pRW4316 cannot exist in E. coli AB1157 by itself, the only way to possibly excise a TRS-containing insert with HindIII is from products of recombination between cointegrants as shown in Fig. 4. In seven of eight cases (Fig. 4), we observed that the second crossover did indeed take place for pRW4312 and pRW4316. These co-integrants can also undergo intramolecular recombination because each molecule contains two copies of the TRS (31). We were unable to assay for these events due to the lack of selection for the products formed.
The assay for an even number of crossover events was per-formed for all lengths of triplet repeat inserts. For (CTG⅐CAG) 67 , in 3 of 30 analyzed colonies, even-numbered crossover events were observed. Similarly, 4 of 35 colonies were scored for such events in the case of the (CTG⅐CAG) 98 tract. Thus, these results demonstrate that the co-integrants can further recombine through subsequent intermolecular crossover events.
Instability of the CTG⅐CAG Tracts in the Recombination Products-In order to study the instability of (CTG⅐CAG) n tracts, we analyzed ϳ30 colonies obtained from E. coli AB1157 after successive transformation with pBR322 containing (CTG⅐CAG) n tracts and pFW25 harboring (CTG⅐CAG) n sequence. The plasmids were assayed for an odd number of crossover events with SnaBI/NdeI digestion. Fig. 5A shows the analyses of products (co-integrants) after single or odd numbers of crossover events, which took place between pRW4827 and pRW4316 (both contain (CTG⅐CAG) 67 tracts). The length of the products after SnaBI/NdeI digestion (short fragments) should be 601 bp if the band contains 67 repeats of CTG⅐CAG tract or 802 for the band harboring 134 repeats (e.g. twice the length of (CTG⅐CAG) 67 ). This would be possible if the crossover took place on the distal ends of the repeating tracts on both plasmids and the recombinant products contained both TRSs. The increase in the length of TRS could also be explained by the expansions of repeating tracts during the recombination process. In all cases, the sizes of the recombination products were in this length range. These products (after single or odd numbers of crossovers) differed in length because the crossover could take place in different parts (registers) of the repeating tracts. Thus, the products of recombination between two (CTG⅐CAG) 67 tracts ranged from (CTG⅐CAG) 27 (deletion of 40 repeats) to (CTG⅐CAG) 104 (expansion by 37 repeats) (Fig. 6A).
Similar data were obtained from co-integrants between pRW4898 and pRW4324 (both harbor (CTG⅐CAG) 98 tracts). The length of the TRS containing recombinant products after SnaBI/NdeI digestion should be 694 bp if the fragments contained 98 repeats of CTG⅐CAG or 988 bp if the fragments harbored 196 repeats (two copies of the CTG⅐CAG tracts) (Fig. 5B). The products obtained after a single crossover varied in length from (CTG⅐CAG) 15 (deletion of 83 repeats) to (CTG⅐CAG) 143 (ex- FIG. 4. Assay for a second crossover event. The molecules formed by the first recombinational event (Fig. 3) can undergo a second crossover. The figure eightshaped molecule (top right) represents the co-integrant formed by recombination between the TRSs (depicted as the crosshatched boxes). The right and left parts of the figure show the products of digestion of the recombinant molecules with Hin-dIII and NdeI/XmnI, respectively. The gel presents the analysis of the plasmids isolated from E. coli AB1157 (after transformation with pRW4312 and pRW4316). DNA was digested with HindIII or NdeI/ XmnI, labeled with [␣-32 P]dATP or [␣-32 P]dTTP, respectively, and separated on a 7% polyacrylamide gel. The dried gel was exposed to an Amersham Biosciences PhosphorImager screen followed by scanning. The lanes numbered 1-8 correspond to DNAs isolated from the single colonies after HindIII restriction. The lanes numbered 1Ј-8Ј represent the same DNA samples as numbered 1-8 but after NdeI/ XmnI restriction. The sizes of the 1-kbp ladder are shown at the left. pansion by 45 repeats) (Fig. 6A). Furthermore, the products of the recombination between pRW4316 (containing (CTG⅐CAG) 67 ) and pRW4312 (carrying (CTG⅐CAG) 175 ) (Fig. 3) were characterized. The lengths of these products ranged from 16 to 251 repeats. Because the parental molecules were of different lengths, the products could have arisen from either parent. Hence, the product lengths are represented as changes in the length of the (CTG⅐CAG) 67 as well as in that of (CTG⅐CAG) 175 (Fig. 6A). Thus, it could be argued that (CTG⅐CAG) 67 gave rise to deletion products that had lost up to 51 repeats as well as products that had expanded by as many as 184 repeats. Alternatively, the (CTG⅐CAG) 175 tract could have been deleted by as many as 159 repeats or expanded by up to 76 repeats. Fig. 6B shows the quantitation of the instabilities of the (CTG⅐CAG) tracts after an odd number of crossovers. Only a small number of recombinant products contained the original starting lengths. Thus, the process of intermolecular recombination between (CTG⅐CAG) repeat tracts results in substantial length changes.
Parallel analyses were performed for plasmids isolated from E. coli ECF005 (pBR322 derivatives and pFW25 derivatives can co-exist and replicate independently in this strain). The lengths of the TRS containing recombinant products were analyzed for molecules, which underwent odd numbers of crossover events (after SnaBI/NdeI digestion). We did not observe any differences in the instabilities of the TRS from co-integrants isolated from ECF005 compared with the co-integrants obtained from AB1157 (data not shown).
The co-integrant molecules that were analyzed for the oddnumbered crossover events were also assayed for even-numbered crossover events. The products were then analyzed for length changes of the fragments containing the TRS. The results did not reveal any major deletion or expansion products (data not shown).
Thus, we conclude that the instability (expansions and deletions) of triplet repeat sequences can occur by reciprocal crossovers. However, the sum of the lengths of the two triplet repeat tracts does not increase. DISCUSSION We have shown by using a biological assay that the apparent frequency of homologous intermolecular recombination between two CTG⅐CAG tracts is up to 60-fold higher than between two non-repeating sequences of similar length. The data also reveal the following. First, the apparent frequency of recombination is proportional to the length of the repeating tract; the longer the tracts, the higher the recombination frequency. Second, the frequency depends on the orientation of the CTG⅐CAG sequences; recombination is more frequent when the TRS are in orientation II relative to the origin of replication (CTG tracts on the lagging strand template). Third, sequence polymorphisms that interrupt the homogeneity of the CTG⅐CAG repeat tracts reduce the apparent recombination frequency when compared with pure uninterrupted CTG⅐CAG repeats. Fourth, the recombination process can genetically destabilize the CTG⅐CAG tracts and result in both expansions and deletions. However, expansion products that are longer than the sum of the lengths of the two individual tracts were not observed (discussed below).
Repetitive sequences have been observed previously to stimulate homologous recombination (15)(16)(17)(18)(19)(20)(21)(22)57). The formation of unusual secondary structures such as left-handed Z-DNA and intramolecular triplexes was proposed to be responsible for the association of these sequences with recombination hot spots (16,17,58). In fact, triplexes can stabilize branch migration intermediates in vitro, suggesting a stimulatory role for these structures in recombination (59). Also, perfect inverted repeat sequences (which form cruciform structures (60)) stimulate recombination in bacteria (61, 62) and yeast (63). Triplet repeat sequences can form hairpin-loop (5-13, 64 -66), tetraplex (67), and slipped (51) structures. The CTG⅐CAG repeats also exist in a highly flexible and writhed configuration (68 -70), a property proposed to serve as a sink for localized negative supercoil density at these sequences and thereby promote duplex unpairing and strand slippage (32,71). We hypothesize that the adoption of unusual secondary structures by the CTG⅐CAG tracts stimulates intermolecular recombination between homologous tracts. This is in contrast to the lower apparent recombination frequency observed for non-repeating sequences of similar length, which are unlikely to form unusual secondary structures. We favor the idea that the high negative supercoil density at the CTG⅐CAG repeats causes them to be transiently unpaired, where RecA mediated strand exchange may occur more frequently. However, we cannot rule out the alternate possibility that the higher recombinogenicity of the CTG⅐CAG repeats is because repetitive sequences can align with each other in multiple frames during the strand exchange reaction.
Surprisingly, the frequencies of recombination between nonrepeating sequences with a pair of 354-bp DMPK fragments FIG. 5. The instability of (CTG⅐CAG) tracts after an intermolecular single crossover event. A, the gel represents the analysis of the plasmid DNAs isolated from the single colonies from E. coli AB1157 after transformation with pRW4827 and pRW4316 and digested with NdeI/SnaBI. The restriction fragments were labeled with [␣-32 P]dTTP and separated on a 1% agarose gel. The dried gel was exposed to an Amersham Biosciences PhosphorImager screen followed by scanning. The lanes numbered 1-12 correspond to DNAs isolated from the single colonies. Lane 13 represents linear pRW4827 after NdeI digestion. B, the analysis of the plasmid DNAs isolated from the single colonies from E. coli AB1157 after transformation with pRW4898 and pRW4324 and digested with NdeI/SnaBI. The restriction fragments were labeled with [␣-32 P]dTTP and separated on a 1% agarose gel. The dried gel was exposed to an Amersham Biosciences PhosphorImager screen followed by scanning. The lanes numbered 1-12 correspond to DNAs isolated from the single colonies. Lane 13 represents linear pRW4898 after NdeI digestion. The sizes of the 1-kbp ladder are shown at the right. and a pair of 564-bp bacteriophage DNA fragments were not the same. The frequency was ϳ17-fold lower for the 354-bp DMPK fragments. The reason for this difference is uncertain, but an examination of the compositions of these sequences revealed that the phage fragment is 60% A ϩ T-rich. In contrast, the DMPK sequence is 60% G ϩ C-rich, and the (CTG⅐CAG) n repeats are 67% G ϩ C-rich. Several prior studies (52,(72)(73)(74) have suggested that A ϩ T-rich sequences are recombinogenic. Therefore, it is possible that the ϳ60% A ϩ T richness of the phage fragments (contrasted to the ϳ40% A ϩ T content of the DMPK sequence) is at least partly responsible for the higher apparent recombination frequency for the 564-bp fragments. Furthermore, the frequency of homologous recombination strongly corresponds to the length of the homology between the recombining sequences (46,47,75,76). The phage fragment is 60% (210 bp) longer than the DMPK sequence. These two differences may be sufficient to increase the recombination frequency of the phage fragments. Neither the DMPK sequence nor the phage fragment contains sites.
Our data show that the apparent recombination frequency of the CTG⅐CAG tracts depends on their orientation relative to the origin of replication. In orientation II (when the CTG tract is on the lagging strand template), the apparent recombination frequency was substantially higher. Because the two plasmids were introduced successively into the cell, we propose that the replication of the CTG⅐CAG repeats in the pBRW322 derivatives (which were introduced in the first step) influences the recombination frequency. The CTG⅐CAG tracts arrest replication fork progression in vitro and in vivo, presumably due to the formation of unusual secondary structures (48 -50, 67); this occurs predominantly when the CTG tract is located on the lagging strand template (orientation II) (48,77). Hence, we propose a model wherein the stalling of the replication fork at the secondary structures leads to nicks and/or double-strand breaks in the repeating tract. Discontinuities in the duplex right-handed B-DNA structure inhibit replication fork progression or cause its collapse (78,79), and stalled replication forks induce DNA repair and recombination (78,80,81). The recombination proteins may then be recruited to the TRS loci due to the affinity of these proteins to unusual secondary structures or to the strand discontinuities. These events could result in the higher apparent recombination frequency for CTG⅐CAG FIG. 6. Distribution of the expansion and deletion products of different lengths of (CTG⅐CAG) after an odd number of crossovers. Several individual clones containing different lengths of the (CTG⅐CAG) tracts were isolated after an odd number of crossover events. The lengths of the (CTG⅐CAG)containing fragments (as shown on Fig. 5) were measured, and the numbers of triplets were calculated as described (42). A, 26 individual clones after recombination involving (CTG⅐CAG) 67 versus (CTG⅐CAG) 67 (q) and 24 isolates of (CTG⅐CAG) 98 versus (CTG⅐CAG) 98 () were analyzed. The measured lengths of the recombination products obtained after an odd number of crossover events are represented on the y axis as an increase or decrease in the number of repeats compared with the original starting length. Also, 23 clones of (CTG⅐CAG) 67 versus (CTG⅐CAG) 175 were characterized, and the lengths of the products are represented as changes in the length of (CTG⅐CAG) 67 (ࡗ) or in that of (CTG⅐CAG) 175 (f). The x axis depicts the numerical names that were arbitrarily assigned to the individual clones isolated from experiments involving each set of molecules. B, a quantitative description of the instability of the triplet repeat tracts after recombination was made by calculating the percentage of individual recombination products from A that harbored expanded, deleted, or unchanged TRS tracts for each set of recombining molecules. tracts in orientation II. This model is supported by the observation of double-strand breaks in CTG⅐CAG repeats in yeast (82,83).
A surprising observation was that when two inserts were oriented oppositely with respect to each other, the recombination frequency was substantially lower than when the inserts were in the same orientation. This effect was exclusively limited to inserts containing CTG⅐CAG repeats; oppositely oriented homologous control sequences from the DMPK locus as well as the GAA⅐TTC sequences recombined with frequencies similar to each other. The reason why the relative orientations of the CTG⅐CAG repeats have such a dramatic effect on their recombinogenicity is uncertain but may be due to a residual amount of replication from the R6K origin. However, literature exists (45,84) that argues against this possibility. Another possibility, albeit remote, is that the secondary structure of the triplet repeat tract is somehow different in orientation I as compared with orientation II and that these differences may present a barrier to the recombination machinery. Because in vivo determinations of DNA conformations are exceedingly difficult (reviewed in Refs. 60 and 85), substantial new experimental strategies may need to be developed to address this question.
The apparent recombination frequency between the CTG⅐CAG repeats containing CTA⅐TAG interruptions was lower than the frequency with pure uninterrupted tracts. We postulate that the presence of interruptions results in the formation of imperfectly aligned heteroduplex recombination intermediates that contain G‫ؠ‬T and A‫ؠ‬C mismatches at the sites of the interruptions. These mismatches may attract the mismatch repair proteins MutS and MutL (42), which can inhibit RecA-mediated strand transfer (86). Thus, the recombination intermediates may be destabilized by the mismatch repair system, thereby diminishing the recombination frequency. This model is consistent with previous work (87,88) that showed that recombination between homologous sequences was stimulated by up to 2 orders of magnitude in strains deficient in mismatch repair functions. In addition, the interruptions can reduce the propensity of these sequences to form secondary structures that may impede the progression of the replication fork (51). A combination of these two effects could account for the lower apparent recombination frequency of the interrupted CTG⅐CAG tracts.
Previous studies (27,28) demonstrated that recombination between CTG⅐CAG tracts in a two-plasmid system enhanced their genetic instability; severalfold expansions were reported. In the present work, we have also utilized a two-plasmid system to measure the apparent frequency of intermolecular homologous recombination between the TRS tracts. In this system, we observed expansions and deletions of the TRS after single (or odd-numbered) crossover events as well as after double (or even-numbered) crossover events. However, it should be noted that the total length of the expanded products was never higher than the sum of the lengths of the two triplet repeats. La Spada and co-workers (89) in a recent study also used a somewhat different two-plasmid CTG⅐CAG recombination system but did not observe massive expansions. The reasons for the apparent lack of agreement between our results, those of La Spada et al. (89), and those from previous investigations from this laboratory (27,28) may be due to differences in the experimental strategies (especially the vectors, replication origins, and transcription properties).
Our experiments demonstrate that CTG⅐CAG repeats are preferred sites for recombination in vivo. This property of these sequences suggests a number of important roles. First, frequent recombination events might occur within CTG⅐CAG re-peats to provide ample opportunity for expansion events, as suggested previously (27,28) to be responsible for a variety of human neurological disorders. A second possibility is that recombination between CTG⅐CAG repeats may promote genetic exchange and speciation in a wide variety of organisms. Finally, the high apparent frequency of recombination between triplet repeats may provide a strong driving force for the evolution of microsatellite sequences (see accompanying article (31)).