Genetic Instabilities in (CTG·CAG) Repeats Occur by Recombination*

The expansion of triplet repeat sequences (TRS) associated with hereditary neurological diseases is believed from prior studies to be due to DNA replication. This report demonstrates that the expansion of (CTG·CAG) n in vivo also occurs by homologous recombination as shown by biochemical and genetic studies. A two-plasmid recombination system was established in Escherichia coli with derivatives of pUC19 (harboring the ampicillin resistance gene) and pACYC184 (harboring the tetracycline resistance gene). The derivatives contained various triplet repeat inserts ((CTG·CAG), (CGG·CCG), (GAA·TTC), (GTC·GAC), and (GTG·CAC)) of different lengths, orientations, and extents of interruptions and a control non-repetitive sequence. The availability of the two drug resistance genes and of several unique restriction sites on the plasmids enabled rigorous genetic and biochemical analyses. The requirements for recombination at the TRS include repeat lengths >30, the presence of CTG·CAG on both plasmids, and recA andrecBC. Sequence analyses on a number of DNA products isolated from individual colonies directly demonstrated the crossing-over and expansion of the homologous CTG·CAG regions. Furthermore, inversion products of the type [(CTG)13(CAG)67]·[(CTG)67(CAG)13] were isolated as the apparent result of “illegitimate” recombination events on intrahelical pseudoknots. This work establishes the relationships between CTG·CAG sequences, multiple fold expansions, genetic recombination, formation of new recombinant DNA products, and the presence of both drug resistance genes. Thus, if these reactions occur in humans, unequal crossing-over or gene conversion may also contribute to the expansions responsible for anticipation associated with several hereditary neurological syndromes.

Genetic instabilities (expansions and deletions) of triplet repeat sequences (TRS) 1 ((CTG⅐CAG), (CGG⅐CCG), or (GAA⅐TTC)) are a hallmark of certain hereditary neurological diseases (1,2). Numerous workers in human genetics have proposed DNA replication, gene conversion, recombination, and related processes as the mechanism(s) responsible for these alterations in repeat sequence lengths. Subsequent in vivo studies in genetically tractable systems (1,3) and in vitro investigations (4) have demonstrated expansions and deletions during DNA replication, probably by slipped strand misalignment due to preferential hairpin formation of TRS. Similar molecular studies in vivo on gene conversion and recombination are lacking.
Several human genetic studies on patient materials reported haplotype analyses, especially related to myotonic dystrophy (DM) and the fragile X syndrome, which implicated gene conversion and/or unequal crossing-over (types of recombination) to genetic instabilities. In the first report, Korneluk and coworkers (5,6) proposed unequal crossing-over (5) and gene conversion (6) as the mechanisms responsible for the expansions and deletions observed in the (CTG⅐CAG) mutation during DM transmission. This conclusion was derived from haplotype analyses of six polymorphic markers in the DM region. The TRS was presumed to be the site of the discontinuous gene conversion events.
Second, Nelson and collaborators (7) investigated the loss of mutation at the FMR1 locus through multiple exchanges between maternal X chromosomes. They described a daughter of a female carrier who inherited the fragile X premutation chromosome based on haplotype analyses using flanking markers. The (CGG⅐CCG) repeat sequence and the intragenic polymorphic marker FMRb showed the normal maternal alleles, whereas two other intragenic markers showed the risk haplotype. Since the other intragenic markers are located between the markers (CGG⅐CCG) and FMRb, this results in patches of normal and fragile X sequence in the FMR1 gene of the daughter and was explained on the basis of gene conversion. Likewise, prenatal diagnosis of the fragile X syndrome showed a loss of mutation owing to a double recombinant or gene conversion at the FMR1 locus (8).
Third, Brown et al. (9) investigated reverse mutations in the fragile X syndrome as well as founder effects. Based on these haplotype analyses of nearby markers to the (CGG⅐CCG) repeats, revertants were discovered in a small percentage of the premutation carrier offspring. Gene conversion (9) and recombination (10) were proposed as the responsible mechanisms. Fourth, homologous recombination involving unequal pairing of sister chromatids leading to the formation of a four-stranded synaptic structure was suggested by Wieringa and co-workers (11) as the mechanism to explain the (CTG⅐CAG) expansion responsible for gonosomal mosaicism in DM patients. Fifth, Warren (12) interpreted the work of Olsen and co-workers (13) on the polyalanine expansion in synpolydactyly to result from unequal crossing-over in the HOXD13 protein gene and the authors agreed (12). Sixth, Kidd and co-workers (14) studied haplotype analyses of the DM locus on a worldwide basis with emphasis on the implications for the evolution of modern humans and the origin of the DM mutations. Several patterns of haplotype variation and linkage disequilibria were explained on the basis of gene conversion events such as unequal sisterchromatid recombination.
For these six cases, it was presumed that the TRS were the sites for the recombination (or gene conversion) events. However, for the majority of the instabilities in TRS involved in hereditary neurological diseases, evidence was not available to specifically identify the molecular mechanism(s) of the processes (i.e. replication, recombination, or repair) (reviewed in Ref. 1). Some workers (15)(16)(17) have supported the concept that recombination is not involved in expansion.
During the mapping of the genes for these neurological disorders, substantial linkage analyses were performed utilizing flanking markers, some quite near the repeats. Since the exchange of flanking markers was not generally found (15,16), simple homologous recombination, as a general mechanism for expansion, has not been favored. Recently, experiments in yeast have tried to elucidate the independent roles of the RAD50 -55 family of genes, which are responsible for various pathways of recombination. CTG⅐CAG sequences were shown (18) to be susceptible to strand breaks, and the occurrence of double-strand breaks was length-dependent. Also, experiments with yeast rad27 strains had augmented instability of the TRS; RAD27 encodes a nuclease involved in Okazaki fragment processing. It is known that the majority of errors that accumulate in rad27 strains are processed via single-strand annealing as well as double-strand break repair (types of recombination) (19).
Herein, we demonstrate that gene conversion or unequal crossing-over with or without exchange of flanking sequences is a powerful mechanism for (CTG⅐CAG) expansion in Escherichia coli.

EXPERIMENTAL PROCEDURES
Plasmids-The plasmids used in these experiments are derivatives of the unidirectionally replicating pUC19 and pACYC184. The cloning and characterization of plasmids containing CTG⅐CAG, CGG⅐CCG, and GTC⅐GAC repeating sequences were described (23-26, 28 -30). pRW4100 and pRW4110 were constructed by digesting pRW3481 and pRW3463, respectively, with PvuII and recloning the fragments containing the triplet repeats into the PvuII sites of pACYC184. pRW4115 was constructed by digesting pRW3822 with EcoRI and EcoRV and recloning the fragment containing the triplet repeat into the EcoRI and Bstz17 sites of pACYC184. pRW4105 and pRW4106 were constructed by digesting bacteriophage DNA with HindIII and recloning the 564-bp fragment into the HindIII site of pUC19 or by end-filling to give blunt ends and then recloning into the PvuII sites of pACYC184. All of the CTG⅐CAG sequences cloned into pUC19 contain no interruptions but the (CTG⅐CAG) 175 cloned in pACYC184 (pRW3239) contains two G-to-A polymorphisms at the 28th and at the 69th repeat as well as 16 bp of human sequence at the proximal end relative to the interruptions and 43 bp distal to the interruptions. pRW3041 containing (CGG⅐CCG) 81 with two interruptions was constructed as follows: a DNA fragment containing (CGG⅐CCG) 81 was isolated from RN2 (46) by digestion by SmaI and HincII and was then inserted into the PvuII sites of pACYC184. pACT-2 was purchased from CLONTECH Laboratories, Inc.
Standard Genetic Techniques-Plasmid preparation, agarose gel, and polyacrylamide gel electrophoreses were carried out according to standard laboratory protocols (20). Transformations were performed by electroporation (48,49). For the cotransformation experiments, each strain was cotransformed with a mixture of the two appropriate plasmids; the plasmids were originally grown in AB10289 which is the recA Ϫ strain. Forty microliters of washed cells of each strain (5 ϫ 10 7 cells/ml) was initially transformed with 1 l of the supercoiled DNA (0.5 g/ml) listed in Fig. 1. For the experiments involving cotransformation, cells were prepared for electrotransformation, and the transformation mixture contained various combinations of the pUC19 and pACYC184 derivatives. The supercoiled plasmid volume was equally divided between the two test plasmids. A voltage of 2000 was delivered for 4.1 to 5.8 ms. The cuvette size was 0.2 mm.
Cotransformants were selected on LB agar plates containing ampicillin (amp) and tetracycline (tet) since pUC19 and pACYC184, respectively, harbor these drug resistance genes. The cells were allowed to recover in 800 l of SOC media (20) and kept at 37°C for 1 h or longer. The cells were plated on LB agar that contained ampicillin (75 g/ml) and tetracycline (12 g/ml) and grown for 4 -16 h at 37°C. Individual colonies were selected for culture and were grown to mid-logarithmic phase (A 600 ϭ 0.3-0.9, 4 -16 h) at 37°C in LB media containing ampicillin (75 g/ml) and tetracycline (12 g/ml) under aerobic conditions. Plasmid purification and gel electrophoresis and analysis were conducted as described (23-26, 28 -30). The plasmid products obtained from recA ϩ strains were quantitated by staining the agarose gels with ethidium bromide and photographed. The amount of DNA in the RB region (defined in Fig. 2) versus the total DNA in the gel lane was determined by quantitating the areas of the negative with a computer densitometer (Molecular Dynamics 400S). The same areas of the gel for the experiments involving recA Ϫ and recB Ϫ C Ϫ strains were also quantitated and used as background and thus were subtracted from the analyses involving recA ϩ strains. The plasmid inserts and flanking sequences were characterized by dideoxy sequencing on both strands with Sequenase (version 2.0). The pACYC184 primers, purchased from Genosys Inc., were the following: primer 4244 (ACGGTCTTTAAAAAG-GCCG) which 3Ј-terminates at map position 95; primer 4245 (CGT-CAGTAGCTGAACAGGAGGG) which 3Ј-terminates at map position 522. The pACT-2 primer was purchased from CLONTECH Laboratories, Inc., and was primer GAL4 AD (ACCACTACAATGGATG) which 3Ј-terminates at map location 5155. Restriction mapping reactions and ligase reactions were conducted as described (25).
Genetic Analyses-The frequency of the survival of amp-tet r colonies due to the presence of the TRS was calculated by counting the number of colonies on LB agar plates containing ampicillin and tetracycline using standard microbiological techniques. pUC19 derivatives and pACYC184 derivatives were transformed into E. coli strains and grown at 37°C. The number of viable cells was determined by growth in the presence of streptomycin (20 g/ml). The number of mutant cells was determined by growth in the presence of ampicillin and tetracycline. E. coli AB1157 and JC10289 were washed and diluted to a concentration of 5 ϫ 10 7 cells/ml and cotransformed with combinations of plasmids as indicated: for example, pUC19 ϩ pRW3239; pRW3080 ϩ pACYC184; or pRW3080 ϩ pRW3239.

RESULTS
Interplasmid Recombination-To evaluate the potential role of homologous recombination in the expansion of TRS, a twoplasmid system was established in E. coli. (The term recombination is used in a general sense and includes gene conversion, unequal crossing-over, and sister chromatid exchange.) For our study, one family of plasmids ( Fig. 1) was a derivative of pUC19 (a diminutive form of pBR322) that contains the unidirectional ColE1 origin of replication and harbors the ampicillin resistance gene (20). The other family of plasmids was derived from pACYC184 (21) that harbors the tetracycline resistance gene. A computer search revealed that little or no sequence identity exists between these two vectors; only single copies of identical tracts that were 48,30,23,20,16, and 12 bp in length were found along with several copies of 9-to 5-bp segments. Thus, the non-identical sequences of these two vectors enabled our focus on the potential effects of cloned tracts of different TRS on recombination. Prior investigations (22) revealed the stable cotransformation of derivatives of these two plasmids.
To analyze the recombination behavior of these plasmids ( Fig. 1), experiments were conducted in three E. coli strains that are isogenic but differ in their recombination capacity. To test the recombination capacity of CTG⅐CAG repeat sequences, compared with controls, each strain (AB1157, JC10289, and JC5519) was transformed with the parental control vectors pUC19 and pACYC184 that lack TRS. As a second control, each strain was cotransformed with the vector pACYC184 along with various pUC19 derivatives that contained different lengths and types of TRS. Third, the vector pUC19 was cotransformed along with pRW3239 (which contains (CTG⅐CAG) 175 ). Finally, the three strains were cotransformed with various pUC19 derivatives that contained different lengths and types of TRS along with pRW3239. The only segments of the nonhomologous plasmids that had identical sequences were the repeating tracts of (CTG⅐CAG), (GTC⅐GAC), (GTG⅐CAC), (GAA⅐TTC), (CGG⅐CCG), and a fragment of phage DNA.
A strength of this experimental approach is the capacity to select for cells harboring recombinant DNA products that contain both drug resistances and to assay rigorously these products for expansions by restriction mapping and/or DNA sequencing since they contain unique recognition sites. Since the two vectors have different replication origins and copy numbers, exhaustive control studies, described above, were performed followed by analyses of either the supercoiled or linearized DNA which provided confidence of the products formed by recombination. For the 11 control experiments, described above, the CTG⅐CAG tracts were unchanged in length and/or deletions occurred, as expected (28 -32). Thus, the differences in the origins of replication and the copy numbers per se did not For example, pRW3036 is a pUC19 derivative that contains a pure insert of (CTG⅐CAG) 36 . All CTG⅐CAG inserts cloned into pUC19 are completely homogeneous (i.e. no interruptions). All other plasmids have TRS that contain interruptions as described previously (23-26, 28 -30), except for pRW3017 which has a pure stretch of (CGG⅐CCG) 17 . The longest stretch of uninterrupted repeating sequences is listed under column n. All TRS are in orientation I (23-26, 28 -30). All plasmids have SacI (S) and EcoRI (E) sites in common, but the pUC19 derivatives also contain a PstI(P) site and a NotI site on each side of the polylinker, whereas all pACYC184 plasmids contain a Bstz17 (B) and a SacII (SЉ) site. The ampicillin resistance gene (Amp) is designated as a checkered box, and the filled box on pACYC184 designates the tetracycline resistance gene (Tet). The origins of replication that are unique for the two different plasmids are shown by vertically and diagonally striped boxes. The shaded box is the polylinker region. pUC19 is a high copy number plasmid (20) with ϳ500 copies per cell, whereas pACYC184 has a copy number of ϳ10 per cell. The plasmids are not drawn to scale. affect the TRS stability. Fig. 2 shows a typical gel electropherogram of supercoiled plasmids as isolated from E. coli AB1157, which is recA ϩ . The plasmids used for transformation and cotransformations are listed above each lane and are depicted on the right. The 1st 5 lanes are controls, and the last lane contains a new product and complete loss of the original transforming DNAs.

Effects of TRS Sequences on Recombination as Analyzed with Supercoiled DNAs-
Considering the singly transformed cells, lane a shows the 1.8-kbp band (monomer supercoiled DNA), whereas the other bands are dimers, trimers, and other multimeric forms of pRW3080. The formation of multimers of plasmids in recA ϩ cells was reported previously (47). Since each plasmid is entirely homologous with itself, intraplasmid recombination in a recA ϩ background is expected. Lane b shows the monomer supercoiled form of pRW3239. Its lower copy number precluded the observation of dimers and higher multimers. Note that the size relationship between these control plasmids is as expected.
Considering the cotransformed cells, lane c shows the DNA isolated from E. coli AB1157 cotransformed with the vectors pUC19 and pACYC184. This lane shows the supercoiled starting materials, as expected; monomer supercoiled pUC19, at 1.7 kbp, migrated slightly faster than the larger pUC19 derivative, pRW3080 (which contains (CTG⅐CAG) 80 shown in lane a). Likewise, the dimer of pUC19, the band at 3.9 kbp, migrated slightly faster than the dimer of pRW3080 (at 4.1 kbp). Since the supercoiled vectors were recovered unchanged in size, no recombination occurred between these two DNAs. pACYC184, although not visible on this gel due to its lower copy number, is present since our media contained tetracycline (as well as ampicillin). Lane d shows the results of the cotransformation of pUC19 and pRW3239; the supercoiled monomer of pUC19 is the band at 1.7 kbp (identical to lane c). The next largest band in lane d is at 3.6 kbp which is the monomer of pRW3239 and comigrates with the molecules found in lane b, as expected. Thus, when one plasmid harbors a TRS, the interaction between the two plasmids is no different than when both vectors, which both lack TRS tracts, are cotransformed into recA ϩ cells. The identity of the other bands found in lane d is interpreted as for the data in lanes a and c. Lane e shows the results of cotransformation of pRW3080 and pACYC184. The 1.8-kbp band comigrates with the DNA in lane a, as expected. The DNAs found above 1.8 kbp comigrate with the DNAs found in lanes a-d, as described above. Hence, even though the recA system was intact, no new products were detected that could be ascribed to recombination. These results are as expected and are identical to the data obtained from the two-plasmid system when conducted in the 11 control studies with isogenic, but recombination-deficient, strains.
Lane f shows the extraordinary result of the formation of large recombinant bands from recA ϩ cells cotransformed with pRW3080 and pRW3239. A loss of the monomer forms of both plasmids and the formation of DNA (ϳ55% of the total DNA) that migrates in the recombinant band (RB) region is revealed. Hence, comparison of the products in lanes c-e with the products found in lane f (the cotransformation with both plasmids containing the TRS) clearly demonstrates the loss of monomer forms of the starting plasmids and formation via recombination of new recombinant products, which contain both plasmids in various ratios. In addition, homodimers and other homomultimers are visible, as expected, and in some cases represent up to ϳ45% of the total DNA. The same experiment conducted in isogenic recombination-deficient strains failed to yield RB and the loss of the plasmid monomers (see "Requirement for Recombination Genes"). The general features revealed in Fig. 2 have been found in at least 100 other similar determinations.
Not only is the formation of the very long RB products (ϳ18 -40 kbp) extraordinary, but the complete loss of the pRW3080 (and probably also the not visible pRW3239) starting material was dramatic. This result was found even for very short culture times (less than 4 h). Hence, the reaction in vivo to form the RB from the appropriate (CTG⅐CAG) containing plasmids was very powerful. Furthermore, lane f does not contain a smear of DNA products suggesting that the formation of a discrete series of plasmids may have been triggered by secondary structures within the triplet repeat sequences themselves, as observed for other instabilities (23,30,50).
Whereas it is not possible to propagate the pUC19 derivative pRW3080 alone, nor the pACYC184 derivative pRW3239 alone, in the presence of both ampicillin and tetracycline (since they do not possess both drug resistances), we found that isolation of the recombinant product (Fig. 2, lane f) and subsequent retransformation into recA Ϫ cells (JC10289) was effective, and the RB was stable. This result was expected since the RB (18 -40 kbp) contains both drug resistances. Since the same number of colonies (Ϯ10%) was found when RB, retransformed into JC10289, was grown on only amp, or only tet, or both amp and tet, we conclude that the transformants are not due to a FIG. 2. Agarose gel electrophoretic analysis of transformation and cotransformation products in the supercoiled form. E. coli AB1157 (which is recA ϩ ) was transformed with purified supercoiled DNA. The cells were plated and grown until mid-log phase (A 600 of 0.5-0.8) in LB media containing ampicillin (75 g/ml) and/or tetracycline (12 g/ml), depending on the presence of the drug resistance genes. The DNA was isolated via the alkali lysis method (20) and subsequently electrophoresed through a 1.2% agarose gel in TAE buffer (20). Lane a shows the migration of supercoiled pRW3080 (which contains (CTG⅐CAG) 80 ). Lane b shows the migration of supercoiled pRW3239 (which contains (CTG⅐CAG) 175 ). Lane c contains supercoiled DNAs from AB1157 cotransformed with pUC19 and pACYC184. Lane d contains supercoiled DNAs from AB1157 cotransformed with pUC19 and pRW3239 (which contains (CTG⅐CAG) 175 ). Lane e contains supercoiled DNAs from AB1157 cotransformed with pRW3080 (which contains (CTG⅐CAG) 80 ) and pACYC184. Finally, lane f shows the gel mobility of products from cells cotransformed with pRW3080 (containing (CTG⅐CAG) 80 ) and pRW3239 (which contains (CTG⅐CAG) 175 ). The band at ϳ12 kbp, found to varying extents between different isolations, is attributed to fragments of bacterial chromosomal DNA. RB designates the bands due to recombination. The circular RB depicted, containing single copies of pRW3080 and 3239, is a simple case, whereas multiple forms of RB with different numbers of each plasmid likely exist (Fig. 3B, lower left).
FIG. 3. Analysis of transformation and cotransformation products in the linear form. A, agarose gel electrophoretic analysis of the linearized DNA products. The DNAs described in Fig. 2 were linearized as follows and analyzed on 0.8% agarose gels. Lane a shows pUC19 minority form. Parenthetically, whereas JC10289 contains Tn10:srl, and hence expresses a low level of tetracycline resistance, our studies were performed in the presence of 12 g/ml tetracycline; JC10289 will not grow at tetracycline concentration Ͼ2 g/ml.
Effects of TRS Sequences on Recombination as Analyzed with Linearized DNAs-Since analyses of the supercoiled DNA products revealed the apparent recombination between plasmids containing CTG⅐CAG tracts, the material in the RB region was characterized further. Fig. 3A shows a typical gel electropherogram of linearized plasmid products as isolated from E. coli AB1157 transformed or cotransformed with the designated plasmids. Lanes a and b contain only pUC19 derivatives; lanes c and d contain only pACYC184 derivatives, and lanes e and f contain the mixture of two plasmids. Lane a shows the result of cells transformed with pUC19 and the product digested with PstI. The appearance of a single band at ϳ2700 bp is as expected. When pRW3080 (which contains (CTG⅐CAG) 80 ) was grown in AB1157 and digested with PstI, the products migrated at approximately 2950 bp (Fig. 3A, lane b); the products migrated slightly slower than pUC19 (lane a) due to the presence of the TRS insert. Interestingly, some small deletions were observed in the (CTG⅐CAG) 80 tract and appear as a smear below 2950 bps, as expected (23).
Unlike the results found in recA Ϫ strains that yielded monomer forms of the plasmid (23)(24)(25)30), the use of recA ϩ strains yielded monomer and multimer forms of plasmids. When multimers of a single type of plasmid were linearized, the result was the presence of only one band, which migrates to its known plasmid length, as expected. Hence, changes in the plasmid sequence would be distinguishable from the original plasmid. However, no such changes were detected when using AB1157 cells. Obviously, both plasmids (pUC19 and pRW3080) were cultured in the presence of ampicillin since the absence of the gene encoding tet resistance would preclude their propagation in the presence of tetracycline.
Considering the pACYC184 derivatives, lanes c and d show the SacII-linearized products of pACYC184 and of pRW3239; distinct bands were observed at 4244 and 4450 bp, respectively. The sharpness of these two bands is due to the fact that larger plasmids stabilize TRS (1). Also, the faintness of the bands reflects the lower copy number of pACYC184. Obviously, these plasmids were grown only in the presence of tetracycline since the absence of the gene encoding ampicillin resistance would preclude the use of both antibiotics.
For the cotransformation experiments, pRW3080 and pRW3239 (both contain (CTG⅐CAG) sequences) were cultured in recombination-proficient cells in the presence of both ampicillin and tetracycline. After linearization of the recombinant products with PstI (Fig. 3B) (which only cleaves the pUC19 derivative), the products are the 3.1-kbp unit length for pRW3080 cleaved out of the recombinant band (Fig. 3A, lane e).
Due to recombination initiated in the CTG⅐CAG tracts between the two otherwise non-homologous plasmids, expansion products are detected since this 3.1-kbp DNA is slightly larger than pRW3080 (lane b). pRW3080 cleaved out of RB by PstI is the major product in Fig. 3A, lane e, due to its high copy number, even in RB. In addition, two faint products are observed at 11 kbp, which represents 1 unit of pRW3080 linked to 2 units of pRW3239, and at 19 -40 kbp which represents 1 unit of pRW3080 linked with multiple units of pRW3239 as diagrammed in Fig. 3B. The 7-kbp band representing 1 unit of pRW3080 and 1 unit of pRW3239 is not visible on this gel but was repeatedly observed on other analyses. Interestingly, the supercoiled monomer form of pRW3239 starting material which cannot be linearized by PstI became fully integrated into the recombinant bands.
When the cotransformation of AB1157 with pRW3080 and pRW3239 was linearized with a restriction enzyme (SacII) that solely cuts the pACYC184 derivative (Fig. 3B), several distinct products that differed in size from those products obtained by PstI linearization were observed (Fig. 3A, lane f). This is consistent with the result of one type of plasmid recombined with one or more plasmids of the second type (Fig. 3B). The DNA at 4.4 kbp is linearized pRW3239 and the product at 10 kbp represents 1 pRW3239 unit linked to 2 pRW3080 units. In addition, broad bands in the range of 19 -40 kbp were observed which represent recombinant bands consisting of 1 unit of pRW3239 with multiple units of pRW3080. Furthermore, an interesting facet of the analysis of lane f is the total absence of the high copy number supercoiled monomeric (or nicked/linear) pRW3080 starting material that cannot be linearized by SacII (migrating at 1.8 kbp, Fig. 2, lane a); hence, this DNA became fully integrated into the recombinant bands. We presume that the same is true for pRW3239, but its low copy number precludes this observation. The 4.4-kbp band is assigned as linear pRW3239 because SacII only linearizes pRW3239 and PstI only linearizes the 3.1-kbp pRW3080. SacII will not linearize pRW3080 so the supercoiled form will be present in the various products depicted in Fig. 3B. PstI will not linearize pRW3239 so its supercoiled form will be present in the various products depicted in Fig. 3B.
Hence, this analysis also reveals the robust formation of RB from the appropriate (CTG⅐CAG) containing plasmids. The general features revealed in Fig. 3A were found in at least 100 other similar analyses. Whereas recombination may occur at any time during the growth of the cultures, "jackpots" have not been observed in our studies.
RB Is Not a Catenane-The possibility was considered that the recombinant band, which is presumably circular, in Fig. 2, lane f, might exist as catenated DNA (interlinked plasmids) rather than a recombined product with the catenated DNA migrating slower than either of the substrate plasmid monomers. To address this question, studies were conducted on the The lower left structures depict two possible recombined plasmids, but a large number of multimers containing different numbers of the two plasmid units in different orientations can also exist. No restriction was placed on the amount of recombination or the extent of multimer formation that can occur. If the recombined plasmid population is linearized at a restriction site (i.e. PstI) unique for the pUC19 derivatives, linear forms will be generated where the two ends of one pUC19 derivative flank one or more copies of pRW3239. Alternatively, if the RB is linearized with SacII which is unique for pRW3239, one or more copies of the pUC19 derivative will be flanked by portions of a single copy of pRW3239. DNA products using single site restriction enzymes. The products in Fig. 3A, lanes e and f, showed that the DNA did not exist as catenated products. Digestion with unique restriction enzymes resulted in the absence of circular monomer plasmids and in the appearance of 7 kbp and larger products that represented one digested plasmid connected to one or more undigested plasmids. If the DNA had been catenated, then use of the unique restriction enzymes (PstI and SacII as well as three other enzymes) (Fig. 3B) would have released the monomer supercoiled pRW3080 that would have migrated as a 1.8-kbp product such as that found in Fig. 2, lane a. Hence, our data show that the TRS are involved in genetic recombination events rather than in the formation of catenanes.
Survival of Amp-Tet r Colonies-The frequency of occurrence of observing amp-tet r colonies was determined genetically when E. coli AB1157 was cotransformed with plasmids that harbored various TRS as well as non-repetitive DNA. More than 41 prior studies revealed a direct correlation between the number of amp-tet-resistant colonies observed in a cotransformation experiment, the biochemical presence of RB in agarose gel electrophoretic determinations, and the 1-5-fold expansions of the CTG⅐CAG sequences. Cells that were cotransformed with pRW3080 (which contains (CTG⅐CAG) 80 ) and with pRW3239 (which contains (CTG⅐CAG) 175 ) had an ϳ90-fold enhanced survival compared with the average of the other pairs of plasmids. Alternatively, for the control studies, we found only background levels of RB, no expansion of the TRS, and low numbers of colonies. The types of sequences investigated were as follows: the DM (CTG⅐CAG) repeat; the sequence isomer (GTC⅐GAC) (24); a TRS that is not known to be associated with a disease (GTG⅐CAC) (24); the Friedreich's ataxia sequence (GAA⅐TTC) (24,51); the fragile X sequence (CGG⅐CCG) (25); a 564-bp HindIII bacteriophage fragment; as well as two nonhomologous sequence mixtures.
Interestingly, it may be noted that, as an infrequent event, we did observe the recombination of the fragile X sequences as measured by a loss of monomer plasmid and the formation of RB. However, the lengths of CGG⅐CCG tracts cloned into the plasmids were shorter than those of CTG⅐CAG, and this may contribute to the lower frequencies. Unfortunately, (CGG⅐CCG) tracts are extremely unstable in E. coli (25) rendering their study in our two-plasmid recombination system less clear to interpret than the results with the (CTG⅐CAG) sequences as well as the other six control sequences. Interestingly, prior replication-based studies (29) revealed the facile expansion of CTG⅐CAG compared with the nine other TRS. In summary, the robust recombination observed in this two-plasmid system is dependent on the presence of CTG⅐CAG in both DNAs.
Effect of TRS Sequence-Biochemical studies were also conducted to test the requirement for identical TRS sequences for homologous recombination events. Table I shows the results of replacing the (CTG⅐CAG) 80 tract in pRW3080 with various lengths of the sequence isomer (GTC⅐GAC) (24) or (CGG⅐CCG) (25,26) in cotransformation experiments with pRW3239. The amount of DNA observed in the recombinant band region was at background level in all of these cases. Hence, the TRS tracts in the recombining plasmids must contain the same sequences. The combination of (CTG⅐CAG) with the sequence isomer (GTC⅐GAC) could, in principle, form parallel-stranded structures (27), but this is ineffective in this system. Also, in principle, GTC⅐GAC and CGG⅐CCG could form paired structures with 2/3 Watson-Crick pairs and 1/3 incorrect pairs; this is also ineffective in recombination. Thus, 33% of AC and GT oppositions create a non-recombinogenic pair of plasmids.
Furthermore, similar investigations were conducted with pUC19 and pACYC184 derivatives which both contain tracts of the fragile X sequence CGG⅐CCG. When pRW3311 (which contains (CGG⅐CCG) 81 ) was cotransformed with pRW3041 (a pACYC184 derivative that contains (CGG⅐CCG) 81 ), a less pronounced amount of RB was formed compared with CTG⅐CAG. Further experiments are in progress to determine if this is an effect of the length of the TRS, the lower capacity of CGG⅐CCG, compared with CTG⅐CAG, to form looped hairpin structures (25,45) or other factors.
Effect of Length of TRS-The effect of TRS length on the formation of recombinant products was determined by maintaining (CTG⅐CAG) 175 in pRW3239 constant whereas the length of the same sequence in the pUC19 derivative was varied. Fig. 2, lane f, presents typical data that show that up to 63% of the DNA was found in the RB region of an agarose gel when pRW3080 and pRW3239 were cotransformed into recombinant-proficient cells. The data collected using other lengths of (CTG⅐CAG) in the pUC19 vector are summarized in Fig. 4. pRW3080 was replaced with pUC19 derivatives that contain different lengths of (CTG⅐CAG) and were cotransformed with pRW3239. The data revealed a sinusoidal curve with a threshold for the appearance of RB at 30 repeats. When the repeat lengths were 30 or shorter, monomer supercoiled DNA was recovered. However, when the pUC19 plasmids containing 36, 47, 80, or 100 repeats were cotransformed with pRW3239, no monomer supercoiled DNA was found, but rather the majority of the DNA was in the RB region (Fig. 2, lane f).
Hence, these results (Fig. 4) correlate strongly with the phenomena observed in myotonic dystrophy patients which demonstrates a repeat threshold where no symptoms are observed below the break point of 30 repeats, whereas progressively more severe symptoms occur above the threshold (1, 2). By analogy in E. coli, repeat lengths of 30 and less were stable and transmitted from one cell division to the next with high fidelity, whereas lengths above 36 repeats were unstably transmitted (expanded up to 5-fold their original lengths (see below)). This threshold is similar to the length observed in humans for the transmission of expanded alleles (11).
Requirement for Recombination Genes-All prior investigations from this laboratory (23-25, 28 -30) on genetic instabilities of TRS were in recA Ϫ E. coli, usually strain HB101. The  Fig. 2) as compared with the total amount of DNA on the agarose gel electrophoretic analyses was determined for the plasmids (Fig. 1) with different types and lengths of TRS inserts. E. coli AB1157 was cotransformed with pRW3239 (containing (CTG⅐CAG) 175 ) and with one of the following plasmids: pRW3080 (which contains (CTG⅐CAG) 80 ); pUC19 which does not contain any TRS; pRW3415 (containing (GTC⅐GAC) 34 ); pRW3462 (containing (GTC⅐GAC) 47 ); pRW3463 (containing (GTC⅐GAC) 54 ); pRW3017 (containing (CGG⅐CCG) 17 ); pRW3687 (containing (CGG⅐CC-G) 60 ); and pRW3311 (containing (CGG⅐CCG) 81 ). n is the longest length of uninterrupted TRS. The isolated DNA was electrophoresed through 1.1% agarose gels in TAE buffer. The gels were photographed, and the negatives were scanned to quantitate the DNA. The standard deviation for (CTG⅐CAG) 80  work reported herein is the first description of genetic instabilities in recA ϩ E. coli. Table II shows that little or no recombination was observed between pRW3080 and pRW3239 in recAor in recB -Cstrains. No RB was observed but only the starting plasmids were detected on the gels. Hence, the presence of these gene products (33) is required for the formation of the recombinant molecules (Fig. 2, lane f). Also, genetic studies, in part described above, agree with this conclusion. Analyses of Recombinant Products-DNA sequence analyses were performed on the recombinant products formed by the cotransformation of pRW3036 (which contains (CTG⅐CAG) 36 ) and pRW3239 (which contains (CTG⅐CAG) 175 ) in recombination-proficient cells (AB1157) using the pUC19 primers 1211 and 1233. The transformations were repeated numerous times on separate days and individual colonies were picked and analyzed. DNA sequencing and restriction mapping were conducted on these individual clones. Table III shows the analyses on 19 colonies; the extents of expansions were from an approximate doubling to 5-fold. Parallel experiments were also conducted with pRW3047 (which contains 47 CTG⅐CAG repeats) and pRW3080 (which contains 80 CTG⅐CAG repeats). Again, substantial expansions were observed in the 15 individual isolates and the expansions ranged from a doubling to 2.75-fold. The length of the expansions was as great as 140 repeats or 420 bp. In all cases, the expansions occurred in the CTG⅐CAG regions without introduction of interruptions. Resolution on the sequencing gels prohibited the counting of distinct repeats beyond 80 due to the lack of markers in the uninterrupted repeat tracts.
As a control, when recA Ϫ cells were cotransformed, no expansion was observed by sequence analysis. Therefore, the presence of RecA directly effects the genetic stability of (CTG⅐CAG). As a further control, if pRW3036 was propagated in AB1157, without pRW3239 and only in the presence of ampicillin (Table III), no expansion of the TRS was found by DNA sequence analysis, as expected. In this case, homodimers, homotrimers, etc., are formed (as for pUC19 and pRW3080 in Fig. 2) due to recombination of pRW3036 with itself, but the TRS was not expanded. Similar results (no formation of RB and no expansions of the TRS) were found by restriction mapping for pRW3080 alone and for pRW2180 by sequence analysis. It is well known (23-25, 28 -30) that short repeats are transmitted accurately and do not readily undergo alterations in repeat length. In all sequence determinations, the vector sequences (pUC19 and pACYC184) were unchanged.
In summary, numerous DNA sequence analyses and restriction mapping studies demonstrated the presence of expansions of 1-5-fold in length.
Fate of Point Mutations after Recombination Events-pRW3239 contains two point mutations that serve as useful molecular markers to follow the fate of individual TRS through the recombination events. If these interruptions in the TRS are directly involved in the exchange (or used as repair templates (Fig. 5)) of CTG⅐CAG sequences, they might move from one plasmid location to the other. pRW3080 (which contains an uninterrupted repeating sequence of (CTG⅐CAG) 80 ) was cotransformed with pRW3239 in E. coli AB1157, and the recombinant product was linearized with PstI and gel-purified to give the 7-kbp linearized band (not visible in Fig. 3A Fig. 2) is plotted in relation to the increasing lengths of (CTG⅐CAG) n inserts in pUC19 derivatives. E. coli AB1157 was cotransformed with pRW3239 (which contains (CTG⅐CAG) 175 ) and with one of the following plasmids: pUC19 (which contains (CTG⅐CAG) 0 ); pRW2163 (which contains (CTG⅐CAG) 13 ); pRW2180 (which contains (CTG⅐CAG) 30 ); pRW3036 (which contains (CTG⅐CAG) 36 ); pRW3047 (which contains (CTG⅐CAG) 47 ); pRW3080 (which contains (CTG⅐CAG) 80 ); and pRW3216 (hich contains (CTG⅐CAG) 100 ). The isolated DNA was electrophoresed through a 1.1% agarose gel in TAE buffer. Each point and its respective standard deviation represents the average of six experiments.

TABLE II
Effect of recA and recBC on recombination The percentage of the DNA in the RB region as compared with the total amount of DNA on the agarose gel electrophoretic analyses was determined for different lengths of CTG⅐CAG inserts in pUC19 for the three strains of E. coli. The cotransformation experiments, as performed in the wild type strain (AB1157), were repeated in the recombination-deficient strains JC10289 (recA Ϫ ) and JC5519 (recB Ϫ C Ϫ ). The E. coli strains were cotransformed with pRW3239 (containing (CTG⅐ CAG) 175 ) and with one of the following plasmids: pUC19 (containing (CTG⅐CAG) 0 ); pRW2163 (containing CTG⅐CAG) 13 ); pRW2180 (containing (CTG⅐CAG) 30 ); pRW3036 (containing (CTG⅐CAG) 36 ); pRW3047 (containing (CTG⅐CAG) 47 ); pRW3080 (containing (CTG ⅐ CAG) 80 ); or pRW3216 (containing (CTG ⅐ CAG) 100 ). The DNA was isolated and electrophoresed through 1.1% agarose gels in TAE buffer. The gels were photographed, and the negatives were scanned to quantitate the amounts of DNA. The standard deviation for E. coli recA ϩ was Ϯ17% and was Ϯ2% for the other 14 experiments. was ligated and retransformed into recA ϩ cells (AB1157). Individual colonies were selected, and the recombinant DNA was purified and sequenced using a pUC19 primer (1211) (Fig. 5).
Note that the CTG⅐CAG sequence in pRW3239 will not be analyzed when the pUC19 primers are used. DNA sequence analyses revealed the presence of one of the G-to-A interruptions (which occur at positions 28 and 69 of the (CTG⅐CAG) 175 tract of pRW3239) in the TRS flanking the pUC19 derivative. Interestingly, the G-to-A interruption from the pACYC184derived sequence was observed at approximately repeat number 127. Thus, the only way this result could have occurred was for recombination to take place between the CTG⅐CAG originally in pRW3239 with the pUC19-derived pRW3080 (data not shown). A second case of transfer of a point mutation was found by sequencing the product from a cotransformation of pRW3216 (containing (CTG⅐CAG) 100 ) and pRW3239 using the pACYC184 primer (4245). This primer hybridizes to the distal end from the original location of the two G-to-A interruptions. This product revealed a G-to-A interruption at position 26 at the distal end. Since sequence analyses of pRW3216 done prior to its use in the two-plasmid recombination system showed the absence of G-to-A mutations, this "new" interruption must be the result of two or more crossover events (gene conversion). Hence, this clone must be derived from a multimer of the recombinant structure (Fig. 3B, lower left) that contains several tandem copies of the pUC19 plasmids. These two cases of the exchange of point mutations are rare.
Interestingly, these results directly demonstrate the recombination-based expansion without the exchange of flanking sequences. Thus, the expansion and the presence of an interruption in the TRS flanking the pUC19 vector sequence is the result of gene conversion or crossing over (Fig. 5).
Inversions within CTG⅐CAG-In addition to the length polymorphisms and the exchange of point mutations, a plasmid (pRW4444) was isolated that had a switch in the type of repeating sequence in the two complementary strands. This unique deletion product of a (CTG⅐CAG) 100 sequence must have undergone a recombination event to yield [(CTG) 13 (CAG) 67 ]⅐[(CTG) 67 (CAG) 13 ] as the repeat sequence in the isolated product. SURE cells were transformed with pRW4404 (which contains (CTG⅐CAG) 100 cloned into the SmaI-EcoRI sites of pACT2). DNA sequence analyses of the isolated product plasmid (pRW4444) revealed [(CTG) 13 (CAG) 67 ]⅐[(CTG) 67 (CAG) 13 ] as the repeat sequence and restriction mapping confirmed the presence of a PstI site (CTGCAG) at the center of the sequence inversion. Hence, this unusual result directly demonstrates the occurrence of an inversion event, the change of orientation of the sequence relative to outside markers, which is likely due to the formation of a slipped-strand structure (52) with staggered single-stranded loops which became rehybridized to form intrahelical pseudoknots, theta shape, figure eight, and bow-shaped structures (Fig. 6) (34). These unorthodox conformations may exist in vivo due to the facile slippage of the CTG and CAG complementary strands relative to each other. A type of "illegitimate" recombination event must have occurred across the four-stranded intersection of one of these rehybridized structures to generate the [(CTG) 13 (CAG) 67 ]⅐[(CTG) 67 (CAG) 13 ] product (Fig. 6, bottom).
A second example of this type of recombination event that elicited a switch in the repeating sequence of the two complementary strands was found in another colony isolated from the recultivation of pRW4404 in SURE cells (as described above). The DNA sequence of the isolated products (pRW4445) revealed [(CTG) 15 15 ] as the undeleted but rearranged TRS insert. We presume that this product was formed by the mechanisms described above (Fig. 6) for pRW4444.
In summary, these DNA sequence analyses provide direct biochemical evidence for recombination between the TRS, confirming the genetic observations. DISCUSSION This report describes the direct demonstration that recombination mediates expansion and contraction of CTG⅐CAG repeats. The expansion events are dependent on the presence of long CTG⅐CAG sequences in the two-plasmid recombination system and require recombination-proficient cells to give frequent, severalfold expansions. Recombination was proven genetically and biochemically by the following: (a) the presence of  (Fig. 1) were cotransformed into recombination-proficient E. coli, and DNA was isolated from individual colonies grown as described under "Experimental Procedures" and the legends to Figs. 2 and 3. The lengths of the repeat expansions in the pUC19 derivatives were determined by DNA sequencing for (CTG⅐CAG) products up to ϳ80 repeats; the error for these determinations is Ϯ2%. Due to the lack of resolution for longer (CTG⅐CAG) products, restriction mapping (usually SacI and PstI) was employed; the error in lengths is Ϯ4%. Each line in the table represents a (CTG⅐CAG) length determination on the recombinant DNA product from an individual clone from a cotransformation experiment. The last six lines show the results from control studies where the pUC19 derivatives (pRW2180, -3036, and -3080) were transformed in the absence of the pACYC184 derivative pRW3239. As expected (23, 28 -30), no expansion was observed.  139  85  49  85  49  85  49  85  49  85  49  85  49  85  49  85  49  85  49  85  49  80  44  80  44  80  44  80  44  76  40  76  40  59  23  59  23   pRW3047  47  117  70  79  32  67  20  67  20  56  9   pRW3080  80  220  140  150  70  150  70  150  70  120  40  120  40  100  20  100  20  100  20  100  20  pRW2180  30  30  0  pRW3036  36  36  0  36  36  0  36  36  0  pRW3080  80  80  0 (23, 28 -32). For recombination-proficient cells, the ratio of expansions to deletions is as high as 100:1. This conclusion is derived from more than 50 cotransformation experiments of pRW3239 with any one of four pUC derivatives where the lengths of the PstI-SacI fragments containing the TRS were analyzed for expansions and deletions by polyacrylamide gel electrophoresis. When a plasmid (pRW3036) containing (CTG⅐CAG) 36 was propagated alone in recA ϩ cells, multimers were observed as expected (47,53), but the TRS length was unaltered and no RB was observed. However, when this plasmid was cotransformed with a pACYC184 derivative containing (CTG⅐CAG) 175 , RB was observed by gel electrophoresis and substantial expansion of the 36 repeat tract was found in the amp-tet r colonies indicating the involvement of recombination in expansion. Similar results were found when pRW3080 or pRW3047 replaced pRW3036. The reason why no recombination-based expansions and no RB were observed when the pUC19 derivatives containing (CTG⅐CAG) sequences were propagated alone is uncertain but may be due to a some property of a second non-homologous plasmid for initiating recombination in the presence of both tet and amp. Neither vector contains a site.
The plasmid copy number in the single and dual transformations was analyzed. The number of copies of the singly transformed DNAs was ϳ50 to 1 for pRW3080 and pRW3239, respectively. However, when the two DNAs were cotransformed and selected for both tet and amp resistance, RB was found as expected, and the ratio of the pRW3080 and 3239 in RB was ϳ20 to 1. These data are the average of five experiments on the EcoRI-linearized DNAs after agarose gel electrophoretic fractionation (both DNAs contain a single EcoRI site). Hence, the replication of recombinant multimers in RB (Fig.  3B, lower left) may increase the copy number of the pRW3239 component due to the dominance of the ColE1 origin in pRW3080. It is noteworthy that cotransformation of pRW3080 (or pRW3036 or 3047) with pRW3239 into recombination-deficient cells (JC10289 or JC5519) results in no formation of RB and no expansion of the CTG⅐CAG tracts in the pUC derivatives as revealed by restriction mapping.
The extent of expansions observed in this system was severalfold the original lengths. The point of recombination initiation lay within the CTG⅐CAG repeat sequences since no recom- The dots in the insert in pACYC184 represent the G-to-A mutations at positions 28 and 69. The heavy solid lines represent the pUC19 vector, and the thinner cross-hatched vector is pACYC184. Two possible mechanisms are proposed. Homologous recombination between the two TRS on the left side gives an exchange of the G-to-A interruption from one plasmid to the other sequence with exchange of flanking sequences. An alternative mechanism (right side) shows that a double-strand break occurs within the CTG⅐CAG tract (18,19,35,54,55) and exchange at the broken ends forms two Holiday-like junctions separated by the distance "k". DNA repair synthesis (dashed lines) restores the sequence with (CTG⅐CAG). Synthesis occurs on both strands resulting in the expansion of pRW3080. Depending on the extent of branch migration, different size expansions will be formed. Resolution of the junctions leaves the flanking sequences unaltered. The structures are processed as in unequal crossing-over between sister strands. Misalignments between the strands as well as other intermediates involving single-stranded loops that are displaced, melted, and/or slippaired could lead to expansion and explain the formation of new products. Adapted from Ref. 42. bination occurred in other regions of the plasmids as observed by restriction mapping. DNA sequencing revealed that when changes in the size of the plasmids did occur, the changes were within the TRS, not in the vectors. Although alterations in repeat length might have occurred by replication-based events including repair, all such events were controlled because the strains were equally competent to carry out replication-based instability reactions (23, 28 -30). Hence, this two-plasmid recombination system may be considered as a primitive model of two eucaryotic chromosomes that harbor various alleles of CTG⅐CAG.
The expansion of CTG⅐CAG by recombination without the exchange of flanking markers can be explained by the doublestrand break repair model (Fig. 5). The simple repeating nature of the CTG⅐CAG sequence in the presence of RecA may cause the pairing of sites for the alignment of the double-strand break gap into a homologous template. The ends of the break may be displaced followed by strand invasion and subsequent DNA synthesis (separated by length k) to extend the chains followed by resolution to generate an expanded CTG⅐CAG. This model directly explains our results found for the recombinant product of the cotransformation of pRW3239 (containing two G-to-A interruptions) and pRW3080 (containing an uninterrupted tract of (CTG⅐CAG) 80 ); the product had a G-to-A interruption transferred to the CTG⅐CAG tract in pRW3080. Thus, the template for DNA repair or a portion of the exchanged DNA tract contained the G-to-A interruption and the expansion occurred without the exchange of flanking sequences (gene conversion).
The DNA replication fork stalls when it encounters a CTG⅐CAG sequence (3,4) which can result in double-stranded breaks (18,19,35,54,55). These gaps provide binding sites for RecA which processes the DNA ends via recombinational repair. Premature termination and replication fork collapse requires recombinational repair to continue. Thus, it is likely that a complex interrelationship exists between replication and recombination functions in vivo. Considering the data described herein, examples were observed of expansions both with and without exchange of flanking sequences. Hence, both simple homologous recombination and gene conversion (Fig. 5) were observed in E. coli as found in patients (see Introduction). CTG⅐CAG tracts longer than 30 units are effective sites for homologous recombination since the recovery of both drug resistance markers occurred at only ϳ1% of the frequency in the absence of the TRS. The length of the CTG⅐CAG tract of approximately 90 bp as a minimum for efficient recombination is reminiscent of the threshold observed for expansion from the normal to the premutation stage (1, 2) in DM. Also, this length is in good agreement with prior determinations on the required extent of sequence identity (50 -75 bp) for homologous recombination in E. coli (36,37), especially considering that substantially different types of sequences were studied in rather disparate systems. Longer CTG⅐CAG tracts that are flexible and writhed (26) have a propensity to slip (38,52) which may initiate recombination. Prior work (39 -43) revealed the recombinogenic properties of simple, direct repeat sequences.
These studies also revealed the requirement for a perfect homologous CTG⅐CAG sequence in an antiparallel arrangement on both vectors. Attempts to achieve recombination be-FIG. 6. Cartoon of possible mechanism to form insert in pRW4444 by TRS loop rehybridization. The box at the top of the figure represents a 300-bp sequence of (CTG⅐CAG) 100 within pRW-4404. The CTG⅐CAG tract adopts slipped structures by misalignment of the complementary strands by slippage (38,45,52). The staggered, single-stranded loops may rehybridize and, depending on the alignment of the loops, generate three forms of intrahelical pseudoknots: theta shape, figure eight, and bow-shaped structures. A type of illegitimate recombination (redrawn from Ref. 34) within these structures can result in the formation of the inversion found in pRW4444 (bottom). tween CTG⅐CAG and GTC⅐GAC failed. Hence, parallel DNA (27) is not a substrate for this recombination system. Likewise, correct Watson-Crick pairing to the extent of 66% is ineffective since recombination between CTG⅐CAG and CGG⅐CCG failed also. Furthermore, the expansions of the CTG⅐CAG repeats in the two-plasmid system were independent of the orientations of the TRS in the vectors. 2 TRS instability by gene conversion (unequal crossing-over) is a robust process, and thus, this mechanism along with DNA replication may contribute to the length polymorphisms observed in human diseases. CTG⅐CAG seems to have special properties for recombination. Whereas the reason for this behavior is uncertain, prior investigations of nucleosome assembly (44), expansion by replication (29), conformational flexibility and writhing (26,38), capacity for adopting hairpin loops (23, 28 -32, 52), and susceptibility for double-strand breaks in vivo (18) revealed its unorthodox character. A prior review has summarized the molecular similarities between studies in humans and E. coli related to hereditary neurological diseases (45). We have no evidence on the extent to which, if at all, recombination is responsible for TRS expansions in humans. However, E. coli has been a useful model to investigate the molecular processes responsible for other events related to these instabilities (45,56).
Since CTG⅐CAG repeats are recombinogenic and since genetic recombination utilizes enzyme systems different from replication, new avenues for therapeutic intervention strategies in human hereditary neurological diseases may be developed using somatic cell gene therapy.