JBC Oz Biosciences

HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
 QUICK SEARCH:   [advanced]


     


This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF)
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Right arrow Citation Map
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Jakupciak, J. P.
Right arrow Articles by Wells, R. D.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Jakupciak, J. P.
Right arrow Articles by Wells, R. D.
Social Bookmarking
 Add to CiteULike   Add to Complore   Add to Connotea   Add to Del.icio.us   Add to Digg   Add to Reddit   Add to Technorati  
What's this?

J Biol Chem, Vol. 274, Issue 33, 23468-23479, August 13, 1999


Genetic Instabilities in (CTG·CAG) Repeats Occur by Recombination*

John P. JakupciakDagger and Robert D. Wells§

From the Institute of Biosciences and Technology, Center for Genome Research, Texas A&M University, Texas Medical Center, Houston, Texas 77030

    ABSTRACT
TOP
ABSTRACT
INTRODUCTION
EXPERIMENTAL PROCEDURES
RESULTS
DISCUSSION
REFERENCES

The expansion of triplet repeat sequences (TRS) associated with hereditary neurological diseases is believed from prior studies to be due to DNA replication. This report demonstrates that the expansion of (CTG·CAG)n in vivo also occurs by homologous recombination as shown by biochemical and genetic studies. A two-plasmid recombination system was established in Escherichia coli with derivatives of pUC19 (harboring the ampicillin resistance gene) and pACYC184 (harboring the tetracycline resistance gene). The derivatives contained various triplet repeat inserts ((CTG·CAG), (CGG·CCG), (GAA·TTC), (GTC·GAC), and (GTG·CAC)) of different lengths, orientations, and extents of interruptions and a control non-repetitive sequence. The availability of the two drug resistance genes and of several unique restriction sites on the plasmids enabled rigorous genetic and biochemical analyses. The requirements for recombination at the TRS include repeat lengths >30, the presence of CTG·CAG on both plasmids, and recA and recBC. Sequence analyses on a number of DNA products isolated from individual colonies directly demonstrated the crossing-over and expansion of the homologous CTG·CAG regions. Furthermore, inversion products of the type [(CTG)13(CAG)67]·[(CTG)67(CAG)13] were isolated as the apparent result of "illegitimate" recombination events on intrahelical pseudoknots. This work establishes the relationships between CTG·CAG sequences, multiple fold expansions, genetic recombination, formation of new recombinant DNA products, and the presence of both drug resistance genes. Thus, if these reactions occur in humans, unequal crossing-over or gene conversion may also contribute to the expansions responsible for anticipation associated with several hereditary neurological syndromes.

    INTRODUCTION
TOP
ABSTRACT
INTRODUCTION
EXPERIMENTAL PROCEDURES
RESULTS
DISCUSSION
REFERENCES

Genetic instabilities (expansions and deletions) of triplet repeat sequences (TRS)1 ((CTG·CAG), (CGG·CCG), or (GAA·TTC)) are a hallmark of certain hereditary neurological diseases (1, 2). Numerous workers in human genetics have proposed DNA replication, gene conversion, recombination, and related processes as the mechanism(s) responsible for these alterations in repeat sequence lengths. Subsequent in vivo studies in genetically tractable systems (1, 3) and in vitro investigations (4) have demonstrated expansions and deletions during DNA replication, probably by slipped strand misalignment due to preferential hairpin formation of TRS. Similar molecular studies in vivo on gene conversion and recombination are lacking.

Several human genetic studies on patient materials reported haplotype analyses, especially related to myotonic dystrophy (DM) and the fragile X syndrome, which implicated gene conversion and/or unequal crossing-over (types of recombination) to genetic instabilities. In the first report, Korneluk and co-workers (5, 6) proposed unequal crossing-over (5) and gene conversion (6) as the mechanisms responsible for the expansions and deletions observed in the (CTG·CAG) mutation during DM transmission. This conclusion was derived from haplotype analyses of six polymorphic markers in the DM region. The TRS was presumed to be the site of the discontinuous gene conversion events.

Second, Nelson and collaborators (7) investigated the loss of mutation at the FMR1 locus through multiple exchanges between maternal X chromosomes. They described a daughter of a female carrier who inherited the fragile X premutation chromosome based on haplotype analyses using flanking markers. The (CGG·CCG) repeat sequence and the intragenic polymorphic marker FMRb showed the normal maternal alleles, whereas two other intragenic markers showed the risk haplotype. Since the other intragenic markers are located between the markers (CGG·CCG) and FMRb, this results in patches of normal and fragile X sequence in the FMR1 gene of the daughter and was explained on the basis of gene conversion. Likewise, prenatal diagnosis of the fragile X syndrome showed a loss of mutation owing to a double recombinant or gene conversion at the FMR1 locus (8).

Third, Brown et al. (9) investigated reverse mutations in the fragile X syndrome as well as founder effects. Based on these haplotype analyses of nearby markers to the (CGG·CCG) repeats, revertants were discovered in a small percentage of the premutation carrier offspring. Gene conversion (9) and recombination (10) were proposed as the responsible mechanisms. Fourth, homologous recombination involving unequal pairing of sister chromatids leading to the formation of a four-stranded synaptic structure was suggested by Wieringa and co-workers (11) as the mechanism to explain the (CTG·CAG) expansion responsible for gonosomal mosaicism in DM patients. Fifth, Warren (12) interpreted the work of Olsen and co-workers (13) on the polyalanine expansion in synpolydactyly to result from unequal crossing-over in the HOXD13 protein gene and the authors agreed (12). Sixth, Kidd and co-workers (14) studied haplotype analyses of the DM locus on a worldwide basis with emphasis on the implications for the evolution of modern humans and the origin of the DM mutations. Several patterns of haplotype variation and linkage disequilibria were explained on the basis of gene conversion events such as unequal sister-chromatid recombination.

For these six cases, it was presumed that the TRS were the sites for the recombination (or gene conversion) events. However, for the majority of the instabilities in TRS involved in hereditary neurological diseases, evidence was not available to specifically identify the molecular mechanism(s) of the processes (i.e. replication, recombination, or repair) (reviewed in Ref. 1). Some workers (15-17) have supported the concept that recombination is not involved in expansion.

During the mapping of the genes for these neurological disorders, substantial linkage analyses were performed utilizing flanking markers, some quite near the repeats. Since the exchange of flanking markers was not generally found (15, 16), simple homologous recombination, as a general mechanism for expansion, has not been favored. Recently, experiments in yeast have tried to elucidate the independent roles of the RAD50-55 family of genes, which are responsible for various pathways of recombination. CTG·CAG sequences were shown (18) to be susceptible to strand breaks, and the occurrence of double-strand breaks was length-dependent. Also, experiments with yeast rad27 strains had augmented instability of the TRS; RAD27 encodes a nuclease involved in Okazaki fragment processing. It is known that the majority of errors that accumulate in rad27 strains are processed via single-strand annealing as well as double-strand break repair (types of recombination) (19).

Herein, we demonstrate that gene conversion or unequal crossing-over with or without exchange of flanking sequences is a powerful mechanism for (CTG·CAG) expansion in Escherichia coli.

    EXPERIMENTAL PROCEDURES
TOP
ABSTRACT
INTRODUCTION
EXPERIMENTAL PROCEDURES
RESULTS
DISCUSSION
REFERENCES

Plasmids-- The plasmids used in these experiments are derivatives of the unidirectionally replicating pUC19 and pACYC184. The cloning and characterization of plasmids containing CTG·CAG, CGG·CCG, and GTC·GAC repeating sequences were described (23-26, 28-30). pRW4100 and pRW4110 were constructed by digesting pRW3481 and pRW3463, respectively, with PvuII and recloning the fragments containing the triplet repeats into the PvuII sites of pACYC184. pRW4115 was constructed by digesting pRW3822 with EcoRI and EcoRV and recloning the fragment containing the triplet repeat into the EcoRI and Bstz17 sites of pACYC184. pRW4105 and pRW4106 were constructed by digesting bacteriophage lambda  DNA with HindIII and recloning the 564-bp fragment into the HindIII site of pUC19 or by end-filling to give blunt ends and then recloning into the PvuII sites of pACYC184. All of the CTG·CAG sequences cloned into pUC19 contain no interruptions but the (CTG·CAG)175 cloned in pACYC184 (pRW3239) contains two G-to-A polymorphisms at the 28th and at the 69th repeat as well as 16 bp of human sequence at the proximal end relative to the interruptions and 43 bp distal to the interruptions. pRW3041 containing (CGG·CCG)81 with two interruptions was constructed as follows: a DNA fragment containing (CGG·CCG)81 was isolated from RN2 (46) by digestion by SmaI and HincII and was then inserted into the PvuII sites of pACYC184. pACT-2 was purchased from CLONTECH Laboratories, Inc.

Bacterial Strains-- The following E. coli strains were used: AB1157 (47) as a parent of the recombinogenic deficient strains; JC10289 (thr-1, ara-14, leuB6, Delta (gpt-proA)62, lacY1, tsx-33, glnV44(AS), galK2, lambda -, rac-, hisG4(Oc), rfbD1, mgl-51, Delta (recA-srl)306, srlR301::Tn10, rpsL31(strR), kdgK51, xylA5, mtl-1, argE3(Oc), thi-1); and JC5519 (thr-1, ara-14, leuB6, Delta (gpt-proA)62, lacY1, tsx-33, qsr'-, glnV44(AS), galK2, lambda -, rac-, hisG4(Oc), rfbD1, recC22, recB21, rpsL31(strR), kdgK51, xylA5, mtl-1, argE3(Oc), thi-1). All strains were obtained from the E. coli Genetic Stock Center, Yale University, New Haven, CT.

Standard Genetic Techniques-- Plasmid preparation, agarose gel, and polyacrylamide gel electrophoreses were carried out according to standard laboratory protocols (20). Transformations were performed by electroporation (48, 49). For the cotransformation experiments, each strain was cotransformed with a mixture of the two appropriate plasmids; the plasmids were originally grown in AB10289 which is the recA- strain. Forty microliters of washed cells of each strain (5 × 107 cells/ml) was initially transformed with 1 µl of the supercoiled DNA (0.5 µg/ml) listed in Fig. 1. For the experiments involving cotransformation, cells were prepared for electrotransformation, and the transformation mixture contained various combinations of the pUC19 and pACYC184 derivatives. The supercoiled plasmid volume was equally divided between the two test plasmids. A voltage of 2000 was delivered for 4.1 to 5.8 ms. The cuvette size was 0.2 mm.

Cotransformants were selected on LB agar plates containing ampicillin (amp) and tetracycline (tet) since pUC19 and pACYC184, respectively, harbor these drug resistance genes. The cells were allowed to recover in 800 µl of SOC media (20) and kept at 37 °C for 1 h or longer. The cells were plated on LB agar that contained ampicillin (75 µg/ml) and tetracycline (12 µg/ml) and grown for 4-16 h at 37 °C. Individual colonies were selected for culture and were grown to mid-logarithmic phase (A600 = 0.3-0.9, 4-16 h) at 37 °C in LB media containing ampicillin (75 µg/ml) and tetracycline (12 µg/ml) under aerobic conditions.

Plasmid purification and gel electrophoresis and analysis were conducted as described (23-26, 28-30). The plasmid products obtained from recA+ strains were quantitated by staining the agarose gels with ethidium bromide and photographed. The amount of DNA in the RB region (defined in Fig. 2) versus the total DNA in the gel lane was determined by quantitating the areas of the negative with a computer densitometer (Molecular Dynamics 400S). The same areas of the gel for the experiments involving recA- and recB-C- strains were also quantitated and used as background and thus were subtracted from the analyses involving recA+ strains. The plasmid inserts and flanking sequences were characterized by dideoxy sequencing on both strands with Sequenase (version 2.0). The pACYC184 primers, purchased from Genosys Inc., were the following: primer 4244 (ACGGTCTTTAAAAAGGCCG) which 3'-terminates at map position 95; primer 4245 (CGTCAGTAGCTGAACAGGAGGG) which 3'-terminates at map position 522. The pACT-2 primer was purchased from CLONTECH Laboratories, Inc., and was primer GAL4 AD (ACCACTACAATGGATG) which 3'-terminates at map location 5155. Restriction mapping reactions and ligase reactions were conducted as described (25).

Genetic Analyses-- The frequency of the survival of amp-tetr colonies due to the presence of the TRS was calculated by counting the number of colonies on LB agar plates containing ampicillin and tetracycline using standard microbiological techniques. pUC19 derivatives and pACYC184 derivatives were transformed into E. coli strains and grown at 37 °C. The number of viable cells was determined by growth in the presence of streptomycin (20 µg/ml). The number of mutant cells was determined by growth in the presence of ampicillin and tetracycline. E. coli AB1157 and JC10289 were washed and diluted to a concentration of 5 × 107 cells/ml and cotransformed with combinations of plasmids as indicated: for example, pUC19 + pRW3239; pRW3080 + pACYC184; or pRW3080 + pRW3239.

    RESULTS
TOP
ABSTRACT
INTRODUCTION
EXPERIMENTAL PROCEDURES
RESULTS
DISCUSSION
REFERENCES

Interplasmid Recombination-- To evaluate the potential role of homologous recombination in the expansion of TRS, a two-plasmid system was established in E. coli. (The term recombination is used in a general sense and includes gene conversion, unequal crossing-over, and sister chromatid exchange.) For our study, one family of plasmids (Fig. 1) was a derivative of pUC19 (a diminutive form of pBR322) that contains the unidirectional ColE1 origin of replication and harbors the ampicillin resistance gene (20). The other family of plasmids was derived from pACYC184 (21) that harbors the tetracycline resistance gene. A computer search revealed that little or no sequence identity exists between these two vectors; only single copies of identical tracts that were 48, 30, 23, 20, 16, and 12 bp in length were found along with several copies of 9- to 5-bp segments. Thus, the non-identical sequences of these two vectors enabled our focus on the potential effects of cloned tracts of different TRS on recombination. Prior investigations (22) revealed the stable cotransformation of derivatives of these two plasmids.


View larger version (29K):
[in this window]
[in a new window]
 
Fig. 1.   Plasmids used for transformation of E. coli. Plasmids in the left column are derivatives of pUC19 and in the right column are derivatives of pACYC184. The types of sequences and lengths of the TRS are listed under column n. For example, pRW3036 is a pUC19 derivative that contains a pure insert of (CTG·CAG)36. All CTG·CAG inserts cloned into pUC19 are completely homogeneous (i.e. no interruptions). All other plasmids have TRS that contain interruptions as described previously (23-26, 28-30), except for pRW3017 which has a pure stretch of (CGG·CCG)17. The longest stretch of uninterrupted repeating sequences is listed under column n. All TRS are in orientation I (23-26, 28-30). All plasmids have SacI (S) and EcoRI (E) sites in common, but the pUC19 derivatives also contain a PstI(P) site and a NotI site on each side of the polylinker, whereas all pACYC184 plasmids contain a Bstz17 (B) and a SacII (S") site. The ampicillin resistance gene (Amp) is designated as a checkered box, and the filled box on pACYC184 designates the tetracycline resistance gene (Tet). The origins of replication that are unique for the two different plasmids are shown by vertically and diagonally striped boxes. The shaded box is the polylinker region. pUC19 is a high copy number plasmid (20) with ~500 copies per cell, whereas pACYC184 has a copy number of ~10 per cell. The plasmids are not drawn to scale.

To analyze the recombination behavior of these plasmids (Fig. 1), experiments were conducted in three E. coli strains that are isogenic but differ in their recombination capacity. To test the recombination capacity of CTG·CAG repeat sequences, compared with controls, each strain (AB1157, JC10289, and JC5519) was transformed with the parental control vectors pUC19 and pACYC184 that lack TRS. As a second control, each strain was cotransformed with the vector pACYC184 along with various pUC19 derivatives that contained different lengths and types of TRS. Third, the vector pUC19 was cotransformed along with pRW3239 (which contains (CTG·CAG)175). Finally, the three strains were cotransformed with various pUC19 derivatives that contained different lengths and types of TRS along with pRW3239. The only segments of the non-homologous plasmids that had identical sequences were the repeating tracts of (CTG·CAG), (GTC·GAC), (GTG·CAC), (GAA·TTC), (CGG·CCG), and a fragment of phage lambda  DNA.

A strength of this experimental approach is the capacity to select for cells harboring recombinant DNA products that contain both drug resistances and to assay rigorously these products for expansions by restriction mapping and/or DNA sequencing since they contain unique recognition sites. Since the two vectors have different replication origins and copy numbers, exhaustive control studies, described above, were performed followed by analyses of either the supercoiled or linearized DNA which provided confidence of the products formed by recombination. For the 11 control experiments, described above, the CTG·CAG tracts were unchanged in length and/or deletions occurred, as expected (28-32). Thus, the differences in the origins of replication and the copy numbers per se did not affect the TRS stability.

Effects of TRS Sequences on Recombination as Analyzed with Supercoiled DNAs-- Fig. 2 shows a typical gel electropherogram of supercoiled plasmids as isolated from E. coli AB1157, which is recA+. The plasmids used for transformation and cotransformations are listed above each lane and are depicted on the right. The 1st 5 lanes are controls, and the last lane contains a new product and complete loss of the original transforming DNAs.


View larger version (47K):
[in this window]
[in a new window]
 
Fig. 2.   Agarose gel electrophoretic analysis of transformation and cotransformation products in the supercoiled form. E. coli AB1157 (which is recA+) was transformed with purified supercoiled DNA. The cells were plated and grown until mid-log phase (A600 of 0.5-0.8) in LB media containing ampicillin (75 µg/ml) and/or tetracycline (12 µg/ml), depending on the presence of the drug resistance genes. The DNA was isolated via the alkali lysis method (20) and subsequently electrophoresed through a 1.2% agarose gel in TAE buffer (20). Lane a shows the migration of supercoiled pRW3080 (which contains (CTG·CAG)80). Lane b shows the migration of supercoiled pRW3239 (which contains (CTG·CAG)175). Lane c contains supercoiled DNAs from AB1157 cotransformed with pUC19 and pACYC184. Lane d contains supercoiled DNAs from AB1157 cotransformed with pUC19 and pRW3239 (which contains (CTG·CAG)175). Lane e contains supercoiled DNAs from AB1157 cotransformed with pRW3080 (which contains (CTG·CAG)80) and pACYC184. Finally, lane f shows the gel mobility of products from cells cotransformed with pRW3080 (containing (CTG·CAG)80) and pRW3239 (which contains (CTG·CAG)175). The band at ~12 kbp, found to varying extents between different isolations, is attributed to fragments of bacterial chromosomal DNA. RB designates the bands due to recombination. The circular RB depicted, containing single copies of pRW3080 and 3239, is a simple case, whereas multiple forms of RB with different numbers of each plasmid likely exist (Fig. 3B, lower left).

Considering the singly transformed cells, lane a shows the 1.8-kbp band (monomer supercoiled DNA), whereas the other bands are dimers, trimers, and other multimeric forms of pRW3080. The formation of multimers of plasmids in recA+ cells was reported previously (47). Since each plasmid is entirely homologous with itself, intraplasmid recombination in a recA+ background is expected. Lane b shows the monomer supercoiled form of pRW3239. Its lower copy number precluded the observation of dimers and higher multimers. Note that the size relationship between these control plasmids is as expected.

Considering the cotransformed cells, lane c shows the DNA isolated from E. coli AB1157 cotransformed with the vectors pUC19 and pACYC184. This lane shows the supercoiled starting materials, as expected; monomer supercoiled pUC19, at 1.7 kbp, migrated slightly faster than the larger pUC19 derivative, pRW3080 (which contains (CTG·CAG)80 shown in lane a). Likewise, the dimer of pUC19, the band at 3.9 kbp, migrated slightly faster than the dimer of pRW3080 (at 4.1 kbp). Since the supercoiled vectors were recovered unchanged in size, no recombination occurred between these two DNAs. pACYC184, although not visible on this gel due to its lower copy number, is present since our media contained tetracycline (as well as ampicillin). Lane d shows the results of the cotransformation of pUC19 and pRW3239; the supercoiled monomer of pUC19 is the band at 1.7 kbp (identical to lane c). The next largest band in lane d is at 3.6 kbp which is the monomer of pRW3239 and comigrates with the molecules found in lane b, as expected. Thus, when one plasmid harbors a TRS, the interaction between the two plasmids is no different than when both vectors, which both lack TRS tracts, are cotransformed into recA+ cells. The identity of the other bands found in lane d is interpreted as for the data in lanes a and c. Lane e shows the results of cotransformation of pRW3080 and pACYC184. The 1.8-kbp band comigrates with the DNA in lane a, as expected. The DNAs found above 1.8 kbp comigrate with the DNAs found in lanes a-d, as described above. Hence, even though the recA system was intact, no new products were detected that could be ascribed to recombination. These results are as expected and are identical to the data obtained from the two-plasmid system when conducted in the 11 control studies with isogenic, but recombination-deficient, strains.

Lane f shows the extraordinary result of the formation of large recombinant bands from recA+ cells cotransformed with pRW3080 and pRW3239. A loss of the monomer forms of both plasmids and the formation of DNA (~55% of the total DNA) that migrates in the recombinant band (RB) region is revealed. Hence, comparison of the products in lanes c-e with the products found in lane f (the cotransformation with both plasmids containing the TRS) clearly demonstrates the loss of monomer forms of the starting plasmids and formation via recombination of new recombinant products, which contain both plasmids in various ratios. In addition, homodimers and other homomultimers are visible, as expected, and in some cases represent up to ~45% of the total DNA. The same experiment conducted in isogenic recombination-deficient strains failed to yield RB and the loss of the plasmid monomers (see "Requirement for Recombination Genes"). The general features revealed in Fig. 2 have been found in at least 100 other similar determinations.

Not only is the formation of the very long RB products (~18-40 kbp) extraordinary, but the complete loss of the pRW3080 (and probably also the not visible pRW3239) starting material was dramatic. This result was found even for very short culture times (less than 4 h). Hence, the reaction in vivo to form the RB from the appropriate (CTG·CAG) containing plasmids was very powerful. Furthermore, lane f does not contain a smear of DNA products suggesting that the formation of a discrete series of plasmids may have been triggered by secondary structures within the triplet repeat sequences themselves, as observed for other instabilities (23, 30, 50).

Whereas it is not possible to propagate the pUC19 derivative pRW3080 alone, nor the pACYC184 derivative pRW3239 alone, in the presence of both ampicillin and tetracycline (since they do not possess both drug resistances), we found that isolation of the recombinant product (Fig. 2, lane f) and subsequent retransformation into recA- cells (JC10289) was effective, and the RB was stable. This result was expected since the RB (18-40 kbp) contains both drug resistances. Since the same number of colonies (±10%) was found when RB, retransformed into JC10289, was grown on only amp, or only tet, or both amp and tet, we conclude that the transformants are not due to a minority form. Parenthetically, whereas JC10289 contains Tn10:srl, and hence expresses a low level of tetracycline resistance, our studies were performed in the presence of 12 µg/ml tetracycline; JC10289 will not grow at tetracycline concentration >2 µg/ml.

Effects of TRS Sequences on Recombination as Analyzed with Linearized DNAs-- Since analyses of the supercoiled DNA products revealed the apparent recombination between plasmids containing CTG·CAG tracts, the material in the RB region was characterized further. Fig. 3A shows a typical gel electropherogram of linearized plasmid products as isolated from E. coli AB1157 transformed or cotransformed with the designated plasmids. Lanes a and b contain only pUC19 derivatives; lanes c and d contain only pACYC184 derivatives, and lanes e and f contain the mixture of two plasmids. Lane a shows the result of cells transformed with pUC19 and the product digested with PstI. The appearance of a single band at ~2700 bp is as expected. When pRW3080 (which contains (CTG·CAG)80) was grown in AB1157 and digested with PstI, the products migrated at approximately 2950 bp (Fig. 3A, lane b); the products migrated slightly slower than pUC19 (lane a) due to the presence of the TRS insert. Interestingly, some small deletions were observed in the (CTG·CAG)80 tract and appear as a smear below 2950 bps, as expected (23).


View larger version (23K):
[in this window]
[in a new window]
 
Fig. 3.   Analysis of transformation and cotransformation products in the linear form. A, agarose gel electrophoretic analysis of the linearized DNA products. The DNAs described in Fig. 2 were linearized as follows and analyzed on 0.8% agarose gels. Lane a shows pUC19 linearized with PstI; lane b shows the PstI-linearized products of pRW3080 (containing (CTG·CAG)80); lane c shows the SacII-linearized product of pACYC184; lane d shows the SacII-linearized product of pRW3239 (which contains (CTG·CAG)175); lane e shows the PstI-linearized products of AB1157 cotransformed with pRW3080 and pRW3239; and lane f shows the SacII-linearized products of AB1157 cotransformed with pRW3080 and pRW3239. The DNA size markers were a 1-kbp ladder (Life Technologies, Inc.); the sizes of these bands are shown at the side of the figure. The figures on the right of the gel represent the linearized molecular structures; the top two figures represent linearized cointegrants containing n units of pUC19 (lane f) or m units of pACYC184 (lane e). The open boxes represent the CTG·CAG tracts. B, illustration of interplasmid recombination events. The thick lines represent a pUC19 derivative, and the cross-hatched segments represent pRW3239. The boxes on each plasmid represent the homologous TRS. The unique PstI and SacII sites are shown; a single EcoRI(E) site is present on both vectors. The arrows designate the locations of the primers used for sequencing. The figure is not drawn to scale. When E. coli AB1157 was cotransformed with the plasmids, the TRS align for strand invasion and exchange. A single crossover event is depicted (upper right). The lower left structures depict two possible recombined plasmids, but a large number of multimers containing different numbers of the two plasmid units in different orientations can also exist. No restriction was placed on the amount of recombination or the extent of multimer formation that can occur. If the recombined plasmid population is linearized at a restriction site (i.e. PstI) unique for the pUC19 derivatives, linear forms will be generated where the two ends of one pUC19 derivative flank one or more copies of pRW3239. Alternatively, if the RB is linearized with SacII which is unique for pRW3239, one or more copies of the pUC19 derivative will be flanked by portions of a single copy of pRW3239.

Unlike the results found in recA- strains that yielded monomer forms of the plasmid (23-25, 30), the use of recA+ strains yielded monomer and multimer forms of plasmids. When multimers of a single type of plasmid were linearized, the result was the presence of only one band, which migrates to its known plasmid length, as expected. Hence, changes in the plasmid sequence would be distinguishable from the original plasmid. However, no such changes were detected when using AB1157 cells. Obviously, both plasmids (pUC19 and pRW3080) were cultured in the presence of ampicillin since the absence of the gene encoding tet resistance would preclude their propagation in the presence of tetracycline.

Considering the pACYC184 derivatives, lanes c and d show the SacII-linearized products of pACYC184 and of pRW3239; distinct bands were observed at 4244 and 4450 bp, respectively. The sharpness of these two bands is due to the fact that larger plasmids stabilize TRS (1). Also, the faintness of the bands reflects the lower copy number of pACYC184. Obviously, these plasmids were grown only in the presence of tetracycline since the absence of the gene encoding ampicillin resistance would preclude the use of both antibiotics.

For the cotransformation experiments, pRW3080 and pRW3239 (both contain (CTG·CAG) sequences) were cultured in recombination-proficient cells in the presence of both ampicillin and tetracycline. After linearization of the recombinant products with PstI (Fig. 3B) (which only cleaves the pUC19 derivative), the products are the 3.1-kbp unit length for pRW3080 cleaved out of the recombinant band (Fig. 3A, lane e). Due to recombination initiated in the CTG·CAG tracts between the two otherwise non-homologous plasmids, expansion products are detected since this 3.1-kbp DNA is slightly larger than pRW3080 (lane b). pRW3080 cleaved out of RB by PstI is the major product in Fig. 3A, lane e, due to its high copy number, even in RB. In addition, two faint products are observed at 11 kbp, which represents 1 unit of pRW3080 linked to 2 units of pRW3239, and at 19-40 kbp which represents 1 unit of pRW3080 linked with multiple units of pRW3239 as diagrammed in Fig. 3B. The 7-kbp band representing 1 unit of pRW3080 and 1 unit of pRW3239 is not visible on this gel but was repeatedly observed on other analyses. Interestingly, the supercoiled monomer form of pRW3239 starting material which cannot be linearized by PstI became fully integrated into the recombinant bands.

When the cotransformation of AB1157 with pRW3080 and pRW3239 was linearized with a restriction enzyme (SacII) that solely cuts the pACYC184 derivative (Fig. 3B), several distinct products that differed in size from those products obtained by PstI linearization were observed (Fig. 3A, lane f). This is consistent with the result of one type of plasmid recombined with one or more plasmids of the second type (Fig. 3B). The DNA at 4.4 kbp is linearized pRW3239 and the product at 10 kbp represents 1 pRW3239 unit linked to 2 pRW3080 units. In addition, broad bands in the range of 19-40 kbp were observed which represent recombinant bands consisting of 1 unit of pRW3239 with multiple units of pRW3080. Furthermore, an interesting facet of the analysis of lane f is the total absence of the high copy number supercoiled monomeric (or nicked/linear) pRW3080 starting material that cannot be linearized by SacII (migrating at 1.8 kbp, Fig. 2, lane a); hence, this DNA became fully integrated into the recombinant bands. We presume that the same is true for pRW3239, but its low copy number precludes this observation. The 4.4-kbp band is assigned as linear pRW3239 because SacII only linearizes pRW3239 and PstI only linearizes the 3.1-kbp pRW3080. SacII will not linearize pRW3080 so the supercoiled form will be present in the various products depicted in Fig. 3B. PstI will not linearize pRW3239 so its supercoiled form will be present in the various products depicted in Fig. 3B.

Hence, this analysis also reveals the robust formation of RB from the appropriate (CTG·CAG) containing plasmids. The general features revealed in Fig. 3A were found in at least 100 other similar analyses. Whereas recombination may occur at any time during the growth of the cultures, "jackpots" have not been observed in our studies.

RB Is Not a Catenane-- The possibility was considered that the recombinant band, which is presumably circular, in Fig. 2, lane f, might exist as catenated DNA (interlinked plasmids) rather than a recombined product with the catenated DNA migrating slower than either of the substrate plasmid monomers. To address this question, studies were conducted on the DNA products using single site restriction enzymes. The products in Fig. 3A, lanes e and f, showed that the DNA did not exist as catenated products. Digestion with unique restriction enzymes resulted in the absence of circular monomer plasmids and in the appearance of 7 kbp and larger products that represented one digested plasmid connected to one or more undigested plasmids. If the DNA had been catenated, then use of the unique restriction enzymes (PstI and SacII as well as three other enzymes) (Fig. 3B) would have released the monomer supercoiled pRW3080 that would have migrated as a 1.8-kbp product such as that found in Fig. 2, lane a. Hence, our data show that the TRS are involved in genetic recombination events rather than in the formation of catenanes.

Survival of Amp-Tetr Colonies-- The frequency of occurrence of observing amp-tetr colonies was determined genetically when E. coli AB1157 was cotransformed with plasmids that harbored various TRS as well as non-repetitive DNA. More than 41 prior studies revealed a direct correlation between the number of amp-tet-resistant colonies observed in a cotransformation experiment, the biochemical presence of RB in agarose gel electrophoretic determinations, and the 1-5-fold expansions of the CTG·CAG sequences. Cells that were cotransformed with pRW3080 (which contains (CTG·CAG)80) and with pRW3239 (which contains (CTG·CAG)175) had an ~90-fold enhanced survival compared with the average of the other pairs of plasmids. Alternatively, for the control studies, we found only background levels of RB, no expansion of the TRS, and low numbers of colonies. The types of sequences investigated were as follows: the DM (CTG·CAG) repeat; the sequence isomer (GTC·GAC) (24); a TRS that is not known to be associated with a disease (GTG·CAC) (24); the Friedreich's ataxia sequence (GAA·TTC) (24, 51); the fragile X sequence (CGG·CCG) (25); a 564-bp HindIII bacteriophage lambda  fragment; as well as two non-homologous sequence mixtures.

Interestingly, it may be noted that, as an infrequent event, we did observe the recombination of the fragile X sequences as measured by a loss of monomer plasmid and the formation of RB. However, the lengths of CGG·CCG tracts cloned into the plasmids were shorter than those of CTG·CAG, and this may contribute to the lower frequencies. Unfortunately, (CGG·CCG) tracts are extremely unstable in E. coli (25) rendering their study in our two-plasmid recombination system less clear to interpret than the results with the (CTG·CAG) sequences as well as the other six control sequences. Interestingly, prior replication-based studies (29) revealed the facile expansion of CTG·CAG compared with the nine other TRS. In summary, the robust recombination observed in this two-plasmid system is dependent on the presence of CTG·CAG in both DNAs.

Effect of TRS Sequence-- Biochemical studies were also conducted to test the requirement for identical TRS sequences for homologous recombination events. Table I shows the results of replacing the (CTG·CAG)80 tract in pRW3080 with various lengths of the sequence isomer (GTC·GAC) (24) or (CGG·CCG) (25, 26) in cotransformation experiments with pRW3239. The amount of DNA observed in the recombinant band region was at background level in all of these cases. Hence, the TRS tracts in the recombining plasmids must contain the same sequences. The combination of (CTG·CAG) with the sequence isomer (GTC·GAC) could, in principle, form parallel-stranded structures (27), but this is ineffective in this system. Also, in principle, GTC·GAC and CGG·CCG could form paired structures with 2/3 Watson-Crick pairs and 1/3 incorrect pairs; this is also ineffective in recombination. Thus, 33% of AC and GT oppositions create a non-recombinogenic pair of plasmids.

                              
View this table:
[in this window]
[in a new window]
 
Table I
Effect of the sequence of the triplet repeats on recombination
The percentage of the DNA in the RB region (defined in Fig. 2) as compared with the total amount of DNA on the agarose gel electrophoretic analyses was determined for the plasmids (Fig. 1) with different types and lengths of TRS inserts. E. coli AB1157 was cotransformed with pRW3239 (containing (CTG·CAG)175) and with one of the following plasmids: pRW3080 (which contains (CTG·CAG)80); pUC19 which does not contain any TRS; pRW3415 (containing (GTC·GAC)34); pRW3462 (containing (GTC·GAC)47); pRW3463 (containing (GTC·GAC)54); pRW3017 (containing (CGG·CCG)17); pRW3687 (containing (CGG·CCG)60); and pRW3311 (containing (CGG·CCG)81). n is the longest length of uninterrupted TRS. The isolated DNA was electrophoresed through 1.1% agarose gels in TAE buffer. The gels were photographed, and the negatives were scanned to quantitate the DNA. The standard deviation for (CTG·CAG)80 was ±17% and was ±1.5% for the other eight experiments.

Furthermore, similar investigations were conducted with pUC19 and pACYC184 derivatives which both contain tracts of the fragile X sequence CGG·CCG. When pRW3311 (which contains (CGG·CCG)81) was cotransformed with pRW3041 (a pACYC184 derivative that contains (CGG·CCG)81), a less pronounced amount of RB was formed compared with CTG·CAG. Further experiments are in progress to determine if this is an effect of the length of the TRS, the lower capacity of CGG·CCG, compared with CTG·CAG, to form looped hairpin structures (25, 45) or other factors.

Effect of Length of TRS-- The effect of TRS length on the formation of recombinant products was determined by maintaining (CTG·CAG)175 in pRW3239 constant whereas the length of the same sequence in the pUC19 derivative was varied. Fig. 2, lane f, presents typical data that show that up to 63% of the DNA was found in the RB region of an agarose gel when pRW3080 and pRW3239 were cotransformed into recombinant-proficient cells. The data collected using other lengths of (CTG·CAG) in the pUC19 vector are summarized in Fig. 4. pRW3080 was replaced with pUC19 derivatives that contain different lengths of (CTG·CAG) and were cotransformed with pRW3239. The data revealed a sinusoidal curve with a threshold for the appearance of RB at 30 repeats. When the repeat lengths were 30 or shorter, monomer supercoiled DNA was recovered. However, when the pUC19 plasmids containing 36, 47, 80, or 100 repeats were cotransformed with pRW3239, no monomer supercoiled DNA was found, but rather the majority of the DNA was in the RB region (Fig. 2, lane f).


View larger version (13K):
[in this window]
[in a new window]
 
Fig. 4.   Effect of length of CTG·CAG on recombination. The percentage of DNA in the RB regions (as defined in Fig. 2) is plotted in relation to the increasing lengths of (CTG·CAG)n inserts in pUC19 derivatives. E. coli AB1157 was cotransformed with pRW3239 (which contains (CTG·CAG)175) and with one of the following plasmids: pUC19 (which contains (CTG·CAG)0); pRW2163 (which contains (CTG·CAG)13); pRW2180 (which contains (CTG·CAG)30); pRW3036 (which contains (CTG·CAG)36); pRW3047 (which contains (CTG·CAG)47); pRW3080 (which contains (CTG·CAG)80); and pRW3216 (hich contains (CTG·CAG)100). The isolated DNA was electrophoresed through a 1.1% agarose gel in TAE buffer. Each point and its respective standard deviation represents the average of six experiments.

Hence, these results (Fig. 4) correlate strongly with the phenomena observed in myotonic dystrophy patients which demonstrates a repeat threshold where no symptoms are observed below the break point of 30 repeats, whereas progressively more severe symptoms occur above the threshold (1, 2). By analogy in E. coli, repeat lengths of 30 and less were stable and transmitted from one cell division to the next with high fidelity, whereas lengths above 36 repeats were unstably transmitted (expanded up to 5-fold their original lengths (see below)). This threshold is similar to the length observed in humans for the transmission of expanded alleles (11).

Requirement for Recombination Genes-- All prior investigations from this laboratory (23-25, 28-30) on genetic instabilities of TRS were in recA- E. coli, usually strain HB101. The work reported herein is the first description of genetic instabilities in recA+ E. coli. Table II shows that little or no recombination was observed between pRW3080 and pRW3239 in recA- or in recB-C- strains. No RB was observed but only the starting plasmids were detected on the gels. Hence, the presence of these gene products (33) is required for the formation of the recombinant molecules (Fig. 2, lane f). Also, genetic studies, in part described above, agree with this conclusion.

                              
View this table:
[in this window]
[in a new window]
 
Table II
Effect of recA and recBC on recombination
The percentage of the DNA in the RB region as compared with the total amount of DNA on the agarose gel electrophoretic analyses was determined for different lengths of CTG·CAG inserts in pUC19 for the three strains of E. coli. The cotransformation experiments, as performed in the wild type strain (AB1157), were repeated in the recombination-deficient strains JC10289 (recA-) and JC5519 (recB-C-). The E. coli strains were cotransformed with pRW3239 (containing (CTG· CAG)175) and with one of the following plasmids: pUC19 (containing (CTG·CAG)0); pRW2163 (containing CTG·CAG)13); pRW2180 (containing (CTG·CAG)30); pRW3036 (containing (CTG·CAG)36); pRW3047 (containing (CTG·CAG)47); pRW3080 (containing (CTG · CAG)80); or pRW3216 (containing (CTG · CAG)100). The DNA was isolated and electrophoresed through 1.1% agarose gels in TAE buffer. The gels were photographed, and the negatives were scanned to quantitate the amounts of DNA. The standard deviation for E. coli recA+ was ±17% and was ±2% for the other 14 experiments.

Analyses of Recombinant Products-- DNA sequence analyses were performed on the recombinant products formed by the cotransformation of pRW3036 (which contains (CTG·CAG)36) and pRW3239 (which contains (CTG·CAG)175) in recombination-proficient cells (AB1157) using the pUC19 primers 1211 and 1233. The transformations were repeated numerous times on separate days and individual colonies were picked and analyzed. DNA sequencing and restriction mapping were conducted on these individual clones. Table III shows the analyses on 19 colonies; the extents of expansions were from an approximate doubling to 5-fold. Parallel experiments were also conducted with pRW3047 (which contains 47 CTG·CAG repeats) and pRW3080 (which contains 80 CTG·CAG repeats). Again, substantial expansions were observed in the 15 individual isolates and the expansions ranged from a doubling to 2.75-fold. The length of the expansions was as great as 140 repeats or 420 bp. In all cases, the expansions occurred in the CTG·CAG regions without introduction of interruptions. Resolution on the sequencing gels prohibited the counting of distinct repeats beyond 80 due to the lack of markers in the uninterrupted repeat tracts.

                              
View this table:
[in this window]
[in a new window]
 
Table III
Summary of extent of expansions of (CTG·CAG) tracts
Plasmids (Fig. 1) were cotransformed into recombination-proficient E. coli, and DNA was isolated from individual colonies grown as described under "Experimental Procedures" and the legends to Figs. 2 and 3. The lengths of the repeat expansions in the pUC19 derivatives were determined by DNA sequencing for (CTG·CAG) products up to ~80 repeats; the error for these determinations is ±2%. Due to the lack of resolution for longer (CTG·CAG) products, restriction mapping (usually SacI and PstI) was employed; the error in lengths is ±4%. Each line in the table represents a (CTG·CAG) length determination on the recombinant DNA product from an individual clone from a cotransformation experiment. The last six lines show the results from control studies where the pUC19 derivatives (pRW2180, -3036, and -3080) were transformed in the absence of the pACYC184 derivative pRW3239. As expected (23, 28-30), no expansion was observed.

As a control, when recA- cells were cotransformed, no expansion was observed by sequence analysis. Therefore, the presence of RecA directly effects the genetic stability of (CTG·CAG). As a further control, if pRW3036 was propagated in AB1157, without pRW3239 and only in the presence of ampicillin (Table III), no expansion of the TRS was found by DNA sequence analysis, as expected. In this case, homodimers, homotrimers, etc., are formed (as for pUC19 and pRW3080 in Fig. 2) due to recombination of pRW3036 with itself, but the TRS was not expanded. Similar results (no formation of RB and no expansions of the TRS) were found by restriction mapping for pRW3080 alone and for pRW2180 by sequence analysis. It is well known (23-25, 28-30) that short repeats are transmitted accurately and do not readily undergo alterations in repeat length. In all sequence determinations, the vector sequences (pUC19 and pACYC184) were unchanged.

In summary, numerous DNA sequence analyses and restriction mapping studies demonstrated the presence of expansions of 1-5-fold in length.

Fate of Point Mutations after Recombination Events-- pRW3239 contains two point mutations that serve as useful molecular markers to follow the fate of individual TRS through the recombination events. If these interruptions in the TRS are directly involved in the exchange (or used as repair templates (Fig. 5)) of CTG·CAG sequences, they might move from one plasmid location to the other. pRW3080 (which contains an uninterrupted repeating sequence of (CTG·CAG)80) was cotransformed with pRW3239 in E. coli AB1157, and the recombinant product was linearized with PstI and gel-purified to give the 7-kbp linearized band (not visible in Fig. 3A but repeatedly detected in other similar experiments), and the eluted band was ligated and retransformed into recA+ cells (AB1157). Individual colonies were selected, and the recombinant DNA was purified and sequenced using a pUC19 primer (1211) (Fig. 5). Note that the CTG·CAG sequence in pRW3239 will not be analyzed when the pUC19 primers are used. DNA sequence analyses revealed the presence of one of the G-to-A interruptions (which occur at positions 28 and 69 of the (CTG·CAG)175 tract of pRW3239) in the TRS flanking the pUC19 derivative. Interestingly, the G-to-A interruption from the pACYC184-derived sequence was observed at approximately repeat number 127. Thus, the only way this result could have occurred was for recombination to take place between the CTG·CAG originally in pRW3239 with the pUC19-derived pRW3080 (data not shown).


View larger version (18K):
[in this window]
[in a new window]
 
Fig. 5.   A model for expansion of CTG·CAG repeats mediated by recombination. Each strand of the CTG·CAG repeat participating in recombination is shown (the TRS in pUC19 and pACYC are open boxes). The dots in the insert in pACYC184 represent the G-to-A mutations at positions 28 and 69. The heavy solid lines represent the pUC19 vector, and the thinner cross-hatched vector is pACYC184. Two possible mechanisms are proposed. Homologous recombination between the two TRS on the left side gives an exchange of the G-to-A interruption from one plasmid to the other sequence with exchange of flanking sequences. An alternative mechanism (right side) shows that a double-strand break occurs within the CTG·CAG tract (18, 19, 35, 54, 55) and exchange at the broken ends forms two Holiday-like junctions separated by the distance "k". DNA repair synthesis (dashed lines) restores the sequence with (CTG·CAG). Synthesis occurs on both strands resulting in the expansion of pRW3080. Depending on the extent of branch migration, different size expansions will be formed. Resolution of the junctions leaves the flanking sequences unaltered. The structures are processed as in unequal crossing-over between sister strands. Misalignments between the strands as well as other intermediates involving single-stranded loops that are displaced, melted, and/or slip-paired could lead to expansion and explain the formation of new products. Adapted from Ref. 42.

A second case of transfer of a point mutation was found by sequencing the product from a cotransformation of pRW3216 (containing (CTG·CAG)100) and pRW3239 using the pACYC184 primer (4245). This primer hybridizes to the distal end from the original location of the two G-to-A interruptions. This product revealed a G-to-A interruption at position 26 at the distal end. Since sequence analyses of pRW3216 done prior to its use in the two-plasmid recombination system showed the absence of G-to-A mutations, this "new" interruption must be the result of two or more crossover events (gene conversion). Hence, this clone must be derived from a multimer of the recombinant structure (Fig. 3B, lower left) that contains several tandem copies of the pUC19 plasmids. These two cases of the exchange of point mutations are rare.

Interestingly, these results directly demonstrate the recombination-based expansion without the exchange of flanking sequences. Thus, the expansion and the presence of an interruption in the TRS flanking the pUC19 vector sequence is the result of gene conversion or crossing over (Fig. 5).

Inversions within CTG·CAG-- In addition to the length polymorphisms and the exchange of point mutations, a plasmid (pRW4444) was isolated that had a switch in the type of repeating sequence in the two complementary strands. This unique deletion product of a (CTG·CAG)100 sequence must have undergone a recombination event to yield [(CTG)13 (CAG)67]·[(CTG)67(CAG)13] as the repeat sequence in the isolated product. SURE cells were transformed with pRW4404 (which contains (CTG·CAG)100 cloned into the SmaI-EcoRI sites of pACT2). DNA sequence analyses of the isolated product plasmid (pRW4444) revealed [(CTG)13(CAG)67]·[(CTG)67 (CAG)13] as the repeat sequence and restriction mapping confirmed the presence of a PstI site (CTGCAG) at the center of the sequence inversion. Hence, this unusual result directly demonstrates the occurrence of an inversion event, the change of orientation of the sequence relative to outside markers, which is likely due to the formation of a slipped-strand structure (52) with staggered single-stranded loops which became rehybridized to form intrahelical pseudoknots, theta shape, figure eight, and bow-shaped structures (Fig. 6) (34). These unorthodox conformations may exist in vivo due to the facile slippage of the CTG and CAG complementary strands relative to each other. A type of "illegitimate" recombination event must have occurred across the four-stranded intersection of one of these rehybridized structures to generate the [(CTG)13 (CAG)67]·[(CTG)67(CAG)13] product (Fig. 6, bottom).


View larger version (14K):
[in this window]
[in a new window]
 
Fig. 6.   Cartoon of possible mechanism to form insert in pRW4444 by TRS loop rehybridization. The box at the top of the figure represents a 300-bp sequence of (CTG·CAG)100 within pRW4404. The CTG·CAG tract adopts slipped structures by misalignment of the complementary strands by slippage (38, 45, 52). The staggered, single-stranded loops may rehybridize and, depending on the alignment of the loops, generate three forms of intrahelical pseudoknots: theta shape, figure eight, and bow-shaped structures. A type of illegitimate recombination (redrawn from Ref. 34) within these structures can result in the formation of the inversion found in pRW4444 (bottom).

A second example of this type of recombination event that elicited a switch in the repeating sequence of the two complementary strands was found in another colony isolated from the recultivation of pRW4404 in SURE cells (as described above). The DNA sequence of the isolated products (pRW4445) revealed [(CTG)15(CAG)85]·[(CTG)85(CAG)15] as the undeleted but rearranged TRS insert. We presume that this product was formed by the mechanisms described above (Fig. 6) for pRW4444.

In summary, these DNA sequence analyses provide direct biochemical evidence for recombination between the TRS, confirming the genetic observations.

    DISCUSSION
TOP
ABSTRACT
INTRODUCTION
EXPERIMENTAL PROCEDURES
RESULTS
DISCUSSION
REFERENCES

This report describes the direct demonstration that recombination mediates expansion and contraction of CTG·CAG repeats. The expansion events are dependent on the presence of long CTG·CAG sequences in the two-plasmid recombination system and require recombination-proficient cells to give frequent, severalfold expansions. Recombination was proven genetically and biochemically by the following: (a) the presence of both amp and tet resistances in the recombinant products; (b) the formation of long cointegrant DNAs; (c) the expansion of (CTG·CAG) tracts by DNA sequencing and by restriction mapping for the longest tracts; (d) the transfer of G-to-A polymorphisms from the TRS in the pACYC184 derivative to the TRS in the pUC19 derivatives; and (e) inserts with strand inversions (i.e. [(CTG)n(CAG)m]·[(CTG)m(CAG)n]. These results are in stark contrast to prior investigations in recombination-deficient E. coli and in yeast where expansions were substantially less frequent than deletions by a ratio of approximately 1:100 (23, 28-32). For recombination-proficient cells, the ratio of expansions to deletions is as high as 100:1. This conclusion is derived from more than 50 cotransformation experiments of pRW3239 with any one of four pUC derivatives where the lengths of the PstI-SacI fragments containing the TRS were analyzed for expansions and deletions by polyacrylamide gel electrophoresis.

When a plasmid (pRW3036) containing (CTG·CAG)36 was propagated alone in recA+ cells, multimers were observed as expected (47, 53), but the TRS length was unaltered and no RB was observed. However, when this plasmid was cotransformed with a pACYC184 derivative containing (CTG·CAG)175, RB was observed by gel electrophoresis and substantial expansion of the 36 repeat tract was found in the amp-tetr colonies indicating the involvement of recombination in expansion. Similar results were found when pRW3080 or pRW3047 replaced pRW3036. The reason why no recombination-based expansions and no RB were observed when the pUC19 derivatives containing (CTG·CAG) sequences were propagated alone is uncertain but may be due to a some property of a second non-homologous plasmid for initiating recombination in the presence of both tet and amp. Neither vector contains a chi  site.

The plasmid copy number in the single and dual transformations was analyzed. The number of copies of the singly transformed DNAs was ~50 to 1 for pRW3080 and pRW3239, respectively. However, when the two DNAs were cotransformed and selected for both tet and amp resistance, RB was found as expected, and the ratio of the pRW3080 and 3239 in RB was ~20 to 1. These data are the average of five experiments on the EcoRI-linearized DNAs after agarose gel electrophoretic fractionation (both DNAs contain a single EcoRI site). Hence, the replication of recombinant multimers in RB (Fig. 3B, lower left) may increase the copy number of the pRW3239 component due to the dominance of the ColE1 origin in pRW3080. It is noteworthy that cotransformation of pRW3080 (or pRW3036 or 3047) with pRW3239 into recombination-deficient cells (JC10289 or JC5519) results in no formation of RB and no expansion of the CTG·CAG tracts in the pUC derivatives as revealed by restriction mapping.

The extent of expansions observed in this system was severalfold the original lengths. The point of recombination initiation lay within the CTG·CAG repeat sequences since no recombination occurred in other regions of the plasmids as observed by restriction mapping. DNA sequencing revealed that when changes in the size of the plasmids did occur, the changes were within the TRS, not in the vectors. Although alterations in repeat length might have occurred by replication-based events including repair, all such events were controlled because the strains were equally competent to carry out replication-based instability reactions (23, 28-30). Hence, this two-plasmid recombination system may be considered as a primitive model of two eucaryotic chromosomes that harbor various alleles of CTG·CAG.

The expansion of CTG·CAG by recombination without the exchange of flanking markers can be explained by the double-strand break repair model (Fig. 5). The simple repeating nature of the CTG·CAG sequence in the presence of RecA may cause the pairing of sites for the alignment of the double-strand break gap into a homologous template. The ends of the break may be displaced followed by strand invasion and subsequent DNA synthesis (separated by length k) to extend the chains followed by resolution to generate an expanded CTG·CAG. This model directly explains our results found for the recombinant product of the cotransformation of pRW3239 (containing two G-to-A interruptions) and pRW3080 (containing an uninterrupted tract of (CTG·CAG)80); the product had a G-to-A interruption transferred to the CTG·CAG tract in pRW3080. Thus, the template for DNA repair or a portion of the exchanged DNA tract contained the G-to-A interruption and the expansion occurred without the exchange of flanking sequences (gene conversion).

The DNA replication fork stalls when it encounters a CTG·CAG sequence (3, 4) which can result in double-stranded breaks (18, 19, 35, 54, 55). These gaps provide binding sites for RecA which processes the DNA ends via recombinational repair. Premature termination and replication fork collapse requires recombinational repair to continue. Thus, it is likely that a complex interrelationship exists between replication and recombination functions in vivo. Considering the data described herein, examples were observed of expansions both with and without exchange of flanking sequences. Hence, both simple homologous recombination and gene conversion (Fig. 5) were observed in E. coli as found in patients (see Introduction).

CTG·CAG tracts longer than 30 units are effective sites for homologous recombination since the recovery of both drug resistance markers occurred at only ~1% of the frequency in the absence of the TRS. The length of the CTG·CAG tract of approximately 90 bp as a minimum for efficient recombination is reminiscent of the threshold observed for expansion from the normal to the premutation stage (1, 2) in DM. Also, this length is in good agreement with prior determinations on the required extent of sequence identity (50-75 bp) for homologous recombination in E. coli (36, 37), especially considering that substantially different types of sequences were studied in rather disparate systems. Longer CTG·CAG tracts that are flexible and writhed (26) have a propensity to slip (38, 52) which may initiate recombination. Prior work (39-43) revealed the recombinogenic properties of simple, direct repeat sequences.

These studies also revealed the requirement for a perfect homologous CTG·CAG sequence in an antiparallel arrangement on both vectors. Attempts to achieve recombination between CTG·CAG and GTC·GAC failed. Hence, parallel DNA (27) is not a substrate for this recombination system. Likewise, correct Watson-Crick pairing to the extent of 66% is ineffective since recombination between CTG·CAG and CGG·CCG failed also. Furthermore, the expansions of the CTG·CAG repeats in the two-plasmid system were independent of the orientations of the TRS in the vectors.2

TRS instability by gene conversion (unequal crossing-over) is a robust process, and thus, this mechanism along with DNA replication may contribute to the length polymorphisms observed in human diseases. CTG·CAG seems to have special properties for recombination. Whereas the reason for this behavior is uncertain, prior investigations of nucleosome assembly (44), expansion by replication (29), conformational flexibility and writhing (26, 38), capacity for adopting hairpin loops (23, 28-32, 52), and susceptibility for double-strand breaks in vivo (18) revealed its unorthodox character. A prior review has summarized the molecular similarities between studies in humans and E. coli related to hereditary neurological diseases (45). We have no evidence on the extent to which, if at all, recombination is responsible for TRS expansions in humans. However, E. coli has been a useful model to investigate the molecular processes responsible for other events related to these instabilities (45, 56).

Since CTG·CAG repeats are recombinogenic and since genetic recombination utilizes enzyme systems different from replication, new avenues for therapeutic intervention strategies in human hereditary neurological diseases may be developed using somatic cell gene therapy.

    ACKNOWLEDGEMENTS

We thank Dr. Robert Gellibolian for the preparation and characterization of pRW3041, Dr. Adam Jaworski for cloning pRW4404 and -4444, and Dr. John Wilson for advice.

    FOOTNOTES

* This work was supported by National Institutes of Health Grants GM52982 and NS37554 and the Robert A. Welch Foundation.The costs of publication of this article were defrayed in part by the payment of page charges. The article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.

Dagger Part of the Genetics Graduate Program at Texas A&M University.

§ To whom correspondence should be addressed: Institute of Biosciences and Technology, Center for Genome Research, Texas A&M University, Texas Medical Center, 2121 W. Holcombe Blvd., Houston, TX 77030. Tel.: 713-677-7651; Fax: 713-677-7689; E-mail: rwells@ibt.tamu.edu.

2 J. P. Jakupciak and R. D. Wells, manuscript in preparation.

    ABBREVIATIONS

The abbreviations used are: TRS, triplet repeat sequences; DM, myotonic dystrophy; bp, base pair(s); kbp, kilobase pair(s); amp, ampicillin; tet, tetracycline; RB, recombinant band.

    REFERENCES
TOP
ABSTRACT
INTRODUCTION
EXPERIMENTAL PROCEDURES
RESULTS
DISCUSSION
REFERENCES

1. Wells, R. D., and Warren, S. T. (eds) (1998) Genetic Instabilities and Hereditary Neurological Diseases , Academic Press, Inc., San Diego
2. Paulson, H. L., and Fischbeck, K. H. (1996) Annu. Rev. Neurosci. 19, 79-107[CrossRef][Medline] [Order article via Infotrieve]
3. Samadashwily, G. M., Raca, G., and Mirkin, S. M. (1997) Nat. Genet. 17, 298-304[Medline] [Order article via Infotrieve]
4. Ohshima, K., and Wells, R. D. (1997) J. Biol. Chem. 272, 16798-16806[Abstract/Free Full Text]
5. Tsilfidis, C., MacKenzie, A. E., Mettler, G., Barcelo, J., and Korneluk, R. G. (1992) Nat. Genet. 1, 192-195[CrossRef][Medline] [Order article via Infotrieve]
6. O'Hoy, K. L., Tsilfidis, C., Mahadevan, M. S., Neville, C. E., Barcelo, J., Hunter, A. G. W., and Korneluk, R. G. (1993) Science 259, 809-810[Abstract/Free Full Text]
7. Van den Ouweland, A. M. W., Deelen, W. H., Kunst, C. B., Uzielli, M.-L. G., Nelson, D. L., Warren, S. T., Oostra, B. A., and Halley, J. J. (1994) Hum. Mol. Genet. 3, 1823-1827[Abstract/Free Full Text]
8. Losekoot, M., Hoogendoorn, E., Olmer, R., Jansen, C. C. A. M., Oosterwijk, J. C., Van den Ouweland, A. M. W., Halley, D. J. J., Warren, S. T., Willemsen, R., Oostra, B. A., and Bakker, E. (1997) J. Med. Genet. 34, 924-926[Abstract/Free Full Text]
9. Brown, W. T., Houck, G. E., Jr., Ding, X., Zhong, N., Nolin, S., Glicksman, A., Dobkin, C., and Jenkins, E. C. (1996) Am. J. Med. Genet. 64, 287-292[CrossRef][Medline] [Order article via Infotrieve]
10. Zhong, N., Kajanoja, E., Smiths, B., Pietrofesa, J., Curley, D., Wang, D., Ju, W., Nolin, S., Dobkin, C., Ryynanen, M., and Brown, W. T. (1996) Am. J. Med. Genet. 64, 226-233[CrossRef][Medline] [Order article via Infotrieve]
11. Jansen, G., Willems, P., Coerwinkel, M., Nillesen, W., Smeets, H., Vits, L., Howeler, C., Brunner, H., and Wieringa, B. (1994) Am. Soc. Hum. Genet. 54, 575-585
12. Warren, S. T. (1997) Science 275, 408-409[CrossRef][Medline] [Order article via Infotrieve]
13. Muragaki, Y., Mundlos, S., Upton, J., and Olsen, B. R. (1996) Science 272, 548-551[Abstract]
14. Tishkoff, S. A., Goldman, A., Calafell, F., Speed, W. C., Deinard, A. S., Bonne-Tamir, B., Kidd, J. R., Pakstis, A. J., Jenkins, T., and Kidd, K. K. (1998) Am. J. Hum. Genet. 62, 1389-1402[CrossRef][Medline] [Order article via Infotrieve]
15. Richards, R. I., Holman, K., Kozman, H., Kremer, E., Lynch, M., Pritchard, M., Yu, S., Mulley, J., and Sutherland, G. R. (1991) J. Med. Genet. 28, 818-823[Abstract/Free Full Text]