 |
INTRODUCTION |
Genetic instabilities (expansions and deletions) of simple
repeating sequences are important in the life cycles of both
prokaryotic (1) and eukaryotic (2) cells. This fundamental mechanism of
mutagenesis has been found in mycoplasma, bacteria, yeast, mammalian
cell cultures, and in humans. In mycoplasma and bacteria, these genetic
polymorphisms are the basis for phase variations, which control the
expression of genes (3-7). In humans, the expansions and deletions of
simple repeating sequences are closely tied to the etiologies of
cancers (8-12) as well as hereditary neurological diseases (reviewed
in Ref. 2).
The general mechanism accepted for all of these instabilities is
slipped strand mispairing, which allows mismatching of neighboring repeats and, depending on the strand orientation, enables the insertion
or deletion of repeats during DNA polymerase-mediated duplication
(reviewed in Refs. 2 and 13). The enzymatic machineries involved
include DNA replication and repair (nucleotide excision repair,
methyl-directed mismatch repair, and DNA polymerase III proofreading)
(2, 13). Biochemical and genetic studies showed also that expansions
and deletions of the TRS1
sequences occur in vivo by homologous recombination (14,
15). These investigations, carried out in a two-plasmid system,
demonstrated that the expansion mechanism is principally gene
conversion rather than unequal crossing-over (15).
In the present work, we cloned two triplet repeat tracts in the same
plasmid and used an intramolecular assay to study the recombinational
properties of the CTG·CAG sequences (Fig. 1). Intramolecular
recombination systems have been widely used to investigate the
mechanism of the recombination processes (16-24) as well as to
establish the recombinational properties of different DNA sequences,
including microsatellites (25).
It has been shown that recombination between TRS tracts can lead to
repeat expansion (14, 15, 26). Evaluation of the frequency of
recombination between CTG·CAG tracts provides important information
about the cellular mechanisms of instability, relative to replication
and repair.
Here we have developed the first genetic assay for monitoring the
frequencies of intramolecular recombination between CTG·CAG tracts in
Escherichia coli. Interestingly, long CTG·CAG repeat sequences from myotonic dystrophy are preferred sites for
intramolecular recombination. In our companion paper (27), we have
established a genetic assay for monitoring the recombination frequency
of the CTG·CAG repeat tracts in an intermolecular system.
 |
EXPERIMENTAL PROCEDURES |
Parent Plasmids--
pRW3244, pRW4026, pRW3246, and pRW3248 were
the parent plasmids containing (CTG·CAG)n tracts used for
these experiments; these pUC19NotI derivatives contain the
(CTG·CAG)n tracts cloned into the HincII site of
the polylinker (28-30). For nomenclature of the TRS, CTG·CAG
designates a duplex sequence of repeating CTG, which may also be
written TGC or GCT; CAG, the complementary strand, may also be written
as AGC or GCA. The orientation is 5' to 3' for both designations of the
antiparallel strands. pRW3244 contains (CTG·CAG)17,
pRW4026 contains the (CTG·CAG)67, pRW3246 contains
(CTG·CAG)98, and pRW3248 contains
(CTG·CAG)175 sequence. The (CTG·CAG)175
sequence is not a pure CTG·CAG tract but contains two G to A
interruptions at repeats 28 and 69 (30); all other TRS are pure (not
interrupted). All of these sequences have non-repeating human flanking
sequences (19 and 41 bp) outside the repeated tract. pRW38152 is a
pUC18NotI derivative and contains the
(GTC·GAC)79 tract. These plasmids were maintained in
E. coli HB101 (Invitrogen) (mcrB, mmr, hsdS20
(rB
,
mB
), recA1, supE44, ara14,
galK2, lacY1, proA2, rplS20 (SmR),
xyl5, 
, leuB6, mtl-1). The
(CTG·CAG)n and (GTC·GAC)n sequences were subcloned
into pBR322.
Cloning of (CTG·CAG)n and (GTC·GAC)n
Sequences into pBR322--
The general strategy of this investigation
involved recloning of the (CTG·CAG)n and (GTC·GAC)n
sequences from pUC19NotI and pUC18NotI
derivatives (28), respectively, into pBR322. Fragments containing the
CTG·CAG and the GTC·GAC TRS were prepared from these plasmids by
digesting the pUC19NotI or pUC18NotI derivatives
with EcoRI and HindIII (New England Biolabs, Inc.) followed by filling-in the recessed 3' termini with 0.1 unit of
the Klenow fragment of E. coli DNA polymerase I (U. S. Biochemical Corp.) and the four dNTPs (0.1 mM each). In the
case of pRW4806 (a pUC19NotI derivative harboring a tract of
165 uninterrupted CTG·CAG repeats), the insert was prepared by
AluI digestion. The blunt-ended DNA fragments were used for
cloning to obtain plasmids containing the TRS tracts in both
orientations relative to the unidirectional ColE1 origin of
replication. The digested DNA was electrophoresed in a 7%
polyacrylamide gel and stained with ethidium bromide, and the bands
containing the triplet repeat fragment were excised. The DNA was eluted
from the excised bands, purified by phenol-chloroform extraction, and
precipitated with ethanol (31). The vector was prepared by digesting
pBR322 with EcoRI and HindIII followed by filling
in the recessed 3' termini as described earlier. The vector and the
insert were mixed at a molar ratio of ~1:10 and ligated for 14 h
at 16 °C by the addition of 20 units of T4 DNA ligase (U. S.
Biochemical Corp.). The ligation mixture was ethanol-precipitated and
transformed into E. coli HB101 by electroporation (2.5 kV,
cuvette size 0.2 mm) and plated on LB agar plates containing 100 µg/ml ampicillin. Plasmid DNA was isolated from individual
transformants by the Wizard Plus Miniprep DNA Purification System
(Promega). Clones containing the CTG·CAG repeats in orientations I
and II (defined in Refs. 28 and 30) were obtained and characterized by
restriction mapping. The inserts cloned into the
EcoRI/HindIII site of pBR322 are referred to as
"X inserts" (Fig. 2). The pBR322 derivatives containing a single
CTG·CAG sequence (the "X insert") were subsequently used to clone
the second TRS tract (CTG·CAG or GTC·GAC) into the PvuII
site at position 2064 of the pBR322 backbone (Fig. 2, Y insert). The same experimental approach was used to clone the second TRS insert, except that after ligation the reaction mixture was
subjected to PvuII digestion to eliminate plasmids lacking the insert. This strategy enabled the construction of a family of
plasmids harboring two homologous TRS tracts oriented as direct repeats
or inverted repeats as well as plasmids containing non-homologous repeats (Fig. 2).
All plasmids were characterized by restriction mapping (to determine
the orientation and length of the cloned TRS) and dideoxy sequencing of
both strands with ThermoSequenase Radiolabeled Terminator Cycle
Sequencing Kit (U. S. Biochemical Corp.). The sequencing reactions
were carried out according to the manufacturer's recommendations using
the following pBR322 specific primers: pBR322EcoRI, GTATCACGAGGCCCT which 3'-terminates at the pBR322 map position 4347 (New England Biolabs, Inc.); pBRHR, GCGTTAGCAATTTAACTGTGAT which 3'-terminates at
the pBR322 map position 49 (Genosys Inc.); pBRPF, GCTTCACGACCACGCTGAT which 3'-terminates at the pBR322 map position 2052 (Genosys Inc.); pBRPR, GTCAGAGGTTTTCACCGTCAT which 3'-terminates at the pBR322 map
position 2087 (Genosys Inc.). The products of the sequencing reactions
were analyzed on 6% Long Ranger gels (FMC BioProducts) containing 7.5 M urea in the glycerol tolerant gel buffer (U. S.
Biochemical Corp.). The gels were dried and exposed to x-ray film.
Cloning of Non-repeating DNA Sequences into pBR322--
Two
different non-repeating sequences were used as controls in this study:
the 564-bp fragment of
phage DNA (HindIII fragment from
nucleotide position 36895 to 37459) and the 354-bp fragment of the
human DMPK gene (part of the exon 7 and intron 7) (32-35). pRW4804 and
pRW4805 were constructed by digestion of
phage DNA with
HindIII and cloning one of the released restriction
fragments (564 bp) into the HindIII and PvuII
sites of pBR322. Thus, the two plasmids, pRW4804 and pRW4805, harbor
direct and inverted repeats, respectively (Fig. 2).
The exon 7/intron 7 fragment of the human DMPK gene used for
construction of pRW4871 and pRW4873 was obtained by PCR amplification of the sequence from the human genomic DNA. The PCR was carried out in
a volume of 20 µl containing 50 ng of genomic DNA, 1.5 mM
MgCl2, 50 mM KCl, 10 mM Tris/HCl,
pH 8.3, 200 µM of each dNTP, and 0.2 units of
Pfu Turbo DNA polymerase (Stratagene). The PCR primers DM7F,
GGCTCGAGACTTCATTCAGC, and DM7R, TAGATGGGCACAGAGCAGGT, were used at the
concentration 1 µM. Amplification on a PCR System 9700 (Applied Biosystems) involved 35 cycles: 20 s/95 °C, 20 s/58 °C,
and 40 s/72 °C. PAGE-purified PCR product was phosphorylated using 2 mM ATP and 5 units of T4 polynucleotide kinase (New England Biolabs, Inc.) and cloned into the HindIII and
PvuII sites of pBR322. pRW4871 as well as pRW4873 contain
homologous sequences oriented as direct repeats; however, the
orientations of the pairs of inserts are opposite in these two plasmids.
Cloning of the Green Fluorescence Protein Gene (GFP) into pBR322
Derivatives--
pBR322 and pBR322 derivatives containing direct,
inverted, non-homologous repeats and non-repeating DNA sequences were
digested with EcoRV and EagI (positions 185 and
939 on the pBR322 map, respectively) to remove the 754-bp DNA fragment
of the vector backbone. The digested plasmids were purified by 5%
acrylamide gel electrophoresis as described earlier and ligated to the
GFPuv gene (36). The GFPuv gene was
obtained by digestion of the pGFPuv (CLONTECH Laboratories, Inc.) with
PvuII and EagI (positions 56 and 1078 on the
pGFPuv map, respectively). After ligation and transformation into
E. coli HB101, transformants were screened using a long-wave
length UV lamp. The cells carrying plasmids with the
GFPuv gene emitted a strong green fluorescence.
The GFP cassette from pGFPuv contains the GFPuv variant of the green
fluorescent protein gene inserted in-frame with the lacZ initiation codon from pUC19 so that a
-galactosidase-GFPuv fusion protein is expressed from the lac promoter in E. coli.
Conditions of Bacterial Growth for Recombination
Studies--
For determinations of recombination properties, plasmids
containing TRS tracts were electrophoresed in 1% agarose gels, and bands corresponding to the supercoiled plasmids were excised from the
gels, transferred into dialysis tubes, and electroeluted (31). To avoid
DNA damage, plasmid purifications were performed without ethidium
bromide staining and UV irradiation of DNA. In all experiments, only
gel-purified, supercoiled plasmid DNA was used for transformation of
the appropriate E. coli strains. To ensure the identical
conditions for experiments with all plasmids studied, a large batch of
the competent cells was prepared for each set of experiments, and the
transformations were always done in parallel. The transformants were
cultured, harvested, and analyzed under the same conditions. The
following E. coli strains were used: AB1157 (37) as a parent of the recombination deficient strain JC10289 (thr-1, ara-14, leuB6,
(gpt-proA)62, lacY1, tsx-33,
glnV44(AS), galK2, 
,
rac
, hisG4(Oc), rfbD1, mgl-51,
(recA
srl)306,
srlR301::Tn10, rpsL31(strR),
kdgK51, xylA5, mtl-1, argE3(Oc), thi-1).
Strains were obtained from the E. coli Genetic Stock Center, Yale University, New Haven, CT. In the population experiments, the
transformation mixture was inoculated into 10-ml LB tubes containing
100 µg/ml ampicillin at a cell density of 102 cells/ml.
The cultures were grown at 37 °C with shaking at 250 rpm. At late
log phase (A600 ~1.0 units), the cells were
harvested, and the plasmid DNA was isolated as described above and
analyzed by restriction digestion.
To determine the frequency of recombination, plasmids harboring the
GFPuv gene were transformed into the appropriate
E. coli strain, plated onto LB plates containing 100 µg/ml
ampicillin, and incubated for 16 h at 37 °C. The frequency of
recombination was measured as the ratio of the number of white colonies
to the total number of viable cells. The white as well as a
representative number of fluorescent colonies were inoculated into 10 ml of LB medium (containing ampicillin at 100 µg/ml). After overnight
growth, the plasmids were isolated and subjected to the restriction and DNA sequencing analyses. The statistical analyses were performed using
SigmaStat version 2.03.
This genetic assay enabled the detection and quantitation of the
recombination events that occurred directly after transformation of the
parental plasmids into the host cells. In order to detect those
recombination events that took place at a later stage of colony
formation, the recombination product would have to outgrow the parental
plasmid molecules (that are present in a large excess at the moment of
the recombination event). Consequently, the recombinant plasmid should
have a tremendous replication advantage over the parental plasmids.
However, this can be easily ruled out by the results of copy number
analyses (see "Results").
In addition, the white and the fluorescent colonies are stable.
Randomly selected fluorescent colonies (350 total) were inoculated into
one bulk culture, mixed, and then plated on plates containing ampicillin. After overnight growth, no white colonies were observed among ~2 × 105 colonies screened. The same
experiment was repeated for the white colonies which revealed no
fluorescent colony formation in ~105 white colonies
analyzed. These results indicate that the "color of the colony"
(i.e. recombination status of the plasmid) is established at
the earliest stage of the colony formation, and masking or overgrowing
of the cells to alter the apparent color (e.g. fluorescent cells by the white ones or vice versa) is highly unlikely.
Determination of E. coli Growth Rates and Plasmid Copy
Numbers--
To ensure that results of the population experiments are
not biased by the growth advantage of cells containing recombination products over the cells containing parental plasmids, the
doubling time of E. coli cells harboring either
recombination substrates or the recombination products with CTG·CAG
tracts of different lengths and orientations was established. The
determination of the doubling time and plasmid copy numbers as well as
the recombination studies were carried out under identical conditions
of bacterial growth (10-ml LB tubes containing 100 µg/ml ampicillin,
37 °C with shaking at 250 rpm). In each case,
~102-103 cells/ml were used to start the
cultures. Aliquots of 10 µl were withdrawn at every 30-60 min for
~8 h, diluted in LB, and subsequently plated on agar plates without
ampicillin. The growth curves were prepared using SigmaPlot 2000 version 6.10, and the doubling time was calculated as described
previously (38).
To exclude the possibility of the replicative advantage of
recombination products over the parental plasmids, the copy numbers of
these plasmids were determined as described earlier (39, 40); the size
of the E. coli genome of 4,639 Kbp (41) was used for these
calculations. The quantitative analyses of plasmid and genomic DNAs
separated by agarose gels were performed using FluorChem version 3.04 (Alpha Innotech Corp.).
Agarose and Polyacrylamide Gel Analyses of Recombination
Products--
In order to analyze the products of intramolecular
recombination between repeating sequences, the isolated DNAs were
linearized with AflIII and labeled by end-filling with the
Klenow fragment of E. coli DNA polymerase I and
[
-32P]dATP. The labeled DNAs were separated on 1%
agarose gels in TAE (40 mM Tris acetate, 1 mM
EDTA, pH 8) buffer, and the gels were dried and exposed to x-ray film.
The instabilities of the TRS tracts of the recombination products were
determined using SphI/BamHI digestion followed by
end labeling as described above. The products were resolved in 5-7%
polyacrylamide gels in TAE buffer. The lengths of the CTG·CAG inserts
were calculated as described earlier (29). The primary structures of
more than 35 individual recombination products were determined by
direct DNA sequencing of one or both DNA strands.
 |
RESULTS |
Intramolecular System to Study Recombination between
(CTG·CAG)n Sequences--
We used an intramolecular plasmid
system to study recombination between TRS tracts, where two homologous
repetitive sequences are located on the same plasmid molecule and are
separated by non-homologous intervening sequences.
Two homologous TRS tracts present on the same replicon can be oriented
relative to each other as direct or inverted repeats (Fig.
1). The term "orientation" is used in
this study to define the relative directionality between two
recombining homologous sequences (direct and inverted repeats). The
terms "orientation I" and "orientation II" refer to the
orientation of the TRS sequences relative to the origin of replication;
for example, for the plasmids containing
(CTG·CAG)n tracts in orientation I, the CTG repeat is in the
leading strand template, whereas for the plasmids harboring
(CAG·CTG)n tracts, in orientation II, the CTG repeat is in
the lagging strand template (28-30, 42, 43).

View larger version (29K):
[in this window]
[in a new window]
|
Fig. 1.
The products of intramolecular recombination
between CTG·CAG tracts oriented as direct repeats or as inverted
repeats. The homologous recombination between direct repeats
(left panel) leads to the formation of a smaller plasmid
containing only one CTG·CAG tract; the DNA fragment which originally
separated the two TRS tracts (shown as a dotted area) is
deleted. This deleted fragment is inviable due to the absence of an
origin of replication and the ampicillin resistance gene and will
therefore be lost. In the case of the inverted repeats (right
panel), the recombination event between two homologous sequences
leads to the inversion of the sequence separating the repeats shown
here as an inversion of the direction of the replication origin and the
ampicillin resistance gene. The ampicillin resistance gene
(Amp) is designated as a white arrow. The
gray arrow shows the orientation of the unidirectional
origin of replication (ori). A portion of the CTG·CAG
tracts is black and a second portion has a white
background to illustrate the location and the consequences of the
recombination events.
|
|
The recombination frequency as well as the types of final products of
the intramolecular recombination event strongly depend on the relative
orientation of the recombining sequences (Fig. 1) (18, 21, 24, 44). The
recombination event between direct repeats may lead to the deletion of
one of the homologous tracts and any intervening sequences between the
repeats (20, 21, 44-47). In the case of homologous TRS tracts, the
intervening sequence separating the repeats will be also deleted.
However, due to their repetitive nature, two homologous CTG·CAG
tracts can align and hybridize with each other in several different
frames (the number of frames equals the number of repeats
divided by 3). As a result of possible different alignments of the
CTG·CAG sequences, the length of TRS tracts in the recombination
products may vary from the minimum length required for recombination to
occur to the maximum length determined by the size of both recombining
homologous sequences. The intervening sequence separating the two TRS
tracts is inviable because it lacks an origin of replication as well as
the ampicillin resistance gene and therefore will be lost during cell division.
The recombination event between inverted repeats (Fig. 1, right
panel) can lead to the inversion of the intervening sequence between the homologous repeats (44, 48). This will result in an
inversion of the direction of the ampicillin resistance gene and an
inversion of the origin of replication. However, other types of
products of intramolecular recombination between inverted repeats such
as head-to-head inverted dimers have also been described previously
(18, 24).
Plasmids Containing Direct and Inverted
Repeats--
Intramolecular plasmid systems have been used widely for
investigating the mechanisms of recombination and the influence of different factors on this process (16-19, 21-24). We used this system to investigate recombination between CTG·CAG repeats in E. coli. For this study, we constructed and characterized a family of
pBR322 derivatives (Fig. 2). Various
lengths of CTG·CAG repeats (17, 67, 98, 165 and 175) were cloned into
the EcoRI/HindIII and PvuII sites of
pBR322. Two homologous TRS tracts inserted in both orientations (I and
II) as direct and inverted repeats (Fig. 2, left and
center columns) were separated by ~2,000 bp of the
intervening sequence (Fig. 2, dotted region) and 2,300 bp of
the intervening sequence harboring the unidirectional replication
origin and the ampicillin resistance gene. Introduction of the X TRS
insert into the EcoRI/HindIII site of pBR322
inactivated the tetracycline resistance gene (49). The cloning of the Y
TRS insert into the PvuII site destroyed the rop
gene of pBR322, which mediates the activity of RNA I. The latter
resulted in an elevated copy number of the plasmids (50).

View larger version (39K):
[in this window]
[in a new window]
|
Fig. 2.
Plasmids used in this study. All
plasmids are derivatives of pBR322 and contain two inserts
(X and Y) oriented as direct repeats (left
column) or inverted repeats (central column). Control
plasmids harboring two non-homologous TRS are shown in the right
column. TRS inserts as well as a non-repeating DNA sequences were
cloned into the HindIII/EcoRI (X TRS
insert) or PvuII (Y TRS insert) sites of
pBR322 (for details, see "Experimental Procedures"). The inserts
containing 17, 67, 98, and 165 CTG·CAG repeats as well as the
(GTC·GAC)79 insert are homogeneous (i.e. are
perfect repeating sequences and contain no interruptions). The
(CTG·CAG)175 sequence present in pSF3 and pSF4 is not a
pure CTG·CAG tract but contains two G to A interruptions at repeats
28 and 69. The actual sequences of the leading strand templates of the
TRS inserts are shown for all plasmids. Thus, CTG·CAG and CAG·CTG
inserts correspond to orientation I and orientation II, respectively
(28-30). The ampicillin resistance gene (Amp) is designated
as a white arrow. The gray arrow shows the
approximate position and direction of the origin of replication
(ori).
|
|
As controls, plasmids with non-repeating homologous sequences instead
of the CTG·CAG repeats were constructed (Fig. 2). Two different
non-repeating DNA fragments were used: the 564-bp fragment of
bacteriophage
DNA and the 354-bp fragment of exon 7/intron 7 of the
human DMPK gene (see "Experimental Procedures" for details). In
addition, pBR322 derivatives containing one CTG·CAG tract (the X
insert) and its isomeric GTC·GAC sequence (51) (the Y insert) were
constructed (Fig. 2, right column) as controls for
non-homologous TRS tracts in one plasmid.
All plasmids were maintained in E. coli HB101, which is
RecA
. Previous studies (18-20, 23) showed that
intramolecular plasmid recombination is not dependent on the function
of the recA gene product. Thus, even the propagation of the
plasmids in E. coli HB101 to obtain working stocks of
plasmids can cause DNA rearrangements due to RecA-independent
recombination. In addition to the recombination events, cultivation of
E. coli harboring plasmids with TRS tracts leads to the
genetic instability of repeating sequences manifested predominantly as
deletion products (28-30, 51, 52). This applies mainly to the long
uninterrupted CTG·CAG sequences such as (CTG·CAG)67 or
longer (28, 30, 52). To eliminate the possibility of transformation by
plasmids containing either large rearrangements caused by recombination
(e.g. dimers, substantial deletions, duplications) or
smaller deletions within the TRS tracts due to replication errors, all
plasmids were subjected to extensive agarose gel purification. The
purity and sequence integrity of DNA was determined before transformation using restriction analyses and DNA sequencing. Only
plasmids that met the above-mentioned criteria were subsequently used
for transformation experiments.
Recombination between Direct Repeats--
The recombination
behavior of the plasmids shown in Fig. 2 was studied in two
E. coli strains that differed in their
recombination capacity: AB1157 (parent) and JC10289
(RecA
). For all plasmids, both single colony analyses and
population experiments were performed (see "Experimental
Procedures"). The plasmids isolated from E. coli AB1157
and JC10289 (Fig. 3, + and
lanes, respectively) were analyzed by
AflIII digestion. The unique AflIII recognition
site is located between the origin of replication and the Y TRS insert,
about 60 bp from the origin of replication. Thus, large rearrangements
such as dimerization or deletion of the DNA segment between homologous
sequences, which may result from recombination, can be detected.
Restriction analyses of plasmids containing direct repeats (pRW4815,
pRW4817, pRW4819, pRW4821, pRW4823, pRW4825, and pRW4804) isolated from
RecA+ and RecA
E. coli showed
bands of 4,500-5,500 bp in size, corresponding to the starting DNA
(co-migrating on agarose gel with the plasmids used for transformation;
Fig. 3A, lanes C) and shorter DNA fragments at
~2,500-3,000 bp. As revealed by restriction analyses of plasmids isolated from single colonies and DNA sequencing of several clones, the
shorter fragments (at 2,500-3,000 bp) correspond to the recombination products between direct repeats, which harbor only one stretch of
CTG·CAG repeats and lack the intervening sequence separating the two
homologous TRS tracts. The same type of recombination products was
observed in the case of pRW4804 containing the homologous non-repeating
sequences (Fig. 3A).

View larger version (89K):
[in this window]
[in a new window]
|
Fig. 3.
Restriction analyses of products of the
intramolecular recombination events between direct or inverted repeats
or non-homologous sequences. Plasmids were isolated from E. coli AB1157 (RecA+) and JC10289 (RecA )
cultures that were grown until the late log phase. The DNA was
linearized with AflIII, end-labeled, and electrophoretically
separated through 1% agarose gels in TAE buffer. The starting material
(the plasmids used for transformation of the E. coli
strains) is indicated as C. Recombination proficient and
deficient strains are shown as + or , respectively. The 1-kbp DNA
ladder (Invitrogen) was used as a size marker, and the sizes of these
bands are indicated (left sides of gels). A,
the results of AflIII digestion of plasmids containing
direct repeats. Brackets designate the full-length plasmids
(~5 kbp) as well as the products of intramolecular deletion due to
recombination between the repeated tracts (at ~3 kbp).
B, AflIII digestion products of plasmids
containing the inverted repeats. C, shows the
AflIII digestion products of control plasmids containing
non-homologous TRS.
|
|
On the other hand, the products of intramolecular deletion between
homologous sequences have never been detected for plasmids harboring
inverted repeats (pRW4816, pRW4818, pRW4820, pRW4822, pRW4824, and
pRW4826) nor for plasmids containing CTG·CAG tracts and their
isomeric, non-homologous sequence (pRW4830, pRW4831, pRW4832 and
pRW4833; Fig. 3, B and C).
Thus, we conclude that the predominant products of recombination
between directly repeated CTG·CAG tracts are intramolecular deletions. Moreover, quantitative analyses of the data presented in
Fig. 3A, obtained using PhosphorImager scanning of
radioactively labeled restriction fragments, showed that the amount of
the recombination products was strongly dependent on the length of the
CTG·CAG sequence. In the case of pRW4815 and pRW4817, both containing
(CTG·CAG)17, the recombination product constituted ~1%
of the total DNA isolated. For plasmids harboring
(CTG·CAG)67 (pRW4819 and pRW4821) and
(CTG·CAG)98 (pRW4823 and pRW4825), the recombination
products accounted for ~15 and ~30% of the total DNA,
respectively. The plasmids containing 98 CTG·CAG repeats (410 bp of
homologous sequence including human myotonic dystrophy flanking
sequences and a fragment of the pUC19 polylinker) showed ~20-30
times higher propensity of recombination product formation when
compared with the non-repeating 564-bp phage
DNA (Fig.
3A, compare pRW4823 and pRW4825 with
pRW4804).
Quantitative analyses of the data presented in Fig. 3A also
revealed that the amount of the recombination products depends on the
orientation of the CTG·CAG sequence relative to the origin of
replication, with orientation II being more recombination-prone than
orientation I (Fig. 3A). Also, this was confirmed later by using a genetic assay to study the frequency of recombination between
the direct repeats (see below). The influence of the CTG·CAG orientation on the recombination frequency suggests an important involvement of replication mechanisms such as polymerase pausing and
induction of DNA nicks in intramolecular recombination between the
CTG·CAG sequences.
The products of intramolecular deletion between the direct repeats were
detected in both RecA+ and RecA
strains;
however, E. coli JC10289 (RecA
) exhibited a
lower recombination propensity than the isogenic RecA+
cells (Fig. 3A, compare lanes + with
). These results are in agreement with previous studies
(19, 45) showing that intramolecular plasmid recombination does occur
efficiently independent of the recA gene function, although
the presence of RecA increases the frequency of this process.
Furthermore, the spectrum of recombination products is
different for plasmids containing short stretches of
CTG·CAG as compared with the long tracts. Digestion of the
recombination products from pRW4815 and pRW4817 (containing
(CTG·CAG)17) with AflIII showed a single band
(within the resolution of the agarose gel), but recombination between
the homologous sequences harboring 67 and 98 CTG·CAG repeats gave a
set of products spanning a distance of at least 500 bp (Fig.
3A). This effect might be due to the higher instability of
the longer TRS tracts present in the recombination products; however,
it is more likely that the size variability of the CTG·CAG tracts in
the recombination products increases with the length of the recombining
homologous sequences.
CTG·CAG Length, Orientation, and Number of Tracts Does Not Affect
the Doubling Time of E. coli nor the Plasmid Copy Number--
The
quantitation of the data obtained from the population experiments could
be strongly biased, at least in principle, by different
physiologies of bacterial cells containing recombination products versus cells harboring parental plasmids. Two major
processes might cause enrichment of the recombinant plasmids in the
cells and therefore influence the outcome of the experiments. First, the cells containing smaller plasmids with one CTG·CAG tract may have
a growth advantage over cells harboring the larger parental plasmids
with two TRS tracts. Second, the differences in size and the number of
TRS present in the replicon could influence the replicative advantage
(copy number) of one type of plasmid over the other. Thus, studies were
conducted to evaluate the magnitude of these potential influences.
The growth curves of E. coli AB1157 and JC10289 host strains
harboring plasmids with two of the longest, uninterrupted TRS tracts
studied (pRW4863 and pRW4865, with two (CTG·CAG)165
tracts in orientations I and II, respectively) were compared with the growth curves of bacteria harboring recombination products with single
TRS tracts of 21, 58, and ~92 CTG·CAG repeats. The doubling time
(t2) calculated during the exponential phase of
growth was almost identical for E. coli AB1157 harboring
pRW4863 (t2AB = 21.7 ± 1.3 min), pRW4865
(t2AB = 22.0 ± 1.4 min), and for bacteria that harbored recombination products (22.7 ± 1.4, 21.8 ± 1.8, and 23.7 ± 2.1 min, for plasmids containing the single
tracts of 21, 58, and 92 repeats, respectively). The doubling time of E. coli JC10289 was lengthened by 15-25% for all plasmids
studied. Approximately 20% difference in t2
between AB1157 and JC10289 was observed regardless of the presence of
the plasmid. The doubling time of bacteria harboring non-repeating DNA
sequences (pRW4804) was ~5-10% shorter than for E. coli
harboring plasmids with CTG·CAG repeats. There was no statistical
difference in t2 between pRW4804 (19.7 ± 1.8 min) and the recombination product of this plasmid (20.9 ± 0.8 min). Thus, under our experimental conditions, no growth advantage
of cells harboring the recombination products over cells harboring the
recombination substrates was observed. These results are in agreement
with the previous findings (30) that cells harboring plasmids with a
shorter TRS tract ((CTG·CAG)17) do not have a growth
advantage over cells containing plasmids with
(CTG·CAG)175 tract, so long as the cultures were
maintained in the exponential phase of growth (even for several
generations). Alternatively, the growth advantage was pronounced
after E. coli passed through the stationary phase
(30), which are conditions never employed in our studies.
In order to determine the difference in the replication propensities of
plasmids with one or two TRS tracts, the copy numbers of pRW4865 (with
two (CTG·CAG)165 tracts), pRW4815 (with two
(CTG·CAG)17 tracts), and the recombination product (with
one CTG·CAG sequence of 21 repeats) were analyzed. By using the
detergent lysis method (39, 40), we observed that the plasmid copy
number is ~10% higher for the recombinant plasmid (148 ± 9 copies per genome) than for pRW4865 (134 ± 7 copies per genome)
or pRW4815 (129 ± 7 copies per genome) in E. coli
AB1157. Thus, the copy number of the plasmid harboring a pair of short
(CTG·CAG)17 tracts is very similar to the copy number of
the plasmid carrying two long 165 repeats tracts. We can also conclude
that the difference in copy numbers between recombination substrates
and products is negligible.
It should be noted that recombination products (~3 kbp plasmids) do
not form multimeric forms (dimers, trimers, etc.) with a high
efficiency while maintained in E. coli AB1157
(RecA+). However, recombination substrates are capable of
forming large amounts of multimeric forms in the recombination
proficient cells. Considering the oligomeric states of the plasmids,
134 copies of pRW4865 and 129 copies of pRW4815 account for 283 and 277 monomer equivalents, respectively. Thus, the copy number as well as the number of monomer equivalents does not depend on the length of CTG·CAG tracts (in the range of 17-165 repeats). Determination of
the copy number of the same plasmids in JC10289 (RecA
)
revealed that both recombination substrates and products are maintained
in E. coli at almost the same copy number.
These experiments showed clearly that parental plasmids with long
repeat tracts do not have a replication disadvantage compared with the
recombination products. Moreover, due to the approximately two times
higher amount of monomer equivalents present in the recombination
proficient cells carrying plasmids with two TRS tracts, the frequency
of recombination events leading to the formation of the smaller
plasmids (calculated from the data presented in Fig. 3A) may
be underestimated rather than overestimated.
In summary, these data show only negligible effects of both the
replicative advantage of recombinant plasmids over parental plasmids
and the growth advantage of cells containing smaller plasmids with one
TRS tract on the outcome of the population experiments shown in Fig. 3.
On the other hand, the data obtained from the biochemical approach must
be interpreted cautiously due to their lower precision, sensitivity,
and statistical significance compared with the genetic assay.
Therefore, we established a genetic assay for determining the frequency
of intramolecular recombination between two TRS tracts (see below).
Recombination between Inverted Repeats--
Experiments with
plasmids containing the inverted CTG·CAG repeats were carried out
under identical conditions as described above for plasmids harboring
direct repeats. In contrast to the plasmids containing direct repeats,
when the two TRS tracts were oriented as inverted repeats, no products
of intramolecular deletions were ever observed (Fig. 3B).
This result was expected (18, 24, 44).
The predicted product of recombination between inverted repeats is a
simple intramolecular inversion as shown in Fig. 1 (right panel) (44, 48). More than two hundred colonies from recombination studies with pRW4816, pRW4818, pRW4820, pRW4822, pRW4824, pRW4826, and
pRW4805 along with DNAs isolated from the population experiments were
analyzed in both RecA+ and RecA
strains.
NheI/AatII digestion was used to identify the
inversion products. These two sites flank the X TRS, and the analysis
of the starting DNA should give rise to two fragments that are
~550-800 and 4100-4350 bp long (the size depends on the number of
CTG·CAG repeats present). If the products of intramolecular inversion due to recombination are formed, two new bands should be detected on
agarose gels (~2450-2700 and 2200-2450 bp in length). Unexpectedly (44, 48), intramolecular inversions were never detected (data not shown).
Intramolecular recombination between inverted repeats can also lead to
the formation of a head-to-tail dimer with complex DNA rearrangements
(18, 24). This kind of recombination product would be easily detected
by AflIII digestion as well as by electrophoresis of
supercoiled plasmid DNA on agarose gels. None of those two approaches
showed formation of such recombination products.
Several factors can explain the failure of detection of recombination
between inverted repeats. The lower frequency of recombination between
inverted repeats in comparison to direct repeats was reported previously (44); therefore, this process may not be detectable by our
radiolabeling methods. Even in the case of non-repeating sequences
(pRW4804 and pRW4805), the products of recombination between direct
repeats were detected in contrast to those of inverted repeats. By
using biochemical methods (restriction digestion and radioactive
labeling), we were able to detect products of the recombination events
occurring with a frequency of
10
4. Furthermore, it is
possible that the intrinsic properties of the TRS sequences
(e.g. to form stable DNA structures (2), pause DNA
polymerases (51, 53, 54), or cause the double-strand breaks (55-58))
favor a specific recombination/repair pathway, which in our
experimental conditions strongly promotes recombination between direct
repeats and/or inhibits inverted repeat recombination.
Frequency of Intramolecular Recombination between Direct Repeats
Depends on the Length of the CTG·CAG Tracts--
The GFP gene was
cloned into the region of the plasmids that underwent deletion during
recombination (for details see "Experimental Procedures"). The GFP
cassette contains the GFPuv variant of the green fluorescent
protein, which is expressed in E. coli under the control of
the lac promoter (36). The GFP emits strong green fluorescence when irradiated with long wavelength UV light. The detection of the fluorescence does not require exogenous substrates or
cofactors and is completely independent of the genetic background of
bacterial host cells (59, 60). Plasmids containing the GFP gene
separating two TRS tracts were transformed into E. coli AB1157 and JC10289. In all experiments the transformations were performed with a large excess of cells to DNA molecules so that transformation should have occurred by a single plasmid molecule (61).
The white colonies are formed only when the incoming plasmid undergoes
recombination immediately after the transformation, leading
to the loss of the GFP gene located between the direct repeats. When
the incoming plasmid is established in the host cell and replicates
several times, the expression of the GFP gene leads to fluorescent
colony formation. The frequency of recombination was measured as the
ratio of the number of white colonies to the total number of viable
cells (Fig. 4A).

View larger version (26K):
[in this window]
[in a new window]
|
Fig. 4.
Frequency of recombination between direct
repeats. The strategy used for determining the recombination
frequencies is shown in A. The plasmids containing the TRS
sequences shown in Fig. 2 were modified as described under
"Experimental Procedures." Briefly, a 0.8-kbp fragment of the
vector sequence separating the repeats was replaced by the green
fluorescent protein gene. The designations of the plasmids are
identical to those shown in Fig. 2 except for the gfp
suffix. After transformation, the E. coli cells were plated
onto LB plates containing ampicillin (100 µg/ml) and incubated for
16 h at 37 °C. The fluorescence of the colonies harboring the
nonrecombined plasmids was detected by exposing the colonies to a
long-wave UV lamp. B, the recombination frequency was
measured as the ratio of the number of white colonies to the total
number of viable cells (both fluorescent green and white colonies). For
each plasmid, three or more independent experiments were performed, and
at least 7,000 colonies were counted, except for pRW4863gfp and
pRW4865gfp where ~2000 colonies were counted. The frequency was
calculated as the mean of the data collected from all experiments.
R represents relative frequency of recombination and is
calculated relative to the frequency of recombination observed for
pRW4815gfp harboring a pair of (CTG·CAG)17 inserts.
C, the frequency of white colony formation in
experiments conducted with control plasmids in E. coli
AB1157.
|
|
Sixteen plasmids containing the GFP cassette were constructed and used
in our experiments (Fig. 4, B and C). The two
major factors found to influence the recombination frequency between direct repeats were the length of the TRS tract and the orientation of
the CTG·CAG sequence relative to the origin of replication.
Fig. 4B shows that plasmids containing short
(CTG·CAG)17 tracts in orientation I recombined in
E. coli AB1157 with the frequency ~5 times lower than
plasmids harboring non-repeating sequences. However, simple inversion
of the orientation (pRW4817gfp) caused a statistically significant
(p < 0.001) 4-fold increase in recombination propensity of the (CTG·CAG)17 sequences. Thus, the
plasmids containing the (CTG·CAG)17 tracts in orientation
II recombined with a frequency similar to non-repeating, homologous DNA
fragments that were 200-400 bp longer.
The effect of length of the homologous TRS regions on the recombination
frequency was dramatic. For CTG·CAG tracts in orientation I,
lengthening the recombining sequences to 67, 98, and 165 repeats increased the rate of recombination 6.5, 8, and 60 times, respectively, in comparison to (CTG·CAG)17. A similar effect of the
length of the recombining sequences was observed also for plasmids
harboring TRS tracts in orientation II (Figs. 4B and
5). Hence, the long CTG·CAG tracts have
a much higher propensity for recombination than shorter tracts;
moreover, the frequency of recombination between the longest
(CTG·CAG)165 sequences studied (pRW4863gfp and
pRW4865gfp) was ~7-10 times higher (p < 0.001) than
the frequency observed for non-repeating sequences of comparable length
(pRW4804gfp). In addition, the level of recombination between the
564-bp long
phage DNA fragments was only slightly higher (11.9 × 10
3) than for the 354-bp DMPK gene fragments
(9.2-10.4 × 10
3). This statistically insignificant
difference (p = 0.07) suggests that the frequency of
recombination does not depend on the length of the recombining
fragments in the case of non-repeating DNA sequences. These results are
in agreement with previous studies (21) showing that the frequency of
recombination between non-repeating DNA sequences (fragments of the
tetracycline resistance gene) oriented as direct repeats increases as
the length of the homologous sequences increases from 14 to 100 bp.
Further lengthening of the repeats (up to 854 bp) had little or no
effect on the recombination frequency (21).

View larger version (18K):
[in this window]
[in a new window]
|
Fig. 5.
The effect of length and orientation on
frequency of intramolecular recombination between direct repeat
tracts. Circles represent the plasmids containing the
CTG·CAG cloned in orientation I (pRW4815gfp, pRW4819gfp, pRW4823gfp,
and pRW4863gfp); diamonds represent the plasmids containing
the CTG·CAG cloned in orientation II (pRW4817gfp, pRW4821gfp,
pRW4825gfp, and pRW4865gfp); the triangles
represent the plasmids containing the non-repeating DNA sequences
(pRW4804gfp, pRW4871gfp, and pRW4873gfp). Data from experiments
performed in E. coli AB1157 are shown on the main
graph; the inset shows the data from E. coli
JC10289 (RecA ). The standard deviations are shown by the
error bars. The homologous sequences are composed of the
CTG·CAG repeats as well as the human flanking sequences and segments
of the polylinker.
|
|
It should be pointed out that the rate of intramolecular recombination
between long CTG·CAG tracts was extraordinarily high; recombinants
were found with a frequency 1.5-12.6% (for plasmids containing 98 and
165 CTG·CAG repeats). Therefore, the recombination products could be
easily detected and visualized in plasmids isolated from population
experiments, as shown in Fig. 3A.
CTG·CAG Tracts in Orientation II Are More Susceptible to
Recombination--
Although CTG·CAG tracts stimulate recombination
in both orientations, a pronounced orientation dependence of the
frequency was observed. The frequencies of recombination were 4, 2, and 3.5 times higher for plasmids containing (CTG·CAG)17,
(CTG·CAG)67, and (CTG·CAG)98 in orientation
II, respectively, than for plasmids harboring repeats of the same
length but in orientation I (p < 0.001). Surprisingly,
in the case of the longest tracts studied ((CTG·CAG)165),
the orientation dependence was found to be the opposite. However, the
frequency of recombination for both pRW4863gfp and pRW4865gfp was much
higher than for plasmids harboring shorter tracts. The plasmid
containing (CTG·CAG)165 in orientation I showed a higher
recombination propensity than plasmids with the
(CTG·CAG)165 in the orientation II (126 × 10
3 versus 81 × 10
3). The
reason for this finding is uncertain, but we believe that the
instability of the CTG·CAG tracts contributes to this behavior. Long,
uninterrupted CTG·CAG sequences (even containing 67 or 98 repeats)
are extremely unstable in plasmids cultivated in E. coli. In
addition, the CTG·CAG repeats in orientation II undergo deletions with a much higher rate than those in orientation I. Both pRW4863gfp and pRW4865gfp harbor the very long (CTG·CAG)165
sequences; thus, it is essentially impossible to stably maintain them
in E. coli. Although the preparation of pRW4863gfp
(orientation I) used in these experiments contained only 5-10%
deletions, the pRW4865gfp preparation (orientation II) contained
~30-35% deletions (estimated by restriction digestion followed by
DNA labeling and calculated as the total amount of deletions from both
TRS inserts). This difference in the TRS stability is likely
responsible for the apparent lower frequency of recombination observed
between the two (CTG·CAG)165 inserts in orientation II.
Thus, if this extreme level of instability was not encountered for the
DNAs with orientation II, we anticipate that the frequencies of
~180 × 10
3 would have been observed, amounting to
a 36-fold enhancement compared with the shortest CTG·CAG tracts.
As expected, the formation of white colonies due to recombination was
not detected for plasmids harboring inverted repeats (pRW4820gfp and
pRW4822gfp) as well as for plasmids containing a CTG·CAG tract in the
same plasmid as the isomeric GTC·GAC repeats (pRW4830gfp and
pRW4831gfp, Fig. 4C). In the case of pBR322gfp, which
contains no homologous repeating sequences, a single white colony was
found (~30,000 colonies were screened), and the restriction analysis
of DNA isolated from that colony showed the existence of a point
mutation (1 or 2 nucleotide deletion) within the GFP gene, which was
obviously a sporadic event.
Intramolecular Recombination Is Independent of
RecA--
Intramolecular recombination experiments with plasmids
containing CTG·CAG repeats were done in both recombination proficient and deficient E. coli strains. In contrast to the
significant reduction of the intermolecular recombination frequency by
recA gene knockout (14, 15), intramolecular plasmid
recombination is known to proceed efficiently in
recA-deficient strains (19, 45).
Similar to the results obtained with E. coli AB1157, the
recombination frequency in JC10289 (recA) was strongly
dependent on the length and orientation of the recombining CTG·CAG
tracts (Figs. 4B and 5). The recA gene
inactivation reduced the overall rates of intramolecular recombination
by 2-11-fold in comparison to the isogenic recombination proficient
E. coli cells. In the case of plasmids containing shorter
CTG·CAG tracts (17 and 67 repeats), the effect of recA
deletion was modest (2-4-fold decrease in recombination frequency).
However, a stronger, 3-11-fold reduction in the recombination rate was
detected for plasmids harboring (CTG·CAG)98 and
(CTG·CAG)165. These results are in agreement with the
previous studies, where the effect of recA mutation on
intramolecular recombination varied from 0- to a 40-fold decrease in
frequency and was predominantly dependent on the length of the
recombining sequences (reviewed in Ref. 17).
Instability of the CTG·CAG Sequences in the Recombination
Products--
To study the instability of the CTG·CAG tract
resulting from intramolecular recombination, plasmids were isolated
from white colonies and analyzed by agarose gel electrophoresis. The
electrophoretic migration of recombinants showed that all plasmids
(~400 DNA samples) isolated from white colonies lost a significant
portion (~2 kbp) of the vector backbone (Fig.
6A). Other types of
rearrangements were not observed. In order to characterize further the
structure of the recombinants, a total of 37 individual recombination
products, representing plasmids harboring TRS of different lengths,
were subjected to the DNA sequencing. The sequence analyses revealed that all recombination products studied harbored only one TRS tract
flanked by the human myotonic dystrophy sequences and fragments of the
polylinker (Fig. 6B). The smallest recombination product analyzed carried only 13 triplet repeats (Fig. 6B), whereas
the largest entirely sequenced TRS tract contained 165 CTG·CAG
repeats. This expansion product was created by a recombination event
between two (CTG·CAG)98 inserts. In the case of
recombinants containing very short CTG·CAG inserts, we were able to
analyze, using a single sequencing reaction, the entire TRS tract along
with the human flanking sequences and segments of the polylinker (Fig.
6B). In addition, pBR322 sequences that were originally
separated in the parental plasmids by a distance exceeding 2 kbp (Fig.
6B, E and P sites) could also be
detected. Thus, these results proved that recombination occurred
between two homologous TRS inserts resulting in the deletion of the
intervening sequence.

View larger version (72K):
[in this window]
[in a new window]
|
Fig. 6.
Analysis of products of intramolecular
recombination between CTG·CAG repeats. A,
agarose gel analysis of the plasmid DNAs isolated from the single
colonies from E. coli AB1157 after transformation with
pRW4823gfp. Lanes 1-6, DNA isolated from white colonies;
lanes 7-12, plasmids isolated from fluorescent colonies.
The sizes of bands of the supercoiled DNA ladder (Sc) are
shown on the left. B, sequence analysis of an
intramolecular recombination product harboring 13 CTG·CAG repeats. A
schematic diagram of the recombinant plasmid is shown on the
right. The recognition sites for BamHI and
SphI used to determine the sizes of TRS tracts in the
recombination products (Fig. 7) are indicated. E and
P indicate the cloning sites for the X TRS insert
(EcoRI) and the Y TRS insert (PvuII),
respectively. The numbers in parentheses
represent the original positions of these restriction sites on the
pBR322 map.
|
|
In order to analyze the size of the TRS inserts in a large number of
recombinants, plasmids were subjected to
SphI/BamHI restriction digestion followed by
end-labeling and polyacrylamide gel electrophoresis (Fig.
7). Although the parental plasmids
contain seven recognition sites for SphI/BamHI,
the recombinants harbor unique SphI and BamHI
restriction sites (the remaining five are lost during recombination along with the 2 kbp of intervening sequence (Fig. 6B)).
Therefore, digestion by these restriction enzymes splits the
recombinant plasmids into an ~2.3-kbp pBR322 fragment (identical for
all plasmids studied containing the origin of replication and the
ampicillin resistance gene; Fig. 6B) and the
CTG·CAG-containing inserts (Fig. 7, A-D).

View larger version (97K):
[in this window]
[in a new window]
|
Fig. 7.
The influence of CTG·CAG length
and sequence interruptions on recombination-mediated instability.
The plasmids were isolated from white colonies and were digested with
BamHI/SphI to release the TRS-containing inserts.
Labeled DNA fragments were separated by 5-7% PAGE to determine the
lengths of the CTG·CAG sequences. Each panel shows the analysis of
~30 individual colonies. A 1-kbp ladder and TRS size standard
(M) were used to determine the lengths of the CTG·CAG
tracts. The TRS size standard (bands identified by arrows)
contains four BamHI/SphI fragments containing 17, 67, 98, and 175 CTG·CAG repeats. A, products of
recombination between two (CTG·CAG)17 tracts
(pRW4815gfp). B, products of recombination between two
(CTG·CAG)67 tracts (pRW4819gfp). C,
products of recombination between two (CTG·CAG)98 tracts
(pRW4823gfp). D, products of recombination between two
TRS tracts containing (CTG·CAG)175 and
(CTG·CAG)98 (pSF3gfp).
|
|
At least 40 individual isolates from each recombination experiment were
analyzed by SphI/BamHI restriction digestion.
Fig. 7 shows the polyacrylamide gel analyses of typical data obtained for pRW4815gfp, pRW4819gfp, pRW4823gfp, and pSF3gfp (all plasmids contain the CTG·CAG tracts in orientation I). There were no
significant differences in the overall TRS size distributions in the
recombination products from plasmids harboring CTG·CAG sequences in
orientations II and I.
In the case of pRW4815gfp and pRW4817gfp, the progenitor (starting
length (CTG·CAG)17) was retained in more than 80% of the recombination products analyzed (Figs. 7A and 8). Only 5 isolates harbored longer tracts with (CTG·CAG)23 being
the longest (Fig. 7A, 12th lane) and 7 clones
contained deleted products (12 repeats was the shortest). These data
imply that the homologous, non-repeating flanking sequences may be an
important factor for recombination between short repeats (particularly
(CTG·CAG)17).
The most interesting recombinational behavior was found for the
plasmids harboring (CTG·CAG)67 sequences (Figs.
7B and 8). Analyses of the pRW4819gfp and pRW4821gfp
recombination products (Fig. 7B) revealed that more than
30% contained expanded CTG·CAG tracts. Deletions and retentions of
the progenitor insert length were detected in ~30 and 40% of
analyzed clones, respectively (Fig. 8).
The smallest CTG·CAG tract found among the recombination products had
15 repeats, and the longest, expanded CTG·CAG tract contained 130 repeats.

View larger version (40K):
[in this window]
[in a new window]
|
Fig. 8.
Frequency of the CTG·CAG repeat expansions
and deletions mediated by intramolecular recombination. The
lengths of the TRS-containing fragments (shown in Fig. 7) were
measured, and the numbers of CTG·CAG repeats were calculated as
described earlier (29). Black bars, deletions;
gray bars, retention of size of progenitor sequence;
white bars, expansions. The orientation of the
CTG·CAG tracts relative to the origin of replication in the parental
plasmids had no influence on TRS size distribution of the recombination
products. Therefore, data obtained for plasmids harboring the same
number of triplet repeats but present in different orientations were
combined and plotted together in a single bar. Each bar represents the
data collected from the analysis of ~90 clones.
|
|
For plasmids containing 98 or 165 CTG·CAG repeats (pRW4823gfp,
pRW4825gfp, pRW4863gfp, and pRW4865gfp), only ~5% of the isolates maintained the size of the progenitor sequence; the majority of the
clones analyzed (70-80%) harbored deleted CTG·CAG sequences (Figs.
7C and 8). Thus, these sequences were very prone to
deletions, and expansions were observed infrequently. The longest TRS
tract found in the recombination products had 172 and 289 CTG·CAG
repeats for pRW4823gfp and pRW4863gfp, respectively.
Hence, we conclude that the intramolecular recombination between
CTG·CAG repeats results in the genetic instabilities of the TRS tracts.
Effect of Interruptions--
It was demonstrated previously (24)
that as little as 2.8% of heterology might reduce the frequency of
recombination more than 1000-fold between non-repeating sequences in an
E. coli plasmid system. The GFP gene-containing derivative
of pSF3 (Fig. 2) was used to analyze the influence of sequence
interruptions on the frequency of intramolecular recombination between
TRS tracts. pSF3gfp harbors two CTG·CAG inserts in orientation I, 98 and 175 repeats in length. The longer tract contains two G to A
interruptions at repeats 28 and 69 (see "Experimental Procedures"),
but the other tract has no interruptions. Note that all TRS tracts
studied herein (Fig. 2) containing 165 or fewer repeats were
uninterrupted, whereas (CTG·CAG)175 was interrupted (see
"Experimental Procedures"). The frequency of recombination for
pSF3gfp was ~2-fold higher than for pRW4823gfp containing two
(CTG·CAG)98 tracts in orientation I (both with no
interruptions) and ~4-fold lower than the recombination frequency
observed in the case of pRW4863gfp ((CTG·CAG)165). In fact, the insert harboring 175 repeats with two interruptions contains
a tract of 106 pure CTG·CAG repeats which is long enough to
efficiently recombine with the (CTG·CAG)98 sequence. This
result showed that the presence of these two interruptions has no
influence on frequency of intramolecular recombination between the
CTG·CAG repeats.
Although the presence of these two interruptions had no influence on
the frequency of recombination, their effect on the length of the
CTG·CAG tracts in the recombination products was pronounced (Figs.
7D and 8). Restriction analyses of the recombination
products revealed that ~60% of clones contained CTG·CAG inserts of
175 repeats or longer, whereas for pRW4823gfp (containing two
uninterrupted (CTG·CAG)98 inserts), less than 30% of the
recombinants had 98 or more CTG·CAG repeats (Fig. 7, C and
D).
Similar findings regarding the influence of interruptions on
intermolecular recombination between CTG·CAG sequences were described recently (14, 15). The results obtained using a two-plasmid system
demonstrated that both multiple fold expansions and increase of
frequency of recombination were observed when one of the recombining sequences (usually cloned into pACYC) contained an interrupted CTG·CAG tract and the other of the recombining plasmids harbored an
uninterrupted CTG·CAG tract. The presence of two interruptions in
each of the recombining CTG·CAG sequences reduced the frequency recombination as well as inhibited the formation of
recombination-mediated TRS expansions (14, 15).
 |
DISCUSSION |
Previous studies showed that repetitive sequences, such as
interspersed Alu repeats (62) as well as tandemly repeated
minisatellites (63, 64) and microsatellites (25, 65-68), exhibit a
higher recombination capacity than non-repeating DNA tracts. Also, it has been postulated that TRSs, such as CTG·CAG repeats, may be recombination hot spots (69, 70).
The principal conclusions from our studies are the following. First,
long CTG·CAG microsatellites are preferred sites of intramolecular recombination in E. coli. The frequency of recombination
between two directly repeated CTG·CAG tracts is up to ~10 times
higher than between two non-repeating sequences (
DNA and DMPK DNA) of similar length in recombination proficient cells. Second, when the
TRS tracts are oriented as inverted repeats, no products of homologous
recombination were observed. Third, the effect of length of the
homologous CTG·CAG tracts on the recombination frequency is dramatic.
We found that increasing the length of the homologous sequences from 17 to 165 CTG·CAG repeats showed a 60-fold increase in the recombination
frequency between direct repeats. This effect was similar for TRS
tracts in orientation I and in orientation II. Fourth, a pronounced
orientation dependence in the frequency was observed, although directly
repeated CTG·CAG tracts recombined efficiently in both orientations
relative to the origin of replication. TRS tracts present in
orientation II (CTG repeats on the lagging strand template) are more
susceptible to recombination. Fifth, intramolecular recombination
between CTG·CAG tracts was observed in both parental and
recA E. coli strains, but the frequency of this
process was elevated by 2-11-fold in the recombination proficient cells. This effect is dependent on the length of the recombining sequences. Sixth, intramolecular recombination between CTG·CAG tracts
led to high genetic instability (deletions and expansions) of the
repeating sequences.
Several features of the TRS tracts may contribute to their
recombinogenic behavior. During the recombination process, two CTG·CAG repeat tracts can hybridize with each other in many
registers; in contrast, the two control, non-repeating sequences (564 bp
DNA and 354 bp DMPK DNA) can align only in one frame. The number of possible alignments between two homologous TRS tracts increases with
the number of repeats present in the recombining sequences, which can
have an influence on the kinetics of the synapsis step of homologous recombination.
In contrast to the extremely frequent intramolecular events between
direct repeats, we were unable to detect any products of recombination
between the inverted repeats. Thus, recombination between
head-to-head-oriented TRS was reduced by at least 100-fold compared
with the head-to-tail-oriented CTG·CAG inserts. These results are
consistent with previous data from studies on plasmid (44) and
chromosome (71, 72) recombination in prokaryota, yeast (73), mammalian
chromosomes (74), and mammalian extrachromosomal elements (75). A
dramatic difference in the recombination frequency between direct and
inverted repeats was also observed for site-specific recombination
systems such as the Tn3 resolvase (76-78). For site-specific recombination systems, the orientation dependence was attributed to the
geometry of the DNA (78). Studies on recombination between inverted
repeats in the Salmonella typhimurium chromosome showed that
the inversion process depended predominantly on the chromosomal localization of the head-to-head-oriented repeats (71). Some DNA
sequences separating the recombining repeats were shown to be
permissive for recombination while others did not stimulate recombination (nonpermissive) (71, 79). The most comprehensive studies
in E. coli plasmid systems to resolve the orientation dependence revealed that homologous recombination between inverted repeats occurs predominantly via nonconservative pathways (44, 80).
Thus, homologous recombination leads to the formation of linear,
inviable recombinants from circular plasmid substrates harboring
inverted repeats (44). We favor the idea, suggested earlier for
recombination between non-repeating sequences in yeast (73), that there
is more than one efficient recombination pathway leading to the high
frequency of intramolecular deletions observed for directly repeated
CTG·CAG sequences. In contrast, only a completely conservative event
(reciprocal exchange) can cause the formation of the predicted
inversion between two homologous TRS tracts in the head-to-head orientation.
Intramolecular recombination depends on the relative orientation of
recombining TRS inserts to each other (direct and inverted tracts) as
well as on the orientation of the pair of the direct repeats relative
to the origin of replication. A significantly higher frequency of
recombination was observed for the pair of CTG·CAG repeats in
orientation II (when the CTG repeats are on the lagging strand
template). These results confirm a tight connection between formation
of the stable secondary structures by CTG·CAG repeats, replication
arrest, and recombination. A higher propensity for recombination
between CTG·CAG tracts in orientation II is in agreement with the
formation of more stable hairpin structures by CTG repeats.
Furthermore, the arrest of the replication fork progression occurred
in vivo, predominantly when the CTG sequence was present on
the lagging strand template (in orientation II) (54).
Intramolecular recombination between directly repeated CTG·CAG tracts
also occurs efficiently in a RecA
E. coli.
However, the presence of the recA gene product increases, in
a length-dependent manner, the rate of intramolecular
deletion by 2
11-fold in comparison to the isogenic RecA
cells. Previous studies (17) revealed the capacity of
RecA
cells to affect intramolecular recombination. Hence,
we conclude that at least two types of recombination pathways,
RecA-dependent as well as RecA-independent, are responsible
for recombination between CTG·CAG tracts in E. coli.
Intramolecular deletions between direct repeats can occur by a
RecA-independent single-strand annealing pathway (21, 47, 81, 82).
Previous studies (22, 47, 82) also showed that these types of
recombination products could result from slippage during DNA
replication. Therefore, we cannot exclude that replication misalignment
may be partially responsible for intramolecular deletions between
direct repeats observed in our study. In addition,
RecA-dependent mechanisms such as crossing-over and
half-crossing-over (44, 80) may contribute significantly to the high
frequency of recombination observed in the case of long CTG·CAG sequences.
Recent data (14, 15, 27) showed that recombination between TRS tracts
leads to large scale deletions and expansions within tandem repeat
sequences at high frequency. We found also that intramolecular
recombination between long, uninterrupted CTG·CAG tracts is a source
of great instability of the TRS tracts. As expected (14, 15),
recombination had no significant influence on the stability of short
inserts, containing 17 CTG·CAG repeats. In addition, the presence of
the G to A interruptions in one of recombining sequences reduced the
length variability of the CTG·CAG tracts observed in recombination
products. We suggest that the interruptions disturb the homogeneity in
the CTG·CAG repeat units and thus decrease the number of possible
alignments between the two recombining TRS tracts, therefore leading to
the genetic stabilization of the repeating sequence.
This work, as well as the accompanying article (27) on the frequency of
intermolecular recombination between long CTG·CAG sequences, shows
their very high recombination potential in E. coli. Their
recombination hot spot characteristics and their capacity to expand by
recombination (14, 15, 26, 27) may be responsible for the genome
instabilities observed in humans. Several cases of the involvement of
recombination processes in TRS expansions in humans were described
(reviewed in Refs. 14 and 83). Also, by taking into account the
statistical overrepresentation of trinucleotide microsatellites in
eukaryota (84), the frequent recombination events between TRS tracts
may be a source of mutations (deletions and inversions) leading to
genetic diseases. In addition, recombination between TRS may have an
important evolutionary role (70) by promoting rearrangements of genetic
information within different loci leading to the formation
of novel genes.