|
Advertisement | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
(Received for publication, August 2, 1996, and in revised form, December 4, 1996)
From the Section on Genomic Structure and Function, Laboratory of
Molecular and Cellular Biology, NIDDK, National Institutes of Health,
Bethesda, Maryland 20892-0830
Tandem repeats are ubiquitous in nature and
constitute a major source of genetic variability in populations. This
variability is associated with a number of genetic disorders in humans
including triplet expansion diseases such as Fragile X syndrome and
Huntington's disease. The mechanism responsible for the
variability/instability of these tandem arrays remains contentious. We
show here that formation of secondary structures, in particular
intrastrand tetraplexes, is an intrinsic property of some of the more
unstable arrays. Tetraplexes block DNA polymerase progression and may
promote instability of tandem arrays by increasing the likelihood of
reiterative strand slippage. In the course of doing this work we have
shown that some of these tetraplexes involve unusual base interactions.
These interactions not only generate tetraplexes with novel properties but also lead us to conclude that the number of sequences that can form
stable tetraplexes might be much larger than previously thought.
Tandemly repeated DNA sequences are distributed widely in nature
and may constitute as much as 10% of the human genome (1). They are
sometimes referred to as satellites, minisatellites, or
microsatellites, depending on their repeat size or array length. Polymorphic tandem repeats are also sometimes referred to as
hypervariable repeats (HVRs)1 or variable
number of tandem repeats. Instability of some of these tandem arrays
has been implicated in a number of disease states including the
so-called triplet expansion diseases (2) such as Fragile X syndrome,
one of the most frequent single gene disorders and the second most
common genetic cause of mental retardation (3).
The nature of the evolutionary forces that act to create and maintain
these tandem arrays has been the subject of much debate (1, 4-12).
Processes such as unequal crossing over during recombination (13) and
strand slippage during replication (14, 15) have been invoked as
potential mechanisms for both the generation of these tandem arrays and
for the variability that is sometimes associated with these sequences.
This variability is of two sorts. Tandem arrays can show length changes
due to the gain and loss of repeat units. These changes tend to occur
at one end of the array, and for this reason are said to show polarity.
Tandem arrays are also prone to the acquisition of point mutations, and
the distribution of these mutations shows a similar polarity (9, 12,
16, 17). This has led to the suggestion that either flanking sequences
are important in imparting polarity to an otherwise non-polar process
(12) or a mechanism that has an inherent polarity such as replication
slippage (16) is involved. However, many of the most hypervariable
arrays show a many-fold increase in repeat number that is thought to
take place within the space of only a few cell divisions (18). Such a
large increase in repeat number cannot be accomplished by a single
strand slippage or recombinational event, and it has been suggested
that in such cases some specialized mutational mechanism must be active
(19, 20).
Many hypervariable sequences that have been described are G + C-rich
and show a strand asymmetry in that one strand is predominantly G-rich
and the other C-rich (21). It had been suggested that these sequences
contained a
Hypervariable sequences used in this study
Clone Construction Oligonucleotides containing
hypervariable repeat units were synthesized on an ABI 381A
oligonucleotide synthesizer using standard phosphoramidite chemistry
and cloned into the plasmid pMS189 Hypervariable sequences were
tested for the ability to block DNA synthesis reactions as follows
(25). Sequencing primer was phosphorylated with
[ Templates
containing guanine or 7-deazaguanine were prepared by PCR amplification
of plasmids containing the HVR of interest using the primers AMP2
(5 Dimethyl sulfate (DMS) protection
assays were performed on gel-purified oligonucleotides using the method
of Williamson et. al. (35) with slight modifications.
End-labeled oligonucleotide (1-5 ng per reaction) was resuspended in
18 µl of TE buffer and heated for 1 min at 90 °C. Potassium
chloride (1 µl) was added to appropriate tubes to a final
concentration of 50 mM. Reactions were then heated for
30 s at 95 °C, 30 s at 55 °C, and 30 s at 72 °C, cooled to room temperature, and reacted for 1 min with 1 µl
of DMS (diluted 1:5 in water). Reactions were terminated by addition of
20 µl of 2 M pyrrolidine (diluted in cold water) and
cleavage effected at 90 °C for 10 min. Samples were precipitated twice with 1.2 ml of butan-1-ol. The samples were dried under vacuum,
redissolved in 20 µl of 42.5% (v/v) formamide, 5 mM
EDTA, pH 9.5, 5 mM NaOH, 0.05% xylene cyanol, 0.05%
bromphenol blue, denatured for 5 min at 90 °C, and run on a 20%
sequencing gel. Gels were covered with plastic wrap and exposed to
x-ray film overnight at Intrastrand tetraplexes form when four G-rich motifs on a single
strand interact to form a series of tetrads (36-39). A series of
stacked tetrads creates a hollow stem or cylinder. This stem is bounded
by three loops formed by bases between the G-rich regions (L1, L2, and L3 in Fig.
1). We have recently developed a highly sensitive and
specific technique for the identification of sequences that can form
intrastrand DNA tetraplexes (25, 34, 40). This assay, illustrated in
Fig. 1, is based on the ability of such sequences to block DNA
polymerase progression in the presence of K+ but not in the
absence of monovalent cations or in the presence of cations such as
Li+, NH4+, Rb+,
or Cs+. The specificity of this reaction for K+
is probably related to the fact that its ionic radius is small enough
for the ion to fit inside the tetraplex cavity but is still large
enough for it to interact with the keto oxygens of guanines in adjacent
tetrads (41). This K+ specificity parallels the
K+-dependent anomalous mobility of
tetraplex-forming oligonucleotides that is considered a diagnostic
feature of tetraplex formation (35, 42, 43). Our assay is simple to use
and has the advantage of allowing multiple tetraplexes to be discerned
in a mixture of such structures or for tetraplexes to be identified
even when they are formed by only a small fraction of molecules in the
solution.
Fig. 1. The K+-dependent block to DNA synthesis assay for tetraplex formation. Diagrammatic representation of the tetraplex arrest assay on a template containing a generic intrastrand tetraplex containing five G4 tetrads (shown as gray parallelograms). The loops L1, L2, and L3 each contain three unspecified bases (N). DNA synthesis starts 3 of the tetraplex-forming region
and proceeds in a 5 to 3 direction toward the tetraplex. The front
end of the polymerase is represented by the diagonally striped
bullet and the nascent DNA strand by the dashed line.
The site of premature chain termination that would result from the
formation of the tetraplex on the template strand is indicated by the
filled arrow. Inset, a G4 tetrad with
a K+ ion situated within the tetrad cavity (not to
scale).
[View Larger Version of this Image (29K GIF file)]
One of the most unstable loci thus far identified in any organism is
the mouse minisatellite locus Ms6-hm, which has a germ line
mutation rate of 2.5% per gamete and which shows frequent intergenerational changes of a kilobase or more (44). This locus contains from 200 to >1000 repeats of the pentamer 5 Fig. 2. Tetraplex assay of the Ms6-hm locus. The assay was carried out on the G-rich and the C-rich strands, in the presence of the indicated monovalent cations as described under "Materials and Methods." The lane markers T, C, G, and A indicate the bases on the template strand. The bracket on the left side of the figure indicates the tandem repeat, and the open and filled arrows mark the positions of sites of the major monovalent cation-dependent sites of premature chain termination. [View Larger Version of this Image (95K GIF file)]
The properties of both the Na+- and the
K+-dependent DNA synthesis arrest sites
including the position of the blocks to DNA synthesis, the template
concentration independence, and the strand specificity, are most
consistent with intrastrand tetraplex formation. The major stop
reflects the most stable tetraplex(es) involving the maximum number of
repeats. The less prominent stops at subsequent repeats reflect a
series of tetraplexes that presumably involve a smaller number of
repeats. In addition to these monovalent cation-dependent stops, a smaller amount of cation-independent premature chain termination is seen at the second G of every repeat. These stops are
even more marked in both guanine and 7-deazaguanine containing linear
templates (Fig. 3), and this is paralleled by a
hypersensitivity of that G to methylation by DMS (see Fig.
4). We hypothesize that these phenomena may be related
to a conformational peculiarity of the DNA backbone of this region.
Fig. 3. Tetraplex assay of the Ms6-hm locus on templates containing 7-deazaguanine. The Ms6-hm HVR was assayed for tetraplex formation using PCR-generated templates containing either guanine or 7-deazaguanine as described under "Material and Methods." The assay was conducted in the absence of added monovalent cation (0), in the presence of 50 mM K+ (KCl) or in the presence of 50 mM Na+ (NaCl). The lane markers T, C, G, and A indicate the bases on the template strand. The brackets alongside the gel indicate the extent of each tandem array. The filled arrow mark the first major K+-dependent block to DNA synthesis, with the second stop marked by an open arrow. [View Larger Version of this Image (136K GIF file)]
Fig. 4. Dimethyl sulfate protection assay of the Ms6-hm HVR. Oligonucleotides containing the Ms6-hm HVR were treated with DMS in the presence and absence of K+ as described under "Materials and Methods." The bracket demarcates the HVR. The solid vertical line on the left indicates the region of DMS protection. Solid arrows represent DMS-reactive bases. The asterisk indicates the G located outside the HVR that serves as a reference base for comparison of the DMS reactivity of bases within the HVR with and without K+. [View Larger Version of this Image (22K GIF file)]
To confirm that polymerase arrest in the presence of K+ and Na+ is related to tetraplex formation, the polymerase chain reaction (PCR) was used to generate templates containing either guanine or 7-deazaguanine. These templates were then tested for the ability to cause K+/Na+-dependent DNA synthesis arrest. Since 7-deazaguanine cannot act as an N7 donor needed to form G tetrads, substitution of all guanine residues with 7-deazaguanine should abolish the K+/Na+-dependent polymerase blocks. As can be seen in Fig. 3, this is precisely what happens. The PCR template in which all the Gs have been replaced by 7-deazaguanine have lost all the K+/Na+-dependent blocks to DNA synthesis, whereas the PCR template containing guanines produced the same blocks to DNA synthesis seen on the circular templates (Fig. 3). DMS treatment of an oligonucleotide containing the HVR was also carried out. Since Gs involved in tetrads do not have their N7 positions exposed, they are protected from modification by DMS. In theory, Gs in tetrads are completely protected from DMS, whereas Gs in the loops of the tetraplex that are not involved in intraloop or interloop interactions should be DMS-reactive (24, 48). In practice, the picture is not always so clear, and this represents a very real limitation on the value of this technique. For example, if a tetraplex is not very stable and is formed by only a small fraction of the molecules in the population, this may produce a pattern of DMS modification in which only partial protection of Gs is apparent. In addition, many tetraplex-forming sequences show conformational complexity that can complicate DMS data interpretation, since a base protected in one structure may be exposed in another. Since the fraction of molecules in the population that form a K+-dependent block to DNA synthesis in the case of the mouse Ms6-hm HVR is small, we would expect to see some DMS protection, but this protection would not be complete. This is in fact the case (Fig. 4). After normalizing the K+ and K+-free reactions to a G outside of the HVR (indicated by an asterisk in Fig. 4) we can see that Gs within the HVR show less DMS reactivity when K+ is present than when it is absent. While not definitive, these data are consistent with our other data and support the idea that the mouse Ms6-hm HVR is capable of tetraplex formation. Why a Na+-induced polymerase block is seen only with this
sequence and not other tetraplexes we have tested (24, 25, 34, 47) is
not clear, but preliminary evidence suggests that it is related to the
involvement of adenines in the structure since the sequence
(CTGGG)12 shows K+-dependent but
not Na+-dependent DNA polymerase arrest (data
not shown). However, the mere presence of adenines is not sufficient to
elicit a Na+ stop since not all A containing templates show
such stops (Fig. 5). Rather we believe the
Na+ effect is related to a specific hydrogen bonding
interaction in which As are involved. The molecular basis of the
Na+ effect is currently under investigation.
Fig. 5. Tetraplex formation by different HVRs. Plasmids containing (TGG)20 (A), 2.5 copies of the sequence 5 -GGGGAGGGGGAAGA-3 from the human D4S43 locus
(B), and the sequence (ACAGGGGTGTGGGG)4 from the
human insulin-linked HVR (C) were tested for tetraplex formation as described under "Materials and Methods." The
lane markers T, C, G, and A indicate the bases on
the template strand. The brackets indicate the extent of
each tandem array. The position of the major
K+-dependent blocks to DNA synthesis are
indicated by black lines. The dashed line in the
case of the D4S43 HVR marks the position of a monovalent
cation-independent arrest site. The open circle in
A marks the position of a monovalent cation independent
arrest site seen only on linear templates.
[View Larger Version of this Image (57K GIF file)]
Tandem arrays of the repeat 5 Fig. 6. Dimethyl sulfate protection assays of the (TGG)20, the D4S43 HVR, and the insulin HVR. Oligonucleotides containing repeats of (TGG)20, the D4S43 HVR, and the insulin HVR were treated with DMS in the presence and absence of K+ as described under "Materials and Methods." The brackets on the right side of each panel demarcate each HVR. The solid vertical lines on the left side of each panel represent regions of strong DMS protection. Solid arrows mark DMS-reactive bases. The asterisks mark reference Gs outside the HVRs that are used in comparisons between reactions carried out in the presence and absence of K+. [View Larger Version of this Image (60K GIF file)]
We have previously shown that a (CGG)20 tract blocks DNA
synthesis in a similar manner producing eight premature chain
termination products opposite C residues at the 3 We also tested repeats with the sequence 5 Substitution of guanines in the template with 7-deazaguanine eliminates the K+-dependent blocks to DNA synthesis (Fig. 5B). The K+-independent polymerase arrest observed midway through the sequence is also eliminated, supporting the hypothesis that this stop may represent a purine:purine:pyrimidine triplex formed between the template and the nascent strand produced in the assay. This HVR shows a pattern of DMS modification with alternating regions of DMS protection and DMS reactivity in the presence of K+ (Fig. 6). This contrasts with the almost uniform reactivity of Gs in the absence of K+. Some of the most protected bases show a DMS reactivity indistinguishable from background. Both the 7-deazaguanine substitution data and the DMS protection data are thus consistent with tetraplex formation. Four repeats from the type I diabetes-linked hypervariable region in the human insulin promoter also produce a number of K+-dependent blocks to DNA synthesis consistent with an array of different tetraplexes (Fig. 5C). These blocks are eliminated by substitution of guanine with 7-deazaguanine and are not observed on the complementary pyrimidine-rich strand. A number of Gs in the HVR are as reactive with DMS as a reference base outside the repeat (indicated with an asterisk in Fig. 6, right panel). These Gs are separated by regions of protected Gs in which no reactivity can be seen above background. Based on indirect evidence from gel electrophoretic mobility assays, and using enzymatic and chemical probes, it had been suggested that this region is able to form a series of intramolecular tetraplexes (43, 53, 54). Our data support this claim. Our observations suggest that the ability to form an intrastrand tetraplex in vitro is a common feature of a number of hypervariable sequences including the mouse minisatellite at the Ms6-hm locus which is one of the most hypervariable sequences thus far described (44). The tetraplex formed by the repeats in the Ms6-hm tandem array is unusual in that it can be stabilized by Na+ as well as K+, albeit with lower efficacy. This contrasts with our observations that all other tetraplexes that we have tested are seen only in the presence of K+ (24, 25, 34, 40, 47). Since the ionic radius of Na+ is smaller than that of K+, it may be that the Ms6-hm tetraplex has smaller internal dimensions than the other previously described tetraplexes. This interpretation is consistent with the fact that other monovalent cations such as Rb+, Cs+, and NH4+ do not result in a block to DNA synthesis in our assay, since these ions have radii that are all larger than that of K+. Li+, on the other hand, is much smaller than Na+ and may still be too small to form the coordination complex that is important in stabilizing these types of structures (41). Our assay might thus be useful in distinguishing between different kinds of tetraplexes such as those that are K+-specific and that correspond to previously described G4 tetrad containing tetraplexes and those that are also seen in the presence of other cations, specifically Na+, that may represent a novel class of tetraplex with different base interactions and thus different properties. Since we have shown previously that the amount of K+ used in this assay represents saturating amounts of cation for tetraplex formation (24), it is likely therefore that the same pattern of polymerase pausing/tetraplex formation would be seen at physiological [K+] which typically is around 150 mM in mammalian cells (55). Tetraplex formation in vivo would require these regions to be transiently unpaired at some time. This might occur during DNA replication or on extrusion from otherwise duplex molecules (53, 56) any time during the cell cycle. In eukaryotic cells it is thought that only relatively small regions of DNA are unpaired during replication, although it has been suggested that many hundreds of bases can be unpaired under certain circumstances (57). Direct evidence for an altered structure in vivo has been obtained for one of these sequences, that of the human insulin HVR (58), suggesting that formation of DNA tetraplexes by the hypervariable sequences described here might in fact be possible. The fact that a variety of tetraplex-binding proteins have been isolated from eukaryote cells (59-65) supports the idea that tetraplexes can form in vivo. The HVRs we have tested are much shorter than those actually found at their specified loci on chromosomes. Therefore not only could the number of potential tetraplexes at these loci be much larger, but the stability of these tetraplexes would be significantly higher as well. A variety of other tandem repeats have been shown to form fold-back
structures. These include the 5 In the strand slippage models for the generation and evolution of
tandem arrays, the nascent strand dissociates from the template, allowing the two strands to slip relative to one another. Successful priming from the slipped position results in a change in repeat number.
Factors that favor strand dissociation over polymerization or that
stabilize a slipped nascent strand-template complex would be expected
to affect the frequency with which repeat units are added to or lost
from the array. Blocks to DNA synthesis, such as those resulting from
tetraplex formation, would be expected to increase the likelihood that
strand slippage would occur. Since the strongest blocks to DNA
synthesis are encountered at the 3 One model that attempts to explain the large scale increase in repeat number seen in some tandem arrays invokes a long lived block to DNA synthesis that induces repeat strand slippage during replication (20). Tetraplexes make compelling candidates for this long lived block since they form strong, stable blocks to DNA synthesis under physiological conditions (24, 25, 34). We have shown that even very long hairpins are not effective barriers to DNA polymerase in our assay (see Ref. 47 and Woodford et al.2), which suggests that sequences that are only able to form hairpins may not arrest DNA synthesis. This would be consistent with in vivo observations (69). However, both tetraplexes and hairpins may act to increase the frequency of successful strand slippage by stabilizing the strand slippage intermediate, thus increasing the likelihood that reinitiation of the polymerase would occur from the slipped position. In addition, we would expect that the intramolecular tetraplex-forming tandem arrays are also likely to form intermolecular tetraplexes involving either one or three other DNA strands (70). Formation of such structures may facilitate synapsis of the DNA strands prior to crossing over during recombination. A combination of enhanced pausing at intrastrand tetraplexes, and enhanced synapsis between strands from different chromosomes or chromatids, may promote instability by facilitating strand switching. It is possible that the formation of secondary structures in general may contribute to the generation and evolution of tandem arrays. In this regard, we would expect that the likelihood of structure formation would be affected by a variety of factors including the nature of the flanking sequences, the local chromatin structure, the transcriptional activity of a region, the rate of replication through the tandem array, the size of individual nucleotide pools, and whether or not the secondary structure-forming sequence is in the leading or lagging strand of DNA synthesis (71). * The costs of publication of this article were defrayed in part by the payment of page charges. The article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
These authors have contributed equally to this work.
§ To whom correspondence should be addressed: Bldg. 8, Rm. 202, National Institutes of Health, 8 Center Dr. MSC 0830, Bethesda, MD 20892-0830. Tel.: 301-496-2189; Fax: 301-402-0240; E-mail: ku @helix.nih.gov. 1 The abbreviations used are: HVRs, hypervariable repeats; PCR, polymerase chain reaction; DMS, dimethyl sulfate; dd, dideoxy. 2 K. J. Woodford, M. N. Weitzmann, and K. Usdin, unpublished observations. We thank Drs. Anthony Furano and Herbert Tabor for critical reading of this manuscript and for their advice and support.
©1997 by The American Society for Biochemistry and Molecular Biology, Inc. This article has been cited by other articles:
|
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Advertisement | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||