DNA secondary structures and the evolution of hypervariable tandem arrays.

Tandem repeats are ubiquitous in nature and constitute a major source of genetic variability in populations. This variability is associated with a number of genetic disorders in humans including triplet expansion diseases such as Fragile X syndrome and Huntington's disease. The mechanism responsible for the variability/instability of these tandem arrays remains contentious. We show here that formation of secondary structures, in particular intrastrand tetraplexes, is an intrinsic property of some of the more unstable arrays. Tetraplexes block DNA polymerase progression and may promote instability of tandem arrays by increasing the likelihood of reiterative strand slippage. In the course of doing this work we have shown that some of these tetraplexes involve unusual base interactions. These interactions not only generate tetraplexes with novel properties but also lead us to conclude that the number of sequences that can form stable tetraplexes might be much larger than previously thought.

Tandem repeats are ubiquitous in nature and constitute a major source of genetic variability in populations. This variability is associated with a number of genetic disorders in humans including triplet expansion diseases such as Fragile X syndrome and Huntington's disease. The mechanism responsible for the variability/instability of these tandem arrays remains contentious. We show here that formation of secondary structures, in particular intrastrand tetraplexes, is an intrinsic property of some of the more unstable arrays. Tetraplexes block DNA polymerase progression and may promote instability of tandem arrays by increasing the likelihood of reiterative strand slippage. In the course of doing this work we have shown that some of these tetraplexes involve unusual base interactions. These interactions not only generate tetraplexes with novel properties but also lead us to conclude that the number of sequences that can form stable tetraplexes might be much larger than previously thought.
Tandemly repeated DNA sequences are distributed widely in nature and may constitute as much as 10% of the human genome (1). They are sometimes referred to as satellites, minisatellites, or microsatellites, depending on their repeat size or array length. Polymorphic tandem repeats are also sometimes referred to as hypervariable repeats (HVRs) 1 or variable number of tandem repeats. Instability of some of these tandem arrays has been implicated in a number of disease states including the so-called triplet expansion diseases (2) such as Fragile X syndrome, one of the most frequent single gene disorders and the second most common genetic cause of mental retardation (3).
The nature of the evolutionary forces that act to create and maintain these tandem arrays has been the subject of much debate (1, 4 -12). Processes such as unequal crossing over during recombination (13) and strand slippage during replication (14,15) have been invoked as potential mechanisms for both the generation of these tandem arrays and for the variability that is sometimes associated with these sequences. This variability is of two sorts. Tandem arrays can show length changes due to the gain and loss of repeat units. These changes tend to occur at one end of the array, and for this reason are said to show polarity. Tandem arrays are also prone to the acquisition of point mutations, and the distribution of these mutations shows a similar polarity (9,12,16,17). This has led to the suggestion that either flanking sequences are important in imparting polarity to an otherwise non-polar process (12) or a mechanism that has an inherent polarity such as replication slippage (16) is involved. However, many of the most hypervariable arrays show a many-fold increase in repeat number that is thought to take place within the space of only a few cell divisions (18). Such a large increase in repeat number cannot be accomplished by a single strand slippage or recombinational event, and it has been suggested that in such cases some specialized mutational mechanism must be active (19,20).
Many hypervariable sequences that have been described are G ϩ C-rich and show a strand asymmetry in that one strand is predominantly G-rich and the other C-rich (21). It had been suggested that these sequences contained a -like sequence that could account for the observed variability by promoting recombination (10). However, many of the more recently identified hypervariable sequences lack a discernible -like motif. We had previously found that a hypervariable sequence, the CGG repeat in the human FMR1 gene that undergoes triplet expansion to result in Fragile X syndrome (22,23), forms a series of intrastrand tetraplexes at physiological temperatures, pH, and ionic strengths (24). This occurs despite the fact that this sequence was one-third Cs, and this C-richness would be expected to reduce tetraplex stability. We have now tested a series of other highly hypervariable tandem repeats (Table I) for the ability to form intrastrand tetraplexes using a K ϩ -dependent arrest of DNA synthesis assay that we have recently developed (25). These sequences are also G ϩ C-rich but, like the CGG-repeat at the FMR1 locus, contain a number of non-G bases. We have found that the ability to form intrastrand tetraplexes is a shared property of all of these sequences. This, together with the observation that other hypervariable tandem arrays form hairpins (24, 26 -32), or triplexes (33), supports the idea that DNA secondary structure may play a major role in the generation and evolution of tandem arrays.

MATERIALS AND METHODS
Clone Construction-Oligonucleotides containing hypervariable repeat units were synthesized on an ABI 381A oligonucleotide synthesizer using standard phosphoramidite chemistry and cloned into the plasmid pMS189⌬ as described previously (24,34). Plasmids were replicated in Escherichia coli MBM7070, isolated by alkaline lysis, and purified by CsCl gradient centrifugation according to standard procedures.
Intrastrand Tetraplex Assay-Hypervariable sequences were tested for the ability to block DNA synthesis reactions as follows (25). Sequencing primer was phosphorylated with [␥-32 P]ATP (DuPont NEN, 3000 -6000 Ci/mmol) using T 4 polynucleotide kinase (Epicentre Technologies, Inc.), and a buffer containing 50 mM Tris-HCl, pH 8.0, and 10 mM MgCl 2 . Reaction mixtures (total volume 6 l) contained 0.2-2 nM template, 0.16 nM of the primer SupFR4 (5Ј-ATGCTTTTACTGGCCT-GCT-3Ј), 10 M dNTPs, one of the following dideoxynucleotides at the * The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18  concentration indicated in parentheses: ddATP (0.3 mM), ddGTP (0.017 mM), ddCTP (0.2 mM), ddTTP (0.6 mM), 50 mM Tris-HCl, pH 9.3, 2.5 mM MgCl 2 , 5 units of Taq polymerase (Life Technologies, Inc.), and where indicated 50 mM monovalent cation. Reaction mixtures were subjected to 30 rounds of heating and cooling in a Perkin-Elmer PCR machine for 30 s at 95°C, 30 s at 55°C, and 30 s at 72°C. The reaction was terminated by the addition of one-half volume of stop buffer containing 95% (v/v) formamide, 10 mM EDTA, pH 9.5, 10 mM NaOH, 0.1% xylene cyanol, and 0.1% bromphenol blue, and the mixtures were heated at 90°C for 5 min prior to electrophoresis on a 6.5% polyacrylamide sequencing gel. The sequence located between the sequencing primer and the repeat on the template strand is 5Ј-CTC-GAGTCAACGTAACACTTTACAGCGGCGCGTCATTTGATATGATGC-GCCCCGCTTCCCGATAAGGG-3Ј.
Preparation of 7-Deazaguanine Templates by PCR-Templates containing guanine or 7-deazaguanine were prepared by PCR amplification of plasmids containing the HVR of interest using the primers AMP2 (5Ј-GGCGACACGGAAATGTTGAA-3Ј) and supFR1 (5-GATCGAAT-TCGTCGACATGGTGGTGGGGGAA-3Ј) which flank the HVR. The primer binding sites are located about 500 bases apart, the precise distance depending on the template, with the repeat being located about halfway between the two primer binding sites. Reaction mixtures containing 10 ng of plasmid template DNA containing the repeat of interest: 1 M each of AMP2 and supFR1; 2.5-5 units of Taq polymerase (Life Technologies, Inc.); 50 mM Tris-HCl, pH 8.0; 10 mM MgCl 2 ; 100 or 160 M each of dATP, dTTP, dCTP, and either dGTP or 7-deaza-dGTP were prepared. They were then overlaid with a drop of mineral oil and subjected to 30 rounds of heating and cooling in a Perkin-Elmer PCR machine for 30 s at 95°C, 30 s at 55°C, and 30 s at 72°C. The PCR products were purified on a 5% polyacrylamide gel and used as templates in the tetraplex assay described above.
DMS Protection Assays-Dimethyl sulfate (DMS) protection assays were performed on gel-purified oligonucleotides using the method of Williamson et. al. (35) with slight modifications. End-labeled oligonucleotide (1-5 ng per reaction) was resuspended in 18 l of TE buffer and heated for 1 min at 90°C. Potassium chloride (1 l) was added to appropriate tubes to a final concentration of 50 mM. Reactions were then heated for 30 s at 95°C, 30 s at 55°C, and 30 s at 72°C, cooled to room temperature, and reacted for 1 min with 1 l of DMS (diluted 1:5 in water). Reactions were terminated by addition of 20 l of 2 M pyrrolidine (diluted in cold water) and cleavage effected at 90°C for 10 min. Samples were precipitated twice with 1.2 ml of butan-1-ol. The samples were dried under vacuum, redissolved in 20 l of 42.5% (v/v) formamide, 5 mM EDTA, pH 9.5, 5 mM NaOH, 0.05% xylene cyanol, 0.05% bromphenol blue, denatured for 5 min at 90°C, and run on a 20% sequencing gel. Gels were covered with plastic wrap and exposed to x-ray film overnight at Ϫ20°C.

RESULTS
Intrastrand tetraplexes form when four G-rich motifs on a single strand interact to form a series of tetrads (36 -39). A series of stacked tetrads creates a hollow stem or cylinder. This stem is bounded by three loops formed by bases between the G-rich regions (L1, L2, and L3 in Fig. 1). We have recently developed a highly sensitive and specific technique for the identification of sequences that can form intrastrand DNA tetraplexes (25,34,40). This assay, illustrated in Fig. 1, is based on the ability of such sequences to block DNA polymerase progression in the presence of K ϩ but not in the absence of monovalent cations or in the presence of cations such as Li ϩ , NH 4 ϩ , Rb ϩ , or Cs ϩ . The specificity of this reaction for K ϩ is probably related to the fact that its ionic radius is small enough for the ion to fit inside the tetraplex cavity but is still large enough for it to interact with the keto oxygens of guanines in adjacent tetrads (41). This K ϩ specificity parallels the K ϩ -de-pendent anomalous mobility of tetraplex-forming oligonucleotides that is considered a diagnostic feature of tetraplex formation (35,42,43). Our assay is simple to use and has the advantage of allowing multiple tetraplexes to be discerned in a mixture of such structures or for tetraplexes to be identified even when they are formed by only a small fraction of molecules in the solution.
One of the most unstable loci thus far identified in any organism is the mouse minisatellite locus Ms6-hm, which has a germ line mutation rate of 2.5% per gamete and which shows frequent intergenerational changes of a kilobase or more (44). This locus contains from 200 to Ͼ1000 repeats of the pentamer 5Ј-CAGGG-3Ј. A template containing eight CAGGG repeats was tested for the ability to form a K ϩ -dependent block to DNA synthesis. Two distinct non-dideoxynucleotide-mediated chain termination products are seen at the 3Ј end of the repeat tract in the presence of 50 mM KCl when the G-rich strand is used as a template (Fig. 2). The more prominent of the two products (filled arrow) corresponds to a block to DNA synthesis just 3Ј of the first G residue in the first 5Ј-CAGGG-3Ј repeat on the template. The second product (open arrow) corresponds to premature chain termination one base 3Ј of this one. A series of weaker stops are seen at corresponding positions in the next four repeats. A smaller amount of premature termination is also observed in the presence of 50 mM NaCl, but none is observed in the absence of cation or in the presence of LiCl, FIG. 1. The K ؉ -dependent block to DNA synthesis assay for tetraplex formation. Diagrammatic representation of the tetraplex arrest assay on a template containing a generic intrastrand tetraplex containing five G 4 tetrads (shown as gray parallelograms). The loops L1, L2, and L3 each contain three unspecified bases (N). DNA synthesis starts 3Ј of the tetraplex-forming region and proceeds in a 5Ј to 3Ј direction toward the tetraplex. The front end of the polymerase is represented by the diagonally striped bullet and the nascent DNA strand by the dashed line. The site of premature chain termination that would result from the formation of the tetraplex on the template strand is indicated by the filled arrow. Inset, a G 4 tetrad with a K ϩ ion situated within the tetrad cavity (not to scale).  (45), the cation specificity is inconsistent with the blocks being due to hairpin formation. No block to DNA synthesis is seen when the complementary strand is used as a template (Fig. 2, right panel) or when single-stranded phage DNA is used as a template (data not shown), ruling out structure triplexes that involve interactions between the template and its complementary strand (46). Arrest of DNA synthesis is seen when these repeats are cloned into other vectors (data not shown), indicating that flanking sequences are not involved. Blockage is also independent of template concentration over a wide range (data not shown) indicating that the blocks do not involve interactions between two or more template strands but are due to the formation of intrastrand structures. The properties of both the Na ϩ -and the K ϩ -dependent DNA synthesis arrest sites including the position of the blocks to DNA synthesis, the template concentration independence, and the strand specificity, are most consistent with intrastrand tetraplex formation. The major stop reflects the most stable tetraplex(es) involving the maximum number of repeats. The less prominent stops at subsequent repeats reflect a series of tetraplexes that presumably involve a smaller number of repeats. In addition to these monovalent cation-dependent stops, a smaller amount of cation-independent premature chain termination is seen at the second G of every repeat. These stops are even more marked in both guanine and 7-deazaguanine containing linear templates (Fig. 3), and this is paralleled by a hypersensitivity of that G to methylation by DMS (see Fig. 4). We hypothesize that these phenomena may be related to a conformational peculiarity of the DNA backbone of this region.
To confirm that polymerase arrest in the presence of K ϩ and Na ϩ is related to tetraplex formation, the polymerase chain reaction (PCR) was used to generate templates containing either guanine or 7-deazaguanine. These templates were then tested for the ability to cause K ϩ /Na ϩ -dependent DNA synthesis arrest. Since 7-deazaguanine cannot act as an N 7 donor needed to form G tetrads, substitution of all guanine residues with 7-deazaguanine should abolish the K ϩ /Na ϩ -dependent polymerase blocks. As can be seen in Fig. 3, this is precisely what happens. The PCR template in which all the Gs have been replaced by 7-deazaguanine have lost all the K ϩ /Na ϩ -dependent blocks to DNA synthesis, whereas the PCR template con-taining guanines produced the same blocks to DNA synthesis seen on the circular templates (Fig. 3).
DMS treatment of an oligonucleotide containing the HVR was also carried out. Since Gs involved in tetrads do not have their N 7 positions exposed, they are protected from modification by DMS. In theory, Gs in tetrads are completely protected from DMS, whereas Gs in the loops of the tetraplex that are not involved in intraloop or interloop interactions should be DMSreactive (24,48). In practice, the picture is not always so clear, and this represents a very real limitation on the value of this technique. For example, if a tetraplex is not very stable and is formed by only a small fraction of the molecules in the population, this may produce a pattern of DMS modification in which only partial protection of Gs is apparent. In addition, FIG. 3. Tetraplex assay of the Ms6-hm locus on templates containing 7-deazaguanine. The Ms6-hm HVR was assayed for tetraplex formation using PCR-generated templates containing either guanine or 7-deazaguanine as described under "Material and Methods." The assay was conducted in the absence of added monovalent cation (0), in the presence of 50 mM K ϩ (KCl) or in the presence of 50 mM Na ϩ (NaCl). The lane markers T, C, G, and A indicate the bases on the template strand. The brackets alongside the gel indicate the extent of each tandem array. The filled arrow mark the first major K ϩ -dependent block to DNA synthesis, with the second stop marked by an open arrow. many tetraplex-forming sequences show conformational complexity that can complicate DMS data interpretation, since a base protected in one structure may be exposed in another. Since the fraction of molecules in the population that form a K ϩ -dependent block to DNA synthesis in the case of the mouse Ms6-hm HVR is small, we would expect to see some DMS protection, but this protection would not be complete. This is in fact the case (Fig. 4). After normalizing the K ϩ and K ϩ -free reactions to a G outside of the HVR (indicated by an asterisk in Fig. 4) we can see that Gs within the HVR show less DMS reactivity when K ϩ is present than when it is absent. While not definitive, these data are consistent with our other data and support the idea that the mouse Ms6-hm HVR is capable of tetraplex formation.
Why a Na ϩ -induced polymerase block is seen only with this sequence and not other tetraplexes we have tested (24,25,34,47) is not clear, but preliminary evidence suggests that it is related to the involvement of adenines in the structure since the sequence (CTGGG) 12 shows K ϩ -dependent but not Na ϩ -dependent DNA polymerase arrest (data not shown). However, the mere presence of adenines is not sufficient to elicit a Na ϩ stop since not all A containing templates show such stops (Fig.  5). Rather we believe the Na ϩ effect is related to a specific hydrogen bonding interaction in which As are involved. The molecular basis of the Na ϩ effect is currently under investigation.
Tandem arrays of the repeat 5Ј-TGG-3Ј are polymorphic (49), as are a mixture of the triplets AGG and TGG (50). As with the mouse Ms6-hm minisatellite, we found that a template containing (TGG) 20 blocked DNA synthesis in a K ϩ -dependent manner (Fig. 5A), producing a series of premature chain termination products corresponding to arrest opposite the T residues of repeats 13-20 in the (TGG) 20 tract. No blocks are seen when the complementary pyrimidine-rich strand was used as template (Fig. 5A). The blocks to DNA synthesis disappear when 7-deazaguanine is incorporated into the template strand (Fig.  5A). A single novel weak stop (open circle) is observed at the second guanine base in repeat 20 on PCR templates containing 7-deazaguanine. This stop is also seen in PCR templates containing guanines and is not dependent on monovalent cation since it is seen in the absence of KCl (data not shown). Since this stop is unique to the PCR templates, is not affected by substitution of Gs by 7-deazaguanine, and is not related to the presence of K ϩ , we presume that it reflects some aspect of the linear templates that is not related to tetraplex formation. Most of the guanines in the TGG repeat are also either fully or partially protected from methylation by DMS (Fig. 6, left panel), consistent with tetraplex formation.
We have previously shown that a (CGG) 20 tract blocks DNA synthesis in a similar manner producing eight premature chain termination products opposite C residues at the 3Ј end of the CGG tract (24). The similarity in both the pattern of polymerase arrest and DMS protection leads us to think that the tetraplexes formed by these sequences could be very similar. Such tetraplexes may contain G 4 tetrads interspersed with pyrimidines or a smaller number of G 4 tetrads interspersed with a mixture of Gs and either T or C. We have previously shown that an AGG triplet does not destabilize a CGG-containing tetraplex (24). It is therefore reasonable to assume then that a mixture of AGGs and TGGs would also form a tetraplex.
We also tested repeats with the sequence 5Ј-GGGGAGGGG-GAAGA-3Ј. Between 1 and 22 repeats of this unit are found upstream of the Huntington's disease gene in humans (51). A template containing 2.5 repeats of this sequence produces a complex pattern of premature chain terminations. There is at least one strong strand-specific K ϩ -dependent block to DNA synthesis and a number of other more minor ones. A small amount of monovalent cation-independent polymerase arrest is seen at the 3Ј end of the D4S43 tract. This may be due either to the formation of a small amount of tetraplex in the absence of monovalent cation or the formation of another structure such as a hairpin that forms independently of added monovalent cation. A significant amount of monovalent cation-independent arrest is seen in the middle of this tract (indicated by the dashed line in Fig. 5B). This block is consistent with triplex formation between the G-rich template and the nascent strand (52). Any or all of these blocks to DNA synthesis could explain the difficulties reported in amplifying this region by PCR and the observation that incorporation of 7-deaza-dGTP is able to correct this problem (51). Once again, the K ϩ -dependent blocks disappear when other monovalent cations are substituted for K ϩ , or when K ϩ is omitted, and no K ϩ -dependent stops are seen when the complementary pyrimidine-rich strand is used as a template.
Substitution of guanines in the template with 7-deazaguanine eliminates the K ϩ -dependent blocks to DNA synthesis (Fig. 5B). The K ϩ -independent polymerase arrest observed midway through the sequence is also eliminated, supporting the hypothesis that this stop may represent a purine:purine: pyrimidine triplex formed between the template and the nascent strand produced in the assay. This HVR shows a pattern of DMS modification with alternating regions of DMS protection and DMS reactivity in the presence of K ϩ (Fig. 6). This contrasts with the almost uniform reactivity of Gs in the absence of K ϩ . Some of the most protected bases show a DMS reactivity indistinguishable from background. Both the 7-dea- zaguanine substitution data and the DMS protection data are thus consistent with tetraplex formation.
Four repeats from the type I diabetes-linked hypervariable region in the human insulin promoter also produce a number of K ϩ -dependent blocks to DNA synthesis consistent with an array of different tetraplexes (Fig. 5C). These blocks are eliminated by substitution of guanine with 7-deazaguanine and are not observed on the complementary pyrimidine-rich strand. A number of Gs in the HVR are as reactive with DMS as a reference base outside the repeat (indicated with an asterisk in Fig. 6, right panel). These Gs are separated by regions of protected Gs in which no reactivity can be seen above background. Based on indirect evidence from gel electrophoretic mobility assays, and using enzymatic and chemical probes, it had been suggested that this region is able to form a series of intramolecular tetraplexes (43,53,54). Our data support this claim. DISCUSSION Our observations suggest that the ability to form an intrastrand tetraplex in vitro is a common feature of a number of hypervariable sequences including the mouse minisatellite at the Ms6-hm locus which is one of the most hypervariable sequences thus far described (44). The tetraplex formed by the repeats in the Ms6-hm tandem array is unusual in that it can be stabilized by Na ϩ as well as K ϩ , albeit with lower efficacy. This contrasts with our observations that all other tetraplexes that we have tested are seen only in the presence of K ϩ (24,25,34,40,47). Since the ionic radius of Na ϩ is smaller than that of K ϩ , it may be that the Ms6-hm tetraplex has smaller internal dimensions than the other previously described tetraplexes. This interpretation is consistent with the fact that other monovalent cations such as Rb ϩ , Cs ϩ , and NH 4 ϩ do not result in a block to DNA synthesis in our assay, since these ions have radii that are all larger than that of K ϩ . Li ϩ , on the other hand, is much smaller than Na ϩ and may still be too small to form the coordination complex that is important in stabilizing these types of structures (41). Our assay might thus be useful in distinguishing between different kinds of tetraplexes such as those that are K ϩ -specific and that correspond to previously described G 4 tetrad containing tetraplexes and those that are also seen in the presence of other cations, specifically Na ϩ , that may represent a novel class of tetraplex with different base interactions and thus different properties.
Since we have shown previously that the amount of K ϩ used in this assay represents saturating amounts of cation for tetraplex formation (24), it is likely therefore that the same pattern of polymerase pausing/tetraplex formation would be seen at physiological [K ϩ ] which typically is around 150 mM in mammalian cells (55). Tetraplex formation in vivo would require these regions to be transiently unpaired at some time. This might occur during DNA replication or on extrusion from otherwise duplex molecules (53,56) any time during the cell cycle. In eukaryotic cells it is thought that only relatively small regions of DNA are unpaired during replication, although it has been suggested that many hundreds of bases can be unpaired under certain circumstances (57). Direct evidence for an altered structure in vivo has been obtained for one of these sequences, that of the human insulin HVR (58), suggesting that formation of DNA tetraplexes by the hypervariable sequences described here might in fact be possible. The fact that a variety of tetraplex-binding proteins have been isolated from eukaryote cells (59 -65) supports the idea that tetraplexes can form in vivo. The HVRs we have tested are much shorter than those actually found at their specified loci on chromosomes. Therefore not only could the number of potential tetraplexes at these loci be much larger, but the stability of these tetraplexes would be significantly higher as well.
A variety of other tandem repeats have been shown to form fold-back structures. These include the 5Ј-CAG-3Ј repeat that is unstable in triplet expansion diseases such as Huntington's disease and myotonic dystrophy (26,28,29,31) and the centromeric satellite sequence (27). Other simple satellites such as the A ϩ T-rich hypervariable sequence in the 3Ј region of the human apolipoprotein B gene (66) also have the potential to form cruciforms and hairpins. Some G ϩ C-rich repeats may also form other unusual DNA structures such as triplexes (33).
In the strand slippage models for the generation and evolution of tandem arrays, the nascent strand dissociates from the template, allowing the two strands to slip relative to one another. Successful priming from the slipped position results in a change in repeat number. Factors that favor strand dissociation over polymerization or that stabilize a slipped nascent strand-template complex would be expected to affect the frequency with which repeat units are added to or lost from the array. Blocks to DNA synthesis, such as those resulting from tetraplex formation, would be expected to increase the likelihood that strand slippage would occur. Since the strongest blocks to DNA synthesis are encountered at the 3Ј end of such an array, these structures would account for the polarity observed for the gain and loss of repeat units from tandem arrays (12,16,67). In addition, since polymerase pause sites are known to be hotspots for nucleotide misinsertions (68), such blocks could also explain the clustering of point mutations at one end of the array (12,16,67).
One model that attempts to explain the large scale increase in repeat number seen in some tandem arrays invokes a long lived block to DNA synthesis that induces repeat strand slippage during replication (20). Tetraplexes make compelling candidates for this long lived block since they form strong, stable blocks to DNA synthesis under physiological conditions (24,25,34). We have shown that even very long hairpins are not effective barriers to DNA polymerase in our assay (see Ref. 47 and Woodford et al. 2 ), which suggests that sequences that are only able to form hairpins may not arrest DNA synthesis. This would be consistent with in vivo observations (69). However, both tetraplexes and hairpins may act to increase the frequency of successful strand slippage by stabilizing the strand slippage intermediate, thus increasing the likelihood that reinitiation of the polymerase would occur from the slipped position.
In addition, we would expect that the intramolecular tetraplex-forming tandem arrays are also likely to form intermolecular tetraplexes involving either one or three other DNA strands (70). Formation of such structures may facilitate synapsis of the DNA strands prior to crossing over during recombination. A combination of enhanced pausing at intrastrand tetraplexes, and enhanced synapsis between strands from different chromosomes or chromatids, may promote instability by facilitating strand switching.
It is possible that the formation of secondary structures in general may contribute to the generation and evolution of tandem arrays. In this regard, we would expect that the likelihood of structure formation would be affected by a variety of factors including the nature of the flanking sequences, the local chromatin structure, the transcriptional activity of a region, the rate of replication through the tandem array, the size of individual nucleotide pools, and whether or not the secondary structure-forming sequence is in the leading or lagging strand of DNA synthesis (71).