The development and use of a DNA polymerase arrest assay for the evaluation of parameters affecting intrastrand tetraplex formation.

We show here that a K+-dependent block to DNA synthesis is a sensitive and specific indicator of intrastrand tetraplex formation that can be used, both to identify sequences with tetraplex-forming potential and to examine parameters that affect tetraplex formation. We show that tetraplex formation is determined by a complex combination of factors including the size and base composition of its constituent loops and stems. In the process of carrying out this study we have found that the number of sequences with the ability to form tetraplexes is larger than previously thought, and that such sequences are ubiquitous in eukaryote genomes.

We show here that a K ؉ -dependent block to DNA synthesis is a sensitive and specific indicator of intrastrand tetraplex formation that can be used, both to identify sequences with tetraplex-forming potential and to examine parameters that affect tetraplex formation. We show that tetraplex formation is determined by a complex combination of factors including the size and base composition of its constituent loops and stems. In the process of carrying out this study we have found that the number of sequences with the ability to form tetraplexes is larger than previously thought, and that such sequences are ubiquitous in eukaryote genomes.
A number of recent studies have shown that certain guaninerich DNA sequences form interstrand and intrastrand DNA tetraplexes. Interstrand tetraplexes are formed between G-rich tracts located on 2 or 4 different DNA strands. Intrastrand tetraplexes are formed by a single strand of DNA that contains at least 4 G tracts (Fig. 1). Gs from each of the G tracts interact to form a planar array of 4 guanines (a tetrad) held together by Hoogsteen bonds (1). Recently it has been shown that (G ϩ C)-rich tracts can also form tetrads of the form G⅐C⅐G⅐C (2,3). A series of successive tetrads forms a hollow stem. In the case of intrastrand tetraplexes, this stem is bounded by three loops formed by the bases between the G tracts.
Interstrand tetraplexes have been implicated in the pairing of homologous chromosomes during meiosis (4) and switch recombination in immunoglobulin genes (4), and they may play important roles in the functioning of telomeres (4,5). Recently, proteins that bind specifically to tetraplexes (4 -6), or that catalyze tetraplex formation have been described (6,7). These results, together with the demonstration of intrastrand tetraplex formation in vivo (8), suggest that tetraplexes may have a biological function. While a large number of studies have focused on interstrand tetraplexes formed by telomere or telomere-derived sequences (9 -14), relatively little is known about the sequence constraints governing the formation and stability of intrastrand tetraplexes.
We have recently shown that a G-rich sequence from the chicken ␤-globin promoter provides a K ϩ -dependent block to DNA synthesis, whose properties are consistent with the formation of an intrastrand tetrahelical structure (15). Analysis of the K ϩ -dependent structure showed it to be an unusual "cinched" tetraplex containing a number of non-guanine bases in the tetraplex stem (2 Ts and a C) and stabilized by interactions between one of the tetraplex loops and bases in the flanking sequence (16). (CGG) n tracts also form K ϩ -dependent blocks to DNA synthesis consistent with tetraplex formation (17). This tetraplex may contain a mixture of G tetrads and G⅐C⅐G⅐C tetrads (2,3), or G-tetrads and C⅐C ϩ base pairs (17,18).
Using a series of known tetraplex-forming model sequences, we show that the ability to block DNA synthesis in a K ϩ -dependent manner is not a peculiar property of these unusual tetraplexes, but rather is diagnostic of intrastrand tetraplexes in general. This property provides a rapid and simple assay for these structures and for studying the parameters that affect tetraplex formation. This assay can be carried out under physiological conditions and can detect tetraplexes at concentrations that are orders of magnitude less than those required by classical analytical techniques such as CD-spectroscopy, x-ray crystallography, and NMR, even when tetraplexes constitute only a small fraction of DNA molecules in the population. We have used this assay to identify a number of tetraplex-forming sequences in eukaryote genomes.

MATERIALS AND METHODS
Clone Construction-Oligonucleotides were synthesized on an ABI 381A oligonucleotide synthesizer and cloned into either the plasmid pMS189⌬ or M13mp18 as described previously (15,17). Plasmids were replicated in Escherichia coli MBM7070, isolated by alkaline lysis, and purified by CsCl gradient centrifugation according to standard procedures. Single-stranded phage DNA was purified as described elsewhere (17). The plasmid (C) 1 was a gift of R. Howell.
Methylation of Plasmids-DNA was methylated with the SssI methylase (New England Biolabs), using the buffers and reaction conditions recommended by the supplier. After phenol extraction, the methylated DNA was ethanol-precipitated and resuspended in 10 mM Tris-HCl, pH 8.0, 1 mM Na 2 EDTA (TE) at a final concentration of approximately 1 g/l. Methylation was confirmed by testing for protection from HpaII digestion.
DNA Synthesis Arrest Assays-Arrest of DNA synthesis was monitored in a modified linear amplification sequencing reaction. Primers for these reactions were end-labeled with [␥-32 P]ATP (DuPont NEN, 3000 -6000 Ci/mmol) using T4 polynucleotide kinase (Epicentre Technologies, Inc. and a K ϩ -free kinase buffer (50 mM Tris-HCl, pH 8.0, 10 mM MgCl 2 ). Reaction conditions for this assay are based on that recommended by Epicentre Technologies for use in their SequiTherm TM sequencing kits. Each reaction contained 2 nM of template, 0.16 nM of primer, 10 M each of dATP, dCTP, dTTP, and 7-deaza-dGTP, 50 mM Tris-HCl (pH 9.3), 2.5 mM MgCl 2 , 5 units of Taq DNA polymerase (Life Technologies, Inc.) and one of the following dideoxynucleotides at the concentration indicated in parenthesis: ddATP (0.3 mM), ddGTP (0.017 mM), ddCTP (0.2 mM), ddTTP (0.6 mM), in a total volume of 6 l. Cations were added to this mix at the indicated concentrations. Reactions were subjected to 30 cycles of heating and cooling (30 s at 95°C, 30 s at 55°C, and 30 s at 72°C). One half volume (3 l) of stop buffer (95% (v/v) formamide, 10 mM EDTA (pH 9.5), 10 mM NaOH, 0.1% xylene cyanol, 0.1% bromphenol blue) was then added and reactions heated at 90°C * The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18  for 5 min prior to loading on a 6% polyacrylamide sequencing gel.
The effect of acidic pH on tetraplex formation was tested by comparing the extent of polymerase arrest in buffers containing 50 mM MES, 1 and 10 mM MgCl 2 at pH 7.5 and 6.5.
Dimethyl Sulfate (DMS) Protection Assays-DMS protection assays were performed on gel purified oligonucleotides using the method of Williamson et. al. (5) with slight modifications. End-labeled oligonucleotide (1 ng/reaction) was resuspended in 18 l of TE buffer and heated for 1 min at 90°C. Potassium chloride (1 l) was added to appropriate tubes to a final concentration of 50 mM. Reactions were then heated for 30 s at 95°C, 30 s at 55°C, and 30 s at 72°C, cooled to room temperature or 55°C, and reacted for 1 min with 1 l of DMS (diluted 1:10 in water). Reactions were terminated by addition of 20 l of 2 M pyrrolidine (diluted in cold water) and cleavage effected at 90°C for 15 min. Carrier DNA (E. coli genomic DNA; 2 g) was added, and samples were precipitated with 1.2 ml of butan-1-ol. The samples were dried under vacuum, redissolved in 10 l of 42.5% (v/v) formamide, 5 mM EDTA (pH 9.5), 5 mM NaOH, 0.05% xylene cyanol, 0.05% bromphenol blue, denatured for 5 min at 90°C, and run on a 20% polyacrylamide sequencing gel.

RESULTS
K ϩ -dependent Blocks to DNA Synthesis Are a General Property of Tetraplex-forming Sequences-When the known tetraplex-forming sequence (T 2 G 4 ) 4 (5) is used as a template in in vitro DNA synthesis reactions, a set of premature polynucleotide chain termination products is observed at the beginning of the tetraplex-forming sequence ( Fig. 2A). The major chain termination product comigrates with the base immediately preceding the 3Ј most G in the (T 2 G 4 ) 4 tract (indicated by an arrow, Fig. 2A). The premature chain termination products or arrests are seen only when KCl is included in the reaction. No arrest is observed in the absence of monovalent cation ( Fig. 2A), or when LiCl, NaCl, RbCl, CsCl, or NH 4 Cl are substituted for KCl (data not shown). Potassium glutamate also produces the same premature chain termination products, confirming that it is the cation rather than anion that is responsible for the formation of the block to DNA synthesis (data not shown). K ϩ -dependent arrest of DNA synthesis occurs with all DNA polymerases tested including Taq polymerase, T4 DNA polymerase, Sequenase TM (a modified T7 DNA polymerase), avian myeloblastosis virus reverse transcriptase, and SequiTherm TM polymerase (data not shown), indicating that arrest of DNA synthesis is not polymerase-specific. Since DNA synthesis in the region preceding the tetraplex-forming sequence is similar in the presence or absence of K ϩ , it is apparent that the arrest of DNA synthesis is not due to an adverse effect of K ϩ on the polymerase. The effect of K ϩ is not ameliorated by the use of polymerase accessory proteins such as E. coli single-stranded binding protein or gene 32 protein, or the inclusion of glycerol, formamide, or dimethyl sulfoxide (data not shown), reagents which have been shown to reduce premature chain termination (19).
The K ϩ -dependent block to DNA synthesis is strand-specific since no block to DNA synthesis is observed when the pyrimidine-rich strand is used as a template (Fig. 2B). Furthermore, these blocks to DNA synthesis are seen even when singlestranded phage DNA containing these sequences is used as the template (data not shown). This rules out incomplete melting of the duplex or the formation of an interstrand structure involving the complementary strand, e.g. a triplex. Interstrand triplexes have been previously shown to block DNA synthesis, but blockage is independent of K ϩ , requires the presence of both DNA strands, and is seen only when the pyrimidine-rich strand is used as a template (20). The same blocks to DNA synthesis are seen when the (T 2 G 4 ) 4 sequence is cloned in different orientations or in different vectors, indicating that the (T 2 G 4 ) 4 sequence itself is sufficient to block DNA synthesis (data not shown).
Formation of these blocks to DNA synthesis is rapid since they are seen even when templates are incubated for less than 10 min in the presence of K ϩ , and the amount of DNA synthesis arrest does not increase with longer incubations before polymerase addition (data not shown). These blocks are also very 1 The abbreviations used are: MES, 2-(N-morpholino)ethanesulfonic acid; DMS, dimethyl sulfate. stable since they are still observed when templates are incubated at 85°C before addition of the polymerase. These data suggest that the extent of premature chain termination observed reflects the amount of underlying structure formed under any given set of circumstances, rather than their stability or rate of formation.
The sequence G 13 C 13 , which is 2 bases longer than (T 2 G 4 ) 4 and is able to form a fully Watson-Crick base paired hairpin, produces very little DNA synthesis arrest and the small amount of arrest seen is K ϩ -independent (Fig. 2C). These data suggest that a hairpin is not responsible for the K ϩ -dependent arrest of DNA synthesis seen on the (T 2 G 4 ) 4 templates.
Both the number of different premature chain termination products at the beginning of the (T 2 G n ) 4 sequence (S1 in Fig.  3A), and the amount of each product increases as the number of Gs in each repeat tract increases (Fig. 3A). This is consistent with both an increase in the number of potential structures and an increase in the proportion of molecules that form such structures as the number of Gs increases. Some K ϩ -dependent polymerase blockage is seen at the beginning of the second G tract when the number of Gs in each tract is 5 or more (S2 in Fig. 3A). These blocks are orientation-dependent (data not shown), suggesting that they represent interactions between the G tracts and 5Ј-flanking bases. A small amount of K ϩindependent DNA synthesis arrest is also seen at the beginning of the first guanine tract (S1) when the number of Gs in each tract is 6 or more (Fig. 3B). This might be due either to the presence of a structure such as a hairpin which is unaffected by , or a small amount of tetraplex formed in the absence of K ϩ . Some monovalent cation-independent DNA synthesis arrest is seen in the middle of the (T 2 G n ) 4 sequence where n Ն 6 (S3 in Fig. 3, A and B) and is characteristic of triplexes formed between the template and the nascent strand (21).
K ϩ -dependent arrest of DNA synthesis is seen over the template concentration range 0.02-40 mM, with a greater proportion of plasmid template molecules blocking DNA synthesis as the template concentration decreases (data not shown). This suggests that arrest of DNA synthesis involves the formation of intrastrand rather than interstrand structures, and that their formation is favored when reannealing of the template strand with its complement is reduced.
These data are consistent with the underlying physical basis of the K ϩ -dependent arrest of DNA synthesis on (T 2 G n ) 4 templates being intrastrand tetraplex formation. To check that these are simple tetraplexes rather than "cinched tetraplexes" (16) or some other unusual tetraplex, we probed an oligonu- cleotide containing the sequence (T 2 G 7 ) 4 with DMS. The N-7 position of guanines in tetraplexes is involved in hydrogen bonding and is thus protected from modification by DMS, a reagent which reacts specifically with guanines at the N-7 position. As Fig. 4A shows, the sequence (T 2 G 7 ) 4 has distinctly different patterns of DMS reactivity in the presence or absence of K ϩ . In the absence of K ϩ strong uniform DMS modification is seen throughout the sequence; however, in its presence the extent of DMS modification is considerably reduced. The bases at the ends of the G tracts are modified about 50% less than they are in the absence of K ϩ as judged by densitometric analysis, while bases in the middle of the G tracts are modified only 10 -25% as much. The pattern and extent of DMS reactivity is most consistent with the formation of a number of simple tetraplexes in which the inner bases in each G tract are more often involved in tetrads than the outer Gs which are more likely to reside at least part of the time within the loops. The simplest interpretation of the data is that the most stable tetraplexes formed are two conformers, both of which have at least 3 bases in each loop. Two possible structures are shown in Fig. 4, B and C.
Parameters Affecting Tetraplex Formation-These experiments demonstrate that K ϩ -dependent arrest of DNA synthe-  1 which contains a single A substitution (template sequence: (T 2 G 5 T 2 GGAGGT 2 G 5 T 2 G 5 )), and (C) 1 which contains a single C substitution (template sequence: (T 2 G 5 T 2 GGCGGT 2 G 5 T 2 G 5 )). (T 2 G 5 ) 4 is shown for comparison. Panel B shows the effect of methylation on tetraplex formation by the (C) 1 template. Methylation status is indicated by the Ϫ and ϩ signs. The lane markers T, C, G, and A indicate bases on the template strand and the arrows indicate the beginning of the first G-rich repeat. sis is not a unique feature of the previously described "cinched" tetraplex (16), but rather is a property of tetraplexes in general. We have used this property to examine the effects of the length and composition of the stem and spacer regions, as well as the effect of pH and methylation status on tetraplex formation.
Loop Parameters-The shortest sequence capable of producing a block to DNA synthesis in this assay was (T 2 G 3 ) 4 (Fig.   3A). In experiments not shown here, neither single-stranded templates containing (T 2 G 2 ) 4 or the thrombin-binding aptamer, G 2 T 2 G 2 TGTG 2 T 2 G 2 (22) blocked DNA synthesis even after prolonged incubation in the presence of K ϩ . The thrombin-binding aptamer has recently been shown by NMR to form an intrastrand tetraplex in which loops 1 and 3 each contain 2 bases, while loop 2 has 3 bases (22). However, the NMR obser-TABLE I List of potential tetraplex-forming sequences A number of potential tetraplex-forming sequences found in a search of GenBank ™ are shown divided into groups according to their location in the genes in which they are found. These sequences were found using Findpatterns, a GCG subroutine (Wisconsin Package, Version 8, Genetics Computer Group, Madison, WI) using as query sequences some of the sequence motifs defined in this study. N was substituted for specific bases in the query sequences in those instances where we have found that the identity of the base is unimportant. GenBank ™ accession numbers and descriptions are given for each sequence. The strand on which the purine-rich run is located is indicated as coding strand (ϩ) or noncoding strand (Ϫ). vations of tetraplex formation were made at low temperature (20°C or below) or in the presence of more than 100 mM KCl. Our DMS modification data indicate that, for (T 2 G 7 ) 4 , the most stable structures formed have at least 3 bases in each loop. The negative results with (T 2 G 2 ) 4 and the thrombin-binding aptamer suggest that at least 2 tetrads and 3 loop bases are essential for stable tetraplex formation under these conditions. This would be consistent with the observations of Balagurumoorthy and Brahmachari (23) for the human telomeric sequence (T 2 AG 3 ) 4 . Sequences of the form (T x G y ) 4 , were used to examine the effect of loop size and base composition on the amount of tetraplex formed. In initial experiments, a template containing only a single T residue in each repeat, i.e. x ϭ 1, and y ϭ 5 was compared to templates where x ϭ 2, and y ϭ 4 or 5. In these experiments equally strong tetraplex formation was observed for all templates where y ϭ 5 (Fig. 5). Assuming that 3 bases are required for each loop, (TG 5 ) 4 must have 2 Gs and a T in each loop, with 3 tetrads in the stem, while (T 2 G 5 ) 4 must have 2 Ts and a G, with 4 tetrads in the stem. The fact that (TG 5 ) 4 blocked DNA synthesis to a similar extent as (T 2 G 5 ) 4 , in spite of having fewer potential tetrads, suggests that Gs in the loop stabilize tetraplexes perhaps by means of interloop interactions or stacking energy contributions. This is consistent with our previous observation that the sequence G 26 forms a much stronger block to DNA synthesis than any of these templates, completely blocking chain extension (16).
In the next set of experiments, the size and base composition of the central spacer region (spacer 2 in Fig. 1A) of the sequence (T 2 G 4 T 2 G 4 X n G 4 T 2 G 4 ) was altered. X n represents the altered bases which would fall into loop 2 of the tetraplex. There was a small but reproducible increase in the amount of tetraplex formed when loop 2 contained a G residue (Fig. 6A). This is consistent with our earlier observation that Gs in the loop stabilize tetraplexes. The base composition of loop 2 had little effect for spacer lengths of 3 where guanine was not one of the loop bases (Fig. 6A). For spacer regions consisting of TTTT, TTTA, CCCC, or AAAA, the pure C and A tracts were more destabilizing than pure T or T-rich tracts. The greater stability of T-rich loops may be the result of base stacking or interactions between loop bases. The potential for T:T base pairing in the loops has been proposed from NMR data (22) and model building studies (24).
Tetraplex formation occurred even when spacer 2 consisted of 10 Ts (Fig. 6B). The optimal spacer length was found to be between 4 and 6 bases consistent with studies of DNA hairpins, where a loop size of 4 -5 nucleotides has been found to be optimal (25).
Templates with 6 Ts in loop 1 (T 2 T 2 T 6 in Fig. 7) formed a similar amount of tetraplex as did templates with 6 Ts in loop 2 (T 2 T 6 T 2 in Fig. 7). Increasing the number of bases to 6 in both spacer regions 1 and 3, or in all 3 spacers had little effect on the amount of polymerase arrest seen. Two pathways for tetraplex folding are currently favored (24). The first involves the formation of two hairpins (i.e. between G tracts 1 and 2, and between G tracts 3 and 4, Fig. 1A) that then interact to form the tetraplex. In this model the loops of the two hairpins become loops 1 and 3 of the tetraplex, while the bases between the two hairpins form loop 2 of the tetraplex. The second model involves the formation of a single large hairpin that becomes folded over to form a tetraplex. The loop of this hairpin forms loop 2 of the tetraplex. The fact that tetraplex formation is not affected by the addition of extra bases to loops 1 or 2 is most consistent with the first model, since these bases would be expected to adversely affect the stability of the folding intermediate in the second model.

A Limited Number of Non-G Bases May Be Tolerated in the
Stem-The cinched tetraplex formed in the chicken ␤-globin gene promoter (15,16) is able to incorporate a number of non-G bases into the stem of the tetraplex. Much of the destabilizing effect expected of these bases we believe to be offset by stability provided by the "cinch" (16). To test whether non-G bases could be incorporated into a normal tetraplex, and to examine the effect of different non-G bases on tetraplex stability, we altered a model tetraplex-forming sequence so that it would contain a single non-G base in the stem.
The introduction of either a single A or C into the stem of the sequence (T 2 G 5 ) 4 greatly reduces the extent of polymerase arrest (Fig. 8A, panels (A) 1 and (C) 1 ), consistent with the effect of disruption of a tetrad. The effect of the C residue is added proof that hairpin formation is not involved in K ϩ -dependent DNA synthesis arrest since the substitution of a G-C pair for a G-G one should not reduce the stability of such a hairpin. Cytosine methylation had no effect on DNA synthesis arrest (Fig. 8B), nor did lowering the pH to 6.5 (data not shown).
Tetraplex Formation at Physiological Temperature and K ϩ Concentrations-To see whether tetraplex formation was possible at physiological temperatures, we carried out DNA synthesis arrest assays on templates containing the sequence (T 2 G n ) 4 , where n ϭ 2-4. Templates were heated and cooled, in the absence of K ϩ , and then K ϩ added after the reaction had equilibrated at 37°C for 5 min. K ϩ -dependent tetraplex formation was seen for templates in which n Ն 3 even under these conditions (data not shown).
Arrest of DNA synthesis is seen with as little as 5 mM K ϩ and increases as the K ϩ concentration increases (data not shown). Since intracellular K ϩ concentrations range from 100 to 150 mM in mammalian cells (26), these results suggest that tetraplexes could be formed in vivo by relatively short G-rich sequences whenever DNA is single-stranded for short periods of time. DISCUSSION We have shown here that the formation of a K ϩ -dependent block to DNA synthesis is a general diagnostic feature of intrastrand tetraplexes. The K ϩ ion specificity is probably related to the fact that K ϩ is unique among the monovalent cations in being both small enough to enter the stem cavity and large enough to stabilize tetraplexes by forming a coordination complex with Gs in adjacent tetrads (4). The K ϩ effect on polynucleotide chain extension makes a simple, rapid, and specific assay for sequences able to form such structures that is easy to visualize and very sensitive. Structures can be resolved at single base resolution on sequencing gels, and it is possible to detect multiple structures simultaneously even when only a small fraction of the molecules have formed a tetraplex. Tetraplex formation can be observed under conditions that might reasonably be expected to occur in cells, and the assay is amenable to studies of factors that affect tetraplex stability.
We have used this assay to explore some of the sequence parameters that affect tetraplex formation and stability. We show that at least 3 bases in each loop are needed, and that Gs in the loops have a stabilizing effect. Ts, Cs, and As all have similar effects on tetraplex stability for loop sizes of 3, but when the loop sizes were increased, Ts are slightly more stabilizing than either As or Cs. At least 10 Ts can be tolerated in the central loop, with 4 -6 being optimal. Our data indicate that a folding mechanism that involves the interaction of two small hairpin intermediates to form the tetraplex is a significant pathway for the formation of these structures. We also show that even a single non-G base in the putative tetraplex stem severely destabilizes the structure, although some arrest is still observed. These structures are, however, relatively insensitive to differing numbers of bases in the constituent loops. This is interesting since it indicates that sequences lacking a recognizable 4 G-repeat motif may still be able to form tetraplexes. This vastly increases the number of sequences that can potentially form such structures.
From the experiments described in this report we were able to identify a number of sequences in GenBank TM that might form strong intrastrand tetraplexes (Table I). In addition to DNA tetraplexes, it is probable that transcripts containing these sequences also form tetraplexes, and an effect of these structures at the RNA level is thus also possible.
A number of the sequences listed in Table I are located in the 5Ј-regulatory region of a variety of eukaryote genes. It is possible that these structures have a regulatory role by affecting protein binding to the promoter. For example, the tetraplex in the chicken ␤-globin promoter overlaps with the binding site of an erythroid-specific DNA binding protein (27). We speculate that tetraplex formation may block binding of this protein and thus represents one level at which regulation of the ␤-globin gene could be accomplished. Tetraplex-forming sequences are also located in the 5Ј-untranslated regions, introns, and 3Јuntranslated regions of a number of interesting genes including the human genes encoding argininosuccinate lyase (28), erythropoietin (29), and acrosin (30). A number of nontranslated RNAs also have tetraplex-forming potential. Examples of this class of sequences include the 28 S ribosomal RNA gene in Xenopus (31), the human X (inactive)-specific transcript (32), and the product of the mouse H19 gene, a developmentally regulated gene under the control of raf and Rif (33). We have confirmed that at least one of the sequences shown in Table I, the one found in the 3Ј-untranslated region of the human retrotransposable element known as L1 (34), does indeed form a tetraplex. 2 This structure may play a role in L1 retroposition because it has been conserved for the ϳ100 million years since the mammalian radiation in the absence of sequence conservation. 2 These structures may act in other contexts as cell cycle check points by slowing replication through these regions. We would also expect that such structures would block transcription and translation. Transcriptional blocks may function to prevent unwanted transcriptional read through from an upstream promoter, while translational blocks may prevent inappropriate protein synthesis by blocking scanning by the 40 S ribosomal subunit.
The biological significance of intrastrand DNA tetraplexes is obviously contingent on their ability to form in vivo. In order for such structures to form, the DNA must be single-stranded. In the assay described here, single strands were obtained by heat denaturation. In eukaryotic cells it is thought that only relatively small regions of DNA are unpaired during replication, although it has been suggested that many hundreds of bases would be unpaired under some circumstances (35). However, even if this were so, much of this DNA would probably be bound by single-stranded binding proteins. On the other hand, tetraplex formation has been demonstrated in supercoiled plasmids in vitro (36,37), suggesting that under some conditions these structures may be formed in otherwise duplex DNA. Direct evidence for tetraplex formation in vivo has been obtained for at least one sequence, that of the human insulin promoter (8), suggesting that formation of DNA tetraplexes by sequences such as those shown in Table I might in fact be possible. Our exploration of sequence variations in tetraplex-forming sequences is clearly not exhaustive. It does, however, illustrate the flexibility of the requirements for tetraplex formation, and greatly expands the category of known tetraplex-forming sequences.