An intramolecular triplex in the human gamma-globin 5'-flanking region is altered by point mutations associated with hereditary persistence of fetal hemoglobin.

The properties of an intramolecular triplex formed in vitro at the 5′-flanking region of the human γ-globin genes were studied by chemical and physical probes. Chemical modifications performed with osmium tetroxide, chloroacetaldehyde, and diethyl pyrocarbonate revealed the presence of non-paired nucleotides on the “coding strand” at positions −209 through −217. These reactivities were induced by negative supercoiling, low pH, and magnesium ions. Downstream point mutations associated with hereditary persistence of fetal hemoglobin (HPFH) altered the extent of the modifications and some of the patterns. Specifically, C−202 → G and C−202 → T significantly decreased the reactivities, whereas the patterns were increased and altered in the T−198 → C. C−196 → T and C−195 → G caused local decreases in reactivity. Modifications at the upstream flanking duplex were modulated by the composition of the vector sequence. In summary, our data indicates the formation of an intramolecular triplex between nucleotides −209 to −217 of the “non-coding strand” and the downstream sequence containing the HPFH mutations. All of the HPFH point mutations altered the structure. More than one sequence alignment is possible for each of the triplexes. In addition, a consequence of some of the point mutations may be to facilitate slippage of the third strand relative to the Watson-Crick duplex.

The properties of an intramolecular triplex formed in vitro at the 5-flanking region of the human ␥-globin genes were studied by chemical and physical probes. Chemical modifications performed with osmium tetroxide, chloroacetaldehyde, and diethyl pyrocarbonate revealed the presence of non-paired nucleotides on the "coding strand" at positions ؊209 through ؊217. These reactivities were induced by negative supercoiling, low pH, and magnesium ions. Downstream point mutations associated with hereditary persistence of fetal hemoglobin (HPFH) altered the extent of the modifications and some of the patterns. Specifically, C ؊202 3 G and C ؊202 3 T significantly decreased the reactivities, whereas the patterns were increased and altered in the T ؊198 3 C. C ؊196 3 T and C ؊195 3 G caused local decreases in reactivity. Modifications at the upstream flanking duplex were modulated by the composition of the vector sequence. In summary, our data indicates the formation of an intramolecular triplex between nucleotides ؊209 to ؊217 of the "non-coding strand" and the downstream sequence containing the HPFH mutations. All of the HPFH point mutations altered the structure. More than one sequence alignment is possible for each of the triplexes. In addition, a consequence of some of the point mutations may be to facilitate slippage of the third strand relative to the Watson-Crick duplex.
Human hemoglobin is synthesized from two sets of clustered genes designated the ␣-cluster (--␣-␣ 2 -␣ 1 ) located on chromosome 16 and the ␤-cluster (⑀-G ␥-A ␥-␤-␦-␤) on chromosome 11. Expression of the genes follows a developmentally, as well as tissue-specific, regulated program that allows and ⑀ to be transcribed during early embryonic life in placental yolk sacderived red cells. At subsequent stages of development, globin expression shifts to the ␣ and ␥ genes in the red cells of hepatic origin. At the time of birth, ␤-chains are predominantly expressed, and erythropoiesis shifts to the bone marrow (1,2). This pattern of expression can be altered by mutations affecting any of the transcribed genes (1-3), but a condition that has attracted particular attention is the hereditary persistence of fetal hemoglobin (HPFH) 1 caused by point mutations at the 5Ј-flanking region of either one of the ␥-genes. These single nucleotide changes cause the affected allele to permanently express high levels of ␥-chains in adult red cells.
Although the molecular mechanisms responsible for this protracted expression are not known, alterations in the recognition of cis-acting elements by regulatory factors and/or in the supramolecular assembly of chromatin have been invoked (4 -7 and references therein). Ulrich et al. suggested that selected mutations in the Ϫ200 region destabilize a non-B DNA structure formed during the course of the ␥-to-␤ switching (4). This hypothesis was supported by the finding that some of the mutations abolished an S1 nuclease-hypersensitive site (S1-HSS) located just upstream, which suggested the formation of an intramolecular triplex (I.T.). This occurrence is interesting for several reasons. First, it is becoming increasingly evident that various types of sequence motifs, including simple repeating defined-order-sequences, have the potential of adopting non-B conformations and that these structural transitions occur in biological systems (8 -11). Second, the isolation of DNAbinding proteins specific for pyrimidine-, or purine-rich, motifs are intriguing (12-16 and references cited therein) since these sequences are known to undergo conformational polymorphisms. Finally, diseases inherited by non-Mendelian genetic mechanisms have been recently associated with the expansion of tandemly repeated DNA motifs (17)(18)(19)(20)(21)(22)(23). The mechanism(s) through which such aberrant expansions are carried out is unknown, but the propensity of defined-order-sequences to adopt multiple conformations suggests that these properties may be directly involved in the process (24 -26).
Intramolecular triplexes have been well characterized in recent years and are known to form at mirror repeat oligopurineoligopyrimidine tracts under the influence of negative supercoiling and low pH. Sequences of this type, but usually with imperfect mirror repeat symmetries, are often found at regulatory regions in eukaryotic genomes and have been proposed to participate in the regulation of physiological processes such as transcription and recombination. Additionally, some of these sequences have been shown to adopt I.T. structures under appropriate conditions (27)(28)(29)(30)(31)(32)(33).
Here we extend the previous studies on the human ␥-globin 5Ј-flanking sequence, which identified the formation of an I.T. based upon S1 nuclease and oligomer binding assays. By ap-plying chemical probe analyses and two-dimensional agarose gel electrophoresis, we now identify the bases associated with the Hoogsteen-paired third strand and describe the structural alterations introduced by the HPFH point mutations.

EXPERIMENTAL PROCEDURES
Plasmid DNA-Plasmid p␥-200 contains the sequence of the 5Јflanking region of human ␥-globin genes from bp Ϫ228 to Ϫ189 inserted at the HincII site of pUC9. Plasmids Ϫ202G, Ϫ202T, Ϫ198C, Ϫ196T, and Ϫ195G harbor the same ␥-globin fragment with the sequence diverging at the indicated position to reproduce the HPFH point mutations. The mutated inserts were cloned at the HincII site in Ϫ202G and Ϫ202T, and at SmaI in Ϫ198C, Ϫ196T, and Ϫ195G. The cloning was described previously (4). Plasmid p␥-200S is as p␥-200, but the ␥-globin insert was cloned at the SmaI site of pUC9. This was performed using synthetic oligonucleotides and standard procedures (34). Plasmid DNA was amplified in the Escherichia coli strain DH5␣ and purified twice through CsCl banding. Na ϩ ions were exchanged for Cs ϩ by dialysis in 10 mM Tris⅐HCl, pH 8.0, 50 mM NaCl, 1 mM EDTA. DNA was stored in 10 mM Tris⅐HCl, pH 8.0, 10 mM NaCl, 1 mM EDTA.
Preparation of Topoisomers at Defined Superhelical Density-12 g of plasmid DNA was incubated with various concentrations of ethidium bromide from 0.1 to 4.0 g/ml for 90 min at 37°C in 300 l of J-1 buffer (10 mM Tris⅐HCl, pH 7.6, 50 mM KCl, 1 mM EDTA, 10 mM dithiothreitol) in the presence of 26 units of topoisomerase I prepared from chicken erythrocytes according to the procedure of Germond et al. (35). Ethidium bromide was allowed to intercalate into the DNA for 30 min at room temperature before topoisomerase I was added. DNA isomers differing in their linking numbers were resolved on 1.2% agarose gels containing increasing concentrations of chloroquine from 0.1 to 400 M (36). The average number of superhelical turns (y) introduced at each concentration of ethidium bromide (x) was described by the function y ϭ 0.05 ϩ 10.25x Ϫ 0.32x 2 (r 2 ϭ 0.9990). Mean superhelical densities Ϫ were derived from Ϫϭ yh/d, where (h) was the DNA helical repeat (10.5 bp) and (d) the size of the plasmid (2,750 bp).
Chemical Modifications-2.4 g of plasmid DNA, corresponding to 72 M of phosphate-DNA, was reacted with chemical probes in 100 l of 50 mM Tris acetate, pH 4.5, 0.1 mM EDTA, 2 mM MgCl 2 , 150 mM NaCl. DNA was allowed to equilibrate at 22°C for 30 min before the modifications were started. OsO 4 was used at 1.8 mM for 30 min at 22°C, CAA at 4% (v/v) for 3 h at 25°C, and diethyl pyrocarbonate (DEPC) at 1.5% (v/v) for 15 min at 22°C. The modified DNA was separated from the unreacted chemical by filtration through spin columns of Sephadex G-50 (Pharmacia Biotech Inc.) equilibrated in 10 mM Tris⅐HCl, pH 8.0, 50 mM NaCl, 1 mM EDTA, 5.4 g tRNA/100 l. To visualize the modifications on the coding strand (top strand on Fig. 1A), DNA was cleaved with EcoRI, 3Ј-end labeled with the Klenow fragment of E. coli DNA polymerase I and [ 32 P]ddATP (3,000 Ci/mmol; 1 Ci ϭ 37 GBq), and then cleaved with HindIII. For the modification on the non-coding strand, DNA was cleaved with BsrBI (located at about 40 bp 5Ј of HindIII site), 3Ј-end labeled with terminal deoxynucleotidyl transferase and [ 32 P]d-dATP, and then cleaved with EcoRI. The restriction fragments containing the ␥-globin insert were separated and purified through 10% polyacrylamide gels. Sites of modifications were cleaved by 1 M piperidine for 30 min at 90°C and resolved on 8% sequencing gels. Gels were fixed, dried, exposed to Fuji RX films, and quantitated on a PhosphorImager (Molecular Dynamics, Sunnyvale, CA) using the ImageQuant software. Data were processed using the SigmaPlot program (Jandel Scientific, San Rafael, CA).
Two-dimensional Agarose Gel Electrophoresis-Two-dimensional agarose gel electrophoresis was conducted as reported previously (38) in the buffer described for the chemical modifications. 4 or 30 M of chloroquine were used in the second dimension. Fig. 1 shows the sequences analyzed in this study, the location of the S1-HSS, and a schematic model for the I.T. based on previous data (4). We have now extended these data using chemical probe analyses (OsO 4 , CAA, and DEPC) in order to detect perturbations at the bp level (27)(28)(29)(37)(38)(39).

RESULTS
Chemical Modifications on the Coding Strand-We examined the reactivities on the coding strand of the wild type sequence (p␥-200) by using OsO 4 , which reacts with the C5-C6 double bond of unpaired thymines (39). Lanes 1 and 2 of Fig. 2A show a typical result. While relaxed (R) p␥-200, lane 1, did not show any strong reactivity, introduction of supercoiling (S, lane 2, Ϫ ϭ 0.137) induced strong cleavages at thymines Ϫ209, Ϫ210, Ϫ212, Ϫ215, and Ϫ217. The percentage of modifications ranged from ϳ7 to 14 (Fig. 2B). Weaker bands were observed corresponding to thymines Ϫ227 and Ϫ228 (less than 1%). Modifications were pH dependent, being observed only at pH 4.5. At pH 5.0 or higher (up to 7.5 was tested), no appreciable signals were detected. In addition, reactivities were increased by the addition of up to 10 mM Mg 2ϩ ions in a concentrationdependent manner (not shown).
CAA forms ⑀-etheno adducts with unpaired cytosines, adenines, and, to a lesser extent, guanines (37). Supercoilinginduced cleavages were found at C Ϫ211 , C Ϫ213 , C Ϫ214 , and A Ϫ216 ( Fig. 2A, lanes 3 and 4). However, these bands accounted only for 0.5-1.5% of the total radioactivity (Fig. 2B). Acid treatment of the samples, before piperidine cleavage, did not improve the signal-to-noise ratio. The sites of modification complemented those detected by OsO 4 and suggested that the 5Ј end of the single-stranded region was 3Ј of A Ϫ219 .
The character of the flanking nt was determined by DEPC, a probe specific for unpaired adenines and guanines (37). As shown in lanes 5 and 6 of Fig. 2A, supercoiling-induced reactivities extended from A Ϫ216 to A Ϫ226 , the strongest band corresponding to A Ϫ216 (0.94 Ϯ 0.05% of the total radioactivity). Taken together, these data define two sets of accessible nt: a major site that extends from T Ϫ209 to T Ϫ217 that we interpret as single-stranded and a minor one, from nt Ϫ218 to Ϫ228, that we consider being weakly bonded (filled and open circles in Figs. 2A and 4C). These structural transitions are influenced by supercoiling, protonation at specific residues, and Mg 2ϩ ions.
Reactions with OsO 4 , CAA, and DEPC were conducted and FIG. 1. Sequence of the human ␥-globin 5-flanking sequence and model for I.T. formation. A, the sequences of human ␥-globin 5Ј-flanking regions from bp Ϫ228 to Ϫ189 were cloned in pUC9 as described; single point mutations leading to HPFH are shown in boldface and indicate the bp change on the top (coding) strand as well as the name of the plasmids carrying the respective mutations. The S1 nuclease hypersensitive site is indicated by S1-HSS. A stretch of bp containing two adjacent purine-rich motifs centered on S1-HSS and HPFH is underlined. B, schematic representation of the I.T. structure proposed to be adopted by the Ϫ200 region. The structure forms under conditions of negative supercoiling and low pH and is stabilized by Hoogsteen-type hydrogen bonds between the two adjacent purine-rich motifs, represented by a thicker line. The structure leaves unpaired pyrimidine residues on the top strand which become a substrate for S1 nuclease (S1-HSS). The model shows how residues affected by the HPFH point mutations may destabilize the I.T. structure. The 5Ј terminus on each of the DNA strands is indicated by a filled circle.
visualized on the coding strand of plasmids containing point mutations associated with HPFH (Fig. 1A). Relaxed and supercoiled DNA (Ϫ ϭ 0.137) were treated under the same conditions as p␥-200. The results demonstrated that the modifications occurred at the same residues as the wild type (not shown) but that there were quantitative differences, which are summarized in Fig. 3. Here the columns represent the percentages of the signals from OsO 4 and CAA normalized to the wild type sequence. The major changes were caused by the G Ϫ202 and T Ϫ202 mutations, which produced a general reduction in modification; the consequences of the G Ϫ202 mutation were more severe than those of Ϫ202 . Ϫ198C exhibited a 2-fold increase in modification at T Ϫ209 and T Ϫ210 , whereas Ϫ196T and Ϫ195G displayed a modest reduction in modification at the middle residues. The normalization at A Ϫ216 with DEPC gave the following values: 0.00 for Ϫ202G, 0.38 for Ϫ202T, 1.19 for Ϫ198C, 0.91 for Ϫ196T, and 1.32 for Ϫ195G.
Chemical Modifications on the Non-coding Strand-Plasmids carrying the wild type sequence or the HPFH point mutations were reacted with OsO 4 , CAA, or DEPC, and the sites of cleavage were monitored on the non-coding strand. In response to supercoiling (Ϫ ϭ 0.137), CAA modified C Ϫ205 and C Ϫ206 in the wild type sequence (Fig. 4A, lanes 1 and 2). A second set of reactive nt was present from A Ϫ217 to C Ϫ224 . Since modifications at these latter positions were also detected on the coding strand (Fig. 4C), they support the interpretation of a locally distorted duplex DNA. However, in contrast to the coding strand, no cleavages were seen between purines Ϫ209 and Ϫ215. This strand-selective pattern of reactivity, together with the modifications at C Ϫ205 and C Ϫ206 , is indicative of an I.T. Accordingly, the protected purines constitute the third strand of the structure, whereas C Ϫ205 and C Ϫ206 would be located in the loop (Figs. 1B and 7) (40,41).
The data with DEPC showed the accessibility of A Ϫ217 , G Ϫ220 , and G Ϫ223 (not shown). OsO 4 modified the thymines spanning positions Ϫ216 to Ϫ226 to a moderate extent (Fig.  4B), confirming the chemical accessibility of the I.T. flanking sequences as well as that of the triplex-duplex junction (nt Ϫ217 and Ϫ218).
Ϫ202G and Ϫ202T showed a marked reduction in reactivities (Fig. 4B) relative to the wild type sequence, confirming their destabilizing effect. Ϫ196T and Ϫ195G displayed pat-terns of modifications with CAA and DEPC qualitatively identical to that of the wild type. These mutants also showed a considerable reduction in reactivity at T Ϫ216 (Fig. 4, A and B). These results, together with those on the coding strand (Fig. 3), suggest that these two mutations perturb the overall geometry of the triplex structure, but do not change the sequence alignment. In contrast, the CAA-induced cleavages at C Ϫ205 and C Ϫ206 were not observed in the Ϫ198C mutant. Instead, signals were detected at C Ϫ204 , G Ϫ203 , and G Ϫ202 (denoted by asterisks in Fig. 4, A and C), indicating an alteration in the loop structure. This may be accomplished by a slippage of the purine-rich third strand relative to the Watson-Crick duplex or by multiple adjustments in the interactions among nt that retain their sequence alignment. In either case, it is clear that this mutation has profound consequences on the triplex structure.
The increase in OsO 4 modification at T Ϫ225 and T Ϫ226 in mutants Ϫ198C, Ϫ196T, and Ϫ195G relative to p␥-200 (Fig.  4B) was quite significant (3-4-fold). However, this behavior is not due to differences in the I.T. structures. A detailed analysis of these reactivities will be given in the last section under "Results." Supercoiling-dependent Titration of OsO 4 Modification-Since the previous experiments were conducted at very high superhelical densities, they left open the question on whether mutations C Ϫ198 , T Ϫ196 , and G Ϫ195 had a significant influence on I.T. formation. To address this issue, OsO 4 was reacted with a set of topoisomers spanning superhelical densities from 0 to Ϫ0.137. At each supercoil density, the percentages of cleavage on the coding strand (from T Ϫ209 to T Ϫ217 ) were totaled. Fig. 5A shows the results with p␥-200, Ϫ202G, and Ϫ202T; a substantial amount of free energy from supercoiling was required to form the I.T. even in the wild type sequence. In fact, at a superhelical density of Ϫ0.06, typical of plasmid DNA isolated from E. coli, only about 10% reactivity was detected. The results from Ϫ202G and Ϫ202T confirm their strong inhibitory effect. Panel B shows the results with Ϫ198C and p␥-200 (as a dotted line). Whereas the increase in modification for Ϫ198C may be attributed to the stronger reactivity at T Ϫ209 and T Ϫ210 (Fig. 3), the significant shift in Ϫ at 50% modification (0.068 Ϯ 0.001 versus 0.078 Ϯ 0.001) indicates that this mutation decreases the amount of free energy required for the duplex to triplex transition. On the other hand, the data for Ϫ196T and Ϫ195G, reported in panels C and D, respectively, do not show dramatic perturbations.
We also conducted two-dimensional gel electrophoresis to assay for the relaxation associated with the DNA structural transition. Topoisomers of p␥-200 and all five mutant plasmids were separated on agarose gels in the same buffer solution used for the chemical modification experiments. Chloroquine concentrations of 4 and 30 M were employed in the second dimension, which afforded the resolution of topoisomers up to 23 negative superhelical turns. A transition centered at 15 negative superhelical turns was observed; however, this was attributed to the formation of a non-B DNA structure in the vector. No transitions due to the duplex-triplex conversion were observed. Two explanations are possible: first, the transition may be too small to be detected in this range of topoisomers or, second, the I.T structure may be too unstable to survive the electrophoretic conditions.
Kinetics of Modifications at A⅐T Base Pairs Support the I.T. Model-Reactivities to OsO 4 on the non-coding strand and DEPC at the corresponding adenines were interpreted from a dynamic standpoint.
Scheme 1 represents a bp in conventional duplex DNA. As bp opening depends on k 1 and k Ϫ1 , the two constants regulate the extent of modification at both residues. However, if Scheme 2 FIG. 4. Chemical modifications on the non-coding strand. A, DNAs were prepared and reacted with CAA as described ("Experimental Procedures" and Fig. 2). p␥-200S was used in this case instead of p␥-200 in order to align the wild type sequence with that of Ϫ198C, Ϫ196T, and Ϫ195G. The nt changes carried by the mutant plasmids are shown on the left. Open circles indicate reactivities common to more than one DNA, whereas asterisks identify sites of modification specific for Ϫ198C. B, DNAs were reacted to OsO 4 as described. Signals above background levels were quantitated and shown as the percentage of the total radioactivity. C, summary of the CAA (C, A, and G residues), OsO 4 (T residues), and DEPC (A, and G residues) modifications. occurs, in which T may interact with a second partner (X) once in an open conformation, the reactivities will depend on k 2 and k Ϫ2 , in addition to k 1 and k Ϫ1 . This is expected to selectively increase modification at A, since this residue remains in an open conformation for a longer time than its partner T. If the percentages of cleavage are expressed as a ratio of T/A, values for A⅐T pairs involved in a type 2 process will be smaller than those progressing through Scheme 1. Table I shows the results for A⅐T bp Ϫ226, Ϫ225, Ϫ222, Ϫ219, and Ϫ216 for p␥-200 and mutant plasmids. The values varied considerably, from 0.7 to 66, but, with the exception of Ϫ216, they were relatively homogeneous at a given locus. The variations observed may be interpreted in terms of modulation in the accessibility to the reactants. In fact, not shown in the schemes are intermediate states in which a given bp may adopt distorted conformations and/or alterations in the stacking in-teractions with neighbor residues. These changes, which are favored by high levels of supercoiling and flanking non-B DNA structures such as the I.T., not only facilitate bp opening, but also increase the rate of chemical attack on partially unpaired conformations (42).
Locus Ϫ216 appears to be a different case. Here the low values, which spanned a greater range (0.7-6.3), were determined by an increased modification at A(open) (A Ϫ216 ) associated with normal or low cleavages at T (Figs. 4B and 6A) and are appropriately accounted for by a type 2 process. Thus, the ratio of T/A modifications is a measure of the relative chemical accessibility and hence the extent to which the A or the T residue has reassociated with another partner.
Reactivities at the I.T. Flanking Sequence Are Affected by the Adjacent Cloning Site-We observed variations in chemical modifications among the mutants that did not correlate with the stability of the respective triplex structures. For example, the OsO 4 modifications at T Ϫ225 and T Ϫ226 were ϳ3-4-fold higher in Ϫ198C, Ϫ196T, and Ϫ195G relative to p␥-200 (Fig.  4B), whereas an increase in stability was apparent only for Ϫ198C (Fig. 5). Since this behavior reflected the differences in the cloning sites for mutants, a comparison of OsO 4 , CAA, and DEPC reactivities between p␥-200 and p␥-200S was performed to resolve this issue. Fig. 6A shows the results from DEPC treatment of the coding strand. The tracings represent the signal from the supercoiled plasmids (Ϫ ϭ 0.137) less the modification found for the relaxed DNA. Quantitation of the peak areas revealed that A Ϫ226 and A Ϫ225 acquired an ϳ2-fold increase when subcloned at the SmaI site (p␥-200S), whereas smaller differences were associated with the other residues. Also, the supercoiling-dependent modifications with OsO 4 at T Ϫ227 and T Ϫ228 (Fig. 6B) revealed that greater signals were associated with the plasmids containing the ␥-globin insert at the SmaI site. Hence, these data show that the flanking vector sequences are responsible for the reactivities at these positions, rather than the HPFH point mutations.

DISCUSSION
These chemical probe analyses on the 5Ј-flanking region of the human ␥-globin genes enable a molecular description of the I.Ts. at a level of detail previously not possible. The duplex to triplex transition characteristic of oligopurine-oligopyrimidine sequences is accomplished by the purine residues simultaneously engaging in Watson-Crick and Hoogsteen hydrogen bonds (27)(28)(29)(30). In general, the bound third strand may occupy a parallel, or antiparallel, orientation relative to the purine residues, depending on its sequence composition. Accordingly, a purine-rich third strand will be accommodated in the major groove in an antiparallel orientation, whereas a pyrimidinerich third strand will occupy the reverse position. Therefore, stable hydrogen bonds may form between G:G, A:A, and A:T in the former case, and G:C ϩ and A:T in the latter (30). Low pH is required in order to stabilize a pyrimidine-rich strand containing cytosine residues in this arrangement. 2 The structures formed at the 5Ј-flanking region of the human ␥-globin genes, both in the wild type as well as the HPFH point mutations, deviate considerably from this general scheme. In fact, these and previous data (4) indicate that the third strand is purine-rich, yet low pH is required for stabilization. Our results show that the third strand Ϫ209 AAGAGGATA Ϫ217 is hybridized in an antiparallel orientation to the downstream Ϫ194 GGGGAAGGGG Ϫ203 containing the sites of mutations. Since these two sequences are 9 and 10 bases long, their inter-2 A colon is used to designate the association between two bases by Hoogsteen or reversed Hoogsteen pairs whereas a center dot designates the interaction between bases in a Watson-Crick pairing motif. action leads to two possible alignments (Fig. 7). In no case can homogeneous G:G, A:A, or A:T Hoogsteen base pairing take place. Rather, mismatches of the G:A, A:G, and G:T type must also be considered. In both of the reported models, the most abundant triplet is C⅐G:A ϩ , a combination that has recently been observed in other I.Ts. (43,44). The stabilization induced by the protonated reversed-Hoogsteen-bound adenines agrees well with our observations. The Hoogsteen G:T pair has been described by NMR only in the parallel orientation (45), where T(H3) shares one hydrogen bond with G(N7). Since parallel and antiparallel thymine displays a 2-fold symmetry about N3 (30), it is possible that the antiparallel G:T pair maintains this type of hydrogen bond. Antiparallel A:G has not been documented, however, close interactions are possible.
Overall, the paucity of stable C⅐G:G and T⅐A:A triplets, together with the short length of the I.T. stem, accounts for the observed requirement of high levels of supercoiling and the inability to detect a duplex to triplex transition by two-dimensional agarose gel electrophoresis. Finally, both models are consistent with the data which locate C Ϫ205 and C Ϫ206 in the loop.
All of the HPFH point mutations alter the normal I.T. structure, some slightly (T Ϫ196 and G Ϫ195 ), others profoundly (C Ϫ198 , G Ϫ202 , and T Ϫ202 ). The destabilizing effects of G Ϫ202 and T Ϫ202 ,  6. Effect of the cloning site on chemical modification at the I.T. flanking sequence. A, plasmids p␥-200 and p␥-200S contain the wild type ␥-globin insert cloned at the HincII or SmaI site of pUC9, respectively. The DNAs were reacted with DEPC at Ϫ of 0 (R) and 0.137 (S) and processed as described for the coding strand. Quantitations were performed as follows. Pixel values for each line graph were converted to percentage of the total signal (sum of pixel values for each lane). Percentages of the R sample were then subtracted from those of S after R and S were aligned on their highest pixel value. The range of the ϫ axes was adjusted so as to align the two relevant sets of peaks. Areas were calculated by cutting and weighing the peaks. Values (mg ϫ 10) at selected locations are reported. Since the amount of material in A Ϫ216 for both DNAs was identical (Ϯ9%), this reinforces the quantitative methodology. B, supercoiling-dependent OsO 4 modification at T Ϫ227 and T Ϫ228 . The percentage of modification at T Ϫ227 and T Ϫ228 was added and expressed as a single (y) value. Interpolation was conducted as explained in the legend to observed previously (4), are confirmed. Both of these mutations disrupt a GGGCCC motif, a sequence that has been shown to acquire an induced bend upon complexation of Mg 2ϩ ions (46). The effect of these nt changes may be that of abolishing this induced bend, which may represent the nucleation step for I.T. formation or that of disrupting critical Hoogsteen hydrogen bonds. Our results favor the first interpretation.
The stabilization mediated by C Ϫ198 was suspected from an earlier work (47). However, the previous studies (4, 47) did not anticipate that this mutation may alter the sequence alignment of the triplex by inducing a slippage of the third strand relative to the Watson-Crick duplex.
The results for T Ϫ196 and G Ϫ195 are unexpected. Here, we find subtle changes in the overall I.T. structures, whereas substantial destabilizations were predicted from former assays (4). It is likely that these discrepancies originate from the experimental conditions used. In fact, we found no modifications at pH 5.0, whereas S1 nuclease cleavages were previously detected at this pH. Since I.T. formation occurs in the pH range of ϳ4.5-5.0, slight variations are likely to affect the stabilities greatly. Also, the results of the oligomer binding assays may have been influenced by the distortion of the DNA flanking the I.T., as well as by the difference in the cloning site between p␥-200 and mutant plasmids (Fig. 6). The two sets of schemes in Fig. 7 predict stable triplexes for C Ϫ198 , T Ϫ196 , and G Ϫ195 (48,49); our data do not permit a delineation between these alternatives.
From a physiological standpoint, the combination of low pH and elevated superhelical density required to induce these structures in vitro may raise concerns about their stability in a cellular environment. However, base protonation may occur, and be maintained, in polynucleotides at several pH units above the pK a of the free base (50 -53). Also, divalent metal ions, polyamines (54 -58), and the aforementioned singlestrand-specific binding proteins may cooperate in lowering the activation energy needed for the I.T. transition.
In vivo, the chromatin in the 5Ј-flanking region of the human ␥-globin genes has been shown to be hypersensitive to DNase I digestion or restriction enzyme cleavage in cells where the ␥-globin genes are actively transcribed (59,60). This behavior, which is also observed in other systems, is likely to be correlated with the loss of positioned nucleosomes along the DNA, and the acquisition of new interactions between cis regulatory sequences and cognate transcription factors (61,62). Indeed, ␥-globin regulatory elements such as the CACCC and CCAAT boxes appear to be selectively occupied in K562 cells, which express these genes (63,64).
In vivo, no protein complex has been identified to date that interacts with the upstream region that contains the HPFH point mutations. In addition, experiments conducted in transgenic mice have demonstrated that, at least in the case of G Ϫ202 , a strong correlation exists between this point mutation and the HPFH phenotype (7). Therefore, a macromolecular complex may assemble at Ϫ200; this complex could be involved in ␥-globin gene silencing. A polypeptide that binds and stabilizes the I.T. structure induced at this location in vitro has been found (14). The interaction of this protein with the I.T. might be altered by any of the HPFH point mutations. It remains to be established whether such an interaction reflects the formation of an I.T. complex that operates in vivo to temporally regulate the expression of the human ␥-globin genes.