Intramolecular Higher Order Packing of Parallel Quadruplexes Comprising a G:G:G:G Tetrad and a G(:A):G(:A):G(:A):G Heptad of GGA Triplet Repeat DNA*

GGA triplet repeats are widely dispersed throughout eukaryotic genomes and are frequently located within biologically important regions such as gene regulatory regions and recombination hot spot sites. We determined the structure of d(GGA) 4 (12-mer) under physio- logical conditions and founded the formation of an intramolecular parallel quadruplex for the first time. Later, a similar architecture to that of the intramolecular parallel quadruplex was found for a telomere DNA in the crystalline state. Here, we have determined the structure of d(GGA) 8 (24-mer) under physiological con- ditions. Two intramolecular parallel quadruplexes comprising a G:G:G:G tetrad and a G(:A):G(:A):G(:A):G heptad are formed in d(GGA) 8 . These quadruplexes are packed in a tail-to-tail manner. This is the first demonstration of the intramolecular higher order packing of quadruplexes at atomic resolution. K (cid:1) ions, but not Na (cid:1) ones, are critically required for the formation of this unique structure. The elucidated structure suggests the mechanisms underlying the biological events related to the GGA triplet to temperature prior to the measurements. CD and NMR Spectroscopies— CD spectra and thermal CD melting curves were recorded with a Jasco J-720 spectropolarimeter. The tem- perature of the solution was raised from 1 to 95 °C at the rate of 1 °C/min. Changes in CD intensity were monitored at 263 nm. The melting temperature was determined by use of the derivative of the melting curve. NMR spectra were recorded with Bruker DRX600 and DRX800 spec-trometers equipped with a quadruple-resonance probe with X, Y, and Z gradients. The following NMR experiments were used to assign the resonances and to obtain distance and dihedral angle constraints; NOESY, 1 TOCSY, 31 P-decoupled DQF-COSY, 1 H- 13 C HSQC, 1 H- 15 N HSQC, and 1 H- 31 P HetCor. Spectra were processed with XWIN-NMR (Bruker), NMRPipe (21), and Capp/Pipp/Stapp (22). Distance and Dihedral Angle Constraints— Interproton distances were calculated from NOESY spectra with mixing times of 50 and 200 as In total, 750 distance constraints were obtained.Dihedral angle constraints for the (cid:2) torsion angle were derived from 3 (cid:1) (cid:2) torsion of (cid:3) (cid:3) (cid:1) 3

Several kinds of triplet repeats are found in the human genome. A link to the occurrence of a certain disease has been established for some of triplet repeats. The CCG, CTG, CAG, and GAA repeats are linked to fragile X syndrome, myotonic dystrophy, Huntington's disease, and Friedreich ataxia, respectively (1)(2)(3)(4). It is suggested that these repeats have unusual structures and that the unusual structures cause the extraordinary expansion of the repeats related to the occurrence of the diseases (5,6). The GGA triplet repeat is widely dispersed throughout eukaryotic genomes (7). The GGA repeat has been identified in portions of human and mouse cellular DNA that cross-hybridize with the internal direct repeat (IR3) repetitive region of Epstein-Barr virus (8). The GGA repeat, together with the GAA repeat, has also been found in micro satellite DNA belonging to the rat polymeric immunoglobulin receptor gene (9). A fragment of the microsatellite DNA containing both the repeats was suggested to attenuate gene expression at the transcriptional and post-transcriptional levels (10). Moreover, the GGA repeat has been identified in various sequences ranging from that of the mouse WASP gene (11), which is a homologue of the gene mutated in the Wiskott-Aldrich syndrome, to regulatory elements governing cell type-specific expression of neural cell adhesion molecule genes (12). The GGA repeat is frequently located within gene regulatory regions and recombination hot spot sites (13). Thus, the biological significance of the GGA repeat is widely recognized, although the link to a certain disease has not yet been clarified. The GGA repeat is capable of forming variable structures (14 -18).
We determined the structure of d(GGAGGAGGAGGA) (d(GGA) 4 ) composed of four tandem GGA units under physiological conditions (19). d(GGA) 4 folds into an intramolecular quadruplex composed of a G:G:G:G tetrad and a G(:A):G(:A): G(:A):G heptad. Four G-G segments of d(GGA) 4 are aligned parallel to each other due to seven successive turns of the main chain at each of the GGA and GAGG segments. This was the first demonstration that DNA can form an intramolecular parallel quadruplex. Later, a similar architecture to that of the intramolecular parallel quadruplex was found for the telomere DNA in the crystalline state (20). We also showed that two quadruplexes of d(GGA) 4 form a dimer stabilized through the stacking interaction between the heptads of the two quadruplexes.
Our findings as to GGA triplet repeat DNA, together with the crystallographic result for telomere DNA, indicated the possibility of intramolecular higher order packing of quadruplexes for longer DNA with certain repeating units. To address this point and to elucidate the character of naturally occurring GGA repeat DNA, we studied the structure of d(GGAGGAG-GAGGAGGAGGAGGAGGA) (d(GGA) 8 ) under physiological conditions. Here, we present its unique structure. This is the first demonstration of higher order packing of quadruplexes at atomic resolution. The biological events related to GGA triplet repeat DNA can be rationalized in the light of the elucidated structure. Furthermore, our findings provide a support for the hypothetical higher order packing of quadruplexes of telomere DNA and indicate the mode of packing for it.

EXPERIMENTAL PROCEDURES
Sample Preparation-d(GGA) 8 and mutant oligomers in which each single G residue was replaced by an I residue were prepared as described previously (19). DNA was dissolved in 10 mM sodium phosphate buffer (pH 6.7) containing 3 mM NaN 3 and either 0 or 1-50 mM KCl. The DNA concentrations were 1-40 M for CD and 0.1-1 mM for NMR. * The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
The 2,2-Dimethyl-2-silapentane-5-sulfonate was used as an internal chemical shift reference. Samples were heated at 95°C for 5 min, followed by gradual cooling to room temperature prior to the measurements.
CD and NMR Spectroscopies-CD spectra and thermal CD melting curves were recorded with a Jasco J-720 spectropolarimeter. The temperature of the solution was raised from 1 to 95°C at the rate of 1°C/min. Changes in CD intensity were monitored at 263 nm. The melting temperature was determined by use of the derivative of the melting curve.
Distance and Dihedral Angle Constraints-Interproton distances were calculated from NOESY spectra with mixing times of 50 and 200 ms as described previously (19). In total, 750 distance constraints were obtained.
Dihedral angle constraints for the ␦ and endocyclic 0 , 1 , 2 , 3 , and 4 torsion angles were derived from 3 J H1Ј-H2Ј , 3 J H2Љ-H3Ј , and 3 J H3Ј-H4Ј couplings as described previously (19,(23)(24)(25). The sugar pucker was determined as follows on the basis of the results of analysis of the intensities of the H1Ј-H2Ј, H2Љ-H3Ј, and H3Ј-H4Ј cross-peaks in the DQF-COSY spectrum: all G residues, except for G13, and A12, C1Ј-exo conformation; all A residues except for A12 and A24, C3Ј-endo conformation; A24, O4Ј-endo conformation; G13, not determined due to overlapping of resonances. The sugar conformations were confirmed independently by the C3Ј and C4Ј chemical shift values (23,24,26). The ␦ and endocyclic 0 -4 torsion angles of all G, except for G13, and A12 residues were moderately constrained, leaving the sugar free to take any conformation without an energy penalty between O4Ј-endo and C2Ј-endo including C1Ј-exo in the pseudorotation cycle. In the same way, those of all A residues, except for A12 and A24, were constrained between C2Ј-exo and C4Ј-exo, including C3Ј-endo, and those of A24 between C4Ј-exo and C1Ј-exo, including O4Ј-endo. That of G13 was not constrained.
Dihedral angle constraints for the ⑀ torsion angle were derived from 3 J H3Ј-P coupling as described previously (19,23,24). Thus, the ⑀ torsion angles of all residues, except for A24, were constrained to Ϫ120 Ϯ 45°. For A24, ⑀ was left unconstrained. No dihedral angle constraints were used for the ␣ and torsion angles.
Structure Calculation-Structure calculations were carried out using distance and dihedral angle constraints with a simulated annealing protocol supplied with X-PLOR, version 3.8 (27). Neither hydrogen bonding constraints nor planarity constraints were included. Twenty final structures were selected from 100 calculations on the basis of the criteria of the smallest residual energy values. None of them violated the distance constraints by more than 0.3 Å or the dihedral angle constraints by more than 6°. The structures were viewed with Insight II (MSI).

RESULTS
K ϩ -induced Ordered Structure of d(GGA) 8 -In the absence of K ϩ , several broad imino proton signals were observed for d(GGA) 8 (Fig. 1A). This indicates that although d(GGA) 8 seems to form some structure in the absence of K ϩ , it is rather unstable. In contrast, 16 sharp signals appeared in the presence of K ϩ (Fig. 1B), indicating the formation of a certain stable ordered structure. The chemical shift range of 10.7-11.9 ppm for these signals suggests that the structure formed is a G:G:G:G quartet-based quadruplex. The sharp signals did not appear when Na ϩ was added instead of K ϩ (Fig. 1C). Thus, K ϩ is critically required for the formation of this stable structure. This is notable from a physiological point of view, because the K ϩ concentration, but not the Na ϩ one, is high, over 100 mM, in the nucleus.
It is known that a parallel quadruplex gives a positive CD band at 260 nm, while an antiparallel one gives a positive CD band at 295 nm either with or without a positive CD band at 260 nm (28,29). The CD spectrum of d(GGA) 8 in the presence of K ϩ had a positive CD band at around 260 nm without a positive CD band at around 295 nm (Fig. 1D). This suggests that d(GGA) 8 forms a parallel quadruplex. The melting temperature of the structure formed by d(GGA) 8 in the presence of K ϩ was determined to be rather high, 86°C. This again suggests the formation of a quadruplex, because the melting temperature of a quadruplex is generally very high.
It was also found that the melting temperature does not depend on the concentration of d(GGA) 8 . This indicates that the structure formed is unimolecular. This conclusion was supported by the result of analysis of the line widths of NMR resonances that are sensitive to molecular weight. The line 1 The abbreviations used are: NOESY, nuclear Overhauser effect spectroscopy; NOE, nuclear Overhauser effect; TOCSY, total correlation spectroscopy; DQF, double quantum filtered; HSQC, heteronuclear single-quantum coherence spectroscopy; r.m.s.d., root mean square deviation. widths for d(GGA) 8 (24-mer) were the same as those for d(GGA) 4 (12-mer), which exists as a dimer in solution (19).
Resonance Assignments-The non-exchangeable 1 H, 13 C, and 31 P resonances of d(GGA) 8 were assigned as described previously (19) using standard methods (23,30). As an example, Fig. 2A shows expansion of the NOESY spectrum allowing the sequential assignments of H1Ј and H6/H8 through H1Ј(i Ϫ 1)-H6/H8(i)-H1Ј(i) connectivities. H2Ј, H2Љ, H3Ј, H4Ј, and H5Ј/ H5Љ resonances were assigned by means of TOCSY and DQF-COSY spectra. These assignments were confirmed by the sequential H3Ј(i Ϫ 1)-P(i)-H4Ј/H5Ј/H5Љ(i) connectivities in the 1 H-31 P HetCor spectrum, the assignments of 31 P resonances being made at the same time (data not shown). The chemical shifts of 13 C resonances were obtained from the 1 H-13 C HSQC spectrum. Exchangeable 1 H resonances were assigned as described previously with the use of mutant oligomers in which each single G residue was replaced by an I residue (19).  (Fig. 2B), ANH 2 -GH1Ј (Fig. 2C), GNH 2 -ANH 2 , and GH8-AH8 NOEs, in addition to GNH/NH 2 -GH8 and GNH-GNH ones; the NOEs commonly observed for both heptads are shown in Fig. 3B. A12 and A24 are not involved in either the tetrads or the heptads.
The arrangement of two heptads was determined on the basis of the strong AH2-AH1Ј/2Ј/2Љ NOEs between A3 and A21, A6 and A18, and A9 and A15, and the medium to weak GH8-GH1Ј ones between G1 and G22, G4 and G19, G7 and G16, and G10 and G13 ( Fig. 2A). Thus, the overall structure of d(GGA) 8 was concluded to be as shown in Fig. 3C.
Structure of d(GGA) 8 -The applied constraints and the structural statistics for the 20 final structures are summarized in Table I. The root mean square deviations (r.m.s.d.s) of the 20 final structures versus the mean structure for all heavy atoms were 0.35 Ϯ 0.13 Å. Fig. 4A shows a stereo view of the superposition of the 20 final structures. A representative structure with the lowest energy is shown in Fig. 4B, a trace of the sugar-phosphate main chain being indicated by a tube for clarity. Two tetrad and two heptads can be seen (Fig. 4, C-E), as already discussed qualitatively. A12 is located close to the upper heptad, although it is not involved in the heptad. A24 is stacked on G23.
Deviation of was found for almost all G residues. The combination of these deviated dihedral angles results in the successive turns of the backbone.
These successive turns cause the four G-G segments, i.e. the G1-G2, G4-G5, G7-G8, and G10-G11 segments, to be aligned  . Similarly, the G13-G14, G16-G17, G19-G20, and G22-G23 segments are aligned parallel to each other, and the G13-G23 portion forms another intramolecular parallel quadruplex. The two quadruplexes are packed in a tail-to-tail manner through stacking between the two heptad planes, the orientation of G-G segments of one quadruplex being opposite to that of the other quadruplex (Figs. 3C and 4B). The stacking between the tetrad and heptad planes is shown in Fig. 4, C and D. The five-membered ring of one guanine base is stacked on the six-membered ring of the other guanine base for each G-G segment. The A3, A6, A9, A15, A18, and A21 bases are stacked on the G2, G5, G8, G14, G17, and G20 sugars, respectively, which is characteristic of the structure of a GNA trinucleotide loop and consistent with the extreme upfield shift of H4Ј (2.62-2.84 ppm) and the moderate upfield shift of H3Ј (4.56 -4.66 ppm) of these G residues (32,33). Stacking of the two heptads is shown in Fig. 4E. The G1, A3, G4, A6, G7, A9, and G10 bases are stacked on the G22, A21, G19, A18, G16, A15, and G13 ones, respectively.
All G residues take on the anti conformation with respect to the glycoside bond, while all A residues, except for A12 and A24, take on the high anti-conformation. All G residues take on the B-form sugar conformation (around C1Ј-exo), while all A residues, except for A12 and A24, take on the A-form sugar conformation (around C3Ј-endo). The high anti and C3Ј-endo conformations of the six A residues are supposed to be preferable for the series of turns and/or for good stacking interactions.
Comparison of the Structures of d(GGA) 8

DISCUSSION
From the viewpoint of symmetry, the formation of an octad composed of four G and four A bases may seem to be natural. A12 of d(GGA) 4 (12-mer) does not fold back to the heptad plane, and thus the octad is not formed (Fig. 5B). This was explained by that the terminal A12 of d(GGA) 4 does not have a following G residue that would pull it to the heptad plane (19). In the case of d(GGA) 8 (24-mer) (Fig. 5A), it is conceivable that A12 may associate with the upper heptad, resulting in the formation of an octad. However, all NMR data for d(GGA) 8 were against the existence of the sheared G10:A12 base pair that is needed for the formation of an octad, although the position of A12 was determined to be close to the upper heptad (Fig. 4B). A12 must bridge the upper and lower quadruplexes. It seems that A12 cannot be involved in the octad to play this role due to structural restriction. The terminal A24 does not fold back to the heptad, either, for the same reason as for d(GGA) 4 . Thus, an octad is not formed for d(GGA) 8 either.
From the viewpoint of achieving maximum stacking interactions, the relative arrangement of the two heptads of d(GGA) 8 can be rationalized. For example, if the lower heptad is rotated by either 90°or 180°, one A base of each heptad cannot be involved in the stacking interaction, which is energetically less stable and thus unfavorable.
Each monomer of d(GGA) 4 forms an intramolecular parallel quadruplex (Fig. 5B). Matsugami et al. (19) first demonstrated that DNA can form the intramolecular parallel quadruplex. Later, an intramolecular parallel quadruplex was found for a telomere DNA in the crystalline state (20). These structures have indicated the intramolecular higher order packing of quadruplexes. In this study, we have demonstrated for the first time that the intramolecular higher order packing of quadruplexes actually occurs and the mode of the packing at atomic resolution is discussed. The intramolecular packing of the two quadruplexes for d(GGA) 8 is achieved through the stacking interaction between the heptads of each quadruplex (Fig. 5A). As a result, the two quadruplexes are arranged in a tail-to-tail manner. Alternatively, if the G13(:A15):G16(:A18):G19(:A21): G22 heptad is stacked on the G2:G5:G8:G11 tetrad, with A12 bulging out, then the two quadruplexes are arranged in a head-to-tail manner. However, this does not occur for d(GGA) 8 , because heptad-heptad stacking in a tail-to-tail manner is energetically much more stable and favorable than heptad-tetrad stacking in a head-to-tail manner.
When the GGA unit is further repeated, formation of the structure shown in Fig. 5A is expected for the other 24-mer region, too. The structures formed for each 24-mer region may further pack through the stacking interaction between the tetrads of each structure.
The GGA triplet repeat is abundant in eukaryotic genomes (7), and is frequently located within biologically important regions such as gene regulatory regions and recombination hot spot sites (11)(12)(13). Thus, the biological significance of the GGA repeat is widely recognized. The GGA repeat shows considerable genetic polymorphism due to genetic instability (7,34). This genetic instability is supposed to originate from DNA slippage during DNA replication and/or frequent recombination of the repeats (35). The structure elucidated in this study can be used to rationalize the occurrence of these events. In the course of replication, double-stranded DNA is locally melted to yield single-stranded DNA for use as a template for DNA synthesis. In this case, the GGA strand is expected to form the unique intramolecular structure we found, when the high concentration of K ϩ , over 100 mM, in the nucleus is considered. The structure is very stable, as revealed by its high melting temperature of 86°C. It was also found for d(GGA) 4 that once the unique structure is formed, the addition of the complementary strand, d(TCC) 4 , is not effective for the formation of a duplex (19). The unique structure remained even in the presence of the complementary strand. The formed structure could cause the slippage during DNA replication, which results in a gain or loss of repeats and thus genetic instability.
Alternatively, intermolecular association of two doublestranded DNAs through the packing of two quadruplexes formed in each DNA, as shown in Fig. 5B, may also occur and facilitate the recombination. This idea is coincident with the fact that the GGA repeat is frequently located in recombination hot spot sites. The facilitated recombination may also be responsible for the genetic instability of the GGA repeat. Furthermore, the intermolecular association at the GGA repeat may play a role in the pairing of homologous chromosomes during meiosis. The idea of pairing through a quadruplex has already been proposed for the G stretch (37)(38)(39)(40). It is known that the synapsis formation between homologous chromosomes in meiosis usually begins at telomere DNA, but that it sometimes starts at loci inside a chromosome.
It is expected that the unique intramolecular structure shown in Fig. 5A is formed in the course of transcription as well, because DNA is locally melted in this case, too. The formed structure is supposed to be an obstacle for an RNA polymerase to proceed, which could explain the attenuation of gene expression at the transcription level (10). We have revealed that RNA with the related sequence also can form a similar intramolecular parallel quadruplex (36). This RNA structure is very stable, too, the melting temperature being 86°C (36). Therefore, it is expected that the unique structure in the transcribed mRNA becomes an obstacle for the translation machinery to associate with and/or proceed along mRNA in the course of translation, which may lead to the attenuation of gene expression at the post-transcriptional level (10).
The structure elucidated in this study can also provide a clue as to the higher order packing of the telomere DNA. A quadruplex of telomere DNA only contains the tetrad, not the heptad (Fig. 5C). Therefore, the arrangement of the two quadruplexes in a head-to-tail manner, in which the G26:G32:G38: G44 tetrad of the second quadruplex is stacked on the G4:G10: G16:G22 tetrad of the first quadruplex, with a T23-T24-A25 segment serving as a linker, seems to be likely for d[AGGG-(TTAGGG) 7 ], as proposed by Parkinson et al. (20). Because of the lack of the heptad, the tail-to-tail arrangement of the two quadruplexes, in which the G26:G32:G38:G44 tetrad is stacked on the G2:G8:G14:G20 tetrad, with folding back of the T23-T24-A25 segment, seems less likely, although it is not impossible.