Advertisement
JBC

HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
 QUICK SEARCH:   [advanced]


     


Originally published In Press as doi:10.1074/jbc.M202235200 on June 19, 2002

J. Biol. Chem., Vol. 277, Issue 38, 35183-35190, September 20, 2002
This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF)
Right arrow All Versions of this Article:
277/38/35183    most recent
M202235200v1
Right arrow Submit a Letter to Editor
Right arrow Alert me when this article is cited
Right arrow Alert me when eLetters are posted
Right arrow Alert me if a correction is posted
Right arrow Citation Map
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrowRequest Permissions
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Pinheiro, P.
Right arrow Articles by McClellan, J. A.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Pinheiro, P.
Right arrow Articles by McClellan, J. A.
Social Bookmarking
 Add to CiteULike   Add to Complore   Add to Connotea   Add to Del.icio.us   Add to Digg   Add to Reddit   Add to Technorati  
What's this?

Structures of CUG Repeats in RNA

POTENTIAL IMPLICATIONS FOR HUMAN GENETIC DISEASES*

Philip PinheiroDagger , Garry ScarlettDagger , Alison Rodger§, P. Mark Rodger§, Anna Murray||, Tom Brown**, Sarah F. NewburyDagger Dagger , and James A. McClellanDagger

From the Dagger  Biophysics Laboratories, School of Biological Sciences, University of Portsmouth, St. Michael's Building, White Swan Road, Portsmouth, PO1 2DT, United Kingdom, the || Wessex Regional Genetics Laboratory, Salisbury Health Care National Health Service Trust, Salisbury District Hospital, Salisbury, Wiltshire SP2 8BJ, United Kingdom, the ** Department of Chemistry, University of Southampton, Highfield, Southampton SO17 1BJ, United Kingdom, the § Department of Chemistry, University of Warwick, Coventry CV4 7AL, United Kingdom, and the Dagger Dagger  Department of Biochemistry, South Parks Road, Oxford OX1 3QU, United Kingdom

Received for publication, March 7, 2002, and in revised form, June 17, 2002

    ABSTRACT
TOP
ABSTRACT
INTRODUCTION
EXPERIMENTAL PROCEDURES
RESULTS
DISCUSSION
REFERENCES

Triplet repeats that cause human genetic diseases have been shown to exhibit unusual compact structures in DNA, and in this paper we show that similar structures exist in shorter "normal length" CNG RNA. CUG and control RNAs were made chemically and by in vitro transcription. We find that "normal" short CUG RNAs migrate anomalously fast on non-denaturing gels, compared with control oligos of similar base composition. By contrast, longer tracts approaching clinically relevant lengths appear to form higher order structures. The CD spectrum of shorter tracts is similar to triplex and pseudoknot nucleic acid structures and different from classical hairpin spectra. A model is outlined that enables the base stacking features of poly(r(G-C))2·poly(r(U)) or poly(d(G-C))2·poly(d(T)) triplexes to be achieved, even by a single 15-mer.

    INTRODUCTION
TOP
ABSTRACT
INTRODUCTION
EXPERIMENTAL PROCEDURES
RESULTS
DISCUSSION
REFERENCES

Trinucleotide repeat expansion diseases (TREDs)1 are genetic disorders that are caused by expanded tracts of repeated sequences (usually CNG) (reviewed in Ref. 1). Naturally variable lengths of these tracts occur in several genes involved in neurological disorders. Up to about 30 trinucleotide repeats do not cause neurological defects, but once a certain critical length is exceeded, disease ensues. The disease-causing alleles are usually dominant. The TREDs are complex syndromes that may exhibit anticipation (genetic instability leading to longer tract lengths with each generation). In several cases there is evidence that the longer the tract, the more severe or the earlier the onset of disease. Proof that long tracts cause disease rather than merely being correlated with it was obtained by artificially introducing long CNG tracts into mice, thereby inducing a version of spinocerebellar ataxia type I (2). In those experiments some, but not all, features of the syndrome were reproduced in the mouse model, and in fact it is known that even in humans, instability and other features of the syndromes depend on poorly understood contributions of the genetic background (3). Although long tracts cause disease, several lines of evidence suggest that the shorter tracts perform some useful function; for example, CNG tracts in neurodevelopmental genes seem to have grown during primate evolution (4).

There are three main outstanding questions regarding the TREDs: (i) what is the normal function of the CNG tracts?, (ii) why and how does the long tract cause disease?, and (iii) is it correct to talk of the TREDs as a unitary phenomenon, or should we be distinguishing Fragile X and myotonic dystrophy (where the CGG or CTG repeats are normally transcribed but not translated) from Huntington's Chorea, SCA1, and other diseases where the CAG repeats are translated into polyglutamine, resulting in proteins that differ from wild-type protein (5-8)? Answers to these questions require molecular analysis of the differences between the long and short tracts. Accordingly, work in various laboratories has investigated the effects of the triplet repeats on various macromolecules as discussed below.

Even when the triplet repeats are translated, the protein by itself cannot explain even the CAG TREDs, because if it did there would be a class of CA(Purine) diseases (CAA is another codon for glutamine). Furthermore, in the case of Fragile X it appears that absence of the protein and/or RNA in the cytoplasm causes the main features of the disease (9). There are several mechanisms that can lead to this result. One possibility is that expanded-tract alleles of FMR-1 are hypermethylated (perhaps because CGG repeats are particularly good substrates for methyltransferases; (10)), and this is associated with genetic shutdown. That absence of gene function can explain the condition is shown by the existence of rare deletions in FMR-1, leading to symptoms of Fragile X (11-13). Interestingly, intermediate-sized alleles (41-60 repeats) of the tract in FMR1 are associated with learning difficulties in boys, despite the fact that other symptoms of Fragile X are absent and that protein levels are normal (14). However, the totality of the syndrome also includes a DNA effect: the expanded-tract chromosomes themselves are physically fragile and genetically hypervariable. This may be a general property of CNG tracts, since these are known to mediate genetic instability not only in eukaryotes but also in Escherichia coli (16-18).2 Similar behavior of other repeated sequences was previously shown to reflect cellular reaction to the in vivo formation of non-B DNA secondary structures (19, 20). Less attention has been paid to the role of the RNA. Nevertheless, since in all cases the CNG tracts are transcribed but in the loci associated with the two commonest diseases (Fragile X and myotonic dystrophy), they are not translated, it is likely that effects at this level are of central importance, at least for the normal function of the tracts (21, 22). As for how the expanded tracts actually cause disease, the complex nature of the syndromes suggests that this is likely to involve all these aspects. For example, in cases where the tracts are translated, the production of mutant protein may have effects on the cell; if the expanded tracts form a novel structure in DNA, this may act to stimulate recombination or to compromise replication, thus accounting for anticipation and chromosomal fragility; and formation of a new structure in RNA may lead to abnormal stability, splicing, localization, or translation.

The clearest candidate for a CNG disease mediated largely by effects at the RNA level is myotonic dystrophy (21). The shortest repeats that are known from the general population are about 15 nucleotides in length (five repeats). The shorter abnormal alleles are associated with adult-onset myotonic dystrophy and the longer ones with the congenital form. In this case a CTG repeat (CUG in the RNA) is untranslated and located in a locus involved in neuromuscular development. The disease-causing allele is dominant. Long RNA containing the expanded tracts has been shown to be transcribed effectively and to accumulate in foci within the nucleus (23).

Our working hypothesis therefore is that there are special features of CNG RNA structure or biochemistry. These special features mediate some useful function in the short, wild-type alleles and possibly also a deleterious function in the long, disease-causing alleles. In this paper we have used in vitro transcription and chemical synthesis to produce CUG RNA of various sizes and compared it with control RNAs, such as GUC (different polarity) and a randomized but isobasic sequence, using gel electrophoresis, circular dichroism (CD) spectroscopy, and thermal melting of the RNA.

    EXPERIMENTAL PROCEDURES
TOP
ABSTRACT
INTRODUCTION
EXPERIMENTAL PROCEDURES
RESULTS
DISCUSSION
REFERENCES

Transcription of Long CNG Tracts and RNA Markers-- Plasmids were propagated in Escherichia coli DH5alpha , and supercoiled DNA was prepared by alkaline lysis and purified by ultracentrifugation in caesium chloride (Invitrogen) gradients that contained ethidium bromide (Sigma). Long CNG tracts and RNA markers were produced by in vitro transcription of linearized plasmids pGEM1, pGEM2, and Bluescript KS+ (all originally from Promega). Plasmid Bluescript KS+ was linearized with restriction enzymes Tsp501I, MwoI, HaeIII, and XbaI (all New England Biolabs) to produce markers that were 9, 24, 30, and 39 nucleotides long, respectively. Plasmids pGEM2 and pGEM1 were digested with MwoI and AluI, respectively. Transcription of these templates produced markers that were 16 and 18 nucleotides long. All transcriptions were performed on ~1 µg of appropriately linearized plasmid in a 20-µl volume (5× T7 buffer (200 mM Tris-HCl, pH 7.9, 30 mM MgCl2, 10 mM spermidine) and 100 mM dithiothreitol (Promega), RNAguard (Amersham Biosciences), 2.5 mM cold nucleotides A, G, and U, and 100 µM cold CTP (Amersham Biosciences), T7 RNA polymerase (Promega), RNase free water (Sigma)) containing 1 µl (1 µCi; 0.33 nmol) of radioactive [alpha -32P]CTP (3000 Ci/mmol: ICN-FLOW). To remove the template the transcription reactions were treated with 2 µl of 0.1 mg/ml DNase I (RNase-free; Amersham Biosciences) and finally purified by phenol extraction and ethanol precipitation.

Design of Oligonucleotides for Transcription of Triplet Repeat RNA-- DNA templates for transcription were constructed using oligonucleotides synthesized on an Applied Biosystems machine using phosphoramidate chemistry. For efficient transcription it turned out to be crucial that each RNA start with a G, and to aid annealing a G + C-rich clamp was required at the "upstream" end of the duplex promoter region. In each case a top strand comprising the T7 promoter sequence was annealed to a bottom strand of which the 3' end was its complement, and the 5' end provided a template for the transcription of the following RNAs: G(CUG)5, G(CUG)6, and G(CUG)7. Control templates were constructed similarly, and they encoded G(GUC)5 or an isobasic control RNA 5'-GUGGUUGGCCUCCGUC-3'. DNase I treatment, phenol extraction, and ethanol precipitation were performed after transcription to remove the template.

The sequences used are as follows: top strand of T7 promoter, 5'-GCCGGTAATACGACTCACATA-3'; template for transcribing G(CUG)5, 5'-(CAG)5CTATAGTGAGTCGTATTACCGGC-3'; template for transcribing G(CUG)6, 5'-(CAG)6CTATAGTGAGTCGTATTACCGGC-3'; template for transcribing G(CUG)7: 5'-(CAG)7CTATAGTGAGTCGTATTACCGGC-3'; template for transcribing G(GUC)5, 5'-(GAC)5CTATAGTGAGTCGTATTACCGGC-3'; template for transcribing isobasic control RNA, 5'-GACGGAGGCCAACCACTATAGTGAGTCGTATTACCGGC-3'; synthetic RNA for control purposes, 5'-(r(CUG)6)-3'.

Electrophoresis of Triplet Repeat RNA-- RNA markers and triplet repeat RNA transcripts were run on 10% non-denaturing and denaturing (7 M urea (Sigma)) acrylamide (29:1 acrylamide to bisacrylamide (National Diagnostics)) gels. In all cases the electrophoresis buffer was 1× TBE (90 mM Tris, pH 8.0 (Sigma), 90 mM boric acid (Sigma), 2 mM EDTA (Sigma)). Gels were fixed in 10% acetic acid and dried under vacuum onto Whatman 3MM paper, before autoradiography using Kodak film.

End Labeling of DNA Markers and of Synthetic RNA-- DNA oligonucleotide markers 8-32 were purchased from Amersham Biosciences and labeled using T4 polynucleotide kinase (New England Biolabs) and [gamma -32P]ATP (3000 Ci/mmol; PerkinElmer Life Sciences). Synthetic RNA was labeled in a similar manner.

Circular Dichroism and UV Melting-- RNA and DNA oligonucleotides for both CD and UV melting were dissolved to a concentration of 4 µM (for DNA) or 8 µM (for RNA) in 5 mM sodium phosphate buffer, pH 7.5, unless otherwise stated. All solutions for both the CD and UV melting experiments were filter-purified using a 0.2-µm nylon filter (Sigma). CD spectra were gathered on a Jasco J-715 CD spectropolarimeter using a 5-mm path length cell. Data were stored using the supplied software and then exported to Kaleidagraph for manipulation. Data were collected over the range 350-200 nm with a 0.5 nm resolution and 1 nm bandwidth at a speed of 100 nm/min; spectra were averaged over 16 scans. For comparison purposes the RNA CD data have been normalized to 1.2 OD260, while the DNA CD spectra have been normalized to 0.6 OD260.

UV melting was conducted on a Cary 1 spectrophotometer ramped at 1 degree/min and the data collected with the supplied software, before exporting to Kaleidagraph for manipulation. Prior to the melting curve determinations, the samples were heated to 95 °C and cooled slowly (24). Melting curve measurements were repeated at least three times, and no significant differences were found between each set of data. The data were analyzed following Marky and Breslauer (25). The fraction of unfolded RNA in solution was determined by calculating,
&agr;=<FR><NU>x</NU><DE>x+y</DE></FR> (Eq. 1)
where x is the difference between the final absorbance of the unfolded RNA and the absorbance at a given temperature, and y is the difference in absorbance between the temperature-dependent absorbance and the base line corresponding to the temperature dependence of the unmelted RNA absorbance. The midpoint of the plot of alpha  versus 1/T, where T is the absolute temperature (in Kelvin), gives the melting temperature (by definition the temperature at which half of the RNA is melted). The van't Hoff transition enthalpy of the unimolecular transition is,
&Dgr;H<SUB><UP>yH</UP></SUB>=<FR><NU>B′</NU><DE>(1/T<SUB><UP>max</UP></SUB>)−(1/T<SUB>2</SUB>)</DE></FR> (Eq. 2)
where B' = 3.50 cal K-1 mol-1 (25), Tmax is the absolute temperature of the maximum of the alpha  versus 1/T plot, and T2 is the absolute temperature of the high temperature half-height of the curve. The entropy of the transition is given as follows.
&Dgr;S=<FR><NU>&Dgr;H<SUB><UP>VH</UP></SUB></NU><DE>T<SUB>m</SUB></DE></FR> (Eq. 3)

Molecular Modeling-- A variety of techniques were used to perform the conformational search within CHARMm version 25.2. These methods included a grid search based on the backbone torsion angles, a Boltzmann jump algorithm, and molecular dynamics simulations at temperatures of both 300 and 600 K. NOE constraints were used in some of the calculations to improve the chance of locating suitable Watson-Crick C-G base pairings within each triplet and also to locate possible stacked triplets; other calculations were performed without constraints. In all cases, the selected conformations were energy minimized at the end of the conformational searches; where NOE constraints were imposed the minimization was initially done with constraints imposed, but then subsequently with the constraints removed.

    RESULTS
TOP
ABSTRACT
INTRODUCTION
EXPERIMENTAL PROCEDURES
RESULTS
DISCUSSION
REFERENCES

Electrophoretic Mobility of CUG RNAs-- The mobility of end-labeled chemically synthesized (CUG)6 electrophoresed on denaturing gels together with RNA markers (Fig. 1A) shows that the RNA migrates as expected for an 18-mer. Electrophoresis of CUG repeat RNAs and control RNAs on denaturing gels shows that in vitro transcription generates bands of the expected sizes for all the samples (Fig. 1C). When these CUG repeat RNAs are electrophoresed on non-denaturing gels (Fig. 1, B and D), they migrate significantly faster than the marker RNAs. Randomized and GUC repeat RNAs migrate similarly to controls. The overall conclusion to be drawn from Fig. 1, A-D, is that the CUG RNAs are all unusually compact, whether they are made by chemical synthesis or in vitro transcription and that over this size-range their degree of compaction increases with tract length. This is in accord with the behavior previously observed for single-stranded DNA (26).


View larger version (56K):
[in this window]
[in a new window]
 
Fig. 1.   A, synthetic r(CUG)6 electrophoresed on a denaturing gel together with RNA markers. B, synthetic r(CUG)6 electrophoresed on a non-denaturing gel together with RNA markers. C, migration of short CUG tract RNA products transcribed from DNA oligonucleotides electrophoresed on a 10% denaturing gel. Tracts of length 16, 19, and 22 bases with RNA markers are shown. Full-length RNAs are indicated together with prematurely terminated products. Transcriptions from random oligonucleotides (R, randomized C5,G6,U5, right hand lane) produce very few prematurely terminated products. D, RNA products as in C electrophoresed on a 10% non-denaturing gel. E, gel electrophoresis of by in vitro transcripts of a range of plasmids containing 24-51 CUG repeats on a denaturing gel together with RNA markers. A double-stranded DNA marker (M) is shown, and the numbers correspond to numbers of base-pairs. F, transcribed RNA products as in E electrophoresed on a 10% non-denaturing gel. G, DNA (CTG)5 oligonucleotides at concentrations of 1-100 µM on a 15% non-denaturing polyacrylamide gel.

For all the CUG and GUC repeats, it is interesting to note that extra bands at low molecular weight are observed. These extra small bands are not observed for the randomized control and probably reflect abortive transcription. DNA polymerase apparently also appears to pause at CNG tracts (27).3 In addition to bands that move faster than the appropriate markers, some that move more slowly, especially in the six-repeat (19 nucleotides) and seven-repeat (22 nucleotides) tracks, are observed. These are not present in the denaturing gel nor in the chemically synthesized sample, so they are probably higher order complexes of RNA; possibly DNA/RNA hybrids that resist DNase I treatment. Multistrand complexes of CNG DNA have been observed previously by other workers (28).

Mobility of Long RNAs from in Vitro Transcription of Plasmids-- Structural work on triplet repeats has been on short tracts that cannot be directly compared with those from TRED patients. Recently we have managed to create RNAs of lengths approaching clinical relevance by in vitro transcription of a range of plasmids containing 24-51 CTG repeats by cloning CNG tracts into bacterial plasmids under conditions where phenomena similar to anticipation may be observed.2 The gel mobility of these RNAs is shown in Fig. 1, E and F. These in vitro transcriptions are not marked by the high levels of premature termination seen for the oligonucleotide templates. On the non-denaturing gel (Fig. 1F) the CNG RNAs migrate as two bands which migrate anomalously slowly (in contrast to the anomalously fast migration of the smaller triplet repeats). Comparison with the mobilities of the markers shows this to be consistent with duplex or higher formation. Some sort of structural transition appears to occur between the 5-7 range and the 36-50 repeat range. Since this structural transition occurs at a length close to the threshold for myotonic dystrophy, this is potentially a result of some significance.

CD Spectra of Triplet Repeat RNA and DNA-- To investigate the structures of the unusual compact structures found in the gels, CD spectra were collected for the chemically synthesized (CTG)5 and (CUG)5 samples and randomized controls. The CD spectrum of (CTG)5 in phosphate buffer consists of a positive peak at 285 nm and two negative peaks at 258 and 208 nm (Fig. 2A). The 285 nm maximum is at too long a wavelength for the molecule to be adopting an A-DNA conformation, and the spectra do not contain a positive peak at around 205 nm, which would be expected if the tract were adopting duplex B-DNA structure (29-31). The negative peak at 208 nm suggests that the tract has triplex character, as a negative peak between 200 and 220 nm is considered a hallmark of triplex DNA (29, 30, 32). Spectra for triplexes are usually able to be approximated as the sum of the CD for the component duplex and single-stranded DNA plus some extra intensity due to the more rigid structure (33). In accord with this the observed spectrum closely resembles that expected for the sum of poly(d(G-C))2 and poly(d(T)) (34). Addition of 20 mM NaCl or 20 mM MgCl2 increases the 285 nm and 258 nm bands to the same intensity and increases the 208 nm band. The 208 nm band is particularly enhanced by MgCl2, which probably correlates with a stabilization of base stacking and more effective reduction of phosphate-phosphate repulsion (35-37).


View larger version (26K):
[in this window]
[in a new window]
 
Fig. 2.   A, CD of (CTG)5 DNA (4 µM) at 20 °C in 5 mm pathlength cells in 5 mM sodium phosphate (---), plus 20 mM NaCl (· · · ·), and plus 20 mm MgCl2 (- - -). B, CD spectra of chemically synthesized (CUG)5 RNA (8 µM) (---) and a isobasic randomized control (- - -) in 5 mM sodium phosphate at 20 °C in 5-mm path length cells. C, effect of temperature on the (CUG)5 RNA of Fig. 2B: ---, 5 °C; - - -, 25 °C; --- · ---· ---, 45 °C; - ·---·-, 65 °C; · · · ·, 80 °C. D, melting curve of (CUG)5 RNA (8 µM) in 5 mM sodium phosphate, 5-mm path length cell. Ramp rate = 1 degree/min. E, fraction of melted RNA versus 1/T. F, derivative of E with respect to 1/T as a function of temperature. G, CD spectrum of a transcribed 50-repeat CUG RNA sequence in 5 mM phosphate at 20 °C in 5-mm path length cells. The data set have been normalized to OD260 = 1.2. H, UV melting curve of (CUG)50 RNA in 5 mM sodium phosphate, 5-mm path length cell. Ramp rate = 1 degree/min.

To investigate whether the observed triplex-like structure was inter or intramolecular, labeled (CTG)5 was titrated with unlabeled (CTG)5. Gel electrophoresis showed that there is no evidence for intermolecular structures (Fig. 1G), which would be expected to shift the band up the gel due to an increased mass of the complex. There was also no evidence of concentration dependence of spectra other than that required by the Beer-Lambert Law. We therefore conclude that (CTG)5 forms a monomolecular base stacking structure that has spectroscopic triplex characteristics.

The CD spectra of (CUG)5 RNA and an isobasic randomized control at 20 °C are shown in Fig. 2B. The (CUG)5 spectrum has a positive band at 269 nm, a small negative band at 237 nm, and a large negative band at 208 nm. The 208 nm band, which is indicative of RNA base pairing and stacking (38), is not present in the randomized control, suggesting that (CUG)5 RNA is significantly more base-paired and stacked than the control. The long wavelength peak (at 269 nm rather than the 285 nm of the corresponding DNA and 282 nm for the RNA control) is very sensitive to base composition and RNA conformation (38, 39). The overall spectrum is very similar to those obtained for RNAs containing pseudoknots with a small but significant negative band at ~237 nm (40, 41) (naturally occurring A-form RNAs generally lack any marked band at this wavelength (42-44)). The spectrum of the intermolecular triplex formed by poly(A)-poly(G)-poly(C) is also similar to that of the (CUG) repeat, except that the small negative band for this intermolecular triplex is at 245 nm (33). Since an intramolecular pseudoknot might be described as a pseudo-triplex, these two descriptions are equally valid.

The Effect of Temperature on (CUG)5 RNA-- To analyze the effect of temperature on (CUG)5 RNA, we measured the CD spectra at a range of temperatures between 5 and 80 °C (Fig. 2C). The negative band at 208 nm decreases markedly with increased temperature as the base pairs melt apart (44). At 65 °C, the 208 nm band is absent and the 267 nm band decreases and shifts to 275 nm resembling the randomized control, indicating that the (CUG)5 structure is completely melted. The series of spectra obtained are almost identical to CD spectra of a 59-nucleotide flavivirus pseudoknot at various temperatures (41).

The melting curve for (CUG)5 (Fig. 2D) was analyzed following (25) to give a single transition with melting temperature of 54.8 °C. The van't Hoff transition enthalpy of the transition was determined from Fig. 2F (Tmax = 328.1 and T2 = 336.6 K) to be Delta HVH = (190 ± 30) kJ mol-1 and Delta S = (580 ± 90) J K-1 mol-1. A single transition is not inconsistent with the possibility that the structure may resemble a pseudoknot; detailed studies of the thermal melting of pseudoknots show that they can melt with one apparent transition (45, 46). However, Tm and Delta H values are high for a 15-mer of any kind, especially as this usually means <8 base pairs. For example, the Tm for a duplex with a similar ratio of G/C and A/U in 1 M NaCl (GUCUAGAC) (versus 5 mM sodium phosphate in our experiments) is 56.2 °C (47) and the Tm and Delta H values for a 23-mer RNA hairpin (with 7 Watson-Crick base pairs) in 10 mM sodium phosphate are 50.9 °C and 193 kJ mol-1, respectively (48). Thus whatever structure is adopted is very stable and is unlikely to be a hairpin.

The RNA containing (CUG)50 by way of contrast has two thermal transitions occurring at ~40 °C and 76 °C (Fig. 2H) and a CD spectrum resembling that of A-form RNA (Fig. 2G). It is interesting to note that there are two bands when this sequence is run on a non-denaturing gel (Fig. 1F). It may therefore be that the two transitions reflect two distinct species melting rather than one species undergoing a two stage transition.

Molecular Modeling-- A conformational study of a single strand of RNA was undertaken to assess the viability of the proposed structure (see below). This was not intended to be a definitive modeling study, indeed the scope of this study would not allow for such a major computational undertaking, but rather to ensure that the proposed structure was reasonable; in particular it should be thermally accessible and represent at least a local free energy minimum structure.

Calculations were performed on both 5'-CUG and 5'-CUGCUG sequences. In general, low energy conformations involving the formation of three C-G H-bonds within each triplet was found to occur readily and with distances in the range 2-2.2 Å between the "heavy" (non-hydrogen) atoms. A stable stacking of these Watson-Crick base pairs involving the hairpin turn of the backbone envisaged in Fig. 3A was less easy to locate, but was found to occur with energies comparable with those found in the more conventional helical stacking patterns. An example of such a conformation is shown in Fig. 3B. The distance between the two Watson-Crick base pairs is about 4 Å, and the H-bonds within each C-G pair are all in the range 2-2.3 Å. There is some tilting of the base pairs relative to each other, but it is likely that this would be reduced by pi -staking interactions in an extended oligomer. It is also interesting to note that the uracils tend to orient parallel to each other and outside the C-G stack. The distance between the uracils is about 5 Å, which would be close enough to allow for favorable pi -stacking interactions and spectroscopic interactions.


View larger version (26K):
[in this window]
[in a new window]
 
Fig. 3.   A, schematic of the toblerone structure. B, alternate views of a toblerone energy-minimized structure for 5'-CUGCUG resulting from a conformational search where NOE constraints were initially imposed then subsequently removed. Color coding of the atoms is: medium gray, carbon; black, oxygen; dark gray, nitrogen; light gray, hydrogen.


    DISCUSSION
TOP
ABSTRACT
INTRODUCTION
EXPERIMENTAL PROCEDURES
RESULTS
DISCUSSION
REFERENCES

In this paper, we have used circular dichroism, UV absorption as a function of temperature, and gel electrophoresis to make some progress in understanding the structure(s) adopted by CNG tracts under our conditions. The spectroscopic data show that both (CUG)5 and (CTG)5 adopt highly ordered structures, which are different from that of the control randomized nucleic acids in accord with the gel mobility data. The CD spectrum for (CUG)5, e.g. is predominantly A-form, but resembles spectra obtained for RNA containing a pseudoknot or triplex structure. The CD spectra for (CTG)5 DNA show that it does not adopt a duplex B-form conformation. The structure that is formed is intramolecular and more stable than a random sequence.

We have also shown that CNG RNA forms compact structures and that the structure becomes relatively more compact as tract length increases for short to medium length repeats. We suggest that the compact structure seen for medium length tracts is biologically important, is probably required for normal gene function, and that the inability of very short tracts to adopt it may be a factor in maintaining them above a certain minimum length. In support of this hypothesis, in the locus associated with myotonic dystrophy, for example, tracts shorter than 5 repeats are not observed in the normal population (1). Proteins that bind CNG repeats have been isolated (49, 50), and it is possible that the compact structure formed by these repeats binds to proteins that are involved in nuclear export (23). The gel mobilities of long CNG RNA are also consistent with high order structures, if in addition to being more compact they are also significantly more rigid than duplex RNA (since rigidity reduces gel mobility).

But what are the compact structures associated with CNG tracts in nucleic acids? Although all laboratories working in the area agree that short-to-medium length CNG tracts adopt compact structures, there is significant disagreement about the nature of these structures. The structure adopted probably depends on tract length, sequence context, DNA/RNA concentration, and environmental conditions such as temperature and ionic strength (26, 28, 51). The simplest structural model for a CNG repeat is a mismatched hairpin (52-54), and it is possible that under some conditions such a structure is indeed adopted (16, 55). However, a mismatched hairpin does not readily explain the following facts: (i) CNG tracts readily adopt a highly compact structure but GNC tracts do not, (ii) the structure becomes more compact as its length increases (26), (iii) much better hairpin-forming sequences exist in eukaryotic genomes, e.g. Ref. 56, but are not to our knowledge associated with genetic instability or disease.

We therefore propose that the CNG repeats fold to form a triangular motif, and the linked triangular motifs then stack on top of each other, probably by rotation relative to the 5' neighbor. It is possible to build such a model with the C and G of each triplet hydrogen-bonded together and the thymine/uracil held approximately coplanar with the base pair at the apex of the triangle. Steric constraints mean that the T/U is almost certainly required to extend out into solution and will hydrogen bond with water rather than with the base pair as is usually observed for a triplex. When the trimers stack on top of one another the net effect will be a duplex analogous to poly(d(G-C))2 or poly(r(G-C))2 plus a single-stranded poly(d(T)) or poly(r(U)). The single strand will be held in place by the requirement of the triplet conformation. Addition of salt (such as Mg2+) would further stabilize the T/U stack. Overall we envisage this structure as resembling the stacked triangles of the confection "toblerone", but with each triangular segment rotated relative to the previous one (Fig. 3A). Such a "twisted-toblerone" nucleic acid structure is sterically constrained, but preliminary molecular modeling calculations have indicated that it is energetically viable and a possible molecular structure is illustrated in Fig. 3B. The toblerone model is not at all similar to the "triad DNA" model (57), since the former is a way of folding a single strand of DNA or RNA into a compact structure, whereas the latter is a way of rearranging two Watson-Crick paired strands to give a slipped structure, in which base triplets rather than base pairs link the strands.

Taken together, these results suggest that "normal" medium-sized (CUG)5 repeats may form a novel structure in vitro that is specific to CNG sequences. It is possible that the structure formed is specifically recognized by proteins involved in nuclear export or in other essential functions such as stability or translation. The stability of the CUG repeats, as shown by our spectroscopic experiments, suggest that "long" CUG repeats that are associated with neurodegenerative disease may form a higher order multiplex structure. Perhaps the twisted toblerone bends back upon itself allowing H-bonds between the extended Ts to occur. This type of higher order structure may give rise to previously reported tetraplex conformations (15) and so the appearance of an A-form CD spectrum between the extended Ts. Higher order structures may mask the binding sites for specific binding proteins and the inherent stability of the repeats cause toxic build-up of the repeat-containing RNAs in the nucleus, particularly in neural cells where cell division and nuclear breakdown rarely occur.

    ACKNOWLEDGEMENTS

We thank Professor Pat Jacobs for inspiration and constructive criticism and Colin Derrick for expert photographic assistance.

    FOOTNOTES

* This work was supported by the National Health Service South West Region Research and Development Directorate and the Engineering & Physical Sciences Research Council Life Sciences Interface: GR/M91105.The costs of publication of this article were defrayed in part by the payment of page charges. The article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.

To whom correspondence should be addressed: Dept. of Chemistry, University of Warwick, Coventry CV4 7AL, UK. Tel.: 44-24-76523234; Fax: 44-24-76524112; E-mail: A.Rodger@warwick.ac.uk.

Published, JBC Papers in Press, June 19, 2002, DOI 10.1074/jbc.M202235200

2 G. Scarlett and P. Pinheiro, manuscript in preparation.

3 A. Murray, unpublished observations.

    ABBREVIATIONS

The abbreviations used are: TRED, trinucleotide repeat expansion disease; NOE, nuclear Overhauser effect.

    REFERENCES
TOP
ABSTRACT
INTRODUCTION
EXPERIMENTAL PROCEDURES
RESULTS
DISCUSSION
REFERENCES

1. Warren, S. T., and Nelson, D. L. (1993) Curr. Opin. Neurobiol. 3, 752-759[CrossRef][Medline] [Order article via Infotrieve]
2. Burright, E., Clark, H., Servadio, A., Matilla, T., Feddersen, R., Yunis, W., Duvick, L., Zoghbi, H., and Orr, H. (1995) Cell 82, 937-948[CrossRef][Medline] [Order article via Infotrieve]
3. Goldberg, Y., McMurray, C., Zeisler, J., Almqvist, E., Sillence, D., Richards, F., Gacy, A., Buchanan, J., Telenius, H., and Hayden-M, R. (1995) Hum. Mol. Genet. 4, 1911-1918[Abstract/Free Full Text]
4. Djian, P., Hancock, P., and Chana, H. (1996) Proc. Natl. Acad. Sci. U. S. A. 93, 417-421[Abstract/Free Full Text]
5. Davies, S. W., Turmaine, M., Cozens, B. A., DiFiglia, M., Sharp, A. H., Ross, C. A., Scherzinger, E., Wanker, E. E., Mangiarini, L., and Bates, G. P. (1997) Cell 90, 537-548[CrossRef][Medline] [Order article via Infotrieve]
6. Li, X.-J., Li, S.-H., Sharp, A. H., Nucifora, F. C. J., Schilling, G., Lanahan, A., Worley, P., Snyder, S. H., and Ross, C. A. (1995) Nature 378, 398-402[CrossRef][Medline] [Order article via Infotrieve]
7. Scherzinger, E., Lurz, R., Turmaine, M., Mangiarini, L., Holenbach, B., Hasenbank, R., Bates, G. P., Davies, S. W., Lehrach, H., and Wanker, E. E. (1997) Cell 90, 549-558[CrossRef][Medline] [Order article via Infotrieve]
8. Trottier, Y., Lutz, Y., Stevanin, G., Imbert, G., Devys, D., Cancel, G., Saudou, F., Weber, C., David, G., Tora, L., Agid, Y., Brice, A., and Mandel, J.-L. (1995) Nature 378, 403-406[CrossRef][Medline] [Order article via Infotrieve]
9. Kooy, R. F., Willemsen, R., and Oostra, B. A. (2000) Mol. Med. Today 6, 193-198[CrossRef][Medline] [Order article via Infotrieve]
10. Smith, S. S., Laayoun, A., Lingeman, R. G., Baker, D. J., and Riley, J. (1994) J. Mol. Biol. 243, 143-151[CrossRef][Medline] [Order article via Infotrieve]
11. Gu, Y., Lugenbeel, K. A., Vockley, J. G., Grody, W. W., and Nelson, D. L. (1994) Hum. Mol. Genet. 3, 1705-1706[Free Full Text]
12. Lugenbeel, K. A., Peier, A. M., Carson, N. L., Chudley, A. E., and Nelson, D. L. (1995) Nature Genet. 10, 483-485[CrossRef][Medline] [Order article via Infotrieve]
13. Trottier, Y., Imbert, G., Poustka, A., Fryns, J.-P., and Mandel, J.-L. (1994) Am. J. Med. Genet. 51, 454-457[CrossRef][Medline] [Order article via Infotrieve]
14. Murray, A., Youings, S., Dennis, N., Latsky, L., Linehan, P., McKechnie, N., MacPherson, J., Pound, M., and Jacobs, P. (1996) Hum. Mol. Genet. 5, 727-735[Abstract/Free Full Text]
15. Fry, M., and Loeb, L. (1994) Proc. Natl. Acad. Sci. U. S. A. 91, 4950-4954[Abstract/Free Full Text]
16. Darlow, J., and Leach, D. (1995) Genetics 141, 825-832[Abstract]
17. Jaworski, A., Rosche, W. A., Gellibolian, R., Kang, S., Shimizu, M., Bowater, R. P., Sinden, R. R., and Wells, R. D. (1995) Proc. Natl. Acad. Sci. U. S. A. 92, 11019-11023[Abstract/Free Full Text]
18. Ohshima, K., Kang, S., and Wells, R. D. (1996) J. Biol. Chem. 271, 1853-1856[Abstract/Free Full Text]
19. Greaves, D., Patient, R., and Lilley, D. (1985) J. Mol. Biol. 185, 461-478[CrossRef][Medline] [Order article via Infotrieve]
20. McClellan, J. A., Boublikova, P., Palecek, F., and Lilley, D. M. J. (1990) Proc. Natl. Acad. Sci. U. S. A. 87, 8373-8377[Abstract/Free Full Text]
21. Hamshere, M. G., and Brook, J. D. (1996) Trends Genet. 12, 332-334[Medline] [Order article via Infotrieve]
22. McClellan, J. A., and Newbury, S. F. (1996) Nature 379, 396[Medline] [Order article via Infotrieve]
23. Taneja, K. L., McCurrach, M., Schalling, M., Housman, D., and Singer, R. H. (1995) J. Cell Biol. 128, 995-1002[Abstract/Free Full Text]
24. Puglisi, J. D., and Tinoco, I. J. (1989) Methods in Enzymol. 180, 304-325[Medline] [Order article via Infotrieve]
25. Marky, L. A., and Breslauer, K. J. (1987) Biopolymers 26, 1601-1620[CrossRef][Medline] [Order article via Infotrieve]
26. Mitchell, J. E., Newbury, S. F., and McClellan, J. A. (1995) Nucleic Acids Res. 23, 1876-1881[Abstract/Free Full Text]
27. Kang, S., Ohshima, K., Shimizu, M., Amirhaeri, S., and Wells, R. D. (1995) J. Biol. Chem. 270, 27014-27021[Abstract/Free Full Text]
28. Smith, G. K., Jie, J., Fox, G. E., and Gao, X. L. (1995) Nucleic Acids Res. 23, 4303-4311[Abstract/Free Full Text]
29. Park, Y. W., and Breslauer, K. J. (1992) Proc. Natl. Acad. Sci. U. S. A. 89, 6653-6657[Abstract/Free Full Text]
30. Gray, D. M., Ratcliff, R. L., and Vaughan, M. R. (1992) Methods Enzymol. 211, 389-405[Medline] [Order article via Infotrieve]
31. Scaria, P. V., and Shafer, R. H. (1991) J. Biol. Chem. 266, 5417-5423[Abstract/Free Full Text]
32. Radhakrishnan, I., and Patel, D. J. (1994) Biochemistry 33, 11405-11416[CrossRef][Medline] [Order article via Infotrieve]
33. Chastain, M., and Tinoco, I. J. (1992) Nucleic Acids Res. 20, 315-318[Abstract/Free Full Text]
34. Johnson, K. H., Gray, D. M., Morris, P. M., and Sutherland, J. C. (1990) Biopolymers 29, 325-333[CrossRef][Medline] [Order article via Infotrieve]
35. Schweisguth, D. C., Chelladurai, B. S., Nicholson, A. W., and Moore, P. B. (1994) Nucleic Acids Res. 22, 604-612[Abstract/Free Full Text]
36. Laing, L., and Draper, D. E. (1994) J. Mol. Biol. 237, 560-576[CrossRef][Medline] [Order article via Infotrieve]
37. Cole, P. E., Yang, S. K., and Crothers, D. M. (1972) Biochemistry 11, 4358-4368[CrossRef][Medline] [Order article via Infotrieve]
38. Gray, D., Liu, J., Ratcliff, R. L., and Allen, F. S. (1981) Biopolymers 20, 1337-1382[CrossRef]
39. Causley, G. C., and Johnson, W. C. J. (1982) Biopolymers 21, 1763-1780[CrossRef][Medline] [Order article via Infotrieve]
40. Johnson, K. H., and Gray, D. M. (1992) J. Biomol. Struct. Dyn. 9, 733-746[Medline] [Order article via Infotrieve]
41. Shi, P.-Y., Brinton, M. A., Veal, J. M., Zhong, Y. Y., and Wilson, W. D. (1996) Biochemistry 35, 4222-4230[CrossRef][Medline] [Order article via Infotrieve]
42. Johnson, K. H., and Gray, D. M. (1991) Biopolymers 31, 385-395[CrossRef][Medline] [Order article via Infotrieve]
43. Loret, E. P., Georgel, P., Johnson, W. C. J., and Ho, P. S. (1992) Proc. Natl. Acad. Aci. U. S. A. 89, 9734-9738[Abstract/Free Full Text]
44. Newbury, S. F., McClellan, J. A., and Rodger, A. (1996) Anal. Commun. 33, 117-121[CrossRef]
45. Spedding, G., Gluik, T. C., and Draper, D. E. (1992) J. Mol. Biol. 229, 609-622
46. Gluick, T., and Draper, D. (1994) J. Mol. Biol. 241, 246-262[CrossRef][Medline] [Order article via Infotrieve]
47. Freier, S. M., Kierzek, R., Jaeger, J. A., Sugimoto, N., Caruthers, M. H., Neilson, T., and Turner, D. H. (1986) Proc. Natl. Acad. Sci. U. S. A. 83, 9373-9377[Abstract/Free Full Text]
48. Sarkar, M., Sigurdsson, S., Tomac, S., Sen, S., Rozners, E., Sjoberg, B.-M., Stromberg, R., and Graslund, A. (1996) Biochemistry 35, 4678-4688[CrossRef][Medline] [Order article via Infotrieve]
49. Yano-Yanagisawa, H., Li, Y., Wang, H., and Kohwi, Y. (1995) Nucleic Acids Res. 23, 2654-2660[Abstract/Free Full Text]
50. Timchenko, L. T., Miller, J. W., Timchenko, N. A., De, Vore, D. R., Datar, K. V., Lin, L., Roberts, R., Casket, T., and Swanson, M. S. (1996) Nucleic Acids Res. 24, 4407-4414[Abstract/Free Full Text]
51. Kohwi, Y., Wang, H., and Kohwi-Shigematsu, T. (1993) Nucleic Acids Res. 21, 5651-5655[Abstract/Free Full Text]
52. Gacy, A., Goellner, G., Juranic, N., Macura, S., and McMurray, C. (1995) Cell 81, 533-540[CrossRef][Medline] [Order article via Infotrieve]
53. Mariappan, S. V., Catasti, P., Chen, X., Ratliff, R., Moyzis, R. K., Bradbury, E. M., and Gupta, G. (1996) Nucleic Acids Res. 24, 784-792[Abstract/Free Full Text]
54. Mitas, M., Yu, A., Dill, J., and Haworth, I. S. (1995) Biochemistry 34, 12803-12811[CrossRef][Medline] [Order article via Infotrieve]
55. Mitas, M., Yu, A., Dill, J., Kamp, T. J., Chambers, E. J., and Haworth, I. S. (1995a) Nucleic Acids Res. 23, 1050-1059[Abstract/Free Full Text]
56. McClellan, J. A., Palecek, E., and Lilley, D. M. J. (1986) Nucleic Acids Res. 14, 9291-9309[Abstract/Free Full Text]
57. Kuryavyi, V. V., and Jovin, T. M. (1995) Nat. Genet. 9, 339-341[CrossRef][Medline] [Order article via Infotrieve]


Copyright © 2002 by The American Society for Biochemistry and Molecular Biology, Inc.
Add to CiteULike CiteULike   Add to Complore Complore   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us   Add to Digg Digg   Add to Reddit Reddit   Add to Technorati Technorati    What's this?


This article has been cited by other articles:


Home page
Microbiol. Mol. Biol. Rev.Home page
G.-F. Richard, A. Kerrest, and B. Dujon
Comparative Genomics and Molecular Dynamics of DNA Repeats in Eukaryotes
Microbiol. Mol. Biol. Rev., December 1, 2008; 72(4): 686 - 727.
[Abstract] [Full Text] [PDF]


This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF)
Right arrow All Versions of this Article:
277/38/35183    most recent
M202235200v1
Right arrow Submit a Letter to Editor
Right arrow Alert me when this article is cited
Right arrow Alert me when eLetters are posted
Right arrow Alert me if a correction is posted
Right arrow Citation Map
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrowRequest Permissions
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Pinheiro, P.
Right arrow Articles by McClellan, J. A.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Pinheiro, P.
Right arrow Articles by McClellan, J. A.
Social Bookmarking
 Add to CiteULike   Add to Complore   Add to Connotea   Add to Del.icio.us   Add to Digg   Add to Reddit   Add to Technorati  
What's this?


HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
 All ASBMB Journals   Molecular and Cellular Proteomics 
 Journal of Lipid Research   ASBMB Today 
Copyright © 2002 by the American Society for Biochemistry and Molecular Biology.
Advertisement
spacer
Advertisement
Advertisement