CUG Repeats Present in Myotonin Kinase RNA Form Metastable “Slippery” Hairpins*

We show that CUG repeats form “slippery” hairpins in their natural sequence context of the myotonin kinase gene transcript. This novel type of RNA structure is characterized by strong S1 and T1 nuclease and lead cleavages in the terminal loop and by mild lead cleavages in the hairpin stem. The latter effect indicates a relaxed metastable structure of the stem. (CUG)5 repeats do not form any detectable secondary structure, whereas hairpins of increasing stability are formed by (CUG)11, (CUG)21, and (CUG)49. The potential role of the RNA hairpin structure in the pathogenesis of myotonic dystrophy is discussed.

We show that CUG repeats form "slippery" hairpins in their natural sequence context of the myotonin kinase gene transcript. This novel type of RNA structure is characterized by strong S 1 and T 1 nuclease and lead cleavages in the terminal loop and by mild lead cleavages in the hairpin stem. The latter effect indicates a relaxed metastable structure of the stem. (CUG) 5 repeats do not form any detectable secondary structure, whereas hairpins of increasing stability are formed by (CUG) 11 , (CUG) 21 , and (CUG) 49 . The potential role of the RNA hairpin structure in the pathogenesis of myotonic dystrophy is discussed.
Eleven human diseases associated with the expansion of trinucleotide repeats have been identified so far. The progress of research in this new area has been discussed in several recent reviews (1)(2)(3)(4)(5). One of these diseases is myotonic dystrophy (dystrophia myotonica (DM)), 1 the most prevalent form of muscular dystrophy in adults with a global incidence of 1 in 8000. The molecular basis of this multisystemic disease, with a complex clinical picture, is the expanded CTG repeat in the myotonic dystrophy protein kinase (DMPK) gene. The repeat expansions, which cause DM, range from 50 repeats in mildly affected patients to Ͼ2000 repeats in the most severe congenital cases (6 -8).
Despite the fact that the nature of the underlying DM mutation has been known for the past 6 years, the molecular pathology of this disease is not understood. In particular, it has been difficult to find the molecular mechanism by which this specific mutation, located in the 3Ј-UTR of the DMPK gene, causes the dominantly inherited disease. One of several recent proposals links the inheritance pattern with the molecular effects observed at the RNA level. It takes advantage of the observation that dramatic decreases in both mutant and normal DMPK poly(A) ϩ RNAs, as compared with the primary transcripts, occur in the DM tissue (9). This suggests that the expanded CUG repeat may have a dominant effect either on the processing of both the normal and expanded transcripts or on their transport to the cytoplasm. Other authors observed only a decrease in the expanded poly(A) ϩ transcript (10). The specific (CUG) n -binding proteins, discovered recently (11)(12)(13), may be involved in normal DMPK RNA processing, transport, and/or translation. These RNA-protein interactions may be impaired by the expanded DMPK transcript.
Understanding the molecular basis of the DM disease is hampered by the lack of knowledge of the RNA structure formed by the CUG repeats. To fill this gap and to provide a background for further studies on the role of RNA level effects in the pathogenesis of myotonic dystrophy, we have analyzed the structure of the repeat region of DMPK RNA. In this paper, we describe the properties of an unusual "slippery" RNA hairpin that contains a metastable stem. The stability of the hairpin increases with the repeat length, and we postulate that long stable hairpins are important factors in DM pathogenesis.

MATERIALS AND METHODS
DNA Templates for in Vitro Transcription-A plasmid containing 49 CTG repeats was obtained from Robert D. Wells. The alleles of the human DMPK gene containing 5, 11, and 21 CTG repeats were selected from our laboratory genomic DNA collection. Control DNA samples from healthy individuals were screened by the polymerase chain reaction with primers described earlier (8). The length of radiolabeled polymerase chain reaction products was compared with the M13 sequencing ladder on a polyacrylamide gel to determine the number of CTG repeats present. Bands corresponding to the selected alleles were cut out from the gel, eluted with 200 l of H 2 O, reamplified, and sequenced using the cycle sequencing protocol (Life Technologies, Inc.). The polymerase chain reaction products with confirmed nucleotide sequence were reamplified with the same reverse primer and a modified forward primer containing the T 7 RNA polymerase promoter: 5Ј-TAA-TACGACTCACTATAGGGCTCGAAGGGTCCTTGTAGCCGGGA. All amplifications and reamplifications were done under the same conditions: total volume of 5 l, 3 pmol of each primer, 200 M each dNTP, and 0.125 units of AmpliTaq polymerase (Perkin-Elmer) in a buffer containing 10 mM Tris (pH 8.3), 50 mM KCl, and 1.5 mM MgCl 2 . Cycling conditions were 95°C for 3 min followed by 30 cycles of denaturation at 95°C for 20 s, annealing at 57°C for 20 s, and extension at 72°C for 45 s.
In Vitro Transcription and RNA Labeling-The transcription reaction was performed as described earlier (14) in the presence of 1000 units/ml RNazin (Promega). The reaction was stopped by a phenol/ chloroform extraction followed by ethanol precipitation. Transcripts were then purified on a denaturing 10% polyacrylamide gel. The RNA bands corresponding to full-size transcripts were identified by staining with Stains-all (Serva), excised, eluted from the gel (with 0.3 M potassium acetate (pH 5.1), 1 mM EDTA, and 0.1% SDS), and precipitated with ethanol. They were stored at Ϫ80°C under ethanol before use. In vitro transcripts were phosphorylated with [␥-32 P]ATP (5000 Ci/mmol; Amersham Corp.) and T4 polynucleotide kinase (New England Biolabs Inc.). The labeled RNAs were purified by electrophoresis on a denaturing 10% polyacrylamide gel, localized on the gel by autoradiography, and recovered as described above. As revealed by polyacrylamide gel electrophoresis under nondenaturing conditions, (CUG) 5 , (CUG) 11 , (CUG) 21 , and (CUG) 49 migrate as monomeric species under conditions used for structure probing experiments.
Nuclease Digestions and Lead Cleavages-Prior to structure probing reactions, the labeled transcripts were supplemented with an unlabeled RNA carrier to obtain a final RNA concentration of 8 M. Limited digestions with lead ions and S 1 and T 1 nucleases (Pharmacia Biotech Inc.) were carried out in the buffer and under the conditions specified in the legend to Fig. 1. All reactions were stopped by adding 7 M urea and 20 mM EDTA.
Structure Melting Monitored by Lead Cleavage-Thermal melting of RNA hairpins monitored by the Pb(II)-induced cleavage was performed in the same buffer and salt conditions as described for structure probing experiments conducted at a temperature of 20°C. The concentration of lead ions was a constant 0.5 mM. The reaction temperature and incubation time are specified in the legend to Fig. 2. Two components of the reaction mixture (the RNA solution and the lead acetate solution) present in separate tubes were preheated for 1 min in the DNA Thermal Cycler model 480 (Perkin-Elmer) to reach the desired temperature. Then equal volumes (5 l) of the RNA solution and the lead solution were mixed together and allowed to react for the specified period of time. All reactions were stopped as described above.
Analysis of Reaction Products-To assign the cleavage sites, the products of lead-induced hydrolysis and nuclease digestion were electrophoresed along with the products of alkaline hydrolysis and limited T 1 nuclease digestion of the same RNA molecule. The alkaline hydrolysis ladder was generated by incubation of the labeled RNA in formamide containing 0.5 mM MgCl 2 at 100°C for 15 min. Partial T 1 nuclease digestion of RNAs was performed under semi-denaturing conditions (10 mM sodium citrate (pH 5.0) and 3.5 M urea) with 0.2 units of the enzyme. The reaction mixture was incubated at 55°C for 15 min. The products of lead cleavages and nuclease digestions were subjected to short, medium, and long electrophoretic runs on denaturing polyacrylamide gels (6, 10, 12, or 15%) containing 7 M urea to visualize different sections of the studied molecules at high resolution. Autoradiograms from the short runs only are shown in Figs. 1, 2, 4, and 5. As the relative cleavage intensities were analyzed, much attention was paid to the standardization of autoradiography and gel electrophoresis conditions. Wherever possible, the reactions were carried out in parallel, and their products were analyzed on the same gel. All experiments were repeated several times.
Secondary Structure Modeling and Free Energy Calculations-The secondary structure of the DMPK mRNA 3Ј-UTR was modeled using the MFold program (Wisconsin Package Version 9.0, Genetics Computer Group, Madison, WI). This program is designed to determine optimal and suboptimal secondary structures for RNA molecules and contains free energy contributions for various secondary structure motifs. The free energy calculations were done using Vienna RNA Package Version 1.1, which includes RNAfold and RNAheat used by the RNAdraw program for calculations.

RNA Models and Structure
Probes-The number of CTG repeats in normal alleles of the DMPK gene varies between 5 and 37. In the Caucasian population, the great majority of chromosomes show three ranges of the repeat size (15). The most frequent allele contains five repeats; the next between 10 and 15; and the third from 20 to 37 repeats. The third group gives rise to the expanded pathogenic alleles (16). We have selected one representative of each group for detailed RNA structure analysis: (CUG) 5 , (CUG) 11 , (CUG) 21 , and the expanded allele containing 49 repeats. In all four model molecules, the same natural sequences flanking the repeat region are present: 30 nucleotides at the 5Ј-side, GGGCUCGAAGGG-UCCUUGUAGCCGGGAAUG; and 35 nucleotides at the 3Јside of the CUG repeat, GGGGGAUCACAGACCAUUUCU-UUCUUUCGGCCAGG.
All RNA structure probing reagents used in this study are well characterized. Lead ions, in the absence of strong binding sites from which highly specific and efficient cleavages of phosphodiester bonds may be performed (17)(18)(19)(20)(21), differentiate between rigid and flexible sugar phosphate backbones (22,23). Rigid double-stranded RNA structures and single-stranded regions involved in strong stacking or hydrogen bond interactions are not cleaved. Flexible single-stranded regions are susceptible to cleavages as a mechanism of the reaction, which begins with the activation of the ribose 2Ј-OH group by the lead ion hydrate, and the attack of the 2Ј-O Ϫ nucleophile on the adjacent phosphate P atom requires a conformational change of the sugar phosphate backbone (17,24). The structure and sequence specificity of S 1 and T 1 nucleases is well established (25). Lead ions and nucleases have been widely used in the structural analysis of a variety of natural RNA molecules and their variants obtained by transcription in vitro. Here they are used for the first time in the analysis of RNA structures, formed by the same trinucleotide motif repeated many times. This gives the unique opportunity to assess the structure specificity of the probing reagents, which is not affected by the ambiguities caused by irregular variations of the nucleotide sequence. All RNA structure probing reactions described in this study were carried out under very similar solution conditions, and important conclusions were based on comparisons of the results obtained with different model substrates and structure probes.
(CUG) 49 Forms a Metastable Slippery Hairpin-All three probes give similar patterns of hydrolysis in the repeat region of the DMPK mRNA fragment containing 49 CUG repeats (Fig.  1). The strong cuts induced by the lead ions, S 1 nuclease, and T 1 ribonuclease are observed in the central part of the repeated sequence. Repeats at both the 5Ј-and 3Ј-sides of the highly reactive region are not digested by the nucleases and are very poorly cleaved by lead ions (Figs. 1 and 2a). Several 3Ј-terminal repeats and the first two repeats from the 5Ј-side are cleaved by lead ions more easily. This pattern of hydrolysis seems consist- correspond to 675, 1350, 2700, and 5400 units/ml S 1 nuclease, respectively. Panel T 1 , lanes 1-3 correspond to 30, 60, and 120 units/ml T 1 ribonuclease, respectively. All reactions were performed at 20°C for 20 min in a buffer composed of 10 mM Tris-HCl (pH 7.2), 40 mM NaCl, and 10 mM MgCl 2 that contained also 1 mM ZnCl 2 in the S 1 nuclease reactions. Lanes C, incubation controls (without a probe); lanes L, formamide ladder; lanes T, limited T 1 ribonuclease digest under semidenaturing conditions. The positions of the G nucleotides from the central and terminal repeats are indicated. Note that 5Ј to the first CUG repeat (see flanking sequence under "Results") is the G residue that in two out of the three equivalent frames ((UGC) n , (GCU) n , and G(CUG) n ) belongs to another first repeat. Therefore, this G residue is assigned as G1 in this figure and in Figs. 2-5. ent with the presence of a hairpin structure that contains a large terminal loop spanning at least four repeats and, in agreement with the results of lead cleavages, a stem "breathing" at its base. This model, however, does not explain the observed asymmetry in the number of lead cleavages induced at both termini of the repeated sequence ( Figs. 1 and 2a). A more likely explanation of these results is the presence of several alternatively aligned variants of the (CUG) 49 hairpin in which different combinations of the neighboring central repeats are involved in the formation of smaller loops, and different terminal repeats form protruding ends at the base of the stem. In favor of this alternatively aligned, slippery RNA model are the following facts. 1) Both the enzymatic and lead-induced cuts in the hairpin loop tend to expand more toward the 5Ј-end of the repeated sequence from the central, most reactive 25th repeat. 2) More repeats located at the 3Ј-end of the repeated sequence show enhanced reactivity with lead ions. 3) Computer modeling (26) of the (CUG) 49 structure predicts the loops of four or seven nucleotides as the most energetically favorable. The fact that only the lead ions, but not the nucleases, cleave the protruding terminal repeats is not surprising. These singlestranded repeats, adjacent to the hairpin stem, are very likely inaccessible to the S 1 and T 1 nucleases. The proposed structures of the alternatively aligned hairpins formed by (CUG) 49 are shown in Fig. 3.
The characteristic feature of the (CUG) 49 hairpin stem is its metastability. Melting of this structure monitored by lead cleavages is revealed by the disappearance of more intense loop cuts and by the increase of cleavage intensity in the stem (Fig.  2a). The loop cuts become weaker at the temperature 65°C and are only slightly stronger than the background cleavages at 75°C, at which also the stem cuts remain weak. On the other hand, a dramatic increase in the intensity of the stem cuts is observed at 85°C. These observations could be explained by the formation of an intermediate structure from the hairpin to the highly reactive single-stranded form, e.g. an intermolecular duplex formed by the antiparallel single strands held together by interactions similar to those present in the hairpin stem.
At a low temperature, all weak cuts in the stem are of nearly equal intensity, with only a minor preference for cleavages at UpG (Fig. 2a). This preference tends to disappear at a temper-ature above 45°C. Therefore, this stem should not be considered as composed of units in which two base pairs, C-G and G-C, are followed by the internal loop, formed by the U⅐U mispair. In light of our data, the stem structure seems to be better described as a well balanced system, a result of a compromise between the stabilizing effects of the G-C and C-G interactions and the destabilizing effect of the U⅐U interactions. For this reason, all base-base interactions present in the hairpin stem are shown by dots in Fig. 3 to indicate their specific nature, which results in the decreased stability of the stem structure.
(CUG) 21 and (CUG) 11 Hairpins Are Less Stable-Slippery hairpins are also present in DMPK mRNA fragments containing 21 and 11 CUG repeats. Here again, several terminal repeats that form protruding ends at the base of the stem show enhanced reactivity. Highly reactive are the centrally located repeats that form a loop. The loops include repeats 9 -12 in hairpin variants of (CUG) 21 and repeats 4 -6 in the (CUG) 11 hairpins. A major difference among the (CUG) 11 , (CUG) 21 , and (CUG) 49 hairpins is the rigidity of their stem structure. This is clearly seen as a difference in the intensity of lead-induced cleavages in their loops and stems (compare Figs. 1 and 4). The stem cuts in (CUG) 11 are the strongest, whereas those in (CUG) 21 are more efficient than those in (CUG) 49 . Probably both the extended stacking interactions and the increased number of hydrogen bond interactions contribute to the higher rigidity of the sugar phosphate backbone in longer hairpins. A difference is also apparent when the intensities of lead-induced cleavages in (CUG) 21 are compared with those of S 1 nuclease cuts in the same model molecule. The S 1 nuclease is not capable of detecting the higher degree of stem structure relaxation in (CUG) 21 , and only very weak S 1 nuclease cuts can be detected after extended autoradiography (data not shown). However, in (CUG) 11 , where the relaxation of the stem structure is still more profound and the lead cleavages become fairly strong, the S 1 nuclease cleaves the stem poorly, with a preference for CpU Ͼ GpC. The same preference is observed among the strong S 1 nuclease cuts within the hairpin loop. The preference of the lead cleavages generated in the stem is different. In (CUG) 11 , the order of reactivity is CpU Ͼ UpG Ͼ GpC. In the more stable stems of (CUG) 21 and (CUG) 49 , this cleavage specificity

FIG. 2. Melting of the hairpin structures (CUG) 49 (a) and (CUG) 11 (b) analyzed by lead cleavages at different temperatures.
The incubation temperature is shown above each lane. The Pb(II) concentration was 0.5 mM in each reaction, and the incubation time was adjusted for each temperature to give the same ratio of the unreacted substrate and cleavage products: 20 min, 10 min, 5 min, 3 min, 90 s, 30 s, and 15 s, respectively, for reactions at 25-85°C. Lanes C, incubation controls (without a probe); lanes L, formamide ladder; lanes T, limited T 1 ribonuclease digest under semi-denaturing conditions. In (CUG) 11 , only the G residues from the terminal repeats are indicated.
(CUG) 5 and Flanking Sequences Are Single-stranded-The repeat region of (CUG) 5 is cleaved by lead ions with the order The reaction conditions were the same as specified in the legend to Fig. 1. Lanes C, incubation controls (without a probe); lanes L, formamide ladder; lanes T, limited T 1 ribonuclease digest under semidenaturing conditions. Only the G nucleotides corresponding to terminal CUG repeats are indicated. Note that the end groups in the T 1 nuclease and the lead cleavage products are 5Ј-OH and 2Ј,3Јcyclic phosphate. The same are the end groups in the formamide ladder. The S 1 nuclease leaves the 5Ј-phosphate and 3Ј-OH at its cleavage site; therefore, the shorter products of S 1 nuclease cleavages migrate slower on polyacrylamide gel. of reactivity CpU Ͼ UpG Ͼ GpC, which is characteristic for the single-stranded looped repeats in all hairpin-forming repeats analyzed here and for stem cuts in the least stable hairpin, (CUG) 11 . In sequences flanking the (CUG) 5 repeat, all G nucleotides are good targets for T 1 ribonuclease, and only a few sites are resistant to both S 1 nuclease and lead ions (Fig. 5). This means that the flanking sequences are mostly if not entirely single-stranded. It should also be pointed out that the pattern of reactivity of both short sequences flanking the CUG repeats is practically identical in all model molecules analyzed in this study. This indicates that independent of the length of the repeated sequence (5, 11, 21, or 49 units), the 30 nucleotides at their 5Ј-end and the 35 nucleotides at their 3Ј-end all form the same structural environment for the repeats.

DISCUSSION
Features of New Hairpin Structure-The molecular mechanism of trinucleotide repeat genetic instability could be explained by the hairpins formed by DNA during replication (27). In several recent biophysical studies, the presence of hairpin structures was confirmed in (CTG) n oligomers (28 -32), and the existence of similar RNA structures was postulated (30,31). In this paper, we show that the (CUG) n hairpins indeed exist in the DMPK RNA fragments and that the stability of these structures increases with the repeat length. We conclude that the entire hairpin stem forms a novel relaxed type of duplex structure, rather than a mosaic of the C-G and G-C base pairs interrupted by the U⅐U mispairs. The results of the lead cleavage experiments show that all internucleotide bonds in the stems of (CUG) 49 and (CUG) 21 undergo mild cleavages, which are much weaker than those that occur in the terminal loops and in the single-stranded CUG repeats. Also, the order of reactivity of internucleotide bonds within the repeated sequence is different in different model molecules and in their distinct structural motifs (Fig. 6). The cleavages at CpU predominate in the single-stranded regions, whereas the cuts at UpG are the strongest among the weak cleavages in the stem. The stabilization of stem structure with the increased repeat length in (CUG) 11 , (CUG) 21 , and (CUG) 49 is accompanied not only by the weakening of all cleavages within the stem, but also by the reversal in the order of internucleotide bond reactivity (from CpU Ͼ UpG to UpG Ͼ CpU). This implies that the rigidity of the phosphate residues present in CpU increases most dramatically in the stems of (CUG) 49 and (CUG) 21 as compared with that in the single-stranded structure. Also, the phosphate residues present at the 3Ј-side of each U nucleotide remain the most flexible in the metastable stem. At an elevated temperature (above 45°C), the stem structure seems to convert to another form, in which the intensity of the UpG cleavages becomes similar to that of the CpU and GpC cleavages (Fig. 6, structure 3). Thus, at higher temperature, the sugar phosphate backbone of the interacting U⅐U, C-G, and G-C residues shows approximately the same degree of flexibility.
Our results do not reveal the nature of the U⅐U interactions, which make the stem structure metastable. Earlier biophysical studies showed that tandem U⅐U mismatches in short RNA duplexes are unexpectedly stable (33). Recent crystallographic studies of two different model oligoribonucleotides (34,35) and NMR studies of a conserved ribosomal RNA hairpin (36) give some details about possible U⅐U pairing schemes. Of the crystal structures, that of the r(UUCGCG) oligomer (34) seems to be more relevant to the (CUG) n hairpin. This hexamer forms a duplex in which two U⅐U pairs are formed adjacent to the base-paired CGCG system. The latter could be considered a model for the (CUG) n hairpin stem deprived of the intervening U⅐U pair. Very interestingly, the two U⅐U pairs adopt an unusual trans-arrangement about their glycosidic bonds. This U⅐U pairing occurs through the Watson-Crick face of one U nucleotide and the Hoogsteen face of the other. On the other hand, in the crystal structure of the dodecamer r(GGACUUUG-GUCC), which also forms a duplex structure, the U⅐U pairs are in the standard cis-arrangement and form wobble U⅐U pairing (35). The NMR study of a single U⅐U mismatch within an asymmetrical, biologically relevant context also showed that the mismatch is base-paired (36). It forms the NH-3-O-4 hy- drogen bond, which forces one of the two U nucleotides to adopt an extended trans-conformation of the backbone to accommodate the mismatch.
It has been postulated that the U⅐U pairs, which are exceptionally flexible, can interconvert between different pairing modes. They could also be the loci for the fraying of the double helix (34). However, in the model hexamer, the CGCG blocks appear largely unperturbed and rigid within the A form of RNA structure (34). We show here that the presence of the U⅐U pair after every two pairs of C-G and G-C in the (CUG) n hairpins perturbs their stem structure. According to our data, the entire (CUG) n hairpin stem is a relaxed metastable structure, and the postulated interconvertion between different U⅐U pairing modes could be reflected by the distinct lead cleavage patterns (2, 3) observed in the stem (Fig. 6).
From the perspective of RNA structure probing methods, our results show a high performance of lead-induced cleavages in determining the morphology of the previously unknown RNA structure. The lead cleavage reaction, as easy to handle as the nuclease digestion, has this advantage over the enzymes: that active species (the deprotonated lead ion hydrates (18)) are much smaller and can penetrate folded RNA structures easier. The hairpin formed by the repeated sequence turned out to be an excellent model to show a clear picture of the structure specificity of the probing reagents. Repeats involved in structures of different types or of different strengths give distinct lead cleavage patterns and/or cleavage intensity.
Relevance to DM Pathogenesis-If we assume that the problem of the DM pathogenesis can be explained by the posttranscriptional RNA level effects and that the hairpins described here exist also in vivo, we can ask the following question. What is the special property of the transcript with 50 or more CUG repeats that could not be shown by transcripts containing 37 repeats or less? Two differences in the structure of the normal and expanded transcripts are obvious; one is the CUG hairpin length and the other is its stability. How could these factors possibly affect the post-transcriptional activities?
The CUG repeat is located in the 3Ј-UTR of DMPK mRNA, which is coded by the last exon (exon 15) of the gene (37). The repeat is 223 nucleotides from the translation stop codon and 485 nucleotides from the polyadenylation site (6). A secondary structure model of this portion of the DMPK RNA containing 120 CUG repeats is shown in Fig. 7. In the GC-rich (64%) 3Ј-UTR, the CUG hairpin can impose structural perturbations over a long distance. This could include the polyadenylation site, the splice sites, and the coding sequence. It is also worth noting that in the case of long repeats, the computer program (26) draws either a single long hairpin or two or more shorter hairpin structures with practically unchanged calculated free energy. It remains to be determined experimentally whether or not such multiple hairpins in fact exist in transcripts from the highly expanded alleles.
As far as the specific proteins that bind to the CUG repeats is concerned, it remains to be seen whether these proteins bind preferentially to the single-stranded or double-stranded repeated motifs. Usually, the loop sequences in RNA hairpins define the binding site for RNA-binding proteins, whereas the stem may help to correctly position the protein by sequenceindependent contacts with the backbone (38). In (CUG) n hairpins, the problem is less clear, as both the loop and the stem contain the same repeated sequence motif. The specific CUGbinding proteins have been isolated based on their affinity for (CUG) 8 (11,12) and (CUG) 10 (13), which, in light of our results, do not form stable hairpins. In case the proteins bind to the single-stranded regions, the long hairpins would not titrate them more effectively than the short hairpins unless there are multiple hairpins or the stability of the stem structure makes a significant contribution to the protein binding efficiency. If the proteins bind to the double-stranded repeats, their titration by the long CUG hairpins could be more easily explained. Having the CUG hairpin structure determined in vitro, experiments can now be designed to demonstrate that the hairpins exist in vivo, to establish the way they bind the proteins, and to show how the protein binding changes with the repeat expansions. FIG. 7. Computer-predicted secondary structure of the DMPK mRNA 3-UTR. Shown is the secondary structure of the entire 3Ј-UTR sequence of DMPK mRNA containing 120 CUG repeats as modeled by the MFold program. One of several nearly isoenergetic structures is shown (energy, Ϫ303.1 kcal/mole), in which the repeat section is split into two CUG hairpins marked by shaded bars. Indicated are the coding sequence border (5Ј) and the poly(A) attachment site (3Ј).