Role of Mismatch Repair Enzymes in GAA·TTC Triplet-repeat Expansion in Friedreich Ataxia Induced Pluripotent Stem Cells*

Background: Friedreich ataxia is caused by a GAA·TTC triplet-repeat expansion in the first intron of the FXN gene. Results: Expansion of the repeats is observed in induced pluripotent stem cells (iPSCs) and can be blocked with either shRNAs to mismatch repair enzymes or small molecules targeting the repeats. Conclusion: MutSα and MutSβ are involved in repeat expansion. Significance: iPSCs provide a model system for studying triplet-repeat expansion. The genetic mutation in Friedreich ataxia (FRDA) is a hyperexpansion of the triplet-repeat sequence GAA·TTC within the first intron of the FXN gene. Although yeast and reporter construct models for GAA·TTC triplet-repeat expansion have been reported, studies on FRDA pathogenesis and therapeutic development are limited by the availability of an appropriate cell model in which to study the mechanism of instability of the GAA·TTC triplet repeats in the human genome. Herein, induced pluripotent stem cells (iPSCs) were generated from FRDA patient fibroblasts after transduction with the four transcription factors Oct4, Sox2, Klf4, and c-Myc. These cells were differentiated into neurospheres and neuronal precursors in vitro, providing a valuable cell model for FRDA. During propagation of the iPSCs, GAA·TTC triplet repeats expanded at a rate of about two GAA·TTC triplet repeats/replication. However, GAA·TTC triplet repeats were stable in FRDA fibroblasts and neuronal stem cells. The mismatch repair enzymes MSH2, MSH3, and MSH6, implicated in repeat instability in other triplet-repeat diseases, were highly expressed in pluripotent stem cells compared with fibroblasts and neuronal stem cells and occupied FXN intron 1. In addition, shRNA silencing of MSH2 and MSH6 impeded GAA·TTC triplet-repeat expansion. A specific pyrrole-imidazole polyamide targeting GAA·TTC triplet-repeat DNA partially blocked repeat expansion by displacing MSH2 from FXN intron 1 in FRDA iPSCs. These studies suggest that in FRDA, GAA·TTC triplet-repeat instability occurs in embryonic cells and involves the highly active mismatch repair system.

The genetic mutation in Friedreich ataxia (FRDA) is a hyperexpansion of the triplet-repeat sequence GAA⅐TTC within the first intron of the FXN gene. Although yeast and reporter construct models for GAA⅐TTC triplet-repeat expansion have been reported, studies on FRDA pathogenesis and therapeutic development are limited by the availability of an appropriate cell model in which to study the mechanism of instability of the GAA⅐TTC triplet repeats in the human genome. Herein, induced pluripotent stem cells (iPSCs) were generated from FRDA patient fibroblasts after transduction with the four transcription factors Oct4, Sox2, Klf4, and c-Myc. These cells were differentiated into neurospheres and neuronal precursors in vitro, providing a valuable cell model for FRDA. During propagation of the iPSCs, GAA⅐TTC triplet repeats expanded at a rate of about two GAA⅐TTC triplet repeats/replication. However, GAA⅐TTC triplet repeats were stable in FRDA fibroblasts and neuronal stem cells. The mismatch repair enzymes MSH2, MSH3, and MSH6, implicated in repeat instability in other triplet-repeat diseases, were highly expressed in pluripotent stem cells compared with fibroblasts and neuronal stem cells and occupied FXN intron 1. In addition, shRNA silencing of MSH2 and MSH6 impeded GAA⅐TTC triplet-repeat expansion. A specific pyrrole-imidazole polyamide targeting GAA⅐TTC tripletrepeat DNA partially blocked repeat expansion by displacing MSH2 from FXN intron 1 in FRDA iPSCs. These studies suggest that in FRDA, GAA⅐TTC triplet-repeat instability occurs in embryonic cells and involves the highly active mismatch repair system.
Nearly 30 hereditary diseases in humans are caused by expansion of a triplet-repeat sequence in genomic DNA (1,2). These expanded sequences are unstable and frequently change in length during intergenerational transmission and within somatic cells. However, the mechanisms of expanded tripletrepeat sequence generation remain enigmatic (3).
Friedreich ataxia (FRDA) 2 is caused by heterochromatinmediated silencing of the FXN gene, encoding the essential mitochondrial protein frataxin (4). The genetic mutation in FRDA is a GAA⅐TTC triplet-repeat expansion in the first intron of FXN, with unaffected alleles having 6 -34 repeats in contrast to 66 -1700 repeats in patient alleles. Longer repeats are associated with more severe gene repression, lower frataxin protein levels, and earlier onset and increased disease severity (5,6). Since the GAA⅐TTC triplet-repeat expansion mutation was identified (5), several models have been established to explain the mechanism of repeat expansion. Using Escherichia coli (7) and yeast (8,9) models, GAA⅐TTC triplet-repeat expansion has been observed during cell culture. Mammalian cell models using a plasmid construct with GAA⅐TTC triplet repeats have shown that the repeat expansions are transcription-, replication-, and position-dependent (10 -12). In addition, mouse models with expanded GAA⅐TTC triplet repeats show somatic instability in different tissues (13)(14)(15). These systems differ substantially from humans and from each other in terms of DNA replication rate, the cell type in which GAA⅐TTC triplet-repeat expansion occurs, and chromatin structure. Therefore, studies on FRDA pathogenesis and therapeutic development are still limited by the availability of an appropriate cell model in which to study the mechanism of GAA⅐TTC triplet-repeat generation in the human genome.
In our previous work (16), we showed that GAA⅐TTC triplet repeats are highly unstable in FRDA induced pluripotent stem cells (iPSCs). Liu et al. (17) also reported that GAA⅐TTC triplet repeats are unstable during the generation of iPSCs from patient fibroblasts. Importantly, asymptomatic heterozygous carriers show GAA⅐TTC triplet-repeat expansion on the pathogenic allele, but not the normal allele. The mismatch repair (MMR) enzyme MSH2, implicated in repeat instability in other triplet-repeat diseases, is highly expressed in pluripotent stem cells and occupies FXN intron 1. In addition, shRNA silencing of MSH2 impedes GAA⅐TTC triplet-repeat expansion (16). Our findings provide the first human cell model to study the mechanisms of GAA⅐TTC triplet-repeat expansion in the context of the endogenous cellular FXN gene in human cells. Herein, we found that GAA⅐TTC triplet-repeat expansion occurred only in the iPSCs, but not in FRDA fibroblasts or differentiated neurospheres. Repeat expansion involved the MMR complexes MutS␣ and MutS␤. These studies suggest that the generation and expansion of GAA⅐TTC triplet repeats in FRDA most likely occur in embryonic cells and involve the highly active MMR system.
Derivation of iPSCs-Unaffected and FRDA iPSC derivation followed previous methods with minor deviations (18). These normal and FRDA iPSCs have been characterized by standard methods (16).
For immunocytochemistry, neuronal cells were fixed in 4% paraformaldehyde for 10 min and permeabilized with 0.5% Triton X-100 in PBS for 10 min at room temperature. Anti-MAP2 (AB5622, Millipore) and anti-Tuj1 primary antibodies in 10% goat serum in 0.5% Triton X-100 in PBS were incubated overnight at 4°C, and Alexa Fluor 488 goat anti-rabbit IgG (H ϩ L), Alexa Fluor 488 goat anti-mouse IgG (H ϩ L), and Alexa Fluor 594 goat anti-rabbit IgG (H ϩ L; A11012, Invitrogen) secondary antibodies were incubated at 1:1000 dilution for 1 h at room temperature, followed by nuclear staining with DAPI.
Nucleic Acid Purification-Total RNA was purified with an RNeasy Plus minikit (Qiagen) according to the manufacturer. Genomic DNA was purified by isopropyl alcohol precipitation of cell lysate prepared in total cell lysis buffer (100 mM Tris-HCl (pH 8.5), 5 mM EDTA, 0.2% SDS, 0.2 M NaCl, and 200 g/ml proteinase K (Roche Applied Science)) (23). Generally, cells were collected and washed once with PBS. Cell pellets were incubated in 0.2 ml of lysis buffer overnight at room temperature. To the lysate was added 1.0 l of 20 mg/ml glycogen (Roche Applied Science) and 0.2 ml of isopropyl alcohol. After incubation for 2 h, the precipitated genomic DNA was collected by centrifugation and washed twice with 0.5 ml of 70% isopropyl alcohol in TE buffer (10 mM Tris-HCl (pH 7.5) and 1 mM EDTA). The genomic DNA pellet was heated at 55°C for 10 min and then resuspended in 40 l of TE buffer overnight at 55°C.
Conventional PCR, Small-pool PCR (SP-PCR) , and Quantitative RT-PCR (qRT-PCR)-For GAA⅐TTC triplet-repeat length conventional PCRs, Phusion polymerase (New England Biolabs, Ipswich, MA) was used according to the manufacturer. 20 ng of genomic DNA (ϳ6600 genomic equivalents) and 0.1 M primers GAA-104F and GAA-629R were used (12) in 20-l reactions cycled through the following conditions: denaturation at 98°C for 5 s, annealing at 70°C for 15 s, and extension at 72°C for 90 s for 40 cycles with a 5-min initial denaturation and a 5-min final extension. Quantitation of PCR band size was performed using an inverse power function directly correlating gel migration of a molecular weight ladder to its known sizes (ImageJ software) (24). PCR products from the FXN locus contain 499 bp of non-repeat sequences, so GAA⅐TTC triplet-repeat number estimations were adjusted accordingly.
SP-PCR was performed using a modification of a previously published protocol (25). Serial dilutions of genomic DNA ranging from 75 to 750 pg (25-250 genomic equivalents) were prepared in microcentrifuge tubes with 2 M primers GAA-104F and GAA-629R. Other PCR conditions were according to previous conventional PCR with 35 cycles. PCR products were resolved by electrophoresis on 0.8% agarose gels. The products were analyzed by Southern blotting using an end-labeled (TTC) 9 oligonucleotide probe that specifically hybridizes to the GAA triplet-repeat sequence.
qRT-PCR analysis was done with a qScript One-Step SYBR Green qRT-PCR kit (Bio-Rad) according to the manufacturer. Primers for pluripotent markers were as described (18). MSH2, MSH3, and GAPDH primers were as described (16). Other primers used for qRT-PCR were as follows: , and nestin_R2 (5Ј-AGCAAAGATCCAAGACGCC-3Ј). Analysis of relative qRT-PCR data was performed via the ⌬⌬C t method (26). All qRT-PCR signals were normalized to GAPDH. Reverse transcription was performed at 50°C for 20 min, followed by PCR cycling at 95°C for 30 s, 55°C for 30 s, and 72°C for 30 s for 40 cycles.
Lentiviral shRNA Transduction-Lentiviral shRNA for targeting MSH2 has been described (16). Lentiviral shRNA constructs for targeting MSH6 were generated by cloning an MSH6-targeted oligonucleotide (5Ј-CCGGGCCAGAAGAAT-ACGAGTTGAACTCGAGTTCAACTCGTATTCTTCTGG-CTTTTTG-3Ј) into the vector pLKO.1-puro (Sigma-Aldrich) at AgeI and EcoRI sites downstream of the U6 promoter. A vector containing a scrambled shRNA (plasmid 1864) was obtained from Addgene (Cambridge, MA). Lentiviral particles were then packaged by cotransfecting human HEK293T cells using FuGENE 6 reagent (Roche Applied Science) with shRNA constructs along with psPAX2 (plasmid 12260) and pMD2.G (plasmid 12259) helper plasmids (Addgene) in HEK293T medium. Transfected cells were grown at 37°C for a total of 12 h, and the cell medium was changed to human ESC medium. Lentivirus-containing supernatants were collected 24 and 48 h later. Lentivirus-containing supernatant was filtered through a 0.45-m cellulose acetate mesh to remove any contaminating cells or cell debris. FRDA iPSCs were subjected to two lentiviral transductions with lentivirus for 12 h at 37°C with 5 g/ml Polybrene. Cells were then expanded and subjected to 6 days of puromycin selection (0.4 g/ml) on drug-resistant DR4 mouse embryonic fibroblasts (GlobalStem).
Pyrrole-Imidazole Polyamide Treatment-FRDA iPSCs were incubated with polyamide FA1 (5 M) (27) or HIV-1 (5 M) (28,29) or in the absence of polyamide for 6 weeks, and genomic DNA was purified after each passage. For chromatin immunoprecipitation, FRDA iPSCs were incubated with polyamide FA1 (5 M) or in the absence of polyamide for 72 h before iPSCs were collected. Polyamides were synthesized as described previously (27,28) and were verified for purity and identity by analytical HPLC and MALDI-TOF-MS. Characterization of the DNAbinding properties of the molecules by quantitative DNase I footprinting has been described (27,28). A BODIPY FL-polyamide conjugate was also used for nuclear uptake studies (27).
Chromatin Immunoprecipitation-Cells were cross-linked first with 1.5 mM dithiobis(succinimidyl propionate), followed by 1% formaldehyde. Subsequent ChIP procedures were as described (4,16) with anti-MSH2 antibody (sc-494, Santa Cruz Biotechnology, Inc.). Analysis by qPCR with primers for the FXN promoter, the region upstream of the GAA⅐TTC triplet repeats, and the region downstream of the repeats was as described (4,16).

Generation of Expanded GAA⅐TTC Triplet Repeats in FRDA
iPSCs-It is known that expanded GAA⅐TTC triplet repeats in FRDA-associated alleles are genetically highly unstable and undergo changes in repeat length in both the germ line and soma (30). Our previous work has shown that GAA⅐TTC triplet repeats in iPSCs in FXN exhibit repeat instability similar to patient families, where the repeats expand with discrete changes in length between generations (16). Liu et al. (17) also reported that reprogramming of FRDA fibroblasts to iPSCs results in both expansions and contractions of the GAA⅐TTC triplet repeats within the FXN locus. In these studies, conventional PCR (typically using Ͼ20 ng of genomic DNA, which represents 6600 genomic equivalents) was used to assess GAA⅐TTC triplet-repeat length variability in the FRDA fibroblasts and iPSCs. This method cannot detect rarer molecules comprising Ͻ10% of the total population.
It has been reported that SP-PCR can detect these rarer mutant molecules. This method constitutes PCR amplification of the GAA⅐TTC repeat sequence in multiple small pools of input genomic DNA, containing on the order of 0.5-200 genomic equivalents (31). To differentiate whether the GAA⅐TTC triplet-repeat instability observed in FRDA iPSCs is similar to somatic or intergenerational instability, SP-PCR was used to detect rarer molecules with discrete lengths of GAA⅐TTC triplet repeats from FRDA fibroblasts (line GM03816 from the NIGMS Coriell Depository, homozygous for expanded GAA⅐TTC triplet alleles) and iPSC line 23-3 (passage 22, derived from GM03816 cells) (16). Conventional PCR (20 ng, ϳ6600 genomic equivalent molecules/reaction) showed that fibroblasts and iPSCs carried a distinguished spectrum of expanded alleles (Fig. 1A). The expanded GAA⅐TTC triplet alleles, seen as double bands by conventional PCR, were thus resolved into similar bands from the respective constitutional GAA⅐TTC allele sizes (Fig. 1B) in SP-PCR with 75 genomic equivalent molecules/reaction. Lower DNA template concentrations (25 genomic equivalent molecules/reaction) frequently showed single-allele discrimination, in which only one of two homologous expanded alleles was amplified (Fig. 1C). We believe that these bands reflect the allele length in individual FRDA genes (haploid genomes). The repeat instability we observed in FRDA iPSCs resembles the intergenerational repeat expansions seen in patient families rather than the repeat instability seen in the somatic cells that are affected in the disease (6). In this latter case, heterogeneous expansions and contractions of the repeats are observed, rather than the discrete changes in repeat length observed on maternal transmission of pathogenic alleles. Our findings are therefore more similar to this intergenerational repeat instability; however, this conclusion must be tempered by the fact that derivation of iPSCs from fibroblasts does not involve meiosis, as does passage through the maternal germ line.
GAA⅐TTC Triplet-repeat Expansion Rate in FRDA iPSCs-Upon propagation of the FRDA iPSCs, we found that the GAA⅐TTC triplet repeats continued to expand ( Fig. 2A). This makes FRDA iPSCs an ideal system in which to study the accumulation rate of the GAA⅐TTC triplet repeats. We found that both alleles expanded on passage in culture (from passages 6 to 19). PCR band sizes were quantified by an inverse power function directly correlating gel migration of a molecular weight ladder to its component lengths. PCR products from the FXN locus contain 499 bp of non-repeat sequences, so GAA⅐TTC triplet-repeat numbers were estimated accordingly (Fig. 2B). Because FRDA is a recessive disease, both alleles are affected, and both alleles expand during iPSC propagation. The rate of GAA⅐TTC triplet-repeat expansion in the longer expanded allele was greater than in the shorter expanded allele (Fig. 2C). At each iPSC passage, the longer expanded allele accumulated 13.3 GAA⅐TTC triplet repeats, whereas the shorter expanded allele accumulated 9.6 GAA⅐TTC triplet repeats.
Potential links between GAA⅐TTC triplet-repeat expansion and DNA replication have been studied in a mammalian cell model using a plasmid with GAA⅐TTC triplet repeats. In this plasmid model, GAA⅐TTC triplet-repeat expansions have been found to be replication-dependent (10). In the iPSCs, the accumulation of repeats in each replication could be estimated. Generally, iPSCs were passaged about every 7 days. It has been reported that iPSC doubling time is ϳ29 h (32). Therefore, it can be calculated that GAA⅐TTC triplet repeats accumulated at a rate of ϳ2.3 (longer expanded allele) and ϳ1.7 (shorter expanded allele) repeats/replication cycle.
GAA⅐TTC Triplet-repeat Expansion Occurs in FRDA iPSCs, but Not in Differentiated Cells-iPSCs are a type of pluripotent stem cell artificially derived from a non-pluripotent patient fibroblast cell by inducing the "forced" expression of Oct4, Sox2, Klf4, and c-Myc. The generated iPSCs are remarkably similar to naturally isolated pluripotent stem cells (16). We were curious to determine whether GAA⅐TTC triplet-repeat expansion also occurs in FRDA fibroblasts and neurospheres (neural stem cells), which can be differentiated from FRDA iPSCs.
In vitro differentiation of FRDA iPSCs to neurospheres was performed as described previously (19 -22). Neurospheres showed higher expression of the genes MAP2 and nestin compared with the original fibroblasts and iPSCs (Fig. 3A). Flow cytometry showed that neurosphere populations were Ͼ90% positive for the marker protein nestin (supplemental Fig. S1). In addition, these neurospheres could be further differentiated to neuronal precursor cells (17). The FRDA neuronal precursor cells expressed neural marker proteins Tuj1 and MAP2, with a cell population of 93% (Tuj1) and 84% (MAP2), respectively (supplemental Fig. S1, C-E, and Fig. S2).
FRDA fibroblasts (GM03816 cells) were cultured and passaged every week and collected every 3 weeks for genomic DNA (passages 11-23). GAA⅐TTC triplet repeats were stable in FRDA fibroblasts during passage for 12 weeks (Fig. 3B). GAA⅐TTC triplet repeats began to expand in FRDA iPSCs (Fig.  3C) and stop expansion once differentiated into neurospheres, showing stability at weeks 0, 2, and 4 post-neurosphere formation (Fig. 3C). Interestingly, GAA⅐TTC triplet repeats continued expanding during the 2 weeks of noggin induction (passages 9 -11*). Moreover, we also observed GAA⅐TTC triplet-repeat expansion between passage 11* and week 0 prior to complete neurosphere formation (2 weeks). One possibility is that intermediate cells may also express some of the pluripotency genes, which maintain GAA⅐TTC triplet-repeat expansion. GAA⅐TTC triplet repeats were also stable in neurospheres during a 10-week observation period (Fig. 3D).
Recently, several studies have reported triplet-repeat expansion and contraction during iPSC derivation (17,33). To confirm whether GAA⅐TTC triplet-repeat expansion occurred during the reprogramming, FRDA iPSCs were collected after reprogramming (first collection from the reprogramming dishes, passage 0). Two independent clones both showed a clear expansion at passage 0 and at later passages (Fig. 3E). One explanation for contractions observed in other studies (17) is that these iPSC clones came from different fibroblasts, each with different GAA⅐TTC triplet-repeat lengths. Another explanation is that expansion was not due to the cell reprogramming but was a consequence of iPSC formation after reprogramming.
GAA⅐TTC triplet repeats accumulate during propagation of the FRDA iPSCs in a manner analogous to the expansion observed in FRDA patient families (intergenerational repeat instability). Importantly, asymptomatic heterozygous carriers show GAA⅐TTC triplet-repeat expansion on the pathogenic allele, but not the normal allele (16). These observations open a new avenue for the investigation of the molecular mechanisms that underlie GAA⅐TTC triplet-repeat expansion.

MMR Proteins Are Highly Expressed in iPSCs-It has been
reported that the MMR system is involved in triplet-repeat instability. In DM1 (myotonic dystrophy 1) transgenic mice, MSH3 is a limiting factor in the formation of intergenerational CTG⅐CAG triplet-repeat expansions, whereas MSH6 is not (34). Our previous work also showed that the MMR enzyme MSH2 is highly expressed in pluripotent cells and occupies FXN intron 1, and shRNA silencing of MSH2 impedes GAA⅐TTC triplet-repeat expansion in the FRDA iPSCs (16).
Because the protein levels of MSH2, MSH3, and MSH6 do not always reflect the level of their respective mRNA transcripts but are dependent upon Mut␣ and Mut␤ complex stability (38), we monitored the levels of MSH2, MSH3, and MSH6 by Western blotting using extracts from the FRDA fibroblasts, iPSCs, and neurospheres. Indeed, expression of MSH2, MSH3, and MSH6 proteins was up-regulated in FRDA iPSCs compared with the parent fibroblasts (Fig. 4, A  and B). These increased protein levels are not related to a FRDA phenotype, as unaffected GM08333 iPSCs showed similar increases in MSH2, MSH3, and MSH6 protein levels compared with fibroblasts and similar to the expression in H1 ESCs (Fig. 4, C and D). Once iPSCs were differentiated into neurospheres, MSH2, MSH3, and MSH6 protein levels were reduced (Fig. 4, A and B).
MMR Enzymes MSH2 and MSH6 Are Involved in GAA⅐TTC Triplet-repeat Expansion-It has been reported that MutS␣ (MSH2/MSH6) preferentially recognizes base-base mis-matches and small insertion/deletion mispaired nucleotides, whereas MutS␤ (MSH2/MSH3) preferentially recognizes larger insertion/deletion mispairs (35). Several studies have pointed to the involvement of MutS␤ in triplet-repeat instability (8,39), and MutS␣ (MSH2/MSH6) may also be involved in the formation of CTG⅐CAG triplet expansions during maternal transmission (34). Here, we investigated the role of MutS␣ and MutS␤ in GAA⅐TTC triplet-repeat expansion in the FRDA iPSCs.
It has been shown that shRNA silencing of MSH2 impedes GAA⅐TTC triplet-repeat expansion in the FRDA iPSCs (16); however, that study did not determine the effect of MSH2 knockdown on the rate of repeat expansion or the role of the other MutS␣ and MutS␤ subunits (MSH3 and MSH6) in repeat expansion. We therefore performed shRNA-mediated knockdown of the mRNAs encoding each of these proteins in FRDA iPSCs and monitored the effects of MMR knockdown on repeat expansion. shRNA knockdown of MSH2 reduced the level of MSH3 protein, leaving MSH6 protein unaffected (Fig. 5A). Due to the sensitivity of our PCR assay, where amplification of the shorter allele is more efficient that that of the longer allele and where changes in repeat length are easier to quantify, we meas- ured changes in repeat length on the shorter allele. The rate of GAA⅐TTC triplet-repeat expansion in the shorter expanded allele was analyzed after MSH2 gene knockdown (Fig. 5A). iPSC genomic DNA was collected after four, six, and eight passages of MSH2 knockdown. PCR analysis showed that the shorter expanded alleles accumulated 11.4 GAA⅐TTC triplet repeats at every passage in scrambled control iPSCs, but only 3.6 GAA⅐TTC triplet repeats in MSH2 knockdown iPSCs (Fig. 5A). These data further implicate the involvement of MSH2 in GAA⅐TTC triplet-repeat instability. MSH6 gene knockdown was also achieved (Fig. 5B), whereas MSH3 gene knockdown was unsuccessful (data not shown). MSH6 gene knockdown was analyzed at the protein level for all three MMR subunits, showing that MSH6 protein knockdown did not affect the MSH2 or MSH3 protein levels significantly (Fig. 5B). PCR analysis for repeat lengths showed that the MMR enzyme MutS␣ (MSH2/MSH6) is involved in GAA⅐TTC triplet-repeat expansion. The shorter expanded allele accumulated 6.5 GAA⅐TTC triplet repeats at every passage in scrambled control iPSCs, but only 3.4 GAA⅐TTC repeats in MSH6 knockdown iPSCs (Fig. 5B). The difference between the scrambled control iPSCs in Fig. 5 (A and B) may come from a passage number difference between lines during the shRNA gene knockdown.
GAA⅐TTC Triplet Repeat-targeted Pyrrole-Imidazole Polyamide Impedes Repeat Expansion-Biochemical studies have documented that expanded GAA⅐TTC triplet repeats adopt unusual non-B DNA structures in vitro, such as triplexes, containing two purine GAA strands along with one pyrimidine TTC strand, flanking a single-stranded pyrimidine region (40), as well as intramolecular "sticky" DNA (41). Structural studies indicate that ␤-alanine-linked polyamides bind the minor groove of canonical B DNA (42). Given the high affinity of ␤-alanine-linked polyamides for purine tracts (43), these molecules might act as a thermodynamic "sink" and lock GAA⅐TTC triplet repeats into double-stranded B DNA. Such an event would disfavor duplex unpairing, which is necessary for formation of FRDA triplexes and sticky DNA, which may lead to GAA⅐TTC triplet-repeat expansion in FRDA iPSCs. It has been shown that ␤-alanine-linked pyrrole-imidazole polyamide FA1 (ImPy-␤-ImPy-␤-Im-␤-Dp, where Py is pyrrole, Im is imidazole, ␤ is ␤-alanine, and Dp is dimethylaminopropylamine) binds GAA⅐TTC triplet-repeat DNA in its B-type conformation, shifts the equilibrium from sticky DNA back to B-type DNA, and relieves transcriptional repression of the FXN gene in FRDA cells (Fig. 6A) (27).
Quantitative DNase I footprinting experiments showed that polyamide FA1 binds the 9-bp site 5Ј-AAGAAGAAG-3Ј (as in GAA⅐TTC triplet-repeat DNA) with an apparent dissociation constant (K d ) of 0.1 nM (27). Quantitative footprint titration experiments also revealed that polyamide HIV-1 (ImPy-␤-ImPy-␥-ImPy-␤-ImPy-␤-Dp, where ␥ is ␥-butyric acid) binds to sites adjacent to and overlapping the HIV-1 TATA element (5Ј-(A/T)GC(A/T)GC(A/T)-3Ј) with a K d of 0.05 nM (28), and subsequent studies showed that this same molecule targets CAG⅐CTG triplet-repeat DNA with high affinity (Fig. 6B) (29). Both molecules have been found to be cell-permeable, and fluorescent conjugates localize in the cell nucleus (supplemental Fig. S3) (27,44). Thus, polyamide HIV-1 was used as a control in the following experiments. We asked whether these polyamides would have an effect on GAA⅐TTC triplet-repeat expansion in FRDA iPSCs. At 5 M, FA1 impeded GAA⅐TTC triplet-repeat expansion in FRDA iPSCs, whereas the control polyamide HIV-1 did not (Fig. 6, C and D).
We postulate that polyamide FA1 reverses a non-B DNA conformation, causing displacement of MMR enzymes and thereby preventing repeat expansion. ChIP assays showed that FA1 removed the MMR enzyme MSH2 from the region downstream of the repeats (Fig. 6E). This result was consistent with the MSH2 shRNA knockdown that removed the MMR enzyme MSH2 from the region downstream of the repeats (Fig. 6F).

DISCUSSION
Accumulating data from E. coli, yeast, mammalian cell, and mouse models reveal that somatic gene mosaicism and tripletrepeat expansion occur as the result of multiple small changes in length-dependent, transcription-dependent, and DNA MMR-dependent processes. The most parsimonious model that can account for these dynamics involves the DNA MMR machinery (45).
Generation of expanded GAA⅐TTC triplet repeats in FRDA iPSCs can be explained using a MMR model. In this model, possibly due to transcription and RNA⅐DNA hybrid formation (46), the GAA⅐TTC triplet repeats adopt an unusual non-B DNA structure. The loop outs are recognized by MutS␤ (MSH2/MSH3) and MutS␣ (MSH2/MSH6), which initiate a DNA MMR reaction by recruiting a MutL heterodimer. If the complementary loop outs are far enough apart, they will be repaired as independent events, and if repair is directed to incorporate the loop out on the opposite stand, a net gain of repeats will result that is equivalent to the size of the original misalignment. Multiple rounds of misalignment and MMR could lead to the accumulation of large GAA⅐TTC tripletlength changes.
Whether the mechanism of GAA⅐TTC triplet-repeat generation is really as simple as inappropriate DNA MMR remains to be elucidated. However, progress achieved in understanding the important factors mediating repeat instability will be extended through the use of an impressive array of model systems now established to test competing models. FRDA iPSCs would be a valuable model to study repeat instability in the human genome.
Is there any relationship between triplet-repeat instability and reprogramming? Recently, several studies have reported triplet-repeat expansions and contractions during iPSC derivation (17,33). There are three ways to explain these repeat expansions and contractions. First, iPSC clones come from a fibroblast cell with shorter or longer repeats. Second, repeat expansion occurs during the reprogramming stage. The repeat size difference between fibroblasts and iPSCs comes from the propagation of iPSCs. The very early iPSCs (passage 0) showing expansion in this study resulted after 3 weeks of propagation, in which transduced fibroblasts grew and formed iPSC clones. The repeat expansion rate between fibroblasts and the very early iPSCs (passage 0) is similar to expansion during later iPSC propagation (Fig. 3E). Third, there is a true expansion and contraction during cell reprogramming. However, this possibility needs more experimental support. For example, one could differentiate iPSCs back to fibroblasts and check the repeat lengths. These new fibroblasts will have less somatic instability because the iPSCs come from one fibroblast cell. Then, reprogram the newly generated fibroblasts back into iPSCs and check the length of the repeats. Although this process will not fully answer the question, it can further confirm expansion and contraction during the reprogramming.
Is MSH6 involved in GAA⅐TTC triplet-repeat expansion? Several studies have pointed to the involvement of MutS␤ (MSH2/MSH3) in triplet-repeat instability in transgenic mice (8,15,39). Our previous ChIP assays (16) also show that at a resolution of ϳ1 kb, there is an increased occupancy of MSH2 and MSH3 downstream of the GAA⅐TTC triplet repeats in FRDA iPSCs compared with an unaffected iPSC line, but, in contrast, not 1254 bp upstream of the FXN transcriptional start site or directly upstream of the GAA⅐TTC triplet repeats. In addition, no differences in MSH6 occupancy were found at any of the regions probed. However, these results cannot rule out the involvement of MSH6 in GAA⅐TTC triplet-repeat expansion. One possibility is that MSH6 may have an indirect effect on GAA⅐TTC triplet-repeat expansion in iPSCs. Because the mechanism of MMR in mammalian cells is not well understood, future biochemical studies could investigate the function of MSH6 in MMR. iPSCs clearly provide a mammalian cell model to study the mechanisms of MMR.
Are other factors involved in GAA⅐TTC triplet-repeat expansion? Higher expression of mismatch enzymes in iPSCs could be one key factor that promotes the GAA⅐TTC tripletrepeat expansion. However, two results suggest that there may be other factors contributing to GAA⅐TTC triplet-repeat expansion in iPSCs. First, asymptomatic heterozygous carriers show GAA⅐TTC triplet-repeat expansion only on the pathogenic allele, but not on the normal allele in FRDA iPSCs (16). This result strongly supports models in which expanded alleles adopt non-B DNA structures, which in turn promote further expansion by recruitment of the MMR enzymes. However, the exact mechanisms involved in MMR recruitment remain to be FIGURE 6. Polyamide FA1, which targets GAA⅐TTC triplet-repeat DNA, partially blocks repeat expansion in FRDA iPSCs. A, polyamide FA1 sequence and binding model. B, polyamide HIV-1 sequence and binding model. Filled and open circles are imidazole (Im) and pyrrole (Py) rings, respectively; diamonds are ␤-alanine (␤); the curved line is ␥-aminobutyric acid (␥); and the semicircle with a plus sign is dimethylaminopropylamine (Dp). C and D, PCR of GAA⅐TTC triplet repeats and GAA⅐TTC triplet-repeat expansion rate (over passage (P)), respectively, in untreated, FA1-treated and HIV-1-treated iPSCs. The HIV-1 polyamide targets CTG⅐CAG triplet-repeat DNA. E, polyamide FA1 targeting GAA⅐TTC triplet-repeat DNA displaces MSH2 from FXN intron 1 in FRDA iPSCs. F, MSH2-targeted shRNA (shMSH2) displaces MSH2 from FXN intron 1 in FRDA iPSCs. ChIP experiments were performed with anti-MSH2 antibody in FRDA iPSCs using PCR primers for a region on the FXN gene 1254 bp upstream of the transcriptional start site (Ϫ1254) and upstream (up GAA) and downstream (down GAA) of the GAA⅐TTC triplet-repeat expansion. Immunoprecipitation recovery is relative to intron 2.
identified. Second, in FRDA patients, somatic GAA⅐TTC triplet-repeat instability is tissue-specific. This finding could implicate other, as yet unidentified factors in repeat expansion. Because we cannot target FXN transcription, DNA replication, or active MMR enzymes to block GAA⅐TTC triplet-repeat instability, factors that are specifically related to this instability could represent a potential drug target to treat FRDA.
Specific ␤-alanine-linked pyrrole-imidazole polyamides could be one way to impede the GAA⅐TTC triplet-repeat expansion. Although the existence of GAA⅐TTC and other triplet-repeat sequences has been known for some time, there is still no effective method or compound to prevent repeat instability in vivo. In this study, we have shown that a specific polyamide (FA1) can impede the GAA⅐TTC triplet-repeat expansion in FRDA iPSCs, likely by binding and displacing MMR enzymes from the repeats. This observation opens a new avenue for investigating and designing new compounds that have the ability to target GAA⅐TTC triplet repeats, as well as other repeats, to block repeat instability. The ability to impede expansion raises the hope that the severity of pathophysiology might be reduced or its onset delayed, thereby widening the therapeutic window for these deadly triplet-repeat diseases (3).