Interaction of the Transcription Factors USF1, USF2, and a -Pal/Nrf-1 with the FMR1 Promoter IMPLICATIONS FOR FRAGILE X MENTAL RETARDATION SYNDROME*

Hypermethylation of the FMR1 promoter reduces its transcriptional activity, resulting in the mental retardation and macroorchidism characteristic of Fragile X syndrome. How exactly methylation causes transcriptional silencing is not known but is relevant if current attempts to reactivate the gene are to be successful. Understanding the effect of methylation requires a bet-ter understanding of the factors responsible for FMR1 gene expression. To this end we have identified five evolutionarily conserved transcription factor binding sites in this promoter and shown that four of them are important for transcriptional activity in neuronally derived cells. We have also shown that USF1, USF2, and a2 Pal/Nrf-1 are the major transcription factors that bind the promoter in brain and testis extracts and sug-gest that elevated levels of these factors account in part for elevated FMR1 expression in these organs. We also show that methylation abolishes a2 Pal/Nrf-1 binding to the promoter and affects binding of USF1 and USF2 to a lesser degree. Methylation may therefore inhibit FMR1 transcription not only by recruiting histone deacetylases but also by blocking transcription factor binding. This suggests that for efficient reactivation of the FMR1 promoter, significant demethylation must occur and that current approaches to gene reactivation using histone deacetylase inhibitors alone may therefore have limited effect. Fragile X syndrome is caused by the expansion of a CGG repeat in the 5 9 -untranslated region of the fragile X mental retardation (FMR1) gene (1, 2). This results in hypermethylation The introduction of the single base change to the re- sultant plasmid (pUSFmut) was confirmed by sequencing. A 909-bp fragment from the original FMR1 promoter in p32.9 and a 910-bp fragment from the mutated promoter from pUSFmut containing 869 and 870 bases from the human FMR1 promoter, respectively, were cloned into Kpn I- Nhe I-digested plasmid pGL3-basic (Promega, Madi-son, WI), which contains the firefly ( Photinus pyralis ) luciferase re- porter gene to make pGL-FMR and pGL-USFmut. The 909-bp wild-type FMR1 promoter was also cloned into Hin dIII- Spe I digested plasmid pRL-null (Promega), which contains the sea pansy ( Renilla reniformis ) luciferase-coding sequence, to make the control plasmid pRL-FMR. Deletions in the promoter were created either by Exonuclease III and S1 nuclease treatment or by digestion with a combination of restriction enzymes followed by T4 DNA polymerase. Specifically plasmids p D2 88/ 1 244, p D1 55/ 1 244, and p D2 447/ 1 244 were generated by digesting pGL-FMR with Bgl II and filling in the recessed 3 9 ends with a 2 phos-phothioate deoxyribonucleotides followed by digestion with Nhe I, Exo- nuclease III, and S1 nuclease according to standard procedures. Plasmids p D2 131/ 2 123 were made by digesting pGL-FMR with Bss HII followed by ligation. p D2 131/ 2 123/USFmut was generated from pGL-USFmut in the same way. Plasmids p D2 149/ 2 116 and p D2 151/ 2 81 were generated by Sph I digestion followed by T4 DNA polymerase treatment and ligation. Plasmid p D2 85/ 2 9 was obtained by digestion of pGL-FMR with Pml I and treatment with Exonuclease III and S1 nuclease.

Hypermethylation of the FMR1 promoter reduces its transcriptional activity, resulting in the mental retardation and macroorchidism characteristic of Fragile X syndrome. How exactly methylation causes transcriptional silencing is not known but is relevant if current attempts to reactivate the gene are to be successful. Understanding the effect of methylation requires a better understanding of the factors responsible for FMR1 gene expression. To this end we have identified five evolutionarily conserved transcription factor binding sites in this promoter and shown that four of them are important for transcriptional activity in neuronally derived cells. We have also shown that USF1, USF2, and ␣؊Pal/Nrf-1 are the major transcription factors that bind the promoter in brain and testis extracts and suggest that elevated levels of these factors account in part for elevated FMR1 expression in these organs. We also show that methylation abolishes ␣؊Pal/Nrf-1 binding to the promoter and affects binding of USF1 and USF2 to a lesser degree. Methylation may therefore inhibit FMR1 transcription not only by recruiting histone deacetylases but also by blocking transcription factor binding. This suggests that for efficient reactivation of the FMR1 promoter, significant demethylation must occur and that current approaches to gene reactivation using histone deacetylase inhibitors alone may therefore have limited effect.
Fragile X syndrome is caused by the expansion of a CGG repeat in the 5Ј-untranslated region of the fragile X mental retardation (FMR1) gene (1,2). This results in hypermethylation of the promoter and transcriptional silencing (3). The major symptoms of fragile X syndrome, mental retardation and macroorchidism, are consistent with the observation that high levels of FMR1 expression occurs in specific cells in brain and testis (4). The GC-rich human FMR1 promoter lacks a typical TATA-box and contains several potential Sp1 binding sites as well as an E-box and putative binding sites for the transcription factors ␣-Pal/Nrf-1, 1 AP2, AGP/EBP, and Zeste (5). Four regions of protein binding in the unmethylated promoter of the FMR1 gene have been described by in vivo dimethyl sulfate footprinting analysis in human fibroblasts, peripheral lymphocytes, and lymphoblastoid cell lines (5,6). These footprints correspond to the ␣-Pal/Nrf-1 site, 2 GC-boxes, and the E-box. However, the transcription factors that interact with these sites have not yet been identified. Moreover, whereas these footprints do reflect in vivo interactions, their relevance to the regulation of FMR1 transcription in brain and testis, the two major affected organs, is unknown.
Because FMR1 knockout mice demonstrate learning deficits and macroorchidism similar to those seen in fragile X patients (7), and mice show patterns of temporal and tissue-specific FMR1 expression similar to humans (8), many of the control elements important for FMR1 gene regulation are also likely to be evolutionarily conserved. To identify conserved promoter elements, we compared the sequence of the human FMR1 promoter with that from two other primates: Pan troglodytes (chimpanzee) and Macaca arctoides (stump-tailed macaque) as well as two more evolutionarily distant species: Mus domesticus (mouse) and Canis familiaris (dog). We also examined the activity of promoter mutations in transient expression assays using a neuronally derived cell line, PC12, and identified the major transcription factors that bind the FMR1 promoter in nuclear extracts of brain and testis. We also demonstrate that binding of one of these factors is abolished by methylation, and binding of the other two factors is also affected. This may have implications for therapeutic strategies aimed at reactivating the gene, since it indicates that methylation of the FMR1 promoter in Fragile X patients does not simply inhibit transcription via the formation of transcriptionally inactive chromatin, as suggested from in vivo dimethyl sulfate footprinting in lymphoblasts (6) and from the presence of deacetylated histones on the FMR1 promoter in individuals with Fragile X syndrome (9).

EXPERIMENTAL PROCEDURES
Isolation of the Mouse fmr1 Promoter-A 129/SVJ mouse (M. domesticus) BAC library was screened by polymerase chain reaction using two primers from exon 1 of the mouse FMR1 gene (Genome Systems, St. Louis. MI). The resultant clone was subcloned and mapped using standard procedures. Exon 1 of the mouse FMR1 gene was localized to a 3.1-kilobase EcoRI fragment that was subcloned into pZero (Invitrogen, Carlsbad, CA). This clone designated pEco3.3 was then sequenced using standard procedures. Sequence comparison was done using the GCG package (Wisconsin Package Version 10.0, Genetics Computer Group (GCG), Madison, WI). The mouse sequence was scanned against the GenBank TM data base. The only significant matches were to the human * The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
The nucleotide sequence(s) reported in this paper has been submitted to the GenBank TM :  AF251347). Amplification, Sequencing, and Computer Analysis of the Promoter Region from Different Species-Polymerase chain reaction amplification of FMR1 promoter from the genomic DNA of chimpanzee (P. troglodytes), macaque (M. arctoides), and dog (C. familiaris) was carried out using the ExpandTM HiFidelity polymerase chain reaction system (Roche Molecular Biochemicals) and the primers Fraxa f (5Ј-dAGC-CCCGCACTTCCACCACCAGCTCCTCCA-3Ј from exon 1 and FMRUP2 (5Ј-dGCNTTCCCGCCNTNCACCAAG-3Ј) homologous to the 5Ј end of the promoter that is conserved in mice and humans. The polymerase chain reaction product was directly sequenced using the Thermosequenase radiolabeled terminator cycle sequencing kit (U. S. Biochemicals Corp.) according to the manufacturer's recommendations. The sequences obtained were aligned initially using Macvector™ version 5.0.2 (Oxford Molecular Group, Inc., Campbell, CA), followed by visual inspection. The transcription factor binding sites in the human sequence were analyzed using TESS (Transcription Element Search Software, available on the World Wide Web 2 ). The sequences were submitted to GenBank TM (accession numbers AF251349, AF251350, and AF251348 for chimpanzee, macaque, and dog, respectively).
Generation of Reporter Constructs-A single base insertion was introduced into the middle of the E-box site in p32.9 (10) using the QuickChange Mutagenesis™ protocol (Stratagene, La Jolla, CA) and the primer pair 5Ј-GAACAGCGTTGATCACTGTGACGTGGTTTCAGT-GTTTAC-3Ј and 5Ј-dGTAAACACTGAAACCACGTCACAGTGATCAA-CGCTGTTC-3Ј. The introduction of the single base change to the resultant plasmid (pUSFmut) was confirmed by sequencing. A 909-bp fragment from the original FMR1 promoter in p32.9 and a 910-bp fragment from the mutated promoter from pUSFmut containing 869 and 870 bases from the human FMR1 promoter, respectively, were cloned into KpnI-NheI-digested plasmid pGL3-basic (Promega, Madison, WI), which contains the firefly (Photinus pyralis) luciferase reporter gene to make pGL-FMR and pGL-USFmut. The 909-bp wild-type FMR1 promoter was also cloned into HindIII-SpeI digested plasmid pRL-null (Promega), which contains the sea pansy (Renilla reniformis) luciferase-coding sequence, to make the control plasmid pRL-FMR. Deletions in the promoter were created either by Exonuclease III and S1 nuclease treatment or by digestion with a combination of restriction enzymes followed by T4 DNA polymerase. Specifically plasmids p⌬Ϫ88/ ϩ244, p⌬ϩ55/ϩ244, and p⌬Ϫ447/ϩ244 were generated by digesting pGL-FMR with BglII and filling in the recessed 3Ј ends with ␣Ϫphosphothioate deoxyribonucleotides followed by digestion with NheI, Exonuclease III, and S1 nuclease according to standard procedures. Plasmids p⌬Ϫ131/Ϫ123 were made by digesting pGL-FMR with BssHII followed by ligation. p⌬Ϫ131/Ϫ123/USFmut was generated from pGL-USFmut in the same way. Plasmids p⌬Ϫ149/Ϫ116 and p⌬Ϫ151/Ϫ81 were generated by SphI digestion followed by T4 DNA polymerase treatment and ligation. Plasmid p⌬Ϫ85/Ϫ9 was obtained by digestion of pGL-FMR with PmlI and treatment with Exonuclease III and S1 nuclease. A 6-base pair deletion was made in the Initiator (Inr)-like element in the FMR1 promoter using the QuickChange Mutagenesis™ protocol and the primer pair 5Ј-dGGCCGGGGGTTCGGCCAGGCGCT-CAGCTCC-3Ј and 5Ј-dGGAGCTGAGCGCCTGGCCGAACCCCCGGCC-3Ј. This resulted in plasmid p⌬ϩ5/ϩ10. The deletion was confirmed by sequencing. Similarly, a four-base pair change was made in the two GC-boxes by QuickChange Mutagenesis™ protocol, and the mutation was confirmed by sequencing. Specifically, Sp1 site was mutated using primers 5Ј-dCACTTGAAGAGAGAGGATCTGGCCGAGGGGCTGAG-C-3Ј and 5Ј-dGCTCAGCCCCTCGGCCAGATCCTCTCTCTTCAAGT-G-3Ј to get plasmid pS1mut, and the Sp1-like site was mutated using primers 5Ј-dGCTGAGCCCGCGGGGGATCTGAACAGCGTTGATCA-C-3Ј and 5Ј-dGTGATCAACGCTGTTCAGATCCCCCGCGGGCTCAG-C-3Ј to get plasmid pS2mut. The plasmids pS1mut and pS2mut were digested with BssHII and ligated to get p⌬Ϫ131/Ϫ123/S1mut and p⌬Ϫ131/Ϫ123/S2mut, respectively.
Methylation of Reporter Constructs-Specific methylation of the Ebox and ␣ϪPal/NRF-1 sites were achieved by in vivo methylation with PmlI and BssHII, methylases, respectively. This was accomplished using pLG339 as the vector for the methylase genes. pLG339 containing the BssHIIM gene (pLG339/BssHIIM) was a gift of New England Biolabs, Beverly, MA. A similar clone was constructed for the PmlI meth-ylase by subcloning the 3.5-kilobase EcoRI-SalI fragment from pEco72 M (a gift of MBI Fermentas Inc.) into pLG339 to generate pLG339/ PmlIM. pGL-FMR was cotransformed into Escherichia coli along with either pLG339/BssHIIM or pLG339/PmlIM. Methylation in all cases was confirmed by digestion with PmlI or BssHII. Since the plasmids obtained by in vivo methylation are contaminated with the low copy number pLG339 derivative, an unmethylated control was prepared by cotransforming pGL-FMR with pLG339.
Cell Culture, Transient Transfections, and Promoter Assays-Cells were grown in Dulbecco's modified Eagle's medium (Life Technologies, Inc.) supplemented with 5% fetal calf serum (Life Technologies Inc.), 10% heat-inactivated horse serum (Sigma), and 1ϫ penicillin-streptomycin (Sigma) at 37°C and 5% CO 2 to ϳ70% confluence. Culture medium was replaced 18 -24 h before the transfection. Ten micrograms of test plasmid DNA together with 10 g of the control plasmid pRL-FMR were introduced into ϳ10 7 cells by electroporation (300 V, 1180 capacitance; Cell-porater, Life Technologies). Transfected cells were plated in duplicate on 6-well plates. After 16 h, the culture medium was replaced with fresh medium. Cells were collected 42-44 h after transfection and assayed for luciferase using the Dual-Luciferase ® Reporter assay system (Promega) and a MicroLumat LB 96 P luminometer (Berthold Systems, Inc. Aliquippa, PA). At least three independent transfections were performed for each plasmid. The mean of the luciferase activities was plotted after adjusting for the activity of pRL-FMR.
Transcription Factors, Competitor Oligonucleotides, and Antibodies-Oligonucleotides used in the electrophoretic mobility shift assay are listed in Table I. The consensus binding sites for their cognate transcription factors are underlined. Double-stranded oligonucleotides containing the consensus binding sites for AP1, AP2, Sp1, OCT1, TFIID, and CREB were obtained from Promega. The top and bottom strands of two variants of the ␣-Pal/Nrf-1 site and that of consensus E-box and SREBP-1 sites were synthesized by Life Technologies. The complementary strands were annealed and used without further purification. Antibodies against USF1, USF2, Max, c-Myc, Sp1, Sp3, Sp4, and Egr-1 were from Santa Cruz Biotechnology, Inc. (Santa Cruz, CA). CREB antibody was from New England BioLabs. ␣ϪPal/Nrf-1 antibodies were a gift from Dr Brian Safer (NHLBI, NIH, Bethesda, MD).
Preparation of Nuclear Extracts-Nuclear extracts were prepared from the brain, testis, and liver of FVB/N mice (The Jackson Laboratory, Bar Harbor, ME) and from human lymphoblastoid cell lines (Coriell Cell Repositories, Camden, NJ) by a modification of the method of Dignam et al. (11). Briefly, the tissues and lymphoblastoid cells were washed with phosphate-buffered saline and resuspended in two volumes of ice-cold buffer A (10 mM HEPES, pH 7.9, 1.5 mM MgCl 2 , 10 mM NaCl, 0.5 mM dithiothreitol, 0.5 mM phenylmethylsulfonyl fluoride, 7 g/ml calpain inhibitor II) containing 1 protease inhibitor mixture tablet (Complete™ mini, EDTA-free, Roche Molecular Biochemicals, Indianapolis, IN) per 10 ml of buffer A. The cells were lysed by 10 strokes of a Dounce homogenizer, and the tissues were homogenized using a VIRTIS 45 homogenizer (Virtis company Inc., Gardiner, NY). The lysed cells and tissues were then centrifuged at 3500 ϫ g for 15 min to pellet the nuclei. The pellet was spun again at 25,000 ϫ g for 20 min to remove the residual cytosolic material. The nuclei were resuspended in 3 ml of buffer C/10 9 cells (20 mM HEPES, pH 7.9, 25% glycerol, 420 mM NaCl, 1.5 mM MgCl 2 , 0.2 mM EDTA, 0.5 mM phenylmethylsulfonyl fluoride, 0.5 mM dithiothreitol, 7 g/ml calpain inhibitor II, and the same protease inhibitor mixture tablet used previously) and stirred at 4°C for 30 min. The nuclear debris was removed by centrifugation at 25,000 ϫ g for 30 min. The supernatant was dialyzed against 50 volumes of buffer D (20 mM HEPES, pH 7.9, 20% glycerol, 100 mM NaCl, 0.2 mM EDTA, 0.5 mM phenylmethylsulfonyl fluoride, 0.5 mM dithiothreitol) at 4°C overnight using a Slide-A-Lyzer ® cassette (Pierce). The dialysate was centrifuged at 25,000 ϫ g for 20 min. The supernatant was quick-frozen in a dry ice/ethanol bath and stored in aliquots at Ϫ80°C. Protein concentrations were determined using the Bio-Rad protein assay reagent. Nuclear extracts from PC12 cell lines were purchased from Promega.
Binding was carried out at 30°C or 4°C in 30 l of reaction buffer containing 25 mM HEPES, pH 7.5, 5 mM MgCl 2 , 2 mM dithiothreitol, 100 mM NaCl, 0.25 ng of probe, 5 g of protein, and 1 g of poly[dA-dT][dA-dT] for 30 min. A 1000-fold excess of nonspecific DNA and transcription factor binding oligodeoxyribonucleotides were included in the reactions as nonspecific or specific competitors. For antibody supershift assays, 4 g of antibody or BSA was incubated with the protein before the addition of the probe, and reactions were carried out for 40 min. The reactions were stopped by the addition of 30 l of 2ϫ gel loading buffer (200 mM Tris, pH 8.8, 10.5% glycerol, 0.002% bromphenol blue). Reactions were subjected to electrophoresis on a 4% polyacrylamide gel (60:1, acrylamide:bis) containing 1.6% glycerol in 1ϫ Tris-glycine-EDTA buffer (50 mM Tris, 380 mM glycine, 2.1 mM EDTA, pH 8.5). Gels were dried and exposed to x-ray film.
DNase I Footprinting-For DNase I footprinting, a PstI/EcoNI fragment of p32.9 was end-labeled at EcoN I site by [␣-32 P]dGTP and Klenow DNA polymerase (Life Technologies). Binding reactions were carried out with 1 ng of 3Ј end-labeled probe and 20 g of protein in a 100-l reaction volume containing poly[dA-dT][dA-dT]. The reactions were treated with 0.25 units of DNase I in the presence of 10 mM MgCl 2 and 1 mM CaCl 2 for 10 min at 30°C. The reactions were phenolextracted and ethanol-precipitated with tRNA as carrier. The sample was dissolved in 100 l of TE (10 mM Tris-HCl, pH 8.0, 1 mM Na 2 EDTA) and butanol-precipitated. The pellet was washed with 70% ethanol, dried, and resuspended in 10 l of formamide stop buffer (95% formamide, 20 mM EDTA, 0.05% bromphenol blue, 0.05% xylene cyanol FF). A 2-l sample was electrophoresed on a 6% sequencing gel at 1600 V until the bromphenol blue dye reached the bottom of the gel. The gel was dried and exposed to x-ray film.
Western Blot Analysis-The nuclear extracts were subjected to electrophoresis on 10% polyacrylamide gels containing SDS and electroblotted to the NitroPure membrane (MSI, Westboro, MA) using standard procedures. Hybridization with primary antibody was carried out as specified by the supplier. Hybridization with the secondary antibody and signal detection were done using ECL Western blotting kit (Amersham Pharmacia Biotech) as per the manufacturer's recommendation.

RESULTS
Phylogenetic Footprinting of the 5Ј End of FMR1 Gene Identifies Five Evolutionarily Conserved Regions-A 466-bp fragment including 272 bp of sequence upstream of the transcription start site of the human FMR1 gene has previously been shown to contain all the elements necessary for the appropriate tissue-specific expression of the FMR1 gene in transgenic mice (12). To define the minimal promoter region more closely and to identify those transcription factor binding sites that are evolutionarily conserved and that may therefore be important for regulation of this gene, we compared the sequences of the 5Ј end of the FMR1 gene that we obtained from a number of different mammals. Fig. 1 shows a dot matrix comparison of the exon 1-containing portion of the mouse fmr1 gene with the corresponding region of the human FMR1 sequence (GenBank TM /EBI locus: HUMFMR1S). The homology between the two sequences is highest in the first exon, but islands of homology both 3Ј and 5Ј of exon 1 are also seen. The 5Ј regions fall within the 272-bp previously defined minimal promoter fragment (12), the upstream border of which is marked by the black arrow in Fig. 1.
No transcription factor binding sites were conserved between mouse and human 5Ј of the ␣ϪPal/Nrf-1 site, 131 bases 5Ј of the start of transcription. This suggests that the minimal FMR1 promoter may only be 131 bp long. The conserved 3Ј regions may represent regulatory elements in the first intron, but they have not been studied in any detail to date.
We used a primer derived from the conserved 5Ј boundary of the promoter and a primer from exon 1 to amplify the FMR1 5Ј region from chimpanzee, Stump-tailed macaque, and dog. The sequences were then aligned with the human and mouse promoter sequences to identify those regions that are conserved in all five species (Fig. 2). The four described previously in vivo protein binding sites, ␣ϪPal/Nrf-1, the 2 Sp1 or GC boxes, and the E-box (sometimes referred to as the c-Myc binding site) (5,6) are conserved in all five species (shown in the dark gray boxes in Fig. 2). None of these sequences had a good TATA-box. However, they all contained a conserved motif close to the reported start of transcription (13) that resembles an Initiator (Inr) element. Such elements direct transcription initiation in certain TATA-less promoters (14).
The region containing the dimethyl sulfate hyperreactive G residue close to the reported transcriptional start site (5) is also well conserved (the complementary C is marked by an asterisk in Fig. 2). This region does not show a protein binding footprint in vivo, at least in cells with low FMR1 activity (5). This hyperreactivity is absent in individuals with fragile X syndrome and is therefore thought to reflect some DNA structure that is associated with transcription (5). The transcription factors Zeste (5), AP2 (15), and CREB (13) have all been suggested to be important for FMR1 regulation. However the binding sites for these factors were not conserved (shown in the open boxes Fig. 2).
Mutational Analysis Confirms That the Evolutionarily Conserved Regions Are Important for Transcriptional Activity-Mutated versions of the human FMR1 promoter were assayed in PC12 cells, a neuronally derived cell line. Fig. 3 shows that single base insertion in the E-box (pGL-USFmut) or the deletion of 9 bases from the ␣ϪPal/Nrf-1 site (p⌬Ϫ131/Ϫ123) each reduced the expression of the reporter gene about 5-fold. Both mutations together (p⌬Ϫ131/Ϫ123/USFmut) reduced activity almost to that of a construct containing a deletion of almost all of the FMR1 sequence (p⌬Ϫ447/ϩ244). Therefore the upstream  (12). The circled region indicates the region of homology between the two sequences due to the CGG repeat in the 5Ј-untranslated region of exon 1. The numbering on the right-hand side of the matrix refers to the sequence of a mouse 3.1-kilobase EcoRI fragment that includes exon 1. The numbering at the top of the matrix refers to that of the human sequence in the GenBank TM entry GB:HUMFMR1S. The numbering on the left-hand side and along the bottom of the matrix corresponds to the numbering used for the alignment of sequences shown in Fig. 2, where ϩ1 corresponds to the first transcribed base.

FMR1 Gene Regulation
boundary of the minimal promoter region is probably close to the 5Ј end of the ␣ϪPal/Nrf-1 site, as suggested by the phylogenetic footprinting. A deletion that included the ␣ϪPal/Nrf-1 site and one of the GC-boxes (p⌬Ϫ151/Ϫ81) led to an increase in activity over that seen with the nine-base deletion in the ␣ϪPal/Nrf-1 site alone (p⌬Ϫ131/Ϫ123). Similarly, a deletion that removes both the E-box sequence and its adjacent Sp1 site (p⌬Ϫ85/Ϫ9) produced a higher activity than the mutant with the single base insertion in the E-box. The activity of this construct was in fact higher than the full-length promoter (pGL-FMR). However, when a 4-base substitution mutation was made in either of the two GC boxes, the activity was reduced to 75 and 50%, respectively (pS1mut and pS2mut). This suggests that these two regions have a positive role in the control of gene expression. The increase in activity seen for deletion constructs p⌬Ϫ151/Ϫ81 and p⌬Ϫ85/Ϫ9 that lack these sites could be due to decreased distance between the ␣ϪPal/ Nrf-1 site and the transcription initiation site or to the deletion of additional sequences that contain negative regulators of FMR1 activity. Constructs containing both the ␣ϪPal/Nrf-1 deletion and the point mutation in the first GC box (p⌬Ϫ131/ Ϫ123/S1mut) had an activity similar to the ␣ϪPal/Nrf-1 deletion alone, suggesting that there might be some interaction between factors that bind these two sites. However, a Ϫ131/ Ϫ123:S2mut double mutant showed a slightly higher activity than the Ϫ131/Ϫ123 deletion by itself. Why this should be is not clear at this time but may be related to the uncovering of secondary transcription factor binding sites that can now be used. The p⌬Ϫ85/Ϫ9 construct lacks the TATA-like sequence upstream of the transcriptional start site. Since deletion of this sequence did not negatively affect promoter activity, it is possible that the TATA-like sequence is dispensable. However, FIG. 2. Sequence alignment of the 5 end of the FMR1 promoter from five different mammalian species. The human sequence is shown in its entirety with gaps in the alignment shown as dashes. The bases in the sequence of the remaining four species that are identical to the human sequence are shown as dots, with only those bases that differ from the human sequence shown. The previously identified putative factor binding sites that are conserved are shown in the dark gray boxes. These sites correspond to the four protein binding sites that have been identified in fibroblasts, lymphocytes, and lymphoblastoid cells. The C complementary to the G at ϩ14 that is dimethyl sulfate-hyperreactive in vivo is marked by an asterisk. Those previously identified putative factor binding sites that are not conserved are shown as the open boxes. The newly identified conserved Inr element binding site downstream of the start of transcription is shown in the light gray box. The sequence of the mouse and human fragments used in all the EMSA experiments described in this manuscript is shown in bold.

FIG. 3. Functional analysis of the human FMR1 promoter in PC12 cells.
PC12 cells were electroporated with the various constructs shown on the left-hand side of the figure as described under "Experimental Procedures." Base substitution mutations are indicated by the ball and stick. The amounts of luciferase produced were plotted as percentages of the luciferase activity of the full-length promoter construct.

FMR1 Gene Regulation
although the initiator-like element is evolutionarily conserved, its deletion had no effect on the promoter activity in PC12 cells (construct p⌬ϩ5/ϩ10 in Fig. 3). Since the region that includes the putative transcriptional start site is not conserved, it may be that transcription initiation in the FMR1 promoter occurs via a novel mechanism that involves as yet undefined signals.
Deletion of the region that included the CGG repeats in the 1st exon (p⌬ϩ53/ϩ244) produced a small increase in reporter gene activity. Since this sequence is located within the transcript, it is possible that this effect is mediated either at the level of transcription, mRNA stability, or translation. A negative effect of CGG repeats on translation has been suggested based on the observation that the levels of FMRP, the protein product of the FMR1 gene, are reduced in individuals with long premutation alleles (16). However, this effect is probably only significant at high repeat numbers. A 20-kDa protein has been shown to bind to CGG repeats and inhibit FMR1 transcription (17). Deletion of the CGG repeat tract may simply eliminate the effect of these or similar proteins. USF1, USF2, and ␣ϪPal/Nrf-1 Bind to the FMR1 Promoter-The 200-bp region of both the mouse and human FMR1 promoters, which contains all the evolutionarily conserved putative transcription factor binding sites (shown in bold in Fig.  2) was used as a probe for binding factors in nuclear extracts from mouse brain, testis, and liver and human lymphoblastoid and PC12 cell lines. EMSA using brain nuclear extracts and the mouse promoter produced a pattern of three major retarded bands (complexes I, II, and III, Fig. 4, lane 4). In addition, three minor shifted products with mobilities slightly faster than complex I were also seen. All of the bands were specific, since they could be eliminated by an excess of nonradioactive probe (data not shown) but not with a variety of other sequences (Fig. 4,  lanes 6, 7, and 8). The same pattern of gel-shifted bands was produced when the human promoter was used (data not shown).
An excess of unlabeled oligonucleotide containing the E-box consensus sequence eliminated complexes I and III (Fig. 4, lane 5) as well as the minor shifted products. An oligonucleotide containing the consensus sequence for the ␣ϪPal/Nrf-1 binding site eliminated complex II and III (Fig. 4, lane 9). Complexes II and III were also eliminated when poly[dI-dC] was present, presumably due to the GC richness of the ␣ϪPal/Nrf-1 binding site (data not shown). Antibodies to the E-box-binding proteins USF1 or USF2 supershifted both complex I and III as well as the minor products, indicating that these complexes contained both USF1 and USF2 (Fig. 4, lanes 14 and 15). Antibodies to c-Myc or max, two other E-box binding proteins, had no effect on the gel mobility profiles (Fig. 4, lanes 12 and 13). This is not due to the absence of c-Myc in the extracts since Western blots with c-Myc antibody showed the presence of large amounts of full-length c-Myc protein (Fig. 4, inset). Antibodies to ␣ϪPal/ Nrf-1 shifted both complex II and complex III (Fig. 4, lane 20), indicating that ␣ϪPal/Nrf-1 was involved in both of these complexes.
USF1 and USF2 were unable to bind the E-box variant with the single base insertion that was used in the promoter activity studies described above. Moreover, whereas an oligonucleotide containing the unmutated E-box was able to eliminate USF binding to the FMR1 promoter when present in a 10-fold excess, an oligonucleotide containing the mutated E-box did not eliminate binding completely even when a 1000-fold excess was used (data not shown).
Sp3 Binds a FMR1 Promoter Subfragment-Factor binding to the 2 GC boxes in the human promoter has been reported in fibroblasts and lymphoid cells (5,6) and in vitro on a 71-bp promoter-containing fragment in extracts of the neuronally derived cell line SK-N-SH (15). However, GC-box-specific factors did not bind the full-length promoter in any of our extracts under a variety of reaction conditions; unlabeled GC-box oligonucleotides did not alter the EMSA profile (Fig. 4, lane 6), and antibodies to members of the Sp1 transcription factor family Sp1, Sp3, and Sp4 did not supershift complex I, II, or III (Fig.  4, lanes 16, 18, and 19). Antibodies to Egr-1, a related transcription factor that binds to some GC boxes, also had no effect (Fig. 4, lane 17). However, an 82-bp SphI-PmlI fragment that contained both the conserved GC-boxes but that lacked both the E-box and the ␣ϪPal/Nrf-1 site did produce gel shifted products, one of which could be eliminated by competition with Sp1 oligonucleotide and was supershifted by Sp3 antibody (data not shown). It is possible that lack of Sp3 binding to the larger promoter fragment is due to binding of USF and/or ␣ϪPal/Nrf-1. The in vivo binding of Sp factors in lymphoid cells and fibroblasts might reflect the high concentrations of Sp factors in these cells relative to brain, testis, and liver (19).
CREB and AP2 Do Not Bind the FMR1 Promoter-A CREB binding site that overlaps with the E-box sequence has also been suggested to be important for FMR1 regulation (13). However, the E-box consensus oligonucleotide that was able to abolish complex I lacked sequences corresponding to a CREB site (Table I). Moreover, oligonucleotides containing the con- The EMSA was carried out as described under "Experimental Procedures" using mouse brain nuclear extracts unless otherwise indicated. Specific competitor DNA (see Table I for their sequence; lanes [5][6][7][8][9] or antibody to specific transcription factors (lanes [11][12][13][14][15][16][17][18][19][20] or bovine serum albumin was added where indicated. The location of the free probe, complexes I, II, and III, and the supershifted complexes are indicated by the arrows on the right-hand side of the figure. Inset, Western blot analysis of the levels of various transcription factors in brain (B), testis (T), and liver (L). SDS-polyacrylamide gel electrophoresis and Western blotting of brain, testis, and liver nuclear extracts were carried out as described under "Experimental Procedures" using antibodies to c-Myc, CREB, USF1, and USF2 as indicated.
sensus sequence for CREB binding did not compete out any of the retarded bands (Fig. 4, lane 7). This is not due to the absence of CREB, because our extracts contained high levels of full-length CREB (Fig. 4, inset). Moreover, when a CREB consensus oligonucleotide is used as a probe with our extracts, a shifted product is seen (data not shown). We found no evidence for any other factor binding including AP2 (data not shown), which has also been reported to be involved in FMR1 regulation (15). The differences between our data and those previously reported might reflect differences between the source of the material used to prepare the nuclear extracts and other experimental conditions. However, the lack of evolutionary conservation of the CREB and AP2 binding sites and the failure of these factors to bind in brain and testis extracts suggests that these sites may be less important for FMR1 expression than previously thought.
A Similar Pattern of Protein Binding Is Seen in Extracts of Liver, Lymphoblasts, and PC12 Cells-A similar pattern of three shifted bands was observed with nuclear extracts from liver, human lymphoblastoid, and PC12 cell lines (Fig. 4, lanes  2 and 3, and data not shown). However, the two smaller of the three minor bands indicated by the bracket in the brain and liver extracts were not seen with the extracts prepared from the cell lines. These bands may be due to proteolysis during the preparation of the nuclear extracts from the mouse tissues or specific USF variants that are only seen in certain cells. Complexes I, II, and III in liver extracts could be eliminated by competition with E-box and ␣ϪPal/Nrf-1 oligonucleotides as in brain extracts (results not shown), suggesting that they are similar in composition to those seen in brain extracts.
Testis Extracts Have a Different Pattern of Protein Binding-Although a similar pattern of gel mobility shift was seen for brain, liver, lymphoblasts, and PC12 cells, the pattern of mobility shift in testis extracts was very different. Complex I was absent, and complex II and a large amount of heterodisperse complex, "complex III," were seen together with a small amount of a novel band (complex IV) with faster mobility (Fig.  4, lane 1). Unlabeled ␣ϪPal/Nrf-1 eliminates complex II and III, and anti-␣ϪPal/Nrf-1 antibody supershifts both complex II and III (data not shown). Unlabeled E-box oligonucleotide eliminates both complex III and complex IV, and both of these complexes are supershifted with antibodies to USF1 or USF2 (data not shown). Thus, complex II and III in testis are apparently similar to complex II and III in brain and liver, being comprised of ␣ϪPal/Nrf-1, and ␣ϪPal/Nrf-1 plus USF proteins, respectively. Complex IV contains USF1 and USF2 proteins since it is supershifted by antibodies to both USF1 and USF2. However since there are no unique testis-specific USF1 or USF2 polypeptides visible on Western blots of SDS-polyacrylamide gel electrophoresis gels of these extracts (Fig. 4, inset, and data not shown), the rapid mobility of USF proteins in complex IV may be due to either charge differences or to the association of USF with a different subset of proteins in this organ.
When the formation of complex III is prevented by the addition of ␣ϪPal/Nrf-1 oligonucleotide, there is little if any increase in complex I in brain (Fig. 4, compare lanes 1 and 6) and liver or complex IV in testis (data not shown). No new bands corresponding to complex I are seen in testis extracts either (data not shown). In contrast, there is an increase in the amount of complex II seen in brain, testis, and liver extracts when excess E-box oligonucleotide is present (Fig. 4, compare  lanes 4 and 5). One interpretation of these data is that the USF proteins in complex III do not bind the promoter effectively in the absence of ␣ϪPal/Nrf-1, perhaps because they are complexed with other proteins or modified in some way and that in testis most of the USF is of this form.
Methylation of FMR1 Promoter Prevents ␣ϪPal/Nrf-1 Binding-Methylation of the FMR1 promoter in fragile X patients (20) and in reporter constructs (21) greatly reduces the ability of the promoter to drive transcription. Methylation of the C residues in all the CpG residues in the promoter by SssI methylase eliminated complex II and III in gel shift assays done with the human FMR1 promoter and brain extracts but had only a small effect on the formation of complex I (Fig. 5, panel  A). Consistent with this observation is the fact that methylation of three CpGs in the ␣ϪPal/Nrf-1 site eliminated the DNase I footprint seen at the ␣ϪPal/Nrf-1 site (Fig. 5, panel B,  lane 4), whereas methylation of the single CpG in the E-box reduced but did not eliminate the DNase I protection at that site (Fig. 5, panel B, lane 6). Moreover, transfection experiments showed that methylation of the promoter at the E-box reduced promoter activity by about 20%, whereas methylation of the ␣-Pal/Nrf-1 site reduced activity by 55% (Fig. 5C). This suggests that binding of ␣ϪPal/Nrf-1 is sensitive to methylation and that USF1/USF2 binding is also sensitive but somewhat less so than ␣ϪPal/Nrf-1. The fact that a deletion in the ␣-Pal/Nrf-1 site which prevents ␣-Pal/Nrf-1 binding causes a 75% drop in promoter activity suggests that methylation of this site reduces the ability of ␣-Pal/Nrf-1 to drive transcription by more than 70%.
Methylation of the E-box also affects the footprint at the ␣ϪPal site (Fig. 5B, lane 6). This suggests that the USF proteins bind to both the E-box and the ␣ϪPal/Nrf-1 site and/or that there is some interaction between these factors. This together with the observations that abolishing complex III by the addition of excess ␣-Pal/Nrf-1 oligodeoxyribonucleotide does not increase the amount of complex I that is seen and that mutations in the E-box or the ␣-Pal/Nrf-1 site both reduce promoter activity to 25% that of the wild-type promoter supports this idea. We are currently examining these possibilities in more detail. DISCUSSION We have identified five evolutionarily conserved sites in the promoter of the FMR1 gene (Fig. 2). These include the four in vivo protein binding sites found in fibroblasts and lymphoid cells (5,6), namely the E-box, two GC-boxes, and a ␣ϪPal/Nrf-1 binding site. The conservation of these sites among species as evolutionarily divergent as primates, canids, and rodents, which last shared a common ancestor 80 -120 million years ago, indicates that they are important for regulation of this gene. That the E-box and the ␣ϪPal/Nrf-1 sites together can account for almost all of the transcriptional activity of the FMR1 promoter in the neuronally derived PC12 cells supports this contention (Fig. 3). Removing the TATA-like sequence just upstream of the transcription start site does not abolish gene activity nor does a deletion of an Initiator-like element close to the reported transcription start site. Initiation of transcription

FMR1 Gene Regulation
of the FMR1 gene may thus involve a novel mechanism whose molecular details remain to be elucidated. We have identified three major transcription factors that seem to positively affect transcription. They are USF1 and USF2, two similar basic helix-loop-helix/leucine zipper (b/ HLH/Z) type transcription factors that are ubiquitously expressed but have nonetheless been implicated in the regulation of several tissue-specific as well as developmentally or metabolically regulated genes (e.g. Refs. 22 and 23). The third positive regulator is a putative bZip factor known as ␣ϪPal or Nrf-1 (nuclear respiratory factor-1). The term ␣ϪPal/Nrf-1 is used here for consistency with the literature and to avoid confusion with NF-E2-related factor-1, which is also abbreviated to Nrf-1. ␣ϪPal/Nrf-1 has been implicated in the regulation of metabolic genes in response to cellular proliferation (24). However, it has strong homology to the Drosophila erect wing gene (ewg) that is required for normal central nervous system development (24), which may be relevant given the role of FMR1 in learning and memory. Deletion of the binding sites for these factors reduces transcriptional activity in a neuronally derived cell line. We have also shown that Sp3, a member of the Sp family of ubiquitously expressed zinc finger factors that bind GC-boxes, binds to the 82-bp SphI-PmlI mouse promoter subfragment containing the conserved GC-boxes. Deletions that include one or the other of these GC-boxes do decrease promoter activity (Fig. 3), suggesting some contribution of these regions to optimal gene activity.
In liver, the amount of USF protein that binds to the promoter in gel shift or DNase I footprinting assays is reduced relative to that in brain and testis, and very little ␣ϪPal/Nrf-1 binding is seen at all (Fig. 4, and data not shown). The high level of USF1 and USF2 in adult brain ( Fig. 4 and Ref. 25) and the high levels of ␣ϪPal/Nrf-1 (26) and the ␣ϪPal/Nrf-1-dependent USF variant in adult testis relative to other organs may account for their relatively high levels of FMR1 mRNA.
Methylation of the FMR1 promoter in individuals with fragile X syndrome is correlated with reduced transcription of this gene. In such individuals no transcription factor binding to the promoter in vivo is seen (5,6). This may reflect the binding of proteins such as MECP2, that recognize methylated CpGs and bind the promoter in a sequence-nonspecific manner (27). One consequence of this binding is thought to be the recruitment of histone deacetylases to the region and subsequent formation of transcriptionally silent chromatin containing a large proportion of deacetylated histones (27). The FMR1 promoter in cells from fragile X patients is associated with deacetylated histones, and reactivation of the promoter using the demethylating agent 5-azacytidine increases transcription (9,28) and the proportion of acetylated histones (9). However, treatment of cells from fragile X patients with the deacetylase inhibitor trichostatin A, which increases the level of acetylated histones directly, has little (29) or no effect on reactivation of the gene (9). This has led to the suggestion that there might be an additional component to the methylation-dependent silencing of this gene (9). We have shown that methylation significantly reduces ␣ϪPal/Nrf-1 binding and also affects binding of USF1 and USF2. Failure of these factors to bind the methylated promoter could well be this additional component. If this is true, it would have important implications for the development of therapies aimed at reactivating FMR1 transcription in patients with fragile X syndrome. FIG. 5. The effect of methylation on the human FMR1 promoter. A, the effect of methylation on the formation of complex I, II, and III. The EMSA was carried out as described under "Experimental Procedures" using mouse brain nuclear extracts and either unmethylated probe (0) or a probe that has been completely methylated using SssI methylase (SssIM). The location of the free probe and the gelshifted complexes I, II, and III are indicated by the arrows on the right-hand side of the figure. B, the effect of methylation of individual transcription factor binding sites on DNase I footprinting. Footprinting was carried out as described under "Experimental Procedures" using mouse brain extracts and either an unmethylated promoter fragment (lanes 1 and 2, 0) or promoter fragments that had been methylated with BssHII methylase (BssH IIM) (lanes 3 and 4) or PmlI methylase (Pml IM) (lanes 5 and 6). The ␣ϪPal/Nrf-1 site and the E-box site are indicated by the square brackets. The numbering corresponds to the numbering used in Fig. 2. C, the effect of methylation on promoter activity. The promoter activity of unmethylated pGL-FMR (0) or the pGL-FMR methylated by PmlI M or BssHIIM was determined as described under "Experimental Procedures." The activity is shown in panel B as a percentage of the activity of the unmethylated plasmid.