The Human 20-kDa 5 * -(CGG) n -3 * -binding Protein Is Targeted to the Nucleus and Affects the Activity of the FMR1 Promoter*

Previous reports have described the human DNA CGG repeat-binding protein (CGGBP1 or p20), which binds specifically to nonmethylated, but not to methylated, 5 * -(CGG) n -3 * repeats in the promoter of the fragile X mental retardation 1 ( FMR1 ) gene. The results of transfection experiments into human HeLa cells using a p20-green fluorescent protein fusion construct indicate that the p20 protein is targeted to the nucleus. By deletion analyses, a nuclear localization signal has been found between amino acids 80 and 84. Deletions between amino acids 69 and 71 and between 95 and 167 interfere with 5 * -(CGG) n -3 * binding. The results of electrophoretic mobility shift assays using DNA with 5 * -(CGG) n -3 * repeats of different lengths render it likely that oligomers of the p20 protein bind to the repeat. In cotransfection experiments, the activity of the FMR1 promoter is reduced by the presence of p20. Upon transfection of the p20 cDNA construct into HeLa cells, transcription of the endogenous FMR1 gene is decreased. The green fluorescent protein-p20 fusion protein associates preferentially with the telomeres of the short arms of human chromosomes 13, 14, 15, 21, and 22. Their telomeres carry the genes for the 28 S rRNA, which contain 5 * -(CGG) n -3 * repeats. The translated region of the p20 gene from three healthy, five fragile X syndrome, and five premu-tation-carrying rescent in situ binding. Analyses of the p20 Sequence in Fragile X and Healthy Individuals— Genomic DNA was isolated from whole blood of FraX, premutation, and unaffected individuals, and the translated p20 region was PCR ampli- fied by using the primers 5 9 -CAAGGAGTGGACCTGATTGCGA-3 9 (nucleotides 10,143–10,164 in the genomic p20 sequence; GenBank TM accession number AF094481) and 5 9 -CTCCACATTTATCTTGAT-CACAA-3 9 (nucleotides 10,926–10,904). Reaction conditions were as follows: 0.5 units of Taq polymerase (Promega), 0.24 m M each of the dNTPs, 1.8 m M MgCl 2 , 10 pmol of each primer, 200 ng of genomic DNA. The mixture was denatured at 95 °C for 5 min followed by 30 cycles of 30 s at 95 °C, 30 s at 55 °C, and 45 s at 72 °C in a Perkin-Elmer Cetus Gene Amp PCR system 9600. Products were elongated for 4 min at 72 °C, gel purified and cloned into the pGEM-T-vector (Promega). After clone isolation from E. coli strain XL1-blue, the p20 coding sequence was determined in several clones.

Repetitive DNA constitutes a major part of the human genome. Unstable triplet repeat sequences have been implicated in the causation of several, mainly neurodegenerative, diseases (for reviews, see Refs. [1][2][3]. The amplification of repeats can abolish the expression of neighboring genes or directly affect their amino acid composition when the repeat is part of the coding sequence (4 -7). The fragile X syndrome (FraX) 1 is a frequent cause of mental retardation in human males. A naturally occurring 5Ј-(CGG) n -3Ј repeat on Xq27.3 with normal n values between 6 and 50 can expand to values between 50 and 200 in nonsyndromic premutation carriers and, upon passage through a female germ line, to values of n Ͼ 200, leading to the disease, in case these expansions become 5Ј-CG-3Ј methylated (for reviews, see Refs. 8 -11). The repeat is located in the 5Ј-untranslated region (UTR) of the fragile X mental retardation 1 (FMR1) gene (8). Expansions with n Ͼ 200 and methylation inactivate the gene, the product of which is normally found in many human tissues, above all in the nervous system (12). The cause for repeat expansions is unknown. Slippage of the polymerase system during replication has been implicated (13)(14)(15), but additional factors, including structural elements, in the sequence play a role.
In recent work, we have purified and characterized a 20-kDa 5Ј-(CGG) n -3Ј-binding protein termed CGGBP1 or p20 from the nuclei of human HeLa cells (16,17). The amino acid sequence of this 5Ј-(CGG) n -3Ј repeat-binding protein has been determined, an expressed sequence tag cDNA clone with a complete open reading frame has been identified in GenBank TM , and the p20 protein has been synthesized from a recombinant clone in Escherichia coli (17). The recombinant protein has 5Ј-(CGG) n -3Ј binding properties akin to the protein purified from HeLa cells. The p20 coding sequence is highly conserved among mammals and is also found in chicken. The protein data base lacks sequences with homologies to p20. There is a putative nuclear localization signal (NLS) (18,19) in the sequence.
In this communication, we demonstrate that removal of the NLS from the p20 cDNA abolishes the ability of the p20 protein to localize exclusively to the nucleus of HeLa cells. We present evidence that the p20 can affect the activity of the endogenous FMR1 promoter and when this promoter is located on a transfected construct. A fusion product of the p20 protein with the green fluorescent protein (GFP) can bind specifically to the telomeric regions on the short arms of human chromosomes 13, 14, 15, 21, and 22. The translated sequence of the p20 gene is not mutated in the genomes of several individuals with premutation or full expansion 5Ј-(CGG) n -3Ј repeats in their genomes.

EXPERIMENTAL PROCEDURES
Plasmid Constructs-In the pFMR1-luc (luciferase) construct, the 2.7-kilobase pair SmaI-NheI fragment of pE5.1 (5) was inserted into pGL2-Basic (Promega). pFMR1-⌬Rep-luc was obtained by replacing the 5Ј-UTR of the FMR1 gene between the XhoI and NarI restriction endonuclease sites with the 5Ј-(CGG) 16 -3Ј repeat by a synthetic oligodeoxyribonucleotide without the repeat. pFMR1-NH08-luc was constructed by insertion of a cloned 5Ј-(CGG) n -3Ј repeat from a natural premutation in the FMR1 promoter (20). For this purpose, pRW3308 was cleaved with the XbaI and HindIII endonucleases, and the 500base pair fragment harboring an AGG/CAG-interrupted 5Ј-(CGG) 160 -3Ј sequence was inserted into the NheI and HindIII endonuclease sites of pFMR1-luc 56 base pairs downstream of the endogenous 5Ј-(CGG) 16 -3Ј repeat in the 5Ј-UTR. pAd2E2AL-luc was described earlier (21). pEGFP-p20 expressing the p20 protein as a C-terminal fusion construct to enhanced green fluorescent protein (EGFP) from Aequorea victoria was obtained by cloning a PCR fragment from the p20 coding region flanked by a BamHI and a KpnI site into the BglII and KpnI sites of pEGFP-C1 (CLONTECH). For the construction of pCMV-p20, the plas-mid pEGFP-p20 was cut with AgeI and XmnI endonucleases to remove the EGFP gene, and the cytomegalovirus (CMV) promoter was ligated via a synthetic double-stranded oligodeoxyribonucleotide to the p20 cDNA coding sequence.
Construction of Mutants in the Putative NLS of the p20 Protein-Single nucleotide exchanges and a deletion were introduced into the coding sequence of p20 by PCR as described elsewhere (22). Briefly, pairs of overlapping PCR products (5Ј-primary and 3Ј-primary products) containing the same mutations in the overlapping regions were synthesized by PCR with cloned Pfu DNA polymerase (Stratagene, La Jolla, CA), purified, and denatured. The reannealed heteroduplexes carrying recessed 3Ј-ends were filled in with Pfu DNA polymerase (2 cycles of 30 s at 95°C, 45 s at 50°C, and 2 min at 72°C), and the complete, mutated coding sequence was amplified with primers p20 -5Ј-AgeI (5Ј-AGAACCGGTATGGAGCGATTTGTAACAG-3Ј) or p20 -5Ј-BamHI (5Ј-CGCGGATCCATGGAGCGATTTGTAGTAACA-3Ј) and p20 -3Ј-KpnI (5Ј-GGGGTACCTCAACAATCTTGTGAGTTGAG-3Ј). PCR amplification was performed as described by the manufacturer using expressed sequence tag clone ID269133 as template and an annealing temperature of 50°C. The 5Ј-primary PCR products were PCRamplified with either primer p20 -5Ј-AgeI or p20 -5Ј-BamHI and a primer carrying a deletion or point mutation. 3Ј-primary PCR products were amplified with p20 -3Ј-KpnI and the corresponding mutated primer. Sequences of the mutated primers were (for p20-⌬(69 -84)) 5Ј-AAGAGATGCAGTTAGG͉GGGGTATGAGCTTTGACTTGAG-3Ј and 5Ј-CTCAAGTCAAAGACTCATACCCC͉CCTAACTGCATCTCTT-3Ј. Deletions are marked by a vertical line. Several point mutations were introduced in a similar way. Purified PCR products were cleaved with AgeI/KpnI (for fusions without the EGFP sequence) or BamHI/KpnI (with the EGFP sequence) and ligated into the appropriately precleaved pEGFP-C1-terminal protein fusion vector (Promega). E. coli DH5-␣ clones were isolated, and the nucleotide sequences of the p20 mutants were verified by DNA sequencing.
Analyses of Mutant p20 Proteins from HeLa Cells-HeLa cells were grown to semiconfluence in Dulbecco's modified Eagle's medium (23). About 10 7 cells were suspended in 0.5 ml RPMI medium without glutamine or dye (Life Technologies, Inc.), and 5 g of different plasmid constructs with 100 g of autoclaved salmon sperm DNA were added and electroporated in 4-mm cuvettes with a Bio-Rad gene pulser at 210 V, 960 F at room temperature. The cells were then incubated in 37°C Dulbecco's modified Eagle's medium on plates for 24 h. After washing in phosphate-buffered saline without Mg 2ϩ and Ca 2ϩ , the cells were either analyzed by UV-microscopy or lysed for 10 min in 25 mM Tris-phosphate, pH 8.0, 2 mM EDTA, 2 mM dithiothreitol, 10% glycerol, 1% Triton X-100 for 10 min. After centrifugation at 13,000 ϫ g for 10 min, the lysates were used for electrophoretic mobility shift assays (EMSAs) as described (16).
Estimation of the Apparent Molecular Size of the 5Ј-(CGG) 17 -3Ј Oligodeoxyribonucleotide Complex with p20 -A published method (24) correlates the dependence of the mobility of a complex on gel concentration. This method yields an estimate of the molecular size of a complex. Under native conditions, the mobility of larger complexes is more sensitive to an increase in gel concentration than that of smaller ones. We used ovalbumin, catalase, and bovine serum albumin monomers, dimers and trimers as markers in 4 -10% polyacrylamide gels in 1 ⁄2 TBE (45 mM Tris, 45 mM borate, 1 mM EDTA, pH 8.0) to determine relative electrophoretic mobilities of the p20 -5Ј-(CGG) 17 -3Ј complex in the same gel. For R f determination, the part of the gel with the marker proteins was stained with Coomassie Blue, and the EMSA part with the radiolabeled oligodeoxyribonucleotide was dried and exposed to x-ray film (Eastman Kodak Co., X-OMAT).
Electromobility Shift Assay-In vitro binding of proteins to radiolabeled double-stranded DNA was performed as described previously (16). Fragment 823 from the human 28 S rRNA gene was used that contained an interrupted 5Ј-(CGG) n -3Ј repeat (5Ј-GCG CGG GTC GGG GCG GCG GCG GCG GCG GCG GTG GCG GCG GCG GCG GGG GCG GCG GGA CCG-3Ј, double-stranded; nucleotides 8758 -8817, Gen-Bank TM accession number U13369; Ref. 25), and a mutated version, in which CGG triplet tracks were interrupted by AGGs (5Ј-GCG CGG GTC GGG GCG GAG GCG GAG GCG GAG GTG GCG GAG GCG GAG GGG GCG GAG GGA CCG-3Ј, double-stranded). Five fmol of labeled DNA were incubated with 1 g of protein from cellular extracts. In control experiments, 0.5 pmol of competitor DNA was used. In supershift experiments, 1:10 or 1:100 dilutions of rabbit anti-p20 polyclonal antiserum were added. The reaction products were resolved on a 5% polyacrylamide gel in 1 ⁄2 TBE buffer.
Binding of proteins to DNA containing longer repeats was investigated under similar conditions. Five fmol of the radiolabeled XbaI-HindIII fragment of plasmid pRW3316 (20) containing 81 interrupted repeats ((CGG) 11 AGG(CGG) 60 CAG(CGG) 8 ) were incubated with either 0.5 or 3 g of total protein from crude cellular extracts. Competitor DNA was used as described. The products were separated on a 5% polyacrylamide gel and exposed to x-ray film.
Quantitative RT-PCR-A total of 10 7 HeLa cells was electrotransfected with 5 g of pCMV-p20, expressing the wild type p20 from the CMV immediate early promoter or with 5 g of pCMV-p20-⌬(69 -84), an amino acid 69 -84 deletion mutant of p20 that lacks DNA binding activity. Total RNA was isolated from the cells 24 h posttransfection (RNeasy Mini Preparation, Qiagen), and 100 ng of RNA were analyzed by quantitative RT-PCR using primers for the FMR1 mRNA 5Ј-UTR (5Ј-AGCGCCCGCAGCCCACCTCTCGG-3Ј, sense) and exon 5 (5Ј-TCATTGTAAGTTGCATCACATGCTGC-3Ј, antisense). Amounts of 2.5-20 fg of an artificial, nonhomologous DNA fragment were used as internal standard that binds to the same primer pair. This protocol was reported to minimize errors (26). RT-PCR was performed using AMVreverse transcriptase and Tfl-DNA polymerase in the same tube (Access, Promega). In parallel, expression of the ␤-actin gene as a control of RNA quantity and quality was analyzed by amplification with a second primer pair (5Ј-CTCCCTGGAGAAGAGCTACGAG-3Ј, sense, exon 4; 5Ј-GATGGAGTTGAAGGTAGTTTCGTG-3Ј, antisense, exon 5). RT-PCR was carried out with 150 ng of each primer, 1 mM Mg 2ϩ , 0.2 mM of the four dNTPs in the presence of 1 Ci of [␣-32 P]dCTP and 2.5 units of both enzymes in AMV/Tfl buffer (Promega) in a volume of 25 l by using a DNA Thermo Cycler (Perkin-Elmer Cetus). Temperature cycling was as follows: 48°C for 30 min; 35 cycles at 94°C for 30 s, 63°C for 1 min, and 68°C for 45 s; and 68°C for 5 min, allowing sequential reverse transcription and PCR without interruption. Aliquots of 6 l of the PCR reaction products were separated on a 4% polyacrylamide gel and analyzed by using a phosphorimager.
Binding of the p20-GFP Fusion Product to Human Chromosomes-Binding targets of p20 on human chromosomes were identified by expressing the p20 protein as a C-terminal fusion product to the EGFP in HeLa cells: 20 g of pEGFP-p20 were electrotransfected into 10 7 cells. Cells were harvested 24 h later and lysed in 50 l of lysis buffer (see above). Crude lysates were directly incubated with spreads of human chromosomes that were prepared as described (27). Freshly prepared chromosome spreads on microscope slides were incubated with 20 l of crude extract diluted 10-fold in gel shift reaction buffer (16) for 30 min at room temperature. Metaphase spreads were analyzed in an Olympus BH-2 fluorescence microscope. Individual human chromosomes were identified by Giemsa trypsin G-banding following fluorescent in situ binding.
Analyses of the p20 Sequence in Fragile X and Healthy Individuals-Genomic DNA was isolated from whole blood of FraX, premutation, and unaffected individuals, and the translated p20 region was PCR amplified by using the primers 5Ј-CAAGGAGTGGACCTGATTGCGA-3Ј (nucleotides 10,143-10,164 in the genomic p20 sequence; GenBank TM accession number AF094481) and 5Ј-CTCCACATTTATCTTGAT-CACAA-3Ј (nucleotides 10,926 -10,904). Reaction conditions were as follows: 0.5 units of Taq polymerase (Promega), 0.24 mM each of the dNTPs, 1.8 mM MgCl 2 , 10 pmol of each primer, 200 ng of genomic DNA. The mixture was denatured at 95°C for 5 min followed by 30 cycles of 30 s at 95°C, 30 s at 55°C, and 45 s at 72°C in a Perkin-Elmer Cetus Gene Amp PCR system 9600. Products were elongated for 4 min at 72°C, gel purified and cloned into the pGEM-T-vector (Promega). After clone isolation from E. coli strain XL1-blue, the p20 coding sequence was determined in several clones.

RESULTS
Nuclear Localization and 5Ј-(CGG) n -3Ј Binding of the p20 Protein-Because the p20 protein had been isolated from nuclear extracts of human cells (16,17), it was likely a nuclear protein. Moreover, a putative NLS was detected in the amino acid sequence of p20. More direct experiments were devised to investigate the subcellular localization of p20. Amino acid comparisons between the p20 protein and the SV40 large T antigen, nucleoplasmin, or the nuclear protein N1 (18,19) render the sequence domain between amino acids 80 and about 84 a FMR1 Promoter Is Regulated by p20 candidate for a NLS (Fig. 1a). Several p20 mutants were created in fusion constructs containing the gene for GFP with enhanced activity from A. victoria (pEGFP). These pEGFP-p20 variants with deletions (⌬) or point mutations in the p20 coding region were electroporated into HeLa cells in culture. At 24 h after transfection, GFP fluorescence was determined by UV light microscopy (Fig. 1b). Deletion mutants (⌬) impairing the domain of amino acids 80 -84 in the p20 protein eliminated its physiological transport to the nucleus (wild type), as evidenced by homogeneous fluorescence throughout the cell. The deletion of amino acids 95-167 revealed a weak effect on nuclear transport. Selected point mutations within the critical amino acid 69 -84 region had no influence on nuclear transport (data not shown).
EMSAs using crude extracts of wild type p20-or mutanttransfected HeLa cells and 32 P-labeled 5Ј-(CGG) 17 -3Ј oligomer as the probe (Fig. 1c) revealed that deletions of amino acids 14 -94, 69 -84, 69 -71, or 95-167 in the p20 protein led to very poor binding, if any at all, to the specific oligomer. Deletion of amino acids 80 -84 reduced oligomer binding of the p20 protein. Weak background binding in the "negative" controls was probably due to endogenously synthesized p20 in HeLa cells. The formation of the cI complex (16) was competed by unlabeled 5Ј-(CGG) 17 -3Ј competitor. The data presented in this section demonstrate that the segment between amino acids 80 and 84 is essential and possibly sufficient for the nuclear localization of the p20 protein. The domain between amino acids 95 and 167 could also be part of the NLS. DNA binding of the p20 protein requires at least two of its domains, one between amino acids 69 and 71 and another one between amino acids 95 and 167. Thus, the two activities of the p20 protein, NLS and DNA binding, reside in two different, possibly partly overlapping, regions of the amino acid sequence.
Specific Inhibition of the FMR1 Promoter by Overexpression of the p20 Protein in HeLa Cells-The possibility exists that the p20 protein, due to its specific binding to the 5Ј-(CGG) n -3Ј repeat, exhibits a direct effect on the FMR1 promoter. When the p20 protein is overexpressed under the control of the immediate early CMV promoter after cotransfection of the pCMV-p20 construct with the pFMR1-luc plasmid into HeLa cells, the FMR1 promoter is inhibited (Fig. 2, pFMR1-luc). Upon removal of the endogenous 5Ј-(CGG) 16 -3Ј repeat from the 5Ј-UTR of the FMR1 promoter, the 5-fold repression by p20 overexpression is decreased to less than 2-fold (pFMR1-⌬Rep-luc). The residual inhibition may be due to the more than 15 5Ј-CGG-3Ј triplets in the 100 base pairs flanking the deleted 5Ј-(CGG) 16 -3Ј repeat. In contrast, when a 5Ј-(AGG/CAG)-3Ј-interrupted 5Ј-(CGG) n -3Ј repeat of n ϭ 160 is inserted 56 base pairs downstream of the endogenous 5Ј-(CGG) 16 -3Ј repeat, inhibition by overexpression of p20 is about 10-fold (pFMR1-NH08-luc). The activity of a luciferase gene construct under the control of the late E2A promoter of adenovirus type 2 (Fig. 2, pAd2E2AL-luc) without 5Ј-(CGG) n -3Ј repeats is not affected by the overexpression of the p20 protein in HeLa cells. These results demonstrate that the activity of the FMR1 promoter can be inhibited by the p20 protein in human HeLa cells. The extent of inhibition may depend on the length of a 5Ј-(CGG) n -3Ј repeat.
The p20 Protein Decreases the Activity of the Endogenous FMR1 Gene-Because we found repression of the FMR1 promoter by p20 overexpression in cotransfection experiments, we also investigated whether elevated levels of p20 protein had an effect on the transcription of the endogenous FMR1 gene. HeLa cells were transfected with plasmids overexpressing the p20 wild type protein or, as a control, a mutant form that could no longer bind to DNA. By functional analyses using gel shift assays, we estimated the p20 binding levels in transiently   FIG. 1. Analysis of DNA-protein binding and subcellular localization of the p20 protein and its mutants. a, alignment of amino acid sequences in NLSs from the SV40 large T antigen, nucleoplasmin, nuclear protein N1, and p20 reveals a possible NLS region. Residues that have been shown to be essential for nuclear localization (18,19) are shaded in gray. b, the intracellular localization of p20 variants was investigated by electrotransfection and transient expression of pEGFP-p20 vector variants harboring deletions (-⌬) as indicated in the p20 coding region fused to EGFP. Transfected HeLa cells were fixed, and EGFP fluorescence and phase contrast were analyzed in a Zeiss laser scan microscope. wt, wild type p20CGGBP. Sequence coordinates were those published previously (16). c, the DNA binding capacity of p20 mutants to a 5Ј-(CGG) 17 -3Ј double-stranded ((CGG) 17 ds) oligomer was analyzed by mobility shift assays. Wild type p20 or mutants were transiently overexpressed in HeLa cells using pCMV-p20 constructs, and crude cell extracts were incubated with the 32 P-end-labeled DNA probe prior to analyses by electrophoresis on a 5% polyacrylamide gel. Note weak background binding (CI, complex I) of endogenous p20 seen in the lane (EGFP) with crude extracts with only EGFP overexpressed as a control.

FMR1 Promoter Is Regulated by p20
pCMV-p20-transfected cells to be about 4 -5-fold higher than in mutant-transfected or nontransfected cells that only express the endogenous p20 protein (data not shown). Subsequently, total RNA was isolated from the cells and the FMR1 mRNA levels were analyzed by radioactive quantitative RT-PCR followed by polyacrylamide gel electrophoresis (Fig. 3). Different amounts of an artificial internal standard (Fig. 3, R) were added and amplified by the same primer pair as the FMR1 cDNA (FMR1). As a further control, ␤-actin (Fig. 3, ␤-act) mRNA was amplified in the same reaction.
Two alternatively spliced FMR1 mRNA products spanning exons 1-5 were detected in HeLa cells. When RNA from the lymphoblastoid cell line TR91.0042 (TR42), which had been derived from a fragile X patient, was analyzed, no FMR1 expression was detectable. Because the relative quantities of RT-PCR products reflect the representation of transcripts in the initial RNA pool (26), phosphorimager quantitations of the FMR1 products compared with the product from the internal standards provided an estimate of the relative amounts of FMR1 mRNA in the cells. Our data indicate that the FMR1 mRNA is reproducibly underrepresented in the cells, which overexpress wild type p20, by a factor of 2-3 when compared with FMR1 mRNA levels in mutant p20-transfected cells. These data support the notion that p20 can play a role in the regulation of the transcription of the endogenous FMR1 gene.
Evidence for the Formation of Oligomers by the p20 Protein with 5Ј-(CGG) n -3Ј-The degree of inhibition of the FMR1 promoter by the p20 protein seems to be dependent on the length of the repeat. Therefore, we have applied the relative mobility method (24) to test whether more than one p20 molecule can bind to a 5Ј-(CGG) n -3Ј stretch at the same time. The estimated molecular size of the cI complex (16) of 105 kDa (Fig. 4a) suggests that the 36-kDa 5Ј-(CGG) 17 -3Ј DNA is indeed occupied by several p20 molecules. Moreover, EMSA experiments with longer 5Ј-(CGG) 81 -3Ј repeats (20) have revealed a ladder of retarded bands supporting this hypothesis (Fig. 4b).
Specific Binding of the p20-EGFP Fusion Protein to the Telomeric Regions on the Short Arms of Human Chromosomes 13, 14, 15, 21, and 22-Does the pEGFP-p20 fusion protein over-expressed in HeLa cells bind directly to other 5Ј-(CGG) n -3Ј-rich regions on human chromosomes? As described under "Experimental Procedures," extracts from HeLa cells, which overexpress this fusion protein, were directly added to and reacted with spreads of human chromosomes that were subsequently photographed under UV light. Green fluorescence reveals a specific weak chromosome banding. Most notably, the EGFP-p20 fusion protein binds specifically to the short arms of chromosomes 13, 14, 15, and 21 and more weakly to that of chromosome 22 (Fig. 5, a and c). These chromosomes have been identified by their acrocentric morphology and their specific Giemsa banding patterns (Fig. 5, b and c). Binding to specific chromosomes, although less well identified, has also been observed with metaphase spreads from Syrian hamster and fish cells and more weakly with those from mouse cells (data not shown). The specificity of fluorescent in situ binding is sup- FIG. 2. Inhibition of FMR1 promoter activity by the overexpression of p20 in human cells. One or 3 g of expression vector pCMV-p20 per 10 7 HeLa cells were cotransfected with pFMR1-luc expressing luciferase under the control of a FMR1 promoter fragment harboring a 5Ј-(CGG) 16 -3Ј repeat in the 5Ј-UTR. Relative luciferase activities were measured after lysing the cells. The values were standardized to the activity obtained using pCMV-EGFP as a control and normalized per g of pFMR1-luc DNA construct transfected. Moreover, constructs were transfected in which the 5Ј-(CGG) 16 -3Ј repeat was deleted (pFMR1-⌬Rep-luc). In another construct, a second repeat, 5Ј-(CGG) 160 -3Ј, was inserted downstream of the endogenous repeat in the 5Ј-UTR (pFMR1-NH08-luc). As a control, pAd2E2AL-luc expressing luciferase under the control of the Ad2 E2A late promoter was transfected.
FIG. 3. Inhibition of the expression of the endogenous FMR1 gene by p20. HeLa cells were transfected with either pCMV-p20 or pCMV-p20-⌬(69 -84), a mutant devoid of DNA binding capacity. Total RNA from these cells (100 ng) was subsequently used for quantitative RT-PCR analysis. Primers were designed to amplify a part of the FMR1 mRNA spanning exons 1-5 and an artificial reference probe (R). Amounts of 2.5-20 fg of reference probe were added before RT-PCR in 2-fold dilution steps. The two FMR1 products were due to alternative splicing. With a second primer pair, a fragment of the ␤-actin mRNA was amplified as an internal standard for RNA quantities and quality (bottom panel). Quantitation using a phosphorimager revealed a 2-3fold reduction of FMR1 mRNA in wild type p20-transfected HeLa cells. In TR42 cells from a fragile X individual, no FMR1 transcription was detectable. In that lane (TR 42), no reference DNA was used.
FIG. 4. Several p20 molecules can bind to longer stretches of 5-(CGG) n -3 repeats. a, approximate determination of the molecular size of the p20 complex cI with 5Ј-(CGG) 17 -3Ј double-stranded oligonucleotide by mobility shift assay. The inverse correlation of the mobility of native molecules with acrylamide concentration (24) was exploited to quantitate repeat-bound p20 molecules. An approximately linear correlation with the molecular size of different reference proteins (filled circles) was found. The molecular size of complex cI was estimated by fitting the linear regression curve to the observed mobility (cross). Free 5Ј-(CGG) 16 -3Ј double-stranded oligonucleotide mobility dependence on gel concentration is also plotted versus its known molecular size (open circle). b, mobility shift assay with cloned repeat DNA XH16 from the human FMR1 promoter containing 81 interrupted repeats; *, (CGG) 11 AGG(CGG) 60 CAG(CGG) 8 (20). Amounts of 0.5 or 3 g of crude HeLa extracts were added and specificity of binding was shown by the addition of unlabeled competitor 5Ј-(CGG) 17 -3Ј repeat DNA. ported by successful competition experiments. The simultaneous addition of 5Ј-(CGG) 17 -3Ј oligomers results in the complete loss of detectable binding, whereas mutated oligomers did not work as competitors (data not shown). The fluorescent in situ binding signals were eliminated by DNase but were resistant to RNase treatment. Thus, DNA appears to be the target for p20 protein binding on the chromosomes.
The genes for the large ribosomal RNA are located in large tandem repeats in the acrocentric regions of the mentioned human chromosomes (28,29). As documented by EMSA experiments using oligodeoxyribonucleotides from the 28 S rDNA (Fig. 5c), p20 can bind specifically to the 5Ј-(CGG) n -3Ј repeats in this gene flanked by 28 S structural sequences (25).
In FraX Patients, the p20 Sequence Is Not Mutated-From the peripheral white blood cells of three healthy and five full expansion FraX males and their premutation mothers, part of the 5Ј-UTR and the translated region of the p20 gene was PCR-amplified and cloned, and the nucleotide sequences of these PCR products were determined (see under "Experimental Procedures"). Mutations in the p20 coding region of FraX or premutation individuals were not detected. These results render a direct role of mutations in the coding part of the p20 gene in the causation of the FraX syndrome unlikely.
In addition, we prepared nuclear extracts from activated lymphocytes from FraX, premutation, or unaffected individuals. The results of EMSA experiments revealed that proteins from either group of probands had the same binding activity to the 5Ј-(CGG) 17 -3Ј oligodeoxyribonucleotide (16,17). Apparently, active p20 is expressed in the lymphocytes of FraX patients and premutation carriers. DISCUSSION The p20 protein is a nuclear protein that has originally been isolated from human HeLa cells and is abundant in mammalian species (16,17). The p20 protein binds specifically to 5Ј-(CGG) n -3Ј repeats in the (human) genome. The p20 protein fails to show homologies to yeast proteins; thus, revealing biochemical analogies are not available. We have started to investigate possible functions of the p20 protein: it recognizes specific trinucleotide repeats and binds at such sites in the genome. The data adduced in this series of investigations support the following additional conclusions.
The p20 protein is located in the nucleus and carries a nuclear localization signal between amino acids 80 and 84, the removal of which leads to the homogeneous distribution of the p20 protein over the entire cell. The segment between amino acids 95 and 167 might weakly contribute to the NLS activity. The p20 regions between amino acids 69 and 71 and between amino acids 95 and 167 are responsible for 5Ј-(CGG) n -3Ј binding.
Apparently due to its binding capacity to the 5Ј-(CGG) n -3Ј repeat in the promoter of the human FMR1 protein, the p20 protein-at least when overexpressed in human cells-is capable of reducing the activity of this promoter, particularly when the repeats are expanded. The p20 protein probably forms oligomers in its association with 5Ј-(CGG) n -3Ј repeats. In this way, the effect of repeat length on p20 protein binding and repression might be explained.
We have also tested the effect of p20 overexpression on the amount of endogenous FMR1 mRNA present in human HeLa cells. The observed reduction in the transcription of the FMR1 gene further supports the notion that p20 acts as a regulator of the FMR1 and other promoters due to its capability to bind to 5Ј-(CGG) n -3Ј repeats.
In FraXA patients or in premutation individuals, mutations in the translated sequence of the p20 gene have not been found. Thus, it is unlikely that alterations in the p20 gene could play a direct role in the causation of the FraXA phenotype.
The p20 protein can bind specifically to the short arms of the FIG. 5. Specific binding of the p20-pEGFP fusion product in HeLa cells to the short arms of human chromosomes 13, 14, 15, 21, and 22. a, fluorescent in situ binding experiments were performed with chromosome preparations from primary human lymphocytes and crude extracts from cells transiently expressing the pEGFP-p20 construct. b, Giemsa trypsin G-banding of the same metaphase spread as shown in a. c, fluorescent and G-banded acrocentric chromosomes from the same experiment were presented separately to identify the chromosomes that reveal specific binding of pEGFP-p20. d, demonstration of p20 binding by EMSA to a fragment (823) of the 28 S rRNA gene harboring 5Ј-(CGG) n -3Ј repeats. Incubation of the 823 fragment (left panel) with crude extracts prepared from HeLa cells resulted in the formation of several complexes, one of which (arrowhead) was enhanced when extracts from cells that overexpressed p20 were used (pCMV-p20). This complex was supershifted by anti-p20 antiserum (␣-p20) and was specifically competed by 5Ј-(CGG) 17 -3Ј double-stranded DNA. The control DNA, in which CGG triplet tracks were interrupted by AGGs (823 mut), did not form specific complexes that can be recognized by antibodies against p20 or be competed by a 5Ј-(CGG) 17 -3Ј oligodeoxyribonucleotide (right panel). acrocentric human chromosomes 13, 14, 15, 21, and 22, pre-sumably to a 5Ј-(CGG) n -3Ј-rich region associated with the 28 S ribosomal RNA genes. It will be interesting to investigate which other promoters can be regulated by p20.
Our data imply a role of p20 in the inactivation of the FMR1 gene that occurs following 5Ј-(CGG) n -3Ј repeat expansion. A stepwise amplification of the repeat might result in the binding of an increasing number of p20 molecules to the 5Ј-region of the FMR1 gene that tunes down gene activity progressively and finally results in the silencing of FMR1 expression. This hypothesis predicts that functional p20 is found in FraX patients as demonstrated in this report. We have not observed differences in expression levels of p20 between healthy individuals and those affected by the fragile X syndrome.