Identification and Characterization of Nucleolin as a c-myc G-quadruplex-binding Protein*

myc is a proto-oncogene that plays an important role in the promotion of cellular growth and proliferation. Understanding the regulation of c-myc is important in cancer biology, as it is overexpressed in a wide variety of human cancers, including most gynecological, breast, and colon cancers. We previously demonstrated that a guanine-rich region upstream of the P1 promoter of c-myc that controls 85–90% of the transcriptional activation of this gene can form an intramolecular G-quadruplex (G4) that functions as a transcriptional repressor element. In this study, we used an affinity column to purify proteins that selectively bind to the human c-myc G-quadruplex. We found that nucleolin, a multifunctional phosphoprotein, binds in vitro to the c-myc G-quadruplex structure with high affinity and selectivity when compared with other known quadruplex structures. In addition, we demonstrate that upon binding, nucleolin facilitates the formation and increases the stability of the c-myc G-quadruplex structure. Furthermore, we provide evidence that nucleolin overexpression reduces the activity of a c-myc promoter in plasmid presumably by inducing and stabilizing the formation of the c-myc G-quadruplex. Finally, we show that nucleolin binds to the c-myc promoter in HeLa cells, which indicates that this interaction occurs in vivo. In summary, nucleolin may induce c-myc G4 formation in vivo.

The c-myc proto-oncogene is a key component of normal cell growth and differentiation. Normally, this gene is subject to tight transcriptional regulation; however, aberrant c-myc expression is a common feature in a number of human malignancies (1)(2)(3). In fact, it is estimated that one-seventh of cancer deaths in the United States are associated with alterations in the c-myc gene or its expression (4). Dysregulation of the c-myc proto-oncogene can arise through a variety of mechanisms, including chromosomal translocation (5), gene amplification (6), and increased transcription (7)(8)(9)(10), as well as a higher rate of translation and enhanced protein stability (11)(12)(13). However, most often c-myc is activated indirectly through alterations in cell signaling that lead to an increase in c-myc transcription (14).
The mechanisms governing c-myc transcription are complex and involve the interaction of regulatory DNA elements with DNA-binding proteins (15,16). One important DNA element that has been shown to regulate c-myc expression is located -142 to -115 bp upstream of the P1 promoter (Fig. 1A), and it has been shown to control up to 90% of the total c-myc transcription (17,18). This DNA segment is highly sensitive to DNase I and S1 nucleases and is referred to as the nucleasehypersensitive element (NHE) 3 III 1 (19). The NHE III 1 consists of a purine-rich sequence that can equilibrate between transcriptionally active forms (duplex and single-stranded DNA) and a silenced four-stranded structure under physiological conditions in vitro (20). At the core of these four-stranded structures are guanine tetrads in which each guanine interacts with two other guanines by Hoogsteen hydrogen bonds to form cyclic arrangements of four guanines (Fig. 1B). In addition, monovalent cations such as K ϩ and Na ϩ stabilize these structures by intercalating between the G-tetrads and forming coordinative bonds with the guanine carbonyl groups. Furthermore, the structural transition of a G-rich region from typical B-DNA to an atypical G-quadruplex structure can be facilitated by negative superhelical stress in DNA (21).
There is evidence to suggest that G-quadruplexes exist in vivo, and it is believed that these structures act as signaling elements (22)(23)(24); however, direct evidence for the formation of these structures in vivo is still emerging. For example, the identification of antibodies and proteins that preferentially bind to, stabilize, unwind, or cleave G-quadruplexes provides evidence for their existence in vivo (25,26). In addition, recent reports have demonstrated that putative G-quadruplex motifs are highly prevalent in human promoter regions, as Ͼ40% of human gene promoters may contain at least one of these elements (27). G-quadruplex-containing promoters have been found to associate with nuclease-hypersensitive sites, suggesting that the formation of these structures may be favored in sequences dynamically equilibrating between duplex and G-quadruplex chromatin conformations in vivo (27). Furthermore, Quarfloxin, a ribosome biogenesis inhibitor currently in phase II clinical trials (Cylene Pharmaceuticals, San Diego), has been suggested to exert its cancer cell-specific apoptotic effects * This work was supported, in whole or in part, by National Institutes of Health by directly binding to ribosomal DNA G-quadruplex structures to displace nucleolin (28). In this study, we sought to find a mammalian cell protein that specifically interacts with and modulates the function of the intramolecular c-myc G-quadruplex structure.
Nucleolin is a nucleolar phosphoprotein that is highly expressed in proliferating cells, known mainly for its role in ribosome biogenesis (29); however, nucleolin also functions in chromatin remodeling (30,31), transcription (32)(33)(34)(35)(36), G-quadruplex binding (36 -38), and apoptosis (39). For example, LR1, a nucleolin-hnRNP D heterodimer transcription factor, has been reported previously to regulate c-myc transcription in B cell lymphomas by binding to a double-stranded DNA element upstream of the NHE III 1 region (35). Interestingly, LR1 has also been reported to bind to G-quadruplex structures (36,37); however, the interaction of this transcription factor or its components with the c-myc G-quadruplex structure has not been characterized previously. Here we describe the identification, purification, and characterization of nucleolin as a c-myc G-quadruplex-binding protein and report its effects on the formation of the c-myc G-quadruplex structure in vitro. In addition, our luciferase assay results show that overexpression of nucleolin can significantly inhibit c-myc promoter-driven transcription as measured by luciferase activity in MCF10A cells.
Furthermore, we provide evidence that nucleolin binds to the c-myc promoter in vivo.

EXPERIMENTAL PROCEDURES
Synthetic Oligonucleotides-The oligonucleotides used in the design of the G-quadruplex affinity columns were Biotin-Pu77-WT, 5Ј-Biotin-TTTTCTTTTCCCCCACGCCCTCTGCTT-TGGGAACCCGGGAGGGGCGCTTATGGGGAGGGTGG-GGAGGGTGGGGAAGG-3Ј, and Biotin-Pu77-MT, 5Ј-Biotin-TTTTCTTTTCCCCCACGCCCTCTGCTTTGGGAAC-CCGGGAGGGGCGCTTATGGGGAGGGTGAGGAGGGT-GGGGAAGG-3Ј. Biotin-Pu77-WT is a biotinylated 77-mer containing the wild-type NHE III 1 sequence capable of forming the most biologically relevant intramolecular c-myc G-quadruplex structure. The Biotin-Pu77-MT sequence is identical to that of Biotin-Pu77-WT with the exception of a G-to-A substitution at position 62. Pu27 is a purine-rich 27-mer corresponding to the c-myc NHE III 1 sequence 5Ј-TGGGGAGGGTGGG-GAGGGTGGGGAAGG-3Ј, and Pu47 is a 5Ј-CTATGTAT-ACTGGGGAGGGTGGGGAGGGTGGGGAAGGTTAGCG-GCAC-3Ј, which corresponds to a purine-rich 47-mer encompassing the c-myc Pu27 sequence plus 10 nucleotides on each of its flanking sides. Its complement, the pyrimidine-rich Py47, has the sequence 5Ј-GTGCCGCTAACCTTCCCCACCCTC-FIGURE 1. Promoter structure of the c-myc gene and scheme of its G-quadruplex structure. A, location of the NHE III 1 region within the c-myc promoter. Runs of guanines that can participate in G-quadruplex formation are underlined. B, scheme of a guanine-tetrad and a schematic of the c-myc G-quadruplex structure. Left, H-bonding pattern in a G-tetrad; center, schematic diagram of a G-tetrad; right, schematic representing a G-quadruplex structure that is found in the c-myc promoter region.
Plasmids-The reporter plasmid containing the NHE III 1 region of the human c-myc promoter (Del-4) linked to the firefly luciferase gene was kindly provided by Dr. Bert Vogelstein (The Johns Hopkins University) (39). The Renilla plasmid (pRL-TK) was used as an internal control (Promega, Madison, WI). The human enhanced green fluorescent protein-nucleolin plasmid (EGFP-Nuc) was kindly provided by Dr. Michael B. Kastan (St. Jude Children's Research Hospital) (40). The EGFPcontrol plasmid was constructed by deleting the nucleolin protein expression sequence from the EGFP-Nuc plasmid using the restriction enzymes EcoRI and KpnI. The HA-Sp1 plasmid was kindly provided by Dr. Scot W. Ebbinghaus (University of Arizona).
Purification of G-quadruplex-binding Proteins-Potential c-Myc G-quadruplex-binding proteins were purified from a 20-ml wet volume of HeLa cells obtained from the National Cell Culture Center. The HeLa whole-cell protein extract was prepared by resuspending the cell pellet in ice-cold lysis buffer containing 10 mM Tris⅐HCl, pH 7.5, 1 mM EDTA, 0.1 mM phenylmethylsulfonyl fluoride, 5 mM ␤-mercaptoethanol, 1 mM dithiothreitol, 0.5% CHAPS, and 10% glycerol. The cell suspension was then incubated on ice for 30 min, sonicated, and centrifuged for 30 min at 20,000 rpm at 4°C to remove the insoluble cell matter. The resulting whole-cell extract was applied to a Sephacryl S-400 gel (GE Healthcare) and eluted with Buffer B (25 mM Tris⅐HCl, pH 7.4, 50 mM NaCl, 0.5 mM MgCl 2 , 1 mM EDTA, 5 mM ␤-mercaptoethanol, 1 mM dithiothreitol in 10% glycerol) containing 100 mM NaCl. Protein-active fractions were identified by the Bradford protein assay (Bio-Rad), pooled together, and applied to a heparin-agarose resin (GE Healthcare) to enrich for DNA-binding proteins. To identify nonspecific c-myc wild-type G-quadruplex-binding proteins, we first applied the heparin-purified proteins to the mutant affinity column Pu77-MT. This mutant affinity column contains a mutation that destabilizes the c-myc G-quadruplex structure (41). The Pu77-MT flow-through protein mixture was collected and subsequently applied to the wild-type affinity column Pu77-WT. Proteins that bound to either the Pu77-WT or the Pu77-MT column were eluted with a 0.1-2.0 M NaCl gradient in Buffer B. The protein mixtures corresponding to either the Pu77-WT-or the Pu77-MT-binding proteins were submitted to liquid LC-MS/MS sequencing analysis at the University of Arizona Mass Spectrometry Consortium. Proteins that bound to Pu77-WT but not to Pu77-MT were identified for further investigation. This procedure was repeated to determine those proteins that reproducibly bound only to the wildtype sequence.
Purification of Nucleolin from HeLa Cells-A 20-ml wet volume of HeLa cells corresponding to ϳ2.4 g of total protein was used to obtain the HeLa whole-cell protein extract for the purification of nucleolin. Five chromatographic steps were used for the purification of nucleolin in the following order: Sephacryl S400 gel filtration, heparin-agarose resin, Q-Sepharose resin, SP-Sepharose resin, and Pu77-WT DNA affinity column. Proteins were eluted with a 0.1-2.0 M NaCl gradient in Buffer B, except for protein applied to the S400 gel column, which were eluted with Buffer B containing 100 mM NaCl. Protein fractions containing nucleolin were identified by Western blot analysis after each chromatographic fractionation, pooled together, and applied to the subsequent column. The purity of the protein fractions collected from the Pu77-WT affinity column was determined by running the nucleolin-containing protein fractions on a 10% SDS-PAGE, staining the protein bands, on the gel with Coomassie Blue, excising each protein band, and sequencing them by LC-MS/MS.
Expression and Purification of Recombinant Nucleolin-Although we were successful in purifying nucleolin from HeLa cells, the purification process was time-consuming, and the total protein yield was low. Consequently, we resorted to using recombinant nucleolin, which conserves its RNA and DNA binding activity. Recombinant nucleolin was produced using the pNuc-1,2,3,4-RGG 9 plasmid, which was generously provided by Dr. L. A. Hanakahi (42). pNuc-1,2,3,4-RGG 9 carries human nucleolin residues 284 -709, including all four RNA binding domains (RBDs) and the C-terminal domain, fused at the N terminus to Escherichia coli maltose-binding protein.
The maltose-binding protein-fused protein was purified on an amylose column (New England Biolabs), following the manufacturer's protocol. The protein was then dialyzed and concentrated in dialysis buffer (20 mM Tris⅐HCl, pH 7.4, 5 mM NaCl, 1 mM EDTA, in 50% glycerol). Protein concentration was determined by the Bradford protein assay (Bio-Rad).
TTTAAGGGTTAGGGTTAGGGTTAGGG telomeric c-myc 1:2:1 GGGGCGCTTATGGGGAGGGTGGGTAGGGTGGGTAAGGTGGGGAGGAG c-myc 1:2:2 GGGGCGCTTATGGGGAGGGTGGGTAGGGTTGGGAAGGTGGGGAGGAG c-myc 2:1:2 GGGGCGCTTATGGGGAGGGTTGGGAGGGTTGGGAAGGTGGGGAGGAG c-myc 1:6:1 GGGGCGCTTATGGGGAGGGTTTTTAGGGTGGGGAAGGTGGGGAGGAG Electrophoretic Mobility Shift Assay (EMSA)-Labeled oligonucleotides were generated by incubating the DNA or RNA oligos with [␥-32 P]dATP and T4 polynucleotide kinase (Fermentas) and purified by electrophoresis on a 16% denaturing polyacrylamide gel. Binding of G-quadruplex DNA or RNA was carried out in 20-l reactions containing 10 mM Tris⅐HCl, pH 7.4, 1 mM EDTA, 50 ng/l poly(dI-dC), and RNase inhibitor (Ribolock, Fermentas). Glycerol (5%) was added to each EMSA reaction immediately before loading onto a 17 ϫ 15-cm 1.5-mm thick 4% nondenaturing polyacrylamide gel containing 0.5ϫ TBE. Protein complexes were resolved by running the gel at 10 mA for 1 h at room temperature. Fig. legends specify the amount or concentration of DNA/RNA, KCl, and protein used in each experiment. Complex formation was quantified by PhosphorImager analysis (Storm 820, GE Healthcare), and dissociation constant (K d ) values were calculated by plotting the fraction of bound DNA or RNA at each protein concentration. K d values represent the protein concentration required to bind 50% of the total labeled DNA or RNA.
DMS Footprinting-DMS footprinting of free Pu27 or recombinant nucleolin-bound Pu27 was performed. Nucleolinbound Pu27 was first visualized by EMSA. Bands containing either the bound or unbound Pu27 DNA were excised and soaked in dimethyl sulfate solution (DMS, 1% in 50% ethanol) for 5 min at room temperature. All reactions were quenched with stop buffer (3 M ␤-mercaptoethanol:water:NaOAc; 1:6:7, v/v). DNA samples treated in-gel were extracted from the gel by crushing the gel and soaking the samples in 10 mM Tris⅐HCl, pH 7.4, overnight. Supernatant from these samples was transferred to a fresh Eppendorf tube to be ethanol-precipitated. After ethanol precipitation and piperidine cleavage, the reactions were separated on an analytical gel (16%) and visualized on a Phos-phorImager. Maxam-Gilbert sequencing G ϩ A and T ϩ C reactions were carried as described previously (43).
Filter Binding Assay-Filter binding assays were performed as described previously (44). In brief, a DEAE membrane was placed directly below the nitrocellulose membrane to trap any DNA not retained by the nitrocellulose. The two membranes were positioned on a 96-well dot-blot apparatus. The nitrocellulose membrane was treated with 0.5 M KOH for 10 min at 4°C and washed with 1ϫ binding buffer (10 mM Tris⅐HCl, pH 7.4) prior to use. Increasing concentrations of the recombinant nucleolin protein and the indicated DNA substrate were mixed in 20 l of buffer (10 mM Tris⅐HCl, pH 7.4) and incubated for 30 min at room temperature. A 15-l aliquot was applied to a nitrocellulose filter under vacuum and washed twice with 200-l of 1ϫ binding buffer. The nitrocellulose and DEAE filters were dried, and the bound and unbound radioactivity were quantified by PhosphorImager analysis. The data were then analyzed by a nonlinear regression equation using GraphPad Prism4 software. The protein concentration at half-maximal binding yielded the apparent K d value for the various DNA substrates.
CD Spectroscopy-Oligonucleotide stocks were diluted to 5 M in 50 mM Tris⅐HCl, pH 7.4. Pu47 ss (single stranded/unstructured c-myc NHE III 1 oligo) samples were incubated with either Tris⅐HCl, pH 7.4 buffer, protein dialysis buffer, or the specified amounts of recombinant nucleolin protein at room temperature for 30 min to reach equilibrium prior to CD spectroscopy. To assemble the Pu47 oligo into the c-myc G-quadruplex conformation, the oligo was heated at 95°C for 10 min and left to cool gradually to room temperature. The assembled G-quadruplex was then incubated with either protein dialysis buffer or recombinant protein at room temperature for 30 min prior to CD spectroscopy. CD spectra were recorded on a Jasco-810 spectropolarimeter (Easton, MD) at room temperature, using a quartz cell of 1-mm optical path length and an instrument scanning speed of 100 nm/min, with a response time of 1 s and over a wavelength range of 200,325 nm. The reported spectrum of each sample represents the average of three scans. The spectral contribution of buffers and proteins was subtracted as appropriate by using the software supplied with the spectrometer.
Polymerase Stop Assay-Pu77-PS is the wild-type c-myc template for the polymerase stop assay. The P28 primer was synthesized, 5Ј-end-labeled, and purified as described previously (45). The primer was annealed to the template DNA by incubating them together in 1ϫ annealing buffer (50 mM Tris⅐HCl, pH 7.4, 10 mM NaCl), heating to 95°C, and then slowly cooling to room temperature. DNA formed by annealing the primer to the template sequence was purified using gel electrophoresis on a 12% native polyacrylamide gel. The specific activity of the purified DNA was determined by TopCount (Packard Instrument Co.). The polymerase stop reaction contained annealed template, reaction buffer (10 mM MgCl 2 , 0.5 mM dithiothreitol, 0.1 mM EDTA, 1.5 g/l bovine serum albumin), 0.1 mM dNTP, and Taq DNA polymerase. The final mixture was incubated at 37°C for 1 h. The polymerase stop reaction was stopped by adding 2ϫ stop buffer (10 mM EDTA, 10 mM NaOH, 0.1% xylene cyanole, 0.1% bromphenol blue in formamide solution) and loaded onto a 16% denaturing gel.
Imaging and Quantification-The dried gels were exposed on a phosphor screen. Imaging and quantification were performed using a PhosphorImager and ImageQuant 5.1 software from Amersham Biosciences.
Chromatin Immunoprecipitation-Chromatin immunoprecipitation (ChIP) assays were performed by using the chromatin immunoprecipitation kit as recommended by the manufacturer (Upstate). Briefly, HeLa cells were grown to confluency in 10-cm dishes and fixed with 1% formaldehyde at 22°C for 10 min. Fixation was stopped by adding glycine to the media to a final concentration of 0.125 M. Cells were scraped and pelleted. The HeLa cell pellet was then resuspended in SDS lysis buffer and incubated for 10 min on ice. The samples were sonicated on ice with a BioRuptor sonicator at high intensity 20 times for 15-s pulses with 1-min intervals between each pulse to an average DNA length of ϳ500 bp, which was confirmed by agarose gel electrophoresis and ethidium bromide staining. To reduce nonspecific binding, the chromatin solution was pre-cleared by the addition of salmon sperm DNA/protein A-agarose-50% slurry for 30 min. Pre-cleared chromatin was incubated with 4 g of corresponding antibody or no antibody and rotated at 4°C for ϳ12-16 h. Antibodies used included nucleolin (NB600-241, Novus Biologicals), Sp1 (SC-59, Santa Cruz Biotechnology), and IgG (SC 2025, Santa Cruz Biotechnology).
Immunoprecipitation of the protein-DNA complex was performed as recommended by the manufacturer. The eluted DNA was dried and resuspended in 50 l of TE buffer. Standard PCRs were performed using 2 l of the immunoprecipitated DNA or 2 l of 1:20 diluted input. After 25 cycles of amplification, PCR products were separated by electrophoresis through 1% agarose gels and visualized by ethidium bromide intercalation. The PCR primers used to amplify the region of the c-myc promoter containing the NHE III 1 were NHE-FW (5Ј-CTGC-GATGATTTATACTCAC-3Ј) and NHE-BW (5Ј-CCAGA-CCCTCGCATT-3Ј).
Luciferase Reporter Assay-MCF-10A cells were seeded at a density of 7.5 ϫ 10 5 cells per well in a 6-well plate in 2 ml of medium at 37°C with 5% CO 2 . Cells were grown overnight, and the medium was exchanged for fresh medium immediately before transfection. Cells were transfected with the corresponding reporter construct by using Lipofectamine 2000 as recommended by the manufacturer (Invitrogen). Luciferase activity was measured 24 h after transfection by using the luciferase assay system as recommended by the manufacturer (Promega). The results of the luciferase assay are presented as the average Ϯ S.D. from three independent experiments after normalization against Renilla luciferase activities. The data are expressed as a percent of luciferase activity in control cells (100%).

Identification of Nucleolin as a Potential c-Myc G-quadruplex-binding
Protein-Potential c-Myc G-quadruplex-binding proteins were purified by using c-myc NHE III 1 DNA affinity column chromatography and identified by liquid LC-MS/MS sequencing analysis. These proteins are listed in Table 2 and can be classified into four main categories as follows: 1) proteins involved in chromatin remodeling and transcription; 2) proteins involved in telomere maintenance such as hnRNP A1; 3) proteins that play a role in RNA splicing or translation; and 4) ribosomal proteins. MYH9, a protein that plays an important role in platelet structure (59), was also found to bind to the c-myc G-quadruplex reproducibly. 4 We became interested in nucleolin because it is an important regulator of cell proliferation that has been previously reported to play a role in c-myc gene regulation in B cells (35,42). In addition, nucleolin has been shown to bind to a number of intramolecular and intermolecular G-quadruplex structures in vitro (35)(36)(37)(38). Furthermore, it is believed that Quarfloxin inhib-its rRNA biogenesis by disrupting the interaction between nucleolin and ribosomal DNA G-quadruplex structures (28). However, to our knowledge, no biologically relevant intramolecular G-quadruplex structure has been identified as a substrate of nucleolin. Moreover, little is known about the relationship between G-quadruplex topology and nucleolin-Gquadruplex binding activity. We first tried to purify nucleolin from human cells and characterize its interactions with the c-myc G-quadruplex structure both in vitro and in vivo to further investigate possible interactions of nucleolin with the c-myc G-quadruplex.
Both Native Human and Recombinant Nucleolin Bind to the G-rich Sequence of the c-Myc NHE III 1 Region in Vitro-Fulllength nucleolin has a molecular mass of ϳ100 kDa, and it is the predominant species present in rapidly dividing cells (60,61). Here we show that nucleolin was purified to homogeneity by column chromatography as established by the LC-MS/MS sequencing analysis and as observed by Coomassie staining of the SDS-PAGE-resolved affinity column-purified proteins ( Fig. 2A). The proteins eluted on the 0.8 and 1 M NaCl column fractions, which had molecular masses of 100 and 80 kDa, were both identified by LC-MS/MS as nucleolin ( Fig.  2A, lanes 6 and 7).
To determine whether the binding of both human and recombinant nucleolin to the c-myc G-quadruplex was specific, we performed an EMSA by incubating the highly purified human nucleolin protein with radiolabeled Pu27-mer in the presence of 100ϫ excess nonradiolabeled poly(dI-dC) competitor DNA (Fig. 2B). Our results show that native nucleolin can bind to the c-myc sequence with high specificity. These results further support our finding that the c-myc G-quadruplex could serve as a binding target of nucleolin.
In addition, EMSA studies of protein fractions containing both the full-length and truncated nucleolin proteins resulted in two retardation bands corresponding to the full-length nucleolin (major, slower running band) and the proteolytic product of nucleolin (minor, faster running band) as confirmed by Western blot and LC-MS/MS (data not shown). Furthermore, additional EMSA studies of Pu27-protein complexes using nucleolin-containing fractions obtained from Sephacryl-400 and SP-Sepharose, or the highly purified nucleolin obtained after Pu77-WT affinity chromatography purification, showed that these protein samples all produced a single retardation band of similar mobility shift, which suggests that nucleolin bound the c-Myc NHEIII 1 as a monomer, not a dimer or larger macromolecular complex (data not shown).  Telomerase binding and telomeric G-quadruplex unwinding (50,51). Translation initiation control (52) and mRNA G-quadruplex unwinding (53) hnRNP A2/B1 Pre-mRNA splicing (54) and mRNA G-quadruplex unwinding (55) Eef1A Translation elongation and G-rich DNA binding (56, 57) RPS20 Ribosomal protein component of the 40 S ribosome subunit (58) RPL15 Ribosomal protein component of the 60 S ribosome subunit (58) RPL21 Ribosomal protein component of the 60 S ribosome subunit (58) MYH9 Maintenance of platelet structure and cytolytic granule exocytosis (59) DMS Footprinting to Investigate the Molecular Interaction of Nucleolin with the c-Myc G-quadruplex-To better understand the way in which nucleolin interacts with the c-myc NHE III 1 sequence, we performed a DMS footprinting analysis. DMS specifically methylates guanines and adenines at the N-7 position, and subsequent treatment with piperidine breaks the DNA backbone of the radiolabeled oligonucleotide at the methylated sites and produces a cleavage pattern that can help us identify the bases that are protected from DMS footprinting (41). In G-quadruplex structures, N-7 of the tetrad-forming guanines is involved in Hoogsteen hydrogen bonding and thus is protected from DMS attack. DMS footprinting is also used to identify the specific DNA-binding site of DNA-binding proteins, because DNA-protein interactions also protect DNA from DMS methylation and subsequent cleavage. In our study, we used DMS footprinting to determine whether nucleolin binds to the c-myc G-quadruplex structure or, alternatively, if it binds to a linear flanking region of the c-myc NHE III 1 sequence. To perform DMS footprinting of the cmyc G-quadruplex-nucleolin complex, we first incubated radiolabeled Pu27 in the absence (Fig. 2C, lane 1) or presence (Fig. 2C, lane 2) of 1500 nM (3 g) recombinant nucleolin. The protein-DNA complex was separated from the free DNA on a native polyacrylamide gel. The radiolabeled DNA was visualized by radiography, and the bands corresponding to the protein-DNA complex (Fig. 2C, lane 2, top band) and free DNA (Fig. 2C, lane 2, bottom  band) were excised from the gel for subsequent in-gel DMS treatment, DNA extraction, and analytical gel analysis.
Our DMS footprinting results show that when the c-myc Pu27mer is bound to nucleolin (Fig. 2D,  lane 4), the four 3Ј-runs of guanines are strongly protected against DMS attack, as compared with the free Pu27 oligo (lane 5). In addition, we observed a weak protection of G4 and G5 when nucleolin was bound to Pu27 (Fig. 2D, lane 4), as compared with the enhanced cleaving that the same guanines present in the unbound Pu27 (lane 5). These results suggest that nucleolin binds to the c-myc G-quadruplex structure comprising the four guanine runs on the 3Ј-end of the NHE III 1 . In summary, these results provide evidence that nucleolin binds with high affinity and selectivity to the c-myc G-quadruplex structure. Nucleolin Binds with Higher Affinity to the c-Myc G-quadruplex Over Its Consensus RNA Substrate-We compared the binding affinity of nucleolin for the c-myc G-quadruplex structure to that of a hairpin RNA structure containing the NRE, a specific binding target of nucleolin (62)(63)(64)(65). Our results show that the binding affinity of nucleolin is significantly higher for the c-myc G-quadruplex structure (Fig. 3A, left panel) than for the NRE-RNA substrate (right panel). For example, at a concentration of 150 nM (0.3 g in a 20 l reaction) of recombinant nucleolin, only 4% NRE-RNA was bound to the protein compared with a 50% binding of the c-myc G-quadruplex (Fig. 3A,  right panel, lane 3 versus left panel, lane 3). In addition, mobility shift competition assays show that excess NRE-RNA cannot displace the bound c-myc G-quadruplex structure (Pu47) (Fig.  3B, left panel, lane 3), whereas the c-myc G-quadruplex can effectively displace the NRE-RNA bound to nucleolin (Fig. 3B,  right panel, lane 4). Furthermore, when both the radiolabeled c-myc G-quadruplex structure and the NRE-RNA substrate are incubated at a 1:1 ratio in the binding reaction, nucleolin preferentially binds to the c-myc G-quadruplex but not to the NRE-RNA substrate, as observed by the nucleolin concentration-dependent disappearance of the free c-myc G-quadruplex, whereas no change in the free NRE-RNA substrate was observed (Fig. 3C). Taken together, our results suggest that the binding affinity of nucleolin for the c-myc G-quadruplex is significantly higher than that for the NRE-RNA.

Nucleolin Binds to the Single-stranded and G-quadruplex Conformations in the G-rich c-Myc NHE III 1 Region-In vitro
studies have shown that nucleolin can bind to a number of RNA and DNA structures, including hairpin RNA, double-stranded DNA, and G-quadruplex structures (32,34,37,38,42,62,66). To better understand the potential interactions of nucleolin with the NHE III 1 region of the c-myc promoter, we investigated the binding affinity of nucleolin to the different DNA conformations in this region by using a filter binding assay, which allowed us to investigate the binding of our protein to a large number of DNA samples simultaneously. In one set of experiments, we used single-stranded Pu47 DNA (Pu47 ss) incubated in binding buffer in the absence of KCl or NaCl (5 mM KCl was added upon addition of the protein to the binding reaction) with increasing concentrations of recombinant nucleolin (Fig. 4A). In addition, we incubated the assembled Pu47 G-quadruplex (Pu47 G4), the double-stranded Pu47 DNA, or the C-rich Pu47 complementary strand (Py47 ss) in binding buffer containing 100 mM KCl and various concentrations of nucleolin (Fig. 4A). Quantification of the relative amount of protein-DNA complex in each sample was performed using ImageQuant software, and the results were plotted as a function of protein concentration (Fig. 4B). The solid lines in Fig. 4B represent a fit of the data to a one-binding site hyperbolic equation.
These results demonstrate that the affinity of nucleolin for the c-myc G-quadruplex structure is higher than that for any of the other conformations of the c-myc NHE III 1 region. For instance, the binding of nucleolin to either the c-myc NHE III 1 C-rich strand or the double-stranded conformation was very weak and did not seem to be specific. Nucleolin, however, is able to bind to the unstructured c-myc NHE III 1 G-rich strand. This result is consistent with our c-myc polymerase stop assay results and suggests that nucleolin has the ability to bind to the G-rich single-stranded NHE III 1 region and induce the formation of the c-myc G-quadruplex.
Recombinant Nucleolin Induces the Formation of the c-Myc G-quadruplex Structure in Vitro-Having found that nucleolin has the ability to bind to both the single-stranded and G-quadruplex conformations of the c-myc NHE III 1 region, we next asked whether nucleolin was able to modulate the unwinding or formation of these structures in vitro. To determine whether or not the protein CD spectrum would significantly affect the CD spectra of either the c-myc single-stranded and G-quadruplex oligos, we performed a control CD scan of recombinant nucleolin, Pu47 ss, and Pu47 G4. Our results show that CD spectra of the nucleolin at a concentration of 15 M is not significant at the wavelength range where Pu47 ss or the Pu47 G4 spectra appear (Fig. 5A). Incubation of nucleolin with the preformed c-myc G-quadruplex did not result in any change in the spectra of the c-myc G-quadruplex (Fig. 5B), suggesting that nucleolin binds to but does not unwind or destabilize the c-myc G-quadruplex. However, incubation of nucleolin with the unstructured c-myc NHE III 1 sequence resulted in a very strong shift of the singlestranded positive peak at 258 nm to the G-quadruplex signature peak at 262 nm (Fig. 5C), thus providing strong evidence that nucleolin can induce the formation of the c-myc G-quadruplex structure from single-stranded DNA.
To further confirm the induction of G-quadruplex formation in the c-myc NHE III 1 region by nucleolin, we performed a DNA polymerase stop assay by using a template containing the c-myc sequence annealed with 32 P-labeled primers, as described previously (45). The DNA polymerase stop assay provides a simple and rapid way to identify DNA secondary structures in vitro, based on the principle that the DNA polymerase enzyme is incapable of traversing these structures unless they are unwound. Thus, the DNA polymerase, traversing toward the 5Ј-end of the template and unable to efficiently resolve quadruplex DNA, stops 3Ј to the first guanine involved in a stable G-quadruplex.
In the absence of KCl, no stable G-quadruplex structure is formed, and the DNA polymerase can extend through the DNA template containing the c-myc G-quadruplex-forming region, as observed by the presence of full-length product and the absence of stop product (Fig. 5D, lane 2). Conversely, when the same DNA template is incubated with KCl, a potassiumdependent stop of the DNA polymerase extension can be observed (Fig. 5D, lanes 3 and 4). Similarly, recombinant nucleolin results in a concentration-dependent stabilization of the c-myc G-quadruplex and, consequently, an increase in polymerase stop product (Fig. 5D, lanes  5-7).
We next examined the capacity of nucleolin to bind to the widely studied human telomeric intermolecular G-quadruplex structure (67)(68)(69). Telomestatin is a natural product (70) that has been shown to interact specifically with a number of G-quadruplex structures (71)(72)(73)(74)(75). In our study, we used telomestatin as a positive control that stabilizes the human telomeric Gquadruplex structure and, as a result, induces a concentration-dependent increase in the amount of polymerase stop product (Fig. 5E, lanes 3-5). Incubation of nucleolin with the telomeric DNA template did not significantly induce the formation of the human telomeric G-quadruplex structure (Fig. 5E, lanes 6 -8).
Nucleolin Binds Preferentially to G-quadruplex Structures with Parallel Topology-Nucleolin has been shown to have binding activity to a number of G-rich oligonucleotides (36,38,66). The binding of nucleolin to these G-rich oligomers, although sequence-independent, correlates with the ability of these sequences to form stable G-quadruplex structures (38). Nevertheless, the relationship between G-quadruplex topology and binding activity of the nucleolin has not been studied. To compare the binding affinity of nucleolin to different G-quadruplex structures, we performed a filter binding analysis of nucleolin with a diverse group of G-quadruplex structures whose topologies have been previously determined by CD spectropolarimetric studies, chemical footprinting, and/or NMR spectroscopy (41, 67, 76 -83). The relative amount of protein-DNA complex in each sample was quantified using Image- Quant software, and the data were plotted as a function of protein concentration. Prism4 software was used to fit the data to a one-binding site hyperbolic equation, and a graph of the fitted data is shown in Fig. 6, top. The protein concentration at half-maximal binding yielded the K d value for the various DNA substrates (Table 3). The results in Fig.  6 and Table 3 show that nucleolin binds to the parallel-stranded quadruplexes (c-myc, VEGF, RET, PDGF-A, and HIF-1␣) with higher affinity than to the nonparallelstranded quadruplexes (BCL-2 and human telomeric) and the heptad: tetrad c-myb.
Mutations on the c-Myc NHE III 1 Sequence Resulting in Alternative G-quadruplex Loop Isomers Negatively Affect Nucleolin Affinity for the c-Myc G-quadruplex-To further investigate the binding selectivity of nucleolin for the c-myc G-quadruplex, we analyzed the effect of mutations to the c-myc NHE III 1 sequence that selectively stabilize different loop isomers of the c-myc G-quadruplex. Our results show that these mutations cause a significant change in binding affinity of the nucleolin (Fig. 7,  A-D). However, the most favorable mutation for binding was that which resulted in the formation of the major quadruplex with 1:2:1 loops (Fig. 7A), whereas the most unfavorable mutation was that which resulted in the 1:6:1 loop isomer (Fig. 7D). Taken together these data suggest that nucleolin binds preferentially to quadruplexes with shorter loops. In the case of c-myc, nucleolin appears to selectively recognize the 1:2:1 loop conformation.
Nucleolin Binds to the c-Myc NHE III 1 Region in Vivo-The ChIP technique is a valuable tool used to identify endogenous interactions of transcription factors and the cis-elements of DNA to which they bind. The ChIP assay relies on the ability of specific antibodies to immunoprecipitate DNA-binding proteins along with the associated genomic DNA. In our study, we performed a ChIP assay on HeLa cells to investigate whether the nucleolin protein can selectively bind to the c-myc promoter in vivo. We immunoprecipitated DNA-protein complexes by using antibodies against nucleolin, Sp1, and IgG on formaldehyde cross-linked nuclear extract from HeLa cells. The abundance of c-myc promoter sequences within the immunoprecipitate was determined by  3 and 4) or nucleolin (lanes 5-7). Lane 1 represents the dideoxy-C sequencing reaction of the c-myc template. Lane 2 corresponds to primer control. E, polymerase stop assay using the wild-type human telomeric sequence-containing template with increasing concentrations of telomestatin (lanes [3][4][5] or nucleolin (lanes 6 -8). Lane 1 represents the dideoxy-C sequencing reaction of the human telomeric template. Lane 2 corresponds to primer control. PCR (Fig. 8A). As a positive control, we analyzed the interaction of Sp1 with the c-myc promoter. Both nucleolin and Sp1 proteins immunoprecipitated with the DNA fragment containing the NHE III 1 region within the c-myc promoter (Fig. 8A).
Recombinant Nucleolin Strongly Represses the Activity of the c-Myc Promoter in Vitro-Because the c-Myc G-quadruplex previously has been shown to function as a silencer of c-myc transcription, we hypothesized that nucleolin overexpression would result in the stabilization of the c-myc G-quadruplex and the inhibition of c-myc promoter activity. To test this hypothesis, we performed luciferase assay studies by using a c-myc luciferase reporter plasmid and a nucleolin protein expression plasmid in MCF10-A cells (Fig. 8B). This cell line was used because it expresses lower basal levels of nucleolin compared with other cell lines. Co-transfection of these two plasmids in the presence of increasing concentrations of nucleolin plasmid allowed us to determine whether nucleolin could modulate c-myc promoter activity in a dosedependent manner (Fig. 8B). We confirmed GFP-nucleolin fusion protein overexpression by Western blot analysis using anti-GFP antibody (data not shown). Our luciferase assay results (Fig. 8B) strongly suggest that nucleolin has the ability to repress c-myc promoter activity, presumably by inducing c-myc G-quadruplex formation and stabilization. In addition, nucleolin was able to effectively prevent the positive effect that Sp1 protein excerpted on c-myc promoter activity as measured by luciferase activity (Fig. 8C).

DISCUSSION
It is well established that G-rich nucleic acid sequences can readily assemble into stable G-quadruplex structures in vitro under physiological conditions. Furthermore, the identification of a number of prokaryotic and eukaryotic proteins that interact with and alter the properties of G-quadruplex structures provides further evidence for the formation of such structures in vivo. The proteins that bind to G-quadruplex structures with high affinity and selectivity have been shown to preferentially perform various functions on those structures over double-stranded DNA, including G-quadruplex structure stabilization, unwinding, and cleaving (25,26). G-quadruplexes and other atypical noncoding DNA structures are believed to perform roles in cellular physiology that go beyond the storage of genetic information done by coding double-stranded DNA, including telomere maintenance (84), FIGURE 6. Differential binding affinity of nucleolin for various G-quadruplex structures. A nitrocellulose filter binding assay was used to determine the affinity of nucleolin for a diverse group of G-quadruplex structures. Each G-quadruplex-forming oligonucleotide was induced to form a stable G-quadruplex structure before incubating with nucleolin. The relative proportions of nucleolin-bound DNA to unbound DNA were quantified by phosphorimaging analysis and plotted. Models of the conformations of some of the G-quadruplex structures used in this study are shown for comparison. Protein concentrations in the reactions ranged from 5 to 2500 nM, and the dissociation constants (K d ) were calculated as described under "Experimental Procedures" and as listed in Table 3.
telomere capping (85), replication fork progression (86), chromosome organization (87), gene expression regulation (20), and DNA recombination (36). Still, how those processes are so carefully orchestrated through these atypical DNA elements remains unclear. The G-quadruplex found in the c-myc promoter has been shown to function as a silencer element (20). Therefore, proteins that unwind or induce its formation are likely to affect c-myc gene expression. Here we describe how affinity column chromatography was used to identify 11 potential c-myc G-quadruplex proteins, which can be classified into the four main groups described before (see Table 2).
Among the telomere-binding proteins that bind to the c-myc G-quadruplex are the hnRNP A1 and hnRNP A2/B1. These two proteins form a macromolecular complex with telomere-maintaining factors, which regulate telomere length. hnRNP A1 has also been shown to unwind telomeric G-quadruplex structures. For this reason, hnRNP A1 has been suggested to stimulate telomere elongation through unwinding of G-quadruplexes at the end of telomeres (51). RNA sequences that are guanine-rich have also been shown to assemble into thermally stable G-quadruplexes under physiological conditions. Several reports demonstrate that RNA quadruplexes can modulate pre-RNA alternative splicing (88) and mRNA translation (89, 90) ex vivo. Interestingly, hnRNP A2 and modified hnRNP A1 have been previously reported to destabilize G-quadruplexes in both DNA and RNA (53). The G-quadruplex-disrupting ability of hnRNP A2 has been suggested to alleviate the translational block that results from G-quadruplex formation on mRNA (55). However, what role this protein plays in binding DNA G-quadruplex structures remains unclear.
The translation elongation factor eEF1A was also identified as a potential c-myc G-quadruplex-binding protein. Interestingly, a recent report suggests that eEF1A binds to G-rich oligonucleotides independent of the formation of a G-quadruplex structure (57). Our studies and those reported by the Manzini and co-workers (57) differ in that in their studies 150 mM NaCl was used rather than KCl. It is well known that sodium and potassium can have different effects on both the facilitation of the formation and topology of the G-quadruplex formed. Furthermore, our results provide strong evidence that G-quadruplex topology is a determinant factor for protein binding. It is possible that one of the functions of eEF1A in RNA is to bind to and unwind RNA G-quadruplexes, thus facilitating translation elongation. However, the significance of the binding of this protein to DNA G-quadruplexes has yet to be determined.
We identified four ribosomal proteins as potential c-myc G-quadruplex-binding proteins. Ribosomal proteins are essential components of the cellular machinery involved in protein synthesis. These proteins have been generally regarded as collectively playing an important role in synthesizing proteins; however, there is increasing evidence that certain ribosomal proteins can themselves act directly to regulate cell growth patterns independent of their role as components of the translational apparatus.
In addition, some of the identified proteins have been shown to remodel chromatin structure. For example, TTF-I has been suggested to regulate chromatin structure by recruiting different chromatin remodeling proteins to the rDNA, which, depending on the molecular context, can either activate or repress RNA polymerase I transcription. For example, TTF-I has been shown to induce chromatin remodeling and relieve transcriptional repression of rDNA (46,47), whereas recruit-FIGURE 7. Effect of G-quadruplex loop length on G-quadruplex binding affinity of nucleolin. An electromobility shift assay was performed to determine the affinity of nucleolin for the 1:2:1, 1:2:2, 2:1:2, and 1:6:1 c-myc G-quadruplex loop isomers. Each isomer was induced to form a G-quadruplex before incubating with nucleolin. Nucleolin concentrations in the reactions ranged from 50 to 500 nM corresponding to 0.1, 0.3, 0.5, and 1 g in a 20-l reaction. ment of NoRC to the rDNA promoter by TTF-I has been shown to silence rDNA transcription (91). Interestingly, c-Myc activation appears to induce the binding of TTF-I to rDNA, leading to the rDNA chromatin conformation changes that are associated with growth stimulation (92). Therefore, one may speculate that under certain conditions TTF-I may act in a feedback loop to regulate c-Myc transcription by modulating the formation or stability of the c-Myc G-quadruplex.
Similarly, nucleolin has been shown to greatly enhance the chromatin remodeling of nucleosomes by SWI/SNF and ACF (30). Nucleolin destabilizes the histone octamer by helping the dissociation of the histone dimer H2A-H2B, thus facilitating the passage of the polymerase and inducing the deposition of histone and the formation of nucleosomal particles (30). As a result, it has been described as a chromatin co-remodeler and a histone chaperone with functional similarity to the Facilitates Chromatin Transcription complex.
Nucleolin has also been reported to function as a transcription factor. For instance, LR1, a nucleolin-hnRNP D heterodimer, has been reported to regulate c-myc transcription in B cell lymphomas by binding to a double-stranded DNA element upstream of the c-myc NHE III 1 region (35,42,93). Significantly, LR1 has been reported to bind to G-quadruplex structures (36,37); however, the binding of this c-myc transcription factor to the c-myc G-quadruplex has not been investigated.
Here we provide evidence that nucleolin, the most abundant nucleolar phosphoprotein, binds to the c-myc G-quadruplex structure with high affinity and selectivity. Nucleolin is a modular protein that can be structurally divided into three different domains as follows: the N-terminal, the central domain that includes the four RBDs, and the C-terminal domain (29). By existing in multiple copies, the RBDs may allow the protein to bind with higher affinity and specificity than would be possible with an individual RBD. Furthermore, the four central RBDs are less conserved within the same protein when compared with the same RBDs in divergent species, thus suggesting that these proteins bind to specific nucleic acid sequences in a manner that has been evolutionarily conserved (29). In the case of the binding of nucleolin to the NRE-RNA substrate, nucleolin utilizes RBD1 and RBD2 to interact with a short stem (5 bp) and a 7-10-nucleotide loop containing the sequence UCGA (63, 94 -96). To date, the RBD1 and RBD2 of nucleolin are the only domains proven to interact with the pre-rRNA. Our studies show that nucleolin binds with higher affinity to the c-myc G-quadruplex structure over its consensus NRE-RNA substrate. In addition, the ability of the c-myc G-quadruplex to displace the NRE-RNA from binding to nucleolin suggests that RBD1 and RBD2 preferentially bind to the c-myc G-quadruplex structure.
The C-terminal domain of nucleolin is defined by spaced Arg-Gly-Gly (RGG) repeats interspread with amino acids that are often aromatic. The RGG motif has been found frequently in nucleolar proteins (29), suggesting an important role for this domain in processes specific to the nucleolus, such as ribosome biogenesis. CD and homology studies of the C-terminal of nucleolin suggest that this domain adopts a helical conformation made of repeated ␤-turns (97). It has been suggested that the regularity of arginine and phenylalanine side chains projecting outside the central core of the spiral structure will create electrostatic and hydrophobic ridges that are prone to interact nonspecifically with RNA and DNA (60,95). Our EMSA and filter binding assays contradict that hypothesis, as we clearly see differential binding specificities. This observation was made previously by the Bates et al. (38), who showed that nucleolin binds differentially to several DNA G-rich oligos, although the B, effect of transient overexpression of nucleolin on the activity of a c-mycluciferase reporter plasmid. MCF10A cells (7.5 ϫ 10 5 ) were transfected with 1 g of Del-4 plasmid per well in 6-well plates with transfection mix alone (Lipofectamine 2000) or the indicated concentrations of GFP-nucleolin plasmid for 24 h before preparing cell extracts for luciferase assay. C, effect of transient overexpression of Sp1 and/or nucleolin on the activity of a c-mycluciferase reporter plasmid. MCF10A cells (7.5 ϫ 10 5 ) were transfected with 1 g of Del-4 plasmid per well in 6-well plates with transfection mix alone (Control); 100 ng of GFP-nucleolin (Nuc); 500 ng of Sp1 plasmid (Sp1), or 500 ng of Sp1 plasmid and 100 ng or 500 ng of GFP-nucleolin for 24 h before preparing cell extracts for luciferase assay. The data shown are the mean luciferase activity Ϯ S.D. of n ϭ 3 experiments. The data are expressed as a percent of luciferase activity in control cells (100%). basis of the differential binding was unclear. Our studies suggest that nucleolin binds more avidly to parallel G-quadruplex structures containing short loops, such as the c-myc G-quadruplex. In addition, our CD spectropolarimetry studies and our polymerase stop assays strongly suggest that nucleolin induces the formation of the c-myc G-quadruplex structure from a single-stranded template, which may be necessary to modulate c-myc transcription. Furthermore, our ChIP and luciferase assay results suggest that nucleolin may be involved in c-myc transcriptional down-regulation in vivo, presumably by inducing the formation and stabilization of the c-myc G-quadruplex structure.
In summary, our findings strongly support the hypothesis that the c-myc G-quadruplex structure is a biologically relevant substrate of nucleolin and suggest a new mechanism for the regulation of c-myc transcription. If this is true, nucleolin may be mediating the cancer cell-specific apoptotic response of Quarfloxin by acting as a repressor of c-myc transcription after being displaced from the nucleolus to the nucleoplasm by Quarfloxin.
Our laboratory continues to investigate the relevance of the interaction of nucleolin with the c-myc promoter both in vitro and ex vivo. We believe that G-quadruplex-binding proteins such as nucleolin may play a key role in the regulation of the expression of several oncogenes that, like c-myc, have the ability to form G-quadruplex structures in their promoter. Understanding the role these proteins play in the regulation of the expression of genes required for tumor formation, maintenance, and metastasis may eventually lead to the development of novel cancer therapies through the ability to deliberately and specifically alter their expression.