![]()
|
|
||||||||
J. Biol. Chem., Vol. 280, Issue 14, 13606-13615, April 8, 2005
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||






From the
Dipartimento di Biologia Animale, Università di Modena e Reggio, Via Campi 213/d, 41100 Modena, Italy, ¶Dipartimento di Scienze Biomolecolari e Biotecnologie. Università di Milano, Via Celoria 26, 20143 Milano, Italy, and the ||Division of Human Cancer Genetics, Department of Molecular Virology, Immunology, and Medical Genetics, Comprehensive Cancer Center, Ohio State University, Columbus, Ohio 43210
Received for publication, December 14, 2004 , and in revised form, January 10, 2005.
| ABSTRACT |
|---|
|
|
|---|
| INTRODUCTION |
|---|
|
|
|---|
NF-Y is considered as a general promoter organizer: thanks to its histone-like nature, it presets chromatin structure locally (8), interfacing well with nucleosomes (9), it helps the binding of neighboring factors (reviewed in Refs. 4 and 5) and attracts coactivators, such as p300/CREB-binding protein (8, 10). The location of the CCAAT box is far from random, being positioned between 60 and 100 in the vast majority of the promoters analyzed. In general, our knowledge of the anatomy of NF-Y-binding sites in terms of flanking sequences, position with respect to transcriptional start sites, and promoter context (6, 11, 12) enables us to make predictions as to whether a gene will be regulated by NF-Y.
Chromatin Immunoprecipitation (ChIP)1 experiments determined that NF-Y is bound in vivo before gene activation (1013); NF-Y is bound to a transcribing cyclin B1 promoter during mitosis in HeLa cells (14). Indeed, binding to cell cycle-regulated promoters is not constitutive but is time-regulated, being found before activation and displaced when promoters are repressed (10). Furthermore, conditional knock-out experiments of CBF-B (NF-YA) unambiguously determined that the protein is required for cell proliferation of mouse embryo fibroblasts and mouse development (15).
The analysis of 130 mammalian CCAAT-containing promoters suggests a prevalence in genes that are active in a tissue- or development-specific way and in inducible genes, either by external stimuli or during the cell cycle (7). Whereas this is certainly informative, very little information exists as to the binding to other regions. Finding all genes targeted by a particular transcription factor is crucial to reconstruct its transcriptional network. To expand our knowledge of NF-Y binding in vivo, a valuable approach is to use DNA derived from ChIPs to probe microarrays. DNA arrays have been developed in which clones derived from a CpG island library have been spotted (16); CpG islands have long been known to be associated to regulatory elements in promoters (17) and also elsewhere in the genome. They are believed to be mainly associated to "housekeeping" genes (i.e. genes active in all cells), albeit at different levels (reviewed in Ref. 18). To gain a wider understanding of the NF-Y transcriptional circuitry, we took a high throughput genomic approach by screening with anti-YB chromatin-immunoprecipitated DNA two CpG island arrays.
| EXPERIMENTAL PROCEDURES |
|---|
|
|
|---|
-mercaptoethanol, were treated by adding formaldehyde directly to tissue culture medium to a final concentration of 1% and incubated for 10 min at room temperature. Approximately 5 x 106 cells were used for each immunoprecipitation. Cross-linking reactions were stopped by the addition of phosphate-buffered saline-glycine to a final concentration of 0.125 M. Cells were washed twice with ice-cold phosphate-buffered saline, scraped, and centrifuged at 2000 rpm for 2 min. Cells were then resuspended in cell lysis buffer (5 mM Pipes, pH 8.0, 85 mM KCl, and 0.5% Nonidet P-40) containing protease inhibitors (100 ng/ml aprotinin and 100 ng/ml leupeptin) and 0.5 mM PMSF and kept on ice for 15 min. Cells were homogenized using a Dounce homogenizer (B pestel) several times, and the resultant homogenates were centrifuged at 5000 rpm for 5 min at 4 °C to pellet the nuclei. The pellets were resuspended in nuclei lysis buffer (50 mM Tris-HCl, pH 8.1, 10 mM EDTA, 0.1% SDS, and 0.5% deoxycholic acid) containing protease inhibitors and PMSF and kept on ice for 20 min. The nuclear lysates were sonicated on ice to an average chromatin length of 22.5 kb and then centrifuged at 12,000 rpm for 10 min at 4 °C. The supernatants were incubated in IP buffer (50 mM Tris-HCl, pH 8.1, 10 mM EDTA, 0.1% SDS, 0.5% deoxycholic acid, and 500 mM LiCl) containing protease inhibitors and PMSF, with Protein G-agarose (KPL) for 2 h at 4 °C in rotation. After removal of Protein G-agarose, the precleared lysates were used as soluble chromatin for ChIP. Chromatin was incubated at 4 °C overnight with 4 µg of anti-NF-YB or anti-NF-YC antibodies. No antibody and anti-FLAG (Sigma) control samples were included. Immunoprecipitates were recovered by incubation for 2 h at 4 °C with Protein G-agarose previously precleared in IP buffer (1 µg/µl bovine serum albumin, 1 µg/µl salmon testis DNA, protease inhibitors, and PMSF). To perform a second immunoprecipitation, 30 µl of elution buffer (50 mM NaHCO3, 1% SDS) were added, and the recovered material was diluted with 270 µl of IP buffer. 2 µgof the second antibody were added and incubated at 4 °C overnight. The recovery proceeded as in the first IP reaction. Reversal of formaldehyde cross-linking, RNase A, and Proteinase K treatments were performed as previously described (19). Data validation was performed with conventional ChIPs (10), with chromatin of 0.8 kb and with anti-YB as well as anti-YC purified polyclonal antibodies. The sequence of PCR primers used to analyze the genes reported in Fig. 2 are shown in Supplemental Table I.
|
Amplicon Generation and Labeling
The generation of amplicons from individual ChIPs was performed following the protocols of LM-PCR described in Refs. 20 and 21. Briefly, two unidirectional linkers (oligonucleotide JW102, 5'-GCGGTGACCCGGGAGATCTGAATTC-3'; oligonucleotide JW103, 5'-GAATTCAGATC-3') were annealed and ligated to the chromatin IPs, previously blunted by T4 DNA polymerase. The first amplicons were generated by PCR (one cycle at 55 °C for 2 min, 72 °C for 5 min, 95 °C for 2 min, followed by 15 cycles at 95 °C for 30 min, 55 °C for min, 72 °C for 1 min, and a final extension of 4 min at 72 °C). The reaction was purified using the Qiaquick PCR purification kit (Qiagen) or the GFX PCR purification kit (Amersham Biosciences) according to the manufacturer's instructions. One-tenth of these initial reactions were used to generate more amplicons, using the same PCR program for a subsequent 30 cycles. After purification of these last rounds of amplification, the DNA was quantified and examined by gene-specific PCR to ensure that the initial enrichment was maintained. 5 µg of amplicons for
-NF-YB,
-FLAG, and input DNA (subjected to the same number of PCR manipulations as the IPs) were labeled using the LabelIT Cy5/Cy3 nucleic acid labeling kit (Mirus), following the manufacturer's instructions, with a reagent/DNA ratio of 2.5 for Cy5 (IPs) and 1.5 for Cy3 (input).
CpG Microarray Hybridization
7776 CpG ArrayThe development of the 7776 CpG island array was described previously (2123). Prior to hybridization, spotted CpG island slides were incubated with a solution of 3x SSC, 0.25% SDS, and 1.5 µg/µl salmon testis DNA under a glass coverslip at 37 °C for 30 min to block nonspecific binding. Slides were washed twice with water and dried for 5 min at 600 rpm in a centrifuge. Labeled DNAs were added to hybridization buffer (0.25 M NaPO4, 4.5% SDS, 1 mM EDTA, and 1x SSC), denatured at 95 °C for 2 min, cooled to 60 °C, and dropped onto slides placed in prewarmed hybridization chambers. Incubation was performed at 60 °C overnight. After hybridization, the slides were washed successively at 50 °C with 1x SSC, 0.1% SDS at room temperature with 1x SSC (0.1%) and at room temperature with 0.2 SSC for 5 min each and then dried. Hybridized slides were scanned with the GenePix 4000A scanner (Axon), and the acquired images were analyzed with the software GenePix Pro, Version 3.0. A global normalization factor was determined for each replica, evaluating the anti-NF-YB ChIP Cy5/control ChIP Cy5 ratio relative to control repetitive elements. Data were normalized prior to comparison. After normalization, positive loci were defined by hybridization intensities at least 2 times greater than that of control.
12K ArrayThe Cy5- and Cy3-labeled DNA were each resuspended in 10 µl of 1 µg/µl Cot-1 DNA (Invitrogen) and mixed together in order to have the same amount of input Cy3-labeled DNA for each IP Cy5-labeled DNA. The hybridization solution was then added to a final composition of 43% formamide, 4.3x SSPE, 0.42% SDS, 42 µg of salmon sperm DNA, 0.2 µg of tRNA, heated for 2 min at 95 °C and cooled down to 37 °C over 30 min. 95 µl of each mixture solution was applied to two human CpG 12K slides (University Health Network, The Microarray Center, Toronto, Canada) and hybridized at 37 °C for >18 h. The slides were prehybridized for 1 h at42 °C with 25% formamide, 5x SSC, 0.1% SDS, and 10 µg/µl bovine serum albumin.
The slides were washed at room temperature for 5 min twice in 2x SSC, 0.1% SDS; once in 1x SSC, 0.1% SDS; and one final time in 0.1x SSC; dried; and immediately scanned using a ScanArray 4000 scanner (Packard). The hybridized microarrays were analyzed using the Quantarray microarray analysis software (Packard). Features of poor intensity (<500) and those that did not meet the quality control criteria (visual inspection, spot circularity, spot uniformity, and background uniformity for both channels) were discarded. After the background subtraction for each spot, the data were normalized to median (i.e. the ratio of the median value of all spots in the Cy5 channel (IP DNA) was normalized to the ratio of the median value of the control channel (Cy3 = input)). From a direct comparison of the arrays hybridized with the DNA of the
-NF-YB IP and the
-FLAG IP, only the spots that showed an enrichment >2-fold in the YB samples were further analyzed. Two independent experiments were performed, each consisting of one
-NF-YB IP and one control
-FLAG IP slide, normalized to the same input DNA, and the commonly enriched spots were considered.
Data Analysis
Positive clones were sequenced and mapped with BLAT. The presence of CCAAT sequences were searched for 2 kb on the flanking of the 7776 CpG island array and 500 bp on the 12K array, annotated in individual files corresponding to the genomic loci identified. The criteria for classifications are described below. Mouse orthologs were retrieved using BLAT. The annotated genes were classified according to functional categories, and the classification was compared with those performed on the MYC and E2F4 targets.
Expression Analysis of NF-Y-targeted Genes
HepG2 cells were infected with control green fluorescent protein, wild type NF-YA, or dominant negative YAm29 adenovirus.2 Adenovirus vectors to express NF-YA or the YAm29 dominant negative mutant were generated using AdEasy, using HindIII and XbaI from the corresponding pcDNA3-based vectors, and introduced into the same sites of the shuttle vector pAdTrack-CMV. This plasmid was recombined with the vector pAdEasy1, followed by treatment with PacI and transfection into an E1-complementing cell line. We infected exponentially growing cells for 7 h in the absence of serum. Fetal calf serum was then added, and cells were incubated for 48 h. RNA was extracted using an RNA-Easy kit (Qiagen), according to the manufacturer's protocol. For cDNA synthesis, 4 µg of RNA were used with the M-MLV-RT kit (Invitrogen). Semiquantitative PCR analysis was performed with oligonucleotides detailed in Supplemental Table II.
Electrophoretic Mobility Shift Analysis of NF-Y Binding
EMSA analyses of Fig. 3 were performed under standard NF-Y conditions (6, 11, 22, 23), with anti-YB supershift antibodies and recombinant NF-Y and the indicated oligonucleotides. 32P-Labeled oligonucleotides were incubated in 20 mM Tris-HCl, pH 7.8, 50 mM NaCl, 1 mM dithiothreitol, 3% glycerol, 5 mM MgCl2 for 30 min at 20 °C with 5 ng of recombinant NF-Y trimer or with 5 µg of HepG2 nuclear extracts together with 200 ng of poly(dI-dC) (Sigma). The samples were loaded on a 4.5% polyacrylamide gel, run for 2 h, dried, and exposed. To produce recombinant NF-Y, Escherichia coli BL21 DE3LysS was induced at an A600 value of 0.6 by the addition of isopropyl-
-D-thiogalactopyranoside to a final concentration of 1 mM for 3 h. Bacterial pellets were resuspended and sonicated in sonication buffer (150 mM KCl, 20 mM Tris-HCl, pH 7.8, 0.05% Nonidet P-40, 0.1 mM EDTA, 5 mM 2-mercaptoethanol, 1 mM PMSF (Sigma), and protein inhibitors) and centrifuged at 23,000 x g in a Beckman SW 27Ti rotor for 30 min at 4 °C. The inclusion bodies pellet was resuspended in sonication buffer, sonicated, and centrifuged again. Inclusion bodies were finally resuspended in 6 M guanidium chloride, 20 mM sodium acetate (pH 5.2), 5 mM 2-mercaptoethanol, and 1 mM PMSF. The three subunits were mixed to a final concentration of 0.5 mg/ml and dialyzed against a 100-fold excess of BC300 (300 mM KCl, 20 mM Tris-HCl, pH 7.8, 0.05% Nonidet P-40, 5 mM 2-mercaptoethanol, 1 mM PMSF); glycerol concentration was adjusted to 20%, and proteins were loaded on a nickel-nitrilotriacetic acid-agarose column, washed with BC300, and eluted with 0.25 M imidazole. The proteins were finally dialyzed against BC100, the purity being routinely >80%.
|
| RESULTS |
|---|
|
|
|---|
22.5 kb) than the one used in conventional ChIPs (0.51 kb). Because of the modifications of our routine ChIPs with extended chromatin, we first verified whether immunoprecipitated DNAs were indeed enriched in NF-Y-targeted fragments. We used oligonucleotides amplifying several CCAAT-containing promoters in semiquantitative PCRs. Fig. 1A shows that essentially all of the promoters tested were clearly positive in the anti-YB ChIP, compared with the FLAG and no antibody controls: the liver-specific genes
GA, MVK, OAT, and mATP synthase and the ubiquitous HnRNPA1, NP95, PPP1R7, HMGB2, ABL, CDC25A,
-actin, and OGG1. Note that only the last two genes were previously known to be regulated by NF-Y (31, 32), whereas all of the others were derived from a CCAAT-containing promoter data set.3 In parallel ChIP analysis, CCAAT-less promoters, p107,
-tubulin, RPS19, and YBL1, were negative (Fig. 1A, lower panel).
|
For the 12K hybridization, we took a different approach, by PCR-amplifying chromatin from Nalm-6 cells after ligation of linker DNA. The advantage is that a very limited amount of ChIP material is required to yield enough DNA for hybridization. We also checked that the successive rounds of PCR amplifications would not decrease the enrichment of bona fide NF-Y targets in the amplicons. Indeed, Fig. 1B shows that the NF-YA promoter amplicon is no less, and in fact probably more, enriched in the final LM-PCR chromatin compared with the initial starting material. Therefore, we conclude that both of these procedures yield sufficiently enriched DNA for further genomic analysis.
Results of the 7776 ArrayWe used DNAs from 2030 individual ChIPs to generate probes for the 7776 array screening. We identified at least 230 spots, in which the corrected signal obtained with the NF-YB chromatin was at least 2-fold higher than the anti-FLAG signal. We sequenced all positive clones and derived their chromosomal localizations. A positive clone will indicate that a bound NF-Y site lies somewhere within 2.5 kb of the CpG island. The genomic sequences surrounding the CpG island were therefore scrutinized for the presence of CCAAT sequences for a length of 2 kb on either sides. Table I shows a list of the positive clones. Several criteria helped us to classify them as follows.
|
We singled out the clones with a location appropriate for a "promoter" definition (i.e. whenever a mapped known gene or multiple clustered expressed sequence tags generated from a localized area were nearby). This is because the CCAAT position is quite constant, 60100 bp from the transcriptional start site within the promoters analyzed (7), and exceptions to this rule are sporadic (3234). In all cases in which multiple CCAAT boxes were detected throughout the locus, the clone was classified as "canonical" if one of them was present in the promoter, within 200 bp from the transcriptional start sites.
We further separated the promoters into two categories, based upon the type of transcriptional unit. CpG islands are abundant not only in simple promoters but also in divergent, convergent, and tandemly linked promoters as well (18, 35); we collectively classified them as complex transcriptional units (CTUs).
Species conservation of TF target sites or regulatory regions in general (and of CCAAT boxes in particular) is a hallmark of functional importance, as detailed in transfection experiments and phylogenetic footprints. We thus retrieved information of the mouse orthologous genes and analyzed them for the presence of a CCAAT sequence at the corresponding position. This could only be possible, with a good degree of confidence, for the promoter (canonical and CTUs) data set, by taking the transcriptional start site as the pivotal point. The sequences of all of the loci are individually provided as Supplemental Table III.
In all clones retrieved, at least one CCAAT pentanucleotide could be found. This is well expected, given the 45 kb of DNA analyzed on both sides of the CpG clones and the average frequency of the core CCAAT (or ATTGG) pentanucleotide, one every 0.5 kb. However, a consensus high affinity NF-Y site (+++ in Table I) is theoretically present every 16 kb (7). Given the overall length of DNA analyzed in all of the loci (750 kb), the total number of CCAAT boxes expected would be 1500, with 46 high affinity ones. Indeed, 1135 CCAAT were scored, with 252 of these matching the NF-Y consensus; thus, although there is a slight negative skewing for the pentanucleotide around the CpG island regions analyzed, the NF-Y optimal sites were 6-fold overrepresented.
To validate our analysis, we performed conventional ChIPs, with 1 kb of chromatin. Selections of the identified targets in each of the different classes were probed with anti-NF-YB and NF-YC antibodies. Furthermore, we also performed sequential immunoprecipitations of chromatin with both antibodies (re-ChIP). The results of these experiments are shown in Fig. 2. All targets tested scored positive, further confirming that clones emerging from the ChIP on chip analysis are indeed positive for NF-Y binding in vivo.
Results of the 12K ArrayTable II contains the genes that emerged from the 12K array screenings with the anti-YB probes. The criteria mentioned above for the classification were also applied here, except that the flanking DNA considered was shorter (1 kb) due to the restricted length of the probe. Clones showing >2-fold higher signals with respect to the FLAG control were 1205 and 783 on 5121 and 4371 spots analyzed, respectively, corresponding to 23 and 18% of positivity. 119 clones were in common; of these, 65 clones were mappable based on the sequences retrievable from the Sanger Centre. Core promoters were 10%, and noncanonical CCAAT were nearly 50%. Several CTUs were also present. Overall, the distribution was highly reminiscent of the 7776 array. Here, again, we validated selected clones by conventional ChIP; all showed a substantial enrichment with respect to the FLAG control (Fig. 2A, right panels).
|
To further check whether the predicted CCAAT boxes were correctly evaluated, we performed ChIP scanning experiments on three loci. We immunoprecipitated chromatin from HepG2 with anti-YB and anti-YC antibodies and amplified three different regions of the CDKN2A-MTAP and EMX2-EMX2OS CTUs and the canonical PMSC6 gene. Results shown in Fig. 4 indicate that only one amplicon of the CDKN2A-MTAP loci was positive with both NF-Y antibodies, corresponding to the +++ CCAAT box indicated in Table I, despite the presence of other CCAAT elements in the proximity of the negative amplicons. In the case of the EMX2-EMX2OS locus, amplicons 2 and 3 were positive, corresponding to the core promoter regions of both genes, whereas an amplicon in the proximity of two high affinity sites in an intronic region of EMX2OS was not enriched compared with the control. In the PSMC6 locus, only the high affinity core promoter CCAAT was bound in vivo. Collectively, these experiments support the classification of Tables I and II and suggest that the genes are indeed under NF-Y control.
|
|
| DISCUSSION |
|---|
|
|
|---|
The number of NF-Y-regulated genes found in our analysis is more in line with the 7.6% figure recently obtained by Fitzgerald et al. (2); in fact, were NF-Y indeed involved in the majority (67%) of promoters, as suggested by Suzuki et al. (1), we would expect a much larger number of positive clones. However, several considerations can be put forward to explain the relative paucity of isolated targets. (i) In the 7776 CpG experiments, we applied a stringent cut-off by normalizing for the higher signals observed with anti-YB DNA in clones containing repetitive sequences; recent reports, however, suggested that CCAAT boxes are present and conserved in some families of repetitive DNA of retroviral origin (36). This finding matches the well known importance of NF-Y sites in many (actually most) retroviral long terminal repeats (reviewed in Ref. 7).4 Thus, our normalization is likely to have obscured a larger set of targets. (ii) It is likely that only a minority of genes are expressed at high levels in all cells and hence activated by NF-Y at all times. Many of the ubiquitous genes, in fact, are only active under specific circumstances (stress, apoptotic signals, a specific cell cycle phase, or environmental stimulus). Cell cycle promoters, for example, to which NF-Y association fluctuates considerably (10), are potentially underrepresented; indeed, other anti-YB positives, such as cyclin B1, that scored between 1.5 and 2 in fluorescence intensity above the FLAG control in the 7776 array are bona fide NF-Y targets.5 (iii) In similar ChIP on chip experiments, an equivalent number of clones were retrieved for MYC (28) and fewer for E2F4, E2F6, and methyl CpG binding domain proteins (20, 27, 30). Alternative approaches indicate that MYC high affinity sites are only a part of the overall binding strategy (37). Thus, it is likely that our data constitute a fraction of all of the potential NF-Y targets. (iv) Most importantly, only clones that showed positivity for multiple hybridizations were considered. In the case of the 12K array, positivity was scored in 1520% of clones of the individual experiments, yet only 119 of them overlapped. We believe that suboptimal hybridization conditions prevent the successive and reproducible identification of the same set of targets, precluding the possibility of calculating the exact number of NF-Y-targeted loci. These shortcomings notwithstanding, our data lead to several interesting considerations.
Conservation of NF-Y SitesAmong the identified genes, only four were previously established through mutagenesis of CCAAT, but not by ChIP analysis: (i) the UNG2-UNG1 tandemly linked genes were functionally dissected, and CCAAT boxes were found to be of importance for both genes (38); (ii) TIMP2 (and the related TIMP1/3) are clearly under NF-Y control (3942); (iii) proliferating cell nuclear antigen, a CTU in which the CCAAT box is found in the first intron (43); and (iv) MTAP, a gene in which two separated suboptimal CCAAT boxes are important (44). For others, NF-Y-binding was more than suspected. (i) The divergent promoters of the H2B-H3 and of H2A-H4 loci belong to the wide family of histone genes; detailed mutational analysis of other histone promoters clearly evidenced the importance of NF-Y (45, 46). (ii) Functional analysis of the NKX6.1 promoter pointed to a double CCAAT region as essential (47). (iii) PAX2 and TLP19 belong to gene families for which formal proof of NF-Y involvement was obtained with other members: PAX3/7/8 (4850) and other endoplasmic reticulum stress-inducible genes, respectively (5153). The analysis of the conservation between human and mouse promoters represents a good example of phylogenetic footprint, since 52 of 72 (74%) mouse orthologous promoters do contain CCAAT at the expected position; this percentage increases to 86% if we consider the optimal NF-Y sites (Tables I and II). Thus, the notion that conservation of the CCAAT is an integral part of the expression strategy within gene families and across species is reinstated.
The CCAAT Box and Complex Transcriptional UnitsIt is somewhat surprising to find a high frequency of CTUs in our analysis; 24% of the loci analyzed contained bidirectional promoters, and 1517% contained tandem promoters. Most bidirectional promoters are divergent (60%), and the rest are convergent, generating partially overlapping transcripts. This result was not anticipated; previous data identified only a minute number (essentially histones, UNG1-UNG2, and AIRC-GPAT) actually containing such units (7).6 An unexpected abundance of bidirectional promoters in the human genome has recently been documented; as many as 11% of the total are divergent, either with overlapping or nonoverlapping transcripts (35, 54). Furthermore, the bidirectional arrangement is often conserved among mouse orthologs and important for expression of both transcripts. We analyzed the ChIP on chip experiments previously performed on the 7776 array, obtaining figures of 15% for E2F4 and 23% for MYC targets as bidirectional promoters (27, 29). This suggests that (i) MYC and NF-Y sites are enriched in bidirectional promoters and/or (ii) that CpG islands are indeed specifically abundant in such units. CCAAT-less bidirectional promoters, do exist (e.g. the YBL1 promoter analyzed in Figs. 1 and 5); NF-Y, therefore, cannot be considered as a hallmark for such units. Nevertheless, in all systems tested so far, centrally located CCAAT boxes are important for the expression of divergent genes (45); the data obtained with the dominant negative NF-Yam29 presented in Fig. 5 on the TYMS and TLP19 loci confirm this assumption. The biological significance of the higher frequency of complex CTUs regulated by NF-Y as well as the molecular details of divergent co-regulation require further dissection.
CCAAT at Distant LocationsA second unanticipated result is the abundance (4050%) of sites away from promoters, with almost half of them located in introns. This clearly means that NF-Y is not a promoter-specific factor. It is important to emphasize that this finding would have been completely obscured had we used a promoter array chip, as available information on CCAAT locations would have suggested. Of course, the assumption that the CCAAT box was almost exclusively a promoter element was based upon standard promoter-driven analysis, thus merely reflecting the fact that far greater information had been gathered from such sequences. Only a handful of cases of distant locations were previously described. (i) In the major histocompatibility complex class II genes, upstream enhancers were shown to be dependent upon Y-boxes and neighboring RFX-binding sites (34). (ii) Sequences were found in the HOXB4 gene, that contain a highly conserved NF-Y site in a crucial intronic enhancer (in fact, it is not even a perfect pentanucleotide, CCATT or GCAAT), and similar deviant sequences were noticed at corresponding locations in other introns of HOX gene clusters (33). Interestingly, CCAAT boxes exist in HOX gene promoters as well (5557), one of which (HOXB13) was identified here; they are perfect CCAAT, whereas the intronic ones are modified, most likely to accommodate the binding of additional cooperating factors, as shown for YY1 in the case of HOXB4 (33). This suggests that there might be a plethora of specialized CCAAT versions, slightly deviating from optimal sites. It is even possible that we are largely underestimating the number of binding sites by focusing on the perfect pentanucleotide. NF-Y binding has been so far invariably associated with regulatory regions, which is confirmed by the expression analysis with the dominant negative NF-Yam29 shown here. An important implication of our data, therefore, is that new enhancers or regulatory regions could be uncovered via this strategy. In vivo functional dissection of the distant regions isolated here with enhancer-based assays is necessary to establish this point.
Functional Classification of NF-Y TargetsFig. 6 shows the functional classification of the annotated genes. In both HepG2 and Nalm-6, prominent classes are (i) DNA-binding and transcription factors in general, which represent >25% of the total, and (ii) membrane/extracellular matrix proteins coding genes and signal transduction genes. Far fewer genes code for structural proteins, proteins involved in mRNA processing and in vescicular and nuclear trafficking. This could be due to particular skewing of the CpG library (16), but we note that many of the genes identified in both HepG2 and Nalm-6 are indeed important for cell growth.
|
In conclusion, although we are still far from having a complete map of NF-Y targets on hand, the criteria employed here reveal new twists in the genomic strategy, mainly concerning its role in complex units and at nonpromoter locations. To build a complete understanding of the transcriptional networks in which the trimer takes part, it will be important to widen the analysis to lower affinity sites, in different cell types, under various growth conditions and with various partner activators.
| FOOTNOTES |
|---|
The on-line version of this article (available at http://www.jbc.org) contains three additional tables. ![]()
Recipient of a Università di Modena fellowship. ![]()
** Recipient of a FIRB "Giovani Ricercatori" contract. ![]()

To whom correspondence should be addressed: Dipartimento di Scienze Biomolecolari e Biotecnologie, Via Celoria 26, 20133 Milano, Italy. Tel.: 39-02-50315005; Fax: 39-02-50315044; E-mail: mantor{at}unimi.it.
1 The abbreviations used are: ChIP, chromatin immunoprecipitation; Pipes, 1,4-piperazinediethanesulfonic acid; PMSF, phenylmethylsulfonyl fluoride; IP, immunoprecipitation; CTU, complex transcriptional unit; EMSA, electrophoretic mobility shift assay; CREB, cAMP-response element-binding protein. ![]()
2 Imbriano, C., Gurtner, A., Cocchiarella, F., Di Agostino, S., Basile, V., Gostissa, M., Dobbelstein, M., Del Sal, G., Piaggio G., and Mantovani, R. (2005) Mol Cell Biol., in press ![]()
3 A. Testa and R. Mantovani, manuscript in preparation. ![]()
4 G. Donati and R. Mantovani, unpublished results. ![]()
5 A. Testa and R. Mantovani, unpublished results. ![]()
6 R. Mantovani, unpublished results. ![]()
| ACKNOWLEDGMENTS |
|---|
| REFERENCES |
|---|
|
|
|---|