Genomic Analysis Reveals a Novel Nuclear Factor-κB (NF-κB)-binding Site in Alu-repetitive Elements*

Background: The transcription factor NF-κB is a critical regulator of immune response pathways. Results: 11% of total NF-κB binding events in humans occur at the Alu-κB element present in Alu repeats. Conclusion: NF-κB has a primate-specific function and a role in human evolution. Significance: Repetitive elements expand the repertoire of binding sites to engage new genes into regulatory networks. The transcription factor NF-κB is a critical regulator of immune responses. To determine how NF-κB builds transcriptional control networks, we need to obtain a topographic map of the factor bound to the genome and correlate it with global gene expression. We used a ChIP cloning technique and identified novel NF-κB target genes in response to virus infection. We discovered that most of the NF-κB-bound genomic sites deviate from the consensus and are located away from conventional promoter regions. Remarkably, we identified a novel abundant NF-κB-binding site residing in specialized Alu-repetitive elements having the potential for long range transcription regulation, thus suggesting that in addition to its known role, NF-κB has a primate-specific function and a role in human evolution. By combining these data with global gene expression profiling of virus-infected cells, we found that most of the sites bound by NF-κB in the human genome do not correlate with changes in gene expression of the nearby genes and they do not appear to function in the context of synthetic promoters. These results demonstrate that repetitive elements interspersed in the human genome function as common target sites for transcription factors and may play an important role in expanding the repertoire of binding sites to engage new genes into regulatory networks.

The transcription factor NF-B orchestrates gene expression programs controlling the response to a variety of stimuli, including bacterial and viral infections and diverse biological functions, such as inflammation, cell proliferation, cellular homeostatic mechanisms, and carcinogenesis (1). In verte-brates, NF-B is a mixture of homo-and heterodimers of a family of five sequence-specific DNA-binding protein subunits that include p50 (NF-B1), p52 (NF-B2), p65 (RelA), RelB, and c-Rel (2). Access of NF-B to DNA elements is inhibited by a family of inhibitory molecules called IBs (IB␣, IB␤, IB␥, IB⑀, IB, p105, and p100) (3,4). Two major pathways of NF-B activation have been characterized (5). The canonical pathway culminates with the nuclear translocation of the p50/ p65 heterodimer, whereas the non-canonical pathway depends primarily on the activity of the p52/RelB heterodimer. The canonical pathway plays an important role in innate and adaptive immune responses, whereas the non-canonical pathway is critical for lymphoid tissue development, organization, and function (1,2,5,6).
NF-B proteins work as transcriptional activators or repressors by binding to the consensus DNA sequence 5Ј-GGGRNYYYCC-3Ј known as the B site. There are ϳ10 6 consensus B sites in the human genome, and ϳ500 genes are regulated by NF-B (a list of all known NF-B target genes has been compiled by T. Gilmore; see the Boston University NF-B Transcription Factors Web site) (reviewed in Refs. 7 and 8). However, the actual number of putative B sites in the human genome is much greater because variations that deviate from the consensus motif are also bound by NF-B and have been also shown to play critical roles in gene transcription despite the wide variation in NF-B DNA binding affinity (9 -15).
Several studies have provided evidence that transcription factors, like NF-B, can bind to many non-consensus sites with affinities similar to the ones displayed for consensus sites (reviewed in Ref. 16) and that most of the binding events in the nucleus occur in extragenic regions located far away from enhancers and promoters (17)(18)(19)(20)(21). However, these previous studies did not test for any significance of NF-B binding and have raised a number of additional questions. For example, what proportion of the B sites is functional? How do the B sites that regulate gene expression compete with those that are not involved in gene regulation? Are there B sites without functional importance?
Recently, we have established the biological significance of NF-B interaction with three B sites localized on specialized Alu-repetitive elements and showed that this binding is required for the subsequent delivery of NF-B to target enhancers (e.g. IFN-␤) via interchromosomal interactions (22). However, we do not know whether NF-B binds selectively only to these three Alu repeats of thousands existing in the human genome; nor do we know the fraction of the activated NF-B molecules bound to the Alu repeats. These early observations raised additional questions regarding the property of NF-B and possibly of other transcription factors to bind repetitive DNA sequences under physiological conditions. Repetitive DNA represents a major component of non-coding sequences in higher eukaryotes and comprises ϳ60% of the human genome (23)(24)(25)(26). Alu repeats are part of the short interspersed nuclear elements with an average size of 300 bp and exist in ϳ1,500,000 copies in the cell. Several previous studies have implicated short interspersed nuclear elements in regulation of gene expression (22, 24 -26). How do transcription factors select specific sites for DNA binding among the many similar ones in the genome, especially the ones occurring on repetitive DNA elements? Do transcription factors, like NF-B, bind preferentially to the higher affinity consensus sites compared with the larger number of lower affinity sites? What is the role of the chromatin state in making accessible transcription factor binding sites throughout the genome? Does binding of transcription factors in remote sites cause local chromatin remodeling and/or recruitment of the basal transcription machinery? Does activation of a transcription factor like NF-B by different signals induce selective binding to some sites, or is site selection signal-independent? Are all occupied sites by NF-B capable of supporting transcriptional activation? Put simply, is NF-B binding restricted to only the sites that control gene expression? If not, what is the functional significance of the other sites? These are obvious questions given the relatively small number of NF-B-regulated genes (ϳ500) and the huge number of putative B sites in the human genome (ϳ2 ϫ 10 6 ).
In this study, we have addressed many of the above questions using the NF-B family of transcription factors as a model system to understand basic regulatory principles of transcription factor-dependent gene expression networks. It has been previously hypothesized that specificity in NF-B-mediated transcriptional response can, in part, be provided through the combinatorial effects of the five different Rel members that are selectively regulated by IB proteins and activated by distinct signals in different cell types (reviewed in Ref. 7) and by the promoter context of the B sites (27)(28)(29). Here, we identified a novel NF-B-binding site residing in specialized Alu-repetitive elements that have the potential for long range transcription regulation, thus suggesting that in addition to its known role, NF-B has a primate-specific function and a role in human evolution. We found that ϳ11% of the total NF-B binding events in human cells occur at the Alu-B site. By combining these data with global gene expression profiling of virus-infected cells, we found that most of the sites bound by NF-B in the human genome, including the Alu-B sites, do not obviously correlate with changes in gene expression of the nearby genes. These results demonstrate that repetitive elements interspersed in the human genome are common target sites for transcription factors and may play an important role in expanding the repertoire of binding sites to engage new genes into regulatory networks.

EXPERIMENTAL PROCEDURES
Cell Cultures, Transfections, and Plasmid Construction-HeLa cells were maintained in DMEM containing 10% FBS, transfected using the calcium phosphate method, and infected with Sendai virus as described previously (29). IB4 cells are Epstein-Barr virus (EBV)-transformed human B cells, and they were maintained in RPMI 1640 medium containing 10% FBS. For the Western blot of Fig. 7B, HeLa cells were transfected on 6-well plates, harvested in SDS-PAGE sample buffer, and probed with the antibodies p65 (sc-109) and ␤-actin (sc-130301) from Santa Cruz Biotechnology, Inc. (Santa Cruz, CA).
The thymidine kinase-luciferase constructs were generated by subcloning the genomic fragments from the p65 library. The plasmids (pCRII-TOPO vector) were digested with SacI/XhoI and cloned upstream of the thymidine kinase promoter driving the expression of the luciferase reporter gene (30). The Ϫ40IFN-␤luc constructs were generated by subcloning tandem repeats of oligonucleotides bearing the Alu-B-binding site (one, two, or four copies) taken from the p65 library clones and placed upstream of the IFN-␤ core promoter driving the expression of the luciferase reporter gene (29). The DNA FISH experiments were carried out as described previously (22).
To construct the p65 ChIP library, we immunoprecipitated chromatin prepared from HeLa cells infected with Sendai virus for 6 h, and the purified DNA fragments were processed with the tagged random PCR method, as described previously (32). Following transformation, we picked 450 clones and obtained sequence from 366 clones. Supplemental Fig. 1 shows the sequence of each clone along with its genomic location and the sequence of the putative B site. The sequence from the 366 clones were analyzed by BLAST to locate the genomic position and by TRANSFAC to identify putative NF-B-binding sites.
The library containing the p52-binding sites was prepared by immunoprecipitated chromatin from IB4 cells with the p52 antibody, and the purified DNA fragments were treated with terminal deoxynucleotidyltransferase (New England Biolabs) to add poly(A) tails to the 3Ј ends. The isolated DNA fragments were enriched by PCR using oligo(dT) primers containing BamHI flanking overhangs. This DNA mix was subsequently digested with BamHI and cloned into the BamHI site of pBlue-ScriptII vector.
In Vitro Transcription and EMSAs-Recombinant NF-B (p50/p65 heterodimer) was expressed and purified as described previously (33). Similarly, EMSAs and in vitro transcription reactions were carried out as described previously (33).
DNA Microarray Experiments-Total RNA was harvested from 7 ϫ 10 6 mock-or virus-infected cells by TRIzol extraction in biological triplicates. Total RNA was labeled according to the Affymetrix protocols and used to hybridize Affymetrix HG133plus2.0 microarray chips. The data were analyzed initially by using Affymetrix Microarray Suite. The RMA algorithm (34) was used for normalization of the data. Median chip normalization and log 2 transformation of the data were performed. Permutation tests (35) and significant analysis of microarray (false discovery rate Ͻ5%) were performed. Annotation of the probe sets was performed with the DAVID Bioinformatics Database (36). After the statistical analysis of the microarrays, 1,262 genes were considered up-regulated at either 6 or 12 h after Sendai virus infection. More than 400 genes were annotated to be regulated by NF-B from the Boston University NF-B Transcription Factors Web site. 2 tests were performed in order to check the correlation between the previously known NF-B target genes and the 36 novel gene targets with the magnitude of activated gene expression in response to virus infection in HeLa cells. Genes were categorized into four categories according to their -fold change 12 h after virus infection. Levels of gene expression were categorized as follows: 1) low expression (-fold change Ͻ 2); 2) medium expression (2 -fold change Ͻ 4); 3) high expression (4 -fold change Ͻ 8); 4) very high expression (-fold change 8).
ChIP-seq Data Analysis-The file containing the NF-B (p65) significant ChIP-seq peaks obtained from ChIP-seq experiments using GM12878 lymphoblastoid cells after TNF-␣ treatment was downloaded from ENCODE (freely available from the University of California Santa Cruz Web site as a BED file describing the coordinates and the intensity of the peaks in hg18). The data were processed using the CEAS (Cis-regulatory Element Annotation System) software (available on the World Wide Web) to identify the genome-wide distribution of NF-B (p65). Peaks overlapping with Alu repeats were determined by intersecting the intervals of NF-B(p65) or the intervals of an INPUT-seq from HeLa cells 4 with each of the Alu repeat intervals as obtained from RepeatMasker version 3.2.7 (available on the World Wide Web). Overlapping intervals were counted, and their total number was normalized by the size of their coverage. In order to compare percentages or proportions between the ChIP-seq data or the ChIP-cloning data and the INPUT-seq data, the binomial distribution was used with a null hypothesis for difference of proportions equal to 0.0 and ␣ ϭ 0.05. p values of less than 1% were considered statistically significant (p Ͻ 0.01). We recovered the sequences corresponding to the peaks of the ChIP-seq data and submitted them to de novo motif identification using the Seqpos motif tool that can find motifs enriched in a set of regions and align them to the known motif published in JASPAR and TRANSFAC (available on the Harvard Galaxy/Cistrome Web site).

Virus Infection Induces a Wide Distribution of NF-B DNA
Binding into the Human Genome-To investigate the localization of NF-B into the human genome following virus infection, we applied a chromatin immunoprecipitation-based cloning technique to isolate genomic DNA sequences bound by NF-B dimers using specific antibodies against the p65 (RELA) subunit. The ChIP-PCR-cloning procedure was optimized to avoid nonspecific interactions and bias in DNA cloning as verified by IgG control immunoprecipitation experiments and by applying limited PCR amplification to limit background (see "Experimental Procedures"). HeLa cells were mock-or virusinfected with Sendai virus for 6 h, followed by chromatin preparation and immunoprecipitation with the anti-p65 antibody (31). The precipitated DNA was processed as described previously (32). Four hundred fifty colonies were randomly picked, and the inserts were sequenced, producing 366 DNA sequences that were characterized in detail. The lengths of the cloned DNA fragments ranged from 100 to 850 bp, averaging about 430 bp (data not shown).
We searched the human genomic data base by BLAST to determine the location of the p65-bound DNA fragments in relation to known or predicted genes (supplemental Fig. 1). Fig.  1A shows the distribution of the identified NF-B-binding sites relative to the closest transcription start site of nearby genes. Interestingly, the larger percentage of NF-B-binding sites (49.5%) was found in regions located more than 10 kb away from known or predicted regulatory enhancer/promoter regions (Fig. 1A). Furthermore, ϳ34% of NF-B-binding sites were identified in introns, whereas smaller percentages ranging from 1.5 to 9% corresponded to B sites located in exons, 5Јand 3Ј-untranslated regions, and 5Ј-proximal regions (less than 10 kb upstream from known genes). Notably, in ϳ38% of the cloned NF-B-bound fragments, a consensus or a putative NF-B-binding site could not be identified (Fig. 1B). Overall, these findings are consistent with several previous reports in which transcription factor-bound DNA sites were mapped into mammalian genomes using ChIP-cloning, ChIP-PET, and ChIP-seq techniques (reviewed in Ref. 16). Our results are in agreement with previous studies reporting binding of NF-B into the human genome in response to LPS activation of macrophages or monocytes (10,14,36) or treatment of HeLa cells with TNF-␣ and mapping of NF-B binding across human chromosome 22 (12). Furthermore, these data may also reflect the property of NF-B to control transcription primarily through distal enhancers or simply indicate that promoter elements are not well defined on a genome-wide basis or that the majority of the NF-B-bound sites in vivo do not correlate with gene expression of the nearby genes (see below).
In Vivo NF-B Binding Sequence Preference Reveals a Novel B Element-We examined our data set to determine whether the NF-B consensus motif, 5Ј-GGGRNNYYCC-3Ј, is enriched in the isolated p65-bound genomic sequences. Fig. 1B shows that only 6.5% of the identified B sites fit to the NF-B consensus, whereas 38% of the bound DNA fragments lack a recognizable NF-B site (i.e. they bear more than three mismatches from the consensus B site). We scanned the DNA sequence of the 366 clones for B-binding sites, and this search revealed an enrichment of the sequence 5Ј-GGGR(T/ A)TYY(C/A)Y-3Ј (Fig. 1D). This sequence is similar to the current NF-B consensus motif (5Ј-GGGRNNYYCC-3Ј) differing at only one nucleotide (underlined), an observation that underscores the robustness of our ChIP-cloning approach to identify in an unbiased manner genome-wide binding sites for NF-B proteins. Of note, we discovered the frequent presence of a T instead of N at position 6 of the consensus (Fig. 1D).
Remarkably, we discovered that 11% of our B sites contained the specific GGGTTTCACC B sequence (Fig. 1B) that deviates from the consensus at two nucleotides (underlined), thus increasing the total percentage of sequences bearing two mismatches to 35% (24 ϩ 11%). The GGGTTTCACC DNA site (named the Alu-B site from this point onwards) is present in specialized Alu-repetitive elements (38,39). We did not identify Alu-B sites outside of Alu repeats. Thus, the Alu-B site is not just another NF-B DNA-binding site but the most frequent human DNA site on which NF-B binds in vivo. None of the previous studies identified the Alu-B site, probably because of the inherent bias of the methods used (10, 12, 14, 36). We have previously identified three Alu-B sites implicated in FIGURE 1. Statistical analysis of p65-bound genomic loci. A, the 366 clones were classified by locations relative to the nearest neighbor gene. The pie chart shows the distribution of p65-binding sites located in promoters (Ϫ10 kb to the transcription start site), exons, introns, 5Ј-and 3Ј-untranslated regions, remote sites (Ϫ10 to Ϫ100 kb), and intergenic sites (Ͼ100 kb away from genes). Putative NF-B-binding sites were detected using the TESS and TRANSFAC 7.0 platform. B, pie chart illustrating the number of mismatches of the identified B sites relative to the consensus B site. The GGGTTTCACC element, although it bears two mismatches, was classified separately. C, shown is the frequency of the GGGTTTCACC element in human and mouse chromosomes. The distribution of the mutant control sequence GGGTATCACC is also shown for both mouse and human chromosomes. D, position weight matrix of the B sites derived from the ChIP cloning experiments using THEME.

the NF-B-dependent virus-induced expression of the human IFN-␤ gene via interchromosomal interactions with the IFN-␤ locus (22).
Bioinformatics analysis indicated that the GGGTTTCACC B site occurs 170,755 times in the human genome, whereas only 1,425 copies exist in the mouse genome, a result consistent with the absence of Alu-repetitive elements in mice (Fig. 1C). The Alu-B site is present in the AluSx, AluSg, and AluY elements and not in the other types of Alu repeats, such as AluSq, AluJo, and AluJb ( Fig. 2A). To verify in vivo binding of NF-B to some of the B site-containing specific Alu repeats (AluSx, AluSg, and AluY), we carried out ChIP experiments using anti-p65-immunoprecipitated chromatin prepared from mock-and Sendai virus-infected HeLa cells and PCR primers flanking each one of the Alu-repetitive elements. Fig. 2B (lanes 1 and 2) shows that indeed NF-B associates only with those Alu elements that bear the GGGTTTCACC site (AluSx, AluSg, and AluY) and not with AluSq, AluJo, and AluJb repeats lacking the Alu-B site. Fig. 1C shows the distribution of the GGGTTTCACC Alu-B element in the human chromosomes in parallel with the distribution of the GGGTATCACC sequence bearing a single base substitution (underlined), used as a control in the bioinformatics analysis. We found that the frequency of the GGGTTTCACC element follows the distribution of the Alu repeats into the human genome, and it depends on the size of chromosomes, thus implying no preference for specific chromosomes (Fig. 1C). Furthermore, this comparison verified the presence of the GGGTTTCACC motif in repetitive elements and that the mutant control sequence occurs in the human genome at the expected abundance for any random 10-mer (2,634 times versus the theoretically expected 3,000 times for the 3 ϫ 10 9 -bp human genome). Finally, the distribution of the Alu-B-containing loci in the genome relative to known genes revealed no preference for specific genomic positions, thus correlating with the overall distribution of Alu elements (Fig. 2C). The Alu-B site was also highly represented among the DNA fragments that were identified by chromatin immunoprecipitation and cloning of p52/NF-B2-binding sites in the genome of latently infected human B lymphocytes with EBV (data not shown).
To examine whether our conclusions reflect the global general binding properties of NF-B and are not the result of the relatively small data sets of the ChIP-cloning experiment, we took advantage of the public availability of recent raw data of a ChIP-seq experiment examining p65 binding to the human genome following TNF-␣ activation in the lymphoblastoid cell line GM12878 from the ENCODE consortium (available from the University of California Santa Cruz Web site). The data were generated by the Snyder, Gerstein, and Weissman laboratories (Yale University), the Farnham laboratory (University of California, Davis), and the Struhl laboratory (Harvard University). Our analysis of these p65-ChIP-seq data is shown in supplemental Fig. 2, and the procedures for the analysis are described under "Experimental Procedures." In a remarkable agreement with our ChIP-cloning analysis (Fig. 1), we found that only a small fraction (17.8%) of the total NF-B binding events occur at promoter regions (Ͻ10 kb from the start site of transcription), a result in agreement with a recent publication examining the relationship of NF-B and Bcl-6 binding in the mouse genome in which 10% only of the p65 bound regions were found at promoters (37). The distribution of NF-B between intergenic regions, introns, exons, and 5Ј-and 3Ј-UTRs is very similar to our ChIP-cloning experiments (compare Fig. 1A with supplemental Fig. 2A). Furthermore, the consensus NF-B DNA binding motif derived from the ChIP-cloning analysis is similar to the consensus motif derived form the analysis of the ENCODE-derived ChIP-seq data (compare Fig.  1D with supplemental Fig. 2B). The small differences in the consensus sequences are most likely due to the inherent bias of the ChIP-seq analysis toward the higher affinity binding sites (the ChIP-seq-derived motif is more similar to the consensus B site than the consensus derived from the ChIP-cloning data). This is consistent with previous biochemical studies using permutations of the B motif in in vitro binding assays (11). Finally, we discovered that a significant number of peaks do not contain a recognizable NF-B binding motif (data not shown). More specifically, 67% of all ChIP-seq peaks contain sequences that deviate from the consensus by more than two mismatches, a result in agreement with our ChIP-cloning analysis (73%).
Importantly, we found a significant percentage of p65 peaks that coincides with Alu repeats at frequencies very similar to the ones we described with the ChIP-cloning approach (for a comparison, see supplemental Fig. 2). Taken together, these data strongly suggest that our major finding of an extensive interaction of NF-B with Alu repeats in the human genome is not the result of the smaller data sets used and analyzed by the ChIP-cloning method. In summary, our combined ChIP-cloning and ChIP-seq analyses revealed the existence of the previously unknown Alu-B site, which is the most frequent binding site targeted by NF-B in human cells, and all Alu-B sites are part of Alu repeats.
Recruitment of NF-B Proteins to Multiple Genomic DNA Sites in Vivo and in Vitro-To examine DNA binding of NF-B and local histone modifications to the wide variety of identified putative target sites, we carried out chromatin immunoprecipitation experiments using pairs of primers corresponding to a randomly selected set of clones bearing different B motifs. Fig.  3A shows that all selected genomic clones are bound (with varying affinities) in vivo by the p65 subunit of NF-B in a virus infection-inducible manner (lanes 1 and 2), a result supporting the validity and specificity of our ChIP-cloning approach. Thus, NF-B binds to a wide variety of DNA sites, located at many different positions relative to genes, including DNA binding to the GGGTTTCACC Alu-B element. Interestingly, a similar pattern of NF-B association was obtained when the cells were treated with TNF-␣ to induce the NF-B DNA binding activity (Fig. 3A, lanes 7 and 8). Importantly, we tested three of our clones (clones 48, 138, and 224) lacking a recognizable B site (supplemental Fig. 1) and verified that they are indeed bound by NF-B, thus supporting further our initial conclusion that NF-B is recruited to genomic sites lacking B elements. A subset of the NF-B-bound loci can also associate with HMGI(Y) (Fig. 3A), suggesting a function of the protein in facilitating NF-B DNA binding activity at genomic loci scattered throughout the genome.
To examine whether different NF-B proteins could occupy the same genomic sites, we carried out side-by-side chromatin immunoprecipitation experiments using antibodies against the p65, p52, and RelB proteins and chromatin prepared from latently infected human B cells with EBV, in which different heterodimeric forms of NF-B proteins are constitutively nuclear. As seen in Fig. 3B, we found that all tested binding events occurred at all sites that can be occupied by all three NF-B family members. Thus, in vivo, B sites are occupied by multiple NF-B family members stimulated by different inducers. This REL protein dimer exchange is also consistent with a report examining the Rel subunit composition on the promoters of five NF-B-regulated genes during dendritic cell maturation (13).
To test whether binding of NF-B at genomic sites located away from known enhancers or promoters can induce local chromatin modifications (e.g. acetylation) and the recruitment of general transcription factors, we carried out ChIP experiments using antibodies against acH4, acH3, TBP, and PolII. Fig. 4 shows that in ϳ50% of the tested genomic sites, the local chromatin is either preacetylated at various levels (clones 8, 31, 77, 87, 95, 175, 366,  315, 216, 223, 217, 307, and 310), or it becomes further acetylated upon NF-B binding (clones 11, 77, 87, 313, and 366), or the acetylation marks are decreased upon virus infection and NF-B binding (probably due to nucleosome loss). These histone acetylation patterns do not correlate with the presence or absence of Alu repeats (Fig. 4). Furthermore, increased histone acetylation in these loci does not always correlate with transcriptional activation in response to virus infection (see Figs. 6 and 7A) and/or the recruitment of TBP and PolII (Fig. 4). In summary, NF-B can induce distinct histone acetylation patterns and recruitment programs upon binding to different Alu-B sites, thus underscoring the role of the B context in NF-B function. It should be mentioned, however, that the observed chromatin modifications and recruitment of general transcription factors could be actually occurring at remote sites with which these B elements physically associate via DNA looping or interchromosomal interactions and are captured by the ChIP assay (22,40,41). Finally, these results are in agreement with a study in yeast showing that binding of GCN4 to non-promoter regions can induce local histone acetylation and recruitment of basal transcription factors (42).
The ability of NF-B to interact with the Alu-B site was evaluated by carrying out electrophoretic mobility shift assays (EMSAs) using oligonucleotides encompassing Alu-B sites taken from various clones and nuclear extracts from mock-, virus-, and TNF-␣-treated cells or recombinant NF-B proteins (p50/p65 heterodimer). Fig. 5A shows that nuclear activated (lanes 1-20) or recombinant NF-B (lanes 21-30) binds to the Alu-B element derived from different genomic sites. Interestingly, the relative affinities of the in vitro DNA binding experiments are in agreement with the in vivo occupancy data presented in Fig. 3. For example, clone 77 is a weak binding site for NF-B both in vivo and in vitro (compare Fig. 3A with Fig.  5A). These DNA-protein interactions are specific because they can be competed away by the WT and not by the mutant oligonucleotides (Fig. 5C, compare lane 2 with lanes 3-6). The protein-DNA complexes formed in nuclear extracts prepared from TNF-␣-treated cells on the Alu-B element contain the p50 and p65 subunits of NF-B because the complexes were abrogated by the addition of the corresponding antibodies (Fig. 5C). Finally, the titration competition experiment of Fig. 5D suggests that the affinity of NF-B for the Alu-B site is ϳ5 times less than the affinity for the typical PRDII 5 site taken from the IFN-␤ gene. 5 The abbreviation used is: PRDII, positive regulatory domain II.

Identification of Virus Infection-inducible Genes Regulated by
NF-B-To identify p65-bound genes that are responsive to virus infection, we carried out transcriptomics analysis in HeLa cells using the Affymetrix platform to identify genes that are differentially expressed after virus infection for 6 and 12 h. We identified 1,262 genes that are significantly up-regulated upon virus infection (supplemental Fig. 3). Of these 1,262 virus-inducible genes, we found that only 36 are located near the p65bound loci identified by the ChIP-cloning approach (Fig. 6A and supplemental Fig. 4), and only 64 virus-inducible genes represent previously known and well characterized NF-B targets ( Fig. 6B and supplemental Fig. 5). Interestingly, of the 462

. Recruitment of NF-B at various non-promoter locations in the genome validates the ChIP-cloning analysis.
A, individual ChIP assays of NF-B binding to randomly selected clones derived from the ChIP-cloning method using chromatin prepared from HeLa cells that were either mock-or virus-infected for 6 h (lanes 1-4) or TNF-␣-treated for 1 h (lanes 7 and 8). The chromatin was immunoprecipitated with either the p65 (lanes 1, 2, 7, and 8) or the HMGI(Y) antibody (lanes 3 and 4). The IFN-␤ promoter was used as a positive control for virus infection and the IL-8 promoter for TNF-␣ induction. The relative location of the selected clones and the identity of their B element are indicated at the left. B, same as in A except that the chromatin used was prepared from IB4 lymphoblastoid cells, and the antibodies used were against p52, RelB, and p65. The ELC promoter was used as a positive control for the anti-p52 and anti-RelB ChIPs, and the IL-8 promoter was used as a positive control for the anti-p65 ChIP. Clones have been selected randomly from the anti-p52-ChIP library previously known NF-B targets, only 11 were identified in our ChIP-cloning assay ( Fig. 6C and supplemental Fig. 4). Thus, the 36 novel virus-inducible genes that are associated with our list of identified sequences represent a reliable list of NF-B target genes in virus-infected human cells. The 36 genes we identified as novel putative direct NF-B targets encode transcription factors (ZFR, ZNF704, and PSGF5), signaling proteins (CDKN2C, RIPK2, MAFB, MAP4K4, SKP2, etc.), and others (see supplemental Fig. 4).
It is important to mention that only a small number of NF-B-bound genes respond to virus infection (Fig. 6A). Thus, only 36 of the 374 nearest genes we identified are responsive to virus infection, and only 64 of the previously known 462 verified and/or validated NF-B targets respond to virus infection (Fig.  6B). Fig. 6D (left) demonstrates that by using 2 tests, we found a close association between the 64 previously known NF-B target genes and the magnitude of activated gene expression in response to virus infection in HeLa cells ( 2 statistic ϭ 31.7, degree of freedom ϭ 3, p Ͻ 0.001). Remarkably, no such correlation ( 2 statistic ϭ 5.8, degree of freedom ϭ 3, p Ͻ 0.10) was found with the novel 36 putative gene targets (Fig. 6D, right). Thus, a random sampling of p65-bound genomic loci correlates with gene expression only at a 10% frequency (36 of 366), and none of these gene expression patterns correlates with the magnitude of the transcriptional activation in response to virus infection. Taken together these data strongly suggest that most of the NF-B-bound genomic sites are not capable of promoting transcriptional activation, and/or NF-B is in general insufficient to trigger gene expression.
Transcriptional Activation Properties of Genomic NF-B Sites-To investigate the transcriptional properties of the novel B sites, we generated reporter constructs in which randomly selected genomic fragments bearing the B elements were placed upstream or downstream from the thymidine kinase promoter linked to the luciferase gene. These reporters were cotransfected into HeLa cells along with an empty vector or a vector expressing the p65 subunit of NF-B, and the luciferase activity was determined in mock-and virus-infected cells. Remarkably, we found that only 30% (7 of 21) of the tested elements activated the luciferase reporter in response to p65 overexpression, and of these, only two (clones 77 and 216) could mediate Sendai virus-induced transactivation of the luciferase reporter (Fig. 7A). Western blotting experiments verified that the p65 protein was expressed at similar levels in all samples (Fig. 7B). Interestingly, the majority of the genomic sites bearing the Alu-B elements did not respond to virus infection or p65-dependent transactivation except the clones 15, 16, 23, and 71, whereas the non Alu-B clone 216 was activated at the highest levels by overexpressed p65 (Fig. 7A). We do not yet know why clone 216 functions as a strong p65 responder, but we can assume that high level p65 expression could lead to binding at very low affinity non-consensus sites or to the indirect recruitment of p65 to the DNA by unknown transcription factors bound to clone 216. Surprisingly, the experiment of Fig. 7C shows that oligomerization of the Alu-B element only responded weakly to high level expression of p65. More specifically, the amounts of activated transcription were ϳ30 times less compared with the ones obtained with the PRDII NF-B site taken from the IFN-␤ enhancer/promoter, under conditions of saturating NF-B DNA binding (also see below).
To determine why NF-B bound to the Alu-B element fails to promote high levels of transcription, we carried out in vitro transcription experiments allowing either single or multiple rounds of transcription reinitiation using recombinant NF-B (p50/p65 heterodimer) and HeLa nuclear extracts. Our DNA templates contained four copies of the PRDII element (GGGAAATTCC) or four copies of the Alu-B element (GGGTTTCACC) taken from two different genomic clones (bearing different flanking sequences) placed immediately upstream of the IFN-␤ core promoter (33). The DNA templates were incubated with saturating amounts of recombinant NF-B, followed by incubation with HeLa nuclear extracts for 30 min to allow the assembly of preinitiation complexes. Next, NTPs were added, followed by the addition of Sarkosyl, which prevents the reassembly of initiation complexes, or buffer, and the samples were incubated for 30 min (33). Correctly initiated transcripts were detected by primer extension. Fig. 7D shows that when multiple rounds of transcription were allowed, NF-B activated transcription 36-fold from the PRDII site and only 7-fold from each of the Alu-B sites (compare lane 2 with lanes 6 and 10). These results are in general agreement with the in vivo transfection experiments of Fig. 7C. Remarkably, in the presence of Sarkosyl, NF-B activated transcription ϳ4-fold from both the PRDII and the Alu-B sites, implying that the PRDII element supports ϳ9 rounds of transcription from PRDII and ϳ1.7 rounds from the Alu-B element. Thus, the distinct transcriptional strength of NF-B on PRDII versus the Alu-B site is due to the inability of NF-B to promote multiple  Red or black represents increase or no change in gene expression relative to uninfected cells, respectively. Only genes displaying significant changes in expression levels (Ͼ1.5-fold or 1/1.5-fold; false discovery rate Ͻ5%) as compared with the uninfected control in three biological replicates are shown. The moving average of genes in a window of 60 genes that are known to be regulated by NF-B is shown on the left. The right part illustrates a similar analysis performed with the 64 and 36 genes identified by our study to contain an NF-B site and to be activated by virus infection. The straight line represents the expected percentage of NF-B-regulated genes. The group of 64 genes corresponds to previously known NF-B targets that were identified in our DNA microarray analysis to be induced by virus infection, whereas the group of 36 genes corresponds to genes identified by our ChIP cloning analysis and shown to be induced by virus infection.
expression. Because the Alu-B site is incapable of supporting by itself high levels of transcription, we hypothesized that the Alu-B elements may play a more general role in interchromosomal interactions. We have tested three randomly picked clones derived from the ChIP-cloning library (clones 11, 17, and 39) for their ability to interact with the IFN-␤ gene side-by-side with the previously characterized 4C-ChIP clones (22). Here we show representative results for clone 11, but similar results were obtained with clones 17 and 39. Fig. 8A shows the maximum alignment of the two Alu repeat sequences surrounding FIGURE 7. Transcriptional properties of genomic p65-bound loci. A, HeLa cells were cotransfected with reporter constructs bearing randomly selected p65-bound loci upstream of the thymidine kinase (TK) minimal promoter directing the expression of luciferase with or without a p65 expression vector. The cells were either mock-or virus-infected for 12 h, and the luciferase activity was determined and plotted after normalization with lacZ. B, Western blotting using some of the extracts generated in Fig. 7A reacted with the anti-p65 antibody (sc-109, Santa Cruz Biotechnology, Inc.). An anti-␤-actin antibody (sc-1616, Santa Cruz Biotechnology, Inc.) was used to detect ␤-actin as an internal control. C, HeLa cells were cotransfected with reporter constructs bearing one, two, or four copies of the GGGTTTCACC or PRDII (GGGAAATTCC) element cloned immediately upstream of the IFN-␤ minimal promoter in the presence or the absence of the p65 expression vector. The cells were either mock-or virus-infected for 12 h before the luciferase activity was determined. D, shown is an in vitro transcription assay using HeLa nuclear extracts and templates bearing four copies of the PRDII element (PRDIIx4LUC) or four copies of the Alu-B element (Ϫ40LUC #11 ϫ4 and Ϫ40LUC #39 ϫ4 ). The sequence of the B oligonucleotides used to generate the templates is shown in Fig. 5B. The in vitro transcription reactions were carried out in the absence or the presence of saturated amounts of recombinant NF-B according to the scheme shown at the left of the gel. The NF-B-dependent transcripts generated in the absence or presence of Sarkosyl (an inhibitor of the reinitiation of transcription) are indicated by an arrow (top) and were quantified by the ImageQuant software (Molecular Dynamics) to calculate the rounds of transcription that took place (bottom). A representative experiment of two is shown. Error bars, S.D.
the Alu-B site from the ChIP-cloning clone 11 and the 4C-ChIP clone 21 shown previously to interact with the IFN-␤ locus in an NF-B-dependent manner. As seen in Fig. 8, the two Alu-B clones share 80% identity. To determine whether binding of NF-B to different Alu-B clones can increase IFN-␤ gene expression by promoting interchromosomal interactions, we carried out three-dimensional DNA FISH analysis using HeLa cells that were either mock-or virus-infected for 4 h. The IFN-␤ bacterial artificial chromosome probe (green) detects 6 or 7 IFN-␤ genes, and the clone 21 bacterial artificial chromosome probe detects three copies of clone 21, because HeLa cells are polyploid. In agreement with our previous work (22), virus infection induces interchromosomal interactions between a single allele of clone 21 and one allele of IFN-␤ in ϳ9% of the cells. In sharp contrast, no such interaction was detected between the ChIP-cloning clone 11 and the IFN-␤ locus (Fig.  8B). To investigate whether clone 11 could potentiate IFN-␤ expression in the absence of physical interactions with the IFN-␤ locus, we generated stable HeLa cells bearing multiple copies (35)(36)(37)(38)(39)(40) of clones 21 and 11. Both stable cell lines bear the inserted DNA at a single integration site (see below). The stable cell lines bearing the empty vector and the Alu-B clones 11 and 21 were infected with virus for different amounts of time, and IFN-␤ gene expression was determined by RT-PCR. Fig. 8C shows that the cells bearing multiple copies of clone 21 express the IFN-␤ transcript at much higher levels compared with control cells (compare lanes 1-8 with 13-16). Interestingly, no effect on IFN-␤ expression was detected in the cells bearing multiple copies of clone 11. The increased levels of IFN-␤ expression in clone 21 cells correlate with interchromosomal interactions between the IFN-␤ gene and clone 21, although clone 21 has been inserted at a different site from its natural position. Interestingly, no such interactions were observed in cells bearing clone 11. We conclude that different Alu-B repeats are functionally distinct despite their significant DNA homology. It is important to mention that different Alu-B repeats may interact with different target genes. Furthermore, these data also demonstrate that the property of some Alu-B repeats to participate in interchromosomal interactions is not determined by their genomic context as revealed by the ability of the stable clone 21 to interact with the IFN-␤ gene.

DISCUSSION
Here we have addressed the interaction of the transcription factor NF-B with the human genome and found that ϳ11% of the total NF-B DNA binding events occur at a specialized type of Alu repeat B element that could function by controlling the activity of NF-B. Furthermore, we demonstrate that the vast majority of the NF-B DNA binding events do not directly correlate with gene regulation, a conclusion in agreement with earlier preliminary observations (10,12,14). All tested novel NF-B-bound genomic elements are true B-binding sites despite the fact that many of them are lacking a recognizable B motif or a B consensus. NF-B may interact with these elements via another DNA-binding protein, and/or these loci may interact with distant NF-B sites via chromatin looping or interchromosomal interactions.
The ChIP-cloning approach provided novel information regarding the genome-wide distribution of NF-B (10,12,14,37). The main advantage of our ChIP-cloning approach is the identification in an unbiased manner of NF-B-binding DNA sites of variable affinities (low, medium, and high), whereas in all previous studies, most of the identified B sites were of relatively high affinity. The second advantage is that in all high throughput methods (ChIP-cloning PET, ChIP-chip, and ChIP-seq) the analysis relies on the use of specific algorithms either to identify the peaks of tagged sequences and/or map the short sequences derived from high throughput sequencing to the genome or to measure the relative hybridization intensity. It is not surprising that different algorithms can produce different results using the same set of data. In sharp contrast, although our approach is of medium scale, we have manually curated the sequencing data, thus increasing the confidence of the results. Finally, high throughput approaches based on hybridization (ChIP-chip) and sequencing (ChIP-seq) may mask putative DNA transcription factor-binding sites at repetitive sequences. This is the main reason that the Alu-B site remained elusive until we carried out the ChIP-cloning experiment. On the other hand, the disadvantage of our ChIP-cloning approach is the identification of only a small number of binding sites when compared with the other high throughput approaches.
The fact that of the 100 NF-B-bound genomic loci identified in our analysis, only a few correlate with changes in expression of the associated genes suggests that most NF-B binding events may not have functional consequences under the specific conditions tested here (virus infection), but they could be functional under another condition. These conclusions are also in agreement with a recent study examining genome-wide binding of liver transcription factors in five different vertebrate species (43). Thus, these unexpected findings raise the question of why there would be so many "nonfunctional" binding sites. We can imagine the following possibilities. 1) The assignment of the binding site to the nearby gene may not be accurate because the bound factor may regulate the expression of a different gene via the formation of long range cis loops or interchromosomal interactions (reviewed in Ref. 46). 2) Binding sites located away from genes may correspond to docking DNA elements used by remote enhancers to reach their target promoters through the progressive formation of transient loops along the DNA (41). 3) Transcription factors may bind to these "irrelevant" sites transiently only (44), suggesting that the "snapshot" of transcription factor DNA binding taken by ChIPbased approaches most likely reflects factors on the "move" (i.e. the continuous association and dissociation from non-relevant sites until they encounter a promoter). 4) Some NF-B-binding sites may be used as a shuttle receiving NF-B and "delivering" it to target genes via DNA looping and/or interchromosomal interactions. This notion is supported by our recent discovery of three distally located Alu-B elements promoting NF-B-dependent expression of the interferon-␤ gene during the early phase of virus infection (22). Specifically, upon nuclear translocation, NF-B initially associates with three Alu-B elements and subsequently is transferred to the IFN-␤ gene via interchromosomal interactions, resulting in the initiation of transcription.
The main question arising from our study regards the role of NF-B when bound to the Alu-B sites. The human genome contains over one million Alu repeat elements (reviewed in Refs. 23 and 24), and ϳ20% of those contain the non-consensus Alu-B site GGGTTTCACC. Our ChIP-cloning and ChIP-seq data showed that the Alu-B site is the most frequently recognized B element in the human genome because ϳ11% of the total NF-B DNA binding events occur at the Alu-B site. The Alu-B site is distinct from all other B sites because it cannot support direct transcriptional activation due to its inability to promote reinitiation of transcription. We speculate that this is due to allosteric changes induced on the Rel proteins by the specific Alu-B DNA sequence, a hypothesis consistent with previous studies demonstrating DNA site-induced conforma-tional changes on DNA binding proteins, thus affecting their ability to activate or repress transcription (27,45). However, depending on the context and the nearby transcription factors, the Alu-B element can work as a virus-inducible enhancer (Fig. 7A). Is there a general function for the Alu-B sites, or could they have different functions depending on their genomic context, cell type, environmental cues, developmental stage, etc.? We favor the latter possibility because we showed that, for example, their ability to participate in interchromosomal interactions is not general and that some of them, in the appropriate context, can work as enhancers. Below we summarize some possible functions, and we emphasize that our current technical inability to define the whole spectrum of specific roles for the Alu-B elements does not necessarily mean that they are without function. We hypothesize the following. First, based on the fact that proteins in solution exist as a population of different conformers (46), we propose that nuclear NF-B could exist in many conformers, some of which are transcriptionally inert but can bind to the Alu-B site. Thus, binding of this specific NF-B conformer to the Alu-B site could lead to its dramatic reduction in the population of the NF-B ensembles, thus allowing the rest of the conformers, which are transcriptionally competent, to bind and activate target genes. According to this model, the Alu-B site functions as a sink "removing" transcriptionally inert NF-B molecules, with the subsequent optimization of the interactions taking place via induced fit and/or DNA-induced allostery.
Second, Alu repeats have been found to contain several potential functional transcription factor-binding sites that could modulate gene expression through their presence in enhancers, promoters, and/or introns (34). In agreement with this idea is the recent observation that repetitive DNA elements spread throughout the genome, including regulatory regions, contain bona fide DNA sites for transcription factors (OCT4, NANOG, and CTCF), thus engaging the associated genes in the core regulatory network of stem cells (25). This study, together with the results presented herein, suggests that binding of transcription factors to repetitive DNA elements could be associated with regulatory expansions, thus playing an important role in expanding the repertoire of binding sites (see also Ref. 47). Third, Alu-B elements could also act as transcriptionally inert docking sites by sequestering the NF-B subunits immediately after nuclear entry, thus preventing excessive targeting and superactivation of promoters. This is supported by our finding that the Alu-B element is nucleosome-free and therefore fully accessible for factor interactions (not shown).
The wide distribution of the Alu-B elements combined with other nearby cis-acting elements may expand the NF-B activity by engaging more genes into specific regulatory networks. This level of NF-B regulation may be a characteristic feature of primate phylogeny because the Alu-B element is not present in other species. Furthermore, the larger number (ϳ3 times) of species-specific insertions of Alu-repetitive elements in the human genome, compared with the chimpanzee genome, suggests a mechanism for differential regulation of NF-B, even between primates (7), thus possibly contributing to human evolution. In addition, the high density of Alu elements in the genome, when taken together with their direct involvement in interchromosomal interactions (2), suggests that Alu repeats together with their bound transcription factors could play an important role in recombination events triggering genomic instability and/or genetic disorders (8). The latter point is further supported by a recent study showing that genome-wide binding of NF-B differs by 25% between individuals, and most of these differences are localized in intergenic regions and are due to SNPs (48). Remarkably, and in agreement with data presented herein, most of these binding differences between individuals are not associated with differences in gene expression. In summary, these findings, combined with our data and the observation that Alu repeats are polymorphic in nature, support the idea that differences in gene expression between individuals and between humans and other primates could be due to long range effects of the polymorphic Alu repeats and/or other genomic sites like some of those identified in this study.