Determination and Functional Analysis of the Consensus Binding site for TFII-I Family Member BEN, implicated in Williams-Beuren syndrome

The ubiquitously expressed TFII-I family of multifunctional transcription factors is involved in gene regulation as well as signaling. Despite the fact that they share significant sequence homology, these factors exhibit varied and distinct functions. The lack of knowledge about its binding sites and physiological target genes makes it more difficult to assign a definitive function for the TFII-I-related protein, BEN. We set out to determine its optimal binding site with the notion of predicting its physiological target genes. Here we report the identification of an optimal binding sequence for BEN by SELEX (systematic evolution of ligands by exponential enrichment) and confirm the relevance of this sequence by functional assays. We further performed a data base search to assign genes that have this consensus site(s) and validate several candidate genes by quantitative PCR upon stable silencing of BEN and by chromatin immunoprecipitation assay upon stable expression of BEN. Given that haploinsufficiency in BEN is causative to Williams-Beuren syndrome, these results may further lead to the identification of a set of physiologically relevant target genes for BEN and may help identify molecular determinants of Williams-Beuren syndrome.

Several groups concurrently reported the presence of a TFII-I related gene that was called WBSCR11 (1), GTF2IRD1 (2) or GTF3 (3). This gene is located just centromeric to the TFII-I gene (GTF2I) on human chromosome 7 and (syntenic murine chromosome 5) in the same transcriptional orientation (4). It is a single copy gene that is deleted in all Williams-Beuren syndrome (WBS) patients, who display craniofacial dysmorphology, cognitive impairments and muscle fatigue (3,5,6). The human gene contains 27 exons and there are two known alternatively spliced isoforms (7,8). The Ruddle group isolated the mouse ortholog BEN (binding factor for early enhancer) in a one-hybrid screen (9). BEN encodes a 1072 amino acid protein, which is structurally similar to the human TFII-I (9). It exhibits six helix-loop-helix domains (I-repeats), an N-terminal hydrophobic leucine zipper-like motif, and a serine-rich repeat (9). For the sake of simplicity, we will refer to this protein henceforth as BEN.
Although knowledge regarding its expression pattern in developing embryos is available (10), the mechanism of action of BEN is poorly understood. For instance, BEN has been shown to be a specific repressor of TFII-I function (11). Competition between TFII-I and BEN has also been shown in the TGF-β signaling pathway. Upon TGF-β/activin signaling TFII-I interacts with Smad2 and is recruited to the goosecoid promoter, resulting in its transcriptional activation. In contrast, over-expressed BEN displaces this complex from the promoter, leading to transcriptional repression (12).
The Xenopus homolog of MusTRD1/BEN (as a VP16 fusion protein) behaves as a transcriptional activator of the goosecoid gene by interacting with Smad2 and Smad3 in activin/nodal dependent fashion (13). It has also been reported that BEN is involved in the confinement of Troponin I slow gene expression to slow-twitch fibers (14,15). On the other hand, using ectopic expression systems and silencing of BEN, it was shown that this protein could behave both as a transcriptional activator as well as a repressor (16). These conflicting results, indicating that BEN functions both as an activator and as a repressor of transcription, clearly warrants a systematic study to definitively assign biochemical functions to this protein and identify physiologically relevant target genes. Toward achieving this goal, we set out to determine the consensus binding site for human BEN (MusTRD1) using an unbiased randomized screening approach, SELEX. Using this method, we derived a consensus sequence. We further show by electrophoretic mobility shift (EMSA) assay and in vivo reporter assays, that this is a functional consensus sequence and that under our assay conditions, BEN functions as a repressor via this site. We also scanned a database to identify a number of potential target genes, which exhibit this BEN consensus site in their promoters. Because BEN is implicated in WBS and because several of these genes could have potential role in WBS pathology, our analysis might lead to future identification of molecular parameters for this disease.

Experimental Procedures SELEX assay
The SELEX procedure was performed as described (17) w a s designed, where a random 19-mer sequence is flanked by two 18-mer fixed sequences that can be amplified by PCR. The following forward primer P1: 5'-CAGGGTCGCTGGTAC-3', and r e v e r s e p r i m e r P 2 : 5 ' -CGCCAGTCGATAGCC-3' was used to amplify. The double-stranded dsSel0R library was generated by primer extension reaction. The extended product was purified on a 12% polyacrylamide native gel. The band corresponding to 55bp was excised from the wet gel, and the DNA was eluted. DNA was concentrated using Ultrafree-10K concentrator (Millipore) and used as probe in EMSA.
The expression and purification of BEN was carried out as described (18). Double-stranded DE sequence containing oligonucleotide 5'-GGGTCGAGATCCATTAATCAGATT AACGGTGAGCAATTAG-3' was used f o r E M S A .
T h e c o m p l e x e s corresponding to monomer and dimer bands were cut out of the wet gel. The DNA pools were eluted, purified and concentrated as described (17).
To amplify the sequences selected in the binding assay, PCR was performed as described (17). The PCR products were purified and used in the next round of SELEX and repeated for 5 rounds. The PCR products from rounds 2, 4 and 5 were ligated into the pGEM-T vector. The DNA plasmids from several clones from rounds 2, 4, and 5 were extracted and purified using QIAprep Spin Miniprep Kit (Qiagen), and were sequenced. To derive consensus site for SELEX round 4 and Selex round 5, clones were aligned manually and using Pictogram software (19) and the occurrence of each base at each position was calculated.

Bioinformatics Analysis
Pictogram s o f t w a r e (19) http://genes.mit.edu/pictogram.html was used to align SELEX clones and to derive the consensus site. TESS software (20) at http://www.cbil.upenn.edu/cgibin/EPConDB/TESS/tess.pl?mode=Sear chForm (EpConDB) was utilized to search for possible target genes that c o n t a i n t h e c o n s e n s u s s i t e CWGCGAYA.
The following parameters were used: search was performed for either mouse or human genome; the search was set to return first 100 hits per chromosome; location of the site was to be -1000 upstream to the putative transcription start site, and the 'string 1' was checked and set as CWGCGAYA, and to consider the reverse orientation.

EMSA
Binding reactions were performed as described above for SELEX assay. The reactions were resolved on a 5% PAGE and visualized by PhosphorImager (Amersham) using Image Quant 5.2 software (Molecular Dynamics).

Luciferase Reporter Assays
Single stranded oligonucleotides, containing either wild type consensus s i t e ( 5 ' -GGGGGCAGCGACAGCCCCC-3') or m u t a n t c o n s e n s u s s i t e ( 5 ' -GGGGGCACTACCAGCCCCC-3') multimerized in triplicate as XhoI and KpnI restriction sites were annealed and subcloned into pTK81luc luciferase vector (ATCC). The resulting constructs were designated as WT3X and Mut3X respectively. COS7 cells were transiently transfected in triplicate using polyfectamine (Qiagen). Western Blot was performed with anti-GST antibody (Sigma) (16).

Generation of BEN knockdown in C2C12 cell lines
BEN knockdown clones were established essentially as described for TFII-I (18). The shRNA target sequences are provided in the supplemental table, S1. Two target sequences were found at positions 1074bp and 1957bp respectively. The clone infected with both shRNA1074 and shRNA1957 exhibited best silencing (~8-fold) and was used further to analyze endogenous gene expression. The clone infected with shRNA1074 exhibited ~2-fold reduction in BEN and was used as control.

Generation of stable GFP-hBEN tetrepressible C2C12 cell line
Tet-inducible stable expression of GFP-BEN in C2C12 cells were done as described before for TFII-I (18) with some modifications. BspH1 site was introduced next to SpeI site into pEBB-GFP-hBEN (11) by PCR. Then a BspHI-NotI fragment containing GFP-hBEN was cloned into pSFG vector digested with NcoI-NotI.

Semi-Quantitative and quantitative RT-PCR
Semi-quantitative PCR was carried out to measure BEN mRNA expression (21), with β-actin as an internal control (18) using GoTaq Green Master Mix (Promega). Relative gene expression by quantitative RT-PCR was measured with SYBR Green Dye (ABI) and 18S as an internal control. The data was analyzed using GeneAmp 7300 SDS software (ABI). The fold change for each gene was calculated relative to wild type using 2 -__Ct as described in (22). The sequence of the primers is provided in supplemental data table, S1.

Quantitative ChIP
C2C12 cells stably expressing GFP-hBEN were left untreated or treated with 20µg/ml of tetracycline for 2 days. ChIP assays were performed according to the manufacturer's protocol (Upstate) and as described (18,23). DNA samples from ChIP were analyzed by quantitative PCR in triplicate using Taqman Gene Expression Master Mix (ABI). The sequence of the primers and probes is provided in supplemental data table, S1.

Results: DNA binding by BEN
Purified BEN was used in EMSA to monitor its binding to the Distal enhancer (DE) sequence element from the goosecoid promoter (13). BEN gave rise to two bands, presumably corresponding to the monomer and dimer in a concentration dependent fashion (Fig 1). A model of cooperative binding of BEN to the DE sequence is supported by quantitation (data not shown) and is also in agreement with previous results (16).

BEN binding site selection by SELEX
Given the binding of BEN was clearly detected, we employed a binding site selection method called SELEX (24). We adapted the methodology previously described (17) with some modifications. The flanking sequences were modified to exclude an E-box consensus site (CANNTG) known to bind TFII-I family members. After the fourth round of SELEX, the selected oligonucleotides were subcloned into a TA-cloning vector (pGEM-T), and the resulting clones were randomly picked and sequenced ( Fig. 2A). We found that BEN also displayed monomer and dimer binding to SELEX library, which was nearly identical to its binding to the DE probe. However, there appears to be no significant difference between sequences isolated from monomer and dimer bands in any rounds, indicating that they bind to the same sequence ( Fig. 2A).
To derive the consensus motif, we calculated the number of sequences containing one of the 4 nucleotides (G, A, T or C) at each of the 19 positions (Fig. 2C), and found the following 8bp core consensus C A G C/G G C/A G A, surrounded by G and C rich sequences. While the precise role of these "mirror" G and C rich sequences is not clear, it is possible that since BEN shows cooperative binding to DNA, these Gand C-rich sequences, together with the core consensus form two half-site consensus sites. We further manually aligned our sequences against this core consensus motif and used the Pictogram Software to align our 30 sequences. The exact same results were obtained (Fig.  2D). To determine if our selection increased the specificity from early to later rounds, we sequenced five clones from SELEX Round 2 and eight clones from SELEX Round 5 ( Fig. 2A), and found that indeed the core consensus enriched for a more specific sequence. While in round 2 there is no clear core consensus, in round 5 it becomes CWGCGAYA.

SELEX yields a functional BEN binding site
To test whether BEN transcriptionally functions through this site, three copies of either the wild type or a mutated sequence was cloned upstream of a TK-promoter driving the expression of a luciferase reporter gene. While BEN did not have any significant effect on the control TK promoter, it repressed the test promoter nearly 6-fold in a dose dependent fashion (Fig 3A, top  panel). The dose dependent ectopic expression of BEN was confirmed under the assay conditions (Fig 3A, bottom  panel). While the wild type promoter (WT3X) was repressed by BEN, the mutant (Mut3X) was unaffected by it ( Fig 3B). To further demonstrate that the transcriptional effects are indeed due to BEN, a nuclear localization deficient mutant form of BEN (BEN-NLS) was used. Interestingly, while the wild type BEN repressed the test promoter, BEN-NLS actually enhanced the transcriptional activity of this reporter. The expression levels of both the wild type and the mutant proteins were very similar (data not shown).

Candidate BEN Target Genes
We used an EPConDB database to identify the potential occurrence of our consensus BEN binding site in M u s Musculus and Homo Sapiens genomes.
The scan yielded 68 and 75 hits for the mouse and human genomes respectively. Table 1 (supplemental data S2) shows a partial list of the genes of interest.
Validation of Candidate Genes I n order to validate the potential BEN target genes, we stably silenced BEN by shRNA in C2C12 cells and analyzed several of the candidate genes by quantitative PCR. This analysis revealed that among the candidate genes tested, A L K 6 / B m p r 1 b and F g f 1 5 are dramatically upregulated upon knockdown of BEN (Fig 3D). Please note that the clone infected with both shRNA1074 and shRNA1957 exhibited best silencing (~8-fold) and was used further to analyze endogenous gene expression. The clone infected with shRNA1074 exhibited ~2-fold reduction in BEN and was used as control. We believe this is why Fgf15 is expressed at higher levels in these cells than in parental (WT) cells. In addition to these genes, S o x 4 and E n 1 are also significantly upregulated upon BEN knockdown (data not shown). However, GATA3 and Bmp8b appear to be largely unaffected under these conditions (data not shown).
To further establish if BEN is recruited to our consensus site in vivo, we performed a quantitaive ChIP assay with Fgf15 gene. We utilized TESS software tool from different websites to search for BEN consensus in the murine genome. Because Fgf15 gave more consistent bioinformatics (in silico) data, we decide to use it for ChIP rather than ALK6/Bmpr1b. For this assay, we utilized a C2C12 cell line stably expressing ectopic GFP-BEN under the control of a tetracycline (tet)-repressible promoter. The cells were either untreated or treated with tet, and harvested for the ChIP assay. The immunoprecipitation was performed with either the anti-GFP antibody or the negative control mIgG. For quantitative PCR, we designed primers and probe encompassing the region, containing our consensus site at -190 to -197 bp away from the transcription start site (25). We also checked the selected region of Fgf15 to make sure it does not harbor the previously reported binding site for GTF2I-like repeats (26). There is a ~2fold enrichment in the sample that was immunoprecipitated with anti-GFP antibody and amplified for Fgf15 versus the IgG control in the untreated samples in the absence but not in the presence of tetracycline. No significant enrichment was observed for cyclin D1. This indicates that BEN is recruited to the region of Fgf15 promoter containing our consensus site, and not to the cyclin D1 promoter.

Discussion
We employed a random binding site selection method (SELEX) based on the selection of specific protein binding sites from a pool of randomized DNA sequences and arrived at a potential consensus site for the transcription factor BEN. The sequence we derived, using SELEX and the wild type full length BEN appears to be different from that derived by Vullhorst and Buonanno et. al. using a fragment of BEN (26). It is likely that the difference in the two consensus sites is due to different use of reagents and methodologies.
For instance, a recombinant bacterially purified fragment of BEN was used in the earlier studies (26) compared to the full length BEN expressed and purified from mammalian COS7 cells that we used here. Moreover, the site selection methods employed are also somewhat different. However, because the TFII-I family proteins appear to bind to multiple sequence elements and thus does not exhibit high degree of sequence specificity, it is very likely that BEN binds to both experimentally derived sequences.
M o s t i m p o r t a n t l y , t h e significance of our selected site was demonstrated by validating it in functional assays. This was achieved both by transient transfection experiments and by stable silencing of BEN followed by analysis of potential target gene in their native environment via quantitative RT-PCR analysis.
WBS is a neurodevelopmental disease with characteristic physical and behavioral traits that is caused by a microdeletion of the 7q11.23 region containing several genes (5,6,27). Indeed a recent report identifies an atypical WBS patient with deletion in Gtf2ird1 (encoding BEN) that exhibits craniofacial defects (27). Furthermore, a transgenic mouse model also strongly indicates that deletion or mutations in BEN is causal to the craniofacial defects (27). However, given the involvement of these transcription factors in WBS ( 5 , 6 , 2 7 ) , a rigorous biochemical approach to identifying the function and physiologically relevant target genes is essential.
O u r search returned several genes involved in TGFβ/BMP pathway such as Bmp8b, ALK6/Bmpr1b, BMP4, ACVR1/ALK2, which correlate with the microarray data (28). Amongst these, ALK6/Bmpr1b was validated as a BEN target gene in our in vivo analysis ( Fig  3D). Our search also revealed several olfactory receptor genes. Interestingly, BEN binds to a regulatory region, which controls the expression of olfactory receptors (29). The search further revealed several fibroblast growth factor genes: Fgf14, Fgf15 (Mouse) and FGF5, F G F 1 4 , F G F R 2 ( H u m a n ) , downregulation of several of which in cells overexpressing BEN was observed (28). Our experimental analysis of Fgf15 concurs with the microarray data (28). That Fgf15 is a bona fide BEN target gene is also borne out by the fact that BEN was recruited to this promoter in vivo. FGFR2 is another interesting candidate gene because mutations of FGFR2 are associated with craniofacial dysmorphology (30). Our analysis also reveals several genes implicated in vertebrate development such as CDX1 (31), Neurogenin 1 (32), and Sox4 (33). It is thus, gratifying to observe an experimental validation of some of these as potential BEN target genes in vivo.   Quantitative ChIP assay of stably expressed GFP-BEN in C2C12 cells in the absence or in the presence of tetracycline (Tet, 20µg/ml), with anti-GFP antibody (Ab) or its isotype matched mIgG (IgG) as a negative control. The Mock lane is without lysate or antibody and serves as a negative control for cross-contamination. The quantitative PCR was performed in triplicate with primers and probe for Fgf15 promoter, while cyclin D1 promoter served as a negative control, and both were normalized to GAPDH. The experiment was repeated three times but a representative one is shown. Western Blot with anti-GFP antibody shows expression of GFP-BEN in the whole cell lysate (lane 1), and in ChIP lysates in the absence (lane 2) but not in the presence (lane 3) of Tet. β-actin served as a loading control.