Originally published In Press as doi:10.1074/jbc.M409176200 on March 24, 2005
J. Biol. Chem., Vol. 280, Issue 22, 21491-21497, June 3, 2005
Identification of Estrogen-responsive Genes Using a Genome-wide Analysis of Promoter Elements for Transcription Factor Binding Sites*
Sitharthan Kamalakaran
¶,
Senthil K. Radhakrishnan
||, and
William T. Beck
**
From the
Departments of Biochemistry and Molecular Genetics, ||Medicine, Microbiology and Immunology, and **Biopharmaceutical Sciences and Cancer Center, University of Illinois at Chicago, Chicago, Illinois 60612
Received for publication, August 11, 2004
, and in revised form, February 28, 2005.
 |
ABSTRACT
|
|---|
We developed a pipeline to identify novel genes regulated by the steroid hormone-dependent transcription factor, estrogen receptor, through a systematic analysis of upstream regions of all human and mouse genes. We built a data base of putative promoter regions for 23,077 human and 19,984 mouse transcripts from National Center for Biotechnology Information annotation and 8793 human and 6785 mouse promoters from the Data Base of Transcriptional Start Sites. We used this data base of putative promoters to identify potential targets of estrogen receptor by identifying estrogen response elements (EREs) in their promoters. Our program correctly identified EREs in genes known to be regulated by estrogen in addition to several new genes whose putative promoters contained EREs. We validated six genes (KIAA1243, NRIP1, MADH9, NME3, TPD52L, and ABCG2) to be estrogen-responsive in MCF7 cells using reverse transcription PCR. To allow for extensibility of our program in identifying targets of other transcription factors, we have built a Web interface to access our data base and programs. Our Web-based program for Promoter Analysis of Genome, PAGen@UIC, allows a user to identify putative target genes for vertebrate transcription factors through the analysis of their upstream sequences. The interface allows the user to search the human and mouse promoter data bases for potential target genes containing one or more listed transcription factor binding sites (TFBSs) in their upstream elements, using either regular expression-based consensus or position weight matrices. The data base can also be searched for promoters harboring user-defined TFBSs given as a consensus or a position weight matrix. Furthermore, the user can retrieve putative promoter sequences for any given gene together with identified TFBSs located on its promoter. Orthologous promoters are also analyzed to determine conserved elements.
 |
INTRODUCTION
|
|---|
Estrogens play a critical role in vertebrate reproduction, especially in the development of female sex organs (1). The bulk of estrogen signaling is controlled by estrogen receptors
and
(ER
1 and ER
), members of the nuclear receptor superfamily (2) of ligand-inducible transcription factors. ER
and ER
bind to the estrogen response element (ERE) on the promoters of the genes they regulate. The ERE is a palindrome of RG-GTCA motifs separated by 3 bp (3). These elements are bound by ER dimers, with one receptor binding each motif. The sequence and spatial organization of the ERE is important for the specificity of binding. Although a crystal structure of the DNA-bound ER is available (4), the unexpected diversity of DNA response elements mediating transcriptional regulation by ERs has made finding EREs in the genome very difficult. In fact, none of the typical estrogen-responsive genes have consensus EREs.
The sequencing of the human genome (5, 6) has opened a new way to study gene regulation through bioinformatic analysis of transcription factor binding sites (TFBSs) on gene promoters. There now exist on the Web both free and subscription-based resources for such analysis of specific promoter regions to identify putative TFBSs (712). Given a user-defined DNA sequence, these programs can predict the potential TFBSs. These resources are valuable for researchers studying regulation of individual genes. When it comes to studying global changes in gene expression following various treatments (drugs, hormones, heat shock) of cells or animals, DNA microarrays are the most popular tool. Most of these changes in gene expression are mediated through the activation of one or more transcriptional regulators (for example, p53 in DNA damage response and steroid hormone receptors like ER and retinoic acid receptor following estrogen and retinoic acid treatment, respectively). It is possible, using bioinformatics, to predict genes that are regulated by these transcriptional regulators by analyzing the promoter regions for binding sites to these transcription factors. In one such study, Hoh et al. (10) developed a computer algorithm, p53MH, to perform a genome-wide scan for p53 binding sites and identified 2,583 genes with putative binding sites for p53. Others have used similar approaches to identify p53 target genes in promoter regions and introns and c-Myc target genes in 5'-untranslated regions, respectively (11, 13). Although these studies have shown that these approaches are worthwhile and identify potential regulated genes of transcription factors of interest, they are limited in scope because they were applied to a single transcription factor. There is thus a need for a public resource for investigators to identify genes that are potentially regulated by specific transcription factor(s) through genome-wide bioinformatic analysis of promoter regions.

View larger version (27K):
[in this window]
[in a new window]
|
FIG. 1. Strategy used to predict potential targets of transcription factors. Human and mouse putative promoter regions from both NCBI annotation and DBTSS data bases were analyzed by regular expression and weight matrix methods for transcription factor binding sites using profiles from TRANSFAC. A browser interface was built to display the potential target genes.
|
|
In this study we developed a pipeline to identify genes controlled by the steroid hormone transcription factor, estrogen receptor. We have also shown that our method is easily extensible to the study of all transcription factors with a characterized binding site. Our Web-based software, PAGen@UIC, described herein, allows a user to identify from a set of all annotated human and mouse promoters the ones that may be regulated by the transcription factor of interest.
 |
EXPERIMENTAL PROCEDURES
|
|---|
ProgramsAll programming languages and software used were Open Source, supplied under a general public license. The programs were written in Python (www.python.org). We used the MySQL data base Server software (www.mysql.com). The application runs on an Apache 2.0 HTML server (www.apache.org). All graphs were generated using the Scipy package for Python. Programs from the EMBOSS software suite (14) were used for some functions, cpgplot for generating CpG island maps and profit for weight matrix searches.
Generation of Putative Promoter Regions and Control Random RegionsPython programs were written to generate putative promoter regions (PPRs), a sequence 2000 nt upstream and 250 nt downstream of transcription start site (TSS) in the GenBankTM annotation of the genome data (NCBI Genome Build 34 Ver3, March, 2004 for human; NCBI Genome Build 32 Ver1, September, 2003 for mouse), and written into a MySQL relational data base. For control random sequences, a similar program was written to obtain 2250-nt sequences from a random position in the genome. The number of random regions per chromosome matched the number of genes in each chromosome. Orthologous information for the mouse and human genes was compiled from the ENSEMBL project (15, 16).
Cell Lines and ReagentsThe MCF7 cell line was obtained from ATCC (Manassas, VA). 17-
-Estradiol was purchased from Calbiochem. Unless indicated otherwise, all other reagents and supplies were purchased from standard commercial sources.
RT-PCRMCF7 cells were grown in phenol red-free medium supplemented with 5% charcoal dextran-stripped serum (low estrogen) for 72 h and then treated with same medium supplemented with 0, 1.0, or 10 nM 17-
-estradiol (E2) for another 72 h. Total RNA was isolated by the TRIzol method as recommended by the manufacturer (Invitrogen); 3 µg were used for cDNA synthesis using the Thermoscript RT-PCR system (Invitrogen) and oligo(dT) primers. Custom synthesized primers were purchased from Sigma Genosys (The Woodlands, TX) and used to amplify the cDNA. PCR cycles were optimized for each gene for log phase amplification. Experiments were repeated independently at least three times.
 |
RESULTS
|
|---|
The annotation of the genome by the NCBI maps the TSS of each gene in the genome (15, 17). Using this annotation, we have created a data base of sequences containing PPRs spanning 2000 nt upstream and 250 nt downstream of the TSS of each gene. The data base comprises PPRs for 23,077 human genes and 19,984 mouse genes. In addition, we obtained 8793 human and 6785 mouse promoter sequences from the Data Base of Transcriptional Start Sites (DBTSS) resource. We used the TRANSFAC data base (18, 19) of DNA binding profiles of eukaryotic transcription factors to build a regular expression (RegEx)-based consensus binding set and a position weight matrix profile for various vertebrate transcription factors. These were then run against the data base of human and mouse promoters. We thus established a "PAGen data base" whereby each annotated gene is defined by a set of putative TFBSs located in its PPR. A flowchart of our approach is shown in Fig. 1.
The accuracy of prediction of a given TFBS is dependent on having a good model of transcription factor binding profile. Two major approaches, consensus set and PWMs, have been used to generate a transcription factor binding profile (see Ref. 20 for a review). The consensus method has the major disadvantage of being too rigorous (many false negatives) or too degenerate (many false positives). To overcome this problem, we have adopted a regular expression-based consensus approach (Fig. 2A). Regular expression search is an established standard for text and pattern matching (21), an important advantage being that it allows for all Boolean operations to be performed. This allows for more complex pattern searches than the use of a degenerate consensus set. It also allows for easy adoption for users familiar with some bioinformatics programming and ease of learning for others with less experience. A user can not only search for promoters with binding sites for one or more specific transcription factors but can also exclude any TFBS that the promoter set should not have.
The second approach we have taken to build the transcription factor binding profile involves generation of PWMs (Fig. 2B). The elements of a PWM correspond to scores reflecting the probability of a given nucleotide occurring at a particular position of the TFBS (see Ref. 22 for a review). PWMs provide the best approximation for determining TFBSs and have been used successfully to identify many biologically relevant targets of transcription factors (8, 10, 23).

View larger version (29K):
[in this window]
[in a new window]
|
FIG. 2. Methods of generation of transcription factor binding profile. A, RegEx-based consensus is compared with Rigorous and Degenerate consensus. Note that the RegEx descriptors are more efficient than the other consensus methods in retrieving potential hits while minimizing false positives. B, an example of a position weight matrix is shown. Each number in the matrix denotes the frequency with which the corresponding nucleotide occurs in a given transcription factor binding site.
|
|
Validation of the Promoter SetAnother factor that determines the relevance of predicted genes with the queried TFBSs is the accuracy of the TSS, because our PPRs are generated based on this information. We needed to validate the set of putative promoters to determine its usefulness, and to this end we used several tests. On a macro level, we plotted the GC content of the promoter set against a set of 2250-nt fragments randomly generated in the genome with matching gene frequencies for each chromosome. Because promoters on average are known to be GC-rich, an increase in GC content for the promoter set over a randomly generated control set demonstrates validity. Not only did our set of promoters have higher GC content, a plot of the distribution of putative CpG islands showed a striking bias toward the TSS in the promoter sets while being distributed randomly in the random set (Fig. 3A). We used the default EMBOSS values in the cpgplot program for predicting CpG islands, defined as having at least one 50-nt window where the observed to expected ratio is >0.6 and the GC content >50%. The numbers of CpG islands in our promoter sets are included in supplemental Table I. We also plotted the number of TFBSs for a given transcription factor against a given position relative to the TSS (-2000 to +1) for the random control set as well as our experimental promoter set. The clustering of a number of TFBSs (CREB1, CAAT, SP1, GC1) to the -500 to +1 region of the TSS in the promoter set, although distributed randomly in the control set, demonstrated validity (Fig. 3B).
There have been reports about wide inaccuracies in the NCBI genome annotations (24), leading the authors of one study to identify TSS by mapping mRNA and expressed sequence tag sequences to the genome and to build a dataset of proximal promoters (PromoSer; biowulf.bu.edu/zlab/PromoSer/) for human, mouse, and rat genes. Pairwise sequence analysis of our set of PPRs with the corresponding regions obtained from PromoSer revealed that >90% of our promoter set had >95% identity. The DBTSS project also aims to map the exact genomic position and TSS of all human and mouse genes (12, 25). The current version, however, has only TSS for 8793 human and 6785 mouse genes. A comparison of the promoter sequences that we derived from NCBI annotation with this limited set of promoters provided in the DBTSS project showed that >85% of our promoter sequences share at least 90% identity with their respective DBTSS promoters. We found that of these remaining 15%,
13% of the NCBI promoters were, on average, 4986 bases downstream of their corresponding DBTSS promoters (8% was found within 1 kb, 1% between 1 and 5 kb, and the rest over 10 kb). The remaining 2% were extremely far apart. To accommodate these discrepancies in promoters, we decided to use the DBTSS promoter set as an additional option.
We then searched our data base for putative estrogen-regulated genes by looking for the presence of EREs in their corresponding promoters. We were able to identify several known ER-regulated genes (EBAG9, c-fos, OXT, F12, TFF1, LTF, CTSD, PFDN2, TGF-
, AGT, GREB1). These genes and their EREs as published in the literature and confirmed using PAGen@UIC are shown in Table I. We also identified several novel candidate genes with EREs in their promoters. We tested some of these genes for estrogen regulation using RT-PCR. For our analysis we selected only those genes with an ERE occurring within 1000 nt upstream of the TSS. We chose 16 such genes, designed gene-specific primers, and examined them for regulation by estrogen in MCF7 cells following a 72-h treatment. Of the 16 genes assayed, 6 were not expressed in MCF7 cells and 4 were unresponsive to treatment. However, as shown in Fig. 4, six genes with an ERE in their upstream sequence regions were, in fact, regulated by estrogen in MCF7 cells. Five genes (KIAA1243, NRIP1, MADH9, NME3, and TPD52L) were up-regulated upon treatment, whereas one (BCRP/ABCG2) was down-regulated. These genes, their predicted EREs, and position relative to their TSS are listed in Table II. The down-regulation of ABCG2/BCRP seems to be cell line-specific because it is, in fact, up-regulated following estrogen treatment in ER-positive T47D:A18 cells, an effect that is mediated by the ERE element identified by PAGen@UIC (26). Thus, by using PAGen@UIC, we have high accuracy for predicting novel estrogen-responsive genes.
View this table:
[in this window]
[in a new window]
|
TABLE I Known estrogen-responsive genes identified The PAGen promoter set was searched using EREs from previously published estrogen-regulated genes. Genes shown below could be identified in this way. The bases that deviate from the consensus are shown in lower case.
|
|
View this table:
[in this window]
[in a new window]
|
TABLE II Novel estrogen-responsive genes identified Novel estrogen-responsive genes validated by RT-PCR and their predicted EREs are shown.
|
|
Finally, we wanted our pipeline to be accessible to other researchers studying transcriptional regulation. The tools we developed can be easily extended for the study of any transcription factor whose binding site has been characterized. We built a wrapper for our data base and data mining tools so that they can be freely accessible through a browser interface on the World Wide Web.
Description of PAGen@UICOur program interface (available at www.uic.edu/pharmacy/depts/pmpcpd/pagen/) allows the user to perform the following operations: (a) search either mouse or human promoter data base (NCBI or DBTSS promoter set) for potential target genes containing one or more listed TFBS in their upstream regions, using either RegEx-based Consensus or PWMs; (b) search the data bases for a user-defined sequence given in the form of a regular expression for a RegEx-based consensus search or a FASTA-formatted list of TFBSs for a PWM-based search; (c) retrieve putative promoter sequences for any given gene together with identified TFBSs located on its promoter. The promoter of the orthologous gene, along with CpG plots indicating putative CpG islands in the PPR, is also provided. The typical result of a query lists a set of genes with putative binding sites for the selected transcription factor together with the sequence and position of the identified binding site. Each identified gene is also linked to the NCBI data base. For example, the user can look up the mRNA transcript of a gene, search PubMed for publications, and retrieve the chromosomal location of the gene using NCBI Mapview and obtain the orthologous promoter. The number of potential target genes predicted by our program for some well known transcription factors are included in Table III. The full list of our transcription factors is presented in supplemental Table II.
View this table:
[in this window]
[in a new window]
|
TABLE III Number of predicted target genes for some transcription factors Number of potential target genes for some well known transcription factors using different prediction methods from both the PAGen and DBTSS set of human (h) and mouse (m) promoters are shown.
|
|

View larger version (31K):
[in this window]
[in a new window]
|
FIG. 3. Analysis of the promoter sets. A, the occurrence of CpG islands against their relative position with respect to the transcriptional start site (TSS) in the PAGen and DBTSS promoter sets are contrasted with a random set. A vast majority of the promoters in the PAGen and DBTSS sets have CpG islands clustered near the TSS, whereas it is distributed throughout the length of the sequence for the random set. B, the position bias for binding sites of transcription factors CREB1, GC1, SP1, and CAAT in the PAGen set of promoters is compared with a random set of sequences. Transcription factors with GC-rich response elements (CREB1, GC1, and SP1) as well as non-GC-rich response elements (CAAT) tend to cluster near the TSS.
|
|
 |
DISCUSSION
|
|---|
In this study we have built a freely available online data base of upstream elements that can be queried for binding sites for one or more transcription factors and then used it to identify and validate six novel estrogen-responsive genes. The size of the data base is larger than any previously available resource. Importantly, our resource can be queried by any user-determined regular expression or PWM, allowing users to determine the expression that best describes their binding site. There are a few data bases of promoters available on the Web. The most accurate resource, the Eukaryotic Promoter data base (2729), contains annotated eukaryotic POL II promoters with an experimentally determined TSS. This data base contains only 1871 human promoters and 196 mouse promoters. This constitutes less than 10% of human genes and less than 1% of mouse genes. The two data bases similar in size to ours are PromoSer (24) and mPromDb (30). However, these tools are primarily to retrieve promoters of interest. The PromoSer and Eukaryotic Promoter data bases do not provide the capability to search for TFBSs. With mPromDb, it is only possible to retrieve a set of experimentally verified target genes of a transcription factor. By contrast, our tool, in addition to retrieving promoters of interest, also allows a user to identify a set of putative genes that may be targets of a given transcription factor through the identification of TFBSs in their promoters.
There are a few reports on genome-wide analysis of regulation by specific genes (8, 10, 11, 13, 31). These efforts concentrate on defining a specific matrix for the gene of interest. In addition some of these programs are proprietary and have not been released in the public domain. Our effort, PAGen@UIC, although, applied to one specific response element, can be adapted easily to any user-defined criteria. The freely available browser-based input gives any user the ability to search TFBSs readily for the transcription factor of interest. Moreover, our program provides the user the ability to tailor the search as needed. An investigator can make subtle changes in the consensus site based on the hypothesis being tested. Instead of defining a consensus for a given transcription factor, the user can also form a regular expression using only experimentally verified binding sites for that particular transcription factor and can find new genes that are potentially regulated by the transcription factor. For example, the tumor suppressor PTEN has been shown to be regulated by p53 (32). However, the promoter for PTEN does not contain the consensus binding site for p53 (33), which is defined as two copies of the 10-bp motif 5'-RRRC(A/T)(T/A)GYYY-3' separated by a spacer of 013 bp. If we vary the spacer by one base, i.e. use 14 nucleotides instead of 13, PTEN would have been identified in these searches. In fact, we have identified many EREs in this manner. The promoter for the ABC half transporter, BCRP/ABCG2, has an ERE that is similar to the ERE of c-fos, but not the consensus ERE that has been widely published. Similarly, the MADH9 promoter has an ERE that is similar to the ERE of F12.
The RegEx-based consensus provides significantly better results than traditionally employed consensus approaches. In our case (ER binding sites), a degenerate consensus would yield 220 hits, and a rigorous consensus would yield only 3, whereas our RegEx method gave us 119 hits. The main strength of this method is that it produces a small set of hits and can reduce the number of spurious hits. However, this also means that many real hits may be omitted. PWMs, on the other hand, are the established method for identifying TFBSs. They generate many putative targets that allow one to identify genes that may not be found using the RegEx method. We believe these two approaches complement each other in fulfilling the potential needs of users.
Another important feature of PAGen@UIC is the integration of the human and mouse promoter data bases. This provides an extra level of accreditation that a promoter with a transcription factor binding site is biologically relevant. Thus, if the user searches for human promoters with binding sites for a given transcription factor, in addition to providing a list of all the human promoters that have a binding site, the program also looks for binding sites for the same transcription factor in the orthologous gene promoters in the mouse data base, and the result table will have an additional link wherever a binding site is present in the orthologous mouse promoter. For example, NRIP1 is an estrogen-responsive gene we identified using this program, with a consensus ERE at position -721; by comparison the program also identified that the mouse ortholog of NRIP1 has an ERE at position -651. This conservation of the ERE increases the probability that it is a relevant hit. This functionality in PAGen@UIC is especially relevant if the transcription factor binding site is poorly defined (e.g. c-Myc has a consensus binding site, CACGTG) and the search generates many hits. In such cases, the user can focus only on those hits where the promoters of both the mouse and human genes are predicted to have the binding site for the desired transcription factor.
Although we project that PAGen@UIC will be a useful tool, we are also aware that there are some limitations. First, our data base of upstream sequence regions is based on the NCBI annotation of the transcription start site for each gene. The tests we have performed show that the promoter set based on NCBI annotation is reliable and compares well with other promoter data bases. However, it is possible that there are a few errors in the annotation that will negatively impact our data set. Second, we have not undertaken any efforts to determine the exact regulatory region for each gene. Our data base of -2000 to +250 base sequences contains only those potential regulatory regions within this region. Although this will hold true in most cases, there are notable exceptions. For example, we know that the p53 binding site for p21 is more than 2 kb upstream of the promoter. We have chosen not to include sequences further upstream because their inclusion will add greatly to the numbers of false positives while detecting few relevant binding sites. PAGen@UIC cannot identify genes with non-canonical transcription factor binding sites or genes that are indirectly regulated by transcription factors (with no binding site in the regulatory regions). There have also been several reports of transcription factors binding introns and 5'-untranslated regions. The 5'-untranslated regions have not been included in this data base. However, a separate data base of human and mouse 5'- and 3'-untranslated regions with all the functionalities described here for such searches is currently under development.
Our tool provides an integrated approach to the study of gene regulation through the analysis of upstream promoter elements. The number of promoters in our data base compares favorably to the other available resources. A significant addition provided by our resource enables the user to identify putative targets of various transcription factors. The ability to perform both regular expression searches and weight matrix searches on a large set of putative promoters is unique to our tool. The integration of the mouse and human promoters in every search operation to identify conservation of the binding sites adds greatly to the usefulness of our tool. User-defined searches of our promoters will allow researchers to modify TFBSs where the TRANSFAC profile is not deemed to be accurate.
Further, this approach can potentially provide functional characterization for previously unknown or predicted genes. For example, the KIAA1243 gene is a predicted gene with some expressed sequence tag evidence, but has no characterized function. We were able to identify it as an estrogen-responsive gene using our application, thus providing a direct approach to characterize its function. We thus expect our application to be quite useful for the study of gene regulation and development pathways associated with one or more transcription factors.
 |
FOOTNOTES
|
|---|
* This work was supported in part by NCI, National Institutes of Health Grants CA40570 and CA30103, by Department of Defense Grant DAMD17-02-1-0412, and by the University of Illinois at Chicago. This investigation was conducted in a facility constructed with support from Grant C06RR15482 from the National Institutes of Health National Center for Research Resources. The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact. 
The on-line version of this article (available at http://www.jbc.org) contains two supplemental tables. 
Both authors contributed equally to this work. 
¶ Present address: Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724. 

To whom correspondence should be addressed: Dept. of Biopharmaceutical Sciences (MC 865), University of Illinois at Chicago, 833 S. Wood St., Chicago, IL 60612. Tel.: 312-996-0888; Fax: 312-996-0098; E-mail: wtbeck{at}uic.edu.
1 The abbreviations used are: ER, estrogen receptor; PAGen, Promoter Analysis of Genome; TFBS, transcription factor binding site; ERE, estrogen response element; TSS, transcriptional start site; PWM, position weight matrix; PPR, putative promoter region; NCBI, National Center for Biotechnology Information; DBTSS, Data Base of Transcriptional Start Sites; RegEx, regular expression; CREB, cAMP-response element-binding protein; RT, reverse transcription. 
 |
ACKNOWLEDGMENTS
|
|---|
We thank Drs. Yin Yuan Mo, Xiaolong He, Rachel Ee, and Martina Vaskova for critical comments and suggestions.
 |
REFERENCES
|
|---|
- Rosenfeld, C. S., Wagner, J. S., Roberts, R. M., and Lubahn, D. B. (2001) Reproduction 122, 215-226[Abstract]
- Schwabe, J. W., and Teichmann, S. A. (2004) Science's STKE 2004, pe4[Abstract/Free Full Text]
- Sanchez, R., Nguyen, D., Rocha, W., White, J. H., and Mader, S. (2002) BioEssays 24, 244-254[CrossRef][Medline]
[Order article via Infotrieve]
- Schwabe, J. W., Chapman, L., Finch, J. T., and Rhodes, D. (1993) Cell 75, 567-578[CrossRef][Medline]
[Order article via Infotrieve]
- Lander, E. S., Linton, L. M., Birren, B., Nusbaum, C., Zody, M. C., Baldwin, J., Devon, K., Dewar, K., Doyle, M., FitzHugh, W., Funke, R., Gage, D., Harris, K., Heaford, A., Howland, J., Kann, L., Lehoczky, J., LeVine, R., McEwan, P., McKernan, K., Meldrim, J., Mesirov, J. P., Miranda, C., Morris, W., Naylor, J., Raymond, C., Rosetti, M., Santos, R., Sheridan, A., Sougnez, C., Stange-Thomann, N., Stojanovic, N., Subramanian, A., Wyman, D., Rogers, J., Sulston, J., Ainscough, R., Beck, S., Bentley, D., Burton, J., Clee, C., Carter, N., Coulson, A., Deadman, R., Deloukas, P., Dunham, A., Dunham, I., Durbin, R., French, L., Grafham, D., Gregory, S., Hubbard, T., Humphray, S., Hunt, A., Jones, M., Lloyd, C., McMurray, A., Matthews, L., Mercer, S., Milne, S., Mullikin, J. C., Mungall, A., Plumb, R., Ross, M., Shownkeen, R., Sims, S., Waterston, R. H., Wilson, R. K., Hillier, L. W., McPherson, J. D., Marra, M. A., Mardis, E. R., Fulton, L. A., Chinwalla, A. T., Pepin, K. H., Gish, W. R., Chissoe, S. L., Wendl, M. C., Delehaunty, K. D., Miner, T. L., Delehaunty, A., Kramer, J. B., Cook, L. L., Fulton, R. S., Johnson, D. L., Minx, P. J., Clifton, S. W., Hawkins, T., Branscomb, E., Predki, P., Richardson, P., Wenning, S., Slezak, T., Doggett, N., Cheng, J. F., Olsen, A., Lucas, S., Elkin, C., Uberbacher, E., Frazier, M., Gibbs, R. A., Muzny, D. M., Scherer, S. E., Bouck, J. B., Sodergren, E. J., Worley, K. C., Rives, C. M., Gorrell, J. H., Metzker, M. L., Naylor, S. L., Kucherlapati, R. S., Nelson, D. L., Weinstock, G. M., Sakaki, Y., Fujiyama, A., Hattori, M., Yada, T., Toyoda, A., Itoh, T., Kawagoe, C., Watanabe, H., Totoki, Y., Taylor, T., Weissenbach, J., Heilig, R., Saurin, W., Artiguenave, F., Brottier, P., Bruls, T., Pelletier, E., Robert, C., Wincker, P., Smith, D. R., Doucette-Stamm, L., Rubenfield, M., Weinstock, K., Lee, H. M., Dubois, J., Rosenthal, A., Platzer, M., Nyakatura, G., Taudien, S., Rump, A., Yang, H., Yu, J., Wang, J., Huang, G., Gu, J., Hood, L., Rowen, L., Madan, A., Qin, S., Davis, R. W., Federspiel, N. A., Abola, A. P., Proctor, M. J., Myers, R. M., Schmutz, J., Dickson, M., Grimwood, J., Cox, D. R., Olson, M. V., Kaul, R., Shimizu, N., Kawasaki, K., Minoshima, S., Evans, G. A., Athanasiou, M., Schultz, R., Roe, B. A., Chen, F., Pan, H., Ramser, J., Lehrach, H., Reinhardt, R., McCombie, W. R., de la Bastide, M., Dedhia, N., Blocker, H., Hornischer, K., Nordsiek, G., Agarwala, R., Aravind, L., Bailey, J. A., Bateman, A., Batzoglou, S., Birney, E., Bork, P., Brown, D. G., Burge, C. B., Cerutti, L., Chen, H. C., Church, D., Clamp, M., Copley, R. R., Doerks, T., Eddy, S. R., Eichler, E. E., Furey, T. S., Galagan, J., Gilbert, J. G., Harmon, C., Hayashizaki, Y., Haussler, D., Hermjakob, H., Hokamp, K., Jang, W., Johnson, L. S., Jones, T. A., Kasif, S., Kaspryzk, A., Kennedy, S., Kent, W. J., Kitts, P., Koonin, E. V., Korf, I., Kulp, D., Lancet, D., Lowe, T. M., McLysaght, A., Mikkelsen, T., Moran, J. V., Mulder, N., Pollara, V. J., Ponting, C. P., Schuler, G., Schultz, J., Slater, G., Smit, A. F., Stupka, E., Szustakowski, J., Thierry-Mieg, D., Thierry-Mieg, J., Wagner, L., Wallis, J., Wheeler, R., Williams, A., Wolf, Y. I., Wolfe, K. H., Yang, S. P., Yeh, R. F., Collins, F., Guyer, M. S., Peterson, J., Felsenfeld, A., Wetterstrand, K. A., Patrinos, A., Morgan, M. J., Szustakowki, J., de Jong, P., Catanese, J. J., Osoegawa, K., Shizuya, H., Choi, S., and Chen, Y. J. (2001) Nature 409, 860-921[CrossRef][Medline]
[Order article via Infotrieve]
- Venter, J. C., Adams, M. D., Myers, E. W., Li, P. W., Mural, R. J., Sutton, G. G., Smith, H. O., Yandell, M., Evans, C. A., Holt, R. A., Gocayne, J. D., Amanatides, P., Ballew, R. M., Huson, D. H., Wortman, J. R., Zhang, Q., Kodira, C. D., Zheng, X. H., Chen, L., Skupski, M., Subramanian, G., Thomas, P. D., Zhang, J., Gabor Miklos, G. L., Nelson, C., Broder, S., Clark, A. G., Nadeau, J., McKusick, V. A., Zinder, N., Levine, A. J., Roberts, R. J., Simon, M., Slayman, C., Hunkapiller, M., Bolanos, R., Delcher, A., Dew, I., Fasulo, D., Flanigan, M., Florea, L., Halpern, A., Hannenhalli, S., Kravitz, S., Levy, S., Mobarry, C., Reinert, K., Remington, K., Abu-Threideh, J., Beasley, E., Biddick, K., Bonazzi, V., Brandon, R., Cargill, M., Chandramouliswaran, I., Charlab, R., Chaturvedi, K., Deng, Z., Di Francesco, V., Dunn, P., Eilbeck, K., Evangelista, C., Gabrielian, A. E., Gan, W., Ge, W., Gong, F., Gu, Z., Guan, P., Heiman, T. J., Higgins, M. E., Ji, R. R., Ke, Z., Ketchum, K. A., Lai, Z., Lei, Y., Li, Z., Li, J., Liang, Y., Lin, X., Lu, F., Merkulov, G. V., Milshina, N., Moore, H. M., Naik, A. K., Narayan, V. A., Neelam, B., Nusskern, D., Rusch, D. B., Salzberg, S., Shao, W., Shue, B., Sun, J., Wang, Z., Wang, A., Wang, X., Wang, J., Wei, M., Wides, R., Xiao, C., Yan, C., Yao, A., Ye, J., Zhan, M., Zhang, W., Zhang, H., Zhao, Q., Zheng, L., Zhong, F., Zhong, W., Zhu, S., Zhao, S., Gilbert, D., Baumhueter, S., Spier, G., Carter, C., Cravchik, A., Woodage, T., Ali, F., An, H., Awe, A., Baldwin, D., Baden, H., Barnstead, M., Barrow, I., Beeson, K., Busam, D., Carver, A., Center, A., Cheng, M. L., Curry, L., Danaher, S., Davenport, L., Desilets, R., Dietz, S., Dodson, K., Doup, L., Ferriera, S., Garg, N., Gluecksmann, A., Hart, B., Haynes, J., Haynes, C., Heiner, C., Hladun, S., Hostin, D., Houck, J., Howland, T., Ibegwam, C., Johnson, J., Kalush, F., Kline, L., Koduru, S., Love, A., Mann, F., May, D., McCawley, S., McIntosh, T., McMullen, I., Moy, M., Moy, L., Murphy, B., Nelson, K., Pfannkoch, C., Pratts, E., Puri, V., Qureshi, H., Reardon, M., Rodriguez, R., Rogers, Y. H., Romblad, D., Ruhfel, B., Scott, R., Sitter, C., Smallwood, M., Stewart, E., Strong, R., Suh, E., Thomas, R., Tint, N. N., Tse, S., Vech, C., Wang, G., Wetter, J., Williams, S., Williams, M., Windsor, S., Winn-Deen, E., Wolfe, K., Zaveri, J., Zaveri, K., Abril, J. F., Guigo, R., Campbell, M. J., Sjolander, K. V., Karlak, B., Kejariwal, A., Mi, H., Lazareva, B., Hatton, T., Narechania, A., Diemer, K., Muruganujan, A., Guo, N., Sato, S., Bafna, V., Istrail, S., Lippert, R., Schwartz, R., Walenz, B., Yooseph, S., Allen, D., Basu, A., Baxendale, J., Blick, L., Caminha, M., Carnes-Stine, J., Caulk, P., Chiang, Y. H., Coyne, M., Dahlke, C., Mays, A., Dombroski, M., Donnelly, M., Ely, D., Esparham, S., Fosler, C., Gire, H., Glanowski, S., Glasser, K., Glodek, A., Gorokhov, M., Graham, K., Gropman, B., Harris, M., Heil, J., Henderson, S., Hoover, J., Jennings, D., Jordan, C., Jordan, J., Kasha, J., Kagan, L., Kraft, C., Levitsky, A., Lewis, M., Liu, X., Lopez, J., Ma, D., Majoros, W., McDaniel, J., Murphy, S., Newman, M., Nguyen, T., Nguyen, N., Nodell, M., Pan, S., Peck, J., Peterson, M., Rowe, W., Sanders, R., Scott, J., Simpson, M., Smith, T., Sprague, A., Stockwell, T., Turner, R., Venter, E., Wang, M., Wen, M., Wu, D., Wu, M., Xia, A., Zandieh, A., and Zhu, X. (2001) Science 291, 1304-1351[Abstract/Free Full Text]
- Quandt, K., Frech, K., Karas, H., Wingender, E., and Werner, T. (1995) Nucleic Acids Res. 23, 4878-4884[Abstract/Free Full Text]
- Bajic, V. B., Tan, S. L., Chong, A., Tang, S., Strom, A., Gustafsson, J. A., Lin, C. Y., and Liu, E. T. (2003) Nucleic Acids Res. 31, 3605-3607[Abstract/Free Full Text]
- Kel, A. E., Gossling, E., Reuter, I., Cheremushkin, E., Kel-Margoulis, O. V., and Wingender, E. (2003) Nucleic Acids Res. 31, 3576-3579[Abstract/Free Full Text]
- Hoh, J., Jin, S., Parrado, T., Edington, J., Levine, A. J., and Ott, J. (2002) Proc. Natl. Acad. Sci. U. S. A. 99, 8467-8472[Abstract/Free Full Text]
- Wang, L., Wu, Q., Qiu, P., Mirza, A., McGuirk, M., Kirschmeier, P., Greene, J. R., Wang, Y., Pickett, C. B., and Liu, S. (2001) J. Biol. Chem. 276, 43604-43610[Abstract/Free Full Text]
- Suzuki, Y., Yamashita, R., Sugano, S., and Nakai, K. (2004) Nucleic Acids Res. 32, (Data base issue) D78-D81[Abstract/Free Full Text]
- Schuldiner, O., Shor, S., and Benvenisty, N. (2002) Gene 292, 91-99[CrossRef][Medline]
[Order article via Infotrieve]
- Rice, P., Longden, I., and Bleasby, A. (2000) Trends Genet. 16, 276-277[CrossRef][Medline]
[Order article via Infotrieve]
- Hubbard, T., Barker, D., Birney, E., Cameron, G., Chen, Y., Clark, L., Cox, T., Cuff, J., Curwen, V., Down, T., Durbin, R., Eyras, E., Gilbert, J., Hammond, M., Huminiecki, L., Kasprzyk, A., Lehvaslaiho, H., Lijnzaad, P., Melsopp, C., Mongin, E., Pettett, R., Pocock, M., Potter, S., Rust, A., Schmidt, E., Searle, S., Slater, G., Smith, J., Spooner, W., Stabenau, A., Stalker, J., Stupka, E., Ureta-Vidal, A., Vastrik, I., and Clamp, M. (2002) Nucleic Acids Res. 30, 38-41[Abstract/Free Full Text]
- Kasprzyk, A., Keefe, D., Smedley, D., London, D., Spooner, W., Melsopp, C., Hammond, M., Rocca-Serra, P., Cox, T., and Birney, E. (2004) Genome Res. 14, 160-169[Abstract/Free Full Text]
- Wheeler, D. L., Church, D. M., Federhen, S., Lash, A. E., Madden, T. L., Pontius, J. U., Schuler, G. D., Schriml, L. M., Sequeira, E., Tatusova, T. A., and Wagner, L. (2003) Nucleic Acids Res. 31, 28-33[Abstract/Free Full Text]
- Wingender, E., Dietze, P., Karas, H., and Knuppel, R. (1996) Nucleic Acids Res. 24, 238-241[Abstract/Free Full Text]
- Wingender, E., Chen, X., Fricke, E., Geffers, R., Hehl, R., Liebich, I., Krull, M., Matys, V., Michael, H., Ohnhauser, R., Pruss, M., Schacherer, F., Thiele, S., and Urbach, S. (2001) Nucleic Acids Res. 29, 281-283[Abstract/Free Full Text]
- Bulyk, M. L. (2003) Genome Biol. 5, 201[CrossRef][Medline]
[Order article via Infotrieve]
- Friedl, J. E. F. (2002) Mastering Regular Expressions, 2nd Ed., O'Reilly and Associates, Sebastopol, CA
- Stormo, G. D. (2000) Bioinformatics 16, 16-23[Abstract/Free Full Text]
- Benos, P. V., Bulyk, M. L., and Stormo, G. D. (2002) Nucleic Acids Res. 30, 4442-4451[Abstract/Free Full Text]
- Halees, A. S., Leyfer, D., and Weng, Z. (2003) Nucleic Acids Res. 31, 3554-3559[Abstract/Free Full Text]
- Suzuki, Y., Yamashita, R., Nakai, K., and Sugano, S. (2002) Nucleic Acids Res. 30, 328-331[Abstract/Free Full Text]
- Ee, P. L., Kamalakaran, S., Tonetti, D., He, X., Ross, D. D., and Beck, W. T. (2004) Cancer Res. 64, 1247-1251[Abstract/Free Full Text]
- Cavin Perier, R., Junier, T., and Bucher, P. (1998) Nucleic Acids Res. 26, 353-357[Abstract/Free Full Text]
- Perier, R. C., Praz, V., Junier, T., Bonnard, C., and Bucher, P. (2000) Nucleic Acids Res. 28, 302-303[Abstract/Free Full Text]
- Schmid, C. D., Praz, V., Delorenzi, M., Perier, R., and Bucher, P. (2004) Nucleic Acids Res. 32, D82-D85[Abstract/Free Full Text]
- Sun, H., and Davuluri, R. V. (2004) Bioinformatics 20, 727-734[Abstract/Free Full Text]
- Bourdeau, V., Deschenes, J., Metivier, R., Nagai, Y., Nguyen, D., Bretschneider, N., Gannon, F., White, J. H., and Mader, S. (2004) Mol Endocrinol. 18, 1411-1427[Abstract/Free Full Text]
- Stambolic, V., MacPherson, D., Sas, D., Lin, Y., Snow, B., Jang, Y., Benchimol, S., and Mak, T. W. (2001) Mol. Cell 8, 317-325[CrossRef][Medline]
[Order article via Infotrieve]
- el-Deiry, W. S., Kern, S. E., Pietenpol, J. A., Kinzler, K. W., and Vogelstein, B. (1992) Nat. Genet. 1, 45-49[CrossRef][Medline]
[Order article via Infotrieve]

CiteULike
Complore
Connotea
Del.icio.us
Digg
Reddit
Technorati What's this?
This article has been cited by other articles:

|
 |

|
 |
 
D. Nichol, M. Christian, J. H. Steel, R. White, and M. G. Parker
RIP140 Expression Is Stimulated by Estrogen-related Receptor {alpha} during Adipogenesis
J. Biol. Chem.,
October 27, 2006;
281(43):
32140 - 32147.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
B. Sarkadi, L. Homolya, G. Szakacs, and A. Varadi
Human Multidrug Resistance ABCB and ABCG Transporters: Participation in a Chemoimmunity Defense System.
Physiol Rev,
October 1, 2006;
86(4):
1179 - 1236.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
M. Kobayashi, E. Takahashi, S.-i. Miyagawa, H. Watanabe, and T. Iguchi
Chromatin immunoprecipitation-mediated target identification proved aquaporin 5 is regulated directly by estrogen in the uterus.
Genes Cells,
October 1, 2006;
11(10):
1133 - 1143.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
A. B. James, A.-M. Conway, and B. J. Morris
Regulation of the Neuronal Proteasome by Zif268 (Egr1)
J. Neurosci.,
February 1, 2006;
26(5):
1624 - 1634.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
A. A. Sharov, D. B. Dudekula, and M. S. H. Ko
CisView: A Browser and Database of cis-regulatory Modules Predicted in the Mouse Genome
DNA Res,
January 1, 2006;
13(3):
123 - 134.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
C. Lin Chang, J. Roh, J.-I. Park, C. Klein, N. Cushman, R. V. Haberberger, and S. Y. T. Hsu
Intermedin Functions as a Pituitary Paracrine Factor Regulating Prolactin Release
Mol. Endocrinol.,
November 1, 2005;
19(11):
2824 - 2838.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
J. Laganiere, G. Deblois, C. Lefebvre, A. R. Bataille, F. Robert, and V. Giguere
From the Cover: Location analysis of estrogen receptor {alpha} target promoters reveals that FOXA1 defines a domain of the estrogen response
PNAS,
August 16, 2005;
102(33):
11651 - 11656.
[Abstract]
[Full Text]
[PDF]
|
 |
|