Genomic Structure of the Promoters of the Human Estrogen Receptor-α Gene Demonstrate Changes in Chromatin Structure Induced by AP2γ

Expression of human estrogen receptor-α (ERα) involves the activity from several promoters that give rise to alternate untranslated 5′ exons. However, the genomic locations of the alternate 5′ exons have not been reported previously. We have developed a contig map of the human ERα gene that includes all of the known alternate 5′ exons. By using S1 nuclease and 5′- rapid amplification of cDNA ends, the cap sites for the alternate ERα transcripts E and H were identified. DNase I-hypersensitive sites specific to ERα-positive cells were associated with each of the cap sites. A DNase I-hypersensitive site, HS1, was localized to binding sites for AP2 in the untranslated region of exon 1 and was invariably present in the chromatin structure of ERα-positive cells. Overexpression of AP2γ in human mammary epithelial cells generated the HS1-hypersensitive site. The ERα promoter was induced by AP2γ in mammary epithelial cells, and trans-activation was dependent upon the region of the promoter containing the HS1 site. These results demonstrate that AP2γ trans-activates the ERα gene in hormone-responsive tumors by inducing changes in the chromatin structure of the ERα promoter. These data are further evidence for a critical role for AP2 in the oncogenesis of hormone-responsive breast cancers.

There are at least two nuclear receptors for estrogen receptor, ER␣ 1 (1,2) and ER␤ (3). Most breast cancers that occur in post-menopausal women overexpress ER␣ (4). Patients with breast cancers that express ER␣ are more likely to respond to hormonal therapy (4,5) and have an improved prognosis compared with patients with ER␣-negative tumors (4,6,7). Studies of breast cancer cell lines (8) and primary tumors (9,10) have indicated that transcription of the ER␣ gene plays an important role in regulating the expression of ER␣. Thus, under-standing transcriptional regulation of the ER␣ gene will likely provide critical insights into the pathogenesis of hormone-responsive breast cancers.
Transcription of the ER␣ gene is complex and involves activity of several distinct promoters (11)(12)(13)(14). Functional promoter studies have concluded that ER␣ expression in breast cancer cell lines and various tissues is likely to involve transacting factors that have a specific cell or tissue distribution pattern (15)(16)(17)(18)(19). There appear to be a variety of factors that interact with the ER␣ promoter with trans-activating (15)(16)(17) or trans-repressing (20) functions. There is also evidence that ER␣ can autoregulate its own transcription (21,22). Other studies suggest that the lack of ER␣ expression in ER␣-negative breast cancer cell lines and tumors may be controlled by methylation of CpG islands in the 5Ј end of the ER␣ gene (23,24).
The main ER␣ promoter, P1, initiates transcription at a cap site previously mapped at the start of exon 1 (1). Exon 1 has a 233-base 5Ј-untranslated region preceding the AUG codon that initiates translation of the ER␣ protein. Studies in ER␣-positive breast cancer cell lines have shown that transcription initiated at exon 1 accounts for 50 -90% of all ER␣ mRNAs (18,25). A functional analysis of the main ER␣ promoter identified a factor, ERF-1, that binds to high affinity sites in the untranslated region of exon 1 and can trans-activate the cloned ER␣ promoter (15,26). ERF-1 was found to be a member of the AP2 family of transcription factors and has been renamed AP2␥ (27).
All other ER␣ transcripts initiate at cap sites upstream of exon 1 and splice into a splice acceptor site at ϩ163 in exon 1 (14). Although some of these upstream exons have open reading frames, there is no evidence that these are translated, and it appears that all alternate 5Ј exons are non-coding. Exon 1Ј has been reported to have two main cap sites giving rise to alternate 5Ј exons of 110 bases or 1206 bases (12). The short and long forms of exon 1Ј both utilize the splice donor site at Ϫ1884 (location relative to cap site of P1). We had previously described two additional alternate ER␣ transcripts called E and H (14). Both the E and H transcripts are expressed in ER␣-positive breast cancer cell lines, primary breast cancers, ER␣-positive endometrial carcinoma cell lines, and normal endometrium. The existence of the E and H transcripts of ER␣ were subsequently confirmed by other investigators, and these exons were reported to be expressed in a variety of tissues (25). The splice donor site of exon E was found to be at Ϫ169. The H transcript was found to utilize two upstream exons, Ha and Hb, separated by an intron of 9 kbp (14). The genomic location of the H exons was not determined but was concluded to be at least 20 kbp 5Ј to exon 1 (14). An additional liver-specific exon, called exon C, has also been described (13). Exon C was reported to be spliced to an exon with sequence matching exon Hb, and it was concluded that exon C is farther 5Ј than exon Ha (14).
In order to define the location of the ER␣ promoters active in * This work was supported in part by National Institutes of Health Grant R01 CA77350. The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.  breast cancer, a detailed analysis of the genomic structure of the 5Ј end of the ER␣ gene including the alternate ER␣ transcripts E and H was performed. Overlapping BAC clones have been isolated that generated a contig map that includes all known ER␣ exons. The exon Ha was found to be 124 kbp upstream of exon 1, and exon C was located 30 -40 kbp 5Ј to exon Ha. Together with the previously known exons of the ER␣ gene that span a region of over 160 kb, the ER␣ locus was found to encompass a genomic region of ϳ300 kbp. By using S1 nuclease and 5Ј-RACE, the cap sites for exons E and H have been mapped. Each of the alternate upstream exons was found to be associated with DNase I-hypersensitive sites specific to cells expressing ER␣. The DNase I-hypersensitive site, HS1, was mapped to the binding sites for AP2␥ in the untranslated leader of exon 1 and was invariably found in the chromatin structure of ER␣-positive cells. In human mammary epithelial cells (HMECs), AP2␥ expression induced the HS1-hypersensitive site and trans-activated the ER␣ promoter, which was dependent upon the region of the promoter containing the HS1 site. These findings provide additional evidence for a critical role for AP2␥ in the oncogenesis of hormone-responsive breast cancer.
BAC Library Screening-BAC clones containing genomic DNA from the 5Ј region of the human ER␣ gene were identified by hybridization of DNA arrays with probes corresponding to the human ER␣ exons 1Ј, Ha and Hb. Nylon membranes arrayed with DNA from a human BAC library were purchased from Research Genetics (Huntsville, AL). Gelpurified insert DNA from clones of ER␣ exons Ha (pHa2.5) and Hb (pHb4.1) were used for hybridization. Exon 1Ј sequences between Ϫ3090 and Ϫ2670 were PCR-amplified from MCF-7 genomic DNA using primers ERSEQ1 (TCTAGAGCATGGGTGGCCAT) and ERSEQ2 (GTGCTCCTAGAGTGCCCACG) and TAQ polymerase. The cycle profile included an initial denaturation step of 94°C for 2 min followed by 25 cycles of 94°C for 30 s, 56°C for 15 s, and 72°C for 2 min and terminated with a final extension step of 72°C for 5 min. All DNAs used for hybridization were gel-purified and labeled by random priming with [␣-32 P]dCTP to specific activities greater than 7 ϫ 10 8 dpm/g. Library membranes were prehybridized in 50% formamide, 5ϫ SSC, 7% SDS, 1% polyethylene glycol, 25 mM sodium phosphate buffer, pH 6.7, and 0.5% non-fat dried milk at 42°C for 1 h. Twenty ml of hybridization solution per membrane were used. For hybridization, the volume of hybridization solution was reduced to 6 ml per membrane, and each probe was added to 5 ϫ 10 5 dpm/ml and hybridized for 12-18 h at 42°C. Following hybridization the membranes were washed twice in 2ϫ SSC, 1% SDS at 42°C and then twice in 0.1ϫ SSC, 0.1% SDS at 65°C. Positive signals were identified by overnight exposure to film.
Cultures of Escherichia coli harboring the BACs identified in the initial screen were obtained from Research Genetics. A small amount of DNA was obtained from each culture using the vendor's protocols for secondary screening by PCR. DNA from 19 BACs were PCR-amplified with primers oEXON0INT and oEXON0 -3Ј (14) for the presence of ER␣ exon Hb sequences and with primers ERSEQ1 and ERSEQ2 for the presence of ER␣ exon 1Ј sequences as described above.
Mapping BAC Clones-DNA from BACs containing portions of the ER␣ upstream region was prepared from 1-liter cultures using a Maxiprep kit from Qiagen (Valencia, CA) as described by the manufacturer. For restriction enzyme digest analysis, 1-2 g of BAC DNA was digested in 20 l with the appropriate enzyme and then subjected to pulse field gel electrophoresis in 0.5ϫ TBE at 140 V with field switches increasing from 1 to 12 s over 20 h. Bands were visualized by ethidium bromide staining and photographed. Band sizes were calculated from standard curves constructed from molecular weight markers.
For Southern blot analysis, DNAs were transferred to positively charged nylon membranes (Hybond Nϩ, Amersham Pharmacia Biotech) using protocols provided by the manufacturer. The blots were hybridized with various probes to identify the locations of the ER␣ exons relative to the restriction sites mapped in each BAC. Probes used were the ER␣ exon 1Ј, Ha and Hb probes described above. In addition, oligonucleotides corresponding to ER␣ exon C (TTCACAATCAAAAG-GATTGG) (13) and to the ends of the BAC genomic DNA inserts were used. The sequences of the ends of the BACs were determined by direct sequencing of BAC DNA. S1 Nuclease Analysis-S1 nuclease analyses were performed essentially as described (18) with modifications of the protocols for probe synthesis. Messenger RNA was isolated from MCF-7 cells using a Fast Track mRNA isolation System (Invitrogen, Carlsbad, CA) according to the manufacturer's instructions. The probe for analysis of the ER␣ transcripts originating at the exon Ha promoter was synthesized by single-sided PCR from a template that encompassed exon Ha plus ϳ850 bases of upstream sequences. The template was PCR-amplified from MCF-7 genomic DNA using the primer H199.4 (GAGAAGATTAT-CACTCAGAGAC) as the 3Ј primer and H853.3 (CGCCCCATTCTAC-CATTTTC) as the 5Ј primer. PCR conditions were as described above except that the annealing temperature was 50°C. To synthesize a single-stranded probe complementary to the expected mRNA sequence, PCR was performed using only H199.4 in the presence of [␣-32 P]dCTP. After PCR the probe DNA was desalted on a spin column to remove unincorporated label.
The single-stranded probe for analysis of the ER␣ transcripts originating from the promoter associated with exon E was synthesized by primer extension using Klenow enzyme. The primer used was ERPRO30 (GTGCAGACCGTGTCCCCGCA) and was annealed to the denatured ER724 -210LUC plasmid (15) as a template. The 3Ј end of the probe was defined by cleavage with PvuII, and the single-stranded probe was isolated from the template by electrophoresis through a 1.5% alkaline agarose gel. The purified probe was eluted from the gel using a Millipore Ultrafree DA spin cartridge (Millipore Corp., New Bedford, MA).
Either the Ha or E probe (5 ϫ 10 5 cpm) was hybridized to 2 g of MCF-7 mRNA or yeast tRNA overnight at 45°C. Nuclease S1 (500 -2000 units/ml; Amersham Pharmacia Biotech) was added, and the samples were incubated at 37°C for 1 h. The reactions were stopped, and the samples were extracted once with phenol/chloroform (1:1) before analysis on a 6% acrylamide sequencing gel. Signals were detected by overnight autoradiography.
5Ј-RACE Analysis-Messenger RNA from MCF-7 cells was analyzed by 5Ј-RACE using the 5Ј-RACE System from Life Technologies, Inc., as recommended by the manufacturer. Initial reverse transcription was primed with ERPRO22 (CCCTTGGATCTGATGCAGTA), which is located in exon 1 at approximately ϩ300. The gene-specific primer for the first round of PCR was ERPRO30, which anneals to sequences in exon 1 upstream of ERPRO22. A second round of PCR amplification was performed using the gene-specific primers ERPRO94 (GCTGGATA-GAGGCTGAGTTT) and H199.4 for exons E and Ha, respectively. RACE PCR products were cloned into pCR2.1 using a TA Cloning Kit (Invitrogen) as recommended. The location of the 5Ј end of each clone was determined by sequencing.
DNase I-hypersensitive Site Assay-DNase I-hypersensitivity was analyzed in a variety of ER␣-positive and ER␣-negative cell lines. Cells were harvested during exponential growth, and then washed once with cold phosphate-buffered saline (PBS). The cells were then washed once with cold buffer A (15 mM Tris, pH 7.4, 60 mM KCl, 15 mM NaCl, 0.2 mM EGTA, 0.2 mM EDTA, 0.25 M sucrose, 1 mM dithiothreitol, 0.15 mM spermine, 0.5 mM spermidine) and then resuspended in 5 ml of cold buffer A with 0.2% Nonidet P-40. The cells were lysed in a Dounce homogenizer using 10 strokes of a B pestle on ice; lysis was checked by trypan blue exclusion. Nuclei were pelleted and resuspended in 2.5 ml of buffer B (buffer A minus sucrose). To 0.5-ml aliquots of nuclei varying quantities of DNase I (0, 200, 400, 600, 800, 1200, 1600 units/ml; Roche Molecular Biochemicals) were added, and then MgCl 2 was added to 5 mM, and the samples were incubated on ice for 15 min. The reactions were stopped by addition of EDTA to 50 mM. Following addition of SDS to 0.5% and proteinase K to 1 mg/ml, the samples were incubated overnight at 37°C. Residual protein was removed by extraction once each with phenol, phenol/chloroform (1:1), and chloroform. Following ethanol precipitation the nucleic acid pellets were dissolved in 10 mM Tris, pH 8.0, 0.5 mM EDTA, and the absorbance at A 260 was measured.
Ten micrograms of each sample were cleaved with the appropriate restriction enzyme overnight at 37°C. For analysis of hypersensitive sites near exon 1, the samples were then electrophoresed through 1% agarose (SeaKem GTG, BioWhittaker Molecular Applications, Rock-land, ME) using 0.5ϫ Tris/acetate/EDTA buffer and transferred to nylon membranes for hybridization as described above. The resulting blots were hybridized with the exon 1Ј probe. For analysis of hypersensitive sites near exons Ha/Hb, samples were subjected to pulse field gel electrophoresis following restriction enzyme cleavage. DNA was immobilized on nylon membranes and hybridized with the same exon Ha probe used to screen the BAC library.
AP2␥ Antibody Production-Antigen for the production of an AP2␥specific polyclonal antibody (AP) was generated by cloning a fragment of AP2␥ in frame with glutathione S-transferase (GST) to create a fusion protein. A fragment corresponding to nucleotides 474 -607, which encodes amino acids 150 -187, was generated by PCR and cloned in frame in the pGEX-4T3 vector (Amersham Pharmacia Biotech). The identity of the clone was confirmed by sequencing the entire insert from both directions. The clone was transformed into XL-1 Blue cells (Stratagene), and the production of a fusion protein of the proper size was confirmed by SDS-polyacrylamide gel electrophoresis. Large scale production of the fusion protein was induced with growth of the transformed cells in the presence of isopropyl-1-thio-␤-D-galactopyranoside. Bacterial lysates were incubated with glutathione-agarose beads, washed with PBS, and the fusion protein eluted with the addition of 5 mM glutathione. The fusion protein was injected into rabbits, and antisera were generated by CalTag Laboratory (Healdsburg, CA). Following production of a polyclonal antisera, affinity purification was performed by passing antisera first over a GST affinity column to bind selectively antibodies directed to the GST portion of the fusion protein, and then over a GST-AP2␥ affinity column with subsequent elution of the affinity-purified antibody. Affinity columns were prepared using Affi-Gel 10 supports (Bio-Rad). Gel shift assays were performed as described previously (15). In supershift assays, 2 l of AP antisera was used. The rabbit polyclonal antibody to AP2, SC-184, was obtained from Santa Cruz Biotechnology, Santa Cruz, CA.
Construction of AP2␥/pAdTrack-CMV and AP2␣/pAdTrack-CMV Shuttle Vector-In order to generate AP2␥ and AP2␣ adenoviral constructs, the AP2␥ and AP2␣ cDNAs were first cloned into a shuttle vector. By using a previously described AP2␥ clone retrieved from a MCF7 expression library (27) as template, AP2␥ cDNA was PCR-amplified from the translation start site at ϩ167 to the stop codon at ϩ1519 using a 5Ј primer (GACGCAGATCTCCATGTTGTGGAAAATA-AC) containing a BglII site and a 3Ј primer (TCGTTCTCGAGTTATT-TCCTGTGTTTCTCC) containing an XhoI site. AP2␣ cDNA was PCRamplified from the translational start site to the stop codon using an AP2␣ cDNA clone (26) as template. PCR was performed using a 5Ј primer AP2␣5Ј SalI (CGATCCGTCGACATGCTTTGGAAATTGACG) containing a SalI site and a 3Ј primer AP2␣3Ј XbaI (GGGAGGTCTA-GATCACTTTCTGTGCTTCTC) containing an XbaI site. The AP2␥ and AP2␣ cDNA fragments were ligated into the pAdTrack-CMV shuttle vector (gift of Dr. Burt Vogelstein, The Johns Hopkins University) (29) after digestion with appropriate restriction enzymes to create AP2␥/ pAdTrack-CMV and AP2␣/pAdTrack-CMV. pAdTrack-CMV also encodes the GFP protein so that viral production can be monitored by fluorescence.
Production of AdAP2␥, AdAP2␣, and AdWT Adenoviruses in 293 Cells-AdAP2␥, AdAP2␣, and AdWT adenoviruses were produced as described previously (29) with several modifications. Briefly, 2 ϫ 10 6 293 cells were plated in 25-cm 2 flasks 24 h before transfection in MEM with 10% FCS so that cells had reached 70 -80% confluence by 24 h. On the day of transfection, 4 g each of AdAP2␥, AdAP2␣, and AdWT DNA were linearized by PacI digestion. Each linearized DNA was mixed with 20 l of LipofectAMINE (Life Technologies, Inc.) in 500 l of Opti-MEM (Life Technologies, Inc.) and incubated at room temperature for 15-30 min according to the manufacturer's protocol. Meanwhile, the cells were washed once with 3 ml of Opti-MEM. After incubation, the lipid/DNA mixes were brought up to 2 ml with Opti-MEM and overlaid onto the 293 cells. After incubation at 37°C, 5% CO 2 for 4 h, the transfection mix was removed, replaced with 6 ml of MEM ϩ 10% FCS, and returned to the incubator. Cells were monitored by fluorescence microscopy for GFP expression over 7-9 days at which time most of the cells were fluorescent and had detached from the flask. Cells were collected, pelleted, resuspended in 2 ml of 1ϫ PBS buffer, and subjected to 4 cycles of freeze/thaw/vortex (dry ice/37°C). AdAP2␥, AdAP2␣, and AdWT adenoviruses were then plaque-purified by infecting 5 ϫ 10 5 293 cells in 35-mm plates with 100 l of serial dilutions of viral supernatants from 10 Ϫ1 to 10 Ϫ4 made in Opti-MEM. After a 1-h incubation at 37°C, 5% CO 2 , cells were overlaid with 3 ml of 0.8% agarose in MEM ϩ 10% FCS and returned to the incubator. Plates were monitored for plaque formation and GFP expression over 9 days at which time plaques were isolated as agarose plugs into 200 l of MEM ϩ 10% FCS and subjected to 3 freeze/thaw (dry ice/37°C) cycles. Fifty l of viral lysate was used to infect 2 ϫ 10 6 293 cells in a 25-cm 2 flask, and the cells were harvested as described above at 3-4 days when the cells were at least 50% detached. Virus was then titered by GFP expression, and 3 more rounds Confirmation of AdAP2␥, AdAP2␣, and AdWT Identity-In order to confirm the identities of AdAP2␥, AdAP2␣, and AdWT viruses, DNA was isolated from the viruses using the DNeasy Tissue Kit (Qiagen) according to the manufacturer's protocol for non-nucleated blood. Viral DNA was subjected to PCR amplification using 3 sets of primer pairs for AP2␥ and AP2␣ and one set for AdWT. Primer pairs for AdAP2␥ were composed either of one primer derived from the pAdTrackCMV vector sequence and one primer from AP2␥ (primer pair 1, GFP-AdTrack/ APseq6; primer pair 2, Rt. Arm-AdTrack/Apseq2) or of two primers from the pAdTrack-CMV sequence (primer pair 3, GFP-AdTrack/Rt. Arm-AdTrack). For AdAP2␣, primer pair 1 was composed of GFP-AdTrack and AP2␣5Ј SalI, and primer pair 2 was composed of Rt. Arm-AdTrack and AP2␣3Ј XbaI. Primer pair 3 remained the same. Viral DNA for AdWT was amplified using only primer pair 3. Primer sequences are as follows: GFP-AdTrack (GCCGTCCTCGATGTTGTG-GCGGATC); APseq6 (CATCAAAGAAGCCCTGATT G); Rt. Arm-AdTrack (CATCAAACGAGTTGGTGCTCATGGC); and Apseq2 (GTGCTGCCCGGCGGAGGAGA).
Infection of HMEC with Adenoviruses-The day before infection, a total of 3 ϫ 10 7 HMEC cells for each virus was plated in three 175-cm 2 flasks. The next day, media were removed from the flasks, and the cells were washed with 5 ml of Opti-MEM. Cells were infected at an m.o.i. of 10 for 24 h at which time they were analyzed for DNase I hypersensitivity as described above.
Trans-activation of the ER␣ Promoter-The ER␣ promoter constructs used have been described previously (15). The constructs ER3794 -230LUC and ER3794 -0LUC were previously called ER3500 -230LUC and ER3500 -0LUC, respectively. However, sequence analysis has shown that the 5Ј end of the constructs are at Ϫ3794 bp relative to the P1 cap site. Transfections in HMECs were performed in triplicate using FuGene 6, and luciferase expression was normalized using ␤-galactosidase expression as described previously (26).

RESULTS
Genomic Mapping of the Alternate ER␣ Exons-We had previously identified alternate 5Ј exons of the human ER␣ gene that were expressed in breast cancer cell lines and primary breast tumors (14). However, genomic mapping with genomic lambda clones failed to provide a contig spanning the entire region. It remained to be determined how large the intron was between exon Hb and the splice acceptor site in exon 1. The location of the liver-specific exon C (13) also had not previously been determined. In order to locate the 5Ј alternative exons Ha, Hb, and C, a BAC library was screened using probes represent-ing exons 1Ј, Ha and Hb. Of the BAC clones identified in the initial screen only BAC 542K7 encompassed sequences from both exon 1Ј and exon Hb and therefore was likely to span the intron separating these two exons. This BAC clone also hybridized to a probe specific for exon Ha. Three other BACs including BAC 295K8 were isolated that contained sequences from exons Ha and Hb but not exon 1Ј. BAC 295K8 also hybridized to a probe for the liver-specific exon C.
Southern blot analysis was used to generate a restriction map, to locate the positions of the ER␣ exons, and to determine the overlap in the two BAC clones (data summarized in Fig. 1). Previous results using lambda genomic clones had demonstrated that exon Hb was at least 20 kb upstream of exon 1Ј (14). The results from mapping BAC 542K7 and 295K8 revealed that exon Hb was more than 110 kb upstream of exon 1Ј and that the distance between the cap site for exon Ha and the cap site for exon 1 was ϳ124 kb (Fig. 1). Southern blot analysis revealed that the liver-specific exon C was located on a SmaI/ PacI fragment between 30 and 40 kb upstream of exon Ha.
Mapping the Cap Sites for the Alternate ER␣ Transcripts-The location of the alternate 5Ј exons of the human ER␣ gene suggests that there are separate ER␣ promoters that may be independently regulated. Functional promoter studies have been performed for promoters initiating transcription at exons 1 (15) and 1Ј (17), but no studies have been performed on promoters that may be involved in expression of exons E or Ha. In order to further localize possible promoters controlling expression of these two alternate exons, experiments to locate the cap sites of the E and H transcripts were performed using S1 nuclease and 5Ј-RACE. The results of these findings are shown in Fig. 2. For the S1 analysis, MCF-7 mRNA was hybridized to a probe extending several hundred bases upstream of the end of the longest cDNA previously identified for ER␣ transcripts initiating with exon Ha (14). The analysis revealed multiple species of protected fragments with 5Ј ends between 15 bases ( Fig. 2A, fragment J) and 80 bases ( Fig. 2A, fragment A) upstream of the 5Ј end of the longest cDNA clone previously obtained for Ha.
To validate that the cap sites of Ha had been correctly identified by S1 nuclease analysis, 5Ј-RACE was performed on MCF-7 mRNA. A clear single PCR product was seen after the second round of PCR using a gene-specific primer in exon Ha (data not shown). Twelve clones with inserts were sequenced to determine the 5Ј-most base, and the results were correlated with the S1 nuclease analysis data (see Fig. 2A). The circles plotted on the right side of the vertical sequence represent the locations of 5Ј-most bases from these 12 clones. Similar to the S1 analysis, RACE clones displayed a 66-base range in length, which coincided with the mRNA ends identified by S1 analysis. Two clusters of clones were observed that terminated ϳ45 bases and 57 bases upstream of the previous end noted from ER cDNA clones. Two clones mapped to within 2 bases of the longest S1 product (fragment A), and one clone corresponded to the smallest S1 product (fragment J). These results demonstrate that exon Ha represents a genuine ER␣ cap site with multiple start sites scattered over ϳ60 bp.
Exon E was previously identified as an alternative 5Ј exon from screening a cDNA library and was localized between exon 1Ј and exon 1 (14). The 3Ј end of exon E was identified 169 bp upstream of the main transcriptional start site for the ER␣ gene. In order to confirm that ER␣ transcripts have a cap site associated with transcription initiated at exon E, a similar analysis to that for exon Ha was undertaken. A single strand DNA probe extending from the 3Ј end of exon E sequences to several hundred bases upstream was hybridized to MCF-7 mRNA. S1 nuclease analysis of the hybrids revealed several

FIG. 2. S1 nuclease and 5-RACE for exons Ha and E.
A, mapping for exon Ha. Results of S1 nuclease are shown that demonstrate 10 start sites over a 66-base region. The main S1 fragments are labeled A-J. RNA used in the experiments is from MCF-7 cells (lanes 1 and 2) or tRNA (lanes 3 and 4). Increasing concentration of S1 nuclease was used as indicated. To the right of the gel is the sequence locating the approximate position of the S1 nuclease products. Closed circles represent the 5Ј-terminal nucleotide from the 5Ј-RACE clones. Arrow indicates the position of the longest cDNA previously reported for ER␣ H transcripts. B, mapping for exon E. S1 nuclease results shown for exon E presented in a manner similar to exon Ha. Five main S1 nuclease fragments are labeled A-E. Fragment A is ϳ120 bp upstream of the cluster represented by B-E. Closed circles represent the 5Ј-terminal nucleotide of 5Ј-RACE clones. Arrow indicates the position of the longest cDNA previously reported for ER␣ E transcripts. protected species as was observed for exon Ha (Fig. 2B). A prominent protected band was observed that corresponded to an mRNA end ϳ120 bases upstream of the end previously identified from the longest cDNA clone for exon E (Fig. 2B,  fragment A). Four additional protected species were identified, the 5Ј-most of which (fragment B) was 8 bp from the previously identified cDNA 5Ј end. The other three protected species (fragments C-E) corresponded to shorter mRNAs.
The 5Ј-RACE analysis of ER mRNA from MCF-7 cells using exon E-specific primers produced two prominent bands of ϳ200 and 300 bp (data not shown). Fourteen of the RACE clones were sequenced, and the 5Ј-most base correlated with the S1 analysis data (Fig. 2B). One clone extended several bases beyond the length of S1 fragment A, and the rest extended over an 80-base range with lengths similar to fragment B-E identified by S1 analysis. A cluster of 7 clones had 5Ј ends within an 11-base range just 3Ј to the fragment corresponding to S1 product B. Since the S1 nuclease fragment sizes are estimated from the mobility in gel electrophoreses, the S1 nuclease results and 5Ј-RACE results are in excellent agreement.
DNase I-hypersensitive Sites-The location of DNase I-hypersensitive sites has been shown to correspond to the position of binding by transcription factors involved in transcriptional regulation of a gene (30). Therefore, mapping DNase I-hypersensitive sites can provide important information about the location of the transcriptional regulatory regions. Previously published data (31) had identified three DNase I-hypersensitive sites near exon 1 that were specific to ER␣-positive cells, HS1 located near the cap site for exon 1, HS2 at approximately Ϫ350 bp, and HS3 at approximately Ϫ2000 bp (positions relative to the cap site for exon 1). A repeat of this experiment confirmed these results and localized the sites with more precision (Fig. 3A). The first hypersensitive site (HS1) was observed within the 5Ј-untranslated region of exon 1 at approximately ϩ200 bp. A second hypersensitive site was observed at Ϫ800 bp (HS2), and a third hypersensitive site (HS3) was seen at Ϫ2000 bp. In agreement with previous work, each of these three hypersensitive sites was specific to ER␣-positive cells, and we have adopted the earlier nomenclature. The location of HS1 coincides with binding sites for AP2␥, which were previously identified in a functional promoter analysis of the main ER␣ promoter (15). HS3 appears to be associated with exon 1Ј and may be functionally related to the P0 promoter (also known as the B promoter) (17) for exon 1Ј. Exon 1Ј may be transcribed as a 110-base or 1206-base exon both of which end at a splice donor site at Ϫ1884 (12) indicating that there may be two alternate transcriptional start sites. If the smaller exon is expressed, HS3 would be mapped just upstream of the start of exon 1Ј. HS2 is located adjacent to exon E and may relate to the promoter controlling transcription starting at this cap site.
DNase I-hypersensitive site analysis was performed using an exon Ha probe on Southern blots of DNA from MCF-7 and MDA-MB-231 cells to locate potential regulatory elements involved in transcription of the H transcript (Fig. 3B). Southern analysis of DNA digested with SwaI revealed a 32-kb fragment in both MCF-7 and MDA-MB-231 DNA, which is consistent with the size of the SwaI fragment determined from the BAC mapping experiments. An additional band migrating at 15 kb was observed in MCF-7 DNA treated with DNase I (Fig. 3B, top  panel, HS4). This band mapped a hypersensitive site that is specific to ER␣-positive cells to a location ϳ5 kb downstream of exon Hb. Southern analysis of DNA digested with XhoI dem-  , HS5 and HS6) and are, therefore, not specific to ER␣-positive cells. These bands mapped hypersensitive sites to the locations indicated on the diagram below the autoradiographs. HS5 mapped to a location ϳ2 kb upstream of exon Hb, and HS6 mapped to a location near or within exon Ha.
The HS1-hypersensitive Site Is Present in ER␣-positive Cells-The location of the HS1-hypersensitive site suggests that this alteration of chromatin structure is required for transcription from the main ER␣ promoter. If this hypothesis were correct, the HS1 site should be found invariably in cells that express ER␣. The chromatin structure of a panel of ER␣-positive and ER␣-negative cell lines was previously examined for

FIG. 3. DNase I-hypersensitive sites in region of ER␣ cap sites.
A, hypersensitive sites in region of exons 1, E, and 1Ј. Hypersensitive sites were determined using chromatin extracted from MCF-7 (ER␣positive) and MDA-MB-231 (ER␣-negative) cells. Chromatin was untreated (0) or treated with increasing amounts of DNase I as indicated. Hypersensitive sites HS1, HS2, and HS3 correspond to sites specific to ER␣-positive cells as reported previously (30). Map below shows schematic location of exons 1, E, and 1Ј relative to location of hypersensitive sites and probe used in the hybridization. Arrows indicate major cap sites identified. B, hypersensitive sites in region of exons Ha and Hb. Results for mapping DNase I-hypersensitive sites are presented in a manner similar to exons 1, E, and 1Ј. Data demonstrate hypersensitive site HS4, which was specific to ER␣-positive cells. Location of site was confirmed in two experiments with digestion with SwaI or XhoI. Hypersensitive sites HS5 and HS6 were identified in both cell lines. the presence of the HS1 site (31). Fig. 4 shows the results of an analysis of the HS1 site in an additional panel of cell lines. The ER␣-positive breast carcinoma cell lines T47-D, ZR75-1, BT20, and MDA-MB-361 all demonstrated the HS1 site. The ER␣positive endometrial cancer cell line ECC-1 also demonstrated the HS1-hypersensitive site. In distinct contrast, the HS1 site was not found to be present in an analysis of the chromatin structure of the ER␣-negative cell lines HEC1A, HBL-100, or HeLa (see Fig. 4). Together with the results in Fig. 3A, this analysis of the HS1 site in 10 cell lines demonstrated a striking correlation between the expression of ER␣ and the presence of the HS1 site. Previous studies in these cell lines have shown that each of the cell lines that demonstrate the HS1 site (MCF7, T47-D, ZR75-1, BT20, MDA-MB-361, and ECC-1) had been found to overexpress AP2␥, whereas those cell lines that lack the HS1 site (MDA-MB-231, HEC1A, HBL-100, and HeLa) express negligible amounts of the AP2 factors (15,32). This result suggests that the HS1 site may be generated by AP2 binding to chromatin in the ER␣ promoter.
AP2␥ Induced the HS1-hypersensitive Site in HMECs-The location of HS1 mapped to the untranslated region of exon 1 and coincides with the location of high affinity binding sites for the AP2 transcription factors (15). Functional analysis of the main ER␣ promoter indicated that AP2␥ as well as AP2␣ are able to induce transcription from the main ER␣ promoter by binding to the AP2 sites found in the untranslated leader of exon 1 (15,26). To define better which AP2 proteins may be involved in regulation of the ER␣ gene in MCF7 cells, an AP2␥-specific antisera was generated. As seen in Fig. 5, the commercially available polyclonal antibody, SC-184, is able to supershift purified AP2␣ and AP2␥. The antisera, AP, is specific for AP2␥ and does not supershift purified AP2␣ (Fig. 5). An analysis of MCF-7 nuclear extracts indicates that the majority of AP2 activity in MCF-7 is supershifted with the AP antisera. Densitometer analysis indicated that ϳ75% of the AP2 activity in MCF-7 nuclear extract is supershifted by the AP2␥-specific antisera. Assuming that homo-and heterodimers of AP2␥ will supershift, it is estimated that ϳ50% of the AP2 activity in MCF-7 cells is AP2␥.
Functional promoter analysis established an important role for AP2␥ in regulating the expression of the main ER␣ promoter in ER␣-positive cells (15). HS1 was specific to the chromatin of ER␣-positive cells, and the location of HS1 coincided with the location of the AP2-binding sites. These two findings led us to hypothesize that binding of AP2␥ to high affinity sites in the ER␣ promoter induces the HS1 site. HMECs express minimal amounts of ER␣ mRNA or protein and are generally considered to be ER␣-negative. An adenovirus was engineered that expressed AP2␥ (AdAP2␥) and was used to induce overexpression of AP2␥ in HMECs. Infection of HMECs with AdAP2␥ induced high levels of AP2 expression, which was supershifted with AP2 polyclonal antibody, SC-184 (Fig. 6A). However, no AP2 activity was detected in HMECs infected with wild-type adenovirus. AP2␥ expression was able to induce the HS1-hypersensitive site in ER␣ exon 1 (Fig. 6B). However, HS2 and HS3 were not induced by AP2␥ expression. Infection of HMECs with AdWT did not induce formation of any hypersensitive sites in the ER␣ exon 1 region (Fig. 6B). These experiments were repeated using an adenoviral construct that expressed the AP2␣ protein. Identical results were obtained (data not shown) indicating that binding of either AP2␣ or AP2␥ is capable of altering the chromatin structure of the ER␣ promoter and inducing the HS1 site which is a characteristic of the chromatin in ER␣-positive cells.
AP2␥ Induces Transcription of the ER␣ Promoter in FIG. 5. Gel shift with AP2␥-specific antisera. Gel shift using AP2-binding site probe and purified AP2␣, AP2␥-GST fusion protein (26), or MCF-7 nuclear extract. Antibody used in supershift is either SC-184, which shifts both AP2␣ or AP2␥, and antisera AP, which is AP2␥-specific. Free probe is not shown. The panel on the right shows that the majority of AP2 activity in MCF-7 cells is supershifted with AP antisera.
FIG. 6. HS1 DNase I-hypersensitive site is induced in HMECs by AP2␥. A, AP2 activity. A demonstrates that infection of HMECs with AdAP2␥ virus generated AP2 activity that co-migrates with activity in MCF-7 nuclear extract and is supershifted with the AP2 antibody SC-184. B, hypersensitive site in HMECs. B, demonstrates that the AdAP2␥ virus induced the HS1-hypersensitive site in HMECs. Infection of HMECs with AdWT had no effect. HMECs-Expression of the endogenous ER␣ gene was examined in HMECs in which AP2 expression was induced by viral infection. After viral infection with AdAP2␣, AdAP2␥, or AdWT, no changes in the level of endogenous ER␣ expression were detected in HMECs (data not shown). This result suggests that either additional factors or additional changes in chromatin structure are required to induce endogenous ER␣ expression in mammary epithelial cells. To investigate these possibilities further, HMECs were transfected with a reporter construct in which the ER␣ promoter was inserted upstream of luciferase. Fig. 7 shows the results of experiments in which an AP2 expression construct was co-transfected into HMECs with ER␣ promoter reporter plasmids. As seen in Fig. 7, the pGL2-Basic reporter has relatively low basal activity in HMECs. The activity of pGL2-Basic was not altered by co-transfection of an AP2␥ expression construct. An ER␣ promoter reporter containing the untranslated leader to ϩ230 (ER3794 -230LUC) had low levels of basal expression in HMECs that was identical to the promoterless pGL2-Basic construct. However, AP2␥ was able to trans-activate the ER␣ promoter in HMECs and induced expression from the promoter by approximately 10-fold. An ER␣ promoter truncation that deletes the untranslated leader (ER3794 -0LUC) lacks the region of the promoter containing the HS1-hypersensitive site. ER3794 -0LUC has identical basal expression in HMECs as the larger construct containing the untranslated leader. However, this construct is profoundly reduced in its ability to be trans-activated by AP2␥. These results clearly demonstrate that AP2␥ can trans-activate the ER␣ promoter in HMECs by interacting with the region of the promoter encompassing the HS1-hypersensitive site. DISCUSSION Breast cancers that express ER␣ are more likely to occur in postmenopausal women (4), are associated with a better prognosis (6,7), and are more likely to respond to hormonal therapy (4, 5) than tumors that do not express the receptor. ER␣positive breast cancers overexpress ER␣ protein and have 10 -100-fold more ER␣ than normal mammary epithelial cells (33). Studies of breast cancer cell lines (8) and primary carcinomas (9,10) indicate that transcriptional regulation is a critical level of control of ER␣ expression. The genomic map of the 5Ј region of the ER␣ locus described in this study provides a physical map of the ER␣ cap sites and the location of regulatory regions involved in transcriptional control. Thus, these data provide the basis for a functional analysis of the alternate ER␣ promoters in breast cancer. The hypersensitive site, HS1, is invariably found to be a feature of chromatin in cell lines that express ER␣. The location of the HS1 site corresponds to binding sites for the AP2 transcription factors in the ER␣ promoter, and the HS1 site can be generated in HMECs by expression of the AP2␥ transcription factor. We have further shown that AP2␥ can trans-activate the ER␣ promoter in HMECs by interacting with the region of the promoter encompassing the HS1 site. Since ER␣ expression is necessary for hormone response, these results provide further evidence that overexpression of AP2␥ is a critical mechanism in the oncogenesis of hormone-responsive breast cancer.
The exons 1-8 of the ER␣ gene encode the ER␣ protein and span a region of ϳ160 kbp of genomic DNA (34). We have developed a contig map that includes all of the known 5Ј noncoding exons of the ER␣ gene that adds an additional 160 -170 kbp to the ER␣ gene locus, which brings the total size of the ER␣ locus to ϳ300 kbp. We were surprised to find that exon Ha is ϳ124 kbp upstream of the coding region of the ER␣ gene and that the intron between exon Hb and the splice acceptor site in exon 1 is over 110 kb. Exon C has been described as a liverspecific ER␣ exon (13). Exon C has been reported to splice to exon Hb (14), and it was fortuitous that this exon was also in the contig. Exon C is located ϳ30-40 kbp upstream of exon Ha, contrary to the genomic location for exon C in a previous report (25).
The two alternate ER␣ transcripts, E and H, are both initiated with genuine cap sites indicating that there are separate ER␣ promoters controlling expression of the ER␣ gene. In addition, there are multiple cap sites for exons Ha and exon E, as demonstrated by the S1 nuclease analysis and 5Ј-RACE results (Fig. 2). In the case of exon Ha, these cap sites are clustered over ϳ70 bp, whereas for exon E, there appears to be two separate clusters of cap sites separated by 120 bp. Multiple cap sites are a common feature of TATA-less promoters, which is consistent with the lack of clear TATA elements associated with either the exon Ha or E 5Ј-flanking sequence.
It is interesting that the ER␣ gene is controlled by several different promoters. The biologic basis for a gene having multiple promoters that encode the same protein is not entirely clear. A recent paper (35) described the expression of an isoform of ER␣ that lacked the amino terminus of the protein. The truncated ER␣ isoform repressed trans-activation of the fulllength ER␣ protein. This isoform was encoded by an H-type transcript that skipped exon 1 and spliced directly to exon 2. It seems likely that one purpose of these alternate cap sites may be to regulate expression of ER␣ proteins with alternate function. There may be additional factors involving tissue-specific expression that requires the existence of alternate cap sites. Defining the genomic organization, cap sites, and hypersensitive sites associated with these exons is an important step toward determining the regulation of these alternate ER␣ promoters.
DNase I-hypersensitive sites are regions of chromatin that are open and accessible to DNase digestion (30). These sites correspond to the location of important regulatory regions of eukaryotic gene promoters. Indeed, hypersensitive sites specific to active transcription of ER␣ are located near each of the ER␣ cap sites. However, eukaryotic regulatory signals can be far removed from a cap site, and often a hypersensitive site may identify the location of a regulatory sequence relevant to regulation of one or more promoters. Our results are in agreement with other studies that have described hypersensitive sites of the ER␣ gene near exon 1 (31). In addition, a new hypersensitive site, HS4, has been identified that is specific to ER␣-positive cells. The HS4 site is just downstream of exon Hb and may represent a regulatory element controlling expression of ER␣ transcription initiated at exon Ha. FIG. 7. ER␣ promoter analysis in HMECs. HMECs were transfected with the reporter plasmids pGL2-Basic, ER3794-230LUC, or ER3794-0LUC. The AP2␥ expression plasmid, AP2␥/pcDNA3.1 (26), or vector alone, pcDNA3.1, was co-transfected. Luciferase expression was normalized with a ␤-galactosidase control vector. Results are the average and standard deviation from three transfections.
The HS1 site has been localized to the untranslated leader of the main ER␣ promoter in exon 1 and is found in all ER␣positive breast and endometrial carcinoma cell lines examined ((31) Figs. 3 and 4) but is not a characteristic of the chromatin structure of HMECs (Fig. 6) or ER␣-negative cell lines ((31) Figs. 3 and 4). The location of the HS1 site corresponds to the region of the ER␣ promoter that was defined in a functional assay to be responsible for ER␣-specific transcription (15,26). This region of DNA bound a factor, initially called ERF-1 (15), that was found to be overexpressed in ER␣-positive breast and ER␣-positive endometrial cancer cell lines. The ERF-1 factor was cloned and found to be identical to AP2␥ (27). ERF-1 in MCF-7 nuclear extract is composed of AP2␥ but contains other AP2 factors as well, presumably AP2␣ (Fig. 5). These two AP2 factors have identical binding specificity and both are able to induce expression from the cloned ER␣ promoter (26). The data herein show that AP2␣ and AP2␥ are able to generate the HS1-hypersensitive site in HMECs, indicating that these factors are able to induce changes in the chromatin structure that is known to be associated with transcription of the ER␣ gene. We have further shown that AP2␥ can trans-activate the ER␣ promoter in HMECs and that trans-activation requires the region of the promoter that contains the HS1 site. These results provide compelling evidence that AP2␥ has an important role in regulating transcription of the ER␣ gene by altering chromatin structure of the promoter.
Comparing the genomic structure and expression of the ER␣ gene in HMECs to hormone-responsive breast cancer can be a useful cell culture model to dissect the oncogenic processes that leads to a hormone-responsive tumor. Overexpression of AP2 factors appears to be one important step. This conclusion is supported by data that demonstrated overexpression of AP2␥ in breast cancer compared with normal mammary epithelial cells (36). This report also demonstrated a significant correlation between AP2␣ expression and the ER␣-positive breast cancer phenotype (36). However, we were not able to detect an increase in ER␣ mRNA in HMECs infected with the AdAP2␥ or AdAP2␣ viruses (data not shown). This result is not surprising since other investigators (20) have reported the existence of factors in ER␣-negative cells that repress expression of ER␣. In addition, other HS1 sites are present within the promoter region of the ER␣ gene that were not generated by expression of AP2 factors alone (Fig. 6) indicating that other trans-activating factors may also be necessary to induce overexpression of the ER␣ gene in HMECs. Identifying other factors involved in the regulation of ER␣ gene transcription will be an area for further investigation. In addition, there may be other changes of chromatin structure that are needed to induce overexpression of ER␣. Determining what other changes, in addition to AP2, are required to induce overexpression of ER␣ will help to define the mechanisms of oncogenesis of hormone-responsive breast cancer.