Characterization of the mouse epidermal growth factor promoter and 5'-flanking region. Role for an atypical TATA sequence.

As a step toward delineating mechanisms that regulate its activity, we have characterized the mouse epidermal growth factor (EGF) promoter. Primer extension and S1 nuclease analyses identified prominent (+1/+2) and minor (+28) transcription start sites, with the dominant +1/+2 site located 33 bases downstream from a TTTAAA sequence. A restriction fragment that spanned these start sites and contained 390 base pairs of 5′-flanking sequence directed transcription from the +1/+2 site in vitro in the presence of HeLa cell nuclear extracts. Additionally, it promoted expression of a coupled luciferase reporter gene in transfected cell lines. The inclusion of additional 5′-flanking sequence either stimulated or inhibited luciferase expression depending on the cell line. Approximately 2 kilobases of EGF 5′-flanking sequence was determined and found to contain several motifs with partial homology to steroid hormone response elements. Despite this fact and evidence that EGF expression might be regulated by androgens in vivo, EGF promoter-luciferase constructs were not steroid-responsive in cells cotransfected with steroid receptor expression vectors. An oligonucleotide containing the aforementioned TTTAAA sequence specifically bound TATA-binding protein and TFIIA in gel shift assays, and an EGF promoter-luciferase construct in which the core TA dinucleotide was mutated to CG was not active in transfected cells. These data suggest that the TTTAAA sequence functions as an atypical TATA box.

Epidermal growth factor (EGF) 1 was first identified in mouse salivary gland extracts as an activity that induced premature eyelid opening and tooth eruption when injected into newborn pups (1). It was subsequently and independently characterized as urogastrone, a component of human urine that inhibited gastric acid secretion (2). Although its precise physiological roles are still not known, EGF is a potent mitogen for many epithelial and mesenchymal cells, and it regulates cellular migration and differentiation in vitro (3). Its numerous actions are a result of high-affinity binding to the EGF receptor, a type I protein tyrosine kinase (4). EGF is the prototype of the EGF receptor ligand family, which includes the mammalian cellular proteins transforming growth factor-␣ (5, 6), amphiregulin (7), heparin binding EGF (8), betacellulin (9,10), and epiregulin (11), as well as several proteins encoded by Shope family viruses (12,13). Characteristics of this family include a conserved three-loop structure and the proteolytic processing of soluble ligands from the ectodomains of bioactive, integral membrane precursors (reviewed in Ref. 14). The 53-amino acid mature EGF is derived from a precursor protein (prepro-EGF) of approximately 1200 amino acids (15,16).
The EGF gene is prominently expressed in the granular convoluted tubules of the submaxillary salivary gland (SG), the distal convoluted tubules of the kidney, Brunner's glands of the duodenum, and alveolar epithelial cells of the mammary gland (17)(18)(19)(20). Other sites of expression have been identified, although some remain controversial. Various findings suggest that expression of the EGF gene in SGs may be regulated by androgens. The male mouse SG contains higher levels of EGF mRNA than its female counterpart, and EGF-positive cells contain androgen receptors (17,21). Moreover, EGF mRNA levels are increased in SGs of female mice given androgen and, conversely, are decreased in SGs of castrated males (22). Other studies suggest that the EGF gene could also be regulated by estrogens (23) and by the lactogenic hormones prolactin and glucocorticoids (24). Finally, deregulation of EGF expression may be a component of neoplastic progression, since EGF mRNA is markedly elevated in some human tumors compared with their normal tissue counterparts (25)(26)(27)(28)(29)(30).
The molecular mechanisms that regulate transcription of the EGF gene have not been delineated. As an initial step, we have begun to characterize the EGF promoter. In the present study, we have refined the mapping of transcription start sites, shown the putative promoter to be active in vitro and in vivo, investigated possible regulation by androgens and glucocorticoids, and assessed the role of a TATA-like sequence.

EXPERIMENTAL PROCEDURES
Materials-Luciferin, dexamethasone, and dihydrotestosterone were from Sigma; the 129SV mouse genomic library and pBluescript SKϩ vector were from Stratagene (La Jolla, CA); and radionucleotides were from DuPont NEN. The HeLaScribe nuclear extract in vitro transcription system, Altered Sites II in vitro mutagenesis system, pGL2-basic luciferase vector, S1 nuclease, RNasin, and avian myeloblastosis virus reverse transcriptase were from Promega Corp. (Madison, WI). Tissue culture reagents, LipofectAMINE, oligo(dT)-cellulose, Taq polymerase, and dNTPs were from Life Technologies, Inc. Human anti-TFIID anti-body was obtained from Santa Cruz Biotechnology (La Jolla, CA). Oligonucleotides were synthesized by the University of North Carolina Nucleic Acids Core Facility.
Recombinant human TATA-binding protein (TBP) and TFIIA were gifts from Robert Roeder (Rockefeller University, New York, NY); the mouse glucocorticoid receptor expression vector was donated by Keith Yamamoto (University of California, San Francisco, CA); and the mouse mammary tumor virus (MMTV)-luciferase construct was obtained from Ron Evans (Salk Institute, San Diego, CA). EGF cDNA probes were gifts from Graeme Bell (University of Chicago, Chicago, IL), and the rat androgen receptor expression vector was provided by Elizabeth Wilson (University of North Carolina at Chapel Hill).
DNA Sequence Analysis-The nucleotide sequence of the EGF HindIII-XhoI fragment was determined from both strands by the University of North Carolina Automated DNA Sequencing Facility. Manual dideoxy DNA sequencing (31) was used to resolve ambiguous regions.
Analysis of EGF mRNA 5Ј-Ends-Cytoplasmic RNA was isolated from cultured cells as described by Gough (32), and poly(A) ϩ RNA was isolated by oligo(dT)-cellulose chromatography (33). Total RNA was purified from mouse tissues using the guanidinium-cesium chloride method (34). The integrity and concentration of RNAs were verified by gel electrophoresis.
Primer extension and S1 nuclease analyses were performed with 10 -20 g of total RNA as previously reported (35), unless otherwise specified. End-labeled primers were complementary to EGF sequences ϩ131 to ϩ170 (primer 1), ϩ43 to ϩ80 (primer 2), Ϫ37 to Ϫ76 (primer 3), and Ϫ260 to Ϫ221 (primer 4) (see Fig. 2A). S1 nuclease probes were generated as previously detailed (35). Primers 1 or 2 were annealed to denatured HX-luc (see Fig. 4A) and extended with the Klenow fragment of Escherichia coli DNA polymerase I. The resulting radiolabeled double stranded products were digested with BanI (Ϫ150 bp), and the single stranded probes were isolated from alkaline denaturing gels. Intensities of ϩ1/ϩ2 and ϩ28 primer extension and S1 nuclease products were determined by laser scanning densitometry (UltraScan XL; LKB Produkter, Bromma, Sweden).
EGF-Luciferase Constructs-EGF promoter and 5Ј-flanking restriction fragments possessing a common 3Ј-end at ϩ314 (XhoI site) were cloned upstream of the luciferase gene. The EGF XX and SX fragments were cloned directly into the corresponding pGL2-basic polylinker sites, whereas the EGF DX fragment was inserted into SmaI-XhoI-cleaved pGL2-basic. The EGF HX fragment was excised from XX-luc using a 3Ј-HindIII site in the polylinker and cloned into the HindIII site of the pGL2-basic in the correct orientation (Fig. 4A).
The EGF TTTAAA sequence (Ϫ33 to Ϫ28 bp) was mutated to a BstBI site (TTCGAA) in SX-luc using the Altered Sites II in vitro mutagenesis system as described by the manufacturer. Briefly, the SX fragment was subcloned into the ampicillin-sensitive pALTER 1 vector, and a single stranded template was prepared. The first mutant strand was synthesized by annealing the TTCGAA and ampicillin repair oligonucleotides to the single stranded DNA template in the presence of T4 DNA polymerase and ligase. The resulting ligation products were transformed into repair-minus BMH cells for second-strand synthesis of the mutant template. Mutant DNA was isolated and transformed into DH 5␣ , and clones corresponding to the double stranded mutated sequence were selected on ampicillin-containing plates. The mutation was confirmed by cleavage with BstB1 and dideoxy sequencing.
In Vitro Transcription-EGF-luciferase constructs were cleaved within the luciferase gene at unique XbaI or BglII sites, and linear templates were isolated from agarose gels prior to use in runoff transcription assays. Transcriptions in vitro were performed using the HeLaScribe nuclear extract in vitro transcription system, with a linear cytomegalovirus (CMV) template included as a positive control. Reactions were carried out in the presence of [␣ 32 P]dUTP (3,000 Ci/mmol, 10 mCi/ml, 50 Ci/reaction) as described by the manufacturer. Alternate reactions were performed in the absence of radioactivity, and the resulting RNAs were purified by phenol-chloroform extraction and ethanol precipitation prior to use as templates in primer extension assays.
Cell Culture and Transfection-CHO, NRK-52E, and COS cells (American Type Culture Collection, Rockville, MD) were maintained in Dulbecco's modified Eagle's medium, 10% fetal bovine serum, 50 g/ml gentamicin, and 0.1 mM nonessential amino acids (maintenance media). For transfection, cells were grown to 50 -60% confluence, and complete media were replaced with serum-free media. At time 0, DNAs (1 g of pGL2-basic or the molar equivalent of larger constructs) were introduced via LipofectAMINE treatment (6 l of LipofectAMINE/35-mm 2 culture dish) as instructed by the manufacturer. Cells were rescued 6 -8 h later by the addition of 1 volume of complete media containing 20% serum. After 24 h, transfection media were removed and replaced with maintenance media or, for experiments shown in Fig. 6, media containing steroid hormones. After 42-48 h (or 18 -24 h of hormone treatment), cells were harvested for assay. All transfections were performed in duplicate.
Luciferase Expression-Plated cells were resuspended in 1 ml of phosphate-buffered saline (4°C), pelleted at 14,000 rpm, and resuspended in 300 l of 100 mM K 2 HPO 4 (pH 7.8). Cells were then lysed with three successive freeze-thaw cycles, and the luciferase activity of individual samples was measured in duplicate. Lysate (100 l) was placed in the luminometer, and reactions were initiated by automatic injection of 200 l each of luciferin reagent (200 M luciferin in 25 mM glycylglycine, pH 7.8) and assay buffer (25 mM glycylglycine, pH 7.8, 15 mM K 2 HPO 4 , pH 7.8, 15 mM MgSO 4 , 4 mM EGTA, 2 mM ATP, and 1 mM dithiothreitol). Luciferase activity was measured for 15 s at ambient temperature immediately following the addition of reagents (AutoLumat LB953 luminometer; Berthold Analytical Instruments, Inc., Nashua, NH). Relative light units were corrected for lysate protein content.
Electrophoretic Mobility Shift Assays-Double stranded DNA probes encoding the EGF promoter TTTAAA sequence (5Ј-TCGACAGAGCTT-TAAAAAGGAGAG-3Ј) and the adenovirus major late promoter TATAAA sequence (5Ј-GGGCTATAAAAGGC-3Ј) were radiolabeled with 32 P and separated from free radionucleotide using a G-50 column. Purified recombinant TBP (His 6 -human TBP, 7.1 ng) and TFIIA (8 ng) were added to 20-l binding reactions containing 20 mM HEPES, pH 7.5, 17 mM KCl, 1 mM dithiothreitol, 0.1 mM EDTA, 4% Ficoll, 0.5 g poly(dI-dC), 5 mM spermidine, 0.0125% Nonidet P-40, and 75 g/ml bovine serum albumin. For competition reactions, a 25-fold molar excess of unlabeled competitor DNA was added prior to the addition of the probe, and samples were incubated for 15 min on ice. The probe (0.06 -0.12 ng, 40,000 cpm) was then added, and the reactions were incubated for an additional 20 min at 25°C. Resulting products were separated on native 5% polyacrylamide gels containing 1 ϫ Tris/glycine/EDTA, 0.05% Nonidet P-40, and 2.5% glycerol. For supershift reactions, the probe, TBP, and TFIIA were first incubated for 15 min on ice, and then 3 g of human anti-TBP antibody was added for 20 min at room temperature prior to electrophoresis.

Isolation and Characterization of the Mouse EGF Promoter and 5Ј-Flanking
Region-To evaluate the hormonal and tissuespecific regulation of the mouse EGF gene, we isolated genomic sequences containing the EGF promoter and 5Ј-flanking region from a 129SV mouse liver genomic library. The probe was a 603-bp fragment generated from a mouse kidney genomic DNA via polymerase chain reaction amplification and primers encompassing the putative EGF promoter (36). Eight positive -FIX II clones were characterized by restriction enzyme cleavage; the largest (7B) contained approximately 17 kb of sequence 5Ј to exon 1 of the EGF gene as well as approximately 4.5 kb of intron 1 sequence (15,16). The 22-kb NotI fragment from clone 7B was shuttled into pBluescript SKϩ vector, and a partial restriction enzyme map was determined by Southern blotting using the aforementioned 603-bp probe (Fig. 1). Predicted restriction fragments were confirmed by Southern analysis of mouse genomic liver DNA to exclude possible gene rearrangements or splicing events that might have occurred during the cloning process.
Identification of the EGF Transcription Initiation Sites-Using S1 nuclease analysis, a single primer, and mouse SG and kidney RNAs, Pascall and Brown (36) previously identified a single prominent and two minor downstream EGF mRNA 5Јends. By comparison with a molecular weight ladder, they associated the prominent band with a cytosine residue. Since the mouse EGF promoter contains multiple TA-rich elements that could function as atypical TATA boxes (see Fig. 2A), we wanted to exclude possible transcription initiation at other sites in the flanking regions. Additionally, we wanted to more accurately map transcription initiation sites by comparison with sequence ladders derived from the EGF promoter region itself. Accordingly, we synthesized a set of four oligonucleotide primers, each of which corresponded to sequences downstream of three putative TATA box-like sequences as well as the previously reported EGF 5Ј-end ( Fig. 2A). Primers 1-4 were used in primer extension assays, whereas radiolabeled primers 1 and 2 were used to make S1 nuclease probes with uniform 5Ј-ends produced by BanI cleavage (see Fig. 2A). Since EGF mRNA is expressed at particularly high levels in adult mouse SG and kidney but is present at very low or undetectable levels in brain (17), 2 we used total mRNA from these three tissues as templates in the primer extension and S1 nuclease assays. Fig. 2B shows that when radiolabeled primer 1 was used in primer extension assays with SG and kidney RNAs, a prominent cluster of two or more bands of roughly 170 bases in length was observed. Primer 2 confirmed this result and resolved the cluster to two principle bands; by comparison with an EGF promoter sequencing ladder generated from the same primer, these two bands corresponded to adjacent adenosine residues located immediately 3Ј to the cytosine previously identified by Pascall and Brown (36). We hereafter refer to the most 5Јadenosine residue as ϩ1. Using RNAs from SG and kidney, primer 1 also detected a less prominent 5Ј-end corresponding to an adenosine at ϩ28 and occasionally other minor extension products as well. The ϩ28 site likely corresponds to a minor 2 S. E. Fenton, unpublished observation.
FIG. 1. The mouse EGF promoter and 5-flanking region. A 22-kb genomic DNA fragment encompassing the mouse EGF promoter was isolated from an 129SV library using a polymerase chain reaction-derived probe corresponding to bases ϩ60 to Ϫ528 (36). The clone was restriction enzyme mapped by comparison with -HindIII molecular weight markers in Southern analysis. The location of the major (ϩ1/ϩ2) transcription start site is indicated (arrow

FIG. 2. Identification of transcriptional start sites.
A, derivation of primer extension and S1 nuclease probes. The approximate location of primers are indicated with bold underlines, and the structures of BanI-cleaved S1 nuclease probes generated with primers 1 and 2 are shown. The relative locations of three putative TATA boxes are illustrated. B, EGF mRNA 5Ј-end analyses. Total RNA from adult mouse SG (male, 10 g), kidney (K, female, 10 g), and brain (B, male, 20 g) were used as templates in S1 nuclease (lanes 1-3) and primer extension (lanes 4 -7) assays. Arrowheads, extension products corresponding to major (ϩ1/ ϩ2) and minor (ϩ28) sites that were identified using primer 1. Arrows, doublet of extension products generated with primer 2 that correspond to the ϩ1/ϩ2 start sites. M, T lane from a set of dideoxy sequencing reactions of HX-luc (see Fig. 4A) generated with primers 1 or 2. Note that primers 3 and 4 did not yield primer extension products with either SG or kidney RNA.
5Ј-end previously identified (36) in SG samples, which they associated with an adenosine residue located two bases further upstream. The ϩ28 product could not be confirmed with primer 2, since the latter corresponds to sequences from ϩ43 to ϩ80. However, S1 nuclease assays performed with SG and kidney RNAs and probes generated from primers 1 (Fig. 2B) and 2 (data not shown) yielded prominent products corresponding to both the ϩ1/ϩ2 and ϩ28 sites. In contrast, primer extension reactions with primers 3 and 4 ( Fig. 2A) did not yield products with any of the mouse RNAs tested, even when higher concentrations of RNA were used (20 versus 10 g). The fact that ϩ1/ϩ2 and ϩ28 primer extension and S1 nuclease products were most abundant in SG versus kidney RNA and were not detected with brain samples is consistent with the relative EGF mRNA abundance in these tissues as judged by Northern blot analysis (17). 2 Collectively, these data confirm and refine the previously reported EGF mRNA 5Ј-ends (36). Specifically, they indicate that transcription of the EGF gene in SG and kidney initiates at two sites, with the ϩ1/ϩ2 site being dominant. Interestingly, densitometric analysis of the autoradiographs shown in Fig. 2B revealed that the ϩ1/ϩ2 site is used 7-fold more frequently than the ϩ28 site in SG, but only 2-fold more frequently in kidney. This suggests that transcription is selectively enhanced via the ϩ1/ϩ2 site in SG.
Transcriptional Activity of the EGF Promoter and 5Ј-Flanking Region-Functional activity of the putative EGF promoter has not been previously demonstrated. Accordingly, we tested its activity both in vitro and in vivo. To assay its ability to direct transcription in vitro in the presence of crude HeLa cell nuclear extracts (Promega), an XhoI fragment containing 6.7 kb of sequence 5Ј of the start site was cloned upstream of the firefly luciferase gene (XX-luc), and runoff templates were produced by cleavage at unique sites (BglII or XbaI) within the luciferase gene. The dihydrofolate reductase and CMV promoter constructs were also linearized and used in conjunction with radiolabeled molecular weight markers for size comparisons. As expected, the ScaI-linearized dihydrofolate reductase template produced two bands of 780 and 736 bases (37), whereas the linearized CMV template yielded a product of 363 bases (Fig.  3). Transcription of the XbaI-cleaved EGF-luciferase template produced two closely spaced products not observed with the parental luciferase vector. The larger product was a diffuse band(s) of approximately 401-410 bases; the smaller, more distinct product had an estimated length of 385 bases. These sizes are similar to those expected on the basis of nucleotide sequence for runoff products initiated at the ϩ1/ϩ2 (404 bases) and ϩ28 sites (376 bases), respectively (Fig. 3). Consistent with these results, transcription of an EGF-luciferase template that had been cleaved at a BglII site closer to the EGF promoter produced two similarly spaced bands of appropriately reduced size (data not shown). Additionally, although transcription of alternate EGF-luciferase templates containing 390 (SX-luc) or 2000 (HX-luc) bases of 5Ј-flanking sequence produced comparable products, a template containing only 30 bases of 5Јflanking sequence (DX-luc) did not yield identifiable transcripts (data not shown). The latter result suggests that sequences upstream of Ϫ30 are required for EGF promoter activity in vitro.
To verify correct initiation in vitro, transcripts derived from the XhoI-XhoI EGF-luciferase construct were assayed by primer extension using primer 1. Fig. 3 shows that a dominant extension product corresponding to the ϩ1/ϩ2 site was obtained together with minor, smaller products. Notably, despite the generation of an in vitro transcript the size of which appeared consistent with initiation at ϩ28 (Fig. 2B), no corresponding extension product was observed. However, aberrant minor products were evident at bases ϩ41 and ϩ55.
Transcriptional activity in vivo was established via transient transfection of EGF-luciferase constructs into cultured cell lines. EGF fragments possessing a common 3Ј-end (XhoI; ϩ314) but containing 6.7 kb (XhoI-XhoI; XX), 2.0 kb (HindIII-XhoI; HX), 0.4 kb (SacI-XhoI; SX), and 0.03 kb (DraI-XhoI; DX) of sequence 5Ј to the transcriptional start site were cloned upstream of the luciferase reporter gene in pGL2 (Fig. 4A). The final vectors were then transiently transfected into CHO and NRK-52E cells, and the resulting luciferase activity was measured after 48 h. As shown in Fig. 4B, EGF promoter activity was confirmed in both cell types, although the effect on luciferase activity of increasing amounts of EGF 5Ј-flanking sequence differed considerably between the two cell lines. Thus, in CHO cells, optimal activity was obtained with SX-luc, which yielded a 5-fold increase in luciferase activity relative to the promoterless vector; the inclusion of an additional 5Ј-flanking sequence decreased luciferase expression from HX-and XX-luc to only 2.2-and 1.3-fold above background, respectively. In contrast, in NRK-52E cells, the larger HX-and XX-luc constructs were most active, yielding 60-and 62-fold increases in luciferase activity over background, respectively. In both CHO and NRK-52E cells, the DX-luc construct, which contained only 30 bases of 5Ј-flanking sequence, was inactive, consistent with the in vitro transcription results described above. Primer extension assays performed with primer 1 and poly (A) ϩ RNA from cells transfected with XX-, HX-, and SX-luc yielded extension products that by comparison with an EGF promoter se-

FIG. 3. Transcription in vitro of EGF-luciferase vectors.
Left, the XX-luc plasmid was linearized with XbaI and transcribed in the presence of crude HeLa cell nuclear extract. Resulting products were resolved on a 6% acrylamide-urea gel as described under "Experimental Procedures." Arrowheads, specific transcripts derived from the EGF promoter template. Products resulting from transcription of linearized dihydrofolate reductase and CMV promoters are shown for comparison. Markers (M) are radiolabeled HaeIII X174 fragments. Assays were repeated three times. Right, RNA (20 g) derived by in vitro transcription of the XX-luc construct was annealed to primer 1 and analyzed by primer extension as in Fig. 2B. Arrowhead, prominent extension product(s) corresponding to initiation in vitro at the ϩ1/ϩ2 site.
quence ladder corresponded to initiation at the ϩ1/ϩ2 site (data not shown).
Although in NRK-52E cells, SX-luc was less active than larger templates containing more 5Ј-flanking sequence, it nevertheless produced a greater fold stimulation over the promoterless vector in this cell type than in CHO cells (20-versus 5-fold). In fact, relative to the pGL2-basic background or to levels of activity produced by CMV-and SV40-luc templates, the EGF promoter was dramatically more active in NRK-52E cells (and in kidney-derived COS cells; see Fig. 6) than in CHO cells, even though CHO cells transfect with considerably greater efficiency than NRK-52E cells. Whether this cell type enhancement reflects tissue-specific regulation of the EGF promoter is presently unknown. Sequence Analysis of the Proximal EGF Promoter Region-The nucleotide sequence of the 2.3-kb HindIII-XhoI fragment, which contains approximately 2 kb of sequence 5Ј to the dominant ϩ1/ϩ2 start site, is shown in Fig. 5. The translational start site is at ϩ352 bp (15,16) and is not shown. The sequence from Ϫ897 to ϩ314 bp is identical to that previously reported (36). In addition to a putative atypical TATA box (TTTAAA) at Ϫ33 bp (see below), the EGF promoter contains several polypurine-rich motifs and consensus binding sequences for the transcription factors no. NFB, GAS, AP-1, AP-2, AP-3, Sp1, p53, and C/EBP (Fig. 5), as defined by the transcription factor data set in the Genetics Computer Group program.
The EGF-Luciferase Constructs Are Not Androgen Responsive-As mentioned above, studies in vivo suggest that the EGF gene may be responsive to androgens. Our sequence revealed that the EGF promoter 5Ј-flanking sequence from Ϫ648 to ϩ314 contains two six-base sequences that are identical to the 3Ј-portion of the 15 base consensus steroid hormone response element (HRE, GGTACANNNTGTTCT; Ref. 38), and the additional 5Ј-flanking region from Ϫ2048 to Ϫ649 includes several other potential half-sites. Furthermore, a 15-base sequence from ϩ226 to ϩ240 is 73% identical to the consensus HRE. To determine whether any of the putative HREs confer direct androgen responsiveness on the promoter, EGF-luciferase constructs were transiently transfected into COS, CHO, and NRK-52E cells either alone or in conjunction with mouse androgen receptor expression vector (provided by Elizabeth Wilson). Following transfection, cells were maintained in serum-free media and after 24 h were exposed to 0.1 nM dihydrotestosterone. An additional 24 h later, cells were harvested, and luciferase expression was measured. For comparison, cells were transfected with an MMTV-luc expression vector (provided by Ron Evans). Fig. 6 shows that activity from MMTV-luc was induced by the combination of androgen receptor expression and dihydrotestosterone treatment in all three cell lines. Relative to expression in nontreated control cells, MMTV-luc activity was increased 4-, 6-, and 9-fold in NRK-52E, CHO, and COS cells, respectively. In contrast, HX-luc activity was decreased in CHO and NRK-52E cells in response to hormone treatment. And although the overall EGF-luc activity was higher in COS cells, it was insignificantly increased in hormone-treated samples (Fig. 6). A similar lack of induction by androgens was observed with the EGF XX-and SX-luc constructs (data not shown). We also tested the EGF promoter for glucocorticoid responsiveness. Whereas MMTV-luc was induced more than 25-fold in the presence of dexamethasone and glucocorticoid receptor (provided by Keith Yamamoto), the activity of the EGF XX-, HX-, and SX-luc constructs was unchanged compared with expression in nontreated control cells (data not shown). Interestingly, we note that these experiments revealed the EGF promoter to be dramatically less active in the absence of serum in all three cell lines (compare Figs. 4 and 6).
The Atypical TTTAAA Sequence at Ϫ33 bp Is Required for Maximal EGF Promoter Activity-As previously noted, the EGF promoter contains a TTTAAA sequence located from Ϫ33 to Ϫ27 bp upstream of the ϩ1/ϩ2 start site. The finding that the DX-luc construct, which deletes 5Ј-sequences to Ϫ30 bp, had negligible activity in vitro and in vivo is consistent with a possible role for the TTTAAA sequence as an atypical TATA box. To specifically assess the role of the TTTAAA motif, we examined both its ability to bind TBP in vitro and tested its requirement for efficient EGF promoter activity in vivo. To test binding via electrophoretic mobility shift assay, we used a combination of TBP and TFIIA (provided by Robert Roeder), since the binding of TBP to TATA box sequences is facilitated in the presence of TFIIA (39). Fig. 7 shows that a 14-bp probe corresponding to the TATA sequence of the adenovirus major late promoter (AdMLP) displayed the expected mobility shift in the presence of TBP⅐TFIIA, and that the formation of the product was competitively inhibited in the presence of a 25-fold excess of unlabeled AdMLP double stranded oligonucleotide. A 20-bp double stranded probe encompassing the EGF promoter TTTAAA sequence displayed a similar mobility shift in the presence of TBP⅐TFIIA, and this binding was specifically inhibited in the presence of a 25-fold molar excess of unlabeled TTTAAA oligonucleotides. Importantly, the mobility shift was also blocked in the presence of a 25-fold molar excess of the unlabeled AdMLP oligonucleotide, and conversely, the mobility shift of the AdMLP probe was inhibited in the presence of the EGF TTTAAA sequence. In contrast, an otherwise identical EGF oligonucleotide in which the TTTAAA sequence was altered to TTCGAA only weakly inhibited the binding of TBP⅐TFIIA to either the EGF TTTAAA or the AdMLP TATA probes. Finally, the addition of anti-TBP antibody (Santa Cruz Biotechnology) to reactions containing the EGF TTTAAA probe and TBP⅐TFIIA caused a partial supershift of the bound probe (Fig. 7). These data establish the ability of the TTTAAA se-quence to bind the TBP⅐TFIIA complex in vitro.
To test the requirement for the TTTAAA sequence in vivo, we compared the activity of a wild-type SX-luc construct with that of a mutant SX-luc in which the TTTAAA sequence had been converted to TTCGAA by site-directed mutagenesis. Compared with background (pGL2-basic) levels, SX-luc in this experiment yielded 3.7-and 10-fold increases in luciferase expression in CHO and NRK-52E cells, respectively (Fig. 8). In contrast, expression of luciferase from the SX-luc mutant construct was comparable with that of the promoterless vector control in both cell lines. These data confirm that the TTTAAA sequence is required for efficient expression of the EGF promoter in vivo. DISCUSSION Our results show that transcription of the EGF gene principally initiates at adjacent adenosine residues located approximately 30 bp downstream from the TTTAAA sequence. Our data further suggest that transcription initiates less frequently at a single adenosine located downstream of the primary site, at ϩ28. However, although 5Ј-ends corresponding to this ϩ28 start site were detected in both SG and kidney RNAs by complementary primer extension and S1 nuclease analyses, we could not confirm that this site was used in in vitro transcription assays or in transfected cells. Conceivably, transcription from this site could be dependent on cell-specific or labile transcription factors. Nevertheless, our data provide important confirmation of EGF promoter activity both in vitro and in vivo. In this regard, it is interesting to note that the EGF promoterluciferase construct was considerably more active in the two kidney-derived cell lines, NRK-52E and COS, than in CHO cell lines. Since the kidney is one of two sites of marked EGF expression, it is tempting to ascribe these cell type differences FIG. 5. Nucleotide sequence of the EGF promoter region. The complete nucleotide sequence of the HindIII-XhoI fragment was determined for both strands by the University of North Carolina Automated DNA Sequencing Facility. Restriction enzyme sites used to make EGF-luciferase constructs are identified. Transcription start sites at ϩ1/ϩ2 and ϩ28 are boxed, and the atypical TATA motif, TTTAAA, is bracketed below. Two 6-bp steroid hormone response element half-sites and a 15-base pair element that is 73% homologous to the consensus HRE are underlined. in activity to tissue-specific regulation of the EGF promoter. However, the relationship of the two cell lines to EGF-expressing cells in the distal convoluted tubules in the kidney is uncertain, and the underlying basis of this phenomenon requires further investigation.
Our data indicate that the EGF promoter and 5Ј-flanking region are not directly responsive to either androgens or glucocorticoids. Work from a number of laboratories suggests that EGF expression can be influenced by these hormones, particularly androgens, in vivo. For example, the SGs of sexually mature male mice were found to contain markedly higher levels of EGF mRNA than those of counterpart females, and the treatment of adult female mice with testosterone resulted in an average 16-fold increase in SG levels of EGF mRNA over a period of several days (22). Similar observations have been made at the protein level. Thus, SGs of male mice contained up to 400 pmol of EGF/mg of protein, whereas corresponding concentrations in female mice were only 5-20 pmol of EGF/mg of protein (40). Moreover, EGF protein levels were increased 4 -40-fold in SGs of normal female mice 6 days after administration of testosterone (41), and the corresponding concentrations in androgen-insensitive tfm/y male mice were as low as those of untreated females (42). These various findings have been supported by surgical manipulations; castration at 8 weeks of age resulted in a marked reduction of SG EGF mRNA and protein levels, whereas ovariectomy produced a 100-fold increase in SG EGF mRNA levels (40). Finally, administration of testosterone to hypophysectomized mice induced SG EGF levels nearly 40-fold, with co-administration of testosterone and thyroid hormone producing a synergistic response (43).
In light of the aforementioned observations, we determined the nucleotide sequence of nearly 2500 bp of DNA flanking the transcription start sites to identify potential androgen-responsive elements. The consensus androgen response element GG-TACANNNTGTTCT (38) is similar or identical to the glucocorticoid, progesterone, and mineralocorticoid response elements, and hence the universal term HRE is used. Although the EGF 5Ј-flanking region does not contain consensus HREs, a 15-bp sequence displaying 73% homology is located approximately 230 bp downstream from the ϩ1/ϩ2 site, and several TGTTCT motifs corresponding to the 3Ј 6-bp portion of the HRE are present upstream of the start site. Analysis of probasin gene promoter studies indicates that functional androgen-responsive elements can diverge considerably from the 15-bp consensus androgen-responsive element, and that when reiterated, the 3Ј-TGTTCT sequence can function in the absence of significant homology to the 5Ј-portion of the HRE (44). Other studies suggest that sequences flanking the putative androgen-responsive element can exert significant influence on hormone responsiveness (45,46). Hence, it was important to directly test the androgen responsiveness of the EGF promoter. In fact, our data indicate that genomic fragments containing the EGF promoter and up to 7 kb of 5Ј-flanking sequence are not androgen sensitive. It is still possible that sequences located either far upstream or downstream of the proximal promoter confer androgen responsiveness on the EGF gene. For example, sequences responsible for androgen regulation of the mouse ␤-glucuronidase gene have been mapped to intron 9 (47). Alternatively, since androgen-induced increases in EGF expression have only been demonstrated in vivo, it is possible that FIG. 6. The EGF promoter is not androgen responsive. EGF-luc (EGF-HX) or MMTV-luc constructs were transfected alone or in conjunction with an androgen receptor expression vector (AR) into CHO, NRK-52E, and COS cells. Following transfection, cells were maintained in serum-free media and after 24 h were treated with 0.1 nM dihydrotestosterone for an additional 20 -24 h, after which they were harvested for luciferase determination. The data are shown as mean (minus background) Ϯ S.E. (bars). For NRK-52E cells, n ϭ 3; CHO cells, n ϭ 5; and COS cells, n ϭ 6.
FIG. 7. The EGF promoter TTTAAA sequence binds TBP. A radiolabeled, double stranded oligonucleotide probe (Ϫ33) corresponding to EGF promoter sequences from Ϫ38 to Ϫ21 and encompassing the TTTAAA sequence was tested for binding to recombinant TBP⅐TFIIA as described under "Experimental Procedures." Shown for comparison are analyses with a 14-bp probe containing the AdMLP TATAA sequence. For competitions, a 25-fold excess of unlabeled probe was added to TBP⅐TFIIA on ice for 15 min prior to addition of the labeled probe. A double stranded oligonucleotide (mut), which is identical to Ϫ33, except that the TTTAAA sequence was altered to TTCGAA, was also used in competitions. Assays were repeated three times. Exposure times were 15 h for lanes 1-10 and 40 h for lanes 11-13. they are not direct responses, particularly since they are typically measured after several days of hormone treatment and are accompanied by changes in the size and morphology of the EGF-expressing SG cells (22,40,43). The finding that kidney EGF mRNA levels are not altered following castration, ovariectomy, or administration of androgen (40) is consistent with this explanation. Importantly, our results support a role for the TTTAAA sequence as an atypical TATA box. The TTTAAA sequence is positioned a conventional distance upstream from the ϩ1/ϩ2 initiation site, and the degree to which mutation of the sequence impairs EGF promoter activity strongly argues that it normally influences transcription from this predominant start site. Atypical, but apparently functional, TATA motifs have been implicated in the transcription of other genes. Interestingly, a TTTAAA sequence is found at comparable distances upstream of initiation sites in a number of promoters, including those for the P-450c27/25 (48), herpes simplex virus UL38 (49), bovine and porcine outer dense fibers (50), prostatic arginine esterase (51), and rat somatostatin (52) genes. Efficient transcription in vitro from the P-450c27/25 promoter requires the intact TTTAAA motif, suggesting that it functions as a cryptic TATA box (48). The presence of a functional TATA-like element in the EGF promoter may, at least in part, account for its dramatic expression in mouse SG and kidney. This tissuespecific pattern contrasts with that of the related transforming growth factor ␣, which is more broadly expressed and at levels significantly lower that those of EGF mRNA in kidney and SG. Interestingly, the transforming growth factor ␣ promoter differs in having a much higher GϩC content (Ͼ80% versus 45% for the EGF promoter) and multiple binding sites for the transcription factor Sp1 and in not possessing a recognizable TATAlike motif. These are all characteristics of so-called housekeeping gene promoters (reviewed in Ref. 14). A T 5 C 5 sequence in the transforming growth factor ␣ promoter is reported to bind TBP in electrophoretic mobility shift assays (53), but the functional significance of this observation has not been established. Finally, it is interesting to note that the EGF promoter contains an additional TTTAAA (Ϫ311 to Ϫ306), as well as a consensus TATATA (ϩ25 to ϩ30). However, our data indicate that neither of these sequences is associated with detectable transcription start sites.
Given that EGF was discovered more than 30 years ago, it is surprising that the molecular regulation of EGF transcription has not been characterized. The studies described here are a first step toward understanding tissue-specific and hormonal regulation of EGF production, as well as the mechanisms by which EGF expression is deregulated in neoplastic progression.