A conserved downstream element defines a new class of RNA polymerase II promoters.

Although many TATA-less promoters transcribed by RNA polymerase II initiate transcription at multiple sites, the regulation of multiple start site utilization is not understood. Beginning with the prediction that multiple start site promoters may share regulatory features and using the P-glycoprotein promoter (which can utilize either a single or multiple transcription start site(s)) as a model, several promoters with analogous transcription windows were grouped and searched for the presence of a common DNA element. A downstream protein-binding sequence, MED-1 (Multiple start site Element Downstream), was found in the majority of promoters analyzed. Mutation of this element within the P-glycoprotein promoter reduced transcription by selectively decreasing utilization of downstream start sites. We propose that a new class of RNA polymerase II promoters, those that can utilize a distinctive window of multiple start sites, is defined by the presence of a downstream MED-1 element.

Although many TATA-less promoters transcribed by RNA polymerase II initiate transcription at multiple sites, the regulation of multiple start site utilization is not understood. Beginning with the prediction that multiple start site promoters may share regulatory features and using the P-glycoprotein promoter (which can utilize either a single or multiple transcription start site(s)) as a model, several promoters with analogous transcription windows were grouped and searched for the presence of a common DNA element. A downstream proteinbinding sequence, MED-1 (Multiple start site Element Downstream), was found in the majority of promoters analyzed. Mutation of this element within the P-glycoprotein promoter reduced transcription by selectively decreasing utilization of downstream start sites. We propose that a new class of RNA polymerase II promoters, those that can utilize a distinctive window of multiple start sites, is defined by the presence of a downstream MED-1 element.
Promoters transcribed by RNA polymerase II are divided into two classes: those that contain a canonical TATA box and those that do not. TATA-containing promoters usually direct transcription from a single initiation point, the location of which is determined by the position of TATA (1). In promoters that lack a TATA box, start site selection is not as well understood and has been investigated primarily in genes that use a single transcription start site, where the presence of an "initiator" element at or near the start site appears to be responsible for localizing the preinitiation complex (2). However, despite the fact that many TATA-less promoters utilize multiple start sites, there is little information as to how this multiple selection process occurs. One hypothesis is that the utilization of many start sites is a random or default response to the lack of a strong "selector" such as the TATA box (3). Another possibility is that each site is independently regulated by a separate initiator-type element; apropos of this, the role of initiators in multiple start site selection within the thymidylate synthase promoter was investigated, but multiple initiators were not identified (4).
We have previously shown that transcription from the TATA-less P-glycoprotein (pgp1) promoter can either begin at a single site (ϩ1) or can include multiple downstream start sites within a ϳ70-nucleotide window (5), suggesting that ϩ1 and the downstream sites are independently regulated. In our efforts to understand the activation of the additional downstream start sites within the pgp1 promoter, we have investigated the possibility that the utilization of multiple start sites in TATA-less promoters is neither random nor mediated by independent initiator elements but rather that TATA-less promoters with a similar "window" of start sites may share a common element that regulates their selection and/or activation. In this report we show that: 1) as opposed to being "random," the size and arrangement of the multiple start site "window" is quite similar in many promoters; 2) multiple start sites can be regulated as a cassette, rather than individually; 3) a conserved sequence motif (MED-1) can be found in the majority of these promoters downstream of the initiation window; and 4) mutation of this motif within the pgp1 promoter decreases transcription from the downstream start site cassette. We therefore propose that the P-glycoprotein gene is a member of a subclass of TATA-less promoters, which can be classified according to a characteristic transcription "window" and the presence of a common downstream regulatory element.

MATERIALS AND METHODS
Computer Search for the MED-1 Element-The following criteria were imposed upon selection of promoters to be included in this study: 1) they had to contain multiple start sites with a distribution similar to what was found in the pgp1 promoter (i.e. unclustered and spanning less than ϳ 100 bp 1 ); and 2) the authenticity of the start sites had to be determined by both nuclease protection and primer extension assays. ϳ410 bp of each promoter sequence extending to ϳ70 nucleotides downstream of the initiation window were aligned using the CLUSTAL program (PC/GENE, Intelligenetics) with the following parameters: k-tuple value, 3; gap penalty, 10; window size, 40; filtering level, 5; open gap cost, 10; unit gap cost, 10, with transitions weighted twice as likely as transversions (Fig. 1A).
pgp1 Reporter Constructs and Transfections-A 433-bp pgp1 promoter fragment between Ϫ256 and ϩ177 was subcloned into Bluescript KS II(ϩ) (Stratagene) at the BamHI and SalI sites (5). The promoter insert was released with SacI and XhoI, subcloned into the luciferase vector pGL2-Basic (Promega), and designated pgpLuc-B. The mutant MED-1 construct (pgpLuc-Bm) was created by site-directed mutagenesis, using the mutant MED-1 oligonucleotide described for gel shift analyses.
To construct pgp1/globin reporters, the unique BamHI site of the pgpLUC-B plasmid was first converted into an ApaI site by site-directed mutagenesis; the resulting plasmid was designated pgpLUC-Ba. pgp1GL was created by cloning a B-globin insert (isolated from PTAG-1 (8) by ApaI/HindIII digestion) into pgpLUC-Ba, replacing the luciferase gene and the SV40 3Ј-untranslated region. pgpGLm was created by the same approach, using pgpLuc-Bm as vector.
6 ϫ 10 5 cells were co-transfected by the calcium phosphate method with 12 g of reporter plasmid and 0.25 g of the neomycin resistance plasmid, p308 (ATCC). After 36 h, cells were split into dishes containing medium supplemented with 400 g/ml G-418 (Life Technologies, Inc.). A typical experiment yielded 300 -400 neomycin-resistant clones. Individual clones were isolated after 15-18 days. The presence and integrity of the luciferase constructs were confirmed by Southern blot analyses (data not shown). Luciferase assays were performed using the Promega luciferase reporter assay system, as recommended by the vendor. Protein concentrations were determined using the bicinchoninic acid assay kit (Pierce) using microtiter plates (9). For analysis of pgp1/globin constructs, resulting clones were pooled prior to RNA isolation.
Riboprobe Plasmids and Nuclease Protection Assays-pgp1 promoter/globin inserts were released from pgpGL and pgpGLm by digestion with SpeI and BamHI and cloned into pGEM7Zf(ϩ) (Promega) between the XbaI and BamHI sites. Constructs were digested with BspMI and Eco72I to remove intron I and exon II of ␤-globin, blunt ended with T4 DNA polymerase, and recircularized. Resulting plasmids (R-GL and R-GLm) were used to generate riboprobes for nuclease protection assays (5). Protected fragments were quantitated using a Fuji PhosphoImager. Promoters were chosen and aligned as described under "Materials and Methods." Only the 3Ј-half of the alignment is shown. Identity among all five promoters is indicated by closed circles. The MED-1 element is outlined. Arrows above the sequence indicate pgp1 transcription start sites. The most upstream start site in each promoter is numbered ϩ1. PGP1, P-glycoprotein (5); HMGCOA, HMG-CoA reductase (7); TS, thymidylate synthase (10); TK, thymidine kinase (11); HPRT, hypoxanthine phosphoribosyltransferase (12). B, transcription initiation window size and position relative to MED-1. Arrows indicate transcription start sites, as described in references noted (5, 7, 10 -21). The size of the arrow does not necessarily correspond to the relative strength of the particular transcription start site. Asterisk indicates values for G␣ o were approximated from the data available. Due to inherent artifacts in assays used to identify mRNA 5Ј-ends, genes in which start sites were not confirmed by at least two independent methods were excluded from these comparisons. Only promoters with complete identity to the MED-1 consensus are shown. ACBP, bovine acyl-CoA-binding protein (13); WT1, human Wilms' tumor (14); MHC-A, human nonmuscle myosin heavy chain (15); N-RAS, mouse N-ras (16); CATL, rat catalase (17); G␣ o , mouse G protein (18); AK2, bovine adenylate kinase isozyme (19); GHRH, rat growth hormone-releasing hormone (20); AAT, rat aspartate aminotransferase (21).

Transcriptional Regulation of Multiple Start Site Promoters 30250
by guest on July 24, 2018 http://www.jbc.org/

RESULTS
A search of the literature identified 14 promoters (7, 10 -22) similar to the pgp1 promoter with respect to the distribution of multiple start sites within the transcription initiation window (see "Materials and Methods" for search strategy). We began with the assumption that a DNA element involved in multiple start site selection would be common to all these promoters and, analogous to a TATA box, would lie within a conserved distance from the window. In order to test this hypothesis, several of the promoters were aligned and analyzed for such an element (Fig. 1A). A hexanucleotide sequence, GCTCC(C/G), which we have designated MED-1 (Multiple start site Element Downstream), was identified as the only element common to these promoters (Fig. 1A, outlined). The relationship of this element to the initiation window is shown in Fig. 1B. MED-1 was present in 14 out of 15 promoters (it was not found in the human Ha-ras promoter (22)) and lies 20 -45 bp downstream of the 3Ј-end and a maximum of ϳ110 bp downstream of the 5Ј-end of the transcription initiation window.
The striking conservation of MED-1 in multiple start site promoters suggested a role for this element in multiple start site selection and/or activation. In order to test the possibility that MED-1 was a site for protein binding, gel shift assays were performed using an oligonucleotide containing the pgp1 MED-1 sequence (Fig. 2). Two specific DNA-protein complexes were identified (Fig. 2A, lane 1). While both complexes were specifically competed with an excess of wild-type pgp1 oligonucleotide (lanes 2-4), a mutation that converted the MED-1 site from GCTCCC to CCAAGG significantly impaired competition for binding of both complexes (lanes 5-7); moreover, when used as a probe, the mutant oligonucleotide was greatly reduced in its ability to form both complexes (data not shown). We do not yet know whether the two specific complexes contain different proteins or multimers of the same protein.
In order to determine whether the sixth base of the MED-1 consensus could be either a G or C as suggested by the computer alignment (Fig. 1A) and to substantiate the importance of this binding site in other promoters, an oligonucleotide representing a comparable region of the HMG-CoA reductase gene (7) was used as competitor and found to compete for both complexes (Fig. 2B, lanes 2-3). These results are consistent with the notion that the same protein factor(s) are binding to this promoter as well as to the others identified in Fig. 1B.
The functional role of MED-1 in pgp1 transcription was assayed in DC-3F/ADII cells, in which the endogenous pgp1 is transcribed from multiple sites (5). In the first set of experiments, cells were stably transfected with one of three constructs: a wild-type pgp1 promoter/luciferase construct, a MED-1 mutant/luciferase construct containing the mutation previously shown to reduce DNA-protein complex formation, or luciferase vector alone. A minimum of 11 independent transfectants was isolated and analyzed for each construct. The results presented in Fig. 3 indicate that mutation of the MED-1 element reduced expression from the pgp1 promoter to ϳ25% that of wild type (p ϭ 0.0001). In light of the remarkable conservation of MED-1 in multiple start site promoters, we predicted that this reduction in expression might be due to a specific effect on the downstream start sites. In order to investigate this possibility, similar experiments were performed using pgp1/globin reporter constructs (luciferase RNA was undetectable in the previous experiments). Following stable transfection of these reporter constructs into DC-3F/ADII cells, two significant observations were made. First, the endogenous multiple start site pattern was recapitulated (Fig. 4B, lane 1), confirming our initial observation (5) that the selection of multiple start sites is not simply a result of a mutation in the endogenous promoter. Second, mutation of the MED-1 element resulted in a ϳ3-fold reduction in utilization of the downstream start sites relative to ϩ1 (Fig. 4, B and C), indicating that the downstream cassette can be regulated independently. DISCUSSION Previous efforts to understand the regulation of promoters containing multiple start sites have focused on individual genes, and the results have been largely inconclusive (4). We therefore began with the assumption that multiple start site promoters share common regulatory features and that any sequence that is involved in start site selection would be at a conserved position relative to the transcription initiation window. It is important to emphasize that the alignment shown in Fig. 1 preceded the functional evaluation of the MED-1 element, thereby reducing the bias that can be associated with data base searches for a DNA element following its identification in a single gene. Therefore, our analysis of the role of MED-1 in pgp1 transcription has strong predictive value relative to its function in the other genes in which it has been

Transcriptional Regulation of Multiple Start Site Promoters 30251
identified. Apropos of this, it is interesting to note that in earlier studies deletion of downstream sequences in both MHC-A (15) and N-ras (16) promoters significantly reduced expression from these genes; we now know that these deletions included the MED-1 element. Whether MED-1 and its cognate binding proteins act as selectors or activators of multiple start sites is not yet known. However, it is clear that the mere presence of MED-1 is not sufficient for activation of multiple start sites since 1) we have already shown that the same pgp1 promoter that supports multiple start sites in some cells uses only the ϩ1 site in others (5) and 2) the protein binding activity shown in Fig. 2 is also present in cells which only utilize ϩ1 (data not shown). Therefore, we suggest that MED-1 is necessary but not sufficient for multiple start site utilization and that other, likely trans-acting, factor(s) impose a higher order of regulation on the recognition of this element.
In conclusion, we propose that a new class of RNA polymer-ase II promoters can be defined by 1) the size of the transcription initiation window and the arrangements of start sites therein and 2) the presence of a downstream MED-1 element.
Since the criteria imposed upon selection of the promoters included in Fig. 1B were quite stringent (requiring verification of start site position by both nuclease protection and primer extension analyses, complete homology with the MED-1 element defined in Fig. 1A, as well as the spatial restrictions suggested by the initial alignment), we predict that as more is known about the spatial and sequence requirements for the MED-1 element, additional promoters will be included in this class.