A functional initiator element in the human beta-globin promoter.

Core promoters are defined by the presence of either a TATA box at approximately 30 base pairs upstream of the transcriptional start site (+1) and/or an initiator element centered around the +1 site. The prevalence, function, and significance of the various combinations of core promoter elements are as yet unclear. We describe here the identification and characterization of an initiator element in the TATA-containing human beta-globin promoter. Mutagenesis of the beta-globin initiator element at positions +2/+3 and +4/+5 abrogates transcription in a heterologous construct. Interestingly, we have found a beta-globin initiator binding activity in nuclear extracts whose presence or absence correlates with function of the beta-globin initiator. Accordingly, this binding activity may be part of the machinery required for beta-globin initiator-dependent transcription. Our analysis further describes a previously uncharacterized beta-thalassemia mutation at the +1 site as a mutation that decreases beta-globin initiator activity. Finally, consistent with other initiator elements, the beta-globin initiator requires a TFIID-containing fraction for in vitro activity. Thus, the human beta-globin promoter contains an initiator element whose function, as revealed by a beta-thalassemia mutation, is of physiological relevance.

Study of RNA polymerase II transcription necessitates characterization of both the cis and trans elements involved in promoter function. Viral and minimal promoters, as well as those promoters directing highly regulated tissue-specific expression, contain a variety of upstream elements and exhibit heterogeneity in their core promoter elements. In this context, it is important to examine the relative roles of both the upstream elements and the core promoter, and their mutual interactions, in order to understand the regulation of complex tissue-specific promoters.
Accurate transcriptional initiation has been classically thought to require a TATA box. However, the finding of numerous promoters that do not contain a TATA box and yet accurately initiate transcription led to the discovery of elements centered around the start site as components of the core promoter (1,2). These initiator (Inr) 1 elements direct accurate transcription from artificial constructs containing only upstream Sp1 sites (1,2). Mutation of Inr elements in several promoters decreased or abolished transcription (see Ref. 3 for review), and in heterologous constructs Inr elements stimulated transcription in the presence of a TATA box (4 -6). Experiments performed with promoters containing both a TATA box and Inr suggest that the TATA box is the predominant selector of the site of initiation and that the Inr contributes to the magnitude of the initiation (2).
Several models have been put forward to describe how specific proteins initiate transcription through Inr elements. One suggests that factors binding to the Inr, and necessary for its function, are present in the TFIID complex (4,5,(7)(8)(9)(10)(11). A second model suggests that Inr-dependent transcription is mediated by initiator-binding proteins such as YY1 and TFII-I (6,12,13), as both can substitute for members of the basal machinery in reconstituted systems in vitro (TBP and TFIIA, respectively) (14,15). In a third model, recognition of the Inr by RNA polymerase II serves as the nucleation event, analogous to the role of TBP in TATA-containing promoters (3,16). Finally, an alternate model suggests that TBP provides a nucleation function through its ability to recognize the Ϫ30 regions of TATA-less promoters (17).
These additional complexities have prompted us to reevaluate the role of a core promoter in the expression of the human ␤-globin gene, a paradigm for developmentally regulated genes. Early studies defined several sequences contributing to the activity of the ␤-globin promoter. Internal deletion/substitution and point mutation analysis assigned the TATA box, a CCAAT box at approximately Ϫ75, and a CACC box at approximately Ϫ90 as the major determinants of transcriptional regulation (18,19). Mutations around the ϩ1 site decreased transcription by approximately 50% (18,19). In a more recent study a C 3 T mutation at Ϫ1 was shown to reduce ␤-globin promoter expression to about 80% of wild-type activity in MEL cells (20).
Human ␤-thalassemia disease is a disorder characterized by reduced or absent ␤-globin expression. The resulting globin chain imbalance due to unimpeded ␣-globin expression leads to precipitation of globin polypeptide chains in developing erythroid cells, and the ensuing anemia. Study of these naturally occurring ␤-thalassemia mutations has proven useful in revealing the in vivo relevance of specific cis elements in the ␤-globin promoter (21). Wong et al. (22) reported a patient with mild asymptomatic ␤-thalassemia whose DNA was homozygous for an A 3 C transversion at ϩ1 of the ␤-globin promoter. In this report we describe the characterization of the ϩ1 region of the human ␤-globin promoter as a functional initiator element and demonstrate that the ϩ1 ␤-thalassemia mutation is a mutation in the ␤-globin Inr element (␤Inr). Furthermore, we show that in vitro transcription from the ␤Inr is dependent on partially purified TFIID and that a ␤Inr DNA binding activity exists whose binding correlates with ␤Inr functional activity.
TFIID Purification-MEL cells (7 l; approximately 10 10 cells) were harvested, and a nuclear extract was prepared as described by Dignam et al. (23,24) and modified by Briggs et al. (25). The pellet from the ammonium sulfate precipitation was suspended in H.1 (20 mM Hepes-KOH, pH 7.9, 1 mM EDTA, 1 mM DTT, 20% glycerol, 100 mM KCl). The extract was run over a P-11 column (Whatman), essentially as described by Roeder and colleagues (23,24,26). 0.1, 0.3, 0.5, and 0.85 M fractions were assayed for TFIID activity by testing for rescue of transcription from heat-inactivated MEL crude nuclear extracts using an adenovirus major late promoter template (26).
In Vitro Transcriptions-MEL crude nuclear extracts and in vitro transcriptions for all templates were prepared as described (1,5). 0.3 g of the Sp1 templates (1) and 0.1 g of the ␤GH and MLP templates were using in 50 l reactions. After incubation at 30°C for 1 h, the tran- Reactions were terminated by ethanol precipitation. Pellets were suspended in formamide loading buffer and run on 8% polyacrylamide/8 M urea, dried, and exposed at Ϫ70°C with Kodak XAR-5 film. Quantitation was done by PhosporImager analysis (Molecular Dynamics). Heat-inactivation of MEL crude nuclear extracts was done as described (26).
Gel Shift Assays-Gel shift binding reaction conditions were identical to the in vitro transcription conditions (10 mM Hepes-OH, pH 7.9, 50 mM KCl, 0.5 mM DTT, 10% glycerol, 6.25 mM MgCl 2 ) plus the addition of 1 g of poly(dI-dC) (Pharmacia Biotech Inc.), 50,000 cpm of the appropriate 32 P-labeled probe, and 1 l of crude MEL nuclear extract (5-10 mg/ml). Reactions were incubated at room temperature for 20 min and run on a 4% polyacrylamide (30:1) gel in 0.25 ϫ TBE. Gels were dried and autoradiographed.

RESULTS
The Human ␤-Globin ϩ1 Region Contains an Inr Element-We employed an in vitro transcription assay to determine whether the ϩ1 region of the ␤-globin promoter can correctly initiate transcription from an artificial construct. In vitro transcriptions using MEL nuclear extracts and the parent template construct containing only Sp1 sites showed only minor, low level initiation (Fig. 1A, lane 1) (1). Sp1/TdT, contain-ing the TdT Inr downstream of the Sp1 sites, initiated high levels of transcription (Fig. 1A, lane 2) (1). A similar construct, Sp1/␤ϩ1, containing the ␤-globin ϩ1 region resulted in transcription from two regions (lane 3). Mapping of the initiation sites indicated that the site indicated by the arrow in Fig. 1B correctly maps to the previously observed ϩ1 site for the ␤globin promoter (27). The second site (bracket in Fig. 1B) maps to the junction of the vector and insert. Consistent with previous in vitro transcriptions using the TdT Inr (1), an Sp1 construct containing the ␤ϩ1 region in the reverse orientation (Sp1/␤ϩ1R) did not initiate transcription (lane 4). Finally, transcription from the Sp1/␤ϩ1 is polymerase II-dependent, as shown by sensitivity to 2 g/ml ␣-amanitin (Fig. 1A, lane 6); moreover, transcription is TFIID-dependent, as revealed by heat inactivation of a MEL nuclear extract (Fig. 1A, lane 7) (26). As a control, we showed that transcription of a MLP template is sensitive to both ␣-amanitin and a 47°C heat treatment (Fig. 1A, lanes 8 -10). These data show that the ϩ1 region of the ␤-globin promoter is able to function as an initiator element in vitro.
Point Mutations in the ␤-Globin Inr Element Abolish Transcription in Vitro-To further delineate the sequences neces- , the ␤-globin ϩ1 region (from Ϫ8 to ϩ13) (Sp1/␤ϩ1; lanes 3 and 5), or the ␤-globin ϩ1 region in the reverse orientation (Sp1/␤ϩ1R; lane 4). Lane 6 is a transcription reaction treated with 2 g/ml ␣-amanitin (ϩ␣-aman). Heat-inactivated nuclear extract (HINE, 47°C for 15 min) was used in the transcription reaction in lane 7 (26). Lanes 8 -10 are identical to lanes 5-7 except that the adenovirus MLP was used as template. Arrows indicate the primer extension product representing the correctly initiated transcript. B, the primer extension product resulting from an in vitro transcription reaction using the Sp1/␤ϩ1 template was electrophoresed next to the sequence of the Sp1/␤ϩ1 template. Both primer extension and sequencing reactions used the same primer.

␤-Globin Initiator Element
sary for the function of the ␤-globin Inr element (␤Inr), we introduced double point mutations in the ␤-globin ϩ1 region and assayed these mutants in the Sp1 construct using in vitro transcriptions. With the wild-type ␤Inr, transcription initiated predominantly at the A at ϩ1; a minor transcript initiated at the C at ϩ2 (Fig. 2, lane 1; Fig. 1B). The Ϫ1,Ϫ2 double mutant showed reduced transcription from the major initiation site, and additional downstream initiation sites (Fig. 2, lane 2). Conversion of the CA at 2/3 and the TT at 4/5 to GG abolished transcription from the ␤Inr element (Fig. 2, lanes 3 and 4). Mutation of positions 7/8 did not affect initiation from the major site (Fig. 2, lane 5), although one downstream initiation site was observed, similar to that seen with the Ϫ1,Ϫ2 double mutant. The 9/10 mutant displayed correct initiation (Fig. 2,  lane 6). These data indicate that mutations in the ϩ1 region specifically abolish ␤Inr-dependent transcription and define the boundaries of the ␤Inr element as approximately nucleotides Ϫ2 to ϩ5 (TTACATT).
Protein Binding to the ␤Inr Element Correlates with ␤Inr Functional Activity-Our finding that the human ␤-globin promoter contains a functional Inr element, and the apparent complexities of Inr-dependent transcription (see Introduction), led us to assay MEL cell nuclear extracts for a specific ␤Inr binding activity. We performed gel shift assays using 32 Plabeled double-stranded oligonucleotides containing the wildtype ␤Inr sequence or double point mutants. As shown in Fig.  3 (indicated by the arrow), a gel shift complex formed with the wild-type ␤Inr and the 7,8 and 9,10 double mutant probes. Binding was not detected with the 2,3 or 4,5 mutant probes, which were inactive as initiator sequences in vitro. Interestingly, the Ϫ1,Ϫ2 mutant template, which revealed reduced initiation from the ϩ1 site, showed an intermediate level of complex formation. Thus, protein binding and transcription initiation were strictly correlated in this series of mutants. Competitions using the various cold wild-type or mutant binding sites against a labeled wild-type ␤Inr probe yielded similar results (data not shown). The ␤Inr binding activity does not appear to be restricted to MEL cells, as a comigrating activity was detected in HeLa nuclear extracts, nor does it comigrate with the YY1 initiator protein in gel shift assays (data not shown).
A Naturally Occurring ␤-Thalassemia Is a Mutation in the ␤-Globin Initiator Element-Wong et al. (22) described an Asian-Indian with a mild asymptomatic ␤-thalassemia in which the only base substitution detected was an A 3 C transversion at the ϩ1 site of the ␤-globin promoter. We surmised that this observation might suggest the presence of an initiator element in the ϩ1 region. Although the ϩ1 region appeared to function as an Inr element in an artificial promoter, we next sought to examine this region in the context of the ␤-globin promoter. Constructs were assembled in which mutations were introduced into the ␤-globin promoter (Fig. 4A).
Representative in vitro transcriptions from these constructs are shown in Fig. 4A. Fig. 4B provides quantitation of the results of three to four experiments. The incorporation of the A 3 C transversion at the ϩ1 site into the ␤GH (wild-type) template resulted in transcriptional activity ϳ75% of wild-type (Fig. 4A, lane 2). We also observed a slight shift in the pattern of initiation from three predominant sites to two major and two minor sites (compare lanes 1 and 2 in Fig. 4). Introduction of two known ␤-thalassemia mutations into the ␤-globin TATA box (Ϫ30␤GH: CATA to CACA; Ϫ31␤GH: CATA to CGTA) (28,29) resulted in transcription levels ϳ40% of wild-type (Fig. 4, A  and B), consistent with transient expression data of TATA box ␤-thalassemia mutations and mutagenesis studies (19,20,30,31). The TATA box mutations, however, did not alter the pattern of initiation (Fig. 4A, lanes 2 and 3). The double mutant (Ϫ30␤THALGH), which contained both the Ϫ30 TATA box T 3 C transition and the ϩ1 A 3 C transversion, further reduced transcription to ϳ20% of wild-type activity.
Replacement of the ␤-globin TATA box sequence (CATA) with the adenovirus MLP TATA box (TATA) provided a second promoter background into which we incorporated the ϩ1 A 3 C transversion and the 2,3 mutant. In the context of a  3. Nuclear extract contains a ␤Inr binding activity whose presence correlates with ␤Inr functional activity. 32 P-Labeled double-stranded wild-type (lane 6) and mutant (lanes 1-5) oligomers were used in binding assays and run on a 4% polyacrylamide gel. The arrow indicates an activity whose presence correlates with the functional analysis of the ␤Inr mutations (Fig. 2). The mutant designations are identical to those in Fig. 2.

␤-Globin Initiator Element
stronger TATA box (compare lanes 1 and 8 in Fig. 4, A and B) (32) the A 3 C ϩ1 mutation reduced expression to ϳ40% of the parent ␤MLPGH template, compared to ϳ75% in the wild-type background (compare ␤MLPGH and ␤MLPTHALGH to ␤GH and ␤THALGH in Fig. 4, A and B). Here again the initiation pattern was altered (compare lanes 10, 11, and 12). Interestingly, the 2,3 mutant, which abolished transcription in the Sp1 assay (Fig. 2, lane 3), reduced transcription in the ␤MLP2,3GH template to approximately ϳ20% of ␤MLPGH levels (Fig. 4,  panel A, lane 12 and panel B). This similar reduction in initi-ation by the 2,3 and the ϩ1 ␤-thalassemia mutations in the ␤MLPGH construct suggests that both mutations affect Inr-dependent transcription and that results using the Sp1-based templates can be reproduced in the context of a natural promoter.
Transcription from the ␤Inr Is Dependent on a Fraction Containing TFIID Activity-To begin to address the possible mechanisms of transcription from the ␤Inr element, we asked whether transcription of the Sp1/␤ϩ1 template could be rescued by addition of a fraction containing TFIID to heat-inactivated nuclear extracts. This fraction was isolated from a MEL cell nuclear extract by passage over a phosphocellulose P-11 column and elution with 0.85 M KCl (see "Materials and Methods"). Addition of the pooled peak 0.85 M fraction to MEL nuclear extracts that were heat-inactivated by incubation at 47°C for 15 min restored transcriptional activity to both a control MLP template and the Sp1/␤ϩ1 template (Fig. 5) (26). Thus, consistent with previous data on other initiator elements the ␤Inr element requires an activity that copurifies with TFIID (4,5,10,11,33). DISCUSSION In this report we describe the identification and characterization of an initiator element in the TATA-containing human ␤-globin promoter. In so doing, we provide evidence that a protein fraction containing TFIID is required for Inr-dependent transcription and detected a DNA binding activity whose binding to the ␤Inr correlates with its functional activity. Finally, we demonstrate that a base substitution at the ␤-globin ϩ1 site, found in association with a human ␤-thalassemia, impairs the activity of the initiator element, thereby implicating the ␤-globin Inr as a functional element in vivo.
Consistent with studies of other initiator elements (1, 2, 6), the ␤-globin Inr functions in a heterologous context and in an orientation-dependent manner (Fig. 1). Comparison of transcription from the Sp1/TdT Inr and Sp1/␤Inr templates indicates that within these contexts the ␤Inr is weaker than the TdT Inr, a finding consistent with observations suggesting that deviations from the loose Inr consensus sequence element (YYA ϩ1 NT/AYY) decrease Inr activity (7). Accordingly, mutation of the ␤-globin Inr with double point mutations replacing nucleotides Ϫ2 through ϩ8 with purines reveals that positions Ϫ1,Ϫ2 (YY), 2,3 (NT/A), and 4,5 are necessary for ␤Inr activity (Fig. 2). Error bars indicate standard deviations from the mean transcription level relative to ␤GH (all ␤GH templates) or ␤MLPGH (all templates with the MLP designation). Note that the error bars all show 10 -15% deviation, which represents the intrinsic error in the assay. Gel shift analysis of MEL cell nuclear extracts with the ␤Inr sequence reveals a protein binding activity (Fig. 3) that strictly correlates with transcriptional activity in vitro (Fig. 2). Previous reports proposed TFII-I and YY1 transcription factors as candidates for mediating initiator activity (6,12,14,15). However, others have reported that the functional activities of YY1 mutant binding sites do not correlate precisely with YY1 binding activities over the same mutant sites (7). This discrepancy is complicated by the assays used to define the activities of YY1. The mutational analysis performed by Javahery et al. (7) employed Sp1 templates containing a YY1 site and used crude nuclear extracts for in vitro transcriptions, whereas reconstituted in vitro systems were used to define YY1 as a functional initiator protein (15). It is formally possible that these two functional assays do not assay similar activities. Further experiments are required to ascertain whether results obtained with systems using reconstituted factors are in accord with those using crude nuclear extracts. A second caveat is the possibility that the context of the Inr may influence the functional assays (7). Nonetheless, our analysis of a panel of ␤Inr mutants provides an example of a correlation between Inr functional activity and Inr DNA binding activity.
A report by Wong et al. (22) described an Asian-Indian with a mild, asymptomatic ␤-thalassemia. Their analysis of the patient's ␤-globin promoter indicated that he was homozygous for an A 3 C transversion at ϩ1. Our analysis (Fig. 4) indicated that this transversion is a mutation in the initiator element, as levels were reduced to 75% of wild-type levels (Fig. 4, panel A,  lane 2 and panel B). Previously described TATA box ␤-thalassemia mutants express at approximately 25% of the wild-type levels in transient assays in HeLa cells (31). Consistent with these results, templates containing a Ϫ30 ␤-thalassemia mutation (T 3 C) (Ϫ30␤GH) and a Ϫ31 ␤-thalassemia mutation (A 3 G) (Ϫ31␤GH) were expressed in vitro at 30% of wild-type levels (Fig. 4, A and B). These results indicate that the in vitro system data can accurately reflect the in vivo environment. Incorporation of the ϩ1 A 3 C ␤-thalassemia into the Ϫ30␤GH template resulted in a further reduction in transcription, again indicating that the ϩ1 mutation affects the function of the initiator element. Finally, the conversion of the ␤-globin TATA box (CATA) to the adenovirus major late promoter TATA box (TATA) supplied a second template (␤MLPGH) with which to study the effects of the ϩ1 ␤-thalassemia mutation. Curiously, the incorporation of the ϩ1 mutation into the ␤MLPGH background resulted in transcription levels that were 35% of wildtype (Fig. 4B, compare ␤MLPGH and ␤MLPTHALGH). This effect is 2-fold greater than that seen with the wild-type ␤globin promoter background (Fig. 4B, compare ␤GH with ␤THALGH). This difference may be due simply to the higher expression from the ␤MLPGH template or may indicate a cooperativity between the TATA box and Inr element (34). We observe that, despite the lack of transcription from the Sp1/2,3 double mutant (Fig. 2) the same mutation in the ␤MLP2,3GH template results in transcription 20% of wild-type (Fig. 4). From these data we conclude that other elements are able to compensate for the mutation within the ␤THALGH, ␤MLPTH-ALGH, and ␤MLP2,3GH templates. Although other mechanisms are possible, these data are compatible with the requirement for TFIID, whose footprint on the adenovirus major late promoter extends from approximately Ϫ45 to ϩ35 (26). If so, the mutations in the ␤Inr and TATA box may result in destabilization, or partial disruption, of the interaction of TFIID with the core promoter, and thereby lead to a decrease in transcription. Mutation of the Inr results in a complete loss of transcription (Fig. 2), as the template only contains one site for TFIID, whereas the partial loss of transcription in the ␤-globin templates through either TATA box or Inr mutations (Fig. 4) is due to the destabilization/weakening of TFIID binding in the core promoter. Our results, therefore, are not mutually exclusive but may reflect the ability of TFIID to bind to both the TATA box and the Inr.
Consistent with this interpretation, our data reveal that a phosphocellulose fraction containing TFIID is required for ␤Inr activity (Fig. 5). These observations are consistent with other data and models proposing that the machinery for Inr-dependent transcription is contained in the TFIID complex (5, 8 -10, 35). Validation of this hypothesis will require further experiments to demonstrate a correlation between TFIID binding and the functional activity of the ␤Inr mutants (8). If this is the case, the Inr binding activity we detect may reside within the TFIID complex. Alternatively, the existence of an independent DNA binding activity might suggest that Inr-dependent transcription requires distinct mechanisms which differ from those used by other promoters (see Introduction). Finally, the rescue of Inr-dependent transcription from heat-inactivated nuclear extracts by the addition of partially purified TFIID is incomplete, despite the complete rescue of adenovirus major late promoter transcription (Fig. 5). These data suggest the existence of an additional heat-sensitive factor that is required for optimal ␤Inr activity. These findings resemble those described for the TdT Inr (4,10,33).
Several groups have produced data that suggest there are distinct mechanisms of initiation mediated by Inr elements. For example, Zenzie-Gregory et al. (36) have reported that Inr-dependent transcription in vitro does not show the lag period found using TATA-dependent templates. Moreover, increasing amounts of nuclear extract reduced activity of a TATAdependent template, but not that of an Inr-dependent template. In addition, transient assays suggest that overexpression of TBP inhibits expression from TATA-containing, but not TATA-less, promoters (37). With an upstream activator site, the combination of a TATA box and an Inr significantly increased promoter activity, suggesting that the Inr element may cooperate with a TATA box and also significantly enhance a promoter's response to activators (34). Additional evidence of a lack of a TBP rate-limiting step in Inr-containing promoters is provided by Ham et al. (38), who showed that overexpression of TBP can relieve a block on expression of minimal promoters containing a papillomavirus E2 activator binding site and a TATA box. However, the same promoter containing a TATA box and Inr is activated by E2 without the requirement for TBP overexpression. Consistent with this model are data suggesting that the interaction of c-Fos and TBP is required for TATA box-mediated, but not for Inr-dependent, transcription (39). Finally, Lescure et al. (40) provide data that part of the N terminus of TBP is required for the assembly of preinitiation complexes in TATA-containing, but not TATA-less, promoters. Collectively, these data argue for distinct mechanisms of transcriptional initiation mediated by Inr elements and suggest that Inrs contribute to the response of promoters to upstream activators. The study of Inr-dependent promoters in TATAcontaining and TATA-less contexts will thus further our understanding of the mechanisms of transcriptional initiation.