Chromatin Structure and Transcriptional Control Elements of the Erythroid Krüppel-like Factor (EKLF) Gene*

Erythroid Krüppel-like factor (EKLF) is a red cell-specific transcription factor whose activity is critical for the switch in expression from fetal to adult β-globin during erythroid ontogeny. We have examined its own regulation using a number of approaches. First, the EKLF transcription unit is in an open chromatin configuration in erythroid cells. Second, in vivotransfection assays demonstrate that the more distal of the two erythroid-specific DNase-hypersensitive sites behaves as an enhancer. Although this conserved element imparts high level transcription to a heterologous promoter in all lines examined, erythroid specificity is retained only when it is fused to the proximal EKLF promoter, which contains an important GATA site. Third, extensive mutagenesis of this enhancer element has delimited its in vivo activity to a core region of 49 base pairs. Finally, in vitro footprint and gel shift assays demonstrate that three distinct DNA binding activities in erythroid cell extracts individually interact with three short sequences within this core enhancer element. These analyses reveal that high level erythroid expression of EKLF relies on the interplay between conserved proximal and distal promoter elements that alter chromatin structure and likely provide a target for genetic control via extracellular induction pathways.

Erythroid Krü ppel-like factor (EKLF) is a red cellspecific transcription factor whose activity is critical for the switch in expression from fetal to adult ␤-globin during erythroid ontogeny. We have examined its own regulation using a number of approaches. First, the EKLF transcription unit is in an open chromatin configuration in erythroid cells. Second, in vivo transfection assays demonstrate that the more distal of the two erythroid-specific DNase-hypersensitive sites behaves as an enhancer. Although this conserved element imparts high level transcription to a heterologous promoter in all lines examined, erythroid specificity is retained only when it is fused to the proximal EKLF promoter, which contains an important GATA site. Third, extensive mutagenesis of this enhancer element has delimited its in vivo activity to a core region of 49 base pairs. Finally, in vitro footprint and gel shift assays demonstrate that three distinct DNA binding activities in erythroid cell extracts individually interact with three short sequences within this core enhancer element. These analyses reveal that high level erythroid expression of EKLF relies on the interplay between conserved proximal and distal promoter elements that alter chromatin structure and likely provide a target for genetic control via extracellular induction pathways.
Erythroid Krü ppel-like factor (EKLF) 1 is a red cell-specific transcription factor that interacts via its three zinc fingers with the essential CACCC element sequence of the mammalian ␤-globin promoter (1). Similar sequences are present in numerous other erythroid promoters and also within the locus control region located upstream of the ␤-like globin cluster (2). However, EKLF discriminates between the various CACCC elements, forming a particularly strong interaction with the 9-bp sequence (5ЈCCACACCCT3Ј) at the adult ␤-globin promoter (2,3). Single point mutations within this CACCC sequence give rise to ␤-thalassemia in humans and drastically decrease its binding affinity for EKLF (4). The lower affinity of EKLF for the murine embryonic (y)/human fetal (␥) CACCC elements implied that EKLF might play a role in the yϪ/␥Ϫ to ␤-globin transcriptional switch (3). This prediction was verified by the genetic ablation of EKLF in mice, which leads to a profound ␤-thalassemia, incomplete definitive erythropoiesis, and embryonic death at the time of the switch (5)(6)(7). Not only is primitive erythropoiesis unaffected in EKLF-null mice, but embryonic/fetal globin expression is 5-fold higher and remains on longer than in wild type mice (8,9). As a result, EKLF is thought to play a major role in consolidation of the switch from embryonic/fetal to adult globin expression.
EKLF expression is exquisitely erythroid-specific, as monitored both in cell lines and tissues from adult mice (1). Most vividly, its developmental profile reveals that its onset of expression is strictly limited to the blood islands of the yolk sac at the early head fold stage (day E7.5) and switches by day E9.5 to expression within the hepatic primordia, which becomes the sole source of definitive erythropoiesis by day E12.5 (10). EKLF expression remains erythroid-specific in the adult, being localized to the red pulp of the spleen and the bone marrow (10).
The crucial role of EKLF in erythropoiesis makes it of interest to investigate its own regulation for a number of reasons. First, EKLF expression remains strictly tissue-specific throughout early development and in the adult. Second, its dual pattern of embryonic expression (i.e. within the yolk sac and fetal liver) makes it an attractive target for these experiments, as these two tissues, although erythropoietic, may regulate expression of their tissue-specific genes by distinct mechanisms. Such considerations arise from experiments that disrupt critically important erythroid genes, yet do not equally affect both primitive and definitive red cell populations (e.g. c-myb (11), AML1/CBFA2 (12,13), CBFB (14 -16), and the erythropoietin receptor (17,18)). Finally, EKLF is expressed early in hematopoietic differentiation (19,20) 2 ; and deciphering its regulation may illuminate the initial events in establishment of the erythroid pathway.
To understand the regulation of EKLF, we have initiated studies to localize cis-acting sequences important for EKLF expression. Previous analyses had focused on the proximal EKLF promoter and revealed that the GATA site at Ϫ60 and the CCAAT element at Ϫ45 are important for activity in erythroid cells (21). The importance of the GATA site was also shown by demonstrating that forced expression of GATA1 could activate the proximal EKLF promoter in non-erythroid cells. However, the present studies use chromatin accessibility as a more global approach to find distal cis-elements that play a role in EKLF transcriptional activation. By using a combination of in vivo transfection and in vitro protein/DNA binding approaches, the present studies have directed us to a distal 49-bp region within an erythroid-specific hypersensitivity domain that, in conjunction with the tissue-specific proximal EKLF promoter, plays a critical role in generating high levels of expression.
The largest EKLF construct (Ϫ3 kb) contains the promoter sequence from Ϫ2900 to ϩ71 (BamHI to StuI murine genomic fragment (23)) in pBLCAT6 (24). The Ϫ950 construct was made by deleting the BamHI-SacI fragment from the Ϫ3-kb construct. The Ϫ691 construct was made by deleting the XbaI-ApaI fragment from the Ϫ950 construct. The Ϫ573 construct was made by digesting the Ϫ950 construct with PstI to remove the PstI fragment. The Ϫ155 construct was made by deleting the PstI-BstBI fragment from the Ϫ573 construct. The ⌬Ϫ573/Ϫ155 construct was made by digesting the Ϫ950 construct to completion with BstBI, but only partially with PstI. The second largest band was then isolated (GeneClean) and self-ligated. The ⌬Ϫ691/Ϫ573 construct was made by deleting the ApaI-EcoNI fragment from the Ϫ950 construct. The ⌬Ϫ666/Ϫ573 construct was made by deleting the SacII-EcoNI fragment from the Ϫ950/LBPmut construct (which has the LBP-1 site mutated to a SacII site (see below)).
Site-directed mutants were generated by using Transformer Sitedirected Mutagenesis Kit (CLONTECH) as recommended by the manufacturer. For GATA (Ϫ684), the mutation primer was 5ЈCCCTACCTG-TCGACGGCCTGAAAC3Ј, which changed the GATA site (GATAGC) to a SalI site (GTCGAC). For LBP-1 (Ϫ666), the mutation primer was 5ЈCGGCCTGAAACCCGCGGTGTGTCTGAT3Ј, which changed the LBP-1 site (ATCTGG) to a SacII site (CCGCGG). The core mutant mutation primer was 5ЈCAGAGCTATGGGGTACCTGGGCCCCCGGA-TCCATAGCGGAATTCAACATCTGGTG3Ј, with the underlined regions indicating the changes (compare with wild type sequence in Fig. 10).
Constructs HS-B to -J were generated by polymerase chain reactiondirected deletion mutagenesis. The oligo primers used for polymerase chain reaction contained a PstI site at the 5Ј-end and were designed approximately every 20 base pairs beginning at Ϫ950 of the EKLF gene: for HS-B, the oligo primer was 5ЈAACTGCAGCTTCAGCTCCATGCA-GTAGCC3Ј; for HS-C, 5ЈAACTGCAGCAGTGAGGGTCCTCAGAGCC-3Ј; for HS-D, 5ЈAACTGCAGCCAGAGTGGGCTGATTTGAGG3Ј; for HS-E, 5ЈAACTGCAGGGGACTCCTTTTGCTAAACAGC3Ј; for HS-F, 5ЈAACTGCAGGCTCAGACCTCAACACAACAGA3Ј; for HS-G, 5ЈAACT-GCAGGAGCAATTCAAAGCTAAAATAATTTG3Ј; for HS-H, 5ЈAACTG-CAGAATTTGTGGCACCAACCCCCAG3Ј; for HS-I, 5ЈAACTGCAGAG-CTCTTCTGCTCAAGGAGGAA3Ј; and for HS-J, 5ЈAACTGCAGACAG-AGCTATGGTTGTTCTGGG3Ј. The oligo primer for the opposite strand strand was 5ЈCACCCCACCTCCCACAGGACT3Ј. The vector was prepared by cutting the Ϫ950 construct with PstI and isolating the larger vector band (which corresponds to a linearized Ϫ573 construct). This vector fragment was then ligated with the desired PstI-digested polymerase chain reaction product.
DNase I Hypersensitivity Assay-About 10 8 32DEpo1 cells were harvested for the preparation of nuclei. After washing with phosphatebuffered saline, cells were incubated with lysis buffer (10 mM Tris, pH 7.4, 10 mM NaCl, 5 mM MgCl 2 , 0.1 mM EGTA, 0.1% Nonidet P-40) for 5 min on ice. The solution was then layered onto lysis buffer that contained 10% sucrose and centrifuged to pellet the nuclei. After one rinse the nuclei were again resuspended in lysis buffer.
DNase I digestion was carried out with 1.5 ϫ 10 7 nuclei in DNase I buffer (20 mM Tris, pH 7.4, 20 mM NaCl, 10 mM MgCl 2 , 0.1 mM CaCl 2 ). The amount of DNase I was 0, 0.1, 0.2, 1, 2 units. After incubation at 37°C for 5 min, stop solution (2% SDS, 25 mM EDTA, 20 l/ml of 20 mg/ml proteinase K) was added and incubated at 55°C overnight. After extraction with phenol/chloroform, 300 g/ml RNase A was added, and the mixture was incubated at 37°C for 30 min. The DNA was precipitated after another phenol/chloroform extraction. The DNA samples were then finally dissolved in TE buffer. The DNA samples for liver and spleen were prepared as described (25).
The DNA samples were digested with SmaI and were blotted to nitrocellulose (15 g/lane) using standard procedures (26). The 5Ј-end probe was the SmaI-NcoI fragment, and the 3Ј-end probe was the SacI-SacII fragment from EKLF cDNA (1).
Transient Transfection and CAT Assay-10 g of CAT reporter and 2 g of control growth hormone constructs were cotransfected into 1 ϫ 10 7 32DEpo1 cells by the DEAE-dextran method as described (27). CV1 and NIH3T3 cells at 40 -50% confluence in 100-mm dishes were transfected with 10 g of CAT reporter and 2 g of growth hormone control constructs by calcium precipitation (1). Calcium-DNA precipitates were added in media and incubated with cells for 7-8 h in the presence of 0.1 mM chloroquine. After washing, cells were continuously grown for another 35-40 h. CAT assays were carried out as described (27). For 32DEpo1 cells 80 g of protein was used, and the incubation time was 2 h at 37°C. For CV1 and NIH3T3 cells 120 g of protein was used, and the incubation time was 1 h at 37°C. The data from multiple experiments were averaged after normalization of CAT activity to growth hormone levels (1) and are presented as "normalized CAT activity." Nuclear Protein Extraction and in Vitro Footprint Assay-Nuclear extracts were prepared from 1 ϫ 10 9 32DEpo1 cells. All buffers contained 40 M phenylmethylsulfonyl fluoride, 0.5 mM dithiothreitol, 10 g/ml leupeptin, and 10 g/ml antipain. After washing with phosphatebuffered saline ϩ 0.35 M sucrose, cells were incubated with buffer A (10 mM Hepes, pH 7.9, 1.5 mM MgCl, 10 mM KCl) on ice for 5 min. The cell pellets were then incubated in buffer A ϩ 0.5% Triton on ice for another 5 min and then transferred to TH buffer (15 mM Tris, pH 7.4, 0.35 mM sucrose, 60 mM KCl, 15 mM NaCl, 0.2 mM EDTA, 0.2 mM EGTA, 0.15 mM spermidine and 0.5 mM spermine). After spinning, the cell pellet was resuspended and incubated for 30 min on a shaker at 4°C in buffer C (20 mM Hepes, pH 7.9, 420 mM NaCl, 1.5 mM MgCl 2 , 0.7 mM EDTA, 25% glycerol). The extracted nuclei were then removed by spinning at 10,000 ϫ g, and the supernatant was dialyzed against buffer D (20 mM Hepes, pH 7.9, 40 mM KCl, 0.2 mM EDTA, and 20% glycerol), aliquoted, frozen, and stored at Ϫ80°C.
The template for footprint assay was generated by digesting construct HS-G with HindIII, labeling the non-coding with Klenow fragment DNA polymerase, and performing a secondary digestion with EcoNI. The released labeled template was isolated from a 6% polyacrylamide gel by elution, phenol/chloroform extraction, and ethanol precipitation. 20,000 cpm of labeled DNA was mixed with 7.5 or 14 g 32DEpo1 nuclear protein extract in buffer containing 40 mM Hepes, pH 7.9, 110 mM KCl, 10 mM MgCl 2 , 5 mM dithiothreitol, 0.05% Nonidet P-40, 17% glycerol, and 1 g of poly(dI)/poly(dC) for each binding reaction in 50 l. After a 1-h incubation on ice, 50 l of Mg-Ca buffer (10 mM MgCl 2 and 5 mM CaCl 2 ) and 0.4 or 0.8 units of DNase I (for bovine serum albumin-negative control, 0.005 units of DNase I was used) were added to the reaction and incubated for an additional 2 min on ice. The reaction was then stopped by adding 90 l of stop buffer (1% SDS, 20 mM EDTA, 200 mM NaCl, and 250 g/ml yeast tRNA). After phenol/ chloroform extraction and ethanol precipitation, samples were dissolved in 6 l of loading buffer (80% formamide, 45 mM Tris base, 45 mM boric acid, 1 mM EDTA, 0.05% bromphenol blue, and 0.05% xylene cyanol) and analyzed on a 6% sequence gel.
Oligonucleotides and Gel Shift Assay-The oligonucleotides used for gel shift assays contained HindIII sites at each end and were as follows ("top" strand only is shown; EKLF sequence is underlined): oligo-1, 5ЈAGCTACAGAGCT3Ј; oligo-2, 5ЈAGCTTATGGTTGTTCTGGGCCC3Ј; oligo-3, 5ЈAGCTCCCTACCTGA3Ј; oligo-4, 5ЈAGCTTAGCGGCCTGAA-ACA3Ј. The gel shift assay was performed as described (27). 2 ϫ 10 4 cpm of labeled double-stranded oligo and 0.7 g of nuclear extract were incubated in the same buffer as used for the in vitro footprint assay, then separated on a 6% native polyacrylamide gel.

RESULTS
Localization and Functional Testing of Distal EKLF Promoter Regions-Previously it had been shown that GATA and CP1 sites, located within 90 base pairs of the EKLF transcription initiation site, are important for activity in transient assays (21). We wished to ascertain whether more distal elements could augment the level of transcriptional activity seen with this minimal promoter. Our approach was to search for DNasehypersensitive sites surrounding the EKLF transcription unit. Such open structures within chromatin can reveal sites important for enhancing transcription (28,29). We performed these experiments using nuclei from 32DEpo1 cells (22,30), which is a murine erythropotential cell line that expresses EKLF, GATA1, and ␤-globin. As shown in Fig. 1A, four hypersensitive sites were found, located at approximately Ϫ8.0, Ϫ0.7, Ϫ0.3, and ϩ5.5 kb relative to the transcription initiation site. All four sites were present whether the cells were grown in interleukin 3 or erythropoietin (data not shown). To address which of these were erythroid-specific, we compared the hypersensitive status of the EKLF gene in spleens and livers from an anemic adult mouse. Earlier DNase analysis of these same samples demonstrated that only the spleen exhibits hypersensitive sites at the ␤-globin locus control region (25). The analyses at the EKLF promoter (Fig. 1B) demonstrate that only the two sites closest to the initiation site (at Ϫ0.7 and Ϫ0.3 kb) are erythroidspecific; the other sites (at Ϫ8.0 and ϩ5.5 kb) are present in both tissues. The two erythroid-specific sites will be referred to as EKLF-hypersensitive sites 1 and 2 (EHS1 at Ϫ0.7 kb, and EHS2 at Ϫ0.3 kb). These sites allow us to conclude that the EKLF transcription unit resides in a more open chromatin structure in erythroid cells and also direct us to two distal locations within the promoter that may account for its tissuespecific expression.
To localize important EKLF promoter regions, we compared the upstream sequences of the murine (23) and human (31) EKLF genomic transcription units via a matrix analysis. Analysis of conserved regions has been helpful in localizing functional elements, for example the mouse and human locus control regions (32,33). Analysis of ϳ800 nucleotides from the human promoter compared with ϳ1 kb of the murine promoter ( Fig. 2) reveals that two discrete regions show extensive sequence conservation. The proximal region extends to approximately Ϫ90 and includes the GATA and CP1 sites described FIG. 1. DNase hypersensitivity analysis of the EKLF transcription unit. Genomic DNA was isolated from nuclei that had been incubated with increasing amounts of DNase I as indicated, digested with SmaI, electrophoresed, blotted, and hybridized with region-specific probes. Expected SmaI-digested segments are based on the EKLF murine genomic sequence (23). Undigested samples in each case are shown in lane 1. A, DNA from 32DEpo1 cells was processed and hybridized with 5Ј-(left) or 3Ј-specific (right) probes. Arrows denote the location of the undigested SmaI fragments whose expected sizes are indicated below the figure. Arrowheads indicate the four hypersensitive sites that appear upon increasing DNase I digestion; their approximate locations are also shown in the schematic diagram below the figure. B, DNA from adult spleen or liver was processed and hybridized with 5Ј-(left) or 3Ј-specific (right) probes. Arrowheads with asterisks indicate the two erythroid-specific hypersensitive sites not observed in the liver sample. earlier (21). Interestingly, the other region retains significant similarity for ϳ300 nucleotides and precisely overlaps the upstream erythroid-specific EKLF-hypersensitive site, EHS1.
We began the functional analyses by addressing whether a murine EKLF promoter construct, which contains 3 kb of upstream sequence, imparts high levels of expression upon a simple CAT reporter gene that contains no other basal promoter sequence (pBLCAT6 (24)). As both of the erythroidspecific hypersensitive sites are within this fragment, we started our analyses with this construct and tested derived deletions via transient assays into 32DEpo1 cells (Fig. 3). Deletion to Ϫ950 has no effect, indicating that sequences beyond the hypersensitive sites are not important for promoter activity. However, deletion to Ϫ573, which removes EHS1 alone, results in a drastic (ϳ20-fold) decrease in CAT activity. Removal of EHS2 alone (⌬Ϫ573/Ϫ155) yields a slight (ϳ50%) increase in activity. Removal of both EHS1 and EHS2 (the Ϫ155 construct) decreases activity similar to the single EHS1 deletion but, consistent with earlier experiments (21), does not abolish it. We conclude that EHS1, a conserved locus that exhibits erythroid-specific hypersensitivity, is critical for imparting high level transcription upon the adjacent EKLF gene.
Determinants of Erythroid Specificity-Given that expression of the EKLF promoter is erythroid-specific, we tested whether the EHS1 element would impart this property upon a heterologous promoter by fusing it to the thymidine kinase (tk) promoter in ptkCAT (pBLCAT5 (24)). The tk promoter is already active (normalized to a value of 1 in each cell line for comparison across cell lines), but its level is boosted 20 -30-fold when fused to EHS1, regardless of cell type (Fig. 4A) and orientation (data not shown). The EHS1 fragment thus appears to behave as a non-tissue-specific enhancer element. However, fusion of this element to the minimal EKLF promoter, which bears no sequence similarity to the tk promoter, was sufficient to regenerate the tissue-specific expression exhibited by the complete 3-kb construct and the Ϫ950 construct (Fig. 4B). We conclude that tissue-specific reporter expression requires the proximal EKLF promoter but that its high level expression additionally requires the EHS1 element.
The proximal EKLF element is highly conserved in mammalian cells (Fig. 2), including the GATA and CP1 elements. We asked how alteration of these sequences would affect the ability of EHS1 to boost transcription. By using the minimal EKLF promoter construct (Ϫ77 nucleotides), we found that mutation of either site disrupted reporter levels (Fig. 5A), as previously observed (21). When EHS1 was fused to each of these constructs, all were significantly boosted (Fig. 5A). However, when normalized to their respective non-EHS1-containing values (Fig. 5B), it became clear that the GATA mutation had the most severe effect on enhancement, being boosted only ϳ6-fold com-pared with a boost of ϳ20-fold for the wild type and CP1 mutant. We conclude that the proximal GATA site is essential not only for minimal promoter activity but also for optimal EHS1 activity. On the other hand, the CP1 site, although important for EKLF promoter activity, is not essential for EHS1 activation.
In Vivo Analyses of EHS1-We next focused our attention on the distal EHS1 enhancer element. Inspection of the sequence surrounding the Ϫ0.7-kb EHS1 indicated that it contains conserved GATA-(at Ϫ684) and LBP (at Ϫ666) (34)-binding sites. The importance of these sites was tested by directed mutagenesis, and the results indicate that their disruption had no effect on the activity of EHS1 (Fig. 6A). We then tested an extensive series of deletion mutants, starting at the 5Ј-end with the fully active Ϫ950 construct (Fig. 6B). Sequential deletions of approximately 20 base pairs to Ϫ715 had no discernible effect (constructs HS-B to -J), but deletion to Ϫ691 significantly disrupted transactivation, approaching the low level seen with the Ϫ573 deletion. Two internal deletions then allowed us to localize the 3Ј-boundary of this element (Fig. 6C). Deletion of sequences from the LBP site to the Ϫ573 boundary (⌬Ϫ666/Ϫ573) had no effect on transactivation. However, a slightly larger deletion up to Ϫ691 (⌬Ϫ691/Ϫ573) decreased activity to the level seen with the 5Ј-directed Ϫ691 deletion. These analyses suggest that full EHS1 activity is localized to the 49-bp sequence between Ϫ715 and Ϫ666 of the EKLF promoter.
In Vitro Analyses of EHS1-We used in vitro assays to determine which region(s) within the 49-bp sequence interact with DNA-binding proteins in extracts from 32DEpo1 cells. DNase footprint analysis with a 250-bp fragment from EHS1 revealed that DNase protection overlaps the 49-bp sequence of interest (Fig. 7), giving rise to regions of both decreased access and hypersensitivity to the nuclease. These were grouped into four footprints (FP1, FP2, FP3, and FP4; Fig. 7) bound by regions of nuclease insensitivity. Note that FP1 extends beyond the 5Ј-boundary (Ϫ715) of the 49-base pair sequence and that the unimportant GATA site (Ϫ684) separates FP3 and FP4.
By juxtaposing the in vitro nuclease protection and in vivo deletion data (in Figs. 6 and 7), we designed four specific oligonucleotides and tested their ability to bind proteins in the 32DEpo1 extract (Fig. 8). The 5Ј-most sequence (oligo 1) did not bind any protein; this was not surprising as this sequence contained only half of FP1. However, the adjacent three sequences (oligos 2-4) yielded a simple pattern of interaction, each with its own single species of protein. These data led us to conclude that three DNA binding activities in erythroid cell extracts interact with the core 49-bp region of EKLF promoter DNA at approximately Ϫ700 that enhances transcriptional activation and exhibits an erythroid cell-specific open chroma- FIG. 4. In vivo tests of erythroid specificity by the EKLF distal EHS1 element and the proximal promoter. A, transient transfection assays of the indicated cell lines was performed with pEHS1tkCAT or ptkCAT (pBLCAT5). Each CAT activity is expressed relative to that seen with pBLCAT5 within that particular cell line (and given an arbitrary value of 1) after normalization of data from multiple experiments. The autoradiograph of the thin layer plate from one experiment is also shown. B, transient transfection assays of the indicated cell lines were performed with the indicated expression constructs. CAT6 denotes pBLCAT6 empty vector levels and CAT5 denotes ptkCAT levels. As in A, each CAT activity is expressed relative to that seen with pBLCAT5 within that particular cell line (and given an arbitrary value of 1) after normalization of data from multiple experiments.  2) were tested and compared with the wild type CAT activity level. B, a series of 5Ј-deletion mutants beginning at Ϫ950 and proceeding approximately every 20 base pairs (constructs B-J; J is deleted to Ϫ715) before ending at Ϫ691 was tested along with the full EHS1 deletion (Ϫ573). C, levels of activity of internal deletion mutants of EHS1 were compared with 5Ј-deletion mutants as indicated.
tin conformation in vivo.
In Vitro and in Vivo Tests of Putative Transcription Factor Sites within EHS1-A search for potential transcription factor sites within oligo 2 reveals that it contains overlapping glucocorticoid receptor and LBP-1 sites. However, binding to oligo 2 remains unaffected in the presence of dexamethasone or RU 486 (data not shown), rendering it less likely that the shift is due to glucocorticoid receptor. LBP (also known as LSF or CP2) binds to the long terminal repeat of human immunodeficiency virus and to the SV40 and ␣-globin promoters and is one of a family of related transcription factors (34 -36). This protein is also unlikely to account for the oligo 2 gel shift, as anti-LSF antibodies (37) do not give rise to any supershift (data not shown). Oligo 3 contains a cytokine 2 site. NF-GMb is a cold shock domain repressor protein that binds to the cytokine 2 element in the granulocyte/macrophage colony-stimulating factor gene (38); however, it prefers binding to single-stranded DNA, an observation not seen in our studies of oligo 3 (data not shown). Oligo 4 contains sites for the UBF1 protein (an RNA polymerase I transcription factor (39)) and for a yeast ␣-factor responsive element (40). However, oligonucleotides that are mutated within these sites still compete as well as wild type for binding (data not shown), indicating that another protein must be responsible for the shift with oligo 4.
These data enabled us to design one final direct transactivation test of the core 49-bp EHS1. Localized mutagenesis of the three protein binding regions was performed within the context of the fully active Ϫ950 construct, and the resultant construct was tested in transfection assays. The data (Fig. 9) show that the core mutant construct is crippled for transactivation, consistent with and as expected from the deletion studies. This verifies that the three protein binding regions identified in the present study that are located between Ϫ715 and Ϫ666 are critical for optimal EKLF promoter activity. DISCUSSION EKLF is an important regulator of gene switching and ␤-globin expression in red blood cells. Recent studies have begun to address how the expression of such regulators are themselves controlled in erythroid cells (reviewed in Ref. 41). The present studies, summarized in Fig. 10, greatly increase our knowledge of EKLF regulation.
The EKLF Transcription Unit Resides within an Open Chromatin Structure-Eukaryotic transcription results from the synergistic interplay between proximal and distal promoter elements (42). These interactions are thought to result in the appropriate stereospecific structure that leads to high levels of gene activity (43). As these events occur within the confines of histones and other chromatin-associated proteins, the detection of open domains has usually correlated with actively transcribing (or "primed" (44)) areas of the genome (28,29). Current models implicate protein-protein interactions as being of critical importance in looping such disparate regions of DNA together (45)(46)(47)(48). In the present case, EKLF resides within a partially open structure even in cell types that are not actively transcribing it. However, nuclease accessibility increases further only within the erythroid cell, giving rise to two specific hypersensitive sites within less than 1 kb of the start of EKLF transcription (EHS1 and EHS2; Fig. 10).
Removal of EHS2 yields a slight increase in EHS1-driven reporter activity, implicating it as a negative element that potentially targets EHS1. Regulation of an upstream activation sequence by a downstream negative regulatory site has been observed in the yeast YOR1 gene (49).
The other site (EHS1) behaves as an enhancer in transient in vivo assays and binds a limited number of proteins in erythroid extracts in vitro. However, high levels of erythroid-specific transcription require both this EHS1 upstream activator element and the proximal EKLF promoter, with its important GATA site. Although forced GATA1 expression could activate the EKLF proximal promoter fragment in non-erythroid cells (21), another GATA-related factor may be the actual in vivo effector of EKLF transcription, as EKLF is still expressed in the absence of GATA1 (50), and its onset of expression in development and erythropoiesis appears coincident with that of GATA1 (10,51). In this scenario, GATA2 becomes a reasonable candidate as an EKLF effector, as it binds the same site and is suitably expressed quite early in hematopoiesis (51)(52)(53).
Although we have focused our attention on the erythroidspecific hypersensitive sites, the two constitutive sites (at Ϫ8.0 kb and ϩ5.5 kb) may be functionally important within chromatin, possibly by providing a boundary or insulator in which the EKLF transcription unit can reside (46,54). Studies in transgenic mice should shed light on whether these sites, in addition to the tissue-specific ones, are required together for formation of an open chromatin architecture.
The structure of the EKLF promoter is reminiscent of that of the chicken lysozyme gene (55). The chromatin surrounding this gene also contains a constitutive hypersensitive site in non-expressing cells. However, additional sites are apparent only in the chromatin within myeloblasts, which changes and become reorganized as the myeloid cells differentiate. Interestingly, formation of these hypersensitive sites is completely abolished in the absence of the lysozyme proximal promoter (55), a property also observed within the ␤-globin locus in erythroid cells (25) and in the rearranged immunoglobulin gene (56).
Three DNA Binding Activities Interact with the Core EHS1 Element-EHS1 gives rise to high levels of erythroid-specific transcriptional activity only in combination with the EKLF proximal promoter. The EKLF promoter contains GATA and CCAAT boxes, although it does not contain a TATA box nor an initiator element. EHS1 (bounded by Ϫ950 to Ϫ573) contains a large number of putative sites for DNA-binding proteins. Because the functional data reveal that deletions within EHS1 from either the 5Ј-or 3Ј-direction to Ϫ691 (an ApaI site) disrupt high level transcription in vivo by 50% compared with its complete deletion (to Ϫ573), sequences on either side of the ApaI site must both be required for optimal expression. At the same time, the in vitro studies indicate that there are three DNA binding activities in cell-free extracts that bind to the important 49-bp core region within EHS1 (Fig. 10). Together, these data reveal that one activity (that binds to oligo 2) binds to the 5Ј-side of ApaI, and two activities (that bind to oligos 3 and 4) bind to the 3Ј-side of ApaI. These DNA binding activities may all cooperate to synergistically activate EKLF promoter transcription, similar to that seen in other systems. For example, c-myb and CBF sites are both required for proper activity of the myeloperoxidase enhancer element (57). Similarly, Pit-1 and GATA-2 functionally cooperate within a 50-bp region to activate the thyrotropin ␤ promoter (58). These interactions may affect the rate of EKLF transcription or the probability of forming an open chromatin structure, as in "binary" models of transcriptional control (59).
Our results have excluded a number of potential DNA binding participants in EHS1. Clearly, identifying the proteins that interact and are responsible for the enhancer properties of EHS1 will be of interest in elucidating the details of EKLF genetic regulation. Although this fragment behaved as an enhancer in all lines examined, this may not be relevant as to whether it binds ubiquitous or tissue-specific factors. For example, although there are numerous DNA binding factors that bind the ␤-globin CACCC element in all cells (e.g. Sp1), only the erythroid-specific EKLF plays a role in ␤-globin expression.
Regulation of EKLF Expression-The approach taken in the present studies begins at the gene locus to find the immediate cis and trans causal components of EKLF transcription. We have utilized erythropotential cells in which EKLF is already abundantly expressed. However, it is clear that EKLF expression is induced at two very specific locations during early development (10) as follows: in the mesodermal blood islands of the yolk sac by day 7.5, and in the hepatic primordia by day 9.5. Our studies have not addressed induction mechanisms, but importantly they set the stage for future studies that will address whether the presently identified components play a role in the initial establishment of EKLF-producing cells within the early embryo, and what extracellular effectors and signal transduction pathways are involved. Of particular interest are the potential role of specific cytokine inducers that transduce via tyrosine (60,61) or serine-threonine (62) kinase receptors.
A related issue is whether the genetic control of EKLF will differ between primitive and definitive erythroid populations. Although genetic ablation of a number of erythroid transcription factors disrupts both red cell compartments (reviewed in Ref. 63), this is not exclusively so (11)(12)(13)(14)(15)(16). In addition, results of disruption of the erythropoietin receptor (17,18) and growth responsiveness during embryonic stem cell differentiation in culture (50,64) indicate that these two erythroid populations are regulated differently by extracellular signals. The only non-globin erythroid gene that has been analyzed at this level is that of GATA1 (reviewed in Ref. 41). In that case, transgenic studies indicate that promoter elements responsible for primi- FIG. 9. Localized mutagenesis of core EHS1 region. Transient transfection assays of 32DEpo1 cells were performed with the indicated expression constructs. Core mut refers to a derivative of the Ϫ950 construct generated by directed mutagenesis (described under "Experimental Procedures") of the three protein binding domains within core EHS1 (as determined in Fig. 8). Normalized results from multiple experiments are shown along with an autoradiograph of the thin layer plate from one experiment. tive or definitive expression are separable. Clearly it will be of interest to establish whether this is also true for EKLF and whether there are common promoter elements and cognate binding factors controlling these two genes. Extending these studies to determine the ultimate causal components (i.e. extracellular to intracellular pathways) in directing the onset of transcription of these important factors will be an important challenge for future studies. Interactions between EHS1 and the EKLF proximal promoter give rise to high level, red cell-specific transcription. Locations of the GATA and CP1 sites within the proximal promoter are shown. The detailed view of the EHS1 region shows the colocalization in structure, sequence, and function from the present data. First, DNase hypersensitivity assays indicate that EHS1 maps to Ϫ0.7 kb relative to the EKLF transcription initiation site. Second, unlike the surrounding area, this region is highly conserved between the murine and human EKLF promoter sequences. Third, the functional in vivo data indicate that the core enhancer maps to the Ϫ715/Ϫ666 region, whose complete sequence is shown (80% identical between murine and human). Fourth, in vitro footprint and electrophoretic mobility shift assay data indicate that this region can be divided into three oligonucleotide segments that each interact with a single species of DNA binding activity in extracts from 32DEpo1 cells.