Multiple Proteins Binding to a GATA-E Box-GATA Motif Regulate the Erythroid Krüppel-like Factor (EKLF) Gene*

Erythroid Krüppel-like factor (EKLF) is a zinc finger transcription factor required for β-globin gene expression and is implicated as one of the key factors necessary for the fetal to adult switch in globin gene expression. In an effort to identify factors involved in the expression of this important erythroid-specific regulatory protein, we have isolated the mouse EKLF gene and systematically analyzed the promoter region. Initially, a reporter construct with 1150 base pairs of the EKLF 5′-region was introduced into transgenic mice and shown to direct erythroid-specific expression. We continued the expression studies in erythroid cells and have identified a sequence element consisting of two GATA sites flanking an E box motif. The three sites act in concert to elevate the transcriptional activity of the EKLF promoter. Each site is essential for EKLF expression indicating that the three binding sites do not work additively, but rather function as a unit. We further show that GATA-1 binds to the two GATA sites and present evidence for binding of another factor from erythroid cell nuclear extracts to the E box motif. These results are consistent with the formation of a quaternary complex composed of an E box dimer and two GATA-1 proteins binding at a combined GATA-E box-GATA activator element in the distal EKLF promoter.

Over the past few years, several relatively specific erythroid transcription factors have been isolated and their functional activity defined. These transcription factors regulate the expression of globin genes as well as many other erythroid-specific genes (for reviews see Refs. 1 and 2). One of these erythroid-specific factors has been termed erythroid Krü ppel-like factor (EKLF) 1 (3). EKLF is a zinc finger DNA-binding protein that recognizes the CACCC motif in the human ␤-globin promoter. It has been shown that EKLF expression is restricted to the erythroid cell lineage, with initial expression in the yolk sac (4), and predominant expression occurring later in erythroid development. As a transcription factor, EKLF appears to be specifically involved in adult ␤-globin gene expression (4 -6). Mice deficient in this gene exhibit lethal ␤-thalassemia (7,8). This disease is very similar to the human ␤-thalassemia caused by point mutations in the CACCC sequence lending further support to the importance of the EKLF-CACCC interaction in vitro (9).
Other critical genes involved in hematopoiesis include GATA-1, Tal1, and Lmo2/rbtn2. While targeted disruption of the mouse EKLF gene results in a failure in adult erythropoiesis, inactivation of any one of these three genes, GATA-1, Tal1, and Lmo2, produces a similar phenotype characterized by a block in hematopoiesis at an earlier yolk sac stage (10 -12). GATA-1 is a zinc finger transcription factor, and its expression is generally confined to erythroid cells. GATA-1binding sites are present in all erythroid-specific genes examined to date (13). Tal1 is a basic helix-loop-helix (bHLH) transcription factor whose name is derived from its isolation at a common translocation site occurring in T-cell acute lymphoblastic leukemia (14). The protein is primarily produced in the same hematopoietic cells that also produce GATA-1. As a class B type of bHLH factor, Tal1 does not readily homodimerize but rather interacts with other HLH proteins, principally the E2A proteins, E12 and E47 (15). These heterodimers are then able to bind DNA. The general consensus site for bHLH complexes is CANNTG and is referred to as an E box. The final erythroid-specific protein of this set, Lmo2, contains a LIM domain believed to be involved in proteinprotein interactions (16,17). Whereas this particular class of proteins also has a zinc finger motif, no evidence for direct DNA binding has been obtained.
The similar failure in erythropoiesis observed with the null mutations in each of these four genes, EKLF, GATA-1, Tal1, and Lmo2, suggest some common role or regulatory interplay in hematopoietic development. Several associations and interactions between these proteins have indeed been demonstrated. GATA-1, for example, has been shown to physically associate with both EKLF and the ubiquitous SP1 protein (18). GATA-1 is also involved in the regulation of EKLF expression through binding to a critical proximal promoter element (19). Recent developments concerning the assembly of an erythroidspecific complex of transcription factors are particularly interesting. Lmo2 and Tal1 can be found as a complex in erythroid cells (20). Subsequently it was noted that Lmo2 will also assemble with GATA-1, whereas efforts to demonstrate a stable association between GATA-1 and Tal1 were unsuccessful (21). Evidence for a model in which Lmo2 interacts with both GATA-1 and Tal1 serving as a protein link between these two DNA-binding proteins has now been provided (22). One caveat with these experiments, however, is the absence of a target gene for which these proteins, excluding GATA-1, alone or as * This work was supported by National Institutes of Health Grant DK39585. The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
The nucleotide sequence(s) reported in this paper has been submitted to the GenBank TM /EBI Data Bank with accession number(s) AF033102.
In our studies of the regulatory elements and associated binding factors important for temporal and tissue-specific expression of the EKLF gene, we have identified an activator element in the distal promoter region. The element consists of two GATA sites flanking an E box motif. We have shown through mutational analyses that all three sites are required for the functional activity. Such a combined site and the data suggesting the binding of multiple proteins raise interesting issues concerning the mechanisms involved in erythroid-specific gene regulation.

EXPERIMENTAL PROCEDURES
Construction of Reporter Plasmids-The basic constructs Ϫ1150EK-LFCAT and Ϫ124EKLFCAT were prepared using PCR. Primers 5Ј-EKLFKpn, GCTTTCTCGAGGCCTGACTAGGTACC, and EKLFH3, GGAATTCAAGCTTGGCTGGCTGGTGTCCACC, were used to amplify the Ϫ1150-bp region. The EKLFH3 primer and EKLF124Kpn, CCTG-GTACCGCACACCATACACATATCG, were utilized to amplify the Ϫ124-bp promoter region. For transgenic animal studies, the Ϫ1150-bp KpnI-HindIII fragment was ligated 5Ј of the lacZ gene containing the SV40 poly(A) addition site. The KpnI-HindIII fragments of Ϫ1150EKLF and Ϫ124EKLF were ligated 5Ј of a CAT reporter in pGEM7Zf (Promega, Madison, WI) to yield Ϫ1150EKLFCAT and Ϫ124EKLFCAT, respectively. The Ϫ1150EKLFCAT construct was then digested with PstI followed by religation creating ⌬Ϫ928/Ϫ575EKLFCAT. A KpnI-SacI fragment was eliminated to yield Ϫ952EKLFCAT. Ϫ1150EKLF-CAT was cut in the upstream polylinker at the SphI site and at KpnI, and a 436-bp fragment was cloned in to yield Ϫ1596EKLFCAT. This construct was subsequently cut with AvrII and SphI to produce Ϫ1385EKLFCAT.
To create mutations in the Ϫ1150 construct, a double-stranded oligomer, SpMyb, spanning Ϫ647 to Ϫ567 and containing a 5Ј ApaI overhang adjacent to an XbaI site and 3Ј NcoI overhang, was ligated to the ApaI and NcoI sites of Ϫ1150EKLFCAT resulting in a Ϫ1150EKLFCAT without the GATA-E box-GATA site. This construct was restricted with ApaI and XbaI and oligomers GEG, G M EG, GEG M , GE M G, and G M E M G M (see Fig. 5) were inserted to make the Ϫ1150EKLFCAT mutants. To produce the Ϫ60GATA mutation primers H114D, CCTATGCATCTTTT-GCTAAACAGCTCAG, and the Ϫ60GATAB mutation oligomer, GTCTT-CCTCTAGAAGCACCCAGGC, were used to amplify Ϫ810 to Ϫ60 mutating the Ϫ60 GATA site. Ϫ60GATAT, GCCTGGGTGCTTCTAGAGGA-AGAC, and EKLFH3 primers amplified the Ϫ60 to ϩ62 region with the complementary GATA mutation. These two Ϫ60 GATA mutations were cut and ligated at the newly created XbaI site, then cut 5Ј with Tth111I and 3Ј with HindIII and ligated into the Tth111I-HindIII sites of Ϫ1150EKLFCAT resulting in the Ϫ1150(Ϫ60GATA M )EKLFCAT.
Additions to the Ϫ124 EKLFCAT construct included isolated fragments, PCR products, and double-stranded oligomers. PstI-HinfI (Ϫ928 to Ϫ814) and HinfI-PstI (Ϫ814 to Ϫ575) fragments were blunt endligated 5Ј of Ϫ124EKLFCAT at the KpnI site yielding (114)-124EKLF-CAT and (239)-124EKLFCAT, respectively. A fragment from Ϫ810 to Ϫ692 was amplified using primers 5Ј-239K, CTGGTACCTTTTGCTA-AACAGCTCAG, and 5Ј-239KA, CTGGTACCGGGCCCAGAACAACCA-TGG and was ligated to the 5Ј KpnI site of Ϫ124EKLFCAT to make (5Ј-239)Ϫ124EKLFCAT. Similarly, the Ϫ696 to 574 region was amplified with primers 3Ј-239KA, CTGGTACCGGGCCCCTACCTGAT-AG, and 3Ј-239KB, CTGGTACCAGATCTGCAGTTCTTACTCTCCC and also ligated to the 5Ј KpnI site of Ϫ124EKLFCAT to give (3Ј-239)-124EKLFCAT. This construct has a 5Ј ApaI site and a 3Ј BglII site designed into the primers. These sites were cut, and the doublestranded oligomers GEG, G M EG, GEG M , GE M G, and G M E M G M , and Sp (see Fig. 7) were ligated. All constructs were produced in or transferred to a modified pBKCMV vector (Stratagene, Inc., La Jolla, CA) which had been cut with NsiI and NheI blunted and self-ligated to eliminate the cytomegalovirus promoter. The constructs were linearized at the MluI site for use in stable transfections.
Nucleotide Sequence Analysis-Genomic clones were sequenced with a model 377 DNA Sequenator and Taq Dye Deoxy sequencing protocol (Applied Biosystems, Inc., Foster City, CA). The nucleotide sequence of all subclones was also confirmed in a similar manner. Primers for this sequence analysis and other applications in this study were produced with a model 394 synthesizer (Applied Biosystems, Inc., Foster City, CA).
Library Screen-A strain 129 mouse genomic library was obtained (23) that had been constructed using Lambda DASH vector (Stratagene, Inc., La Jolla, CA). The library was then screened with a reverse transcriptase-PCR product from nucleotides 852 to 1234 of the EKLF cDNA using MEL BB88 cell RNA as a template. Filters were washed with 0.5ϫ SSC and 0.1% SDS at 65°C. Sixty-seven positive clones were selected of which 29 contained EKLF sequence. Transgenic Mice-Transgenic mice were generated in strain FVB/N by injection of purified (Qiagen, Inc., Santa Clarita, CA) Ϫ1150EKLF␤gal fragment into the male pronucleus (24). The putative transgenic animals were screened by PCR. Southern blot analysis was also performed with genomic DNA from F1 animals of each line to determine the transgene copy number and to ensure the transgene was intact and free of rearrangements.
␤-Galactosidase Assays-Tissue extracts from adult transgenic animals were prepared by homogenizing with a Brinkman Polytron model PT3000 (Brinkman Instruments, Westbury, NY) followed by three cycles of freezing on dry ice and thawing in a 37°C water bath. ␤-Galactosidase assays were then carried out as described previously (25). For the in situ analysis, embryos were isolated at day 10.5 post-coitus, and whole-mount staining was performed as described (26).
Transfections and CAT Assays-Mouse erythroleukemia (MEL) BB88 cells were transfected using a Bio-Rad GenePulser (Bio-Rad) set at 350 V and 960 microfarads. Approximately 5 ϫ 10 7 cells per 800 l of serum-free RPMI media were used per transfection with a total of 45 g of total DNA. In transient assays, this 45 g was composed of 30 g of the CAT reporter plasmid and 15 g of an SV40-luciferase control plasmid (Promega Corp., Madison, WI). Stable transfections were split into three pools and selected with G418 (Life Technologies, Inc.) at 1200 g/ml for 4 days then maintained on 400 g/ml. CAT assays from stables were performed with 25 g of protein for 1 h at 37°C as described previously (27). Transient CAT assays utilized the entire transfection for 3 h at 37°C and were normalized with luciferase activity. Protein concentrations were determined using a bicinchoninic acid assay (Pierce).
Electrophoretic Mobility Shift Assays-Nuclear extracts were prepared as described (28). All EMSAs were carried out in a total volume of 30 l containing 4 g of poly(dI-dC), 2 ϫ 10 5 cpm end-labeled probe, and 10 g of nuclear extract. The GATA-1 binding was carried out under ionic conditions described by Lam and Bresnick (29), and the E box binding was performed under ionic conditions described by Prasad et al. (30). Reactions were carried out at room temperature by incubating 15 min without probe and 15 min after the addition of the probe. The supershifts were carried out under the conditions described above for GATA-1 binding with 0.1 g of GATA-1 rat monoclonal antibody (Santa Cruz Biotechnology, Santa Cruz, CA) added at the time of the probe addition. The DNA-protein complexes were separated on 5% polyacrylamide gels at 120 V for 3-4 h.
Nucleotide Sequence Accession Number-The EKLF genomic sequence diagrammed in Fig. 1 has been deposited in GenBank TM .

The EKLF Promoter Directs Erythroid-specific Expression in
Transgenic Mice-The mouse EKLF gene was isolated from a genomic phage library using a fragment from the zinc finger region as a hybridization probe. Approximately 5.6 kb of genomic DNA was subjected to nucleotide sequence analysis; this includes 1.6 kb of 5Ј-flanking sequence, the 3.3-kb transcription unit, and approximately 0.7 kb of 3Ј-flanking region. A schematic diagram of the EKLF genomic organization is shown in Fig. 1, panel A. The gene consists of 3 exons, and a summary of the important structural features of the EKLF gene, along with the intron/exon boundary sequences, is included in panel B of Fig. 1.
For the initial characterization of the promoter activity, we sought to demonstrate erythroid-specific expression of a heterologous reporter gene driven by EKLF promoter sequences. A construct was prepared using 1150 bp 5Ј of the EKLF transcription start site attached to a ␤-galactosidase gene. Lines of transgenic mice were established expressing this construct. Tissues, including blood, brain, lung, kidney, spleen, and testis, were collected from adult animals and assayed for ␤-galactosidase activity. Expression of the reporter gene was only detected in the blood samples from these mice. Bone marrow is a more active hematopoietic organ in mice (31) compared with the spleen, accounting for the lack of expression in young adult spleen. The 1150-bp region of the mouse EKLF promoter is thus sufficient to confer erythroid tissue-specific expression to a heterologous gene. Moreover, we mated these F1 transgenic mice to produce timed pregnancies and collected embryos at 11.5 days of gestation. The embryos were fixed and stained for ␤-galactosidase activity. Representative non-transgenic and transgenic embryos and yolk sac sections are shown in Fig. 2. ␤-Galactosidase activity is prominent throughout the embryonic circulation and in the fetal liver. Although we did observe background staining of yolk sac tissue in the non-transgenic animals, the circulating blood is clearly red in these yolk sacs and blue in the yolk sacs from transgenic embryos. Thus the 1150-bp EKLF promoter directs both tissue and developmentally specific expression of the reporter gene.
Identification of an Enhancer Element in the Distal EKLF Promoter-To investigate further the regulatory elements in the EKLF promoter important for erythroid-specific expression, varying lengths of the 5Ј-flanking sequence have been attached to a CAT reporter gene. These constructs were introduced into MEL BB88 cells. As shown in Fig. 3, an approximately 5-fold increase in CAT activity is observed with the Ϫ1150 construct compared with our minimal promoter. We chose Ϫ124 as our basal element since Crossley et al. (19) had previously shown the presence of a critical GATA site in this proximal promoter region and indicated that this was the principal cis element within 353 bp of the transcription start site. The fold enhancement we are reporting therefore corresponds to elevated expression compared with this 124-bp promoter with an active GATA-binding site. Comparison of the Ϫ1150 construct containing this newly described distal enhancer activity with a Ϫ77 EKLF promoter reported previously (19) in which the GATA site has been mutated results in an approximately 80-fold enhancement over this background value. We will return to a direct comparison with this proximal GATA site in a later experiment.
In addition, a non-erythroid cell line, mouse 3T3 fibroblasts, was also transfected with this series of EKLF-CAT constructs as a control to distinguish basal versus erythroid-specific expression. No expression has been observed with any EKLF-CAT construct in this non-erythroid cell line, indicating the tissue specificity evident in the transgenic mouse studies was recapitulated in our in vitro tissue culture system. Importantly, an internal 353-bp deletion in the Ϫ1150 construct severely compromises expression from the EKLF promoter. The region missing in the ⌬Ϫ928/Ϫ575 construct corresponds to a 353-bp PstI fragment. The nucleotide sequence for this region is shown in Fig. 4, panel A. We performed a computer search of this region against a data base of transcription factor consensus binding sites and discovered three Sp1 sites, two GATA sites, an E box motif, a c-myb site, and an EKLF consensus site. In order to test the functional significance of these putative transcription factor binding sites, we prepared a series of CAT constructs with subfragments from this region attached to our basal Ϫ124 EKLF promoter. The designation of these clones is outlined in Fig. 4, panel B. This assay was developed as a complement to the loss-of-function analysis carried out with the Ϫ1150 internal PstI deletion. An upregulation of activity from the EKLF Ϫ124 promoter upon inclusion of these sequences would indicate that the activator function can be added back to this basal promoter. As shown in Fig. 4, panel C, the presence of the 239-bp fragment results in a 9-fold increase in CAT activity, whereas the 114-bp fragment has no effect. This indicates that the 239-bp fragment accounts for all the activity in the element defined by deletion analysis and that this element can be moved relative to the transcription start site and still maintain its functional activity. The 239-bp fragment was subsequently further divided at the ApaI site and analyzed in a similar manner. The majority of the enhancer activity resides in the 3Ј-half of this fragment (Fig. 4, panel C).
All Three Sites in the GATA-E Box-GATA Motif Define the EKLF Enhancer Element-We noted with interest the clustering of the GATA and E box motifs and considered that alone or in combination these elements may be responsible for the erythroid-specific activation of the EKLF-CAT constructs. Double-stranded oligonucleotides flanked by restriction enzyme sites to allow easy insertion 5Ј of the Ϫ124 EKLF minimal promoter were therefore synthesized. Initially, two sequences, designated GEG and Sp and shown in Fig. 5, panel A, were tested. GEG is a 49-bp oligomer and includes the GATA-E box-GATA configuration of sites. The Sp oligomer corresponds to the next 37 nucleotides of sequence and includes the Sp1 consensus site. As illustrated in Fig. 5, panel B, the GEG oligomer completely replicated the 5-8-fold enhancement observed with the 3Ј-239 control construct, whereas the Sp oligomer had no effect. Oligomers with point mutations in each of the GATA sites alone or in combination were subsequently synthesized. Additionally, double-stranded oligomers were prepared in which the core E box sequence was replaced with a restriction enzyme site. Finally, an oligomer was constructed with all three sites disrupted. Mutation of either GATA site or the E box abrogated the enhancer activity of this element. Moreover, we did not observe an additive effect whereby mutation of the GATA sites might have eliminated half the activity of this distal element, and alteration of all three sites would drop the activity to base line. Rather, disruption of either of the GATA sites or the E box was sufficient to abolish the activator effect. Multiple binding sites, including the E box motif and at least one of the GATA sites, are therefore necessary for the activity of this erythroid-specific enhancer element which appears to function as a unit.
These experiments demonstrated that the activator could be removed from its normal position and still elevate expression from our basal EKLF promoter and that both a GATA and an E box site are important for this enhancer activity. It is possible, however, that placing this 49-bp element in the proximal promoter region allowed expression of an activity that was normally regulated by surrounding sequences, and thus our constructs did not accurately represent the impact of this element in the typical expression from the EKLF promoter. The GATA point mutations and E box replacement mutation were therefore moved into the Ϫ1150 EKLF CAT construct to test the effect of these small alterations in the wild-type context. In this series of experiments, the effect of mutating each site individually and in combination was examined. The analysis of CAT activity from stably transfected MEL cells is summarized in Fig. 6. A mutation in any single site, either the 5Ј-or 3Ј-GATA sites (G M EG or GEG M ) or the E box motif (GE M G), is sufficient to reduce the EKLF promoter activity to the level observed with the Ϫ928 to Ϫ575 deletion that originally defined this proximal enhancer. Combinations including alterations in both GATA sites (G M EG M ) or all three factor binding sites (G M E M G M ) gave levels comparable to a single site mutation. All three sites thus appear to be required for the function of this distal activator.
Although the significance of these GATA-binding sites in the regulation of expression from the EKLF promoter is a novel observation, an additional GATA site in the proximal promoter region has also been shown to be important for EKLF promoter activity, as mentioned previously (19). Since the studies addressing this proximal GATA site had been carried out using constructs with only 77 bp of 5Ј-sequence, we wished to test the contribution of this site in the context of our Ϫ1150 EKLF-CAT reporter. A point mutation was therefore made in this GATA site, and the activity of this Ϫ1150(Ϫ60G M ) construct was compared with the single GATA site mutations, Ϫ1150(G M EG) and Ϫ1150(GEG M ). After stable transfection in MEL cells, the normalized CAT activity from cellular extracts is shown in Table I. Clearly, all three GATA sites are critical for expression of the EKLF promoter when considered in the context of this extended 5Ј-promoter sequence. That is, even in the presence of an intact GATA-E box-GATA distal activator, only minimal transcriptional activity is observed if the proximal GATA site is altered. Similarly, in the framework of the 1150-bp fragment, this proximal site by itself is insufficient to direct expression of the EKLF gene.
Multiple DNA-Protein Complexes Form on the Intact GATA-E Box-GATA Motif-The previous experiments establish the functional importance of the GATA-E box-GATA motif as a regulatory element in the EKLF promoter. The fact that a mutation in any single site eliminated the enhancer effect suggested that several factors could be involved in the forma-FIG. 2. Transgenic mice carrying the EKLF promoter driving the ␤-galactosidase gene express ␤-galactosidase activity in hematopoietic tissue in a developmentally specific manner. A construct consisting of 1150 bp of the EKLF promoter attached to a ␤-galactosidase reporter gene was used to establish lines of transgenic mice. Embryos and yolk sacs were collected at 11.5 days of gestation, fixed, and stained for transgene activity. Photographs of the non-transgenic and transgenic embryos are shown. ␤-Galactosidase activity was observed in the fetal liver and throughout the circulation. Arrows highlight the color change in the yolk sac circulation.

FIG. 3. Deletion constructs of the EKLF promoter region driving CAT expression in erythroid cells identify a distal activator
at ؊928 to ؊575. Constructs were transfected into MEL cells by electroporation as described under "Experimental Procedures." The percent conversion of substrate to product was normalized for a minimum of three experiments per construct. Activity of the 1150-bp construct was arbitrarily set at 100 for comparison purposes. tion of a larger protein complex, each potentially contributing to the stability through contact with its DNA-binding site. We have begun investigating the character of this putative complex with electrophoretic mobility shift assays (EMSAs) using a nuclear extract from MEL cells. With the wild-type GEG double-stranded oligomer as a probe, we regularly observe three specific complexes, labeled A, B, and C in panel A of Fig. 7. Occasionally, slower mobility complexes are also detected, but not consistently under the particular buffer and temperature conditions used for this assay. The three major complexes can be specifically competed by the wild-type oligomer (GEG) or by any of the single mutation oligomers as shown in Fig. 7, panel A. The common factor with all these productive competitors is the presence of at least one intact GATA site.
Although a single GATA site may be all that is required to disrupt the complexes from the GEG oligomer, the formation of the A and B complexes relies on the presence of multiple sites. In panel B of Fig. 7, the binding of MEL cell nuclear factors to the wild-type GEG oligonucleotide is compared with complex formation on the probes with mutations in either the 5Ј-or 3Ј-GATA sites. The particular G 3 A/C mutations at these putative GATA-binding sites were prepared because previous studies (32) had shown that these nucleotide changes would prevent binding by the GATA class of transcription factors.
Although all three complexes, A-C, are observed with the GEG oligomer, only the C complex is formed when the probe carries a point mutation in one of the GATA sites. One interpretation of this result is that the C complex represents binding by a single GATA protein, and the A and B complexes contain multiple factors. The binding at each GATA site is not necessarily specific to the particular context of that GATA sequence, however. That is, binding to the G M EG oligomer can be competed with the GEG M oligomer and vice versa. Nevertheless, we have noted that the GEG M oligomer appears to be less efficient in this binding reaction than either the G M EG or the wild-type GEG oligonucleotide.
Whereas the results from these binding assays confirmed the importance of the GATA sites in this element, the role of the E box was not addressed. On the one hand, a mutation in the E box motif abolished the enhancer activity, yet a specific protein-DNA complex could not be assigned to this site since the GATA sites appeared to be driving the binding activity in our assays. Therefore, as a means of focusing on the potential E box-binding protein, the G M EG M oligonucleotide was labeled and used as probe in the EMSA shown in Fig. 8. With mutations in the flanking sites precluding binding by GATA factors, a complex was formed with the central intact E box. Specificity was demonstrated by competition with the wild-type but not the mutated E box in the GATA-E box-GATA oligomers. These experiments taken together provide evidence for the formation of DNA-protein complexes at all three sites in the GATA-E box-GATA motif.
GATA-1 Is a Component in the Complexes That Bind the EKLF Distal Enhancer-With the important features of this cis FIG. 6. Mutation of any single site in the GATA-E box-GATA enhancer eliminates the increased transcriptional activity observed with the ؊1150 EKLF promoter construct. The double-stranded oligonucleotides depicted in Fig. 5, panel A, were used to replace the wild-type sequence in the Ϫ1150 EKLF-CAT construct. The plasmids were stably introduced into MEL cells, and pools of transfected cells were assayed for CAT activity. The values are normalized with the activity of the wild-type 1150-bp construct arbitrarily set at 100.  FIG. 8. A nuclear MEL cell factor binds the E box motif in the GATA-E box-GATA distal element. A mutant oligomer in which both GATA sites were altered to eliminate binding by the GATA family of factors was labeled and used as a probe in a mobility shift assay. A protein-DNA complex, indicated by the arrow, was observed. The complex was specifically competed by the addition of unlabeled probe but not by the inclusion of an oligomer containing a mutant E box sequence. Two nonspecific bands (NS) were also noted. element defined, our studies have now shifted to the identification and characterization of the trans-acting factors that interact with this interesting configuration of binding sites. We have carried out supershift assays using antibodies to GATA-1 and two E box-binding proteins. Concerning this latter site, we were unable to demonstrate a reproducible effect using antibodies to either the ubiquitous E2A proteins or antibodies generated against the Tal1 protein. Our studies using GATA-1 antibodies were more instructive, and the results are illustrated in Fig. 9. In EMSA reactions with the wild-type GEG oligomer used as the probe, addition of the GATA-1 antibody resulted in the formation of several slower mobility complexes with a concomitant disappearance of the A-C complexes evident in the 1st lane. Addition of a control, non-immune antibody or an antibody directed against another, unrelated protein did not effect the mobility of any of these bands. Furthermore, with a probe containing only one intact GATA site (e.g. G M EG), the single C complex is also shifted by the GATA-1 antibody. Therefore, all three DNA-protein complexes observed with the GATA-E box-GATA motif include GATA-1 as a component.

DISCUSSION
The EKLF Promoter Specifies Adult, Erythroid-specific Expression-A 1150-bp genomic fragment containing the EKLF promoter was shown to drive erythroid-specific and developmentally correct expression as analyzed in transgenic mice. In addition, we have transfected K562 cells with these EKLF-CAT constructs. The K562 cell line represents an earlier stage in erythroid development as compared with MEL cells. The re-porter constructs are expressed in the K562 transfectants, but the levels are 40 -50-fold lower than the MEL cell expression when the results are corrected for transfection efficiency between the two cell lines. 2 This is consistent with the developmental pattern both of the reporter construct in transgenic mice and the endogenous EKLF gene. This promoter may therefore represent an avenue for producing adult, erythroidspecific expression in mice.
A GATA-E Box-GATA Enhancer Motif Resides in the Distal EKLF Promoter-This study describes the identification of an interesting configuration of binding sites in the EKLF distal promoter region that functions as a unit to elevate the transcriptional activity. The 49-bp element consists of a GATA-E box-GATA arrangement of consensus binding sites. The distal activator was functionally defined by deletion constructs in which the presence of the element consistently produced a 5-8-fold increase in reporter gene activity. Mutational analyses were carried out to demonstrate the requirement for all three binding sites, i.e. both GATA sites and the E box-binding motif. A mutation at any individual site abolishes the transcriptional activation.
Potential Assembly of a Multimeric Complex at the GATA-E Box-GATA Enhancer Element-Our results suggest the potential formation of, at the minimum, a quaternary protein complex containing two GATA-1 transcription factors and an E box dimer. These data can be considered in light of a recent paper from Wadman et al. (22) describing a large multiple protein, erythroid-specific complex in MEL nuclear extracts. These authors utilized a CASTing (cyclic amplification and selection of targets) procedure (33) to screen for preferred nucleotide sequences binding the protein of interest, Lmo2/rbtn2, in a complex with other factors. This experiment yielded a consensus site consisting of an E box and a GATA-binding site separated by 8 -10 bp. This arrangement exactly matches the 3Ј twothirds of our distal activator element. The authors propose a model involving a complex of GATA-1 and a heterodimer of Tal1 and E2A bridged by Lmo2, with the newly described Ldb1 also included through an association with Lmo2. This model is based in part on previous studies demonstrating a physical interaction between GATA-1 and Lmo2 (21) and Lmo2 and Tal1 (20). If we are detecting the same type of multimeric protein structure in our studies, this would be an indication of a functional role for this complex as a transcriptional activator for erythroid-specific expression. Although the similarity in sequence configuration is compelling, there are some differences that suggest the EKLF complex may be comparable but not identical to that described by Wadman et al. (22).
One principal difference is the requirement for all three sites in our element. The 5Ј-GATA-binding site contributes to the functional activity of this enhancer to an equivalent degree as the 3Ј-GATA-or E box-binding sites. The failure to obtain a CASTing sequence with all three sites may simply be due to the fact that the oligonucleotide strings used in the procedure were composed of 26 random nucleotides, whereas the EKLF element we have described spans 49 bp.
The identity of the factor binding to the E box in the EKLF activator element is still under investigation. Based on recent evidence indicating interactions between GATA-1, Lmo2/rbtn2, and Tal1 (20 -22), and the fact that Tal1 is an erythroid-specific E box factor, we have tested for the presence of Tal1 in our electrophoretic mobility shift assays. The inclusion of antibodies against either Tal1 or E2A, a known heterodimer partner for Tal1 (15), did not produce a supershift in any of the DNAprotein bands in our assays. Whereas we can detect binding at 2 K. P. Anderson, unpublished results.
FIG. 9. GATA-1 is a component in the complexes that bind the GATA-E box-GATA motif in the EKLF distal activator. GATA-1specific antibodies were included in a binding assay with either the wild-type or the 5Ј-GATA mutant oligonucleotide probe. In each instance, the specific complexes were supershifted in the presence of the antibody. In the case of the wild-type oligomer, GEG, three complexes (A, B, and C) are affected; a single DNA-protein complex (C) is shifted in the mutant G M EG assays. A nonspecific band (NS) is evident in all lanes and is unaffected by the antibody. the E box site (see Fig. 8), we have noted that the nucleotide sequence matches neither the consensus Tal1-binding site (CA-GATG) (34) nor the non-standard Tal1 site (CAGGTG) described by Wadman et al. (22). One possibility is, therefore, that an additional E box binding factor is involved. Alternatively, in this in vitro system, the binding conditions that favor binding of one protein or complex may be unsuitable for other constituents. Additional variations in buffer, temperature, and gel conditions may therefore be necessary to form and stabilize what is potentially a very large protein-DNA complex. Nonetheless, the functional activity observed in the transfected cells and the evidence for nuclear factor binding at all three sites indicate that the characterization of these binding factors will be an important avenue of continuing investigation.
Finally, an analysis of the complex formation at this distal site must consider the likelihood of protein-protein associations in addition to direct DNA binding interactions. The establishment of GATA-1 binding at this site presents intriguing possibilities in light of the expanding list of factors shown to physically associate with GATA-1. Previous studies have demonstrated interactions between GATA family members (35,36) and between GATA-1 and EKLF or Sp1 (18). For example, a factor termed FOG was recently isolated based on its ability to bind GATA-1 (37). Although a functional role has not yet been assigned to this protein, its expression pattern mirrors that of GATA-1 and it may generally be associated with GATA in vivo.
Thus, the view of transcriptional activation is moving from simple models of factors binding to their cognate sites and individually activating transcription through an interaction with a component of the basal transcription machinery to more intricate mechanisms whereby either a protein binds to DNA and subsequently recruits additional factors into a multi-protein complex or, alternatively, the complex may assemble in the nucleus and bind en bloc to a series of binding sites. With the identification of many important transcription factors active in this system and the recent evidence for the multiple associations between these factors, studies of erythroid-specific gene regulation are expected to continue to significantly contribute to our general understanding of transcriptional mechanisms.