Multiple Regulatory Elements in the 5′-Flanking Sequence of the Human ε-Globin Gene*

We have previously reported, on the basis of transfection experiments, the existence of a silencer element in the 5′-flanking region of the human embryonic (ε) globin gene, located at −270 base pairs 5′ to the cap site, which provides negative regulation for this gene. Experiments in transgenic mice suggest the physiological importance of this ε-globin silencer, but also suggest that down-regulation of ε-globin gene expression may involve other negative elements flanking the ε-globin gene. We have now extended the analysis of ε-globin gene regulation to include the flanking region spanning up to 6 kilobase pairs 5′ to the locus control region using reporter gene constructs with deletion mutations and transient transfection assays. We have identified and characterized other strong negative regulatory regions, as well as several positive regions that affect transcription activation. The negative regulatory regions at −3 kilobase pairs (εNRA-I and εNRA-II), flanked by a positive control element, has a strong effect on the ε-globin promoter both in erythroid K562 and nonerythroid HeLa cells and contains several binding sites for transcription factor GATA-1, as evidenced from DNA-protein binding assays. The GATA-1 sites within εNRA-II are directly needed for negative control. Both εNRA-I and εNRA-II are active on a heterologous promoter and hence appear to act as transcription silencers. Another negative control region located at −1.7 kilobase pairs (εNRB) does not exhibit general silencer activity as εNRB does not affect transcription activity when used in conjunction with an ε-globin minimal promoter. The negative effect of εNRB is erythroid specific, but not stage-specific as it can repress transcription activity in both K562 erythroid cells as well as in primary cultures of adult erythroid cells. Phylogenetic DNA sequence comparisons with other primate and other mammalian species show unusual degree of flanking sequence homology for the ε-globin gene, including in several of the regions identified in these functional and DNA-protein binding analyses, providing alternate evidence for their potential importance. We suggest that the down-regulation of ε-globin gene expression as development progresses involves complex, cooperative interactions of these negative regulatory elements, εNRA-I/εNRA-II, εNRB, the ε-globin silencer and probably other negative and positive elements in the 5′-flanking region of the ε-globin gene.

We have previously reported, on the basis of transfection experiments, the existence of a silencer element in the 5-flanking region of the human embryonic (⑀) globin gene, located at ؊270 base pairs 5 to the cap site, which provides negative regulation for this gene. Experiments in transgenic mice suggest the physiological importance of this ⑀-globin silencer, but also suggest that downregulation of ⑀-globin gene expression may involve other negative elements flanking the ⑀-globin gene. We have now extended the analysis of ⑀-globin gene regulation to include the flanking region spanning up to 6 kilobase pairs 5 to the locus control region using reporter gene constructs with deletion mutations and transient transfection assays. We have identified and characterized other strong negative regulatory regions, as well as several positive regions that affect transcription activation. The negative regulatory regions at ؊3 kilobase pairs (⑀NRA-I and ⑀NRA-II), flanked by a positive control element, has a strong effect on the ⑀-globin promoter both in erythroid K562 and nonerythroid HeLa cells and contains several binding sites for transcription factor GATA-1, as evidenced from DNA-protein binding assays. The GATA-1 sites within ⑀NRA-II are directly needed for negative control. Both ⑀NRA-I and ⑀NRA-II are active on a heterologous promoter and hence appear to act as transcription silencers. Another negative control region located at ؊1.7 kilobase pairs (⑀NRB) does not exhibit general silencer activity as ⑀NRB does not affect transcription activity when used in conjunction with an ⑀-globin minimal promoter. The negative effect of ⑀NRB is erythroid specific, but not stage-specific as it can repress transcription activity in both K562 erythroid cells as well as in primary cultures of adult erythroid cells. Phylogenetic DNA sequence comparisons with other primate and other mammalian species show unusual degree of flanking sequence homology for the ⑀-globin gene, including in several of the regions identified in these functional and DNA-protein binding analyses, providing alternate evidence for their potential importance. We suggest that the down-regulation of ⑀-globin gene expression as development progresses involves complex, cooperative interactions of these negative regulatory elements, ⑀NRA-I/⑀NRA-II, ⑀NRB, the ⑀-globin silencer and probably other negative and positive elements in the 5-flanking region of the ⑀-globin gene.
The expression of the individual genes of the human ␤-globin cluster is regulated in both a developmental and a tissue-dependent manner. The developmental "switches" in expression follow the sequential arrangement of the globin genes, beginning at the 5Ј region of the gene cluster and including the five active ⑀, G␥, A␥, ␦, and ␤-globin genes (1). The effort to understand the mechanism of hemoglobin switching has focused on localizing the cis-acting DNA sequence elements which are involved in regulating globin gene expression, and identifying and characterizing the transcription factors or proteins that bind to those DNA motifs or related proteins (2,3). Each globin gene and its immediate flanking region appear to contain sufficient information for developmentally correct expression as suggested by transgenic mouse experiments (4 -7). Phylogenetic footprinting has been used to identify evolutionarily conserved regions and other potential protein binding sites in the globin gene cluster (8 -10). Located at the distal 5Ј region of the ␤-globin cluster immediately upstream of the embryonic ⑀-globin gene are the DNase I hypersensitive sites (HS 1 to HS 5) 1 of the locus control region (LCR) (6 -13 kb 5Ј) that are important in controlling transcription and replication of the ␤-globin cluster. The proposed role of the LCR in developmental regulation is controversial. Studies in transgenic mouse show that linkage of the LCR to individual globin gene results in much higher expression in vivo, and an apparent alteration in the developmental specificity of the ␥and ␤-globin genes, depending on proximity and arrangement of the transgene (11)(12)(13). In contrast, developmental specificity of expression of human ⑀-globin gene appears to be more autonomous and does not require a particular arrangement with respect to the fetal ␥or adult ␤-globin genes. DNA constructs lacking the LCR show developmental switching of globin genes in transgenic mice showing the LCR is expendable for developmental regulation, at least in this assay.
We have previously identified an ⑀-globin gene silencer (⑀GS), using reporter gene transfection assays, in vitro transcription and DNA-protein binding assays, located in the region between Ϫ300 bp and Ϫ250 bp 5Ј to the ⑀-globin gene cap site (14 -16). The potential biological significance of the silencing activity of ⑀GS was supported by in vivo studies using transgenic mice (7,17,18). Additional studies have revealed other cis-acting regulatory elements further 5Ј to the ⑀-globin gene (9,20,21), including a positive regulatory element, located at Ϫ700 bp, and a negative regulatory element located at about Ϫ400 bp. In general, the 5Ј region of the ⑀-globin gene provides much of the activity for developmental regulation of the ⑀-globin gene expression as evidenced from transgenic mouse studies (7). However, the expression of limited levels of the human ⑀-gene (5-10% of the mouse ⑀y or ␤) with constructs in which the silencer has been mutated (18) 2 suggests that other important negative regulatory elements may exist around the ⑀-globin gene.
In the present study, we have investigated the functional role of the ⑀-globin gene 5Ј-flanking region up to Ϫ6 kb, which includes HS 1, and have identified several functionally important cis-elements that markedly affect expression driven by the ⑀-globin promoter. Construction of serially deleted mutants enabled us to systematically study the positive and negative cis-acting elements involved in ⑀-globin control. We observed multiple regulatory sequences in this region and focused on several strong negative elements located in the regions around Ϫ1.7 and Ϫ3.0 kb. In all cases, the negative elements are flanked by positive regulatory regions. These elements contain several DNA-protein binding motifs, including the erythroid specific transcription factor GATA-1. DNA sequences in the regulatory region located at Ϫ1.7 kb are conserved in all mammals examined, whereas the DNA sequences located at Ϫ3.0 kb are present only in the prosimian primate orangutan, galago, and human. Our data suggest that in addition to the ⑀GS and the stage-specific positive element located more proximal to the ⑀-promoter, expression of the ⑀-globin gene including specifically its down-regulation during development involves multiple positive and negative elements.

MATERIALS AND METHODS
Plasmid Constructions-An ⑀-globin promoter/reporter gene construct was made by linking human ⑀-globin gene containing 5Ј sequences from the promoter ϩ46 to Ϫ6073 bp 5Ј of the cap site, to a luciferase reporter gene (LUC)-coding plasmid pGL-Basic (Promega), generating a parent construct p⑀6073 that includes DNase I HS 1 at about Ϫ5 kb. A series of 5Ј-deletion mutants were made by linearizing p⑀6073 with SacI and SpeI followed by exonuclease III digestion, at 1-min intervals. The ends of the deleted mutants were filled in with the Klenow fragment of DNA polymerase I and self-ligated. A second set of 5Ј series of deletions was made from p⑀3028 to generate smaller deletion mutants. The 5Ј ends of the deletion mutants were determined by dideoxy sequencing.
Cell Culture-The human erythroleukemia K562 and HeLa cells were grown in RPMI 1640 or AMEM medium (Biofluid, Rockville, MD), respectively, supplemented with 10% fetal bovine serum, L-glutamine and penicillin/streptomycin. Primary human adult erythroid cells (hAEC), were grown in a two-phase liquid culture system as described previously (20). Briefly, mononuclear cells from the peripheral blood of normal donors, isolated on a Ficoll-Hypaque gradient, were grown in ␣-minimal essential medium with 10% fetal calf serum and 10% conditioned medium collected from 5637 human bladder carcinoma cells (phase I). After 7 days the cells were washed and recultured in liquid medium supplemented with 1 unit/ml recombinant erythropoietin (phase II).
Transient Transfection Assays-Both K562 and HeLa cells were transfected by electroporation with Gene Pulser (Bio-Rad) at 250 V (220 V for HeLa) and 960 F with a plasmid DNA amount ranging from 10 to 40 g. Transfections with hAEC were carried out after 10 -11 days of incubation by combining phase II cultured cells from different donors. Transfected cells were collected and lysed after 48 h of incubation, and 20 l of the cell lysate were used to determine luciferase activity analyzed with a Monolight 2010 luminometer (Analytical Luminescence Laboratory, San Diego, CA), in which the substrate D-luciferin was automatically injected. The results are expressed as the average of at least three experiments with the activity of luciferase normalized to the amount of protein used in each experiment. A construct containing the LUC reporter gene under control of the SV40 promoter was used separately as the positive control to establish a value for promoter activity of 1.0.
In Vitro DNA Foot Printing-DNA probes were made by labeling sense primers with [␥-32 P]dATP followed by polymerase chain reaction amplification to generate DNA fragments. The probes range from Ϫ3198 to Ϫ2898 bp 5Ј for ⑀NRA-I/⑀NRA-II and from Ϫ1838 to Ϫ1588 bp 5Ј for ⑀NRB. The labeled probes were purified by SpinBind (FMC, Rockland, ME). The mixtures of probe (20,000 cpm) and nuclear extract (50 -100 g) were incubated for 30 min on ice followed by the addition of DNase I (0.25-0.5 unit) and incubation for 4 min at room temperature. Equal volumes of stop solutions containing 400 g/ml proteinase K were added and samples incubated for 30 min at 37°C, and 2 min at 70°C. After phenol/chloroform extraction and ethanol precipitation the DNA samples were dissolved in loading buffer and analyzed on 6% polyacrylamide sequencing gels.
Electrophoretic Mobility Shift Assays-Gel shift studies were carried out by annealing a pair of oligonucleotides, labeled with [␥-32 P]dATP followed by SpinBind (FMC, Rockland, ME) gel purification. The reactions were carried out on ice for 30 min in a 15-l total volume and loaded onto a 4% polyacrylamide gel. In competition experiments, an unlabeled probe or the same fragment with mutation with 12.5-100fold molar excess was included in the reactions as indicated. Oligonucleotide sequences for gel shift are as follows with the mutated bases underlined: ⑀NRA II-1G: 5Ј-CCCAG AGCTG TATCT TAATTGT; ⑀NRA II-⌬1G: 5Ј CCCAG AGCTG GCGCC TAATTGT.
DNA Sequence Analysis-Pairwise alignments of the DNA sequences from the ␤-globin gene clusters of human, galago, rabbit, and mouse were computed using the program SIM (21) and displayed as percent identity plots (22). In a percent identity plot, all the gap-free aligning segments in the region of interest are automatically plotted as a series of horizontal lines (each between the coordinates of the human sequence present in a gap-free alignment) placed along the y axis according to the percent identity in each aligning segment. Notable features in the human sequence are also placed along the x axis. The simultaneous alignment of these four DNA sequences were obtained from the Globin Gene Server (http://globin.cse.psu.edu) (23). The region encompassing ⑀NRA in human and the homologous regions from orangutan (EMBL accession no. X05035) and galago (GenBank TM accession no. U60902) were aligned simultaneously using the program YAMA2 (24). In the displays of the multiple alignments, boxes are drawn around blocks of at least six columns where each column has an identical nucleotide in at least 75% of the positions; this is equivalent to requiring invariant columns for alignments of three sequences.

The Presence of Negative Element(s) in the 5Ј-Flanking Sequences of Human ⑀-Globin
Gene-The human embryonic epsilon globin (⑀) 5Ј-flanking sequence was linked to the luciferase reporter gene and tested by transient transfection in K562 cells, a human erythroleukemia cell line that expresses embryonic and fetal globin genes. As shown in Fig. 1A, the transcription activity of ⑀-promoter in transfected cells measured as luciferase reporter gene activity varies greatly with different lengths of 5Ј-flanking sequences. A high level of activity 2.5fold greater than the SV40 promoter was observed for the minimal ⑀-promoter construct p⑀177, as expected given the active transcription activity of the endogenous ⑀-globin gene in K562 cells. The ⑀GS in the region of Ϫ300 to Ϫ250 bp (14) and other negative elements located at Ϫ419 bp (25) contribute to the lowered reporter gene activity of p⑀883 when compared with that of the minimal ⑀-promoter construct (p⑀177). Extending the 5Ј region to encompass HS 1, we find that the transcription activity of p⑀6073 is 10-fold lower than that of p⑀883 suggests the existence of one or more strong negative element(s) in the region from Ϫ800 to Ϫ6000 bp.
Transcriptional Activity Profile of the ⑀-Globin Gene Promoter-We have studied the transcriptional activity profile of this region of the ⑀-globin gene-flanking sequences in detail by constructing a series of deletion mutants extending up to 6 kb 5Ј of the human ⑀-globin gene linked to luciferase reporter gene. The transcriptional activities of these reporter gene constructs were tested in transient transfection assays in embryonic/fetal erythroid K562 and nonerythroid HeLa cells (Fig.  1A). In K562 cells, transcription activity of the ⑀-globin gene minimal promoter was comparable with that of SV40, in contrast to HeLa cells in which the ⑀-globin minimal promoter activity is only 10% of that SV40. Analysis of the deletion mutants in these cells revealed several regulatory regions flanking the ⑀-globin gene 5Ј extending from Ϫ883 bp to HS 1. A striking feature of the behavior of the reporter gene constructs is that positive regulatory regions are generally flanked by negative regulatory regions, i.e. certain constructs appear as "spikes" in the graph. The two most striking combinations of this type are a pair of positive (⑀PRA) and negative regions (⑀NRA-I/⑀NRA-II) located between Ϫ2.8 and Ϫ3.1 kb that are active in both K562 cells and HeLa cells and a pair of positive (⑀PRB) and negative (⑀NRB) regions located around Ϫ1.7 kb that function only in K562 cells. Another, less potent regulatory pair includes the positive regulatory region between Ϫ1995 bp and Ϫ1747 bp flanked on the 5Ј side by a negative regulatory that functions in both K562 and HeLa cells. The positive region between Ϫ1084 and Ϫ1135 bp and an overall negative region between Ϫ1135 and Ϫ1460 bp are active only in K562 cells. Additional positive regulatory regions (Fig. 1A) are localized between Ϫ2385 and Ϫ2772 bp and between Ϫ3199 and Ϫ3329 bp that increase transcription activity by about 3-fold in K562 cells, and between Ϫ3329 and Ϫ3986 bp that increases transcription activity in HeLa cells. Other negative regulatory regions that reduce transcription activity are localized between Ϫ883 and Ϫ1084 bp, Ϫ2000 and Ϫ2385 bp, and Ϫ3986 and Ϫ4442 bp, and are active in both K562 cells and HeLa cells. Extending the 5Ј region from Ϫ4442 to Ϫ6073 bp further decreases reporter gene activity in K562 cells.
The greatest change in transcription activity observed in these transient assays are the increases associated with the regions ⑀PRA and ⑀PRB, and the decreases associated with the regions ⑀NRA-I/⑀NRA-II and ⑀NRB. To further understand the negative regulation of the ⑀-globin gene, we have focused on the two regions that exhibited marked decrease in transcription activity in K562 cells localized at Ϫ3 kb (⑀NRA-I/⑀NRA-II) and Ϫ1.7 kb (⑀NRB). ⑀NRA-I/⑀NRA-II are active in both K562 and HeLa cells while the activity of ⑀NRB is absent in HeLa cells, suggesting that the negative activity of this region is erythroid-specific.
Conserved DNA Sequences in the 5Ј-Flanking Region of Mammalian ⑀-Globin Genes-A summary of the results of the deletion series are shown in Fig. 1B (top panel), aligned with graphs of the sequence matches observed in pairwise comparisons of the human sequence with that of other mammals. In these percent identity plots, the percent identity (from 50 to 100%) for each gap-free aligning segment is plotted using the coordinates of the human sequence, and notable features such as exons and interspersed repeats are placed along the horizontal axis (22). Fig. 1B shows the percent identity plots for alignments of the human sequence with that from the prosimian primate galago, from rabbit, and from mouse as three panels, including the region from HS 1 of the LCR through the ⑀-globin-coding sequence. In general, almost all of the galago sequence aligns with a high similarity to the human sequence. Extensive matches are also seen for comparisons of the human sequence with rabbit and mouse, although a roughly 1.6-kb segment between HS 1 and the ⑀-globin gene does not match (corresponding to about Ϫ4 -2.4 kb in the human). Matching sequences extending this far 5Ј to the gene are not characteristic of all mammalian globin genes. For instance, the 5Ј-flanking region of the human ␤-globin gene matches with that of galago to about Ϫ3000 bp, and with mouse to about Ϫ770 (23). The regions delineated in the results of the deletion series as ⑀NRA-I/⑀NRA-II and ⑀NRB show significant regions of matching in those comparisons. Thus the simultaneous alignment of these sequences is helpful in analyzing this region in more detail, as described below. However, regions comparable to human ⑀NRA-I/⑀NRA-II and ⑀PRA are found only in orangutan and galago, and only this pairwise alignment is informative, in contrast to greater cross-species matching more proximal to the ⑀-globin gene itself.
Characterization of ⑀NRB-The tissue-specificity of ⑀NRB was further examined by comparison of the two constructs, p⑀1747 and p⑀1707, in human adult erythroid primary cells (hAEC) as well as in the K562 and HeLa cell lines (data not shown). The decrease in transcription activity of p⑀1747 compared with p⑀1707 is erythroid-specific as observed in both K562 and hAEC cells but not in HeLa cells, suggesting the erythroid-specific property of ⑀NRB. Protein binding to the ⑀NRB was studied by in vitro DNase I footprinting with nuclear extracts from both K562 and HeLa cells. Two strongly protected regions were detected only with K562 nuclear extracts (Fig. 2). These footprints are located around Ϫ1752 to Ϫ1735 bp and Ϫ1718 to Ϫ1710 bp and overlap with regions that are conserved in the 5Ј region of corresponding embryonic globin genes in mouse, rabbit, and galago (Fig. 2, bottom). ⑀NRB alone, however, does not act as a true silencer. Interestingly, no significant negative activity is observed when ⑀NRB is linked directly to the ⑀ minimal promoter and tested in either K562 or HeLa cells, when linked to a heterologous promoter transcription activity is again reduced (Fig. 3). This suggests that ⑀NRB alone may exhibit negative regulation depending on the promoter, but does not act as a true silencer.
Characterization of ⑀NRA-I and ⑀NRA-II-The region between Ϫ3127 and Ϫ2902 bp which is active in both K562 cells and HeLa cells, has a much stronger negative effect in the erythroid cells (Fig. 1A), perhaps related to GATA-1 binding (Fig. 4). This region contains two negative control regions, ⑀NRA-I (Ϫ3127 to Ϫ3071 bp) and ⑀NRA-II (Ϫ3028 to Ϫ2902 bp), each associated with a decrease in reporter gene activity. In K562 cells, the region separating these two motifs (Ϫ3071 and Ϫ3028 bp) exhibits a modest positive effect (Fig. 1A). The combined effect of ⑀NRA-I and ⑀NRA-II in the 225-bp region reduces transcription activity 20-fold when added back to construct p⑀2902 to create p⑀3127. The negative effects of ⑀NRA-I and ⑀NRA-II were also observed in HeLa cells with about a 13-fold increase in transcription activity comparing p⑀2902 with p⑀3127. The activity of p⑀3127 is 3-4-fold lower than the ⑀-globin minimal promoter construct, p⑀177.
The ⑀NRA-I and ⑀NRA-II regions were combined with a heterologous SV40 promoter in reporter gene constructs p⑀NRA-I/SV40 and p⑀NRA-II/SV40, respectively. The activity of these reporter genes were assayed and compared with that of SV40 alone (Fig. 5). The region ⑀NRA-I decreases SV40 transcription activity by about 50% in K562 cells and more than 60% in HeLa cells. A similar decrease in transcription activity is observed when ⑀NRA-I is combined with the epsilon minimal promoter (p⑀NRA-I/⑀177) (data not shown). The ⑀NRA-II has an even greater effect on SV40 promoter activity. The decrease in SV40 promoter activity by ⑀NRA-II is almost 20 fold in K562 cells and about 10-fold in HeLa cells. The ability of ⑀NRA-I and ⑀NRA-II to decrease SV40 promoter activity is consistent with the decreases observed when these subregions are examined in the series of deletion mutants for the ⑀-globin 5Ј region (Fig. 1A).
Multiple Protein-binding Sites Identified in ⑀NRA-I and ⑀NRA-II-To attempt to identify the sequence motif responsible for the negative effect of ⑀NRA-I and ⑀NRA-II, we carried out DNase I footprint analysis and correlated the results with aligned DNA sequences from this region. Since the sequence corresponding to ⑀NRA is not present in mouse or rabbit, we reasoned that it would be informative to look at additional primate species. The only other primate species for which sequence data extends this far is the orangutan, and a simultaneous alignment of human, orangutan, and galago sequences is shown in Fig. 6B. Fig. 6A shows the DNase I footprinting assay of region ⑀NRA. The probe was generated by a polymerase chain reaction with 32 P-labeled primer, and the nuclear extract from K562 cells was used in the reactions. Several regions are footprinted by DNase I digestion designated as FP1-FP5. These include a conserved progesterone receptor binding motif (FP1) and a GATA-1 binding motif (FP2). A major footprinted region (FP3) appears within the region Ϫ3071 and Ϫ3028 bp which exhibits a small positive effect on transcription activity when comparing the constructs p⑀3028 with p⑀3071 in K562 cells. This footprinted region (FP3) is included within a block of sequence that is invariant among human, orangutan, and galago. Two minor footprinted regions (denoted FP4 and FP5) are at potential GATA-1 binding motifs in ⑀NRA-II at about Ϫ2976 and Ϫ2949 bp, respectively. An inverted AGATAG sequence appears in the region corresponding to FP4 in the galago ⑀-globin 5Ј-flanking region and the region corresponding to FP5 is only partially conserved in this comparison. Although two of the GATA1 binding sites have mismatches in galago that would be expected to decrease binding affinity, these binding sites are identical between orangutan and human.
To assess the role of the GATA-1 binding motifs in ⑀NRA-II in decreasing transcription activity, site directed mutagenesis was used to mutate the GATA-1 binding motifs at positions Ϫ2976 and Ϫ2951 bp in p⑀NRA-II/SV40 to create p⑀NRA-II-⌬1G/SV40 and p⑀NRA-II-⌬2G/SV40, respectively (Fig. 5). The construct, p⑀NRA-II-⌬1⌬2G/SV40, contained mutations at both sites. Mutation of the GATA-1 binding motif at Ϫ2976 (p⑀NRAII-⌬1G/SV40) resulted in an increase of transcription activity by about 15-fold and restored transcription activity to more than 85% that of the SV40 promoter alone. Mutation of the GATA-1 binding motif at Ϫ2949 (p⑀NRA-II-⌬2G/SV40) resulted in an increase in transcription activity by 4 -5-fold to about 25% of the activity obtained with the SV40 promoter alone. The construct containing the double mutation, p⑀NRA-II-⌬12G/SV40, also resulted in a restoration of almost 90% of the SV40 promoter activity. While GATA-1 binding motifs often provide positive regulation of transcription, these data suggest that as with the ⑀-globin silencer motif (⑀GSM) located around Ϫ275 bp, the GATA-1 binding sites in ⑀NRA-II provide much of the negative regulation associated with that region, and that the motif at Ϫ2976 bp was particularly important in this regard.
Gel mobility shift assays, therefore, were carried out to char- FIG. 3. Transcription effects of ⑀NRB on the ⑀-minimal promoter and a heterologous promoter (SV40). Luciferase activity of the ⑀-minimal promoter construct with and without ⑀NRB was measured in transfection assays in K562 and HeLa cells. An SV40 promoter construct with and without ⑀NRB was also analyzed in K562 cells.
FIG. 4. Gel mobility shift assay of ⑀NRA-II-1G with K562 (A) and HeLa (B) cell nuclear extracts. Probe was generated as described under "Materials and Methods." The molar excess of cold ⑀NRA II-1G or ⑀NRA II-⌬1G were 12.5ϫ, 25ϫ, 50ϫ, 12.5ϫ, and 25ϫ for A, lanes 3-7; 12.5ϫ, 50ϫ, 150ϫ, 12.5ϫ, 50ϫ for B, lanes 4 -8. acterize the ability of the GATA-1 motif at Ϫ2976 bp to form a DNA-protein complex in vitro. Fig. 4 shows that there are two complexes (A and B) formed between ⑀NRA-II-1G located at Ϫ2976 bp and nuclear extract of K562 cells, while there is only one complex (AЈ) formed with HeLa cell nuclear extract. Complex B appears to be specific binding and probably GATA protein-related as evidenced from the fact that an increasing amount of cold ⑀NRA-II-1G diminished the band (Fig. 4A, lanes  3-5), while addition of competitor with GATA-1 site mutated (⑀NRA-II-⌬1G) increased the formation of complex B.

DISCUSSION
It has been noted for some time that the ⑀-globin gene and its flanking regions are more conserved among mammals than are the ␤or ␥-globin genes (26,27). Additional DNA sequences and development of new sequence alignment software have continued to show homology throughout much of the 5Ј-flanking region, extending to HS 1 of the LCR. This homology is highly suggestive of extensive regulatory sequences. Previous studies have revealed multiple, conserved regulatory elements in the 800 bp proximal to the cap site of the human ⑀-globin gene. Conserved CCAAT and CACC motifs are needed for function of the proximal promoter (28), a highly conserved GATA motif at Ϫ160 bp is needed for response to the HS 2 enhancer (29), and the ⑀-globin silencer (⑀GS) (14) between Ϫ300 and Ϫ250 bp contains conserved binding sites for GATA1 and YY1 (8,15,16). Additional regulatory elements are observed further 5Ј, such as the negative element located at Ϫ419 (25,30). Multiple positive regulatory elements have also been identified within the first 800 bp 5Ј to the ⑀-globin gene, and at least two of them function in a synergistic manner (25,31). Each of these additional cis-acting regulatory sequences between Ϫ800 and Ϫ300 bp correspond to evolutionarily conserved sequences (8,9,23,32). The assumption that the sequence conservation results from selection for a common regulatory function was verified by observing a similar pattern of positive and negative regulatory elements 5Ј to the rabbit ⑀-globin gene (9).
Data in this report from the transient transfection assay of a series of deletion mutants show that multiple negative and positive cis-acting regulatory elements are found even more distally to the ⑀-globin gene, extending to HS 1 of the LCR. As illustrated in Fig. 1B, DNA sequences corresponding to many but not all of these regulatory elements are conserved in other mammals. Two prominent pairs of negative and positive regulatory elements in the Ϫ6000to Ϫ800-bp region, A and B, were studied in more detail. The highest level of reporter gene activity was observed for p⑀2902, in contrast to the low level of activity observed for p⑀2807, p⑀3028, and p⑀3127. These activ-ities of these constructs localized a strong positive regulatory region (⑀PRA) between Ϫ2807 and Ϫ2902 and a negative regulatory region (⑀NRA) consisting of two subregions between Ϫ3127 and Ϫ3071 (⑀NRA-I) and between Ϫ3028 and Ϫ2902 (⑀NRA-II). Both ⑀NRA-I and ⑀NRA-II also function when combined with a heterologous (SV40) promoter, with ⑀NRA-II, exhibiting a stronger negative regulatory effect (Fig. 5).
Our work shows the importance of the erythroid transcription factor, GATA-1, in these distal sites. GATA-1 has been found to be a repressor of the ⑀-globin gene in vivo (33) and appears to be involved in negative regulation of the erythropoietin gene (34). We have found it to be involved in the activity of ⑀GS (15). Site-directed mutagenesis of each of the two potential GATA-1 binding sites located in ⑀NRA-II decrease its negative effect, and mutation of both sites restored most the SV40 promoter activity (Fig. 5). These results demonstrate that the negative regulation of ⑀NRA-II is directly related to the two GATA-1 binding sites. The fact that ⑀NRA-II is active in both K562 and HeLa cells suggests that GATA-1 (expressed in K562 cells) and possibly other GATA factors (expressed in HeLa cells) can suppress transcription of the ⑀-globin gene. Whether this would be necessary in nonerythroid cells in which the globin chromatin is in a closed conformation is not clear. Mutation of GATA-1 site located in ⑀NRA-I does not change the negative effect (data not shown).
Unlike the other cis-regulatory elements in the 5Ј-flanking region of the ⑀-globin gene, the DNA sequences of the human ⑀NRA and ⑀PRA regions are not conserved in non-primate mammals, and are found only in the primates human, orangutan, and galago (Fig. 6B). Since mutations in this region have a strong phenotype in transfected cells, it appears that the function of this region is limited to primates. A complex array of positive and negative cis-regulatory elements are revealed by the deletion/transfection analysis. Likewise, the in vitro footprinting shows multiple binding sites. One of the long strings of invariant nucleotides in the human-orangutan-galago alignment (11 bp long) corresponds to FP3 (Fig. 6A), which is in a region implicated in positive regulation (between Ϫ3071 and Ϫ3028). In other cases the correspondence between the footprints and the invariant strings of nucleotides is not as strong. For instance, two of the three GATA binding sites in ⑀NRA contain mismatches between human and galago, suggesting that some of the function observed for ⑀NRA may be specific to higher primates. Regulation of the ␥-and ⑀-globin genes is distinctive in higher primates, with considerably more expression of the ⑀-globin gene compared with that of the ␥-globin gene in primitive erythroid cells but abundant expression of the FIG. 5. Transcription effects of ⑀NRA-I and ⑀NRA-II on a heterologous promoter (SV40). The regions from Ϫ3127 to Ϫ3071 bp (⑀NRA-I) and Ϫ3028 to Ϫ2902 bp (⑀NRA-II) of the ⑀-globin gene were placed 5Ј of the SV40 promoter driving expression of the luciferase reporter gene. Relative luciferase activity was measured and normalized to that of the SV40 promoter in K562 (left) and HeLa (right) cells. The two GATA-1 sites in ⑀NRA-II located at Ϫ2976 (1G) and Ϫ2946 (2G) which were mutated, separately or jointly, are indicated by triangles. Luciferase activities of these mutant constructs were also measured in K562 cells.
␥-globin gene in fetal definitive erythroid cells. In most other mammals (including galago), the ␥-globin gene ortholog is expressed at an equal or higher level than the ⑀-globin gene ortholog in primitive cells, and neither are expressed in definitive cells (fetal or adult). Thus some but not all of the regulatory elements in the ⑀NRA/⑀PRA may be distinctive to higher primates. Consistent with this hypothesis, we find that the GATA-1 binding sites are identical between orangutan and human. However, the orangutan sequence is very similar to human overall, and investigation of the sequence of more distantly related simian species would provide a clear test of the hypothesized function in higher primates. The GATA-1 binding site at Ϫ208, implicated in silencing of the ⑀-globin gene (17), is also found in the human sequence but not in prosimian mammals or representatives or other mammalian orders, again consistent with a function only in higher primates.
The second prominent pair of positive and negative regulatory elements is ⑀NRB/⑀PRB. The negative regulation exhibited by ⑀NRB is seen only in erythroid cells (data not shown).
The strong negative effect of ⑀NRB on the ⑀-globin gene promoter occurs only when it is in its natural position (Figs. 1 and  3), but it does not act alone on the proximal promoter (to Ϫ177) of the ⑀-globin gene or a heterologous promoter such as SV40. This suggests that the negative effect of ⑀NRB may require interaction with downstream sequences in the 5Ј-flanking region or other negative elements. A similar cooperative mechanism has also been proposed for the several positive elements located with Ϫ800 of the ⑀-globin gene, which do not function in isolation (20). DNA-protein binding assays reveal two footprinted regions in ⑀NRB with K562 cell nuclear extracts, which are absent with HeLa cell nuclear extract (Fig. 2). Both protected regions correspond to blocks of sequences, or phylogenetic footprints, conserved in human, galago, rabbit and mouse. Thus in the case of ⑀NRB, three independent lines of investigation, i.e. functional analyses of deletion constructs, in vitro DNA-protein binding data, and analyses of DNA sequence conservation, generate congruent results, all showing that this is an important regulatory region in many and possibly all orders of mammals.
It is interesting to note that this type of deletion analysis points to the existence of positive and negative elements as frequently close to each other, essentially in a tandem arrangement along the ⑀-globin gene 5Ј-flanking sequences. In addition to ⑀NRA/⑀PRA and ⑀NRB/⑀PRB, we have also localized pairs of positive and negative elements generating smaller effects from Ϫ2385 to Ϫ1747 bp and from Ϫ1460 to Ϫ1084 bp (Fig. 1A). Several of these regulatory regions contain conserved sequences previously identified as phylogenetic footprints (8). The positive region from Ϫ1707 to Ϫ1511 bp with erythroid specificity identified in this study has been shown to contain a conserved YY1 binding site and can bind YY1 very strongly (8), as well as GATA-1. YY1 is a ubiquitous transcription factor with dual action (35). The negative regions from Ϫ1460 to Ϫ1135 bp (active in K562 cells) and Ϫ1084 to Ϫ883 bp (active in both K562 and HeLa cells) identified in this study have binding motifs for YY1 and GATA-1. The positive region from Ϫ1153 to Ϫ1084 bp (active in K562 cells) contains a potential GATA-1 binding site (8). The previously characterized ⑀GS element from Ϫ300 to Ϫ250 bp also contains binding sites for both YY1 and GATA1. The manner in which YY1 and GATA1 function in both positive and negative regulation of the ⑀-globin gene is an important matter for further study. The detection of GATA-1 binding proteins, such as FOG (36), may point to complex protein assembly mechanisms mediating these effects.
We suggest that the down-regulation of ⑀-globin gene expression as development progresses involves cooperative interactions of the negative regulatory elements located around Ϫ4.5, Ϫ3, Ϫ1.7, and Ϫ0.3 kb (⑀GS), plus specific motifs located in the other general negative regions identified in the 5Ј-flanking region examined in this study (Fig. 1A). In particular, the reporter activity of construct p⑀6073, which contains about 6 kb of 5Ј-flanking sequences, is only 3% of that for the proximal ⑀-globin promoter, p⑀177 (Fig. 1A). This suggests that, even though along 6 kb of 5Ј-flanking sequences there are several positive as well as negative control elements, the net effect is negative on the ⑀-globin gene promoter, despite the fact that this construct contains HS 1. This could be the reason that when the ⑀-globin silencer around Ϫ275 is deleted or mutated, the expression in adult transgenic mice of the human ⑀-globin transgene linked to an LCR is only 5-10% as compared with the level of the endogenous mouse ⑀y or ␤ gene (18). 2 Additional aspects of the silencing process may be apparent when the ⑀-globin gene is linked with the LCR and other genes within the ␤-globin gene cluster. Other experiments in transgenic mice suggest that control of ⑀-globin gene expression may not be strictly autonomous and that in addition to the LCR, other regulatory elements flanking the 5Ј region of the ⑀-globin gene may affect expression of the genes located more 3Ј in the cluster. Studies using human YAC constructs containing the ␤-globin gene cluster with the LCR showed that deletion of the ⑀-globin silencer region also affected ␥-globin gene expression as well (19). Our new results identifying even more cis-acting regulatory elements in the 5Ј flank of the ⑀-globin gene illustrate the complexity of the mechanisms of ⑀-globin gene silencing, and they are a further step in improving understanding of the joint regulation of the entire ␤-globin gene cluster.