Intron 1 Elements Promote Erythroid-specific GATA-1 Gene Expression*

The zinc finger protein GATA-1 functions in a concentration-dependent fashion to activate the transcription of erythroid and megakaryocytic genes. Less is understood, however, regarding factors that regulate the GATA-1 gene. Presently elements within intron 1 are shown to markedly affect its erythroid-restricted transcription. Within a full-length 6.8-kilobaseGATA-1 gene construct (G6.8-Luc) the deletion of a central subdomain of intron 1 inhibited transcription ≥10-fold in transiently transfected erythroid SKT6 cells, and likewise inhibited high-level transcription in erythroid FDCW2ER-GATA1 cells. In parental myeloid FDCER cells, however, low-level transcription was largely unaffected by intron 1 deletions. Within intron 1, repeated GATA and Ap1 consensus elements in a central region are described which when linked directly to reporter cassettes promote transcription in erythroid SKT6 and FDCER-GATA1 cells at high rates. Moreover, GATA-1 activated transcription from this subdomain in 293 cells, and in SKT6 cells this subdomain footprinted in vivo. For stably integrated GFP reporter constructs in erythroid SKT6 cells, corroborating results were obtained. Deletion of intronic GATA and Ap1 motifs abrogated the activity of G6.8-pEGFP; activity was decreased by 43 and 56%, respectively, by the deletion of either motif; and the above 1800-base pair region of intron 1 per se was transcribed at rates uniformly greater than G6.8-pEGFP. Also described is the differential utilization of exons 1a and 1b among primary erythromegakaryocytic and myeloid cells.

The zinc finger protein GATA-1 functions in a concentration-dependent fashion to activate the transcription of erythroid and megakaryocytic genes. Less is understood, however, regarding factors that regulate the GATA-1 gene. Presently elements within intron 1 are shown to markedly affect its erythroid-restricted transcription. Within a full-length 6.8-kilobase GATA-1 gene construct (G6. 8

-Luc) the deletion of a central subdomain of intron 1 inhibited transcription >10-fold in transiently transfected erythroid SKT6 cells, and likewise inhibited high-level transcription in erythroid FDCW2ER-GATA1 cells. In parental myeloid FDCER cells, however, low-level transcription was largely unaffected by intron 1 deletions. Within intron 1, repeated GATA and Ap1 consensus elements in a central region are described which when linked directly to reporter cassettes promote transcription in erythroid SKT6 and FDCER-GATA1 cells at high rates. Moreover, GATA-1 activated transcription from this subdomain in 293 cells, and in SKT6 cells this subdomain footprinted in vivo.
For stably integrated GFP reporter constructs in erythroid SKT6 cells, corroborating results were obtained. Deletion of intronic GATA and Ap1 motifs abrogated the activity of G6.8-pEGFP; activity was decreased by 43 and 56%, respectively, by the deletion of either motif; and the above 1800-base pair region of intron 1 per se was transcribed at rates uniformly greater than G6.8-pEGFP. Also described is the differential utilization of exons 1a and 1b among primary erythromegakaryocytic and myeloid cells.
Less well defined are regulators of GATA gene expression. The GATA-1 gene is best studied and first, a 900-bp 1 proximal promoter region 5Ј to exon 1a has been described which in isolation directs transcription in erythroleukemic murine erythrolukemic cells (but not 3T3 fibroblasts) in part via two inverted GATA-1 elements (20,21). Unlike a yeast artificial chromosome construct containing the GATA-1 gene locus, however, this proximal promoter failed to support the expression of a linked GATA-1 cDNA at levels sufficient to rescue the erythroid differentiation of GATA1-deficient embryonic stem cells (22). Second, extended GATA-1 gene ␤-galactosidase reporter constructs recently have been tested in vivo, and an upsteam activating element (UAE) positioned approximately 2600 bp upstream of exon 1 was discovered to be required for efficient expression in erythroid and megakaryocytic cells (23)(24)(25). Interestingly, expression in definitive erythroid cells also was inhibited by the deletion of a downstream 3900-bp region (including intron 1 and exon 1b) (22), and within this general region an erythroid-specific hypersensitive site previously has been mapped (26). Presently, possible contributions of intron 1 subdomains to high-level erythroid-restricted GATA-1 gene transcription have been investigated further. For transiently as well as stably transfected reporter constructs, a central subdomain of intron 1 is shown to be necessary (as well as sufficient) for high-level transcription in erythroid cells, and to contain repeat consensus GATA and Ap1 elements which foot-print in vivo. Also examined are rates of transcriptional initiation from exon 1a versus exon 1b in primary hematopoietic cells. These novel regulatory features of the GATA-1 gene likely exert important effects on its differential levels of expression among developmental stages, and hematopoietic lineages.
In Vivo Footprinting-Exponentially growing SKT6 cells (4 ϫ 10 8 ) were collected and resuspended in 4 ml of 7% fetal bovine serum in Opti-MEM I media. Methylation was achieved by incubation for 4 min at 20°C with either 7 or 21 l of (10.57 M) DMS (Sigma) per 2 ml of cells, and was terminated by washing cells twice with 0°C phosphate-buffered saline. Genomic DNA was extracted, and methylated guanine (G) residues were hydrolyzed by incubating 20 g of DMS-treated genomic DNA with 0.1 M piperidine for 10 min at 90°C (31). Control genomic DNA was extracted, and exposed to DMS and piperidine. 1 g of DNA was used as a template for ligation-mediated PCR (34). Complementary strand synthesis was with primer 1A (5Ј-GTCTCTCCCTCCATTTCC-3Ј, T m ϭ 60°C). Linker ligation was as described using a modified Mueller and Wold linker (32). PCR amplification of the double-stranded population was with primer 2A (5Ј-ACTGTGTTTCTGTGTTTTTCCTACC-3Ј, T m ϭ 63°C). A third primer (3A: 5Ј-CTGTGTTTCTCCTACCTTTCTGT-GCTTTACC-3Ј, T m ϭ 67°C) was end-labeled using T4 polynucleotide kinase (New England Biolabs, Beverly, MA) and [␥-32 P]ATP (Amersham Pharmacia Biotech). The radiolabeled primer was elongated via five PCR cycles. Products were resolved in 5% gels and exposed to film. The sequencing ladder was created using Sequenase (U. S. Biochemicals, Cleveland, OH) and pCRScript-G (7)/Ap1(4) as a template.
DNA Sequence Analyses-Cycle sequencing reactions were performed using 3Ј BigDye-labeled dideoxynucleotide triphosphates and an ABI PRISM 377 DNA Sequencer (Perkin-Elmer ABI, Foster City, CA). Putative elements for transcription factor binding were defined using Sequence Interpretation Tools software.

RESULTS
In primary analyses of GATA-1 gene subdomains that affect erythroid-specific transcription, the deletion constructs illustrated in Fig. 1 were prepared from a parent construct, G6.8-Luc, and assayed for activity in erythroid SKT6 cells. Within G6.8-Luc, 6800 bp 5Ј to exon 1a are retained, including the UAE and additional upstream sequences. In G4.5-Luc and G2.5-Luc, upstream sequences of 2300 and 4300 bp are deleted, respectively. In G6.8-ESI⌬900, upstream regions are intact but a central 900-bp subdomain of intron 1 (bounded by exons 1a and 1b and designated ESI, erythroid-specific intron) is deleted. Finally, in ESI 3700-Luc, ESI 1800-Luc, and ESI 900-Luc, subdomains of intron 1 alone are represented, and are fused to a luciferase cassette immediately upstream from a unique translational start codon in exon 2 of the GATA-1 gene (as are all luciferase and GFP reporter constructs). Activities of the above constructs in transiently transfected SKT6 cells are illustrated in Fig. 2, upper panel. For G4.5-Luc, activity reproducibly was observed to be increased 2-fold over G6.8-Luc upon the deletion of an apparently repressing 2300-bp region upstream of the UAE. Consistent with the results of in vivo studies (21, 22) deletion (from G6.8-Luc) of the UAE in construct G2.5-Luc inhibited transcription in SKT6 cells by ap-proximately 10-fold (as compared with G4.5). More remarkably, in G6.8ESI⌬900 the deletion of a central subdomain of intron 1 inhibited transcription as severely as the deletion of the UAE (i.e. Ն10-fold inhibition). In addition, when fused directly to a luciferase reporter cassette, a proximal 1800-bp region of this intron (construct ESI 1800-Luc) promoted transcription in transiently transfected SKT6 cells at rates as high as Ն10-fold above those observed for G6.8-Luc and G4.5. Within ESI 1800-Luc, the deletion of a distal 900-bp region (yielding the construct ESI 900) abrogated activity, further delineating strong positively acting elements within intron 1 to a central subdomain. Finally, in the construct ESI 3700 inclusion of a further 5Ј region of intron decreased activity in SKT6 cells, indicating the possible presence of inhibitory elements within the far upstream region of intron 1.
To determine the extent to which the above delineated GATA-1 gene subdomains might regulate transcription selectively in erythroid cells, reporter constructs also were assayed in a uniquely advantageous pair of cell lines, i.e. FDCW2ER and FDCW2ER-GATA-1 cells. FDCW2ER cells are a well characterized myeloid progenitor cell line in which no expression of GATA-1, EKLF, Epo receptor, or globin gene transcripts is detectable, while FDCW2ER-GATA-1 cells are a derived subline in which the stable expression of exogenous GATA-1 activates endogenous erythroid gene expression, including GATA-1, EKLF, and ␤ maj globin (19). In erythroid FDCW2ER-GATA-1 cells, those GATA-1 gene constructs which possessed high transcriptional activities in erythroid SKT6 cells (i.e. G6.8-, G4.5-, and ESI 1800-Luc) likewise were transcribed at high rates (Fig. 2, center panels). In myeloid FDCW2ER cells, however, transcription of these constructs was uniformly low (Fig. 2, lower panels) while activities of constructs G2.5 and ESI 900 were elevated. These results reinforced the above findings in SKT6 cells, and indicate that a central 900-bp domain of intron 1 contains elements that strongly promote erythroid-restricted GATA-1 gene transcription. In addition, the elevated activities of constructs G2.5 and ESI 900 in myeloid FDCERW2ER cells at least suggest that domains Ϫ4500 to Ϫ2500 and ϩ1800 to ϩ900, while activating in red cells, might also repress transcription in other lineages. Activities of the above GATA-1 promoter constructs also were assayed in erythroid B6SUt.Ep cells (28), and activity profiles were observed to sharply parallel those observed in SKT6 and FDCW2ER-GATA-1 cells. 2 Based on the above characterized role for a central subdomain of intron 1 in erythroid-specific transcriptional activity, this region of the murine GATA-1 gene next was sequenced (Fig. 3). Sequencing revealed first the occurrence of several GATA elements as well as a CACC element within a proximal 900-bp region. In addition, positioned in an immediately upstream region were clusters of seven contiguous consensus GATA elements (bold and underlined) and four adjacent Ap1 elements (italicized and underlined). The functional signifi-cance of these latter repeated elements next was tested in three ways. The ability of GATA-1, FOG, or GATA-1 plus FOG to activate the above described transcriptional reporter constructs in 293 fibroblasts was assessed; repeat domains were assessed for trans-factor occupancy in erythroid SKT6 cells by in vivo footprinting; and relative rates of transcription from promoters positioned upstream of exon 1a versus 1b were assayed.

FIG. 4. Transcriptional activation of intron 1-derived constructs by GATA-1 in stably transfected 293 cells.
Panel A, to test the ability of GATA-1 (and/or FOG) to activate transcription from intron 1 subdomains, select G6.8, and ESI luciferase reporter constructs were assayed for activity in 293 fibroblasts transfected stably with these factors. In 293-GATA-1 cells, transcription from the intron 1 construct ESI-1800 was stimulated Ն5-fold due to GATA-1 alone (see panel 293-G1, and lower summary panel) and this was inhibited by deletion of GATA and Ap1 repeat elements (in the truncated construct ESI-900). By comparison, transcription of the full-length construct G6.8 was stimulated by GATA-1 only 2-fold, and only in the presence of FOG (i.e. in 293-G1-FOG cells). As a positive control, a murine ␣IIb promoter-reporter construct also was assayed in parallel, and was activated approximately 2.5-fold by GATA-1, and 8-fold by GATA-1 plus FOG (right most panels). In the lower subpanel GATA-1-induced increases in the transcription of the above constructs are summarized. Panels B and C, shown are levels of GATA-1 protein (B) and FOG transcript (C) expression in the stably transfected 293 cell lines used in the above analyses. main. In 293-GATA-1-FOG cells, the co-expression of FOG with GATA-1 did not significantly affect the transcription of the intron 1 constructs ESI 3700, 1800, or 900, but did reproducibly increase transcription from the GATA-1 gene constructs G6.8-Luc and G4.5-Luc approximately 2-fold (in 293-FOG cells, FOG alone did not exert this effect). As a further positive control, 293 sublines also were transiently transfected with a reporter that contains a 545-bp promoter region of the murine ␣IIb gene. As shown in Fig. 4A (right panels) GATA-1, as well as GATA-1 plus FOG, efficiently activated this megakaryocytic promoter. Thus, the central subdomain of intron 1 which contains repeated GATA and Ap1 elements maps as a functional target for GATA-1-dependent transcriptional activation (see Fig. 4, lower summary panel). In Fig. 4, B and C, levels of GATA-1 and FOG expression in stably transfected 293 cells are illustrated. To further assess the functional importance of cis-elements within intron 1, in vivo footprinting of a region bounding GATA and Ap1 repeats also was performed in SKT6 cells. As shown in Fig.  5, footprinting against DMS-dependent hydrolysis was observed for a region extending from nine (of 14) GATA repeats through eight GACA repeats, and extended 3Ј through at least four flanking nucleotides (GCAG). This was observed at two concentrations of DMS (left panel) and in independently repeated experiments. Dideoxy dGTP and dATP sequence reaction products from a cloned intron 1 fragment (primed with an internal oligo used in genomic footprinting reactions) were used to index the register of footprinted products. These data indicate stable occupancy of this intron 1 subdomain in erythroid cells.
Initially in SKT6 cells and erythroid splenocytes, relative rates of transcription from initiation sites within exon 1a ver-sus exon 1b within the endogenous GATA-1 gene were next assayed. Assays were by 32 P-reverse transcriptase-PCR using primers specific for exon 1a versus 1b (for schematics of exons within the GATA-1, and -2, and -5 genes, see Fig. 6, lower panel). In both SKT6 cells and erythroid splenocytes, transcription was primarily from exon 1a yet also initiated at appreciable rates from exon 1b (Fig. 6, upper panels). Next, exon 1a and 1b-derived transcripts were assayed in marrow cells expanded ex vivo under culture conditions selective for the propagation of CFUe, CFU-Meg, or granulocytes/monocytes (Fig. 6, center  panel). In marrow-derived erythroid and megakaryocytic cells, relative frequencies of exon 1a versus 1b-derived transcripts were highly similar to those observed in SKT6 cells and erythroid splenocytes. In primary granulocyte/monocytic cells, however, transcripts from exon 1b somewhat unexpectedly were detected at levels comparable to those in erythroid and megakaryocytic cells (while no exon 1a-derived transcripts were detected). Sequencing confirmed the identity of this PCR product. Thus, transcription from initiation sites in exon 1a occurs at high levels in cells that develop within erythroid or . RNA was reversed transcribed, and products whose transcription initiated at exon 1a (220-bp PCR product) versus exon 1b (490-bp PCR product) were assayed by PCR using 5Ј primers specific to each exon. Diagrammed in the lower panel are the related exon structures of GATA-1, -2, and -5 genes. For G4.5-Luc, activity reproducibly was observed to be increased 2-fold over G6.8-Luc upon the deletion of an apparently repressing 2300-bp region upstream of the UAE. megakaryocytic lineages. In addition, initiation from this exon may be selectively repressed in alternate hematopoietic progenitor cells while initiation from exon 1b does not appear to be subject to this differential regulation.
Structure-function features of intron 1 that merit attention are first, a 900-bp region immediately 5Ј to exon 1b which alone supported low-level transcription in erythroid cells, yet somewhat higher level transcription in myeloid FDCER cells (see Figs. 1-3). Based on the occurrence in this functional ESI-900 promoter of several consensus GATA elements as well as a CACC box (which likewise occur in the previously described promoter 5Ј to exon 1a) (20,21), it was anticipated that its activity in erythroid cells might be higher. Erythroid activity, however, proved to require linkage to the central region of intron 1. This GATA and Ap1 repeat-containing region (in ESI-1800-Luc and -EGFP constructs) interestingly also strongly repressed transcription in myeloid FDCER cells. These findings at least suggest roles for this central region in recruiting not only erythroid activators, but possibly repressors in non-erythroid cells. Finally, the 1900-bp distal region of intron 1 (as linked to ESI-1800 in ESI-3700-Luc) repressed transcription specifically in erythroid SKT6 and FDCER-GATA1 cells. Inspection of this distal region, however, revealed essentially no consensus elements for transfactor binding. In contrast is the striking occurrence of the above central repeat GATA and Ap1, and data also indicate that this region is transactivated by GATA-1 in 293 cells and occupied in erythroid SKT6 cells in vivo. Specific activities exerted by subsets of repeat elements merit further study, but by analogy it is interesting to consider that essential clustered GATA elements also recently have been described within the divergently transcribed niiA and niaD genes of Aspergillus (34). In this system, disruption of these elements (or of the GATA gene, Area) blocks chromatin remodeling at these sites, and severely inhibits niiA and niaD transcription. In addition, within the short arm of chromosome Y, GATA repeats within a p17 subregion of an M34 repeat (including 12 contiguous elements in as many as 300 repeated M34 domains) also have been speculated to regulate testes-specific decondensation (35).
As investigated within the context of full-length and maximally active reporter constructs (and as analyzed in both transient and stable transfections), an important role for the central subregion of intron 1 in directing erythroid-specific GATA-1 gene expression is also presently described. Here dramatic losses in activity due to the deletion of both GATA and Ap1 elements simply but importantly reveal requirements for this intronic region in erythroid transcription. Mechanistically, this might involve effects on transcription initiated at exon 1b, 1a, or both. The deletion of this intron region inhibited overall luciferase and GFP expression from exon 2 Ն10-fold, while levels of transcripts derived from exon 1b in erythroid cells accounted for only 10% (or less) of total levels. It therefore is concluded that transcription from both exon 1a and b depends upon the intactness of intron 1. Such effects of intron elements on transcription are not common, yet have been well documented in other systems. Examples include a tissue-specific reduction in mRNA stability upon the deletion of intron 1 of the ␣ 1 -collagen gene (36), attenuated nuclear export of mitogenregulated proliferin transcripts (37), and roles for intron 1-3 of the mouse thymidine kinase gene in markedly modulating transcript initiation (38). In the narrowed context of erythroid genes, several examples also exist wherein introns can regulate lineage-specific transcription. In the Epo receptor gene, an erythroid-specific DNase I hypersensitive site has been mapped to intron 1, and this intron was shown to enhance transcription 4-fold and to contain two key GATA-1-binding sites (39). In the human adult ␤and ␣-globin genes, important roles for introns in transcriptional regulation also have been described. In the adult ␤-globin gene, two DNase I-hypersensitive sites have been mapped within intron II (␤ IVS2) that contain four GATA-1 binding sites (40) and integrity of ␤ IVS2 has been shown to be important for transcription (possibly via interactions with elements in the locus control region) (41). Finally, in the human ␣-globin gene cluster, a key upstream hypersensitive site, HS-40, interestingly lies within an intron of an anonymous but widely expressed gene (42) and integrity of this site likewise is essential for high-level ␣-globin gene expression in erythroid cells (43). Within the presently studied GATA-1 gene, whether integrity of intron 1 supports transcript stability, transport, and/or initiation is under active investigation (as is the possible lineage specificity of this effect).
A final interesting feature of GATA-1 gene transcription is the observed utilization of exon 1b (but not 1a) in primary granulocytic/monocytic cells (see Fig. 6). In these granulocytes/ monocytes cells, levels of exon 1b-derived transcripts were shown to approximate those in primary erythroid and megakaryocytic cells, while little to no exon 1a-derived transcripts were detected. This observation at least suggests that exon 1b might be selectively utilized in non-erythromegakaryocytic cells to provide for low-level GATA-1 expression. By comparison, in at least two related GATA genes, GATA-2 and -5, differential expression from similarly distributed exons recently has been reported. In the murine GATA-2 gene Minegishi et al. (13) have demonstrated the occurrence of two functional promoters upstream from exons 1S and 1G which are positioned in close parallel to exons 1a and 1b of the murine GATA-1 gene (see Fig. 6). Transcription from exon 1S was observed only in Sca-1 ϩ /c-Kit ϩ hematopoietic progenitor cells, and repeat elements for GATA binding were discovered in the promoter flanking this exon. In the chicken GATA-5 gene, transcription also is initiated from two distinct exons, and in embryonic heart, transcripts initiate from both exons while in adult heart initiation is only from 1b (44). Thus, whether transcription of the GATA-1 gene from exons 1a versus 1b might also be regulated differentially during erythroid development likely merits further investigation.