Loss of the Gata1 Gene IE Exon Leads to Variant Transcript Expression and the Production of a GATA1 Protein Lacking the N-terminal Domain*

GATA1 is essential for the differentiation of erythroid cells and megakaryocytes. The Gata1 gene is composed of multiple untranslated first exons and five common coding exons. The erythroid first exon (IE exon) is important for Gata1 gene expression in hematopoietic lineages. Because previous IE exon knockdown analyses resulted in embryonic lethality, less is understood about the contribution of the IE exon to adult hematopoiesis. Here, we achieved specific deletion of the floxed IE exon in adulthood using an inducible Cre expression system. In this conditional knock-out mouse line, the Gata1 mRNA level was significantly down-regulated in the megakaryocyte lineage, resulting in thrombocytopenia with a marked proliferation of megakaryocytes. By contrast, in the erythroid lineage, Gata1 mRNA was expressed abundantly utilizing alternative first exons. Especially, the IEb/c and newly identified IEd exons were transcribed at a level comparable with that of the IE exon in control mice. Surprisingly, in the IE-null mouse, these transcripts failed to produce full-length GATA1 protein, but instead yielded GATA1 lacking the N-terminal domain inefficiently. With low level expression of the short form of GATA1, IE-null mice showed severe anemia with skewed erythroid maturation. Notably, the hematological phenotypes of adult IE-null mice substantially differ from those observed in mice harboring conditional ablation of the entire Gata1 gene. The present study demonstrates that the IE exon is instrumental to adult erythropoiesis by regulating the proper level of transcription and selecting the correct transcription start site of the Gata1 gene.

GATA1 is essential for the differentiation of erythroid cells and megakaryocytes. The Gata1 gene is composed of multiple untranslated first exons and five common coding exons. The erythroid first exon (IE exon) is important for Gata1 gene expression in hematopoietic lineages. Because previous IE exon knockdown analyses resulted in embryonic lethality, less is understood about the contribution of the IE exon to adult hematopoiesis. Here, we achieved specific deletion of the floxed IE exon in adulthood using an inducible Cre expression system. In this conditional knock-out mouse line, the Gata1 mRNA level was significantly down-regulated in the megakaryocyte lineage, resulting in thrombocytopenia with a marked proliferation of megakaryocytes. By contrast, in the erythroid lineage, Gata1 mRNA was expressed abundantly utilizing alternative first exons. Especially, the IEb/c and newly identified IEd exons were transcribed at a level comparable with that of the IE exon in control mice. Surprisingly, in the IE-null mouse, these transcripts failed to produce full-length GATA1 protein, but instead yielded GATA1 lacking the N-terminal domain inefficiently. With low level expression of the short form of GATA1, IE-null mice showed severe anemia with skewed erythroid maturation. Notably, the hematological phenotypes of adult IE-null mice substantially differ from those observed in mice harboring conditional ablation of the entire Gata1 gene. The present study demonstrates that the IE exon is instrumental to adult erythropoiesis by regulating the proper level of transcription and selecting the correct transcription start site of the Gata1 gene.
Transcription factor GATA1 is critical for erythroid and megakaryocytic cell differentiation through its regulation of specific target genes (1,2). Germ line mutation of the Gata1 gene results in death around embryonic day 11.5 (E11.5) due to malfunctioning primitive hematopoiesis (3). By contrast, disruption of the Gata1 gene in adult mice using tamoxifen-inducible Cre recombinase expression results in the loss of erythroid progenitors and gives rise to a phenotype resembling human red cell aplasia (4). In addition, accumulating lines of evidence suggest that dysmegakaryopoiesis is linked to a reduction in GATA1 protein (5)(6)(7)(8).
The Gata1 gene consists of five common exons coding for the GATA1 protein and multiple first exons encoding the 5Ј-untranslated region (UTR). 3 Of the first exons, the erythroid first exon (IE exon) is located 3.9 kb upstream of the second exon and is mainly utilized for transcription of the Gata1 gene in hematopoietic tissues (9,10). The distal testis first exon (IT exon) plays a role in Gata1 gene expression in mouse testis (11,12). In addition, a minor first exon called the IEb exon was identified in an erythroid cell line and is located within the Gata1 first intron. Another minor first exon, the IEc, was found just downstream of the IEb exon in cultured bone marrowderived cells from eosinophil-specific GATA1-deficient mice (2,9,13). The IEb and IEc were seldom used during homeostasis, indicating that these exons do not retain the potential to promote Gata1 gene expression physiologically in vivo. In addition, as these exons are difficult to separate, we refer to them collectively as the IEb/c exon.
We previously established a Gata1.05 knockdown allele by inserting a neo-cassette in front of the hematopoietic IE exon of the Gata1 gene (14). Gata1 gene expression from the IE exon of the Gata1.05 allele decreased to 5% of the endogenous level (14). Because the Gata1 gene is localized to the X chromosome, hemizygous male mice (Gata1 G1.05/Y ) harboring the allele markedly lack the Gata1 gene expression. The hemizygotes die by embryonic day 12.5 (E12.5), because of the impairment of embryonic (primitive) hematopoiesis. This is in very good agreement with the Gata1 knock-out mouse phenotype.
By contrast, less well understood is the function of IE exon during adult hematopoiesis, as mice lacking the entire Gata1 gene or the IE exon die in utero due to insufficient erythropoiesis (Ref. 3 and this study). Thus, we carried out stage-specific deletion of the IE exon in adult mouse hematopoietic cells using mice harboring an allele in which the IE exon is sandwiched by a pair of loxP sequences and an interferon-inducible Cre recombinase expression system.
We found that upon Cre-mediated deletion of the IE exon, Gata1 gene expression was completely abolished in the megakaryocytic lineage. By contrast, in the erythroid lineage Gata1 mRNA was produced at a level comparable with that in control mice because of employment of the alternative first exon IEb/c and newly identified IEd first exon. We found that mRNAs transcribed from these alternative first exons tend to produce variant GATA1 protein lacking the N-terminal domain, which is well observed in Down syndrome-related megakaryoblastic leukemia (15)(16)(17). Importantly, in mice with conditional knock-out of the entire Gata1 gene, erythroid progenitors underwent maturation arrest (4), whereas in IE-null mice we observed dyserythropoiesis with skewed erythroid maturation. These results thus demonstrate that the IE exon is essential for adult erythropoiesis in terms of proper regulation of Gata1 gene expression and the appropriate production of GATA1 protein.

EXPERIMENTAL PROCEDURES
Animals-All experimental procedures were approved by the Institutional Animal Experiment Committee and experiments were performed in accordance with the Regulation for Animal Experiments of Tohoku University. Homologous recombination gene targeting of Gata1 (Fig. 1A) was performed following the method as described previously (14). Mx1-Cre and Ayu1-Cre transgenic mouse lines were kindly provided by Drs. Hisamaru Hirai and Mineo Kurokawa (University of Tokyo) and Drs. Kenichi Yamamura and Kimi Araki (Kumamoto University), respectively. Peripheral blood from individual mice was collected and analyzed with a hemocytometer (Nihon Koden). For induction of hemolytic anemia, mice were subcutaneously administrated phenylhydrazine (48 mg/kg of body weight; Sigma) for 2 consecutive days. Mice were analyzed 3 days after the final administration.
Southern Blot Analysis-Genomic DNA samples from the spleen and bone marrow of mice were digested with EcoRI and EcoRV and loaded on an agarose gel for electrophoresis, followed by blotting onto a nylon membrane. Probes were labeled with [␣-32 P]dCTP in a random priming reaction using a RediPrime II DNA labeling system (Amersham Biosciences).
Histological and Immunohistological Analyses-Hematoxylin-eosin and acetylcholine esterase staining were performed as previously described (18,19). For immunohistological analyses, tissue samples were fixed in 4% paraformaldehyde overnight at 4°C, embedded in Tissue-Tec OCT compound (Sakura Finetechnical), and rapidly frozen. The dissected tissues were immunostained with rat anti-GATA1 monoclonal antibody N6 or goat anti-human GATA1 polyclonal antibody C20 (Santa Cruz) for 1 h at room temperature and subsequently incubated with a biotinylated secondary antibody at room temperature.
Flow Cytometry and Cell Sorting-Cells were incubated with fluorescein isothiocyanate-conjugated anti-CD71 and -CD41, phycoerythrin-conjugated anti-Ter119 and -CD61, and allo-phycocyanin-conjugated anti-c-Kit antibodies. Cells were analyzed by FACSCalibur (BD Biosciences). For morphological analyses, cells were sorted by FACSAria (BD Biosciences). Positive selection of CD71, CD41, or B220 expression was carried out using a MACS cell separation system (Miltenyi Biotec) according to the manufacturer's procedure. All antibodies were purchased from BD Pharmingen.
Quantitative and Semi-quantitative Reverse Transcriptase (RT)-PCR-Total RNA was isolated using ISOGEN (Nippon Gene) and transcribed into cDNA using SuperScript II and III Reverse Transcriptase (Invitrogen). Quantitative RT-PCR (qRT-PCR) analysis was carried out by an Applied Biosystems 7500 real time PCR system and qPCR Mastermix Plus or qPCR Mastermix Plus for SYBR Green I (EUROGENTEC) were utilized to detect total Gata1 transcript or specific Gata1 transcript containing alternative first exons, respectively. Data were normalized by glyceraldehyde-3-phosphate dehydrogenase (Gapdh) expression. The Gata1 probe/primers used in the qRT-PCR analyses are described in supplemental Table S1 and Gapdh probe/primers were purchased from Applied Biosystems. Semi-quantitative RT-PCR was performed with Advantage 2 Taq (Clontech) using the primers described in supplemental Table S1.
5Ј Rapid Amplification of cDNA Ends (5Ј-RACE)-Reverse transcription was carried out by a 5Ј-Full RACE Core Set (Takara) using the primers described in supplemental Table  S2. The products were subcloned into pGEM-T easy vector (Promega) and sequenced using BigDye Terminator (Applied Biosystems).
Chromatin Immunoprecipitation (ChIP)-Bone marrow-derived cells were cross-linked with 1% formaldehyde and neutralized with glycine. After lysing the nuclei, samples were sonicated to produce DNA fragments of ϳ300 -500 bp. An aliquot of each sample was kept as an input DNA control. The remaining samples were subjected to overnight pre-clearing twice using a mixture of protein A-and protein G-agarose beads (GE Healthcare) in combination with normal rabbit IgG (Santa Cruz). Samples were immunoprecipitated with antibodies against the C terminus of histone H3, di-methylated lysine 4, and tri-methylated lysine 4 of histone H3 (Upstate). Immunoprecipitation with rabbit IgG only was performed as a negative control for immunoprecipitation with specific antibodies. Cross-links were reversed by incubation at 65°C with 0.2 M NaCl. DNA was purified using QIA Quick spin columns (Qiagen) and subsequently subjected to qPCR using qPCR Mastermix Plus for SYBR Green I. Values were normalized against the histone H3 occupancy level of each promoter. Irrelevant locus (14.4 kb upstream of the second exon) was amplified as a negative signal. The primers utilized in the ChIP experiment are described in supplemental Table S3. For detection of the IE exon, primers were set outside of two loxP sites, so that the primers amplified the genomic fragment even from the IE exon-deleted mice.

Gata1 IE Exon Is Indispensable for Embryonic Hematopoiesis-
We generated a mouse line harboring a neo-cassette sandwiched with the first and second loxP sequences in the 5Ј-region of the IE exon (Fig. 1A, Gata1.05flox). We also inserted a third loxP sequence into the 3Ј-region of the IE exon. We found that hemizygous Gata1.05flox male embryos (Gata1 G1.05F/Y ) died by E12.5 (data not shown), indicating that Gata1 gene expression from the Gata1.05flox allele was insufficient to sustain embryonic erythropoiesis. This is in very good agreement with the observation that hemizygous Gata1-null male mice (Gata1 Ϫ/Y ) and Gata1.05 mutant male mice died by E12.5 (3,14).
To achieve complete deletion of the IE exon (⌬IE allele) from the Gata1 gene, we crossed female mice harboring the Gata1.05flox allele (i.e. Gata1 G1.05F/X ) with Ayu-1-Cre male mice that retain the Ayu-1-Cre transgene for the ubiquitous deletion of the loxP-flanked gene segments (20). Among the offspring of this mating, we did not find any female mice harboring the Gata1.05flox allele concomitant with the Ayu-1-Cre transgene, suggesting that complete recombination at the Gata1 locus had occurred. Indeed, all female mice harboring the mutant Gata1 allele and the Ayu-1-Cre transgene exhibited deletion of the IE exon (Gata1-⌬IE). In subsequent generations, male mice were born only when the mice harbored a wild-type Gata1 allele (data not shown).
We also conducted serial embryo analyses and found that hemizygous Gata1-⌬IE male embryos showed severe anemia (Fig. 1B) and died by E12.5 (data not shown). These results thus indicate that the level of GATA1 in hemizygous Gata1-⌬IE males was insufficient to support embryonic hematopoiesis in the yolk sac, as is the case for Gata1 G1.05F/Y embryos.
Deletion of the IE Exon in Adulthood Results in the Accumulation of Erythroid Progenitors-As the neo-cassette located 5Ј of the IE exon ( Fig. 1A) attenuates expression of the Gata1 gene, we removed the neo-cassette from the Gata1.05flox allele and generated two independent mouse lines harboring a conditional knock-out (IEflox) allele (supplemental Fig. S1). We confirmed that hemizygous male mice harboring the IEflox allele (Gata1 IEF/Y ) but not harboring the neo-cassette were born according to the Mendelian inheritance and showed no abnormal hematopoietic parameter (Table 1 and data not shown). We then crossed the IEflox heterozygous female mice (referred to as Gata1 IEF/X mice) with Mx1-Cre male mice, in which Cre recombinase was expressed under regulation of the interferon-responsive promoter (21). Gata1 IEF/Y male mice with or without the Mx1-Cre transgene were utilized for further examination.
We first examined the hematological indices of Gata1 IEF/Y males with or without the Mx1-Cre transgene using a hemocytometer. As shown in Table 1, Gata1 IEF/Y male mice with the Mx1-Cre transgene displayed mild thrombocytopenia, but not anemia, before treatment with inducers. We surmise that this may be caused by partial recombination of the allele due to leaky expression of Cre recombinase, as previously described (22). We induced Cre recombinase expression with pI-pC (polyinosinic-polycytidylic acid) in adult mice. Three to 4-week-old mice received seven consecutive intraperitoneal pI-pC injections (30 mg/kg of body weight) every 3 days and the day of the final pI-pC treatment was classed as day 0. For simplicity, we referred to pI-pCtreated Gata1 IEF/Y :Mx1-Cre mice as Gata1 IECKO/Y mice. We used pI-  pC-treated Gata1 IEF/Y mice without Mx1-Cre as control mice.
Southern blot analysis revealed that the ⌬IE allele appeared exclusively in genomic DNA recovered from the bone marrows and spleens of Gata1 IECKO/Y mice on day 5, whereas only a traceable level of the IEflox allele was detected in Gata1 IECKO/Y mice ( Fig. 2A). We subsequently examined the hematological indices of the mice after day 20. As shown in Table 1, the Gata1 IECKO/Y mice suffered from severe thrombocytopenia, whereas the severity of anemia varied among individuals. For instance, 14 of 93 Gata1 IECKO/Y mice showed mild anemia with hematocrit (Hct) values more than 40%. When we evaluated recombination efficiency, we found that less than 10% of non-recombined IEflox allele remained in the spleen and bone marrow of Gata1 IECKO/Y mice whose Hct values were below 32.5%. By contrast, ϳ50 (spleens) and 20% (bone marrows) of the Gata1 gene was the IEflox allele in mutant mice with mild anemia (Hct Ͼ 42.9%). We surmise that the residual hematopoietic cells that escaped the Cre recombinase activity might have contributed to expansion of the erythroid lineage in spleens, which consequently modified the phenotype of the mice. Therefore, we employed Gata1 IECKO/Y mice whose Hct values were less than 40% after day 20 for further analysis.
Although we noticed significant leukocytosis in Gata1 IECKO/Y mice, this might be due to stray nucleated erythrocytes in the circulating blood miscounted by the hemocytometer. In fact, a considerable number of nucleated erythrocytes appeared on the peripheral blood film of Gata1 IECKO/Y mice (Fig. 2B). These hematological abnormalities persisted during the observation period (up to 10 month; data not shown). In addition, the spleens of Gata1 IECKO/Y mice were significantly enlarged (Fig. 2, C and D), with apparent destruction of the splenic architecture. Due to expansion of the red pulp, we could not find demarcation of white or red pulp in Gata1 IECKO/Y mouse spleens (Fig. 2E). These results thus indicate that loss of the IE exon provokes erythroid hyperplasia in mice.
Skewed Erythroid Differentiation in Gata1 IECKO/Y Mice-To further study the erythroid phenotype of Gata1 IECKO/Y mice, we examined the expression of erythroid surface antigens by flow cytometry. It has been reported that by using CD71 and Ter119, erythroblasts can be categorized into several fractions (23). As described, bone marrow cells from control mice sorted into CD71 high Ter119 low (region 2), CD71 high Ter119 high (region 3), and CD71 med Ter119 high (region 5) fractions showed morphological characteristics that correspond to proerythroblasts, basophilic erythroblasts, and polychromatophilic erythroblasts, respectively (Fig. 2, F and H). Importantly, whereas only a small number of cells from control mice were observed in region 2 (Fig. 2F), hematopoietic cells from Gata1 IECKO/Y mice were abundant in region 2 (Fig. 2G). This suggests that the IE exon was responsible for erythroid development toward the stage when Ter119 expression was up-regulated. Surprisingly, a considerable number of bone marrow cells from Gata1 IECKO/Y mice were sorted into the CD71 low Ter119 low (region 4) fraction (Fig. 2G).
Morphological analyses revealed that the cells from Gata1 IECKO/Y mice in regions 2 and 4 showed features of polychromatic and orthochromatic erythroblasts, respectively. The latter cells were small sized, mature erythroblasts with dense nuclear chromatin. The morphologies of the cells from Gata1 IECKO/Y mice in regions 2 and 4 resembled those of cells from control mice in regions 3 and 5, respectively (Fig. 2, F-I).
These results collectively showed that the phenotypes observed in erythroid cells of Gata1 IECKO/Y mice are quite different from those of previously established Gata1-deficient mice. A complete loss of GATA1 provokes apoptotic cell death and differentiation arrest of erythroid progenitors. By contrast, Gata1 gene expression at 5% of the endogenous level protects erythroid cells from apoptosis, but does not promote erythroid differentiation (19,24). Two preceding conditional knock-out experiments further revealed that ablation of the entire Gata1 gene in adulthood either blocks differentiation at the proerythroblast stage or leads to a phenotype resembling human pure red cell aplasia (4).
Loss of the IE Exon Results in Thrombocytopenia with Megakaryocytic Hyperplasia-Several GATA1-deficient mouse lines reported to date suffer from thrombocytopenia with progenitor hyperplasia (4 -7, 19). Therefore, we investigated the megakaryocytic phenotype of Gata1 IECKO/Y mice. Whereas Gata1 IEF/Y :Mx1-Cre mice showed mild thrombocytopenia, deletion of the IE exon markedly worsened the megakaryocytic phenotype. The Gata1 IECKO/Y mice showed severe thrombocytopenia (see Table 1).
Flow cytometric analysis revealed that CD41 ϩ CD61 ϩ megakaryocytes were markedly increased in the bone marrow of Gata1 IECKO/Y mice (Fig. 3A, from 9.1 to 14.0%). We found an accumulation of various sized acetylcholine esterase-positive cells in Gata1 IECKO/Y spleen sections (Fig. 3B, brown staining). These megakaryocytic phenotypes resemble those of previously reported Gata1-deficient mice (4, 6, 7). It has been reported that GATA1 supports megakaryopoiesis at multiple steps. For instance, the N-terminal (NT) domain is important to regulate proliferation of immature megakaryocytes, whereas the N-finger and interaction with FOG1 are necessary for terminal maturation of megakaryocytes (18,25). Therefore, we deduced that the IE exon deletion per se disrupted megakaryopoiesis, leading to accumulation of megakaryocytes with interrupted terminal maturation.
Gata1 mRNA Expression Is Preserved in Erythroid Cells but Not in Megakaryocytic Cells in Gata1 IECKO/Y Mice-We found mature erythroblasts in Gata1 IECKO/Y mouse bone marrow, which were beyond the stage of critical differentiation arrest set FIGURE 2. Conditional deletion of the IE exon causes dyserythropoiesis. A, Southern blotting analysis of Gata1 IECKO/Y and control mice. Genomic DNAs were digested with EcoRI and EcoRV restriction enzymes, followed by hybridization with the probe corresponding to the upstream region of the IE exon (shown in Fig. 1A, black bar). Arrowheads indicate the bands corresponding to the IEflox allele (2.9 kb, white arrowhead) and the ⌬IE allele (2.5 kb, black arrowhead), respectively. BM, bone marrow. Two independent mice from day 5 were utilized for each genotype. B, Wright-Giemsa staining of peripheral blood smears from day 40. Note that for Gata1 IECKO/Y mice a considerable number of erythroblasts appeared on the film (right panel, arrowheads). An erythroblast at high magnification is shown in the inset. Scale bars represent 10 m. C and D, Gata1 IECKO/Y mice displayed massive splenomegaly. A representative example is shown in C. The weights of Gata1 IECKO/Y spleens were ϳ6-fold larger than those of control mice (Gata1 IECKO/Y mice, n ϭ 10; control mice, n ϭ 9; p Ͻ 0.0001). E, histological analysis of spleens. Note that the splenic architecture was destroyed due to the massive expansion of erythroid and megakaryocytic cells.  by GATA1 deficiency. This phenotype differs largely from phenotypes reported for several GATA1-deficient mouse lines, including a similar conditional knock-out of the Gata1 gene in adult mice. One plausible explanation for this discrepancy is that there may be a certain compensatory expression of GATA1 in Gata1 IECKO/Y mice. To test this possibility, we determined the expression level of Gata1. We conducted RT-PCR analysis using a primer set designed for the amplification of sequences corresponding to the whole open reading frame of Gata1 mRNA.
To our surprise, semi-quantitative RT-PCR revealed that the expression level of Gata1 mRNA in Gata1 IECKO/Y mouse bone marrow is similar to that in control and wild-type mice (Fig.  4A). Because this was an unexpected result, we repeated the analysis and also executed real time qRT-PCR analysis that quantifies Gata1 gene expression corresponding to the N-and C-finger domains. The semi-quantitative and real time qRT-PCR analyses were consistent and verified that Gata1 mRNA was expressed in Gata1 IECKO/Y and control mice at equivalent levels (Fig. 4B).
We further examined Gata1 expression in the isolated erythroid and megakaryocytic cell fractions. Congruous with the results from the whole bone marrow cell analysis (Fig. 4, A and  B), Gata1 mRNA expression in the CD71-positive erythroid cells of Gata1 IECKO/Y mice was virtually equivalent to that in wild-type cells and Mx1-Cre-negative control cells (Fig. 4C). By contrast, Gata1 expression in CD41-positive megakaryocytes from Gata1 IECKO/Y mice was markedly reduced. The expression level was close to that in B220 ϩ lymphoid cells that were used as a negative control. These findings demonstrate that Gata1 mRNA expression in Gata1 IECKO/Y mice is preserved in erythroid cells, but not in megakaryocytic cells.

Alternative First Exons (IEb/c and IEd) Compensate in Transcription of the Gata1
Gene-To discover how Gata1 mRNA is expressed at the wild-type level without the IE exon/promoter, we studied the mouse Gata1 gene structure. We previously identified an alternative first exon/promoter of the Gata1 gene essential for Gata1 gene expression in the testis and called it the IT exon (11,12). Additionally, IEb and IEc first exons were also found (2,9,13) and due to their substantial overlap, we combined and refer to them as the IEb/c exon. The IEb/c exon is located in the first intron and was reported to function occasionally in erythroid cells (2,9,13). The locations of the alternative first exons of Gata1, i.e. the IT and IEb/c, are depicted in Fig. 5A.
To delineate how loss of the IE exon was compensated for in Gata1 IECKO/Y mice, we performed 5Ј-RACE analysis using the antisense primer for the Gata1 third exon sequence and RNA samples extracted from bone marrow cells. As shown in Fig. 5B, the Gata1 cDNA fragment starting from the IE exon was the expected size and was detected only in the control mouse sample. By contrast, two novel bands of slightly smaller (gray arrowhead) and larger (black arrowhead) sizes were observed in the Gata1 IECKO/Y mouse sample.
These 5Ј-RACE products were cloned and their sequences were determined. In this analysis, the first nucleotide of the second exon was set to ϩ1 based on GenBank TM sequence data (accession number NM_008089.1). In control mice, the transcription start (TS) site was identified as position Ϫ74, which is 25 bp upstream of the previously reported first nucleotide of the IE exon in 12 of 18 clones (Fig. 5C). Although multiple TS sites were identified upstream of position Ϫ74 in control mice, they were all clustered near the IE exon. We also sequenced 22 clones and 28 clones from the slow and fast migrating bands, respectively, from Gata1 IECKO/Y mice (Fig. 5C). Sequence information from the slow migrating band revealed that 18 of 22 TS sites are located in or near the IEb/c exon (2,13). The main TS site is located at position Ϫ169, which is close to the first nucleotide of the IEc exon (13). The remaining 4 clones corresponded to the IT exon.
Upon sequencing the fast migrating band, we found that most TS sites (24 of 28) are located in the second intron. Because no first exon was identified in this region, we named this novel exon as the IEd exon (Fig. 5, A and C). The remaining 4 TS sites corresponded to IT and IEb/c exons, perhaps due to minor technical problems. Showing very good agreement with this finding, we found that variant transcripts derived from the IEb/c and IEd exons were abundantly expressed in Gata1 IECKO/Y mice, whereas in control mice the IE exon is predominantly utilized and these variants were not even detectable (Fig. 5D).
Consistent with previous qRT-PCR results, variant transcripts from the IEb/c and IEd exons were observed predominantly in CD71-positive erythroid cells, but not in CD41-positive megakaryocytic cells from Gata1 IECKO/Y mice (Fig. 5E). This implies that these alternative first exons/promoters are utilized in a lineage-specific manner. B, Gata1 gene expression in bone marrow cells analyzed by qRT-PCR using a TaqMan probe that recognizes the boundary sequence corresponding to the forth and fifth exons of the Gata1 gene. Four control and six Gata1 IECKO/Y mice were utilized for this experiment. The mean value of control mice was set to one. C, erythroid and megakaryocytic Gata1 gene expression was assessed by qRT-PCR using the same TaqMan probe as in B. Cells possessing the indicated lineage markers were independently isolated by MACS from three mice for each genotype. The mean value of wild-type (WT) CD71 ϩ cells was set to one. Note that Gata1 expression in CD41 ϩ cells was significantly reduced in Gata1 IECKO/Y bone marrow, whereas that in CD71 ϩ cells was preserved.  Fig. 4C were used in this experiment. Note that variant transcripts were abundantly expressed in CD71 ϩ cells from Gata1 IECKO/Y mice. The mean value of the transcripts in Gata1 IECKO/Y CD71 ϩ cells was set to one. F, ChIP-qPCR analysis of the histone H3 lysine K4 methylation level in the Gata1 first exon/promoter regions. An aliquot of each sample shown in the figure was used for immunoprecipitation. The values were normalized against the histone H3 occupancy on each promoter. White, gray, and black bars represent control mice, phenylhydrazine-treated control mice with hematocrit values less than 33%, and Gata1 IECKO/Y mice, respectively. The values exhibiting a significant change from those of control mice are indicated by asterisks (*, p Ͻ 0.05).

IEb/c and IEd Exons Are Activated in the Absence of the IE
Exon-To examine changes in the chromatin configuration and epigenetic modifications in Gata1 IECKO/Y mice, we performed ChIP assays. To exclude the possibility that alternative promoters might be activated in response to anemic stress, we used phenylhydrazine-treated anemic control mice with hematocrit values less than 33%. We first examined histone H3 level at the regions containing IE, IEb/c, or IEd exons. As shown in supplemental Fig. S2, no conspicuous difference in histone H3 occupancy was observed in control mouse groups despite anemic stress, whereas the levels were changed in Gata1 IECKO/Y mice, probably due to an unexpected instability of the genome structure.
We next utilized specific antibodies directed against dimethylated or trimethylated histone H3 lysine 4 (H3K4). Dimethylation of H3K4 marks the region permissive for transcription, and trimethylation of H3K4 normally marks the 5Ј end of highly transcribed genes (26). Upon normalizing the ChIP-qPCR data against the total histone H3 level, we noticed that anti-dimethylated and trimethylated H3K4 antibodies precipitated the IE, IEb/c, and IEd exon regions differentially (Fig. 5F).
Anti-dimethylated and -trimethylated H3K4 antibodies precipitated genomic regions containing the IE exon from the Gata1 IECKO/Y mice (black bars in the left panels) almost to a similar extent as from anemic control mice (gray bars). By contrast, these antibodies precipitated a much lesser amount of the genomic region from normal non-anemic control mice (white bars), indicating that the transcriptional potential of the IE exon is activated in response to anemic stress and this activation is independent of the presence of IE exon. These antibodies precipitated slightly more IEb/c and IEd exon regions from the anemic control mice than from the non-anemic control mice (compare gray and white bars in the middle and right panels).
Importantly, the anti-trimethylated H3K4 antibody precipitated a more significant amount of IEb/c and IEd exon regions from Gata1 IECKO/Y mice than from the anemic control mice (Fig. 5F, lower middle and right panels), indicating that the IEb/c and IEd exons are highly activated in the absence of the IE exon. Similarly, anti-dimethylated H3K4 antibody precipitated markedly more of the IEd exon region from Gata1 IECKO/Y mice than from the anemic control mice (upper right panel). These results are consistent with the results of semiquantitative and qRT-PCR analyses of the Gata1 gene expression (Fig. 5,  D and E). We speculate that the IE exon is required to prevent irrelevant expression of variant transcripts from the alternative transcription start sites.
IE Exon Is Indispensable for Production of Full-length GATA1 Protein in Vivo-It should be noted that, although there was a comparable level of Gata1 mRNA in Gata1 IECKO/Y mice, they exhibited severe anemia (Hct 29.9%; see Table 1) in contrast to wildtype mice. This finding prompted us to speculate that ablation of the IE exon may influence GATA1 protein production, such as through the efficiency of translation and/or mRNA stability. To address this hypothesis, we conducted immunoblotting analyses using two different anti-GATA1 antibodies: C20 recognizes the C-terminal region of GATA1 and N6 recognizes the NT region of GATA1.
In this analysis, we discovered that a 34-kDa short form of GATA1 is predominantly expressed in Gata1 IECKO/Y mice (Fig.  6A). This size is practically identical to that of the NT domaintruncated (⌬NT) type of GATA1 (Fig. 6A), which is translated from the second methionine codon located in the third exon of the Gata1 gene (27). The C20 antibody detected this short form of GATA1, whereas the N6 antibody that recognizes the N-terminal domain did not. Importantly, we could not detect this short form GATA1 in either wild-type mice or control mice (Fig. 6A). These outcomes therefore suggest that the short form of the GATA1 protein was produced from transcripts containing either the IEb/c or IEd exons, both of which only provide the 5Ј-UTR.
An immunohistochemical analysis exploiting the N6 antibody confirmed that full-length GATA1 protein did not accumulate efficiently in erythroid and megakaryocytic cells of the Gata1 IECKO/Y mouse spleen (Fig. 6B). Notably, we could not detect a conspicuous level of GATA1 protein expression in erythroid cells of Gata1 IECKO/Y mice, even with the C20 antibody (Fig. 6B). We surmise that the Gata1 mRNAs transcribed from the IEb/c or IEd exons may have acquired an inefficient ability to be translated. These analyses taken together suggest that the alternative first exons of the Gata1 gene do not have the ability to produce full-length GATA1 in vivo.
The Gata1 transcripts containing the IE, IT, or IEb/c exons retained open reading frames originating from the first methionine codon. On the contrary, Gata1 mRNA harboring the IEd exon lacks the first methionine codon, so that the second methionine codon (Met-84) has to be used as the translation initiation codon. We carried out an in vitro translation assay FIGURE 6. In Gata1 IECKO/Y mice, GATA1 with a truncated N terminus is expressed alternatively to full-length GATA1. A, immunoblotting analysis of GATA1 from bone marrow. A bone marrow sample recovered from Gata1 G1.05/Y mice rescued by transgenic expression of ⌬NT-GATA1 (indicated as ⌬NTR) (30) and the cell lysate of murine erythroleukemia cells (MEL) were employed as markers of ⌬NT-GATA1 (black arrowhead) and fulllength GATA1 (white arrowheads), respectively. Of great interest, in Gata1 IECKO/Y bone marrow, bands of a molecular weight equivalent to that of ⌬NT-GATA1 were detected with C20 antibody, whereas the N6 antibody did not recognize full-length GATA1 protein. B, immunohistochemical analysis of spleen sections from control mice (left panels) and Gata1 IECKO/Y mice (right panels) using N6 (bottom panels) and C20 (top panels) anti-GATA1 antibodies.
using a reticulocyte lysate system to examine how each 5Ј-UTR of the Gata1 transcript affects production of GATA1. We found that not only full-length GATA1 but also short forms of GATA1 were abundantly expressed from the Gata1 transcripts originating from the IE, IT, or IEb/c exons in this in vitro assay (supplemental Fig. S3). Furthermore, an additional product was found to be translated from the IEb/c exon (supplemental Fig.  S3), which might be derived from a methionine codon in the untranslated sequence of the IEb/c exon. These results thus support the contention that translation of Gata1 mRNAs is finely regulated in vivo and artificial in vitro translation experiments cannot recapitulate the regulation precisely.

DISCUSSION
The IE exon of Gata1 is mainly employed for gene expression in erythroid and megakaryocytic lineages, whereas the other first exons, i.e. the IT, IEb/c, and IEd, and their affiliated promoters are rarely used in these cell lineages (9,10,13). To further delineate the in vivo contribution of the IE exon to the maintenance of erythroid homeostasis, we conditionally deleted the IE exon in adulthood. Specific loss of the IE exon in adult mice led to skewed erythropoiesis with aberrant maturation of erythroid lineage cells. The primary reason for this dyserythropoiesis appears to be the production of variant Gata1 mRNAs that fail to be translated into full-length GATA1. The mRNAs from the IEb/c and IEd exons tend to be translated into a short form of GATA1 that lacks the NT domain. The accumulation of abnormal erythroid cells in the conditional IE-deficient mouse is in sharp contrast to conditional deletion of the entire Gata1 gene in adults, as the latter mice display bone marrow resembling the human disease known as pure red cell aplasia (4). In this study, two important revelations were made relating to the function of the IE exon in adult hematopoiesis. First, the 5Ј-UTR sequence encoded by the IE exon appears to be critical for in vivo production of the full-length GATA1 protein. Second, the IE exon is essential for Gata1 gene expression in megakaryocytes. These results thus demonstrate that the Gata1 gene is under complex regulation by the multiple first exon system and that the IE exon is critical in defining the proper production of Gata1 mRNA and GATA1 protein.
Exploiting a transgenic reporter mouse assay, we previously demonstrated that the IE exon is critical for Gata1 gene expression in hematopoietic cells, but not in Sertoli cells in the testis (28). A genomic region of ϳ8.5 kb including the 3.9-kb upstream sequence, the IE exon, and the first intron is referred to as the GATA1 hematopoietic regulatory domain (G1HRD). The G1HRD can recapitulate Gata1 gene expression sufficiently enough to sustain embryonic and adult hematopoiesis (29,30). This can also be achieved by a 1-kb artificial mini-gene composed of the IE exon and four discrete cis-regulatory elements within the G1HRD (31). Substantiating these findings, our study showed that hemizygous males with congenital deletion of the IE exon die in utero. Collectively, these results conclude that the IE exon is indispensable in the development of embryonic hematopoiesis.
In the presence of the IE exon, trimethylated H3K4 was hardly detectable around the alternative first exons (IEb/c and IEd) in erythroid cells. In contrast, these genomic regions of Gata1 IECKO/Y mice showed marked precipitation with the trimethylated H3K4 antibody, indicating that the IEb/c and IEd exons are activated in the absence of the IE exon. The IE promoter lacks specific recognition sites for basal transcription factors, such as an initiator sequence, a TATA-box, a purinerich transcription factor B recognition element, and a downstream promoter element (32). Initiator sequences, however, are located close to the transcription start sites of the IEb/c and IEd exons. By contrast, strong cis-acting motifs, such as the double GATA motif, the CP2-binding motif, and the CACCC motif, are located proximal to the IE exon (9,10,33). We speculate that these might contribute to the high-level transcription of the IE exon in erythroid cells. A variant transcript from the IEb/c exon appeared to be activated in mice harboring a Gata1 mutant allele that lacks the double GATA motif (13). Based on these results we conclude that proper regulation by erythroid transcription factors is important for expression of the Gata1 gene from the IE exon.
An intriguing observation is that the 5Ј-UTR sequence encoded by the IE exon is required for in vivo production of full-length GATA1 protein. It should be noted that although GATA1 has several aspartic acids carrying caspase recognition motifs (34), the predicted molecular mass of cleaved products was different from those recognized in Gata1 IECKO/Y mice. In addition, the first translation initiation codon in exon 2 is preserved in the transcript derived from the IEb/c exon. Therefore, we envisage that the IEb/c transcript should loose the potential to select the first translation initiation codon or to produce full-length GATA1 protein in vivo. In our in vitro translation analysis, this preferential use of the second translation initiation codon was not reproducible. This indicates that the 5Ј-UTR sequence encoded by the IE exon affects selection of the translation initiation codon of Gata1 mRNA and that this is supported by certain machinery that specifically operates in vivo.
A multiple first exon system has been observed in a substantial number of genes. More than 50% of human genes are predicted to retain alternative first exons/promoters (35). Indeed, a considerable number of erythroid genes possess alternative first exons. These include the heme biosynthesis enzymes 5aminolevulinate dehydratase (36), porphobilinogen deaminase (37) and uroporphyrinogen III synthase (38), the transcription factors GATA1 (11), GATA2 (39,40), and NF-E2 p45 (41,42), and also membrane protein Band 3 (43). The first exons and their affiliated promoters are often utilized to support spatiotemporal gene expression. In this study, we found that the choice of erythroid-specific first exons influences the translation efficiency of the Gata1 transcripts because of sequence variations of the 5Ј-UTR. The choice further affects the N-terminal structure of GATA1 and consequently GATA1 function.
It should be noted that aberrant use of the alternative first exons might potentially lead to human diseases. In fact, deregulated c-myc gene expression due to shifting to an alternative first exon/promoter that harbors less efficient transcription elongation block may be related to the oncogenesis of Burkitt lymphoma (44). Overproduction of aromatase, a key enzyme in estrogen biosynthesis, occurs in the surrounding tissues of breast tumors due to transcriptional activation caused by switching from a weak physiological promoter to a strong promoter (45). Consequently, the local concentration of estrogen increases and promotes tumor growth and progression. Furthermore, an exon residing in the intron between the third and forth exons of the TP73 gene is preferentially utilized as the first exon in a variety of primary human cancer cells, but not in normal tissues. As a result, an aberrant TP73 protein lacking the N-terminal transactivation domain is expressed specifically in tumor cells. Because this mutant TP73 suppresses the function of TP53 and wild-type TP73, the promoter switch from the first exon to the intronic first exon of the TP73 gene is predicted to contribute to the oncogenesis (46).
Two types of GATA1-related leukemias have been reported that are caused by GATA1 dysfunction (47). One is erythroleukemia in mice caused by the insufficient expression of GATA1 (19), whereas the other is the preleukemic-stage transient myeloproliferative disorder and acute megakaryoblastic leukemia specific to cases of Down syndrome and caused by truncation of the N-terminal domain of GATA1 (identical to ⌬NT-GATA1 in mice). In the former case, leukemogenesis appears to be linked to residual GATA1 activity, because decreased expression of Gata1 to 5% of the endogenous level is sufficient to protect erythroid progenitors from apoptosis, but insufficient to promote their terminal differentiation (19,24). We recently noted that heterozygous female mice carrying the ⌬IE allele are prone to develop erythroblastic leukemia, similar to Gata1 G1.05/X mice. 4 This supports our contention that GATA1 products that adequately inhibit apoptosis might be expressed through alternative first exons in erythroid cells of Gata1 ⌬IE/X mice. Although leukemia has not been observed in Gata1 IECKO/Y mice, it seems plausible that the acquired mutation in the IE exon may lead to erythroblastic leukemia in mice and may also be one of the causes of human leukemias.