Alternative Promoter Identified between a Hypermethylated Upstream Region of Repetitive Elements and a CpG Island in Human ABO Histo-blood Group Genes*

We have studied the expression of human histo-blood group ABO genes during erythroid differentiation, using anex vivo culture of AC133−CD34+cells obtained from peripheral blood. 5′-Rapid amplification of cDNA ends analysis of RNA from those cells revealed a novel transcription start site, which appeared to mark an alternative starting exon (1a) comprising 27 bp at the 5′-end of a CpG island in ABO genes. Results from reverse transcription-PCR specific to exon 1a indicated that the cells of both erythroid and epithelial lineages utilize this exon as the transcription starting exon. Transient transfection experiments showed that the region just upstream from the transcription start site possesses promoter activity in a cell type-specific manner when placed 5′ adjacent to the reporter luciferase gene. Results from bisulfite genomic sequencing and reverse transcription-PCR analysis indicated that hypermethylation of the distal promoter region correlated with the absence of transcripts containing exon 1a, whereas hypermethylation in the interspersed repeats 5′ adjacent to the distal promoter was commonly observed in all of the cell lines examined. These results suggest that a functional alternative promoter is located between the hypermethylated region of repetitive elements and the CpG island in the ABO genes.

In 1900 Karl Landsteiner discovered the ABO blood group system, which is important in blood transfusions and personal identification in criminal investigations (1). Two carbohydrate antigens, A and B, and their antibodies constitute this system. The functional A and B alleles at the ABO genetic locus encode glycosyltransferases ␣133GalNAc transferase (A-transferase) and ␣133Gal transferase (B-transferase), respectively. A-transferase transfers a GalNAc residue from UDP-GalNAc to the precursor H substrate, producing A antigens as defined by the trisaccharide determinant structure GalNAc␣133-(Fuc␣132)Gal␤13 R. Similarly, B-transferase catalyzes the transfer of a Gal from UDP-Gal to the same H substrate, producing B antigens defined by Gal␣133(Fuc␣132)-Gal␤13 R (2)(3)(4)(5). Molecular genetic studies of human ABO genes have demonstrated that ABO genes consist of at least seven exons spanning over 18 kb of genomic DNA and that two critical single base substitutions in the last coding exon result in amino acid substitutions responsible for the different donor nucleotide sugar substrate specificity between A-and B-transferases. A single base deletion in exon 6 was ascribed to shift the reading frame of codons and to abolish the transferase activity of A-transferase in most O alleles (6 -9).
The ABO antigens are expressed in a cell type-specific manner; the isoantigens A, B, and H of blood groups A, B, and O are not confined to red cells but are also found in most secretions and on some epithelial cells. However, they are absent in connective tissues, muscles, and the central nervous system (10). Moreover, ABH antigens are known to undergo drastic changes during development, differentiation, and maturation of cells in epithelial lineage as well as erythroid lineage. Immature cells in the basal layers in nonkeratinized stratified squamous epithelia, for example, were characterized by the expression of sialylated or unsubstituted precursor peripheral cores, whereas differentiated and mature cells in the upper layers sequentially expressed ␣132 fucosylated H structures and A/B antigens, depending on the ABO genotype of the individual (11). Similar to ABH antigen expression in epithelia, studies of A antigen expression during maturation of erythroid progenitors in a two phase liquid culture system showed that A-positive cells gradually increased during erythroid maturation (12,13). Fluorescence-activated cell sorter (FACS) 1 analysis using monoclonal antibodies demonstrated the expression of A antigens on colony cells derived from BFU-E and CFU-E (14). The changes of ABH antigen expression have also been documented in pathological processes such as tumorigenesis. Reduction or complete deletion of A/B antigen expression in carcinomas has been reported (15)(16)(17). In addition, the loss of ABH antigens has been correlated with tumor progression of various carcinomas, including those in lung and bladder (18 -21).
To elucidate the molecular basis of how ABO genes are controlled in cell type-specific expression, during normal cell differentiation, and in cancer cells lacking A/B antigens with invasive and metastatic potential, it is essential to understand the regulatory mechanism of ABO gene transcription. Previously we determined two transcription start sites just upstream from the initiation codon by the 5Ј-rapid amplification of cDNA ends (5Ј-RACE) technique using human pancreatic cDNA as a template (8). Transient transfection experiments demonstrated that the ABO gene promoter was located between Ϫ117 and ϩ31, relative to the upstream transcription start site in cells of both epithelial and erythroid lineages (22,23). Like many housekeeping genes, the ABO gene contains a typical CpG island that extends from 0.7 kb upstream to 0.6 kb downstream from the transcription start site in exon 1 (24). Expression of the ABO genes was shown to be repressed upon DNA methylation of the CpG island in the promoter region (24,25). Neoplastic cells simultaneously harbor widespread genomic hypomethylation and regional areas of hypermethylation (26). The CpG island located in the gene promoter region is usually the main target of this regional hypermethylation. DNA methylation of CpG islands spanning the promoter is often strongly associated with transcriptional silencing. A group of the methyl-CpG binding domain proteins (MBDs) bind preferentially to methylated CpGs (27)(28)(29)(30), and several of these MBD proteins associate with histone deacetylases or Mi-2, a member of the SWI2/SNF2 family of ATP-dependent chromatin-remodeling proteins (31)(32)(33)(34). The direct linkages among the MBDs, histone deacetylases, and chromatin remodeling machinery have provided a basis for understanding how DNA methylation can be related to a transcriptionally incompetent chromatin state.
Further characterization of the upstream region of the ABO gene is necessary to obtain a better understanding of the underlying mechanisms that result in the cell type-specific expression of the ABO genes and the changes in ABO gene expression during cell differentiation. Furthermore, it will be also informative for the elucidation of the molecular basis of DNA methylation in the ABO gene promoter in cancer cells.
Here we report the identification of an alternative promoter of the ABO genes.
AC133 Ϫ CD34 ϩ cells were prepared as reported previously with some modifications (35). Mononuclear cells isolated from peripheral blood with Ficoll-Paque PLUS (Amersham Biosciences) were first subjected to immunomagnetic separation using a MACS AC133 Cell Isolation Kit (Miltenyi Biotech, Auburn, CA). The cells in the flow-through fraction were then incubated with AC133 microbeads prior to the application to the second column. The cells in the flow-through fraction were subjected to immunomagnetic separation using a MACS CD34 Progenitor Cell Isolation Kit (Miltenyi Biotech). The trapped cells were collected and designated as AC133 Ϫ CD34 ϩ cells. As the mononuclear cells isolated from blood were applied to the CD34 immunomagnetic separation, the cells in the flow-through fraction from the column were also collected and designated as the flow-through fraction.
Reverse Transcription-PCR (RT-PCR)-Total RNA was isolated from cultured cells using the acid guanidine thiocyanate/acid phenol method. cDNA was synthesized from total RNA (2 g) using random hexamers and Superscript II (Invitrogen). One and 2 l of 20 l of the resulting single strand cDNA reaction were used as templates for RT-PCR and quantitative real time RT-PCR, respectively. Five l of human bone marrow Marathon-Ready cDNA (Clontech, Palo Alto, CA) was also used as a template for RT-PCR.
Amplification of the ABO gene message was performed using primers ABOϩ116 and ABOϩ802 corresponding to the sequences within exon 3 and 7 of the ABO gene, respectively, as described previously (24). The starting exon-specific RT-PCR was carried out using each distinct starting exon-specific primer and the reverse primer ABOϩ802. The sequences of the starting exon-specific primers were 5Ј-GGCCGAGGT-GTTGCGGACGCT-3Ј (ABOϩ3) and 5Ј-GAGCTTCCTCGAGCGGACG-CCA-3Ј (ABOU-678), corresponding to the sequence in exon 1 which was reported previously (8,9) and the sequence in the alternative starting exon 1a reported in this paper, respectively. Conditions for the amplification were 95°C for 9 min, 35 cycles of 94°C for 30 s, and 72°C for 1 min, followed by incubation at 72°C for 10 min. Another starting exon-specific RT-PCR was performed using either one of the starting exon-specific primers or the reverse primer ABOϩ98, of which the sequence was 5Ј-CCAAACAAGACCAAGACAAGCATTATTAGG-3Ј, complementary to exon 2 of the ABO gene. Conditions for these amplifications were 95°C for 9 min, 35 cycles of 94°C for 1 min, 67°C for 1 min, and 72°C for 2 min. These combinations of primers were used in quantitative real time RT-PCR. RT-PCR was performed to amplify the GATA-1, ␤-globin, and glyceraldehyde-3-phosphate dehydrogenase (G3PDH) gene messages according to the methods by Kosugi et al. (36).
PCR amplifications were performed in a 100-l reaction mixture containing 20 pmol of each primer, 2.5 units of AmpliTaq Gold (Applied Biosystems, Foster City, CA), 1.5 mM MgCl 2 , 150 M dNTP, and 1ϫ buffer (Applied Biosystems). The products were resolved on a 2% agarose gel. After cloning the PCR-amplified products into a pCR2.1 plasmid vector (Invitrogen), the nucleotide sequences of the amplified fragments were determined, using the ABI PRISM dRhodamine Terminator Cycle Sequencing Ready Reaction Kit with AmpliTaq DNA polymerase FS (Applied Biosystems) with both M13 forward and reverse primers.
5Ј-RACE-5Ј-RACE was performed using the 5Ј-RACE system, version 2.0 (Invitrogen) according to the manufacturer's instruction. The first strand cDNA was synthesized using 5 g of total RNA and the ABO gene-specific primer, the sequence of which was 5Ј-TGGCCCAC-CATGAAGTGCTT-3Ј (primer ABOϩ433). After homopolymeric dC tails were added to the 3Ј-ends of the cDNA, the second strand DNA was synthesized using 5Ј-RACE Abridged Universal Anchor Primer provided by the supplier. The products were purified, followed by PCR amplification using the Abridged Universal Amplification Primer provided by supplier and the nested primer fy-123 that was reported previously (8). Conditions for the second strand DNA amplification were 95°C for 9 min, 40 cycles of 94°C for 1 min, 55°C for 1 min, 72°C for 2 min, followed by incubation at 72°C for 10 min. PCR products were electrophoresed through a 3% agarose gel, and DNA fragments were extracted using a MERmaid kit (BIO 101, Inc., Carlsbad, CA). The DNA fragments were then ligated with a pCR2.1 plasmid vector, and the sequences were determined as described above.
Quantitative Real Time RT-PCR-Quantitative real time RT-PCR was performed using ABI PRISM 7700 Sequence Detector System and QuantiTech SYBR ® Green PCR Kit (Qiagen GmbH, Hilden, Germany). Specific amplifications of individual starting exon-specific variants of the ABO gene were performed using either starting exon-specific forward primer ABOϩ3 or ABOU-678 and the common reverse primer ABOϩ98. Conditions for both amplifications were 95°C for 15 min, 40 cycles of 94°C for 30 s, 62°C for 30 s, and 72°C for 1 min. Amplification of the G3PDH gene message was performed using the primers as reported by Yin et al. (37). Conditions for amplifications were 95°C for 15 min, 40 cycles of 94°C for 30 s, 60°C for 30 s, and 72°C for 1 min. Quantitative PCR was performed in a 50-l reaction mixture containing 2.5 pmol of each primer, 25 l of 2ϫ QuantiTech SYBR Green PCR buffer, and 2 l of 20-l single strand cDNA reactions. To determine the absolute copy number of the target transcripts in individual cDNA reaction mixtures, plasmids A(1) and A(1a) containing the entire ABO cDNA starting from exon 1 and exon 1a, respectively, and plasmid pG3PDH containing the entire G3PDH cDNA were used to generate a calibration curve. Construction protocols of these plasmids will be detailed later. Amplification of a control plasmid with an unrelated primer set did not generate an increase in reporter fluorescence, and PCR products were barely detected in gel electrophoresis after ethidium bromide staining (data not shown). The plasmid templates were measured using a spectrophotometer, and copy numbers were calculated from the absorbance at 260 nm. For each assay, a standard curve was prepared using serial dilutions of template plasmid DNA with known copy numbers in log steps from 10 7 copies down to 1 copy in a 2-l volume. All samples to be compared were run in the same assay. After completion of the PCR amplification, the data were analyzed with the Sequence Detector Systems version 1.7a software (Applied Biosytsems). To maintain consistency, the base line was set automatically by the software using data collected from cycle 3 to cycle 15 in most experiments. The fluorescence of the reporter dye was plotted against the number of cycles. The threshold cycle was calculated by the sequence detection software as the cycle number at which the fluorescence of the reporter dye crossed the threshold in log-linear range of PCR. The copy numbers of the respective ABO splice variants or G3PDH cDNA were quantified by interpolating the results from the threshold cycles.
Plasmids-The plasmid pRc/AAAA was constructed by digesting the human A-transferase expression construct pAAAA (7) with EcoRI, modifying both ends with XbaI linker, and directionally ligating the fragment into the XbaI-digested pRc/CMV vector (Invitrogen). The HindIII site in the pRc/CMV vector was located downstream from the enhancer/ promoter sequences from the immediate early gene of human cytomegalovirus and upstream from the A-transferase cDNA sequence. The SacII site was located in the last coding exon of the human A-transferase cDNA. These HindIII/SacII sites were used to facilitate the subcloning of PCR-amplified fragments for construction of A-transferase expression vectors. Using fragment L obtained from exon 1a-specific RT-PCR as template (see Fig. 3A), the PCR amplification was carried out with the 5Ј-primer including the nucleotides corresponding to the transcription initiation site in exon 1a plus the HindIII restriction enzyme recognition site and 3Ј-primer ABOϩ802, followed by digestion with HindIII and SacII, and ligation with the HindIII-, SacIIdigested pRc/AAAA to generate plasmid A(1a) for expression of the alternative starting exon 1a-containing cDNA. Because the expression plasmid pRc/AAAA contained human A-transferase cDNA with intron 6 and because the PpuMI site was located in the third exon of the human A-transferase cDNA, the expression plasmid A(1) for the entire A-transferase cDNA starting from exon 1 without intron 6 was constructed by replacing the PpuMI/SacII fragment from plasmid pRc/AAAA with the PpuMI/ SacII fragment from plasmid A(1a).
We have recently found that a few nucleotides were missing in the upstream sequence of the ABO gene published (22) and deposited in the GenBank, and the revised sequence has been submitted (accession number U22302). Nomenclature used for the various reporter constructs in this paper is based on the nature of the inserted fragments. Letter symbols reflect the restriction enzyme cleavage sites used for the construction of these plasmids, whereas numerals indicate the end points of the primers used for PCR. For example, the BpN construct contains the BpnI/NcoI fragment (between Ϫ409 and ϩ31), and the Ϫ832Xh construct contains the fragment bordered with PCR primer sequence starting at Ϫ832 on one end and the XhoI site on the other. The DNA fragments that were generated by either restriction endonuclease digestion or PCR were subcloned into a luciferase (luc) reporter vector, the pGL3-basic vector (Promega, Madison, WI), of which the SmaI site was converted to the EcoRI site to facilitate the subcloning of the genomic fragments or PCR-amplified fragments into the EcoRI/ NcoI sites just upstream from the luciferase gene (22). Luciferase reporter vector plasmids XhN, KN, and SN have been described previously (22). Plasmid XhN⌬Ϫ335/Ϫ118 was prepared by digestion of plasmid XhN with SmaI and SacII, followed by blunt end modification and self-ligation. Plasmid XhN⌬Ϫ275/Ϫ118 was prepared by directional ligation of the double strand oligonucleotide, corresponding to the -335 to -276 sequence, with the SmaI-, SacII-digested, blunt ended XhN. Orientation and 5Ј-and 3Ј-boundaries of the insert of all of the constructs used in this study were verified by restriction enzyme mapping and by DNA sequencing. For all of the constructs containing PCR-amplified fragments and chemically synthesized oligonucleotides, sequencing was performed over the entire region of the amplified sequences and the oligonucleotides. Plasmid DNA was purified by applying alkaline lysed samples onto two successive CsCl-ethidium bromide gradients.
Transfection and Luciferase Assay-Transient transfection experiments into KATOIII cells, MKN28 cells, and HEL cells were performed as reported previously (22)(23)(24). KATOIII cells were cotransfected by electroporation with 10 g of luciferase reporter plasmids and 4 g of the control plasmid containing the Rous sarcoma virus long terminal repeat directing Escherichia coli ␤-galactosidase expression. MKN28 cells were transfected with Lipofectin reagent (Invitrogen); 1.4 g of luciferase reporter and 0.6 g of ␤-galactosidase control vector were used for each analysis. HEL cells were transfected with DMRIE-C reagent (Invitrogen); 4 g of luciferase reporter and 2 g of ␤-galactosidase control vector were used for each analysis. MKN1 cells were transfected with Lipofectin according to the same protocol used in the transfection of MKN28 cells. Human embryonal lung fibroblasts were transfected with Lipofectin; 1.4 g of luciferase reporter and 0.6 g of ␤-galactosidase control vector were used for each analysis. The fibroblasts were split, 18 -24 h prior to transfection, into a six-well tissue culture plate (BD Biosciences) at 1 ϫ 10 5 /ml. At the time of transfection, cells were washed once with MEM containing neither fetal bovine serum nor L-glutamine. Two g of supercoiled plasmid DNA were suspended in 100 l of Opti-MEM I reduced serum medium (Invitrogen). Ten l of Lipofectin reagent was diluted in 100 l of Opti-MEM I reduced serum medium. The two solutions were combined at room temperature for 10 -15 min followed by the addition of 0.8 ml of Opti-MEM I reduced serum medium. The mixture was then overlaid onto the cells. The cells were incubated for 8 h before the DNA-containing medium was replaced with the 2 ml of growth medium containing serum. The cells were harvested for luciferase and ␤-galactosidase assays at 48 h after replacement of the medium. Cell lysis and luciferase assays were performed using the Luciferase Assay System (Promega). Light emission was measured by the model Luminous CT-9000D luminometer (Dia-Iatron, Tokyo, Japan). The values were obtained in relative light units. Variations in transfection efficiency were normalized to the activities of ␤-galactosidase expressed from cotransfected ␤-galactosidase control vector. ␤-Galactosidase activities were measured as described elsewhere (22). Activity of the pGL3 promoter vector containing the SV40 promoter was assigned an arbitrary value of 1.0.
Bisulfite Modification and Genomic Sequencing-Bisulfite reactions were performed as described by Clark et al. (38) under conditions that allowed for conversion of cytosine, but not 5-methylcytosine, to uracil. In brief, genomic DNA was digested with EcoRI followed by phenol/ chloroform extraction and ethanol precipitation. DNA modification and purification were carried out, as described previously (24). Because methylation of cytosine residue in CpG dinucleotide appears symmetrical on either strand of DNA, the upper strand of the bisulfite-modified ABO upstream region was amplified with ABO gene-specific primers for the modified sequence, as shown in Table I. The conditions used for PCR I were 95°C for 9 min, 40 cycles of 94°C for 2 min, 60°C for 2 min, 72°C for 3 min, and finally 10 min at 72°C. The conditions for other PCRs were the same as those for PCR I except that the annealing temperatures were 60°C, 60°C, 61°C, 59°C, and 54°C in PCR II, IϩII, III, IV, and V, respectively. PCR amplification was performed in a 100 l reaction mixture containing 100 pmol of each primer, 2.5 units of AmpliTaq Gold, 1.5 mM MgCl 2 , 150 M dNTP, and 1ϫ buffer. Amplified DNA was electrophoresed, gel purified, and ligated with pCR2.1. Sequencing was then performed with double strand plasmid templates.
The numbers of clones sequenced at individual PCR targets in each cell culture are indicated in Table II.

Expression of the ABO Genes during Maturation of Peripheral Blood-derived AC133
Ϫ CD34 ϩ Cells in ex Vivo Culture-We investigated the expression of the ABO genes during erythroid differentiation using AC133 Ϫ CD34 ϩ cells isolated from peripheral blood. A selective two phase liquid culture system, based on a well defined culture medium supplemented with recombinant growth factors, was utilized for the maturation of erythroid progenitors. The AC133 Ϫ CD34 ϩ cells are rich in erythroidcommitted progenitors, and more than 90% of the colonies produced from those cells are pure erythroid colonies (35). The AC133 Ϫ CD34 ϩ cells were cultured primarily in serum-free medium supplemented with thrombopoietin, Flt3 ligand, and stem cell factor for 7 days followed by a secondary culture with the addition of erythropoietin for the next 7 days. The FACS profiles of AC133 and CD34 antigen expression revealed that more than 95% of the cells that we isolated from peripheral blood mononuclear cells were AC133 Ϫ CD34 ϩ (data not shown). The AC133 Ϫ CD34 ϩ cells proliferated and differentiated into a large number of blasts, and enrichment of BFU-E occurred during the first phase of the culture. After stimulation with erythropoietin, erythroid progenitors proliferated and maturated into orthochromatic erythroblasts. The time course profile of erythropoietin receptor during the culture showed a progressive increase of the receptor on erythroid progenitor cells. 2 We initially examined by RT-PCR the expression pattern of genes encoding GATA-1 and ␤-globin in AC133 Ϫ CD34 ϩ cell differentiation. The important regulatory elements of the ␤-globin gene were demonstrated to have GATA-1 sites. Neither transcript was detectable in the cells of the flow-through fraction from CD34 immunomagnetic separation nor in the freshly purified AC133 Ϫ CD34 ϩ cells, although both transcripts became apparent at days 7 and 14 of culture (Fig. 1). The validity of these RT-PCR analyses was confirmed by the presence of both transcripts in the bone marrow and in the erythroleukemia cell line K562 and the absence of the transcripts in the T cell line Jurkat. These results indicated that the AC133 Ϫ CD34 ϩ peripheral blood mononuclear cells differentiated into erythroid cells during the ex vivo culture. To monitor the expression of the ABO genes during differentiation of the erythroid progenitors, RT-PCR analysis was carried out with ABO transcripts. The ABO gene transcript was barely detectable in the freshly purified AC133 Ϫ CD34 ϩ cells and the cells in flow-through fraction. However, the ABO transcripts were apparent at day 7 ( Fig. 1) but obscure at day 14. The control G3PDH transcripts were detectable before and after the cul-ture. Nucleotide sequence determination using the dRhodamine Terminator Cycle Sequencing Ready Reaction Kit demonstrated that the PCR-amplified products of different sizes in the cells at day 7 consisted of a major full-length transcript and a minor transcript with removal of exon 6, similar to the RT-PCR products observed in human bone marrow (24). Because those transcripts were demonstrated to be derived from the A gene by nucleotide sequence determination, expression of A antigens was assessed by FACS analysis. The A antigens were detected on more than 30% of the cells at day 7, whereas almost all of the cells became strongly positive for A antigens at day 14 (data not shown). Taken together, it seems likely that the ABO gene is expressed at an early stage during differentiation of the erythroid progenitors. Consistent with these results, Bony et al. (13) reported that blood group A antigens appeared at an earlier stage during erythroid cell differentiation, based on the two phase liquid culture of human peripheral blood mononuclear cells (13).
Identification of an Alternative Transcription Initiation Site at the 5Ј-End of CpG Island in the ABO Genes-The human histo-blood group ABO gene consists of at least seven exons (8,9). Two transcription initiation sites have been mapped previously just upstream from the translation start site in the ABO 2 K. Matsumoto and K. Yasui, unpublished data. FIG. 1. Expression of the ABO genes during ex vivo culture of the AC133 ؊ CD34 ؉ cells prepared from peripheral blood mononuclear cells. Using RT-PCR, the expression of four genes (ABO, GATA-1, ␤-globin, and G3PDH) was determined in the flow-through cells from CD34 immunomagnetic separation (FT), the AC133 Ϫ CD34 ϩ cells freshly isolated from peripheral blood, ex vivo culture of the AC133 Ϫ CD34 ϩ cells at days 7 and 14, erythroleukemia cell line K562 cells, T lymphocyte Jurkat cells, and human bone marrow. RT-PCR analysis for the ABO gene expression was performed using primers ABOϩ116 and ABOϩ802 corresponding to the sequences within exons 3 and 7 of the ABO gene, respectively. PCR products were electrophoresed through a 2% agarose gel followed by ethidium bromide staining. The PCR products of different sizes (A-F) observed in K562 cells and bone marrow are the result of alternative splicing as reported previously (24). Sizes of the PCR products are 334 bp for GATA-1, 244 bp for ␤-globin, and 452 bp for G3PDH. A 1Kb PLUS DNA ladder was used as a molecular size marker. genes by the 5Ј-RACE technique using human pancreatic cDNA as a template (8). To examine the transcription start site(s) of the ABO genes in human erythroid cells, 5Ј-RACE was performed using cDNA synthesized from RNA of the AC133 Ϫ CD34 ϩ cells cultured at day 7. Agarose gel electrophoresis of the 5Ј-RACE products showed a major band migrating slower and several faint bands migrating faster. Considering the alternative splicing of the ABO gene transcripts, DNA fragments were purified from the major band and cloned into a sequencing vector. DNA sequences were determined for 20 transformant clones. Except for one clone, the 5Ј-ends of the 5Ј-RACE product were located around the transcription initiation sites, as determined previously using human pancreas cDNA. That single clone contained a 371-bp 5Ј-RACE DNA product that appeared to be a hybrid between exon 2 and the upstream genomic DNA of the ABO gene. That product was 100% identical to exons 2-6 from the 5Ј-end of the reverse primer fy-123 to the 5Ј-end of the invariant region of exon 2. Beyond that point, however, the sequence showed 100% identity with the upstream genomic DNA starting at position Ϫ656 and running to position Ϫ682 relative to the transcription start site in exon 1 (the underlined sequence in Fig. 2). More importantly, the product lacked exon 1. This comparison with the upstream genomic sequence of the ABO gene suggested the presence of an alternative exon, which we named exon 1a. The donor splice site between exon 1a and the subsequent intron had GC, whereas the acceptor site between the subsequent intron and exon 2 had AG. The noncanonical GC-AG splice site pair was reported at a ratio of 0.56% in mammalian splice site sequences (40). Therefore, the 5Ј-splice site seems to be compatible with splicing junction.
Inspection of the human ABO gene indicated that repetitive elements, including two Alu repeats and long interspersed nuclear elements, are located between 0.87 and 2.18 kb upstream from the transcription start site in exon 1 in addition to a typical CpG island extending from 0.7 kb upstream to 0.6 kb downstream (see Fig. 5). The newly identified transcription initiation site seems to be located at the 5Ј-end of the CpG island of the ABO gene.
To confirm the splicing junction between exon 2 and the upstream DNA and to examine whether exon 1a is also used as starting exon in gastric cancer cell lines MKN45 and KATOIII and the erythroid progenitors, RT-PCR was carried out using primers specific for each of two distinct starting exons and a common reverse primer complementary to the sequence in exon 7. DNA fragments of different sizes were amplified from RNA of the cells examined (Fig. 3A). Determination of the nucleotide sequences of the RT-PCR products demonstrated that exon 2 was ligated with the genomic DNA between Ϫ678 and Ϫ656 in those cells examined. This indicated that both exon 1a and exon 1 are utilized as starting exons in the cells of both erythroid and epithelial cell lineages. Furthermore, the RT-PCR products of different sizes observed in those cells seemed to be the result of alternative splicing. The complex patterns of spliced products are represented schematically in Fig. 3B. Because the splicing patterns of the ABO transcripts were complicated among variants, the relative abundance of the exon 1-containing transcripts with that of the exon 1a-containing transcripts was difficult to determine with this method. Two kinds of exon 1a-containing transcript were recognized: full-length transcript L and transcript M, which lacked exon 6. Because these transcripts contained exon 2, as shown in Fig. 3B, the quantity of the transcripts starting from exon 1a could be compared with that of the transcripts containing both exons 1 and 2 in each cell culture. Quantitative real time RT-PCR was performed using each distinct starting exon-specific primer and a common reverse primer complementary to exon 2. The relative abundance was determined by dividing the copy number of exon 1a-containing transcripts by that of exon 1-containing transcripts. Table III shows that the ratios of the transcripts containing exon 1a range from 0.2 to 6.2% of the exon 1-containing transcripts in the cells examined. Considering that the amounts of exon 1-containing transcripts were represented by those of the transcripts containing both exons 1 and 2 in the comparison, the levels of the transcripts starting from exon 1 should be much higher than those of the transcripts from exon 1a in the cells examined. Therefore, it is likely that only a small portion of the ABO transcripts starts from alternative starting exon 1a.
Analysis of Promoter Activity in the 5Ј-Flanking Region of Exon 1a in the ABO Gene-Although alternative exon splicing of the ABO gene transcripts has been demonstrated (8, 9, 24), FIG. 2. Nucleotide sequence of the 5-flanking region in the human ABO blood group gene. The sequence is given in full, from position Ϫ1000 to ϩ50, relative to the transcription start site in exon 1 of the human ABO genes. Two thick arrows above the sequence indicate the transcription initiation sites that were determined previously from 5Ј-RACE using human pancreas cDNA (8). Previous population genetic analysis indicated a polymorphism in regard to the presence or absence of the sequence between Ϫ978 and Ϫ943, which is indicated by a rectangle (44). Open circles indicate locations of 5Ј-ends of the ABO transcripts, determined by 5Ј-RACE using cDNA obtained from ex vivo culture of the AC133 Ϫ CD34 ϩ cells at day 7. Exon 1a (nucleotides Ϫ682 to Ϫ656) is underlined. The uppercase letters denote the coding sequence of exon 1, and the lowercase letters indicate a noncoding genomic sequence. Several putative transcription factor binding sites were found by the Transfac software and are indicated by overbars.
alternative promoter usage has never been reported with the ABO genes. Distinct promoters and alternative 5Ј-ends have been reported with some other genes encoding glycosyltransferases, such as ␣1,2-fucosyltransferase and ␣1,3-fucosyltransferase (41,42). Therefore, it is important to characterize the promoter that regulates the transcription of the ABO messages containing exon 1a.
Inspection of the sequence around the distal transcription initiation site revealed putative binding sites for several transcription factors as shown in Fig. 2. As a means to examine the promoter activity in the 5Ј-flanking region of exon 1a in the ABO gene, we first obtained the Ϫ832Xh construct by introducing the Ϫ832 to Ϫ667 upstream region fragment of the ABO with those containing exon 1a in cultured cells All data represent means of triplicates, presented as copy numbers of target transcript/10 7 copy numbers of G3PDH. The standard deviation of copy numbers is given in parentheses. The ratios were calculated by dividing the copy numbers of the ABO exon 1a-containing splice variant cDNA by those of exon 1-containing cDNA.

FIG. 3. RT-PCR analysis for detection of the ABO gene starting exon in human gastric cancer cell lines MKN45 and KATOIII and ex vivo culture of the AC133 ؊ CD34 ؉ cells at day 7.
A, RT-PCR analysis. Total RNA prepared from MKN45 cells, KATOIII cells, and ex vivo culture of the AC133 Ϫ CD34 ϩ cells at day 7 was reverse transcribed with random primer, and the resulting single strand cDNA was used as a template for PCR analysis. The ABO gene amplification was performed using either distinct starting exon-specific primer ABOϩ3 or ABOU-678 and a common reverse primer ABOϩ802 complementary to exon 7 of the ABO gene. PCR products were electrophoresed through a 2% agarose gel and were stained by ethidium bromide. The amplified fragments were named G to M. A 1Kb PLUS DNA ladder was used as a molecular size marker. B, splicing patterns of the amplified fragments G-M. Nucleotide sequences of these fragments were determined and then compared. Schematically represented ABO genes were aligned with the RT-PCR products amplified, using a set of each starting exon-specific primer (ABOU-678 or ABOϩ3) and the ABOϩ802 primer gene into the promoterless pGL3-basic vector upstream from the luciferase coding sequence. The Ϫ832 upstream terminus was chosen because the vast majority of CpG dinucleotide are methylated and reside within repetitive elements (43), and the locations of the recognition sites for the putative transcription factors were taken into account. The reporter plasmid was transfected into the KATOIII cells. 48 h after transfection, the cells were harvested, and the luciferase activities in cell extracts were analyzed (Fig. 4). The pGL3-promoter vector containing the SV40 promoter and pGL3-basic vector without the promoter sequence were used as positive and negative controls, respectively. The relative luciferase activity of the Ϫ832Xh construct was at least 8-fold higher than that of pGL3-basic vector and was 6-fold lower than that of pGL3-promoter vector. This indicated the promoter activity of the 5Ј-flanking region of exon 1a in the ABO gene. The luciferase activity of the construct Ϫ832Xh was similar to that of the construct SN containing the Ϫ117/ϩ31 proximal promoter sequence. Deletion of the upstream end of the 5Ј-flanking region of exon 1a from position -832 to -781 resulted in a loss of one-third of the activity, suggesting that the important elements for the distal promoter function were contained within the deleted region. Moreover, the presence of an additional sequence between -666 and Ϫ336 in construct Ϫ832Sma yielded an elevated activity, which implicated the presence of positive regulatory element(s) in the sequence. These results suggest that the region just upstream from the distal transcription start site acts as a promoter.
Thus, it appears that the alternative promoter is present at the 5Ј-end of the CpG island in the ABO gene and that the exon 1a is transcribed from this promoter.
To compare the proximal and distal promoters, we prepared the Ϫ832N construct by introducing the Ϫ832 to ϩ31 sequence incorporating both the promoter regions of the ABO gene into the promoterless pGL3-basic vector. The luciferase activity of the plasmid Ϫ832N was less than 2-fold of the activity for the control pGL3-basic vector, suggesting negative element(s) located between those two promoters. Deletion of the upstream end from position Ϫ832 to Ϫ670, Ϫ548, Ϫ409, or Ϫ275 did not result in any significant changes. Thus, the proximal promoter does not seem to be affected by the distal promoter activity in transient transfection experiments. Deletion of the sequence from Ϫ275 to either Ϫ202 or Ϫ117 elicited an increase in the luciferase activity. In addition, an internal deletion of the sequence between -275 and -118 in construct XhN⌬Ϫ275/Ϫ118 resulted in a 7-fold or 2-fold increase in luciferase activity compared with the construct XhN or SN, respectively. Moreover, another internal deletion of the -335 to -118 sequence in construct XhN⌬Ϫ335/Ϫ118 resulted in similar increases. These results suggested that negative element(s) for the ABO gene transcription were present in the region between Ϫ275 and Ϫ118 and that positive regulatory element(s) were present in the sequence between -670 and Ϫ336.
Methylation Profile of the Upstream Region in the ABO Gene from the Repetitive Elements through the Distal Promoter to the Proximal Promoter-Because the vast majority of CpG dinucleotide are methylated and reside within repetitive elements (43), we examined the methylation status of the upstream region in the ABO gene to define a boundary between methylated and unmethylated domains. The methylation status in the region between Ϫ1469 and ϩ26 was analyzed by bisulfite genomic sequencing. This region spanned 128 or 127 CpG sites and was analyzed by PCR amplification of five regions (PCR I-V) followed by subcloning of the PCR-amplified fragments into a sequencing vector. PCR fragments IV and V contained polymorphisms, which included variation in the number of (AAAAAT) [3][4] in the downstream Alu sequence and the presence or absence of 36 bp in a long interspersed nuclear element (44), respectively. Data obtained from genomic sequencing of bisulfite-treated DNA were compiled in human gastric cancer cell lines KATOIII, MKN45, MKN1, MKN28, human erythroleukemia cell lines K562 and HEL, and the culture of AC133 Ϫ CD34 ϩ cells at days 7 and 14. The percentage of the methylated cytosine residue in each CpG dinucleotide was calculated for the individual sites in the Ϫ1469 to ϩ26 region using more than eight clones. The number of the clones used for calculation of the percentage of the methylated cytosine in each CpG dinucleotide is shown in Table II. Fig. 5 shows the methylation profiles in the Ϫ1469 to ϩ26 region in each cell line and the ex vivo culture. Analysis of methylation in KATOIII cells, K562 cells, and the culture of AC133 Ϫ CD34 ϩ cells at days 7 and 14 demonstrated hypermethylation in the region of repetitive elements and hypomethylation in the region downstream position Ϫ832 except for methylation at low ratios in a few CpG sites. The distal promoter region appeared to be located 3Ј adjacent to the methylated region of repetitive elements.
In contrast, the methylation profile in MKN28 cells demonstrated that hypermethylation extended from the region of repetitive elements through the distal promoter region to the proximal promoter region. Because methylation of CpG islands spanning the promoter regions is strongly associated with transcriptional silencing, starting exon-specific RT-PCR was performed using each distinct starting exon-specific primer and a common reverse primer complementary to exon 2 to examine transcripts from both exons 1a and 1 (Fig. 6). These exonspecific RT-PCR analyses showed that neither exon 1a-nor exon 1-containing transcript was detectable in MKN28 cells, although the control G3PDH transcript was detectable in MKN28 cells similar to those expressing cells in which the ABO transcripts from both exons were detected. The lack of any transcript in MKN28 cells by the starting exon-specific amplifications agreed with the previous result that the ABO transcript was not found by RT-PCR using the primers corresponding to the sequences in exons 3 and 7 (24). Thus, hypermethylation of both promoter regions in MKN28 cells might be responsible for the absence of any transcripts from either promoter. The methylation profile in MKN1 cells showed an intermediate pattern between KATOIII cells and MKN28 cells. Hypermethylation extended from the region of repetitive elements through the distal promoter region to around 0.4 kb upstream relative to the transcription start site of the ABO gene exon 1. The distal promoter was hypermethylated, whereas the proximal promoter was hypomethylated. These methylation profiles appeared to correspond to the results of expression analysis that the ABO transcript was barely detectable by exon 1a-specific RT-PCR, whereas the transcript was detected by exon 1-specific RT-PCR. As was observed in MKN28 cells, the methylation profile of HEL cells showed hypermethylation extending from the region of repetitive elements to the proximal promoter region, whereas methylation ratios decreased in the proximal promoter region. Our previous study indicated cellular heterogeneity in DNA methylation of the Ϫ117 to ϩ26 sequence in HEL cells (24). To define a cellular heterogeneity in methylation of the Ϫ434 to ϩ47 sequence, additional PCR amplification was performed with a distinctive combination of primers in PCR IϩII. Fig. 7 shows the methylation profile obtained from individual clones generated from the PCR IϩII fragment. The methylation patterns were heterogeneous. Hypomethylation was found in the Ϫ117 to ϩ 26 sequence in some clones, whereas hypermethylation was found in other clones. These methylation profiles seemed to be consistent with the results that the transcript starting from exon 1a was not detectable by exon 1a-specific RT-PCR, although the transcript starting from exon 1 was detected by exon 1-specific RT-PCR in HEL cells. In MKN45 cells, hyper-methylation was demonstrated in the region of repetitive elements, whereas the proximal promoter region was unmethylated. However, clusters of methylated CpG sites were found at low or moderate ratios in an interval of around 0.2 kb in the region between the hypermethylated region of repetitive elements and the hypomethylated proximal promoter region. With regard to the methylation status of the region within a long interspersed nuclear elements, which were found by computer analysis using RepeatMasker. The second exon is not indicated in the figure because it is located 13 kb downstream from exon 1. The third diagram from the top represents the distribution of CpG dinucleotides, and the vertical lines indicate the position of each CpG dinucleotide in the DNA sequence from Ϫ1469 to ϩ26 relative to the transcription start site of the ABO gene exon 1. Dense clustering of CpG sites is shown in the CpG island in which CpG density is 12.9%. Below the CpG dinucleotide plot, eight panels show the percentage of DNA methylation at individual CpG sites in the Ϫ1469 to ϩ26 sequence of the ABO genes. The ratios of DNA methylation at each CpG site are represented as the length of the vertical lines at the relative positions of the CpG dinucleotide in human gastric cancer cell lines KATOIII, MKN45, MKN1, MKN28, human erythroleukemia cell lines K562 and HEL, and the ex vivo culture of AC133 Ϫ CD34 ϩ cells at days 7 and 14. Inspection of the Ϫ1469 to ϩ26 sequence reveals 128 or 127 CpG dinucleotides because a polymorphism was reported with regard to the presence or absence of the sequence between Ϫ978 and Ϫ943, which contains one CpG site (44). The Vshaped lines in MKN45 cells, MKN28 cells, and the ex vivo culture of AC133 Ϫ CD34 ϩ cells indicate the regions that were deleted in the ABO gene upstream region. ϩ1 represents the transcription start site in exon 1, which was determined previously. radius of 100 bp around the distal transcription start site, one PCR product showed hypomethylation, and the others demonstrated methylation at a few CpG sites. Although the starting exon-specific RT-PCR analyses demonstrated the presence of transcripts from both exon 1a and exon 1, the relative ratio of the exon 1a-containing transcripts to the exon 1-containing transcripts was smaller than those observed with KATOIII cells and the ex vivo culture of erythroid cells (Table III). Because transcriptional repression by DNA methylation was reported to appear when the density of methyl-CpGs approach to approximately 1 in 100 bp (45,46), the ABO gene distal promoter could be repressed only moderately by DNA methylation in MKN45 cells.
Both promoter regions were hypomethylated in the culture of AC133 Ϫ CD34 ϩ cells at day 14. However, the transcript starting from exon 1a was barely detectable, whereas the transcript starting from exon 1 was detectable (Fig. 6). These results may suggest that negative regulatory mechanism(s) other than DNA methylation might play a role in the down-regulation of transcription from the distal promoter during differentiation of erythroid progenitors in the ex vivo culture.
Effects of DNA Demethylation on Expression of the ABO Gene Exon 1a-containing Transcripts-To address the question of whether methylation of the distal promoter region is itself inhibiting expression from the distal promoter, we treated MKN1 cells, MKN28 cells, and HEL cells with 5-aza-2Ј-deoxycytidine, an inhibitor of DNA methyltransferase which causes the demethylation of DNA. We then monitored the expression of the ABO transcripts by the starting exon-specific RT-PCR. Results are shown in Fig. 8. DNA fragments of the expected size were amplified by the exon 1a-specific RT-PCR of RNA from MKN1 cells and HEL cells treated with 5 M concentrations of 5-aza-2Ј-deoxycytidine for 3 days. These bands were confirmed to be derived from the ABO gene message by nucleotide sequencing. Demethylation analysis of the distal promoter in those cells after the 5-aza-2Ј-deoxycytidine treatment was carried out by PCR of the bisulfite-treated DNA using the primers for the PCR III target corresponding to the sequence from Ϫ789 to Ϫ414. PCR products were then digested with TaqI, a restriction enzyme that distinguishes methylated DNA FIG. 6. RT-PCR analysis for detection of the ABO gene starting exons in various tumor cells. ABO amplification was performed using distinct starting exon-specific primers ABOU-678 and ABOϩ3 in combination with a common reverse primer ABOϩ98 complementary to the sequence in exon 2 of the ABO gene. PCR products were electrophoresed through a 2% agarose gel and were stained with ethidium bromide. Sizes of the PCR products are 96 bp for starting exon 1-specific PCR and 93 bp for starting exon 1a-specific PCR. A 1Kb PLUS DNA ladder was used as a molecular size marker. from unmethylated DNA. Only the methylated sequences were cleaved by the enzyme, yielding bands at 255 and 121 bp. By comparing the bands derived from the methylated (255 bp) and unmethylated (376 bp) alleles before and after the treatment, the proportions of the demethylated sequences seemed to increase in both cell lines after the DNA methyltransferase inhibitor treatment. Because the demethylation of the promoter could reactivate the distal promoter, it strongly supports the idea that DNA methylation is responsible for the absence of the transcripts from the distal promoter in MKN1 cells and HEL cells.
No amplification of the exon 1a-specific transcripts was observed by the RT-PCR of RNA from MKN28 cells treated with 5-aza-2Ј-deoxycytidine despite the repeated attempts, although the demethylation of the distal promoter region was demonstrated by PCR of bisulfite-treated DNA, followed by TaqI digestion. In contrast, the exon 1-specific RT-PCR confirmed the appearance of transcripts containing exon 1. These results suggest that negative regulatory mechanisms other than DNA methylation might play a role in the downregulation of transcription from the distal promoter in MKN28 cells.
Cell Type-specific Activity of Transfected ABO Gene Distal Promoter-By performing transient transfection experiments using KATOIII cells, the sequence located between Ϫ832 and Ϫ667 was found to be responsible for transcriptional activity. To examine whether the ABO gene distal promoter exhibited cell type-specific activity correlating with endogenous expression of the ABO genes, transient transfection studies were performed using MKN1 cells, MKN28 cells, HEL cells, and fibroblasts. In fibroblasts, the transcripts starting from the ABO gene exon 1 and exon 1a were barely detectable by starting exon-specific RT-PCR, as shown in Fig. 6. The absence of the ABO genes in fibroblasts seemed to coincide with the absence of ABH antigens in connective tissue (10). Because the SV40 promoter showed activity independent of cell types, the relative promoter activities of constructs Ϫ832Xh or SN to the activity of pGL3-promoter vector containing the SV40 promoter were calculated and compared with one another (Table  IV). In all the cell lines tested, the proximal promoter construct SN was 7-21-fold more active than the control pGL3basic vector, indicating that the extraneously introduced ABO gene proximal promoter was constitutively active in the nonexpressing cells including MKN28 cells and fibroblasts. The reporter activities of Ϫ832Xh construct were from 3-fold to 8-fold more active than the control pGL3-basic vector. When the ratios of promoter activities between constructs SN and Ϫ832Xh were calculated in those cell lines used, similar ratios were obtained for KATOIII cells, MKN1 cells, and HEL cells. Demonstration of the distal promoter activity in MKN1 cells and HEL cells supported the possibility that DNA methylation, a generalized negative regulatory mechanism, could repress the distal promoter of the ABO gene in these cells.
In contrast to these expressing cells, the construct Ϫ832Xh showed weaker activity compared with the construct SN in fibroblasts, in which the ABO gene expression was barely detectable, suggesting that the distal promoter functioned more efficiently in ABO gene-expressing cells than fibroblasts. It is a The ABO proximal promoter used sequences located between Ϫ117 and ϩ31. b The ABO distal promoter used sequences located between Ϫ832 and Ϫ667. c The results are expressed as an average relative activity compared with that observed for the pGL3-promoter vector containing the SV40 promoter. Standard deviations are indicated for a minimum of three repetitions. d KATOIII cells were transfected with electroporation; 10 g of luciferase reporter and 4 g of ␤-galactosidase control vector were used for each analysis. e MKN1 cells, MKN28 cells, and fibroblasts were transfected with Lipofectin; 1.4 g of luciferase reporter and 0.6 g of ␤-galactosidase control vector were used for each analysis.
f HEL cells were transfected with DMRIE-C reagent; 4 g of luciferase reporter and 2 g of ␤-galactosidase control vector were used for each analysis.
FIG. 8. Expression of the ABO genes after treatment with 5-aza-2-deoxycytidine. MKN1 cells and HEL cells were treated with 5 M 5-aza-2Ј-deoxycytidine for 3 days, and MKN28 cells were treated for 5 days. ABO gene expression from the proximal and distal promoter was analyzed by starting exon-specific RT-PCR, performed in Fig. 6. Demethylation analysis of these cultured cells after treatment was performed using TaqI digestion of PCR III fragments, corresponding to the sequence from Ϫ789 to Ϫ414. The PCR product (376 bp) was digested with TaqI and electrophoresed through a 2% agarose gel. TaqI cleaved only the methylated allele, yielding bands at 255 and 121 bp (the band at 121 bp is not shown).
likely that the distal promoter activity is dependent upon cell types and that the cell type-specific promoter is located 3Ј adjacent to the hypermethylated region of repetitive elements that can be recognized by MBDs involved in repressive complexes in association with histone deacetylases. In MKN28 cells, the promoter construct Ϫ832Xh revealed half the activity of the construct SN. Comparison of the ratio of promoter activities between constructs SN and Ϫ832Xh in MKN28 cells with those ratios in those expressing cells suggested the reduction of the distal promoter activity in MKN28 cells. Thus, the decreased promoter activity may have resulted in the inhibition of transcription from the distal promoter in MKN28 cells. The reduction of the promoter activity was incomplete in the transient transfection experiments, and the exon 1a-containing transcript was barely detectable by the starting exon-specific RT-PCR from MKN28 cells treated with 5-aza-2Ј-deoxycytidine although demethylation of the distal promoter region was found. Therefore, other generalized mechanisms such as methylation of histone tails may have also participated in the inhibition of transcription from the distal promoter. Moreover, a negative control other than DNA methylation has been suggested to play a role in the down-regulation of transcription from the distal promoter during differentiation of the erythroid progenitors. Although hypermethylation was found in the upstream region, further investigation is required to elucidate other negative regulatory mechanisms in MKN28 cells. DISCUSSION We have studied the transcriptional regulatory mechanism of the human ABO genes. Our previous characterization of the 5Ј-upstream sequence of the ABO gene demonstrated that the region between Ϫ117 and ϩ31 had promoter activity in both epithelial and erythroid lineages (22,23). Furthermore, expression of the ABO genes was found to be dependent upon DNA methylation of the proximal constitutive promoter (24). In this paper, we have identified a novel transcription start site at the 5Ј-end of the CpG island containing the proximal promoter in the ABO gene. This start site appears to mark an alternative starting exon (1a), which is utilized as a transcription starting exon in both erythroid and epithelial lineage cells examined. The region just upstream from the transcription start site is sufficient for the expression of a reporter gene in a cell type-specific manner when placed 5Ј adjacent to the luciferase gene. Hypermethylation is commonly observed in the region of repetitive elements 5Ј adjacent to the distal promoter region in all of the cell lines examined. These results indicate that a cell type-specific promoter is located between the hypermethylated region of repetitive elements and the CpG island containing the constitutive promoter in the ABO genes.
The utilization of multiple promoters and transcription start sites is a frequently used mechanism to create diversity and flexibility in the regulation of gene expression (47). The level of transcription initiation can vary between alternative promoters: the turnover or translation efficiency of mRNA isoforms with different leader exons can differ, alternative promoters can have different tissue specificity and can react differently to growth signals, and alternative promoter usage can lead to the generation of protein isoforms differing in amino acid sequence. The use of an alternative exon 1a seems to be common in the ABO genes because the ABO blood group genotypes of the AC133 Ϫ CD34 ϩ cells used in this study were inferred to be AA based on A allele-specific nucleotide substitutions (6), whereas those of KATOIII cells, MKN45 cells, and K562 cells were BO, AA, and OO, respectively (24). Although the nucleotide sequence of exon 1a does not contain an ATG codon, transfection experiments of MKN28 cells with the expression plasmid A(1a) containing the entire cDNA starting from exon 1a demonstrated the appearance of large amounts of A antigens, suggesting that the functional protein involved in expression of A antigens could be synthesized by the usage of an internal preferential translation start site in the transmembrane domain. 3 However, quantitative real time RT-PCR showed that most of the ABO gene transcripts start from exon 1. The low level of the transcripts starting from exon 1a may be caused by different turnover efficiencies of mRNA isoforms with different leader exons or by reduction of the distal promoter activity conferred by corepressor complexes recruited by MBDs that can bind to the hypermethylated interspersed repeats in chromatin. On the other hand, the short open reading frames upstream from the coding sequence may serve to control expression at the translational level, as reported with other genes (49). In the murine acetylcholinesterase gene, an alternative promoter was identified 5Ј-upstream from the proximal promoter (50). Our analyses of the GϩC density and CpG frequency in the nucleotide sequence of the upstream region of the murine acetylcholinesterase gene (GenBank accession no. AF148849) demonstrated that the alternative exon is located 5Ј adjacent to a CpG island. Based on the histone code hypothesis that distinct histone amino-terminal modifications can generate synergistic or antagonistic interaction affinities for chromatin-associated proteins, which in turn dictate dynamic transitions between transcriptionally active and silent chromatin states (51), it is possible that some activities of the cell type-specific promoter located at the 5Ј-end of the ABO CpG island prevent histone tails from modifications such as deacetylation and methylation leading to silenced chromatin domains. Further studies are needed to define the relative roles of the distal cell type-specific promoter and the proximal constitutive promoter in the regulation of ABO gene expression. Moreover, elucidation of an association among the hypermethylated region of repetitive elements, the distal promoter, the negative element from position Ϫ275 to Ϫ118, and the proximal promoter may lead to more precise resolution of a molecular basis for the ABO gene expression in a cell type-specific fashion and of the changes during cell differentiation.
CpG islands are almost always maintained in unmethylated states, unlike the CpG sites in the remainder of the genome. However, methylation of CpG islands can occur on an inactive X chromosome, in promoters of imprinted genes, with oncogenesis, and during aging. In all of these cases, methylation of CpG islands spanning the promoter regions is strongly associated with transcriptional silencing. DNA methylation of the distal and proximal promoters in the ABO gene is correlated with the absence of transcripts from the distal and proximal promoter, respectively, in MKN1 cells, MKN28 cells, and HEL cells. However, the possibility still exists that additional factor(s) could function in the down-regulation of ABO gene expression. As of now, there is no clear understanding of what leads to the aberrant methylation of CpG islands commonly seen in cancer. Moreover, it is unclear whether hypermethylation is initiated from the ends of CpG island or at "hypermethylation center(s)" within the CpG island in tumor cells. It has been suggested that maintenance of the unmethylated CpG islands is dependent on continued active transcription and/or the binding of specific proteins that may protect them from being methylated (52). Studies of the adenine phosphoribosyltransferase gene locus have identified a potential mechanism of how CpG islands remain free of methylation in embryonic cells (53)(54)(55)(56). Sp1 binding sites and presumably trans-acting factor(s) that bind to those sites apparently protect the CpG island of this gene from methylation. Thus, the loss of specific proteins or interference with their binding and/or the lack of active transcription may contribute to CpG island promoter methylation in cancer. It has also been proposed that repetitive elements such as Alu repeats and long interspersed nuclear elements might serve as foci for de novo methylation and that methylation may spread from such attractors of modification (57,58). In particular, Graff et al. (59) showed that Alu sequences upstream from E-cadherin and von Hippel-Lindall (VHL) genes are methylated in normal tissues, whereas adjacent CpG island sequences are not. In addition, from their studies of transfected DNA they have suggested that methylation may progressively encroach from the methylated Alu sequence regions flanking CpG islands. However, others have suggested that methylation is initiated at "centers" within the islands and progressively spreads (60). Recently, Millar et al. (39) demonstrated that a marked boundary of the methylated and unmethylated domains correlated with an (ATAAA) 19 -24 repeated sequence at the 3Ј-end of hypermethylated Alu sequence of the upstream region in the glutathione S-transferase gene in normal tissues involving prostate tissue. It was also discovered that DNA methylation was present in the core CpG-rich promoter region but did not extend through the 5Ј-flanking region in two prostate cancer cell lines, whereas the extensive high level of DNA methylation observed in the core promoter region was found to spread through the upstream region to and beyond the boundary in another prostate cancer cell line. Supporting the latter hypothesis, the leukemia-promoting promyelocytes-retinoic acid receptor (PML-RAR) fusion protein has been reported to induce gene hypermethylation and silencing by recruiting DNA methyltransferases to target promoters (48). These authors suggested that the newly methylated CpGs worked as docking sites for methyl-binding proteins, which in turn interacted with both histone deacetylase complexes and DNA methyltransferases, leading to the spreading of hypermethylation to the neighboring regions. In the present study, we found that the region consisting of repetitive elements was methylated in all of the cells examined, although a definite boundary between the methylated and unmethylated domains could not be determined in the ABO gene upstream region. A similar methylation pattern in which the methylated domain is restricted beyond the distal promoter region, was observed in KATOIII cells, K562 cells, and the erythroid cells in ex vivo culture. However, distinctive methylation patterns of the ABO gene upstream region were found among MKN45 cells, MKN1 cells, MKN28 cells, and HEL cells, in which hypermethylation extends from the region of repetitive elements through the distal promoter region. The distal promoter region seems to be preferred for methylation compared with the proximal promoter region in the ABO genes. This preference of the distal promoter region for methylation may be simply the result of different distances between the promoter region and the methylated interspersed repeats. Alternatively, additional factors binding to the sequence downstream the distal promoter may protect the proximal promoter region from methylation. Elucidation of such a phenomenon may provide a clear understanding of the molecular mechanism for the extension of aberrant methylation associated with oncogenesis.