Analysis of the megakaryocyte glycoprotein IX promoter identifies positive and negative regulatory domains and functional GATA and Ets sites.

The glycoprotein (GP) Ib-V-IX multisubunit complex binds to von Willebrand factor and mediates the adhesion of platelets to the subendothelium of damaged blood vessels. Expression of the GPIX subunit is required for stability of the complex, and its absence in platelets is associated with the rare bleeding disorder Bernard-Soulier syndrome. Comparative analyses indicate that the four GPIb-V-IX subunits are members of the leucine-rich repeat family and suggest that GPIX resembles a possible primitive progenitor of this group. To characterize GPIX transcriptional regulation, a series of 5′ deletion constructs was made linking the GPIX upstream flanking sequence to the luciferase marker gene, and promoter activity was measured in transiently transfected human erythroleukemia cells. This analysis identified two negative regulatory domains between −686 to −423 and −311 to −203 and two positive regulatory domains at −323 to −311 and −151 to −100 relative to the GPIX transcription start site. In addition, site-directed mutagenesis experiments and in vitro gel retardation assays identified Ets and GATA elements at −42 and −65, which positively regulate GPIX promoter activity and specifically bind nuclear factors derived from human erythroleukemia cells. DNase I protection experiments identified a protein-dependent “footprint” and hypersensitive site within the GPIX Ets sequence. These results provide a framework for comparison of the GPIX promoter with others of the GPIb-V-IX system, other megakaryocyte-specific genes, and other members of the leucine-rich repeat family.

The glycoprotein (GP) Ib-V-IX multisubunit complex binds to von Willebrand factor and mediates the adhesion of platelets to the subendothelium of damaged blood vessels. Expression of the GPIX subunit is required for stability of the complex, and its absence in platelets is associated with the rare bleeding disorder Bernard-Soulier syndrome. Comparative analyses indicate that the four GPIb-V-IX subunits are members of the leucine-rich repeat family and suggest that GPIX resembles a possible primitive progenitor of this group. To characterize GPIX transcriptional regulation, a series of 5 deletion constructs was made linking the GPIX upstream flanking sequence to the luciferase marker gene, and promoter activity was measured in transiently transfected human erythroleukemia cells. This analysis identified two negative regulatory domains between ؊686 to ؊423 and ؊311 to ؊203 and two positive regulatory domains at ؊323 to ؊311 and ؊151 to ؊100 relative to the GPIX transcription start site. In addition, site-directed mutagenesis experiments and in vitro gel retardation assays identified Ets and GATA elements at ؊42 and ؊65, which positively regulate GPIX promoter activity and specifically bind nuclear factors derived from human erythroleukemia cells. DNase I protection experiments identified a protein-dependent "footprint" and hypersensitive site within the GPIX Ets sequence. These results provide a framework for comparison of the GPIX promoter with others of the GPIb-V-IX system, other megakaryocyte-specific genes, and other members of the leucine-rich repeat family.
Platelets play an essential role in thrombosis and hemostasis (1). Initial adhesion of platelets to the subendothelium of a damaged blood vessel is mediated via interaction of von Willebrand factor with the platelet glycoprotein (GP) 1 Ib-V-IX multisubunit complex (2). GPIb-V-IX is comprised of the following four subunits: GPIb␣, GPIb␤, GPIX and GPV (3). GPIb␣, the subunit that directly binds von Willebrand factor (4), is covalently linked to GPIb␤ via a disulfide bridge (5). GPIb␤ may be involved in platelet signal transduction (6,7). GPIX is tightly, but noncovalently (8), bound, and GPV is loosely associated with the complex (9). Bernard-Soulier syndrome is a human bleeding disorder associated with absent or significantly reduced GPIb-V-IX receptor on the surface (10). Several kindreds have been identified that have defects in genes encoding the GPIb␣ (11,12) and GPIX (13) subunits and the GPIb␤ promoter (14). Reconstitution studies have demonstrated that GPIb␣, GPIb␤, and GPIX are all necessary for expression of the complex on the cell surface (15). GPV is not absolutely required but does contribute to complex stability (16,17). Structural analyses of the GPIb-V-IX genes identify a variable number of leucine-repeat motifs in all four subunits that are believed to be important for protein-protein interaction (18 -22). Comparative studies suggest that GPIX might be a primitive progenitor of these members of the leucine-rich repeat family (23).
The cellular precursors of platelets, megakaryocytes, are large polyploid cells that comprise approximately 0.05% of bone marrow cells (24). Time course studies of megakaryocyte development identify a specific pattern of gene expression that coincides with megakaryocytic differentiation (25). The earliest known megakaryocyte marker is rodent acetylcholinesterase (26), which is followed by GPIIb, GPIIIa and platelet factor 4 (PF-4) and subsequently by the GPIb-V-IX complex (25,27). Detailed analysis of megakaryocyte-specific gene regulation will allow comparative studies that correlate promoter activity with the temporal program of megakaryocyte gene expression.
Characterization of GPIX expression is important for several reasons. Since GPIX expression is critical to platelet function and necessary for assembly of the GPIb-V-IX complex (15), comparative studies of transcriptional regulation of the four subunits of the GPIb-V-IX complex can potentially identify which subunit is rate-limiting during assembly of the complex. Furthermore, characterization of GPIX regulation at the tran-scriptional level can provide a basis for comparison with regulation of other megakaryocyte genes to identify elements that modulate differences in the timing of developmental expression. In addition, the GPIX promoter can serve as a paradigm for other tissue-specific promoters and for promoters of other genes of the leucine-rich repeat family of proteins (32). The GPIX gene and upstream flanking regions have been previously sequenced, and the intron/exon structure and transcription start site have been characterized (38). The present study uses deletion and site-directed mutagenesis to characterize the GPIX promoter in transiently transfected human erythroleukemia (HEL) cells and identifies several positive and negative regulatory domains in the GPIX upstream flanking sequence. In addition, positive regulatory Ets and GATA sites were shown to function in vivo and specifically bind factors in vitro.
Plasmid Constructs-H/SGPIX was constructed by subcloning a Hin-dIII-SacI fragment of a genomic phage-containing sequence from the human GPIX gene, including the 5Ј-flanking sequence (38), into pBluescript II SK(ϩ) (Stratagene). To generate the GPIX5Ј deletion constructs, H/SGPIX template was amplified using polymerase chain reaction (PCR) with a 3Ј-antisense primer that bound at ϩ4 relative to the GPIX transcription start site in conjunction with different 5Ј-sense primers that bound at varying distances within the GPIX upstream flanking sequence. Each primer contained restriction site adaptors, BglII in the 3Ј-antisense primer and Acc65I in the 5Ј-sense primers, to facilitate subcloning of the amplified product. The PCR fragments were digested with BglII and Acc65I and inserted into the multiple cloning region of the luciferase expression construct pXP2 (39) previously digested with the same enzymes. The nomenclature used for each deletion construct shown in Fig. 1 indicates the number of bases of upstream 5Ј-flanking sequence relative to the GPIX start site (38).
Construction of the GPIX-Ets-mut and GPIX-GATA-mut substitution mutants used the PCR mutagenesis technique of Michael (40). Briefly, this protocol employed PCR amplification of the GPIX5Ј-203Luc template by using the same forward and reverse primers that were used to generate the original deletion construct in conjunction with 24-base mutagenic oligonucleotides that had been previously phosphorylated with polynucleotide kinase and ATP. This technique introduced specific mutations into either the Ets (GPIX-Ets-mut) or GATA (GPIX-GATAmut) sites. The mutagenic oligonucleotides were designed to introduce EcoRI sites into the constructs to facilitate screening. Specific changes in the GPIX promoter sequence are shown in Fig. 2. To screen for possible PCR errors, the fidelity of both the 5Ј deletion and substitution mutants was confirmed by sequencing prior to testing for promoter activity. No PCR errors were identified in the luciferase constructs; however, the original GPIX sequence has two errors, a two-base omission (5Ј-CG-3Ј) between positions Ϫ107 and Ϫ108 and a G residue between Ϫ166 and Ϫ167. These corrections have been submitted to GenBank TM (accession no. M80478).
Transient Transfection Assays-For each sample, a cationic lipid: plasmid DNA suspension was prepared by mixing 5 g of luciferase reporter plasmid and 5 g of the internal control plasmid CMV␤gal (CLONTECH) in 0.25 ml of Hepes-buffered saline (20 mM Hepes, 150 mM NaCl, pH 7.4) with 0.25 ml of a solution consisting of a premix of 1:4 N- [1-(2,3-dioleoyloxy)propyl]-N, N, N-trimethylammonium methylsulfate (DOTAP) (Boehringer Mannheim):Hepes-buffered saline. Following incubation (10 min, 22°C), the 0.5 ml of DNA:lipid solution was added to 10 6 HEL cells in 10 ml of RPMI 1640 medium with 2% fetal bovine serum and placed in 100-mm Corning tissue culture dishes. Cells were incubated (37°C, 5 h) and then the transfection medium was replaced with standard tissue culture medium. Approximately 24 h after the initiation of transfection, the cells were washed in phosphatebuffered saline followed by lysis in 200 l of a solution containing 100 mM KHPO 4 , pH 7.8, 17 mM MgSO 4 , 1 mM dithiothreitol and 0.1% Triton X-100 for 10 min on ice. The cellular debris was then removed by centrifuging at 12,000 ϫ g for 2 min, and 20 l of the supernatant was assayed for luciferase activity by using the Promega luciferase assay kit according to the manufacturer's instructions. Activity was assayed by using a Turner TD 20e luminometer. Galactosidase activity was measured with 50 l of the lysate by using the colorimetric assay as described by the manufacturer (Promega) (41). The luciferase activities shown in Figs. 1 and 2 were corrected for minor variations in transfection efficiency by dividing the luciferase activity generated by the promoter constructs by the galactosidase activity generated by the CMV␤gal internal control plasmid. There were no differences in cell mortality when transfected with the different constructs (not shown).
Gel Mobility Retardation Assays-Oligonucleotides that include the GPIX GATA site (CTGCACTGGGGGGATAAGCCAGGC), GATA mutant (CTGCACTGGGGGAATTCGCCAGGC), Ets site (ATTTTCAT-CACTTCCTTCCGC), and Ets mutant (ATTTTCATCACTGAATTC-CGC) were synthesized on an Applied Biosystems DNA synthesizer. Each oligonucleotide was end-labeled with polynucleotide kinase and [␥-32 ]ATP (42), annealed with an excess of complementary synthetic strand, and purified on a nondenaturing polyacrylamide gel. To test for factor binding, a mixture was made of 4 g of HEL nuclear extracts prepared as described (42). Briefly, a solution was made containing 10 mM Tris, pH 7.5, 50 mM KCl, 5 mM MgCl 2 , 1 mM dithiothreitol, 1 mM EDTA, 12.5% glycerol, 0.1% Triton X-100, and 2 g of poly(dI-dC) as a nonspecific competitor with or without double-stranded wild-type or mutant unlabeled competitor as indicated in the experiment. The 15-l reactions were preincubated for 20 min on ice prior to the addition of probe (20,000 cpm, 0.24 pmol) and then incubated at room temperature for an additional 20 min. The samples were separated on a 5% polyacrylamide gel containing 50 mM Tris/glycine running buffer. The gels were fixed in 10% methanol, 10% acetic acid, 5% glycerol, dried, and exposed to Kodak X-OMAT AR film at Ϫ80°C with an intensifying screen.
DNase I Protection Assay-A double-stranded 32 P-end-labeled probe was generated by PCR amplification of the template GPIX5Ј-686Luc by using an unlabeled forward primer that bound at Ϫ423 and a 32 P-endlabeled reverse primer that hybridized at ϩ19 relative to the GPIX transcription start site. The resulting probe was mixed with increasing amounts of crude HEL cell nuclear extract and digested with 4 units of DNase I at 22°C for 2 min. The samples were analyzed by separation of the fragments on an 8 M urea, 10% polyacrylamide sequencing gel. An M13 sequencing ladder was run in parallel as a mobility marker. The gel was fixed in 12.5% methanol and 12.5% acetic acid, dried, and analyzed on a Bio-Rad Molecular Imager System phosphoimager using Bio-Rad Molecular Analyst software.

GPIX Upstream Flanking Sequence Contains Cell-specific
Promoter Activity-The GPIX upstream flanking sequence was linked to the luciferase marker gene and tested for its ability to generate luciferase activity in transiently transfected cells. Table I compares promoter activity in the hematopoietic HEL cell line, nonhematopoietic bovine aortic endothelial (BAE) cells, baby hamster kidney fibroblasts, and human epithelial HeLa cells. The three luciferase expression constructs tested were GPIX5Ј-686, which has the GPIX upstream flanking sequence gene extending to Ϫ686 relative to the start of transcription; pXP1, which lacks the upstream promoter sequence but is otherwise identical to GPIX5Ј-686; and GL2C, which encodes the SV40 promoter and enhancer. Plasmid pXP1 expressed negligible luciferase activity in all cell lines tested. Construct GL2C showed variable expression levels in the cell lines tested, which is probably a reflection of a combination of differing transfection efficiencies and variable capacity to support SV40 promoter/enhancer activities. GPIX5Ј-686Luc showed detectable luciferase activity only in the hematopoietic HEL cell line, which indicates that the GPIX upstream flanking sequence contains cell-specific promoter activity and suggests that this activity is selective for hematopoietic cells.
Deletion Analysis of the GPIX Promoter- Fig. 1 compares promoter activity in HEL cells generated by luciferase constructs containing 5Ј truncations derived from the 5Ј-flanking region of the GPIX gene. All of the GPIX constructs show activity that is clearly differentiable from the background in cells transfected with pXP1. Comparison of activities generated by the GPIX5Ј deletion constructs shows a complex pattern of expression identifying several positive and negative regulatory elements. Elimination of the GPIX sequence between Ϫ686 and Ϫ423 increases expression approximately 3-fold, identifying one or more negative regulatory elements in this region. Truncation to Ϫ311 shows a diminution of activity by approximately 3-fold, indicating the presence of a positive regulatory element between Ϫ323 and Ϫ311. Deletion of the GPIX promoter region between Ϫ311 and Ϫ203 increases GPIX promoter activity approximately 10-fold, identifying a strong negative regulatory domain. Comparison of promoter activities generated by constructs GPIX5Ј-151Luc and GPIX5Ј-100Luc identifies another positive regulatory domain between Ϫ151 and Ϫ100. GPIX5Ј-100Luc and GPIX5Ј-69Luc, which gave similar amounts of luciferase activity, contain both the Ets and GATA consensus sites. These results indicate that, although the GATA and Ets sites regulate GPIX promoter activity (see below), there are other important regulatory elements within upstream flanking regions, the elimination of which can increase or decrease expression of the attached marker gene. The specific mechanism by which luciferase activity is altered is unclear. It is likely that activity correlates directly with luciferase message levels, although it is possible that changes in luciferase activity in the truncated promoter constructs are caused by alterations in transcription start site utilization.
Functional Activity of GPIX GATA and Ets Sites in Transiently Transfected HEL Cells-To directly test the function of the GPIX GATA and Ets sites, we performed site-directed mutagenesis of these sites within the context of the strongly expressing construct GPIX5Ј-203. Fig. 2 shows activities generated by the GATA and Ets mutant constructs in comparison with the wild-type GPIX5Ј-203 in transiently transfected HEL cells. The data show an approximately 4-fold diminution of promoter activity in the GPIX-Ets mutant and 7-fold diminution in the GPIX-GATA mutant in comparison with activity generated in HEL cells transfected with the wild-type plasmid GPIX5Ј-203. This experiment indicates that both the GATA and Ets sites are positive regulatory elements necessary for full GPIX promoter activity.

GPIX GATA and Ets Sites Specifically Bind to Proteins in HEL Cell Nuclear Extracts in Vitro-
To test whether the GPIX GATA and Ets cis-acting elements bind to protein factors that might regulate GPIX promoter activity, we performed gel retardation experiments comparing in vitro binding activity in HEL cell nuclear extracts in the presence of oligonucleotides that encode 32 P-end-labeled wild-type or mutant GPIX Ets or GATA sites. The open arrows in Fig. 3 identify two protein complexes when using the Ets probe and one protein complex when using the GATA probe. All three complexes are competable with homologous wild-type competitor but poorly competable with heterologous mutant competitor, although modest competition can be seen with the mutant Ets site with very high levels (400-fold) of the mutant Ets competitor (Fig. 3, A  and C, lanes 3-6). It is not clear whether the presence of a second Ets complex represents binding by a distinct protein or identifies a post-translationally modified or degraded Ets sitebinding protein. Closed arrows (Fig. 3, lane 2 of each panel) identify nonspecific DNA-protein complexes that obscure the presence of specific complexes. Nonspecific binding is eliminated in the presence of the poly(dI-dC) (Fig. 3, lanes 3-6).

FIG. 1. Analysis of GPIX promoter deletion constructs in transiently transfected HEL cells.
On the left is a schematic representation of GPIX upstream flanking regions that are linked to the luciferase marker gene. The names of the constructs indicate the number of bases extending upstream of the GPIX transcription start site. Construct GL2C contains the enhancer and promoter derived from SV40, and pXP1 is a luciferase construct lacking any known promoter elements. Rectangular boxes diagrammed on the left indicate the locations of possible cis-acting elements that were identified using the TFD Transcription Factor Database program, which is part of the GCG sequence analysis package. The sites indicated are of known functional significance in the regulation of other hematopoietic genes. The graph on the right shows luciferase activity (light units Ϯ S.D.) generated by the constructs divided by galactosidase activity constitutively expressed by the CMV␤gal plasmid (see "Experimental Procedures") to normalize for variations in transfection efficiency. Bars represent average activity from a representative experiment. Each construct was tested for promoter activity in at least four separate experiments.
Specific binding to both the Ets and GATA sites is shown by the absence of a complex formation by the mutant Ets probe and greatly reduced binding of the mutant GATA probe (Fig. 3, B  and D) and by the inefficient competition of the mutant relative to wild-type unlabeled competitors.
DNase I Protection Identifies a Discernible Footprint and a Prominent Hypersensitive Site in the GPIX Ets Sequence-32 P-End-labeled GPIX promoter probe was mixed with increasing HEL cell nuclear extract and treated with DNase I prior to separation of digested fragments on a denaturing sequencing gel. Analysis of the autoradiographic pattern shown in Fig. 4 identifies a region of decreased band intensity extending from Ϫ49 to Ϫ35 relative to the GPIX transcription start site. This region includes a GPIX ACTTCCT consensus sequence extending from Ϫ45 to Ϫ39, which is a functionally important motif found in several megakaryocytic promoters (see "Discussion"). In addition to the "footprint," protein binding induces the formation of a DNase I hypersensitive site that maps to the first 5Ј-cytosine nucleotide located within the consensus ACTTCCT at Ϫ44. proven to be key regulators in hematopoietic differentiation. Examples include NFE2 in megakaryocyte and erythrocyte development (44); PU.1, which is expressed in several hematopoietic lineages (45); and the IKAROS family, which is involved in lymphocyte differentiation (46). Of the four GPIb-V-IX subunit genes, only the GPIb␣ promoter has been well characterized (36). The work described here defines two negative regulatory domains, two positive regulatory domains, and positive GATA and Ets cis-acting elements in the GPIX upstream 5Јflanking region.
Targeted disruption of the GPIX GATA and Ets sites reduced promoter activity by approximately 85 and 75%, respectively. GATA and Ets cis-acting sites regulate expression of several megakaryocyte-specific genes including GPIIb (32)(33)(34), GPIb␣ (36), , and ␤-thromboglobulin (47). However, GATA and Ets factors are not found exclusively in the megakaryocyte lineage (48,49). In addition to megakaryocytic promoters, GATA and Ets consensus sites are also important for the regulation of genes with a wider expression such as PECAM-1 (50) and P-selectin (51), which are expressed in both megakaryocytes and endothelial cells.
Members of the GATA family of transcription factors are defined by a similar binding motif, 5Ј-GATAA-3Ј, and a high degree of sequence similarity in their zinc finger DNA binding domains (52,53). GATA-1 is primarily expressed in erythrocytes, megakaryocytes, and mast cells (48,54); GATA-2 is expressed in a variety of cell types including megakaryocytes (48,55) and endothelial cells (56); and GATA-3 is expressed in T lymphocytes, erythrocytes (57,58), and developing brain (59). Which protein binds to the GPIX GATA site is unclear. Exper-iments have demonstrated in vitro GATA-1 binding in the GATA sites of the GPIIb promoter (60). However, the GPIb-V-IX complex is expressed later than GPIIb during megakaryocyte development (25); and since GATA-1 is only expressed early in megakaryocyte maturation while GATA-2 expression continues unabated (52,61,62), differential utilization of GATA-1 and GATA-2 might account for the temporal differences in protein expression. Fig. 4 shows a region of DNase I protection in the GPIX promoter extending from Ϫ49 to Ϫ35 relative to the GPIX transcription site. This region of binding includes an ACT-TCCT heptanucleotide consensus motif similar to the sequences found in several megakaryocyte promoters (38) including GPIIb (two copies) (55), PF-4 (35), PF-4 variant (63), ␤-thromboglobulin (47), and GPIb␣ (36). Which proteins bind the GPIX Ets site is unclear. The Ets family of transcription factors share similar amino-terminal DNA binding domains that bind to cis-acting elements with a core consensus binding sequence of GGAA (64 -67). In vitro binding studies have identified Ets-1 and Ets-2 binding in the GPIIb promoter, and it is possible that one or both of these factors regulate expression of GPIX.
Deletion mutagenesis identifies two positive regulatory domains at Ϫ323 to Ϫ311 and Ϫ151 to Ϫ100 relative to the GPIX transcription start site (Fig. 1). Computer analysis done by utilizing the Transcription Factor Database program (68) identified tandem GAGGAA sequences at Ϫ320 to Ϫ315 and Ϫ302 to Ϫ297, which are potential sites for PU .1 binding (69, 70). Elimination of the GPIX upstream flanking region between Ϫ323 and Ϫ311 reduces promoter activity 3-fold and identifies a relatively well circumscribed regulatory domain that includes the distal potential PU.1 site. PU.1 is a member of the Ets transcription factor family (69,70). The protein has been identified immunohistochemically in megakaryocytes, and PU.1 mRNA has been detected in HEL cells (71). Although tandem repeats of PU.1 binding sites are known to regulate expression of several myeloid-and lymphoid-specific genes (72,73), regulation of megakaryocyte-specific genes by PU.1 has not been described. Thus, although it is possible that PU.1 or another transcription factor is important for GPIX promoter activity in this region, this is only speculation. Elimination of promoter sequences between Ϫ151 and Ϫ100 reduced promoter activity approximately 3-to 5-fold. Homology searches using the Transcription Factor Database program (68) did not identify any sites that are known to regulate other hematopoietic genes. However, the program did identify a 5Ј-GAGGCGCT-3Ј at Ϫ113 to Ϫ106, which regulates expression of sea urchin actin (74), and a potential steroid-responsive element (75,76), 5Ј-TGTGCCC-3Ј at Ϫ135 to Ϫ129. The significance of these potential sites in GPIX regulation is unknown.
The GPIX promoter has a weak negative regulatory domain between Ϫ686 and Ϫ423 and a strong negative domain between Ϫ311 and Ϫ203. Negative regulation of promoter activity is emerging as a common theme in megakaryocyte-specific promoters. Transcriptional silencer domains have been identified upstream of the human PF-4 gene (77) and within the rat PF-4 (35) and human GPIIb (78,79) promoters. Prandini et al. (80) described two sites in the GPIIb promoter that mediate transcriptional repression in transiently transfected cells at Ϫ120 to Ϫ116 (5Ј-TGAGT-3Ј) and Ϫ102 to Ϫ93 (5Ј-CCCTTT-GCTC-3Ј) relative to the GPIIb transcription start site. Sequence analysis of the GPIX promoter identified an exact duplicate of the GPIIb 5Ј-TGAGT-3Ј hexamer at Ϫ455 to Ϫ451 within the Ϫ686 to Ϫ423 negative regulatory domain of the GPIX promoter. In addition, within the strong Ϫ311 to Ϫ203 GPIX negative repressor domain, there are two sequences at Ϫ282 to Ϫ273 and Ϫ263 to Ϫ254 that are similar to the GPIIb CCTTTGCTC sequence. However, the similarity is highly degenerate with three mismatches in each GPIX site.
In summary, molecular dissection of the GPIX promoter has identified several positive and negative regulatory elements. In addition, two of these cis-acting elements, Ets and GATA consensus sites, show functional activity that correlates with in vitro binding activity. Further studies are in progress to better define the precise locations of cis-acting sites within the positive and negative regulatory domains and to characterize the factors that bind them.