The mouse p44 mitogen-activated protein kinase (extracellular signal-regulated kinase 1) gene. Genomic organization and structure of the 5'-flanking regulatory region.

Mitogen-activated protein kinase (MAPK) or extracellular signal-regulated kinase are ubiquitous kinases conserved from fungi to mammals. Their activity is regulated by phosphorylation on both threonine and tyrosine, and they play a crucial role in the regulation of proliferation and differentiation. We report here the cloning of the murine p44 MAP kinase (extracellular signal-regulated kinase 1) gene, the determination of its intron/exon boundaries, and the characterization of its promoter. The gene spans approximately eight kilobases (kb) and can be divided into nine exons and eight introns, each coding region exon containing from one to three of the highly conserved protein kinase domains. Primer extension analysis reveals the existence of two major start sites of transcription located at −183 and −186 base pairs (bp) as well as four discrete start sites for transcription located at −178, −192, −273, and −292 bp of the initiation of translation. However, the start site region lacks TATA-like sequences but does contain initiator-like sequences proximal to the major start sites obtained by primer extension. 1 kb of the promoter region has been sequenced. It contains three putative TATA boxes far upstream of the main start sites region, one AP-1 box, one AP-2 box, one Malt box, one GAGA box, one half serum-responsive element, and putative binding sites for Sp1 (five), GC-rich binding factor (five), CTF-NF1 (one), Myb (one), p53 (two), Ets-1 (one), NF-IL6 (two), MyoD (two), Zeste (one), and hepatocyte nuclear factor-5 (one). To determine the sites critical for the function of the p44 MAPK promoter, we constructed a series of chimeric genes containing variable regions of the 5′-flanking sequence of p44 MAPK gene and the coding region for luciferase. Activity of the promoter, measured by its capacity to direct expression of a luciferase reporter gene, is strong, being comparable with the activity of the Rous sarcoma virus promoter. Progressive deletions of the 1 kb (−1200/−78) promoter region allowed us to define a minimal region of 186 bp (−284/−78) that has maximal promoter activity. Within this context, deletion of the AP-2 binding site reduces by 30-40% the activity of the promoter. Further deletion of this minimal promoter that removes the major start sites (−167/−78) surprisingly preserves promoter activity. This result implicates a major role of this region that contains the Sp1 sites. Finally, removal of the major start sites of transcription as well as the Sp1 sites reveals additional promoter activity at the upstream transcription minor start sites (−240/−167), an activity that is enhanced by the upstream cis-acting elements. In summary, our findings reveal a complex pattern of transcriptional regulation of the mouse p44 MAPK promoter.

Mitogen-activated protein kinases (MAPKs) 1 or extracellular signal-regulated kinases were first described as two proteins of 42 and 44 kDa that were phosphorylated on both tyrosine and threonine residues following stimulation of 3T3-L1 adipocytes with insulin (1, 2). These same phosphoproteins had been visualized previously by two-dimensional gel electrophoresis (3)(4)(5). MAPKs are ubiquitously expressed, being found in all cell systems studied including yeast, worms, flies, frogs, plants, and mammals (6). They are activated by a wide variety of extracellular signals, and their activation requires the phosphorylation of the highly conserved TEY motif present in almost all described MAPKs. An increasing body of data in particular in yeast suggests that MAPKs belong to a multigene family. In yeast, each reported isoform has been implicated in a different signaling pathway, leading to mating, cell wall synthesis, or regulation of osmotic pressure. Studies with the mammalian homologues of yeast MAPK suggest that they play equivalent roles in different processes, including proliferation, differentiation, and response to environmental stress (7)(8)(9)(10)(11).
As far as the p42 and p44 MAPK are concerned, two approaches have demonstrated their role in controlling fibroblast cell growth. First, we showed that overexpression of either a dominant-negative p44 MAPK mutant or an antisense construct prevented growth factor-induced cell cycle entry. Second, we (12) and others (13,14) demonstrated that expression of a constitutively active form of a MAPK activator (MEK1) led to the constitutive activation of p42 and p44 MAPK, an action sufficient to promote cell cycle entry and oncogenicity in fibroblasts. So, at this stage, both MAPK isoforms that are coordinately regulated and capable of phosphorylating identical substrates in vitro appear to be redundant. Alternatively they might serve different functions as a consequence of alternative spliced isoforms that display distinct subcellular localization as recently reported (15). To resolve this issue we isolate genomic MAPK clones in order to study their regulation and to subsequently disrupt each corresponding mouse gene. Here we de-scribe the detailed structure of the murine p44 MAPK (extracellular signal-regulated kinase 1) gene. We have also investigated its promoter to identify cis elements important to drive transcription by a deletion analysis.

EXPERIMENTAL PROCEDURES
Materials-Restriction and DNA modifying enzymes were obtained from New England Biolabs (Ozyme, France) or from Eurogentec (Liege, Belgium). [␣-32 P]dCTP, [␣- 35 S]dATP, and [␥-32 P]dATP were from Amersham Corp. or ICN. Synthetic oligonucleotides were from Eurogentec. The genomic DNA library was made with SV 129 D3 embryonic stem cell DNA and constructed in GEM12. This library was kindly provided by J-M. Garnier (laboratory of P. Chambon, Strasbourg, France).
Genomic Clones-The genomic library was screened with a 0.73-kb KpnI/KpnI fragment from the hamster p44 MAPK cDNA (16). Six phage plaques were isolated from the library with the first screening. Among these different clones two clones (clones 14 and 15) hybridized to the greatest extent to the probe at high stringency and were subcloned for a further analysis. Clone 15 was found to correspond to p44 MAPK, and clone 14 corresponded to another isoform of p44 MAPK with approximately 80% homology at the nucleotide level. Two SacI fragments of 4.3 and 4.8 kb adjacent in clone 15 were subcloned into Bluescript KS. The partial sequence of these subclones was obtained using Universal M13, T3, T7, SK, and KS primers as well as oligonucleotide primers derived from the coding sequence of the hamster p44 MAPK.
DNA Sequence Analysis-Sequencing was performed by the doublestranded dideoxy chain termination technique using the Pharmacia kit. Restriction analysis and determination of overlapping sequence were done using the Mac Vector program for Macintosh (IBI, New Haven, CT).
Primer Extension Analysis-Three oligonucleotides, ERS 2 corresponding to bases 80 -61 of the cDNA (base 1 ϭ A of ATG), ERS 5 corresponding to bases 20 -1, and GP9 corresponding to bases 208 -189 were designed to hybridize with total RNA isolated from the same ES cells used to construct the genomic library and were used to prime reverse transcription. The oligonucleotides were end-labeled with [␥-32 P]dATP and T4 polynucleotide kinase and purified by ethanol precipitation. Labeled oligonucleotides were co-precipitated with 30 g of total RNA and resuspended in hybridization buffer (5 mM PIPES, pH 6.4, 0.5 mM NaCl, 1.0 mM EDTA, and 80% formamide). The mixture was heated to 95°C during 10 min and annealed at 50°C for 16 h to avoid the generation of secondary structures. The RNA and annealed oligonucleotides were extracted with phenol/chloroform and ethanol-precipitated. The pellet was resuspended in reverse transcriptase buffer with 5 mM dNTPs, 25 units of RNase inhibitor, and 50 units of Moloney Murine Leukemia virus reverse transcriptase. Elongation was carried out for 2 h at 42°C. The reaction was stopped by incubation at 65°C for 10 min, and the products were incubated with 10 units of RNase H for 30 min at 37°C. The reaction products were phenol/chloroform-extracted, ethanol-precipitated, and resuspended in formamide loading buffer. Half of the primer-extended products were electrophoresed on a 6% acrylamide, 7 M urea sequencing gel in parallel with the products of a double-stranded sequencing reaction.
Construction of Chimeric Luciferase Plasmids-A BglII/HindIII fragment corresponding to the promoter region was cloned in front of the luciferase reporter gene as follows. First, a 5Ј 2-kb SacII/SacII fragment derived from a 4.3-kb subclone of genomic lambda phage 15 was introduced into Bluescript KS so that the 5Ј SacII site was in the Bluescript polylinker. Then a BglII (internal site)/HindIII (polylinker site) fragment was subcloned in the PxP 1 luciferase vector (17) to obtain the BH construct. The BN construct was obtained by cutting the BH plasmid by NheI (internal site) and HindIII (polylinker site) blunt-ending both extremities with the klenow enzyme, and religating the vector on itself. The P3Ј vector was obtained as follows. A 700-bp PstI/PstI fragment of the promoter was first subcloned in Bluescript KS, and then using the BamHI and HindIII sites of Bluescript this fragment was introduced at the corresponding BamHI and HindIII sites of PxP 1 vector. The PH vector was obtained by cutting the P3Ј vector by BamHI (polylinker site) and NheI (internal site) and ligating this fragment in the BH vector where a BamHI/NheI fragment has been removed. The NH vector was obtained by cutting the BH vector with NheI (internal site) and BglII (corresponding to the BglII described above), blunt-ending the extremities with klenow enzyme, and religating the vector on itself. The SH construct was obtained by completely digesting the BH vector with BglII and partially digesting with StyI, blunt-ending with klenow, and religating the vector on itself. The BsH vector was obtained by digesting the BH vector with BglII and BssHII, blunt-ending with klenow, and religating the vector on itself. The NP vector was obtained by cutting the P3Ј vector with BamHI (polylinker site) and NheI (see above), blunt-ending the extremities with klenow enzyme, and religating the vector on itself. The antisense construct was obtained by introducing in the PxP1 vector a KpnI/KpnI fragment of 1.8 kb isolated from the SacII construct described above. The ϩAP2 and ϪAP2 vector were constructed by PCR using, respectively, oligo 1 (CGGGATCCCTTAG-CATTACTGAG) ϩ 4 (GAAAGCTTGATATCGAATTCCTGC) and oligo 2 (CGGGATCCCTCTTGGCAGACTAAAG) ϩ 4. The ϩAP2/Bs and ϪAP2/Bs constructs were obtained by cutting, respectively, the ϩAP2 and ϪAP2 constructs by BssHII (internal site) and HindIII (polylinker site), blunt-ending with klenow, and religating the vector on itself. The S/Bs construct was obtained by cutting the SH vector by BssHII (internal site) and HindIII (polylinker site), blunt-ending with klenow, and religating the vector on itself. The Rous sarcoma virus (RSV) luciferase gene was as described previously (18).
Transient Transfection and Luciferase Assay-CCL39 cells in 12-well dishes (10 5 /well) were transiently transfected by CaPO 4 precipitation with the indicated plasmids (2 g/well). Sixteen hours after addition of DNA, the cells were washed twice with phosphate-buffered saline and incubated with Dulbecco's modified Eagle's medium with 7.5% fetal calf serum. Two days later, the cells were washed with cold phosphatebuffered saline, and luciferase assays were performed as follows (Promega protocols and applications guide). Cells were lysed in lysis buffer (25 mM Tris-phosphate, pH 7.8, 2 mM dithiothreitol, 2 mM 1,2-diaminocyclohexane-N,N,NЈ,NЈ-tetraacetic acid, 10% glycerol, and 1% Triton X-100) for 15 min at room temperature and the lysate was cleared by centrifugation (5 min, 12,000 ϫ g). The assay of luciferase activity was performed in a chemoluminometer in a buffer containing 20 mM Tricine, Preparation of RNA-Cells were washed in ice cold phosphate-buffered saline and lysed in RNA Insta-Pure buffer from Eurogentec (Liege, Belgium). The supernatant was cleared by centrifugation and ethanolprecipitated. RNA were resuspended in sterile water.

RESULTS
With the existence of a large MAPK family member and the presence of various pseudogenes, it was crucial to characterize with certainty the genomic clones that hybridize with the entire hamster p44 MAPK cDNA. From the two phages, 14 and 15, that hybridized at high stringency to the p44 MAPK probe, only phage 15 was assigned to the mouse p44 MAPK (extracellular signal-regulated kinase 1) gene. This identification was certified by total exon sequencing. In contrast, phage 14 corresponds to a p44 MAPK close family member. From the phage 15, we estimated that the transcription unit of the mouse gene for p44 MAPK (extracellular signal-regulated kinase 1) spans approximately 8 kb. Fig. 1 shows the relationship of the gene to its corresponding mRNA/cDNA and protein. Sequencing of the plasmid subclones that hybridize to the hamster p44 MAPK probe allowed the determination of the position of the intron/ exon junctions. We found nine exons ( Fig. 1 and Table I), and all of the splice acceptor and donor sequences agree with the "GT-AG" rule (19). Each exon of the gene encodes one or more of the conserved subdomains previously identified in protein kinases (20). The first exon contains all of the 5Ј-untranslated region and also contains the region coding for the GXGXXG domain determinant for the ATP binding. The possibility of alternative splicing with the presence of an additional intron in that region is discussed later. Exon 2 contains lysine 72 of subdomain II implicated in phosphate transfer, subdomain III, and subdomain IV with conservation of glutamic acid 89 and hydrophobic residues; exon 3 encodes the subdomains V and VI with the HRD motif; exon 4 encodes subdomain VII with the invariant DFG motif and subdomain VIII containing the APE triplet; exon 5 encodes the subdomain IX where aspartic acid 228 is conserved; exon 6 encodes subdomain X; exon 7 encodes subdomain XI; exon 8 encodes a region of the protein appar-ently implicated in the specificity of substrate recognition of the MAP kinase family plus 5.4% of the 3Ј-untranslated region; and exon 9 encodes the remaining 3Ј-untranslated region.
Identification of the Transcription Start Site-The 5Ј end of the mRNA was determined by primer extension using different 20-base oligonucleotides (see "Experimental Procedures") derived from the sequence of the mouse p44 MAPK. ERS 2 and GP9 are specific for p44 MAPK, while ERS 5 can hybridize to p42 MAPK as well as p44 MAPK. The ERS 5 oligo was used to determine the exact position of the start sites because the extended fragments obtained with GP9 or ERS 2 were too long for their size to be determined accurately on classical sequencing gels. The experiment performed with ERS 5 reveals the existence of two major start sites (Ϫ183 and Ϫ186) as well as four minor initiation sites at positions Ϫ178, Ϫ192, Ϫ273, and Ϫ292 bp of the ATG (A of ATG ϭ ϩ1) (Fig. 2). Because of the high proportion of GC content upstream of and including the ERS 5 primer, it was difficult to obtain reliable sequence within this region, and therefore an unrelated template and primer were used to calibrate the primer extension analysis. When the primer extension analysis was performed with the GP9 oligonucleotide, we detected a major band of 44 bp, indicating the existence of another possible transcript shorter than the classical p44 MAPK transcript (data not shown). Interestingly, the sequence shown in Fig. 2 also reveals the presence of one splice donor (CGGGTGGGT at Ϫ293) and two splice acceptor (CCGCGCAGG at Ϫ138, TGGTGAAGG at ϩ92, and GGGCAGC at ϩ100) consensus sites within the promoter, the 5Ј-untranslated sequence, and the beginning of the coding region. Fig. 2 only shows the positions of the start sites of transcription in the absence of alternative splicing.
Sequence of the 5Ј-Regulatory Region-To obtain the sequence of the ATG 5Ј-flanking region, we subcloned a 2-kb SacII/SacII fragment from the original 4.3-kb SacI/SacI fragment. This SacII/SacII fragment was digested with BstEII, AccI, NheI, and BglII in order to obtain smaller fragments. The sequence obtained is shown in Fig. 3. It was searched for reported consensus sequences that are recognized by DNAbinding proteins. The sequence reveals three putative TATA boxes (TATAAAA at Ϫ846; GATACATA at Ϫ638; CATAGAGA at Ϫ384), one hepatocyte nuclear factor-5 site (TATTTGT at Ϫ1321) (21), two p53 sites (GGGCTTGCTT at Ϫ1193 and GGGCTAGCCT at Ϫ369)   (32), and five GC-rich binding factor (GCF) binding sites ((G/C/T)(G/C)CG(C/ G)(C/G)(C/G)C(G/C/T)) overlapping the Sp1 sites (33). All binding sites cited above induced a positive regulation, whereas GCF exhibits transcriptional repressor activity. To study the relevance of these sequences, a comprehensive structure-function analysis of the promoter was performed. Identification of the Promoter Region-The promoter activity of a 1128-bp BglII/SacII (BH) fragment (see "Experimental Procedures" and Fig. 3) was analyzed by measuring its ability to direct production of a luciferase reporter enzyme in transient transfection assays in CCL 39 lung fibroblasts. Positive and negative controls included RSV luciferase, which contains the RSV promoter, PxP1 (EV), which has no promoter and a KpnI/ KpnI (AS) fragment of 1.8 kb introduced into the vector in a reverse orientation. Deletions of the BH fragment from the 5Ј or the 3Ј end (see "Experimental Procedures" and Figs. 3 and 4) were also tested. The BH fragment showed high promoter activity comparable with that of the control RSV promoter. 5Ј deletion of this construct to the distal PstI (PH) or NheI (NH) sites did not strongly affect the activity of the promoter. Positioned within segment Ϫ367/Ϫ178 (proximal start site) are sequence motifs that are perfect or single base mismatches to the consensus binding sites for transcription factors Myb (28), AP-2 (29), AP-1 (30), and CTF-NF1 (31). Interestingly the ϩAP-2 construct shows a small increase in promoter activity when compared with the BH construct, suggesting removal of upstream inhibitory sequences. Thus, we can assimilate the Ϫ284/Ϫ78 region to the maximally active promoter. To determine whether or not any of these consensus sequences were genuine AP-2, AP-1, or CTF-NF1 important binding sites, we performed finer deletion analysis. As shown in Fig. 4, deletion of the AP-2 binding site reduced by 30 -40% the activity of the ϩAP-2 construct. A deeper deletion to the StyI site deleting the AP-1 and CTF-NF1 site decreased by 66% the activity of the ϩAP-2 construct. Therefore, these results indicate that AP-2, AP-1, and CTF-NF1 potential binding sites participate in the strength of the promoter.
Transcription Can Be Initiated from All of the Start Sites Determined by Primer Extension-An intriguing observation is that deletion of the major start sites of transcription (BsH construct) decrease but did not abolish transcription, suggesting intervention of initiator-like sequence (34) within the Ϫ167/ Ϫ78 region or specific initiation of transcription in fibroblasts. The quite strong promoter activity (50% of maximum) is probably driven by the few remaining sequences containing the three Sp1 binding sites. However, these Sp1 sites are dispensable since their removal (ϩAP-2/Bs or ϪAP-2/Bs constructs) preserves promoter activity (12 or 8% of maximal). On the other hand, a 3Ј deletion to the PstI proximal site (NP con-  R (178, 183, 186, 192, 273, and 292). The lowercase letters represent the position of the intron. In the case of Sp1 or GCF the number of of each site in the underlined sequence is given. struct) that deletes Sp1 sites as well as the major transcription start sites still possesses 5-10% of promoter activity. The addition to this construct of the upstream cis-acting elements (P3Ј construct) enhanced 10-fold this activity. This result shows that transcription can be initiated from the discrete start sites (Ϫ273, Ϫ292) and that there exist positive cis-active elements in the Ϫ939/Ϫ367 region. Indeed deletion of these discrete start sites (BN construct) abolished promoter activity. DISCUSSION MAPK belongs to a multigene family, and previous reports have shown that expression of dominant negative mutants or antisense constructs of p44 MAPK were able to inhibit fibroblast proliferation (7). Because of their potential importance in growth control (7,35) and differentiation (36 -38) the genes for human p44 MAPK (extracellular signal-regulated kinase 1), p42 MAPK (extracellular signal-regulated kinase 2), and p63 MAPK (extracellular signal-regulated kinase 3) have been mapped (39). However, it has not been possible to attribute specific biological roles to the individual isoforms even if some data describe differential activation of p42 MAPK versus p44 MAPK in platelets (40). As a first step in such an analysis, we have isolated and partially characterized several different mouse MAPK genomic clones and characterized the gene for mouse p44 MAPK (extracellular signal-regulated kinase 1) and a portion of its 5Ј-flanking regulatory region in detail.
The p44 MAPK (extracellular signal-regulated kinase 1) gene spans approximately 8-kb and is divided into nine exons. An interesting aspect of the gene's structure is that one or more of the domains highly conserved among protein kinases are contained within individual exons. This is the first example of such a distribution, and it is strikingly different from what is observed in related kinases, such as mammalian cdc2 (41) which is divided into four exons only, without precise division of the protein kinase subdomains among them. This unusual subdivision could result from the evolution of an ancestral gene that has progressively acquired specific characteristics. The first 7 exons encode the protein kinase domains. An additional exon, exon 8, encodes the carboxyl terminus of MAPK. The C-terminal domain it encodes can be considered to be specific for p44 MAPK because it is one of the most divergent domains among the MAPK related kinases, the other variable domain being subdomain X. The ninth exon directly encodes 95% of the 3Ј-untranslated region of p44 MAPK mRNA.
We also describe the presence of two predominant start sites of transcription located at Ϫ183 and Ϫ186 bp upstream from the ATG (A ϭ 1). However, overexposure of the gels allowed us to see additional discrete start sites. The presence of multiple sites of transcription initiation is open to interpretation. First, the oligonucleotides used in primer extension analysis could have hybridized to mRNA not yet described. This possibility has to be considered because during the screening of the genomic library five other clones, each apparently encoding a different gene, were shown to hybridize to the p44 MAPK probe at high stringency. The phenomenon could also be explained by the absence of a real consensus sequence for a TATA box. For SV 40 and histone H2A genes, removal or mutation of the canonical TATA box results in the initiation of transcription at many sites within the promoter (42)(43)(44). A third interpretation of the detection of minor transcripts is the possibility of alternative splicing suggested by the presence of one splice donor and three splice acceptor consensus sequences. While it is possible that the splice donor site at Ϫ293 is used, it is not likely to be used with the splice acceptor site at position Ϫ138 since (a) the major start sites of transcription would be located at positions Ϫ338, Ϫ343, Ϫ346, and Ϫ352 in Fig. 2, too close to the ATG at position Ϫ338 and (b) the sequence located upstream of this ATG does not match the consensus sequence of Kozak. If splicing occurs between the donor site at position Ϫ293 and the splice acceptor sites at ϩ92 or ϩ100 the downstream ATG at position ϩ166 could be used. However, the sequence upstream of this ATG also does not match the consensus sequence of Kozak, and if translation did start at this ATG the conserved GXGXXG ATP binding domain would be deleted. If such alternative splicings did occur, then shorter or longer mRNA could be transcribed from the same gene. Detection of a shorter mRNA has already been described for p42 MAPK/extracellular signal-regulated kinase 2, apparently as a result of alternative splicing of the gene (45,46). For the reasons outlined above and because it is impossible to detect, by high resolution polyacrylamide gel electrophoresis and Western blot, proteins with higher or lower molecular weight in fibroblasts or ES cell extracts (data not shown), we believe that it is unlikely that such alternatively spliced transcripts exist in these cells.
We have shown that a 1128-bp BglII/SacII fragment was sufficient to drive transcription of the luciferase gene. Activity of this promoter is high because it is comparable with the activity induced by the RSV promoter, which is considered to be a strong promoter. We have also shown that transcription can be initiated from each start site of transcription determined by primer extension and probably from initiator-like sequences (34) or fibroblast-specific initiation of transcription at least in vitro. First, transcription can be initiated from the minor start sites of transcription leading in the NP construct. The basal transcriptional activity detected with this construct is strongly enhanced in the P3Ј construct, suggesting positive intervening sequences within the Ϫ939/Ϫ367 region. However, deletion of an NheI/SacII on the 3Ј region of the BH fragment (BN) results in complete loss of activity, demonstrating that the TATA box located upstream of the NheI site does not have a relevant activity. Second, the ϩAP-2/Bs and ϪAP-2/Bs constructs containing the major start sites of transcription can also drive transcription but at a low level when compared with the BH construct. Presence of a small amount of mRNA in lung (47) and CCL39 fibroblasts (16) suggests that these sites are predominantly used in CCL39 cells. Third, the Ϫ167/Ϫ78 region where no start sites of transcription are detected is responsible for high basal promoter activity. Different interpretations could account for this result. We can suspect intervention of initiator-like sequences (34) or start sites undetectable in ES cells that can be alternatively used when the major ones are deleted. We can imagine that these different possibilities of initiation reflect what happens in vivo when tissue-specific cis-acting elements are used.
The MAPK promoter contains consensus binding sites for many transcription factors. Their presence however, does not prove their involvement in the regulation of p44 MAPK promoter activity. In fact, it is difficult to attribute a role to the binding sites located upstream of the NheI restriction site. The fact that the luciferase activity of the P3Ј construct is higher than the NP construct proves that this region contains positive regulatory elements. However, the role of the AP-2, AP-1, and CTF-NF1 sequences is clearer because their deletions decrease the p44 MAPK promoter activity. The fact that Jun, a partner of the AP-1 complex is phosphorylated by a Jun kinase, a member of the MAP kinase family, makes it tempting to speculate that MAPK could regulate its own transcription (8). However, cotransfection of the BH construct with expression vector for Fos and Jun or constitutive active form of MAP kinase kinase (12) only shows a small increase in the MAPK promoter activity (data not shown). The role of Sp1 sites is unclear because they are all situated before the major start sites of transcription. Sp1 sites have been shown to play a role in transcription of housekeeping genes such as hprt (48,49) or dhfr (50,51) genes in the regulation of genes specific to or maximally expressed in the nervous central system such as nicotinic acetylcholine receptors (52) and plasminogen activator (53) as well as in the regulation of transcription of growth control-regulated genes such as c-myc (54), epidermal growth factor receptor (55) and Ha-ras (56). In each of these genes many start sites of transcription have been documented such as in the p44 MAPK gene. In the case of epidermal growth factor receptor promoter, transcriptional activity can be detected in chloramphenicol acetyltransferase transient transfection assays even in the absence of the major start sites of transcription (57), and the proximal bases in front of the initiation of translation can bind nuclear proteins in gel retardation assays, suggesting a major role of this region in the initiation of transcription. This region can function as a promoter and mediates inductive response to epidermal growth factor, phorbol 12myristate 13-acetate, and cAMP (58). It is the case for the BsH construct, which shows high transcriptional activity. We can suppose that three of the five Sp1 sites in the construct (stop at the SacII site) are implicated in this activity. However, it is difficult to say if they play the same role in vivo. In fact, Sp1 belongs to a multigene family whose expression varies in different tissues. Expression of Sp1 is high in lung and thymus (59), but p44 MAPK is low in lung and expressed to near undetectable levels in thymus (47). However, the very high amount of MAPK mRNA in brain shows that previously described brain-specific Sp1-like factors (53, 60) could regulate transcription of MAPK. The presence of binding sites for GCF overlapping the Sp1 sites suggests a balance activity of these two transcription factors in different physiological conditions. A recent report also describes phosphorylation of Sp1 by a DNA-dependent protein kinase (61). This phosphorylation could be the result of activation of the kinase pathway activated by UV light or by signaling pathway leading to apoptosis. We can also imagine Sp1-dependent activation of the transcription of p44 MAPK after such stress.
The variation of p44 MAPK mRNA levels in different organs could implicate tissue specific elements of the promoter (47). Thus, the presence of a binding site for hepatocyte nuclear factor-5, which is involved in gene expression in the liver, two binding sites for MyoD, which is implicated in myocyte differentiation, and a GAGA box, which has been shown to be important in Drosophila development would suggest developmental expression and tissue-specific regulation of the p44 MAPK gene. However, the p44 MAPK mRNA and protein levels are not influenced by growth factors or by the position in the cell cycle in a given cell line, suggesting that the p44 MAPK gene, like many housekeeping gene products, is not submitted to acute regulation. In contrast, the complexity or "plasticity" of the promoter region, here defined, might reflect its ubiquitous expression from embryonic stem cells (ES cells, data not shown) to most differentiated tissues.
In summary the work reported here on the cloning and characterization of the p44 MAPK gene is the first step toward the inactivation of the gene by homologous recombination in embryonic mouse stem cells. Parallel studies with the p42 MAPK gene will be necessary to determine whether each MAP kinase serves specific function or is totally redundant and can entirely substitute for each other.