A Minimal Murine Msx-1 Gene Promoter

To dissect the cis-regulatory elements of the murine Msx-1 promoter, which lacks a conventional TATA element, a putative Msx-1 promoter DNA fragment (from −1282 to +106 base pairs (bp)) or its congeners containing site-specific alterations were fused to luciferase reporter and introduced into NIH3T3 and C2C12 cells, and the expression of luciferase was assessed in transient expression assays. The functional consequences of the sequential 5′ deletions of the promotor revealed that multiple positive and negative regulatory elements participate in regulating transcription of theMsx-1 gene. Surprisingly, however, the optimal expression of Msx-1 promoter in either NIH3T3 or C2C12 cells required only 165 bp of the upstream sequence to warrant detailed examination of its structure. Therefore, the functional consequences of site-specific deletions and point mutations of the cis-acting elements of the minimalMsx-1 promoter were systematically examined. Concomitantly, potential transcriptional factor(s) interacting with thecis-acting elements of the minimal promoter were also studied by gel electrophoretic mobility shift assays and DNase I footprinting. Combined analyses of the minimal promoter by DNase I footprinting, electrophoretic mobility shift assays, and super shift assays with specific antibodies revealed that 5′-flanking regions from −161 to −154 and from −26 to −13 of the Msx-1 promoter contains an authentic E box (proximal E box), capable of binding a protein immunologically related to the upstream stimulating factor 1 (USF-1) and a GC-rich sequence motif which can bind to Sp1 (proximal Sp1), respectively. Additionally, we observed that the promoter activation was seriously hampered if the proximal E box was removed or mutated, and the promoter activity was eliminated completely if the proximal Sp1 site was similarly altered. Absolute dependence of theMsx-1 minimal promoter on Sp1 could be demonstrated by transient expression assays in the Sp1-deficient Drosophilacell line cotransfected with Msx-1-luciferase and an Sp1 expression vector pPacSp1. The transgenic mice embryos containing −165/106-bp Msx-1 promoter-LacZ DNA in their genomes abundantly expressed β-galactosidase in maxillae and mandibles and in the cellular primordia involved in the formation of the meninges and the bones of the skull. Thus, the truncated murine Msx-1promoter can target expression of a heterologous gene in the craniofacial tissues of transgenic embryos known for high level of expression of the endogenous Msx-1 gene and found to be severely defective in the Msx-1 knock-out mice.

Homeobox (Hox) genes of vertebrates are closely related in sequence and genomic organization to the homeotic genes of Drosophila. Most vertebrate Hox genes are located in four unique clusters in the genome (e.g. HoxA, HoxB, HoxC, and HoxD complexes), each cluster consisting of about 10 genes; there is striking correlation between the linear order of Hox genes on the chromosome and their regional expression in the developing embryo (1,2). In contrast, the members of the Msx class of hox genes, which also share remarkable homology to the msh gene of Drosophila, are found physically unlinked in the vertebrate genome (3,4). Although Hox genes encode transcription factors, characterized by the presence of a highly conserved 60-amino acid-long helix-turn-helix DNA binding domain, the homeodomain, the downstream genetic targets of their regulation, and the underlying molecular mechanisms of their action are only beginning to be unraveled (5)(6)(7).
In the developing embryo, Hox genes play a central role in positional specification, pattern formation, and organogenesis; it is thought that inductive interactions among the various cell layers, mediated through the action of intercellular ligands with their receptors, and a cascade of signaling events regulate the temporal and spatial expression of Hox genes (4, 8 -15). Inappropriate ectopic expression of Hox genes or their elimination by genetic "knock-out" leads to severe developmental anomalies (16 -18).
Hox genes Msx-1 and Msx-2, the best studied members of the Msx family, have been shown to be expressed most conspicuously in the areas of epithelial-mesenchymal interactions (4). High levels of Msx-1 gene expression observed in the developing limb bud (19 -26), regenerating limbs (27) or fins (28), developing eyes (29,30), or molar teeth (31,32) imply that Msx-1 plays a critical role during organogenesis. Defective expression of Msx-1 in the limb bud mesenchyme of chicken mutants limbless and talpid has been reported; apparently the embryos of limbless mutants failed to assemble an active apical ectodermal ridge, and the underlying mesoderm expressed little or no Msx-1 transcripts (33,34). Implantation of apical ectodermal ridge from a wild type embryo above the limbless mesoderm restored Msx-1 gene expression (33). Therefore, it appears that the cells of apical ectodermal ridge, either through cell-cell contact or through diffusible factors, regulate Msx-1 gene transcription (23, 24, 29 -36).
Concomitant alterations of Msx-1 gene expression and mirror image duplications of digits in response to 9-cis-retinoic acid (37) or fibroblast growth factor-2 or -4 (38 -42) suggest that these phenomena may be causally related to each other, and therefore, the molecular mechanisms of Msx-1 gene regulation warrant further investigation. Earlier we described the structural organization of the coding and noncoding sequences of the Msx-1 gene and reported data that suggested that Msx-1 gene expression may be subject to autoregulation (43). We carried out a detailed functional analysis of ϳ5 kb 1 of 5Јflanking genomic DNA of Msx-1 with an aim to elucidate the putative cis-acting elements which mediate Msx-1 gene transcription in NIH3T3 and C 2 C 12 cells. We report that a Ϫ165/ ϩ106-bp minimal Msx-1 promoter, containing sequence motifs capable of interacting with helix-loop-helix proteins (proximal E box) and a ubiquitous transcriptional modulator, Sp1 (proximal Sp1), is sufficiently active in driving the expression of luciferase in cells in culture. Furthermore, our analysis of the bacterial LacZ expression driven by the minimal Msx-1 promoter in transgenic mice suggests that the minimal Msx-1 promoter is exquisitely activated in the structures derived from the interactions between epithelial and mesenchymal cell layers during craniofacial morphogenesis.

EXPERIMENTAL PROCEDURES
Cell Culture-NIH3T3 cells (ATCC, CRL1658) and C 2 C 12 cells (ATCC, CRL1772) were bought from the American Tissue Culture Collection, Bethesda, MD; cells were cultured in Dulbecco's minimal essential medium (DMEM) supplemented with 10% fetal bovine serum in a humidified 37°C incubator with 5% CO 2 . C 2 C 12 cells are capable of differentiation into multinucleated myotubes when cultivated in DMEM with 0.2% fetal bovine serum. Drosophila Schneider line 2 (SL2) cells, provided by Dr. Carl Wu, National Institutes of Health, Bethesda, MD, were grown in the Schneider medium (Life Technologies, Inc.) supplemented with 10% heat-inactivated fetal bovine serum, penicillin, streptomycin, and fungizone at 25°C in an incubator without CO 2 (44).
Plasmid Vectors-The Msx-1 promoter-luciferase plasmids for the transfection experiments were constructed by cloning DNA fragments from an Msx-1 genomic clone (43) into the pGL2-Basic Vector (Promega). A 1.4-kb EcoRI-BamHI Msx-1 genomic DNA fragment was cloned into pBluescript-IISKϩ (pBEB) and was used as the source of DNA for all other promoter-luciferase or promoter-LacZ constructs. DNA fragments, prepared either by digestion with restriction enzymes or by polymerase chain reaction (PCR) amplification with oligonucleotides designed according to the sequence of the genomic DNA and containing desirable restriction sites, were cloned into pGL2-Basic. Thus, Ϫ1282/ϩ106-bp promoter was constructed by inserting a HincII-BamHI fragment (HincII was derived from the polylinker of pBluescript, and the BamHI site came from Msx-1 genomic DNA), encompassing 1282 bp upstream and 106 bp downstream of the transcription start site (43), into the SmaI-BglII sites of pGL2-Basic Vector. PCRamplified promoter fragments Ϫ1168/ϩ106, Ϫ1042/ϩ106, Ϫ886/ϩ106, Ϫ811/ϩ106, Ϫ726/ϩ106, Ϫ588/ϩ106, Ϫ509/ϩ106, Ϫ268/ϩ106, or Ϫ165/ϩ106 with SstI-BglII termini were cloned into SstI and BglII sites of the pGL2-Basic. The Ϫ127/ϩ106-bp promoter was generated by digesting pBEB with KpnI and BamHI and by cloning the DNA fragment into the homologous restriction sites of pGL2-Basic. The promoter fragments Ϫ91/ϩ106, Ϫ52/ϩ106, Ϫ32/ϩ106, ϩ10/ϩ106, and ϩ33/ϩ106 were prepared by PCR and cloned into KpnI-BglII sites of pGL2-basic. Fragments with 5Ј or 3Ј site deletions, Ϫ886/Ϫ33, Ϫ886/Ϫ166, Ϫ268/ Ϫ33, Ϫ268/Ϫ166, and Ϫ165/Ϫ33, were created by PCR-based strategy using oligonucleotides with SstI-BamHI ends. The nucleotide sequences of all PCR-amplified DNA fragments inserted in reporter plasmids were verified by the dideoxynucleotide method of DNA sequencing (45).
Transient Transfections-NIH3T3 and C 2 C 12 cells were seeded (10 5 cells per 35-mm diameter well) in 6-well tissue culture dishes, 1 day prior to transfection. Both cell lines were transfected using Lipo-fectAMINE TM (Life Technologies, Inc.) according to the manufacturer's recommendations. Four l of LipofectAMINE TM and 1 g of plasmid DNA were diluted individually in 100-l aliquots of OptiMEM TM I Reduced-Serum Medium (Life Technologies, Inc.). Cells were incubated with DNA lipid complexes for 5 h and then fed DMEM; 24 -30 h after transfection, cells were rinsed and harvested in phosphate-buffered saline, and lysed in 150 l of 1 ϫ Cell Culture Lysis Reagent (Promega). Aliquots of cell extracts were mixed with 100 l of 470 mM luciferin, and light intensity was measured in a Turner Designs Luminometer Model 20. Expression of luciferase in cells transfected with pGL2-Basic Vector, which lacks eukaryotic promoter and enhancer, and the pGL2-Control Vector (Promega), which contains SV40 promoter and enhancer, were used as negative and positive controls, respectively (46). Cells were transfected with a given construct in triplicate, and expression of the cotransfected pSV-␤-galactosidase plasmid (Promega) was used to correct for the variable transfection efficiencies. The protein content of cell extracts was quantitated by the Bradford method (Bio-Rad Protein Assay System). The luciferase activities were expressed as arbitrary units of light intensity per g of protein.
To examine transactivation of Msx-1 promoter with Sp1, SL-2 cells were cotransfected with Msx-1 promoter-luciferase and Sp1 expression constructs. Twenty-four h before transfection, Drosophila SL-2 cells were transferred to 35-mm well plates at a density of 1.0 ϫ 10 5 cells per well. Cells were transfected with 0.5 g of Msx-1 promoter-luciferase plasmid mixed with 0.05 g of Sp1 expression vector, pPacSp1 (47,48), using 2 l of Cellfectin (Life Technologies, Inc.). Parallel aliquots of SL-2 were also cotransfected with Msx-1 luciferase constructs mixed with 0.05 g of pPacSp1 in antisense orientation and were used as negative controls. Luciferase assays were performed 48 h after transfection as outlined above.
Preparation of Nuclear Extracts-Nuclear extracts from NIH3T3 or C 2 C 12 cells were prepared according to Dignam et al. (49) and as described in detail previously (50). Cells were rinsed and scraped in phosphate-buffered saline, resuspended in hypotonic buffer (10 mM HEPES, 1.5 mM MgCl 2 , 10 mM KCl, 0.2 mM phenylmethylsulfonyl fluoride, 0.5 mM dithiothreitol), and homogenized. The nuclei were removed by centrifugation and resuspended in low-salt buffer (20 mM HEPES, 25% glycerol, 1.5 mM MgCl 2 , 20 mM KCl, 0.2 mM EDTA, 0.2 mM phenylmethylsulfonyl fluoride, 0.5 mM dithiothreitol). The salt concentration of the nuclear suspension was adjusted to 0.3 M KCl which released soluble nuclear proteins. Nuclei were then pelleted by centrifugation, and the protein extracts were dialyzed against a buffer containing 100 mM KCl. The precipitated protein was removed by centrifugation, and the supernatants were stored in aliquots at Ϫ80°C.
Electrophoretic Mobility Shift Assays (EMSA)-Complementary single-stranded oligonucleotides (Table I) with 3-4-nucleotide-long 5Ј overhangs were annealed and radiolabeled by end-filling with Klenow fragment of Escherichia coli DNA polymerase, using [␣-32 P]dCTP. Radiolabeled DNA probes (10,000 cpm/l) were incubated with nuclear extracts in the presence or absence of competitor oligonucleotides. To each tube, 17 l of the premixed incubation buffer (Stratagene) and 1 l of radiolabeled probe were added, and the mixture was incubated at room temperature for 20 -30 min. For supershift assays, nuclear extracts were preincubated with polyclonal antibodies against MyoD, myogenin, c-Myc, Max, USF-1, USF-2, or Sp-1 for 2 h at 4°C prior to initiation of the binding reaction. The contents of the binding reactions were electrophoresed at 4°C on a 4% nondenaturing polyacrylamide gel in 1 ϫ TBE (135 mM Tris, 45 mM boric acid, and 2.5 mM Na 2 EDTA, pH 8.9) and fluorographed; we have described EMSA methods in detail previously (50,51). All antibodies to transcriptional factors and the oligonucleotides containing the consensus recognition motifs used in EMSA were purchased from Santa Cruz Biotechnology, Inc.
DNase I Footprinting-The protocols for DNase I footprinting were used as described previously with minor modifications (50,51). DNA fragments encompassing Ϫ91/ϩ106 bp and Ϫ268/ϩ106 bp, cloned into SstI-BglII sites of pGL2-Basic, were linearized with XmaI and endlabeled with [␣-32 P]dCTP and E. coli DNA polymerase. After labeling, DNA polymerase was inactivated by incubation at 75°C for 15 min. The 3Ј end of the insert was cut with HindIII and purified. Nuclear extracts or recombinant human Sp1 (Promega) were incubated with radiolabeled DNA (10,000 cpm). For competition, radiolabeled probes were mixed with 50 ng of Sp1 consensus oligonucleotide (Stratagene) before initiating binding; the binding reactions were allowed to proceed for 15 min on ice and incubated for an additional 2 min at room temperature. After addition of DNase I, the mixture was incubated for exactly 2 min and was combined with 100 l of a stop solution. The digested probe was extracted with phenol/chloroform, precipitated with ethanol, and electrophoresed in 8% polyacrylamide containing 7 M urea, alongside a nucleotide sequence ladder.
Generation of Transgenic Mice and Analysis of LacZ Expression during Embryogenesis-The incrementally truncated murine Msx-1 promoter fragments were cloned in front of the LacZ gene in the plasmid pLacF (52). The detailed experimental strategies used to generate transgenic mice which contain the full-length (5.0 kb) or serially truncated variants of Msx-1 promoter-LacZ vectors in their genome will be described elsewhere. The minimal Ϫ165/ϩ106-bp Msx-1 promoter DNA fragment containing XbaI recognition termini was ligated in the XbaI site of pLacF (52). The BglII-linearized plasmid DNA was microinjected into fertilized eggs obtained from FVB/NHsd females, and embryos were implanted in the pseudopregnant mice; the transgenic founders were identified by analyzing their tail DNA by Southern hybridization and PCR methods as detailed previously (53). Four independent lines of transgenic founders containing the Ϫ165/ϩ106-bp Msx-1-LacZ DNA were studied extensively. Founders were back-crossed, and timedmated FVB females were sacrificed by cervical dislocation. The embryos were partially fixed in 2% paraformaldehyde at 4°C and stained with X-gal at 37°C overnight (53). The stained embryos were submerged in 70% ethanol, illuminated uniformly by scattered light, and photographed under a dissecting microscope.
Wholemount in Situ Hybridization of Embryos with Msx-1-specific Antisense RNA Probes-To assess expression of the endogenous Msx-1 gene, normal FVB/NHsd mouse embryos were obtained at different stages of development and processed for wholemount in situ hybridization by previously published protocols (54,55). A 700-bp SstI-EcoRI DNA fragment containing the 5Ј half of the Msx-1 cDNA was cloned in pGEMϩ vector (Boehringer Mannheim). Antisense or sense RNAs were transcribed by T7 or Sp6 polymerases, respectively, according to the directions provided by the manufacturer. RNA was synthesized to incorporate digoxigenin-UTP and purified. Fixed embryos were subject to wholemount in situ hybridization with digoxigenin-labeled RNAs according to the published protocols (54,55). Stained embryos were clarified by incubation in glycerol/phosphate-buffered saline mixture (50: 50) overnight and photographed using Kodak Elite II film (100 ASA).

Selection of Cells to Study Msx-1 Promoter Function by Transient or Stable Expression Assays-Since
Msx-1 shows a complex pattern of expression elicited at many sites of epithelial and mesenchymal cell interactions during embryogenesis (4), ideally, the Msx-1 promoter activation should be studied in vitro under conditions which mimic cell-cell interactions leading to organogenesis. While a number of in vitro organ cultures have been developed for such studies (e.g. cultured limb buds), it is not possible to efficiently transfect Msx-1 promoter-reporter constructs into all cells in such an organ culture system. Therefore, we surveyed a number of established cell lines for high levels of Msx-1 expression since such cells are likely to contain the necessary trans-acting factors to activate the Msx-1 promoter. As shown in Fig. 1, the steady state levels of Msx-1 transcripts in the NIH3T3 fibroblasts, C 2 C 12 myoblasts, or a rhabdomyosarcoma cell line, Rh28, were similar to the levels seen in E10 stage mouse embryos (a time of maximum Msx-1 gene expression; Ref. 4). The Rh28 and C 2 C 12 cells undergo myogenic differentiation and form myotubes, when grown in media lacking serum. The levels of Msx-1 transcripts declined precipitously in Rh28 and C 2 C 12 cells, under conditions of differentiation (see below). Curiously, the rhabdomyosarcoma cells undergo spontaneous differentiation at very high rates after 30 passages in culture. 2 Since the phenotype of Rh28 was somewhat unpredictable, we selected to examine Msx-1 pro-moter-reporter constructs in C 2 C 12 and NIH3T3 cells. Both NIH3T3 and C 2 C 12 cells are readily transfectable by the Lipo-fectAMINE TM method used in our studies. Although we tested a subset of the Msx-1 promoter-luc or Msx-1 promoter-LacZ constructs in stably transfected clones of C 2 C 12 cells, we will restrict our discussion to transient expression assays, since no major discrepancy was noted in the promoter activation studied with the two types of assays.

Serial Truncations of the 5Ј-Flanking Sequences Reveal Positive and Negative cis-Acting Elements and Delimit a Minimal
Msx-1 Promoter-Our previously published computer-based homology analysis of the Msx-1 promoter revealed several putative cis-acting elements (43). To experimentally dissect the functional Msx-1 promoter, serially truncated 5Ј-flanking Msx-1 genomic DNA fragments were ligated in front of the luciferase reporter in the pGL2-Basic Vector (Fig. 2); promoter activities were deduced from quantitation of luciferase assays in transiently transfected NIH3T3 and C 2 C 12 cells. Both of these cell lines express moderate levels of Msx-1 and are therefore expected to contain the adequate levels of trans-acting factors to support the activation of Msx-1 promoter-luciferase constructs. Although we sequentially analyzed 5 kb of 5Ј-flanking Msx-1 DNA for promoter activity, the sequences between Ϫ5 kb and Ϫ1282 bp had no detectable enhancement over the Ϫ1280-bp promoter in our assays. 3 Therefore, the expression of Ϫ1282/ϩ106-bp promoter (full-length promoter) was arbitrarily fixed as 100%, and the activities of all other constructs were compared with the full-length promoter. Compared with the full-length promoter, while some deletions in the 5Ј-flanking DNA of Msx-1 caused a modest to severe decline in luciferase activity (e.g. Ϫ1168/ϩ106-, Ϫ509/ϩ106-, and Ϫ127/ ϩ106-bp constructs), the removal of some DNA sequences (e.g. Ϫ886/ϩ106-, Ϫ811/ϩ106-, Ϫ268/ϩ106-, and Ϫ165/ϩ106-bp constructs) led to enhanced expression of the reporter gene (Fig. 3). It should be stressed however that although the ap-2 C. Guron and R. Raghow, unpublished observations. 3 T. Takahashi and R. Raghow, unpublished data. parent positive or negative modulatory consequences of particular deletions were rather modest, the overall quantitative patterns were strikingly similar and reproducible in both NIH3T3 and C 2 C 12 cells (Fig. 3). Unlike NIH3T3 cells, C 2 C 12 cells are capable of undergoing myogenic differentiation when they are cultivated in serum-free DMEM for many hours. Thus, it is conceivable that there are additional factors in C 2 C 12 cells which may preferentially interact with some cis-regulatory sequences of the Msx-1 promoter. The activity of the Ϫ165/ ϩ106-bp promoter compares favorably with the longer constructs, and further shortening the Ϫ165/ϩ106-bp promoter (e.g. ϩ10/ϩ106or ϩ33/ϩ106-bp promoter-luciferase constructs) abolished its activity completely in both cell lines (Fig.  3). Therefore, we have tentatively named the Ϫ165/ϩ106-bp promoter as a "minimal" Msx-1 promoter.
The Proximal E Box and Sp1 Motifs Are Essential for Optimal Activity of the Minimal Msx-1 Promoter-Careful and detailed analysis of expression of sequentially deleted Msx-1 promoter-luciferase constructs in transient expression assays convinced us to examine the minimal promoter more rigorously to assess the contribution of the putative trans-acting factor(s) which may interact within the minimal promoter (Ϫ165/ϩ106 bp). The location in the truncated promoter of one of the three consensus E box elements (the proximal E box) and one of the three consensus GC boxes known to bind Sp1 (the proximal Sp1), predicted theoretically, prompted us to experimentally test the function of these two DNA elements individually. Msx-1 promoter-luciferase constructs, from which either proximal E box or Sp1, or both, had been deleted, were tested for promoter activities. We observed that deleting either one of these elements, regardless of the presence or absence of additional 5Ј-flanking DNA, caused nearly complete loss of the Msx-1 promoter activity (Fig. 4). Another, somewhat intriguing, observation came out of this analysis; we also noticed that while the terminal deletions in the Msx-1 upstream sequences had less severe effects on the promoter, the internal deletions almost completely abolished promoter function in both cell lines. We extended these data by introducing 4-bp block mutations in the proximal E box or Sp1 motifs individually and tested the effects of these perturbations in the context of the Ϫ886/ϩ106-bp Msx-1 promoter. As depicted in Fig. 5 mutations in either the proximal E box or the proximal GC box singly caused severe reduction in the expression of luciferase (1.2% and 4.1% activities remaining, respectively). Curiously, however, in contrast to what occurred with the block mutations at single sequence motifs, when both mutations were mobilized in the same construct a substantial level of luciferase activity was restored (Fig. 5). At present we can only speculate as to the cause of this phenomenon; conceivably, binding site(s) for an additional factor(s) were created as a result of combining both mutated E box and GC box sites on a contiguous fragment of DNA.
Nuclear Proteins from NIH3T3 and C 2 C 12 Cells Bind to the Proximal E Box of the Msx-1 Promoter-Our transient expression data indicated that the proximal E box and Sp1 sites were critical for Msx-1 promoter activity. To explore directly whether putative transcription factors bound to sequences predicted by the deletion assays, we performed DNase I footprinting and EMSA experiments with radiolabeled DNA fragments encompassing the potential cis-regulatory sites incubated with nuclear extracts from NIH3T3 or C 2 C 12 cells. When radiolabeled Ϫ91to ϩ106and Ϫ268to ϩ106-bp DNA fragments were subjected to footprinting analyses, one clearly discernible area of protection from DNase I breakdown, Fp-1, could be consistently seen (Fig. 6); the identity of Fp-1, encompassing nucleotides numbered Ϫ26 to Ϫ13, was established to be an Sp1-like motif by more extensive analyses. The footprinted area marked Fp-2 was obtained less consistently; Fp-2 encompassed the sequence motif identified as the proximal E box by rigorous EMSA and site-specific mutagenesis experiments. Both putative motifs were recognized by a nuclear protein(s) from C 2 C 12 cells (Fig. 6) and nuclear extracts prepared from NIH3T3 cells. 3 We extended the DNase footprinting data by testing the specificity of binding of the putative factors with two types of EMSA experiments. First, the Ϫ165/Ϫ128-bp region was subdivided into three parts, and the individual oligonucleotides were used to competitively inhibit binding of the putative E box proteins to Ϫ165/Ϫ128 oligonucleotide while Ϫ165/Ϫ147 oligonucleotide inhibited binding very effectively (Fig. 7A, lane 3), the two truncated oligonucleotides, Ϫ156/Ϫ138 and Ϫ146/ Ϫ128, did not displace the protein-bound DNA (Fig. 7A, lanes 4  and 5, respectively). The DNA binding region was narrowed to Ϫ165/Ϫ147 oligonucleotide by EMSA (Fig. 7B, lane 1) since neither Ϫ156/Ϫ138 (Fig. 7B, lane 2) nor Ϫ146/Ϫ128 (Fig. 7B, lane 3) oligonucleotides bound to trans-acting factor(s) from C 2 C 12 extracts. To extend these observations, more specific mutations were created within the putative E box motif or immediately contiguous sequences, and the mutant doublestranded oligonucleotides were used as competitors to displace binding of the radioactive Ϫ165/Ϫ147 oligonucleotide. As shown in Fig. 7C, while MϪ165/Ϫ162 efficiently competed out binding of proteins to the proximal E box (lane 3), either MϪ161/Ϫ158 or MϪ157/Ϫ154 oligonucleotides failed to do so (Fig. 7C, lanes 4 and 5, respectively). Based on these data we conclude that the region of DNA encompassing nucleotides from Ϫ161 to Ϫ154 of Msx-1 promoter is essential for binding to the putative E box-recognizing factor(s). Identity of the proximal E box-binding factor was further investigated by competition with oligonucleotides with previously well-defined E box motifs; additionally, we carried out formation of DNA-protein complexes in the presence of antibodies to the basic helix-loophelix proteins known to recognize the core E box motif in the context of additional contiguous sequences. As shown in Fig. 8, a consensus upstream stimulating factor-1 (USF-1) oligonucleotide competed with the Msx-1 oligonucleotide (Fig. 8, lane  4); a 50-fold excess of the consensus MEF-1 oligonucleotide was unable to dislodge this complex under identical conditions (Fig.  8A, lane 3). For unknown reasons, the apparent reduction of DNA-protein complex formation seen with MEF-1 oligonucleotide did not occur consistently. 4 Furthermore, Msx-1 proximal E box-protein complexes could be readily supershifted with antibodies to USF-1 (Fig. 7B, lane 3); similar incubation of these complexes with antibodies to USF-2 resulted in a weakly supershifted band (Fig. 8B, lane 4). 4 We should also mention here that polyclonal antibodies to a number of other E boxbinding proteins which include MyoD, Myf5, Myf6, and myogenin failed to interact with the proximal E box DNA-protein complexes. 5 As shown in an earlier experiment (Figs. 2 and 3), the deletion of sequences encompassing the distal E box element had a positive effect on the promoter activity in both NIH3T3 and C 2 C 12 cells. When promoter constructs containing point mutations in the distal E box were quantitatively assessed for their activity in transiently transfected C 2 C 12 cells, the results corroborated the data obtained with terminal deletions; the presence in the Msx-1 promoter of a mutated distal E box element (disabled to form DNA-protein complexes) substantially 4  boosted the Msx-1 promoter activity in a reproducible manner (Fig. 9). Thus, we were curious to compare the putative nuclear factor(s) binding to the distal E box (Ϫ1164 to Ϫ1159) with the factor(s) that recognized the proximal E box (Ϫ159 to Ϫ154). We discovered that the distal E box oligonucleotide, which formed three prominent protein-DNA complexes as revealed by EMSA (Fig. 10, lane 2), competed effectively with itself (lane 3), but failed to dislodge trans-acting factor(s) bound to the prox-imal E box motif (Fig. 10, lane 11). The proximal E box oligonucleotide bound to protein formed one shifted band (Fig. 10,  lane 9) and was an excellent self-competitor (Fig. 10, lane 10) but failed to inhibit DNA-protein complex formation with the distal E box oligonucleotide (Fig. 10, lane 5). As expected, a mutated distal E box oligonucleotide did not dislodge DNAprotein complexes formed with either the distal (lane 4) or the proximal (Fig. 10, lane 12) E box oligonucleotides. The apparent lack of reciprocity diplayed in the cross-competition experiments suggests that an apparently unique protein factor(s) bind to the distal and proximal E box motifs. The

FIG. 9. Expression of luciferase driven by the full-length (؊1282/؉106), wild type distal E box-containing truncated (؊1168/؉106), or a mutated E box-containing truncated (؊1168M) promoter in transiently transfected C 2 C 12 cells.
Sequence of the wild type distal E box and its mutated counterpart is shown in A. The percent luciferase activities, relative to the full-length promoter, indicate that mutation of the distal E box leads to enhancement of the truncated promoter as shown in B. The bars represent S.E. of means of five independent experiments and have been corrected for the experimental variations in the transfection efficiency. scription start site was sufficiently active to warrant a more thorough characterization of the "core promoter." Therefore, the core promoter was analyzed in detail by EMSA for binding of the putative transcription factor(s). Analyses using EMSA and an antibody-mediated electrophoretic mobility supershift revealed that both C 2 C 12 and NIH3T3 nuclear extracts contained readily detectable levels of Sp1 proteins which interacted with Msx-1 oligonucleotide encompassing Ϫ32/ϩ2 (Fig.  11, lanes 2 and 4, respectively). Binding could be competed with cold, wild type oligonucleotide (Fig. 11, lanes 3 and 5, respectively) but failed to be competed with mutant oligonucleotides, MϪ22/Ϫ19, MϪ18/Ϫ15, and MϪ14/Ϫ11 (Table I; data not shown). We extended the competitive binding assays by incubating radiolabeled Ϫ32/ϩ2 oligonucleotides with either C 2 C 12 nuclear extracts or with purified Sp1 and carried out binding reactions in the presence of a polyclonal antibody specific for human Sp1 and known to cross-react with murine, rat, and human Sp-1. The radiolabeled bands, representing Sp1-bound DNA complexes, could be readily supershifted, regardless of whether the source of Sp1 was nuclear extracts from C 2 C 12 cells (Fig. 11B, lanes 2 and 3), from NIH3T3 cells (Fig. 11B,  lanes 4 and 5) or purified Sp1 (Fig. 11B, lanes 6 and 7). It appears that proximal Sp1 oligonucleotide formed multiple complexes when incubated with nuclear extracts; this is in contrast to what occurred with the purified Sp1. We believe that this result is not unexpected since Sp1 in the cells may be present in a variety of posttranslationally modified states or may be bound to other factors. Based on these data we conclude that the core promoter binds to a trans-acting factor(s) which is immunologically related to the authentic Sp1.

Cells Cultured in Serum-deprived Media Contain Diminished Levels/Activities of trans-Acting Factors
FIG. 11. EMSA shows that nuclear extracts from C 2 C 12 and NIH3T3 bind to the proximal Sp1 motif of the Msx-1 promoter. A, the proximal Sp1 oligonucleotide (Ϫ32/ϩ2) was end-labeled and incubated with C 2 C 12 or NIH3T3 nuclear extracts in the absence or presence of unlabeled double-stranded competitor. Lane 1, probe alone; lanes 2 and 3, C 2 C 12 nuclear extract; lanes 4 and 5, NIH3T3 nuclear extract; lanes 3 and 5, 40-fold molar excess of unlabeled Ϫ32/ϩ2 oligo. The arrows indicate specific DNA-protein complexes, and SS denotes the supershifted band. B, electrophoretic mobility supershift experiments using labeled proximal Sp1 site. Labeled oligonucleotides containing the proximal Sp1 motif were incubated with C 2 C 12 or NIH3T3 nuclear extracts in the presence of anti-Sp1 polyclonal antibody. Lane 1, radiolabeled Ϫ32/ϩ2 oligonucleotide alone; lanes 2 and 3, radiolabeled oligonucleotide incubated with 11 g of C 2 C 12 nuclear extract; lanes 4 and 5, 11 g of NIH3T3 nuclear extracts; lanes 6 and 7, 5 footprint units of purified human Sp1 protein; lanes 3, 5, and 7, 2 g of anti-Sp1 antibody.
repressed transcription of MyoD (an essential prerequisite for myogenesis) by binding directly to the MyoD enhancer (74). Because C 2 C 12 cells differentiate into myotubes when grown in serum-free media, we were curious as to the status of Msx-1 gene expression in C 2 C 12 cells undergoing myogenesis. We found that the steady state levels of Msx-1 transcripts declined substantially in C 2 C 12 cells grown for 24 h in serum-deprived medium (Fig. 13A), and less than 5% transcripts remained in cells grown under serum-deprived conditions for 72 h. We also saw a concomitant increase in the levels of a smaller molecular weight species of RNA which was detected with Msx-1 cDNA probes; we think that this represents a breakdown product of Msx-1 transcripts in the cells undergoing myogenesis. Unlike C 2 C 12 cells which undergo myogenesis in vitro, there was no noticeable change in the morphology of NIH3T3 cells grown in low serum. Nevertheless, Msx-1 transcripts were similarly reduced in both NIH3T3 and C 2 C 12 cells; the transient expression assays also revealed that the Ϫ165/ϩ106-bp Msx-1 promoter-luciferase activity was reduced by more than 10-fold in both cells grown under serum-deprived conditions after transfection. 3 Therefore, we examined the levels/activities of the trans-acting factors capable of binding to the minimal Msx-1 promoter in the nuclear extracts prepared from NIH3T3 and  Freshly seeded SL-2 cells were co-transfected with 0.5 g of one of the denoted reporter plasmids with or without 50 ng of pPacSp1, in either sense (S) or antisense (AS) orientation, as described under "Experimental Procedures." Reporter constructs are pGL2-Basic (B), Ϫ32/ϩ106 core Msx-1luciferase with wild type Sp1 motif (Ϫ32W), or Ϫ32/ϩ106 Msx-1-luciferase construct containing mutated Sp1 site (Ϫ32M). Luciferase activity, obtained with cotransfection of the reporter plasmid pGL2-Basic with the sense of antisense Sp1 expression vector is shown for comparison. Cotransfection of SL-2 cells with Ϫ32W with pPacSp1(S) activated the Msx-1 promoter more than 100-fold; mutation in the Sp1 motif reduced this transactivation substantially (greater than 90%). Cotransfection of reporter constructs with pSpSp1(AS) was inconsequential under all conditions. The steady-state levels of Msx-1 declined substantially in cells grown in the serum-deprived conditions; there was no significant change in the level of the glyceraldehyde-3-phosphate dehydrogenase RNA. A 5Ј-endlabeled Sst-BglII fragment of Msx-1 DNA (Ϫ165/ϩ106 bp) was incubated with nuclear protein extracts prepared from C 2 C 12 or NIH3T3 cells grown in serum-containing (ϩ) or serum-deprived (Ϫ) conditions. Radiolabeled probe (P) without nuclear extracts was electrophoresed. The putative DNA protein complexes are denoted with arrowheads. C 2 C 12 cells grown in serum-free media. The EMSA revealed a drastic reduction in the amount/activity of the trans-acting factors which associated with the minimal promoter (Fig. 13B). Thus, regardless of the morphological transformation of the two cell lines grown under low-serum conditions, the factor(s) which activate the minimal Msx-1 promoter decline in both cells, concomitant with a decline in the steady state levels of Msx-1 transcripts.
The Minimal Ϫ165/ϩ106-bp Msx-1 Promoter Drives Heterologous ␤-Galactosidase Gene Expression in the Craniofacial Tissues of Transgenic Mice-Based on the transgenic analysis of 13 kb of DNA around the Msx-1 locus, MacKenzie et al. (73) surmised that the pattern of Msx-1 gene expression during embryogenesis was determined by a complex set of cis-acting elements, including the two tissue-specific enhancers located ϳ2 kb apart from each other. Since none of the putative promoter DNAs could drive the LacZ gene expression in absolute concordance with the endogenous Msx-1, they concluded that disparate sequence motifs, which act both independently and in concert, determine the complex pattern of Msx-1 gene expression in the embryo (73). With a long term goal to elucidate the mechanisms regulating the Msx-1 gene activation in the developing embryo, we have begun a systematic analysis of a number of Msx-1 promoter-LacZ constructs in transgenic mice, concomitant with promoter dissection studies using transient expression in cells in culture. We have analyzed a number of transgenic mice lines, harboring incrementally truncated Msx-1 promoter ligated to LacZ DNA, in their genome. Our data revealed that the ϳ5-kb Msx-1-LacZ embryos expressed ␤-galactosidase at many sites reminiscent of expression of the endogenous gene; the serially truncated variants of the fulllength promoter exhibited wide variations in their patterns of developmental stage-specific activation. 6 In light of our data showing that a truncated Msx-1 promoter was highly active in both C 2 C 12 and NIH3T3 cells, we tested the activity of the minimal promoter in transgenic embryos. A comparison of the endogenous Msx-1 expression as judged by wholemount in situ hybridization of 11-12-day-old mouse embryos with Msx-1-specific antisense probes and corresponding stage embryos depicting ␤-galactosidase gene expression driven by the Ϫ165/ϩ106 bp promoter are illustrated in Fig. 14. As has been reported previously (31,32,73), the endogenous Msx-1 gene is highly expressed in the dorsal neural tube, choroid plexi of the third and lateral ventricles, meninges, and skull bone precursors; significant expression is also seen in the developing nasal, mandibular, and maxillary prominences and in the limbs of the embryos (Fig. 13, A and B). Apparently, the Ϫ165/106-bp minimal Msx-1 promoter dictates LacZ expression in a highly restricted manner in the craniofacial structures. In particular, the cellular primordia which are destined to generate upper and lower jaws, teeth, nose, and bones of the skull are positive for LacZ gene expression driven by the truncated Msx-1 promoter (Fig. 14, C and D). The craniofacial pattern of expression of the transgene is remarkably similar to the craniofacial pattern displayed by the endogenous Msx-1 gene (Fig. 14, compare A and B with C and D). Out of the four lines of transgenic mice containing the Ϫ165/ϩ106-bp Msx-1-LacZ construct, we have analyzed two in great detail; transgenic embryos from both of these lines show remarkable similarity of LacZ expression in the craniofacial primordia as exemplified in Fig. 14. Our data strongly suggest that neither the site(s) of integration nor the copy number of the transgene in the genome significantly alter the specificity of the minimal Msx-1 promoter activation in the craniofacial tissues of the transgenic mice. Finally, it is significant that the expression of the transgene driven by the minimal promoter was conspicuously absent from the limb primordia and the dorsal neural tube, the two locations well known for high endogenous Msx-1 gene expression. 6 Therefore, we conclude that the minimal Msx-1 promoter, consisting of an E box and a GC-rich motif, can target expression of a heterologous gene into specific craniofacial tissues; interestingly, the Msx-1-ablated mice also showed consistent and severe abnormalities in the same craniofacial locations which are preferentially targeted by the minimal Msx-1 promoter (72). DISCUSSION With an objective to extend our previous analysis of the murine Msx-1 promoter (43), we carried out systematic deletion and mutagenesis studies on a 5-kb 5Ј-flanking DNA of Msx-1 gene. The longest and the truncated variants of the putative promoter DNA fragments were used to drive the expression of  (panels A and B), and the transgenic embryos (E12.5) were stained with X-gal (panels B and C) as detailed under "Experimental Procedures." The endogenous Msx-1 gene is highly expressed in the choroid plexus of the 3rd (cp3) and lateral ventricles (cplv), lateral nasal prominences (lnp), maxillae (max), and mandibles (man); the additional sites of high Msx-1 expression include the midline of the dorsal neural tube (nt) and limb buds (lb). The craniofacial tissues of Ϫ165/ϩ106 bp Msx-1-LacZ transgenic embryos (B and C) prominently stained with X-gal include the choroid plexus of the ventricles, maxillary, mandibular and nasal prominences, and eyes (arrows). The limb buds and the dorsal neural tubes of the transgenic embryos containing the minimal Msx-1 promoter-LacZ in their genome do not show detectable ␤-galactosidase expression (B and C).
reporter genes, luciferase, and bacterial LacZ. Using transiently transfected NIH3T3 and C 2 C 12 cells, both of which express Msx-1 constitutively, we tested functional consequences of targeted alterations in the potential cis-acting sequences of Msx-1 gene on the expression of the reporter genes. Our initial studies showed that the deletions between Ϫ5 kb and Ϫ1.4 kb did not significantly affect Msx-1 promoter activity in either cell line. Therefore, we analyzed the Ϫ1.4-kb promoter in much greater detail. Our studies revealed that although a number of positive and negative cis-regulatory regions could be readily demonstrated through site-specific modifications of the Ϫ1.4-kb promoter (considered the full-length Msx-1 promoter), most of these perturbations had modest effects on the activity of the Msx-1 promoter. In fact, a relatively short DNA fragment containing 165 bp upstream of the transcription start point (TSP) and 106 bp downstream of the TSP retained strong promoter function in both NIH3T3 and C 2 C 12 cells. We tentatively termed this as the minimal Msx-1 promoter. Guided by these observations, we examined in detail the minimal promoter for the cis-regulating sequence motifs and the potential transcription factor(s) to which the cis-elements bind. Transient expression assays with wild type or mutated Msx-1 promoter-luciferase were combined with DNase I footprinting, EMSA, and supershift analysis with specific antibodies to unravel the potential interactions between cis-acting motifs and their trans-acting factor(s) which modulate the activity of the minimal promoter. These data reveal that cis-regulatory motifs, located at Ϫ159 to Ϫ154 (the proximal E box) and at Ϫ26 to Ϫ13 (the proximal Sp1 site), were critical for activation of the minimal Msx-1 promoter in the NIH3T3 and C 2 C 12 cells.
E box motifs are known to bind to a diverse group of basic helix-loop-helix DNA-binding proteins; these include the myogenic transcription factors MyoD, Myf5, Myf6, and myogenin (56 -59) and members of the Myc/Max gene family of transcription factors which includes USF, TEF3, TEFB, Mxi1, Ap4, and FIP (60 -69). Since a wide variety of transcription factor(s) bind to E box motifs, we explored the attributes of the two additional E boxes located upstream of the proximal E box. We observed that in contrast to the proximal E box, which contributed positively to the Msx-1 promoter activity, the distal E box (Ϫ1164 to Ϫ1159) acted as a negative element. Conspicuously, the middle E box (Ϫ882 to Ϫ877) neither affected promoter activity nor appeared to bind to transcription factor(s) by EMSA (data not shown). Two other observations with regard to the proximal and distal E boxes are also pertinent. First, the trans-acting factor(s) binding to the two sequence motifs are apparently unique as judged by the number of DNA-protein complexes formed by the proximal (one) and distal (three) E boxes and the failure of oligonucleotides representing proximal and distal E box motifs to cross-compete each other in DNAprotein interactions unraveled by EMSA. Second, we have observed that the trans-acting factor(s) associating with the proximal E box are inducible with serum; this finding is significant since the Msx-1 gene transcription declined precipitously in C 2 C 12 and NIH3T3 cells grown in serum-free media. Therefore, we speculate that a serum-modulatable proximal E box-binding trans-acting protein(s) may be involved in proliferation versus differentiation signaling and Msx-1 gene expression.
The Msx-1 core promoter, with only 32 bp of sequences upstream of the TSP contains an authentic Sp1 recognition element. The truncated Msx-1 promoter lacks a TATA element but sequences around its TSP are CCGCTGC which are 86% homologous to a recently discovered, modified initiator element (Inr) in the promoter of Ha-ras (70). The location of the proximal Sp1 site in the Msx-1 promoter, 16 bp upstream of the Inr, is reminiscent of the situation in the human muscle phospho-fructokinase P1 promoter which also lacks a TATA box but contains a Sp1 site (between ϩ12 and ϩ21) immediately adjacent to TSP (70). Promoters containing Sp1 binding sites and Inr elements and lacking a TATA box are thought to be activated through a TBP-mediated mechanism (70). Of the three putative Sp1 recognition GC motifs located on the Msx-1 promoter (at nucleotides Ϫ671 to Ϫ663, Ϫ490 to Ϫ485, and Ϫ25 to Ϫ17), only the proximal GC box (located at Ϫ25 to Ϫ17) was found to be obligatory for transcriptional activation of the minimal promoter in either cell line. It has been proposed that Sp1 may interact with the basal transcriptional apparatus through coactivators and is involved in binding with TATA-binding protein TFIID. In TATA-less promoters, Sp1 is thought to recruit the basal transcription factors through a novel tethering activity, distinct from coactivators. The tethering factor(s) physically associates with TBP and functions to anchor the initiation complex to the promoter through binding to Sp1 (70,71). Based on the current data, we cannot be certain whether Msx-1 promoter activation involves an Inr-mediated or Sp1mediated mechanism.
Expression of Msx-1 gene can be readily detected at many well-defined locations in embryos from 9.5 to 12.5 days of development; these include areas of active organogenesis mediated through interactions between the epithelial and mesenchymal cell layers (4). The Msx-1 knock-out mice have unraveled another important paradox between the sites of its expression during embryogenesis and the phenotype of the Msx-1 null mice. Thus, although Msx-1 gene expression occurs rather widely in the embryo, Msx-1-deficient mice exhibit very specific defects, primarily restricted to craniofacial structures (72). Currently, it is not known whether the characteristic craniofacial dysmorphology seen in the Msx-1 null mice reflects an apparent failure of the compensatory mechanisms (e.g. expression of Msx-2) which rescue other locations of organogenesis in the embryo. The cis-acting elements which regulate Msx-1 gene expression at various sites in the developing embryo are poorly understood. Recent studies of MacKenzie et al. (73) have revealed that multiple positive and negative tissuespecific elements, including two enhancer sequences located far apart from each other in the Msx-1 promoter, dictate the complex spatiotemporal expression of Msx-1 during embryogenesis. However, these authors failed to obtain absolute concordance between the endogenous Msx-1 expression and the transgene expression dictated by several variants of the Msx-1 promoter designed from 13 kb of the genomic DNA (73). Based on these elegant analyses, it was concluded that the Msx-1 promoter is made up of cis-acting elements that act both independently and in concert with each other to generate the complex pattern of expression of Msx-1 seen during embryogenesis (73). A systematic analysis of a number of Msx-1 promoter-LacZ constructs in transgenic mice in our laboratory suggests that the pattern of the endogenous Msx-1 gene expression during embryogenesis is only partially reproduced by most of the promoter constructs; therefore, our observations fully corroborate the conclusion of MacKenzie and co-workers. 2 In light of the extensive observations regarding the widespread Msx-1 expression during development, it is extremely significant that the minimal Msx-1 promoter, encompassing Ϫ165/ϩ106 bp is activated with remarkable precision in the craniofacial structures found to be defective in the Msx-1 knock-out mice (72). The minimal Msx-1 promoter appears to be extraordinarily simple; it consists of two commonly found sequence motifs of eukaryotic promoters, the proximal E box and the GC box. The proximal E box binds to a protein factor which is immunologically related to USF-1. The GC box binds to a transcription factor, which is immunologically related to Sp1. Conceivably, additional factor(s) may also recognize the minimal promoter and participate in Msx-1 gene activation in vivo; such interactions may not only be mediated by direct DNA-protein complex formation but also by protein-protein interactions. Msx-1 itself is a transcription factor which inhibits MyoD expression in fibroblast ϫ 10 T1 ⁄2 cell hybrids (74). It is not known if some myogenic transcription factor(s) also reciprocally regulates Msx-1. Although Msx-1 protein has been shown to bind to the consensus sequence 5Ј-(C/G)TAATTG-3Ј, Msx-1 can repress transcription of some target genes lacking DNA-binding sites for the Msx-1 homeodomain (75). 7 Msx-1 promoter itself contains two Msx-1 consensus binding sites and the binding of Msx-1 homeodomain polypeptide to the predicted Msx-1 motif was previously demonstrated (43). The murine Msx-1 promoter may be subject to autoregulation by DNAprotein and protein-protein interactions and thus further complicates regulatory feedback loops orchestrating early development.