An evolutionary and molecular analysis of Bmp2 expression.

The coding regions of many metazoan genes are highly similar. For example, homologs to the key developmental factor bone morphogenetic protein (BMP) 2 have been cloned by sequence identity from arthropods, mollusks, cnidarians, and nematodes. Wide conservation of protein sequences suggests that differential gene expression explains many of the vast morphological differences between species. To test the hypothesis that the regulatory mechanisms controlling this evolutionarily ancient and critical gene are conserved, we compared sequences flanking Bmp2 genes of several species. We identified numerous conserved noncoding sequences including some retained because the fish lineage separated 450 million years ago. We tested the function of some of these sequences in the F9 cell model system of Bmp2 expression. We demonstrated that both mouse and primate Bmp2 promoters drive a reporter gene in an expression pattern resembling that of the endogenous transcript in F9 cells. A conserved Sp1 site contributes to the retinoic acid responsiveness of the Bmp2 promoter, which lacks a classical retinoic acid response element. We have also discovered a sequence downstream of the stop codon whose conservation between humans, rodents, deer, chickens, frogs, and fish is striking. A fragment containing this region influences reporter gene expression in F9 cells. The conserved region contains elements that may mediate the half-life of the Bmp2 transcript. Together, our molecular and evolutionary analysis has identified new regulatory elements controlling Bmp2 expression.

The coding regions of many metazoan genes are highly similar. For example, homologs to the key developmental factor bone morphogenetic protein (BMP) 2 have been cloned by sequence identity from arthropods, mollusks, cnidarians, and nematodes. Wide conservation of protein sequences suggests that differential gene expression explains many of the vast morphological differences between species. To test the hypothesis that the regulatory mechanisms controlling this evolutionarily ancient and critical gene are conserved, we compared sequences flanking Bmp2 genes of several species. We identified numerous conserved noncoding sequences including some retained because the fish lineage separated 450 million years ago. We tested the function of some of these sequences in the F9 cell model system of Bmp2 expression. We demonstrated that both mouse and primate Bmp2 promoters drive a reporter gene in an expression pattern resembling that of the endogenous transcript in F9 cells. A conserved Sp1 site contributes to the retinoic acid responsiveness of the Bmp2 promoter, which lacks a classical retinoic acid response element. We have also discovered a sequence downstream of the stop codon whose conservation between humans, rodents, deer, chickens, frogs, and fish is striking. A fragment containing this region influences reporter gene expression in F9 cells. The conserved region contains elements that may mediate the half-life of the Bmp2 transcript. Together, our molecular and evolutionary analysis has identified new regulatory elements controlling Bmp2 expression.
The evolutionary conservation of the BMP2 and BMP4 proteins at both the functional and sequence levels is remarkable. Easily recognizable Bmp genes have been cloned from distantly related phyla such as arthropods, mollusks, cnidarians, and nematodes. Except for the nematode BMPs, the amino acid sequences of nonvertebrate BMPs are 70 -87% identical to human BMP2 (Ref. 22 and references therein). The Drosophila protein DPP, which is 71% identical to human BMP2, is functionally interchangeable with mammalian BMPs in a mammalian bone induction assay (23). Conversely, the closely related mammalian BMP4 can rescue the dorsal-ventral axis defects of Drosophila lacking DPP function (24). Indeed, entire signaling pathways involving BMPs are conserved. For example, BMPs and their antagonists appear to play analogous, although inverted, roles in dorsal-ventral axis formation in vertebrates and invertebrates (25).
Both Bmp2 in mouse and dpp in Drosophila are expressed in a highly tissue-and stage-specific pattern. Multiple promoters and alternative splicing produce a variety of dpp transcripts in Drosophila (26). Like Bmp2, the dpp mRNA has an unusually long 3Ј-untranslated region with portions highly conserved between flies (27,28). Our work, and that of others, suggests that mammalian Bmp2 may be regulated in an equally complex manner (29 -34).
The first indication that the vitamin A-derivative retinoic acid (RA) is involved in regulating the Bmp2 gene was our discovery of its strong induction in F9 embryonal carcinoma cells stimulated to differentiate with RA (35). F9 cells lacking retinoic acid receptor (RAR) ␥ fail to express Bmp2 in response to RA (36). RA also induces the Bmp2 gene in the developing chick limb (37). Many Bmp2-expressing tissues develop abnormally in vitamin A-deficient embryos or after exposure to teratogenic levels of vitamin A or other retinoids. These include the heart and cardiovasculature, limbs, central nervous system, craniofacial structures, and vertebrae (4,38).
F9 cells are widely used as a model of cellular differentiation and early embryonic development (39,40). The morphology, pattern of gene expression, and cell cycle of undifferentiated embryonal carcinoma cells resemble that of the totipotent stem cells of the blastocyst inner cell mass. F9 cells differentiate rapidly and synchronously into primitive extraembryonic endoderm upon treatment with RA and into parietal extraembryonic endoderm upon treatment with RA and cAMP analogs (41). This model system has been used to identify genetic ele-ments controlling the expression of many important developmental genes.
Bmp2 is expressed at three distinct levels in F9 cells (35,42). Undetectable in undifferentiated stem cells, the Bmp2 transcript is readily detectable in RA-treated cells. The combination of elevated cAMP levels and RA increases the abundance of the Bmp2 transcript 5-6-fold more than RA alone. Cyclic AMP elevation alone induces neither differentiation nor expression of Bmp2. This indicates that maximally expressing Bmp2 requires a synergistic interaction between RA and cAMP.
Reasoning that the regulatory elements controlling this ancient and critical gene were conserved (43), we compared sequences flanking the Bmp2 genes from several species. We tested the function of several conserved noncoding sequences in F9 cells representing three embryonic cell types: stem cells, primitive endoderm, and parietal endoderm. The results suggest that, like other RA-induced genes lacking classical retinoic acid response elements (RAREs), a protein-protein interaction between retinoid receptors and Sp1 contributes to RA responsiveness. In addition, a portion of the 3Ј-untranslated region that has been conserved for 450 million years influences Bmp2 expression.

DNA Isolation
All plasmids were purified using Qiagen® plasmid purification kits. Genomic DNA was isolated from cells grown on 150-mm dishes, lysed with ligation-mediated PCR lysis buffer, incubated overnight at 37°C, and purified by phenol/chloroform extraction (44

Genomic PCR
Genomic PCR conditions were adapted from the ligation-mediated PCR protocol described by Garrity and Wold (44) using these degenerate primers: 5Ј-ATTNNCCTCTGTGGGTTGCTAATCCG-3Ј (nt Ϫ1217 to Ϫ1192) and 5Ј-NNGGGGNCACGTCCATTGAAAGAG-3Ј (nt 1094 to 1071). Unless otherwise indicated, all nucleotide positions are provided with respect to the murine Bmp2 distal promoter that is 2,201 nt upstream of the initiator codon (ATG). One of any four deoxynucleotides, indicated by "N," was placed in positions that differed between the mouse and human sequences. Reactions contained 0.01 unit/l of Vent DNA polymerase (New England Biolabs), 2.5-10.0 ng/l of genomic DNA, 0.1 pmol/l of primers, 0.8 mM dNTPs, 5% Me 2 SO, 40 mM NaCl, 20 mM MgSO 4, 0.01% gelatin, 0.1% Triton X-100. Samples were covered with mineral oil and thermocycling was performed in a Mas-terCycler gradient thermocycler (Eppendorf). The cycling conditions were as follows: 2 cycles: denature at 97°C for 3 min; anneal, 60°C for 30 min; extend, 76°C for 10 min followed by 29 cycles: 97°C for 1 min, 60°C for 3 min, and 76°C for 3 min. The ramping time between denaturation at 97°C and annealing at 60°C was set to 1°C/s. The reactions were separated on a 0.7% agarose gel and visualized by ethidium bromide staining.
Following the initial sequencing of the squirrel monkey and African green monkey Bmp2 sequences, the following degenerate primers were designed using ClustalW alignments of the rodent, primate, and deer Bmp2 sequences: nt Ϫ140 to Ϫ116, 5Ј-CGTCCACACCCCTGCGCGCN-GCTCC-3Ј; nt 645 to 619 5Ј-ATGGCTCGGGGCAGCCATC(C/G)TGGG-CGA-3Ј. Genomic PCR using these primers was performed as described above.

Southern Analysis
Following visualization of the genomic PCR products, Southern analysis was performed by standard methods (45). Probes for Southern blot analysis were made by primer extension using Bmp2-specific primers or random primers and [␣-32 P]dCTP (45

Cloning and Sequencing the African Green Monkey Bmp2 Promoter Region
A 1696-nt fragment was amplified from African green monkey genomic DNA and sequenced using the degenerate primers 5Ј-cgcgga-tccATTNNCCTCTGTGGGTTGCTAATCCG-3Ј (mouse nt Ϫ1217 to Ϫ1192, lowercase letters indicate an added BamHI site) and 5Ј-GGG-CNCATTCTCgaGCGAGTCGAGC-3Ј (mouse nt 479 to 455, RevBmp2XhoI, lowercase letters indicate an added XhoI site) as described above. This PCR product also was digested with BamHI and XhoI and cloned into the BamHI and XhoI sites in pBluescript II SKϩ to make pBS-AGM599 (analogous to mouse Ϫ104 to 465 nt) and sequenced with T7 and Sp6 primers.

Sequence Analysis
Contiguous squirrel monkey sequences were assembled using the Phrap interface in CuraTools. 2 ABI trace viewing, ClustalW alignment/ editing and percent identity were calculated using BioEdit. 3 Promoter predictions were performed by neural network analysis. 4 Putative transcription factor binding sites were determined by visual inspection and using TRANFAC and TFD analysis at SIGNALSCAN. 5 Multiple alignments for figures and percent identity plots were performed using MultiPipMaker 6 and VISTA. 7

Luciferase Reporter Constructs
All nucleotide positions are provided with respect to the murine distal Bmp2 promoter that is 2,201 nt upstream of the initiator codon (ATG).
pGL2Basic⌬Bam-To remove the BamHI site from the 3Ј cloning site, pGL2Basic (Promega Corp.) was digested with BamHI and the ends were made blunt using Klenow (New England Biolabs) in the presence of dNTPs and religated.
B575-LUC (nt Ϫ104 to 471, pGL575pBX)-pGL716dBX was cut with BamHI and NheI. The ends were filled in with Klenow and religated.
SqM-LUC (Analogous to Mouse nt Ϫ104 to 465, pGLSQM600)-A 1.6-kb fragment was PCR amplified from pCRIISQM2.3KB with the primers described above. The PCR product was digested with BamHI, made blunt with Klenow, digested with XhoI, and then cloned into pGL2Basic⌬Bam digested with SmaI and XhoI.

F9 Cell Culture and Differentiation
F9 embryonal carcinoma cells were plated on dishes precoated with 1% gelatin and incubated at 37°C with 10% CO 2 . The culture media consisted of Dulbecco's modified Eagle's medium supplemented with 10% heat-inactivated calf serum and 2 mM glutamine (Dulbecco's modified Eagle's medium). The cells were induced to differentiate into parietal endoderm by adding 1 M all-trans-retinoic acid, 250 M dibutyryl cAMP, and 500 M theophylline (RACT). Undifferentiated control cells were treated with 250 M dibutyryl cAMP and 500 M theophylline (CT).

F9 Cell Transfection by Calcium Phosphate Precipitation
Transfections were performed essentially as described by Vasios et al. (46). Briefly, for 96-h drug treatments, F9 cells were plated at 1 ϫ 10 6 or 0.3 ϫ 10 6 (CT only) cells per 100-cm dish (Nunc) for 12 h, drugged for 48 h with CT or RACT, transfected by overnight calcium phosphate precipitation, and then cultured for an additional 48 h with drugs. Each 100-cm dish was cotransfected with 10 g of reporter plasmid and 3 g of p␤AclacZ (46) containing the ␤-galactosidase coding region driven by the constitutive ␤-actin promoter.

Luciferase Assays
Cells were extracted and luciferase activity was determined using the Promega Luciferase Assay system and a Monolight 2010 luminometer (Analytic Luminescence Laboratory). Luciferase activity was normalized for transfection efficiency by dividing the raw luciferase value by the units of ␤-galactosidase activity (1 unit ϭ A 420 ⅐l Ϫ1 h Ϫ1 ).

Electrophoretic Mobility Shift Assays
Electrophoretic mobility shift assays were performed essentially as described in Glozak et al. (47) except poly(dA-dT) was used to block nonspecific binding. Expression vectors encoding human RAR␤ and murine RXR␥ (obtained from R. Evans) were linearized with BamHI and in vitro transcribed and translated using the TNT T7-coupled wheat germ extract system (Promega Corp.) according to the manufacturer's instructions. Dr. F. Kashanchi provided human recombinant Sp1 from Promega Corp.

Noncoding Sequences Upstream of the Bmp2 Gene Are
Widely Conserved in Mammals and Chick-The Bmp2 coding sequences are highly conserved in distantly related animal phyla (3,22). Bmp2 expression also is conserved in several vertebrate tissues during early development. For example, Bmp2 transcripts occur in the posterior regions of developing mouse and chick limb buds and zebrafish fin buds (4,49). Like others, we observed that the regions upstream of the Bmp2 gene are highly conserved between humans and mice (32). Similarly, the untranslated regions (UTRs) of Bmp2 transcripts are conserved between many species. This suggests that critical regulatory elements mediating the expression of Bmp2 have been retained throughout evolution.
To test the hypothesis that Bmp2 regulatory elements are conserved, we developed a PCR strategy to amplify Bmp2 noncoding regions from other species. We identified conserved regions 1.2 kb upstream of the distal promoter (30) and at the 3Ј end of exon 1. A degenerate upstream PCR primer was designed using a ClustalW alignment of the mouse and human Bmp2 genomic sequences. The degenerate exon primer was designed by comparing the human, mouse, rat, deer, and rabbit Bmp2 cDNA sequences. The locations of these primers and sequence accession numbers are shown in Fig. 1. Each primer sequence was compared with GenBank TM using BLAST 8 and no significant similarity to other sequences was found.
The ability of the primers to amplify fragments indicates that the primer sequence occurs in each genome. However, to confirm that the amplified fragments were the Bmp2 sequence, the PCR products were hybridized to a 1.7-kb Bmp2 fragment (Ϫ1237 to ϩ471 nt relative to the distal promoter). This probe hybridized to the amplified fragments under normal stringency conditions (6ϫ SSC, 50% formamide, 42°C, Fig. 1B). These results demonstrated that the PCR products were indeed Bmp2 fragments.
Sequences Downstream of the Bmp2 Translated Region Are Conserved between Mammals, Birds, Amphibians, and Fish-The genomes of several vertebrates have been completed or are nearing completion. Consequently, we could compare sequences within and flanking the Bmp2 gene of several different species that could not be analyzed easily using PCR. We used two different programs (PipMaker and VISTA) that were designed to provide global sequence alignments between extremely long sequences. Each program provides a useful visualization of the many possible alignments.
PipMaker and MultiPipMaker compare the percent identity of all alignments between gaps. The results are returned as a Percent Identity Plot (PIP) showing the percent identity of any gap-free segments within an alignment (50,51). Unbroken lines indicate the alignment length. The height of the line indicates percent identity over that alignment. For example, the mature Bmp2 peptide sequences in mice and humans are nearly identical and only one-third base differences occur in the DNA sequence. Consequently, an unbroken line near 90% results from comparing the mouse and human coding sequences within exon 3 ( Fig. 2A).
VISTA measures percent identity within a user-specified window beginning at each base pair of the entire sequence (52). The results are returned as a continuous curve representing identity level. For example, aligning the exon 3 coding regions from mouse and human produces a high plateau in VISTA (Fig.  2B). Whereas PipMaker data is anchored by comparing all sequences to one species, VISTA performs pairwise sequence alignments between 3 species (e.g. mouse-human, mouse-pufferfish, and human-pufferfish (53)). A statistical analysis of transcription factor binding sites identified by comparing bacterial genomes suggests that three species are sufficient for meaningful comparisons (54). Each program enables annotation of the results with functional information such as coding regions and CpG islands and is publicly available. 6,7 We have assumed that sequences identified in three or more species with both programs are likely to be functionally significant. Using these tools and this criterion, we identified noncoding sequences that are conserved between rodents, humans, deer, birds, frogs, and fish. Fig. 2, A and B, shows MultiPip-Maker and VISTA analyses of the Bmp2 genes from mouse (M. musculus, strain C57Bl/6), human, and pufferfish (Fugu rubripes). Conservation upstream and surrounding the distal promoter region is readily apparent between the mouse and human sequences. Rodents and primates diverged ϳ91 million years ago (55).
An upstream conserved region corresponds to a high identity PIP bar and a VISTA peak (Fig. 2, A and B). This region includes the degenerate PCR primer site used to amplify Bmp2 sequences from diverse species (Fig. 1). Together, the computer and experimental analyses indicate that the sequence is conserved between 11 mammals and 1 bird. This evolutionary conservation suggests that elements controlling mammalian and perhaps avian Bmp2 expression reside in this region.
Both the distal (29 -31, 56) and proximal (29,57) promoter regions are conserved between rodents and primates. The distal promoter region exhibits slightly higher identity (compare the length and height of the PIP bars near each promoter). For example, seven mismatches occur in the 100 nt leading up to the mouse and human distal promoters, whereas the proximal promoters have 19 mismatches.
Conservation in the 5Ј end of the gene was not evident in the amphibian or teleost sequences (the short overlapping PIP bars and VISTA peak upstream of the promoter are because of a simple sequence repeat found in mammals and fish). Thus, the sequences near the promoters may interact with mammalian and possibly avian-specific factors. Pufferfish diverged ϳ450 million years ago (55). Although mainly coding sequence conservation occurred over this period, functional conserved noncoding sequences have been identified. For example, many enhancers required for mammalian Hox gene expression, including retinoic acid response elements, are conserved between mammals and fish (Refs. 43 and 58, and references therein).
Both MultiPipMaker and VISTA analyses detected a highly conserved region at the beginning of the 3Ј-UTR of the Bmp2 transcript from mouse, human, and pufferfish (Fig. 2, A and B). This region is also conserved in rat, rabbit (Oryctolagus cuniculus), deer (Dama dama), chick, frogs (Xenopus laevis and X. tropicalis), and zebrafish (Bmp2b, Danio rerio). The Bmp2 conserved element is absent in the pufferfish Bmp4 gene (Fig. 2C), thus implying a function specifically in Bmp2 regulation. Conservation over a large phylogenetic distance suggests the sequence must interact with conserved cellular factors in a

Mouse and Monkey Bmp2 Sequences Drive Reporter Gene Expression Similarly in F9 Cells Treated with RA or RACT-
Interspecies sequence comparisons indicate that many sequences outside of the Bmp2 coding regions are conserved. These regions may contain regulatory elements subject to functional constraints. We have begun to experimentally test whether or not these conserved noncoding sequences are functionally relevant using differentiating F9 cells. Untreated or CT-treated F9 cells are undifferentiated and closely resemble cells of the blastocyst inner cell mass. RA or RACT treatment stimulates F9 cells to differentiate into primitive or parietal extraembryonic endoderm, respectively (41). Whereas undifferentiated cells express no detectable Bmp2, RA induces expression. Bmp2 expression is further elevated by RACT treatment (Fig. 3A) (35). We previously demonstrated that a 1.7-kb fragment containing the mouse Bmp2 distal promoter drove RAdependent chloramphenicol acetyltransferase reporter gene expression in F9 cells (30). The 1.7-kb fragment contains 1237 nt of upstream sequence, the distal promoter, 471 nt of exon 1, and includes the conserved regions described above. As shown in Fig. 3A, like the endogenous Bmp2 transcript, luciferase gene expression driven by this region is induced in RA-treated cells and is elevated further in RACT-treated cells.
To test the hypothesis that the regulatory elements required to express Bmp2 during extraembryonic endoderm differentiation are conserved between mouse and primates, we cloned amplified Bmp2 DNA from squirrel monkey (S. sciureus) and African green monkey (C. aethiops). 2322 nt of squirrel monkey and 1046 nt of African green monkey sequences have been deposited into GenBank TM (accession numbers AY494188, AY494189, and AY494190). As shown in Fig. 3B, RACT induced transcription from monkey sequence-containing reporter genes by about 10-fold relative to undifferentiated cells (CT). The level of transactivation resembled that mediated by the mouse sequence. This indicates that the primate sequence contains a promoter and interacts appropriately with mouse regulatory factors during differentiation.
Deletion and Mutation Analysis of Conserved Bmp2 Upstream Sequences-To further delineate the elements involved in regulating Bmp2, various sequences were mutated or deleted from the full-length murine fragment (nt Ϫ1237 to ϩ471). We tested each construct for its ability to drive reporter expression relative to the full-length fragment in F9 cells drugged for 96 h.
MultiPipMaker and VISTA analyses detected a conserved sequence between Ϫ1237 and Ϫ1145 nt (Fig. 2). We also showed that primers designed from human and mouse sequences within this conserved sequence could PCR amplify fragments from many non-rodent and non-primate species (Fig.  1). This evolutionarily conservation suggests the region interacts with specific cellular factors. A perfect binding site (AT-TTTA) for the F2F transcriptional repressor is located between nt Ϫ1146 to Ϫ1141 of the mouse sequence and at the homologous region of the rat, human, African green monkey, and squirrel monkey genes (Fig. 4A). F2F represses prolactin expression in some contexts (59). Mutating this site elevated reporter gene expression by 30% (Fig. 4C, p Ͻ 0.05). This suggests that F2F may function in F9 cells.
Deletion of the region between Ϫ1237 and Ϫ245 did not FIG. 3. Murine and primate Bmp2 reporter genes mimic the endogenous pattern of Bmp2 expression in F9 cells. A, F9 cells were untreated (no RA) or treated with RA or RA and RACT for 48 h and transfected in duplicate with a luciferase reporter plasmid containing nt Ϫ1237 to 471 of the mouse Bmp2 gene (Bmp-LUC) and a plasmid constitutively expressing ␤-galactosidase (p␤AclacZ). After 2 days further culture with drugs, luciferase activity was measured and normalized for transfection efficiency as determined by ␤-galactosidase activity. Average luciferase activity is shown Ϯ S.E. measurement, inset panel shows an autoradiograph of a Northern blot of RNA isolated from F9 cells treated as described and hybridized to Bmp2 probe as described in Ref. 35. B, diagrams of the murine Bmp2 distal promoter region (full-length Bmp-LUC, nt Ϫ1237 to 471; B575-LUC; nt Ϫ104 to 471) in the pGL2Basic luciferase reporter vector (LUC). C, F9 cells were treated with dibutyryl cAMP and theophylline (CT) with or without RA (RACT), transfected with the murine constructs shown in B or constructs containing African green monkey (AGM, 607 nt) or squirrel monkey (SqM, 606 nt) DNA, and analyzed for luciferase activity as described for A. Based on sequence similarity, the African green monkey and squirrel monkey sequences are homologous to the murine B575-LUC construct. The luciferase activity of the full-length murine construct in RACT-treated F9 cells (Bmp-LUC) has been set to 100 and all other constructs were normalized to that value (n ϭ 4). A reporter construct containing 1.9 kb of squirrel monkey sequence generated similar results (n ϭ 1, not shown).
X. tropicalis (X. tropic.), AJ315160; Zebrafish Bmp2b gene, D. rerio, AL928549; Pufferfish, F. rubripes, CAAB01002368. B, a VISTA plot comparing the mouse (M) to human (H), mouse to pufferfish (F), and human to pufferfish sequences. 30 -100% identity is shown. C, PipMaker plots comparing the mouse strain 129 Bmp2 sequence to the pufferfish Bmp2 and Bmp4 sequences. Note that significant sequence identity between mouse Bmp2 and pufferfish Bmp4 occurs in the coding sequence, but not in the untranslated region. Strain 129 sequence was compiled from L25602 and AF074942.

Bmp2 Gene Regulation
significantly affect expression in RACT-treated cells (Fig. 4C). However, removing the Ϫ245 to Ϫ104 segment (Fig. 4, B and C) reduced expression by 30% (p Ͻ 0.001). We also deleted a NotI fragment between Ϫ193 to Ϫ160 from the plasmids containing 1237 or 245 nt of sequence upstream of the promoter. Luciferase activities generated from the plasmids with the small internal deletion were indistinguishable from that of the plasmid controlled by only 104 nt of upstream sequence. The require- FIG. 4. Conserved noncoding sequences upstream of the distal promoter. A, the mouse upstream sequence between Ϫ1191 and Ϫ1144 is aligned with rat, human, squirrel monkey, and African green monkey sequences. The conserved F2F site consensus sequence (TAAAAT) in the reverse orientation is in bold face and shaded. The F2F site nucleotides mutated in the reporter construct BmpMtF2F-LUC are indicated in lowercase below the sequence. Species and sequence accession numbers are: mouse, M. musculus strain C57Bl/6, NW_000178; rat, R. norvegicus, AABR02024700; human, H. sapiens, AF040249; squirrel monkey (Sq Monk), S. sciureus, (accession number AY494188); African green (Af Gr) monkey, C. aethiops, (accession numbers AY494189 and AY494190). Independently sequenced mouse strain 129 sequence compiled from L25602 and AF074942 was identical to the C57Bl/6 sequence. B, the BamHI fragment containing the GC-rich region with Sp1 binding sites (nt Ϫ246 to Ϫ98) is aligned with rat, human, and squirrel monkey sequences. Classical Sp1 sites (GGGCGG) in the reverse orientation are shaded. The two NotI restriction endonuclease sites are bold face and underlined. Species and sequence accession numbers are as in A. C, F9 cells were transfected, drug-treated, and analyzed for luciferase activity as described for Fig. 3 (n ϭ 3-21). The diagram to the left of the bar graph illustrates the DNAs driving the reporter gene activity. The ends of the constructs are indicated below each bar. The black box and ∧ mark the deleted 32-base pair NotI fragment. A Student's t test (two-sample assuming unequal variances) was used to compare the activity of each reporter construct to the full-length murine reporter (Bmp-LUC). The following constructs differed significantly: the full-length construct with the mutated F2F site, BmpMtF2F-LUC, p Ͻ 0.05 (*), n ϭ 5; the Ϫ104 to 471 construct, B575-LUC, p Ͻ 0.001 (***), n ϭ 17; the full-length construct lacking the NotI fragment between Ϫ193 and Ϫ160, Bmp⌬Not-LUC, p Ͻ 0.01 (**), n ϭ 3; the Ϫ245 to 471 construct lacking the NotI fragment, B716⌬Not-LUC, p Ͻ 0.01 (**), n ϭ 3. ment for this NotI fragment was also observed previously using a chloramphenicol transferase reporter gene (30). Thus, an element within or overlapping a 32-nt fragment upstream of the distal promoter accounts for one-third of the transactivation controlling the Bmp2 promoter.
Deleting the NotI fragment removes a consensus Sp1 sequence (GGCGGG) on the reverse strand that is conserved between mice and rats and alters the sequence context of an Sp1 site embedded in the downstream NotI site (Fig. 4B). Two perfect Sp1 sites occur in this region in humans and one in squirrel monkeys. Depending on species, one or two more Sp1 sites are located further downstream. Like Bmp2, Sp1-influenced genes typically contain more than one binding site. Although Sp1 binding itself is not cooperative, multiple sites are often required for maximal transactivation (60,61).

Sp1 and RAR␤ Interact
Upstream of Bmp2-Electrophoretic mobility shift assays (Fig. 5) (30) showed that annealed, labeled oligomers containing the site within the NotI fragment bind to Sp1 protein in vitro. Cold competitor oligomers with the same sequence or an unrelated sequence containing a consensus Sp1 binding site reduced binding (Fig. 5A). Oligomers containing the three downstream Sp1 sites did not bind in vitro (data not shown). The sequence context surrounding Sp1 sites can influence binding. For example, five or more purines or pyrimidines on the 3Ј side of an Sp1 consensus site can reduce its affinity for Sp1 relative to sites with mixed purines and pyrimidines (62). The 3Ј (upstream relative to the Bmp2 promoter) side of the site that binds Sp1 in vitro contains more mixed purine and pyrimidine bases compared with the three non-binding sites. Thus, the context of the Sp1 site within the NotI fragment appears to promote Sp1 binding.  Ϫ193 to Ϫ143, Fig. 4B). Oligomers containing only the Sp1 site embedded in the downstream NotI site (nt Ϫ168 to Ϫ143) or the two sites downstream of the NotI fragment (nt Ϫ148 to Ϫ98) did not bind (data not shown). Radiolabeled oligomers were incubated with 500 pg of recombinant human Sp1 protein and 1 l each of in vitro transcribed and translated RAR␤ and RXR␥, and electrophoresed through nondenaturing gels. 400-fold molar excess of unlabeled Bmp2 or Sp1 consensus oligomers (ATTCGATCGGGGCGGGGCGAGC) were added as indicated. B, labeled Bmp2 oligomers were incubated with the indicated mass of Sp1 protein with 2 l of a 1:1 mixture of RAR␤ and RXR␥ (ϩ) or with an equal volume of unprogrammed wheat germ extract (Ϫ). C, labeled Bmp2 oligomers were incubated with 500 pg of Sp1 with the indicated volume of RAR␤ and RXR␥ mixture. The volume and total protein concentration of each binding reaction was equalized with unprogrammed wheat germ extract.

FIG. 6. RAR␤ and Sp1 act upstream of the distal promoter in CT-treated cells.
A, PCR was used to amplify the Sp1 binding region upstream of the distal promoter (nt Ϫ286 to Ϫ24) from genomic DNA or DNA isolated from chromatin that was immunoprecipitated with Sp1 or RAR␤ antibodies (ChIP). The approximate locations of the forward (thick right arrow, Ϫ286 to Ϫ258) and reverse (thick left arrow, Ϫ24 to Ϫ44) PCR primers are indicated. B, chromatin was prepared from F9 cells treated with RA or dibutyryl cAMP and theophylline (CT) or both (RACT) for 3 days. DNA was extracted directly (Input) or from chromatin immunoprecipitated (IP) with Sp1 (␣-Sp1) or RAR␤ (␣-RAR␤) antibodies. The PCR products were electrophoresed, Southern blotted, and hybridized to Bmp2-specific radiolabeled probes. The numbers below the ChIP blots indicate the relative signals in each lane after normalization to the signals in the Input lanes. As negative controls, PCR targeted to non-promoter regions such as the Bmp2 3Ј-UTR were performed on ChIP DNA. No signals were observed. C, F9 cells were CT-treated and transfected with the full-length murine Bmp2 reporter construct (nt Ϫ1237 to 471, Bmp-LUC) as described for Fig. 3 except that cells were exposed to mithramycin (MTM) at the time of transfection. The average luciferase activity Ϯ S.E. (n ϭ 2 to 4) was plotted against the MTM concentration.
Despite strong activation of the endogenous Bmp2 gene and Bmp2 promoter-driven reporter genes by RA in F9 cells, a classical RARE cannot be identified near the Bmp2 promoter. Classical RAREs contain two directly repeated "PuG(G/T)TCA" motifs separated by 5 nucleotides (63). Although the spacing and relative position of these motifs may vary, at least two usually occur (Ref. 47 and references therein). Only three, distantly spaced motifs occur within the full-length Bmp2 construct (nt Ϫ1236 to ϩ471). None of these are between Ϫ245 to Ϫ104. The regulatory regions controlling several RA-inducible genes lacking classical RAREs have been shown to contain Sp1-binding sites whose activity is modulated by retinoid receptors (see Refs. 62 and 64, and references therein). Thus retinoid-activated receptors may control Bmp2 transcription by modulating Sp1 activity at this element.
We tested the hypothesis that retinoid receptors potentiate Sp1 binding at low Sp1 levels by mixing purified Sp1 protein with in vitro transcribed and translated RAR␤ and RXR␥. As shown in Fig. 5B, adding RAR␤ and RXR␥ enhanced Sp1 binding to the oligomers containing the Sp1 site within the NotI fragment at low levels of Sp1. In contrast, we never observed binding to oligomers containing the three downstream Sp1 sites (not shown). The Sp1 to receptor ratio appeared to be critical because adding excess receptor protein reduced Sp1 binding (Fig. 5C).
Electrophoretic mobility shift assay can test whether or not a protein can bind a sequence in vitro. However, cellular and molecular context greatly modifies protein activity. For example, depending on post-translational modification or concentration of Sp1, or on binding site context, Sp1 may activate or repress genes (65)(66)(67)(68)(69)(70). Similarly, coactivator and corepressor levels alter nuclear receptor activity (71)(72)(73). Reporter genes permit mutational analysis of putative elements, but multiple copies of transiently transfected genes may disrupt the normal balance of regulatory factors. Therefore, we used ChIP to test whether or not Sp1 binds the conserved Sp1 sites in nontransfected cells.
We isolated chromatin from undifferentiated, CT-treated cells that express no Bmp2 mRNA, RA-treated cells that express some Bmp2 mRNA, and RACT-treated cells that express the highest level of Bmp2 mRNA (Fig. 3) (35, 42). This chromatin was immunoprecipitated with antibodies specific to Sp1, RAR␣ and -␤, and RXR␤ and -␥. After extracting DNA, we attempted to amplify the region containing the conserved Sp1 sites (Fig. 6A). As Fig. 6B shows, a fragment was readily amplified from the CT-treated cell chromatin immunoprecipitated with anti-Sp1 and anti-RAR␤ antibodies. A faint band was also visible in the DNA from RA-treated cells. We could not amplify fragments from DNA from RACT-treated cells nor from chromatin precipitated with other antibodies (data not shown). Likewise, we were unable to detect Sp1 or RAR␤ bound to other regions of the Bmp2 gene such as the 3Ј-UTR (data not shown). We normalized the intensity of the ChIP signals to the amount of total DNA (input) in each chromatin sample. The relative signal levels indicate that Sp1 and RAR␤ binding correlates inversely with the abundance of the Bmp2 transcript.
A pharmacological approach also can assess the activity of a regulatory protein binding the Bmp2 promoter. Mithramycin prevents the binding of Sp1 to GC-rich DNA and can inhibit Sp1-induced genes (62,74,75). We treated cells transfected with a Bmp2 reporter gene with mithramycin. Mithramycin induced the reporter gene in CT-treated cells (Fig. 6C) but did not affect expression in RACT-treated cells (data not shown). Thus, mithramycin induced Bmp2 reporter genes and Sp1 bound upstream of Bmp2 only in undifferentiated cells. Because these cells do not express Bmp2, this implies that mithramycin interferes with Sp1-mediated repression.
Conserved Noncoding Sequences in the Bmp2 3Ј-UTR-The sequence comparisons shown in Fig. 2 identified a sequence downstream of the Bmp2 stop codon that is conserved between mammals, birds, frogs, and fish. Fig. 7 shows the MultiPip-Maker sequence alignment of the highly conserved region. The 96-nt sequence (mouse) immediately following the stop codon (TAG) varies even between mammals. In contrast, the next 280 nt shown in Fig. 8 is 96% identical between all mammals and is 93% identical between mouse and chick. With the exception of a gap between 9,712 and 9728, even the two Xenopus sequences are 84% identical to the mouse sequence. Remarkably, large stretches within this region are identical between mouse and the two fish sequences (shaded gray). Indeed, identity over the 230-nt region conserved between mouse and pufferfish is about 69%.
A Reporter Gene Containing the Bmp2 Downstream Region Is Induced More Strongly by RA-To test the hypothesis that the region conserved between mammals and other vertebrates influenced Bmp2 expression, we inserted the entire 3Ј-UTR downstream of luciferase in the 1.7-kb mouse reporter construct previously described (Bmp-LUC, Figs. 3 and 4). Fig. 8A shows the construct (Bmp-LUC-Bmp). The inserted fragment includes the Bmp2 stop codon, the conserved noncoding sequence, and four putative polyadenylation sites (57,76,77). The actual luciferase activity generated in RACT-treated cells from the reporter construct with the 3Ј-UTR was more than twice that observed for the 3Ј-UTR lacking construct (Fig. 8B). Furthermore, luciferase activity was not significantly elevated in CT-treated cells transfected with the 3Ј-UTR construct. Consequently, whereas RA induced Bmp-LUC expression by 9.5 Ϯ 2.3-fold relative to non-RA-treated cells, inserting the 3Ј-UTR enabled RA to induce expression by 13.6 Ϯ 1.5-fold. These results indicate that key regulatory elements occur within this region.
The Bmp2 3Ј-UTR Increases the Stability of the Luciferase mRNA-The overall AU content of the conserved region in mouse is about 60% (Fig. 7). AU-rich sequences are associated with post-transcriptional regulation. A specific "AUUUA" motif known as an AU-rich element (ARE) controls transcript stability (78). Seven AREs are conserved in the mammalian 3Ј-UTRs. The conserved AREs are embedded exclusively within the re- The pGL2-Basic (black) luciferase gene and SV40 intron (LUC) and SV40 polyadenylation (pA) signal are indicated relative to the Bmp2 gene segments. B, F9 cells were transfected, drug-treated and analyzed for luciferase activity as described for Fig. 3 (n ϭ 11). The luciferase activity generated by the Bmp-LUC and Bmp-LUC-Bmp constructs in RACT-treated cells differed significantly (p Ͻ 0.005). Luciferase activity in CT-treated cells did not differ significantly. C, F9 cells were transfected and drug-treated as described for Fig. 3. Subsequently, cells were exposed to 5,6-dichloro-1-␤-D-ribofuranosylbenzimidazole (DRB) for up to 6 h, followed by extraction and luciferase analyses. The average luciferase activities and ranges (n ϭ 2) were plotted as a percentage of the luciferase activity in cells not treated with inhibitor (time 0). gion shown in Fig. 7 and none occur in the remaining 1.6 kb of sequence included in the reporter gene (Fig. 8). Of these, four also occur in fish.
We plotted the data as a percentage of the luciferase activity in cells not treated with inhibitor (time 0). Within 2 h of 5,6-dichloro-1-␤-D-ribofuranosylbenzimidazole addition, luciferase levels in cells transfected with the "promoter only" plasmid (Bmp-LUC) fell to 10% of those observed at time 0 (Fig.  8C). In contrast, luciferase levels in cells transfected with the Bmp2 3Ј-UTR-containing construct (Bmp-LUC-Bmp) were nearly 80% of time 0 levels. Similarly, the Bmp-LUC-Bmp message was more stable in cells treated with actinomycin D (not shown). Thus the Bmp2 3Ј-UTR contains an element that stabilizes a heterologous reporter message.

DISCUSSION
Upstream Conserved Noncoding Sequences-BMP2 is an indispensable growth factor required throughout vertebrate and invertebrate embryogenesis. Disruptions in BMP2 signaling influence human diseases, including cancers and pulmonary hypertension (80). Indeed, the human BMP2 gene has been linked to osteoporosis (81). Thus, understanding the mechanisms controlling BMP2 synthesis is relevant to congenital disorders and adult disease. BMP2 is an ancient and conserved member of the transforming growth factor ␤ family. Related proteins from animals as distantly related as flies and humans are functionally interchangeable in some assays. Gene expression patterns also appear partially conserved. For example, Bmp2 mRNA is detectable in the homologous regions of the developing mouse and chick limb buds and the zebrafish fin buds (4,49). We have identified and characterized conserved genomic regulatory elements controlling the induction of the Bmp2 gene in F9 embryonal carcinoma cells during parietal endoderm differentiation.
Genomic sequences containing the entire Bmp2 transcription unit are available from mice, rats, humans, two frogs (X. laevis and X. tropicalis), zebrafish, and pufferfish. MultiPip-Maker and VISTA alignment of the mouse promoter region to the frog and fish sequences failed to detect significant conservation. In contrast, the rodent and primate sequences contain obvious regions of high identity (Figs. 2 and 4). Reporter gene activity (Fig. 3) indicates that upstream regulatory elements required for inducing Bmp2 in F9 cells treated by RACT are functionally conserved between rodents and primates whose lineages separated about 91 million years ago (55). Rodents and primates are evolutionary sister groups (82); however, our ability to amplify sequences from other mammals and chick suggests that Bmp2 regulatory element conservation extends considerably further (Fig. 1). Because only limited chick sequence is available, directly comparing the putative chick promoter to mouse is not possible. However, the fragment amplified from chick DNA hybridized easily to the mouse promoter region under normal stringency conditions. This suggests that, despite ϳ310 million years of independent evolution (55), many aspects of Bmp2 promoter region function have been retained.
Sp1 Regulation of Bmp2 Transcription-A cluster of Sp1 consensus binding sites occurs in a GC-rich region upstream of the mouse, rat, human, and squirrel monkey promoters (Fig.   4). In addition to sequence identity, four distinct experimental approaches support the hypothesis that one or more conserved Sp1 sites regulate Bmp2 expression. First, deleting the Sp1 site between nt Ϫ192 to Ϫ161 significantly reduced reporter gene activity (Fig. 4C). Second, this sequence bound Sp1 in electrophoretic mobility shift assays (Fig. 5). Third, ChIP analyses showed that Sp1 bound this region in F9 cells (Fig. 6B). Fourth, mithramycin, an inhibitor of Sp1/chromatin interactions altered Bmp2 reporter gene activity (Fig. 6C). Thus, Sp1 may influence Bmp2 transcription.
We also found that the RAR␤/RXR␥ heterodimer potentiates Sp1 binding to this site in vitro (Fig. 5, B and C). Furthermore, ChIP analyses indicate that RAR␤ binds the same region as Sp1 in F9 cells (Fig. 6B). Retinoid receptor and Sp1 interaction may help explain the absence of a classical RARE near the Bmp2 promoter. Indeed, several RA-inducible genes, including the Bmp2 relative TGF-␤1, lack classical RAREs, but contain Sp1-binding sites required for induction by RA (62,64). Like Bmp2, RARs potentiate Sp1 binding to these sites. Moreover, both RA and mithramycin influence the activity of reporter genes derived from these genes (62).
Several lines of evidence point to a direct physical interaction between Sp1 and retinoid receptors. These include coimmunoprecipitation of purified proteins, the fact that glutathione S-transferase-chimeric proteins pull-down these factors from cell extracts, and mammalian two-hybrid assays (62,83,84). Unexpectedly, however, ChIP analyses of proteins binding the Bmp2 upstream region in F9 cells indicated an inverse correlation between RAR␤ and Sp1 binding and Bmp2 expression. RAR␤ and Sp1 strongly bound the region in undifferentiated, CT-treated cells that do not express Bmp2; binding was intermediate in RA-treated cells expressing an intermediate level of mRNA; and binding was undetectable in RACT-treated cells expressing the highest mRNA level (compare Figs. 3A and 6B). Additionally, mithramycin, which interferes with Sp1 binding to chromatin (62,74,75), induced a Bmp2 reporter gene specifically in undifferentiated cells (Fig. 6C). These results suggest that a mithramycin-sensitive RAR␤-Sp1 complex may repress Bmp2 transcription in undifferentiated, CT-treated cells.
Many nuclear hormone receptors repress transcription in the absence of ligand by recruiting transcriptional corepressors (71,72). However, the simplest hypothesis that RAR␤ directly represses Bmp2 is unlikely. First, no classical retinoic acid response elements (63) occur within this region. In addition, in vitro synthesized receptors cannot bind this region alone ( Fig.  5B and data not shown). Thus, a cofactor probably facilitates binding to this region. Second, although RAR␣ interacts with the corepressor SMRT (silencing mediator of retinoic acid and thyroid hormone receptor), RAR␤ does not (85,86). We only detected RAR␤ binding upstream of Bmp2 (Fig. 6B). Our results are mirrored by the observation that RAR␤ binds an Sp1 element upstream of the folate receptor (FR) type ␤ gene only in untreated cells (87). Loss of binding occurs after RA induces the FR␤ gene. For these reasons, we favor the hypotheses that Sp1 recruits RAR␤ to Bmp2 and that an RAR␤-Sp1 complex represses Bmp2.
Consistent with this hypothesis, RAR␤ mRNA abundance is greatly reduced in RACT-treated cells compared with CTtreated cells (35,88,89). In contrast, Sp1 is a ubiquitously expressed regulatory protein that recognizes GC-rich sequences upstream of many genes (70). Extracts from untreated F9 cells and F9 cells treated with RA and cAMP analogs have similar Sp1 binding activities (64). Sp1 itself may also act as either a repressor or activator depending on post-translational modification, cellular concentration, and binding site context (65)(66)(67)(68)(69)(70). Thus, it is possible that binding to RAR␤ promotes the repressive nature of Sp1.
We recognize that if Sp1 represses specifically in CT-treated cells then the 32-bp NotI deletion (Fig. 4C) also must disrupt a sequence that activates the reporter gene in RACT-treated cells. Other potential regulatory protein-binding sites do occur within or overlap the NotI fragment. Furthermore, Sp1 often silences by competing with activating factors whose binding sites overlap that of Sp1 (65,66,68,69).
Downstream Conserved Noncoding Sequences-Whereas interspecies sequence identity in the upstream region may be restricted to mammals and birds, a striking degree of conservation occurs downstream of the stop codon in the 3Ј-UTR. Indeed downstream elements are retained in amphibians and fish that diverged from mammals about 360 and 450 million years ago, respectively (55). Decapentaplegic (dpp) is the Drosophila ortholog to Bmp2 and its close relative, Bmp4. Curiously, a 110-bp invariant region is also immediately downstream of the dpp coding sequence from several fly species that diverged from each other 40 to 80 million years ago (27,28,90). Although the fly sequence does not easily align with the metazoan sequence, it is both AU-rich and positioned similarly in the 3Ј-UTR. This is consistent with the involvement of the region in a very ancient mechanism retained because of the invertebrate-vertebrate split.
The conserved region contains putative RNA stability elements that may mediate the half-life of the Bmp2 transcript. These "AREs" are found in the 3Ј-UTRs of many highly regulated genes involved in critical signaling processes, e.g. c-fos, c-myc, and various cytokines (78). AREs can influence RNA turnover, translation, and subcellular localization of the mRNA (78). The Bmp2 AREs are clustered after the translation termination codon in all species. In mouse, ARE motifs are absent in the 1.6-kb sequence downstream of the conserved region. This suggests that the elements were actively selected during evolution and that their location is constrained within the transcript.
AREs appear to play dual roles in RNA turnover, in that under different conditions they may stabilize or destabilize mRNAs. Our observations that the luciferase activity and the half-life of transcripts containing the conserved AREs is greater than for transcripts lacking the AREs is consistent with a role in RNA turnover. Undifferentiated, CT-treated cells expressed the reporter constructs containing or lacking the 3Ј-UTR equally. This suggests that mRNA turnover machinery in undifferentiated cells does not interact with the Bmp2 3Ј-UTR. In contrast, the 3Ј-UTR markedly stimulated expression and transcript half-life in differentiated, RACT-treated cells. Future experiments are directed toward determining whether a stabilizing protein binding the Bmp2 transcript is induced in differentiated cells.
In summary, genomic comparisons of transcription units and flanking DNAs from different species can reveal functional sequences that would be difficult to detect using purely experimental methods. Using both wet bench and computer-based methods, we have identified and tested the function of conserved sequences controlling Bmp2 expression. One conspicuously conserved region is downstream of the stop codon. The obvious conservation of this sequence in the Bmp2 3Ј-UTRs of mammals, birds, amphibians, and fish suggest high functional significance. The AU-rich nature of the element suggests a specific post-transcriptional role that may be empirically tested.