Twist-mediated Activation of the NK-4 Homeobox Gene in the Visceral Mesoderm of Drosophila Requires Two Distinct Clusters of E-box Regulatory Elements*

NK-4, also called msh2and tinman, encodes a homeodomain transcription factor that is required for the development of the dorsal mesoderm and its derivatives in the Drosophila embryo. Genetic analyses indicate that NK-4 resides downstream of the mesodermal determinant twist, which encodes a basic helix-loop-helix-type transcription factor. However, the regulation ofNK-4 by twist remains poorly understood. Using expression assays in cultured cells and transgenic flies, we show that two distinct clusters of E-box regulatory sequences, present upstream of the NK-4 gene, mediate NK-4 expression in the visceral mesoderm. These elements are conserved between the Drosophila melanogaster and Drosophila virilis NK-4 genes and serve as binding sites for Twist (E1 cluster) and NK-4 (E2 cluster) proteins. In cultured cells, Twist and NK-4 binding results in activation of NK-4 gene expression. In transgenic animals, the E1 and E2 clusters are functionally connected, and both elements are required for NK-4 activation in cells of the visceral mesoderm and also for NK-4 repression in cells of the somatic musculature. These results demonstrate that NK-4 is a direct transcriptional target for Twist and its own gene product in visceral mesodermal cells, supporting the idea that twistand NK-4 function in the subdivision of the mesoderm duringDrosophila embryogenesis.

In Drosophila, the mesoderm develops from cells in the ventral-most part of the embryo at the cellular blastoderm stage (reviewed in Ref. 1). After gastrulation, the mesoderm is further subdivided into three different mesodermal layers including the somatic mesoderm, visceral mesoderm, and progenitors of the heart. Although the mechanism of mesoderm partitioning into more specialized cells is unclear, it is generally believed that a cascade of transcriptional regulators and inductive signals from other germ layers are involved (2)(3)(4)(5)(6)(7)(8)(9).
Previously, it was shown that twist, which encodes a basic helix-loop-helix (bHLH) 1 transcription factor, is essential for the early establishment of the mesoderm (10,11). twist is initially activated by dorsal in the presumptive mesoderm of the cellular blastoderm embryo (12)(13)(14)(15)(16), where the highest concentration of the dorsal morphogen exists. Subsequently, Twist is distributed in a graded manner where it regulates downstream genes (17,18). Possible targets of twist include msh-2 (19), PS2 (18), Zfh-1 (20,21), DFR1 (22), and D-mef2 (23)(24)(25)(26), since mesodermal expression of these genes is disrupted in twist mutant embryos. Recently, it was demonstrated that Twist also functions in the subdivision of the mesoderm, presumably selecting different targets in mesodermal derivatives depending on its concentration (27). Although it was suggested that Twist is required for the activation of those mesodermally expressed genes, little evidence of their direct activation by Twist in specialized mesodermal cells has been published to date. NK-4, also named msh-2 and tinman, is a mesodermal gene that belongs to the cluster of homeobox genes (lbe, lbl (nkch4), NK-4 (msh-2, tinman), NK-3 (bagpipe), 93Bal, and NK-1 (S59)) that are located in the 93D/E region of Drosophila chromosome III (28 -34). NK-4 is initially expressed in the presumptive mesoderm at the cellular blastoderm stage shortly after twist, and it continues to be expressed in all mesodermal cells during germband elongation (19). 2 Eventually, its expression is restricted to dorsal mesodermal cells including precursor cells of the heart (19,30). As revealed by the analysis of the tinman mutant embryos, the gene is required for visceral mesodermal cell differentiation and development of the dorsal vessel that is functionally comparable to the mammalian heart (30,31). Recently, Gajewski and co-workers (35) showed that NK-4 activates the D-mef2 transcription factor gene in cardial cells during heart morphogenesis. These studies demonstrated that D-mef2 resides directly downstream of NK-4 in the genetic hierarchy controlling heart formation in Drosophila. NK-4related genes have been identified in vertebrates, and their patterns of expression were shown to be confined to the developing heart and gut tissue (36 -42). Furthermore, targeted disruption of the mouse Nkx-2.5 gene results in abnormal heart formation during embryogenesis, suggesting that Nkx-2.5 is essential for normal heart morphogenesis (43). These results raise the possibility that the genetic pathways required for heart morphogenesis may be conserved among different species (44 -46).
As an initial effort to understand the molecular mechanism by which NK-4 functions in specifying the regional subdivision of mesoderm and heart development, we have investigated the transcriptional control of NK-4 in cultured cells and transgenic flies. We show that NK-4 is a direct target for Twist in visceral mesodermal cells. Moreover, we show that this regulation is mediated by two distinct clusters of E-box regulatory elements and involves autoregulation by the NK-4 protein.

EXPERIMENTAL PROCEDURES
Cloning and Nucleotide Sequencing-A Drosophila melanogaster genomic DNA library (cosmid library, CLONTECH) was screened with a 2.3-kb EcoRI-BamHI fragment corresponding to the NK-4 first exon and part of the first intron (clone 9 from Ref. 28) to isolate clones containing the 5Ј upstream region of the NK-4 promoter. One of the cosmid clones obtained (CL 3; 23-kb insert) was used for subcloning and restriction mapping. For sequencing, two subclones (1.62-kb EcoRI insert and 2.3-kb EcoRI-BamHI insert) were serially deleted with Ex-oIII-S1 nuclease (Promega), and serial deletion mutant plasmids were used as template DNA for DNA sequencing with Sequenase (U. S. Biochemical Corp.). By aligning the overlapping sequences, the total 3.92-kb sequence was obtained. For cloning of the twist cDNA, we synthesized oligonucleotides (primer I, 5Ј CTGGAATTCCACAAAT-TCTAACGTGAAGAAGG 3Ј; primer II, 5Ј GGCTTAGACATCTTA-GAATCATCT 3Ј) based on the published sequence (11) and used them for polymerase chain reaction (PCR) (35 cycles of 1 min at 95°C, 1 min at 55°C, 2 min at 72°C) with Drosophila embryonic (3-12 h) cDNA templates. Amplified DNA was digested with EcoRI and subcloned into the pBluescript KSII vector (Stratagene). Cloned cDNAs (pKS-Twist, 1.6-kb EcoRI insert from nucleotide 108 to 1823 without intron) were confirmed by sequencing and used for construction of the expression vectors. For the cloning of the DvNK-4 homeobox gene, an lEXlox Drosophila virilis genomic DNA library (Novagen) was screened with a 32 P-labeled-NK-4 cDNA.
Construction of Plasmids-For constructions of F-series reporter plasmids, DNA fragments (F23, Ϫ1186 to Ϫ887; F33, Ϫ1036 to Ϫ737; F43, Ϫ888 to Ϫ587; F53, Ϫ738 to Ϫ438; F3, Ϫ888 to Ϫ737) containing various 5Ј upstream regions of NK-4 were amplified by PCR (30 cycles of 1 min at 95°C, 1 min at 55°C, 1 min at 72°C) with specific primers. For the constructions of the F23E1m and F33E2m, P1E1mE2m plasmid DNA (see below) was used for PCR as template. Primers (27-mer including restriction site) for the 5Ј to 3Ј direction have a HindIII site in addition to the corresponding sequence, and those for the 3Ј to 5Ј direction have a BamHI site at the end. Amplified DNA fragments were cut with HindIII and BamHI and subcloned into the pBLCAT2 (48). To construct the P-series reporters (P1, P3, and P5) containing the NK-4 promoter and upstream region, a common primer for the 3Ј to 5Ј direction containing the SalI restriction site at the end (primer 750 from Ϫ137 to Ϫ154, 5Ј CTGGTCGACAACCGTTAGCGCAACCGT 3Ј) and specific primers containing HindIII sites at the end (primer 711 for P1, 5Ј CTAAAGCTTGAATTCATTATAACTCTG 3Ј; primer 713 for P3, 5Ј CTAAAGCTTATTATTAAAAATGTTGCT 3Ј; primer 715 for P5, 5Ј CTAAAGCTTTCAAGTAGCGAAACAAAA 3Ј) were synthesized and used for PCR. Amplified DNA fragments were digested with HindIII and SalI and subcloned into the pCAT-basic vector (Promega) cut with HindIII and SalI. To construct E-series reporters (E1, E1m, E2, and E2m), oligonucleotides were synthesized, annealed, and subcloned into pBLCAT2. Correct clones were selected by DNA sequencing. Oligonucleotides used were as follows: for the E1 construct, primer E153 (5Ј AGCTTTATGTACATATGCACTACATATGCAATTATATACATATGT-GAACAG 3Ј) and primer E135 (5Ј GATCCTGTTCACATATGTATATA-ATTGCATATGTAGTGCATATGTACATAA 3Ј); and for the E1m construct, primer E1m53 (5Ј AGCTTGTAGATATCCACTAGATATCCAAT-AACCT-3Ј) and primer E1m35 (5Ј-CTAGAGGTTATTGGATATCTAG-TGGATATCTACA-3Ј); and for the E2 construct, primer E253 (5Ј GA-TCCTTAAAATCAAGTGTGCGAAAATCTGCACTTGAGCGCCACTT-GACAACAG 3Ј) and primer E235 (5Ј GATCCTGTTGTCAAGTGGC-GCTCAAGTGCAGATTTTCGCACACTTGATTTTAAG 3Ј); and For the E2m construct, primer E2m53 (5Ј GATCCTTAAAATGAAGTCTGC-GAAAATCTGGACTTCAGCGCGACTTCACAACAG 3Ј) and primer E2 m35 (5Ј GATCCTGTTGTGAAGTCGCGCTGAAGTCCAGATTTTCG-CAGACTTCATTTTAAG 3Ј). For the construction of the twist expression vector pRC/CMV-Twist, pKS-Twist was digested with NotI and ApaI, and the gel-eluted DNA fragment (1.6 kb) was subcloned into the pRC/CMV vector (Invitrogen). To construct the truncated form of the twist expression vector CMV-TwiH, twist cDNA (0.57-kb EcoRI DNA fragment; amino acids 324 Ϫ490) was obtained from pGBT-TwiH 3 and subcloned into the EcoRI site of the NK-4N expression vector 4 that contains 109 bp of 5Ј-untranslated region of NK-4 and the initiator codon. To construct the NK-4 expression vector pRC/CMV-NK4, specific primers were synthesized (primer 466, 5Ј ACGGCGGCCGCCGAGAT-TCCAATTCAAGT 3Ј; primer 465, 5Ј CTGGGGCCCTTAATCGTCGTC-CTTGTAGTCAGCCATGTGCTGCATCTGTTGC 3Ј) and used for PCR. Primer 465 has a coding sequence for the Flag peptide (IBI) so that the expressed NK-4 protein can be tagged with the Flag peptide. Amplified DNA fragments were digested with NotI and ApaI, and subcloned into the corresponding sites of pRC/CMV.
Site-directed Mutagenesis-A PCR-based method was used to generate mutations within the E1 or E2 cluster for the construction of P1E1m, P1E2m, and P1E1mE2m (all of the three E-box sequences are mutated). For the P1E1m construct, two separate DNA fragments (223 bp, from Ϫ1336 to Ϫ1114; 986 bp, from Ϫ1122 to Ϫ137) were amplified by PCR with specific primers (primer 711; primer 921, 5Ј TTGCTCGAG-TAGTGCGTACGTACATACAGTACACA 3Ј; primer 923, 5Ј CTACTC-GAGCAATTATATACGTACGTGAACACGTTTTTGGT 3Ј; primer 724, 5Ј TAGGGATCCAACCGTTAGCGCTTCCGT 3Ј). Amplified DNAs were digested with HindIII-XhoI and XhoI-BamHI, respectively, and ligated into the pBluescript vector cut with HindIII-BamHI. Subclones were sequenced, and selected plasmids containing the mutated E-box sequences within the E1 cluster were digested with HindIII-XbaI. DNA fragments were eluted from an agarose gel and subcloned into the HindIII-XbaI sites of the pCAT-basic vector. To construct P1E2m and P1E1mE2m reporters, the pKS-P4E2m plasmid containing mutations within the E2 cluster was constructed first. To construct the pKS-P4E2m, primers were synthesized (primer 925, 5Ј GGTAAGCTTGCT-TAGTACACTCTTAAAATGAAGGTCTGCGAAAATCTGGACTTCAG-CGCGACTTCACAACCGTTTAATACACA 3Ј; primer 724, 5Ј CAGG-GATCCAACCGTTAGCGCAACCGT 3Ј) and used for PCR. Amplified DNA fragments were digested with HindIII and BamHI and subcloned into a pBluescript vector. Mutations within the E2 cluster were confirmed by DNA sequencing. From this construct fragment C (771 bp, from Ϫ907 to Ϫ137) containing mutations within the E2 cluster was amplified with specific primers (primer 927, 5Ј CCACGATTTATT-TATTTGTTAGCTTAGTACACTCTTAAAATG 3Ј; primer 750). Two additional DNA fragments (471 bp, from Ϫ1336 to Ϫ866) containing either the wild type (fragment A, from P1 template DNA) or the mutated E-box sequence (fragment B, from P1E1m DNA) within the E1 cluster were amplified from different template DNAs with specific primers (primer 711 and primer 928, 5Ј CATTTTAAGAGTGTACTA-AGCTAACAAATAAATAAATCGTGG 3Ј). Mixed DNA fragments (A and C for P1E2m; B and C for P1E1mE2m) were denatured and re-annealed. DNAs (41 nucleotide overlap) were gap-filled with Vent DNA polymerase (New England Biolab). Gap-filled DNAs were subjected to PCR (30 cycles of 1 min at 95°C, 1 min at 55°C, 2 min at 72°C) with primers 711 and 724 to generate full-length DNA fragments (1.2 kb, from Ϫ1336 to Ϫ137). Amplified DNAs were gel-eluted and subcloned into pBluescript vector. Mutated regions of subclones were sequenced again with specific primers. Selected clones were digested with HindIII and XbaI, and gel-eluted DNA fragments were subcloned into the pCAT-basic vector.
In Vitro DNA Binding Assays-Gel-shift assays were performed with partially purified fusion proteins (155 ng) and 32 P-labeled probes (5 ϫ 10 4 cpm, 5-10 fmol) in binding buffer A containing 25 mM HEPES (pH 7.5), 3 mM MgCl 2 , 1 mM EDTA, 0.5% Nonidet P-40, 10% glycerol, 1 g of poly[d(I-C)]. Reactions were incubated at room temperature for 15 min and analyzed on 4% polyacrylamide gels in 0.25 ϫ Tris borate buffer. To prepare oligonucleotides probes, equimolar amount of oligonucleotides were annealed and end-labeled with T4 kinase or Klenow fragment. To make the F35 probe, DNA amplified by PCR with specific primers was subcloned into the HindIII-BamHI sites of the pBLCAT2 vector. From this construct the F35 DNA fragment (78 bp from Ϫ886 to Ϫ809) was gel-eluted, dephosphorylated, and end-labeled by T4 kinase. Footprinting assays were performed as described previously (49). The F23 (for Twist footprinting) or F33 (for NK-4 footprinting) plasmid DNA was digested with HindIII (in the case of minus-strand labeling, BamHI was used), and digested DNAs were dephosphorylated with alkaline phosphatase. Dephosphorylated DNAs were redigested with BamHI (in the case of minus-strand labeling, HindIII was used). DNA fragments were eluted from a gel, and the concentration of DNA was measured. End labeling was done by T4 kinase. Binding reactions were performed with fusion proteins and labeled DNA (12.5 fmol) in 20 l of binding buffer A. After incubation for 15 min at room temperature, CaCl 2 (final 0.5 mM) and DNase I (0.05 unit) were added to the binding reactions for DNase I digestion. Reactions were stopped after 1 min incubation by adding stop buffer and were analyzed on an 8% denaturing polyacrylamide gel.
Cell Transfections and CAT Assays-CV-1 cells were grown in minimal essential medium (Life Technologies, Inc.) supplemented with 10% fetal bovine serum, and Drosophila S2 cells were grown in M-3 medium (Quality Biologicals Inc.) supplemented with 10% fetal bovine serum. Cells (2 ϫ 10 5 per 60-mm dish) were transfected with plasmid DNAs (total 12 g per dish; amount of DNA was adjusted with salmon sperm DNA) by the calcium phosphate precipitation method. Usually, 3 g of reporter plasmid, 1 g of expression vector, and 1 g of ␤-galactosidase expression vector (pCMV␤; CLONTECH) were used unless indicated. For cotransfections with two different expression vectors, the pRC/CMV empty vector was used to adjust the total amount of expression vector. Cells were harvested 48 h after transfection, washed, and finally solubilized in 60 l of lysis buffer (250 mM Tris-HCl (pH 8.0)). Cells were lysed with three cycles of freezing and thawing, and cell extracts (supernatants) were collected after centrifugation and used for ␤-galactosidase activity and CAT assays. Twenty l of the extracts were used to measure CAT activities with a CAT enzyme-linked immunosorbent assay KIT (Boehringer Mannheim) according to manufacturer's protocol. Transfection efficiencies were normalized with ␤-galactosidase activity.
Germ Line Transformation and Immunohistochemistry-Standard procedures were used for P-element-mediated germ line transformation (50). The y w 67c23 strain was used for embryo injections with P-elements P1LacZ (wild type), P1E1mLacZ (E1 mutation), P1E1E2mLacZ (E2 mutation), and P1E1mE2mLacZ (E1 and E2 mutations), respectively. P-elements were constructed by inserting corresponding NK-4 promoter and upstream regions (EcoRI and BamHI double-digested DNA fragments) from P1, P1E1m, P1E2m, and P1E1mE2m reporters that were used for transient expression assays in cultured cells into CaSpeR ␤-galactosidase vector. A minimum of four lines was established for each construct. For the characterization of reporter gene expressions in transgenic flies, embryos from each fly stock were collected, dechorionated, and fixed as described previously (51) and subjected to incubation with the anti-␤-galactosidase primary antibody (Cappel; 1:5000 dilution). Signals (brown precipitate) were developed with peroxidase (horseradish peroxidase)-conjugated secondary antibody (Life Technologies, Inc.) and diaminobenzidine.

The 5Ј Upstream Region of the NK-4 Promoter Contains Two Clusters of E-box Sequences That Are Conserved in the D. virilis
Gene-Because the potential hierarchical relationship between twist and NK-4 was suggested previously (19,26), the 5Ј upstream region of the NK-4 promoter was sequenced and found to contain nucleotide motifs such as E-boxes. Using two subcloned plasmids (a 1.62-kb EcoRI insert and a 2.3-kb EcoRI-BamHI insert, respectively) which contain the 5Ј upstream region, the first exon, and part of the first intron, serial deletion mutant plasmids were generated and used as template DNAs for DNA sequencing. A total of 3 kb of sequence from the upstream region was obtained by aligning the overlapping sequences. The portion of the upstream sequence that was determined is shown in Fig. 1A. We reasoned that this region might contain functionally important regulatory sequences for the

FIG. 1. Nucleotide sequence of the 5 upstream region of the NK-4 homeobox gene (A) and sequence comparison of E-box clusters between D. melanogaster NK-4 (DmNK-4) and D. virilis NK-4 (DvNK-4) (B).
Both strands were sequenced by the dideoxy chain termination method. The nucleotide sequence of the 5Ј upstream region of the promoter and the 5Ј-untranslated region is shown. A, A of the initiator codon (ATG, Met) is numbered as ϩ1. The E-box (CANNTG) sequences are indicated by E. Locations of the two clusters of the E-box sequences (E1 and E2) are indicated and labeled. The basal promoter region is underlined. An asterisk indicates the 5Ј end of the cDNA. EcoRI, EcoRI restriction site. B, the E-box clusters that are located in the 5Ј upstream region of the DvNK-4 promoter are compared with those of DmNK-4. expression of NK-4, since two deletion alleles within this region result in lethality (30). In the proximal region (from Ϫ337 to Ϫ137) to the initiator codon, we detected the basal promoter activity by an analysis of transient expression of reporter constructs in Drosophila S2 cells (data not shown). In addition to many putative regulatory sites, we found 11 E-box (CANNTG) sequences that are putative binding sites for the bHLH type transcription factors such as MyoD and Twist. Of interest are two clusters of E-box sequences (see Fig. 1A; E1 cluster, 3 copies of ACATATG from Ϫ1134 to Ϫ1101; E2 cluster, 3 copies of CACTTGA from Ϫ868 to Ϫ831). If these two clusters of E-box sequences have a regulatory function in NK-4 expression, then they may be conserved in other species during evolution. To test this, we cloned the NK-4 homologue of D. virilis (DvNK-4). Interestingly, several clones contained both the NK-4 and NK-3 homeobox genes, suggesting that the D. virilis genome also has the same homeobox gene cluster as D. melanogaster (28). The deduced amino acid sequences of the DvNK-4 and DvNK-3 homeodomains are nearly identical to those of D. melanogaster, and the gene structure is also similar. 4 In addition, we found that the E-box clusters are also conserved in D. virilis (Fig. 1B). Conservation of the E-box clusters during evolution suggests that these sequences may have an important regulatory function for NK-4 expression.
Twist Binds to the E1 Cluster, but not to the E2 Cluster DNA-Various bHLH transcription factors can bind to the E-box sequence as homo-or heterodimers (52,53). We observed that nuclear extracts from the Drosophila embryos contain DNA binding proteins for both E1 and E2 cluster DNA, indicating that regulatory proteins such as bHLH transcription factors might bind to these E-box clusters during embryogenesis (data not shown). If the homodimer of the Twist protein is functional in DNA binding, we should be able to detect binding activity of the Twist protein to the E-box sequence. As an initial step toward determining whether Twist activates NK-4 directly, we first examined whether Twist could bind to the E-box clusters located in the 5Ј upstream region of the NK-4 promoter. The Twist protein used in the DNA binding studies was prepared by expressing twist cDNA in E. coli as a GST:Twist fusion protein, which was partially purified using a glutathione-Sepharose column. The GST:Twist fusion protein was analyzed for its DNA binding properties by gel-shift assays using the labeled E1 or E2 DNA containing two copies of the E-box sequence within the E1 or E2 cluster region as a probe. We found that the GST:Twist fusion protein bound to the E1 cluster specifically, presumably as a homodimer ( Fig. 2A, lanes E1 PROBE). However, we could not detect any binding activity to the E2 cluster DNA ( Fig. 2A, lanes E2 PROBE). We also performed DNase I footprinting assays with a labeled F23 DNA fragment (300 bp from Ϫ1186 to Ϫ887) containing the E1 FIG. 2. The E-box sequences (CATATG) within the E1 cluster are binding sites for Twist. A, binding of the Twist protein to the E1 cluster but not to the E2 cluster DNA. Gel-shift assays using a purified GST:Twist fusion protein were performed with labeled E1 or E2 probes that contain two copies of E-box sequence from either the E1 or E2 cluster and a 100-fold excess of cold E1, E2, or nonspecific (NS) oligonucleotides as competitor. The absence (Ϫ) or presence (E1, E2, NS) of a competitor and Twist is indicated above the gel. Arrowheads indicate shifted bands. The E1 and E2 sequences used in the gel-shift assays are shown below the autoradiogram. B, DNase I footprinting assays. Either strand of the F23 DNA fragment (300 bp from Ϫ1186 to Ϫ887) was end-labeled and subjected to DNase I footprinting assays with a GST:Twist fusion protein as described under "Experimental Procedures." Only the E1 cluster region was protected (closed box to the right of gel). Numbers indicate corresponding regions of the protected sequences. Lanes: G ϩ A, G ϩ A sequence of the F23 DNA fragment; None, reaction in the absence of protein; Twist, reaction using a GST:Twist fusion protein; BSA, reaction using bovine serum albumin.
cluster. Results shown in Fig. 2B demonstrate that the Twist protein leads to specific footprints on the E1 cluster that are consistent with the results of the gel-shift experiments shown in Fig. 2A. These results demonstrate that Twist binds to the E1 cluster DNA but not to the E2 cluster DNA.
Dependence of NK-4 Activation by Twist on the E1 Cluster-We next asked whether the specific binding of a GST: Twist fusion protein to the E1 cluster DNA in vitro is functionally related to NK-4 regulation by twist in cultured cells. For this purpose, we employed transient expression assays in cultured CV-1 cells, since previously we found that the Drosophila S2 cells contain endogenous twist and NK-4 activities. We constructed a Twist expression vector, pRC/CMV-Twist, containing the complete twist cDNA fragment driven by a strong CMV promoter. For the construction of the F23 and F33 reporter plasmids containing either the E1 (F23 DNA fragment) or the E2 cluster region (F33 DNA fragment, 300 bp from Ϫ1036 to Ϫ737), PCR-amplified DNAs were subcloned into the pBLCAT2 vector, in which the chloramphenicol acetyltransferase (CAT) gene is driven by the heterologous thymidine kinase promoter. When we measured CAT activities from cell extracts cotransfected with reporter plasmids and with the Twist expression vector, increased CAT activities were observed only in cell extracts cotransfected with the F23 reporter (Fig. 3A). Cotransfection with the F33 reporter containing the E2 cluster resulted in no increase in CAT activity. To investigate whether the E-box sequences of the E1 cluster within the F23 construct are responsible for the activation of this reporter, FIG. 3. NK-4 is a direct transcriptional target for twist. CAT activities (averages of three sets of independent experiments) were measured in extracts from CV1 cells cotransfected with various CAT reporter plasmids (3 g per transfection) together with the twist expression vector (1 g of pRC/CMV-Twist; hatched bar) or with an empty vector (1 g of pRC/CMV; closed bar) as described under "Experimental Procedures." In this series of experiments, the normalized CAT activity (CAT activities was corrected for transfection efficiencies by ␤-galactosidase assays), obtained from transfection with a test reporter, was divided either by the corresponding value obtained with pBLCAT2 and pRC/CMV (A) or by that obtained with P1 and pRC/CMV (B and C) and is shown as relative CAT activity. A, effect of twist on the E1 cluster-dependent gene activation in cultured cells. F23E1m has the same upstream region as the F23 construct except for a mutated sequence within the E1 cluster. The E1 construct contains the wild type sequence of the E1 cluster region, whereas the E1m construct has a mutated E-box sequence within the E1 cluster region. B, direct activation of NK-4 by twist. The P1E1m construct contains the same NK-4 promoter and upstream region as the P1 construct but has mutation within the E1 cluster. C, cotransfections were performed with the truncated form of the twist expression vector pRC/CMV-TwiH (TwiH), and CAT activities were measured. Reporter constructs and expression vectors used are shown. Numberings and the NK-4 Map are based on the nucleotide sequence shown in Fig. 1. M indicates the initiation codon (Met). Names of constructs are indicated to the right. Solid bars and numbers show corresponding upstream regions of the NK-4 promoter. Two clusters of E-box sequences are indicated as E1 (closed circle) and E2 (closed box). X on the solid bar indicates mutations within the E1 or E2 cluster. In the case of P1E1mE2m, both the E1 and E2 clusters were mutated. P-series reporter constructs contain the NK-4 promoter, whereas F-and E-series reporters have the heterologous thymidine kinase (TK) promoter. The 3Ј end of the NK-4 promoter region of the P-series constructs is numbered as Ϫ137. The arrow represents the transcription start site. RI, EcoRI; CAT, chloramphenicol acetyltransferase reporter gene.
we constructed the F23E1m reporter containing a mutated sequence of the E1 cluster and measured CAT activity after cotransfection. Indeed, mutations within the E1 cluster abolished the increase in CAT activity that was seen in the F23 transfection following twist expression, suggesting that the E-box sequences (CATATG) within the E1 cluster are necessary for the activation of the reporter gene by Twist (Fig. 3A). Similarly, we constructed reporters (E1 and E1m) in which the E1 cluster alone (from Ϫ1139 to Ϫ1095) or the mutated E1 cluster are attached to pBLCAT2 and tested transactivation of these reporter genes by twist expression. The results showed that the wild type E1 cluster sequence was sufficient to induce an increased CAT activity when cells were cotransfected with the twist expression vector, whereas the mutated E1m sequence was not (Fig. 3A). These results demonstrate that the sequence requirements for Twist binding to the E1 cluster closely parallel those necessary for NK-4 activation by twist.

NK-4 Is a Direct Transcriptional Target for Twist in Cultured
Cells-Finally, we tested whether Twist is able to activate the NK-4 homeobox gene directly in cultured cells. To this end, we used the NK-4 promoter and the 5Ј upstream region for the reporter constructs. The wild type P1 construct was generated by inserting 1.2 kb of DNA (from Ϫ1342 to Ϫ137) from the NK-4 promoter and the 5Ј upstream region into the pCAT-basic vector. For the construction of the mutant P1E1m, we introduced mutations within the E1 cluster. Following cotransfection of cells, a 6-fold increase in CAT activity was observed with the P1 construct which was dependent upon the twist expression (Fig. 3B). In contrast, the mutant reporters (P3 and P1E1m) did not show a significant increase in CAT activities compared with that of the P1 reporter construct (Fig. 3B). In addition, transfections with a truncated form of the twist expression vector, CMV-TwiH, that contains a bHLH domain but lacks a transcriptional activation domain fail to show NK-4 activation, indicating the direct involvement of the Twist protein in transactivation of NK-4 (Fig. 3C).
The residual CAT activity (see Fig. 3B; 1.8-fold increase) in P3 and P1E1m prompted us to search for other putative twist regulatory sites. We tested potential Twist binding to E-box sequences including those that might occur in the 5Ј upstream region (up to 3 kb) of the NK-4 promoter, by gel-shift assays (Fig. 4). We found that the Twist protein could bind to the E-box sequences with differential affinities under our experimental condition (strong binding, Fig. 4, lanes 1 and 7; moderate binding, lanes 8 and 10; weak binding, lanes 5 and 9; no binding, lanes 2-4, and 6). These strong binding sites (lanes 1 and 7) also exist in the rhomboid neuroectodermal element (47), and one binding site with medium affinity (lane 10) was found in the snail proximal enhancer element (54). As far as the Twist binding site in the NK-4 promoter and the 5Ј upstream region (up to 3 kb) is concerned, we did not find any strong binding sites except for the E-boxes (CATATG) within the E1 cluster (Fig. 4, lanes 1-5). This E-box sequence was found in two other locations (TACATATGC from Ϫ1657 to Ϫ1649, data not shown; AGCATATGA, from Ϫ444 to Ϫ436), and a weak binding site (AGCAGCTGG from Ϫ675 to Ϫ667) was also found. These two binding sites within the F53 DNA fragment (from Ϫ738 to Ϫ438) may explain the residual CAT activities seen in P1E1m and P3 transfections. Indeed, coexpression of F53 containing these two sites (300 bp, from Ϫ736 to Ϫ436) and twist showed a mild increase in CAT activity (data not shown). Taken together, these results strongly suggest that NK-4 is a direct transcriptional target of twist in cultured cells.
The E2 Cluster Is Responsible for the Autoregulation of NK-4 -Because autoregulation of homeobox genes has been described previously, we examined whether NK-4 could also autoregulate its own gene in cultured cells. To this end, various overlapping upstream regions (300-bp DNA fragment each) of the NK-4 promoter were amplified and subcloned into the pBLCAT2 reporter (Fig. 5, F23-F53), and the effect of NK-4 on CAT activity was measured. We observed that CAT activities from cell extracts cotransfected with the F33 or F43 reporters were increased in the presence of the NK-4 expression vector pRC/CMV-NK4 (Fig. 5A). Additionally, we observed increased CAT activity in cells cotransfected with F3 and the NK-4 expression vector. These results indicate that the overlapping region between the two constructs (F3 DNA fragment, 150 bp from Ϫ888 to Ϫ737) contained cis-acting DNA elements responsible for the autoregulation by NK-4. To determine the NK-4 binding sites within this region, we expressed a GST: NK-4 fusion protein in E. coli and used it for DNA binding studies. Using gel-shift assays we found strong binding activities to the F35 DNA fragment (from Ϫ886 to Ϫ809) in which the E2 cluster sequence is located (Fig. 6A). This binding was eliminated by competition with oligonucleotides containing two copies of the E-box sequence within the E2 cluster region (from Ϫ852 to Ϫ827). DNase I footprinting assays also showed protection of the E2 cluster region in the presence of the NK-4 protein (Fig. 6B), suggesting that the E-box sequences within the F35 region are binding sites for the NK-4 protein. This finding was again confirmed by competition assays with oligonucleotides containing a mutated sequence within the E-box. None of the tested sequences could compete with the wild type E-box sequence except MUT3 (CACTTAA) under our experimental condition (Fig. 6C). In fact, we could see the protection of the additional E-box sequence (TCAAGTG, from Ϫ963 to Ϫ957) which has the same sequence as that in the E2 cluster (Fig. 6B). These results demonstrate that the E-box sequences in the E2 cluster are strong binding sites for the NK-4 homeodomain.
Having demonstrated NK-4 protein binding to the E2 cluster, we sought to examine the functional relevance of this phenomenon to NK-4 autoregulation. We tested a set of reporters (F3E2m, E2, and E2m) containing the F3 region with a mutated E-box sequence within the E2 cluster (F3E2m) and containing the wild type E2 sequence (E2), or mutated E2 sequence (E2m) alone (Fig. 5). We found that mutations within the E2 cluster completely abolished the CAT activity seen in the F3 transfection (Fig. 5B). Similarly, whereas transfection with E2 resulted in increased CAT activity, transfections with E2m did not show any increase in reporter gene expression. Consistent with these results, the wild type P1 reporter showed an increased CAT activity in the presence of NK-4. In contrast, the deletion mutant (P5) and mutations within the E2 cluster (P1E2m) resulted in decreased CAT activities (Fig. 5C). These results demonstrate that the E2 cluster is responsible for NK-4 autoregulation.
Both the E1 and the E2 Cluster Are Required for NK-4 Activation in Visceral Mesodermal Cells in Vivo-To address the in vivo function of the two clusters of E-box sequences, we established transgenic flies containing P-elements (wild type, P1LacZ; E1 mutation, P1E1mLacZ; E2 mutation, P1E2mLacZ; E1 and E2 mutations, P1E1mE2mLacZ). For constructions of the P-elements, DNA fragments containing the NK-4 promoter and the 5Ј upstream regions from P1, P1E1m, P1E2m, and P1E1mE2m, respectively, were subcloned into the CaSpeR Pelement vector. In these constructs, the lacZ reporter gene is driven by the same NK-4 promoter and enhancer region that were characterized in transient expression assays in cultured cells. Expressions of the ␤-galactosidase marker was monitored with immunohistochemistry during embryogenesis. As shown in Fig. 7, the wild type NK-4 reporter gene is expressed in visceral mesodermal cells at late stage 11 embryos (arrow, A and B), which is consistent with tinman expression in those cells. These results indicate that the enhancer region that we have characterized in cultured cells is sufficient for NK-4 activation in visceral mesodermal cells in vivo. When we mutated either the E1 or E2 cluster, ␤-galactosidase activity was abolished in these cells (Fig. 7, D-F), demonstrating that both the E1 and E2 clusters are responsible for the NK-4 activation in visceral mesodermal cells. Furthermore, together with data that Twist and NK-4 bind to these clusters in vitro, and that those bindings result in reporter gene activations in cultured cells, these in vivo results suggest that NK-4 is a direct transcriptional target for Twist in visceral mesodermal cells and support the notion that twist function is also required for the subdivision of mesoderm during Drosophila embryogenesis. Ectopic ␤-galactosidase expression in midline cells was also observed (Fig. 7, A-C, arrowhead), and the ␤-galactosidase protein, presumably because of stability of ␤-galactosidase, FIG. 5. NK-4 autoregulation. CAT reporter plasmids (3 g per transfection) were cotransfected with the NK-4 expression vector (1 g of pRC/CMV-NK4; hatched bar) or with an empty vector (1 g of pRC/CMV; closed bar) into CV-1 cells, and CAT activities (averages of three sets of independent experiments) were measured. Relative CAT activity (for A and C) was calculated as described in the Fig. 3 legend. Reporters and expression vectors used for cotransfection are shown. A, localization of sequences responsible for the NK-4 autoregulation. Various upstream regions of the NK-4 promoter were subcloned into pBLCAT2, and the resulting plasmids were used for transient expression assays. B, effect of NK-4 on the E2 cluster-dependent gene activation. In these experiments, the normalized CAT activity, obtained from transfection with a test reporter, was divided by the corresponding value obtained with F3 and pRC/CMV and is shown as relative CAT activity. C, autoregulation of the NK-4 gene promoter by NK-4. The P1E2m construct contains the same upstream region as P1 except for mutations within the E2 cluster (see "Experimental Procedures"). remained at stage 13 embryos in visceral muscle cells (Fig. 7C,  arrow). Interestingly, we found that in embryos from transgenic flies containing P-elements that have mutations in either the E1 or E2 cluster (Fig. 7, D-F) ectopic reporter gene expression in somatic muscle cells was observed. Ectopic expression of lacZ reporter in the P1E1mLacZ embryos was weaker than in embryos from other transgenic lines (P1E2mLacZ). These results suggest another function for the E-box clusters in NK-4 regulation, that is that both enhancer elements are required for NK-4 repression in cells that do not express NK-4 normally. In addition, since in these embryos reporter gene activation in visceral mesodermal cells disappeared, both elements are functionally connected and are required for the NK-4 activation in visceral mesodermal cells. The idea that both elements are required for the NK-4 activation was tested in transient expression assays (Fig. 8). Indeed, in the case of the wild type P1 construct, cotransfection of both NK-4 and twist expression vectors showed an increase in CAT activities, whereas either mutation in the E1 or E2 cluster (P1E1m, P1E2m, and P1E1mE2m) eliminated CAT activation despite the presence of NK-4 and twist expression vectors (Fig. 8). Taken together, the results suggested multiple function of the E-box clusters for the NK-4 regulation (Fig. 9). DISCUSSION In the present study, we examined transcriptional control of NK-4 and show that two distinct clusters of E-box sequences mediate gene activation in visceral mesodermal cells. Several lines of evidence support the idea that the direct activation of NK-4 by twist also occurs during embryogenesis. First, the temporal and spatial expression patterns of NK-4 and twist show that both genes are expressed in the presumptive mesoderm in cellular blastoderm stage embryos and in the mesodermal layers after gastrulation (11,19,30). Moreover, the onset of NK-4 expression follows the appearance of the Twist protein. Also, cells that express NK-4 are included within regions that express twist. Second, in the absence of twist, NK-4 is not expressed (19), suggesting that Twist is responsible for its activation, either directly or indirectly. Third, we show that ectopic expression of Twist in cultured cells induces activation of reporter genes driven by the 5Ј upstream region and the NK-4 promoter (Fig. 3). And we demonstrate that in addition to in vitro binding of Twist protein to E-box sequences within the E1 cluster of the 5Ј upstream region of the NK-4 promoter (Fig.  2), mutations within the E1 cluster abolish activation of NK-4 by Twist (Fig. 3B). Also, the expression of a truncated form of Twist does not activate the reporter gene (Fig. 3C), indicating that direct binding of Twist protein to the E1 cluster DNA is required for NK-4 activation. Finally, transgenic animal analysis showed that both the E1 and E2 clusters are required for the expression of NK-4 in visceral mesodermal cells in which relatively low concentrations of Twist exist during embryogenesis (Fig. 7). Taken together with the genetic data, the results shown here establish a functional role for Twist in the direct activation of NK-4 in visceral mesodermal cells.
twist encodes a bHLH transcription factor (11) that binds to E-box (CANNTG) sequences (47,54). Domain analysis of the Twist protein using GAL4:Twist chimeras in yeast also indicated that the glutamine-rich regions contained a transcriptional activation domain (55). In vivo, twist is activated by dorsal to give a graded distribution that tails off laterally at the cellular blastoderm stage (13)(14)(15)(16). Later in the subdivision of the mesoderm, high levels of Twist expression is maintained in cells that will give rise to somatic muscles, whereas relatively lower amounts are expressed in progenitors of other derivatives (27). Therefore, Twist may regulate different target genes by differential DNA binding affinities depending on the concentration of the Twist protein. We showed that Twist can bind to target DNAs with differential DNA binding affinities (Figs. 2 and 4) and that Twist protein binds to the E1 cluster and activates NK-4 in visceral mesodermal cells (Figs. 3 and 7). Since we found that the E1 cluster element contains strong binding sites for Twist and is required for the NK-4 activation in visceral mesodermal cells, we propose that the relatively low concentration of the Twist protein may be sufficient for the recognition of the NK-4 promoter and the visceral mesoderm enhancer elements such as the E1 cluster. Therefore, our results provide, for the first time, direct evidence supporting the notion that twist function is also required for the subdivision of the mesoderm during Drosophila embryogenesis (27).
Is the concentration gradient of Twist protein sufficient for the selection of target genes during mesodermal cell specification? Heterodimerization and post-translational modification may be other important factors for Twist function, although little is known about modification of Twist protein, such as phosphorylation, and about Twist's partner for heterodimerization. So far, our data suggest that the Twist homodimer is sufficient for DNA binding (Figs. 2 and 4) and for the activation of NK-4 that was seen following the wild type twist expression (Fig. 3B). Nevertheless, our transgenic animal data showed that another E-box cluster (the E2 cluster), which does not bind Twist homodimer, is absolutely required for NK-4 activation in visceral mesodermal cells and is functionally linked with the E1 cluster (Figs. 7 and 8). Therefore, although a certain con- FIG. 8. Both the E1 and E2 cluster are required for the activation of NK-4. Reporter plasmids containing either mutation in the E1 and the E2 cluster (P1E1m, P1E2m, P1E1mE2m) were constructed (see "Experimental Procedures") and used for cotransfections with both NK-4 and twist expression vectors. The normalized CAT activity, obtained from cotransfection with a test reporter, was divided by the corresponding value obtained with P1 and a pRC/CMV empty vector and shown as relative CAT activity. Reporters used are shown below the diagram. centration of Twist protein itself is important to select target genes (27), we prefer the possibility that other transcription factors such as NK-4, at least in the case of the visceral mesodermal cell specification, may cooperate to regulate target genes, thereby specifying the subdivision of the mesoderm. Yet it remains to be determined whether the Twist homodimer may directly or indirectly interact with other transcription factors that bind to the E2 cluster or whether Twist may form heterodimers with unknown bHLH partners to find correct target genes during mesodermal cell differentiation. It was shown previously that dorsal-bHLH interactions are important for initiation of the embryonic mesoderm (54,56,57).
We demonstrate that the NK-4 protein binds to the E-box sequence (TCAAGTG) within the E2 cluster (Fig. 6) and autoactivates NK-4 (Fig. 5). It is worth noting that the NK-4 protein strongly binds to this target sequence rather than one containing the 5-TAAT-3 core. Recently, it was shown that similar binding sites were recognized by NK-2 class homeodomain transcription factors such as NK-2 (TNAAGTGG (58)), TTF-1 (TCAAGTGT (59), CEH-22 (CGCTAAAGTG (60)), and NKx-2.5 (TNAAGTG (61)). Interestingly, all members of the NK-2 family of homeodomains have a tyrosine residue at position 54 (46,62,63), suggesting that this residue may have an important function in recognizing target DNA sequences (64). Positive autoregulation is seen in many homeobox genes (65)(66)(67)(68)(69)(70), and tissue-specific negative autoregulation is also seen in Ubx (71). Likewise, we demonstrate that NK-4 up-regulates its own gene by binding to the E2 cluster. The NK-4 protein has both activator and repressor domains, 4 suggesting that NK-4 can act as either a transcriptional activator or repressor molecule. Indeed, NK-4 can down-regulate a specific reporter gene ( Fig. 5C; P1E2m), suggesting that, depending on chromatin context, it can act as a transcriptional repressor. Yet it remains to be seen whether NK-4 down-regulates its own gene or other unknown target genes in a tissue-specific manner.
As discussed above, one function of the E2 cluster is to serve as an enhancer element for NK-4 activation in visceral mesodermal cells, which is shown by the transgenic animal data (Fig. 7). Because the E2 cluster that is recognized by NK-4 also contains E-box sequences, it is conceivable that in cells that do not express NK-4, this cluster may serve as a negative regulatory element for unknown bHLH proteins. We have shown that mutation in either the E1 or E2 cluster abolished the expression of the reporter gene both in visceral mesodermal cells and in cultured cells indicating that two distinct clusters of the E-box elements are indispensable for NK-4 activation and are functionally connected. Furthermore, in embryos carrying mutant reporters, ectopic expression was observed in a subset of somatic muscle cells (Fig. 7, D-F). These results provide a third function for the E-box clusters, that is they are actively involved in NK-4 repression in cells that do not normally express NK-4 (Fig. 9). It is of note that the NK-3 protein acts as a transcriptional repressor and is also able to bind to the same DNA sequences as NK-4. 4 Therefore, it is probable that the E2 cluster is responsible for NK-4 repression by other transcription factors such as NK-3 in visceral mesodermal cells.
Dorsoventral axis formation in the Drosophila embryo is controlled by a cascade of transcription factors (1, 2). Downstream target genes may use separable, but sometimes tightly linked, cis-acting regulatory elements in combination with a homologous core promoter to be expressed in a temporally and spatially regulated manner (Figs. 8 and 9; Refs. 72 and 73). Perhaps gene interactions among mesodermal genes, once activated by upstream genes such as twist, are important for these tight regulations during mesodermal cell specification (35). The recent finding that twist function is also required for the subdivision of the mesoderm (27) strengthens the importance of our results which provide the first evidence that twist function is directly required for target gene activation in visceral mesodermal cells. Inductive signals such as Dpp and Wingless from other germ layers also have pivotal roles for the subdivision of the mesoderm (3,5,(7)(8)(9)74). Since the two clusters of E-box sequences that we have characterized here only explain NK-4 activation in visceral mesodermal cells, other regulatory elements that are involved in the activation of NK-4 in other tissues such as the dorsal vessel remain to be characterized. Further characterizations of NK-4 regulation should offer insights into the complex mechanisms controlling mesodermal cell specification.