GATA-4 activates transcription via two novel domains that are conserved within the GATA-4/5/6 subfamily.

GATA-4 is one of the earliest developmental markers of the precardiac mesoderm, heart, and gut and has been shown to activate regulatory elements controlling transcription of genes encoding cardiac-specific proteins. To elucidate the molecular mechanisms underlying the transcriptional activity of the GATA-4 protein, structure-function analyses were performed. These analyses revealed that the C-terminal zinc finger and adjacent basic domain of GATA-4 is bifunctional, modulating both DNA-binding and nuclear localization activities. The N terminus of the protein encodes two independent transcriptional Activation Domains (amino acids 1-74 and amino acids 130-177). Amino acid residues were identified within each domain that are required for transcriptional activation. Finally, we have shown that regions of Xenopus GATA-5 and −6 corresponding to Activation Domains I and II, respectively, function as potent transcriptional activators. The identification and functional characterization of two evolutionarily conserved transcriptional Activation Domains within the GATA-4/5/6 subfamily suggests that each of these domains modulates critical functions in the transcriptional regulatory program(s) encoded by GATA-4, −5, and −6 during vertebrate development. As such these data provide novel insights into the molecular mechanisms that control development of the heart.

The GATA family of zinc finger transcription factors plays an important role in transducing nuclear events that modulate cell lineage differentiation during vertebrate development (for review see Refs. [1][2][3]. Six GATA family members have been identified in vertebrate species, each of which is expressed in a developmentally regulated, lineage-restricted pattern (4 -18). Within the GATA family, members of the GATA-4/5/6 subfamily of transcription factors are expressed in an overlapping pattern in the extra-embryonic endoderm, precardiac mesoderm, embryonic and adult heart, and gut epithelium (7,12,14,19,20). Functionally important GATA-binding sites have been identified in multiple cardiac-specific transcriptional regulatory regions, and overexpression of GATA-4 has been shown to transactivate these cardiac-specific transcriptional regulatory elements in noncardiac cells (21)(22)(23). In addition, expression of antisense GATA-4 transcripts in pluripotent P19 embryonal carcinoma cells blocks retinoic acid-inducible expression of genes encoding cardiac-specific myofibrillar proteins (24); whereas injection of GATA-4 mRNA into Xenopus oocytes results in the premature expression of genes encoding cardiacspecific contractile proteins (20). Moreover, GATA-4 has been implicated in regulating the differentiation of the extra-embryonic endoderm as cystic embryoid bodies derived from GATA-4 Ϫ/Ϫ ES cells exhibit gross defects in formation of visceral and parietal endoderm (25). Thus, GATA-4 may play an important role in the transcriptional program(s) that control the differentiation of embryonic tissues (heart and possibly gut) as well as extra-embryonic tissues.
Despite their critical role in regulating lineage-specific gene expression, relatively little is currently understood about the molecular mechanisms that regulate the transcriptional activity of each GATA family member. Each vertebrate GATA factor contains two conserved type IV Cys-X 2 -Cys-X 17 -Cys-X 2 -Cys zinc fingers that recognize, and bind to, a related sequence motif (WGATAR) that is present in the transcriptional regulatory regions of multiple lineage-specific genes (8, 26 -28). NMR analyses of the chicken GATA-1 C-terminal zinc finger and adjacent basic domain bound to DNA have revealed that the C-terminal zinc finger interacts with the major groove of DNA, and the basic domain lies in physical contact with the minor groove of the target sequence (29). Structure-function analyses of the mouse GATA-1 and human GATA-3 proteins, which control critical steps in erythroid and lymphoid development, respectively (30,31), revealed single independent transcriptional Activation Domains within the N-terminal regions of both proteins (28,32,33). The GATA-1 Activation Domain (aa 1 1-66) is an acidic, serine-rich domain with sequence homology to other acidic Activation Domains such as that present in the herpes virus transcriptional activating protein, VP16 (32)(33)(34). In contrast, the transcriptional Activation Domain identified within GATA-3 (aa 30 -74) has a neutral charge and shares little amino acid sequence homology to previously described transcriptional Activation Domains including those within the mouse and chicken GATA-1 proteins (28,32,33). Of note, computer-based sequence alignment algorithms fail to detect even low level amino acid sequence homology between the transcriptional Activation Domains located within GATA-1 and -3 and any region of the murine GATA-4 protein (4).
Given its unique pattern of expression and highly divergent structure from the previously examined GATA-1/2/3 subfamily of proteins, it was of interest to examine the molecular mechanisms that control the transcriptional activity of the GATA-4 protein. In this report, structure-function analyses were employed to map the domains responsible for transcriptional activation of the murine GATA-4 protein. These studies revealed that the C-terminal zinc finger and basic domain is bifunctional modulating DNA-binding and nuclear localization activities. In addition, two N-terminal transcriptional Activation Domains, which are not conserved in the GATA-1/2/3 subfamily of transcription factors, were identified. Mutational analyses revealed specific amino acid residues within each domain that are required for transcriptional activity. Finally, we have shown that both Activation Domains I and II are functionally conserved within the recently identified Xenopus GATA-5 and -6 proteins (20). Elucidation of conserved transcriptional Activation Domains within the GATA-4/5/6 subfamily provides new insights into the molecular program regulating cardiac-specific gene expression and further clarifies the molecular evolution of the GATA family of transcription factors in vertebrates.

EXPERIMENTAL PROCEDURES
Plasmids and Site-directed Mutagenesis-The pMT2-GATA-4 expression plasmid encoding the murine GATA-4 protein has been described previously (4). Of note, DNA sequence analyses revealed several sequencing errors within the previously reported 5Ј-end of the murine GATA-4 cDNA (4). The corrected GATA-4 cDNA has an initiation codon located 18 base pairs 5Ј of the previously reported initiation codon and a shift in reading frame. The corrected deduced amino acid sequence has been recently reported (14). The p-124cTnCGH reporter plasmid contains the 124-base pair cardiac-specific cardiac troponin C (cTnC) promoter-enhancer (35) subcloned immediately 5Ј of the human growth hormone (GH) reporter gene as described previously (14). The pGAL4GH reporter plasmid contains five copies of the yeast GAL4 DNA-binding consensus sequence (36) subcloned immediately 5Ј of a minimal TATA-containing promoter (32) in the p0GH plasmid backbone (Nichols Institute). To identify the DNA-binding domain and nuclear localization signals within the murine GATA-4 protein, the following expression plasmids were generated via the polymerase chain reaction (PCR) and subcloned into the EcoRI site of the pMT2 vector (4) as described above: pG4/199 -323 encodes a recombinant protein that contains both murine GATA-4 zinc fingers and the C-terminal basic domain, pG4/199 -302 encodes a recombinant protein that contains both murine GATA-4 zinc fingers (but lacks the basic domain), pG4/ 251-323 encodes a recombinant protein that includes only the murine GATA-4 C-terminal zinc finger and the basic domain, and pG4/⌬199 -302 encodes a GATA-4 mutant protein containing an in-frame deletion of the amino acids within the N-and C-terminal zinc fingers (see schematic representations in Fig. 1).
To determine the function of specific amino acid residues within each Activation Domain of the murine GATA-4 protein on transcriptional activation, a series of point mutations were introduced via PCR into either the cDNA encoding GATA-4 Activation Domain I (amino acids 1-74) or II (amino acids 130 -177), and each of the mutated cDNAs was subcloned into the pGAL4 plasmid. The series of plasmids generated included the following: pGAL4/Y2A, pGAL4/Q3A, pGAL4/S4A, pGAL4/ F26A, pGAL4/H28A, pGAL4S29A, pGAL4/P36A, pGAL4/Y38A, pGAL4/ P40A, pGAL4/Y53A, pGAL4/Q55A, pGAL4/E147A, pGAL4/Q149A, pGAL4/S157A, pGAL4/Y158A, pGAL4/Y162A, pGAL4/Y165A, and pGAL4/W172A (the suffix of each name corresponds to the point mutation of the encoded murine GATA-4 Activation Domain). Cells, Transfections, and Growth Hormone Assays-NIH 3T3 and COS-7 cells were grown as described previously (22). NIH 3T3 cells were co-transfected with 25 g of the indicated expression plasmid, 2.5 g of the indicated GH reporter plasmid, and 1 g of the pMSV␤gal reference plasmid using Lipofectin reagent as described previously (22). Cells were harvested 48 h following transfection, and the medium from each plate was assayed for growth hormone using a commercially available radioimmunoassay kit (Nichols Institute). In addition, to assess transfection efficiencies ␤-galactosidase assays were performed on cell lysates as described previously (37). Each co-transfection experiment was repeated in duplicate at least three times. Results are expressed as normalized growth hormone Ϯ S.E.
Electrophoretic Mobility Shift Assays (EMSAs)-Nuclear extracts were prepared 48 h following transient transfection of COS-7 or NIH 3T3 cells with 15 g of expression plasmids encoding the indicated protein according to the procedure of Andrews and Faller (38). EMSAs were performed as described previously (35). Equal amounts of protein were loaded in each lane. The following radiolabeled double-stranded synthetic oligonucleotides were utilized: CEF-1, 5Ј CCAGCCTGAGAT-TACAGGGAG 3Ј (the GATA-binding site is underlined); GAL4, 5Ј GAGCGGAGTACTGTCCTCCGAG 3Ј. To confirm the specificity of each nuclear protein complex, specific and nonspecific unlabeled competitor oligonucleotides were included in each binding reaction as described previously (35).
Immunohistochemistry-To identify nuclear localization signal(s) within the murine GATA-4 protein, NIH 3T3 cells plated on glass coverslips were transiently transfected with 15 g of the indicated expression plasmid. 48 h post-transfection, cells were fixed with 3.7% formaldehyde and incubated with a rabbit polyclonal antiserum raised against the murine GATA-4 protein (the gift of D. Wilson, Washington University). Previous experiments have demonstrated that this polyclonal antiserum recognizes the murine GATA-4 protein but not the GATA-1, -2, or -3 proteins (12). GATA-4 protein was detected with a fluorescein isothiocyanate-conjugated goat anti-rabbit IgG secondary antibody, and the slides were viewed with a Zeiss Axiophot microscope.

Localization of a Bifunctional Domain That Mediates DNAbinding Activity and Encodes a Nuclear Localization
Signal-To define the minimal domain within the GATA-4 protein that confers DNA-binding activity in vitro, EMSAs were performed using nuclear extracts prepared from COS-7 cells transiently transfected with expression plasmids encoding GATA-4 deletion mutants and a radiolabeled oligonucleotide corresponding to the functionally important CEF-1/GATA motif from the murine cardiac troponin C (cTnC) transcriptional enhancer. Consistent with our previous reports (22,35), recombinant full-length GATA-4 protein bound specifically to the radiolabeled CEF-1 oligonucleotide (Fig. 1B, lane 3). We have demonstrated previously that the lowest mobility complex (arrow) is super-shifted with GATA-4 antiserum and is abolished when unlabeled specific competitor CEF-1 oligonucleotide is added to the binding reactions (22). Both the pG4/199 -323 deletion mutant (Fig. 1A), containing both zinc fingers and the adjacent C-terminal basic domain, and the pG4/251-323 deletion mutant (Fig. 1A), containing only the C-terminal zinc finger and basic domain, bound the radiolabeled CEF-1 oligonucleotide (Fig. 1B, lanes 4 and 6, arrows). In contrast, the pG4/199 -302 GATA-4 deletion mutant (Fig. 1A), containing both zinc fingers, but lacking the C-terminal basic domain, failed to bind DNA (Fig. 1B, lane 5). Of note, Western blot analyses performed with GATA-4 specific antiserum and nu-clear extracts prepared from the transfected COS-7 cells confirmed that the 199 -302 deletion mutant was produced (data not shown). These results demonstrate that the C-terminal zinc finger and the adjacent basic domain are necessary and sufficient to confer DNA-binding activity to the GATA-4 protein.
To identify the amino acid residues that target the murine GATA-4 protein to the nucleus, a series of expression plasmids encoding GATA-4 deletion mutants were transfected into NIH 3T3 cells (which do not express GATA-4 protein), and the recombinant protein was immunocytochemically localized using rabbit polyclonal GATA-4 antiserum. Consistent with the intracellular localization of GATA-4 in cardiac myocytes (22), expression of the recombinant full-length GATA-4 protein was localized to the nuclei of 3T3 cells ( Fig. 2A, yellow staining). Similarly, the pG4/ 199 -323 deletion mutant protein that contains both zinc fingers and the basic domain, and the pG4/251-323 GATA-4 deletion mutant protein, containing only the C-terminal zinc finger and the adjacent basic domain (the minimal DNA-binding domain), was localized exclusively to the nuclei of transfected 3T3 cells (Fig. 2, B and C, yellow staining). In contrast, diffuse localization throughout the cytoplasm and nucleus was observed in cells transfected with the pG4/⌬199 -302 plasmid which encodes a GATA-4 deletion mutant lacking both zinc fingers (Fig. 2D). Specific staining was not demonstrated in mock-transfected cells (data not shown). Taken together with the EMSAs shown in Fig.  1, these data demonstrate that the C-terminal zinc finger and basic domain are bifunctional modulating both DNA-binding and nuclear localization activities. Of note, the finding that GATA-4 deletion mutants lacking both zinc fingers were detectable in both the nucleus and cytoplasm of transfected cells suggests that additional nuclear localization signal(s) outside the minimal DNA-binding domain may also play a role in targeting GATA-4 to the nucleus.
Deletion Analyses of the Murine GATA-4 Protein-We have reported previously that GATA-4 transactivates the 124-base pair cardiac-specific troponin C (cTnC) transcriptional promoter-enhancer in noncardiac muscle cell lines (22). To identify regions of the murine GATA-4 protein that are involved in activating transcription, a series of expression plasmids encoding GATA-4 deletion mutants were co-transfected with the p-124cTnCGH reporter plasmid into NIH 3T3 cells. Consistent with our previous studies (22), co-expression of GATA-4 with the cTnC reporter plasmid resulted in an approximately 100fold induction in GH activity as compared with co-transfection of the cTnC reporter plasmid with the negative control plasmid, pMT2 (Fig. 3A, row 1). Similarly, co-transfection with the pG4/ ⌬19 expression plasmid, which lacks the N-terminal 19 amino acids of GATA-4, transactivated the GH reporter plasmid approximately 100-fold (Fig. 3A, row 2). However, co-expression of deletion mutants lacking 36 or 49 N-terminal amino acids with the cTnC GH reporter plasmid resulted in 60 and 90% reductions, respectively, in GH activity (Fig. 3A, rows 3 and 4). Moreover, a GATA-4 deletion mutant lacking 76 N-terminal amino acids (pG4/⌬76) failed to activate the p-124GH reporter plasmid above basal levels ( Fig. 3A, row 5). Deletion analysis of the C terminus of the murine GATA-4 protein revealed that both the pG4/⌬420C and pG4/⌬405C expression plasmids, encoding proteins harboring C-terminal truncation of 20 and 35 amino acids, respectively, transactivated the cTnC GH reporter plasmid to levels that were comparable with those obtained with the full-length GATA-4 expression plasmid (compare Fig.  3A, rows 1, 8 and 9). In contrast, co-transfection with the pG4/⌬390 protein, containing a deletion of 50 C-terminal amino acids, resulted in an 80% reduction in transcriptional activity (Fig. 3A, lane 7). These data suggested that both N-and Cterminal regions of the murine GATA-4 protein may be involved in regulating its transcriptional activity.
To more finely map regions of the murine GATA-4 protein that are required to transactivate the cardiac-specific cTnC promoter, a series of expression plasmids encoding GATA-4 in-frame deletion mutants were co-transfected with the p-124GH reporter plasmid into NIH 3T3 cells. Given the lack of activity demonstrated by the pG4⌬76 plasmid (Fig. 3A, row 5), it was surprising that co-transfection with either the pG4⌬20 -35 or pG4/⌬33-60 plasmids encoding GATA-4 inframe N-terminal deletion mutants induced comparable levels of GH activity to those obtained with wild-type GATA-4 ( Fig.  3B, rows 1-3). These data suggest either that in the context of the native protein aa 1-19 are absolutely required for transcriptional activity or, alternatively, that alterations in the secondary structure of the ⌬76 deletion mutant mask the function of other transcriptional Activation Domains located within the murine GATA-4 protein (see below). Co-transfection with the pG4/⌬74 -110 expression plasmid resulted in only a 60% reduction of transcriptional activity as compared with that observed with the pMT2-GATA-4 plasmid (Fig. 3B, row 4). Taken together, deletion analyses suggested that both the Nterminal and C-terminal regions of the GATA-4 proteins may harbor transcriptional Activation Domains and that the N terminus of GATA-4 might contain multiple transcriptional Activation Domains.
To confirm that each of the GATA-4 N-terminal, C-terminal, and in-frame deletion mutants maintained the ability to bind to the CEF-1/GATA nuclear protein binding sites located within the cTnC promoter-enhancer (22,35), EMSAs were performed with nuclear extracts prepared from COS-7 cells transiently transfected with expression plasmids encoding each GATA-4 deletion mutant (⌬19N, ⌬36N, ⌬49N, ⌬76N, ⌬323C, ⌬390C, ⌬420C, ⌬19 -36, ⌬32-61, and ⌬74 -110) and a radiolabeled synthetic oligonucleotide corresponding to the cTnC CEF-1/ GATA nuclear protein binding site. The CEF-1 oligonucleotide bound at least one nuclear protein complex present in nuclear extracts prepared from cells expressing each GATA-4 deletion mutant (Fig. 3C, lanes 4 -13, bracketed) that was not present in nuclear extracts prepared from COS-7 cells transiently transfected with the pMT2 negative control plasmid (Fig. 3C, lane 2). Of note, each of these nuclear protein complexes (brackets) migrated more rapidly than the full-length GATA-4 complex (Fig. 3C, lane 3, arrow). These analyses demonstrate that each GATA-4 deletion mutant tested in the co-transfection assays (see Fig. 3, A and B) maintained the ability to bind to the cTnC transcriptional enhancer in vitro.
Identification of Independent Transcriptional Activation Domains-To identify independent transcriptional Activation Domains within the murine GATA-4 protein, a series of expression plasmids encoding chimeric proteins containing the 147-amino acid yeast GAL4 DNA-binding domain (GAL4-DBD) fused inframe to regions of the GATA-4 protein were co-transfected into NIH 3T3 cells with the pGAL4GH reporter plasmid. Consistent with previous analyses of both the GATA-1 and -3 proteins (28, 32), an expression plasmid encoding the GAL4-DBD fused to the full-length GATA-4 protein (including its DBD) failed to transactivate the GAL4 GH reporter plasmid (data not shown). However, a 320-fold increase in GH activity was demonstrated when the pGAL4/1-204 expression plasmid, containing the N-terminal 204 amino acids of GATA-4 fused in-frame to the GAL4-DBD, was co-transfected with the GH reporter plasmid. This experiment confirmed the deletion analyses (see Fig. 3A and B) demonstrating that the N terminus of the murine GATA-4 protein contains one or more transcriptional Activation Domains (Fig.  4A, row 2). Similarly, high levels of GH activity were demonstrated when the pGAL4/1-93 (150-fold) and pGAL4/93-204 (220-fold) expression plasmids were co-transfected with the pGAL4GH reporter plasmid (Fig. 4A, rows 3 and 5). In contrast, expression plasmids encoding GAL4-DBD/GATA-4 fusion proteins that included GATA-4 aa 59 -119 or aa 334 -440 (the entire C terminus of the protein beyond the zinc fingers and basic domain) failed to increase GH activity above basal levels (Fig. 4A,  rows 4 and 6). Of note, the lack of GH activity demonstrated with the pGAL4/334 -440 expression plasmid was not anticipated as a 90% reduction in GH activity was demonstrated with the pG4⌬323C plasmid (Fig. 3A, row 6). This suggested either that the C terminus of the murine GATA-4 protein is necessary but is not sufficient for transcriptional activation within the context of the native GATA-4 protein or that the secondary structure of one or both of the C-terminal deletion mutants was altered from that of the native GATA-4 protein. Taken together, these data demonstrated that GATA-4 contains at least two independent transcriptional Activation Domains both of which are located in the N terminus of the protein. In addition, they suggested that while the C terminus of the protein (aa 334 -440) may be required for transcriptional activity of the native protein, it does not contain an independent transcriptional Activation Domain.
To more precisely map the number of independent Activation Domains located within the N terminus of the murine GATA-4 protein, a second series of expression plasmids encoding GAL4-DBD/GATA-4 fusion proteins were transiently co-transfected into NIH 3T3 cells with the pGAL4GH reporter plasmid. 4A, row 3) revealed that co-expression of a fusion protein containing GATA-4 aa 32-61, which spans an evolutionarily conserved serine-and proline-rich region (see Fig. 5, upper panel), or a fusion protein spanning GATA-4 aa 19 -85 failed to increase GH activity above basal levels (Fig. 4B, rows 2 and 3). In fact, deletion of the N-terminal 10 amino acids of GATA-4 completely abolished the ability of the GAL4-DBD/GATA-4 fusion protein to transactivate the GAL4GH reporter plasmid demonstrating that these 10 amino acid residues are required for functional activity of this N-terminal domain (Fig. 4B, row  4). In contrast, co-expression of a fusion protein containing GATA-4 aa 1-74 with the pGAL4GH reporter plasmid resulted in an approximately 250-fold increase in GH activity (Fig. 4B,  row 5). However, further C-terminal truncation of GATA-4 to aa 1-70 resulted in an approximately 90% decrease (12-fold residual activity) in the ability of the GAL4-DBD/GATA-4 fusion protein to transactivate the GH reporter plasmid (Fig. 4B,  row 6). Once again, EMSAs revealed that each of the chimeric proteins that failed to activate transcription retained the capacity to bind to the GAL4 motif ( Fig. 4D and data not shown). These analyses demonstrate that amino acids 1-74, heretofore designated Activation Domain I, are necessary and sufficient to independently activate transcription in NIH 3T3 cells.
Activation Domain I (aa 1-74) has a calculated pI of 7.16 (Fig. 5, upper panel). It contains eight proline residues, eight serine residues, four tyrosine residues, and three threonine residues. This domain has been evolutionarily conserved in GATA-4 proteins from the human, mouse, and frog (Fig. 5,  upper panel). Its amino acid sequence is 82 and 50% identical to human and Xenopus GATA-4 proteins, respectively (Fig. 5,  upper panel). Moreover, high level amino acid sequence identity was identified between this region of the murine GATA-4 protein and regions within the recently identified chicken and Xenopus GATA-5 and -6 transcription factors (12, 20) (Fig. 5,  upper panel). Within Activation Domain I, subdomains spanning amino acids 1-11, 35-47, and 52-57 demonstrate high pression plasmids encoding GATA-4 Nand C-terminal deletion mutants (schematically represented on the left) were transiently co-transfected with 2.5 g of p-124cTnCGH reporter plasmid and 1 g of the pMSV␤gal reference plasmid, into NIH 3T3 cells. 48 h following transfection the cell media were assayed for growth hormone activity, and the growth hormone activity was normalized to that obtained in cells co-transfected with the pMT2-GATA-4 expression plasmid. Data are expressed as mean Ϯ S.E. B, a series of expression plasmids encoding GATA-4 in-frame deletion mutants (schematically represented on the left) were transiently co-transfected with the p-124cTnCGH plasmid as described above. C, confirmation that each GATA-4 deletion mutant retained the capacity to bind to the previously described CEF-1/GATA-binding site in the cTnC transcriptional enhancer (22). EMSAs were performed using nuclear extracts prepared from COS-7 cells that were transiently transfected with expression plasmids encoding each GATA-4 deletion mutant (indicated above each lane) and a radiolabeled double-stranded synthetic oligonucleotide corresponding to the cTnC CEF-1 nuclear protein binding site. The nuclear protein complex that corresponds to the full-length GATA-4 protein in lane 3 is indicated by an arrow to the left of the autoradiographic image. The portion of the gel containing each deletion mutant is bracketed.
level sequence identity with regions located in the N terminus of the GATA-5 and -6 proteins, respectively (Fig. 5, upper  panel, gray boxes). Interestingly, each of these regions contains a conserved tyrosine residue (Fig. 5, upper panel, arrowheads), as well as conserved serine and proline residues. Given the deletion analysis presented in Fig. 4B, which suggested that aa 70 -74 are critical to the function of this domain, it is notewor-thy that both aa residues 70 and 71 are serine residues. Overall, the proline-rich nature and neutral charge of this domain suggested that this region of the GATA-4 protein may be subclassified into the proline-rich family of transcriptional Activation Domains (34). Of note, protein sequence analysis software failed to detect even low level sequence homology between this domain of the GATA-4 protein and the GATA-1, -2, or -3 pro- Transient co-transfections were performed as described above. The minimal activation domain identified was located between aa 1 and 74. C, fine mapping of transcriptional activation domains located in GATA-4 between aa 94 and 204. Transient co-transfections were performed as described above. The minimal activation domain identified was located between aa 130 and 177. D, DNA-binding activities of GAL4-DBD/GATA4 fusion proteins. To confirm that each chimeric GAL4/GATA-4 fusion protein retained the capacity to bind to the yeast GAL4 nuclear protein binding site, EMSAs were performed using nuclear extracts prepared from NIH 3T3 cells transiently transfected with selected expression plasmids encoding GAL4/GATA-4 fusion proteins (indicated above each lane) and a radiolabeled double-stranded synthetic oligonucleotide corresponding to the consensus GAL4 binding site. Each recombinant fusion protein described retained the capacity to bind the GAL4 DNA-binding site. teins including the previously characterized transcriptional Activation Domains of GATA-1 and -3 (28, 32, 33) (data not shown).
Further analysis of the domain spanning aa 93-203 of the murine GATA-4 protein (see Fig. 4C, row 1) revealed that a fusion protein including GATA-4 aa 93-155 failed to increase GH activity above basal levels (Fig. 4C, row 2). Similarly, co-expression of a fusion protein containing GATA-4 aa 151-204 resulted in a 90% reduction (21-fold versus 225-fold induction) in GH activity compared with that obtained following co-transfection of the pGAL4/93-204 plasmid (Fig., 4C, rows 1  and 3). In contrast, the pGAL4/130 -177 plasmid transactivated the GH reporter plasmid approximately 125-fold above levels obtained with the pGAL4 negative control plasmid (Fig.  4C, row 4). Once again, EMSAs confirmed that each of these fusion proteins retained the capacity to bind to the GAL4 motif ( Fig. 4D and data not shown). Thus, amino acids 130 -177, heretofore designated Activation Domain II, are sufficient to produce high level transcriptional activity in this heterologous system. Activation Domain II (aa 130 -177) has a calculated pI of 6.20 (Fig. 5, lower panel). It contains three proline, five serine, four tyrosine, and one glutamine residues. It is conserved across species demonstrating 43 and 38% sequence identity to the chicken and Xenopus GATA-4 proteins, respectively (10, 12) (Fig. 5, lower panel). High grade amino acid sequence identity was demonstrated between GATA-4 Activation Domain II and three regions (aa 145-150, 156 -166, and 172-176) located at the N terminus of the Xenopus and chicken GATA-5 and -6 proteins (Fig. 5, lower panel, conserved subdomains are shown in gray). The conserved GREDQYG motif (aa 145-150) bears no identifiable sequence homology to Activation Domain I. In contrast, the conserved proline-and serine-rich GSYSSPYPAYM motif (aa 156 -166) bears low level sequence homology to the SSPVYVPTP subdomain (aa 34 -42) identified within Activation Domain I. Once again, within each of these subdomains proline, serine, and tyrosine residues (Fig. 5, lower panel, arrowheads) are conserved throughout the GATA-4/5/6 subfamily of transcription factors. As with Activation Domain I, sequence homology could not be detected between GATA-4 Activation Domain II and any region of the GATA-1, -2, or -3 proteins (data not shown).
Mutational Analyses of Transcriptional Activation Domains I and II-The demonstration that specific amino acid residues within Activation Domains I and II were conserved across species and within each member of the GATA-4/5/6 subfamily of transcription factors suggested that some, or all, of these conserved amino acid residues might mediate critical functions required for transcriptional activity of the GATA-4, -5, and -6 proteins. To determine whether these conserved amino acid residues affected the function of Activation Domains I and II, a series of expression plasmids was generated encoding chimeric proteins in which the yeast GAL4-DBD was fused in-frame to the murine GATA-4 Activation Domains I or II, respectively, containing single point mutations. Each expression plasmid was transiently co-transfected with the pGAL4GH reporter plasmid into NIH 3T3 cells, and the observed GH activity was compared with that obtained with the pGAL4/1-74 (encoding Activation Domain I) or pGAL4/130 -177 (encoding Activation Domain II) expression plasmids, respectively. Functional analysis of Activation Domain I revealed that multiple discrete point mutants within Activation Domain I including Q3A, F26A, H28A, Y38A, P40A, Y53A, and Q55A resulted in greater than an 80% reduction in transcriptional activity compared with that obtained with the native murine GATA-4 Activation Domain I (Fig. 6A, lanes 3, 5, 6, and 9 -12). In contrast, the Y2A and S4A mutations had relatively little effect on transcriptional activity (Fig. 6A, lanes 2 and 4), whereas mutation of aa P36A resulted in a 65% reduction in transcriptional activity (Fig. 6A, lane 8). EMSAs revealed that each of the encoded fusion proteins were expressed and retained the capacity to bind to the yeast GAL4-DBD probe (data not shown). Thus, most, but not all, of the evolutionarily conserved amino acid residues in Activation Domain I are required for transcriptional activation in this heterologous system. Functional analysis of Activation Domain II revealed that only one point mutant (W172A) decreased transcriptional activity greater than 90% compared with the control Activation Domain II expression plasmid (Fig. 6B, lane 9). In addition, the Y158A and Y162A mutations decreased transcriptional activity by greater than 60% (Fig. 6B, lanes 6 and 7), whereas the E147A and Y149A mutants had a modest affect on transcriptional activity decreasing GH activity by 45 and 35%, respectively (Fig. 6B, lanes 2 and 4). Finally, the Q148A, S157A, and Y165A mutations failed to decrease transcriptional activity (Fig. 6B, lanes 3, 5, and 8). Once again, EMSAs revealed that each of the encoded fusion proteins were expressed and retained the capacity to bind to the yeast GAL4-DBD (data not shown). Taken together, mutational analyses revealed that each evolutionarily conserved glutamine residue (Q3A and Q55A), two tyrosine residues (Y38A and Y53A), one phenylalanine residue (F26A), and one histidine residue (H28A) in Acti-vation Domain I were required for high level transcriptional activity. In contrast, only mutation of the evolutionarily conserved tryptophan residue at position 172 in Activation Domain II decreased transcriptional activity by greater than 90%, whereas mutations of the conserved tyrosine residues at positions 158 and 162 attenuated the observed GH activity by approximately 60%.
Activation Domains I and II Are Conserved within the GATA-4/5/6 Subfamily-The demonstration of amino acid sequence homology to Activation Domains I and II within the recently identified GATA-5 and -6 proteins suggested the hypothesis that each of these domains would be conserved within each member of the GATA-4/5/6 subfamily of proteins throughout vertebrate evolution. To determine whether regions in the Xenopus GATA-5 and -6 proteins corresponding to murine GATA-4 Activation Domains I and II (see Fig. 5) function in vivo as bona fide transcriptional activators, chimeric proteins composed of the yeast GAL4-DBD fused in-frame to regions of the Xenopus GATA-5 and -6 proteins corresponding to mouse GATA-4 Activation Domains I and II, respectively, were coexpressed with the pGAL4GH reporter plasmid in NIH 3T3 cells. As shown in Fig. 7, the pGAL4/XG5Act I (Xenopus GATA-5 aa 1-80) and pGAL4/XG5Act II (Xenopus GATA-5 aa 100 -160) expression plasmids encoding chimeric proteins spanning the putative Activation Domains I and II of the Xenopus GATA-5 protein transactivated the GH reporter plasmid 165-and 184-fold, respectively (Fig. 7, rows 3 and 4). Similarly, the pGAL4/XG6Act I (Xenopus GATA-6 aa 1-103) and pGAL4/ XG6Act II (Xenopus GATA-6 aa 100 -160) expression plasmids encoding chimeric proteins spanning the putative Activation Domains I and II of the Xenopus GATA-6 protein transactivated the GH reporter 220-and 66-fold, respectively (Fig. 7,  rows 5 and 6). This level of GH activity was comparable with that obtained following co-transfection of the pGAL4/1-204 GATA-4 expression plasmid in this experiment (Fig. 7, lane 2). Taken together, these data demonstrate that the N-terminal regions of the GATA-4, -5, and -6 proteins contain two independent, evolutionarily conserved transcriptional activation domains. DISCUSSION GATA-4 appears to play an important role in directing cell lineage-specific gene expression during development of the vertebrate heart (20 -23, 25, 39). In this report, we have performed structure-function analysis of the murine GATA-4 protein and mapped the regions of the protein that control DNA binding, nuclear localization, and transcriptional activation. These studies revealed that the C-terminal zinc finger and adjacent basic domain of GATA-4 is bifunctional, modulating both DNAbinding and nuclear localization activities. In addition, the N terminus of the GATA-4 protein contains two independent transcriptional activation domains that share no identifiable amino acid sequence homology to the transcriptional activation domains identified previously in GATA-1 and -3, respectively (28,32,33). Mutational analyses defined specific amino acid residues within each of these domains that are required for transcriptional activation. Finally, we have shown that both Activation Domain I and II of the murine GATA-4 protein have been conserved across species and within each member of the recently identified GATA-4/5/6 subfamily of zinc finger transcription factors.
Current paradigms suggest that transcription factors are modular proteins with specific domains encoding distinct functions (28,32,33). The demonstration that the conserved Cterminal zinc finger and basic domain of GATA-4 is necessary and sufficient to confer both DNA-binding activity and nuclear localization has not been recognized previously. Conservation Each point mutant is shown to the left of the bar graph. Point mutations were introduced into the indicated conserved amino acid residues within Activation Domain I by PCR-mediated site-directed mutagenesis. Each GATA-4 Activation Domain I point mutant was subcloned into the pGAL4 expression plasmid, and the resulting plasmid was cotransfected with the pGAL4 GH reporter plasmid into NIH 3T3 cells, and the relative GH activity observed was compared with that obtained with the pGAL4Act I (aa 1-74) plasmid, encoding Activation Domain I. The data are expressed as relative GH activity Ϯ S.E. B, mutational analyses of Activation Domain II. Mutations were introduced into the indicated conserved amino acid residues in Activation Domain II. Each GATA-4 Activation Domain II point mutant was subcloned into the pGAL4 expression plasmid, and the resulting plasmid was co-transfected with the pGAL4 GH reporter plasmid into NIH 3T3 cells, and the relative GH activity observed was compared with that obtained with the pGAL4Act II (aa 130 -177) plasmid, encoding Activation Domain II. The data are expressed as relative GH activity Ϯ S.E. of a single domain encoding both DNA-binding and nuclear localization activities represents an efficient mechanism whereby a single modular domain may encode a specific function (DNA-binding) and target its subcellular location (the nucleus). The fact that each of these domains is conserved in yeast single finger GATA proteins demonstrates that this important bifunctional domain has been conserved throughout ancient evolution. In contrast, computer homology searches revealed that the two N-terminal transcriptional activation domains identified within the murine GATA-4 protein only share amino acid sequence homology with the recently identified GATA-5 and -6 proteins and not with other GATA factors (data not shown). As discussed below, these data strongly suggest that the N terminus of GATA-4 encodes a novel function that is shared only with the closely related GATA-5 and -6 proteins. Moreover, the structural organization of GATA-4, -5, and -6 suggests that this subfamily of zinc finger transcription factors evolved via duplication and subsequent diversification of a common ancestral gene that displayed a structure similar to GATA-4. In support of this theory preliminary characterization of the murine GATA-4, -5, and -6 genes has revealed common intron-exon boundaries. 2 Although a great deal is currently understood about the function of the zinc finger DNA-binding domains of each GATA family member (28,29,32,33), relatively little is currently understood about the molecular mechanisms by which each GATA family member activates transcription. In fact, it remains unclear whether the transcriptional activation domains in each GATA factor function as interchangeable modules or, alternatively, direct the unique cell lineage-specific developmental program encoded by each vertebrate GATA factor. With respect to this question, two transcriptional activation domains were identified within the murine GATA-4 protein as assessed by transient transfection analyses of GAL4-DBD/GATA-4 fusion proteins (see Fig. 4). In most cases these data confirmed transient transfection analyses of GATA-4 deletion mutants.
However, the precise function of the C terminus of GATA-4 remains unclear as deletion analyses of the native protein suggested that this region of the protein is required for transcriptional activation (Fig. 3A, lane 6), whereas analyses of GAL4/GATA-4 fusion proteins revealed that aa 334 -440 do not contain an independent transcriptional activation domain (Fig.  4A, lane 6). This apparent inconsistency underscores the inherent limitation of these analyses as the lack of transcriptional activity demonstrated by either the GAL4/GATA-4 fusion pro-tein and/or the GATA-4 deletion mutant could result from an alteration(s) in the secondary structure of either recombinant protein. However, each recombinant protein retained the capacity to bind DNA as assessed by EMSA suggesting that gross alterations in the secondary structure of the C terminus of the proteins did not occur. An alternative explanation for these findings is that the C terminus of the murine GATA-4 protein may be necessary, but is not sufficient, for transcriptional activation within the context of the native GATA-4 protein.
It is noteworthy that Activation Domains I and II share no identifiable amino acid sequence homology with the transcriptional activation domains identified previously within GATA-1 and -3 (28,32), strongly suggesting that each of these domains is functionally distinct from those identified previously within the GATA-1/2/3 subfamily of transcription factors. Conversely, high grade amino acid sequence homology was identified between both Activation Domains I and II and subdomains located within the N terminus of the recently identified Xenopus GATA-5 and -6 proteins (20). Furthermore, N-terminal domains corresponding to Activation Domains I and II of the Xenopus GATA-5 and -6 proteins, respectively, function as potent transcriptional activators in vivo. As such, these data provide a potential mechanism for the functional differences between GATA-4, -5, and -6 and the previously characterized members of the GATA-1/2/3 subfamily of proteins. In this regard it is noteworthy that (i) forced expression of GATA-4 is not sufficient to activate the endogenous program directing cardiac-specific gene expression in noncardiac muscle cells (22), and (ii) other cardiac-restricted transcription factors including Nkx2.5 (40 -43) and MEF2 (44 -51) are necessary, but not sufficient, to activate cardiac-specific transcriptional regulatory elements. Taken together, these data are consistent with a model wherein Activation Domains I and/or II bind directly to cardiac-specific transcriptional co-activators and in concert coordinate the developmentally regulated pattern of cardiac muscle-specific gene expression. Such a combinatorial model involving both protein-protein and protein-DNA interactions has been demonstrated in other cell lineages. For example, GATA-1 binds to, and functionally synergizes with, the erythroid-specific LIM domain protein RBTN2 to activate expression of erythroid-specific genes (52). Similarly, the MADS box transcription factor MEF2 binds to, and synergistically activates, skeletal muscle-specific transcription with members of the basic helix-loop-helix family of myogenic transcriptional activators (48). Thus, identification of the transcription factors that bind to Activation Domains I and II should provide fundamental insights into transcriptional programs that regulate cardiac 2 H. Ip, E. Morrisey, and M. Parmacek, unpublished observations. FIG. 7. Activation Domains I and II are conserved through evolution within the GATA-4/5/6 subfamily of transcription factors. 25 g of expression plasmid encoding chimeric GAL4-DBD/GATA-4 fusion protein (schematically represented to the left), 2.5 g of the pGAL4GH reporter plasmid and 1 g of the pMSV␤gal reference plasmid were transiently co-transfected into NIH 3T3 cells. The GAL4-DBD is shown as a hatched box. The regions of the Xenopus GATA factor (XG-5 and XG-6) or murine GATA-4 protein corresponding to Activation Domain I (Act I) or Activation Domain II (Act II) are shown in black boxes with the corresponding amino acid residues indicated above each box. 48 h posttransfection cell media were assayed for GH and ␤-galactosidase activities. The data are reported as normalized GH activity Ϯ S.E.

development.
Many studies have demonstrated that transcription factors are post-translationally modified and activated (or suppressed) in response to specific intracellular signals. For example, phosphorylation of the MADS box transcription factors, MEF2C and SRF, has been shown to enhance the DNA-binding and transcriptional activity of each of these proteins (53)(54)(55). In this regard it is noteworthy that GATA-1 is phosphorylated in vivo and that a single serine residue is differentially phosphorylated during the dimethyl sulfoxide-induced differentiation of MEL cells (56). Thus, it is striking that both Activation Domain I and II of the murine GATA-4 protein are centered around evolutionarily identical tyrosine and serine residues (Fig. 3). In fact, Activation Domain I contains eight serine and four tyrosine residues, and Activation Domain II contains five serine and four tyrosine residues. Moreover, mutations of some of these tyrosine residues (Tyr-38, Tyr-53, Tyr-158, and Tyr-162), or deletion of serine 171, severely decreased transcriptional activity of each domain (Figs. 2B and 4). These data are consistent with the hypothesis that phosphorylation of one or more of these residues in the GATA-4, -5, and -6 proteins may play an important role in regulating their respective activities. Thus, it will be important to carefully characterize the role of phosphorylation in regulating the DNA-binding and transcriptional activity of the GATA-4/5/6 subfamily of transcription factors. The determination of whether GATA-4 is post-translationally modified in vivo may provide novel insights into the signal transduction pathways that regulate cardiac development.
The observation that the recently identified GATA-4/5/6 subfamily of proteins shares both conserved DNA-binding and transcriptional activation domains raises the question of whether (unlike the GATA-1, -2, and -3 proteins) this subfamily of transcription factors mediates redundant functions in the vertebrate embryo. However, several recent reports argue strongly that this is not the case. For example, Soudais and co-workers (25) observed that GATA-4 Ϫ/Ϫ ES cells exhibit gross defects in the formation of visceral and parietal endoderm. Moreover, we have observed that while GATA-4 and -6 are developmentally co-expressed in the precardiac mesoderm and embryonic heart (14), GATA-5 has a temporally and spatially distinct pattern of expression from that of GATA-4 and -6 during embryonic cardiac development (19). In addition, we have reported that the genes encoding the murine GATA-4, -5, and -6 proteins are each expressed in a unique cell lineagerestricted pattern in tissues including the lung, bladder, and vascular smooth muscle cells (14,19). Taken together, these data strongly suggest that while GATA-4 and -6 may subserve partially, or completely, redundant functions in the developing vertebrate heart, each member of the GATA-4/5/6 subfamily of transcription factors performs a unique function during vertebrate development. As such, elucidation of the molecular mechanisms by which Activation Domains I and II function in different cellular contexts should provide novel insights into the transcriptional regulatory programs mediated by each member of this recently identified subfamily of zinc finger transcription factors.