DTEF-1, a novel member of the transcription enhancer factor-1 (TEF-1) multigene family.

M-CAT motifs mediate muscle-specific transcriptional activity via interaction with binding factors that are antigenically and biochemically related to vertebrate transcription enhancer factor-1 (TEF-1), a member of the TEA/ATTS domain family of transcription factors. M-CAT binding activities present in cardiac and skeletal muscle tissues cannot be fully accounted for by existing cloned isoforms of TEF-1. TEF-1-related cDNAs isolated from heart libraries indicate that at least three classes of TEF-1-related cDNAs are expressed in these and other tissues. One class are homologues of the human TEF-1 originally cloned from HeLa cells (Xiao, J. H., Davidson, I., Matthes, H., Garnier, J. M., and Chambon, P. (1991) Cell 65, 551-568). A second class represents homologues of the avian TEF-1-related gene previously isolated (Stewart, A. F., Larkin, S. B., Farrance, I. K., Mar, J. H., Hall, D. E., and Ordahl, C. P. (1994) J. Biol. Chem. 269, 3147-3150). The third class consists of a novel, divergent TEF-1 cDNA, named DTEF-1, and its preliminary characterization is described here. Two isoforms of DTEF-1 (DTEF-1A and DTEF-1B) were isolated as 1.9-kilobase pair clones with putative open reading frames of 433 and 432 amino acids whose differences are attributable to alternative splicing at the C terminus of the TEA DNA binding domain. Cardiac muscle contains high levels of DTEF-1 transcripts, but unexpectedly low levels are detected in skeletal muscle. DTEF-1 transcripts are present at intermediate levels in gizzard and lung, and at low levels in kidney. DTEF-1A is a sequence-specific M-CAT-binding factor. The distinct spatial pattern of expression, and unusual amino acid sequence in its DNA binding domain, may indicate a particular role for DTEF-1 in cell-specific gene regulation. Recent work also suggests that at least one more TEF-1-related gene exists in vertebrates. We propose a naming system for the four TEF-1 gene family members identified to date that preserves existing nomenclature and provides a means for extending that nomenclature as additional family members may be identified.

Upon differentiation, the cytoplasm of myogenic progenitor cells is converted to a highly organized sarcomeric array. This process is dependent upon the activation and expression of genes in a specific temporal sequence that is coordinated by transcription factors. In developing skeletal muscle, for example, the expression of most skeletal muscle-specific genes is dependent upon the myoD family of transcription factors (also known as myogenic determination factors or MDFs 1 ; for review see Refs. [1][2][3]. MDFs are muscle-specific proteins of the basic helix-loop-helix (bHLH) superfamily that form heterodimers with ubiquitous bHLH nuclear proteins, and thereby bind E boxes in muscle gene promoters. MDFs can also activate the skeletal myogenic program in many non-muscle cells, converting them to the skeletal muscle phenotype (4 -6). MDFs are not present in cardiac muscle (7).
Many cardiac gene promoters, on the other hand, are regulated through DNA sequences known as M-CAT motifs (5Ј-CATTCCT-3Ј). M-CAT-dependent promoter activity was first described in the chicken cardiac troponin T gene (8,9,31). M-CAT sites have been shown to be involved in the regulation of the ␤-myosin heavy chain, ␣-myosin heavy chain, cardiac troponin C, skeletal ␣-actin, and ␤-acetylcholine receptor genes (8 -13). The M-CAT motif is bound by M-CAT binding factor(s), or MCBF, which is enriched in the nuclei of striated muscle, but is also present in non-muscle tissues (9,14). All muscle MCBFs have been shown to be antigenically and biochemically related to TEF-1, transcription enhancer factor-1 (14,32).
TEF-1 was first cloned from Hela cells where it binds the M-CAT-related GTIIC and Sph elements in the SV40 enhancer (15)(16)(17), whose sequences are variations of the canonical M-CAT motif (8,14). TEF-1 is a member of the TEA/ATTS domain family of transcription factors that are characterized by a structurally conserved DNA binding domain (18,19). The TEA domain has been found in the amino-terminal regions of a class of regulatory proteins and is conserved across many species. The yeast regulatory protein TEC1 is involved in the activation of the Ty1 retrotransposon (20), ABAA regulates conidiation in Aspergillus nidulans and terminates vegetative growth (21), and the Drosophila gene scalloped plays an important role in neurodifferentiation (22). Functional mammalian TEF-1 is found in the 2-8-cell stage in early mouse development (23), and insertional knockout of the murine homologue of the human TEF-1 gene results in embryonic lethality accompanied by hypoplasia and/or degeneration of the ventricular myocardium (24).
Previously, we cloned a class of TEF-1-related cDNAs from heart and skeletal muscle that includes at least four alternatively spliced cDNA products (designated TEF-1A, -1B, -1C, and -1D) of the same gene. TEF-1A mRNAs are expressed in many tissues but are enriched in both cardiac and skeletal muscle. In addition, isoproteins encoded by TEF-1A and -1B cDNAs are bona fide M-CAT binding factors. TEF-1B, at least, can activate transcription when linked to a heterologous DNA binding domain (25). Muscle tissues contain at least three protein⅐M-CAT complexes on electrophoretic mobility shift assays, one of which appears to be muscle-specific and up-regulated upon differentiation (32). All muscle protein⅐M-CAT complexes contain TEF-1 proteins, but the latter muscle-enriched complex contains TEF-1A and not other TEF-1 related proteins (32).
Additional TEF-1-related cDNAs might, therefore, account for the multiple protein⅐M-CAT complexes found in muscle and non-muscle tissues. We have identified a new, divergent class of TEA domain gene that we name DTEF-1, for divergent TEF-1. DTEF-1 is a sequence-specific M-CAT binding factor whose mRNA is highly enriched in cardiac muscle. Thus, DTEF-1 may be involved in the cardiac-specific regulation of M-CAT-dependent promoters.

EXPERIMENTAL PROCEDURES
Polymerase Chain Reaction Cloning and cDNA Library Screening-Complementary DNA synthesized from chick retina RNA was used for polymerase chain reaction amplification of TEF-1 cDNA products using the following primers: sense, 5Ј-TGG AG(T/C) CCN GA(T/C) AT(T/C) AGA (A/G)CA-3Ј; antisense, 5Ј-ATC AT(A/G) TA(T/C) TC(A/G) CAC AT(G/T/C) GG-3Ј. The polymerase chain reaction was conducted in an automated thermal cycler for 35 cycles under the following conditions: 1 min at 94°C, 1 min 30 s at 55°C, and 1 min 30 s at 72°C. A 900-bp fragment was isolated and used to screen an adult chick heart cDNA library (Clontech). Five TEF-1 cDNAs were isolated, and one of these was used to screen a seven week chick heart cDNA library (1 ϫ 10 6 recombinants, Stratagene) under reduced stringency conditions (40% formamide, 40°C, 5 ϫ SSC, 5 ϫ Denhardt's solution, 0.1% SDS, 100 g/ml salmon sperm DNA). Clones were sequenced on both strands using the Sequenase kit (U. S. Biochemical Corp.) or a dideoxy Taq DNA sequencer (Applied Biosystems). Sequence comparisons were performed using the GAP program of the Wisconsin Sequence Analysis Package (Genetics Computer Group).
Northern Blot Analysis-Total RNA from embryonic day 12 chick tissues was isolated using standard methods (26). Twenty-five g of total RNA was electrophoresed on a 0.8% agarose, 1.1 M formaldehyde gel, and transferred to a Hybond N membrane (Amersham Corp.). Cross-linking, prehybridization, and hybridization were as described by the manufacturer. A DTEF-1 specific probe was used for hybridization consisting of a 366-bp EcoRI-PvuII fragment (nucletotides 1-366) that was labeled by random priming (Rediprime, Amersham) with [ 32 P]dCTP (Amersham, 6000 Ci/mmol) to a specific activity of 10 9 cpm/ g. Membranes were hybridized for 18 h and then washed at 42°C with 2 ϫ SSC, 0.1% SDS, and at 65°C with 0.1 ϫ SSC, 0.1% SDS prior to autoradiography. Ethidium bromide staining of the gel prior to transfer was used to verify the equivalence of total RNA loads.
In Vitro Transcription/Translation and Gel Mobility Shift Assay-The SmaI/EcoRI (nucletotides 388 to 3Ј end) fragment of DTEF-1A was ligated in frame into PvuII/EcoRI digested and phoshatase-treated pR-SET vector (Invitrogen). Linearized plasmids were transcribed with T7 RNA polymerase according to the manufacturer (Promega). One microgram of capped RNA was then translated in 20 l of reticulocyte lysates (Promega). Gel shifts were performed using 5 l of in vitro translated protein, as described previously (25).

Identification of Multiple Classes of TEF-1-related
cDNAs-An adult chicken heart cDNA library was screened at low stringency using a TEF-1 cDNA probe. Four cDNA clones were isolated whose nucleotide sequence was determined to represent overlapping cDNA segments derived from one gene with higher amino acid homology to human TEF-1 (97% identity with human TEF-1) than to TEF-1-related cDNAs previously isolated from avian heart and skeletal muscle (25 and data not shown). Because the nucleotide differences between these two classes of cDNA were found throughout their length, we concluded that there are at least two TEA domain genes within the avian genome. To preserve but extend the extant TEF nomenclature (Table I), we designated as NTEF-1 (Nominal TEF-1; that from which the name is derived) those genes from mouse, rat, and chick most homologous to the human TEF-1 cloned from HeLa cells 2 (10,15,27). Since avian NTEF-1 was more related to human NTEF-1 than to our previously identified chick TEF-1-related cDNAs (96.5% versus 76%; 25), we redesignated these latter cDNAs and any homologous genes/cDNAs from other species as RTEF-1 (Related to TEF-1).
To identify further TEF-1-related genes, and in particular those that might be preferentially expressed during cardiac development, a partial avian NTEF-1 cDNA was used to screen a seven week chicken heart cDNA library under reduced stringency conditions. Of nine cDNA clones isolated, four had nucleotide sequences corresponding to RTEF-1. The nucleotide sequence of five other clones were divergent from that of either NTEF-1 or RTEF-1 ( Fig. 1 and data not shown) indicating that a third TEA-domain gene is present in the avian genome. We designate this new TEF-1-related gene DTEF-1 (Divergent TEF-1) to indicate its higher degree of divergence from avian NTEF-1 (see "Discussion").
All five apparently full-length (1.9-kb) DTEF-1 cDNAs were identical except for a specific segment that suggests alternative splicing of the primary transcript of the DTEF-1 gene (Fig. 1, A and D) generating predicted isoforms designated DTEF-1A and DTEF-1B. The deduced amino acid sequence of DTEF-1A (Fig.  1B) yields a 433-amino acid polypeptide that is 72% identical to both avian NTEF-1 and RTEF-1A.
The TEA domain is highly conserved throughout evolution (18,19), and there is only a single amino acid change within this domain between the drosophila TEA domain gene scalloped and the human-, mouse-, or chick-derived NTEF-1s (22) (Fig. 1C). NTEF-1 and RTEF-1 are identical throughout their TEA domains. However, DTEF-1A, which represented four of the five cDNA clones, has two amino acid differences within the TEA domain as compared to NTEF-1 (Fig. 1C). Amino acid residue 87 of DTEF-1A contains a lysine instead of an arginine, and residue 94 contains a leucine in place of an isoleucine, preserving the basic and aliphatic natures, respectively, of the residues at these two positions.
The DTEF-1B isoform is identical to DTEF-1A at the nucleotide and amino acid levels except for an alternatively spliced 2 A. Azakie, unpublished observation.   (Fig. 1C).
Distribution of DTEF-1 Transcripts in Embryonic Chick Tissues-The distribution of DTEF-1 transcripts in embryonic chick tissues was examined by Northern blot analysis of total RNA using a DTEF-1 specific probe (Fig. 2). Two predominant transcripts were detected; a major 1.9-kb transcript and a minor 3.6-kb transcript. The 1.9-kb transcript length corresponds with that of the cDNAs and so is the presumed fully processed mRNA. The 1.9-kb transcript is highly enriched in cardiac tissue and present at lower levels in lung, gizzard, and kidney (Fig. 2). Skeletal muscle, liver, and brain contain trace levels or none of the 1.9-kb transcript. The minor 3.6-kb transcript is present at low but equivalent levels in heart, skeletal muscle, lung, and gizzard.
Cloned DTEF-1 Binds M-CAT Sites in a Sequence-specific Fashion-Because of the considerable sequence conservation between the TEA domains of DTEF-1, RTEF-1, and NTEF-1 we predicted that DTEF-1 would retain the ability to form sequence-specific interactions with M-CAT elements. We tested the DNA binding activity of in vitro translated DTEF-1A protein using gel-retardation assays. Full-length DTEF-1A fusion protein bound to radiolabeled M-CAT probe producing a gel shift complex in the appropriate mobility range (Fig. 3, lanes 1  and 4). Moreover, this binding is sequence-specific since the presence of excess wild type M-CAT DNA (Fig. 3, lane 2) competed effectively for binding, while mutant M-CAT competitor (Fig. 3, lane 3) did not. Thus, DTEF-1A is a bona fide M-CAT binding factor. DISCUSSION In this study we report evidence that vertebrate TEF-1 comprises a multigene family. Three classes of TEF-1-related genes, NTEF-1, RTEF-1, and DTEF-1, have been tentatively implicated in muscle gene regulation. NTEF-1 is expressed in various tissues including heart, skeletal muscle, lung, kidney, brain, and gizzard, and produces a major 1.6 -1.8-kb transcript (15, 27) (data not shown). RTEF-1 is also present in muscle tissue and one isoform can potentially transactivate muscle promoters through binding to M-CAT motifs (25). DTEF-1 is the newest and most divergent class of TEF-1-related cDNAs. Unlike NTEF-1 and RTEF-1, the expression of DTEF-1 is restricted to a few tissues, and is most highly expressed in heart muscle.
Recently, another vertebrate TEF-1 family member was reported (ETF-1) (30). ETF-1 is expressed specifically in a subset of embryonic tissues, including the cerebellum, testis and distal portions of the limb and tail buds, but is essentially absent from adult tissues. Sequence comparison of ETF-1 to avian TEF-1 family members shows that ETF-1 is not any more closely related to any one TEF-1 family member than another, indicating that it is not a true homologue of N-, R-, or DTEF-1. Although ETF-1 shares sequence features with both RTEF-1 and DTEF-1, and is 100% identical to NTEF-1 in the TEA domain, ETF-1 probably represents a fourth class of TEF-1related genes. We suggest ETF-1 should be renamed ETEF-1 in order to reflect its membership of the TEF-1 family (see below and Table I).
We propose a system of nomenclature for the known TEF-1 related genes in vertebrates (Table I), taking into account the cloning of ETF-1 in the mouse (30). This nomenclature is based on the first cloned vertebrate TEF-1 family member, human TEF-1 (15). The avian, rat, and murine homologues of human TEF-1 are grouped as NTEF-1 type (Nominal TEF-1, from which the name is derived) 3 (15,27) based on their high degree of homology (97% at the amino acid level, see Table I). Another closely related class of TEF-1 cDNAs/genes, RTEF-1 (Related to NTEF-1) has been identified in chick (25) and mouse, 4 shar-DTEF-1B alternative splicing domain results in predicted polypeptide sequence one amino acid shorter than that predicted for DTEF-1A. E, alignment of derived amino acid sequences of avian NTEF-1, RTEF-1, and DTEF-1, identified in the figure as chickn, chickr, and chickd. RTEF-1 is identical to NTEF-1 in the TEA domain, while DTEF-1A is 97% identical. Amino-terminal to the TEA domain RTEF-1 is 43% identical to NTEF-1, and DTEF-1 is 45% identical to NTEF-1. Carboxyl-terminal to the TEA domain RTEF-1 has 72% identity with NTEF-1 and DTEF-1 has 70% identity. Black background indicates identity, gray background indicates similarity, and white background indicates difference.

DTEF-1: A Novel Divergent TEF-1 Family Member
ing higher amino acid sequence homology to each other (89%) than to the NTEF-1 class (74%). The DTEF-1 (Divergent TEF-1) class of TEF-1 cDNAs reported here constitutes a fourth class that is 72% identical to human NTEF-1 with novel changes in the TEA domain (see below). By contrast, all other vertebrate family members cloned to date show 100% identity in the TEA domain.
Vertebrate TEF-1-related gene family members characterized to date are believed to initiate translation at an isoleucine codon that lies upstream of the first methionine codon (15,25,27). The size of endogenous and in vivo produced human NTEF-1 corresponds to initiation at an AUU codon and shorter NTEF-1 polypeptides that might initiate at the downstream AUG codon are not detected by anti-TEF-1 antisera (15). Similarly, the major product of in vitro translated NTEF-1 mRNA also corresponds to an isoleucine initiated protein. Engineering of a perfect Kozak consensus around the first methionine codon does not affect that translation pattern. On the other hand, introduction of a perfect Kozak sequence around the isoleucine codon results in more efficient initiation at that site, while mutation of the isoleucine codon abolishes it (15).
The first potential AUG (residue 428) initiator codon in the DTEF-1 cDNA is surrounded by a poor Kozak consensus sequence (7/13 nucleotides, underlined in Fig. 1B), but a reasonably favorable Kozak consensus (10/13 nucleotide homology with the Kozak consensus sequence; underlined in Fig. 1B) surrounds the isoleucine codon (AUA) at nucleotide 352. That isoleucine 352 codon corresponds to the one identified as the initiator for human NTEF-1 as outlined above (15) and initiates a conceptual open reading frame of 433 amino acids whose sequence is closely related to that of other vertebrate TEF-1-related genes (see Table I) and consists of a serine-rich N terminus, followed by a highly conserved, basic TEA domain, and then a proline-rich region. We tentatively conclude, therefore, that DTEF-1 translation probably initiates at an isoleucine codon and that the DNA binding and transactivation motif pattern is grossly conserved through all classes of vertebrate TEF-1 cDNAs consistent with general conservation of function for this protein.
Two isoforms of the DTEF-1 cDNA were isolated. The nucleotide sequence of the DTEF-1A cDNA isoform differs from that of DTEF-1B over a 69 nucleotide segment that includes the C-terminal portion of the TEA domain (Fig. 1D). In all other regions these isoforms are identical in nucleotide sequence indicating that they are products of a single alternatively spliced gene. The DTEF-1B TEA domain is identical to all other published vertebrate TEA domains except that of DTEF-1A (Fig. 1C). DTEF-1A, on the other hand, represents the first divergent vertebrate TEA domain protein due to the presence of a lysine at position 87 where an arginine is in human-N, mouse-N, chick-N, chick-R, Drosophila, and Aspergillus TEA domains. Similarly all known TEA domain genes encode an isoleucine at position 94 except for chick DTEF-1A and Aspergillus abaA which code for a similarly aliphatic leucine at that position.
The divergent region of DTEF-1A encompasses the predicted third helix of the TEA DNA binding domain. This helix was shown to be important in sequence-specific recognition of SV40 M-CAT-like elements by human NTEF-1, but complete disruption of its tertiary structure by proline substitutions, where the rest of the protein was intact, did not completely abolish binding capacity (16). The first helix (where Drosophila scalloped differs from N, R, and DTEF-1) appeared to be the most important for sequence-specific DNA recognition (16). Further deletion analysis of human NTEF-1 established that portions of the C terminus of the protein are also involved in modulation of DNA binding (16). Thus, it is impossible to predict, at present, how the amino acid substitutions and the sequence differences in the C terminus of the DTEF-1 TEA domain might affect the sequence-specificity of DNA binding. We show here that DTEF-1A is capable of sequence-specific binding to a canonical M-CAT motif, but its relative affinity for the wide range of M-CAT element variants found in cardiac and other promoters, compared to other TEF-1 family members, remains to be tested.
NTEF-1 transcripts are widely expressed in human, mouse and chick tissues. RTEF-1 transcripts are also widely expressed; being enriched in skeletal and cardiac muscle and less abundant in gizzard, brain, and liver. By contrast, DTEF-1 transcripts are most highly abundant in heart muscle; at intermediate levels in lung and gizzard; and minimal or undetectable levels in skeletal muscle, kidney, liver and brain. The selective enrichment of DTEF-1 mRNA in heart and the demonstration that DTEF-1 is a sequence-specific M-CAT binding factor suggests a potential role for DTEF-1 in cardiac-specific muscle gene regulation that may differ from that played by other TEF-1 family members.
M-CAT elements were first described in the cardiac-specific cTNT promoter (8) and since then most characterized cardiac promoters have been found to contain one or more functional M-CAT elements (9 -13). M-CAT elements in the promoters of the ␤-myosin heavy chain and ␣-skeletal actin genes have been shown to be required for induction by ␣ 1 -adrenergic agonists and protein kinase C (28,29). Several candidate protein kinase C sites can be identified in the derived amino acid sequence of DTEF-1, including residues 81, 86, 219, 329, and 398. These potential protein kinase C sites are conserved in avian NTEF-1 and RTEF-1 and mammalian NTEF-1, consistent with evolutionary conservation of potentially important regulatory sites. Since DTEF-1 is abundantly expressed in heart it may be a target for this protein kinase C regulation and a potential mediator of the ␣ 1 -adrenergic hypertrophy response.
The fact that this multigene family consists of at least four genes with each gene encoding multiple, alternatively spliced isoforms implies complex variability in the regulatory functions associated with the TEF-1 family. This is remniscient of the MEF-2 and bHLH families of transcription factors whose members exhibit distinct spatiotemporal patterns of expression, functional redundancy, and differentially contribute to specific gene regulatory mechanisms. Members of the TEF-1 gene family may also show functional redundancy. Disruption of the murine NTEF-1 gene, for example, results in embryonic lethality by day 11.5 (24) by which time the forming heart tube is grossly normal. Histological analysis of mutant embryo hearts shows hypoplastic ventricles with reduced trabeculation and evidence of myocyte degeneration subjacent to the endocardium. These findings suggest a possible role for NTEF-1 in the growth and maintenance of the cardiac phenotype, but are less consistent with a role in the initiation of cardiogenesis. However, the expression of at least two other TEF-1 gene family members in the heart (RTEF-1 and DTEF-1) may be sufficient to compensate for the absence of the NTEF-1 gene products. Furthermore, the expression of other members of the TEF-1 multigene family may explain why M-CAT-dependent transcription of genes for cardiac proteins (cTNT, cTNC, myosins) is maintained in the TEF-1 knockout mutant embryos (24). It will be interesting to see what effects transgenic knockout of the mouse DTEF-1 and RTEF-1 homologues have upon cardiac and skeletal muscle development.
sharing unpublished sequence information about TEF-1 variants. Nina Kostanian and Monique Benoualid provided expert technical assistance for this project.