Characterization of general transcription factor 3, a transcription factor involved in slow muscle-specific gene expression.

General transcription factor 3 (GTF3) binds specifically to the bicoid-like motif of the troponin I(slow) upstream enhancer. This motif is part of a sequence that restricts enhancer activity to slow muscle fibers. GTF3 contains multiple helix-loop-helix domains and an amino-terminal leucine zipper motif. Here we show that helix-loop-helix domain 4 is necessary and sufficient for binding the bicoid-like motif. Moreover, the affinity of this interaction is enhanced upon removal of amino-terminal sequences including domains 1 and 2, suggesting that an unmasking of the DNA binding surface may be a precondition for GTF3 to bind DNA in vivo. We have also investigated the interactions of six GTF3 splice variants of the mouse, three of which were identified in this study, with the troponin enhancer. The gamma-isoform lacking exon 23, and exons 26-28 that encode domain 6, interacted most avidly with the bicoid-like motif; the alpha- and beta- isoforms that include these exons fail to bind in gel retardation assays. We also show that GTF3 polypeptides associate with each other via the leucine zipper. We speculate that cells can generate a large number of GTF3 proteins with distinct DNA binding properties by alternative splicing and combinatorial association of GTF3 polypeptides.

The establishment and maintenance of mature fast-and slow-twitch muscle fibers require the expression of distinct sets of genes for contractile proteins, metabolic enzymes, and ion channels. These expression patterns are mainly controlled at the level of transcription. Fiber type specificity of muscle genes can be recapitulated in transgenic reporter mouse models or in vivo transfection assays by using respective transcription control regions (see, for example, Refs. [1][2][3][4][5][6][7][8]. Over the past years, signaling proteins such as calcineurin and Ras, and transcription factors GTF3 1 /MusTRD1, MEF-3, MEF-2, NFAT, and PGC-1␣ were implicated in the regulation of fiber type-specific expression in adult muscle (9 -17). In birds and lower vertebrates, the sonic hedgehog signaling pathway was shown to be involved in the specification of primary slow myofibers (18 -20). However, the transcription factors and signaling pathways that control the establishment of slow and fast fiber phenotypes during mammalian muscle development are not known.
The troponin I slow gene (TnIs) is activated during terminal myogenic differentiation in all skeletal muscles regardless of their future fiber type. Its expression is then confined to prospective slow fibers during fetal development (11,21). The enhancer that confers slow fiber specificity to TnIs expression is located ϳ800 bp upstream of the gene and was termed SURE (for slow upstream regulatory element; Refs. 3 and 22). Using a transgenic approach, we showed that the downstream half of the 128-bp SURE, including binding sites for myogenic regulatory factors (i.e. MyoD and myogenin) and MEF-2, is necessary for general muscle-specific activity, but not sufficient to restrict transcription to specific fiber types (11). Rather, a 36-bp upstream region of the SURE is required in addition to downstream sequences to re-establish slow fiber-specific reporter expression. Within this sequence, a bicoid-like motif (BLM; CGGATTAAC) was found in a yeast one-hybrid screen to interact with the general transcription factor GTF3. In a similar approach, the corresponding sequence of the human TnIs upstream enhancer (equivalent to SURE) was used to isolate a cDNA encoding MusTRD1 (12). GTF3, MusTRD1, GTF2ird1, WBSCR11, CREAM, and the mouse ortholog BEN are synonyms for proteins encoded by the same gene (12,(23)(24)(25)(26). GTF3 is ubiquitous in rodent tissues (11,27), whereas MusTRD1 has been suggested to be muscle-specific in humans (Ref. 12; but see Ref. 24). As we showed previously, the highest expression of GTF3 in rodent muscle occurs during fetal development, after which it is down regulated to very low levels in mature muscle fibers. In transfected rat muscle, GTF3 significantly reduces the transcriptional activity from the SURE. This is consistent with the idea that a repressive mechanism establishes slow fiber-specific TnIs expression during myogenic development. We therefore proposed that GTF3 is involved in the confinement of TnIs expression to slow-twitch fibers (11).
A characteristic feature of GTF3, and its paralog TFII-I, is the presence of reiterated helix-loop-helix (HLH) domains, socalled I-repeats (R1-R5/R6; for review, see Ref. 28). Most of these repeats are believed to function as protein-protein interaction surfaces because they lack a basic domain. In TFII-I, a basic motif precedes R2, and its deletion abrogates binding to V␤ Inr and c-Fos promoter sequences (29,30). Largely based on protein sequence, MusTRD1 has been suggested to bind to the USE B1 enhancer element of the human TnIs gene via a domain located in its amino-terminal half (12). However, two lines of evidence indicate that the first two HLH domains of GTF3 are dispensable for binding to the TnIs BLM; (a) most of the GTF3 clones we obtained from the yeast one-hybrid screen lack the sequences encoding R1 and R2, and (b) a partial GTF3 protein containing the carboxyl-terminal half including R3-R5 * The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
The nucleotide sequence(s) reported in this paper has been submitted to the GenBank TM /EBI Data Bank with accession number(s)AY149688 (GTF3␣ 2 ), AY149689 (GTF3␣ 3

), and AY149690 (GTF3␥ 2 ).
‡ To whom correspondence should be addressed. Tel.: 301-496-3298; Fax: 301-496-9939; E-mail: buonanno@helix.nih.gov. 1 The abbreviations used are: GTF3, general transcription factor 3; SURE, slow upstream regulatory element; FBS, fetal bovine serum; EDL, extensor digitorum longus; HLH, helix-loop-helix; bHLH, basic helix-loop-helix; BLM, bicoid-like motif; TFII-I, general transcription factor II-I; WS, Williams syndrome; LZ, leucine zipper; EMSA, electrophoretic mobility shift assay; DIV, days in vitro; RT, reverse transcription. forms a complex with a SURE-derived oligonucleotide probe in electrophoretic mobility shift assays (EMSAs), whereas a mutant protein encompassing the amino-terminal half including R1 and R2 does not (11). In that study, GTF3 cDNAs conforming to either one of two reported human GTF3 transcripts (containing either the long or the short form of exon 19) were isolated, indicating that sequence variability in the region between R3 and R4 encoded by this exon does not appreciably affect DNA binding. However, mice express at least six GTF3 isoforms that differ more extensively in sequence (Ref. 31 and this paper). Most notably, both ␣and ␤-GTF3 isoforms contain a sixth HLH domain located between R5 and the carboxyl terminus, plus an additional 27 amino acids between R4 and R5 encoded by mouse exon 23. Their expression pattern and functional properties are not known.
GTF3 and TFII-I were mapped along with at least 21 other genes as part of a 1.5-Mb microdeletion in persons with Williams syndrome (WS) (32,33). Given the potential importance of GTF3 and TFII-I for the pathology of WS, and the suggested role of GTF3 in regulating slow fiber-specific gene expression, a better understanding of the biochemical properties of GTF3 and its functional relation to TFII-I is necessary. Therefore, the goal of this study was to characterize the interactions between GTF3 and the TnI SURE. We mapped the DNA binding domain of GTF3 to HLH domain 4 that lacks a consensus basic region. Interestingly, affinity of GTF3 for the BLM was dramatically augmented upon removal of NH 2 -terminal sequences. We also show that rodent skeletal muscles express at least five different GTF3 isoforms that exhibit distinct DNA binding properties in EMSAs; three variants (␣ 2 , ␣ 3 , and ␥ 2 ) are reported here for the first time. We furthermore demonstrate that GTF3 proteins can form dimers via the NH 2 -terminal leucine zipper (LZ) motif, suggesting that cells can generate a large number of GTF3 transcription factor complexes with potentially different properties and functions.

Cell Culture
C2C8 myogenic cells were propagated in low glucose Dulbecco's modified Eagle's medium (Invitrogen) supplemented with 20% fetal bovine serum (FBS; Invitrogen) and 2 mM L-glutamine. Cells were maintained at 37°C in an 8% CO 2 environment. Cell density was kept between 20 and 80% confluence to prevent terminal myogenic differentiation. HEK293 cells were propagated in minimal essential medium (Invitrogen) supplemented with 10% FBS and maintained in a 37°C, 10% CO 2 environment. Primary myotube cultures were prepared from rat embryonic day 19 hindlimbs and grown for 10 days in 10% FBS/Dulbecco's modified Eagle's medium at 37°C and 8% CO 2 (34). Cytosine ␤-Darabinofuranoside was added to the cultures at day 4 to enrich for myotubes.
hGTF3⌬125-Coding oligonucleotide sequence was 5Ј-ACC AGG CCT TTC CAA GGA CTC ATC GCA GAA ATC TGC AAT GAT GCC AAG GTG-3Ј; antisense primer was T7. Following ligation to pGemT, a StuI fragment was released and used to replace the corresponding StuI fragment in hGTF3⌬12.
hGTF3⌬4 -A SapI fragment including the sequences flanking the deletion was released from hGTF3⌬124 and used to replace the corresponding sequence in full-length hGTF3.
Mouse GTF3 Deletion Constructs-Full-length and truncated mouse GTF3 expression plasmid were based on I.M.A.G.E. clone 555547 (see above). The generation of carboxyl-terminal deletion mutant mGTF3␤.⌬3-6 is described elsewhere (11). The amino-terminal deletion mGTF3␤.⌬LZ was made by PCR-amplifying the GTF3␤ cDNA with a coding oligonucleotide located downstream of the leucine zipper motif (5Ј-T CCA GTC GAC GCC ACC ATG CAG TCA GAC TTC CTC AGG TTC TGC-3Ј) and that includes a SalI linker, Kozak sequence, and translation initiation codon (underlined). T7 was used as antisense oligonucleotide. The PCR product was digested with SalI and EagI and inserted between the SalI and NotI sites of pCMV-Sport2. A 229-bp SacII fragment including the new translation start site was then excised from this intermediate and used to replace the corresponding sequence in GTF3␤. The Pk epitope (Ref. 37; also referred to as "V5") was added on to full-length mGTF3␤, mGTF3␤.⌬3-6, and mGTF3␤.⌬LZ to generate respective affinity-tagged derivatives. Corresponding constructs were made by inserting a double-stranded oligonucleotide coding for this epitope into the NcoI site that overlaps with the translation initiation codon (CCATGG) in GTF3. The sequence of this oligonucleotide (excluding NcoI linker arms) was 5Ј-GAA GGT AAG CCT ATC CCT AAC CCT CTC CTC GGT CTC GAT TCT ACG AG-3Ј. Mouse versions of human GTF3⌬1-3 were made by amplifying the coding sequence of the various GTF3 splice variants downstream of HLH repeat 3 using coding oligonucleotide 5Ј-GTC GAC GCC ACC ATG AAG AGA CAG GGC CTT CAA G-3Ј; SalI linker, Kozak sequence, and translation initiation codon are underlined) and T7 as the noncoding oligonucleotide. The resulting PCR products were digested with SalI-BstEII and inserted between the corresponding sites of mGTF3 fulllength isoform cDNAs in pCMV-Sport2.

Quantitative Analysis of mGTF3␣ and -␥ Isoform Expression
Relative abundance of mouse GTF3␣/␥ splice variants was determined by a combination of RT-PCR and subsequent hybridization anal-ysis. Mouse total RNA (2 g) from adult soleus and EDL muscles, as well as from whole brain, was reverse transcribed and PCR-amplified with oligonucleotides mGTF3ϩ2346s and mGTF3ϩ3446r using the conditions described in the previous section. Parallel PCR reactions were subjected to 18,19, and 20 cycles to ensure that PCR product growth was in the logarithmic phase. PCR products were electrophoresed on a 1.5% agarose gel and blotted onto a charged nylon membrane. For GTF3 isoforms ␣, ␣ 2 , and ␣ 3 , membranes were hybridized at 50°C in 6ϫ SSC, 1% SDS, 1 mM EDTA (1ϫ SSC ϭ 150 mM NaCl, 15 mM sodium citrate, pH 7.0) with the following [␥-32 P]-labeled oligonucleotide probes: ␣, 5Ј-CC AAA GCC TG/A AAC CAA ATT-3Ј; ␣ 2 , 5Ј-CC AAA GCC TG/G ACA TGA AGC-3Ј; ␣ 3 , 5Ј-CC AAA GCC TG/A TGA GGA TGA-3Ј ("/" indicates the boundary between exon 22 common to all isoforms and the isoform-specific adjacent sequences). Following a stringent wash (2ϫ SSC, 1% SDS at 45°C for ␣/␣ 3 and 40°C for ␣ 2 ), the membrane was exposed to a Storm ® phosphorimager screen (Amersham Biosciences). Specificity was monitored, and hybridization signals were normalized, by including known quantities of BstEII-HindIII cDNA fragments of ␣-␣ 3 isoforms on the same membrane. For GTF3 isoforms ␥ and ␥ 2 , and to quantify total ␣and ␥-PCR products, membranes were hybridized with a [␣-32 P]dCTP-labeled BstEII-HindIII restriction fragment of mouse GTF3␥ 2 excised from pCMV-Sport2. To normalize for variability related to RNA input and reverse transcription, aliquots of RT samples were amplified in parallel with oligonucleotides specific for a 352-bp fragment of the mouse ribosomal protein L7 transcript (L7s, 5Ј-AGA TGT ACC GCA CTG AGA TCC-3Ј; L7a, 5Ј-ACT TAC CAA GAG ACC GAG CAA-3Ј; Ref. 38). L7 PCR products were electrophoresed and blotted as described above and hybridized against a [␣-32 P]dCTP-labeled cDNA probe derived from the L7 PCR product. Signals were quantified with the phosphorimager and used to normalize corresponding values for GTF3 splice isoforms.

Electrophoretic Mobility Shift Assays
Full-length and partial GTF3 proteins used for EMSAs were generated in vitro from cDNAs subcloned into pCMV-Sport2 (see above). Proteins were synthesized from 0.5 g of plasmid DNA using 14 l of TNT ® SP6 reticulocyte coupled transcription-translation system (Promega). Relative efficiency of translation was monitored in parallel reactions by the addition of [ 35 S]methionine. Radiolabeled proteins were fractionated on 4 -20% gradient SDS-polyacrylamide gels (SDS-PAGE), and gels dried and exposed to autoradiographic film for visualization of proteins. The relative levels of translated protein were determined by quantification using the phosphorimager and normalization for the number of methionines in each GTF3 construct.

Antibody Production
A cDNA fragment encoding the amino-terminal 130 amino acids of mouse GTF3 was amplified from I.M.A.G.E. clone 555547 using the following primers: GTF3-N.Cod (5Ј-CAC TAG GAA TTC GGA TCC GCC TTG CTG GGG AAG CAC TGT TGA C-3Ј) and GTF3-N.NCod (5Ј-TGG TAC GAA TTC ATC TTC TGC AGC AGG TAC ACA TCC-3Ј). The PCR product was digested with BamHI and EcoRI and inserted between the corresponding sites of pGEX-2T (Amersham Biosciences). Glutathione S-transferase fusion protein was expressed in Escherichia coli BL21 cells and extracted from lysates on glutathione-Sepharose (Amersham Biosciences). The immunogen was further purified by preparative SDS-PAGE followed by electroelution of the specific 40-kDa band. This preparation was used to immunize rabbits. Immunoglobulins were purified from whole antiserum on Protein A-Sepharose columns (Pierce). Antibody specificity was confirmed by Western blotting of whole ex-tracts from HEK293 and C2C8 cells transfected with expression constructs for human and mouse GTF3 as well as mouse TFII-I (see above). No cross-reactivity of anti-GTF3 antibodies toward TFII-I was observed.

Location of DNA Binding Domain in Human GTF3-We
have previously demonstrated that GTF3 interacts specifically with the BLM of the TnIs upstream enhancer (11). We concluded that the DNA binding domain of GTF3 must be located downstream of R2 because many GTF3 clones obtained from our yeast one-hybrid screen lacked sequences upstream of R3, and EMSA experiments confirmed that the carboxyl-terminal half of GTF3 (including R3-R5) bound to the BLM, but not the amino-terminal half (including R1 and R2). To map the location of its DNA binding domain, we have generated a series of truncated human GTF3 expression plasmids (Fig. 1). The ability to interact with the BLM was tested in EMSAs using in vitro translated proteins and an oligonucleotide probe that FIG. 1. Human GTF3 constructs used in this study. Figure is a schematic representation of full-length and partial human GTF3 expression constructs used for mapping the DNA binding domain. All constructs were derived from a cDNA corresponding to a GTF3 isoform (GenBank TM accession no. AAF19786) that contains the long form of exon 19 (amino acids 656 -671). Boxes represent HLH domains (denoted as R1-R5). The leucine zipper motif (LZ, amino acids 32-55) at the NH 2 terminus (hexagon) and a nuclear localization signal (N, amino acids 884 -889) at the carboxyl terminus (circle) are also shown. Numbers in italics before and after constructs or construct segments indicate amino acid positions of start and stop signals as well as internal junction sites relative to full-length GTF3. encompasses the sequence between Ϫ842 and Ϫ815 of the rat TnI SURE (11). To ensure that all GTF3 proteins were properly translated, parallel reactions spiked with [ 35 S]methionine were loaded on a 4 -20% gradient SDS-PAGE and autoradiographed (data not shown). EMSAs were performed with unlabeled proteins synthesized from the different GTF3 constructs shown in Fig. 1.
As shown in Fig. 2A, full-length GTF3 produced a relatively weak specific shift (lanes 2 and 13) similar to that observed previously (11). GTF3 proteins lacking the NH 2 terminus plus the first two (hGTF3⌬12, lane 3) or three HLH domains (hGTF3⌬1-3, lane 4) interacted strongly with the probe. In contrast, no shift was observed when R4 was removed in addition to the first three repeats (hGTF3⌬1-4; lanes 5 and 14). Thus, a region between R3 and R4 is necessary for GTF3 to bind to the BLM. Next, we deleted specific regions in the carboxyl-terminal half of GTF3 within the context of hGTF3⌬12. A mutant protein that lacks the carboxyl terminus downstream of R5 efficiently formed a complex with the probe (hGTF3⌬12C, lane 6), indicating that this region, which includes a serine-rich stretch and the nuclear localization signal (26), is not required for DNA binding. Likewise, a weaker but significant signal was observed with a protein that lacks HLH domain 5 (hGTF3⌬125, lane 7). Next, we removed R4 both in the context of hGTF3⌬12 as well as the full-length protein (lanes 8 and 9). No specific shift was detected in either one of these reactions, demonstrating that R4 is required for DNA binding. Finally, we tested the ability of R4 to bind the BLM by itself, and indeed a strong shift was obtained with this GTF3 protein fragment. In conclusion, we have identified HLH domain 4 as being necessary and sufficient to mediate the interaction between GTF3 and the BLM of the TnI SURE.
The weak binding of hGTF3⌬125 and the Ϫ842/Ϫ815 probe may represent an artifact because a stronger shift is obtained with the same protein when a shorter probe encompassing the sequences between Ϫ844 and Ϫ827 of the TnI SURE (lane 11) is used. It is therefore conceivable that the proximity between the carboxyl terminus and R3-R4 in this mutant negatively affects binding to the long probe. We also noticed that GTF3 proteins encoded by constructs ⌬12, ⌬1-3, and ⌬125 gave rise to additional weak low mobility complexes. The carboxyl terminus of GTF3 is the only region present in these proteins but absent from ⌬12C and R4, both of which gave rise to a single band. Sequences within this region could therefore mediate the formation of these higher order complexes that may or may not include other lysate proteins in addition to GTF3.
We then asked whether the apparent augmentation of protein-DNA interaction upon removal of amino-terminal sequences represents a true increase in the affinity of GTF3 for the BLM, or an artifact caused by a concomitant increase in translation efficiency for the smaller GTF3 proteins. We expressed the affinities of the respective full-length and truncated proteins for the BLM as the fraction of bound probe relative to the total probe count (free plus bound probe) per arbitrary protein unit and defined full-length GTF3 as 1. As shown in Fig. 2B, affinities were 6-fold higher for hGTF3⌬12 compared with full-length GTF3 and ϳ10-fold higher for all other binding-competent GTF3 proteins. hGTF3⌬125 was not included for reasons discussed above. In conclusion, we find that the avidity of DNA-protein interaction increases significantly in the absence of NH 2 -terminal sequences, suggesting an inhibitory effect of this region on DNA binding.
Cloning of Mouse GTF3␣ and -␥ Splice Variants from Muscle and Non-muscle Tissues-In initial attempts the ␤-isoform of mouse GTF3, and truncated versions thereof, failed to produce a complex with the BLM in EMSAs. We therefore explored the possibility that additional GTF3 isoforms are expressed in rodent skeletal muscle that might interact with the SURE. The mouse GTF3 gene has been reported to give rise to two additional (␣-and ␥-) isoforms, both of which differ from GTF3␤, and from each other, in sequences carboxyl-terminal of the DNA-binding domain R4 (31). Human GTF3 most closely resembles the mouse ␥-isoform. To determine whether ␣and ␥-isoforms are expressed in skeletal muscle tissue, we employed RT-PCR to amplify the 3Ј end of the coding region of mouse GTF3. Although all previously known splice variants were reported to share sequences upstream of the BstEII site we used to clone the PCR fragments into the parental mGTF3␤ expression vector, we cannot rule out that splice events upstream of the analyzed sequence contribute to the variability of GTF3 transcripts. We used a coding primer that recognizes all  3, 4, 7, and 11. This autoradiogram was exposed for 16 h. Lanes 12-14 show longer exposures (2 days) of lanes 1, 2, and 5 to demonstrate specific binding of full-length GTF3. B, relative affinities of GTF3 proteins for the BLM. Values were obtained as follows: affinities were expressed as the percentage of bound probe relative to total probe count. Resulting numbers were corrected for protein molarity using [ 35 S]methionine proteins translated in parallel (see "Experimental Procedures") and normalized to full-length GTF3. three known GTF3 splice isoforms and an antisense primer that binds to a sequence downstream of the translation termination codon of ␣and ␥-isoforms. A representative gel is shown in Fig. 3A. GTF3 cDNA fragments were amplified from RNA extracted from DIV10 rat primary myotube cultures (PM), pooled RNA from embryonic and early postnatal mouse hindlimbs (E15, E18, P7; HL), as well as RNA isolated from various adult slow and fast muscles (S-D). RNAs from heart, brain, and testis were included as non-skeletal muscle tissues (H-T).
All reactions produced at least four discrete fragments that were confirmed by Southern blotting using an internal probe to represent specific GTF3 sequences (data not shown). Sizes of predominant fragments were 1100 and 1019 bp, and the weaker bands were 791 and 710 bp, respectively. RT-PCR reactions from perinatal hindlimbs and testis were used for shotgun subcloning. At least two clones were sequenced for each of the four size categories. We identified cDNA fragments representing GTF3␣ and -␥ isoforms (1100-and 710-bp PCR fragments, respectively), as well as sequences for three previously unknown splice variants (1079/1019/791 bp). The exon organization of GTF3 splice isoforms, deduced from comparison of cDNA and genomic sequences, is illustrated in Fig. 3B. Like GTF3␣, the 1079-and the 1019-bp fragments included exons 26 -28 that code for R6 and were therefore designated GTF3␣ 2 and -␣ 3 , respectively. GTF3 isoforms ␣ and ␣ 2 did not resolve because of their similar size. The 791-bp PCR product lacked exons 26 -28 and was named GTF3␥ 2 to indicate its similarity with GTF3␥. The difference between GTF3␣ and GTF3␣ 2 /␣ 3 , and between GTF3␥ and GTF3␥ 2 , can be attributed to differential usage of exon 23. A shortened form of this exon is generated in GTF3␣ 2 by splicing of exon 22 to a cryptic site in exon 23, 21 nucleotides downstream of its 5Ј-boundary. The alternative splice junction (ACGACCACGAAG) has the consensus AG dinucleotide but lacks the pyrimidine-rich upstream sequence commonly found in obligatory 3Ј splice sites. Like the ␥-isoform, GTF3␣ 3 lacks exon 23 altogether, whereas it is included in GTF3␥ 2 .
The relative intensity of GTF3␣/␣ 2 -derived signals compared with GTF3␣ 3 appeared to be variable in different tissues. GTF3␣/␣ 2 fragments were more abundant in testis, whereas a fairly even distribution or a moderate bias toward GTF3␣ 3 was seen with RNA from cultured myotubes and muscle tissues including heart, as well as with RNA from whole brain. We then asked whether certain isoforms might be differentially expressed between muscles that transcribe the TnIs gene and those that do not. We tested RNA from the slow soleus and fast EDL muscles, and included RNA from brain as a non-muscle tissue. In the mouse, ϳ60% of all muscle fibers in the soleus are slow-twitch, whereas the EDL almost exclusively comprises fast-twitch fibers in which the TnIs gene is not expressed. We used semiquantitative RT-PCR to test for a correlation between the relative abundance of ␣and ␥-isoform-derived PCR products and the slow fiber type-specific expression of the TnIs gene. Because of the similarities in size and sequence of ␣and ␥splice variants, we employed the conditions used above for RT and subsequent PCR, assuming that all isoforms were amplified with comparable efficiencies. Reactions were subjected to a low number of PCR cycles to ensure that product growth was captured in the logarithmic phase, and quantified by radioactive hybridization (for details, see "Experimental Procedures"). The result of this analysis is summarized in Table I.
The overall abundance of ␣and ␥-isoforms was similar between soleus and EDL muscles, whereas an ϳ5-fold higher expression level was observed with RNA from whole brain tissue. This in agreement with the finding that GTF3 transcript levels in the mouse are low in adult skeletal muscle compared with other tissues, and with the lack of a bias toward either slow-or fast-twitch muscles (11). In skeletal muscle and brain, GTF3␣ 3 was the most abundant isoform, ranging from 46 to 62% of total GTF3␣/␥ transcripts, followed by GTF3␣  b Data were obtained by hybridizing PCR reactions to a pan-GTF3 cDNA probe. The value for the soleus was arbitrarily defined as 1. c Values are based on normalized hybridization data for distinct isoforms using isoform-specific oligonucleotides (␣, ␣ 2 , ␣ 3 ) or a pan-GTF3 cDNA probe (␥, ␥ 2 ). For quantitation and normalization procedures, see "Experimental Procedures." (27-39%). In all three tissues, expression of GTF3␣ 2 and of both ␥-isoforms was low, and together accounted for less than 15% of the total ␣/␥ transcripts. The distribution of isoforms in soleus and EDL was not significantly different. Taken together, we conclude that GTF3␣ 3 and GTF3␣ isoforms predominate in skeletal muscle, and that the uniformity of GTF3␣/␥ isoform expression pattern in slow and fast muscles does not support the idea that the expression of any particular isoform is correlated with the expression pattern of the TnIs gene.
DNA Binding Properties of Mouse GTF3 Splice Variants-Next, we asked how these differently spliced sequences affect the interaction between GTF3 and the BLM. Cloned fragments of GTF3 splice variants were excised from the PCR vector and used to replace the corresponding sequence of GTF3␤. We used in vitro translated proteins and the Ϫ842/Ϫ815 oligonucleotide probe to assess their DNA binding properties in EMSAs. To allow for cross-comparison of binding affinities, reactions were normalized for protein molarity (see "Experimental Procedures"). As shown in Fig. 4 (left half), no specific shifts were detected after a 36-h exposure in reactions that used full-length GTF3 proteins, except for GTF3␥ that produced a very weak specific shift (lanes 2-7). This result is in agreement with the poor binding of full-length human GTF3 to the BLM probe (see Fig. 2). Next, NH 2 -terminally truncated versions that lacked the first three HLH domains (⌬1-3) were generated from all variants to determine whether the removal of these sequences enhances the affinity of GTF3 proteins for the BLM (Fig. 4, right half). ⌬1-3 versions of GTF3␥ and -␥ 2 isoforms interacted avidly with the probe (lanes 12 and 13). A modest shift was obtained with GTF3␣ 3 (lane 10). Very weak shifts (gray arrowheads) not present in the reticulocyte lysate control (lane 1) were visible with GTF3␣, -␣ 2 , and -␤ isoforms (lanes 8, 9, and 11). Their mobilities, compared with the predominant complexes obtained with ␥-, ␥ 2 -, and ␣ 3 -isoforms, were aberrant and therefore are not likely to represent bona fide interactions. The different signal intensities of specific shifts obtained with the various isoforms reflect intrinsic differences in the affinity of GTF3 proteins for the BLM, and are not the result of variability in translation efficiency because reactions were normalized for protein molarity (see above).
The common feature distinguishing strongly binding GTF3␥ and -␥ 2 isoforms from modestly binding GTF3␣ 3 is the absence of exons 26 -28 (see Fig. 3B), suggesting that the presence of R6 encoded by these exons interferes with DNA binding. A second variable appears to be the presence or absence of amino acids encoded by exon 23. The GTF3␥ isoform lacks this exon and forms a complex that is stronger than that obtained with its exon 23-containing counterpart GTF3␥ 2 . Likewise, the presence of the long or the short form of exon 23 in amino-terminal truncations of GTF3␣ and -␣ 2 significantly reduces their affinity for the BLM compared with GTF3␣ 3 . We conclude that multiple GTF3 isoforms are expressed in muscle that exhibit differential binding properties for the TnI SURE enhancer. In particular, HLH domain 6 encoded by exons 26 -28 and, to a lesser extent, sequences encoded by exon 23 modulate their affinity for the BLM.
Homomerization of GTF3 Polypeptides-The amino-terminal LZ domain in TFII-I is required for homomeric interactions (30). Given the conservation of this domain in GTF3 and TFII-I, we asked whether the LZ in GTF3 functions as a dimerization motif as well. To this end, we tested the ability of affinitytagged GTF3 baits to pull down untagged GTF3 proteins in co-immunoprecipitation assays. We co-transfected HEK293 cells with constructs expressing carboxyl-terminally truncated GTF3 (mGTF3␤.⌬3-6) and either full-length mGTF3␤ or a mutant merely lacking the LZ motif (Fig. 5A). The epitopetagged bait protein was either mGTF3␤.⌬3-6 or full-length mGTF3 (Fig. 5B). The expression of bait and prey proteins was confirmed in direct Western blots of nuclear protein extracts prepared from transfected cells. Bait proteins were immunoprecipitated with an antibody directed against the Pk epitope (37), and pull-downs were probed with antibodies against GTF3 (top row) and Pk (bottom row). As shown in Fig. 5B, the untagged full-length mGTF3␤ was readily co-immunoprecipitated along with the mGTF3␤.⌬3-6(Pk) bait protein, indicating that they formed a stable complex. Conversely, when the affinity tag was added on to full-length GTF3, this protein effectively co-precipitated untagged mGTF3␤.⌬3-6 (right). In contrast, the ⌬LZ mutant was not pulled down using the mGTF3␤.⌬3-6(Pk) bait protein. Taken together, these results demonstrate that GTF3␤ polypeptides interact with each other via the leucine zipper domain.
We next sought to confirm this interaction in myoblasts using immunofluorescence cytochemistry. We speculated that the bait protein mGTF3␤.⌬3-6, which lacks a bona fide nuclear localization signal located near the very carboxyl terminus of GTF3 (26), would remain in the cytosol unless co-expressed with full-length GTF3. The underlying assumption was that the truncated bait protein would be translocated to the nuclear compartment through association with full-length GTF3. As shown in Fig. 5C, double labeling with Hoechst 33258 demonstrated that both full-length GTF3␤ and GTF3␤.⌬LZ predominantly located to the nucleus, although on occasion cells were found that showed cytoplasmic staining (a and aЈ; b and bЈ). In contrast, mGTF3␤.⌬3-6 protein distributed evenly between cytosol and the nucleus (c and cЈ). Next, cells were co-transfected with Pk-tagged mGTF3␤.⌬3-6 and either untagged fulllength mGTF3␤ or mGTF3␤.⌬LZ (panels d and e). In agreement with results from co-immunoprecipitation experiments (see above), mGTF3␤.⌬3-6(Pk) was predominantly located in the nucleus of most transfected cells when co-expressed with full-length GTF3␤ (d and dЈ), but not with GTF3␤.⌬LZ (e and eЈ). Substituting mGTF3␣ and -␥ isoforms for mGTF3␤ in this  2-7) as well as amino-terminal deletion mutants (⌬1-3; lanes 8 -13) were added as indicated. To allow for direct comparison of binding avidities, the amount of protein added to each reaction was normalized among fulllength proteins and among truncated mutants, respectively. The black arrowhead indicates weak but specific complexes obtained with fulllength GTF3␥. Gray arrowheads indicate complexes in lanes 8, 9, and 11 that run aberrantly compared with adjacent lanes. Open arrowheads indicate complexes that resulted from nonspecific interactions between the probe and components of the reticulocyte lysate.
FIG . 5. The leucine zipper motif is a homomerization domain. A, schematic overview of mouse GTF3 proteins used to study the role of the leucine zipper. Respective expression constructs were based on the cDNA for GTF3␤. Functional domains are depicted as in Fig. 1. An amino-terminal Pk epitope was added on to full-length mGTF3␤ and mGTF3␤.⌬3-6 constructs to allow for co-immunoprecipitation of GTF3 protein complexes and for immunofluorescence detection using the Pk antibody (see below). B, co-immunoprecipitation of full-length and partial mouse GTF3 proteins expressed in HEK293 cells. Affinity-tagged proteins (Bait) were either mGTF3␤.⌬3-6 (left and center) or full-length mGTF3␤ Characterization of GTF3 8376 assay, as well as in the pull-down assay described above, yielded similar results (data not shown). In conclusion, both biochemical and in situ evidence indicate that GTF3 polypeptides can interact with each other, and that this homomeric affinity can be attributed to the amino-terminal leucine zipper motif.
Analysis of GTF3 Interactions with TFII-I-The conservation of the LZ domain in GTF3 and TFII-I raises the question whether these two proteins can associate to form heteromeric complexes. Again, we utilized a co-immunoprecipitation assay to test whether the bait protein mGTF3␤.⌬3-6(Pk) can pull down TFII-I (⌬-isoform) in HEK293 cells co-transfected with constructs expressing these proteins (Fig. 6). In contrast to previous experiments showing interaction between the bait and GTF3 proteins, we were unable to immunoprecipitate TFII-I with mGTF3␤.⌬3-6(Pk). This result indicates that GTF3 and TFII-I do not form stable complexes that can be immunologically purified.
Based on experiments that utilized heterologous expression of both proteins in COS-7 cells, it was recently proposed that the presence of GTF3 excludes TFII-I from the nucleus (39). The lack of a detectable interaction between GTF3 and TFII-I in co-immunoprecipitation experiments would be in general agreement with this notion. To test whether nuclear exclusion of TFII-I occurs in myogenic cells, we performed double immunofluorescence cytochemistry for endogenous GTF3 and TFII-I in C2C8 myocytes. Fig. 7 shows that both proteins predominantly locate to the nucleus. Nuclear residency of TFII-I (b) appeared to be more pronounced than that of GTF3 (a). However, it is unclear whether cytoplasmic staining seen with the GTF3 antibody represents background because heterologous epitope-tagged GTF3 detected with the Pk antibody is confined to the nuclear compartment (see Fig. 5C). The co-localization of GTF3 and TFII-I is demonstrated by overlaying both images (c). Similar results were obtained with terminally differentiated multinuclear C2C8 myotubes, and with non-myogenic cell lines such as HEK293 and 3T3 (data not shown). In conclusion, our data indicate that GTF3 and TFII-I represent separate entities that, in the cells and conditions tested, do not exhibit a strong affinity for each other, and that both factors possess distinct DNA binding properties (see above). However, this statement does not preclude the possibility that GTF3 and TFII-I indirectly interact with each other to synergistically regulate gene transcription.

DISCUSSION
Because of the potential importance of GTF3 and TFII-I for the pathology of WS, and the suggested role of GTF3 in regulating slow fiber-specific gene expression, a better understanding of the biochemical properties of GTF3 and its functional relation to TFII-I is necessary. This study focused on three aspects of GTF3 biochemistry: (i) location of the DNA binding domain, (ii) expression of GTF3 isoforms in muscle and analysis of their DNA binding properties, and (iii) assessment of potential homo-and heteromeric interactions between GTF3 proteins and TFII-I.
GTF3 Binds to the BLM via HLH Domain 4 -We have mapped the region in GTF3 that interacts with the TnIs BLM to HLH domain 4. R4 is necessary and sufficient for DNA binding because mutant GTF3 proteins that lack this sequence fail to bind the BLM, and a protein fragment that merely encompasses the R4 domain is sufficient to mediate this interaction. This result is consistent with data from our previous work demonstrating that GTF3 binds to the TnIs enhancer through an area located in its carboxyl-terminal half (11). Interestingly, R4 is not preceded by a sequence that would conform to the consensus for a basic domain as delineated by Atchley et al. (K/R 88% -K/R 94% -(X) 4 -E 93% -K/R 95% -X-R 91% -X; Ref. 40). Thus, it seems unlikely that the DNA binding domain represents a classical bHLH motif. Because the BLM does not resemble an E-box element (CANNTG) either, we speculate that the mode of GTF3 binding to the BLM does not conform to (right). Untagged GTF3 proteins (Prey) were full-length mGTF3␤ (left), mGTF3␤.⌬LZ (center), or mGTF3␤.⌬3-6 (right). 100 g of nuclear proteins were immunoprecipitated using the Pk antibody and probed with a polyclonal GTF3 antibody (top panel). A second identical blot was probed with anti-Pk antibody (bottom panel). The presence of bait and prey proteins in nuclear extracts was confirmed by direct Western blotting (W) using 10 g of nuclear protein. Immunoprecipitated GTF3 proteins are shown as indicated (IP). C, nuclear translocation in C2C8 myoblasts of nuclear localization signal-deficient mGTF3␤.⌬3-6 protein by co-expression of full-length mGTF3. Cellular distribution of affinity-tagged proteins mGTF3␤ (a, aЈ), mGTF3␤.⌬LZ (b, bЈ) or mGTF3␤.⌬3-6 (c, cЈ) is shown in top panel. Cells co-transfected with constructs expressing affinity-tagged mGTF3␤.⌬3-6 and untagged mGTF3␤ (d, dЈ) or mGTF3␤.⌬LZ (e, eЈ) are shown in bottom panel. All cells were treated with anti-Pk antibody, detected with an Alexa 488 secondary antibody (a-e) and stained with nuclear dye Hoechst 33258 to visualize nuclei (overlay, aЈ-eЈ).
FIG. 6. TFII-I does not co-immunoprecipitate with GTF3. HEK293 cells were co-transfected with constructs expressing human TFII-I as prey and Pk-tagged mouse mGTF3␤.⌬3-6 as bait. Nuclear extracts were prepared and immunoprecipitated as described in classical bHLH/E-box interactions. Site-directed mutagenesis of R4 and resolution of the three-dimensional structure of the protein-DNA complex will be required to understand the structural basis of this interaction. Unlike GTF3, TFII-I binds to target sequences via a bHLH motif located in R2 (30). The corresponding repeat in GTF3 lacks a basic domain and is not required for interaction with the BLM (see above). Dot matrix analyses revealed that R2 in TFII-I is most homologous to R3 in GTF3 and not R4 (data not shown), arguing against the idea that DNA binding domains in both proteins were swapped during evolution. It therefore appears that GTF3 and TFII-I bind distinct DNA elements via different domains, despite their close structural relationship. In support of this notion, we were unable to produce a gel shift with GTF3 proteins and the initiator element of the adenovirus major late promoter, a well characterized target sequence for TFII-I (41), or to use the adenovirus major late promoter oligonucleotide to compete for complex formation between hGTF3⌬1-3 and the BLM (data not shown). Additionally, our previous yeast one-hybrid screens of human skeletal muscle cDNA libraries that used the TnIs BLM as bait never yielded TFII-I encoding clones, even though TFII-I is clearly expressed in skeletal muscle tissues (29).
Mapping of the DNA binding domain in GTF3 also revealed a pronounced increase (ϳ10-fold) in avidity of protein-DNA interaction upon removal of amino-terminal sequences, suggesting that this region somehow impedes the ability of R4 to bind DNA. In agreement with this observation, none of the six independent GTF3 clones we isolated previously from the yeast one-hybrid screen contained the entire open reading frame but rather lacked 300 bp or more from the 5Ј coding region (11). In addition, Bayersaihan et al. (27) isolated a 5Ј-truncated GTF3 clone that interacts with the early enhancer of the Hoxc8 gene from a yeast one-hybrid screen of a mouse embryo library. In vitro translated full-length TFII-I did not shift the adenovirus major late promoter oligonucleotide in our conditions (data not shown). Other groups have used highly purified proteins from transfected eukaryotic cells or E. coli to obtain complexes between TFII-I and its target sequences (see, for example, Ref. 29). Thus, it is possible that both GTF3 and TFII-I per se exhibit modest affinities for their target sequences, and that concentrations of in vitro translated proteins in the reticulocyte lysate are too low to drive efficient protein-DNA complex formation. It is therefore conceivable that DNA binding of GTF3 is conditional and requires a conformational modification of the protein (e.g. post-translational modification or binding of other proteins), or the disruption of GTF3 protein complexes associated via the leucine zipper (see below). NH 2 -and COOH-terminal domains with autoinhibitory properties have been identified in other transcription factors such as Ets-1, Smad2/ Smad4, and Nkx2.5 (42)(43)(44). They regulate transcription factor function through inhibition of DNA binding (as in Ets-1) or transactivation (as in Smad 2/Smad4), or both (as in Nkx2.5). A more detailed analysis of the interaction between amino-terminal sequences and the DNA binding domain R4 in GTF3 will be necessary to test these different scenarios.
Muscle and Non-muscle Cells Express Multiple GTF3 Splice Isoforms-The existence of multiple GTF3 isoforms in rodent tissue had been reported previously (31), but their expression in different tissues and during development was not investigated. Because our interest in GTF3 stems from its proposed role in fiber type-specific gene expression, we explored by RT-PCR whether isoforms other than GTF3␤, which in preliminary EMSA experiments bound very poorly if at all to the TnIs BLM, were expressed in skeletal muscle. Using oligonucleotides that flank the sequences that are alternatively spliced in GTF3␣ and -␥ isoforms, we demonstrated their expression in all muscles analyzed, including fast-and slow-twitch muscles like the EDL and the soleus, respectively. In addition, we also identified three novel splice isoforms, GTF3␣ 2 , -␣ 3 , and -␥ 2 , that arise from differential usage of exon 23 and exons 26 -28. Because our screen was not designed to exhaustively identify all splice variants generated from the GTF3 locus, it is likely that additional splice variants will be identified in the future. Moreover, although all previously known splice variants were reported to share sequences upstream of the BstEII site we used to clone the PCR fragments into the parental mGTF3␤ expression vector, we cannot rule out that splice events upstream of the analyzed sequence contribute to GTF3 isoform variability. The banding patterns of PCR products generated with our primer set revealed a quite stereotypical isoform expression profile in developing and adult skeletal muscles, and in non-muscle tissues. A quantitative analysis of the relative abundance of GTF3␣/␥ splice isoforms in soleus and the EDL muscles corroborated that their distribution is quite similar in slow and fast fiber types. This finding argues against the idea that the differential expression of distinct isoforms in different fiber types may contribute to the slow fiber type-specific expression of the TnIs gene. However, as discussed above, this statement does not preclude the possibility that other yet unidentified isoforms may exert this function.
No human splice isoforms have been identified to date that contain sequences corresponding to mouse exons 23 and 26 -28. By aligning the region of the mouse GTF3 gene that harbors the alternatively spliced exon 23 with the corresponding region of the human locus, we were unable to detect sequences that resemble mouse exon 23 or flanking intronic segments. Similarly, no evidence was found for a duplication of human GTF3 exons 23-25 (encoding R5) that would suggest the existence of human splice variants that contain a sixth HLH domain. We therefore speculate that these exons were fairly recently added to the mouse gene or lost from the human locus.
GTF3 Splice Isoforms Exhibit Distinct DNA Binding Properties-In EMSAs, full-length mouse GTF3 splice variants bound poorly if at all to the BLM. This result is in agreement with the low affinity of human full-length GTF3 for the BLM in vitro. A surprising variability in their affinity for the BLM was revealed after ablation of amino-terminal sequences. GTF3␥, lacking both exon 23 and exons 26 -28, binds most avidly. It is followed by GTF3␥ 2 , which lacks exons 26 -28 but includes exon 23, and GTF3␣ 3 , which contains exon 26 -28 but lacks exon 23. No detectable binding to the BLM was obtained with GTF3␣, -␣ 2 , and -␤ isoforms, which contain both exons 23 and exons 26 -28. Human GTF3 proteins are most closely resembled by mouse GTF3␥, which is in agreement with the fact that both ⌬1-3 variants derived from these proteins bound avidly to the BLM. Mouse exons 26 -28 encode for HLH domain 6 that is identical in sequence to HLH domain 5 (27). It is therefore possible that the close proximity of a second HLH domain to the DNA binding domain R4 destabilizes DNA-protein interaction or masks the DNA-binding surface. Exon 23 codes for 27 amino acids that are located COOH-terminal of R4. Because 75% of these residues are charged or polar, it is likely that this region is exposed on the surface of the protein. No recognizable motifs were detected in a ScanProsite protein pattern search. One possible effect of these residues may be that they destabilize protein-DNA interaction by altering the relative orientation of flanking domains including R4.
The presence of multiple exon configurations in the vicinity of the DNA binding domain could conceivably generate GTF3 proteins with different DNA sequence preferences, and splice variants that interact poorly with the BLM may bind other DNA targets more efficiently. In this context, it will be interesting to compare the relative binding affinities of mouse GTF3 splice isoforms between the TnI SURE and different bona fide GTF3 target sequences, such as the early enhancer of the Hoxc8 gene (27). Alternatively, as discussed for the role of amino-terminal GTF3 sequences, access of weakly binding isoforms to DNA may be regulated and requires prior association of other factors or posttranslational modification.
The Leucine Zipper Is Important for GTF3 Homomerization-We have demonstrated both in vitro using co-immunoprecipitation and in vivo using immunofluorescence cytochemistry that the amino-terminal LZ domain functions as a homomerization motif. This result is in agreement with the demonstration that the LZ in TFII-I is required for dimer formation as well (30). Interestingly, we observed no stable interaction between GTF3 and TFII-I, suggesting that either their respective LZ domains are structurally too diverse or that sequences outside the LZ interfere with heteromerization. It is also possible that TFII-I splice variants other than the one we used in our experiments (TFII-I⌬) can interact with GTF3, although the immediate area flanking the LZ is invariable in these isoforms (45). The lack of a direct interaction between GTF3 and TFII-I would be in general agreement with a model recently proposed, in which GTF3 expression negatively regulated the nuclear residency of TFII-I by competing for a limiting factor that is required by both GTF3 and TFII-I to translocate to the nucleus (39). However, in C2C8 myoblasts and myotubes, endogenous GTF3 and TFII-I proteins co-reside in the nucleus. In contrast, mature rat primary myotube cultures show a nuclear exclusion of GTF3 protein (data not shown). This phenomenon is currently under investigation because it might provide insights into how GTF3 functions as a transcription factor in skeletal muscle. Taken together, we do not find evidence in myogenic cells that supports the notion of a competition of GTF3 and TFII-I interaction for nuclear residency.
In conclusion, GTF3 emerges as a transcription factor with interesting properties including the location and sequence features of the DNA binding domain, as well as the potential autoinhibition by NH 2 -terminal and possibly COOH-terminal domains. Future studies will have to focus on the regulation of DNA binding and transcription effector functions of GTF3 in skeletal muscle and other tissues to better understand its role in fiber type-specific gene expression and its possible involvement in the pathology of Williams syndrome.