Identification of a Highly Conserved Module in E Proteins Required for in Vivo Helix-loop-helix Dimerization*

Basic helix-loop-helix (bHLH) transcription factors often function as heterodimeric complexes consisting of a tissue-specific factor such as SCL/tal or MyoD bound to a broadly expressed E protein. bHLH dimerization therefore appears to represent a key regulatory step in cell lineage determination and oncogenesis. Previous functional and structural studies have indicated that the well defined HLH domain is both necessary and sufficient for dimerization. Most of these studies, how-ever, have employed in vitro systems for analysis of HLH dimerization, and their implications for the requirements for in vivo dimerization remain unclear. Using multiple approaches, we have analyzed bHLH dimerization in intact, living cells and have identified a novel domain in E proteins, domain C, which is required for in vivo dimerization. Domain C, which lies just carboxyl-terminal to helix 2 of the HLH domain, represents the most highly conserved region within E proteins and appears to influence the in vivo conformation of the adjacent HLH domain. These results suggest that HLH dimerization in vivo may represent a complex, regulated process that is distinct from HLH dimerization in vitro . several of transcription factors, such as members of the basic helix-loop-helix, leucine zipper, and nuclear receptor families, dimerization represents a key, obligatory step prior to DNA binding and transcriptional dimerization the mixing and matching of factors with different DNA half-site specificities the repetoire of

Basic helix-loop-helix (bHLH) transcription factors often function as heterodimeric complexes consisting of a tissue-specific factor such as SCL/tal or MyoD bound to a broadly expressed E protein. bHLH dimerization therefore appears to represent a key regulatory step in cell lineage determination and oncogenesis. Previous functional and structural studies have indicated that the well defined HLH domain is both necessary and sufficient for dimerization. Most of these studies, however, have employed in vitro systems for analysis of HLH dimerization, and their implications for the requirements for in vivo dimerization remain unclear.

Using multiple approaches, we have analyzed bHLH dimerization in intact, living cells and have identified a novel domain in E proteins, domain C, which is required for in vivo dimerization. Domain C, which lies just carboxylterminal to helix 2 of the HLH domain, represents the most highly conserved region within E proteins and appears to influence the in vivo conformation of the adjacent HLH domain. These results suggest that HLH dimerization in vivo may represent a complex, regulated process that is distinct from HLH dimerization in vitro.
For several classes of transcription factors, such as members of the basic helix-loop-helix, leucine zipper, and nuclear receptor families, dimerization represents a key, obligatory step prior to DNA binding and transcriptional activation. This dimerization permits the mixing and matching of factors with different DNA half-site binding specificities and expands the repetoire of potential target sequences that may be recognized. Basic helix-loop-helix (bHLH) 1 factors in metazoan organisms often regulate the expression of target genes as heterodimeric complexes between tissue-specific factors, such as SCL/tal or MyoD, and broadly expressed E proteins (1)(2)(3). The composition of bHLH complexes is dictated in part by the dimerization specificities of the constituents; in particular, tissue-specific bHLH factors tend not to interact with one another but rather to bind universally to E protein partners (4,5). Likewise, the dominant negative HLH proteins of the Id family exert their inhibitory effect through preferential binding to E proteins (6 -8). Thus a progenitor cell during embryogenesis may con-tain an array of different cell lineage-specific bHLH factors all competing with one another, as well as with inhibitory HLH proteins, for dimerization with a common E protein partner. Such competition would allow the progenitor cell to make mutually exclusive, binary decisions with regard to lineage commitment, proliferation, and terminal differentiation (2).
The structural basis for bHLH dimerization, according to x-ray crystallography, resides in the formation of a parallel, four-helix bundle in which dimerization contacts derive from conserved hydrophobic residues clustered within a shielded core (9 -11). However, the crystallographic structures have all been obtained with pre-formed, DNA-bound complexes and provide no information on the transition states that occur in the process of dimerization in solution. Furthermore, the data from the crystal structures provide no satisfactory explanation for dimerization specificities: all HLH factors possess similar hydrophobic dimerization contact residues, and yet tissue-specific bHLH factors show highly restricted dimerization specificities while E proteins demonstrate considerable promiscuity in dimerization. In vitro biochemical studies suggest that nonconserved hydrophilic residues may somehow contribute to HLH dimerization specificity (12). As an additional complication, in vitro dimerization of bHLH factors does not appear accurately to reflect the dimerization process in vivo. In most analyses, in vitro bHLH dimerization, whether with crude extracts or with purified proteins, is a highly inefficient process which requires subphysiologic temperatures and displays affinities in the micromolar range (13)(14)(15)(16). Studies of in vivo dimerization portray a process that appears to be extremely efficient, with rapid heterodimerization occurring just prior to nuclear localization (17,18).
To study factors influencing dimerization in vivo, we have exploited a number of established systems applicable to bHLH dimerization, including yeast two-hybrid, mammalian two hybrid, nuclear redirection assays, and coimmunoprecipitation (4,(17)(18)(19)(20). We found that, contrary to findings in vitro, bHLH domains are not sufficient for dimerization in vivo. In particular, E proteins require an additional highly conserved domain, domain C, located carboxyl-terminal to helix 2. Tissue-specific bHLH factors, by contrast, require only the bHLH domain for heterodimerization with E proteins. Therefore, intrinsic structural differences exist between E proteins and tissue-specific bHLH factors, differences which may explain their different dimerization specificities. In particular, the role of domain C appears to be as an in vivo conformational determinant, maintaining the bHLH domain of E proteins in a "receptive" conformation for heterodimerization with tissue-specific bHLH proteins.

MATERIALS AND METHODS
Plasmid Constructions-Expression of LexA fusion proteins in yeast employed the vector pEG202, kindly provided by Dr. Roger Brent (Massachusetts General Hospital, Boston, MA) (21). Expression of B42 ac-tivation domain fusion proteins in yeast employed the vector pJG45, also kindly provided by Dr. Roger Brent. PCR upon plasmid templates was used to generate DNA fragments encoding the following: the bHLH domain of SCL/tal (amino acids 186 -242), the 20-kDa naturally occurring isoform of SCL/tal (amino acids 176 -331) (22), the bHLH domain of MyoD (amino acids 108 -163), the full-length coding region of MyoD,  and the various truncation mutants of E2-2 (encoding amino acids  467-588, 484 -541, 467-541, 484 -588, 484 -563, and 484 -548). PCR fragments were cloned in-frame into pEG202 and pJG45 as EcoRI-XhoI fragments. pJG-Id1 and pJG-Id2, yeast expression plasmids encoding Id1 and Id2 as B42 fusion proteins, were previously isolated from a HeLa cDNA library in the vector pJG45 (library provided by Brent laboratory). For mammalian expression of the VP16 activation domain alone or fused to SCL/tal (amino acids 176 -331), we used the plasmids pVP-HA1 and pVP16-TAL1, respectively, both generously provided by Dr. Richard Baer (University of Texas Southwestern Medical Center, Dallas, TX) (4). For mammalian expression of the GAL4 DNA-binding domain fusion proteins, our laboratory generated a parent vector pCMV-DB which contains the GAL4 DNA-binding domain downstream of the CMV immediate early promoter. The starting vector consisted of pCMV5 (23) from which the EcoRI site had been eliminated, yielding pCMV5 R-. A HindIII-PstI fragment encoding the GAL4 DNA-binding domain was released from the yeast expression vector pGBT9 (CLON-TECH, Palo Alto, CA) and ligated into the corresponding sites in pCMV5 R-. E2-2 fragments (encoding amino acids 484 -541 and 484 -563) with EcoRI-XhoI ends were cloned into EcoRI-SalI sites of pCMV-DB, yielding plasmids with in-frame GAL4-E2-2 fusions, pCMV-DB-E2-2 bHLH, and pCMV-DB-E2-2 C. The G5E1bLUC reporter plasmid, with 5 GAL4-binding sites upstream of the E1b TATA sequence followed by the luciferase reporter gene, was generously provided by Dr. Richard Baer and has been described elsewhere (4). pEMSV-MyoD, kindly provided by the laboratory of Dr. Harold Weintraub (Fred Hutchinson Cancer Research Institute, Seattle WA), was used for mammalian expression of the full-length MyoD protein with its own activation domain. The mammalian expression plasmid for the nuclear localization-deficient mutant of SCL/tal, pCMB, has been previously described (24). For bacterial expression of MBP and GST fusion proteins, SCL/tal (encoding amino acids 176 -331) and E2-2 (encoding amino acids 484 -541 or 484 -563), were cloned as EcoRI-XhoI fragments into pMAL-c2 (New England Biolabs, Beverly, MA) and pGEX4T-1 (Pharmacia Biotech, Piscataway, NJ).
Yeast Two-hybrid Techniques-The yeast two-hybrid system developed in the laboratory of Roger Brent was employed as we have previously described (21,25). The yeast strains (EGY48 and YPH499), mating protocols, library screening protocols, and ␤-galactosidase assays have all been described in an earlier publication (25). For Western blot analysis of yeast expression of LexA fusion proteins, equivalent quantities of yeast grown to mid-log phase in CM-URA, -HIS, and -TRP media with 2% galactose, 1% raffinose were resuspended in SDS-PAGE loading buffer and boiled. Resultant Western blot membranes were probed with a rabbit polyclonal antibody to LexA, provided by Dr. Erica Golemis (Fox Chase Cancer Center, Philadelphia, PA).
Library of Randomly Mutated SCL/tal-Error-prone PCR amplification of the bHLH encoding region of SCL/tal included 0.25 mM MnCl 2 in a standard PCR reaction (with 2.5 mM MgCl 2 and 200 M dNTPs). After 30 cycles of mutagenic amplification, 0.1 l out of a 100-l reaction was subjected to a second 30 cycles of mutagenic amplification. Similarly, 0.1 l of the latter reaction was subjected to a third round of 30 cycles of mutagenic amplification. Equal quantities of PCR products resulting from 30, 60, and 90 cycles of mutagenic amplification were pooled and cloned into the EcoRI-XhoI sites of pJG45. A library of 2 ϫ 10 6 primary bacterial colonies was thereby generated. A plasmid preparation of this library was then transformed into the yeast strain YPH499, yielding 1 ϫ 10 6 primary yeast colonies.
Mammalian Two-hybrid Assays-K562 cells in mid-log phase were resuspended at a concentration of 1 ϫ 10 6 cells/ml in RPMI 1640 media with 5% fetal bovine serum. For each transfection, 1.6 ϫ 10 6 cells were combined with 6 g of DNA and 30 l of DOTAP (Boehringer Mannheim, Indianapolis, IN) premixed in 400 l of Hepes-buffered saline. The 6 g of DNA consisted of 2 g of GAL4 expression plasmid (pCMV-DB, pCMV-DB-E2-2 bHLH or pCMV-DB-E2-2 C), 2 g of activation domain expression plasmid (pVPHA1, pVP16-TAL1, or pEMSV-MyoD), and 2 g of the G5E1bLUC luciferase reporter plasmid. After overnight incubation, the cells were resuspended in fresh RPMI 1640 with 10% fetal bovine serum and cultured for an additional 24 h. To assay cells for luciferase activity, the Luciferase Assay System kit (Promega, Madison WI) was used, following the manufacturer's recommendations. In all cases, equivalent numbers of cells were harvested for luciferase assays.
Immunofluorescent Nuclear Redirection Assay-COS 7 cells were seeded on glass coverslips in RPMI 1640 with 5% fetal bovine serum at a density of 4 ϫ 10 5 cells per 22-mm 2 coverslip. Transfections were carried out overnight with 5 g of plasmids and 25 l of DOTAP (Boehringer Mannheim) per coverslip. The plasmids consisted of 2.5 g of pCMB plus 2.5 g of GAL4 expression plasmid (pCMV-DB, pCMV-DB-E2-2 bHLH, or pCMV-DB-E2-2 C). Cells were incubated in fresh RPMI 1640 with 10% fetal bovine serum for 72 h prior to fixation. The protocols for cell fixation and indirect immunofluorescent staining for SCL/tal have been previously described (17). The cells were visualized on an MRC-600 confocal laser scanning imaging system (Bio-Rad Molecular Bioscience Group, Hercules, CA).
Coimmunoprecipitation Assay-COS 7 cells grown to ϳ60% confluency in 75-cm 2 flasks were transfected with 5 g of pCMV-SCL/tal, an expression vector for full-length SCL/tal protein (amino acids 1-331) (24). In addition, the cells received 5 g of either pCMV-DB-E2-2 bHLH or pCMV-DB-E2-2 C. Transfections were accomplished overnight in Dulbecco's modified Eagle's medium with 5% neonatal calf serum using 50 g of DOTAP (Boehringer Mannheim) plus 10 g of plasmid per flask. After supplying the cells with fresh media consisting of Dulbecco's modified Eagle's medium with 10% neonatal calf serum, the cells were incubated an additional 4 days prior to harvesting. For harvesting, cells were gently scraped in room temperature phosphate-buffered saline with 5 mM EDTA. Cell pellets were then resuspended in 300 l of ice-cold NETN (20 mM Tris-HCl, pH 8.0, 100 mM NaCl, 1 mM EDTA, 0.5% Nonidet P-40, 2 g/ml leupeptin, 2 g/ml pepstatin, 0.2% aprotinin, and 200 M phenylmethylsulfonyl fluoride). After a 10-min incubation on ice with intermittent inversion, insoluble cellular debris was eliminated by pelleting. 50-l portions of the extracts were saved for direct immunoblot analysis. To the remaining 250 l of extracts, 0.5 g of rabbit anti-GAL4 DNA-binding domain antibody (number sc-577, Santa Cruz Biotechnology, Santa Cruz, CA) was added followed by incubation on ice 45 min with intermittent inversion. A 10-l packed volume of protein A-agarose beads, prewashed in phosphate-buffered saline with 1% bovine serum albumin, were then added to each tube, followed by rotation at 4°C for 30 min. The protein A-agarose beads were then washed 4 times with ice-cold NETN and resuspended in 50 l of SDS-PAGE loading buffer.
Immunoprecipitates and crude cellular extracts were then subjected to SDS-PAGE on 12% gels followed by electrotransfer to nitrocellulose membranes. Immunoblots were probed with the BTL-73 mouse monoclonal anti-SCL/tal antibody that was kindly provided by Karen Pulford (Oxford, United Kingdom) (26). BTL-73 was used as a one-half dilution of a tissue culture supernatant. In addition, crude extracts from COS cell transfectants were probed in parallel with the rabbit anti-GAL4 antibody at a 1/500 dilution (0.2 ng/ml). Western blots were otherwise carried out as described previously (24).

E2-2, but Not SCL/tal or MyoD, Requires Sequence
Outside the bHLH Domain for in Vivo Dimerization-The minimal bHLH domains of E2-2, SCL/tal, and MyoD have been well defined by alignment analysis of a broad range of bHLH proteins from a wide variety of organisms (28). To assay interaction of minimal bHLH domains in the yeast two-hybrid system the amino acid sequences indicated in Fig. 1 were expressed in yeast as fusions with either the LexA DNA-binding domain or the B42 transcription activation domain. Interaction within yeast between coexpressed LexA fusions and B42 fusions was reflected by activation of a ␤-galactosidase reporter gene containing upstream LexA-binding sites. As shown in Fig. 2, LexA fusions with the minimal bHLH domains of MyoD and SCL/tal (LexA-MyoD bHLH and LexA-SCL/tal bHLH) displayed specific interaction with a larger fragment of E2-2 encompassing the carboxyl-terminal 121 amino acids (E2-2 467-588) fused to the B42 activation domain. Surprisingly, LexA-MyoD bHLH and LexA-SCL/tal bHLH manifested no interaction with the minimal bHLH domain of E2-2 (amino acids 484 -541) fused to B42. As will be shown below, similar results were obtained with E2-2 as the LexA fusion component and were not attributable to poor expression of E2-2 bHLH fusion proteins in yeast. These data suggest that the structural requirements for dimerization in the yeast two-hybrid system differ for tissuespecific bHLH proteins as compared with E proteins, the former requiring only the minimal bHLH domain and the latter requiring additional sequence.
Requirement for Domain C for in Vivo Heterodimerization of E2-2-To identify additional sequence requirements for E2-2 heterodimerization, a number of E2-2 truncations were expressed in yeast as LexA fusion proteins (Fig. 3A). These LexA-E2-2 truncations were analyzed for interaction with B42 fusions containing the following HLH proteins: Id1, Id2, SCL/tal, and MyoD. As shown in Fig. 3B, domain C, a stretch of 22 amino acids carboxyl-terminal to the HLH domain (amino acids 541-563) is both sufficient and necessary for heterodimerization of the E2-2 bHLH domain with an array of HLH partners. Interestingly, domain A, an acidic region upstream of the bHLH domain, which has been implicated in selective heterodimerization of E12 with MyoD (29), had no influence on the heterodimerization of E2-2 with the various HLH partners.
Domain C Influences the Conformation of the bHLH Domain in E2-2-To study further the role of domain C in the heterodimerization of E2-2, a number of control experiments were performed. First, Western blot analysis of LexA-E2-2 fusion expression in yeast demonstrated insufficient differences in expression levels of the various truncation mutants to account for the differences in interaction patterns (Fig. 4A). For example, LexA-E2-2 467-541 (Fig. 4A, lane 3) showed the same levels of expression as LexA-E2-2 484 -563 (Fig. 4A, lane 5), but only the latter LexA fusion demonstrated heterodimerization with the various HLH partners (Fig. 3B). Second, we isolated two independent altered-specificity mutants of SCL/tal, SE1 and SS1, that exclusively recognized forms of E2-2 lacking domain C (Fig. 4B). These mutants were isolated from a library of randomly mutated SCL/tal bHLH domains using a yeast two-hybrid screen for interaction with the minimal bHLH domain of E2-2 fused to LexA. The selectivity of SCL/tal mutants SE1 and SS1 for E2-2 lacking domain C rules out any trivial explanations for the role of domain C, i.e. nuclear localization or permitting DNA binding by LexA. In fact, these data indicate that domain C influences the conformation of the E2-2 bHLH domain such that the presence of domain C permits exclusive heterodimerization with wild type HLH partners and the absence of domain C permits exclusive heterodimerization with the SCL/tal mutants SE1 and SS1.
Domain C Is Required for in Vivo Dimerization in Mamma-lian Cells-To extend the findings in the yeast two-hybrid system, three independent assays were employed to analyze in vivo HLH dimerization in mammalian cells. In the mammalian two-hybrid system, K562 cells were transiently co-transfected with vectors expressing fusions with the GAL4 DNA-binding domain and with the VP16 activation domain. Also included in the transfection was the GAL5E1bLUC reporter plasmid which contains GAL4-binding sites upstream of the luciferase gene. As has been previously described, interaction of GAL4 fusions with VP16 fusions activates expression of the luciferase reporter gene (30). As shown in Fig. 5A, a GAL4 fusion with the minimal E2-2 bHLH domain (E2-2 amino acids 484 -541) showed no interaction above background with a VP16-SCL/tal fusion. However, inclusion of domain C in the GAL4-E2-2 fusion (E2-2 amino acids 484 -563) permitted interaction with VP16-SCL/tal. In Fig. 5B, GAL4 fusions were coexpressed with full-length MyoD. Because MyoD has its own potent activation domains, it was not necessary to express MyoD as a VP16 fusion. As with VP16-SCL/tal, MyoD failed to interact with a GAL4 fusion with the minimal E2-2 bHLH domain (E2-2 amino acids 484 -541). As predicted, inclusion of domain C in the GAL4-E2-2 fusion (E2-2 amino acids 484 -563) restored the in vivo interaction with MyoD. In the nuclear redirection assay, which has been previously described (17,18), COS cells were transfected with an expression plasmid for a mutant SCL/tal which lacks a nuclear localization signal. In addition, the COS cells were cotransfected with expression plasmids for either E2-2 bHLH only or E2-2 bHLH with domain C, each fused to the GAL4 nuclear localization signal. Cells were then analyzed for subcellular local-  Fig. 1), and LexA fused to the minimal bHLH domain of SCL/tal (as depicted in Fig. 1). B42 fusions include: B42 fused to E2-2 467-588 which includes the bHLH domain as well as flanking amino acids, and B42 fused to the minimal bHLH domain of E2-2 (as depicted in Fig. 1). Liquid ␤-galactosidase assays were performed on three separate occasions. Results shown are mean Ϯ S.E. ization of SCL/tal by indirect immunofluorescence with confocal microscopy. When coexpressed with only the GAL4 nuclear localization signal, the SCL/tal mutant showed predominantly cytoplasmic localization with perinuclear accumulation (Fig.  6A). When coexpressed with E2-2 bHLH with domain C (E2-2 amino acids 484 -563), the SCL/tal mutant showed efficient nuclear localization as a result of its heterodimerization with E2-2, a process referred to as nuclear redirection (Fig. 6B). By contrast, no nuclear redirection of mutant SCL/tal was observed with coexpression of the minimal E2-2 bHLH domain (E2-2 amino acids 484 -541), indicating an absence of heterodimerization (Fig. 6C).
In coimmunoprecipitation assays (Fig. 7), COS cells were cotransfected with expression vectors for full-length SCL/tal (pCMV-SCL/tal) and either GAL4-E2-2 bHLH (pCMV-DB-E2-2 bHLH) or GAL4-E2-2 C (pCMV-DB-E2-2 C). Transfectants were subjected to low stringency immunoprecipitation with rabbit anti-GAL4 antibodies. Immune complexes were then analyzed by immunoblot with the BTL-73 monoclonal antibody specific for SCL/tal (26). As shown in Fig. 7, while no detectable SCL/tal protein could be detected in complex with GAL4-E2-2 bHLH (lane 1), the 42 kDa full-length isoform of SCL/tal could be detected in complex with GAL4-E2-2 C (lane 2). As shown in the immunoblots in the lower panels of Fig. 7, crude extracts from both COS cell transfectants contained similar quantities of SCL/tal and GAL4-E2-2 proteins. The doublet observed on immunoblotting for SCL/tal has been previously described (24). Thus, three independent assay systems, mammalian two-hybrid, nuclear redirection, and coimmunoprecipitation, all confirm the requirement for domain C for in vivo heterodimerization of E2-2 in mammalian cells.
Domain C Does Not Enhance in Vitro Dimerization-To examine the role of domain C in a highly controlled system of in vitro dimerization, we employed surface plasmon resonance analysis of bacterially expressed, purified proteins. In the data shown in Fig. 8, GST-E2-2 bHLH (E2-2 amino acids 484 -541) and GST-E2-2 C (E2-2 amino acids 484 -563) immobilized at similar densities were employed as the solid phase ligands. The soluble analyte consisted of MBP-SCL/tal (SCL/tal amino acids 176 -331) at a concentration of 4 M. As shown in Fig. 8, a weak association (Ͼ1 M) was detectable between MBP-SCL/tal and GST-E2-2 bHLH. Surprisingly, virtually no association was detectable between MBP-SCL/tal and GST-E2-2 C. Similar findings were obtained with SCL/tal as the immobilized ligand and the E2-2 proteins as the soluble analyte (data not shown). Therefore, for in vitro heterodimerization, domain C, rather than a requirement, appears to behave in an inhibitory fashion.  Fig. 3A). The amino acids contributed by E2-2 are indicated above each lane. Strains used for two-hybrid assays were lysed and subjected to standard Western blotting using a rabbit anti-LexA antibody. Panel B, yeast two-hybrid analysis of interactions between LexA-E2-2 truncation mutants and B42 fusions with the SCL/tal bHLH mutants SE1 and SS1. Interactions were quantitated by liquid ␤-galactosidase assays. The amino acids contributed by E2-2 are indicated in bold. The SCL/tal bHLH mutants SE1 and SS1 are altered specificity mutants obtained from screening a library of random mutants for yeast two-hybrid interaction with the E2-2 minimal bHLH domain.

Domain C Is the Most Highly Conserved Module within E
Proteins-An alignment of the human E proteins, E2-2, HEB, E12, and E47, with E proteins from a range of organisms including Drosophila and zebrafish shows the extremely high phylogenetic conservation of domain C. Fig. 9 highlights blocks of phylogenetic identity, i.e. regions of 100% conservation, within E proteins. Domain C represents by far the largest block of phylogenetic identity within E proteins, with 100% conservation over 20 amino acids in organisms ranging from fly to human. To determine if this entire conserved block was required for in vivo dimerization, domain C was subjected to carboxyl-terminal truncation at the proline residue highlighted in Fig. 9. As shown in Fig. 10, inclusion of only the first 7 residues of domain C (E2-2 amino acids 484 -548) sufficed for heterodimerization of E2-2 with SCL/tal in the yeast two-hybrid system. Thus, despite its striking evolutionary conservation, the entirety of domain C appears not to be required for in vivo heterodimerization. DISCUSSION The basic helix-loop-helix domain has been well defined both by functional studies as well as multiple sequence alignment analysis (28). The most compelling functional studies have shown sufficiency of a minimal bHLH domain of MyoD, encompassing 68 amino acids, for myogenic conversion of fibroblasts (31). Thus, for MyoD the HLH domain appears to suffice for both in vitro and in vivo heterodimerization. Our data similarly show that the minimal bHLH domains of MyoD and SCL/tal can support in vivo heterodimerization (Fig. 2). For E proteins a discrepancy exists between requirements for in vitro and in vivo dimerization. Our own in vitro studies on E protein dimerization, as well as those of other laboratories, have clearly demonstrated sufficiency of the minimal bHLH domain sequence shown in Fig. 1 (see surface plasmon resonance data in Fig. 8) (10, 16). By contrast, our findings for in vivo heterodimerization of E2-2 indicate a requirement for additional sequence outside the bHLH domain. Most likely this discrepancy is due to fundamental differences between in vitro and in vivo HLH dimerization. In particular, in vitro HLH dimerization, as studied by surface plasmon resonance, appears to be a highly inefficient process with slow association rates and rapid dissociation rates (see Fig. 8). 2 Recent NMR studies of E47 homodimers showed a high degree of dynamics within the HLH domain and poor definition of helix 1, both findings suggestive of a high degree of dimer instability (32). The low affinity and relative instability observed in HLH dimers in vitro is incompatible with previous observations of high efficiency HLH dimerization in vivo (17,18).
Domain C appears to function in vivo as a cis-acting confor- The concept that regions outside the HLH domain can influence dimerization has support from several earlier studies. Klein et al. (33) identified a dimerization-defective isoform of the rat E protein, REB, generated by alternative splicing. The dimerization-defective REB␤ differs from the dimerizationcompetent isoform REB␣ only in that REB␤ possesses an additional 24-amino acid ankyrin-like domain located 180 amino acids amino-terminal of the bHLH domain. Presumably the ankyrin-like domain of REB␤ makes contacts in cis with the bHLH domain, thereby masking its dimerization potential. Shirakata and Paterson (29) have demonstrated an influence of domain A in E12, located just upstream of the bHLH domain, in preventing homodimerization and promoting heterodimerization with myogenic bHLH proteins. Wright et al. (34) have shown that antimyogenin monoclonal antibodies recognizing epitopes amino-terminal of the bHLH domain inhibit the heterodimerization of myogenin and E12. Hara et al. (8) have shown that Cdk2-dependent phosphorylation of Id2 at a serine residue in the amino terminus, approximately 30 amino acids upstream from the HLH domain, completely blocks the ability of Id2 to heterodimerize with E12. Thus ample precedent exists for the presence of external cis-acting domains which influence the accessibility and/or the conformation of the HLH domain.
The means by which domain C influences the conformation of the adjacent HLH domain remains undefined. One possibility is that domain C simply represents an extension of helix 2. This possibility is unlikely for two reasons. First, multiple algorithms for the prediction of protein secondary structure suggest that helix 2 of E2-2 ends amino-terminal to domain C (35). Second, domain C does not promote, and in fact inhibits, in vitro dimerization (see Fig. 8); as a simple helical extension, domain C would be predicted to promote in vitro dimerization. Another possibility is that domain C, at its central proline residue, bends back on top of helix 2 and induces a conformational change. This cis-acting model for domain C is unlikely for two reasons. First, carboxyl-terminal truncation of domain C at the central proline residue does not eliminate its function (Fig. 10). Second, the cis-acting model also predicts that do-  (Fig. 8). A third possibility is that domain C serves as a docking site for a chaperonin which induces the conformational change in the HLH domain. This chaperonin-docking model is appealing for three reasons: 1) the unique requirement of domain C for in vivo but not for in vitro dimerization; 2) the exquisite evolutionary conservation of domain C; and 3) the significant precedent for regulation of transcription factors by chaperonins. With regard to the last point, heat shock protein 90 has been shown to regulate the DNA binding functions of E12 and MyoD (36) as well as the ligandinducibility of the bHLH dioxin receptor (37). Using the potent and specific inhibitory compound macbecin I (38), we have ruled out a major role for heat shock protein 90 in HLH dimerization in yeast (data not shown). However, numerous other chaperonin systems remain to be tested.
Analyses of bHLH structures in solution indicate that prior to dimerization, monomeric bHLH domains are either completely disordered (14) or in an antiparallel hairpin-like conformation (39,40). Thus the transition from monomeric to dimeric bHLH molecules requires major structural alterations from either a disordered or antiparallel conformation into a highly structured, well organized parallel conformation. This transition may occur slowly and inefficiently in vitro permitting some observable dimerization. However, for efficient in vivo dimerization, there is most likely active refolding of bHLH domains into conformations receptive for dimerization. Our observations suggest that domain C is a cis-acting determinant which permits the refolding of E proteins into dimerizationcompetent monomers. In such a conformation, the E proteins may then be capable of heterodimerizing with a wide array of tissue-specific HLH factors. Thus domain C may contribute to the unique ability of E proteins to bind a diverse array of HLH partners.