|
Advertisement | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
J. Biol. Chem., Vol. 277, Issue 31, 27593-27605, August 2, 2002
From the Department of Surgery, University of Pennsylvania Medical
System, Philadelphia, Pennsylvania 19104
Received for publication, April 2, 2002, and in revised form, April 22, 2002
The mammalian skeletal myosin heavy
chain locus is composed of a six-membered family of tandemly linked
genes whose complex regulation plays a central role in striated muscle
development and diversification. We have used publicly available
genomic DNA sequences to provide a theoretical foundation for an
experimental analysis of transcriptional regulation among the six
promoters at this locus. After reconstruction of annotated drafts of
the human and murine loci from fragmented DNA sequences, phylogenetic footprint analysis of each of the six promoters using standard and
Bayesian alignment algorithms revealed unexpected patterns of DNA
sequence conservation among orthologous and paralogous gene pairs. The
conserved domains within 2.0 kilobases of each transcriptional start
site are rich in putative muscle-specific transcription factor binding
sites. Experiments based on plasmid transfection in vitro
and electroporation in vivo validated several predictions
of the bioinformatic analysis, yielding a picture of synergistic
interaction between proximal and distal promoter elements in
controlling developmental stage-specific gene activation. Of particular
interest for future studies of heterologous gene expression is a
650-base pair construct containing modules from the proximal and distal
human embryonic myosin heavy chain promoter that drives extraordinarily
powerful transcription during muscle differentiation in
vitro.
Throughout metazoan evolution, the range of molecular mechanisms
used to achieve functional diversity in muscle partially reflects the
anatomic complexity of the species under consideration. In vertebrates,
the maximal contractile speed and energy utilization rate of individual
muscle fibers is largely controlled by the structure of the motor
protein myosin (reviewed in Ref. 1). The conventional myosins of
striated muscle are hexameric proteins composed of paired trimers of
myosin heavy chain (which contains the ATP-splitting motor domain) and
one each of the essential and regulatory light chains. The human genome
has at least 11 distinct striated myosin heavy chain
(MyHC)1 genes (2), six of
which are abundantly expressed in skeletal muscle and are tandemly
linked at a single locus on chromosome 17 (3-5). The members of the
human skeletal MyHC locus have 38 coding exons and two 5' noncoding
exons. Unlike striated MyHC isoform diversification in
Drosophila, which reflects a complex pattern of alternative
exon splicing (6, 7), the process in vertebrates relies on sequential
activation of a family of distinct genes, each encoding a single MyHC
(8). By virtue of the extraordinary abundance of these proteins (35%
of muscle protein), a large body of experimental data on MyHC isoform
switching was assimilated before isolation of the encoding genes
(9).
During embryonic myogenesis in mammals, the transcriptionally 5'-most
gene in the skeletal MyHC locus is activated first. During fetal muscle
development, the perinatal gene is selectively activated in a large
proportion of the muscle cells with progressive down-regulation of the
embryonic gene until birth. During subsequent postnatal growth, the
perinatal MyHC is progressively replaced by the three "type II"
(adult fast twitch) MyHCs. Throughout adult life, reversible
transitions are possible in response to cycles of degeneration and
regeneration or alterations in the hormonal environment, innervation,
and pattern of locomotive recruitment of individual muscle fibers (1,
10). A sixth gene at the skeletal MyHC locus is expressed only in
selected extraocular and laryngeal muscle fibers (11). The order of
sequential activation during development contrasts with the physical
order of the linked genes: 5'-embryonic, IIa, IId/x, IIb, perinatal,
extraocular-3'. A seventh MyHC abundantly expressed in type I (slow
twitch) skeletal muscle fibers is encoded by the physically unlinked
cardiac Two exceptional attributes of the mammalian skeletal MyHC locus, size
and tandem gene number, have limited the study of transcriptional regulatory mechanisms within this gene family. Predating the modern era
of general information about transcription factor binding sites,
studies of a 1.4-kb region of genomic DNA upstream from the rat
embryonic MyHC gene suggested a complex pattern of regulation by
counteracting cis-elements (12). Studies of a large number of other
single copy genes expressed during myogenesis have more recently
defined roles and binding specificities for several transcription factors, providing a basis for general models of promoter and enhancer function during commitment and terminal differentiation. Transcription factors of the MyoD/MRF and MEF-2 families
play pivotal and synergistic roles in the activation and maintenance of
the myogenic differentiation pathway (13, 14). Although the precise
mechanisms by which these factors achieve transcriptional activation
remain unclear, there is increasing evidence for a central role of
histone modification and chromatin remodeling. In addition to MRF and
MEF-2, numerous ubiquitous and muscle-specific transcription factors
have been found to play important roles in gene up-regulation during
terminal myogenesis. A partial list reflecting the extent of
characterization includes AP1, AP2, GR, Oct-1, SRF, Sp1, TEF, and TR
(15-20). Factors traditionally associated with other cell lineages
(e.g. GATA and hematopoeisis) have also been found to play
important roles in the expression of some muscle-specific genes (21).
For all of the factors listed above, information about binding
specificities has been incorporated into publicly available nucleotide
weight matrices (22).
The sequence specificity of transcription factors and the recruitment
of these proteins into multisubunit complexes dictate intriguing
patterns of evolutionary sequence conservation in noncoding DNA. In
recent years, a number of World Wide Web-based tools have been
developed to facilitate the comparative analysis of DNA sequence emerging from the various genome projects. Although the publicly available murine genome sequence currently consists of highly fragmented files in draft form, islands of synteny with the human genome allow reconstruction of orthologous gene alignments that expedite the prospective identification of transcriptional control regions. As currently implemented, "percentage identity plot" analysis provides a low resolution snapshot of large genomic regions (23), whereas "Bayesian phylogenetic footprinting" allows a detailed look at regions of up to 2 kb (24). The latter approach has
been applied to other muscle-specific promoters with the finding that
the majority of relevant transcription factor binding sites are
concentrated within the small percentage of total sequence that has the
highest probability of alignment (i.e. meets the most
stringent criteria for conservation). The complementary use of
transcription factor matrices to screen phylogenetically conserved DNA
sequence for high affinity binding sites has the capacity to yield a
detailed annotation of putative cis-acting regulatory domains.
Hypotheses generated on the basis of this annotation can provide a
useful basis for the experimental study of gene regulation in
vitro and in vivo.
In the present study, we use draft murine genome sequence from several
fragmented draft files, initially identified on the basis of BLAST (25)
comparisons with the human genome, to reconstruct a working model for
the complete murine skeletal MyHC locus. Each of the six human MyHC
promoter regions is analyzed in detail using bioinformatic tools, as
exemplified by those cited above, to yield annotated maps reflecting
both the level of sequence conservation and the precise location of
matches to transcription factor matrices. Regions of strong
conservation between the orthologous human and murine MyHC promoters
occur in patches scattered throughout at least 2 kb immediately
upstream of the transcriptional start sites. Paralogous promoter
comparisons reveal surprisingly strong homology within 300 bp of the
transcriptional start sites, providing evidence for a conserved
proximal "core" promoter domain. Transfection of myotubes in
vitro and electroporation of myofibers in vivo are used
to further dissect the relevant domains of four of these promoters with
the findings that 1) isolated proximal promoter regions can retain
developmental stage specificity despite homology to a core consensus
sequence and 2) an E box-rich distal region of the human
embryonic MyHC promoter dramatically amplifies the in vitro
activity of all of the proximal core promoters tested in
cis. These findings suggest several testable hypotheses
regarding potential short and long range interactions between
transcriptional control elements within the skeletal MyHC tandem gene locus.
Computer Analysis--
Deduced full-length cDNA sequences
derived from an annotated version of the human skeletal myosin heavy
chain locus (2) were used to query the high throughput genomic DNA
sequence databases accessible through the National Center for
Biotechnology Information Homepage (www.ncbi.nlm.nih.gov/BLAST/).
Species-specific repeats in the reconstructed human locus were
identified using the REPEATMASKER algorithm as implemented on the
Washington University server
(ftp.genome.washington.edu/cgi-bin/RepeatMasker). As soon as draft
sequences appeared online to confer complete coverage of all six of the
genes at the murine locus, we ordered the draft sequence fragments to
correspond to the order of the orthologous human sequences. A
reconstructed murine sequence draft spanning the entire locus was
annotated, initially by using GENSCAN as implemented on the MIT server
(genes.mit.edu/GENSCAN) and later by cross-checking with the human
annotated sequence to verify exon splice site predictions. We next used
sequence from our full-length cDNA clones to prospectively identify
the transcriptional start sites for four of the genes at the human
locus. These predictions were confirmed and extended to include all six
genes by using the Promoter Prediction by Neural Network program as
implemented on the Berkeley Drosophila Genome server
(www.fruitfly.org/seq_tools/promoter.html). Orthologous regions were
easily identified in the annotated murine MyHC locus. Regions upstream
of the predicted human and murine transcriptional start sites were
analyzed with the several programs in the MacVector package (version
6.0.1; Oxford Molecular Group) as implemented on the Mac OS platform
and as cited throughout. In addition, these regions were analyzed using
the TESS and Bayesian Phylogenetic Footprint Homepage programs, as
implemented on the University of Pennsylvania
(www.cbil.upenn.edu/cgi-bin/tess/tess?SEA-FR-Query) and Wadsworth
servers (bayesweb.wadsworth.org/cgi-bin/bayes_align12.pl), respectively. The outputs from the latter analyses were merged by
importing the results into Microsoft Word files and reproducing the
color coding of the nucleotide sequence to identify phylogenetically conserved regions. The complete files are available on request form the
corresponding author. Only the general results have been integrated
into the figures in this paper.
Plasmids--
All of the promoter constructions were made in the
pGL3-basic plasmid, a promoterless gene encoding the firefly luciferase (Promega). The different promoters were cloned using the
NheI and BglII sites of the pGL3-basic plasmid.
For this purpose, all of the promoters were amplified by 30 cycles of
PCR using the proofreading Pwo polymerase (Roche Molecular
Biochemicals) according to the manufacturer's protocol. All primers
were designed to have a 58 °C annealing temperature and to contain
the NheI site in 5' or the BglII site in 3' as
showed in Table I to allow a
unidirectional cloning. The plasmid CMV- Cell Culture and Transfection--
The C2C12 cell line
proliferates at low density in Dulbecco's modified Eagle's medium
plus 20% fetal calf serum. The differentiation into myotubes was
induced by replacing the proliferation medium by Dulbecco's modified
Eagle's medium + 0.5% fetal calf serum + ITS (insulin, transferrin,
and selenium from Roche Molecular Biochemicals at 5 µg/ml each).
These cells were transfected using Fugene6 (Roche Molecular
Biochemicals). Briefly, at day 1, myoblasts were plated at 200,000 cells/well in a six-well plate. At day 2, 2 h prior to the
transfection, the medium was removed and replaced by differentiation
medium. Transfection was achieved by incubating 6 µl of FuGene6 in 94 µl of Dulbecco's modified Eagle's medium at room temperature. After
5 min, 2 µg of the luciferase reporter plasmid and 1 µg of
CMV- Rat in Vivo Electroporation--
Electroporation of rat tibialis
anterior muscles was performed as per Ref. 30. All electroporation and
subsequent euthanizing and procurement procedures were performed in
accordance with University of Pennsylvania IACUC protocols. Liquid
nitrogen frozen muscles were pulverized to powder and solubilized with
1× lysis buffer (luciferase assay system; Promega). After a 5-min
incubation and one freeze/thaw cycle, samples were centrifuged, and 20 µl of supernatant were processed for luciferase and Normalization of Reporter Gene Activity and Statistical
Analysis--
The luciferase results are expressed as the ratio
between the luciferase activity and the To provide a basis for phylogenetic analysis of putative
transcriptional control domains within the skeletal MyHC locus, we periodically screened high throughput DNA sequence databases until a
representation of the entire murine locus could be identified in draft
form. Three overlapping bacterial artificial chromosomes spanned the
entire locus as depicted in Fig.
1A, although the data
contained in the relevant accession files was highly fragmented at the
time of this analysis. Based on synteny with the human locus, we
reconstructed a single file that contained a properly ordered and
annotated draft of the murine locus. As expected, coding regions
exhibit greater than 90% sequence homology, without gaps. Although the
orthologous genes occupy approximately the same relative proportion of
the length of the entire locus, the murine locus is only 80% of the
size of the human locus. From the standpoint of transcriptional
regulation, the regions upstream of the first coding exons of the
orthologous and paralogous genes are of greatest interest. We
identified transcriptional start sites by comparison with full-length
cDNA sequences or by analyzing homologous intergenic regions for
TBP binding site and initiator consensus sequences using Promoter
Prediction by Neural Network NNPP (Lawrence Berkley National
Laboratory).
An initial screen of 8 kb upstream of each transcriptional start site
revealed a pattern exemplified by the orthologous embryonic and IIb
promoters in which there is strong conservation interrupted by blocks
of nonconserved sequence, which in most cases precisely coincide with
the boundaries of species-specific repeats as annotated previously
using the REPEATMASKER algorithm. There are blocks of >100-bp
DNA sequence that attain 60% conservation (note peak height in the
raised relief image) out to at least 8 kb upstream of both the human embryonic and IIb MyHC transcriptional
start sites (Fig. 1, B and C, embryonic MyHC
upstream 8 × 8 kb and IIb MyHC upstream 8 × 8 kb). The
conserved regions identified as conserved human embryonic MyHC element
(ChemE)-1, -2, and -3 are defined below. With the sole exception of the
extraocular MyHC promoter pair, the pattern of conservation within 2 kb
of the start sites provided striking evidence for selection against
mutation by insertion or deletion. This pattern suggests that the order
of and/or the physical distance between transcription factor binding
sites spread over at least 2 kb is crucial to the proper developmental
regulation of MyHC gene activation/repression. To address the
possibility of a shared core promoter structure, we extended the
pairwise comparison to include paralogous human MyHC promoters (Fig.
1D). Homology among promoter paralogs is uniquely detected
within the most proximal 300 bp, as shown for the IIa, IId/x, IIb, and
perinatal genes.
A limitation of this analysis for the identification of transcriptional
regulatory elements is its dependence on user-defined parameters such
as window size, DNA scoring matrix, and gap penalty. These biases have
been largely eliminated through the recent introduction of a Bayesian
statistical method based on a Gibbs sampling algorithm (24, 26). The
relationship between level of sequence conservation and distance
upstream from the transcriptional start sites is quantitatively
displayed in Bayesian "phylogenetic footprint" plots for five of
the promoter pairs (Fig. 2). The
z axis in each of the three-dimensional plots indicates the
local probability of mouse-human DNA sequence alignment based on a
modification of the Bayes block aligner algorithm (24). The overall
pattern of gene conservation as a function of distance from the
transcriptional start site is similar for the perinatal, IIa, IId/x,
and IIb genes. Relative to these four genes, the embryonic gene has a
comparatively larger block of conserved sequence in the interval A position weight matrix-based search for binding sites in a transcription factor data base was filtered to identify candidate protein-DNA interactions in the phylogenetically conserved portion of each sequence pair, as shown schematically above the footprints (Fig. 2, lettering above individual plots). Three groups of transcription factors known to participate in the regulation of muscle-specific gene expression are identified as having high concentrations of sites within these conserved domains: MRFs, MEF-2, and SRF. Additional transcription factors identified by this approach include AP-2, AhR, CCAAT enhancer-binding protein, GR, HEB, Oct-1, RREB-1, Sp1, STAT4, TEF-1, and YY1. As previously demonstrated for other muscle-specific promoters, the concentration of predicted sites is higher in the conserved than in the divergent sequence domains (24). The overall distribution of these sites is not appreciably conserved among the paralogous promoters except in the most proximal 300 bp of sequence. We used ClustalW (27) to align all of the proximal promoters for each
orthologous gene pair (Fig. 3). This
alignment draws attention to four core domains and provides a consensus
sequence for these and the intervening regions. A computational
analysis of each of the individual sequences and the consensus sequence identifies several "high affinity" matches to the Transfac matrices for MEF-2, MyoD, Oct-1, CCAAT enhancer-binding protein, and TBP, the
former of which is outlined on the sequence alignment. In this
alignment, the embryonic promoter pair is distinguished from the other
four pairs in having the highest affinity MEF-2 site (28)
nearest to the transcriptional start site (proximal domain 2 as
compared with proximal domain 3 for the others). In fact, this site
achieves the highest binding score of any predicted MEF-2 site within 2 kb of an MyHC promoter. Three MEF-2 consensus sequences are identified
in the IIa, IId/x, and perinatal promoters. MEF-2 binding is predicted
from position weight matrices for the corresponding regions of the
other two promoters, but the binding strength falls below the default
limits at the most distal site for the embryonic promoter and at the
most proximal site for the IIb promoter.
Recapitulating the bioinformatic analysis, five of the six tandemly
linked skeletal MyHC promoters exhibit similar overall structures in
which focally conserved sequences, rich in putative muscle-specific
transcription factor binding sites, are distributed throughout 1.5-2
kilobases upstream of a broadly conserved core promoter domain of
250-300 bp. We chose a convenient nomenclature to describe the three
larger blocks of conserved sequence in the embryonic promoter:
conserved human embryonic MyHC elements 1, 2, and 3 (ChemE-1, -2, and
-3). These observations provide a basis for interpreting the results of
an experimental dissection of proximal and distal promoter elements
in vitro and in vivo. As an initial screen for
transcriptional activity in vitro, we used large constructs
based on cosmid-cloned fragments of three human skeletal MyHC genes
(Fig. 4). As expected, the SV40 promoter
drove high level marker gene expression in C2C12 myoblasts and myotubes (612- and 146-fold increases over background, respectively). Cis-acting sequences spanning at least 4 kb 5' of the embryonic, IId/x, and IIb
MyHC gene translational start sites drove marker gene expression in
myoblasts at levels barely above background. In myotubes, however, the
embryonic MyHC gene construct uniquely increased marker gene expression
to a level 160-fold higher than background, exceeding even that of the
SV40 control. In contrast, marker gene expression was the same in
myoblasts and myotubes for the IId/x and IIb MyHC gene constructs
(i.e. just above background). The surprisingly high marker
gene expression from the embryonic construct suggested that the 30-kb
fragment encompassed most of the elements necessary for developmental
stage-specific transcriptional activation. The relative inactivity of
the adult MyHC promoters in this assay recapitulates the activities of
the endogenous genes (29) and may reflect the absence of developmental
stage-specific activators, the presence of repressors, or a combination
thereof.
To address the possibility that repressive elements distal to the
broadly conserved core domains were responsible for the inactivation of
the adult MyHC promoters, we assayed marker gene expression in C2C12
myoblasts and myotubes using plasmids containing the proximal 300 bp of
the embryonic (ChemE-1), perinatal, IId/x, and IIb promoters. The
results of this series of experiments (Fig. 5) again showed strong activation in
myotubes of the proximal 300-bp portion of the embryonic MyHC promoter
and very low (near background) activity of the nonembryonic promoters.
Further truncation of the embryonic promoter resulted in an abrupt loss
of activity between
To further localize the critical elements in transcriptional control of
the embryonic MyHC gene, we compared the activities of five additional
truncation constructs in vitro. As shown in Fig.
6, there is progressive and dramatic loss
of activity with truncation below
To study the role of this phylogenetically conserved region in the
muscle-specific activation of this promoter, we constructed a series of
internal deletion mutants as shown in Fig.
7. Marker gene expression in a series of
C2C12 transfections was only slightly reduced when regions distal to
To further characterize the pattern of transcriptional activation from
the ChemE-2 subfragment, we tested its ability to synergize with
heterologous minimal promoters (Fig. 8).
The SV40 minimal promoter, activated in cis by a wide range
of enhancers, was minimally affected by juxtaposition to the ChemE-2
subfragment in C2C12 myotube transfections. In contrast, all of the
MyHC proximal promoters previously tested were further activated by
ChemE-2. The relative magnitude of this cis-activation was stronger for
the IId/x and perinatal promoters than for the embryonic promoter
(37.3-, 31.8-, and 14.1-fold, respectively), and was similar for
the IIb (12.3-fold) and embryonic promoters. Nevertheless, the absolute
activity of the ChemE-2 subfragment fused to the embryonic minimal
promoter was highest by almost 10-fold in this assay. We
interpret this result to indicate that the overall developmental stage-
and fiber type-specific activity of each of the endogenous MyHC
promoters reflects the input of multiple DNA-transcription factor
interactions in both the highly conserved proximal promoter and the
isogene-specific distal promoter regions. Strong activation by distal
elements can partially override weak activation or repression from more proximal elements.
To further dissect the proximal promoter regions, we took advantage of
the low level homology between the embryonic and IIb promoters in the
construction of chimeric reporter plasmids (Fig. 9). The ChemE-2 subfragment was used in
juxtaposition to a series of progressively truncated embryonic proximal
promoter fragments with the finding that these serial deletions were
associated with the progressive loss of promoter activity. When the
deleted regions were replaced by the homologous domains from the IIb
promoter, there was neither further loss nor compensatory restoration
of activity. The absence of developmental stage-specific
transcriptional repression associated with the IIb for embryonic
chimeric substitutions suggests the complete absence of inhibitory
DNA-protein interactions in this region. In other words, relative to
the embryonic deletion constructs, insertional replacement with IIb
sequences had essentially no effect on activity, such that the IIb DNA
served essentially as neutral spacer sequence in the context of the
C2C12 transcriptional milieu. The noninterchangeability of homologous
domains from the IIb and embryonic promoters reveals the potential
complexity of activating DNA-protein and protein-protein interactions
in the proximal promoters.
In vivo, the adult isoforms are highly expressed in the fast
twitch muscles of the rodent hind limb. As a simple test of the activity of these promoter constructs in vivo, we used an
electroporation protocol to partially augment the efficiency of somatic
gene transfer. While less efficient than the use of recombinant vectors
that allow binding of the myocyte surface by viral capsid proteins, the
plasmid electroporation approach (30) avoids the complicating use of
viral DNA sequences in cis with the MyHC promoter sequences. The results of this series of experiments are shown in Fig.
10. We tested the embryonic and IIb
constructs with and without the ChemE-2 subfragment at 4 and 7 days,
and without ChemE-2 at 21 days with the expectations that
(a) localized regeneration after electrical injury (31)
would initially activate both of the endogenous genes (embryonic and
IIb), (b) activation of the endogenous embryonic MyHC gene
would be transient as regeneration subsided, and (c)
augmentation from the ChemE-2 subfragment might be required to achieve
detectable signal above background. This experiment showed that the
embryonic and IIb core promoters were comparably active in
vivo and that they were both up-regulated by the ChemE-2 subfragment (although to a lesser extent than seen in
vitro). The comparative activity of the IIb promoter constructs at
4, 7, and 21 days probably reflects the role of trans-acting factors present in vivo but absent from the C2C12 system in
vitro. The trends observed also suggest that any repression
attributable to elements within the ChemE-2 subfragment is outweighed
by activation from elements within the IIb core domain. At 21 days,
however, the embryonic and IIb core promoters showed minimal and
maximal activity, respectively, a result that contrasts sharply with
the initial relative activity of these constructs in vitro
but parallels that of the endogenous genes in vivo. Thus,
the core promoter constructs qualitatively recapitulate the expression
patterns of the cognate genes, providing tools for use in a detailed
dissection of the comparative roles of trans-activation and
trans-repression during development and regeneration.
Despite the critical importance of the sarcomeric MyHC in skeletal
muscle function and the extraordinary abundance of this protein, the
size and complexity of the human skeletal MyHC locus has delayed
detailed investigation of the transcriptional control of its six
tandemly linked genes. During the preparation of this manuscript, we
became aware of a report on some recent in vitro studies
using plasmid constructs based on the murine IIa, IId/x, and IIb
proximal promoters (32). Our data on early developmental isoform
switching complement these studies (which instead focused on possible
mechanisms for fiber-type regulation of adult MyHC isoform switching).
We have used newly available draft DNA sequence for the human and
murine genomes as a basis for modeling important transcription factor
binding sites in the cis-acting regulatory sequences. Analysis of
paralogous skeletal MyHC genes reveals striking homology in the most
proximal 300 bp of the embryonic, IIa, IId/x, IIb, and perinatal
promoters, centered around three predicted MEF-2 sites. Our previous
analysis of the evolutionary relationships among these genes, based
wholly on comparisons of coding sequence for the "molecular
yardstick-like" rod domain of the protein, suggests that this MEF-2
site rich "core" structure to the promoter dates back to a common
ancestral gene that existed during early vertebrate evolution, before
the human avian species split (2). Consistent with this prediction is
the result of a phylogenetic footprint analysis of the human IId/x
promoter with that from the most closely related chicken MyHC gene
(Fig. 2H). The high probability regions match the core
consensus at all three of the MEF-2 sites identified in the IIa, IId/x,
and perinatal promoters. In contrast, neither the embryonic nor the IId/x promoters contain E boxes within 300 bp of the transcriptional start sites, and CANNTG consensus sequences identified in the other
promoters are located at sporadic positions (IIa Our experimental findings with the set of core promoter constructs revealed that reporter expression roughly paralleled the relative levels of the corresponding isomyosins in myotubes in vitro (29) and in myofibers in vivo (33). In a detailed analysis of chimeric substitutions, none of the IIb sequence motifs identified in the sequence alignments could substitute for the homologous portions of the embryonic promoter without a significant loss of activity in vitro. This implies that the divergent sequences intervening between the consensus-matching portions of the core contribute to the process of promoter switching during development. Some of the divergent sequence falls within the three MEF-2 sites predicted in this region, raising the possibility that subtle differences in MEF-2 binding affinity could play a role in this process. Mutation of the E box at The sequences between The skeletal myosin heavy chains are among the most abundant proteins in the bodies of vertebrates, and they must accumulate with other contractile proteins in precisely controlled stoichiometric ratios. This gene dosage effect is best documented in Drosophila, where compensatory changes in actin gene number improve muscle structure by restoring proper stoichiometry (40). We speculate that throughout vertebrate evolution, the potential advantage of MyHC isoform diversification through gene duplication was offset by the attendant mutational disruption of established patterns of developmental regulation. The fact that these six genes remain tandemly linked suggests that their proper regulation requires proximity to shared transcriptional control elements (39). A teleological advantage of centralized control, exerted across the entire locus by single copy regulatory elements, is that the individual promoters are free to compete for varying proportions of the total MyHC mRNA mix without disrupting overall protein stoichiometry. Alternatively, if all of the genes are autonomously regulated by their own upstream regulatory sequences, down-regulation of one gene must be precisely offset by up-regulation of another. Although it is premature to suggest detailed models for long range interaction between transcription factors binding the endogenous MyHC genes, two candidate control regions are identified in the current studies (ChemE-2 and -3). One of these is incorporated in the only minimal length skeletal MyHC promoter construct to show robust autonomous activity in vitro, perhaps providing a cassette powerful enough for functional expression of recombinant MyHCs. Further studies will establish whether the other promoter regions identified at this locus can autonomously drive heterologous protein expression in adult tissues at levels sufficient for assembly of the contractile apparatus, a sine qua non for isolation of the entire control region. In summary, the foregoing studies illustrate the complementarity of
readily available tools for in vitro, in vivo,
and in silico analysis, as applied to the
dissection of the transcriptional regulatory circuitry controlling the
developmental program for the six-membered human skeletal MyHC gene family.
* This work was supported in part by grants (to H. S.) from the Association Française contre les Myopathies (AFM), the Muscular Dystrophy Association of America, NIAMS (National Institutes of Health (NIH)), NINDS (NIH), and the Department of Veterans Affairs.The costs of publication of this article were defrayed in part by the payment of page charges. The article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
§ To whom all correspondence should be addressed: Rm. 608, Biomedical Research Bldg. II/III, 421 Curie Blvd., Philadelphia, PA 19104-6160. Tel.: 215-898-1432; Fax: 215-573-8606; E-mail: hstedman@mail.med.upenn.edu.
Published, JBC Papers in Press, April 23, 2002, DOI 10.1074/jbc.M203162200
The abbreviations used are: MyHC, myosin heavy chain; ANOVA, analysis of variance; contig, group of overlapping clones; ChemE, conserved human embryonic MyHC element; MRF, muscle regulatory factor; HSD, honestly significant difference.
Copyright © 2002 by The American Society for Biochemistry and Molecular Biology, Inc. This article has been cited by other articles:
|
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Advertisement | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||