![]()
|
|
||||||||
(Received for publication, October 6, 1994) From the
We report the isolation and the organization of the gene
encoding human tryptophan hydroxylase (TPH) and an analysis of the
corresponding mRNAs. The gene spans a region of 29 kilobases, which
contains at least 11 exons and a variably spliced 5`-untranslated
region (5`-UTR). The sequence of the coding region and the majority of
the positions of the intron-exon boundaries of human TPH gene are very
similar to those encoding human tyrosine hydroxylase and phenylalanine
hydroxylase, the other members of the aromatic amino acid hydroxylase
family. Phylogenetic analysis evidences the early divergence and the
independent evolution of the three hydroxylase types. TPH cDNA cloning
and anchored polymerase chain reaction revealed a diversity of the TPH
mRNA, which is restricted to the 5`-UTR. Four TPH mRNA species were
detected by Northern blot with pineal gland and carcinoid tumor RNAs.
These messengers are transcribed from a single transcriptional
initiation site, and their diversity results from differential splicing
of three intron-like regions and of three exons located in the 5`-UTR.
Analysis by S1 nuclease protection revealed that the intron-like
regions in the 5`-UTR are mostly unspliced and that TPH mRNA species
where the three intron-like regions are eliminated are present at low
level in pineal gland and not detectable in carcinoid tumors.
Tryptophan hydroxylase (TPH) ( TPH
is a member of the aromatic amino acid hydroxylase family, which also
includes tyrosine hydroxylase and phenylalanine hydroxylase. These
three enzymes require a reduced pterin cofactor to hydroxylate their
amino acid substrate and interact with Fe Recently, cDNAs encoding rabbit and rat TPH have been cloned from
pineal gland, and this was followed by the isolation of homologous
human and mouse cDNAs from carcinoid tumor and mastocytoma cDNA
libraries,
respectively(13, 14, 15, 16) . The
coding region of human TPH cDNA extends over 1332 bp, and the deduced
amino acid sequence is very similar all along the sequence to those of
rat (91.2%(18) ) rabbit (94.6%(13) ), and mouse
(90.1%(16) ). The rat TPH gene is characterized by a diversity
restricted to the 5`- and 3`-UTR of the TPH mRNAs, which could provide
the basis for post-transcriptional regulation of TPH gene
expression(18, 19, 20, 21) . We
report the isolation of the human TPH gene and show by a combination of
cDNA cloning, polymerase chain reaction (PCR) analysis, and S1 nuclease
protection an unexpected diversity in the 5`-UTR of TPH mRNAs. These
messengers are transcribed from a single promoter, and their diversity
results from the conservation of one or several intron-like sequences
and from the differential splicing of three exons in the 5`-UTR of the
TPH mRNAs.
The coding and noncoding sequences of the human TPH mRNA
were aligned with those of the other aromatic amino acid hydroxylases
from different animal species. Computer-aided sequence comparisons and
analysis were performed with the GCG program(23) . After
sequences were optimally aligned, phylogenetic distance trees were
constructed by the Neighbor Joining Method of Saitou and
Nei(24) , and bootstrap analysis was performed by using the
MUST package(25) .
The human TPH cDNA clones fell into two categories, which
differed by the organization of their 5`-leader sequences and their
abundance. The majority of TPH cDNA clones, referred to hereafter as
type 1 cDNAs, all had long 5`-UTR of variable size and probably
corresponded to incompletely reverse-transcribed mRNAs (the longest
type 1 cDNA was 6 kb long with a 5`-UTR of 2.5 kb). In contrast, the
type 2 cDNA clones were about 3.6 kb long with a short 5`-UTR (the
longest type 2 cDNA has a 5`-UTR of 310 bp). They differed from the
type 1 cDNA clones by the absence of a 1.7-kb sequence (named
I
Figure 1:
Schematic representation of the various
5`-cDNA clones for human TPH and organization of TPH 5`-UTR on genomic
DNA. A, two types (1 and 2) of TPH cDNA clones were obtained
by cDNA cloning. The two longest species of these different cDNAs were
6 kb and 3.5 kb, respectively. Open and shaded boxes indicate the 5`-noncoding and coding exons, respectively. Thick horizontal lines represent the 3`-UTR and the
intron-like sequences in the 5`-leader sequence. The broken lines indicate the elimination of the intron-like region. B,
two 5`-noncoding extremities of TPH cDNA clones were isolated by
anchored PCR: Slic type 1 and Slic type 3. C, restriction map
and organization of the TPH 5`-noncoding region in genomic DNA. The arrow shows the position of the transcription initiation site.
Some of the restriction sites present in the TPH gene are
shown.
To determine the size and the number of human TPH mRNAs, Northern
blots were performed in denaturing conditions with RNA extracted from
various tissues that do or do not produce TPH. No hybridization signal
was detected with RNA purified from liver and dorsal raphe nuclei. In
contrast, the cDNA probe labeled four major transcripts with apparent
sizes of 5, 6, 7.5, and 9 kb in mRNA from pineal gland and carcinoid
tumors (Fig. 2A). These results agreed with the known
distribution of the enzyme in tissues, with the notable exception of
the raphe nuclei area of the brainstem, which, as found in the rat, did
not show any hybridization signal(20) . The 5-kb species
appeared to be the most abundant and was the only signal detected in
RNA prepared from the intestine. The high molecular weight TPH mRNA
forms were more abundant in carcinoid tumor than in pineal gland RNA.
Figure 2:
Northern blot analysis of TPH mRNA
expression. A, tissue distribution of TPH transcripts. Two
µg of poly(A)
To characterize better the diversity of TPH mRNA, the whole coding
sequence and the 3`-UTR were analyzed by PCR amplification from pineal
gland and carcinoid tumor cDNAs. A single fragment was detected for
each of the four overlapping subregions (defined by the primers)
spanning these domains, in agreement with the length of the cloned TPH
cDNA sequences (data not shown). In this respect, human TPH mRNA
differs from rat and mouse TPH mRNAs, which possess two different
3`-untranslated regions generated by alternative polyadenylation sites.
Thus, the diversity of TPH mRNAs may arise from RNA splicing in the
5`-noncoding region. The size of the cDNA clones cannot easily be
reconciled with that of the TPH mRNAs detected on Northern blots. This
could be because of the cDNA clones being incomplete at the 5`-end. We
therefore cloned the entire human TPH gene to facilitate the analysis
of the diversity of its mRNAs.
Figure 3:
Structural organization and exon-intron
junctions of the human tryptophan hydroxylase gene. A,
restriction map of the TPH gene. Filled and open boxes indicate the noncoding and coding exons, respectively. Open and filled circles represent HindIII and EcoRI sites, respectively. Three different phage clones
(
Figure 4:
Conservation of intron-exon junction
positions between the human TPH, tyrosine hydroxylase (TH),
and phenylalanine hydroxylase (PAH) genes. A, the
three genes are aligned to maximize amino acid identity. Boxes and thick lines indicate the exons and the introns,
respectively. Numbers in the boxes indicate the percentages of
identity in the exons between the hydroxylase genes. The colors white, gray, and black provide a visual
representation of the degree of identity: 0-35%, 36-59%,
and 60-100%, respectively. B, phylogenetic distance tree
based on the protein sequences of the aromatic amino acid hydroxylases.
After sequence alignment, the distance tree was constructed by the
Neighbor Joining Method (24) (left; (24) ) and
bootstrap analysis (right) as described under ``Materials
and Methods.'' The tree was arbitrarily rooted on the tyrosine
hydroxylase Drosophila sequence. Note the heterogeneity of the
molecular clock between the three subfamilies of hydroxylases, which
does not allow the sequence duplications to be dated with
confidence.
A phylogenetic distance
analysis was performed after having aligned all of the available
sequences of the three aromatic amino acid hydroxylases (AAAH) isolated
from the different animal species. The topologies of the trees obtained
from the nucleotide and the amino acid sequences were identical.
Moreover, tree branching was also unchanged by using the whole
sequences or only the C-terminal two-thirds of the molecules where the
sequences aligned without any gap or deletion. Thus, all the parts of
the sequences have presumably evolved at roughly the same relative
rates. Each of the three vertebrate AAAH, i.e. tyrosine
hydroxylase, phenylalanine hydroxylase, and tryptophan hydroxylase,
clearly constitutes a monophyletic group, a contention unambiguously
supported by the bootstrap analysis (Fig. 4B). The
phylogeny of animal species is correctly reproduced by the AAAH
sequences, but the rate of sequence divergence varies significantly
among the hydroxylases (note branch lengths for human, rat, and mouse
in each of the three groups; Fig. 4B). Thus, there is
no satisfactory molecular clock to date with confidence the divergence
of the three AAAH. Similarly, it is not possible to determine which of
the three AAAH diverged first, although the absence of a bona fide TPH
in Drosophila(32) suggests an early divergence of
tyrosine hydroxylase before that of TPH and phenylalanine hydroxylase
from a common ancestral gene. One of the genomic clones,
The analysis
of the 5`-ends of TPH mRNA obtained by the anchored PCR technique
confirmed the predominance of the TPH mRNA containing the intron-like
sequence of 1.7 kb (I
Figure 5:
Determination of the transcription
initiation site by S1 nuclease protection. Poly(A)
The S1-mapping experiment using probe A,
which spans the cap site, also revealed an additional protected
fragment of 29 bases in pineal gland RNA (Fig. 5A). The
presence on the genomic sequence of a 5`-splice donor site 29
nucleotides downstream of the transcription initiation site suggested
that this fragment could correspond to a small exon located at the cap
site. This was confirmed by PCR experiments with a primer bordering the
cap site and another primer localized inside the 5`-end of the type 3
SLIC clone. Sequence analysis of the PCR fragment confirmed the
existence of the 29-base exon (called exon E
Figure 6:
PCR analysis of the 5`-UTR of human TPH
mRNA. PCR was performed with specific primers after first-strand cDNA
synthesis using poly(A)
Figure 7:
S1 nuclease analysis of the 5`-UTR exons (C), 5`-intron-like regions (A, B), and
coding region (D) of TPH mRNA. Poly(A)
The low abundance of
the spliced TPH mRNA 5`-UTR also was confirmed with two other probes (I
and J). Probe I corresponds to 187 bases of the 5`-noncoding extremity
of TPH mRNA and contains the first noncoding exon linked to the exon
E
Figure 8:
S1 nuclease analysis of the TPH 5`-UTR
region. Poly(A)
Finally, to determine which of the three
intron-like regions was retained in the 5`-UTR of the major species of
TPH transcripts, probes specific to these three regions were generated
by PCR amplification and hybridized to different Northern blots (Fig. 2B). One intron, the I The study of the human TPH mRNA by cDNA cloning, anchored PCR
amplification, and S1 nuclease protection assays revealed a very
unusual organization of its 5`-UTR. Human TPH mRNA exhibited a large
diversity in the 5`-leader sequence, whereas the coding region was
identical in all of the tissues studied. Four TPH transcripts were
visualized by Northern blotting of both pineal gland and carcinoid
tumor RNA. To unravel this complex organization, the corresponding gene
was isolated and mapped. The sequence and the locations of the
intron-exon junction of the human TPH gene revealed very strong
similarity to those of the genes encoding other aromatic amino acid
hydroxylases (AAAH). The human TPH locus spans 29 kb and contains at
least 11 exons, and its mRNA appears to undergo differential splicing
in the 5`-UTR. The locations of intron-exon junctions of the mammalian
AAAH genes are very similar, particularly in the region corresponding
to the catalytic core of the enzyme (see Fig. 4). The only
exceptions are one intron specific to tyrosine hydroxylase (I-6)), one
specific to phenylalanine hydroxylase (I-12), and one common to
tyrosine hydroxylase and phenylalanine hydroxylase, which is absent
from TPH and could therefore have been lost during the course of
evolution. The 5`-extremity of the fourth exon is 4 amino acids longer
in the TPH gene than in the corresponding exon of the other
hydroxylases. The N-terminal region is less well conserved and encoded
by a number of exons that varies from one enzyme to another, and even
among mammalian species, as in the case of tyrosine
hydroxylase(7) . Interestingly, the junction between the
regulatory and catalytic domains of the proteins corresponds to an
intron-exon junction in all the genes of the family. Among the
hydroxylase genes, the mouse and human TPH genes alone have introns in
the 5`-UTR (one and three introns, respectively). It is also very
likely that an intron is present in the 5`-UTR of the rabbit TPH mRNA.
Indeed, the 5`-leader sequence of the rabbit TPH cDNA shares 74.4%
identity and 85.7% identity, respectively, with the human TPH exons
E
Figure 9:
Comparison of human and rabbit TPH gene
5`-UTR sequences. A, nucleotide sequences of human and rabbit
TPH are numbered with respect to the transcription start site and to
the first nucleotide of the cDNA, respectively. =, nucleotides
conserved between the two species. B, schematic representation
and alignment of the rabbit TPH 5`-UTR with the exons (E
Phylogenetic distance analysis, using either the amino acid
or the nucleotide sequences of this gene family, only partly supports
the conclusions drawn by Woo and colleagues (12, 35) about the evolution of the AAAH genes. These
authors proposed that two major gene duplications have occurred; the
first one separated tyrosine hydroxylase from the common ancestor, and
the second one gave birth to phenylalanine hydroxylase and TPH.
However, the uncertainty about the regularity of the molecular clock in
this protein family (Fig. 4B) and the small number of
sequences available from animal species belonging to different phyla do
not allow the duplications to be dated with confidence. Nevertheless,
it was recently proposed that Drosophila melanogaster possesses only two aromatic amino acid hydroxylase genes, one
being tyrosine hydroxylase-homologous and the other having both
phenylalanine hydroxylase and TPH activities (32) . If one
could rule out the possibility that one of the hydroxylases was
eliminated as redundant in the Drosophila phylum, it should be
proposed that the first duplication occurred presumably before and the
second one after the divergence of arthropods from the other taxa (600
million years ago). Human TPH mRNAs are characterized by large
diversity within, and restricted to, the 5`-noncoding region. This
diversity results from the conservation of one or more intron-like
regions in the 5`-leader sequence of the TPH mRNA and from differential
splicing of three exons in the spliced TPH mRNAs 5`-UTR. Generally,
mRNAs that display a 5`-UTR diversity, for example the genes encoding
mouse choline acetyltransferase(36) , human insulin-like growth
factor II(37) , and aldolase-A gene(38) , are
transcribed from alternative promoters, followed by the splicing of
intervening sequences in the 5`-UTR. An extreme example is the
hydroxymethylglutaryl-CoA reductase gene, where a more complex
mechanism involves the combination of multiple transcription initiation
sites and various 5`-splice donor sites for one intron(39) .
This diversity within the mRNA 5`-leader sequences is therefore
associated with the use of alternative promoters, which could be
preferentially activated in particular tissues or stages of
development(37, 40, 41) . In the case of the
human TPH gene, however, the multiple mRNA species are transcribed from
a single promoter, and the variety of TPH messengers is the result only
of the differential splicing of three intron-like regions and of the
three exons located in the 5`-UTR. It is surprising that the three
intron-like sequences in the 5`-UTR of TPH mRNAs are in many cases
retained when the introns of the coding region are eliminated. Northern
blotting clearly identified high molecular weight TPH transcripts,
which may result from differential splicing, generating unusually long
5`-noncoding sequences. These transcripts are more abundant than would
be expected of processing intermediates. This led to their isolation
directly from the screening of the cDNA library, a rather uncommon
event. PCR experiments only allowed the cloning and the
characterization of several rare, differentially spliced TPH mRNA
species where the three intron-like regions are eliminated. Generally,
mRNAs with long 5`-leader sequences correspond to precursors. In this
latter case, the abundance of long 5`-UTR in TPH mRNAs should imply
that the excision of three intron-like regions is suffi ciently slow
and that these messengers are sufficiently stable to allow their
accumulation. The limiting step of the TPH mRNA processing could be the
splicing of the 5`-leader sequence rather than nuclear RNA degradation.
Therefore, there appear to be two steps in the processing of human TPH
mRNA. The first is rapid, eliminating the introns of the coding region.
The second is slower, leading to a complex pattern of 5`-UTR
maturation. The mRNAs not containing region I It is generally thought that the 5`-UTR contributes to
the stability (43) and to the regulation of translation of the
messengers(44, 45) . However, the presence of long
5`-noncoding sequences in TPH mRNAs poses many problems with regard to
translation mechanisms. The initiation of translation in higher
eukaryotes is modulated by several structural features in the
5`-untranslated region of mRNA. They include the m7G cap, the position
of the AUG codon, the length of the leader sequence, and secondary
structures(46) . mRNAs with long 5`-UTR could correspond to
precursors or to otherwise nonfunctional transcripts. Indeed,
translation initiation optimally requires a short 5`-noncoding region
and no AUG codon upstream of that used to initiate
translation(47) . The introns that are retained in the
5`-leader sequence considerably impair translation efficiency because
they often contain an AUG-burdened leader sequence. Several AUG codons
upstream from the translation initiator AUG are present in the type 1
and type 2 human TPH cDNA clones. All of these upstream AUG codons are
followed by short open reading frames that could potentially encode
peptides. According to the scanning model of translation, these
AUG-burdened RNA sequences corresponding to the high molecular weight
TPH transcripts are expected to be poorly translated, a characteristic
that could be compensated for by the abundance of these mRNAs. In this
case, these mRNAs would be translated without additional maturation. In
addition, there have been several reports of abundant, incompletely
spliced transcripts that enter the cytoplasm (48) and also
which have been found on polysomes (49) . This observation
suggests that the introns, when they are maintained in the transcripts,
could play a role in the regulation of gene expression. In contrast,
it has recently been shown that precursor RNAs can be synthesized and
stored for later processing(50) . In this model, the large TPH
RNAs would be precursors, to be translated only after a maturation
step, a mechanism that easily accounts for the abundance of TPH mRNAs
bearing long 5`-UTR relative to those short 5`-UTR. Only the TPH mRNAs
with a short 5`-leader sequence would be effectively translated, and
the low abundance of these transcripts could result from their rapid
degradation. The conversion of a stable, untranslatable precursor to a
functional mRNA generates a supplementary step in the regulation of
gene expression. Finally, internal translation intiation as
described for some viral and eukaryotic genes (51, 52, 53) could also explain the abundance
of long TPH mRNAs. This translation initiation mechanism allows
messengers with long 5`-leader sequences to be efficiently translated.
Each of these three models could account for the large difference
between the amounts of human TPH mRNAs with long and short 5`-UTR, and
the intracellular localization of the high molecular weight TPH mRNAs
may indicate whether or not they can be translated. In any case, the
diversity exhibited by the 5`-UTR of the human TPH mRNAs may play a
physiological role in the production of TPH enzyme. It increases the
possibility of modulation of TPH gene expression at
post-transcriptional and translational levels. Another particularity
of TPH mRNA expression is the discrepancy between the tissues in which
the TPH enzyme and mRNA is found. Northern blot analysis detected TPH
transcripts in the pineal gland, intestine, and carcinoid tumor but not
in the brainstem raphe nuclei, which nevertheless contain TPH. There
have been similar observations in rat, rabbit, and
mouse(13, 16, 18) . The discrepancy between
TPH mRNA and protein levels in the brainstem could be explained by (i)
the existence of another TPH gene expressed specifically in the raphe
nuclei, (ii) better translation efficiency of very small amounts of TPH
mRNA, or (iii) enhanced stability of the TPH protein. Measurements of
the TPH gene transcription rate have shown that the level of gene
expression was similar in the pineal gland and in the brainstem,
suggesting post-transcriptional or translational regulation of the TPH
mRNA(21) . It is possible that no high molecular weight TPH
mRNAs are transcribed in human raphe nuclei brainstem and that only TPH
transcripts with spliced 5`-UTR are synthesized. These short mRNAs may
be efficiently translated and then rapidly degraded. In the carcinoid
tumors and pineal gland, large amounts of TPH mRNAs are produced.
Surprisingly, no short 5`-leader sequences of TPH messengers are
detected in the carcinoid tumors by S1 nuclease protection, although
they are in the pineal gland. The abundance of these high molecular
weight TPH mRNAs in carcinoid tumors could reflect a high transcription
rate or RNA stability peculiar to the mitotic character of this tissue.
Although these tumors synthesize and secrete very high levels of
serotonin, it is not known if the pathological cells produce more
active TPH than healthy enterochromaffin cells. In conclusion, the
cells expressing the TPH gene contain a large and complex variety of
TPH mRNA forms differing in the 5`-UTR. Although the functional
consequences of this phenomenon are only beginning to be investigated,
it provides interesting clues to novel mechanisms of regulation of gene
expression. An important aspect of TPH expression in the pineal gland
is its rhythmicity. In rat, TPH activity and the mRNA levels have been
shown to vary during the circadian rhythm(42, 54) . (
The nucleotide
sequence(s) reported in this paper has been submitted to the
GenBank(TM)/EMBL Data Bank with accession number(s)
X83213[GenBank].
Volume 270,
Number 8,
Issue of February 24, 1995 pp. 3748-3756
©1995 by The American Society for Biochemistry and Molecular Biology, Inc.
AN UNUSUAL SPLICING COMPLEXITY IN THE 5`-UNTRANSLATED REGION (*)
ABSTRACT
INTRODUCTION
MATERIALS AND METHODS
RESULTS
DISCUSSION
FOOTNOTES
ACKNOWLEDGEMENTS
REFERENCES
)is the key enzyme in
the biosynthetic pathway of the neurotransmitter serotonin. In mammals,
TPH is only expressed by a small number of tissues, namely the
brainstem raphe nuclei, the pineal gland in the central nervous system,
and the pancreatic and intestinal enterochromaffin cells in the
periphery. The serotonergic neurons of the raphe nuclei form a highly
divergent neuronal system controlling the basic activity of many target
regions distributed throughout the forebrain, the cerebellum, and the
spinal cord. This neuronal system modulates a variety of psychological
and physiological processes including thirst and appetite, sleep and
memory, and reproduction(1) . In the pineal gland, the
concentration of serotonin, which is an intermediate in melatonin
synthesis, is higher than in any other region of the brain or in any
other organ analyzed. The production of melatonin is characterized by a
dark-light circadian rhythm, which synchronizes the circadian and
ultradian cycles involved in a variety of functions including sleep,
sexual behavior, and body temperature(2, 3) . However,
the mechanisms involved in regulating these diverse activities remain
uncharacterized. At the periphery, TPH is also present in the
autonomous nervous system, in the intestinal and pancreatic
enterochromaffin cells, and in carcinoid tumors. These tumors develop
from intestinal enterochromaffin cells, produce large amounts of TPH,
and secrete large amounts of serotonin, accounting for most of the
symptoms associated with this pathology(4, 5) .
in their
tetrameric quaternary structure to coordinate oxygen
molecules(6) . The human TPH amino acid sequence is similar to
those of tyrosine hydroxylase (7) and phenylalanine
hydroxylase(8) , and the most conserved region comprises the
340 C-terminal amino acids of the proteins. This region shows 49%
identity among the mammalian hydroxylases without any gaps and with
mostly conservative substitutions (the TPH enzyme contains an extra 5
amino acids at the C-terminal end not found in the other enzymes).
Biochemical (9, 10, 11) and sequence analyses (12) suggest that this domain corresponds to the catalytic part
of the enzyme. The N-terminal domain differs more in size and sequence
between the enzymes and has been proposed to modulate the enzyme
activity. Relatively little is known about the regulation and the
biochemistry of TPH because it is not abundant and is extremely
unstable in vitro. Therefore, cloning the gene could lead to a
much better understanding of the biochemistry and regulation of TPH.
Screening a Carcinoid Tumor cDNA Library and Sequence
Analysis
A cDNA library was constructed from poly(A)
RNA extracted from a human intestinal carcinoid tumor (kindly
provided by J. M. Launay, Paris) using the method developed by Gubler
and Hoffman(22) . Double-stranded cDNA was size-fractionated on
a 10-30% sucrose gradient (containing 1 M NaCl) and
cloned into the EcoRI-digested
ZapII vector
(Stratagene). 4 10
recombinant plaques of this
human cDNA library were screened with a rat TPH cDNA probe at low
stringency. Positive clones were isolated and analyzed by restriction
mapping and Southern blotting. The EcoRI DNA fragments of the
longest type 1 and type 2 TPH cDNA clones were sequenced on both
strands.Screening a Human Genomic Library
A human genomic
library constructed in the
EMBL3 vector was obtained from Clontech
and screened by standard procedures with a probe containing the human
TPH coding sequence and part of the 5`-flanking region. 6
10
bacteriophages corresponding to about 3 times the
equivalent of the size of the human genome were screened, and the
positive clones were purified by four sequential rounds of plating and
hybridization. Bacteriophage DNA was prepared from selected plaques
according to the method described in Sambrook et al.(26) and analyzed by restriction mapping and Southern blotting.
Genomic DNA fragments were subcloned into pBluescript vectors, and the
intron-exon junctions were sequenced using TPH-specific primers.Northern Blot Analysis
Total RNA was extracted
from human pineal gland, carcinoid tumor, liver, dorsal raphe nuclei,
and colon (kindly provided by F. Javoy-Agid, Paris) according to the
technique described by Civelli et al.(27) . The
poly(A)
RNA of human pineal gland, carcinoid tumor,
and liver were purified by oligo(dT)-cellulose chromatography. Northern
blots were performed as described by Faucon Biguet et
al.(28) .S1 Nuclease Protection Analysis
S1 nuclease
mapping was performed according to Sambrook et
al.(26) . Aliquots of the labeled probes and 0.25 µg
of poly(A)
mRNA from human carcinoid tumor, pineal
gland, and liver were hybridized for 16 h at 45 °C. The samples
were digested with S1 nuclease (100-400 units/ml) at 37 °C
for 75 min. The specifically protected DNA fragments were separated on
a 4-5% denaturing polyacrylamide gel and visualized by
autoradiography.PCR Amplification
Single-stranded cDNAs were
synthesized from 0.2 µg of poly(A)
mRNA prepared
from carcinoid tumor and pineal with 10 µM of pd(N)
primer and avian myeloblastosis virus reverse transcriptase. The
amplification was carried out for 30 cycles with 0.3 µM of
forward and reverse primers in 1.5 mM MgCl
.
Thermal cycling was as follows: denaturation at 93 °C for 40 s;
annealing at 53 °C or 58 °C (depending on the T
of the primers) for 40 s; elongation at 72 °C for 120 s. The
primers O
, O
, and O
corresponded
to parts of exons E
, E
, and E
,
respectively. Primers O
and O
overlapped exons
E
,E
and E
,E
,
respectively, to amplify transcripts containing these two sets of
exons. PCR amplification with the primers O
and O
was performed with an annealing temperature of 50 °C to
prevent the annealing of these primers to only one exon. The primer
sequences were as follows: O
, CGACCCAGCCTGCACCTAC;
O
, TACTGGCGCCCGAGGTGAG; O
,
TCCCCTTTCTAAGGAATGGTCTTTG; O
, GCACCTACTGGCGCCCGAGTGGTA;
O
, TTTGGAGTAATTCTCTAAAACCATT. This technique is hereafter
referred to as RT-PCR.Cloning by anchored PCR of the 5`-Ends of the mRNA
Anchored PCR was performed as described by Dumas-Milne-Edwards et al.(29) . Briefly, the first cDNA strand was
synthesized by avian myeloblastosis virus reverse transcriptase
initiated from 0.5 µM of a TPH-specific primer (SBPE1),
which corresponds to a sequence 168 bases downstream of the translation
initiation codon and with 0.2 µg of poly(A)
RNA
from carcinoid tumor and pineal gland. Oligonucleotide BM-5` (29) was ligated to the 3`-end of the cDNA with T4-RNA ligase.
Half of the ligation product was amplified by PCR with 0.15 µM primers (BM-5`-1 (29) and a TPH oligonucleotide (SBSLIC)
overlapping the SBPE1 extension primer) in 1.5 mM MgCl
. Two nested PCR amplifications were performed on
the previously amplified products with primers BM-5`-2, BM-5`-3 (29) and with a TPH-specific primer, O
, which
corresponds to a sequence 80 nucleotides upstream of the extension
oligonucleotide (SBPE1). PCR products were analyzed on 3% agarose gels,
blotted onto nylon membranes, and hybridized with a primer
complementary to a sequence of the first coding exon (SBCRIB). The
bands hybridizing with SBCRIB were eluted from the gel, subcloned into
M13 mp8/SmaI and sequenced on both strands. The sequences of
the primers were as follows: SBPE1, 5`-TCTTCTTTTTGATTTTCGGGAC; SBSLIC,
5`-TTTTCGGGACTCGATATGTAACAGATTC; SBCRIB,
5`-TTTTCGGGACTCGATATGTAACAGATTC.
Multiple Human TPH mRNAs
Approximately 4
10
bacteriophages from a human carcinoid tumor cDNA library
were screened at low stringency with a rat TPH cDNA probe. Twenty
positive clones were isolated and analyzed by restriction mapping,
Southern blotting, and sequencing. The coding regions of these clones
and most of their 3`-untranslated regions (UTR) were identical; they
differed, however, by the length of their 5`-ends. The open reading
frame of 1332 bp (previously published; (15) ) was very similar
to that of the rat, rabbit, and mouse TPH cDNA. The 3`-noncoding
sequence, including the poly(A) tail, was about 2 kb long and also
exhibited a high degree of identity with part of the corresponding rat
and rabbit sequences. However, the size and sequence of the human TPH
5`-UTR were different from those of the rodent and were found similar
only to the 50 bases immediately upstream of the AUG codon of rabbit
TPH cDNA.
) 26 nucleotides upstream of the translation initiation
codon (Fig. 1A). The 5`- and 3`-extremities of the
I
region showed intron splice site sequences, suggesting
that I
may be recognized as an intron in some TPH
transcripts. Therefore, the two types of cDNA clones may result from
alternative splicing in the 5`-UTR; or possibly the type 1 cDNA clones
are TPH precursor RNA, whereas the type 2 is the mature transcript.
RNA from carcinoid tumor (Carcinoid T.), pineal gland (Pineal and Pineal
(5d)), and 20 µg of total RNA from raphe nuclei, colon, and
liver were subjected to gel electrophoresis, blotted onto a nylon
filter, and hybridized to
P-labeled TPH probe. Pineal and
pineal
are identical samples with different exposure times
(pineal, 18-h exposure; pineal
, 5-day exposure). The arrows show the TPH mRNAs and indicate their molecular
weights. B, Northern blot analysis of TPH transcripts with
intron and exon probes. Carcinoid tumor (right panel) and
pineal gland RNAs (left panel) were hybridized with a
1-kb-long TPH coding sequence (lane 1), I
intron-like region (lane 2), I
intron-like
region (lane 3), or I
intron-like region probes (lane 4).
Cloning and Mapping of the Human TPH Gene and
Phylogenetic Analysis
A human genomic DNA library in
EMBL3
was screened with the TPH cDNA probe, and six independent positive
clones were isolated. The restriction maps of the clones were
determined, and three clones that span the entire coding region of the
TPH gene were selected for further analysis (Fig. 3A).
The coding region, parts of the introns, and the 5`- and 3`-noncoding
flanking regions were sequenced. The individual coding exons were
mapped, and the sequences of each exon-intron boundary were determined
with TPH-specific oligonucleotides (Fig. 3B). TPH mRNA
is transcribed from a minimum of 11 exons, contained within a region of
genomic DNA of 29 kb. The intron locations in the three paralogous
hydroxylase genes are strongly conserved (Fig. 4A).
Interestingly, the number, position, and size of introns in the coding
region are conserved between the human TPH gene and the mouse TPH gene
whose organization has been previously reported(17) . Moreover,
the nucleotide sequences in the vicinity of the splice sites of the
human introns are similar to the corresponding regions in the mouse.
The exon sizes are between 63 and 197 nucleotides, and all of the donor
and acceptor splicing sites conform to the consensus sequences for
eukaryotic genes(30) . The exon that spans the 3`-part of the
open reading frame and the whole 3`-untranslated sequence is about 2 kb
long and contains 172 coding nucleotides. The length of the 3`-UTR is
conserved among the human, rat, and mouse TPH
genes(16, 18) , and their sequences are very similar
for the first 200 nucleotides downstream of the stop codon. The
sequence at the 3`-end of the human gene contained a weak
polyadenylation signal, AAUAGA, as compared with the canonical AAUAAA
polyadenylation signal(31) .
12,
13,
15) cover the whole TPH gene. B,
exon-intron structure of the human TPH gene. The exon sequences are
denoted in uppercase letters; intron sequences are in lowercase and are given for each junction. The amino acids and
their corresponding positions on the cDNA sequence are also shown underneath each exon-intron
boundary.
12,
which contained both the 5`-noncoding region of TPH cDNA and upstream
sequences, was analyzed by detailed restriction mapping (Fig. 1C). The sequence of the region upstream of the
translation initiation site on the genomic DNA was identical to the
entire 5`-UTR of the type 1 cDNA clones. It was therefore clear that
the domain I
characterizing the type 1 cDNAs corresponded
to an unspliced, intron-like sequence.Cloning of the 5`-UTR of the Human TPH mRNA by Anchored
PCR
The TPH mRNA diversity revealed by cDNA cloning and Northern
blotting was likely to be confined to the 5`-UTR. We therefore used the
anchored PCR technique (SLIC, (29) ) to isolate other 5`-leader
sequences for the TPH mRNA. Several 5`-directed cDNAs were synthesized
and cloned from human pineal gland and carcinoid tumor, both tissues
that strongly express the TPH gene. Only two classes of transcripts
were obtained, as shown by restriction mapping and sequence analysis (Fig. 1B). In both tissues, most of the isolated clones
were about 0.3 kb long and contained the 3`-end of the I
region. They corresponded to the type 1 TPH cDNA clone. This
result agreed with the cDNA library screening, where the majority of
cDNA clones isolated carried the I
sequence. The second
type of SLIC clone was about 0.35 kb long and characterized by the
absence of the I
region, as were the type 2 cDNA clones.
However, it differed from type 2 cDNA 5`-UTR in that the first 224
bases were replaced by a previously unidentified sequence of 210 bases (Fig. 1B). This clone was called type 3 SLIC cDNA. The
comparison of the type 3 SLIC clone sequence to that of the genomic DNA
showed that the 210-nucleotide stretch is located 2.4 kb upstream of
the identified 5`-UTR sequence. Therefore, this 2.4-kb region could be
recognized as an intron, which when spliced together with the I
intron-like region, generates an additional exon of 63
nucleotides in the 5`-leader sequence (named exon Ec; Fig. 1C). The nucleotide sequence, upstream of exon
E
, found in type 1 and type 2 cDNA also corresponded to the
nonspliced 3`-end of this 2.4-kb intervening sequence.
) and also showed the existence of a
new 5`-noncoding extremity. In short, cloning TPH cDNAs revealed a
diversity in the 5`-UTR of the human TPH mRNAs resulting from the
conservation or the elimination of two intron-like regions, I
and I
.Determination of the mRNA Cap Site by S1 Nuclease
Protection
We mapped the transcriptional initiation site of the
TPH gene as follows. The nucleotide sequence of the genomic DNA
encompassing the 5`-end of the type 3 SLIC clone and extending 1.5 kb
upstream was completely analyzed by S1 nuclease protection assays (Fig. 5). The labeled fragment of 0.57 kb (probe D) spanning the
5`-end of the type 3 SLIC clone was fully protected in pineal gland and
carcinoid tumor mRNAs, indicating that the TPH mRNA extended upstream
of the probe D (Fig. 5D). This result, also confirmed
by PCR analysis (data not shown), showed that the cloning techniques we
used did not allow us to reach the transcription start site. A more
extensive S1-mapping study was undertaken with three TPH genomic probes
(probes A, B, and C) upstream of the 5`-end of the type 3 SLIC clone
and covering 1.05 kb of genomic DNA. The labeled fragments B and C were
completely protected from S1 nuclease by RNAs extracted from both
tissues. In contrast, hybridization of the RNAs with probe A yielded a
major protected fragment (126 bases long) much smaller than the probe,
suggesting divergence between the genomic and the mRNA sequences (Fig. 5A). The genomic sequence did not contain any
acceptor splice site that coincided with the S1 nuclease cleavage.
Moreover, numerous putative cis-acting DNA elements, commonly found in
eukaryote gene promoters (TATA box, inverted CCAAT box, CCACCC box,
GC-rich sequences, AP-2 and AP-4 binding sites), were present
immediately upstream of the S1 nuclease cleavage site. This site,
therefore, probably corresponded to the TPH transcription initiation
site. Primer extension, using an oligonucleotide hybridizing to the
3`-end of probe A, yielded a single extension product consistent with
transcription beginning at the residue predicted by S1 nuclease
protection (data not shown). Thus, the position of TPH RNA start site
in genomic DNA is located 5.5 kb upstream of the ATG codon. In
addition, S1 nuclease protection analysis revealed that the region of
1.4 kb (named I
), localized between the cap site and the
5`-end of the type 3 SLIC clone, is present at the 5`-extremity of many
of the human TPH mRNAs.
RNA
(0.25 µg) from pineal (Pi); carcinoid tumor (CT),
and liver (L) was subjected to S1 nuclease protection
analysis. Pr-S1, probe without S1 nuclease; Pr+S1, probe with S1 nuclease. The arrows show
the protected fragments and their corresponding sizes. A,
probe A was a 0.42-kb SmaI-PstI fragment. B,
probe B was a 0.347-kb HindIII-SmaI fragment. C, probe C was a 0.287-kb EcoRI-HindIII
fragment. D, probe D was a 0.574-kb KpnI-EcoRI fragment. The bottom panel represents the position of probes A, B, C, and D in genomic DNA (horizontal arrows). The vertical arrows indicate the
position of restriction enzyme sites, and the broken arrow indicates the transcription initiation site. Open boxes indicate the 5`-noncoding exons.
) and showed
the splicing of the 1.4-kb region (data not shown). In this fragment,
exon E
was joined to a 179-base sequence (named exon
E
) within the 210-bp stretch of the type 3 SLIC clone (Fig. 1C). Therefore, type 3 SLIC clone contained a
part of the region I
, which, like the I
and
I
regions, could be recognized as intron but is present
in most TPH mRNA extracted from the pineal gland and carcinoid tumors
(see above).Characterization of a Novel TPH mRNA Species with Short
5`-UTR by RT-PCR Analysis
To help decipher the complete
organization of the 5`-untranslated region of TPH mRNAs, both pineal
gland and carcinoid tumor mRNAs were exhaustively analyzed by RT-PCR.
Two forward primer sequences were selected downstream of the
transcription initiation site. One was specific for the 29-bp exon
E
(O
) and was chosen to reveal the possible
presence of short TPH mRNA 5`-UTR, which would result from the splicing
of the three intron-like regions (I
, I
, and
I
). The second was specific for the I
sequence (O
) and was used to confirm the presence of
long 5`-extremities in TPH mRNAs. The sequence of the reverse primer
(O
) corresponds to a region downstream of the translation
initiation site. Several very short PCR products were obtained with
each of the two sets of primers, demonstrating the existence of short
5`-UTR in TPH mRNA in the two studied tissues. Surprisingly, no long
PCR product was amplified. However, when fragments of very different
sizes are simultaneously transcribed by RT-PCR, the amplification of
the shortest transcripts is strongly favored over the longer ones,
whatever the initial abundance of their respective
mRNAs(33, 34) . Three fragments (103, 166, and 282 bp)
were amplified by PCR with primers O
-O
(Fig. 6). Their sequences were determined and showed that
the 103-bp band corresponded to exon E
joined to the first
coding exon E
(containing 26 noncoding nucleotides at its
5`-end). The 166- and 282-bp products resulted from the insertion
between the exon E
and the exon E
of the exon
E
and the exon E
, respectively (Fig. 6A). Only two fragments (187 and 250 bp) were
generated with primers O
-O
and
corresponded to an extension of 100 bp from the first 29-nucleotide
exon E
linked to either exon E
or exon E
and exon E
(Fig. 6B). Thus, there
appeared to be two 5`-splice donor sites in the first noncoding exon,
both of which are used in the pineal gland and carcinoid tumor (the
resulting fragments were named E
and E
). An
additional 5`-untranslated extremity containing all of the exons
E
, E
, E
, and E
joined
together was also identified but only by using a set of primers
covering the junction E
-E
and the junction
E
-E
, suggesting that this transcript is
of very low abundance (Fig. 6C). Thus, in TPH mRNAs
containing exon E
, all combinations of exons were found
upstream of exon E
. However, when the exon E
was present, exon E
was never found between E
and E
. This suggests that there are a variety of
patterns of splicing, which all generate TPH mRNAs of about 3.5 kb,
with short 5`-UTR.
mRNA isolated from carcinoid
tumor (CT) and pineal gland (Pi). Open and hatched boxes represent the noncoding and coding exons,
respectively. The organization of the 5`-noncoding extremity for each
PCR fragment is schematized. A, radiolabeled PCR with specific
TPH primers (O
and O
). B, radiolabeled
PCR with specific TPH primers (O
and O
). The
products of the two sets of primers were separated on a 5% denaturing
polyacrylamide gel and visualized by autoradiography. C, PCR
with primers overlapping exons E
,E
and
E
,E
. The products of PCR were separated on a 3%
agarose gel, transferred to a membrane, and hybridized with a labeled
oligonucleotide (SBCRIB).
Determination of the Abundance of the TPH mRNA with Short
5`-UTR as Compared with That of Long 5`-UTR TPH mRNAs
To study
the relative abundance of the various 5`-UTR of the TPH mRNAs in pineal
gland and carcinoid tumor, additional S1 nuclease protection
experiments were performed. TPH probes hybridizing with sequences in
either the introns or overlapping noncoding and coding exon-intron
junctions were used ( Fig. 5and Fig. 7). Part of the
coding region was analyzed with the cDNA probe H, which overlaps three
exons (E
, E
, and E
). It was
entirely protected by both pineal gland and carcinoid tumor mRNAs,
indicating that intervening introns were eliminated in this domain (Fig. 7D). In contrast, probes C, E, and F located in
the intron-like regions (I
I
, and
I
) of the 5`-UTR, each were fully protected by the same
RNA (Fig. 5C and Fig. 7, A and B). Therefore, the three intron-like regions in the 5`-UTR
(I
, I
, and I
) were conserved
in most of the TPH mRNAs when the introns interrupting the coding
region were usually eliminated. The splicing of the intron-exon
junctions E
-I
and I
-E
was investigated with probes A and G, respectively. Probe G
yielded two fragments, which corresponded to the spliced and to the
nonspliced intron-like sequence I
, thereby showing the
presence of two classes of transcripts in which the 1.7-kb region was
and was not present (Fig. 7C). The TPH mRNA where unspliced
intron-like regions are maintained in the 5`-UTR are much more abundant
in carcinoid tumor than in pineal gland. Similar results were obtained
with probe A, which covered the junction of the exon E
and
the intron-like region I
(Fig. 5A). No
signal corresponding to the transcript from which intron-like region
I
had been eliminated was detected in the tumoral tissue.
Therefore, the relative abundance of the spliced transcript was higher
in pineal gland than in carcinoid tumor RNAs.
RNA
(0.25 µg) from pineal (Pi); carcinoid tumor (CT),
and liver (L) was subjected to S1 nuclease protection. Pr-S1, probe without S1 nuclease; Pr+S1, probe
with S1 nuclease. A and B, S1 nuclease mapping with
the E and F probes containing 290 and 362 nucleotides from the
intron-like regions. C, probe G was 250 nucleotides long and
contains a part of the exon E
(117 nucleotides). D, probe H was 330 nucleotides long and covered three coding
exons. The arrows show the protected fragments for each probe
used. The bottom panel shows the positions of probes (E, F, G,
H) in genomic DNA (horizontal arrows). Open and shaded boxes indicate the 5`-noncoding and coding exons,
respectively.
(Fig. 8A). The entire probe was fully
protected by the pineal gland RNA, showing that short mRNA
5`-extremities are present in normal tissue. Only a very weak
protection was obtained in the carcinoid tumor, where splicing of
intron-like regions in the TPH 5`-UTR is rare. In contrast, probe J
(349 bases), complementary to the sequence of the type 3 SLIC clone,
was poorly protected in the two tissues (Fig. 8B).
Therefore, the type 3 SLIC clone isolated by anchored PCR is a minor
form of TPH mRNA.
RNA (0.25 µg) from pineal gland (Pi), carcinoid tumor (CT), and liver (L)
were subjected to S1 nuclease protection. Pr-S1, probe without
S1 nuclease; Pr+S1, probe with S1 nuclease. A,
S1 nuclease mapping with probe I of PCR fragment (187 bases)
corresponding to one of the TPH 5`-noncoding extremities. B, probe J is 349 bases long and corresponds to the type 3 SLIC
clone. The arrows show the fragments protected by each probe.
The schematic organization of each protected fragment is represented. Open boxes indicate the 5`-noncoding exons and horizontal
lines indicate the intron-like region.
, contained
three antisense Alu sequences separated by up to a few hundred bases of
non-Alu DNA. Thus, we chose an intronic probe in a region located
outside of the repetitive elements. However, no simple conclusion can
be drawn from these hybridizations. In both carcinoid tumor and pineal
gland, the major 5-kb band was recognized by each of three probes,
suggesting that it may correspond to several species of TPH mRNA with
different 5`-UTR but similar size. In practice, it was very difficult
to quantify the relative abundance of the various other transcripts
labeled by the intronic probes from the RNA material available.
Therefore, although each of the mRNA bands detected on Northern blot
may correspond to a different and complex exon-intron arrangement in
the 5`-UTR, it was impossible to unravel the organization of these TPH
transcripts by simple hybridizations to the blots. Nevertheless,
Northern analysis supported the main conclusions of the extensive and
more accurate nuclease protection experiments, which were that the TPH
transcripts having eliminated the three intron-like regions in the
5`-UTR generally represented a minor population as compared with the
partially spliced TPH mRNAs, a situation that is much more pronounced
in carcinoid tumor than in pineal gland.
and E
, which border the 5`-UTR and are
separated by introns (Fig. 9). However, this suggestion awaits
experimental confirmation. In addition, the good conservation of the
TPH 5`-UTR sequence between human and rabbit is not found in the other
known mammalian species, indicating major sequence shuffling in this
region.
and E
) and part of the I
intron of
human TPH 5`-UTR. Hatched boxes represent the human or rabbit
exons of the TPH gene 5`-UTR. Thick lines indicate the introns
of the TPH gene 5`-UTR.
(such as
the type 2 TPH cDNA), are characterized by the presence of a
supplementary in-phase AUG codon, 27 bases upstream of the presumed
translation start site. The putative use of this initiator codon would
generate a longer N-terminal sequence. Nevertheless, it remains to be
determined whether or not this protein is produced and whether the two
resulting proteins possess the same characteristics (i.e. stability or activity). The recent cloning of Xenopus laevis TPH cDNA has shown that it potentially encodes a TPH protein with
37 extra amino acids at the N terminus as compared with TPH in other
species(42) . To date, this extension has no known functional
consequences.
)This type of variation could imply an integrated
regulation of TPH gene expression. An attractive hypothesis is that it
evolves from differential splicing events leading to mRNAs, which
differ only by their 5`-leader sequence.
)
)
We thank E. Jean-Gilles, I. Brunet, and D. Samolyk
for technical assistance; P. Ravassard, N. Faucon Biguet, and J. F.
Julien for critical comments; and P. Vernier for the phylogenetic
analysis and critical reading of this manuscript.
©1995 by The American Society for Biochemistry and Molecular Biology, Inc.
![]()
CiteULike
Complore
Connotea
Del.icio.us
Digg
Reddit
Technorati What's this?
This article has been cited by other articles:
![]() |
S. Eddahibi, C. Guignabert, A.-M. Barlier-Mur, L. Dewachter, E. Fadel, P. Dartevelle, M. Humbert, G. Simonneau, N. Hanoun, F. Saurini, et al. Cross Talk Between Endothelial and Smooth Muscle Cells in Pulmonary Hypertension: Critical Role for Serotonin-Induced Smooth Muscle Hyperplasia Circulation, April 18, 2006; 113(15): 1857 - 1864. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Slominski, J. Wortsman, and D. J. Tobin The cutaneous serotoninergic/melatoninergic system: securing a place under the sun FASEB J, February 1, 2005; 19(2): 176 - 194. [Abstract] [Full Text] [PDF] |
||||
![]() |
|