 |
INTRODUCTION |
Subjected to an increasingly severe thermal environment as the
Southern Ocean began to cool approximately 25-40 million years ago
(mya)1 (1), the coastal
fishes of the Antarctic diverged from temperate fishes (2) and evolved
compensatory molecular, cellular, and physiological adaptations that
maintain metabolic efficiency and preserve macromolecular function in
their now chronically cold marine environment (
1.86 to approximately
1 °C). The translational machinery of these fishes, for example,
shows clear evidence of cold adaptation (3, 4), with rates of
polypeptide chain elongation more than 10-fold greater than those
measured in temperate fishes cooled to comparable temperatures.
Similarly, the polymerization energetics of the actins of Antarctic
fishes (5) and the ATPase activities of their skeletal myosins (6)
support efficient myofibrillar assembly and function at their low
habitat temperatures. Our goal is to determine the molecular
adaptations, both qualitative and quantitative, that maintain the
efficient expression of the tubulin genes and the polymerization
capacity of the tubulin polypeptides of these extreme psychrophiles.
We and others (7, 8) have shown previously that the critical
concentration for microtubule formation by the brain tubulins of
Antarctic fishes (~1 mg/ml) is comparable to those of temperate poikilotherms and homeotherms at their much higher body temperatures. Conservation of the critical concentration by Antarctic fishes probably
results from structural changes, both in primary sequences and in
posttranslational modifications, intrinsic to their
- and
-tubulin subunits (9-12). The primary sequence of class II
-tubulin from the yellowbelly rockcod Notothenia
coriiceps, for example, contains unique residue substitutions that
increase both the hydrophobicity and the flexibility of the polypeptide
chain (12, 13), two factors that should favor microtubule formation in
an energy-poor environment. Similar alterations have been observed in
other
- and
-tubulin chains of this
species.2 A second, related
challenge confronting Antarctic fishes is the synthesis of
sufficient quantities of the
- and
-tubulins to attain the
critical concentration of tubulin dimers in their cells.
The abundant expression of tubulin in the brains of Antarctic fishes is
likely to require compensatory adjustments in gene transcription to
offset the rate-depressing effects of low temperature. Potential
adaptations include increases in tubulin gene number, organization of
tubulin genes into efficient transcription units, evolution of more
efficient gene promoters, enhancers, transcription factors, and/or RNA
polymerases, and enhancement of mRNA stabilization. To evaluate
these possibilities, we have initiated analysis of the structure,
genomic organization, and expression of the tubulin genes of N. coriiceps. Our results suggest that several of these modes of
adaptation may be exploited by these cold-living vertebrates.
In higher vertebrates, the
- and
-tubulins are encoded by small
gene families (~6-7 functional genes for
and a similar number
for
), each member of which yields a structurally distinct polypeptide (14-16). These genes are generally thought to be unlinked and dispersed throughout the genome (17). In a study of the chicken
-tubulin gene family, for example, Pratt and Cleveland (18) found
that four of five genomic clones contained single
-tubulin genes;
the fifth contained two
-tubulin genes, one functional and the
second a pseudogene. The genomes of lower, nonvertebrate eukaryotes, by
contrast, frequently contain tightly linked tubulin genes. Protozoan
parasites possess tubulin gene ensembles, either as separate tandem
groupings of
- or
-tubulin genes (Leishmania spp. (19,
20)) or as linked
/
tandem repeats (Trypanosoma brucei
(21, 22)). Similarly, some of the tubulin genes of the sea urchin
Lytechinus pictus are organized in distinct
or
clusters (23).
The regulation of tubulin gene expression occurs at both
transcriptional and translational levels. The tissue-specific and hormonally regulated expression of the
-tubulin genes of
Drosophila is controlled both by upstream promoter elements
and by negative and positive regulatory elements (silencers and
enhancers) generally located within the first introns (24-28). Less is
known regarding regulation of tubulin gene expression in vertebrates.
TATA boxes are generally present in vertebrate
- and
-tubulin
promoters (29, 30), and high level expression of a Xenopus
-tubulin gene, X
T14, in oocytes is regulated by three
CCAAT boxes, a "heat-shock-like" element (all located 60-200 bp
upstream of the transcription start site), and their corresponding
transcription factors (31). Cotranslational regulation of tubulin
mRNA stability also contributes to control of cellular tubulin
levels (32). When the pool of tubulin dimers is high,
-tubulin
mRNAs are targeted for degradation by binding of a cellular factor
to the ribosome-bound amino-terminal
-tubulin tetrapeptide (33,
34).
Here we report the first example of clustered tubulin genes in a
vertebrate, the Antarctic rockcod N. coriiceps. Three
-tubulin genes, designated NcGTb
a,
NcGTb
b, and NcGTb
c, are tightly linked in
an ~10-kb segment of DNA, with
a and
b
linked head-to-head (5' to 5') and
a and
c
tail-to-tail (3' to 3'). The similarity of the nucleotide sequences of
these genes, strikingly illustrated by an ~480-bp palindrome linking
a and
b, suggests that the cluster evolved
approximately 7-31 million years ago by duplication, inversion, and
divergence of a common ancestral
-tubulin gene. The neurally
restricted expression of the
a/
b gene pair
and the widespread expression of
c appear to be governed
by distinct sets of promoter and enhancer elements. We have also
identified a 285-bp element from the
a/
c
intergenic region that is distributed widely in notothenioid genomes.
We propose that expansion of the number of
-tubulin genes in the
N. coriiceps genome facilitates the synthesis of
-tubulin
chains at low temperature by providing additional templates for
mRNA synthesis. The selective pressure favoring this expansion was
probably the cooling of the Southern Ocean beginning ~25-40 mya. A
preliminary report of some of this work has appeared (35).
 |
EXPERIMENTAL PROCEDURES |
Collection of Fish Tissues--
Specimens of the Antarctic
yellowbelly rockcod, N. coriiceps, were collected by bottom
trawling from the R/V Hero or from the R/V Polar
Duke near Low and Brabant Islands in the Palmer Archipelago. They
were transported alive to Palmer Station, Antarctica, where they were
maintained in seawater aquaria at
1.5 to 1 °C. Tissues (testis,
brain, gill, liver, spleen, blood, and muscle) were dissected, frozen
in liquid nitrogen, and maintained at
70 °C until use.
Frozen testis tissue from the New Zealand black cod, Notothenia
angustata, was generously provided by Dr. Arthur DeVries
(University of Illinois, Urbana).
Southern Analysis of Genomic DNA--
High molecular weight
genomic DNA was purified (36) from the testis tissue of one N. coriiceps male, and Southern blots (37) of restriction
endonuclease-digested DNA samples were prepared as described previously
(12). The Southern replicas were probed for
-tubulin gene sequences
by hybridization to 32P-labeled (12, 38)
-tubulin
cDNAs from the chicken (c
1; Ref. 39) or from Chlamydomonas
reinhardtii (
10-2; Ref. 40). Prehybridization and hybridization
of the membranes were performed at moderate stringency (3× SSC (1×
SSC = 0.15 M NaCl, 0.015 M trisodium
citrate), 5× Denhardt's solution (41), 50 µg/ml sonicated, denatured Escherichia coli DNA, 0.5% (w/v) SDS, 1 mM EDTA, 68 °C) for 1 and 18-20 h, respectively, and
the membranes were washed to high stringency (0.1× SSC, 25 °C,
1 h). The membranes were exposed to Kodak XAR-5 X-Omat film at
70 °C with intensification (DuPont Cronex Lightning Plus screens).
Genomic Library Construction and Screening--
A genomic
library of N. coriiceps testicular DNA was constructed in
the
vector Charon 35 (42). High molecular weight DNA was digested
partially with MboI, and fragments of 15-20-kb, obtained by
sucrose gradient centrifugation, were ligated to the BamHI sites of the vector arms. Recombinant phage DNA was packaged in vitro (Packagene; Promega). The unamplified library was screened for clones encoding
-tubulin genes by hybridization (38) of nitrocellulose replicas of bacteriophage plaque DNA to the
32P-labeled chicken cDNA. Prehybridization and
hybridization of the membranes were performed at moderate stringency
(see "Southern Analysis of Genomic DNA") for 1 and 18-20 h,
respectively. Positive plaques were detected autoradiographically as
described above. One hundred twenty candidate
-tubulin genomic
isolates, obtained from a primary screen of 500,000 recombinant phage,
were classified as strongly, moderately, or weakly hybridizing. One
strongly hybridizing isolate, designated S2, was carried through two
additional rounds of plaque purification and screening, and single
plaques were picked for clone stock preparation.
By using testicular DNA from N. angustata, we constructed a
genomic library of 15-20-kb fragments in the vector LambdaGEM-11 (Promega). DNA was digested partially with MboI; fragments
were ligated to phage arms containing XhoI half-sites, and
recombinant phage DNA was packaged in vitro (Packagene;
Promega). The library (titer = 1 × 106) was
screened for candidate
-tubulin clones by hybridization to an
N. coriiceps
-tubulin cDNA, NcTb
1, essentially as
described for
-globin genes by Zhao et al. (43). Twelve
candidate
-tubulin genomic isolates were obtained from a primary
screen of 200,000 recombinant phage. Six of these isolates were carried
through two additional rounds of plaque purification and screening, and single plaques were picked for clone stock preparation.
cDNA Library Construction and Screening--
Total RNA was
isolated from brain tissues of N. coriiceps (38, 44), and
poly(A)+ RNA was selected by oligo(dT)-cellulose affinity
chromatography (45). Two different libraries were made.
Oligo(dT)-primed cDNA synthesis and construction of the first
library in
gt10 followed the procedures described by Huynh et
al. (46). The second library was constructed in
ZAP II
(Stratagene); cDNA synthesis was primed with a mixture of random
hexanucleotides (75%) and oligo(dT) (25%). The libraries were
screened for recombinant clones bearing
-tubulin coding sequences by
hybridization of nitrocellulose or nylon (MagnaLift, MSI, Westboro, MA)
replicas of bacteriophage plaque DNA to the probe,
32P-labeled by nick translation (38) or by random priming
(47). In early screens c
1 was used as probe, whereas in later
screens
-tubulin cDNAs from N. coriiceps were
employed. Hybridization and washing of the membranes were performed as
described (12), and positive plaques were detected
autoradiographically. A total of 159 candidate
-tubulin cDNA
isolates were obtained from three screens (632,000 total recombinant
phage) of the two libraries, and 80 of these were carried through
tertiary plaque purification/screening. Three cDNA clones
(designated NcTb
2, NcTb
7, and NcTb
8) from the second library
that corresponded to the three
-tubulin genes (
b,
a, and
c, respectively) of the genomic
cluster (Fig. 1) were sequenced (see below). The nucleotide sequence of
NcTb
2 downstream of codon 168 was used to complete the sequence of
the partial
b gene. Two cDNAs (NcTb
1 and NcTb
3)
from the first library were also characterized.
Subcloning and DNA Sequence Analysis--
Parental clones and
restriction fragment or deletion (48) subclones were sequenced manually
on both strands by use of the dideoxynucleotide chain termination
method (49) and T4 DNA polymerase (Sequenase II; U. S. Biochemical
Corp.). Portions of the sequence were established by use of the PRISM
Ready Reaction Dye Deoxy Termination Cycle Sequencing Kit (Applied
Biosystems), and the products were electrophoresed on an Applied
Biosystems 373A automated DNA sequencer (University of Maine DNA
Sequencing Facility).
Nucleotide and amino acid sequence analyses of the N. coriiceps
-tubulin genes, cDNAs, and their encoded proteins
were performed by use of the Clustal method provided by DNASTAR
MegAlign. DNA sequence relatedness was calculated as the similarity
index of Dayhoff (50) as implemented by DNASTAR Align.
GenBank Accession Numbers--
The sequence of the N. coriiceps
-tubulin gene cluster reported in this paper has been
deposited in the GenBankTM data base under the accession
number AF082027. The sequence of the cluster has been scanned against
the GenBankTM data base using the BLASTN program (National
Center for Biotechnology Information) to identify sequences with
significant relatedness. Related sequences, and their accession
numbers, are presented under "Results."
Northern Analysis of
-Tubulin Gene Expression--
Total RNAs
from testis, brain, gill, liver, spleen, blood, and muscle were
isolated from tissues by a modification (44) of the acid guanidinium
thiocyanate/phenol/chloroform method (51). RNAs (5 µg/slot) were
applied to nylon membranes (MagnaGraph, MSI) by vacuum aspiration using
a Bio-Rad Bio-Dot slot-blot apparatus. Sets of seven RNA samples were
hybridized to PCR-generated, 32P-labeled probes specific
for the 3'-UTRs of the cDNAs NcTb
2 (
b gene),
NcTb
3, NcTb
7 (
a gene), or NcTb
8 (
c
gene). To estimate the total
-tubulin mRNA in each tissue, a
control set of RNA samples was hybridized to a fragment of NcTb
1
encoding amino acid residues 1-430. Prehybridization and hybridization
of the membranes were performed in 5× SSPE (1× SSPE = 0.18 M NaCl, 0.01 M
Na2HPO4·7H2O, 0.001 M
EDTA), 5× Denhardt's solution (41), 50% formamide, 0.2% SDS at
42 °C for 2 and 18-20 h, respectively, after which the membranes
were washed sequentially with buffers of increasing stringency (final
wash conditions = 0.1× SSPE, 42 °C, 15 min). The membranes
were exposed to Fuji RX x-ray film at
70 °C with intensification.
PCR-based Gene Linkage Analysis--
To determine the potential
linkage of
-tubulin genes in the N. angustata genome, we
employed a PCR-based strategy using as template phage DNAs purified
from the six tertiary genomic clones (see "Genomic Library
Construction and Screening"). Nondegenerate primers corresponding to
highly conserved regions of the primary sequence of the N. coriiceps
-tubulins were synthesized as follows: 1) sense
primer, 5' CAGTTTGTGGACTGGTGC 3' (residues 341-347,
N-Gln-Phe-Val-Asp-Trp-Cys-C); 2) antisense primer, 5'
AGCTCCAGTCTCACTGAAG 3' (reverse complement of coding sequence for
residues 53-58; N-Phe-Ser-Glu-Thr-Gly-Ala-C). The primers were used in
three combinations as follows: 1) sense alone to amplify
tail-to-tail-linked genes; 2) antisense alone to amplify
head-to-head-linked genes; and 3) sense plus antisense to establish
head-to-tail linkage by difference (i.e. PCR products not
shared with sense alone and antisense alone reactions). Each PCR
reaction contained 3-5 ng of template DNA, 1.6 µM
primers (0.8 µM of each primer when different), and
CLONTECH AdvantageTM KlenTaq polymerase
mix (optimized for long distance PCR) (52). Touchdown PCR (53) was
performed for 29 cycles using the following parameters: 1) denaturation
steps, 94 °C, 30 s; 2) annealing steps, first 9 cycles ramping
the temperature from 70 to 62 °C in 1° increments followed by 20 cycles at 62 °C; and 3) extension steps, 68 °C, 6 min. PCR
products were analyzed on 1% agarose gels containing 1× TBE (0.089 M Tris borate, 2 mM EDTA, pH 8.0) and 0.0005%
ethidium bromide. The ends of the PCR products were sequenced by the
automated procedure to establish
-tubulin gene orientation.
Genomic Southern Analysis of a Repetitive DNA
Element--
During characterization of the N. coriiceps
-tubulin gene cluster, we discovered a 285-bp repetitive element. To
determine the abundance, organization, and species distribution of this fragment, we hybridized it to Southern replicas of
HindIII-digested genomic DNAs from Antarctic and temperate
notothenioids, other temperate fishes, an amphibian, and a reptile.
Restriction endonuclease digestion, electrophoresis, and transfer of
DNAs were performed as described previously (12). Prehybridization of
the membrane and subsequent hybridization to the 285-bp probe (labeled
with 32P by random priming (47)) were performed as
described by Detrich and Parker (12) with the following exceptions: 1)
prehybridization was for 2 h; 2) the
prehybridization/hybridization temperature was 63 °C; and 3) the
membranes were washed to final stringencies of 0.1-1× SSC, 63 °C,
for 15-40 min. The membranes were exposed to Fuji RX x-ray film as
described above.
Genomic DNA from the zebrafish (Danio rerio) was prepared
from total body tissues (36). Samples of genomic DNAs from the African
lungfish (Protopterus aethiopicus), the clearnose skate (Raja eglanteria), the goldfish (Carassius
auratus), the horned shark (Heterdontus francisci), the
sea lamprey (Petromyzon marinus), the spotted ratfish
(Hydrolagus colliei), the sturgeon (Acipenser fulvescens), the clawed frog (Xenopus mulleri), and the
snapping turtle (Chelydra serpentina) were generously
provided by Dr. Chris Amemiya (Boston University School of Medicine).
 |
RESULTS |
Estimation of
-Tubulin Gene Number in N. coriiceps--
To
estimate the number of
-tubulin genes possessed by N. coriiceps, we probed its genome for sequences complementary to
-tubulin cDNAs from the chicken (c
1) and from
Chlamydomonas (
10-2). Fig. 1 shows that the
-tubulin probes
hybridized to 10-15 different fragments in each restriction digest of
the fish DNA. Furthermore, the hybridization patterns generated by the
two heterologous cDNAs were virtually identical. These results
suggest that the
-tubulins of N. coriiceps, like its
-tubulins (12), are encoded by a multigene family that is larger
than those of higher vertebrates (14, 16, 39). Of particular note, the
strong hybridization signals observed for some of the fragments raised
the possibility that they contain multiple, linked
-tubulin
genes.

View larger version (74K):
[in this window]
[in a new window]
|
Fig. 1.
Hybridization of -tubulin cDNA probes
to genomic DNA from N. coriiceps. Southern blots of
restriction endonuclease-digested testis DNA were hybridized to
32P-labeled recombinant plasmids pT1 (A; chicken
c 1 cDNA insert) or pcf10-2 (B; C. reinhardtii 10-2 cDNA insert). Lanes B,
H, P, and E contain genomic DNA digested with
BamHI, HindIII, PstI, and
EcoRI, respectively. Lanes U contain undigested
genomic DNA. The molecular weights of DNA standards (in kb) are
indicated on the vertical axes.
|
|
Organization of an N. coriiceps
-Tubulin Gene Complex--
To
investigate the organization of the
-tubulin genes of N. coriiceps, we selected a strongly hybridizing clone, S2, that carried an insert of ~13.8 kb. Preliminary restriction mapping and
Southern hybridization analysis suggested that the insert contained two
or more
-tubulin genes in a segment of ~10 kb. Subsequent sequence
analysis revealed that S2 contains two complete
-tubulin genes,
designated NcGTb
a and NcGTb
c, and one
partial gene, NcGTb
b, that abuts one end of the genomic
fragment. Fig. 2 presents the
organization and salient features of this gene complex. Two of the
genes,
a and
b, are linked in head-to-head, or 5' to 5', orientation with ~500 bp separating their start codons. By contrast, the
a and
c genes are linked
tail-to-tail (3' to 3') with ~2 kb between their poly(A) signal
sequences. The approximately 4 kb of sequence to the left of the
c gene is devoid of
-tubulin coding sequences.

View larger version (20K):
[in this window]
[in a new window]
|
Fig. 2.
Organization of the N. coriiceps
-tubulin gene cluster. The genomic clone S2 contains two
complete -tubulin genes, NcGTb a and
NcGTb c, and one partial gene, NcGTb b. Each
gene contains three introns (yellow rectangles,
numbered) and four exons (dark blue rectangles),
the first of which consists only of the ATG start codon. The positions
of the start codons (ATG) were mapped by comparison of the
gene sequences to the 5'-UTRs of the corresponding cDNAs. 3'-UTRs
are indicated in gray. The direction of transcription
(5' 3') is indicated for each gene. Shown
expanded beneath the cluster map are the intergenic
palindrome (red and green) and introns 1 (yellow) of the a/ b gene pair (see Fig.
6A) and a 285-bp region (light blue) between
a and c that is highly similar to a portion
of intron 4 of the D. mawsoni trypsinogen gene (see Fig. 9)
(74). Within introns 1, M and N show the
positions of putative maternal and neural enhancers (see "Results"
and "Discussion").
|
|
N. angustata Is a Temperate Congener of N. coriiceps--
To
determine whether this mesophilic species shares a similar organization
of its
-tubulin genes, we probed its genome by Southern blot
hybridization to the N. coriiceps NcTb
1 cDNA. The
-tubulin fragment patterns observed for HindIII-digested
N. angustata and N. coriiceps DNAs shared some
similarities in the low molecular weight region, but the temperate
species contained few of the strongly hybridizing fragments of high
molecular weight (>5 kb) that are suggestive of gene clustering in the
cold-living fish (data not shown). We also examined six genomic
-tubulin clones from N. angustata for gene clustering by
use of PCR-based linkage analysis. Two of the six clones gave the same
3-kb amplification product, which corresponded to head-to-tail linkage
of a pair of
-tubulin genes with approximately 2.5 kb separating
their respective coding sequences (data not shown). The remaining four clones apparently contained single
-tubulin genes. Although these surveys were not exhaustive, they do suggest that the extent of
-tubulin gene clustering in N. angustata is smaller than
that of the Antarctic fish.
Three
-Tubulin Genes and Their Encoded Polypeptides--
Fig.
3, A
C, shows the nucleotide
sequences and translations of the NcGTb
a,
NcGTb
b, and NcGTb
c genes, respectively.
(For comparative purposes, the sequence of the
b gene
downstream of Glu168 has been completed from its cognate
cDNA NcTb
2.) Table I gives estimates of the sequence similarities of these genes and subregions thereof. The
a,
b, and
c
genes are quite similar to each other (80-83%), which suggests that
they may have arisen by duplication and divergence of a common
ancestral gene. Each gene contains three introns that interrupt the
nucleotide sequence after codon 1, within codon 76, and after codon
125, positions that are highly conserved in other vertebrate
-tubulin genes (54). In general,
a appears to be most
closely related to
b, except that its small introns 2 and
3 are quite similar to those of
c.

View larger version (58K):
[in this window]
[in a new window]

View larger version (60K):
[in this window]
[in a new window]

View larger version (58K):
[in this window]
[in a new window]
|
Fig. 3.
Nucleotide sequences of the
NcGTb a, NcGTb b, and NcGTb c
genes and primary sequences of the encoded -tubulins.
A, NcGTb a, B, NcGTb b,
and C, NcGTb c. Coding sequences are indicated
by uppercase roman text, and the encoded amino acid residues
are presented below in the three-letter code.
Introns are shown in lowercase roman. 5'-UTR and -upstream
sequences are presented in uppercase italics, and 3'-UTR and
-downstream sequences are given in lowercase italics. Potential TATA boxes and maternal
and neural enhancer motifs are indicated by reversed text,
double underlining, and open boxes, respectively.
The Kozak sequences for translation initiation (57, 58) are given in
underlined boldface, and potential polyadenylation signal
sequences in the 3'-UTRs are shown boldfaced, underlined, italic
text. Gene-specific probes derived from the 3'-UTRs, used to
assess the expression of these genes in major tissues (Fig. 5),
are shown underlined.
|
|
View this table:
[in this window]
[in a new window]
|
Table I
Sequence comparison of regions of the N. coriiceps a-, b-, and
c-tubulin genes
Percentage similarities were calculated as the similarity index of
Dayhoff (50). For the alignments, the K-tuple was set at 3, the gap
penalty at 1, and the range at 40.
|
|
The three genes encode distinct, but closely related,
-tubulin
polypeptides (Fig. 4). Compared pairwise,
the
a-,
b-, and
c-tubulin chains are 98.4-98.9% identical to
each other. With respect to other vertebrate
-tubulins, the three
fish chains are very closely related to the
T6-tubulin of the ray
Torpedo marmorata (97.6-97.8% sequence similarity;
GenBankTM accession number P36220) and to two mammalian
chains,
1-tubulin of Chinese hamster (97.3-97.6% sequence
similarity; accession number P05209) and the M
2 isotype of mouse
(97.1-97.3% sequence similarity; accession number P05213). Somewhat
surprisingly, the N. coriiceps
-tubulins are only ~94%
similar to two
-tubulin polypeptides from salmonid teleosts, the
rainbow trout (Oncorhynchus mykiss testis-specific
chain, accession number P18288), and the chum salmon (O. keta
chain, accession number P30436). However, this apparent
discrepancy most likely reflects the multiplicity of
-tubulin
isotypes in vertebrates and the paucity of fish tubulin sequences
available for comparison.

View larger version (45K):
[in this window]
[in a new window]
|
Fig. 4.
Primary sequences of the a-, b-, and
c-tubulins. Amino acid residues that differ between the three
polypeptides are shown by shaded rectangles. Residues of the
b- and c-tubulins that are identical to a are indicated by
periods. The sequence of b-tubulin beyond
Glu168 was deduced from the cognate cDNA, NcTb 2, of
the b gene.
|
|
Like the Ncn
1
-tubulin cDNA of N. coriiceps (12),
the NcGTb
a,
b, and
c genes
show a strong preference for codons containing G or C in the third
position. Although 57 (
a) to 58 (
b,
c) codons are used, the frequency of codons ending in G
or C is 2.05 (
a) to 2.07 (
b,
c) times that of codons with third position A or T. The
codon bias of the three
-tubulin genes stands in striking contrast
to their A + T rich introns (see below) and to the G + C content
(39-43%) of the genomes of closely related Antarctic nototheniid
fishes (55).
-Tubulin genes from the chum salmon (accession number
X66973) and the rainbow trout (accession number M36623) are similarly
biased to third position G or C (G + (C/A) + T = 2.22 and 2.39, respectively), those of mammals are slightly less so (mouse
3 and
6 tubulins = 2.15 and 1.81, human = 1.72; accession
numbers M13442, M13441, and K00558, respectively), but
Xenopus (accession number X07046) and the electric ray show
little codon bias (third position G + (C/A) + T = 0.96 and 1.13, respectively). Overall, the pattern of codon usage in the N. coriiceps
-tubulin genes is strongly reminiscent of that found
for a set of 22 genes of the Atlantic salmon Salmo salar L. (56), which suggests that mutational bias is a major factor influencing
choice of synonymous codons in both fishes.
Expression of the
-Tubulin Gene Cluster--
To determine
whether the
-tubulin genes of the N. coriiceps cluster
are functional, we used gene-specific probes complementary to their
3'-UTRs (see Fig. 3) to measure steady-state mRNA levels in seven
tissues of N. coriiceps. Fig.
5 shows that the
c gene is
expressed most widely (all tissues except liver), whereas
a and
b expression is restricted primarily
to brain. The mRNAs for all three genes accumulate significantly,
and to comparable levels, in neural tissues.
c mRNAs
are also prominent in red blood cells and testis. A fourth
-tubulin
gene that is not part of this cluster (represented by the NcTb
3
cDNA) also shows widespread expression. We conclude that each of
the three
-tubulin genes of this cluster are functional and that
regulation of the
a/
b gene pair differs
from the
c gene.

View larger version (108K):
[in this window]
[in a new window]
|
Fig. 5.
Expression of the NcGTb a,
NcGTb b, and NcGTb c tubulin genes in
tissues of N. coriiceps. Steady-state levels of
mRNAs transcribed from the a (ALA),
b (ALB), and c (ALC)
genes were assessed in seven tissues (right axis) by
hybridization of slot blots of total RNA preparations (5 µg per
tissue) to the gene-specific 3'-UTR probes shown in Fig. 3. Probes were
generated from cDNAs corresponding to the genes (see
"Experimental Procedures") by PCR. For comparison, expression of a
fourth -tubulin gene (using the 3'-UTR of the NcTb 3 cDNA)
that is not part of this cluster was also evaluated. Total -tubulin
mRNA in each tissue was revealed by hybridization to a coding
fragment of the NcTb 1 cDNA (encoding amino acid residues
1-430).
|
|
Structural Features and Potential Regulatory Motifs of the Three
-Tubulin Genes--
The striking similarity of the
a,
b, and
c genes, together with their unusual
organization and differential expression, prompted a detailed
comparison of their coding and noncoding regions.
5'-Promoter and -Untranslated Regions--
The organization of the
a and
b genes as divergent transcription
units with potentially overlapping promoters, and their probable
evolution by gene duplication, inversion, and divergence, suggest that
the two genes may share structural features in their 5'-noncoding
sequences (i.e. promoters and untranslated sequences). Indeed, Fig. 6A shows that the
479-bp DNA segment that separates the start codons of the
a and
b genes possesses an axis of 2-fold rotational symmetry. Thus, this intergenic region is substantially palindromic (overall similarity index for the two halves = 78%), and the 5'-promoter and -untranslated regions of the two genes are
strongly related. It is not surprising, then, that the two genes show
an identical pattern of expression (Fig. 5). The initiator codons of
a and
b occur in contexts
(CAAGCAATCATGG and CGAGCAATCATGG, respectively;
cf. Fig. 3, A and B) that approximate
the consensus signal for translation initiation in vertebrates,
(GCC)GCC(A/G)CCATGG (57, 58).

View larger version (53K):
[in this window]
[in a new window]
|
Fig. 6.
5'-Noncoding sequences upstream of the
NcGTb a, NcGTb b, and NcGTb c
tubulin genes. A, palindromic nature of the
479-bp intergenic region linking the start codons of the
a and b genes. The 2-fold rotational axis
is indicated by the vertical line and attached
arrows, and palindromic sequences are shown by
light and dark shading. 5'-UTRs,
deduced from the NcTb 7 and NcTb 2 cDNAs that correspond to the
a and b genes, are indicated by
underlining. (Due to the high degree of similarity of the
palindromic 5' sequences, it has proven impossible to design
gene-specific oligonucleotides for precise mapping of the transcription
start (capping) sites by primer extension (85) or by S1-nuclease
protection (86).) Because transcriptional start sites have not been
mapped, sequences are numbered with respect to the translational start
codons (+1; noncoding nucleotides begin at 1) of the two genes.
B, potential promoter and enhancer elements of the
a/ b intergenic region. TATA, C/EBP,
initiation response element, and Sp1 elements are shown in
reversed, boldface, boxed, and
underlined text, respectively. C, promoter and
enhancer elements within the 5'-noncoding sequences of the
c gene. Potential TATA, CCAAT, GATA, CACCC, Sp1, c-myb,
Hox-1, and octamer motifs are shown in reversed,
boldface, shadowed, double-underlined,
single-underlined, bold underlined italic,
bold italic, and underlined italic text,
respectively.
|
|
Despite the symmetry of the
a/
b intergenic
region, we have found it difficult to identify basal and
tissue-specific promoter elements that would explain the neural
expression of the two genes. A potential, but noncanonical, TATA box
(consensus TATAAA) (59) found upstream of the
a start
codon and untranslated region (Fig. 6B; reverse complement
shown in reversed text starting at
105) is not
present in a corresponding location upstream of
b.
Rather, the
b gene possesses a possible, but corrupt,
TATA motif that begins at position
187. True CCAAT boxes (59, 60) are
absent. The
a/
b intergenic region contains
initiation response elements (consensus WWYACTYYY) (61) and C/EBP
motifs (consensus TKNNGYAAK) (62), but most of these (the two
initiation response elements and the two proximal C/EBP sites) map
within the 5'-UTRs of the gene transcripts. Potential Sp1
sites (63) are also present. The apparent paucity of "legitimate"
upstream promoter elements in the
a/
b
intergenic region might indicate that the noncanonical, and irregularly
located, motifs that we have described are functional. Alternatively,
signals present in the first introns of the two genes may function as
transcriptional regulators (see "Introns" below). Thus,
determination of the actual promoter (and enhancer) elements of this
gene pair will require deletion analysis of the
a/
b intergenic region and introns 1 using
an appropriate reporter vector and host cell system (see
"Discussion"). The neural expression of
a,
b, and
c, for example, may be conferred by
enhancer elements located within the first introns of these genes (see below).
Fig. 6C shows the 5'-noncoding sequences of the
c gene. The sequence of this region differs substantially
from those upstream of the
a and
b genes
(Table I). Consistent with its pattern of expression, the
c gene contains promoter elements characteristic of
hematopoietic, neural, and testicular genes (63, 64). A consensus TATA
box begins 450 bp upstream of the start codon, and noncanonical TATA
motifs are located at
119 and
470.3 Two bona fide CCAAT
elements (60) begin at positions
159 and
210. Two GATA sites
(consensus WGATAR) (64), the targets of GATA-binding
transcriptional activators in subsets of blood, neural, and testicular
cells (64-66), are found at position
411 and downstream of the
proximal TATA box. One CACCC element (64) is located upstream of the
distal CCAAT box, and a c-Myb (consensus ATTGAC) (63) site is present
downstream of the proximal TATA box. Other sites that may contribute to
expression of
c include single Sp1, Hox-1, and octamer
motifs (63, 67, 68).
Introns--
The introns of the three N. coriiceps
-tubulin genes are noteworthy for their generally
small sizes (986-1149, 83-102, and 102-103 bp for introns 1, 2, and
3, respectively) and their uniformly high contents of A + T residues
(62-71%; length-weighted mean = 65.5%). Corresponding
introns in human and frog
-tubulin genes (accession numbers X01703
and X07045, respectively) are considerably larger (intron 1 = 1527-3499 bp, intron 2 = 147-1024 bp, intron 3 = 183-303
bp), and their A + T contents range from 36 to 73% (weighted mean = 59.5). In contrast to their introns, the coding sequences of the
N. coriiceps
a,
b, and
c
genes are relatively A + T-poor (45, 43, and 45% A + T, respectively), due in part to their biased usage of codons (see above). The
intervening sequences of the
a,
b, and
c genes, considered separately, are also more divergent
than their coding sequences (Table I).
Fig. 7 shows the exon-intron boundaries
of the three
-tubulin genes. The donor exon triplets located
immediately to the 5' sides of the splice sites are unusual: ATG for
the first junction, T(C/T)G for the second, and CTG for the third
versus the vertebrate consensus (C/A)AG (69) and the tubulin
consensus ATG (54). Similarly, the 5' nucleotide of the downstream
acceptor exon rarely matches the vertebrate consensus residue, G. By
contrast, intron sequences adjacent to the donor and acceptor junctions
conform well to the vertebrate consensus. In particular, each intron
contains the triplet GT(A/G) at its 5' end and a pyrimidine-rich tract immediately upstream of the CAG triplet at its 3' end (see also Fig. 3,
A-C).

View larger version (29K):
[in this window]
[in a new window]
|
Fig. 7.
Exon-intron boundaries of the three
-tubulin genes. Donor and acceptor splice junctions
for the three introns present in each of the NcGTb a,
NcGTb b, and NcGTb c genes are compared with
the vertebrate (69) and tubulin (54) consensus sequences. Residues of
the a, b, and c junctions
that match the tubulin and/or vertebrate donor and the vertebrate
acceptor consensus sequences are shown in boldface.
|
|
The absence of definitive basal promoter elements in the compact
a/
b intergenic region raises the
possibility that each gene might be governed by promoter elements
located in the first intron of the other. Two perfect inverse
(i.e. reverse complementary) TATA elements reside in intron
1 of
a, the first 239 bp from its 5' end, or 721 bp
upstream of the start codon for the
b gene (Fig.
3A). Similarly, a near-perfect inverse TATA box is located 392 bp from the 5' end of intron 1 of
b, or 874 bp before
the start codon for the
a gene (Fig. 3B).
A striking feature of tubulin gene expression in Drosophila
is the occurrence of cis-acting regulatory sequences (often
enhancers) in the first intron of several of the
- and
-tubulin
genes that confer tissue-specific expression (70-73). We scanned the
intronic sequences of the three
-tubulin genes of the N. coriiceps cluster for comparable elements and found two, the
neural-specific enhancer CAAAAT and the maternal-specific enhancer
CAAAAAT originally defined for the
1-tubulin gene of
Drosophila (70). Fig. 3, A
C, shows that the
a gene contains three copies of the neural element and two of the maternal,
b one and zero, respectively, and
c four and two, respectively. These observations support
our hypothesis that cis-acting sequences within the first
introns of the clustered
-tubulin genes (particularly in the
a and
b genes) may contribute to regulation
of their expression.
3'-Coding and -Untranslated Sequences--
Comparison of the
carboxyl-terminal coding sequences and the 3'-UTRs of the
a,
b, and
c genes reveals a
remarkable degree of similarity (Table I and Fig.
8). The
a and
b UTRs are the most closely related, with 88% identity
in the first 90 bp, and an overall similarity of 78%. However, the
a 3'-UTR is considerably shorter (127 bp) than are those
of
b and
c (270 and 309 bp, respectively).

View larger version (71K):
[in this window]
[in a new window]
|
Fig. 8.
3'-Coding and -untranslated regions of the
NcGTb a, NcGTb b, and NcGTb c
tubulin genes. Beginning with their codons for residue 438 and terminating with their probable polyadenylation signals, the 3'
sequences of the NcGTb a, NcGTb b, and
NcGTb c genes were aligned using the Clustal method
provided by DNASTAR MegAlign. Regions of sequence identity are
indicated by the shaded boxes. Gaps introduced to establish
optimal alignment are shown by dashes. Residues 438-451 are
given in the single letter code.
|
|
A Repetitive Sequence Element in the Notothenioid
Genome--
During characterization of the N. coriiceps
-tubulin gene complex, we also scanned the ~2-kb intergenic region
located between the
a and
c genes against
the GenBankTM data base to determine whether it shared
significant sequence features with other genes. We found that a 285-bp
fragment (Figs. 1 and 9) of the
a/
c intergenic region is ~90% similar to
a bipartite element from intron 4 of the trypsinogen gene (accession
number U58835) (74) of the Antarctic toothfish, Dissotichus
mawsoni. No other significant matches were detected. The lone
match to the Dissotichus intronic fragment is striking and
raises the possibility that this shared sequence might constitute a
repetitive element of notothenioid fishes. To determine the abundance
and species distribution of this fragment, we hybridized it to Southern
replicas of a panel of HindIII-digested genomic DNAs from
Antarctic and temperate notothenioids, other temperate fishes, an
amphibian, and a reptile (Table II).
Among the notothenioid fishes, ~40-50 discrete bands were detected
against a smeared background of positive DNA fragments, consistent with
the partially structured dispersal of many copies of this element
throughout their genomes (data not shown). This pattern is reminiscent
of the distribution of two short interspersed nuclear elements in a
subgroup of salmonid fishes (75). By contrast, the 285-bp fragment did
not hybridize at all to the genomic DNAs of non-notothenioid fishes and
more distantly related vertebrates. Given the apparent restriction of
this repetitive element to the notothenioid suborder, we provisionally designate it Noto1. We are currently investigating the possibility that
Noto1 is a mobile genetic element.

View larger version (69K):
[in this window]
[in a new window]
|
Fig. 9.
A sequence element shared with the
notothenioid trypsinogen gene. BLASTN (National Center for
Biotechnology Information) comparison of the ~2-kb
a/ c intergenic region of the N. coriiceps gene complex to GenBankTM data base files
detected significant sequence homology of a 285-bp fragment (Fig. 1) to
the 5' and 3' ends of a 522-bp fragment from intron 4 of the
trypsinogen gene (accession number U58835) (74) of the Antarctic
toothfish, D. mawsoni. The sequences were aligned using the
Clustal method provided by DNASTAR MegAlign. Regions of
sequence identity are indicated by the shaded boxes. Gaps
introduced to establish optimal alignment are shown by
dashes.
|
|
Evolutionary Divergence Times for the
-Tubulin Genes--
Using
as a metric the nuclear gene divergence rate (0.12-0.33%/million
years) recently determined for the nonfunctional globin gene remnants
of Antarctic icefishes (43), we can estimate the time of
-tubulin
gene duplication. We considered substitutions at positions of 4-fold
degeneracy in the coding sequences, which minimizes the influence of
selection on molecular differences (76). Furthermore, transversions
were analyzed because they accumulate linearly with respect to time
(77). Taken pairwise, the 2.3-3.7% transversion frequency observed
for the NcGTb
a, NcGTb
b, and
NcGTb
c genes at 4-fold degenerate codons yields an
estimated divergence time of ~7-31 million years. Thus, the cluster
apparently evolved as the Southern Ocean cooled (1, 2).
 |
DISCUSSION |
In this report we describe the first example of a vertebrate
tubulin gene cluster, a complex of three tightly linked
-tubulin genes from the Antarctic yellowbelly rockcod, N. coriiceps.
The
a,
b, and
c genes probably evolved
by duplication, inversion, and divergence of an ancestral gene during
the period when the Southern Ocean was cooling. We propose that cold
adaptation of microtubule assembly in Antarctic fishes entails both the
expression of numerically large
- and
-tubulin gene families and
the unique sequence features of the encoded tubulin polypeptides.
Evolution of a Vertebrate
-Tubulin Gene Cluster by Gene
Duplication--
The striking similarity of the three
-tubulin
genes that comprise the N. coriiceps cluster (97-98%
coding sequence similarity, 80-83% overall similarity), and the
clearly palindromic structure of the
a and
b genes, suggests that they evolved relatively recently
from a common ancestral gene. Given the apparently large number of
-tubulin genes possessed by this fish, the identity of the ancestral
gene is unclear. Nevertheless, we consider it likely that
a is the direct ancestor of
b (or
vice versa) and gave rise to the latter gene through a
recent duplication/inversion event that preserved neural-specific
expression. Subsequent conversion (78) of the segment of the
a gene containing introns 2 and 3 to that of
c (or of the corresponding region of the
b
gene to that of a fourth
-tubulin gene) would explain the regional similarities and dissimilarities within the cluster. Determination of
the most plausible evolutionary scenario that explains the origin of
the entire cluster will depend on analysis of other members of the
-tubulin gene family of N. coriiceps.
It is intriguing to speculate that other
-tubulin genes may be
linked to the
a-
c cluster, upstream of
c
and/or downstream of
b, in orientations that create
additional divergent transcription units. We plan to evaluate these
possibilities by analysis of genomic clones that overlap S2 and by
PCR-based linkage studies.
Adaptational Expansion of Tubulin Gene Templates--
Based on the
divergence rate (0.12-0.33%/million years) recently determined for
the nonfunctional nuclear globin gene remnants of Antarctic icefishes
(43), we estimate that the N. coriiceps
-tubulin gene
cluster arose ~7-31 mya. Thus, duplication and divergence of members
of the
-tubulin gene family apparently occurred in concert with, and
probably was an adaptive change selected by, cooling of the Southern
Ocean, which began ~38 mya and reached freezing temperatures during
the mid-late Miocene (5-14 mya) (79). This conclusion must be
qualified by recognition that gene conversion events within the
-tubulin cluster may have reduced the sequence heterogeneity of the
individual genes (80), which would lead to underestimation of the true
divergence time. However, it is noteworthy that the antifreeze
glycoprotein genes of notothenioid fishes apparently evolved from a
pre-existing pancreatic trypsinogen gene in a time frame similar to
that which we have estimated for