|
Originally published In Press as doi:10.1074/jbc.M105850200 on September 11, 2001
J. Biol. Chem., Vol. 276, Issue 45, 41963-41968, November 9, 2001
A Novel Hybrid Open Reading Frame Formed by Multiple Cellular
Gene Transductions by a Plant Long Terminal Repeat
Retroelement*,
Nabil
Elrouby and
Thomas E.
Bureau
From the Department of Biology, McGill University,
Montreal, Quebec H3A 1B1, Canada
Received for publication, June 22, 2001, and in revised form, September 10, 2001
 |
ABSTRACT |
The discovery that vertebrate retroviruses could
transduce cellular sequences was central to cancer etiology and
research. Although not well documented, transduction of cellular
sequences by retroelements has been suggested to modify cellular
functions. The maize Bs1 transposon was the first
non-vertebrate retroelement reported to have transduced a portion of a
cellular gene (c-pma). We show that Bs1 has, in
addition, transduced portions of at least two more maize cellular
genes, namely for 1,3- -glucanase (c-bg) and
1,4- -xylan endohydrolase (c-xe). We also show that
Bs1 has maintained a truncated gag
domain with similarity to the magellan gypsy-like long
terminal repeat retrotransposon and a region that may correspond to an
env-like domain. Our findings suggest that, like oncogenic
retroviruses, the three transduced gene fragments and the Bs1
gag domain encode a fusion protein that has the potential to be
expressed. We suggest that transduction by retroelements may facilitate
the formation of novel hybrid genes in plants.
 |
INTRODUCTION |
Retroelements comprise a diverse array of mobile elements that all
share a common property, the copying of RNA into DNA during a step in
their life cycle. Retroelements include endogenous retroviruses and
class I transposable elements (i.e.
LTR1 retrotransposons, long
interspersed nuclear elements, short interspersed nuclear elements, and
processed pseudogenes). Infectious retroviruses have been suggested to
be highly evolved types of retroelements that acquired the ability to
infect with the acquisition of an envelope (env) gene (1,
2). Alternatively, LTR retrotransposons may have evolved from an
ancestral retrovirus through the loss of an env gene and
hence the loss of infectivity (3). copia-like LTR
retrotransposons have recently been reported to be horizontally transmitted (4), and some copia-like (5) and
gypsy-like (6-8) elements have been found to contain an
env-like gene. Retroviruses and LTR retrotransposons thus
share conserved structural, functional, and mechanistic features (9).
The structural and sequence similarities between gypsy-like
LTR retrotransposons and retroviruses led to the postulation that they
are related evolutionarily (2, 10).
Retroviruses have the potential to capture cellular genes, a process
commonly known as cellular gene transduction. Cellular gene
transduction by retroviruses has been fundamental in studying neoplastic transformation in animals (11). Cellular gene-containing retroviruses were found to induce tumors in animals and transform cells
in culture. This was central to the discovery of cellular genes with
oncogenic potential (proto-oncogenes) (11). Cellular gene transduction
is not limited to retroviruses however. For example, the maize
Bs1 retroelement was reported to have acquired a segment of
a plasma membrane proton ATPase gene (pma) (12-14). This
raised the question as to the significance of transduction for genomes
in the absence of neoplastic transformation.
The impact of retrotransposons on genome organization and function has
been suggested (15, 16). L1 and Alu sequences were found in
many known proteins (16), and L1-mediated exon shuffling (by
transduction) was shown to occur in an experimental system with
cultured human cells and suggested to represent a general mechanism for
the evolution of new genes (17). The characterization of the human
genome then revealed that transduction of 3'-flanking sequences is a
common feature of L1 retrotransposition (18). Although the human
PMCH1 gene is believed to have evolved by exon shuffling
through retrotransposition of an antisense transcript of the
MCH gene (19), the evolution of a new gene by transduction of genomic sequences adjacent to an LTR retrotransposon has not been documented.
Bs1 was identified as an insertion that inactivated the
maize Adh1-S allele (20, 21). Subsequent cloning and
characterization revealed that Bs1 is 3203 bp in length, has
302-bp identical LTRs that are terminated by the retroviral consensus
terminal sequences TG ... CA, and is immediately flanked by a 5-bp
target site duplication (22, 23). Other features that are common to
retroelements include a canonical primer binding site that immediately
follows the 5'-LTR and shares similarity with plant initiator
methionyl-tRNA and a polypurine tract that precedes the 3'-LTR.
Although the internal sequence of Bs1 potentially encodes
two overlapping open reading frames (ORF1 and ORF2), convincing
similarity to typical retroelement-encoded peptides has not been
demonstrated (22). The finding that Bs1 has transduced a
portion of a cellular gene (pma) suggested either that, like
retroviruses, an LTR retrotransposon can transduce cellular genes or
that Bs1 is actually a defective retrovirus (12, 13).
We report that the maize Bs1 retroelement has transduced
segments from at least three different cellular genes. We show that most of the Bs1 internal sequence was replaced by the
cellular gene sequences, which explains the lack of similarity to known retroelement proteins. Bs1, thus, remains the only clearly
documented transduction event outside of the vertebrate retroviruses.
Furthermore, Bs1 seems to have retained a truncated
gag domain with similarity to the gypsy-like
magellan LTR retrotransposon and a region that may
correspond to an env-like gene. The transduction of multiple cellular gene segments by Bs1 seems to have generated a
hybrid ORF that has the potential to be expressed. We discuss that this may be a general mechanism for the evolution of new genes.
 |
EXPERIMENTAL PROCEDURES |
Data Base Searches and Sequence Analysis--
The entire
Bs1 nucleotide sequence (GI 168648) and putative ORFs, ORF1
(GI 806301), ORF2 (GI 806303), and a short internal ORF (ORF3, GI
806302) that is spanned by ORF1, were used as queries in
BLAST searches against GenBankTM (24) (version
2.0, www.ncbi.nlm.nih.gov/blast/). Exon and coding sequence predictions
of c-bg and c-xe were performed using GENSCAN (genes.mit.edu/GENSCAN.html) developed at the Massachusetts Institute of Technology. Nucleotide sequence was translated into amino acid sequence using a modified version of the GDE translate program (bimas.dcrt.nih.gov/molbio/translate/) developed at the BioInformatics and Molecular Analysis Section of the National Institutes of Health. Sequence alignments were generated using the PILEUP program
as part of the University of Wisconsin Genetics Computer Group suite of
programs (version 10.0) and were further manipulated using GENEDOC (25).
Cloning of the Maize Cellular Genes--
Zea mays
ssp. mays CV W22 germplasm was obtained from Susan Wessler
(University of Georgia). Seedlings were grown under greenhouse conditions for 3 weeks, and genomic DNA was extracted as described previously (26).
To isolate the maize cellular genes for 1,3- -glucanase
(c-bg) and 1,4- -xylan endohydrolase (c-xe), we
made use of homologous sequences available in the data bases. For
c-bg, the Bs1 sequence (GI 168648, position
917-1270) was conceptually translated and used as a query in
BLASTP searches. Similar proteins were used in multiple
alignments, and degenerate oligonucleotides were then designed
following the codon usage for Z. mays
(www.kazusa.or.jp/codon/). These primers were then used in polymerase
chain reactions. c-bg was amplified using a sense primer
(BG21, 5'-GGTGAAGCT(G/C)TTCGAGGC(C/G)G-3') that starts at codon number
21 of a hypothetical Arabidopsis BG protein (GI 5042412) and
an antisense primer (BG1A, 5'-GGATCTGTATGGTGAAGTTGC-3') that
corresponds to the 3' end of r-bg (GI 168648, position
1156-1176). Compared with the barley xe (GI 1813594),
r-xe corresponds to a part of the 5'-untranslated region and
a part of the amino terminus of the coding region. Thus, we chose a
conserved region that spans the active site of the enzyme (27) and that
is located approximately in the middle of the protein to design the
downstream primer (XEIV-1A, 5'-CATCTCGTTGTT(G/C/A)ACGTCGTAGTG-3'). For
the upstream primer, we aligned the amino acid sequences of barley (GI
1718238) (27), wheat (GI 5306060) (28), and two putative
Arabidopsis XE proteins (GI 12321045 and GI 8979937) with
the region corresponding to the predicted protein of r-xe
(GI 168648, position 1463-1664) and designed primers that span the
initiation codon (RXE1478S, 5'-CA(A/T)GGGCG(T/C)GTTCCG(G/C)C-3'). All
polymerase chain reactions conditions and cloning of amplification
products were as described previously (29). Sequencing was performed
using a SequiTherm EXCEL II DNA sequencing kit (Epicentre Technologies;
Madison, WI) with a Li-Cor LongReader IR 4200 DNA sequencer (Li-Cor,
Lincoln, NE).
 |
RESULTS |
Bs1 Sequence Re-analysis--
Previous characterization of
Bs1 established lack of similarity to any known retroelement
sequences (22). However, the wealth of new sequence information
currently available prompted us to re-examine Bs1. We used
the Bs1 nucleotide sequence (GI 168648) and previously
identified ORFs (GI 806301-3) in extensive sequence similarity searches
of the GenBankTM data base (May, 2001) (24). This revealed
that Bs1 contains several domains, each of which shares
similarity to multiple entries in GenBankTM. The results
are represented schematically in Fig.
1.

View larger version (6K):
[in this window]
[in a new window]
|
Fig. 1.
Schematic representation of the different
domains in Bs1. Rectangles containing
arrowheads represent the LTRs. PBS and
PPT are the characteristic retroviral primer-binding site
and polypurine tract, respectively. The domains that make up most of
the internal sequence are gag, a region that may correspond
to an env-like domain, and the transduced sequences
r-bg, r-xe, and r-pma. The thick
lines below the Bs1 structure indicate the
position of open reading frames.
|
|
The region immediately downstream from the Bs1 5'-LTR
(position 303-890) corresponds to a truncated gag domain.
TBLASTN searches using the Bs1 ORF1 as a query
revealed similarity with the maize gypsy-like LTR
retrotransposon magellan (Fig.
2) (GI 2343274; 30% identity and 58%
similarity). BLASTX searches using the Bs1 sequence as a query also reveals significant similarity (35% identity and 46% similarity) to the GAG protein of a Bombyx mori
non-LTR retrotransposon (GI 4521268). Interestingly, the similarity of the Bs1 GAG domain extends to several viral capsid proteins
(data not shown). For example, when the amino-terminal-most 180 residues of ORF1 were used in Position-specific Iterated
BLAST searches (30), sequence similarity with turnip
crinkle virus coat protein (GI 335196; 24% identity and 40%
similarity) was observed. The Bs1 GAG region also shares
similarity (29% identity and 46% similarity) with the 55-kDa protein
of human adenovirus type 41 (GI 209892). In addition, the
Bs1 GAG domain shares amino acid sequence similarity with
many animal tropomyosins including the Xenopus non-muscle tropomyosin (GI 530992; 44% identity and 60% similarity). The significance of this result is unclear.

View larger version (20K):
[in this window]
[in a new window]
|
Fig. 2.
Amino acid sequence alignment of the
Bs1 and magellan GAG domains.
Residues shaded in black are identical, and those
shaded in gray are similar. Amino acid positions in ORF1 are
numbered.
|
|
BLASTX searches also revealed similarity between the
Bs1 sequence (position 917-1270; frame +2) and several
basic 1,3- -glucanases (bg) (41-47% identity; 44-60%
similarity). This observation suggests that this region corresponds to
a portion of a maize 1,3- -glucanase cellular gene (c-bg)
that has been transduced by Bs1. The region corresponding to
bg within Bs1 is referred to as retroelement bg (r-bg) (12). r-bg corresponds to
the carboxyl-terminal amino acid residues 336-419 of a hypothetical
Arabidopsis BG protein (GI 5042412) (Fig.
3). Although genes encoding
1,3- -glucanases have been cloned from a wide range of plant and
fungal species (including a maize cDNA that encodes acidic
1,3- -glucanase (31) but shares no similarity to r-bg), a
maize c-bg that contributed r-bg has not been
characterized. In addition, several expressed sequence tags (ESTs) from
both monocots (e.g. maize, rice, sorghum, and barley) and
dicots (e.g. Arabidopsis and tomato)
corresponding to 1,3- -glucanases share no significant sequence
similarity to r-bg. However, two sorghum ESTs (GI 6675471 and 9303089) were found to overlap and, when assembled, reconstitute a
cDNA with 83% nucleotide sequence identity to r-bg.

View larger version (72K):
[in this window]
[in a new window]
|
Fig. 3.
Amino acid sequence alignment of the
Bs1 ORF1 region spanning r-bg and
r-xe. The alignment shows that although
r-xe integration into ORF1 maintained the XE frame,
r-bg integration did not maintain the BG frame.
r-bg and r-xe nucleotide sequences were
translated in the frames that encode -1,3-glucanase and
-1,4-xylan endohydrolase, respectively. The sorghum EST
(SoEST, an assembly of GI 6675471 and 9303089) and a
hypothetical Arabidopsis protein (Atbg, GI
5042412) coding for -1,3 glucanase are shown. HvX-II is
the barley xylan endohydrolase isoform X-II (GI 1718238). Amino acid
positions in ORF1 are numbered.
|
|
In addition, BLASTX searches uncovered similarity between
Bs1 (position 1310-1664; frame +3) and 1,4- -xylan
endohydrolase (xe) from barley, wheat,
Arabidopsis, and many fungi and bacteria. Relative to the
barley xe isoenzyme X-II (GI 1718238), the transduced region
of xe (r-xe) corresponds to amino acid residues
1-35 fused to residues 86-118, which suggested that r-xe
has sustained a deletion of sequences coding for 51 amino acids (Fig.
3). As in the case of bg, a maize cellular gene
(c-xe) that contributed r-xe was not found in our searches.
Bs1 (position 1865-2521) shares similarity to a maize
plasma membrane proton ATPase. A detailed description of
r-pma and the identification of the maize c-pma
has been reported previously (12-14).
In retroviruses and LTR retrotransposons containing env-like
genes, the env is always located at the 3'-most region of
the internal sequence, i.e. downstream from pol
and just upstream from the 3'-LTR. BLASTX (position
2541-2853, frame +2) searches revealed similarity between the 3'-most
region of the Bs1 internal sequence and members of a very
large gene family that constitutes ~1% of the Arabidopsis
genome (>200 members) (32, 33). The first gene of this family to be
characterized encodes a membrane-associated salt-inducible protein (GI
473873) (34). Proteins encoded by this family do not share significant
overall amino acid sequence similarity. However, they do share the
presence of a degenerate 35-amino acid repeat called the "PPR
repeat" (33). Careful examination of the Bs1
sequence revealed that the similarity with the PPR proteins
is restricted to two tandem repeats of the PPR motif (Fig.
4).

View larger version (24K):
[in this window]
[in a new window]
|
Fig. 4.
The Bs1 env-like
domain contains tandem PPR repeats. Amino acids sequence
alignments of the PPR motif consensus (PPR), those of
Bs1 (-a and -b), and of three
Arabidopsis proteins (GI 2462828, 4335729, and 4038037) are
shown. Amino acid residue positions within the Arabidopsis
proteins and the Bs1 sequence (frame +2 translation of
nucleotide positions 2541-2853) are indicated.
|
|
Data base searches using the Bs1 nucleotide sequence as a
query revealed a maize EST (GI 7238199) with similarity to the
Bs1 LTRs. The EST is 595 bp in length and was isolated from
a Z. mays CV Ohio43 cDNA library of mixed stages of
anther and pollen. The 5'-most 366 bp of this EST is nearly identical
to the Bs1 sequence (99.5%; position 2834-3199).
Specifically, the region of similarity begins 68 bp upstream of the
start of the 3'-LTR to 4 bp upstream of the end of the 3'-LTR (Fig.
5). The corresponding region within the
EST is followed by a poly(A) tract (18 bp). The putative Bs1 polyadenylation signal is located 18 bp upstream of the poly(A) tract
of the EST. The remaining portion of the EST (immediately downstream
from the poly(A) tract) is identical to a part of another maize EST (GI
5005895) that corresponds to a gene encoding aminotransferase. Similarity with an EST (GI 7238199) suggests that Bs1 is
possibly expressed.

View larger version (55K):
[in this window]
[in a new window]
|
Fig. 5.
An EST corresponds to
Bs1. Nucleotide sequence alignment of
Bs1 and a maize EST (GI 7238199). The LTR sequence is
boxed, and the polyadenylation signal is
underlined. The poly(A) tract in the EST is shown followed
by a region of similarity to an EST for aminotransferase (see
text).
|
|
Maize c-bg and c-xe--
To establish direct evidence that the
r-bg and r-xe domains of Bs1
correspond to transduced portions of maize cellular genes, we cloned
the maize c-bg and c-xe. Nucleotide sequence
analysis suggests that c-bg contains two introns and three
exons (Fig. 6). Sequence alignment of
r-bg and c-bg (Fig. 6) indicates that they share
~81% sequence identity and that r-bg corresponds to a
portion of the third exon of c-bg (position 1091-1350).
Because r-bg shows slightly higher identity with the sorghum
ESTs (83%, GI 6675471 and 9303089), it is possible that a different
maize cellular gene contributed to r-bg. Regardless, the
high similarity between the Bs1 sequence and both of the
sorghum ESTs and c-bg confirms that r-bg
corresponds to a transduced bg gene. Maize c-xe,
on the other hand, appears to contain one intron and two exons.
Compared with c-xe, r-xe corresponds to a part of
the first exon (108 bp downstream from the ATG) fused to a part of the
second exon (position 495-588 relative to the ATG) (Fig.
7). This suggests that r-xe
has sustained a deletion that removed 44 bp from the first exon, all of
the intervening intron, and 82 bp of the second exon (Fig. 7). This
deletion thus eliminated 42 amino acid residues in r-xe
compared with c-xe (Fig. 3). Overall, r-xe and
c-xe share ~86% nucleotide sequence identity.

View larger version (49K):
[in this window]
[in a new window]
|
Fig. 6.
Acquisition of a portion of c-bg
by Bs1. Schematic representation (A) and
sequence alignment (B) of c-bg, r-bg,
and sorghum ESTs (SoEST, an assembly of GI 6675471 and
9303089). Exons are represented by black rectangles, and
introns are represented by a thick black line. The antisense
primer used to amplify c-bg is located at the end of the
c-bg sequence. Nucleotide sequence positions within
Bs1 are indicated.
|
|

View larger version (34K):
[in this window]
[in a new window]
|
Fig. 7.
Acquisition of a portion of c-xe
by Bs1. Schematic representation
(A) and sequence alignment (B) of c-xe
and r-xe. Exons are represented by black
rectangles, and introns are represented by a thick black
line. The white line dividing the r-xe
sequence in A corresponds to the point where a portion of
exon I and a portion of exon II are fused. Sequence in lowercase
letters corresponds to the intron sequence. Nucleotide sequence
positions within Bs1 are indicated.
|
|
Relative to their cellular counterparts, the mutation patterns of
r-bg and r-xe are different. Whereas
r-xe has sustained a large deletion of 385 bp,
r-bg has only a single nucleotide insertion. Base
substitution patterns are also different, whereas r-xe has
sustained 34 nucleotide substitutions (23 transitions and 11 transversions), a comparison of r-bg and c-bg
indicates that r-bg bears 59 nucleotide substitutions (27 transitions and 32 transversions). Although base substitutions in
r-xe have resulted in 19 amino acid changes (of the 67 amino
acid residues contributed to ORF1 by r-xe), 9 of these
mutations are changes into residues conserved in the barley XE protein
(Fig. 3). Overall, integration of the c-xe sequence into
Bs1 and subsequent mutations in r-xe seem to have
occurred in a non-random fashion and conserved the XE frame within
Bs1 ORF1 (Fig. 3). On the other hand, r-bg
integration into Bs1 did not maintain the BG frame within
ORF1 (Fig. 3). Furthermore, compared with c-bg, base
substitutions have resulted in 43 amino acid changes (of 85), two
of them being stop codons (Fig. 3).
 |
DISCUSSION |
Bs1 gag and env Domains--
Our results reveal significant
similarity between the amino-terminal-most 100 residues of ORF1 and GAG
of the maize gypsy-like LTR retrotransposon
magellan (Fig. 2), GAG of a non-LTR retrotransposon from
B. mori (35), and the capsid protein of turnip crinkle virus
(36). magellan was isolated as a recent insertion in the wx-M allele (37) and as an independent insertion in a maize gene that encodes for the pl transcriptional factor (38) (GI 2343274). The recent insertion of magellan into
wx and pl suggests it retains the coding capacity
for all proteins necessary for retrotransposition, including GAG.
magellan was established as a member of the
gypsy/Ty3 LTR retrotransposons by a comparison of
its integrase and reverse transcriptase domains with those of the yeast
Ty3 and Lilium del gypsy-type LTR
retrotransposons (37). Taken together, this suggests that a functional
copy of Bs1 may belong to the gypsy/Ty3 class of
LTR retrotransposons. The similarity between Bs1 and
magellan GAG proteins is intriguing because homology between
GAG proteins of different retrotransposons is usually low or not
significant (39). This may indicate that Bs1 and
magellan are, in fact, closely related.
Retroviruses and some LTR retrotransposons have an env
domain located in the 3' region of their internal sequence (5-9). env domains are highly variable and may have originated from
independent transduction events (1). Examination of the deduced amino
acid sequence of the Bs1 3'-most region reveals similarity
to a large family of proteins that contain tandem PPR repeats. Given
its location and shared structural similarities with the env
domains of retroviruses and LTR retrotransposons (8), we suggest that this region may correspond to an env-like domain. For
example, ENV proteins are typically plasma membrane-associated
glycoproteins. The putative Bs1 ENV-like protein, as well as
PPR proteins in general, contains several potential N- and
O-glycosylation sites, have hydrophobicity profiles
consistent with membrane-associated proteins, and at least some PPR
proteins are targeted to the plasma membrane
(34).2 In addition, like ENV
proteins of some LTR retrotransposon (e.g. SIRE-1) (5), the
PPR motif is predicted to form transmembrane domains with -helices
and coiled coils (33). Furthermore, members of some retrovirus groups
exhibit considerable diversity in receptor usage. This is because of
the presence of stretches of variable sequence within otherwise
conserved sequence (40-43). Variation has been suggested to result
from both amino acid changes within and the recombination of these
regions (41, 42). Variation in the sequence, number, and organization
of the PPR repeats has likewise been suggested to reflect an ongoing
mechanism generating diversity (32).
Maize c-bg and c-xe--
Our finding that Bs1 may have
transduced portions of genes that encode for 1,3- -glucanase and
1,4- -xylan endohydrolase prompted us to clone these genes from
maize. -Glucanases are enzymes that hydrolyze -linked glucans and
are implicated in various physiological processes that involve the
structure and function of plant cell walls. Plant -glucanases fall
into two types, 1,3- -glucanases and 1,3;1,4- -glucanases. Based on
their isoelectric properties 1,3- -glucanases may be further divided
into two classes, namely acidic and basic. Although genes for both
classes have been isolated from different plant species, only one maize
cDNA encoding an acidic 1,3- -glucanase has been identified (31).
A comparison of the deduced translation product of c-bg with
plant -glucanases reveals that it is clearly a basic
1,3- -glucanase (Fig. 3; data not shown).
Genomic and cDNA sequences encoding 1,4- -xylan endohydrolase
have been isolated from barley and wheat (27, 28, 44). The barley
cDNAs (GI 1718235 and GI 1718237) encode for two isoenzymes (X-I
and X-II) that share 87% identity on the amino acids level (27). Both
the barley and wheat genomic sequences correspond to isoenzyme X-I (28,
44). The maize c-xe is most related to the barley isoenzyme
X-II (Fig. 3). Barley and wheat 1,4- -xylan endohydrolases are
involved in endosperm cell wall degradation through the hydrolysis of
-linked xylans (45).
Multiple Cellular Gene Transduction by Bs1--
The capture of
cellular genes by retroviruses is a consequence of the different
features of the retroviral life cycle. As a result of inefficient
transcript termination and polyadenylation at the 3'-LTR, ~15% of
proviral transcripts are readthrough, spanning genomic sequences
located downstream from the provirus (46, 47). Joining the viral and
genomic sequences is achieved at the RNA level through abnormal
splicing between a donor site in the viral sequence and an acceptor
site in the genomic sequence, or at the DNA level by a deletion
event(s) followed by transcription from the 5'-LTR (48). A chimeric RNA
molecule containing the cellular gene sequences and a normal viral RNA
molecule are co-packaged into one viral particle (49). Strand switching
by reverse transcriptase during reverse transcription may then result
in non-homologous recombination events leading to the incorporation of
the cellular gene sequences into the retroviral genome (47, 50).
Bs1 represents the only clearly documented transduction
event other than by vertebrate retroviruses and human L1 elements. We
provide evidence that Bs1 has captured segments from at
least three different cellular genes. Presumably, a chimeric RNA
molecule must have been packaged together with a normal
(i.e. functional) Bs1 RNA molecule into a
virus-like particle. Evidence that LTR retrotransposons assemble into
virus-like particles that are structurally and functionally analogous
to retroviral cores has been clearly demonstrated (51-53). That a
functional copy of Bs1 may exist in maize has also been
proposed previously (13); this is suggested by the fact that
Bs1 was first isolated as a recent insertion in the maize
Adh1 gene (21) and indicated in our Southern hybridization results (see the Supplemental Material). However, lacking the sequence of a functional Bs1 element, we still cannot
distinguish whether Bs1 is an LTR retrotransposon (perhaps
gypsy-like) or a bona fide retrovirus.
Regardless of the true identity of Bs1, ORF1 is reminiscent
of many oncogenes in several respects (11). First, like transduced cellular proto-oncogenes (oncogenes), the Bs1-transduced
genes lack their native promoters and, if transcribed, would presumably be under the control of the Bs1 promoter located within its
5'-LTR. Second, like viral oncogenes, the Bs1-transduced
genes lack introns and have sustained deletions. For example,
r-bg corresponds only to a portion of the c-bg
third exon, and the 5' end (first and second exons and the intervening
introns) and 3' end of the gene were deleted (Fig. 6). Relative to
c-xe, r-xe is composed of a portion of the first
exon fused in-frame to a portion of the second exon, whereas the
intervening intron and parts of the first and second exons are deleted
(Fig. 7). The deletion is flanked by a 4-bp direct repeat (CACC).
Direct repeats have been implicated in the formation of simple
deletions during reverse transcription (54). Furthermore,
r-pma corresponds to the first codon of exon 4 through the
first 72 bp of exon 10 with all intervening introns spliced out and an
internal deletion of 183 bp that removed most of exon 6 (12, 13).
Third, the transduction of cellular genes by Bs1 may result
in a fusion protein with the Bs1 GAG sequence. Most viral
oncogenes encode GAG fusion proteins that differ significantly in
structure and, potentially, in function from the corresponding proto-oncogenes. Fourth, relative to their cellular gene counterparts, the Bs1-transduced genes have accumulated multiple point
mutations, a consequence of reverse transcriptase being
error-prone.
A novel aspect of Bs1 is that it has captured more than two
cellular gene segments. Possible mechanistic scenarios include the
following: (i) the additive acquisition of different gene segments in
multiple successive events by one Bs1 element, (ii) separate
transduction events by different Bs1 elements and subsequent recombination, or (iii) transduction involving the generation of a very
long readthrough transcript spanning all three genes arranged in
tandem. The second scenario is unlikely because Southern hybridization
experiments do not reveal intermediate Bs1 molecules with
only one or two of the transduced genes.2 The third
scenario also seems unlikely because readthrough of several transcript
termination and polyadenylation signals would have to occur and because
our polymerase chain reactions results do not suggest that the three
genes are in close physical association (data not shown). Several
acutely transforming retroviruses have been shown to harbor sequences
from two cellular genes (55-58). In these cases, the chimeric viral
genome arose by independent recombination of the viral sequence with
two distinct cellular loci (57). In Bs1, sequence divergence
of the transduced segments from their cellular counterparts (81, 86, and 88% for r-bg/c-bg, r-xe/c-xe, and r-pma/c-pma,
respectively) suggests that transduction probably started by
r-bg, followed by r-xe and then r-pma.
Hence, integration of the transduced sequences may have taken place at the 3'-most region of the internal sequence with subsequent
transductions displacing previous ones upstream. Additionally, because
the distance between r-bg and r-xe on the
Bs1 sequence is relatively short (Fig. 3), it is possible
that they may actually correspond to the transduced portion of a
composite/chimeric gene that bears similarity to both glucanases and
xylan endohydrolases. However, we could not find any examples of such a
gene in GenBankTM, and more importantly, we could not
amplify gene sequences that span the two domains. The transducing
efficiency of a retroelement may be regarded as a function of the
efficiency of its polyadenylation and transcript termination. In the
case of Bs1, the polyadenylation signal is non-canonical
(5'-AATACA-3') and may result in higher transducing efficiency.
Regardless of the scenario, the transduction events must have occurred
within very close evolutionary time windows because the divergence
between the transduced gene segments and their cellular counterparts is
similar for the three genes.
We identified an EST (GI 7238199) that shares identity over its 5'
two-thirds with the Bs1 3'-LTR and a part of the internal sequence. The 3' one-third of this EST shares identity to another EST
(GI 5005895) that encodes aminotransferase. The two regions of the EST
are separated by a poly(A) tract and a sequence that may correspond to
the XhoI restriction site used for cloning (Fig. 5).
Although it is possible that this EST represents a readthrough transcript of Bs1 into a 3'-flanking gene (i.e.
aminotransferase), it is likely that the generation of this chimeric
structure is merely a cloning artifact. Regardless, this EST represents
evidence that Bs1 is expressed. Interestingly, this EST was
isolated from a cDNA library of mixed stages of anther and pollen,
indicating that Bs1 is expressed in the germ line. Together
with the fact that maize is an outcrosser, germ line expression may
have facilitated the fixation of Bs1 transduction and
insertion events.
ORF1 Is a Novel Hybrid--
The Bs1 ORF1 has been both
predicted and shown to be translated in vitro (22, 23).
In vitro translation experiments also indicate that a longer
polypeptide predicted for the frameshift fusion of ORF1 and ORF2 (which
would span the region with similarity to the PPR-containing proteins)
can be generated (22). This is striking because incorporation of
r-bg, r-xe, and r-pma into Bs1 involved a complex pattern of mutations including
abnormal splicing of transcripts, deletions, and numerous point
mutations. A possible explanation is that there may be selective
pressure to maintain this ORF. Unlike oncogenes, Bs1 does
not seem to be associated with any disease phenotype because it is
present in normal "wild type" maize cultivars and also in wild
relatives of maize, the teosintes (20). The fact that selective
constraints may have maintained ORF1, that several developmentally
important cellular genes have been integrated, and that
r-pma retains the ATPase domain of c-pma (12,
13), suggest a possible function for Bs1 in normal maize
development. In humans, modification of cell function by L1-mediated
transduction has been proposed (18). L1 retrotransposition has also
been involved in the mobilization of cellular sequences such as exons
or promoters into other locations in the genome and hence was suggested
to represent a general mechanism for the evolution of new genes by exon
shuffling (17, 19). In light of our findings, it is tempting to
speculate that transduction events associated with Bs1 gave
rise to a novel hybrid ORF and, therefore, a new gene.
 |
ACKNOWLEDGEMENTS |
We thank Quang Hien Le and Drs. Barid
Mukherjee and Candace Waddell for critical comments on our manuscript.
 |
FOOTNOTES |
*
This work was supported by a National Science and
Engineering Research Council of Canada grant (to T. E. B.).The costs of publication of this
article were defrayed in part by the
payment of page charges. The article
must therefore be hereby marked
"advertisement" in
accordance with 18 U.S.C. Section
1734 solely to indicate this fact.
The on-line version of this article (available at
http://www.jbc.org) contains Fig. 1.
To whom reprint requests should be addressed. Tel: 514-398-6472;
Fax: 514-398-5069; E-mail: thomas_bureau@maclan.mcgill.ca.
Published, JBC Papers in Press, September 11, 2001, DOI 10.1074/jbc.M105850200
2
N. Elrouby and T. E. Bureau, unpublished data.
 |
ABBREVIATIONS |
The abbreviations used are:
LTR, long terminal
repeat;
ORF, open reading frame;
bp, base pair.
 |
REFERENCES |
| 1.
|
Malik, H. S.,
Henikoff, S.,
and Eickbush, T. H.
(2000)
Genome Res.
10,
1307-1318[Abstract/Free Full Text]
|
| 2.
|
Temin, H. M.
(1980)
Cell
21,
599-600[CrossRef][Medline]
[Order article via Infotrieve]
|
| 3.
|
Finnegan, D. J.
(1983)
Nature
302,
105-106[CrossRef][Medline]
[Order article via Infotrieve]
|
| 4.
|
Jordan, I. K.,
Matyunina, L. V.,
and McDonald, J. F.
(1999)
Proc. Natl. Acad. Sci. U. S. A.
96,
12621-12625[Abstract/Free Full Text]
|
| 5.
|
Laten, H. M.,
Majumdar, A.,
and Gaucher, E.
(1998)
Proc. Natl. Acad. Sci. U. S. A.
95,
6897-6902[Abstract/Free Full Text]
|
| 6.
|
Britten, R. J.
(1995)
Proc. Natl. Acad. Sci. U. S. A.
92,
599-601[Abstract/Free Full Text]
|
| 7.
|
Kim, A.,
Terzian, C.,
Santamaria, P.,
Pelisson, A.,
Prud'Homme, N.,
and Bucheton, A.
(1994)
Proc. Natl. Acad. Sci. U. S. A.
91,
1285-1289[Abstract/Free Full Text]
|
| 8.
|
Wright, D. A.,
and Voytas, D. F.
(1998)
Genetics
149,
703-715[Abstract/Free Full Text]
|
| 9.
|
Boeke, J. D.,
and Stoye, J. P.
(1997)
in
Retroviruses
(Coffin, J. M.
, Hughes, S. H.
, and Varmus, H. E., eds)
, pp. 343-435, Cold Spring Harbor Laboratory Press, New York
|
| 10.
|
Xiong, Y.,
and Eickbush, T. H.
(1990)
EMBO J.
9,
3353-3362[Medline]
[Order article via Infotrieve]
|
| 11.
|
Cooper, G. M.
(1995)
Oncogenes
, 2nd Ed.
, pp. 21-65, Jones and Bartlett Publishers, Sudbury, MA
|
| 12.
|
Bureau, T. E.,
White, S. E.,
and Wessler, S. R.
(1994)
Cell
77,
479-480[CrossRef][Medline]
[Order article via Infotrieve]
|
| 13.
|
Jin, Y.-K.,
and Bennetzen, J. L.
(1994)
Plant Cell
6,
1177-1186[Abstract]
|
| 14.
|
Palmgren, M. G.
(1994)
Plant Mol. Biol.
25,
137-140[CrossRef][Medline]
[Order article via Infotrieve]
|
| 15.
|
Kazazian, H. H. J.,
and Moran, J. V.
(1998)
Nat. Genet.
19,
19-24[CrossRef][Medline]
[Order article via Infotrieve]
|
| 16.
|
Li, W.-H.,
Gu, Z.,
Wang, H.,
and Nekrutenko, A.
(2001)
Nature
409,
847-849[CrossRef][Medline]
[Order article via Infotrieve]
|
| 17.
|
Moran, J. V.,
DeBerardinis, R. J.,
and Kazazian, H. H. J.
(1999)
Science
283,
1530-1534[Abstract/Free Full Text]
|
| 18.
|
Goodier, J. L.,
Ostertag, E. M.,
and Kazazian, H. H.
(2000)
Hum. Mol. Genet.
9,
653-657[Abstract/Free Full Text]
|
| 19.
|
Courseaux, A.,
and Nahon, J.-L.
(2001)
Science
291,
1293-1297[Abstract/Free Full Text]
|
| 20.
|
Johns, M. A.,
Mottinger, J.,
and Freeling, M.
(1985)
EMBO J.
4,
1093-1102[Medline]
[Order article via Infotrieve]
|
| 21.
|
Mottinger, J. P.,
Johns, M. A.,
and Freeling, M.
(1984)
Mol. Gen. Genet.
195,
367-369[CrossRef][Medline]
[Order article via Infotrieve]
|
| 22.
|
Jin, Y.-K.,
and Bennetzen, J. L.
(1989)
Proc. Natl. Acad. Sci. U. S. A.
86,
6235-6239[Abstract/Free Full Text]
|
| 23.
|
Johns, M. A.,
Babcock, M. S.,
Fuerstenberg, S. M.,
Fuerstenberg, S. I.,
Freeling, M.,
and Simpson, R. B.
(1989)
Plant Mol. Biol.
12,
633-642[CrossRef]
|
| 24.
|
Altschul, S. F.,
Gish, W.,
Miller, W.,
Myers, E. W.,
and Lipman, D. J.
(1990)
J. Mol. Biol.
215,
403-410[CrossRef][Medline]
[Order article via Infotrieve]
|
| 25.
|
Nicholas, K. B.,
Nicholas, H. B. J.,
and Deerfield, D. W. I.
(1997)
EMBNEW. News
4,
14
|
| 26.
|
Dellaporta, S. L.,
Wood, J.,
and Hicks, J. B.
(1985)
in
Molecular Biology of Plants: A Laboratory Manual
(Malmberg, R.
, Messing, J.
, and Sussex, I., eds)
, pp. 36-37, Cold Spring Harbor Press, New York
|
| 27.
|
Banik, M.,
Garrett, T. P. J.,
and Fincher, G. B.
(1996)
Plant Mol. Biol.
31,
1163-1172[CrossRef][Medline]
[Order article via Infotrieve]
|
| 28.
|
Sidhu, P. K.,
and Fincher, G. B.
(1999)
Plant Physiol.
121,
685-686[Free Full Text]
|
| 29.
|
Elrouby, N.,
and Bureau, T. E.
(2000)
Plant Physiol.
124,
369-377[Abstract/Free Full Text]
|
| 30.
|
Altschul, S. F.,
Madden, T. L.,
Schäffer, A. A.,
Zhang, J.,
Zhang, Z.,
Miller, W.,
and Lipman, D. J.
(1997)
Nucleic Acids Res.
25,
3389-3402[Abstract/Free Full Text]
|
| 31.
|
Wu, S.,
Kriz, A. L.,
and Widholm, J. M.
(1994)
Plant Physiol.
106,
1709-1710[CrossRef][Medline]
[Order article via Infotrieve]
|
| 32.
|
Aubourg, S.,
Boudet, N.,
Kreis, M.,
and Lecharny, A.
(2000)
Plant Mol. Biol.
42,
603-613[CrossRef][Medline]
[Order article via Infotrieve]
|
| 33.
|
Small, I. D.,
and Peeters, N.
(2000)
Trends Biochem. Sci.
25,
46-47[Medline]
[Order article via Infotrieve]
|
| 34.
|
Chang, P.-F.,
Damsz, B.,
Kononowicz, A. K.,
Reuveni, M.,
Chen, Z.,
Xu, Y.,
Hedges, K.,
Tseng, C. C.,
Singh, N. K.,
Binzel, M. L.,
Narasimhan, M. L.,
Hasegawa, P. M.,
and Bressan, R. A.
(1996)
Physiol. Plant.
98,
505-516[CrossRef]
|
| 35.
|
Abe, H.,
Ohbayashi, F.,
Shimada, T.,
Sugasaki, T.,
Kawai, S.,
and Oshiki, T.
(1998)
Genes Genet. Syst.
73,
353-358[CrossRef][Medline]
[Order article via Infotrieve]
|
| 36.
|
Carrington, J. C.,
Heaton, L. A.,
Zuidema, D.,
Hillman, B. I.,
and Morris, T. J.
(1989)
Virology
170,
219-226[CrossRef][Medline]
[Order article via Infotrieve]
|
| 37.
|
Purugganan, M. D.,
and Wessler, S. R.
(1994)
Proc. Natl. Acad. Sci. U. S. A.
91,
11674-11678[Abstract/Free Full Text]
|
| 38.
|
Marillonnet, S.,
and Wessler, S. R.
(1998)
Genetics
150,
1245-1256[Abstract/Free Full Text]
|
| 39.
|
Eickbush, T. H.
(1994)
in
The Evolutionary Biology of Viruses
(Morse, S. S., ed)
, pp. 121-155, Raven Press, New York
|
| 40.
|
Battini, J. L.,
Heard, J. M.,
and Danos, O.
(1992)
J. Virol.
66,
1468-1475[Abstract/Free Full Text]
|
| 41.
|
Bova, C. A.,
Olsen, J. C.,
and Swanstrom, R.
(1988)
J. Virol.
62,
75-83[Abstract/Free Full Text]
|
| 42.
|
Dorner, A. J.,
and Coffin, J. M.
(1986)
Cell
45,
365-374[CrossRef][Medline]
[Order article via Infotrieve]
|
| 43.
|
Srinivasan, A.,
Anand, R.,
York, D.,
Ranganathan, P.,
Feorino, P.,
Schochetman, G.,
Gurran, J.,
Kalyanaraman, V. S.,
Luciw, P. A.,
and Sanchez-Pescador, R.
(1987)
Gene (Amst.)
52,
71-82[CrossRef][Medline]
[Order article via Infotrieve]
|
| 44.
|
Banik, M.,
Li, C.-D.,
Langridge, P.,
and Fincher, G. B.
(1997)
Mol. Gen. Genet.
253,
599-608[CrossRef][Medline]
[Order article via Infotrieve]
|
| 45.
|
Fincher, G. B.
(1989)
Annu. Rev. Plant Physiol. Plant Mol. Biol.
40,
305-346[CrossRef]
|
| 46.
|
Herman, S. A.,
and Coffin, J. M.
(1987)
Science
236,
845-848[Abstract/Free Full Text]
|
| 47.
|
Swain, A.,
and Coffin, J. M.
(1992)
Science
255,
841-845[Abstract/Free Full Text]
|
| 48.
|
Swanstrom, R.,
Parker, R. C.,
Varmus, H. E.,
and Bishop, J. M.
(1983)
Proc. Natl. Acad. Sci. U. S. A.
80,
2519-2523[Abstract/Free Full Text]
|
| 49.
|
Hu, W.-S.,
and Temin, H. M.
(1990)
Science
250,
1227-1232[Abstract/Free Full Text]
|
| 50.
|
Zhang, J.,
and Temin, H. M.
(1993)
Science
259,
234-238[Abstract/Free Full Text]
|
| 51.
|
Al-Khayat, H.,
Bhella, D.,
Kenny, J. M.,
Roth, J.-F.,
Kingsman, A. J.,
Martin-Rendon, E.,
and Saibil, H. R.
(1999)
J. Mol. Biol.
292,
65-73[CrossRef][Medline]
[Order article via Infotrieve]
|
| 52.
|
Garfinkel, D. J.,
Boeke, J. D.,
and Fink, G. R.
(1985)
Cell
42,
507-517[CrossRef][Medline]
[Order article via Infotrieve]
|
| 53.
|
Mellor, J.,
Fulton, S. M.,
Dobson, M. J.,
Wilson, W.,
Kingsman, S. M.,
and Kingsman, A. J.
(1985)
Nature
313,
243-246[CrossRef][Medline]
[Order article via Infotrieve]
|
| 54.
|
Omer, C. A.,
Pogue-Geile, K.,
Guntaka, R.,
Staskus, K. A.,
and Faras, A. J.
(1983)
J. Virol.
47,
380-382[Abstract/Free Full Text]
|
| 55.
|
Ellis, R. W.,
DeFeo, D.,
Maryak, J. M.,
Young, H. A.,
Shih, T. Y.,
Chang, E. H.,
Lowy, D. R.,
and Scolnick, E. M.
(1980)
J. Virol.
36,
408-420[Abstract/Free Full Text]
|
| 56.
|
Leprince, D.,
Gegonne, A.,
Coll, J.,
de Taisne, C.,
Schneeberger, A.,
Lagrou, C.,
and Stehelin, D.
(1983)
Nature
306,
395-397[CrossRef][Medline]
[Order article via Infotrieve]
|
| 57.
|
Naharro, G.,
Robbins, K. C.,
and Reddy, E. P.
(1984)
Science
223,
63-66[Abstract/Free Full Text]
|
| 58.
|
Nunn, M. F.,
Seeburg, P. H.,
Moscovici, C.,
and Duesberg, P. H.
(1983)
Nature
306,
391-395[CrossRef][Medline]
[Order article via Infotrieve]
|
Copyright © 2001 by The American Society for Biochemistry and Molecular Biology, Inc.

CiteULike Complore Connotea Del.icio.us Digg Reddit Technorati What's this?
This article has been cited by other articles:

|
 |

|
 |
 
D. R. Hoen, K. C. Park, N. Elrouby, Z. Yu, N. Mohabir, R. K. Cowan, and T. E. Bureau
Transposon-Mediated Expansion and Diversification of a Family of ULP-like Genes
Mol. Biol. Evol.,
June 1, 2006;
23(6):
1254 - 1268.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
J. Lai, Y. Li, J. Messing, and H. K. Dooner
From the Cover: Gene movement by Helitron transposons contributes to the haplotype variability of maize
PNAS,
June 21, 2005;
102(25):
9068 - 9073.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
S. Brunner, K. Fengler, M. Morgante, S. Tingey, and A. Rafalski
Evolution of DNA Sequence Nonhomologies among Maize Inbreds
PLANT CELL,
February 1, 2005;
17(2):
343 - 360.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
S. J. Robinson, D. J. Cram, C. T. Lewis, and I. A.P. Parkin
Maximizing the Efficacy of SAGE Analysis Identifies Novel Transcripts in Arabidopsis
Plant Physiology,
October 1, 2004;
136(2):
3223 - 3233.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
J. C. Kuhl, F. Cheung, Q. Yuan, W. Martin, Y. Zewdie, J. McCallum, A. Catanach, P. Rutherford, K. C. Sink, M. Jenderek, et al.
A Unique Set of 11,008 Onion Expressed Sequence Tags Reveals Expressed Sequence and Genomic Differences between the Monocot Orders Asparagales and Poales
PLANT CELL,
January 1, 2004;
16(1):
114 - 125.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
T. Wicker, N. Yahiaoui, R. Guyot, E. Schlagenhauf, Z.-D. Liu, J. Dubcovsky, and B. Keller
Rapid Genome Divergence at Orthologous Low Molecular Weight Glutenin Loci of the A and Am Genomes of Wheat
PLANT CELL,
May 1, 2003;
15(5):
1186 - 1197.
[Abstract]
[Full Text]
|
 |
|

|
 |

|
 |
 
S. Knappe, U.-I. Flugge, and K. Fischer
Analysis of the Plastidic phosphate translocator Gene Family in Arabidopsis and Identification of New phosphate translocator-Homologous Transporters, Classified by Their Putative Substrate-Binding Site
Plant Physiology,
March 1, 2003;
131(3):
1178 - 1190.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
S. K. Lal, M. J. Giroux, V. Brendel, C. E. Vallejos, and L. C. Hannah
The Maize Genome Contains a Helitron Insertion
PLANT CELL,
February 1, 2003;
15(2):
381 - 391.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
S. Shim, N. Bae, and J.-K. Han
Bone morphogenetic protein-4-induced activation of Xretpos is mediated by Smads and Olf-1/EBF associated zinc finger (OAZ)
Nucleic Acids Res.,
July 15, 2002;
30(14):
3107 - 3117.
[Abstract]
[Full Text]
[PDF]
|
 |
|
Copyright © 2001 by the American Society for Biochemistry and Molecular Biology.
|
Advertisement
Advertisement
|