|
Advertisement | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
J. Biol. Chem., Vol. 280, Issue 21, 20740-20751, May 27, 2005
Pyrrolysine and Selenocysteine Use Dissimilar Decoding Strategies*![]() ![]() ¶|| **
From the
Received for publication, February 8, 2005 , and in revised form, March 22, 2005.
Selenocysteine (Sec) and pyrrolysine (Pyl) are known as the 21st and 22nd amino acids in protein. Both are encoded by codons that normally function as stop signals. Sec specification by UGA codons requires the presence of a cis-acting selenocysteine insertion sequence (SECIS) element. Similarly, it is thought that Pyl is inserted by UAG codons with the help of a putative pyrrolysine insertion sequence (PYLIS) element. Herein, we analyzed the occurrence of Pyl-utilizing organisms, Pyl-associated genes, and Pyl-containing proteins. The Pyl trait is restricted to several microbes, and only one organism has both Pyl and Sec. We found that methanogenic archaea that utilize Pyl have few genes that contain in-frame UAG codons, and many of these are followed with nearby UAA or UGA codons. In addition, unambiguous UAG stop signals could not be identified. This bias was not observed in Sec-utilizing organisms and non-Pyl-utilizing archaea, as well as with other stop codons. These observations as well as analyses of the coding potential of UAG codons, overlapping genes, and release factor sequences suggest that UAG is not a typical stop signal in Pyl-utilizing archaea. On the other hand, searches for conserved Pyl-containing proteins revealed only four protein families, including methylamine methyltransferases and transposases. Only methylamine methyltransferases matched the Pyl trait and had conserved Pyl, suggesting that this amino acid is used primarily by these enzymes. These findings are best explained by a model wherein UAG codons may have ambiguous meaning and Pyl insertion can effectively compete with translation termination for UAG codons obviating the need for a specific PYLIS structure. Thus, Sec and Pyl follow dissimilar decoding and evolutionary strategies.
Pyrrolysine (Pyl)1 has recently been identified in the active site of monomethylamine methyltransferase (MtmB) from Methanosarcina barkeri, and sequences encoding Pyl-containing homologs of this protein were found in several other methanogenic archaea, including Methanosarcina acetivorans, Methanosarcina mazei, and Methanosarcina thermophila (13). Methylamine methyltransferase genes from these organisms contain in-frame UAG codons, which do not halt translation, but encode Pyl. Following this discovery, additional Pyl-containing methyltransferases have been identified in Methanosarcina, and to date three classes of Pyl-containing methylamine methyltransferase genes are known: mtmB, dimethylamine methyltransferase (mtbB), and trimethylamine methyltransferase (mttB) (1, 2). Some Methanosarcina contain several paralogs of each methyltransferase family. Using this information, various genome sequences were scanned for genes encoding homologous Pyl-containing proteins. This search identified an mttB homolog in a Gram-positive bacterium Desulfitobacterium hafniense (2). More recently, an Antarctic archaeon, Methanococcoides burtoni, has also been reported to utilize Pyl (4). In contrast, no Pyl-containing methyltransferases have been reported in eukaryotes. It is also not known whether the utilization of Pyl is restricted to methyltransferases or other Pyl-containing proteins exist.
Although the mechanism of Pyl biosynthesis and incorporation into protein is not fully understood, the presence of a Methanosarcina tRNApyl gene (pylT) with the CUA anticodon and of class II aminoacyl-tRNA synthetase gene (pylS) argued for cotranslational incorporation of Pyl (2). A recent study suggested that pylT and pylS are the only foreign genes necessary for translating UAG as Pyl in Escherichia coli, when the cells are supplemented with exogenous Pyl (5). In addition, it was reported that PylS could activate and ligate Pyl directly onto tRNApyl (5, 6) and that tRNApyl is directly recognized by the standard elongation factor EF-Tu (7). Analysis of the genomic context of pylT and pylS identified pylB, pylC, and pylD, which were suggested to participate in Pyl biosynthesis or insertion into protein (2). pylT, -S, -B, -C, and -D genes constitute a Pyl gene cluster (or Pyl operon), and pylT and pylS genes are considered as the Pyl utilization signature. Because Pyl is inserted in response to a codon that in most organisms functions as a terminator, there are three distinct possibilities for how Pyl insertion can be achieved: (i) redefinition of a subset of UAG stop codons by a cis-acting mRNA signal to encode Pyl; (ii) reassignment of all UAG codons to encode Pyl; and (iii) ambiguous meaning of UAG codons, e.g. a competition between read-through and termination such that a fraction of ribosomes translating the UAG codon incorporate Pyl, whereas the rest support termination (8). However, the attention of researchers has previously focused on the first possibility, because of the analogy between Pyl and selenocysteine (Sec) (9). Both Pyl and Sec are encoded by "termination" codons and are the only known additions to the pool of 20 universal, directly encoded, amino acids. Therefore, Sec and Pyl are known as the 21st and 22nd amino acids. The mechanism of Sec insertion is known in much detail (1013). Incorporation of Sec requires the presence of selenocysteine insertion sequence (SECIS) element, a hairpin structure residing in 3'-untranslated regions (3'-UTRs) of selenoprotein mRNAs in eukaryota and archaea, or immediately downstream of Sec UGA codons in eubacteria (1315). SECIS is essential for Sec insertion, whereas in its absence UGA serves as terminator (16). Several attempts have been made to search for analogous stem-loop structures in mRNAs encoding Pyl-containing proteins. A putative secondary structure was predicted 56 nucleotides downstream of the Pyl-encoding UAG codon in mtmB mRNAs and designated as pyrrolysine insertion sequence (PYLIS) element (9, 17). This predicted structure has not been tested experimentally for functional relevance. Identification of genes encoding Sec- and Pyl-containing proteins in genomic sequences is challenging, because standard annotation tools interpret UGA and UAG as stop signals. For example, most methylamine methyltransferases in Methanosarcina are incorrectly annotated. At present, no tools are available for prediction of Pyl-containing proteins, and previous in silico approaches were limited to manual analyses and BLAST searches (2). In the case of Sec, tools have been developed and successfully used to identify selenoprotein genes by searching for SECIS elements (1820) and Sec/Cys pairs in homologous sequences (21, 22). In this study, we used bioinformatics approaches to analyze Pyl-utilizing organisms and Pyl-containing proteins, and to examine possible mechanisms of Pyl insertion. Our data suggest that indiscriminate Pyl insertion at UAG may be tolerated in Pyl-utilizing archaea and that Pyl decoding processes are different from those of Sec.
Sequence Databases and Resources260 completely sequenced prokaryotic genomes were downloaded from the NCBI ftp server (ftp.ncbi.nih.gov/genomes/Bacteria). To analyze incompletely sequenced genomes, we used partial genomic sequences (contigs) from the NCBI data base of microbial genomes as well as a non-redundant nucleotide data base. Both web-based and local Blast programs (23) were used for sequence analysis (available at ftp.ncbi.nih.gov/blast and www.ncbi.nlm.nih.gov/BLAST). Identification of Pyl Gene Cluster Homologs and Known Pyl-containing ProteinspylT and pylS sequences from M. barkeri (accession number AY064401 [GenBank] ) were used as queries to search genomic databases for possible homologs with an e value below 0.01. Candidate tRNAPyl sequences were further analyzed to identify structural features associated with known tRNAPyl, such as a 6-bp acceptor stem and a base between the D and acceptor stems (2). Other genes in the Pyl gene cluster (pylB, -C, and -D) were similarly analyzed by comparative sequence analyses. We further examined whether these genes were organized in clusters. A tblastn program with default parameters was used to search for Pyl-containing methylamine methyltransferases in different organisms. Open reading frames (ORFs) and conservation of UAG-flanking regions were then examined manually. Multiple alignments and phylogenetic trees were generated with ClustalW (24). Analysis of Candidate PYLIS Elements in Methylamine Methyltransferase GenesSequences either downstream of in-frame UAG codons or in the putative 3'-UTR of methylamine methyltransferase gene mRNAs were analyzed manually to search for possible conserved structures and sequence features within these structures. RNA secondary structures were predicted with RNAfold 1.4, which is a part of the Vienna RNA package (available at www.tbi.univie.ac.at/~ivo/RNA/(25)). Analyses of UAG Codon FunctionTo characterize functions of UAG codons, a homology-based approach was developed and used to analyze UAG-flanking regions in four Pyl-utilizing organisms, M. acetivorans, M. mazei, M. burtonii, and D. hafniense. This procedure was implemented using simple Perl scripts (available upon request). First, genes terminating with UAG were extracted from the original annotation files and extended until the next non-UAG stop signal (UAA/UGA). ORFs translated from the elongated genes were analyzed by tblastn against non-redundant and microbial genome databases. We also screened for conservation of UAG codons in nucleotide sequences and of UAG-flanking regions in protein sequences. This procedure assigned each UAG codon to one of three categories as follows: (i) A UAG was interpreted as a terminator if an elongated sequence was sufficiently long (>30 nucleotides), and all of its homologs had a true stop signal (either a non-UAG terminator in Pyl-utilizing organisms or any termination signal in other organisms) that corresponded to the UAG codon. (ii) A UAG was interpreted as a candidate Pyl codon if an elongation was >30 nt, and >50% homologs extended beyond the UAG and terminated near the termination site of the elongated sequence. All identified sequences were then analyzed for conservation of UAG in Pyl-utilizing organisms with blastn and blastp. (iii) A UAG was not assigned a function if the two situations discussed above could not be satisfied (for example, if we observed short elongations beyond UAG codons, lack of sequence similarity between homologs in regions flanking UAG, or a small number of homologs extending beyond the UAG). A non-Pyl-utilizing archaeon, Methanococcus jannaschii, was also analyzed using the same approach. It served as the control in searches involving Pyl-utilizing archaea. Analysis of Overlaps between Elongated UAG-containing Genes and Downstream GenesOverlaps between genes are common in prokaryotic genomes (26). To examine how extensions of UAG-containing genes relate to the extent of the overlap, we analyzed overlapping genes before and after sequence elongation downstream of predicted stop codons in M. acetivorans and M. mazei. A simple Perl script was developed for this analysis (available upon request). We first identified overlapping genes in the original genome annotations, determined the number of overlaps in each genome, and measured overlap lengths. The longest overlap in a genome was defined as an overlap threshold. We then extended genes terminated at UAG until the next non-UAG stop signal using the approach described above and repeated the overlap analysis procedure. We reasoned that if no significant increase in the number of genes whose overlap was longer than the threshold would be observed, the situation would be consistent with the use of UAG as either a terminator or a Pyl codon. However, if the sequence extension procedure generated many genes with large overlaps with the downstream genes, the situation would be consistent with the use of UAG codon as terminator. In addition, the gene overlaps involving genes terminated at UAA and UGA codons were analyzed using the same strategy (e.g. before and after extension). These served as controls. Identification of Genes Associated with Pyl UtilizationAll predicted ORFs in M. acetivorans and M. mazei genomes were searched for exclusive occurrence in genomes that utilize Pyl. The tblastn program was used to search these sequences against 260 completely sequenced prokaryotic genomes, non-redundant nucleotide data base and unfinished microbial contigs with an e value below 0.05. A simple script was developed to parse the tblastn output and examine presence/absence of homologs in analyzed genomes. A pairwise alignment tool, bl2seq, was then used with an e value cutoff set to 0.001 to cluster protein sequences into different families. The occurrence of these proteins in D. hafniense was then analyzed.
Distribution of Pyl-utilizing Organisms and Pyl-containing Proteins Available completely and incompletely sequenced prokaryotic genomes were screened for tRNApyl (pylT) and pyrrolysyl-tRNA synthetase (pylS) sequences, and their patterns of occurrence were compared with those of other Pyl genes (pylB, -C, and -D). We found that the products of pylB (biotin synthase homolog) and pylC (carbamoyl-phosphate synthetase homolog) have close homologs in a wide variety of organisms. In contrast, pylS, pylT, and pylD (nucleoside-diphosphate sugar epimerase homolog) are specific for methanogenic archaea and D. hafniense (Fig. 1). In D. hafniense, pylSn and pylSc encode the N- and C-terminal parts of PylS (2), and small overlaps occur between pylT and pylSc (4 nt) and between the pylB and pylC genes (52 nt). pylS, pylT, and pylD always cluster with pylB and pylC, and the overall Pyl gene cluster has identical mutual organization of these sequences, except that D. hafniense pylSn is located at the end of the cluster (Fig. 1). Thus, the five Pyl genes define the Pyl gene cluster, but only pylT and pylS (and perhaps pylD) sequences can be used as the signature for the Pyl trait.
A search of completely and incompletely sequenced prokaryotic genomes for Pyl genes revealed only six organisms that could utilize Pyl, including four members of Methanosarcina genera, M. burtonii, and D. hafniense. Methanosarcina species and M. burtonii belong to Methanosarcinales, suggesting that Pyl is encoded by a UAG codon in a restricted group of phylogenetically related organisms that occupy a specific environmental niche. High conservation of the Pyl gene cluster and the small number of organisms that utilize Pyl suggest its relatively recent origin. In the 6 Pyl-utilizing organisms, a total of 29 Pyl-containing methylamine methyltransferase genes was identified (Table I). They are distributed in three enzyme families that do not share significant sequence similarity. Fig. 2 shows the occurrence of these genes in genomes and contigs. Only mtmB genes cluster with the Pyl operon genes (in three Methanosarcina organisms). In M. mazei, two distant duplicate mtmB genes are present. In M. barkeri, two duplicate mtmB1 genes cluster together and are on the opposite strands with the Pyl cluster.
Further analyses of the three methylamine methyltransferase protein families revealed conservation of Pyl in MtmB and MtbB (i.e. no MtmB and MtbB homologs were detected, in which Pyl is replaced with other residues or in which the Pyl-encoding UAG codon is replaced with a non-UAG stop signal). On the other hand, multiple MttB homologs were detected, in which Pyl is not conserved and replaced with various amino acids (Fig. 3). This situation is in contrast to Sec, which is highly conserved. In addition, most selenoproteins have homologs, in which Sec is replaced with cysteine (Cys). In fact, the Sec/Cys pair in homologous sequences is a feature that is used for identification of selenoproteins in genomic databases (21, 22). Phylogenetic analyses of the three methylamine methyltransferase families as well as of the pylT and pylS genes typically placed D. hafniense genes as outliers (Fig. 4). In Pyl-utilizing archaea, all mtmB, mtbB, and mttB genes encode Pyl-containing proteins. Conversely, D. hafniense possessed mttB genes encoding proteins with and without Pyl. MttB homologs that did not have Pyl were broadly distributed in other bacteria. Although the use of Pyl appears to be prevalent in methanogenic archaea, from our data it could not be established with certainty whether the Pyl trait evolved in these organisms or in bacteria. Both Pyl operon sequences and mttB have optimal codon usage in organisms in which they are present (data not shown), arguing against a recent (traceable) lateral transfer of the Pyl trait between methanogenic archaea and D. hafniense.
Analyses of Candidate PYLIS Elements
Functions of UAG Codons in Pyl-utilizing Organisms
In contrast, the proportion of genes that are predicted to terminate at UAG in Pyl-containing archaea is <5.0%. For example, only 126 (including all incorrectly annotated methylamine methyltransferases) of 3,371 genes are predicted to use UAG terminator in M. mazei. This value is much lower than the proportion of UAA or UGA terminators, suggesting that UAG might be functionally distinct from UAA and UGA in these archaea.
We further used a homology-based search strategy to examine the coding potential of UAG codons as an additional test to characterize the function of these codons in archaea. Our approach was to extend the reading frames of all genes predicted to contain UAG codons until the next non-UAG stop signal. For each elongated UAG-containing gene, the tblastn program was used to identify candidate homologs in other organisms and to examine the conservation of UAG-flanking regions within translated sequences. We reasoned that if a sequence is sufficiently extended (>30 nt) beyond the UAG, and all of its homologs in other organisms have true stop signals that corresponded to the UAG codon (that is, only the sequence upstream of UAG is conserved, whereas sequence similarity is absent downstream of UAG), the UAG should be a terminator (Fig. 5A). Using this strategy, we could reliably identify UGA stop signals in Sec-utilizing organisms and distinguish them from UGA codons for Sec (data not shown). Conversely, if the extension is long (>30 nt), and most candidate homologs extend beyond the UAG to end near (or after) the site corresponding to the non-UAG stop codon in the elongated sequence, the UAG codon in the sequence of interest is considered a Pyl codon candidate. In addition to testing UAG function, this strategy could also be used for identification, in Pyl-encoding organisms, of candidate Pyl-containing proteins. To avoid the possibility of dealing with a sequencing error or a pseudogene, we required the presence of sequences encoding a candidate Pyl-containing protein in two or more genomes of Pyl-utilizing organisms (Fig. 5B). In this case, methylamine methyltransferases served as true positives, because they can be extended beyond their Pyl UAG codons, share homology with other proteins in sequences downstream of their UAG codons, and occur in at least 4 Pyl-utilizing organisms as Pyl-containing forms (Fig. 3). In other situations (see Fig. 5C for specific examples), we could not distinguish between Pyl-encoding functions and stop signals. However, if the UAG was followed with a nearby stop codon, either Pyl insertion or translation termination could presumably be tolerated. Surprisingly, among all the genes with predicted in-frame UAG codons in the Pyl-utilizing archaea, we could not detect a single unambiguous candidate containing UAG as its terminator (Table III). Instead, we found that in most genes either UAG codons are followed with additional nearby non-UAG stop signals (40.3% of UAGs), or UAG-containing genes can be extended to generate conserved sequence alignments downstream of UAGs (24.3%). The former situations cannot distinguish between termination and Pyl insertion, whereas the latter correspond to candidate Pyl-containing proteins.
Interestingly, these searches revealed only four protein families, including three known methylamine methyltransferase families and a family of transposases, as candidate Pyl-containing proteins encoded in at least two genomes. UAG-containing forms of transposases were identified in M. acetivorans (4 sequences) and M. mazei (16 sequences), and all had a single UAG codon at the same position. These data suggest that transposases are a family of novel Pyl-containing proteins (Fig. 5B). However, only methylamine methyltransferases were found in three or more genomes, and these enzymes provided the best match to the Pyl trait. This observation suggests that Pyl utilization was conserved during evolution for its use by methylamine methyltransferases. At least in MtmB, Pyl is located at the enzyme active site, where it was suggested to be directly involved in catalysis serving as a strong electrophile (3). It cannot be excluded that additional, species-specific proteins that use catalytic Pyl residue occur in organisms capable of Pyl insertion. However, our searches suggest that the Pyl trait is associated exclusively with methylamine methyltransferases and not with any other family of Pyl-containing proteins. The bias against unambiguous assignment of a fraction of the occurrences of a codeword as terminator that we observed in Pyl-utilizing archaea, was not seen in either Sec-utilizing organisms (data not shown) or the Pyl-decoding bacterium D. hafniense. In D. hafniense, we detected a number of "true" UAG stop signals (221 hits, 19.6% of all UAG codons). In addition, 33 D. hafniense sequences, including one incorrectly annotated mttB gene, have a predicted in-frame UAG codon. However, except for mttB, these are present in single copies and so are likely false positives. Overall, these findings are consistent with the hypothesis that UAG has a dual function in D. hafniense, but no evidence for such a dual function was observed in archaea. Using the same approach, we analyzed a non-Pyl-utilizing archaeon, Methanococcus jannaschii, in which UAG is known to have unambiguous stop codon meaning. Like Methanosarcina, M. jannaschii has a small number of UAG codons (UAG: UGA:UAA = 164:222:1343). However, we detected 52 genes (31.7% of all UAGs), in which these UAGs could be classified as true stop signals (Fig. 5A). Only 5 genes showed sequence homology downstream of their UAG, but these were represented by single UAG-containing sequences. Thus, there is a clear difference in the use of UAG codons between Pyl-utilizing and non-Pyl-utilizing archaea (Table III). Partially overlapping genes are common in prokaryotic genomes (26, 27), however, most overlaps are short. We reasoned that if UAG codons function as terminators, extensions of these sequences downstream of UAGs would result in some overlaps that are abnormally long. In contrast, if UAG codons function in Pyl insertion, the extended sequences would not result in overlaps or would produce mostly short overlaps with the downstream genes.
To examine these possibilities, we analyzed overlaps involving previously annotated genes and genes whose ORFs were extended beyond the annotated stop signals for all three stop codons in M. acetivorans and M. mazei (Table IV). The annotated genes showed a maximum of a 98 nucleotide overlap in M. acetivorans and a 241 nucleotide overlap in M. mazei, and these values were set as thresholds to evaluate the overlaps obtained after the genes were extended to the next stop signal.
We found that, among all UAG-extended genes, 27 M. acetivorans and 43 M. mazei genes overlap with downstream genes that are in different reading frames (genes that overlapped in the same reading frame were excluded as these produced gene fusions rather than overlaps and could correspond to Pyl-containing proteins). However, in M. mazei, the lengths of all overlaps are below the threshold (241 nt). In M. acetivorans, only 3 genes have overlaps longer than the threshold (98 nucleotides), but these (including the longest overlap of 172 nucleotides) are still shorter than the M. mazei threshold. Thus, extension of UAG-containing genes until the next stop signal does not modify the overall composition of the overlapping genes. In contrast, extension of UAA- and UGA-containing genes generated many genes that overlap above the thresholds (e.g. 117 genes for UAA and 195 genes for UGA in M. acetivorans), with some overlaps being above 1000 nt (the longest overlap is 1934 nt for UGA in M. acetivorans, Table IV). Thus, both the number of abnormal gene extensions and the longest observed overlaps involving UGA- and UAA-containing sequences are much higher than those of the UAG-containing sequences, suggesting that the usage of UAA and UGA is functionally distinct from that of UAG in Pyl-utilizing archaea and that read-through of UAG codons should be tolerated better than read-through of UAA or UGA stop codons.
Analysis of Class I Release Factors in Archaea
Several organisms are known, in which stop codons are reassigned to sense codons (34, 35). Although in bacteria (in which there are two semi-specific class I release factors) stop codon reassignment involves loss of one release factor, in eukaryota stop codon reassignment involves changes in the specificity of release factors. For example, in the ciliate Euplotes, UGA was reassigned to encode Cys (36), and it has been shown that its RF1 does not recognize UGA stop codons (37). Interestingly, there are also two non-identical RF1s in Euplotes (38, 39). An attractive hypothesis is that codon reassignment involves extensive mutational alteration of release factors, including duplication of their corresponding genes. If so, the fact that there are two release factors in some Pyl-encoding archaea can be also used to support the proposition that UAG function might be partially or fully reassigned from stop to Pyl.
Putative Mechanisms for Discrimination between Pyl Decoding and Stop Signal Functions of UAG Codons The several lines of evidence described above imply that, at least in Pyl-decoding archaea, UAG codons do not function as standard terminators. Instead, there appears to be either a complete reassignment of UAG codons from stop to Pyl, or a competition between Pyl insertion and translation termination that favors Pyl insertion. Reassignment of codon meaning has been documented for genetic codes of several prokaryotic and eukaryotic organisms, and it is highly common in genetic codes of organelles (34, 40, 41). Most frequently stop rather than sense codons are reassigned. This is probably because it is less deleterious for a cell, because stop codon usage is lower than the usage of sense codons. In addition, terminators generally do not correspond to crucial parts of the protein as is the case with many sense codons. In case of UAG, a complete reassignment of this terminator to Pyl codon does not seem to result in adverse functional consequences as there only a few UAG codons in Pyl-utilizing archaea and many of these would simply be slightly extended at the C terminus if UAG codes for Pyl.
The second possibility is that the meaning of UAG is ambiguous and specifies both termination and Pyl, making UAG the polysemous codon in methanogenic archaea. There is one known example of such ambiguous meaning of a codon. In several members of the Candida species a standard leucine (Leu) CUG codon is translated as serine (Ser).
An attractive possibility is that the competition between termination and Pyl insertion for UAG codons is influenced by physiological stimuli, e.g. this is a regulated competition or a partial reassignment of UAG codons. Considering that only one new family of candidate Pyl-containing proteins (transposases) could be identified, and it occurred only in two archaea, methylamine methyltransferases emerged as the key enzymes that utilize Pyl. MtmB and MtbB families of methylamine methyltransferases strictly conserve Pyl, whereas the occurrence of Pyl-containing mttB methylamine methyltransferases strictly matches that of the Pyl trait. Thus, it seems likely that Pyl insertion is maintained, because it is required for use by these enzymes. It is possible that the efficiency with which UAG is decoded as Pyl depends on specific environmental conditions, such as growth on methylamines (Fig. 8). In this case, methylamines may activate Pyl biosynthesis and expression of methylamine methyltransferase genes. Under these conditions, Pyl insertion may out-compete translation termination at many or all UAG codons. However, because methylamine methyltransferases are extremely abundant enzymes (13), Pyl will primarily serve the methyltransferase UAG codons. It is also possible that the rate with which Pyl is inserted at UAG codons is gene-specific, e.g. it depends on strength or leakiness of the UAG codon as a terminator. It is known that stop signals are extended elements with upstream and downstream sequences contributing to efficiency of decoding, terminating, and read-through functions (4547). These stop-codon-flanking sequences were reported to cross-link with release factors and influence termination efficiency (48, 49). Thus, the context of UAG codons may be an important feature that influences the outcome of UAG decoding. However, the small number of true Pyl-containing proteins (e.g. methyltransferases) was insufficient for us to make definitive conclusions regarding the presence of sequence signals than flank UAG and discriminate between the two coding functions of UAG codons. If metabolites influence competition between Pyl insertion and termination, in the absence of methylamines, Pyl might not be synthesized, and termination at UAG codons may prevail, because there would be no Pyl-tRNAPyl to compete with termination. If so, many UAG codons could serve as stop signals under certain growth conditions (e.g. no methylamines), but will insert Pyl when methylamines are present. This hypothesis is also consistent with the identical patterns of occurrence of methylamine methyltransferase genes (mtmB and mtbB) and the Pyl cluster.
In any case, the data suggest that there appears to be no need for a highly specific PYLIS element in archaeal Pyl-containing mRNAs. For example, if UAG codons are primarily decoded as Pyl, there might be only a small fraction of truncated forms of methyltransferases due to competition with termination. In addition, insertion of Pyl at other UAG codons should be well tolerated, because this would result in only slight increases in protein masses. At the same time, the small number of UAG-containing genes in Pyl-decoding archaea suggests that Pyl is not a common amino acid in protein in these organisms. Presumably, either utilization of Pyl could affect functions of some proteins, or the flux through the Pyl biosynthetic pathway could not satisfy the demand for this amino acid, if it is to be commonly used. Further research is needed to address these hypotheses.
Searches for Genes Associated with the Pyl Trait
Parallels and Differences in Sec and Pyl Insertion Systems A hypothesis of a parallel between Sec and Pyl insertion systems was suggested (3, 9, 50) based on similarities at various steps in the two translation pathways. However, our analyses described above argue against this possibility. We further analyzed and compared the known features of Pyl and Sec biosynthesis and insertion (Table VI).
Distribution of Sec and Pyl TraitsSec is used by approximately a quarter of prokaryotic organisms, which have been characterized by genome sequencing, and by many eukaryotic organisms, with the exception of yeasts and higher plants (1, 13, 21). At least 26 eukaryotic and 25 prokaryotic selenoprotein families are known; most are present in proteins with distinct structures and functions (20, 21). This relatively widespread use of Sec contrasts with the rare occurrence of Pyl. We compared the Sec and Pyl decoding traits and found that only one organism, D. hafniense, utilizes both rare amino acids. Searches against all known selenoproteins identified only one selenoprotein, formate dehydrogenase subunit, in D. hafniense (data not shown). Formate dehydrogenases contain one Sec, which is present in the active site, coordinates molybdenum, and is directly involved in oxidation of formate to carbon dioxide (51, 52). Thus, it appears that D. hafniense has only one Sec residue and one Pyl residue in its set of proteins. Whereas the use of Pyl and Sec is limited in D. hafniense, this organism appears to use all 22 natural amino acids currently known. The use of both Sec and Pyl by this organism is clearly supported by the presence of corresponding biosynthetic and insertion machineries.
Comparison of Sec and Pyl Biosynthetic PathwaysOne distinctive feature of Sec biosynthesis is that its synthesis occurs on tRNASec (the selC gene product) (53, 54). The tRNASec has features that distinguish it from canonical tRNAs, including its length (typically 90 nucleotides; more than any other tRNAs), few post-transcriptional modifications, an unusually long variable arm, and the presence of 13 nucleotides in the acceptor and T Comparison of Sec and Pyl Insertion PathwaysThe mechanism of Sec insertion in response to UGA has been most thoroughly elucidated in E. coli (53, 61, 62). Sec-tRNASec forms a complex with the Sec-specific elongation factor SelB and GTP, and subsequently binds the SECIS element within ribosome-bound mRNAs (10). The resulting quaternary complex directs the insertion of Sec at in-frame UGA codons (54, 63, 64). The mechanism of Sec insertion in archaea and eukaryotes differs from that in bacteria in two aspects: (i) in bacteria, the SECIS element is located immediately downstream of the inframe UGA codon, but it is present in 3'-UTRs in archaea and eukaryotes; (ii) SelB in bacteria binds both Sec-tRNASec and SECIS, whereas in eukaryotes (and possibly archaea), SelB (EFSec) binds Sec-tRNASec and is associated with SECIS via SECIS-binding protein 2 (SBP2) (13, 65, 66). Release factor RF2 can recognize UGA efficiently if any component of the Sec incorporation machinery is missing (45). Understanding the mechanism of Pyl insertion has lagged behind that of Sec. As discussed above, we neither could identify true UAG stop codons nor find conserved stem-loop structures (PYLIS) immediately downstream of the in-frame UAG codons or in UTRs of methyltransferase genes in methanogenic archaea. Our data, viewed as a whole, argue for differences between Sec and Pyl insertion pathways.
The nature of the processes that extend the universal genetic code is still being debated. In recent years, many non-standard amino acids in proteins have been identified (6769), but almost all are formed by post-translational modifications. Sec and Pyl are the only two exceptions identified to date. Both are encoded by canonical stop codons using specific tRNAs. These properties may have evolutionary significance in regard to the extension of the genetic code (8, 50). In this study, various genomic sequences were scanned to identify and scrutinize Pyl-utilizing organisms and Pyl-containing proteins. Analysis of the distribution of stop codons revealed that UAG is a rare codon in Pyl-encoding archaea, but common in the bacterium D. hafniense. It appears that in D. hafniense UAG codons must serve two functions, stop and Pyl insertion. However, having only a single Pyl-containing protein in this organism precludes computational searches for possible cis-elements that modify the function of the UAG codon. In contrast, in Pyl-utilizing archaea, the lack of unambiguous candidates for UAG stop codons, the relatively large proportion of genes that could only slightly be extended beyond UAG and the acceptable overlaps for extended UAG-containing genes suggest that UAG codons might be decoded as Pyl in many or all UAG-containing genes. These features also suggest that insertion of Pyl could be well tolerated in these organisms, even though the number of proteins that require Pyl for their function is small and might be limited to methylamine methyltransferases. Thus, although both Sec and Pyl insertion may compete with termination, there is a difference between decoding strategies employed by Pyl and Sec, with Pyl resembling the standard amino acids and Sec having its own biosynthetic and decoding mechanisms. Our findings are also consistent with analyses of release factor sequences in archaea as well as with the unsuccessful search for PYLIS elements in methylamine methyltransferase genes. If some discrimination between Pyl decoding and stop signal functions of UAG codons is needed, it could potentially be provided by the context of the UAG codons or structural elements that inhibit termination thus allowing Pyl insertion. Alternatively, UAG codons could serve as Pyl codons under conditions that stimulate Pyl synthesis (e.g. during growth of cells on methylamines) by out-competing the weak terminator function of UAG. However, in situations where methyltransferase genes are not required, Pyl biosynthesis may be inhibited resulting in depletion of the pool of Pyl-tRNApyl, which would favor termination at UAG codons. For most genes, Pyl insertion or termination may be nearly equivalent, as gene products would only slightly differ in C-terminal regions. Combined with the fact that only a small number of genes utilize UAG codons in archaea, this change in UAG coding function could be well tolerated by these organisms.
* This work was supported in part by National Institutes of Health (NIH) Grant GM061603 (to V. N. G.). The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
|| Supported by NIH Grant GM48152 and an award from Science Foundation Ireland. ** To whom correspondence should be addressed: Dept. of Biochemistry, University of Nebraska, Lincoln, NE 68588-0664. Tel.: 402-472-4948; Fax: 402-472-7842; E-mail: vgladyshev1{at}unl.edu.
1 The abbreviations used are: Pyl, pyrrolysine; Sec, selenocysteine; MtmB, monomethylamine methyltransferase; MtbB, dimethylamine methyltransferase; MttB, trimethylamine methyltransferase; pylT, tRNApyl gene; PylS, pyrrolysyl-tRNA synthetase; SECIS, selenocysteine insertion sequence; PYLIS, pyrrolysine insertion sequence; ORF, open reading frame; UTR, untranslated region; RF1, class I release factor; RF2, release factor 2; selA, Sec synthase gene; SelB, Sec-specific elongation factor; EFSec, eukaryotic Sec-specific elongation factor; selC, tRNASec gene; SelD, selenophosphate synthetase; SBP2, SECIS-binding protein 2; nt, nucleotide(s).
We thank the Research Computing Facility of the University of Nebraska-Lincoln for the use of the Prairiefire Beowulf cluster supercomputer.
This article has been cited by other articles:
|
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Advertisement | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||