Translation initiation from a naturally occurring non-AUG codon in Saccharomyces cerevisiae.

Although previous studies have already shown that both cytoplasmic and mitochondrial activities of glycyl-tRNA synthetase are provided by a single gene, GRS1,in the yeast Saccharomyces cerevisiae, the mechanism by which this occurs remains unclear. Evidence presented here indicates that this bifunctional property is actually a result of two distinct translational products alternatively generated from a single transcript of this gene. Except for an amino-terminal 23-amino acid extension, these two isoforms have the same polypeptide sequence and function exclusively in their respective compartments under normal conditions. Reporter gene assays further suggest that this leader peptide can function independently as a mitochondrial targeting signal and plays the major role in the subcellular localization of the isoforms. Additionally, whereas the short protein is translationally initiated from a traditional AUG triplet, the longer isoform is generated from an upstream inframe UUG codon. To our knowledge, GRS1 appears to be the first example in the yeast wherein a functional protein isoform is initiated from a naturally occurring non-AUG codon. The results suggest that non-AUG initiation might be a mechanism existing throughout all kingdoms.

Although previous studies have already shown that both cytoplasmic and mitochondrial activities of glycyl-tRNA synthetase are provided by a single gene, GRS1, in the yeast Saccharomyces cerevisiae, the mechanism by which this occurs remains unclear. Evidence presented here indicates that this bifunctional property is actually a result of two distinct translational products alternatively generated from a single transcript of this gene. Except for an amino-terminal 23-amino acid extension, these two isoforms have the same polypeptide sequence and function exclusively in their respective compartments under normal conditions. Reporter gene assays further suggest that this leader peptide can function independently as a mitochondrial targeting signal and plays the major role in the subcellular localization of the isoforms. Additionally, whereas the short protein is translationally initiated from a traditional AUG triplet, the longer isoform is generated from an upstream inframe UUG codon. To our knowledge, GRS1 appears to be the first example in the yeast wherein a functional protein isoform is initiated from a naturally occurring non-AUG codon. The results suggest that non-AUG initiation might be a mechanism existing throughout all kingdoms.
In lower eukaryotes such as Saccharomyces cerevisiae, AUG is the only codon recognized as the translational initiator, and the AUG codon closest to the 5Ј-end of mRNA preferentially serves as the start site for translation (1,2). If the first AUG triplet is mutated, initiation can begin at the next available AUG on the message. The same rules apply to virtually all eukaryotes. However, there are many examples of cellular and viral mRNAs that can use "non-AUG" codons differing from AUG by a single nucleotide as the translation start site (3). The relatively weak base pairing between a non-AUG codon and the anticodon of initiator tRNA appears to be compensated for by interactions with nearby nucleotides, in particular a purine in position Ϫ3 and a G in position ϩ4 (4,5). Stated differently, mutations in the sequence surrounding the first AUG can lead to its inefficient utilization as initiator and subsequent use of the next available AUG in a better context. Additionally, a stable hairpin structure with a predicted stability of around Ϫ19 kcal/mol located 12-15 nucleotides downstream from the initiator can also facilitate recognition of a weak start site (e.g. a non-AUG codon or an AUG in a suboptimal context) by the 40 S ribosomal subunit (6). In contrast, there seems to be little sequence context effect in yeast (7), and the positive benefits of downstream secondary structures have not yet been clearly addressed. Perhaps for these reasons, yeast cannot efficiently use non-AUG codons as translation start sites (8). So far, examples of native non-AUG initiation have not been reported in the model yeast S. cerevisiae.
In prokaryotes, there are typically 20 aminoacyl-tRNA synthetases, one for each amino acid (9 -12). In eukaryotes, protein synthesis occurs not only in the cytoplasm, but also in organelles, such as mitochondria and chloroplasts (13). Compartmentalization of the protein synthesis machinery within the cytoplasm and organelles of eukaryotes leads to isoaccepting tRNA species that are distinguished by nucleotide sequence, subcellular location, and enzyme specificity. Thus, most eukaryotes, such as yeast, commonly have two genes that encode distinct sets of proteins for each aminoacylation activity; one functions in the cytoplasm and the other in the mitochondria. Each set aminoacylates the isoaccepting tRNAs within its respective cell compartment, and is sequestered from the isoacceptors in other compartments. As exceptions to this rule, two S. cerevisiae genes, HTS1 (coding for histidyl-tRNA synthetase) (14) and VAS1 (coding for valyl-tRNA synthetase) (15), specify both the cytoplasmic and mitochondrial forms of their respective enzymes. mRNAs with distinct 5Ј-ends are alternatively transcribed from each of these genes. The mitochondrial form of the enzyme is translated from the first initiation AUG on the "long" message, whereas the cytosolic form is translated from the second in-frame AUG on the "short" message. These two isoforms cannot substitute for each other because of different localization (15). Similar observations have been made for the Arabdopsis thaliana genes that encode alanyl-tRNA synthetase, threonyl-tRNA synthetase, and valyl-tRNA synthetase (16). It should be addressed that except for some algae, all aminoacyl-tRNA synthetases are encoded by nuclear genes and imported post-translationally into their respective compartments.
As with most yeast tRNA synthetases, two homologous nuclear genes encoding glycyl-tRNA synthetase (GlyRS), 1 GRS1 and GRS2, have been identified in S. cerevisiae. Paradoxically, GRS1 provides both cytoplasmic and mitochondrial functions, whereas GRS2 appears to be nonfunctional (17). However, unlike HTS1 and VAS1, only one appropriate AUG triplet was identified in the 5Ј region of its open reading frame, leading to the assumption that the translational product initiated from this start site was a bifunctional enzyme (17). A similar example in which a single translation product is responsible for both mitochondrial and cytoplasmic functions is the yeast fumarase gene FUM1 (18,19). In this instance, all fumarase molecules are targeted and processed in mitochondria before part of the products are sent by retrograde movement back to the cytosol (20). We wondered if GRS1 followed a similar pathway, or if a different mechanism was involved. In the work described here, we cloned the GRS1 gene and investigated its bifunctional phenotype in relation to the mechanism of its translational control. To our surprise, two protein isoforms of different size are alternatively translated from this gene. Even more remarkable is the fact that one of the two protein isoforms is actually initiated from a UUG triplet, a codon that was believed inappropriate for initiation in yeast. The implications of this novel mechanism of translation initiation, particularly in yeast, are further discussed.

EXPERIMENTAL PROCEDURES
Construction of Plasmids-Cloning of GRS1 from S. cerevisiae followed standard protocols. The wild-type GRS1 (designated as wt GRS1) sequence was amplified by PCR as a 2.4-kb EagI-SalI fragment (a 2.0-kb open reading frame plus 0.4 kb of upstream DNA) with appropriate oligonucleotides. The PCR products were digested with EagI and SalI prior to ligation into EagI/SalI-digested pRS315, a low-copy number yeast shuttle vector (21). Various point mutations, such as M1T (ATG 1 to ACT), OF (Out-of-frame mutation of AAA Ϫ7 to TAAA), I(Ϫ18)L (ATC Ϫ18 to CTC), L(Ϫ23)L (TTG Ϫ23 to TTA), and E(Ϫ25)stop (GAA Ϫ25 to TAA), were subsequently introduced into the wild-type clone following standard protocols. The numbers "Ϫ7," "Ϫ18," "Ϫ23," and "Ϫ25" in these constructs refer to the Ϫ7, Ϫ18, Ϫ23, and Ϫ25 codon triplets that precede ATG 1 , respectively. ADH/GRS1⌬2-12 was constructed as previously described (17). For construction of ADH/GRS1, the open reading frame (extending from base pair Ϫ3 to the stop codon) of GRS1 was amplified by PCR as an EagI-SalI fragment and cloned in a high-copy number plasmid pADH1 (22). Cloning of ADH/GRS1 Ϫ88 ⌬2-12 (extending from base pair Ϫ88 to the stop codon but lacking codons 2-12) in pADH1 followed a similar protocol.
To construct H-GRS1, a short DNA duplex coding for a putative RNA hairpin structure was inserted into position Ϫ88 of a wt GRS1 construct. Briefly, a GRS1 coding sequence extending from base pair Ϫ88 to the stop codon was amplified by PCR as an EagI-SalI fragment and cloned into pRS315, yielding GRS1 Ϫ88 . Two partially complementary oligonucleotides, 5Ј-GGCCAGCGTGCGGGCATCTAGCCCGCACGCA-TATGC-3Ј and 5Ј-GGCCGCATATGCGTGCGGGCTAGATGCCCGCAC-GCT-3Ј, were annealed at equal molar concentration and then inserted into the EagI site in GRS1 Ϫ88 . Only fusions that carry an EagI site at the 5Ј junction of the duplex were selected as templates for the subsequent construction. A DNA sequence containing base pairs Ϫ300 to Ϫ88 of GRS1 was PCR amplified and inserted into the EagI site located at the 5Ј-end of the duplex to give H-GRS1. The putative stem-loop structure deduced from the duplex, where 5Ј-GCGUGCGGGC-3Ј pairs with 5Ј-GCCCGCACGC-3, has a predicted stability of about Ϫ24 kcal/mol.
For construction of various preVAS1-GRS1 fusions, a GRS1 sequence extending from base pair Ϫ12 to the stop codon was amplified by PCR, digested, and then ligated into pRS315. Subsequently, a DNA sequence containing base pairs Ϫ300 to ϩ138 relative to ATG 1 of VAS1 was amplified by PCR as a SacI-EagI fragment with appropriate oligonucleotides, and then cloned into the SacI-EagI site preceding the GRS1 sequence, yielding preVAS1-GRS1. A similar approach was used to construct preVAS1 M1A -GRS1 and preVAS1-GRS1 M1T , which contain a mutation (ATG 1 to GCG) in the preVAS1 portion and a mutation (ATG 1 to ACT) in the GRS1 portion, respectively.
Construction of wild-type VAS1 (designated as wt VAS1) and VAS1c has been described previously (22). For construction of preGRS1-VAS1c, a DNA sequence containing base pairs Ϫ300 to Ϫ1 relative to ATG 1 of GRS1 was amplified by PCR and used to substitute for the corresponding EagI-NdeI fragment in VAS1c, yielding preGRS1-VAS1c. preGRS1 L(Ϫ23)L -VAS1c, which contains a silent mutation (TTG Ϫ23 to TTA) in the preGRS1 portion, was constructed following a similar protocol.
Mapping the 5Ј Ends of GRS1 Transcripts-Identification of the 5Ј-ends of GRS1 transcripts was carried out with 5Ј-rapid amplification of cDNA ends (RACE, Invitrogen). Briefly, the 5Ј-terminal sequences of GRS1 mRNAs were transcribed with Moloney murine leukemia virus reverse transcriptase into first strand cDNAs using an "antisense" GRS1-specific primer that was annealed to a region 400 bp downstream of ATG 1 . The first strand cDNA products were purified and then tailed at their 3Ј-ends with dCTP using terminal deoxynucleotidyl transferase. The tailed cDNAs were then amplified via PCR using Taq DNA polymerase with a deoxyinosine-containing anchor primer (provided by the manufacturer) annealed to the 5Ј-end of the cDNA, and a nested GRS1-specific primer annealed 360 bp downstream of ATG 1 . Following PCR-driven amplification, the double-stranded cDNA products were cloned and sequenced.
Complementation Assays for the Cytoplasmic Function of GRS1-Complementation assays were performed using a yeast GRS1 knockout strain, RJT3/II-1 (17), which carries a maintenance plasmid that provides both cytoplasmic and mitochondrial GlyRS functions. Assays for the cytoplasmic function of plasmid-borne GRS1 and derivatives were carried out by introducing the test plasmid into RJT3/II-1 and determining the ability of transformants to grow in the presence of 5-fluoroorotic acid . The cultures were incubated at 30°C for 3-5 days or until colonies appeared. (Photos for the complementation assays were taken at day 3 following incubation.) The transformants evicted the maintenance plasmid with a URA3 marker in the presence of 5-FOA. Thus, only an enzyme with the cytoplasmic GlyRS activity encoded by the test plasmid (carrying a LEU2 marker) could rescue the growth defect of RJT3/II-1 on 5-FOA.
Complementation Assays for the Mitochondrial Function of GRS1-To test the mitochondrial function of plasmid-borne GRS1 derivatives, RJT3/II-1 was cotransformed with a test plasmid (carrying a LEU2 marker) and a second maintenance plasmid (carrying a TRP1 marker) that expresses a functional cytoplasmic GlyRS, but is defective in mitochondrial GlyRS activity. (The second maintenance plasmid used contained GRS1 OF cloned in pRS314.) In the presence of 5-FOA, the first maintenance plasmid (containing a URA3 marker) was evicted from the cotransformants, whereas the second maintenance plasmid was retained. Thus, all cotransformants survived 5-FOA selections because of the presence of cytoplasmic GlyRS derived from the second maintenance plasmid. The cotransformants were further tested on YPG plates for their mitochondrial phenotypes at 30°C, with results documented at day 3 following plating. Because a yeast cell cannot survive on glycerol without functional mitochondria, the cotransformants do not grow on YPG plates unless a functional mitochondrial GlyRS is present. Complementation assays for various VAS1 constructs were conducted essentially the same way as those for GRS1 constructs, except that a VAS1 knockout strain (designated as CW1) (22) was used as the test strain.

RESULTS
Mapping the 5Ј -Ends of GRS1 Transcripts-It was previously shown that a single GRS1 gene encodes both cytoplasmic and mitochondrial glycyl-tRNA synthetase activities in S. cerevisiae (17). However, unlike HTS1 and VAS1, no alternative in-frame AUG codon was found in the sequence upstream of the putative open reading frame (as diagramed in Fig. 1A). Interestingly, a long stretch of uninterrupted coding sequence was found between the nearest upstream stop codon, UAA Ϫ57 , and the putative translation initiation codon, AUG 1 . (The number "Ϫ57" in UAA Ϫ57 refers to the Ϫ57 codon that precedes AUG 1 .) To gain insight into the mechanism of GRS1 expression, the in vivo transcription profiles of this gene were examined. The transcription start sites and 5Ј-terminal nucleotide sequences of GRS1 transcripts were determined using a technique known as 5Ј-RACE. Following reverse transcription and PCR-driven amplification, the double-stranded cDNA products were cloned and sequenced. Remarkably, only one cDNA product was identified on a 10% polyacrylamide gel (Fig. 1B), with its 5Ј-end mapped at position Ϫ88 relative to the A of AUG 1 (Fig. 1A). In addition, sequence of the cDNA (ϳ400 bp determined) was identical to that of the corresponding segment in genomic DNA. Thus, the possibility of splicing occurring at the 5Ј-end of GRS1 mRNA could be ruled out.
A Second Isoform of GlyRS Generated from GRS1-A previous study suggested that an amino-terminal segment of GlyRS might direct its mitochondrial localization, because deletion of codons 2-12 from this gene selectively impaired its mitochondrial activity (17) (ADH/GRS1⌬2-12 in Fig. 2). However, these phenotypes were observed with protein expression under the control of a constitutive ADH promoter in a high-copy number vector, and might not faithfully reflect normal conditions in vivo. To mimic physiological conditions, various GRS1 se-quences with the native promoter were cloned into a low-copy number yeast shuttle vector, pRS315, for further studies. (Normal conditions are referred to here as protein expression from a low-copy number vector under the control of its native promoter.) We supposed if AUG 1 is the only initiator for GRS1, a missense mutation at this codon would impair the synthesis (and thus function) of both the cytoplasmic and mitochondrial enzymes, whereas a nonsense or an out-of-frame mutation placed upstream of AUG 1 would have little effect on either function. As shown in Fig. 2, wt GRS1 allowed colony formation on both FOA (Fig. 2B) and YPG (Fig. 2C) plates, indicating the production of functional cytoplasmic and mitochondrial GlyRSs. When AUG 1 was mutated to ACU, the resultant construct, GRS1 M1T , lost the cytoplasmic function ( Fig. 2B) but retained the mitochondrial function (Fig. 2C), suggesting that AUG 1 serves as the translation initiator for the cytoplasmic form, but not the mitochondrial form. In contrast, when an out-of-frame mutation (AAA Ϫ7 to UAAA) was introduced 5Ј to AUG 1 , the resultant construct, GRS1 OF , lost the mitochondrial activity ( Fig. 2C) but retained the cytoplasmic function (Fig.  2B), suggesting that the mitochondrial precursor is translationally initiated at a site upstream of AAA Ϫ7 . (Note that this out-of-frame mutation not only caused an out-of-frame reading, but also led to a stop codon at this site.) Thus, the mitochondrial function of GlyRS is provided by a distinct isoform initiated at a site 5Ј to AUG 1 .
To further prove our assumption, we inserted the prese-quence back into construct ADH/GRS1⌬2-12 and tested for its effect. Consistent to the data mentioned above, the resultant construct, ADH/GRS1 Ϫ88 ⌬2-12, restored the mitochondrial function of ADH/GRS1⌬2-12. However, when an intact open reading frame of GRS1 without the presequence was expressed from the ADH promoter, the overexpressed GlyRS rescued both the cytoplasmic and mitochondrial defects of the grs1 Ϫ strain (Fig. 2), suggesting that codons 2-12 are important for the mitochondrial phenotype of the "overexpressed" GlyRS molecules (compare ADH/GRS1 and ADH/GRS1⌬2-12 in Fig. 2, B and C). Thus, residues 2-12 appear to constitute a weak, cryptic, transit signal that normally does not play a role in mitochondria import, but can be recruited to do so when overexpressed.
Initiation of the Mitochondrial Isoform from a Native Non-AUG Codon of GRS1-Because there is no other in-frame AUG FIG. 1. Mapping the 5-ends of GRS1 transcripts. A, the 5Ј-end DNA sequence of GRS1, including base pairs Ϫ180 to ϩ120 or codons Ϫ60 to ϩ40 relative to ATG 1 . The transcription start site of GRS1 is labeled on top of the sequence. ATG 1 and the nearest upstream stop codon TAA Ϫ57 (designated as *) are in bold. The amino acid residues deduced from the presequence between TTT Ϫ56 and AGA Ϫ1 are underlined. Diverging arrows highlight the potential formation of a hairpin structure upstream from the ATG start site. The numbers "Ϫ57, Ϫ56, and Ϫ1" in TAA Ϫ57 , TTT Ϫ56 , and AGA Ϫ1 refer to the Ϫ57, Ϫ56, and Ϫ1 codons that precede ATG 1  Complementation of the cytoplasmic and mitochondrial phenotypes of the grs1 Ϫ strain was shown by the ability to grow on 5-FOA and YPG plates, respectively. A, a schematic summary of the various GRS1 constructs and their complementation activities. B, complementation assays on a 5-FOA plate. C, complementation assays on a YPG plate. ADH/ GRS1⌬2-12 and ADH/GRS1 Ϫ88 ⌬2-12 each have a deletion in the sequence coding for residues 2-12 and are expressed from an ADH promoter. GRS1 M1T and GRS1 OF have a missense mutation (ATG 1 to ACT) and an out-of-frame mutation (AAA Ϫ7 to TAAA), respectively. M1 denotes the native initiator methionine, whereas M* represents an artificial initiator methionine introduced at the 5Ј-side of codon 13. Deletion is shown as a dashed line. The open and solid boxes denote the ADH and GRS1 sequences, respectively. triplet in the sequence 5Ј to AUG 1 (Fig. 1), the possibility that a non-AUG codon differing from AUG by one nucleotide serves as the translation start site for the mitochondrial form was investigated. There are 9 such triplets between UAA Ϫ57 and AUG 1 , including UUG Ϫ54 , AUC Ϫ48 , AUU Ϫ45 , AUU Ϫ31 , UUG Ϫ23 , AUC Ϫ18 , AUU Ϫ10 , AAG Ϫ4 , and AUU Ϫ3 (Fig. 1). To determine whether any of these serves as the initiator, a nonsense mutation was introduced into this region to map its approximate location. As shown in Fig. 3, a nonsense mutation at GAA Ϫ25 (resulting in GRS1 E(Ϫ25)stop ) did not compromise the mitochondrial function of this gene (Fig. 3C), suggesting that the alternative initiator was located downstream of GAA Ϫ25 . Together with the results shown in Fig. 2, the alternative initiation site must be between GAA Ϫ25 and AAA Ϫ7 . There are three candidates, UUG Ϫ23 , AUC Ϫ18 , and AUU Ϫ10 , located within this region. Among these three codons, AUU Ϫ10 is the least likely candidate, because translation initiated at this site would lead to a protein with a very short leader peptide. For this reason, we focused on AUC Ϫ18 and UUG Ϫ23 . Mutation at AUC Ϫ18 (resulting in GRS1 I(Ϫ18)L ) had little effect on the mitochondrial activity (Fig. 3C), whereas mutation at UUG Ϫ23 (GRS1 L(Ϫ23)L ) abolished the mitochondrial activity. To provide further evidence that UUG Ϫ23 is the initiator, a stop codon, UAA, was inserted between codons Ϫ23 and Ϫ22, and the resultant construct (GRS1 Ϫ23(stop)Ϫ22 ) was tested for its mitochondrial activity. Consistent to our hypothesis, insertion of a stop codon immediately downstream of UUG Ϫ23 specifically knocked out its mitochondrial activity. In contrast, insertion of a stop codon, UAA, between codons Ϫ24 and Ϫ23 (resulting in GRS1 Ϫ24(stop)Ϫ23 ) had no obvious effect on the mitochondrial function. Together, our results suggest that UUG Ϫ23 is the translation initiator for the mitochondrial isoform of GlyRS. Note that none of these mutations in the presequence compromised the cytoplasmic function of GRS1 (Fig. 3B).
To rule out the possibility that mutation at UUG Ϫ23 (such as UUG Ϫ23 to UUA) caused an unpredicted change in the secondary structure of GRS1 mRNAs, UUG Ϫ23 was mutated to a classical AUG translation initiation codon and then assayed for its effect on complementation activity. As expected, the mutation did not compromise the mitochondrial activity of GRS1 (GRS1 L(Ϫ23)M in Fig. 3C), suggesting that it is the initiator activity, but not the RNA secondary structure, that was damaged by the mutation of UUG Ϫ23 to UUA. In addition, the relative mRNA levels produced from each mutant were assessed by RT-PCR and found to be similar, suggesting that the mutations did not alter mRNA stability.
Partitioning of GlyRS Isoforms Determined by the Signal Peptide-To provide further insight into the mechanism that governs the subcellular localization of the GlyRS isoforms, we asked whether the leader peptide of the precursor form functions independently as a mitochondrial targeting signal, and if the short form of GlyRS could be converted to a mitochondrial enzyme by fusion of a heterologous signal peptide without overexpression. As previously shown, VAS1c, which has its mitochondrial targeting peptide (residues 1-46) deleted, encodes only the cytoplasmic form of valyl-tRNA synthetase and thus, cannot rescue the mitochondrial defect of the VAS1 knockout strain CW1 (22). Substitution of the native VAS1 promoter in VAS1c with the GRS1 presequence enabled the fusion to provide both the cytoplasmic and mitochondrial activities (preGRS1-VAS1c in Fig. 4), whereas mutation of the non-AUG initiator in the fusion construct (preGRS1 L(Ϫ23)L -VAS1c in Fig. 4) selectively impaired its mitochondrial activity. Thus, the leader peptide is in itself a fully functional mitochondrial targeting signal.
We next tested whether the short form of GlyRS can be converted into a mitochondrial enzyme by a well characterized mitochondrial targeting signal (22). Fig. 5 showed that the leader peptide of valyl-tRNA synthetase could direct the otherwise cytoplasmic glycine enzyme into mitochondria. Further mutagenesis assays confirmed that the mitochondrial and the cytoplasmic functions are provided by proteins initiated from the AUG initiators in the VAS1 and GRS1 portions, respectively (compare preVAS1 M1A -GRS1 and preVAS1-GRS1 M1T in Fig. 5). Thus, the subcellular locations of the GlyRS isoforms depend solely on the presence or absence of an appropriate Translation of the Isoforms from a Single mRNA-To provide additional evidence that there are no alternative transcripts available solely for translation of the cytoplasmic isoform, and that the two isoforms are indeed generated from the same transcript identified by 5Ј-RACE, a sequence coding for a stable RNA secondary structure was inserted at the start of the transcription initiation site and tested for its effect on protein synthesis. As shown in Fig. 6, cytoplasmic and mitochondrial functions were simultaneously affected by the putative hairpin structure, suggesting that no alternative mRNAs are available solely for the translation of the cytoplasmic isoform. (A negligible number of small colonies were observed in H-GRS1 on 5-FOA and YPG, probably because of incomplete inhibition of translation.) These data suggested that the GlyRS isoforms are indeed translated from the same transcript.
Western Blot Analyses of Non-AUG-initiated Protein Translation-To confirm our results obtained by functional tests, we investigated the initiating capacity of UUG Ϫ23 with a different approach, Western blot. Various GRS1 presequences were fused in-frame to lexA and the expression profiles of the fusion constructs were tested. (These fusions were expressed under the control of a constitutive ADH promoter.) As shown in Fig. 7, the LexA protein produced from the lexA construct was specifically recognized by anti-LexA antibody (compare lanes 1 and 2 in Fig. 7B). Unexpectedly, when the wild-type GRS1 presequence was fused to lexA (resulting in preGRS1-lexA), we ob-served only one protein band with a molecular mass similar to that of the LexA protein but could not detect the larger fusion protein (lane 3 in Fig. 7B). A similar profile was seen with the initiator mutant preGRS1 L(Ϫ23)L -lexA (lane 4 in Fig. 7B). We wondered whether the non-AUG-initiated protein was actually produced under these conditions or if the LexA fusion generated has been processed by matrix processing peptidase in the mitochondria. To distinguish between the two possibilities, the non-AUG initiator in the GRS1 portion (UUG Ϫ23 ) was substituted with a traditional AUG initiator (resulting in preGRS1 L(Ϫ23)M -lexA) and tested to see if the fusion protein could be detected. Because this newly introduced AUG was located 5Ј to the native AUG initiator of lexA, we expected that this triplet would become the primary translation start site for the fusion gene. Contrary to our anticipation, we did not observe a larger protein band for preGRS1 L(Ϫ23)M -lexA (lane 5 in Fig. 7B). Thus, the processed LexA fusions appear to have a molecular mass similar to that of LexA. To get rid of the interference caused by AUG 1 in the lexA portion, we mutated the AUG in some of the constructs used in the following experiment.
To compare the efficiency of initiation from various codons more accurately, we used a chemiluminescence-based method. As expected, a protein band with a molecular mass identical to FIG. 4. Testing the mitochondrial targeting potential of the non-AUG-initiated leader peptide of GlyRS. A, a schematic summary of the various VAS1 constructs and their complementation activities. B, complementation assays on a 5-FOA plate. C, complementation assays on a YPG plate. preGRS1-VAS1c and preGRS1 L(Ϫ23)L -VAS1c contain a wild-type and a mutant (TTG Ϫ23 to TTA) GRS1 presequence substituted for the native VAS1 promoter of VAS1c, respectively. Deletion is shown as a dashed line. The striped and solid boxes represent the VAS1 and GRS1 sequences, respectively. Cyt, cytoplasmic; Mit, mitochondrial.

FIG. 5. Coverting the cytoplasmic (Cyt) GlyRS into a mitochondrial (Mit) enzyme by a heterologous signal peptide.
A, a schematic summary of the various GRS1 constructs and their complementation activities. B, complementation assays on a 5-FOA plate. C, complementation assays on a YPG plate. preVAS1-GRS1 is a fusion construct that has a VAS1 sequence substituted for the sequence 5Ј to ATG 1 of GRS1. preVAS1 M1A -GRS1 has a mutation (ATG 1 to GCG) in the VAS1 portion, whereas preVAS1-GRS1 M1T has a mutation (ATG 1 to ACT) in the GRS1 portion. The solid and striped boxes indicate the GRS1 and VAS1 sequences, respectively. that of LexA was produced from preGRS1-lexA M1T in which the AUG 1 in the lexA portion was mutated (AUG to AGU) (lane 3 in Fig. 7C), suggesting that the GRS1-LexA fusion protein has been efficiently processed. The protein band became much stronger when the UUG was changed to AUG (preGRS1 L(Ϫ23)M -lexA M1T , lane 5 in Fig. 7C). In contrast, when UUG Ϫ23 was mutated to a codon unsuitable for initiation (preGRS1 L(Ϫ23)L -lexA M1T , lane 4 in Fig. 7C), the protein band reduced dramatically. We surmised that the insignificant protein band that appeared in lane 4 probably came from one of the non-AUG codons (that differ from AUG by one nucleotide) between UUG Ϫ23 and AUG 1 . However, these non-AUG initiators are probably too inefficient to serve as functional initiators in complementation assays. To obtain a more accurate estimation of the relative initiation efficiencies, various amounts of total proteins were loaded and compared. As shown in Fig. 7D, the frequency of initiation at UUG Ϫ23 is around 5% relative to those at AUG 1 and AUG Ϫ23 (compare lanes 1, 5, and 9). Thus, AUG Ϫ23 appears to be as efficient as AUG 1 in initiation (compare lanes 5 and 9 in Fig. 7D). In addition, the relative amounts of mRNA expressed from preGRS1-lexA M1T , preGRS1 L(Ϫ23)M -lexA M1T , and preGRS1 L(Ϫ23)L -lexA were similar (Fig. 7E), suggesting that it is not the stability of the mRNAs that was affected by the mutations.
Consistent to our complementation assays (Fig. 6), when the duplex described in the previous section was inserted at the 5Ј-end of the preGRS1-lexA fusion, synthesis of both the LexA fusion and LexA was impaired (lane 8 in Fig. 7C), suggesting that these two protein species were indeed produced from the same transcript. To gain insight into the mechanisms of UUG initiation, we introduced point mutations at nucleotides Ϫ39 and Ϫ38 (UA to GC) (resulting in preGRS1 HD -lexA) to destabilize the proposed hairpin structure (5Ј-AGUAGA-3Ј paired with 5Ј-UUUACU-3Ј) (Fig. 1A) and tested for their effect on initiation. Contrary to our expectations, the mutations had little effect on protein expression (compare lanes 3 and 7 in Fig. 7C). A GRS1 construct with the same mutations was shown to effectively rescue the growth phenotype of the GRS1 knockout strain on a YPG plate (data not shown). These observations suggest that the putative structure might not be critical for recognition of UUG Ϫ23 or the sequence downstream of UUG Ϫ23 might assume a structure in vivo different from what we predicted. In addition, when the nucleotide at position Ϫ3 relative to UUG Ϫ23 was mutated (A to U or C), the mitochondrial function of the resultant GRS1 constructs appeared to be normal (data not shown). We did not change the nucleotide at position ϩ4 relative to UUG Ϫ23 , because it is already a less favorable nucleotide, T. DISCUSSION In the present work, we have discovered a non-AUG translation initiator occurring naturally in the yeast S. cerevisiae. Many cellular and viral messages have been shown to use non-AUG codons, such as ACG, CUG, and AUU, as translation start sites (23). In some cases, non-AUG codons serve as exclusive translation initiators, such as in the mRNAs for the Arabidopsis AGAMOUS gene (24) and the latently expressed kaposin locus of Kaposi's sarcoma-associated herpesvirus (25). However, in most cases, non-AUG codons act as alternative translation initiators. Recognition of these initiators by mammalian ribosomes is often facilitated by particular sequences around the initiators, in particular nucleotides in position Ϫ3 and ϩ4 (5). In contrast, sequence context has little effect on translation initiation in yeast. As a result, yeast genes were not previously expected to be capable of using non-AUG triplets as translation start sites (8). In this regard, our discovery of a non-AUG initiator occurring naturally in the yeast GRS1 gene is particularly striking (Fig. 3). The question arose as to why UUG Ϫ23 can serve as the alternative initiator if sequence context is irrelevant. Analysis of the nucleotide sequence 5Ј to AUG 1 revealed a weak secondary structure having a predicted stability of ⌬G around Ϫ2 kcal/mol 15 nucleotides downstream of UUG Ϫ23 (Fig. 1), an optimal position to stimulate recognition of the weak upstream start site (6). However, mutations that destabilize the already weak structure did not affect the initiation ability of UUG Ϫ23 (Fig. 7C). Perhaps the sequence downstream of the UUG initiator does not assume a structure in vivo as predicted. More detailed experiments are necessary for mapping the sequences or structural elements required for efficient non-AUG initiation.
In mammalian cells, the 40 S ribosomal subunit often skips a weak upstream initiator, frequently a non-AUG initiator or an AUG codon with suboptimal sequence context, and continues scanning toward the 3Ј-end until it encounters an AUG triplet within a favorable sequence context. As a result of this "leaky" scanning, two protein isoforms can be alternatively translated from a single transcript (26 -29). Although leaky scanning has been observed frequently in mammals, there are so far only two known examples in yeast: MOD5 (coding for isopentenyl pyrophosphate: tRNA isopentenyl transferase) (30) and CCA1 (coding for ATP (CTP): tRNA nucleotidyltransferase) (31). In these two instances, leaky scanning occurs because the first AUG codon in the two genes is positioned too close to the 5Ј-end of the mRNA, which makes it inaccessible to the initiating ribosome. Because the first initiator (UUG Ϫ23 ) in GRS1 is relatively inefficient, it is possible that the downstream AUG initiator is also recognized through leaky scanning. It is interesting to note that when UUG Ϫ23 was mutated to AUG, the resultant construct rescued both the cytoplasmic and mitochondrial defects of the GRS1 knockout strain (Fig. 3). We surmised that the cytoplasmic function of this construct probably results from the single translation product initiated from AUG Ϫ23 , which has been processed in the mitochondria and then leaks back to the cytoplasm. Upon insertion of a stable hairpin at the beginning of the transcript, the production of both isoforms was simultaneously inhibited (Figs. 6 and 7), lending further support to the hypothesis that recognition of the alternative initiators follows the traditional mechanism of cap-dependent ribosomal scanning and ruling out the possibility of internal ribosomal entry, a mechanism previously discov-ered in picornavirus (32), and recently shown in S. cerevisiae (33).
Contrary to a previous report, which suggested that a single translation product of GRS1 provides both cytoplasmic and mitochondrial GlyRS functions (17), we presented strong evidence here that this translational product (initiated from AUG 1 ) is solely a cytoplasmic enzyme when expressed at normal levels. Localization of the GlyRS molecule into mitochondria requires the presence of an additional leader sequence that serves as a transit signal (Fig. 2). Like many other mitochondrial targeting signals (34), this signal peptide is rich in positively charged (ϳ30%) and hydroxylated (ϳ17%) residues, and devoid of acidic residues. The reporter gene assays confirmed that the leader peptide alone carries all the relevant information required for mitochondrial targeting and thus can successfully convert a cytoplasmic passenger into a mitochondrial player (Fig. 4). On the other hand, the cytoplasmic form, because of lack of the leader peptide, functions strictly in the cytoplasm when expressed from a low-copy number vector under the control of its native promoter (Fig. 2) or an analogous promoter (Fig. 4). Despite this, the leaderless enzyme can be forced into mitochondria when overexpressed (Fig. 2). These observations suggested that the GlyRS preprotein contains two adjacent mitochondrial transit signals, one in the leader peptide and the other in the amino-terminal part of its catalytic body. The former appears to play a predominant role in mitochondrial localization of the preprotein, whereas the latter is a weak cryptic signal, which normally does not take part in protein import (Fig. 2), but can be recruited to do so when overexpressed (Fig. 2). A similar scenario has been observed in COXVa, the yeast gene coding for the precursor form of the mitochondrial cytochrome c oxidase subunit Va. Although the leader peptide is required for mitochondrial import of this enzyme under normal conditions, overexpression of a leaderless form enabled it to overcome this requisite (35). In contrast to these observations, although VAS1 also encodes two distinct protein isoforms through alternative transcription and translation, the cytoplasmic form cannot complement its mitochondrial function even when it is overexpressed (22).
Because yeast ribosomes have now been found to recognize a naturally occurring non-AUG start site, it is conceivable that more examples of non-AUG-mediated protein translation will soon be identified in the yeast. Complementary to our viewpoint, an interesting report published very recently argued that a GUG codon may serve as the exclusive translation initiator for the gene encoding acidic ribosomal P2A protein in the yeast Candida albicans (36). In addition, preliminary screening of genomic sequences for GlyRS genes in other low eukaryotes revealed the presence of only one such gene in both C. albicans and Schizosaccharomyces pombe, and even more incredibly, only one suitable AUG triplet in each of the two genes. Protein sequence alignment suggested that the primary translation products initiated from their putative AUG initiators share high homology with the cytoplasmic GlyRS enzyme of S. cerevisiae, in particular at their amino termini (data not shown), implying that they only serve a cytoplasmic function. It is thus likely that, as with S. cerevisiae, the production of a second, mitochondrial, isoform might involve the utilization of an upstream in-frame non-AUG initiator. Prediction of their upstream sequences with PSORTII (37) showed high mitochondrial localization potential, lending further support to our assumptions. These observations suggest that non-AUG initiation might be a mechanism more widespread than previously thought. In addition, recent investigations have discovered that tRNA synthetases are capable of functions other than protein synthesis, including roles in cellular fidelity, tRNA processing, RNA splicing, RNA trafficking, apoptosis, and transcriptional and translational regulation (38). These non-canonical functions might require additional protein interaction domains and may often take place in locations other than the cytoplasm, such as mitochondria or the nucleus (39 -41). In this sense, alternative translation at in-frame non-AUG codons might contribute to the multifunctional phenotype of a gene by leading to the production of distinct protein isoforms with such signal peptides or additional protein interaction domains. Cumulatively, our discovery has opened a new avenue to the identification of new open reading frames that start exclusively or alternatively with non-AUG initiators, and novel gene functions that might have been previously overlooked in the yeast S. cerevisiae.