Yeast Positive-stranded Virus-like RNA Replicons

Saccharomyces cerevisiae strains carry single-stranded RNAs called 20 S RNA and 23 S RNA. These RNAs and their double-stranded counterparts, W and T dsRNAs, have been cloned and sequenced. A few nucleotides at both ends, however, remained unknown. These RNAs do not encode coat proteins but their own RNA-dependent RNA polymerases that share a high degree of conservation to each other. The polymerases are also similar to the replicases of RNA coliphages, such as Qβ. Here we have determined the nucleotide sequences of W and T dsRNAs at both ends using reverse transcriptase polymerase chain reaction-generated cDNA clones. We confirmed the terminal sequences by primer-extension and RNase protection experiments. Furthermore, these analyses demonstrated that W and T dsRNAs and their single-stranded RNA counterparts (i) are linear molecules, (ii) have identical nucleotide sequences at their ends, and (iii) have no poly(A) tails at their 3′ ends. Both 20 S and 23 S RNAs have GGGGC at the 5′ ends and the complementary 5-nucleotides sequence, GCCCC-OH, at their 3′ ends. S1 and V1 secondary structure-mapping of the 3′ ends of 20 S and 23 S RNAs shows the presence of a stem-loop structure that partially overlaps with the conserved 3′ end sequence. Nucleotide sequences and stem-loop structures similar to those described here have been found at the 3′ ends of RNA coliphages. These data, together with the similarity of the RNA-dependent RNA polymerases encoded among these RNAs and RNA coliphages, suggest that 20 S and 23 S RNAs are plus-strand single-stranded virus-like RNA replicons in yeast.

of them are intracellular parasites with no extracellular stage. Transmission is mainly vertical or through mating or hyphal anastomosis.
Yeast strains of Saccharomyces cerevisiae have been described to carry at least 5 types of double-stranded RNAs, L-A, L-BC, M, W, and T (1). L-A, L-BC, and M are encapsidated into isometric viral particles. W and T are not encapsidated into viral coats (2). W (2.5 kilobases) and T (2.9 kilobases) have been cloned and sequenced almost entirely (3,4). Both RNAs code for proteins with domains conserved among RNA-dependent RNA polymerases (RDRPs) of RNA viruses (5)(6)(7)(8). The protein encoded by W (ϩ) strands (p91) and the protein encoded by T (ϩ) strands (p104) share a high degree of homology that extends beyond the RDRP consensus motifs, indicating a close evolutionary relationship between these RNAs (Fig. 1). Comparison with other RDRPs suggests that these polymerases are more similar to the RNA coliphage replicases than to RDRPs from dsRNA viruses, including those present in the same host, namely L-A and L-BC viruses (3, 4, 9 -11).
All strains carrying W dsRNA also carry a single-stranded RNA called 20 S RNA, and all strains carrying T also have a single-stranded RNA called 23 S RNA. 20 S RNA and 23 S RNA have been proposed to be identical to the W and T (ϩ) strands, respectively (3,4,12). 20 S RNA and 23 S RNA copy number is highly induced under stress conditions such as growth under nitrogen starvation (4,13), reaching up to 100,000 copies/cell. 20 S and 23 S RNAs are not encapsidated into viral particles (14,15) but are associated with their own RNA polymerases, forming ribonucleoprotein complexes (15,16). Recently we have shown that the p91/20 S RNA complexes have in vitro RNA polymerase activity that synthesizes 20 S RNA (17). p104/23 S RNA complexes have similar activity. Since cisacting signals at the ends of the RNA viral genomes often play critical roles in the template specificity of viral RNA polymerases (18 -21), we decided to determine the nucleotide sequences at the ends of W and T dsRNAs.
Here we report the cloning and analysis of the nucleotide sequences at the 5Ј and 3Ј ends of W and T dsRNAs. Both (ϩ) strands have conserved 5Ј end GGGGC and 3Ј end GCCCC-OH sequences. Primer extension analysis and RNase protection experiments confirmed that the single-stranded forms (20 S RNA and 23 S RNA) are identical to the (ϩ) strands of the corresponding double-stranded forms (W and T) and that all these RNAs are linear molecules. S1 and V1 secondary structure mapping of the 3Ј ends confirm that not only 20 S RNA and 23 S RNA share similar sequences at their ends but 3Ј end secondary structures as well. These sequences and secondary structures are similar to those found at the 3Ј ends of the genomic RNAs in (ϩ) strand single-stranded RNA coliphages. Based on the available data we believe that 20 S and 23 S RNAs are similar to positive-stranded RNA viruses. * This work was supported by Grant PB94-0076 from the Dirección General de Investigación Científica y Técnica (DGICYT) (Spain). The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
The nucleotide sequence(s) reported in this paper has been submitted to the GenBank TM /EBI Data Bank with accession number(s) M63893 (W dsRNA)
Purification of RNAs-W and T dsRNAs from strain 37-4C were purified by CF-11 cellulose chromatography as described previously (22). Then, W and T dsRNAs were separated on an agarose gel, electroeluted from the gel, and further passed through Elutip columns (Schleicher & Schuell). 20 S RNA and 23 S RNA were purified from strain 37-4C grown under induction conditions as described (3). Briefly, cells were grown for 48 h to stationary phase, washed, and incubated in the presence of 1% potassium acetate for 14 -16 h to achieve induction of 20 S RNA and 23 S RNA (23). Total nucleic acids were separated on 1.5% agarose gels, and 23 S RNA or 20 S RNA were electroeluted from the gel, extracted once with phenol:chloroform, and precipitated with ethanol. To isolate the ϩ and Ϫ strands of W dsRNA, the dsRNA was first denatured in the presence of 7 M urea at 90°C for 1 min and then loaded onto a 5% polyacrylamide strand separation gel (24). Both strands were located by ethidium bromide staining, excised from the gel, and purified.
Plasmids-pALI17 contained the complete cDNA sequence of 20 S RNA fused to the T7 RNA polymerase promoter and the SmaI site of pBluescript-KSϩ vector. Run-off transcription of SmaI-digested pALI17 by T7 RNA polymerase, therefore, gave transcripts that have the entire sequence of 20 S RNA with the correct 5Ј and 3Ј ends. Plasmid pALI22 contained 20 S RNA cDNA sequences from nt 2288 to 2514 cloned between the HindIII and SmaI sites of pBluescript-SKϩ vector. T7 Run-off transcription of SmaI-digested pALI22 gave the 273-nt RNA transcript WHindIII, which contains 227 nt from the 20 S RNA 3Ј end with an upstream 46-nt vector sequence. Plasmid pNR27 contained the entire 23 S RNA cDNA nucleotide sequence (2891 base pairs) cloned into the unique SmaI site of pBluescript-SKϩ vector. Plasmid pRE443 contained sequences of 23 S RNA cDNA from nt 2750 to 2891 cloned into the SmaI site of pBluescript SKϩ vector. T7 run-off transcription of SmaI-digested pRE443 gave the 215-nt RNA transcript TSpeI, which contained 142 nt from 23 S RNA 3Ј end and 73-nt upstream sequence derived from the vector. Plasmids pW3-2 and pT3-8 are reverse transcription-PCR-generated cDNA clones containing the last 227 nt of W or the last 143 nt of T (ϩ) strand 3Ј ends followed by a poly(A) tract.
RNase Protection Experiments-For RNase protection experiments, we used the RPAII TM ribonuclease protection assay kit from Ambion. 32 P-Labeled RNA probes with sequences complementary to the 3Ј ends of W and T (ϩ) strands were generated by run-off transcription with T3 RNA polymerase from PvuII-digested pW3-2 and EagI-digested pT3-8, respectively. The 32 P-labeled probes were separated in a denaturing acrylamide gel and purified from the gel as mentioned above. The probe was then annealed with 20 S RNA or W (ϩ) strands or with 23 S RNA or T (ϩ) strands, depending on the probe used. The RNA hybrids were digested with a mixture of RNase A and RNase T1 under high salt, and the protected RNA fragments were separated on a 7 M urea 6% polyacryalamide gel and detected by autoradiography.
Secondary Structure Analysis-W dsRNA or small RNA transcripts corresponding to the 3Ј ends of W (ϩ) strand (20 S RNA) or T (ϩ) strand (23 S RNA) were 3Ј end-labeled with [ 32 P]pCp (3000 Ci/mmol, Amersham) and T4 RNA ligase (Life Technologies, Inc.) in a 30-l reaction mixture. The conditions were as suggested by the enzyme supplier. The labeled RNA was denatured and separated in 5% polyacrylamide strand separation gels (24). The 3Ј-end-labeled transcripts or W (ϩ) strands were excised from the gel and extracted with 0.5 ml of 0.5 M ammonium acetate, 1 mM EDTA overnight at room temperature. The samples were then filtered through glass wool to remove polyacrylamide and precipitated with ethanol. The labeled RNA (5000 -20000 cpm) was first preincubated in the reaction buffer for 10 min at 37°C and then digested with Nuclease S1 (Life Technologies, Inc.) or RNase V1 (Amersham). Nuclease S1 digestion was performed in a reaction mixture (6 l) containing 30 mM sodium acetate, pH 4.6, 1 mM zinc acetate, 5% glycerol, 80 mM NaCl, 0.5 g of tRNA, and 0.2, 2, or 10 units of S1 nuclease (28). RNase V1 digestion was done in a buffer (6 l) that contained 25 mM Tris-HCl, pH 7.2, 10 mM MgCl 2 , 200 mM NaCl, 0.5 g of tRNA, and 0.009 or 0.018 units of RNase V1. Nuclease treatments were done at 37°C for 10 min, and the reactions were stopped by the addition of 1 l of 100 mM EDTA and the same volume of loading buffer (10 M urea, 1.5 mM EDTA, 0.05% xylene cyanol, and 0.05% bromphenol blue). To generate a sequence ladder, alkaline hydrolysis was carried out at 90°C for 7 min in a 6-l reaction volume that contained 50 mM sodium bicarbonate/carbonate, pH 9.2, 3 g of tRNA, and twice the amount of labeled RNA used for the enzymatic digestions. The cleaved products were analyzed on 7 M urea 20% or 10% polyacrylamide gels.
General Procedures-Plasmid DNA was sequenced by the dideoxy chain termination method (29) using T7 DNA polymerase (Amersham). In vitro run-off RNA transcription by T7 or T3 RNA polymerases (Promega) was carried out using plasmids linearized with appropriate restriction enzymes. Then the DNA template was digested with 1 g of DNase I (Promega) for 15 min at 37°C. RNA secondary structure prediction was done using the RNAFOLD program (30). RNA coliphages nucleotide sequences were retrieved from the EMBL data bank.

RESULTS
Cloning of W and T dsRNAs Ends-Previously we cloned and sequenced random primer-generated cDNAs from W and T dsRNAs. In these works we obtained 2505-and 2871-base pair nucleotide sequences for W and T dsRNAs, respectively. As judged from the mobilities in denaturing acrylamide gels of in vitro made transcripts with these sequences, we estimated that our cDNA sequences lacked only a few nucleotides at the ends of these RNAs. To understand the replication mechanism of these RNAs, however, it is essential to know the exact nucleotide sequences of these molecules, especially at both ends.
To clone the ends of both RNAs, we used the method called 3Ј-RACE (rapid amplification of cDNA ends) (25). W or T dsR- NAs were first poly(A)-tailed at the 3Ј ends using poly(A) polymerase. The poly(A)-tailed RNA was denatured and annealed with the deoxyoligonucleotide primer Bam-dT16, which could hybridize to the poly(A) tail with its 3Ј end oligo dT sequence. The cDNA was then synthesized with reverse transcriptase. The cDNA was amplified by PCR using Bam-dT16 and a second primer. We used four sets of second primers. Each of them had a nucleotide sequence identical to an internal region close to the 3Ј end of either strand of W or T dsRNA, thus providing the specificity in cDNA amplification. Finally the amplified cDNAs were cloned into Bluescript vectors. We sequenced 5-10 clones derived from each amplification. Table I summarizes the results of sequencing, which are shown as the (ϩ) strand sequences. The majority of sequences derived from each amplified end had a cluster of four C residues accentuated by a downstream poly(A) tract. If we assume that the fourth C adjacent to the poly(A) tract represents the end of each dsRNA, the total numbers of the nucleotide sequences of W and T dsRNAs are 2514 and 2891 base pairs, respectively.
Confirmation of the Sequences of W and T dsRNA Ends-We added poly(A) tails at the 3Ј ends of W and T dsRNAs during cloning. If W or T dsRNA had extra A residues downstream of the successive 4 C residues at the 3Ј ends, these A residues could not be distinguished from the poly(A) tail attached by the poly(A) polymerase. This is the intrinsic problem associated with the 3Ј RACE method. Therefore we needed to evaluate whether these successive 4 C residues were the true ends of these RNAs. For this purpose, two approaches were undertaken: primer extension analysis and RNase protection experiments.
In the primer extension experiments, purified (ϩ) or (Ϫ) strands of W dsRNA or denatured T dsRNA were mixed with a 5Ј end-labeled oligo primer that was complementary to the 5Ј end region of the RNA. Then the labeled primer was extended toward the 5Ј end of the template by reverse transcription. As shown in Fig. 2A, lane 1, the primer complementary to the 5Ј end region of W (ϩ) strand was fully extended and terminated as a single band at the position corresponding to the last 3Ј end C of the W (Ϫ) strand. This result clearly indicates that the 5Ј end sequence of W (ϩ) strand shown in Table I is correct and has no preceding extra Ts. When the T (ϩ) strands were examined similarly by primer extension, we obtained again a single band of the extended primer terminating at the position corresponding to the last C of the clustered 4 C residues nested at the 3Ј end of T (Ϫ) strands (Fig. 2B, lane 1). This result again indicates that the 5Ј end sequence of T (ϩ) strands shown in Table I is correct and has no extra nucleotide sequences at the 5Ј end. We also examined the (Ϫ) strands of W and T dsRNAs and obtained the same results; that is, the first nucleotides of the 5Ј ends of W and T (Ϫ) strands are G and there are no extra Ts attached to them (not shown). It should be pointed out that although the cloning of each end of W and T dsRNA by 3Ј RACE was manipulated at their 3Ј ends, the primer extension experiments shown in Fig. 2 directly analyzed the 5Ј end of each RNA strand. The fact that these two independent but complementary experiments gave consistent results strongly suggests that our cloned sequences are correct and represent the real W and T end sequences.
Logically, however, the possibility still remains that there exists a nonbase-pairing poly(A) tail at the 3Ј end of W or T (ϩ)

FIG. 2. Primer extension analysis of W and T (؉) strand 5 ends.
A, purified W (ϩ) strands (lane 1) or 20 S RNA (lane 2) were annealed with 32 P-labeled primer NR2, and the primer was extended with reverse transcriptase. The extended products were separated on a 7 M urea 6% polyacrylamide gel and autoradiographed. For comparison, a sequence ladder was made from plasmid pNR24 using the same primer. pNR24 is a reverse transcription-PCR-generated clone containing an upstream T 18 tract attached to the W (ϩ) strand 5Ј end sequence (GGGGC). Lane C ϩ 2 contains an aliquot of sequencing ladder C plus the primer-extended products analyzed in lane 2. B, denatured T dsRNA (lane 1) or 23 S RNA (lane 2) were annealed with primer NR22, and the primer was extended with reverse transcriptase. The extended products were analyzed and detected as in A. A sequencing ladder was made from plasmid pNR27 that contains the complete T (ϩ) strand cDNA sequence cloned into the SmaI site of Bluescript vector. In lane C ϩ 2, an aliquot of sequencing ladder C and the primer-extended products shown in lane 2 were mixed and analyzed. Arrows indicate the mobilities of the extended products. Nucleotide sequences derived from the sequencing ladders are shown on the left of the panels, and the sequences complementary to W and T (ϩ) strands are boxed. ...UGAGGCCACGGCCCC (5) T (ϩ) strands/23 S RNA GGGGCCAUG... (6) ...CCGGGCCUGAGCCCC (5) GGGCCAUG... (2) ...CCGGGCCUGAGCC (1) GGCCAUG... (1) ...CCGGGCCU (1) strands. To rule out this possibility, we undertook a second experimental approach; RNase protection experiments. We made uniformly labeled RNA in vitro that had the nucleotide sequence from base 2514 to 2288 (numbering refers to the (ϩ) strand sequence) of the W (Ϫ) strand attached to the 5Ј upstream poly(T) sequence (Fig. 3A). This probe therefore can hybridize to the 3Ј end region of W (ϩ) strand. If W (ϩ) strands have poly(A) tails at their 3Ј ends, a part of the poly(T) se-quence of the probe complementary to the poly(A) tail should be protected from RNase digestion. As shown in Fig. 3A, a part of the probe corresponding to W (Ϫ) strand from base 2514 to 2288 was fully protected, but the 5Ј end poly(T) sequence was completely digested with the RNases (lanes 4 and 5). When the 3Ј end of the T (ϩ) strand was examined using a similar probe, a portion of the probe corresponding to the T (Ϫ) strand sequence from base 2891 to 2812 was fully protected, but again, the adjacent upstream poly(T) sequence was completely digested (Fig. 3B, lanes 4 and 5). These results, therefore, (i) confirm the correctness of our 3Ј end nucleotide sequences of W and T (ϩ) strands and (ii) clearly rule out the possibility that the W and T (ϩ) strands have non-base paring poly(A) tails at their 3Ј ends. Altogether, the results from primer extension analysis and RNase protection experiments indicate that the 3Ј end sequences of W and T dsRNAs obtained by 3Ј RACE are genuine, and that there are no non-base paring poly(A) tails at their 3Ј ends. Thus we have now established the complete nucleotide sequences of W and T dsRNAs. Since we have obtained the complete sequences of W and T dsRNAs, we asked whether 20 S and 23 S RNA have the same corresponding sequences at their ends. The 5Ј end of 20 S RNA was examined by primer extension with the same oligonucleotide used for W (ϩ) strands. As shown in Fig. 2A, lane 2, the primer was extended and terminated as a single band at the same position where the 5Ј end of the W (ϩ) strand terminates. This result indicates that 20 S RNA has the same primer binding site at the same distance from the 5Ј end as the W (ϩ) strand. When the 5Ј end of 23 S RNA was analyzed similarly with the primer used for the T (ϩ) strands, we obtained the same result; that is, the primer was terminated as a single band at the same position corresponding to the 5Ј end terminus of the T (ϩ) strand (Fig. 2B, lane 2). Therefore, the primer extension analysis indicates that the 5Ј ends of 20 S RNA and 23 S RNA are indistinguishable from those of W and T (ϩ) strands, respectively. The 3Ј end regions of 20 S and 23 S RNAs were analyzed by RNase protection experiments with the same RNA probes used for the 3Ј ends of W and T (ϩ) strands, respectively. As shown in Fig. 3A, lane 3, the W (ϩ) strandspecific probe was protected by 20 S RNA from RNase digestion in the region complementary to the 3Ј end of the W (ϩ) strand. This result clearly indicates that the 3Ј end region of 20 S RNA is identical to that of the W (ϩ) strand. Since the upstream poly(T) sequence of the probe was, however, completely digested by RNase, it also indicates that 20 S RNA has no poly(A) tail at its 3Ј end. When the 3Ј end of 23 S RNA was examined similarly with the T (ϩ) strand-specific probe, the probe was protected by 23 S RNA in the region complementary to the 3Ј end of the T (ϩ) strand (Fig. 3B, lane 3). Again, the upstream poly(T) sequence of the probe was completely digested. Therefore this result indicates that the 3Ј end of 23 S RNA is identical to that of the T (ϩ) strand and has no poly(A) tail attached to it. All the results from primer extension and RNase protection experiments, together with our previous observations, thus The cleaved products were separated on a 7 M urea 20% acrylamide gel and detected by autoradiography. For comparison, we analyzed the same samples digested with 2 units of nuclease S1 in the same gel (lanes 6 and 12). Arrows indicate positions highly sensitive to V1 digestion (numbered from the 3Ј end). The positions of loop I and loop II from the S1 digestion are also indicated.
indicate that 20 S and 23 S RNAs are identical to the (ϩ) strands of W and T dsRNAs, respectively.
W and T dsRNAs and 20 S and 23 S RNAs Are Linear-W and T dsRNAs can be labeled stoichiometrically at their 3Ј ends with [ 32 P]pCp and T4 RNA ligase, indicating that these molecules have free OH groups at their 3Ј termini (24, 31). In addition, as shown in Fig. 2, all the reverse-transcribed products from W and T dsRNA by primer extension terminated at the positions corresponding to the 5Ј termini of W and T dsRNA, and there is no more synthesis beyond them. These results clearly indicate that W and T dsRNAs are linear molecules.
The linearity of 20 S RNA was demonstrated previously by a site-directed single cleavage with RNase H, which produced two fragments as expected from its linear nature (24). As demonstrated in Figs. 2 and 3, 20 S RNA and 23 S RNA have the 5Ј and 3Ј termini identical to those of the W and T (ϩ) strands, respectively. Therefore, these results indicate that not only 20 S RNA but also 23 S RNA are linear molecules.
Characteristics of 20 S and 23 S RNA Termini-20 S and 23 S RNAs share the 5 nt sequence GGGGC at the 5Ј ends and the 5 nt GCCCC-OH at the 3Ј ends, thus confirming their close relationship. Interestingly, these 5-nt sequences are complementary. They can potentially form panhandle structures. It also means that the template strands for (ϩ) and (Ϫ) strand synthesis have the same 5-nt sequences (GCCCC-OH) at their 3Ј ends from which the new strand synthesis begins. Computergenerated secondary structure prediction suggests that 20 S and 23 S RNAs have very similar secondary structures at their 5Ј ends (Fig. 4). In both RNAs, the leading sequences upstream of the first AUG initiation codons are very short, 12 nt in 20 S RNA and only 6 nt in 23 S RNA and rich in GC content, and these AUG codons are located within a region of a strong secondary structure (Fig. 4). Both RNAs share nucleotides at Ϫ2 (C), Ϫ1 (C), and ϩ5 (A) positions with respect to the AUG codon. Host mRNAs have much longer leader sequences (usually about 50 nt long), and these sequences are rich in adenine and void of significant secondary structure (32). The consensus sequence around the initiation codons is AA/UAAUGUCU, and less expressed mRNAs have more G-rich and shorter leader sequences. Therefore, the leader sequences of 20 S and 23 S RNAs have characteristics of less expressed yeast mRNAs, and these features may contribute to the low abundance of the encoded p91 and p104 proteins in the host cell. On the contrary, the 3Ј end noncoding regions of 20 S and 23 S RNAs are quite different. 20 S RNA has only 12 nt after the termination codon of p91, whereas 23 S RNA has 59 nt at the 3Ј end. At the 3Ј end termini, however, both RNAs possess nucleotide sequences that can potentially form strong secondary structures. These structures partially overlap with the above-mentioned 3Ј end GCCCC-OH sequences (Fig. 7). Interestingly, similar 3Ј end terminal structures and sequences can be found among the genomic RNAs of positive-stranded RNA coliphages (Fig. 8).
Secondary Structures at the 3Ј Ends of 20 S and 23 S RNAs-To analyze the 3Ј end secondary structure of 20 S RNA, we tried to label purified 20 S RNA with [ 32 P]pCp and T4 RNA ligase. We also made in vitro T7 transcripts that contained the complete 20 S RNA sequence with the correct termini. Unfortunately we were unable to label both RNAs efficiently. To overcome this problem we decided to label W dsRNA and then to separate and purify the (ϩ) strands in an acrylamide strand separation gel (24). The purified, 3Ј end-labeled W (ϩ) strand (20 S RNA) was treated with different concentrations of nuclease S1, and the products were analyzed in a 7 M urea acrylamide sequencing gel (Fig. 5A, lanes 6 to 10). We also analyzed the small transcript WHindIII by S1 nuclease (Fig. 5A, lanes 1 to   5). This transcript contains 20 S RNA nucleotide sequence from nt 2288 to 2514 with the correct 3Ј end. Labeling of the transcript with T4 RNA ligase was more efficient than labeling of 20 S RNA because of its smaller size. There are two regions close to the 3Ј end clearly sensitive to S1 digestion, namely the loop parts of stem-loops I and II. Loop I extends from nt 18 to 21, and loop II extends from nt 62 to 66 from the 3Ј end. These two stem-loops are present in the most optimal conformations for 20 S RNA and the WHindIII transcript predicted with the RNAFOLD program (30) (Fig. 7). Furthermore, there is a third region sensitive to S1 nuclease, located around nt 92, that corresponds to a single-stranded region between stem-loops II and III (Fig. 5A, upper right panel and Fig. 7). We confirmed the presence of these stem-loop structures by V1 nuclease digestion (Fig. 5B). The preference of V1 to digest doublestranded regions gives a pattern of partially digested products essentially complementary to that of S1 digestion (compare lane 4 and lane 6 in Fig. 5B; summarized in Fig. 7).
To determine the secondary structure of 23 S RNA 3Ј end region, we tried to label purified 23 S RNA with [ 32 P]pCp and FIG. 6. S1 and V1 mapping of T (؉) strand 3 end. A, S1 mapping. The 3Ј end-labeled TSpeI transcript was digested with various amounts of nuclease S1 (0.2, 2, or 10 units (U), lanes 3 to 5) or mocked-treated (lane 2) and analyzed on a 7 M urea 20% acrylamide gel. Arrows indicate the two regions (loop I and loop II) highly sensitive to S1 digestion. Lane 1 shows an RNA ladder generated from the transcript by alkaline hydrolysis. B, V1 mapping. The same sample as in A was digested with 0.009 or 0.018 units of nuclease V1 (lanes 3 and 4) or mocked-treated (lane C) and analyzed as in A. For comparison the transcripts digested with 2 units of nuclease S1 (lane 6) were analyzed in the same gel. Arrows indicate positions highly sensitive to V1 digestion and are numbered from the 3Ј end. The positions of loop I and loop II from the S1 digestions are also indicated. Bottom, diagrams of 23 S RNA (or T (ϩ) strand) and TSpeI transcript are shown at the bottom of the panel. TSpeI consists of an upstream vector-derived sequence (open square) and 23 S RNA 3Ј end sequence from nt 2749 to 2891. T4 RNA ligase. Again we observed poor labeling of 23 S RNA. We also tried to isolate labeled T (ϩ) strands from pCp-labeled T dsRNA in a strand separation acrylamide gel. Because of poor separation of both strands in the gel we could not obtain labeled (ϩ) strands pure enough to be analyzed. Therefore we expressed in vitro a small fragment (TSpeI) of 23 S RNA with the correct 3Ј end and labeled its 3Ј end with [ 32 P]pCp and T4 RNA ligase. This fragment, according to the RNAFOLD program prediction, contained the 23 S RNA 3Ј end terminal domain that can form three stem-loop structures (Fig. 7) similar to those present in 20 S RNA 3Ј end. The same prediction was obtained with the entire 23 S RNA nucleotide sequence. The labeled fragment was partially digested with S1 or V1 nucleases and analyzed in a sequencing gel. As shown in Fig. 6 the patterns of digestions clearly identified two 3Ј end stemloop structures (I and II), consistent with the computer predicted secondary structure of 23 S RNA 3Ј end (Fig. 7). DISCUSSION In this paper we have cloned and sequenced both ends of W and T dsRNA. Together with the previously known sequences, we have now established the complete nucleotide sequences of these RNAs. The genuineness of the end sequences obtained was demonstrated by primer extension and RNase protection experiments. Furthermore, these experiments clearly indicated that both ends of 20 S and 23 S RNAs are identical to those of W and T (ϩ) strands, respectively, thus confirming our proposal that 20 S and 23 S RNAs are single-stranded (ϩ) forms of W and T dsRNA.
Interestingly, 20 S and 23 S RNA have no poly(A) tails at their 3Ј ends, in contrast to some minus strand RNA viruses that synthesize poly(A)-tailed mRNAs distinct from their genomic complementary strand RNAs (33). We do not know whether 20 S and 23 S RNAs are 5Ј end-capped. It is extremely difficult to label the 5Ј end of 20 S and 23 S RNA with [␥-32 P]ATP and T4 polynucleotide kinase. Since we have encountered a similar difficulty in labeling the 5Ј end of 20 S RNA, even made in vitro by T7 RNA polymerase with the correct termini, 2 this difficulty apparently comes from the inaccessibility of the enzyme to the 5Ј ends, perhaps because of the large RNA size rather than from a capped structure at the 5Ј end. So far we have not noticed any consensus motifs in p91 and p104 implicated in viral capping enzymes (34). Considering the cytoplasmic localization of 20 S and 23 S RNAs within the cell 3 , it is likely that these RNAs are not capped at their 5Ј ends. At any rate, it should be clarified. 20 S and 23 S RNAs apparently belong to the same RNA family of viral origin. They share many similarities, such as their genome organization, RNA polymerase sequences, and inducibility under nitrogen starvation conditions among others. When the nucleotide sequences at both ends are compared, their resemblance is again quite evident. Both RNAs share the same 5-nt sequence (GGGGC) at their 5Ј ends and also its complementary 5-nt sequence (GCCCC-OH) at their 3Ј ends. The latter 3Ј end sequence partially overlaps with a strong secondary structure, stem-loop I in 20 S RNA (Fig. 7), and probably also in 23 S RNA (Fig. 7). When we searched the 3Ј end sequences of viral RNAs in data banks, we found that (ϩ) strand single-stranded RNA coliphages have similar sequences and secondary structures at the 3Ј ends of their genomic RNAs (Fig. 8). Previously we had noticed that the consensus sequences for RNA-dependent RNA polymerases found in p91 and p104 are most closely related to those of the replicases of the RNA coliphages (3,4,9,10). These resemblances, especially in their 3Ј end RNA structures, might be conserved during their evolution, possibly from the beginning of the RNA world as suggested by Maizels and Weiner (35) in their genomic tag hypothesis. Alternatively, these polymerases might require similar chemical natures in their RNA templates. The polymerase senses the stem-loop structure as a signal of the 3Ј end of the template, and the successive C residues at the 3Ј end provide chemical stability in the initiation complex. At any rate, these resemblances strongly suggest that 20 S and 23 S RNAs are similar to (ϩ) strand single-stranded RNA viruses. The available data fit to this framework are: (i) 20 S and 23 S RNAs are of positive sense; (ii) the (Ϫ) strands are much less, usually less than 1% of the (ϩ) strands; (iii) both 20 S and 23 S RNAs are linear molecules and have no poly(A) tails. Therefore their reproduction proceeds through end-to-end (Ϫ) strand synthesis and then from these (Ϫ) strands, the (ϩ) strand RNAs are regenerated by end-to-end transcription. This replication scheme is consistent with our previous observation of in vitro 20 S RNA synthesis. Crude extracts prepared from induced cells synthesized mostly 20 S RNA and a small amount of the Ϫ strands. In time course measurements, we detected nascent molecules that grew in size to full unit-length 20 S RNA during the incubation (17). W and T dsRNAs can be referred as replicative forms of 20 S and 23 S RNAs. However it should be mentioned that they could be formed artifactually during their isolation or generated in vivo as dead end products. In the Q␤ system, it is well documented that the double-stranded forms are inactive as templates (36). Another feature of 20 S and 23 S RNAs shared with coliphages is the existence of a single nontemplated A residue at their 3Ј ends (Fig. 8). Based on the analysis of the 3Ј end terminal nucleotides done by thin layer chromatography, we found that about 80 -90% of W or T (ϩ) strands are C residues , and 10 -20% are A residues (24, 31). The same results were obtained when the 20 S and 23 S RNA 3Ј-terminal nucleotides were analyzed, 2 thus confirming the identity of W (ϩ) strands with 20 S RNA and T (ϩ) strands with 23 S RNA.
Apparently, it is controversial to refer to 20 S and 23 S RNAs as (ϩ) strand RNA viruses, since these RNAs do not encode coat proteins and are not encapsidated into viral particles. However, we believe that this controversy could be reconciled if we take their biological context into consideration. So far all known mycoviruses have no extracellular pathway of transmission, except for some opportunistic events (37). It means that these viruses do not need elaborated machineries for exit and re-entry into new host cells. For example, yeast Ty retroelements have much simpler genome organization compared with those of retroviruses (38), and their genomic RNAs are encapsidated into particles that are equivalent to the inner cores of the higher eukaryote counterparts. Yeast totivirus L-A encodes only two proteins (coat protein and RNA polymerase), and the RNA genome is encapsidated into a particle that corresponds to the inner core of reoviruses. In both cases they lack envelope and outer capsid. Therefore, if we placed the RNA coliphages into yeast, the gene necessary for cell lysis and the functions of coat protein that may be involved in exit and re-entry would become dispensable. In the cases of Ty and L-A, however, coat (or capsid) proteins also provide other functions. They still protect the encapsidated genome from the nuclease-rich intracellular environment (39). They also provide an environment where their genome and replication machinery are intimately concentrated. This compartmentalization may promote efficient and faithful reproduction of their progeny (or provirus). Concerning those functions, we believe 20 S and 23 S RNAs have evolved uniquely. As mentioned, 20 S and 23 S RNAs are not encapsidated into particles, but they form ribonucleoprotein complexes with their cognate RNA polymerases. This complex formation thus substitutes the necessity of compartmentalization by coat proteins. The polymerase in the complex sees a high local concentration of the 3Ј ends of the bound template. The RNAs themselves might have evolved in such a way that they can form stable secondary or tertiary structures relatively resistant to cellular RNases, even without a protective protein coat. Such extreme cases may be seen in the small RNA plant pathogens viroids, which form highly organized rod-shaped structures by extensive intramolecular hydrogen bondings. The high GC contents in 20 S RNA (58.3%) and 23 S RNA (59.0%), which are equivalent to those of viroids (53-60%) (40), surely stabilize such structures. On the other hand, the cellular mRNAs and L-A dsRNA virus in the same host have much lower GC contents (38 -40 and 45.7%, respectively). In addition, the ribonucleoprotein complex formation would further stabilize and protect the RNA, thus substituting the necessity of a protein coat to protect the genomic RNA. Our preliminary results indicate that the complex formation greatly enhances the RNA stability in vitro. 4 The lack of infectious cycle, therefore, allowed 20 S and 23 S RNAs to simplify their genomes to the extremity. They encode only their RNA polymerases. Their simple appearance in the genome organization is, however, deceptive. They have to reproduce their progenies like other viruses and live peacefully in the host. Now all the information necessary for these activities is concentrated into only one gene product and perhaps the RNA genome structure itself.