Molecular Characterization of Two Endogenous Double-stranded RNAs in Rice and Their Inheritance by Interspecific Hybrids*

We completely sequenced 13,936 nucleotides (nt) of a double-stranded RNA (dsRNA) of wild rice (W-dsRNA). A single long open reading frame (13,719 nt) containing the conserved motifs of RNA-dependent RNA polymerase and RNA helicase was located in the coding strand. The identity between entire nucleotide sequence of W-dsRNA and that of the dsRNA of temperate japonica rice (J-dsRNA, 13,952 nt) was 75.5%. A site-specific discontinuity (nick) was identified at nt 1,197 from the 5′ end of the coding strand of W-dsRNA. This nick is also located at nt 1,211 from the 5′ end in the coding strand of J-dsRNA. The dsRNA copy number was increased more than 10-fold in pollen grains of both rice plants. This remarkable increase may be responsible for the highly efficient transmission of J-dsRNA via pollen that we already reported. J-dsRNA and W-dsRNA were also efficiently transmitted to interspecific F1 hybrids. Seed-mediated dsRNA transmission to F2 plants was also highly efficient when the maternal parent was wild rice. The efficiency of dsRNA transmission to F2 plants was reduced when the maternal parent was temperate japonica rice; however, the reduced rates in F2 plants were returned to high levels in F3 plants.

Linear large (more than 10 kb 1 ) double-stranded RNAs (dsRNAs) have frequently been identified in healthy plants such as alfalfa (1), barley (2,3), broad bean (Vicia faba (4,5)), cassava (6), common bean (Phaseolus vulgaris (7,8)), pepper (9,10) and rice (11)(12)(13). Most of these dsRNAs have no obvious effect on the phenotype of the host plant, with the exception of the dsRNA in V. faba, that is associated with cytoplasmic male sterility (4,14). These large dsRNAs are not associated with distinct virus-like particles (2,13). They are present at a (low) constant concentration in host plants (15) and replicate using their own RNA-dependent RNA polymerase (5). Although they are usually found in the cytoplasm of host plant cells, they are effectively transmitted to progeny plants via pollen as much as by ova (2,15). Because all attempts (mechanical inoculation, graft transmission, and aphid transmission) other than by seeds failed to transmit dsRNA to dsRNA-free plants, dsRNA is probably transmitted to progeny plants via seeds alone (2,9,15). Thus, these endogenous dsRNAs have some intriguing plasmid-like properties that differ from those of conventional plant RNA viruses (16).
Double-stranded RNA (about 14 kb) is a feature of many strains of temperate and tropical japonica rice (cultivated rice, Oryza sativa) and of one strain of wild rice (Oryza rufipogon W-1714, an ancestor of O. sativa). It is not found in any strains of indica rice (cultivated rice, O. sativa), which rarely hybridizes with japonica rice in the field (13). These dsRNAs occur in every tissue as well as at every developmental stage, and they are transmitted very efficiently (more than 98%) to progeny plants via seeds (15). The dsRNAs are maintained at an almost constant concentration (100 copies/cell) by host plants from generation to generation (15). However, dsRNA copy number increases about 10-fold when host cells are grown in suspension culture (15,17). The entire sequence (13,952 nucleotides (nt)) of the dsRNA from the temperate japonica rice (cv. Nipponbare, J-dsRNA) has been determined. It consists of a long open reading frame (ORF; 13,716 nt, 4,572 amino acid residues) containing the conserved motifs of RNA-dependent RNA polymerase (RdRp) and RNA helicase (18). The coding strand of J-dsRNA contains a site-specific discontinuity (nick) at position 1,211 nt from the 5Ј-end (19). This nick divides not only the coding strand of the dsRNA molecule into 1,211-nt 5Ј and 12,741-nt 3Ј fragments but also the long ORF into 1,045-nt (348 amino acids) 5Ј and 12,671-nt (4,224 amino acids) 3Ј segments.
The nucleotides of the dsRNA of tropical japonica rice (cv. Gendjah Gempel, T-dsRNA) and the dsRNA of wild rice (O. rufipogon W-1714, W-dsRNA) have been partially sequenced (17). A comparison of the nucleotide and deduced amino acid sequences of the core regions of the RdRp domains found in these three dsRNAs indicates that J-dsRNA (temperate japonica rice) is more similar to T-dsRNA (tropical japonica rice) than to W-dsRNA (wild rice). The cytoplasmic inheritance of these dsRNAs is unusual in some F1 hybrids when they are introduced into F1 hybrids by crossing japonica rice (O. sativa) and wild rice (O. rufipogon). The evolutionarily related dsRNAs were incompatible, and the resident dsRNA of an egg cell from cultivated rice was excluded by the incoming dsRNA of a pollen cell from wild rice in some F1 plants. Coexisting dsRNAs in the F1 hybrids segregated from each other in the F2 plants. However, the total amounts of these dsRNAs in the host cells remained constant (approximately 100 copies/cell) even in interspecific hybrid rice (17). Stringent regulation of the dsRNA copy number might be responsible for this unusual type of inheritance.
Here we determined the entire nucleotide sequence of the second dsRNA from wild rice (W-dsRNA). We detected a sitespecific discontinuity (nick) in the coding strand, then com-pared it with the dsRNAs of temperate (J-dsRNA) and tropical (T-dsRNA) japonica rice. Furthermore, we discovered remarkable increases in the dsRNA copy number of pollen grains and an unusual mode of dsRNA inheritance in interspecific F2 and F3 hybrids between O. sativa and O. rufipogon. cDNA Cloning of dsRNA-The dsRNA were extracted from 14-dayold seedlings or mature leaves using SDS-phenol, fractionated by column chromatography on CF-11 cellulose (Whatman, Maidstone, UK) as described by Morris and Dodds (20), then incubated with DNase I. A series of overlapping cDNA clones that covered the entire sequence of the dsRNA of wild rice (W-1714) was obtained by the method of Gubler and Hoffman (21) using oligonucleotide primers synthesized with reference to the J-dsRNA sequence (18). Complementary DNA clones corresponding to the terminal regions of W-dsRNA and T-dsRNA were generated by 5Ј rapid amplification of cDNA ends (22).
Reverse Transcriptase-Polymerase Chain Reaction (PCR)-The strand containing the site-specific nick was determined by reverse transcriptase-PCR. Oligonucleotides, WN161 (5Ј-CCTTGGAGGTGTG-GTGTATGT-3Ј) located at nt 1,067-1087, WN184 (5Ј-GTCCTTAAAT-CTAGGGACAACA-3Ј) complementary to nt 1,702-1,723, and W184 (5Ј-CTGTGCCAGTGTTATCCCTGA-3Ј) complementary to nt 1,734 -1,754 from the 5Ј end of the coding strand of W-dsRNA were used as PCR primers. When the coding strand was the template for cDNA synthesis, W184 was used as the primer for cDNA synthesis, then WN161 and WN184 were used as PCR primers. When the noncoding strand was the template, WN161 was used as the primer for cDNA synthesis, then WN161 and WN184 were used as PCR primers. Products of PCR reactions were analyzed by agarose gel electrophoresis.
DNA Sequencing-DNA sequences were determined by dideoxynucleotide chain termination using a 7-deaza Sequenase Ver. 2.0 DNA sequencing kit (U. S. Biochemicals Corp.) or the BigDye TM Cycle Primer Sequencing FS Ready reaction kit (Applied Biosystems, Foster City, CA).
Analysis of Nucleotide Sequences-Nucleotide and amino acid sequences were analyzed using the SDC-GENETYX genetic informationprocessing program (Software Development Co., Ltd., Tokyo, Japan). Secondary structures of 5Ј-and 3Ј-noncoding regions of three dsRNAs were predicted using the RNAdraw program (23).
Copy Number of dsRNA-Seedlings and pollen grains were thoroughly ground in liquid nitrogen using a mortar and pestle, then total nucleic acids were extracted using SDS-phenol. Total nucleic acids were resolved by agarose gel electrophoresis, then the gels were stained with ethidium bromide (500 ng/ml). Band intensities of dsRNA and DNA were analyzed using the NIH image program. The dsRNA copy number was estimated from a comparison of the band intensity and genome size of dsRNA (14 ϫ 10 3 nt) with those of rice DNA (4.3 ϫ 10 8 nt/haploid genome) (15,24).
Inheritance of dsRNA-Cultivars Nipponbare (temperate japonica rice, O. sativa) and W-1714 (wild rice, O. rufipogon) were crossed as described (15,17). The dsRNA (J-dsRNA or W-dsRNA) in F1, F2, and F3 plants were identified by Northern hybridization using J-dsRNA-or W-dsRNA-specific probes under high stringency as described previously (17). The specific probes did not cross-hybridize under these conditions.

RESULTS
Nucleotide Sequence of W-dsRNA-The entire sequence (13, 936 nt) of the dsRNA of wild rice (O. rufipogon W-1714; W-dsRNA) was determined from a series of independent and overlapping cDNA and rapid amplification of cDNA end clones. A single long ORF of 13,719 nt (4, 573 amino acid) was found in one (coding) strand of W-dsRNA. The noncoding region preced- A long ORF of 13,719 base pair (bp), a 166-base pair 5Ј-noncoding leader sequence, and a 48-base pair 3Ј-noncoding sequence were found in the coding strand. Two cDNA clones (W161 and W149) used as probes to detect nicks (see Fig. 2) are represented by thick horizontal lines. Nucleotide sequence data reported here will appear in the DDBJ, EMBL, and GenBank™ nucleotide sequence data bases under the accession number AB014344. kbp, kilobase pairs;aa, amino acids. ing the ORF was 166 nt long, and an AUG codon was located at nt 167-169 from the 5Ј-end of the coding strand. The termination codon at nucleotide positions 13,885-13,887 from the 5Јend of coding strand was followed by a noncoding region of 48 nt long (Fig. 1). W-dsRNA did not have a poly(A) tail, and no ORF of significant size was found in the other (noncoding) strand. The AU content of W-dsRNA was high (64.95%), and analysis of the codon usage in the ORF indicated a significant bias toward codons with A or U as the third letter (data not shown).
Detection of the Nick in the Coding Strand of W-dsRNA-Purified W-dsRNA was resolved by electrophoresis on a denaturing agarose gel, then hybridized using two cDNA probes located between nt 770 and 1,723 (W161) and between nt 6,554 and 8,215 (W149) from the 5Ј end of the coding strand of dsRNA (Fig. 1). These probes detected both (coding and noncoding) strands that were separated by denaturing agarose gel electrophoresis. A 14-kb band was detected in both experiments when either W161 or W149 was the probe (Fig. 2A, lanes 1 and 2). In addition, a band of about 1.2 kb was detected when probed with W161 ( Fig. 2A, lane 1) but not by either Northern hybridization using W149 as a probe (Fig. 2A, lane 2) or by electrophoresis on nondenaturing agarose gels stained with ethidium bromide (data not shown). Because J-dsRNA contains a site-specific nick in the coding strand (19), we considered that two bands (14 and 12.8 kb) should be detected when probed with W149. However, these two bands could not be completely separated by electrophoresis because of having similar molecular masses ( Fig. 2A, lane 2). Indeed two bands (14 and 12.8 kb) were detected using W149 when electrophoresis was prolonged (data not shown). Any probes located within the 5Ј region from a position nt 1,107 detected both the 1.2-and 14.0-kb bands but not the 12.8-kb band. Alternatively, any probes located in the 3Ј region from nt 1,685 detected both the 14.0-and 12.8-kb bands but not the 1.2-kb band (data not shown). These experimental results were very similar to the findings of J-dsRNA, which has a site-specific nick on the coding strand at nt 1,211 from the 5Ј end (19). The results indicated that W-dsRNA has a site-specific nick around nt 1,200 from the 5Ј end.
Reverse transcriptase-PCR experiments were carried out to determine which strand contained the site-specific nick (see "Experimental Procedures"). When the template for cDNA synthesis was the noncoding strand, PCR products 687 nt long between nt 1,067 and 1,754 were amplified. In contrast, no band was amplified when the coding strand was used as the template (data not shown). These results indicated that the cDNA could be synthesized through the nick point on the noncoding strand as a template but not on the coding strand. Therefore, the coding strand of W-dsRNA contained a sitespecific discontinuity (nick).
To define the exact site of the nick on the coding strand of W-dsRNA, we synthesized a 25-mer oligonucleotide primer that was complementary to positions between nt 1,396 and 1,420 from the 5Ј end of the coding strand and performed primer extension (see "Experimental Procedures"). The main products of the reaction are shown in lane P in Fig. 2B. This result indicates that the nick was located between U at nt 1,197 and G at nt 1,198 from the 5Ј end of the coding strand. This nick divides not only the coding strand of W-dsRNA into 1,197-nt 5Ј and 12,739-nt 3Ј fragments but also the long ORF into 1,031-nt (343 amino acids) 5Ј and 12,688-nt (4, 230 amino acids) 3Ј segments.
Comparison between W-dsRNA and J-dsRNA-The entire sequences of W-dsRNA (13,936 nt) and the deduced amino acid sequences of the ORF in W-dsRNA (13,719 nt, 4,573 amino acids) were compared with those of J-dsRNA (13,952 nt) and of the ORF in J-dsRNA (13,716 nt, 4,572 amino acids) (Fig. 3). The identities of the nucleotide sequences of these dsRNAs and the deduced amino acids sequences of the ORFs were 75.5 and 79.6% (similarity: 96.4%), respectively. The consensus motifs of RNA helicase and RdRp domains were found between amino acids 1,505 and 1,748 (nt 4, 679 and 5410) and between amino acids 4,234 and 4,476 (nt 12, 866 and 13, 594) from the Nterminal end of the ORF, respectively. These locations were almost identical to those of the RNA helicase and RdRp domains in J-dsRNA. Identities of the amino acid sequences of the RdRp domain (94.7%) and the RNA helicase domain (83.1%) were much higher than the average of the entire ORF (79.6%). We divided the entire nucleotide sequences of rice dsRNAs by 120 nt from the 5Ј end, then compared the nucleotide and amino acid sequences between J-dsRNA and W-dsRNA every 120 nt (40 amino acid; Fig. 3). The homology at the amino acid level between J-dsRNA and W-dsRNA in the RdRp region was more conservative than that of other regions except for the nick.
The coding strands of J-dsRNA and W-dsRNA had sitespecific nicks at nt 1,211 and at a nt 1,197 from their 5Ј-ends, respectively (Fig. 3). As shown in Fig. 4A, the amino acid sequences around each nick were highly conserved. We investigated the conservation of the nucleotide and amino acid sequences around the nick in detail by comparing the surrounding regions (nt 840 -1,560 from the 5Ј end, amino acids 226 -466 from the N-terminal), which were equivalent in size to the RdRp or RNA helicase domain, between J-dsRNA and W-dsRNA. The nucleotide and amino acid sequences were 82.9 and 95.4% identical, respectively. These values were similar to those of RdRp domain and higher than those of the RNA helicase domain, and the highly conservative regions continued across the nick (Fig. 4A).
On the contrary, two less conservative regions were found at nt 2,760 -3,240 and nt 8040 -8520 from the 5Ј end. The nucleotide (deduced amino acid) sequences were 58.4% (43.1%) and 51.6% (26.9%) identical, respectively (Fig. 3). In these regions, not only nonsynonymous base substitutions but also deletions and insertions of nucleotides were found, which caused changes in the amino acid sequences of the ORF (Table I). Although many base substitutions and several deletions and insertions of nucleotides were found, no termination codon within either ORF was generated.
Secondary Structures of Noncoding Regions in Rice -The sequences of 5Ј-noncoding regions (166 nt) and the location of the start codon (AUG) were conserved among the three dsRNAs (J-dsRNA, W-dsRNA, and T-dsRNA; Fig. 4B). Five stem and loop structures (SL1-SL5) in the 5Ј-noncoding region and one additional stem and loop structure containing a start codon (SL6) of three dsRNAs are shown in Fig. 5. The SL1 and SL2 were not so conservative among the three dsRNAs, and the SL2 was not located in J-dsRNA. The SL3, SL4, SL5, and SL6 were highly conserved among the three dsRNAs. In particular, the SL3 (nt 55-82), which has a remarkably GC-rich stem, was identical among the three dsRNAs. The SL3 must be the most stable among the six stem and loop structures, because the hydrogen bonds of the GC-rich stem are very strong. The 3Јnoncoding region of W-dsRNA (48 nt) was shorter than that of J-dsRNA (70 nt) (18). A comparison of the 3Ј-noncoding sequences between J-dsRNA and W-dsRNA revealed a long de- letion (18 nt) in the 3Ј-noncoding region of W-dsRNA (Fig. 4C). No conservative secondary structure was found in the 3Ј-noncoding regions.
Increase of dsRNA Copy Number in Pollen Grains-Rice dsRNA (J-dsRNA) is efficiently transmitted to progeny plants via pollen although it is localized in cytoplasm (15). We supposed that the dsRNA copy number in pollen grains is responsible for the paternal inheritance of J-dsRNA. A comparison of the amounts of dsRNA and rice DNA showed that the estimated copy numbers of three dsRNAs (J-, T-, and W-dsRNA) in leaves, roots, or seedlings (Fig. 6, lanes 1-3) were about 100 copies/cell (15). However, the copy numbers of these dsRNAs were increased in pollen grains (Fig. 6, lanes 4 -6). The band intensity of dsRNA was compared with that of DNA in each sample (lanes 4 -6 in Fig. 6). From the genome sizes of rice DNA (4.3 ϫ 10 8 nt) (24) and dsRNA (14 ϫ 10 3 nt), the estimated copy numbers of these dsRNAs in pollen grains were more than 1,000 copies/cell. The concentration of dsRNA was similarly in high samples extracted from the three rice cultivars.
Inheritance of the dsRNAs in Interspecific Hybrids between     (Table II). To investigate the inheritance of each dsRNA in hybrids from generation to generation, the dsRNAs in F2 and F3 plants, which were self-pollinated progenies of F1 hybrids, were subjected to Northern analysis under high stringency conditions. When the recipient (maternal parent) was Nipponbare and the pollen donor (paternal parent) was W-1714, the efficiency of seed-mediated transmission of dsRNAs from F1 plants was low in some F2 plants; the rates of dsRNA transmission from individual F1 plants to F2 plants varied from 0 to 100% (Table II) (Table II). When the dsRNA-free plants of Nipponbare were used as recipients, the efficiency of transmission was also low in some F2 plants (Table II). Once the F2 plants harbored the dsRNAs, the efficiency of dsRNA transmission to F3 plants increased again (78/84 ϭ 93%). When the recipient was W-1714 and the pollen donor was Nipponbare, only W-dsRNA was detected in F1 plants and the efficiency of seed-mediated transmission of dsRNA from these F1 plants was high in all F2 plants (100% , Table II). DISCUSSION W-dsRNA is very similar to J-dsRNA with respect to the sizes of their genomes and their ORFs, locations of RNA helicase and RdRp domains, and the sites of specific nicks (Fig. 3). Furthermore, the stem and loop structures in the 5Ј-noncoding regions were conserved in both dsRNAs (Fig. 5). Several characteristic stem and loop structures found in the 5Ј-noncoding region may function as cis-acting elements for their replication, transcription, and/or translation. By contrast, the 3Ј-noncoding region of W-dsRNA has a long deletion in comparison with J-dsRNA (18 nt, Fig. 4B). In our recent study of J-dsRNA and W-dsRNA inheritance, the ability of W-dsRNA to replicate in F1 plants between temperate japonica rice and wild rice seems to be greater than that of J-dsRNA (17). The 3Ј-noncoding regions of J-dsRNA and W-dsRNA might be responsible for the difference in their ability to replicate, because noncoding (Ϫ) strand synthesis must start at the 3Ј-terminal region of the coding (ϩ) strand. The two dsRNAs differ at the nucleotide level by 3,418 nucleotides (24.5%) of the entire sequence. Because cultivated rice (O. sativa) and wild rice (O. rufipogon) diverged several thousand years ago (24), about one nucleotide substitution each year has accumulated during the evolution of these dsRNAs.
Because the amino acid sequences across the nick (nt 840 -1,560 from the 5Ј end, amino acids 226 -466 from the N-terminal end) were highly conserved, they might be a functional   (Figs. 3 and 4A), an unknown translational product containing this amino acid sequence may be important as well as an RdRp. Once the nick occurs in the coding strand of the dsRNA, the divided coding strand can no longer be used as a template for noncoding (Ϫ) strand synthesis in its replication cycle or as an mRNA for the full-length form of its putative functional protein.
Some endogenous RNA replicons, such as the yeast 20 S and 23 S RNAs (ϭ yeast T and W dsRNAs), encode single ORFs containing RdRps (25). Although many base substitutions and several deletions and insertions of nucleotides were found in two rice dsRNAs, no termination codon within either ORF was generated (Table I and Fig. 3). It suggests that a single long ORF is also important for the strategy of gene expression (translation) in rice endogenous dsRNAs.
The dsRNAs isolated from mature leaves, roots, or seedlings (Fig. 6, lanes 1-3) were detected at a concentration of approximately 100 copies/cell (15). The dsRNA copy number increased more than 10-fold in pollen grains of three rice plants (Fig. 6, lanes 4 -6). This remarkable increase may explain the high efficiency of J-dsRNA transmission via pollen that we reported (15). There is no evidence for the horizontal transmission of rice dsRNAs, so their propagation seems to depend on steady replication before every host cell division (not only mitosis but also meiosis) and their steady (efficient) transmission to the next generation via not only eggs/ova but also pollen. Mating of host plants must be an opportunity for dsRNA propagation. An increase in the dsRNA copy number only in pollen grains must be a reasonable strategy for their efficient transmission to progenies despite their cytoplasmic localization (15). This phenomenon may be reminiscent of meiotic derepression of dsRNA copy numbers seen in the 20 S and 23 S RNA replicons of yeast (26,27). These rice dsRNAs could not have the property of horizontal transmission that almost all viruses have; they have the ability of vertical transmission via both eggs and pollen, which almost all viruses do not.
The two endogenous dsRNAs in rice (J-dsRNA and W-dsRNA) described here have unique properties, such as the single large ORF and the site-specific nick in the coding strand. The 16.7-kb dsRNA found in V. faba contains the single long ORF and the site-specific nick (28). Furthermore, the deduced amino acid sequence of the ORF is similar to those of the ORFs in rice dsRNAs (29). These three large dsRNAs might constitute a novel virus family. Furthermore, large endogenous dsRNAs with unknown sequences identified in other plants (see the Introduction) may also belong to this family.