Cloning and characterization of three new murine genes encoding short homologues of RNase P RNA.

Three novel genes encoding small RNAs homologous to human and mouse RNase P RNA have been isolated from a mouse genomic library. As assessed by Northern blot analysis and nuclease protection assays, transcripts derived from one or more of these genes are expressed in murine cells and tissues. The RNA products of these RNase P RNA-homologous genes are smaller in size (238-248 nucleotides) than the 305-nucleotide transcript previously identified. These smaller transcripts are uniformly less abundant than the larger RNase P RNA, but their expression varies severalfold among different mouse tissues. Similar short homologues of RNase P RNA also are expressed in rat, rabbit, and human cells. We conclude that higher vertebrates express multiple isoforms of RNase P RNA.

Three novel genes encoding small RNAs homologous to human and mouse RNase P RNA have been isolated from a mouse genomic library. As assessed by Northern blot analysis and nuclease protection assays, transcripts derived from one or more of these genes are expressed in murine cells and tissues. The RNA products of these RNase P RNA-homologous genes are smaller in size (238 -248 nucleotides) than the 305-nucleotide transcript previously identified. These smaller transcripts are uniformly less abundant than the larger RNase P RNA, but their expression varies severalfold among different mouse tissues. Similar short homologues of RNase P RNA also are expressed in rat, rabbit, and human cells. We conclude that higher vertebrates express multiple isoforms of RNase P RNA.
RNase P is a site-specific endoribonuclease that cleaves tRNA precursor molecules to generate the 5Ј termini of mature tRNAs in both prokaryotic and eukaryotic cells (1)(2)(3). The holoenzyme is a ribonucleoprotein, the RNA subunit of which, termed RNase P RNA (RPR), 1 exhibits considerable variability in size among different species, ranging from 140 to 490 nucleotides in length (4 -6). The RNA moiety alone, as isolated from Escherichia coli or Bacillus subtilis, is capable of catalyzing the site-specific cleavage reaction in vitro in the absence of its apoprotein (7,8). Eukaryotic RNase P, in contrast, requires assembly of the holoenzyme for activity (9 -12).
In yeasts, RNase P is compartmentalized; both the apoprotein and RNA components of mitochondrial RNase P are distinct and arise from different genes than those encoding components of the nuclear enzyme (4,(13)(14)(15)(16). In Saccharomyces cerevisiae, the protein subunit of mitochondrial RNase P is derived from a nuclear gene (15,16), whereas the mitochondrial RPR is encoded within the mitochondrial genome (4). In mammalian cells, RNase P activity is present in both nuclear and mitochondrial fractions (11,17), but only a single form of RPR has been identified (18,19), and no RPR-homologous sequences can be identified within the more compact mammalian mitochondrial genome.
The present study was designed to test the hypothesis that mammals express multiple isoforms of RPRs. We screened a mouse genomic library using a PCR-amplified mouse RPR sequence as the probe. Three new genes encoding homologues of RNase P RNA (RPRH) were isolated and sequenced. All of these encode RNA molecules with a high degree of sequence identity to the human and mouse RPR genes published previously (18,19) but are smaller in size. Transcripts derived from one or more of these novel genes are expressed in murine cells, and multiple sizes of RPR-related transcripts are expressed in other mammalian species.

EXPERIMENTAL PROCEDURES
Cloning-The 305 base pairs of mouse RPR coding region was PCRamplified with two primers (RP5Ј, ATAGGGCGGAGGAAGCTCATCA, and RP3Ј, ACCAAAAATGGGCGGAGGAGAGTAGTC) synthesized from the human RPR sequence (18). An 129SV mouse genomic library (Strategene, La Jolla, CA) was screened using the primer-extended RPR probe (20). Duplicate nylon membrane filters were lifted from the phage library plates, hybridized to the mouse RPR probe in 2 ϫ PIPES/0.5% SDS/100 g of denatured salmon sperm DNA/ml, and washed at low and high stringencies (i.e. 0.1 ϫ SSC (0.15 M NaCl, 0.015 M sodium citrate)/0.1% SDS at 57 or 62°C). The purified clones were further evaluated by a PCR-based assay using the RPR primers, RP5Ј and RP3Ј. The clones that were positive by plaque hybridization but negative in a PCR assay (no amplification of a 305-base pair RPR fragment) were chosen for detailed analysis. DNA was prepared from those clones (21) and characterized by restriction mapping and Southern blot hybridization. DNA fragments containing RPRH sequences were subcloned into pBluescript or pUC18 vectors and sequenced using an automatic sequencing system (Applied Biosystems, Inc., Foster City, CA). The sequencing data were compiled using DNASTAR (DNASTAR, Inc., Madison, WI) and GCG programs (Genetic Computer Group, Inc., Madison, WI).
Northern Blot Hybridization-Total RNAs were isolated from cultured cells or animal tissues by CsCl gradient precipitation following direct lysis in 4 M guanidinium thiocyanate and 10 mM Tris-HCl, pH 8.0 (22). The RNA samples were denatured in 50% formamide at 85°C, electrophoresed through 6% denaturing polyacrylamide gels containing 1 ϫ TBE buffer (0.89 M Tris boric acid, 0.02 M EDTA, pH 8.3) and 8 M urea, electroblotted onto nylon membranes in 0.5 ϫ TBE at 25 volts for 1 h using an electrotransferring apparatus (PROTEAN II, Bio-Rad, Hercules, CA), and hybridized with RPR or RPRH cDNA probes as described previously (23).
Southern Blot Hybridization-Genomic DNA was isolated from mouse Sol 8 myogenic cells, rat and rabbit kidney, and human WI-38 cells (a diploid cell line), digested with restriction enzymes, and electrophoresed through 0.8% agarose gels. Hybridization was performed in the same buffer as Northern blot hybridization. Filters were washed in 0.1 ϫ SSC/0.1% SDS at various temperatures. S1 Nuclease Mapping-Single-stranded DNA probes were prepared from the RPRH4 gene using a unique EcoRI site for end labeling. The probe for 5Ј end mapping was made from a PCR fragment encompassing the RPRH4 gene, which was digested with EcoRI and labeled at the 5Ј end by T4 kinase. The antisense DNA strand extending from nt 207 to nt 417 (numbered as in Fig. 2) was purified through a denaturing polyacrylamide gel. The probe for 3Ј end mapping was made from a 950-base pair PvuII fragment containing the RPRH4 gene, which was used as the template for unidirectional PCR with an M13 forward primer. The antisense strand of the DNA fragment was replaced by the newly synthesized PCR strand that is 70 nucleotides shorter than the * The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
The nucleotide sequence(s) reported in this paper has been submitted to the GenBank TM /EMBL Data Bank with accession number(s) U31003 (RPRH2), U31227 (RPRH3), and U31228 (RPRH4).

RESULTS
Identification and Cloning of RPR-related Genes-To assess the possibility that the mouse genome includes multiple genes homologous to RNase P RNA, we performed genomic Southern blot hybridization under various stringencies. Following digestion with each of several restriction enzymes, multiple bands remained after washing under high stringency conditions (Fig.  1). Because the cleavage sites of the restriction enzymes used in Fig. 1 are not present in the mouse 305-nt RPR sequence (19), we estimated, based on the number and intensity of the bands, that the mouse genome was likely to contain at least four RPR isogenes.
Eight phage clones were isolated by screening a mouse genomic library. Restriction mapping and Southern blot hybridization analysis indicated that these clones contained murine DNA inserts that could be grouped into four nonoverlapping genomic sequences. Three of these clones included BamHI fragments of 0.95, 2.3, and 5.2 kilobase pairs, respectively (data not shown), identical in size to genomic BamHI fragments seen in the original Southern blots (Fig. 1). Approximately 1 kilobase pair from each of these three clones (termed RPRH2, RPRH3, and RPRH4) were sequenced in both orientations. A portion of the sequence data is shown in Fig. 2.
Sequence Analysis of Three RPRH Genes-As shown in Fig.  2, all three RPRH genes include a putative RNA coding region that is closely related to human and mouse RNase P RNA (18,19). The putative RPRH transcripts start immediately following a TATA-like motif and end at a putative termination signal (TTTT) for transcription by RNA polymerase III. The predicted sizes of the transcribed products of these genes are 239 nt for RPRH2 and 238 nt for RPRH3 and RPRH4, as compared with 305 nt for the published RPR gene. Within the regions predicted to be transcribed, the three RPRH genes and RNase P RNA share 88 -92% homology to each other. RPRH3 and RPRH4 are very closely related, not only throughout the RNA coding region but in the 5Ј and 3Ј flanking sequences as well.
The RPRH2 gene only partially resembles RPRH3 and RPRH4 in the 5Ј flanking region; the sequence similarity collapses entirely at positions more than 261 nt upstream from the putative transcriptional start site (Fig. 2). Proximal and distal enhancer elements characteristic of small RNA genes transcribed by RNA polymerase III (25, 26) and present in the human RPR gene (18,27) could not be identified in any of the three RPRH genes, suggesting that expression of these genes may be controlled in a distinctive manner.
Characterization of RPRH Transcripts-Northern blot hybridization of total RNAs isolated from mouse tissues showed three major RNA species estimated to be 305, 248, and 238 nucleotides in length, respectively (Fig. 3). The largest RNA species matches the size of the published mouse RNase P RNA and uniformly gave the strongest hybridization signal. The 238-nt RNA species is coincident to the size predicted for transcript(s) encoded by one or more of the novel RPRH genes we have identified. The third RNA species (248 nt) is slightly larger than that predicted for RPRH gene transcripts but may arise from variations in the transcriptional start or termination sites. Other, smaller, RNA bands are inconsistently present and are likely to represent degradation products. Quantitation of the hybridization signals indicates that the 238-and 248-nt Putative TATA boxes and transcription termination signals are underlined. Asterisks indicate the 5Ј and 3Ј termini of the RPRH4 transcript, as mapped by S1 nuclease protection assays. Nucleotide positions in the RPRH genes that differ from the published mouse RPR sequence are shown in lowercase, and nucleotides within the 5Ј and 3Ј flanking regions at which complete divergence begins are italicized.
RNAs are at least 3-fold more abundant in kidney and liver than in heart and skeletal muscle, whereas the 305-nt RNA is expressed at relatively high levels in all tissues.
To verify that the 238-and 248-nt RNAs are indeed transcribed from RPRH gene(s) and not generated by partial degradation of the 305 nt of RNase P RNA, we performed S1 nuclease protection mapping using two single-stranded antisense DNA probes prepared from the RPRH4 gene (Fig. 4). The 5Ј end protection probe, extending 210 nucleotides upstream from the EcoRI site, detected two transcription start sites (Fig.  4A). One of the start sites is located at 127/126 nt upstream from the EcoRI site, a position that corresponds to the 5Ј end of the 238-nt RNA transcript predicted from the RPRH4 gene. The other start site is located 135-138 nt from the EcoRI site, corresponding to the 5Ј end of a 248-nt RNA transcript predicted from RPRH4 gene. The 3Ј end protection probe, extending 347 nucleotides downstream from the EcoRI site, detected a RNA transcript ending 112 nt downstream from the EcoRI site (Fig. 4B), consistent with the predicted 3Ј end of RPRH4 transcript.
These fragments of probes based on the RPRH4 gene, when bound to their complementary RNA sequences from murine cells, were resistant to high concentrations of S1 nuclease, providing evidence that the 238-and 248-nt RNA species are not degradation products of the 305-nt RPR transcript but bona fide products of the RPRH4 gene. The 3Ј end analysis further supports this conclusion in that the predicted hybridization product formed between the RPRH4 probe, and the 305-nt RPR transcript (115 nt) is present and abundant at low concentrations of S1 nuclease (Fig. 4B) but disappears at the higher concentrations of S1 nuclease necessary to digest single base mismatches with the probe. Spatial relationships between these probes and transcripts derived from the RPR and RPRH4 genes are summarized in Fig. 4C.
Multiple Isoforms of RPR-related Genes and RNAs in Other Mammals-To investigate the distribution of the RPR-homologous genes in other mammalian species, we performed genomic Southern blot hybridization by using restriction enzymes that do not cut the published RPR sequences (18, 19). FIG. 4. S1 nuclease protection mapping of the RPRH4 transcript. A, mapping of the 5Ј end. Lane 1, probe only; lane 2, no RNA added but with 400 units of S1 nuclease; lanes 3-5, each reaction containing 30 g of mouse liver RNA, 10 5 cpm of the 5Ј end protection probe and increasing amounts of S1 nuclease, as indicated. The sizes (nucleotide bases) of the protected fragments are indicated. B, mapping of the 3Ј end. Lane 1, probe only; lane 2, no RNA added but with 400 units of S1 nuclease; lanes 3-6, each reaction contained 40 g of mouse liver RNA, 10 5 cpm of the 3Ј end protection probe, and increasing amounts of S1 nuclease, as indicated. C, schematic presentation showing the S1 nuclease protection probes and the 5Ј and 3Ј ends of RPRH4 transcripts protected from S1 nuclease digestion. The sizes of the probes and the predicted 5Ј and 3Ј ends are indicated as nucleotide bases. Cleavages sites at the termini or at sites of sequences mismatched between the 305-nt RPR transcript and the probes based on the RPRH4 sequence are indicated (open triangles). Southern blot analyses demonstrated that multiple bands are present in rat and rabbit DNAs digested by various enzymes (Fig. 5, A and B). We estimate that rat, like mouse, has four RPR isogenes and that rabbit has at least two RPR isogenes. The human RPR cDNA detected at least two bands (Fig. 5C): a strong band corresponding to the RNase P RNA gene and a weak band that may represent a RPR-related gene.
Northern blot hybridization using the mouse and human RPR sequence to probe RNAs isolated from rat, rabbit, and human cells also revealed at least two major RPR isoforms in each of these mammalian species: a larger RPR that corresponds to the 305 nt form in the mouse, and a smaller and less abundant form that corresponds to the RPRH transcripts we have identified in murine cells (Fig. 6). When human RNA and genomic DNA were probed with the mouse, instead of human, RPR sequence, only the RNase P RNA but not the RPR homologues was detected (data not shown), suggesting that the human RPRH is more diverged from the RNase P RNA than those in mouse, rat, and rabbit. In summary, these data indicate that most, if not all, mammalian species contain and express multiple genes encoding isoforms of RNase P RNA. DISCUSSION Genes encoding an RNase P RNA were cloned previously from human and mouse and encode transcripts ranging from 305 nt in mice to 341 nt in humans (18,19). We have cloned three new murine genes homologous to the published RPR sequence but predicted to generate shorter RNA transcripts. Each of these RPRH genes includes a segment almost identical (88 -92%) to the 5Ј region of the previously described mouse RPR sequence but truncated by the presence of a transcriptional termination signal at a position corresponding approximately to nt 240 of the 305-nt RPR.
Two lines of evidence rule out the possibility that these genes are derived from a cloning artifact. First, BamHI fragments of clones that contain RPRH genes are matched in size to BamHI fragments detected in a mouse genomic Southern blot. Second, two of the novel RPRH gene sequences (RPRH2 and RPRH4) were present in two or more independent clones containing overlapping mouse genomic DNA fragments and an identical RPRH sequence. Other data demonstrate that the RPRH sequences that we have identified are not pseudogenes. Transcripts corresponding in size to the products predicted from RPRH genes are present in murine cells, and nuclease protection experiments confirm that the shorter forms of RPRhomologous RNA observed in Northern blots are not degradation products of the 305-nt RPR. Short forms of RPR-homologous RNA also are present in rat, rabbit, and human cells.
The structure and function of RNase P RNAs have been studied by computer modeling (28), chemical cleavage (29), nuclease protection (30), and mutational or phylogenetic analyses (31)(32)(33)(34). Although the size and sequence of RNase P RNAs have diverged considerably during evolution, a similar threedimensional structure appears to be conserved (1-3, 19, 28, 32, 33). The 305-nt mouse RPR is predicted to have the core structure common to other RNase P RNAs, which includes three major rings formed by internal base pairing (19). The tertiary structure is established through base pairing between the rings (19). Because of their shorter length, mouse RPRH transcripts may form only two of the three rings of the common RNase P RNA core structure.
The functional properties of RPRH transcripts in mammalian cells have not yet been determined. Truncated forms of human RPRs retain enzymatic activity in reconstitution assays in vitro (35), suggesting that RPRH transcripts potentially serve as functional components of holoenzyme complexes. An interesting alternative possibility is that these naturally occurring truncated forms of RNase P RNA may be capable of binding to the apoprotein constituents of RNase P ribonucleoprotein complexes but are enzymatically inactive due to the disruption of the RNA tertiary structure. In this way, RPRH gene products could function as negative regulators of RNase P. It should be possible, in future studies, to assess the functional characteristics of the RPRH gene products that we have identified, as well as their protein subunit and subcellular distribution. Because yeast expresses different forms of RNase P RNA in mitochondria and the nucleus (4,13,14), perhaps one or more of the RPRH transcripts will have a similarly compartmentalized function.
Comparison of the primary sequences of the previously defined mouse RPR gene and the new RPRH genes also suggests certain interesting possibilities concerning their evolutionary origins. An analysis using a Jotun Hein alignment method (DNASTAR), as shown in Table I, indicates that all three RPRH genes are clearly related to each other but diverge considerably from the RPR gene downstream of their 3Ј transcriptional termination signals. Within the group of RPRH genes, RPRH2 has diverged from the RPRH3 and RPRH4 genes both within the transcribed regions and within flanking sequences, FIG. 5. Southern blot of genomic DNAs isolated from rat or rabbit kidney or human WI-38 cells. The rat (A) and rabbit (B) DNAs were hybridized with the mouse RPR cDNA probe. The human (C) DNA was hybridized with the human RPR cDNA probe. kb, kilobase pairs. FIG 6. Northern blot of RNAs isolated from mouse, rat, or rabbit kidney or human HeLa cells. The mouse, rat, and rabbit RNAs were hybridized with the mouse RPR cDNA probe. The human RNA was hybridized with the human RPR cDNA probe.
whereas the latter two genes are almost identical throughout the entire 1-kilobase pair region we sequenced. These comparisons suggest that the present diversity within this gene family has arisen from sequential gene duplication events.
In summary, three novel genes encoding sequences homologous to RNase P RNA have been isolated from a mouse genomic library. Transcripts derived from one or more of these RPRH genes are expressed in murine cells, and multiple sizes of RPR-related transcripts are present in other mammalian species as well. We conclude that higher vertebrates express multiple isoforms of RNase P RNA.