The RNA subunit of ribonuclease P from the zebrafish, Danio rerio.

A simple strategy has been devised to identify the gene encoding the RNA subunit of RNase P from the zebrafish, Danio rerio. The sequence obtained by amplification of genomic DNA with primers based on sequences common to two other vertebrates was confirmed by reverse transcription and amplification of RNA from a partially purified preparation of the holoenzyme. The 5' and 3' ends were determined by cyclizing the RNA, followed by reverse transcription and sequencing across the ligated RNA junction. The zebrafish sequence is 63% identical to that of Xenopus laevis nuclear RNase P RNA and 69% identical to the human RNase P RNA. A consensus secondary structure was constructed based on these nucleotide identities and on the many compensatory base changes in several regions among these three RNAs. The strategy used to obtain the zebrafish sequence should be useful in deriving analogous gene sequences from diverse classes of eukaryotes.

A simple strategy has been devised to identify the gene encoding the RNA subunit of RNase P from the zebrafish, Danio rerio. The sequence obtained by amplification of genomic DNA with primers based on sequences common to two other vertebrates was confirmed by reverse transcription and amplification of RNA from a partially purified preparation of the holoenzyme. The 5 and 3 ends were determined by cyclizing the RNA, followed by reverse transcription and sequencing across the ligated RNA junction. The zebrafish sequence is 63% identical to that of Xenopus laevis nuclear RNase P RNA and 69% identical to the human RNase P RNA. A consensus secondary structure was constructed based on these nucleotide identities and on the many compensatory base changes in several regions among these three RNAs. The strategy used to obtain the zebrafish sequence should be useful in deriving analogous gene sequences from diverse classes of eukaryotes.
Ribonuclease P is an essential ribonucleoprotein enzyme that catalyzes the endonucleolytic hydrolysis of 5Ј-leader sequences from precursor transfer RNA molecules (1,2). In Escherichia coli, the holoenzyme consists of one RNA molecule (M1 RNA) of 377 nucleotides and one protein subunit (C5 protein) of 119 amino acids. In all eubacteria so far examined, the RNA component is catalytic in vitro in the absence of its protein cofactor in buffers that contain high Mg 2ϩ or monovalent cation concentrations (3). There is evidence that the protein serves both as an electrostatic shield between the substrate and the RNA enzyme (4), and increases the versatility of the holoenzyme by decreasing the K m and increasing k cat /K m for substrates other than precursor tRNAs (5). Thus, in eubacteria, the protein subunit plays the role of a cofactor for the catalytic RNA subunit. By contrast, no eukaryotic or archaeal RNase P RNA tested has yet demonstrated catalytic capability in the absence of protein in vitro.
To understand the differences in the mechanism of action between eukaryotic RNase P RNAs and the catalytically active eubacterial analogues requires, in part, accurate models of the secondary and tertiary structures of these RNAs. Secondary structures can be inferred from a comparative phylogenetic analysis of several RNase P RNA sequences from related species (6) and can be confirmed by chemical and enzymatic mapping analyses (7). For eubacterial RNase P RNAs, several secondary structural domains have been assigned based on these strategies (2). The extensive work done in determining the secondary structure of eubacterial RNase P RNAs has led to two models for the three-dimensional structure of the E. coli homologue (8,9).
In order to develop three-dimensional models for various eukaryotic RNase P RNAs, secondary structural models must be validated. Among lower eukaryotes, a comparative structural analysis has been performed using several fungal (yeast) nuclear sequences (10). For higher eukaryotes, however, a phylogenetic analysis is not yet possible, since the RPR genes (genes coding for the RNA subunit of RNase P) from only a few classes have been described. These include Xenopus laevis (11), mouse and rat (12), and human (13,14), and non-human primates (12). A reliable consensus secondary structure based on phylogenetic analysis requires a larger number of vertebrate sequences in order to identify true covariation of residues with confidence. Inferred structures can then be confirmed by detailed chemical and enzymatic mapping.
Here, we propose a simple strategy to identify, clone, and sequence full-length vertebrate RNase P RNA genes. The method exploits the few short regions of identical sequence within the known RPR genes of two distantly related vertebrates. Complementary oligonucleotide primers directed to these regions in other vertebrate genomic DNAs can be used to amplify new sequences between these consensus regions. The corresponding RNA sequence can be determined by reverse transcription of RNA isolated from partially purified RNase P holoenzyme, followed by amplification of the resulting cDNA. The remaining unidentified sequence at the 5Ј and 3Ј termini can be obtained by cyclizing the RNA isolated from the purified holoenzyme, followed by RT-PCR 1 across the ligated ends. Finally, primer extension of the linear RNA can be used to determine the nucleotide at the ϩ1 position of the mature RNA subunit. We have used this strategy to identify, clone, and sequence the full-length RNase P RNA gene, RPR1, from the zebrafish, Danio rerio, a small, freshwater vertebrate. This simple strategy, which is based on the few short regions of sequence identity among different vertebrate species, should be applicable to the identification of RPR sequences from a number of other vertebrate organisms.

EXPERIMENTAL PROCEDURES
Materials-Taq polymerase and deoxynucleoside triphosphates for PCR, Pefabloc protease inhibitor, and bacterial alkaline phosphatase were purchased from Boehringer Mannheim. AMV reverse transcriptase, nucleoside triphosphates, and SP6 RNA polymerase were * The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
The nucleotide sequence(s) reported in this paper has been submitted to the GenBank TM  Amplification of RPR1 from Zebrafish Genomic DNA-Zebrafish genomic DNA was isolated from a single adult fish as described previously (15). Each amplification reaction of 50 l contained 80 ng of genomic DNA as target, 30 pmol each primer, 200 M each dNTP, and 5 units of Taq DNA polymerase. Each of the 30 cycles of amplification included denaturation at 94°C, 1 min; annealing at 42°C, 0.5 min; and extension at 72°C, 1 min.
Cloning of PCR Products-DNA fragments were purified in nondenaturing polyacrylamide gels (5-8%), and visualized by UV shadowing. The excised fragment was crushed and soaked in TE buffer (10 mM Tris-HCl, pH 8.0; 1 mM EDTA) at 37°C for 2 h. Following centrifugation to remove the gel fragments, the DNA was recovered by ethanol precipitation. The 5Ј ends were phosphorylated by resuspending the fragment in 25 l of water, 2 l of 10 ϫ polynucleotide kinase buffer (New England Biolabs), 1 l of 10 mM ATP, and 1 l (10 units) of T4 polynucleotide kinase. The reaction mixture was incubated for 1 h at 37°C, then terminated by denaturation of the enzyme at 65°C for 20 min. An aliquot was removed for blunt-end ligation into pBluescript II SK(ϩ) (Stratagene), which had been linearized with EcoRV restriction enzyme and dephosphorylated using bacterial alkaline phosphatase. The ligation reaction was carried out overnight at 16°C. An aliquot of 5 l was removed to transform XL1-Blue competent cells (Stratagene). Plasmids containing inserts were analyzed by DNA sequencing of both strands using Sequenase (Amersham-U. S. Biochemical Corp.).
Purification of Zebrafish RNase P-Zebrafish PAC2 fibroblast cells were cultured at 27°C in L15 medium supplemented with 15% fetal bovine serum without added CO 2 (16). Cells from 100 10-cm dishes were collected from culture by centrifugation. The pellet was resuspended in 30 ml of lysis buffer (10 mM Tris-HCl, pH 7.8; 2.5 mM MgCl 2 ; 10 mM KCl; 0.1 mM Pefabloc protease inhibitor) and incubated on ice for 30 min. The suspension was homogenized with a Dounce homogenizer for 10 strokes. The cellular debris was pelleted by centrifugation for 30 min at 6500 ϫ g at 4°C. The supernatant was centrifuged at 100,000 ϫ g for 1 h at 4°C, and the supernatant from this second spin was applied to a 2.5-ml DEAE-Sepharose Fast Flow column (Sigma), in equilibration buffer (10 mM Tris-HCl, pH 7.8; 2.5 mM MgCl 2 ; 1 mM dithiothreitol; 0.1 mM Pefabloc) that contained 100 mM KCl. Following a wash of 10 column volumes, the RNase P was eluted from the column with a linear gradient of 100 -500 mM KCl in equilibration buffer (50 ml total).
Assay for RNase P Activity-In these experiments, the substrate was the serine suppressor tRNA precursor, supS1, from Schizosaccharomyces pombe (17). The supS1 gene was cloned into the plasmid pSP64 (Promega), and transcribed in the presence of [␣-32 P]GTP (18). Each RNase P reaction of 10 l contained 10 mM Tris-HCl, pH 7.8; 10 mM MgCl 2 ; 200 mM KCl; 1 mM dithiothreitol; and 1 pmol of precursor tRNA substrate, radiolabeled to a specific activity of 1,000 cpm/pmol. The reaction mixture was incubated for 30 min at 37°C, and the reaction was terminated by addition of an equal volume of 40 mM EDTA, 8 M urea. To facilitate migration into denaturing polyacrylamide gels, 1 l of phenol was added to each reaction tube, and the samples were electrophoresed at 15 watts for 2 h. Reaction products were detected by autoradiography of the polyacrylamide gels.
Amplification of Zebrafish RNase P RNA-The pooled fractions of peak RNase P activity were concentrated in a Centricon C-10 apparatus and stored at Ϫ20°C in 50% glycerol. Aliquots of 10 l were removed and diluted to 50 l with water. Phenol (2 l equilibrated with 0.1 M Tris-HCl, pH 8) was added to denature and solubilize protein. Ethanol was added to precipitate the RNA, and the resuspended pellet was subsequently used for the reverse transcription reaction. The reaction contained the lyophilized RNA pellet, 1 mM of each dNTP, 1 ϫ AMV reverse transcriptase buffer (Boehringer Mannheim), 40 units of RNasin RNase inhibitor, and 30 pmol of antisense primer (P6, see Table I).
The solution was heated to 80°C for 5 min and then cooled to 42°C. AMV reverse transcriptase (5 units) was added, and the reaction mixture was incubated at 42°C for 1 h. Following reverse transcription, the sample was added to 80 l of a solution that contained 8 l of 10 ϫ PCR amplification buffer (Boehringer Mannheim), 30 pmol of sense-strand primer (P5, see Table I), and 2.5 units of Taq DNA polymerase. Amplification was carried out for 30 cycles, and each cycle included denaturation at 94°C for 1 min, annealing at 60°C for 1 min, and extension at 72°C for 1 min.
Identification of Zebrafish RNase P RNA Termini-An aliquot of 25 l from the purified RNase P preparation was deproteinated with phenol as described above. Prior to cyclization, the RNA population was incubated with tobacco acid pyrophosphatase to hydrolyze the terminal phosphoanhydride bonds (19), leaving donor 5Ј-monophosphoryl termini. The RNA was then treated with T4 RNA ligase in 1 mM ATP to foster cyclization of the linear RNAs (20). The preparation of cyclized RNAs was then reverse transcribed with a zebrafish RPR-specific, antisense primer, P7 (see Table I). Reverse transcription was followed by amplification of the minus-strand cDNA with a sense-strand PCR primer, P8 (see Table I). The products of the reaction were cloned and sequenced as described above.
Primer Extension Analysis-RNA from partially purified RNase P was reverse transcribed with primer P9 (see Table I), which had been 5Ј-phosphorylated (as described above) with [␥-32 P]ATP. The product of the reaction was run in an 8% polyacrylamide gel under denaturing conditions and aligned with the DNA sequence of the RPR1 clone derived from cyclized RPR1 RNA, in order to identify the nucleotide at the ϩ1 position.

RESULTS
Identification of the Zebrafish RPR1 Gene-We assumed that regions of sequence conservation between the human and X. laevis RNase P RNAs (11) would also be conserved in the zebrafish. Therefore, four short regions of identity between the human and X. laevis RNase P RNA genes (see Fig. 1) were identified and, using these sequences, short, nested PCR primers for amplification of zebrafish genomic DNA were designed. The initial strategy was to amplify genomic DNA between the outermost primers (P1 and P4, see Table I), to be followed, if necessary, by a second amplification with the nested primers (P2 and P3, see Table I). We expected that the low annealing temperature that was required for these outer primers (42°C) would necessitate a second, nested amplification. However, a one-step amplification using only primers P1 and P4 was sufficient to yield a single major product of approximately 300 bp. This PCR fragment was cloned and sequenced (Fig. 1A, nucleotides 5-297). The sequence obtained does not match any known DNA sequences in the GenBank TM data base. Nevertheless, it is very similar to the few known vertebrate RPR sequences, with identities ranging from 60 to 69% (see below).
To show that the DNA fragment cloned from zebrafish genomic DNA indeed could encode part of an RNA molecule that copurifies with zebrafish RNase P holoenzyme, a wholecell extract from zebrafish was fractionated on a DEAE-Sepharose column (see "Experimental Procedures"). RNase P activity was monitored in the eluted fractions by incubation with a uniformly radiolabeled precursor tRNA substrate, and by comparison of the migration of the products of the cleavage reaction with those generated by human RNase P. The zebrafish holoenzyme eluted at a KCl concentration of approximately 300 mM (Fig. 2). This is consistent with the elution profile of nuclear RNase P holoenzyme from other higher eukaryotic species (11,13,21).
An aliquot from the concentrated peak fractions was subjected to RT-PCR. To ensure specificity in the RT-PCR reaction, we chose for the reverse transcription primer P6, a derivative of P4. Primer P6 contained at its 3Ј end sequence that was actually determined from the zebrafish genomic PCR reaction, instead of sequence merely predicted from the consensus vertebrate sequences. The second primer used in the PCR reaction was primer P5, a derivative of P2. P5 also contained empirically determined sequence at its 3Ј end to enhance the fidelity of the reaction. The RT-PCR reaction yielded a single major product of greater than 250 bp. This product was cloned, and its sequence matched that for the genomic RPR1 DNA, over the region amplified (Fig. 1A, nucleotides 11-294).
To test the possibility that amplification in the RT-PCR experiment could have come from contaminating genomic DNA instead of holoenzyme RNA, we performed the reverse transcription in the absence of either reverse transcriptase or primer P6. Following PCR under these conditions, no product was obtained. Furthermore, when the partially purified holoenzyme was first treated with pancreatic ribonuclease, no PCR product was obtained following RT-PCR. The product of the RT-PCR reaction, therefore, was derived from an RNA species in the purified preparation. Together, these data strongly suggest that the PCR product from the genomic DNA was amplified from a gene that encodes zebrafish RNase P RNA.
Mapping the 5Ј and 3Ј Termini of the Zebrafish RNase P RNA Sequence-The RPR1 sequence obtained from the genomic PCR reaction and from the RT-PCR reaction was incomplete since the primers used in these amplifications were targeted to internal regions in the molecule. Moreover, the sequences of the primers themselves had to be confirmed experimentally since they were only predicted to match zebrafish sequences.
To obtain the authentic sequence at the 5Ј and 3Ј termini, we cyclized the RNAs from the partially purified RNase P preparation with T4 RNA ligase. Self-ligation, followed by RT-PCR, ultimately yielded clones that contained 15 nucleotides of new sequence that appeared between the 3Ј end and 5Ј end of the previously determined genomic sequence (nucleotides 298 -308 and 1-4). In some of the six clones sequenced, the 11th and 12th nucleotides in the 15-nucleotide stretch gave TT, instead of AA. The rest were consistently as shown in Fig. 1A. The 15 new nucleotides of (mostly) 3Ј-terminal sequence can form a base-paired stem with the 5Ј end of the previously determined RNA sequence, as would be predicted from analogous sequences of all other known RNase P RNAs (2). This method also revealed that the predicted identities of two nucleotides in the genomic-PCR primers (P1 and P4) were incorrect, since amplification by RT-PCR after cyclization generated the actual sequence that covered the regions complementary to P1 and P4 primer sequences. However, priming of the original genomic amplification reaction obviously had not been hindered, probably because these nucleotide differences occurred near the 5Ј ends of the PCR primers.
To determine which nucleotide corresponds to the ϩ1 position in the RNA molecule, we performed primer extension of the zebrafish RNase P RNA, and the extended minus-strand cDNA product was aligned with the DNA sequence obtained after amplification of the cyclized RNA (Fig. 3). The reversetranscribed product terminated at the last nucleotide in the sequence: . . .CGCTACT. . .-3Ј. This corresponds to an RNase P FIG. 1. DNA sequence of the RPR1 gene from zebrafish and alignment with the RPR1 genes from X. laevis and H. sapiens. A, nucleotides 5-297 were obtained from genomic DNA (see text). The matching RT-PCR product from purified RNase P RNA covered nucleotides 11-294 (see text). Nucleotides 298 -308 and 1-4 were identified as described under "Experimental Procedures." Nucleotide heterogeneities from the 14 clones sequenced are given below the consensus sequence. Heterogeneities in lower case occurred in either one, two, or three of the 14 clones sequenced. Those in upper case occurred in either 4 or 5 of the 14 clones sequenced. Never more than two different bases appeared in any of the positions of heterogeneity. Oligonucleotide primers used to obtain these sequences are described in Table I. B, alignments were performed using the program GeneWorks, release 2.2 (IntelliGenetics).
Heterogeneity in the Zebrafish RPR1 Gene Sequence-To arrive at a consensus RPR1 RNA sequence (Fig. 1), a total of 14 clones were sequenced, including three clones from each of two independent PCR amplifications of zebrafish genomic DNA, six other clones derived from the RT-PCR reaction of RNA from partially purified holoenzyme, and, finally, two clones from the RT-PCR reaction using cyclized RNA from the same enzyme preparation. In nearly all of the 308 positions, the nucleotide agreement was 100% among the 14 clones. Of the few positions where nucleotide differences did occur among the 14 clones, the number of differences ranged from one to five. In none of the positions where heterogeneity occurred did more than two different bases appear. The most common heterogeneous pair was C and T (Fig. 1A).
Sequence and Structural Similarities between the Zebrafish RPR1 Gene and Other Eukaryotic RPR Genes-The sequence of zebrafish genomic RPR1 was aligned to that for human RPR1 (H1 RNA gene) and X. laevis RPR1 (XL RNA gene). In a pairwise analysis, the human and zebrafish sequences share 69% identity over the regions aligned (Fig. 1B). The zebrafish RPR1 gene also shares 63% identity with the X. laevis RPR1 gene (Fig. 1B), and 60% identity with the mouse gene (data not shown). In a three-way alignment of zebrafish, human and X. laevis sequences, the identity decreases to 48% (Fig. 1B). The similarity decreases even more significantly when the comparison is with a lower eukaryote, as the identity with RPR1 from Saccharomyces cerevisiae is is only 30%. It is interesting, however, that the first 126 nucleotides of the zebrafish sequence are more than 50% identical to nucleotides 236 -338 in the S. cerevisiae sequence.
In addition to the overall sequence similarity among the X. laevis, human, and zebrafish RPR genes, there are several short contiguous regions of identity. These stretches can, in principle, form similar secondary structures, and provide the first steps toward building an overall secondary structure for the zebrafish RNase P RNA in the absence of more sequence data (Fig. 4). One such potential secondary structure in the zebrafish RNA is formed by long range base pairing, which forms part of a pseudoknot, between nucleotides 71-76 and 287-292, i.e. GTCTGA/TCAGAC. These same sequences are found in the X. laevis and human RNAs in the same relative position. Among all RNase P RNAs, this secondary structural feature is absolutely conserved (2). Another potential secondary structure based on sequence identity and relative location among the three RNAs is a four-base pair "bridge," CUCA/ UGAG (nucleotides 16 -19 and 257-260), that, through formation of a stable stem, joins the left-hand and middle singlestranded domains of the structure drawn in Fig. 4. Finally, a second putative stem in the zebrafish RNA (nucleotides 82-85 and 176 -179) that forms a bridge between this middle singlestranded region and the right-hand single-stranded region is identical in X. laevis and co-variant in human.
Proposed secondary structures in Fig. 4 that contain the identical sequence in human RPR1 RNA, but that co-vary in the X. laevis RNA, include the stem that joins the 5Ј and 3Ј termini (in the zebrafish, nucleotides 2-13 and 297-307). In the zebrafish RNA, this stem extends for 10 base pairs, with two single-nucleotide bulges. This structure is universally conserved among all known RNase P RNAs, and, although the sequences are often divergent, it is remarkable that between human and zebrafish, 8 of the 10 base pairs contain identical sequences. Also, the sequences in the proposed stems formed by the following nucleotides are nearly identical to those in the human and covariant in X. laevis: 38 -42 and 48 -52, 93-98 and 102-107, and 227-232 and 244 -249.
Other regions in zebrafish RPR1 RNA can form secondary structures similar to those proposed for the X. laevis and human RNAs based only on nucleotide co-variation, that is, compensatory base changes that maintain a potential Watson-Crick or noncanonical G/U base pair. These include nucleotides 111-114 and 132-135, 141-145 and 154 -158, and 190 -195 and 201-206. Finally, the RNA of each species contains a long "stem-bulge-stem-loop" in the region just downstream from the short, conserved stem that bridges the left-hand and middle single-stranded regions of the molecule. As discussed below, this structure, despite no apparent sequence similarity, is highly conserved even among the prokaryotic RNase P RNAs. DISCUSSION We have identified the full-length, zebrafish RNase P RNA subunit by first isolating a fragment of the corresponding gene (RPR1) from zebrafish genomic DNA by PCR. The genomic DNA fragment is identical in sequence to that obtained from the RNA isolated from partially purified zebrafish RNase P holoenzyme. This confirms that the RPR1 gene that we initially identified is not a pseudogene, but one that is actively transcribed into RNA in order to form the RNase P holoenzyme.
The primer sequences chosen for amplification of zebrafish genomic DNA were based on DNA sequence identity in two short regions of the RPR1 genes from two other distantly related vertebrates, human (nucleotides 7-19 and 326 -309) (11) and X. laevis (nucleotides 7-19 and 304 -287) (11). The zebrafish RPR1 RNA, which is 308 nucleotides long, has a sequence similarity greater than 60% to these other vertebrate RNase P RNAs. Moreover, it shares with these other vertebrate RNAs sequence identity in several short regions, including the putative pseudoknot, the relative location of which is absolutely conserved in the RNase P RNAs from all organisms so far examined.
The sequence heterogeneity at some of the 308 positions from 14 clones sequenced has several possible origins. First, Taq polymerase has a slightly reduced fidelity due to its lack of a RNA from phenol-solubilized RNase P holoenzyme was reverse transcribed with primer P9 (see Table I), which was 5Ј-phosphorylated with [␥-32 P]-ATP. P, aliquots from the primer extension reaction loaded in lanes 1, 5, and 11. T, G, C, and A are the [␣-35 S]dATP-labeled sequencing products in lanes 2-5 and 7-10 that were obtained by sequencing the cloned PCR fragment from cyclized RNase P RNA using primer P9 that was 5Ј-phosphorylated with unlabeled ATP (see "Experimental Procedures" for all methods). Primer P9 is complementary to nucleotides 77-60. The arrow shows the alignment of the primer extension product with the corresponding nucleotide in the DNA sequence.
3Ј-5Ј repair mechanism compared to other DNA polymerases (22). Additionally, AMV reverse transcriptase might have inserted the occasional incorrect nucleotide (23); however, most of the nucleotide differences seem to be limited to a subset of the genomic DNA-derived clones. Finally, there could be more than one homologue of the RPR1 gene in the zebrafish, which could account for the nucleotide polymorphisms we observed. In a recent study, several homologues of the mouse RPR gene have been identified, all of which are transcribed (24). In the same study, Northern analysis suggests that there could be multiple, expressed homologues of the RPR1 genes in other vertebrates, including rat, rabbit, and human.
The secondary structure of the full-length zebrafish RPR1 RNA sequence drawn in Fig. 4 is based on comparison of the currently available sequences from zebrafish, human, and X. laevis. Alignments in Fig. 1B were used to identify potential stems in Fig. 4. The rationale for the proposed base pairing in the zebrafish secondary structure is based either on sequence identity in analogous regions among the three RNAs or on co-varying base changes in these same regions that maintain the potential stems. Many more sequences will be required to validate this latest secondary structure prediction. Moreover, chemical and enzymatic mapping analyses that identify secondary structure can be used to confirm the predictions that are based on a phylogenetic analysis. Nevertheless, a sequence comparison of the three vertebrate RNAs can provide much information toward developing a consensus secondary structure.
Despite the low sequence identity with the known fungal nuclear RPR sequences (10), the zebrafish sequence contains several structural features that are shared with these lower eukaryotic sequences. For example, of the six known nuclear Saccharomyces sequences (whose lengths range between 336 and 369 nucleotides) and two Schizosaccharomyces sequences (285 or 286 nucleotides) whose secondary structures have been previously described (10), all contain the long range base pair-ing stem of the pseudoknot, as does the zebrafish sequence. Furthermore, in the Saccharomyces RNAs, the locations of the constituent 6-nucleotide pseudoknot strands are all between nucleotides 88 -98 and 312-350. In the much shorter Schizosaccharomyces RNAs, the 6-nucleotide strands are located between nucleotides 83-90 and 251-257. In addition, the zebrafish and fungal molecules have long base-paired stems that join the 5Ј and 3Ј termini. Although the sequences of these terminal stems are highly divergent among each other, the presence of this structure is universal.
Another interesting secondary structure conserved between the fungal and zebrafish RNAs is located between nucleotides 25 and 63 in the zebrafish molecule. This universally conserved stem-bulge-stem-loop exists even in the consensus eubacterial conserved core and in the same relative position (2). Experimental evidence has suggested that this analogous structure in human RNase P RNA is the binding site for the 40-kDa protein that co-immunoprecipitates with human RNase P holoenzyme in the presence of antisera from patients with certain autoimmune diseases (25,26).
Because vertebrates that are as distantly related as frog, human, and zebrafish maintain the short regions of identity near the 5Ј and 3Ј termini of their respective RPR1 genes, PCR primers targeted to these regions should be useful in obtaining RPR gene sequence from many other vertebrates. These primers might also be useful in obtaining the RPR sequences from some invertebrate classes of eukaryote as well, including insects and nematodes. Although the vertebrate regions of identity are not conserved in the S. cerevisiae nuclear or mitochondrial sequences, higher invertebrates might have more nearly identical sequences to those of the vertebrates in these regions. Degenerate primers, combined with nested amplifications that target short regions of vertebrate identity even further internally from the termini of the RNA sequences ( Figs. 1 and 4), could yield fragments to be used as probes to screen libraries for full-length genes from these higher invertebrates. Recent studies have exploited human RNase P as a vehicle for targeting inhibition of deleterious gene expression in vitro and ex vivo (27). This method could also be used in zebrafish to understand the role in development of certain targeted genes following ablation of their expression by zebrafish RNase P. Targeted inhibition of gene expression by this method will require a more extensive characterization of the zebrafish holoenzyme substrate specificity.