Purification of the Spliced Leader Ribonucleoprotein Particle from Leptomonas collosoma Revealed the Existence of an Sm Protein in Trypanosomes

Trans-splicing in trypanosomes involves the addition of a common spliced leader (SL) sequence, which is derived from a small RNA, the SL RNA, to all mRNA precursors. The SL RNA is present in the cell in the form of a ribonucleoprotein, the SL RNP. Using conventional chromatography and affinity selection with 2′-O-methylated RNA oligonucleotides at high ionic strength, five proteins of 70, 16, 13, 12, and 8 kDa were co-selected with the SL RNA from Leptomonas collosoma, representing the SL RNP core particle. Under conditions of lower ionic strength, additional proteins of 28 and 20 kDa were revealed. On the basis of peptide sequences, the gene coding for a protein with a predicted molecular weight of 11.9 kDa was cloned and identified as homologue of the cis-spliceosomal SmE. The protein carries the Sm motifs 1 and 2 characteristic of Sm antigens that bind to all knowncis-spliceosomal uridylic acid-rich small nuclear RNAs (U snRNAs), suggesting the existence of Sm proteins in trypanosomes. This finding is of special interest because trypanosome snRNPs are the only snRNPs examined to date that are not recognized by anti-Sm antibodies. Because of the early divergence of trypanosomes from the eukaryotic lineage, the trypanosome SmE protein represents one of the primordial Sm proteins in nature.

In trypanosomes all mRNAs are processed by trans-splicing, which joins a common spliced leader (SL) 1 derived from the SL RNA to polycistronic pre-mRNAs (1,2). The SL RNA fulfills a dual function, both recruiting splicing co-factors and serving as a substrate for splicing. In analogy to cis-splicing, trans-splicing requires the participation of U snRNPs. Initial studies identified U2, U4, and U6 snRNAs in trypanosomes (3), and their essential role in trans-splicing was demonstrated (4). Only recently, the trypanosome U5 homologue was identified (5,6), but no U1 homologue has been revealed so far (3).
The trypanosome U snRNAs are generally smaller and show differences from their cis-spliceosomal counterparts (3,7). Another unique property of the trypanosomatid snRNAs is that they possess a divergent Sm site. In other eukaryotes, the Sm site constitutes the binding site for the common Sm proteins (8) that are recognized by anti-Sm antibodies present in autoimmune patients (9). Surprisingly, so far, the trypanosome snRNPs represent the only exception in that they are not recognized by these antibodies, suggesting unique properties of the trypanosome core proteins (10,11). Five core snRNP proteins were identified in Trypanosoma brucei ranging in size from 15 to 8.5 kDa, which are shared among all trans-spliceosomal snRNPs, including the SL, U2, U4/U6 (11), and U5 RNP (12).
Recent studies in both mammalian and yeast systems have demonstrated that all the known Sm proteins possess homology in two regions that were termed Sm motifs 1 and 2 (13)(14)(15). The Sm motifs are also conserved in all putative homologues including plant and yeast (13)(14)(15). This suggests that Sm proteins may have arisen from a single common ancestor. Recently, Sm-like proteins that do not belong to the canonical core proteins were also revealed, and one of them was identified as a U6-specific protein (14).
Very little is known about snRNP proteins in trypanosomes. The two trypanosome proteins that have been identified so far are the U2-specific 40 kDa, the human homologue of U2AЈ (16) and the U5-specific PRP8/p220 homologue (12). Interestingly, not all the amino acid positions critical for PRP8 function are conserved in the trypanosome protein p277, suggesting that certain differences exist between PRP8 functions in cis-and trans-splicing (12).
In this study, we have purified the SL RNP particle from the trypanosomatid Leptomonas collosoma using a combination of conventional chromatography and affinity selection with biotinylated 2Ј-O-methyl (2Ј-OMe) antisense RNA oligonucleotide, resulting in the identification of a subset of core proteins and loosely bound proteins that were selected with the SL RNA. The L. collosoma proteins show immunological relationship to the T. brucei core proteins. Based on internal peptide sequences, one of the core proteins was cloned and sequenced; the analysis identified it as a bona fide Sm protein harboring the conserved Sm motifs 1 and 2, mostly related to the SmE protein. In sum, although snRNP proteins are not recognized by anti-Sm sera in trypanosomes and the Sm site is degenerate, trypanosomes do possess Sm proteins.

EXPERIMENTAL PROCEDURES
Extract Preparation and Antisense 2Ј-OMe RNA Affinity Purification of SL RNP-L. collosoma cell culture and extract preparation (using 0.3 or 0.4 M KCl) has been previously described (17), as well as postribosomal supernatant preparation and DEAE chromatography (10). Extracts were prepared from 10 liters of cell cultures (5 ϫ 10 7 cells/ml). The extract was fractionated on a DEAE-Sephacel column, and the material eluted at 0.4 M KCl was concentrated to 0.5-1 ml by chroma-* This work was supported by the German-Israeli Foundation. The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
Cloning and Sequencing-500 ng of degenerate oligonucleotides 24279, 24280, and 24281 were labeled each with 250 Ci of [␥-32 P]ATP (7000 Ci/mmol) and polynucleotide kinase. Each labeled oligonucleotide (10 8 Cerenkov counts) was used to screen a genomic library (50,000 plaques) of L. collosoma (17,19). Two plaques gave positive signals with all three probes. Pure DNA was digested with HpaII and AluI, and DNA fragments of 0.6 and 1.6 kilobases were cloned into pBluescript SK and sequenced, using T3 and T7 primers.
Native Gels-RNPs from the different purification steps were separated on a 5% nondenaturing gel buffered with 50 mM Tris-glycine, pH 8.8.

RESULTS AND DISCUSSION
Purification of the L. collosoma SL RNP Core Particle-Previously, using affinity selection with biotinylated RNA oligonucleotide, the SL RNP of T. brucei was purified and a subset of five low molecular mass proteins of 15 to 8.5 kDa was identified. Only a single protein of 14.4 kDa appeared to be specific to SL RNP, whereas other proteins are shared with U2 and U4/U6 RNPs (11). Because of the experimental advantage of performing biochemical analysis with the monogenetic trypanosomatids, we have set up the purification of the SL RNP from L. collosoma extracts. Purification steps were scaled-up such that microgram amounts of proteins sufficient for microsequencing could be prepared. SL RNP was purified from high-salt extract because only under these conditions was it possible to recover about 80% of the particles. The SL RNP was highly enriched by two purification steps: depletion of the ribosomes and fractionation on a DEAE-Sephacel column. RNA was extracted from the different purification steps. Results presented in Fig. 1A, lanes 1-3, demonstrate that this fractionation procedure efficiently enriches for the major snRNAs, U2, U4, SL, and U6, but that other small RNAs such as 7SL RNA and tRNAs were still present in these fractions. To investigate the status of the SL RNP during purification, we analyzed the RNPs by native gel electrophoresis and Northern hybridization. The results presented in Fig. 1B indicate that in the initial steps of purification, the SL RNP migrated as a slow form (lane 2) and was converted to a fast migrating form during fractionation on a DEAE column (lane 3). Fractionation of the DEAE eluate by CsCl gradient centrifugation suggests that these particles withstand fractionation in high salt gradients and band at a density of ϳ1.4 g/ml, characteristic of core SL RNP particles (10, 20) (data not shown). It is currently unknown whether, during the conversion from the slowly to the fast migrating form of the SL RNP, only particle-specific proteins are dislodged or whether the SL RNP dissociates from a multi-snRNP complex. We have previously demonstrated that, when RNPs extracted at low ionic strength are fractionated on sucrose gradients, at least 50% of the SL RNP migrates at S values higher than 11 S (core SL RNP), suggesting that the SL RNP associates with other spliceosomal components (6). Affinity selection of the SL RNP and U2 RNP from the DEAE eluate demonstrates its high specificity and efficiency because, in both selections, a single RNA band of the expected size was observed (Fig. 1A, lanes 4 and 5).
Identification of the SL RNP Proteins from L. collosoma-To analyze the protein composition of the SL RNP, a concentrated DEAE column fraction prepared from 10 11 cells was used in affinity selection with biotinylated antisense SL-1 RNA oligonucleotide. Proteins were analyzed on a 12% Tris-Tricine gel and stained with Coomassie Blue. Results presented in Fig. 2A show the enrichment of five proteins of 70, 15, 13, 12, and 8 kDa (lanes 1, 2, 3, and 4). The proteins of ϳ55 kDa are the most abundant proteins in the DEAE fraction (lanes 4 and 5) and most likely represent ␣ and ␤ tubulins. These proteins were nonspecifically selected because they were present in the control selection without the SL oligonucleotide (lane 5). The pattern of the selected small proteins of the L. collosoma SL RNP seems to differ only slightly from the pattern of the proteins selected in T. brucei. In addition, the L. collosoma SL RNP contains a protein of 70 kDa. Because this SL RNP purification was done at 0.4 M KCl, we will refer to this RNP as the core SL RNP.
To investigate whether purification under less stringent conditions reveals additional SL RNP associated proteins, particles were extracted at 0.2 M KCl and were purified by DEAE chromatography. The selected proteins were stained with Coomassie Blue (Fig. 2B). The results indicate the presence of proteins of 28, 20, and 18 kDa in addition to the four low molecular mass proteins characteristic of the core SL RNP. However, the 18-kDa protein was bound nonspecifically to the beads as this protein was present also in the control selection in the absence of oligonucleotide (Fig. 2B, lane 5). The 70-kDa protein enriched in the core SL RNP (Fig. 2A, lane 4), was not clearly seen under low salt conditions (Fig. 2B, lane 4), most probably because of the nonspecific staining at the upper part of the gel. The results presented in Fig. 2 demonstrate the specificity of selection under high salt conditions but also the disadvantage of loosing additional, weakly associated proteins.
Immunological Relationship of L. collosoma and T. brucei snRNP Core Proteins-To examine the relatedness of the core proteins, we used the antibodies raised against the T. brucei core proteins (anti-CP) in immunoprecipitation and immunoblotting experiments with L. collosoma DEAE fraction. The results presented in Fig. 3A demonstrate that anti-CP antibodies specifically immunoprecipitated the SL RNP because only very faint background levels of the SL RNA were observed when the nonimmune serum was used in the experiment (com- pare lanes 1 and 2). To identify which of the L. collosoma SL RNP proteins are related to the T. brucei proteins, affinityselected L. collosoma proteins were immunoblotted and reacted with anti-CP antiserum alongside to T. brucei affinityselected proteins from SL and U2 RNPs. The result indicates strong reactivity with the 12-kDa protein. The anti-CP antibodies show the strongest reactivity with the T. brucei 14-and 12.5-kDa proteins (21) migrating close to the immunoreactive proteins of the L. collosoma SL RNP (Fig. 3B, lanes 2 and 3). In a control experiment with nonimmune serum, no reactivity was observed with the L. collosoma proteins (data not shown). The reactivity with the 12-kDa protein indicates that the snRNP core proteins from L. collosoma and T. brucei are immunologically related.
The 12-kDa Protein Is an Sm Protein Homologous to SmE-To clone the genes coding for core SL RNP proteins, the low molecular mass proteins were eluted from the gel and digested with LysC protease. Several internal peptide sequences were obtained from the four core proteins. In particular, two long peptides (peptide 1, IEGNLLGYDEFMNVVL; and peptide 2, ILLRSDNVGVVHAI) could be determined from the 12-kDa protein. Using degenerate oligonucleotides as probes, a genomic fragment of 1220 base pairs was cloned and sequenced (Fig. 4). An open reading frame was identified coding for a 11.9-kDa protein of 94 amino acids. The spliced leader addition site (AG 3Ј splice site) was mapped 16 nucleotides upstream of the first ATG (data not shown), and 22 consecutive pyrimidines were found upstream (Fig. 4, underlined in the sequence). These pyrimidines resemble the polypyrimidine tract found in intergenic regions of trypanosomatid genes that was shown to be essential for accurate trans-splicing (22). The polyadenylation site was mapped by reverse transcriptase polymerase chain reaction and is located 417 nucleotides downstream of the stop codon (results not shown). Southern analysis indicated that the gene is present in a single copy in the L. collosoma genome (data not shown). A multiple alignment of the L. collosoma protein with its homologues is presented in Fig. 5 its homologues is position 7 of the Sm2 motif (Arg instead of Lys in most of the other species). This is especially interesting because Arg is conserved in all Sm2 motifs, but in all E proteins this Arg was changed to Lys, except for the L. collosoma protein where the canonical Arg is kept. Recently, functional analysis was performed on the yeast SmE protein (23,24). The results pointed out the importance of certain hydrophobic residues for cell viability; these positions are essential for interactions of the protein with SmF and SmG (24). Interestingly, of these seven important residues, all but one are either similar or identical in L. collosoma, suggesting that their function may be conserved in the trypanosome protein.
One of the most intriguing questions regarding the trypanosome core proteins is the lack of recognition by anti-Sm sera. This finding is even more surprising after revealing the presence of Sm motifs in the L. collosoma protein. It was previously suggested that important domains for Sm autoimmunization are the conserved Sm motifs (13). This current study suggests that not only are the Sm motifs involved in B-cells epitope recognition. Alternatively, the trypanosome Sm motifs may differ from the canonical motifs in those positions that are essential for recognition by the anti-Sm sera. The number of the core proteins described in this study is smaller than that observed in T. brucei (11). Either the L. collosoma SL RNP composition is less complex than that of T. brucei, or more than a single polypeptide migrates at the same position of the protein gel. The latter possibility is more likely, based on the differential staining intensity of the protein bands (Fig. 2, A  and B). Cloning of the entire subset of the L. collosoma Sm proteins should clarify this issue.
Core proteins assembled on the Sm site of snRNAs were shown to serve as the recognition signal for the m 3 G cap methyltransferase (25). Similarly, it was shown that accurate modification of the SL RNA cap structure in vitro requires the SL RNA to be assembled in a core particle (26). The degeneracy of the Sm site in trypanosomes may reflect that each U snRNA, including the SL RNA, binds different subsets of core proteins. Indeed, the presence of unique proteins in the low molecular mass range has been observed for the SL and U4/U6 snRNPs (11). This suggests that the different combinations of Sm proteins determined by the different Sm sites and surrounding sequences may influence the binding of the different methyltransferases and hence dictate the particular type of cap modification on the trypanosome snRNA.