Mutational Analysis of 3′ Splice Site Selection duringtrans-Splicing*

trans-Splicing is essential for mRNA maturation in trypanosomatids. A conserved AG dinucleotide serves as the 3′ splice acceptor site, and analysis of native processing sites suggests that selection of this site is determined according to a 5′-3′ scanning model. A series of stable gene replacement lines were generated that carried point mutations at or near the 3′ splice site within the intergenic region separatingCUB2.65, the calmodulin-ubiquitin associated gene, andFUS1, the ubiquitin fusion gene of Trypanosoma cruzi. In one stable line, the elimination of the native 3′ splice acceptor site led to the accumulation of Y-branched splicing intermediates, which served as templates for mapping the firsttrans-splicing branch points in T. cruzi. In other lines, point mutations shifted the position of the first consensus AG dinucleotide either upstream or downstream of the wild-type 3′ splice acceptor site in this intergenic region. Consistent with the scanning model, the first AG dinucleotide downstream of the branch points was used as the predominant 3′ splice acceptor site. In all of the stable lines, the point mutations affected splicing efficiency in this region.

trans-Splicing is an RNA processing event that was first discovered in trypanosomatids and later observed in nematodes, trematodes, and Euglena (1,2). In all of these organisms, this intermolecular process results in a spliced leader sequence at the 5Ј-end of mature transcripts. In trypanosomes, its importance has been recognized for two reasons. First, the spliced leader supplies an identical 5Ј terminal cap for every known messenger RNA (3), and second, trans-splicing provides a means to convert polycistronic pre-mRNA to monogenic mRNA (4).
Because trans-splicing joins segments of two independently transcribed RNA molecules, it is distinct from cis-splicing, an intramolecular process that removes introns separating protein-coding sequences on primary transcripts. However, transsplicing does have a number of characteristics that suggest that it is related to the more widely studied cis-splicing. Mechanistically, both types of splicing occur through similar steps, which are catalyzed by a spliceosome (5)(6)(7)(8). Also, some of the cis-acting sequences in trans-splicing are identical to those observed in cis-splicing (9,10).
The trans-splicing reaction essentially occurs through two steps and requires two RNA molecules, the spliced leader RNA carrying the spliced leader sequence and spliced leader intron and the pre-mRNA carrying a 5Ј-untranslated region followed by the protein coding sequence (7,8,11). The first step of the reaction is a cleavage at the 5Ј splice donor site on the spliced leader RNA and the simultaneous formation of a Y-branched intermediate. A 2Ј-5Ј phosphodiester linkage between the first nucleotide at the 5Ј-end of the spliced leader intron, a guanosine, and an internal adenosine residue (the branch point) upstream of the 3Ј splice site of the pre-mRNA characterizes this branched intermediate. During the second step of trans-splicing, the Y-branched intermediate is removed when the spliced leader sequence is ligated to the pre-mRNA at the 3Ј splice site. Under most circumstances, this reaction is rapid, perhaps occurring co-transcriptionally. The precursors and intermediates are transient, and the functional end products, mature mRNAs each carrying the capped spliced leader sequence, are the only steady-state RNA detected.
In trypanosomes, the disruption of open reading frames during trans-splicing is avoided through the use of specific 3Ј splice sites upstream of the translation initiation site of a transcript. The locations of these splice sites as well as the branch point sites are determined by the cis-acting sequences on the pre-mRNA. Although these sequences have not been thoroughly investigated, a few important elements have been recognized. The 3Ј splice acceptor site always occurs at an AG dinucleotide, the consensus sequence also used during cis-splicing (1). Polypyrimidine tracts play an important role, since the elimination or successive deletion of the polypyrimidine tracts near native 3Ј splice acceptor sites often leads to activation of cryptic splice sites or diminished splicing within an intergenic region (12)(13)(14)(15). Very little is known about the selection of the branch point site during trans-splicing. The only branch points that have been mapped are those for the highly expressed ␣Ϫ and ␤Ϫtubulin transcripts of Trypanosoma brucei (16). Although branching was shown to occur at one or more A nucleotides upstream of the polypyrimidine tracts associated with the tubulin 3Ј splice acceptor sites, no consensus branch site sequence was determined.
The present study was initiated to develop a better understanding of the cis-acting sequences near the 3Ј splice acceptor site that are involved in trans-splicing. The noncoding sequence separating the 2.65 calmodulin-ubiquitin associated (CUB2.65) and FUS1 ubiquitin fusion genes of the calmodulinubiquitin complex of Trypanosoma cruzi presented an ideal environment to analyze the effects of 3Ј splice acceptor site mutations (see Fig. 1 and Ref. 17). trans-Splicing of the FUS1 transcript occurs quantitatively at the first and only AG dinucleotide between the polypyrimidine tract and the FUS1 translation initiation codon (18).
Electroporation, Stable Transformation, and Cloning of Trypanosomes-T. cruzi epimastigote electroporation was performed as described by Hariharan et al. (20) with minor modifications. In all cases, one or two rounds of antibiotic selection were carried out with a recovery period in between. Clonal lines were obtained by serial dilution in the absence of antibiotic selection, and tandem or single gene replacement lines were verified by Southern analysis (18). For TcCL:TR3, approximately 275 pmol of the gel-purified 2.7-kb 1 MluI-EcoRV fragment from the pBS:CH6N3 plasmid (described below and in the legend to Fig. 2A) was used during electroporation of 3 ϫ 10 8 epimastigotes. A transformed hygromycin-resistant (Hyg r ) population was selected by application of 500 g/ml hygromycin B sulfate at 48 h post-electroporation. For the TcCL:FN3 line, the gel-purified 1.2-kb AflII-EcoRV fragment from the plasmid pBS:FN3 (Fig. 2B) was used, and for the TcCL:CnFc and TcCL:SAM stable lines, the 2.5-kb MluI-EcoRV "CnFc" fragment from pBS:CnFcII and pBS:SAM plasmids (described below and in the legend to Fig. 2C) was used. Selection was carried out by application of 250 g/ml G418.
Oligonucleotides-All DNA oligonucleotides used in the experiments are listed and described in Table I. PCR Conditions-Unless stated otherwise, polymerase chain reaction amplifications (21) were carried out in a volume of 100 l containing 50 mM KCl, 10 mM Tris-Cl, pH 9.0, 1.5 mM MgCl 2 , 0.1% Triton X-100, 0.2 mM dNTPs, 2.5 units of Taq polymerase, 0.1 g of each primer, and either 10 ng of genomic DNA or 1 ng of plasmid DNA as template. Standard amplification conditions were either 2 min at 94°C, 1 min at 55°C, and 2 min at 72°C for 30 cycles or 2 min at 94°C, 1 min at 50°C, and 2 min at 70°C for 30 cycles.
RNA Isolation and Northern Hybridizations-Total cellular RNA was isolated by the guanidinium/cesium chloride method of Sambrook et al. (22) or by using TRIzol® (Life Technologies, Inc.) according to the manufacturer's suggestions.
Blotting, hybridization, and washing conditions used in Northern analyses were as described in Hariharan et al. (20), and Northern hybridizations were carried out on RNA that was size-fractionated on 1.1% agarose gels containing 2.2 M formaldehyde (22). Probes for the ubiquitin, calmodulin, CUB, Neo r , and CAT genes used in Northern analysis were generated by PCR amplification as described previously (18,20,23). The Hyg r probe was generated using Hyg1 and Hyg2 primers (Table I) to amplify the protein coding sequence template. The tubulin probe was generated using the Tub1 and Tub2 primers (Table  I) to amplify the tubulin coding sequence from the p164 plasmid (24). 32 P Labeling of Oligonucleotide Primers-Splint labeling of the primers used during 5Ј extension and S1 nuclease protection analyses was performed using a procedure based on the method of Hausner et al. (25). Splint labeling was performed by PCR amplification of a 2:1 ratio of biotin-conjugated CAT9 or Neo8 primer to their complementary primers CAT10 or Neo7 in a 50-l volume in the PCR reaction conditions described above. Five cycles of 1 min at 95°C, 1 min at 50°C, and 1 min at 70°C were carried out followed by binding to avidin-agarose beads (Pierce) in PBS, 500 mM NaCl for 10 min at ambient temperature with periodic vortexing. Binding was followed by one wash step with an equal volume of PBS, 500 mM NaCl. To isolate the labeled primer, the avidin-agarose-bound primers were resuspended in distilled H 2 O, boiled for 3 min, and pelleted. The labeled primer was released into the supernatant.
Primer Extension Analysis-Primer extensions were done essentially as described by Sambrook et al. (22) with the following modifications and conditions. Unless stated otherwise, 100 g of total cellular RNA and approximately 33-50 ng of splint-labeled primer were used. Primer annealing was carried out at 42°C, and primer extensions using RNaseH Ϫ , Moloney murine leukemia virus reverse transcriptase (Superscript II, Life Technologies, Inc.) were carried out at either 37 or 42°C for 1 h. Quantitative primer extensions were performed by carrying out simultaneous primer extensions of FUS1/CAT and tubulin transcripts in a single reaction. During the quantitative 5Ј extensions, 4 ng of 5Ј terminus-labeled Tub10 primer was used during each reaction.
PhosphorImager Analysis-The disintegrations/min of the radioactive signals from the primer extension gels and Northern blots were detected using a Molecular Dynamics Storm PhosphorImager, model 860. Analysis of the signals was carried out using Image QuaNT and FragmeNT Analysis software (Molecular Dynamics). S1 Nuclease Protection Analysis-S1 nuclease protection analyses were performed as described by Sambrook et al. (22). The singlestranded DNA probe was generated by linear PCR amplification of AflII-digested pBS:CH6N3 plasmid with splint-labeled Neo8 primer (Fig. 4C). The 410-nt probe, isolated from a 5% denaturing polyacrylamide gel, was annealed to 100 g of total RNA overnight at 37°C. S1 nuclease treatment was performed at 37°C for 1 h with 2 units of enzyme/l of reaction.
Following digestion, all samples were precipitated, resuspended in 3 l of formamide sample buffer, and separated on 5% polyacrylamide DNA sequencing gels.
Debranching of RNA-HeLa cell S100 cytoplasmic extracts with debranching activity were generously provided by Drs. Kenneth Watkins and Nina Agabian (University of California, San Francisco) (7). Debranching of total cellular RNA was carried out as described previously except that 100 g of RNA was treated for 2 h (7). After debranching, the RNA was extracted with phenol and precipitated prior to annealing with the labeled oligonucleotide and primer extension.
DNA Sequencing-The DNA sequencing ladders included in the S1 nuclease protection and primer extension analyses were Sequenase version 2.0 (U.S. Biochemical Corp.)-catalyzed dideoxy sequences of plasmid DNAs with primers chosen to match the 5Ј-end of the primer used in the S1 protection or 5Ј extension experiment. The template plasmid was also matched to the experiment.
RNA Stability Analysis-Midlog epimastigotes were treated with actinomyocin D at a concentration of 10 g/ml to inhibit transcription. Trypanosomes were collected at the reported time points (0, 30, 60, 90, 120, and 180 min). RNA was isolated with TRIzol® (Life Technologies, Inc.) and analyzed by Northern blot for FUS1/CAT mRNA. Blots were stripped by boiling for 30 min in 1ϫ SSC, 1.0% SDS and reprobed for tubulin.
pBS:CH6N3 and pBS:FN3 Constructions-The plasmids used for all experiments were constructed with the Bluescribe plasmid (pBSϩ/Ϫ) from Stratagene, Inc., and sequenced to verify that unintentional mutations were not introduced. Unless otherwise stated, all PCR fragments and restriction sites were treated with T4 DNA polymerase to generate blunt ends prior to ligation.
The plasmids pBS:CH6N3 and pBS:FN3 are shown in Fig. 2, A and B, which shows the alignment of genomic T. cruzi sequences in the plasmid with portions of the wild-type 2.65 calmodulin-ubiquitin locus. The "C" and "300" fragments and the Neo r coding sequence were gen- erated by PCR amplification as described previously (18,20). The C region consists of the 525-bp sequence immediately upstream of the initiation codon of CUB2.65, including the 3Ј splice acceptor site (23). In pBS:CH6N3, the C fragment is fused to the Hyg r gene, generated by PCR amplification of the protein coding sequence with Hyg1 and Hyg2 oligonucleotides. This coding sequence was followed by the 362-bp "SAM6-F" region that consists of the entire sequence between the coding sequences of the CUB2.65 and the downstream FUS1 genes. The "SAM6-F" fragment was generated by PCR amplification of a wild-type "F" sequence with the oligonucleotides 5ЈF and SAM6, which carried two point mutations ( Table I). The Neo r gene is fused to the F region and is flanked at its 3Ј-end with the 300 intergenic sequence, which separates FUS1 from PUB12.5 (17,18). The plasmid pBS:FN3 contains an F sequence of 525 bp that extends into CUB2.65 coding sequence, which was generated by PCR amplification with primers Ub-18 and Fuspro1 and carries no point mutations. This fragment is ligated to the Neo r coding sequence, which is in turn followed by the 300 fragment described above.
pBS:CnFc and pBS:CnFcSAM Constructions-For these experiments pBS:CnFc (18) was modified to eliminate the sup4 gene, flanked by XhoI restriction sites, by digestion with XhoI and religation. The modified plasmid was named pBS:CnFcII. Fig. 2C diagrams the pBS: CnFcII construct and shows the alignment of genomic T. cruzi sequences in the plasmid with the wild-type 2.65 calmodulin-ubiquitin locus. Subsequent pBS:SAM constructs were derived from pBS:CnFcII and contain the point mutations described in Table II.
To generate the pBS:CnFcSAM plasmids, pBS:CnFcII was mutagenized by site-directed mutagenesis using splice acceptor site mutation (SAM) oligonucleotides: SAM1, SAM2, SAM3, SAM4, CATM1, and CATM2. In all of the plasmids, the F region was sequenced to confirm that no spurious mutations were introduced.

Deletion of a Native 3Ј Splice Acceptor Site Yields Multiple
Transcripts-The first set of experiments were designed to better understand trans-splicing of the FUS1 transcript by eliminating its wild-type 3Ј splice acceptor site in the native locus (18). Because CUB2.65 and FUS1 belong to multicopy gene families, stable transformation was used to replace these genes with Hyg and Neo r , respectively, while simultaneously introducing targeted point mutations in the F intergenic region separating them ( Fig. 1 and Ref. 18). Two stable lines were generated (see "Materials and Methods"). TcCL:TR3 carried two mutations and both gene replacements. One mutation eliminated the native AG dinucleotide 3Ј splice acceptor site by creating an A to T substitution at position Ϫ14 relative to the  Table I. T7 and T3 refer to the location of the T3 and T7 primers in the vector, Bluescribe (Stratagene). The thick black line indicates polylinker sequences from the vector. Restriction sites are given. Details of the plasmid construction are described under "Materials and Methods." FUS1 translation initiation codon (Fig. 1). The second point mutation in this line eliminated an alternative ATG translation initiation codon by creating a G to C substitution at position Ϫ24 (Fig. 1). TcCL:FN3 was the control stable line and carried the single FUS1/Neo r gene replacement and no mutations within the F intergenic region.
TcCL:TR3 was one of several clonal lines isolated from a hygromycin-resistant population. Selection for expression of the FUS1/Neo r gene replacement was never imposed so that mutations that may have blocked expression of the Neo r gene could be represented in the original transformed parasite population. TcCL:TR3, like all tandem replacement lines analyzed, was found to lack neomycin phosphotransferase II activity despite the presence of the FUS1/Neo r gene replacement (data not shown). This suggested that if the FUS1/Neo r gene was transcribed, cryptic 3Ј splice acceptor sites within the F region were not used, since this would have led to neomycin phosphotransferase II production. By comparison, the clonal line TcCL: FN3, which also carried the FUS1/Neo r gene replacement but retained the native 3Ј splice acceptor site, expressed neomycin phosphotransferase II activity that was over 1 ϫ 10 5 -fold above background levels (data not shown). Taken together, the lack of neomycin phosphotransferase II activity and verification of the successful gene replacement indicated that although TcCL:TR3 carried the FUS1/Neo r gene replacement, the protein product was not expressed, presumably because eliminating the 3Ј splice acceptor site blocked maturation of the FUS1/Neo r mRNA. To determine how completely FUS1/Neo r gene expression was blocked and whether spontaneous mutations restoring expression could be isolated, the TcCL:TR3 line was subjected to G418 selection (data not shown). Despite repeated attempts, G418-resistant parasites were never isolated, which indicated that the introduced mutation was stable.
Although the analyses indicated that the FUS1/Neo r gene product of TcCL:TR3 was not expressed, they did not address whether the gene was transcribed and, if so, how the transcript TABLE I Sequences of DNA oligonucleotides that were used in the constructions and/or analyses a a All sequences are in 5Ј to 3Ј orientation. Numbers indicate nucleotide positions with respect to the indicated source. Primers that are the complement of coding strand sequences are referred to as "reverse primers." Lowercase letters indicate mutations or noncoding sequences, such as added restriction sites, and are explained in the description of the oligonucleotide. A reference or GenBank™ accession number is given for each oligonucleotide where possible. GGAACGCGGAACCATGGtGAAg Nucleotides 13-1 upstream of FUS1 (XO7452) fused to nucleotides 1-9 of CAT coding sequence (VB0065) with two of the SAMϩ10 point mutations, lowercase "t" and "g" CATM2

GGAACGCGGAACCATGGtGAAAAAAg
Nucleotides 13-1 upstream of FUS1 (XO7452) fused to nucleotides 1-13 of CAT coding sequence (VB0065) with two of the SAMϩ14 point mutations, lowercase "t" and "g" Tub1  Ϫ60 The position of the introduced mutations correspond to numbering in Fig. 1.
b The position of the first AG dinucleotide 3Ј of the branch points for splicing of the FUS1/CAT transcript. was processed. To address the first of these questions, the Northern blot analysis shown in Fig. 3 was carried out. Five replicate blots of total RNA isolated from either nontransformed T. cruzi or TcCL:TR3 parasites were hybridized with calmodulin, CUB, ubiquitin, Hyg, and Neo r coding sequence probes (see "Materials and Methods"). As expected, TcCL:TR3 parasites expressed the calmodulin and ubiquitin genes but lacked the 1.1-kb CUB2.65 mRNA (Fig. 3, panel CUB, lane 2). TcCL:TR3 also expressed the CUB2.65/Hyg r mRNA (Fig. 3, panel Hyg, lane 2). The Neo probe recognized two RNAs in TcCL:TR3 of approximately 1.2 and 1.4 kb (Fig. 3, panel Neo, lane 2), raising the possibility that these RNAs were either alternatively spliced nontranslatable products, nonspliced RNA processing intermediates, or both. In addition, the extended exposure necessary to detect the FUS1/Neo r transcripts in TcCL:TR3, suggested the RNAs were expressed at low levels relative to the native FUS1 mRNA. In contrast, the expected single transcript was detected by the Neo probe in TcCL:FN3 (data not shown; see Ref. 18).
Altered Processing of the FUS1/Neo r Transcripts of TcCL: TR3-To further characterize the two FUS1/Neo r transcripts observed in TcCL:TR3, primer extension and S1 nuclease protection analyses were carried out (see "Materials and Methods"). These experiments demonstrated that a subset of the FUS1/Neo r transcripts were trans-spliced within the FUS1/ Neo r protein coding sequence, and the remainder were nonspliced RNAs with defined 5Ј termini.
Primer extension analyses were carried out using an oligonucleotide complementary to the Neo r coding strand to prime reverse transcription of total RNA isolated from TcCL:TR3, wild-type, or TcCL:FN3 parasites (Fig. 4A). The RNA from TcCL:FN3 and wild-type nontransformed parasites were included as positive and negative controls, respectively. No reverse transcription products were generated from wild-type RNA as expected (Fig. 4A, lane 3). The TcCL:FN3 primer extension product (Fig. 4A, lane 2) indicated the FUS1/Neo r transcript was trans-spliced at position Ϫ12, the previously identified native 3Ј splice acceptor site for FUS1 ( Fig. 1 and Ref. 18). The 47-nt reverse transcription product included 12 nt of the FUS1/Neo r 5Ј-untranslated region and 35 nt of the 39-nt spliced leader sequence. As previously noted, modification of the 4,5Ј-terminal bases of the spliced leader blocked further reverse transcription (3). It can also be seen that in TcCL:FN3 splicing at position Ϫ12 was quantitative with no splicing detected at the next AG dinucleotide downstream.
The results of the 5Ј extension analysis from TcCL:TR3 were more complex. Multiple extension products were detected including the product expected from a template RNA, which was trans-spliced at position ϩ14, the first AG dinucleotide downstream of the polypyrimidine tract in TcCL:TR3 (see ϩ14 in Fig. 4A, lane 1). A comparison of the intensities of the TcCL: FN3 (Ϫ12) primer extension product and the TcCL:TR3 (ϩ14) product suggested that splicing at the ϩ14 site may be less efficient than splicing at the native site in TcCL:FN3. Three longer products at positions Ϫ112, Ϫ113, and Ϫ116 were also reproducibly observed in TcCL:TR3 (Fig. 4A, lane 1). These products were unlikely to represent additional trans-spliced FUS1/Neo r mRNAs for two reasons. First, this region contains no AG dinucleotides at the positions where they would be expected if these were trans-spliced RNAs, and second, transsplicing at these sites would lead to expression of neomycin Primer extension and S1 nuclease protection analysis of TcCL:TR3 and TcCL:FN3 Neo r transcript 5Ј-ends is shown. A and B are autoradiographs of primer extension and S1 nuclease analysis of total RNA, respectively. The hash-marked bands and numbers indicated are described under "Results." Sequencing ladders are pBS:CH6N3 primed with the Neo8 oligonucleotide (Table I), thus allowing direct determination of the size and position of fragments. A, primer extension analysis with splint-labeled Neo8 oligonucleotide (Table I). Lane 1, TcCL:TR3; lane 2, TcCL:FN3; lane 3, wild type (WT). Film exposure was for 11 days at ambient temperature. B, S1 nuclease analysis. Lane 1, TcCL:TR3; lane 2, TcCL:FN3; lane 3, wild type (WT); lane P, 1:100 dilution of probe. Film exposure was for 2.5 days with one intensifying screen. C, diagram of the S1 nuclease protection probe generation. D, diagram of the effect of the branched intermediates on primer extension (PE) and S1 nuclease protection (S1) analysis. phosphotransferase II. Closer examination revealed that these three products mapped to three A nucleotides immediately upstream of an uninterrupted polypyrimidine tract (Fig. 1), suggesting that they represented reverse transcription products in which extension was blocked by a Y-branched structure. The other products of intermediate length seen in TcCL:TR3 may have been due to secondary structures predicted to form within this region of the F sequence (data not shown; see Ref. 27), which blocked reverse transcriptase. At this point, they have not been investigated further.
To confirm the identity of the 3Ј splice acceptor sites and support the identification of potential branch points within the F intergenic region, S1 nuclease protection was carried out on the same RNA samples (see "Materials and Methods" and Fig.  4B). Since the single-stranded end-labeled DNA probe lacked sequence complementary to the spliced leader sequence (see Fig. 4C), protection by trans-spliced mRNAs would generate products 35 nucleotides shorter than the corresponding 5Ј extension products. In TcCL:FN3, a single protection product mapping to the Ϫ12 native 3Ј splice acceptor site was obtained (Fig. 4B, lane 2), confirming the results of the primer extension analysis. In contrast, multiple products were generated from the TcCL:TR3 sample (Fig. 4B, lane 1). The shortest protection fragment represented FUS1/Neo r mRNA spliced at ϩ14 and corresponded to the product mapped by 5Ј extension. The protected fragments at Ϫ160/Ϫ162 did not correspond to any previously observed primer extension products (Fig. 4A, lane 1) and these products were substantially longer than any of the 5Ј extension products. As shown below, the templates for these protection products were Y-branched FUS1/Neo r RNAs, which carried a defined 5Ј terminus. Fig. 4D illustrates the probable sources of the products of S1 nuclease protection and primer extension of FUS1/Neo r RNA of TcCL:TR3.
Identification of Branched FUS1/Neo r RNAs in TcCL: TR3-To confirm the identification of the Y-branched intermediates, the TcCL:TR3 RNA sample was treated with a HeLa cell debranching extract prior to 5Ј extension analysis (see "Materials and Methods" and Ref. 7). If a Y-branched structure was causing premature termination of reverse transcriptase, then eliminating the branch structure would result in 5Ј extension and S1 nuclease protection products of equal length.
TcCL:TR3 RNA was subjected to three treatments. One sample was analyzed by 5Ј extension (Fig. 5A, lane PE), a second sample was debranched prior to 5Ј extension analysis (Fig. 5A,  lane D), and a third sample was subjected to S1 nuclease protection (Fig. 5A, lane S1). The 5Ј extension products corresponding to the putative branch points were evident in the untreated RNA sample (Fig. 5A, lane PE) but were not seen in the debranched RNA sample (Fig. 5A, lane D). Rather, a new product was generated, which corresponded in length to the longer S1 nuclease protection product (Fig. 5A, lane S1). As expected, the debranching extract had no effect on the mature trans-spliced mRNA as indicated by the ϩ14 extension product in both samples (Fig. 5A, lanes PE and D). The same treatments and analyses performed on TcCL:FN3 RNA showed that the debranching extract had no effect on the trans-spliced TcCL:FN3 FUS1/Neo r mRNA (Fig. 5B). Thus, these results confirmed that the template RNAs carried Y-branched structures with a defined 5Ј terminus, which mapped to position Ϫ160/Ϫ162.
Generation of SAM Stable Lines-The analysis of the FUS1/ Neo r transcripts of TcCL:TR3 supported the scanning hypothesis for the selection of the 3Ј splice acceptor site during transsplicing. This hypothesis suggests that during the second step of splicing, the spliceosome scans the RNA downstream of the branch point in a 5Ј-3Ј direction and splices at the first AG dinucleotide it encounters (10,28). To further test this hypothesis, additional mutant stable lines were generated, which carried mutations in the F region such that the first AG dinucleotide in each line occurred at a different position downstream of the branch points (Table II and Fig. 6). For these experiments, the inserts from the plasmids pBS:CnFcII and pBS:SAM(s) (see "Materials and Methods" and Fig. 2C) were used to generate the stable transformants carrying CUB2.65/ Neo r and FUS1/CAT gene replacements. pBS:CnFcII carried the native F intergenic region (Fig. 1). pBS:SAM plasmids each carried point mutations in the F intergenic and CAT coding sequences, which placed the consensus AG dinucleotide at various positions (Table II and Fig. 6). For pBS:SAMϪ59, Ϫ35, and Ϫ23, one point mutation was introduced into the F sequence that placed the first consensus splice acceptor site at Ϫ59, Ϫ35, and Ϫ23, respectively, relative to the first nucleotide of FUS1/CAT coding sequence. The wild-type site at Ϫ12 was retained in these constructs. For pBS:SAMϩ7, the wild-type splice acceptor site was eliminated by deleting the A residue at Ϫ14, which left the AG dinucleotide at position ϩ7 within the CAT coding sequence as the first potential 3Ј splice acceptor site. For pBS:SAMϩ10 and ϩ14, the wild-type splice acceptor site was eliminated, and point mutations were introduced For clarity, the numbers are also classified with a technique abbreviation in parentheses: PE for primer extension and S1 for S1 nuclease protection. Lane PE, splint-labeled Neo8 primer (Table I) extension; lane D, splint-labeled Neo8 primer extension of debranched total RNA; lane S1, S1 nuclease protection; lane P, S1 nuclease protection probe control; lane 1/100, 1:100 dilution of S1 nuclease protection probe. B marks the branch sites at Ϫ112, Ϫ113, and Ϫ116 in A. Numbers are described under "Results." Sequencing ladders are pBS:CH6N3 primed with the Neo8 oligonucleotide, thus allowing direct determination of the size and position of fragments. Autoradiograph exposures were for 2.5 days with one intensifying screen for A and 2 days at ambient temperature for B.
within the CAT sequence such that an AG dinucleotide occurred at a position that would direct splicing at ϩ10 or ϩ14, respectively.
TcCL:SAM Stable Lines Support Scanning Model for the Selection of the 3Ј Splice Acceptor Site-Northern blot analysis indicated that each TcCL:SAM line produced FUS1/CAT mRNAs of approximately 1.0 kb as expected (Fig. 7A) (18), although the detected level of FUS1/CAT mRNA varied from line to line. TcCL:CnFc, which carried the native F region, had the highest level of FUS1/CAT mRNA (Fig. 7A, lane 2), suggesting that the mutations carried by the TcCL:SAM lines had variably adverse effects. TcCL:SAMϪ59, Ϫ35, and ϩ7 (Fig. 7A,  lanes 3, 4, and 6) exhibited the most dramatic decreases in FUS1/CAT mRNA levels, since in these lines transcripts could only be detected with extended exposure of the autoradiograph (Fig. 7A). Overall, the Northern blot analysis was consistent with the notion that the mutations introduced in the TcCL: SAM lines had adversely affected maturation of the FUS1/ CAT transcripts.
Quantitative primer extension analyses were carried out to assess how the SAM mutations affected trans-splicing of the FUS1/CAT transcripts. The position of the splice acceptor site in each line was determined using an oligonucleotide that was complementary to the CAT coding sequence to prime reverse transcription (see Fig. 7B and "Materials and Methods"). The quantity of FUS1/CAT product in each parasite line was normalized to the ␣Ϫtubulin extension product so that the amounts of spliced FUS1/CAT mRNA could be compared between lines (Table III).
The results of the primer extensions are shown in Fig. 7B and Table III, which includes a description of the 3Ј splice acceptor site used and the relative quantity of FUS1/CAT mRNA for each line. In all of the TcCL:SAM lines, the levels of trans-spliced FUS1/CAT mRNA were reduced to 8 -28% of that detected in TcCL:CnFc (Table III). For example, in TcCL: SAMϩ14, the intensity of the CAT primer extension product detected at position ϩ14 is approximately 28% of that detected in TcCL:CnFc (Fig. 7B, lanes 2 and 7). In TcCL:SAMϪ35, the intensity of the CAT primer extension product was only 8% of that detected in TcCL:CnFc (Fig. 7B, lanes 2 and 4).
Supporting the scanning model for 3Ј splice acceptor site selection, splicing in TcCL:CnFc occurred specifically at the wild-type site at Ϫ12 (Fig. 7B, lane 2) as expected. The results from the TcCL:SAM stable lines further support this model,  Table II). The polypyrimidine tract (Ϫ109 to Ϫ91) is represented by a hatched rectangle, and the branch points hatch marks are marked by B. ⌬, deletion of one nucleotide at the native 3Ј splice site. The translation initiation codon is marked by a hatch mark at ϩ1. The first AG dinucleotide downstream of the branch point is denoted by an underline. The blot was exposed with one intensifying screen for 14 days at Ϫ70°C. B, quantitative primer extension of TcCL:CnFc and TcCL:SAM stable transformants. Autoradiograph showing the results of the primer extension analysis. One hundred g of total cellular RNA from TcCL: CnFc and each of the TcCL:SAM stable lines was analyzed using primer extension analysis (see "Materials and Methods"). Splint-labeled CAT10 primer (Table I) was used to analyze the position of the splice acceptor site of the FUS1/CAT message in each line, and 5Ј terminuslabeled Tub10 primer (Table I)  The 76-nt tubulin extension product is denoted. The autoradiograph was exposed for 2 days at ambient temperature. a The position of the primer extension products detected for each stable line is given ( Fig. 1 and 7B).
b Primer extension bands (Fig. 7B) were quantitated by PhosphorImager analysis, and the relative amount of each product is given. The formula used to calculate the relative amounts of spliced FUS1/CAT mRNA for each clone is (SAM CAT phosphor units/SAM tubulin phosphor units)/(CnFc CAT phosphor units/CnFc tubulin phosphor units) ϫ 100. since splicing in each stable line occurred predominantly at the first AG dinucleotide downstream of the branch points and polypyrimidine tract. Two of the stable lines, TcCL:SAMϪ59 and Ϫ23 (Fig. 7B, lanes 3 and 5), exhibited a small degree of splicing at the wild-type site. In these two lines, both the amount of spliced FUS1/CAT mRNA and the precision of splicing were affected by the mutations in the F region. In contrast, no splicing at the wild-type site could be detected in TcCL: SAMϪ35, although the quantity of FUS1/CAT mRNA was decreased by 5-fold, indicating that although precision was maintained, efficiency was decreased (Fig. 7B, lane 4, and Table III). For TcCL:SAMϩ7, ϩ10, and ϩ14 (Fig. 7B, lanes  6 -8), splicing occurred within the CAT coding sequence at positions ϩ7, ϩ10, and ϩ14, respectively. Minor primer extension products were also detected near the position of the wildtype splice acceptor site despite the absence of an AG at Ϫ12 in these clones. Preliminary experiments indicate that these products represent FUS1/CAT transcripts that were spliced at non-AG dinucleotides (data not shown). Longer exposure of the autoradiograph in Fig. 7B also revealed primer extension products in TcCL:SAMϩ7, ϩ10, and ϩ14 that mapped to the three A residues at positions Ϫ112, Ϫ113, and Ϫ116, which have been previously identified as branch points for the FUS1 transcript.
SAM Mutations Decrease the Efficiency of trans-Splicing of the FUS1/Neo r Transcripts-The diminished levels of FUS1/ CAT transcripts detected in the TcCL:SAM lines could have been the result of one or more mechanisms including alterations in transcription, RNA processing, or mRNA stability. Since transcription in trypanosomes is polycistronic (4,29) and constitutive transcription across the 2.65 calmodulin-ubiquitin locus has been demonstrated (30), it is unlikely that the point mutations introduced in the TcCL:SAM stable lines affected transcription of the FUS1/CAT gene. To distinguish between the other two possibilities, the stability of the FUS1/CAT mRNA expressed in three different lines was assessed (see "Materials and Methods"). TcCL:CnFc served as the control, since splicing occurs exclusively at the wild-type site in this line. TcCL:SAMϪ23 and ϩ7 were analyzed, because the point mutations in these clones had respectively smaller and larger effects on expression from the FUS1 locus when compared with expression in the other TcCL:SAM lines (Table III). Fig. 8 shows Northern blots that are representative of the results obtained in the RNA stability analysis. The same Northern blots stripped and probed for tubulin transcripts are included as a control, since tubulin mRNA has a half-life that is significantly longer than that of FUS1/CAT mRNA (data not shown). Analysis of the RNA isolated at various time points after inhibition of transcription revealed that the half-life of FUS1/CAT mRNA for all of the clones analyzed was approximately the same (Table IV). Importantly, the point mutations introduced in these TcCL:SAM lines had no effect on the stability of the FUS1/CAT transcripts. Hence, coupling these results with the results of the quantitative primer extension analysis suggests that decreased efficiency of splicing in the F region is responsible for reduced levels of FUS1/CAT transcripts in the TcCL:SAM lines. DISCUSSION trans-Splicing is a process in trypanosomes that is required for the maturation of mRNA. To understand this process, it is important to identify the cis-acting sequences involved and, further, to understand the interplay between these sequences that leads to productive trans-splicing. The AG dinucleotide at the 3Ј splice acceptor site and its associated upstream polypyrimidine tract(s) are the most prominent and highly conserved cis-acting sequences of the trypanosomal pre-mRNA (9,14). This report centered on studying the effects of subtle point mutations that deleted the native AG 3Ј splice acceptor site and/or introduced new AG dinucleotides in the region surrounding the native 3Ј splice site of the FUS1 transcript.
In TcCL:TR3, the elimination of the native 3Ј splice acceptor site within the F region led to splicing at the next AG dinucleotide downstream and the accumulation of Y-branched intermediates. These intermediates facilitated the mapping of the first bona fide branch points in T. cruzi, only the second identified in trypanosomatids (16). Branching on the FUS1/Neo r pre-mRNA occurred at the three A residues directly upstream of the only polypyrimidine tract. Furthermore, analysis of Tc-CL:TR3 also provides evidence for the scanning model for 3Ј splice site selection (10,28) during trans-splicing as illustrated through two observations. The first is that the native 3Ј splice acceptor site for the FUS1 transcript occurs at the first AG dinucleotide downstream of the branch points and polypyrimidine tract (18). Second, in the absence of the native 3Ј splice acceptor site in TcCL:TR3, splicing occurred at the next AG dinucleotide downstream at ϩ14.
The scanning model for 3Ј splice site selection suggests that the pre-mRNA is scanned by some component(s) of the spliceosome and splicing occurs at the first AG dinucleotide downstream of the branch point (28,32). This model has been supported in cis-splicing by carefully designed experiments done in FIG. 8. Stability of FUS1/CAT mRNA is unaltered in TcCL:SAM stable transformants. The lines TcCL:CnFc, TcCL:SAMϪ23, and TcCL:SAMϩ7 were analyzed. Total RNA was isolated at various times after inhibition of transcription (0, 30, 60, 90, 120, and 180 min) and analyzed by Northern blot (10 g/lane). Each blot was hybridized first with the CAT probe and then stripped and rehybridized with the tubulin probe (see "Materials and Methods"). The autoradiographs for FUS1/CAT Northern blots were exposed for the following lengths of time: TcCL:CnFc, 2 days with one intensifying screen at Ϫ70°C; TcCL:SAMϪ23, 6 days with two intensifying screens at Ϫ70°C; TcCL:SAMϩ7, 10 days with two intensifying screens at Ϫ70°C. The autoradiographs for tubulin Northern blots were exposed for 1 day at ambient temperature.
vitro and in vivo using yeast and mammalian transcripts (33)(34)(35)(36). The experiments demonstrated that under most circumstances, when the position of the AG was altered by mutation, splicing occurred at the downstream AG dinucleotide closest to the branch point. Analyses of mapped native splice sites from cis-spliced RNAs also suggest that splicing occurs at the first AG downstream of the predicted branch point (10). Furthermore, bimolecular splicing experiments in which the 5Ј splice site, branch point, and polypyrimidine tract are located on an RNA separate from that RNA carrying the 3Ј splice site suggest that these RNAs can be spliced together (37). These experiments demonstrate that a 5Ј to 3Ј scanning mechanism may be operational, since splicing occurs at the AG dinucleotide closest to the 5Ј-end of the target RNA.
The analysis of the TcCL:SAM stable lines provides the first set of experiments designed specifically to address the scanning hypothesis for 3Ј splice acceptor site selection (10,28) during trans-splicing. These lines each carried a unique set of mutations that placed the first AG dinucleotide at various positions downstream of the branch point. The data observed from each of the lines supported the scanning model, since splicing occurred predominantly at the first AG dinucleotide downstream of the branch points. These analyses provide experimental support for the evolutionary conservation of a scanning mechanism used during RNA splicing in eukaryotic organisms. Additional support for the scanning model during trans-splicing comes from Patzelt et al., who identified the branch points of T. brucei ␣and ␤-tubulin transcripts in which splicing also occurred at the first AG dinucleotide downstream (16). In trypanosomatids, it has been difficult to assess whether other native 3Ј splice sites occur at the first AG downstream of the branch site, because no consensus sequence for branching has been determined, and other branch points have not been experimentally mapped in trypanosomatids.
In addition to these findings, the analysis of the SAM mutations suggests that splicing at the first AG dinucleotide downstream from the branch point may not always be precise. For example in TcCL:SAMϪ59 and Ϫ23, although the predominant splicing event occurred at the first AG dinucleotide downstream of the branch point, a small amount of splicing was also detected at the intact wild-type splice site at Ϫ12. This contrasts with the observations from TcCL:CnFc and TcCL: SAMϪ35 in which splicing at one site is quantitative. Perhaps quantitative splicing at these sites was not observed because the positions of these AG dinucleotides along the primary transcript or nucleotide context surrounding the new sites in TcCL: SAMϪ59 and Ϫ23 are not optimal. cis-Splicing in yeast and metazoans can be affected by distance between the branch point and the splice acceptor site, quality and length of the polypyrimidine tract, context of the splice site, or any combination of these factors (33,35,38,39,41,42). Another possibility for the low level of promiscuous splicing observed in the TcCL:SAM stable lines is that the secondary structure of the precursor RNA molecule may affect the availability of a con-sensus site to be recognized by the splicing machinery (43,44). A more detailed analysis will have to be undertaken to better understand how other cis-acting factors affect splice acceptor site selection in trans-splicing and to better understand how the combination of different cis-acting factors influence trans-splicing.
Presently it is unclear why splicing is directed to the first AG downstream of the branch point of the pre-mRNA. Recent cross-linking studies using extracts from HeLa cell nuclei and Caenorhabditis elegans suggest that the small subunit of the U2 auxiliary factor, U2AF 35 , directly contacts and binds to the 3Ј splice acceptor site and exon sequences located immediately downstream in a sequence-dependent manner (45,46). Coupled with evidence that the large subunit, U2AF 65 , is important for recognizing and binding to the polypyrimidine tract and branch point (47,48), splicing at the first AG downstream of the branch point may be a consequence of these combined interactions. A similar model could be proposed in trans-splicing, since U2AF 35 was demonstrated to cross-link to the 3Ј splice site of trans-splicing pre-mRNAs in HeLa nuclei extracts (46), and homologues of both U2AF subunits have been identified through sequence analysis in the trypanosomatid Leishmania major (GenBank TM accession numbers AC005836 and AC005893).
Additionally, there are a number of other factors that may also influence 3Ј splice site selection. For example, Slu7p is a spliceosomal component that was originally identified in yeast from a screen of mutants that were synthetically lethal with mutations of U5 snRNA (49). In vitro experiments have demonstrated that the absence of this splicing factor is coupled with a loss of fidelity of splicing at the 3Ј splice site (50). The importance of its involvement in splicing becomes more evident when the 3Ј splice site is distal to the branch point (51). The characterization of U5 small nuclear ribonucleoprotein-associated proteins suggests that they also may determine 3Ј splice site choice. Prp8p is one of the most highly conserved of these proteins (52) that binds to both the 5Ј and 3Ј splice sites (53)(54)(55). Recent studies show that mutants of Prp8 suppress mutations at both splice sites, thus suggesting that it plays a role during 3Ј splice site selection (53,56). In addition, it is associated with other U5 small nuclear ribonucleoprotein proteins that carry RNA helicase domains, RNA unwindase activity, or homology to the translational elongation factor, EF2 (see Ref. 57 and references therein). Together, the U5 small nuclear ribonucleoprotein factors may work in a concerted fashion to effect a scanning mechanism and select and position the 3Ј splice site during catalysis. Some of these interactions may be conserved during trans-splicing, since studies indicate that a homologue of Prp8p identified in T. brucei associates with SLA2, the trypanosome U5 snRNA homologue (58). More studies are necessary to determine which trans-acting factors and what mechanism(s) dictates selection of the 3Ј splice site.
The need for a mechanism operating during trans-splicing that would reduce the likelihood of disruption of an open reading frame is illustrated by results from TcCL:SAMϩ7, ϩ10, and ϩ14 and TcCL:TR3, lines that in the absence of the native 3Ј splice site spliced at the first AG dinucleotide within the protein coding sequence. As a result, translation of the truncated mRNA would be effectively blocked. A scanning mechanism (59) or the interaction of the U2AF could furnish such a device, since they direct splicing to a site near a branch point, thus decreasing the possibility of splicing within an open reading frame.
Analysis of both the TcCL:SAM and TcCL:TR3 lines suggest in addition to determining the splice site, the position of the AG may effect the efficiency of trans-splicing. The reduced effi- a PhosphorImager analysis was used to determine the quantity of FUS1/CAT mRNA in each lane of northern blots as shown in Fig. 8. The half-lives of FUS1/CAT mRNA for each stable line was determined by plotting time versus detected FUS1/CAT mRNA. Half-lives are given in minutes and were determined from an average of three independent experiments. ciency of splicing observed in the TcCL:SAM lines and the appearance of stabilized Y-branched intermediates in TcCL: TR3 suggest that the position of the 3Ј splice acceptor site could directly affect the level of expression of various proteins. Furthermore, this analysis, particularly the results of TcCL: SAMϪ35, suggests that splicing can be inefficient but remain precise. Inefficient but precise trans-splicing may be one means for reducing the amount of translatable mRNA. These results, taken together, support the theory that cis-acting factors influence how efficiently transcripts are trans-spliced and may explain how mature transcripts originating from one polycistronic unit are present at various levels (26,31,40).