Structure and expression of novel spliced leader RNA genes in Caenorhabditis elegans.

Approximately 25% of Caenorhabditis elegans genes are organized as operons. Polycistronic transcripts are converted to monocistronic mRNAs by 3′ cleavage/polyadenylation and 5′ trans-splicing with untranslated, 5′-terminal exons called spliced leaders, (SLs). The 5′ termini of mRNAs encoded by downstream genes in operons are acceptors for ≥7 recently discovered “novel” SLs and a classical SL (SL2). Diversity in SL exons is now partly explained by the discovery and characterization of five novel genes that encode C. elegans SL RNAs. These novel SL RNAs contain a 22- or 23-nucleotide SL followed by conserved splice donor and downstream sequences that are essential for catalysis of trans-splicing reactions. The SL3α, SL4, and SL5 RNA genes are tightly clustered on chromosome III; their 114-nucleotide transcripts deliver three distinct SLs to mRNAs. The SL3β and SL3γ RNA genes are on chromosome I, but are not tightly linked. SL RNAs 3α, 3β, and 3γ provide identical 5′ leader exons, although their 3′ sequences diverge. Transcription of SL 3-5 RNA genes appears to be driven by flanking DNA elements that are homologous with segments of promoters for the C. elegans SL2 RNA and small nuclear RNA genes. RNase protection assays demonstrated that novel SL RNAs are transcribed in vivo and accumulate in the poly(A−) RNA pool. SL3 exons are transferred to mRNAs as frequently as SL2 exons. In contrast, SL4 is appended to mRNAs 10% as frequently as SL3. The abundance of SL4 RNA increased 6-fold during postembryonic development, and the SL4 RNA gene promoter is active principally in hypodermal cells.

The free living nematode Caenorhabditis elegans excises noncoding segments of gene transcripts by three processes. Internal introns are apparently removed by a conserved classical cis-splicing mechanism that involves snRNPs 1 (U1-U6), a branched intermediate and auxiliary protein factors (1)(2)(3). Approximately 70% of C. elegans mRNAs are covalently modified at their 5' termini by the addition of a 22-nt leader sequence (4) named the "spliced leader" or SL. SLs are attached in transsplicing reactions (5)(6)(7). The 5' SL that appears most frequently on C. elegans mRNAs corresponds to nucleotides 1-22 of the product of the SL1 RNA gene (5,8). This gene is reiterated in tandem ~ 100 times on chromosome V and encodes an RNA composed of ~ 95 nt (8). C. elegans transcripts that (a) contain a consensus splice acceptor sequence (UUUCAG) upstream from the coding region and (b) lack a corresponding 5' splice donor sequence, are targets for trans-splicing (9). Most of the spliceosomal components required for cis-splicing are also essential for trans-splicing (10). Key differences are as follows: (a) an SL1 (or SL2; see below) RNA complexed with proteins is included in the splicing complex; (6) SL RNA is consumed in each round of trans-splicing, thereby necessitating reloading or reassembly of the spliceosome for subsequent catalysis; and (c) SL RNA provides a trimethylguanosine cap for mature mRNAs (11,12). The 22-nt leader derived from SL1 RNA is usually appended to the 5' end of mRNAs derived from typical genes, in which transcribed sequences are preceded by contiguous promoter/enhancer regions. Unlike many other eukaryotes, C. elegans contains numerous genes that are organized as operons (4,13). Thus, transcription of two (or more) structural genes is sometimes driven by a unique 5' promoter/enhancer region. Polycistronic transcripts are converted to monocistronic mRNAs by a combination of cleavage, polyadenylation, and trans-splicing (13)(14)(15). Messenger RNAs encoded by downstream genes in operons receive leader sequences that differ from SL1 (13,16). Initially, two highly homologous SL RNA genes (SL2α and SL2/3), which encode identical 22-nt SLs were identified as sources of these 5' termini (16).
An early compilation of irons-spliced cDNA sequences suggested that C. elegans mRNAs received SL1 or SL2 in a mutually exclusive manner (16). However, recent applications of a reverse transcriptase-anchored PCR procedure (17) revealed that the 5' ends of mRNAs encoding a sex determination factor (TRA-2; Ref. 18), protein kinase C1A (19), and the β subunit of casein kinase II 2 (20) are modified with multiple "novel" SLs. Thus, our knowledge of SL RNA genes is incomplete, and multiple pertinent questions regarding the origin and utilization of the novel SLs can be asked. Are the novel SLs more closely related to SL1 or SL2? Are the structures of the novel SL RNA genes similar to the previously described SL1 and SL2 RNA genes or different? Are the novel SL RNA genes clustered or scattered in the genome? What are the sizes of the pools of novel SL RNA transcripts relative to the levels of SL1 and SL2 RNAs? To what degree are the novel SLs utilized on mRNAs in vivo? Is the expression of any of the novel SL RNA genes regulated?
To address these questions we have cloned and characterized five novel SL RNA genes. The genes can be either clustered or isolated, and they encode homologous RNA sequences of 114 nt. The novel, 5' SL sequences are more closely related to SL2α and SL2β than SL1. Although the relative abundance of the novel SL RNAs is low when compared with steady-state SL1 and SL2 RNA concentrations, the utilization of certain novel 5' SL sequences on mRNAs occurs with a substantial frequency. Finally, expression of the novel SL4 RNA gene is developmentally regulated, and SL4 RNA transcripts are detected only in hypodermal cells.

EXPERIMENTAL PROCEDURES
Growth of C. elegans -The Bristol N2 strain of C. elegans was grown, synchronized, and harvested as described in previous publications (21,22).
Southern Gel Analysis -C. elegans genomic DNA was prepared by the procedure of Yesner and Emmons (23). Samples of high molecular weight DNA (10 µg) were digested with restriction endonucleases, and the resulting fragments were fractionated on a 1% agarose gel. To optimize hybridization of 32 P-labeled oligonucleotides with DNA fragments "dry" gels were prepared as described by Thein and Wallace (24). DNA fragments in the gels were denatured by incubation in 0.5 M NaOH, 1.5 M NaCl for 10 min at 22 °C. After neutralization in 0.5 M Tris-HCl, pH 8.0, containing 1.5 M NaCl (10 min at 22 °C), the gel was placed in 5 x SSPE (1 x SSPE = 10 mM sodium phosphate, pH 6.5, containing 0.15 M NaCl and 1.3 mM EDTA) that was supplemented with 0.1% sodium pyrophosphate, 0.1% SDS, and 0.1 mg/ml tRNA. Oligonucleotides complementary to SL2 (16) and SLb (see Fig. 1), a novel spliced leader, were 5'-end-labeled (10 9 cpm//µg) as described previously (21). Probes (2 x 10 6 cpm/ml) were hybridized with the gel for 8 h at 35 °C. Subsequently, the gel was washed twice with 6 X SSC (1 X SSC = 0.15 M NaCl, 15 mM sodium citrate, pH 7.0) containing 0.1% SDS for 15 min at 45 °C. After three rinses with the same solution at 22 °C, the gels were exposed to x-ray film (Kodak XAR-5) at -70 °C.
Isolation of Novel SL RNA Genes-A C. elegans genomic library in bacteriophage λEMBL4 was provided by Dr. Chris Link, University of Denver (Denver, CO). This library was screened with oligonucleotides complementary to novel leader sequences SLb and SLa (see Fig. 1), which appear at the 5' ends of PKC1A mRNAs (19). Duplicate nitrocellulose filter lifts of recombinant AEMBL4 phage were hybridized at 35 °C for 18 h in a mixture of 5 x SSPE containing 5 X Denhardt's solution, 0.1% sodium pyrophosphate, 0.1% SDS, 0.1 mg/ml tRNA, and 32 P-labeled probe (4 x 10 6 cpm/ml). Subsequently, filters were washed twice (15 min) at 45 °C with 6 X SSC, 0.1% sodium pyrophosphate, 0.1% SDS and twice (15 min) at room temperature with the same solution before drying and autoradiography at -70 °C. Three positive phage clones were purified to homogeneity. The DNA insert from clone λSLA hybridized with the antisense SLa and SLb oligonucleotides; the other DNA inserts (clones λSLB and λSLC) hybridized only with the SLb antisense oligonucleotide. When λSLA was digested with HpaII, two fragments of similar size (~ 2.2 kbp) hybridized with the 32 P-labeled SLb probe. These segments of genomic DNA were cloned into pGEM7Z for DNA sequence analysis. Digestion of λSLC with HindIII yielded a 1-kbp DNA fragment that hybridized with the SLb probe. This fragment was cloned into pGEM7Z and sequenced as described below. λSLB was digested with SpeI, and a 5.6-kbp fragment that hybridized with the SLb oligonucleotide was subcloned into pGEM5Z. The amplified 5.6-kbp insert was digested sequentially with AatII, XbaI, and ClaI. An AatII/XbaI fragment (2.5 kbp) hybridized weakly with the SLb probe and was subcloned into pGEM7Z for DNA sequencing. A 2.3-kbp ClaI/ XbaI fragment hybridized strongly with the probe and was subcloned into pGEM7Z for further restriction enzyme analysis. Upon digestion with Sad, this recombinant plasmid yielded a 1.3-kbp genomic DNA fragment and linearized vector (3.0 kbp) fused with 1.0 kbp of C. elegans DNA. Both species of DNA hybridized with the SLb probe. The 1.3-kbp Sad fragment was cloned into pGEM5Z, the 4-kbp DNA segment was religated, and both inserts were sequenced as described below. Each recombinant pGEM5Z or 7Z plasmid that was sequenced contained only one SL RNA gene.
DNA Sequence Analysis -DNA inserts were sequenced by the dideoxynucleotide chain termination procedure of Sanger et al. (25) using SP6, T7, and synthetic oligonucleotide primers as described previously (21).
Computer Analysis -Analysis of sequence data, sequence comparisons, and data base searches were performed using PCGENE-Intelli-Genetics software (IntelliGeneties, Mountainview, CA) and BLAST pro-grams (26) provided by the NCBI server at the National Library of Medicine/National Institutes of Health.
Mapping SL RNA Genes to Chromosomes -A nitrocellulose filter that contains an array of 958 recombinant yeast artificial chromosomes (YACs) was obtained from Dr. Alan Coulson at the Medical Research Council Laboratory of Molecular Biology, Cambridge, United Kingdom. Overlapping DNA inserts (200-250 kbp) in the YACs account for >90% of the C. elegans genome (27). DNA inserts from plasmids that contain individual SL RNA genes and their contiguous 5' and 3' flanking sequences (see above) were used as templates to generate randomly primed, 32 P-labeled probes (21). A plasmid containing the SL2α RNA gene was generously provided by Dr. David Hirsh (Department of Biochemistry, Columbia University). Filter grids were hybridized and washed under high stringency conditions as described previously for Southern blot analysis (21). Signals were visualized by autoradiography. Physical map positions for the hybridizing inserts were determined from the C. elegans electronic map data base (Harvard Medical School). Map positions of novel SL RNA genes were verified by the fingerprinting method of Coulson et al. (28) in a collaborative experiment with A. Coulson and colleagues (MRC laboratory of Molecular Biology, Cambridge, United Kingdom).
Isolation of RNA and Northern Gel Analysis -Total RNA was extracted from C. elegans as described by Hu and Rubin (21). Poly(A + ) RNA was isolated by the procedure of Sambrook et al. (29). Northern blot analysis was performed as described previously (30). RNA samples used in these studies were normalized to contain the same amounts of myosin light chain mRNAs as described in Hu and Rubin (20). Levels of myosin light chain mRNAs are constant during C. elegans development (31).
RNase Protection Analysis: Detection and Quantification of SL RNAs-Primers that direct the synthesis of ~200-bp segments of C. elegans DNA, which include an SL RNA gene and short 5'-and 3'flanking sequences, were designed. Recombinant pGEM5Z or 7Z plasmids that contain SL RNA genes (see above) served as templates for DNA synthesis catalyzed by Pfu DNA polymerase (Stratagene). NotI and NsiI restriction sites were introduced via the 5' and 3' PCR primers, respectively. PCR products were digested with NotI and NsiI and cloned into pGEM5Z plasmid that was cleaved with the same enzymes. Constructs were verified by sequencing both DNA strands.
Plasmids were linearized by digestion with SpeI, and 32 P-labeled antisense SL RNA probes (~ 200 nt) were synthesized via SP6 RNA polymerase and purified as described previously (20,22). Control (sense) probes were generated by T7 RNA polymerase after cleaving with PvuII. Hybridization of 32 P-labeled antisense (or sense) probes (10 6 cpm) with C. elegans total RNA and subsequent digestion of singlestranded RNA with RNase T 1 and RNase A was performed as previously reported (20). Protected duplex RNA was denatured and subjected to electrophoresis in a 6% polyacrylamide gel containing 7 M urea. Protected, 32 P-labeled complementary SL RNA (~ 115 nt) was visualized by autoradiography on XAR-5 film and quantified by Phosphor-Imager (Molecular Dynamics) analysis.
Detection and Quantification of Spliced Leader Sequences in C. elegans mRNAs and Poly(A -) RNA -Levels of specific SL sequences in poly(A + ) RNA were measured by RNase protection analysis, using full-length 32 P-labeled antisense probes (see above). Assays were carried out as described above except for two modifications; probe and poly(A + ) RNA were heated at 85 °C for 10 min and then incubated at 37 °C for 14 h, and RNase digestion was performed at 35 °C. Protected 32 P-labeled fragments were characterized by electrophoresis (in a 15 or 18% polyacrylamide gel that contained 7 M urea) and autoradiography.
The abundance of various SL sequences in poly(A -) RNA was determined by RNase protection analysis as described above for poly(A + ) RNA. 32 P-labeled antisense RNA probes (5 x 10 5 cpm) that complement only the SL sequence were employed. Probes were generated by the method of Milligan et al. (32) using T7 RNA polymerase (>1000 units/ ml) provided by Heike Pelka (Department of Molecular Developmental Biology, Albert Einstein College of Medicine). Reaction products (22-or 23-nt SL plus 12 or 13 irrelevant nt) were analyzed on a 10% polyacryIamide, 7 M urea gel to verify their lengths and purity. The 35-nt probes were readily distinguished from the 22/23-nt protected fragments in RNase protection assays.
Preparation of Transgenic C. elegans -Cloned genomic DNA that corresponds to the first 40 bp of the coding region of the SL4 RNA gene and 800 bp of contiguous 5'-flanking DNA (Fig. 3) was amplified by PCR (20,22). A separate set of primers was used to amplify a 197-bp DNA fragment that encompassed 162 bp of 5'-flanking sequence and 35 bp of the SL4 RNA structural gene. SphI and SalI restriction sites were introduced via the 5' and 3' primers, respectively. Amplified DNAs were digested with SphI and SalI and cloned into the C. elegans expression vector pPD16.51 (33), which was cleaved with the same enzymes. This places the putative SL4 RNA gene promoter upstream from a nuclear localization signal and the lacZ reporter gene. Recombinant SL4 RNA:lacZ chimeric genes and the rol-6 gene, which provides a selectable marker phenotype (34), were co-injected into the gonadal syncytium of C. elegans, and stable lines of transgenic nematodes were established as described previously (35).
Detection of SL RNA Gene Promoter Activity by in Situ Hybridization Analysis -C. elegans were fixed with 4% formaldehyde in 0.1 M HEPES-NaOH buffer, pH 6.9, containing 2 mM MgSO 4 and 1 mM EGTA for 4.5 h at 20 °C. The specimens were dehydrated by successive incubations (10 min) with solutions containing increasing amounts (20% increments) of methanol and corresponding decrements in 10 mM sodium phosphate buffer, pH 7.4, 0.15 M NaCl (PBS). Nematodes were stored in 100% methanol at -70 °C. Prior to hybridization, C. elegans were incrementally rehydrated at 22 °C with PBS containing 0.1% (v/v) Tween 20 and 0.2% (w/v) SDS (buffer A) and washed twice with 5% (v/v) 2-mercaptoethanol in buffer A. The nematodes were then incubated with 50 µg/ml Proteinase K in PBS for 15 min at 37 °C to permeabilize the external collagenous coat. Partial digestion was terminated by washing with 10 volumes of 25 mM glycine in buffer A. After two washes with buffer A, C. elegans were incubated in PBS containing 4% formaldehyde for 20 min at 22 °C, washed three times with PBS containing 0.1% Tween 20, and equilibrated with buffer B (40% (v/v) formamide, 5 X SSC, 0.1 mg/ml sonicated salmon sperm DNA, 50 µg/ml heparin, and 0.1% Tween 20). Prehybridization in buffer B was carried out at 48 °C for 1 h.
The pPD16.51 expression plasmid (33) was cleaved at a unique Sad site that precedes the lacZ gene. An oligonucleotide primer was designed to initiate Taq DNA polymerase-catalyzed DNA synthesis 2 kbp downstream from the lacZ transcription initiation site. DNA synthesis progressed toward the 5' end of the lacZ gene, yielding a 2-kilobase single-stranded, antisense DNA probe. Primer (30 ng) and 200 ng of template DNA were added to a reaction mixture containing 0.5 M KC1, 0.01 M Tris-HCl, pH 8.3, 1.5 mM MgCl 2 , 0.001% gelatin, 0.2 mM dATP, 0.2 mM dCTP, 0.2 mM dGTP, and 0.13 mM dTTP. Digoxigenin-11-dUTP was added to a final concentration of 70 µM. The reaction mixture (25 µl) was heated at 100 °C for 3 min and then was incubated in a thermal cycler under the following conditions: denaturation at 95 °C for 45 s; annealing at 55 °C for 30 s; and synthesis at 72 °C for 90 s. After 35 cycles, single-stranded DNA was precipitated with ethanol and dissolved in 0.3 ml of buffer B. The probe was boiled for 10 min and an aliquot (6 µl) was added to a suspension (20 µl) of fixed nematodes in buffer B. Hybridization of probe with lacZ RNA derived from the chimeric reporter genes (see above) was carried out at 48 °C for 16 h. Subsequently, C. elegans were washed serially with 1-ml aliquots of PBS, 0.1% Tween 20 containing 80, 60, 40, 20, and 0% buffer B at 22 °C. C. elegans were then washed with 0.1% (w/v) albumin and 0.1% Triton X-100 in PBS (buffer C) and incubated 16 h at 4 °C with a 1:2500 dilution of antibody directed against digoxigenin (Boehringer Mannheim) in buffer C. The IgG was coupled to alkaline phosphatase. After washing C. elegans four times with 10 volumes of buffer C, antigenantibody complexes were detected by incubation with a solution containing 0.5% (w/v) 4-nitro blue tetrazolium chloride and 0.33 mg/ml 5-bromo-4-chloro-3-indolyl phosphate, 0.1 M NaCl, 5 mM MgCl 2 , 0.1 M Tris-HCl pH 9.5, 1 mM levamisole, and 0.1% Tween 20 for 20 min at 37 °C. Alkaline phosphatase catalyzes the synthesis of an insoluble blue reaction product. Procedures for fixation of C. elegans and in situ hybridization of RNA are modifications of methods described by Tautz and Pfeifle (36) and Patel and Goodman (37), respectively. Modifications were introduced by Seydoux and Fire (Dept. of Embryology, Carnegie Institute of Washington). 3

Formulation of Strategy and
Cloning of Novel SL RNA Genes from C. elegans -Nonclassical SLs that appear at the 5' ends of cDNAs encoding three distinct proteins are shown in Fig. 1. Three of these SLs (Fig. 1, asterisks) appear upstream from cDNA sequences encoding both protein kinase C1A (19) and TRA-2 (18). The remaining SLs were attached to only a single type of cDNA. Nonclassical SLs are collectively designated "novel SLs." Oligonucleotides complementary to SL2 and the novel SL named SLb ( termini of certain C. elegans cDNAs. Primary structures of nonclassical SLs were determined by the direct sequencing of amplified cDNAs (17)(18)(19). The cDNAs contained complete copies of the 5' termini of C. elegans mRNAs encoding protein kinase C1A, TRA-2, and casein kinase IIβ (18)(19)(20). 2 Only the SL sequences are shown. SLs a-d were appended to protein kinase C1A mRNA; SLs b-f were present at the 5' end of tra-2 mRNA; SLg was found at the 5' terminus of cDNA encoding the ft subunit of casein kinase II. The novel SLs are aligned with the classical C. elegans spliced leader sequences SL1 and SL2 (5,16). Novel SLs marked with an asterisk are evident in both protein kinase C1A and tra-2 cDNAs. A distinctive dinucleotide sequence that is conserved in all of the novel SLs is shown in boldface type. The lengths (in nucleotides) of the SLs are given in parenthesis.
used to establish hybridization conditions suitable for differentially identifying C. elegans DNA fragments encoding novel SL RNA genes (see "Experimental Procedures"). When C. elegans DNA was digested with ClaI and fractionated in an agarose gel the SLb probe hybridized exclusively with a 15-kbp fragment ( Fig. 2A, lane 1). In contrast, the SL2 probe generated a prominent signal with a 5-kbp segment of ClaI-digested DNA and also hybridized with five other restriction fragments (Fig. 2B, lane 1). None of these fragments corresponded to the band observed with the SLb probe. The pattern of four principal bands obtained with the SL2 probe (Fig. 2B, lanes 1 and 2) is in agreement with the copy number for SL2 genes reported by Huang and Hirsh (16).
Subsequently, a C. elegans genomic DNA library in bacteriophage EMBL4 was screened with 32 P-labeled oligonucleotides complementary to the SLb and SLa leader sequences (Fig. 1). Approximately 200 candidate clones were obtained from 1.6 X 10 5 recombinant phage. Several recombinants were plaque purified, and the DNA inserts were characterized by digestion with restriction endonucleases and Southern gel analysis. Each genomic insert contained multiple copies of novel SL RNA genes. Representative examples are shown in Fig. 2C.
Terminology -SL is defined as a 22-or 23-nucleotide RNA segment (untranslated exon) that is transferred from the 5' end of a spliced leader RNA to the 5' terminus of a messenger RNA via transesterification. SL RNA is a 114-nt (or 95-nt for SL1 RNA) RNA transcript encoded by a spliced leader RNA gene. SL RNA contains an SL, a contiguous donor splice site, and downstream sequences essential for catalysis of the trans-splicing reaction. SL RNA genes are segments of DNA that encode SL RNAs. Expression of these genes is driven (in part) by a characteristic promoter element that lies 40-65 bp upstream from the transcription start site (nucleotide 1 in the SL sequence).
Organization and Sequences of SL RNA Genes -Genomic DNA inserts from recombinant phage clones designated λSLA, λSLB, and λSLC (see "Experimental Procedures") were characterized. Restriction fragments of the insert in λSLA that hybridized with the 32 P-labeled oligonucleotides (see "Experimental Procedures") were subcloned and sequenced. Three SL RNA genes were discovered within 2990 bp of contiguous sequence (Fig. 3). Novel SL sequences were initially identified by homology with SLs presented in Fig. 1. Alignment of eight derived RNA sequences revealed that "typical" C. elegans SL RNA genes comprise ~114 bp and contain conserved sequences and PstI (lane 2) and processed as described under "Experimental Procedures." Dried agarose gels were probed with 32 P-labeled oligonucleotides corresponding to SLb (A) and SL2 OB) (see Fig. 1 and "Experimental Procedures"). Autoradiograms are shown. C, genomic DNA inserts derived from clones (see "Experimental Procedures") of recombinant bacteriophage EMBL4 designated λSLA (lane 1) and λSLB (lane 2) were digested with AluI and MspI, respectively, and subjected to Southern gel analysis as described under "Experimental Procedures." Dried agarose gels were probed with a 32 P-labeled oligonucleotide corresponding to SLb. Autoradiograms are presented. Gels were calibrated with DNA markers that were electrophoresed in parallel lanes and stained with ethidium bromide. Sizes of the markers are given in kbp.
that are essential for spliceosome assembly and function (see Fig. 5, and see below). Genes and cognate SL RNAs were named according to their 5' leader sequence. SL3α RNA has a 22-nt leader sequence that differs from SL2 by the substitution of AC for CU at nucleotides 17 and 18. SL5 is identical with SL3α except for the insertion of A at position 13. The SL3α and SL5 RNA genes are oriented head to head, so that transcription is likely to be driven by promoter elements located in the intervening 835 bp (Fig. 3). SL4 differs from SL2 at multiple positions. It contains UA in place of GU as nucleotides 13 and 14, as well as the AC dinucleotide at positions 17 and 18. The SL5 and SL4 RNA genes are positioned in a head to tail fashion and are separated by 900 bp of intervening sequence (Fig. 3). The three-gene cluster is flanked by 152-bp inverted repeats (151 identities) at its 5' and 3' ends. Repeated sequences begin at +85 (taking the first nucleotide of the SL as + 1) in both the SL3α and SL4 RNA genes and extend to +237 in each instance. Other SL RNA gene sequences diverge after nucleotide 114.
The SL3β and SL3γ RNA genes were found in the DNA inserts of λSLB and λSLC (Fig. 4). Spliced leader sequences encoded by these genes are identical with nucleotides 1-22 of SL3α RNA, but their downstream sequences differ significantly.
Derived novel SL RNA sequences are aligned with SL1, SL2α, and SL2β RNAs in Fig. 5A. The novel SL RNA genes encode spliced leaders (nucleotides 1-22 or 1-23) that are more homologous with SL2 (at least 80% identity) than SL1 ( 64% identity). A feature that distinguishes novel SLs from both SL1 and SL2 is the AC dinucleotide that lies 6 nt upstream from the donor splice site. In contrast, two segments of SL sequences that correspond to nucleotides 1-5 and 8-12 in SL2 are invariant.
Approximately 80 nt at the 3' end of an SL RNA transcript are essential for catalyzing the joining of an SL with a target mRNA (38,39). Included in this region are a donor splice site and an Sm antigen-binding segment (39,40). Antibodies directed against "Sm" proteins that are common to all snRNPs (41) also bind with RNPs containing C. elegans SL RNAs (42)(43)(44). Proteins that interact with the Sm sequence are essential for the formation of active spliceosomes (39,40). These key functional regions are evident in the novel SL RNAs (Fig. 5). The splice donor site (AG GU) is conserved throughout the SL RNA family. However, the site is followed by U in the novel SL RNAs as compared with A in SL1 and SL2 RNAs. Three versions of the consensus Sm antigen binding site (PuoA(U) n G-Puo) (45) are employed in C. elegans SL RNAs: SL2, SL3β, SL3γ and SL5 RNAs share the sequence AA(U) 5 GG (nucleotides 70-78 in SL2); SL1 and SL3α RNAs contain AA(U) 4 GG; and the corresponding region of SL4 RNA is AA(U) 4 GA. In Ascaris lumbricoides, a parasitic nematode, short sequences immediately downstream (AAC) and ~15 nt upstream (GUGGC) from the Sm site (Fig. 5B) play central roles in the first step of the frores-splicing reaction (39,40). Modified versions of these sequences (Fig. 5A) are present at the appropriate locations in C. elegans SL RNAs. Sequences of functional regions in C. elegans SL RNAs are compared with the analogous segments of SL RNA from Ascaris in Fig. 5B.
The novel 114-nt SL RNAs are more homologous with SL2α and β RNAs ( 70% identity) than SL1 RNA ( 60% identity). The degree of similarity with SL2 RNA declines according to the following pattern: SL3γ RNA > SL3β RNA > SL3α RNA > SL4 RNA > SL5 RNA. Thus, the closely linked SL3α, SL4, and SL5 RNA genes (Fig. 3) exhibit maximal homology among themselves and diverge from SL RNA genes located on different chromosomes (see below). Several permutations of 5' SLs and 3' catalytic regions are engaged in trans-splicing (Fig. 5A). For example, SL3α and SL4 RNAs are 90% identical, but unlike the highly-conserved SL2α and SL2β RNAs, they donate different SLs. In contrast, SL3 is provided by discrete transcripts (SL3α and SL3β RNAs) that share only 80% overall identity.
Characterization of DNA Sequences That Flank the 5' Ends of SL RNA Structural Genes-DNA flanking the 5' ends of C. elegans snRNA genes (U2-U6) contains a 22-24-nt proximal site element (PSE) that lies between nucleotides -40 and -65 ( Fig. 6 and Ref. 47). The snRNA PSE is thought to be an important component of the promoter, and PSE-like sequences were also observed in DNA upstream from the SL1 and SL2 RNA genes (47). Putative PSEs for the SL3α, -3γ, -4, and -5 RNA genes are homologous and appear to be "composite" elements that contain short DNA sequences corresponding to segments of PSEs preceding the SL1 and SL2 RNA genes (Fig.  6). In contrast, the PSE for the SL3β RNA gene is divergent and distinct. Differential utilization of PSE, 5' splice donor, and 3' catalytic modules may generate diversity in both the levels and sequences of SL RNAs.
Copy Number and Chromosomal Location of Novel SL RNA Genes -DNA fragments that account for >90% of the C. elegans genome were amplified in YAC vectors and immobilized on a gridded filter (27). Novel SL RNA genes were subcloned in the plasmid pGEM7Z, amplified, and then excised to generate templates for the preparation of random-primed 32 P-labeled DNA probes (see "Experimental Procedures"). High stringency hybridization and washing of the recombinant YAC grids revealed that the SL 3β and SL3γ RNA genes are present in YACs Y43C3 and Y40B9. Overlap between the YACs places the SL3β and SL3γ RNA genes in the center of chromosome I (between the unc-29 and mei-26 genes) on the physical map (27). Using the same approach, the clustered SL3α, SL4, and SL5 RNA genes were mapped to the central portion of chromosome III.
Although perfectly matched probe and target DNAs yielded the most intense signals, hybridization experiments also identified distinct but homologous SL RNA genes. This is illustrated by results obtained for SL2 RNA genes. Radiolabeled probes derived from the full-length SL2α RNA gene hybridized  (Fig. 1A) with YACs (Y74A11, Y56D12, Y48F5) that place the gene at its previously established locus (on chromosome I) in the physical map. However, two other strong signals (Fig. 7B) and five weaker signals (not shown) were observed on the grid after longer exposures to x-ray film. One strong signal corresponds to YAC Y39H2, which contains the 93% identical SL2β RNA gene (16). Another highly conserved SL2-like gene is evidently present in YAC Y50H4. Assuming that each YAC clone contains 1 or 2 copies of SL RNA genes, the results suggest that there are 3-6 SL2 RNA genes and 5-10 closely related genes in the C. elegans genome.
When similar analyses were performed with novel SL RNA gene probes, ~ 20 homologous DNA fragments were identified in recombinant YACs. The DNA inserts mapped to all six chromosomes, but none hybridized with probes derived from the SL2α RNA gene. Thus, ~ 30 genes direct the synthesis of RNAs that can donate SL2-related and novel spliced leader exons to mRNAs. These results, the identification of SLs for which genes have yet to be cloned (Fig. 1) and the possible occurrence of SL RNA genes that do not hybridize with the SL gene probes used in these studies, suggest that the total numbers of SL1 RNA genes (~ 100 tandem copies on chromosome V) and dispersed non-SL1, SL RNA genes may be similar.
Novel SL RNA Genes Are Transcribed in C. elegans -A sensitive RNase protection assay was used to determine whether the novel SL RNA genes direct the synthesis of full-length SL RNAs in vivo. 32 P-Labeled, antisense RNA probes that are complementary to nucleotides 1-115 in the predicted transcripts were incubated with total C. elegans RNA prior to digestion with RNases A and T 1 . A representative autoradiogram (Fig. 8) shows that each novel gene encodes an authentic 115-nt SL RNA. Neither tRNA (Fig. 8, lane 3) nor excess nonradioactive SL2α RNA (Fig. 9A) protected the labeled probes for the novel SL RNAs. Excess nonradioactive SL3α RNA failed to protect the 32 P-labeled SL4 RNA probe, although the two sequences are 93% identical (Fig. 9B). Thus, the assay discriminates between closely related gene products.
The major protected species obtained with the SL2α and SL3β probes were closely spaced doublets (Fig. 8, lanes 2 and  4). This may result from "breathing" in the RNA duplexes. Alternatively, the two species may reflect heterogeneity at the 3' ends of the poly(A -) SL RNAs or differentially modified 5'  caps (48). As expected, the protected fragment of antisense SL1 RNA is ~95 nt in length (5). In addition to the principal protected species, unique patterns of smaller fragments are observed with antisense probes for SL RNAs 2α, 3β, 3α, and 3γ (Fig. 8, lanes 2, 4, 5, and 8). These fragments probably arise from partial protection of the probes by homologous but non-identical SL RNAs.
Relative levels of novel SL RNAs were measured by Phosphorlmager analysis. Transcripts derived from the SL 3α, 3β, 3γ, 4, and 5 RNA genes are collectively 0.2 ± 0.02% (mean ± S.E., n = 4) as abundant as SL1 RNA and 0.9 ± 0.07% (n = 4) as abundant as SL2α RNA. Thus, the steady-state levels of the transcripts of the 5 newly discovered genes account for only a small fraction of SLs that are available for modifying the 5' ends of C. elegans mRNAs.
Determination of the Relative Abundance of Various SLs in C. elegans-Poly(A -) RNA-The concentrations of selected individual SL RNAs may not be reliable indicators of the total amounts of SLs that are available for modifying the 5' ends of mRNAs. A minimum of 30 gene loci encode SL RNAs identical with or closely related to the products of the SL 2-5 RNA genes (Fig. 7, and see above). Levels of the individual SL RNAs may vary significantly as functions of their rates of transcription and/or half-lives. Moreover, multiple SL RNAs may donate identical or nearly identical 21-23-nt SLs to mRNAs.
An assay was designed to estimate the levels of specific SLs on intact SL RNA molecules. 32 3 and 4), and SL3α, SL5, SL4, and SL3γ RNAs ( lanes  5-8, respectively). The sample applied to lane 3 was hybridized with 40 µg of tRNA instead of total C. elegans RNA. A composite autoradiogram is presented. Signals from lanes 1 and 2 were obtained after exposing x-ray film for 3 h, whereas the time of exposure was increased to 48 h for lanes 3-8. This experiment was replicated four times. Similar data were obtained in each instance. Typical results are shown. (Fig. 10) are much higher than the levels of the individual full-length transcripts reported above. The observations that excess SL2 and SL3 RNA sequences do not protect 32 P-labeled SL3 and SL2 antisense probes, respectively (data not shown), verified the specificity of the assays. Quantification of the results with a Phosphorlmager revealed that SLs 2-5 are each 5-15% as abundant as SL1 in poly(A -) RNA. Collectively, the size of the donor pool of SLs 2-5 is similar to the size of the SL1 FIG. 9. RNase protection analysis distinguishes among individual SL RNA transcripts. A, RNase protection assays were performed as described under "Experimental Procedures" and in Fig. 8 with one modification. Nonradioactive SL2α RNA (0.5 µg) was substituted for C. elegans total RNA. The 32 P-labeled antisense RNA probes used were complementary to SL1, SL2α, SL3β, SL3α, SL5, SL4, and SL3γ RNAs (lanes 1-7, respectively). An autoradiogram is shown. Panel B shows autoradiographic signals obtained when An autoradiogram is presented. The gel was calibrated with DNA oligonucleotides of the indicated sizes that were fractionated in parallel lanes. This experiment was replicated three times and the results were essentially the same in each instance. donor pool. Evidently, multiple SL RNA genes encode identical 5' SL (2, 3, 4, or 5) sequences.
Under standard conditions of electrophoresis the protected, 32 P-labeled antisense SLs exhibit mobilities similar to the mobility of a DNA marker comprising 26 nucleotides (Fig. 10). Moreover, the SL bands are somewhat diffuse. Several factors may account for these properties: RNA molecules migrate 5-10% more slowly than DNA fragments in this gel system; the 3' ends of the protected antisense RNAs are heterogeneous because irrelevant nucleotides immediately downstream from the SL sequence in the probe can be protected when they match (by chance) downstream nucleotides in intact SL RNAs; the high resolution gel system can separate molecules of the same length on the basis of nucleotide composition; and RNAs lacking the trimethyl guanosine cap migrate faster than capped RNAs. 4 Protected fragments that migrate more rapidly than the 22-nt DNA marker may be due to the hybridization of the probe with highly homologous but distinct SLs and/or "breathing" in the AU-rich region of RNA duplexes that corresponds to the conserved 5' ends of the SL sequences (Fig. 5A).
Novel Spliced Leaders Are Incorporated into trans-Spliced mRNAs in Vivo -Poly(A + ) RNA from an asynchronous population of C. elegans was probed with antisense SL1-SL5 RNAs in RNase protection assays (Fig. 11). Since poly(A --) RNA is removed by oligo(dT) cellulose chromatography, only the 22-or 23-nt segments complementary to SL sequences are protected in RNA-RNA duplexes. Slight differences in mobilities of the  6). Protected fragments from the radiolabeled probes were fractionated by electrophoresis in a 15% polyacrylamide, 7 M urea gel. Signals were visualized by autoradiography. Autoradiographic signals were quantified as described under "Experimental Procedures," and the data are presented in Table I. Size markers were oligodeoxynucleotides of the indicated size (in nt) that were end-labeled with [γ-32 P]ATP and T4 polynucleotide kinase.
protected fragments (Fig. 11) may be ascribed to factors listed above and the larger size (23 nt) of SL 5 (lane 4). Collectively, non-SL1 exons are incorporated into mRNAs ~8% as frequently as SL1 (Fig. 11, Table I). SL2 and SL3 are attached to mRNAs with similar frequencies (Table I). In contrast, SL4 is used only 10% as often as SL2 or SL3.
Northern blots of poly(A + ) RNA were incubated with radiolabeled antisense DNA corresponding to either SL3 or SL4. Both probes hybridized with a heterogeneous array of mRNAs that ranged from several hundred to >3000 nt in length (Fig.  12). Although high intensity bands of 0.5 and 1.1 kilobases were observed with the SL3-specific probe, it appears that both SLs are incorporated into large constellations of mRNAs. Signal intensities from the Northern blots were measured in a PhosphorImager. Messenger RNAs containing SL3 are ~ 8-fold more abundant than mRNAs with a 5' SL4 exon. Thus, relative frequencies of SL3 and SL4 utilization determined by RNase protection and Northern analyses are in agreement.
Levels of Certain SL RNAs May Be Regulated during Postembryonic Development-Levels of SL1, SL2α, and SL3α transcripts are essentially invariant throughout C. elegans development (Fig. 13). In contrast, SL4 RNA is minimally expressed in L1 larvae, and the content of the transcript increases during later development. Adult nematodes have a 6-fold higher level of SL4 RNA than L1 animals. The developmentally controlled limitation of SL4 RNA content in early development may account for its rather infrequent utilization on mRNAs derived from mixed populations of C. elegans (see Figs. 11 and 12, Table I).
The SL4 RNA Gene Promoter Is Active in Hypodermal Cells -The developmental stage-specific accumulation of SL4 RNA may be due to its restricted expression in a limited number of highly differentiated cells. This possibility was investigated by generating transgenic nematodes, in which the first 40 bp of the SL4 RNA structural gene and 800 bp of contiguous 5'-flanking DNA drive expression of the E. coli lacZ reporter gene. Transgene expression was monitored in individual cells of fixed C. elegans via in situ hybridization (see "Experimental Procedures" and Fig. 14). SL4 RNA gene promoter activity was evident only in hypodermal cells of C. elegans. Similar results were obtained when the 5'-flanking DNA was limited to 162 bp immediately adjacent to the SL4 RNA structural gene. DISCUSSION Seminal studies by Hirsh and co-workers (5,6,8,16) and Blumenthal and colleagues (9,13,49) demonstrated that 5' ends of C. elegans mRNAs are often covalently modified via frares-splicing. Early reports suggested that targeted mRNAs received either of two 22-nt leader sequences (SL1 or SL2) in a  mutually exclusive manner (16). However, recent determinations of sequences at the extreme 5' ends of several C. elegans cDNAs suggested that a larger family of SL RNA genes may be present in the C. elegans genome ( Fig. 1 and Refs. [18][19][20]. Heretofore, direct evidence for the occurrence of novel SL RNA genes, the transcription of such genes and the utilization of their 5' SL exons for trans-splicing mRNAs in vivo was lacking. We have now cloned and characterized genes that encode five novel C. elegans SL RNAs. Each gene yields a transcript of 114 nt, which contains several consensus sequences that are predicted to be essential for biological activity (Figs. 3-6). Three of these genes (encoding SL RNAs 3α, 4, and 5) are clustered within 2 kbp of genomic DNA near the center of chromosome III. Despite the tight gene linkage, the corresponding SL RNAs donate three distinct spliced leaders. Head to tail and head to head orientations of SL RNA genes are evident in the cluster, suggesting that transcription may proceed from promoters located in both DNA strands. The observations that composite SL RNA/snRNA promoter elements (PSEs) lie ~50 bp upstream from nucleotide 1 in the SL RNA coding sequences (i.e. the predicted transcription start site) support this idea (Figs. 3, 4, and 6).
FIG. 14. Expression of SL4 RNA gene promoter activity in C. elegans hypodermal cells. Transgenic C. elegans, which contain a lacZ reporter gene downstream from 800 bp of DNA that flanks the 5' end of the SL4 RNA gene, were generated as described under "Experimental Procedures." RNA encoding β-galactosidase was detected by in situ hybridization with digoxigenin-labeled, antisense DNA as described under "Experimental Procedures." RNA-DNA complexes were visualized by incubating the specimens serially with anti-digoxigenin IgGs coupled to alkaline phosphatase and a chromogenic substrate. IgG-coupled alkaline phosphatase catalyzes the synthesis of an insoluble blue reaction product in cells transcribing the SL4 RNA gene. Nomarski interference microscopy revealed that the histochemical stain appeared in the nuclei of hypodermal cells. A photograph of a stained adult animal (taken with a Zeiss Axioscope microscope at a magnification of X 100) is presented. More than 90% of the transgenic C. elegans exhibited similar staining patterns. Two other novel SL RNA genes (3β, 3γ) and associated PSEs were sequenced and mapped to chromosome I. SL RNA 3β and 3λ transcripts provide identical 5' leader exons, although their 3' sequences differ significantly (Figs. 4 and 5A).
The derived 114-nt SL 3α-λ, SL4, and SL5 RNAs are more homologous with SL2 RNAs than the SL1 transcript (Fig. 5A). Likewise, the novel 22-or 23-nt spliced leader sequences (SLs 3-5) are more closely related to SL2 than SL1. These homologies and the ability of multiple SL RNA genes to donate identical leader sequences (see Figs. 3-5 and Ref. 16) suggested that SL2-5 RNA genes may be members of a larger subfamily. A minimum estimate of the size of this group of genes was obtained by hybridizing filters, which contain an ordered, overlapping array of DNA fragments comprising most of the C. elegans genome, with 32 P-labeled probes for the SL2, SL3, SL4, and SL5 RNA genes. Approximately 30 loci that are identical or closely related to the SL2-5 RNA genes were identified on the autosomal and X chromosomes. Thus, this gene family is both large and dispersed over a substantial portion of the total genome.
During the preparation of this paper, sequences of overlapping cosmid inserts derived from C. elegans chromosome III were deposited in the GenBank™ data base, in the context of the C. elegans genome project (50). The cosmid designated CEL B0280 (accession number U10438) contains the clustered SL3α, SL4, and SL5 RNA genes. The sequence we determined for the genomic DNA insert in γSLA (Fig. 3) is identical with that reported for cosmid CEL B0280. However, the presence of three SL RNA genes and their associated PSEs was not detected by the methods employed in the genome project. Analysis of the data from the C. elegans genome project with the Genefinder program predicts that the three SL RNA genes in the cluster lie within introns of a gene that encodes a putative glutamate receptor subunit (50). Placement of small SL RNA genes within introns of a larger gene may reflect the efficient utilization of C. elegans' relatively small genome (~2 X 10 8 bp). The nematode genome, which is only ~5% as large as the human genome, contains many introns and intergenic DNA sequences that are considerably smaller than their counterparts in mammalian systems (51). As additional data on the C. elegans genome are generated it will be possible to determine whether dispersed SL RNA genes are typically positioned within other genes or are distributed in a more randomized manner along the chromosomes.
Sharp (52), Nilsen (39), and Blumenthal and Thomas (51) suggested SL RNAs are functional chimeras composed of a 5' exon that is fused to an snRNA-like downstream sequence. By performing in vitro mutagenesis and assaying trans-splicing in extracts of Ascaris embryos, Nilsen and colleagues (38,39) demonstrated that the intronic, snRNA-like region of an SL RNA can deliver a variety of natural and synthetic 5' exons to acceptor mRNAs. SL RNA sequence elements that are essential for trans-splicing are the Sm protein(s) binding site, a trinucleotide sequence immediately downstream from the Sm site, a pentanucleotide sequence ~15 nt upstream from the Sm site and a 5' splice site (Fig. 5B). Consensus sequences for each of these elements appear in the novel C. elegans SL RNA transcripts as well as the classical SL2 and SL1 RNAs, thereby suggesting that the newly discovered genes encode components of SL RNPs and trans-spliceosomes.
In the closely related nematode Ascaris, the 22-nt SL DNA sequence functions as a promoter element that is essential for transcription by RNA polymerase II (53). The high degree of sequence conservation observed at the 5' ends (nucleotides 1-12) of C. elegans SLs, the repetitive utilization of certain spliced leader exons (e.g. SL3 and SL2) on multiple SL RNA gene transcripts, and the discovery of composite SL1/SL2 PSEs 50 bp upstream from the novel SL RNA structural genes ( Fig.  6) suggest the speculation that these sequence elements and their corresponding trans-acting proteins may coordinately drive expression of RNAs that deliver multiple "isoforms" of 5'-untranslated exons to C. elegans mRNAs. SL RNA transcripts encoded by the C. elegans SL3α-γ, SL4, and SL5 RNA genes are produced in vivo (Fig. 8). Although the levels of these individual transcripts are low (~1% of the abundance of SL2 RNA), the total pools of novel SLs in poly(A -) RNA (full-length SL RNAs) and the amounts of novel SLs transferred to mRNAs are similar to the levels of SL2 in the poly(A -) and poly(A + ) fractions of C. elegans RNA (Figs. 10 and 11, Table I). This is probably due to the occurrence of a limited subset of shared SL exons that are components of a substantially larger group of related but distinct SL RNA genes. Differences in the activities of individual SL RNA gene promoters and/or the stabilities of SL RNAs may also contribute to the net accumulation and utilization of novel SL RNAs.
In instances where reverse transcription-PCR has been employed to determine sequences at the 5' termini of specific mRNAs, it appears that the novel SL exons listed in Table I are used in trans-splicing reactions with frequencies that match the utilization rate of the classical SL2 sequence (18,19). These results provide direct experimental evidence for the in vivo functionality of novel SL RNAs. The appearance of SL3 on mRNA encoding the sex determination factor TRA-2 (18) documents a linkage among a novel SL RNA gene(s), its transcription in vivo, and the use of the 5' SL exon of the transcript to modify a specific mRNA via trans-splicing.
Many C. elegans mRNAs receive a 5' leader exon from transcripts of the repeated SL1 RNA genes. The precise physiological significance of trans-splicing with SL1 is not known. Since the SL usually terminates within 1-50 nt of the initiator AUG, it seems probable that trans-splicing eliminates upstream, outof-frame AUG codons and deletes long 5' -untranslated sequences capable of folding into secondary structures that inhibit translation. In addition to optimizing translation efficiency, modification with the SL may alter the stability of the target mRNA.
Messenger RNAs that contain SL1 are encoded by conventional structural genes that are preceded by contiguous pro-moter sequences. However, ~25% of C. elegans genes are organized as operons, in which transcription of 2-5 mRNAs is driven by a single 5' promoter/enhancer region (4,13). Exact sequences of 5' ends of mRNAs derived from downstream genes in C. elegans operons have been determined only in a few instances (reviewed in Refs. 4 and 13). The data indicate that SL2 is appended only to mRNAs encoded by downstream genes. Moreover, polycistronic mRNAs are not cleaved and transspliced if upstream polyadenylation signals are eliminated by mutation (13). On the basis of these considerations, Speith et al. (13) proposed that SL2 RNA plays a specialized role in processing polycistronic mRNAs. A plausible suggestion is that RNPs containing SL2 RNA bind to proteins that catalyze 3'end processing of pre-mRNAs (13). Such complexes would mediate the 3' cleavage and polyadenylation of an upstream mRNA and simultaneously place the SL2 RNP in proximity with the trans-splice acceptor site at the 5' end of the neighboring downstream mRNA.
The SL3 exon, which appears at the 5' ends of three SL RNA structural genes described in this paper, and additional novel SLs (Table I) are incorporated at the 5' termini of mRNAs for TRA-2 and/or protein kinase C1A and casein kinase IIβ (18,19). Each of these mRNAs is derived from a gene that occupies a downstream position in an independent operon (13). These observations and a conserved structural relationship with SL2 RNA (see above) strongly suggest that SL3α-γ, SL4, and SL5 RNAs play important roles in the generation of efficiently translated, monocistronic RNAs from the internal segments of polycistronic transcripts. The molecular basis for the exclusion of SL1 RNA and the utilization of a family of non-SLl RNAs for this mode of pre-mRNA processing is unknown. One speculative suggestion is that a partially conserved sequence near the 3' ends of SL RNAs 2-5 (nt 94-114) provides a binding site for a protein(s) involved in the 3' cleavage and/or polyadenylation of mRNAs. The smaller SL1 RNA transcript lacks this sequence.
The relative abundance of SL4 RNA increases 6-fold during postembryonic development (Fig. 13). Moreover, the putative promoter for the SL4 RNA gene is active principally in hypodermal cells (Fig. 14). These results suggest that the SL4 leader might selectively modulate mRNA translation, stability etc, in a subset of hypodermal transcripts during late development. Furthermore, novel upstream promoter/enhancer elements may control the level and cell-specific expression of SL4 RNA transcripts. These possibilities must be regarded with caution in the absence of (a) knowledge of the properties of specific mRNAs that are trans-spliced with SL4 and (b) systematic analysis of the gene promoter by mutagenesis. Moreover, patterns of promoter activity must be established for other SL RNA genes and compared with that observed for the SL4 RNA gene. Nevertheless, the observations indicate that potential regulatory elements in the SL4 sequence and SL4 RNA gene promoter/enhancer merit further study.