Identification of an Alternatively Spliced Seprase mRNA That Encodes a Novel Intracellular Isoform*

Seprase is a homodimeric 170-kDa integral membrane gelatinase that is related to the ectoenzyme dipeptidyl peptidase IV. We have identified an alternatively spliced seprase messenger from the human melanoma cell line LOX that encodes a novel truncated isoform, seprase-s. The splice variant mRNA is generated by an out-of-frame deletion of a 1223-base pair exonic region that encodes part of the cytoplasmic tail, transmembrane, and the membrane proximal-central regions of the extracellular domain (Val5 through Ser412) of the seprase 97-kDa subunit (seprase-l). The seprase-s mRNA has an elongated 5′ leader (548 nucleotides) that harbors at least two upstream open reading frames that inhibit seprase-s expression from a downstream major open reading frame. Deletion mutagenesis of the wild type splice variant cDNA confirms that initiation of the seprase-s coding sequence begins with an ATG codon that corresponds to Met522 of seprase-l. The seprase-s open reading frame encodes a 239-amino acid polypeptide with an M r∼ 27,000 that precisely overlaps the carboxyl-terminal catalytic region of seprase-l.

Proteolytic degradation of the extracellular matrix is a fundamental property of normal tissue remodeling and repair as well as the pathological processes of tumor invasion and metastasis. In addition to the various families of proteolytic enzymes that serve as the major collagenases and gelatinases such as the matrix metalloproteases, etc. (1), a subfamily of membrane-bound nonclassical serine proteases, including seprase and dipeptidyl peptidase IV (CD26), are implicated in matrix degradation and invasiveness of migratory cells (2)(3)(4)(5)(6). Seprase is a homodimeric 170-kDa integral membrane gelatinase whose expression appears to correlate with the levels of invasiveness manifested by the human melanoma cell line, LOX, in an in vitro extracellular matrix degradation/invasion assay (7). The deduced amino acid sequence of its 97-kDa subunit (seprase-l, GenBank TM accession number U76833) predicts a type II membrane topology with a short cytoplasmic tail (6 amino acids) followed by a transmembrane region (20 amino acids) and a large extracellular domain (734 amino acids (8)). Its catalytic triad of residues Ser 624 , Asp 702 , and His 734 are contained within a ϳ200-amino acid region located in the carboxyl terminus of each subunit. However, seprase requires the dimerization of its inactive subunits for activity (8,9). Comparisons of their deduced amino acid sequences indicate that seprase is essentially identical to human fibroblast activation protein ␣ (FAP␣ 1 ; GenBank TM accession number U09278), which is expressed on reactive stromal fibroblasts of various carcinomas and on fibroblasts of healing wounds (10,11). Additionally, seprase exhibits a striking sequence homology (52%) to the ectoenzyme dipeptidyl peptidase IV (GenBank TM accession number M74777), which increases to a 68% amino acid identity between their catalytic regions (8).
Alternative RNA splicing allows for the diversification of the protein products of a single gene not only in terms of their structure but possibly their function and/or cellular localization. Interestingly, several genes that encode proteases associated with tumor invasion and metastasis undergo post-transcriptional RNA splicing. For example, splice variants with altered 5Ј-and/or 3Ј-untranslated regions have been reported for cathepsin B (12) and L (13). And there is a variant that encodes a truncated cytoplasmic isoform of cathepsin B (14). Transcription variants have also been identified that encode meprin ␤Ј (15) and a soluble form of membrane type 3-matrix metalloprotease (16,17). Also, the gene that encodes the murine homolog of FAP␣, mFAP (GenBank TM accession number Y10007), is reported to generate two splice variants that encode altered isoforms of the membrane-bound protease (18).
Functional eukaryotic mRNAs that have one or more AUG codons within their 5Ј leader sequences are relatively rare in nature (19,20). Indeed, some proto-oncogenes, also genes that control cellular growth and differentiation, and viral genes give rise to mRNAs that possess one or more short upstream open reading frames (uORFs) or minicistrons in their 5Ј leaders that do not overlap the downstream major ORF (20,21). And there have been numerous reports that uORFs can function as cis-acting regulatory elements that significantly inhibit the expression of their cognate downstream major ORFs .
Here, we report the identification of an alternatively spliced seprase mRNA from LOX cells that is generated by the utilization of suboptimal exonic 5Ј and 3Ј splice sites in its pre-mRNA. The resulting messenger is polycistronic; it harbors at least two uORFs in its 5Ј leader region that inhibit the expression from a downstream ORF of seprase-s, a truncated isoform of seprase that is identical to the catalytic region of seprase-l.  The human breast carcinoma cell line MDA-MB-436, the human melanotic melanoma cell line  SKMEL28, the human embryonic lung fibroblast line WI-38, and the  monkey kidney cell line COS-7 were all purchased from American Type Culture Collection. Human umbilical vein smooth muscle cells (HUVSMC) and total RNA were obtained from Dr. S. Steve Okada (Georgetown University). Total RNA from the melanotic melanoma cell line RPMI7951 was obtained from Dr. H. Nakahara (Georgetown University). Superscript II RNase H Ϫ reverse transcriptase and recombinant Taq polymerase were from Life Technologies, Inc. Premixed deoxynucleotides were obtained from Roche Molecular Biochemicals. The mammalian expression vector pCR3.1 was purchased from Invitrogen, and the expression plasmid pCAT3-control vector was purchased from Promega. Unconjugated rabbit anti-CAT polyclonal Ab was purchased from 5 Prime 3 3 Prime, Inc. Alkaline phosphatase-conjugated antirabbit polyclonal Ab was from Rockland. Immun-Star chemiluminescent substrate was obtained from Bio-Rad. Amplify, Hyperfilm, and L- [4, H]leucine (136 Ci/mmol) were obtained from Amersham Pharmacia Biotech. Immobilon polyvinylidene difluoride transfer membranes were from Millipore. Human glyceraldehyde-3-phosphate dehydrogenase amplimer set was from CLONTECH.
DNA Cloning-cDNA amplicons that encode seprase-s were obtained by RT-PCR utilizing either the primer pairs FAP 1 ϩ FAP 6 or FAP 13 ϩ FAP 12 from LOX, MDA-MB-436, or HUVSMC RNA. PCR was either carried out as described above or with the Expand Long Template PCR System (Roche Molecular Biochemicals) utilizing buffer 1, an annealing temperature of 55°C, and an elongation time of 2.5 min for 30 cycles. Amplicons were isolated from either a 1% agarose gel using a QIAquick gel extraction kit, or PCR reactions were directly purified using QIAquick spin columns (Qiagen). Purified cDNAs were ligated into the pCR 3.1 vector. Ligation, transformation, and selection of recombinant clones were carried out using the eukaryotic TA cloning kit (Bidirectional; Invitrogen).
DNA Sequencing and Analysis-The DNA sequence of seprase-s clones was obtained using the ABI prism dye terminator cycle sequencing kit and an ABI Prism 377 DNA sequencer (Perkin-Elmer). The cDNA insert of clone pA12 was sequenced on both strands using primers that generated overlapping sequence data. Primers utilized for the sense strand were: T 7 (5Ј-TAATACGACTCACTATAGGG-3Ј (vector)); FAP 8 (5Ј-TCCAAGCAAGAAGTGTGTTA-3Ј (nt 1507-1526)); and FAP 5 (5Ј-TGACAAACTCCTCTATGCAG-3Ј (nt 1951-1971)). Primers utilized for the antisense strand: RP-1 (5Ј-TAGAAGGCACAGTCGAGG-3Ј (vector)) and FAP 7 (5Ј-CTGCATAGAGGAGTTTGTCA-3Ј (nt 1970 -1951)). Only the sense strand was sequenced for all other seprase-s clones. Sequence analysis was performed using Lasergene software (DNASTAR, Inc.) Competitive PCR-Determination of the relative levels of seprase-l and seprase-s mRNAs for the human melanoma cell line LOX was obtained by competitive PCR of seprase-l and seprase-s first-strand cDNAs generated from 3 LOX RNA preparations obtained over a 2-year period. First-strand cDNA synthesis was carried out as described above. Quantitation of seprase-l cDNA was obtained using a homologous competitive fragment that contains primer template sequences for the primers FAP 10 and 11 (see above). This fragment was produced by overlap extension using PCR (44). The oligonucleotide pair utilized to generate its 159-bp deletion (FAP␣ cDNA sequence from nt 908 through nt 1066) is FAP L (5Ј-GATACGGATATACCAGTTGCCTCAAGTGATT-ATTAT-3Ј (sense)) and FAP M (5Ј-ATAATAATCACTTGAGGCAACTG-GTATATCCGTATC-3Ј (antisense)). The seprase-l target amplicon generated with the FAP 11ϩ10 primers is 929 bp, whereas the mimic amplicon is 770 bp. Quantitation of seprase-s cDNA was carried out using a homologous DNA mimic that overlaps the truncated region of the seprase-s cDNA sequence and that contains primer template sequences for the primers FAP 1 and FAP 6 (see above). This competitive fragment was also generated by overlap extension using PCR (44). The oligonucleotide pair used to produce its 248-bp deletion (FAP␣ cDNA sequence from nt 1863 through nt 2110) is FAP N (5Ј-GTCAGAGTGT-AAGGTCTGGCATCTGGAACTGGTCTT-3Ј (sense)) and FAP O (5Ј-AA-GACCAGTTCCAGATGCCAGACCTTACACTCTGAC-3Ј (antisense)). The seprase-s target amplicon produced with the FAP 1ϩ6 primers is 1143 bp, and its competitive mimic is 895 bp. PCR was carried out with Taq polymerase (see above) using the cycle profile described under "DNA Cloning." A 25-l sample of each competitive PCR reaction was resolved on a 1.2% agarose, EtBr gel. The equivalence point for target and mimic amplicon intensities was determined visually. The initial endogenous levels of seprase-l and/or seprase-s cDNAs (in 1 l of the first-strand cDNA reaction) for each RNA preparation represents the average value of input mimic DNA that generates the target-mimic equivalence point in three distinct titrations. The endogenous levels (average value, value range (attomole (1.0 ϫ 10 Ϫ18 mole)) of seprase-l and -s cDNAs synthesized from each of the 3 LOX RNA preparations are: preparation A, seprase-l 0.067, 0.050 -0.075 and seprase-s 0.008, 0.006 -0.010; preparation B, seprase-l 0.300, 0.300 and seprase-s 0.013, 0.010 -0.016; and preparation C, seprase-l 0.135, 0.125-0.150 and seprase-s 0.004, 0.003-0.005.
In Vitro Expression-Seprase-s cDNA and its deletion mutant homologs were expressed in vitro from plasmids (0.5 g/25 l) using both the TNT T 7 -coupled rabbit reticulocyte lysate and wheat germ extract systems (Promega). Plasmids were not linearized for the wheat germ extract system. Expression was also carried out in uncoupled in vitro transcription and translation. Amplicons (250 ng) containing the T 7 promoter and seprase-s cDNA were transcribed using the T 7 Cap-Scribe kit (Roche Molecular Biochemicals), and RNA transcripts (1 l) were translated using wheat germ extract (Promega) adjusted to 73 mM K ϩ and 2.1 mM Mg 2ϩ . In vitro translations were carried out in the presense of [ 3 H]leucine. The extent of [ 3 H]leucine incorporation was determined by trichloroacetic acid precipitation on 5 l of the reaction followed by liquid scintillation counting. The trichloroacetic acid precipitation value was actually the average value for a 5-l aliquot from 3 identical reactions. Translation products were resolved by SDS-PAGE on 12% gels. The gels were impregnated with Amplify and dried down before undergoing autoradiography.
Fusion Protein Constructs-Utilizing overlap extension mutagenesis, a fusion protein construct, p14SC, was generated that linked the cDNA insert of pA12 with one that encodes CAT. This was accomplished using a primer pair that encodes the carboxyl-terminal residues (Cys 234 to Asp 239 ) of seprase-s and the amino-terminal residues (Ile 5 to Thr 10 ) of CAT. The primer pair had the following sequence: sepCAT-F (5Ј-TTC-TCTTTGTCAGACATCACTGGATATACCACC-3Ј (forward)) and sepC-AT-R (5Ј-GGTATATCCAGTGATGTCTGACAAAGAGAAACA-3Ј (reverse)). PCR was carried out as described under "Deletion Mutagenesis" using the primer pair FAP 1 and sepCAT-R with pA12 and sepCAT-F and CAT-R (5Ј-TGTATCTTATCATGTCTGCTC-3Ј (nt 1210 -1190; pCA-T-3)) with the pCAT-3 control vector. Fusion-amplification using the Expand Long Template PCR System and subcloning were as described above. A deletion mutant, p33⌬M-1SC, in which an ATG triplet (nt 1 to 3) was deleted, was derived from p14SC using the primer pair FAP F ϩ FAP G with FAP 1 ϩ CAT-R as the flanking primers. Also, a CAT construct, pCAT, was produced by incorporating the CAT-coding region obtained by PCR using the primers CAT-F (5Ј-AGCTCTTAAGCGGCC-GCAAGC-3Ј (nt 451-471; pCAT-3)) and CAT-R into the pCR 3.1 vector. DNA sequence analysis verified all CAT constructs. COS-7 Cell Transfection-Transient transfection of COS-7 cells was carried out by electroporation (0.3 kV; 950 microfarads). The electroporated cells were harvested after 72 h and lysed in a detergent extractbuffer (9).
Immunoblotting-COS-7 detergent lysates (55 g) were resolved by SDS-PAGE on 10% gels. Proteins were transferred to Immobilon polyvinylidene difluoride membranes. Blots were probed with a commercially available polyclonal rabbit anti-CAT Ab diluted 1:500. Primary Ab was detected with an anti-rabbit polyclonal Ab conjugated to alkaline phosphatase and diluted 1:20,000. Immunoreactive proteins were visualized using the Immun-Star substrate. Fig. 6 was obtained by exposing Hyperfilm to the Immun-Star-treated immunoblot for 15 s. This exposure time emphasizes or enhances the sepCAT fusion protein band relative to the CAT band.

RESULTS
Reverse transcription-PCR of LOX RNA using the primers FAP 1 and FAP 6, which correspond to nucleotide sequences within the 5Ј-and 3Ј-untranslated regions, respectively, of the seprase mRNA(s), exhibits two major amplicons at ϳ2.4 kb and at ϳ1.2 kb (Fig. 1). The ϳ2.4-kb amplicon was previously shown (8) to contain the entire coding sequence for the seprase 97-kDa subunit (seprase-l). DNA sequence analysis (Fig. 2) of the clone pA12 (GenBank TM accession number AF007822) (cDNA insert contains the entire ϳ1.2-kb amplicon (1143 bp)) revealed a 1223-bp deletion of the region extending from nt 61 through nt 1283 of the seprase-l cDNA sequence. Otherwise, it is essentially identical to the reported seprase cDNA sequence. To confirm the existence of a truncated seprase mRNA that gives rise to the ϳ1.2-kb amplicon, we carried out RT-PCR on LOX RNA using primer pairs that generate nested fragments along the length of the seprase mRNA(s) (Fig. 1). Those pairs that correspond to nt sequences outside the predicted deleted region exhibit 2 major bands (i.e. seprase-l and -s mRNAs) with a size differential of ϳ1.2 kb. However, those pairs which utilize a primer that lies within the deleted region show only one band that corresponds in length to the full-length messenger (i.e. seprase-l mRNA). An additional low intensity intermediate size band was observed with all primer pairs that generate the two major amplicons (Fig. 1B). We have isolated and sequenced the intermediate band (ϳ1 kb) produced by the FAP 1ϩ4 primer pair; it is an artifact of PCR (data not shown). Also, three additional truncated cDNA clones obtained by RT-PCR of LOX RNA have been sequenced, and all exhibit precisely the same deletion region as pA12. We estimated the relative abundance of the seprase-l and seprase-s mRNAs in each of three LOX RNA preparations by utilizing competitive PCR of their first strand cDNAs ("Experimental Procedures"). The seprasel/seprase-s ratios for the 3 RNA preparations are 8.9, 22.7 (this preparation was used in Fig. 1), and 34.6 (this preparation was used in Fig. 3, lanes 1-4). In addition, we found that both the seprase-l and -s mRNA levels appear to be fluctuating in each of the three RNA preparations. The preparation that generated the intermediate mRNA ratio of ϳ23 has the highest levels of both seprase-l and -s ("Experimental Procedures").
The existence of the truncated seprase mRNA is not unique to LOX cells. Reverse transcription-PCR analyses using the primer pair FAP 1ϩ6 of RNAs from the cell lines RPMI7951 (melanoma), WI-38 (fibroblast) and MDA-MB-436 (carcinoma), and HUVSMC all exhibit amplicons corresponding to both the seprase-l and the seprase-s mRNAs (Fig. 3). The noninvasive melanoma cell line SKMEL28, which does not express seprase, was negative for the presence of the seprase mRNAs (Fig. 3). Additionally, two truncated seprase cDNA clones were sequenced: one from the breast carcinoma line MDA-MB-436 and the other from HUVSMC. Both are essentially identical to pA12 ("Experimental Procedures"; data not shown) Analysis of the pA12 cDNA sequence (Fig. 2) predicts that the 1223-bp deletion between nt Ϫ329 and Ϫ328 is out of phase with respect to the seprase-l ORF, which begins 4 codons upstream at the ATG triplet represented by nt Ϫ340 to Ϫ338. The exonic deletion produces a uORF or minicistron encoding the pentapeptide MKTWQ followed by an in-frame TGA codon at nt Ϫ325 to Ϫ323 (distal uORF). Downstream, the cDNA sequence predicts the existence of a second uORF, which encompasses nt Ϫ179 to Ϫ72 (proximal uORF) and encodes a 36amino acid polypeptide that is not homologous to other reported uORF-encoded proteins (  Ϫ237 and would encode a 30-amino acid polypeptide. The initiation ATG triplet (nt Ϫ326 to Ϫ324) for this centrally located uORF is overlapped by the termination codon (nt Ϫ325 to Ϫ323) for the distal uORF (nt Ϫ340 to Ϫ326), and therefore it would be expected to initiate or reinitiate protein synthesis poorly (Ref. 48; see inhibition by uORFs below). Nevertheless, functional uORFs with this structural organization have been reported (31)(32)(33). The pA12 cDNA lacks 160 nt that are present at the 5Ј end of the seprase/FAP␣ mRNA (10). We analyzed the 5Ј-untranslated region of the FAP␣ cDNA sequence for uORFs; none were found.
The scanning model for initiation of protein synthesis (49) predicts that the first ATG triplet (nt 1 to 3) in adequate sequence context downstream of the proximal (nt Ϫ179 to Ϫ72) uORF can initiate polypeptide synthesis (Fig. 2). This ATG triplet corresponds to Met 522 in full-length seprase-l (8). It thus delimits an ORF that encodes a polypeptide of 239 amino acids with a M r 26,956 (seprase-s). To determine if this ATG codon initiates protein synthesis we carried out in vitro transcription and translation of the pA12 cDNA in both coupled and uncoupled systems using rabbit reticulocyte lysate and wheat germ extract ("Experimental Procedures"). In Fig. 4, lanes 2 and 7 show translation products generated by pA12 in the coupled rabbit reticulocyte and wheat germ systems, respectively. Both lanes exhibit a single major band under the 30-kDa marker.
Uncoupled in vitro transcription followed by in vitro translation of capped RNA transcripts in wheat germ extract duplicated the results in lane 7 ("Experimental Procedures"; data not shown). To determine if the major band in lanes 2 and 7 initiates at the ATG codon corresponding to nt 1 to 3, we constructed a deletion mutant p11⌬M-1 ("Experimental Procedures") in which nt 1 to 3 are deleted from the parental plasmid pA12. Lanes 3 and 8 show that indeed the major translation product generated by pA12 initiates at this ATG triplet. The next potential initiation ATG triplet is located at nt 136 to 138 (Fig. 2). In Fig. 4, lanes 4 and 9, a deletion mutant p24⌬M-2 in which nt 136 to 138 have been deleted expresses only the major translation product seen in lanes 2 and 7. This confirms that the upstream ATG codon (nt 1 to 3) is the primary site of initiation. In lane 5 we utilized a double mutant construct p14⌬M-1ϩ2 in which both ATG triplets (nt 1 to 3 and nt 136 to 138) have been deleted. The translation products (between 21

FIG. 2. Nucleotide sequence of the pA12 cDNA and the deduced amino acid sequences of its uORFs and its major ORF.
Nucleotide and amino acid sequence numbers are to the left. The number 1 nucleotide and the first amino acid residue correspond to the major ORF. The deduced amino acid sequences of the distal (nt Ϫ340 through Ϫ326) and proximal (nt Ϫ179 through Ϫ72) uORFs are shown. Initiation and putative initiation ATG codons and the nucleotides in the Ϫ3 and ϩ4 positions relative to these ATG codons (A ϭ ϩ1) are represented by bold characters. The alternative splice junction between nt Ϫ329 and Ϫ328 is separated by 24 underlined nucleotides that represent the extreme 5Ј (6 nucleotides) and 3Ј (18 nucleotides) ends, respectively, of the deleted 1223-bp exonic region present in the full-length seprase cDNA. Putative exonic splicing enhancer-like motifs are represented by bold italicized characters. Initiation methionine residues are denoted by the bold character M, whereas amino acid residues in the seprase-s ORF that correspond to the catalytic triad (Ser 103 , Asp 181 , His 213 ) and the serine protease consensus motif (Gly 101 , Trp 102 , Ser 103 , Tyr 104 , Gly 105 ) of seprase-l are represented by bold underlined characters. Arrows (1) denote nt positions at which the uORF deletion mutants p8ϩ6 -3 (nt Ϫ265) and p16ϩ6 -11 (nt Ϫ61) begin their 5Ј leader regions (Fig. 5).   FIG. 3. Seprase mRNA profiles of melanoma, carcinoma, and fibroblast cell lines and HUVSMC. Reverse transcription-PCR was performed on total RNA from the human cell lines LOX (amelanotic melanoma), SKMEL28 (melanotic melanoma), RPMI7951 (melanotic melanoma), WI-38 (lung embryonic fibroblast) and MDA-MB-436 (breast carcinoma), and HUVSMC utilizing the seprase/FAP␣ primers FAP 1ϩ6 (lanes 1, 3, 5, 7, 9, 11, and 13), which correspond to nucleotide sequences within the 5Ј-and 3Ј-untranslated regions, respectively, of the seprase mRNAs. Reverse transcription-PCR was also carried out using a glyceraldehyde-3-phosphate dehydrogenase amplimer set (lanes 2, 4, 6, 8, 10, 12, and 14). The ϳ2.4-kb and ϳ1.2-kb amplicons (indicated by arrows) generated in lanes 3,7,9,11, and 13 correspond to seprase-l and seprase-s mRNAs, respectively. Lanes 1 and 2 represent RT-PCR of LOX RNA in the absence of reverse transcriptase. Lanes 5 and 6 represent RT-PCR of a noninvasive cell line (SKMEL28) that is negative for seprase expression. Lane 15 contains size markers.

FIG. 4. In vitro expression of seprase-s and its deletion mutant homologs.
Parental plasmid pA12 that encodes seprase-s and the ATG codon deletion mutant constructs of its downstream ORF: p11⌬M-1 (nt 1 to 3), p24⌬M-2 (nt 136 to 138), and p14⌬M-1ϩ2 (nt 1 to 3 and nt 136 to 138) were expressed in coupled in vitro transcription and translation systems (Fig. 2). Lanes 1 to 5 represent the rabbit reticulocyte lysate system, whereas lanes 6 to 9 utilize the wheat germ extract system. Plasmid pA11 is the vector control. The ϳ17-kDa translation product present in lanes 2-5 and lanes 7 (weak), 8, and 9 initiates from the ATG codon at nt 265 to 267 ( Fig. 2; data not shown). Translation products were labeled by [ 3 H]leucine incorporation and resolved by SDS-PAGE on a 12% gel followed by fluorography. and 30 kDa) generated by wild type pA12 (lane 2) and the two deletion mutants p11⌬M-1 (lane 3) and p24⌬M-2 (lane 4) are not detected. The same result was obtained with wheat germ extract (data not shown). These results indicate that the downstream ATG codon at nt 136 to 138 is initiation-capable (p11⌬M-1 initiates at this site), but it is the upstream ATG triplet (nt 1 to 3) that serves as the major seprase-s initiation site.
The pA12 cDNA sequence predicts that there are two unambiguous uORFs located at nt Ϫ340 to Ϫ326 (distal) and at nt Ϫ179 to Ϫ72 (proximal) and a putative third uORF (central), which extends from nt Ϫ326 to Ϫ237. We analyzed the contribution the uORFs made to the translational efficiency of the downstream ORF by generating 5Ј leader deletion mutants: p8ϩ6 -3, which lacks the distal uORF and a large portion (nt Ϫ326 through Ϫ266) of the putative central uORF, and p16ϩ6 -11, which deletes all uORFs ("Experimental Procedures"; Fig. 2). As can be seen in Fig. 5, there is a marked difference in the translation levels of the downstream ORF between the wild type pA12 and the deletion mutant p16ϩ6 -11. Removal of the distal uORF and a majority (76%) of the potential-central uORF shows a small but noticeable increase in downstream ORF expression. Quantitation of [ 3 H]leucine incorporation revealed that p8ϩ6 -3 increased [ 3 H]leucine incorporation by 1.3-fold and p16ϩ6 -11 by 3.7-fold over parental pA12 ("Experimental Procedures").
Our panel of anti-seprase monoclonal Abs (9) does not recognize the pA12 primary translation product generated in the coupled in vitro transcription and translation system. To confirm that the downstream ORF of pA12 is expressed in vivo, we carried out transient transfection of COS-7 cells with a fusion protein construct, p14SC, that links the cDNA insert of pA12 to one that encodes CAT ("Experimental Procedures"). We also made a deletion mutant of this construct, p33⌬M-1SC, which deletes the initiation ATG triplet (nt 1 to 3) of the downstream ORF. The CAT ORF, which begins at Ile 5 , is in the same reading frame as seprase-s. Thus, detection of the fusion protein (sepCAT) by an anti-CAT polyclonal Ab can only occur if seprase-s is expressed. Also, neither fusion protein construct can express native full-length CAT, since the initiation ATG triplet for the CAT ORF (and the three succeeding codons) is deleted. Fig. 6 shows the results of an immunoblot that was carried out on detergent extracts of transiently transfected COS-7 cells using the positive control construct pCAT (encodes full-length CAT), p14SC, p33⌬M-1, and pA11 (vector control). The p14SC lane shows the expression of the sepCAT fusion protein band co-migrating with the 46-kDa marker (predicted M r ϳ 53-kDa) that is not present in the other lanes. DISCUSSION Our results confirm the existence of an alternatively spliced seprase mRNA that encodes a novel truncated isoform, seprase-s. The pA12 cDNA sequence (Fig. 2) indicates that deletion of an exonic 1223-bp region (it encodes part of the cytoplasmic tail, transmembrane, membrane proximal (Nglycosylation), and the central (cysteine-rich) regions of the extracellular domain of seprase-l (8)) from the seprase pre-mRNA is the result of alternative exon splicing, which obeys the GT-AG rule (50) but utilizes suboptimal exonic donor and acceptor splice sites (51,52). Interestingly, within the nucleotide sequence just downstream of the splice junction ( Fig. 2; nt Ϫ329 to Ϫ328) is a purine-rich region (nt Ϫ318 to Ϫ297) 5Ј-GAAGAATACCCTGGAAGAAGAAA-3Ј, which resembles exonic splicing enhancer motifs that facilitate the removal of proximal upstream introns (with weak 5Ј and/or 3Ј splice sites) from pre-mRNA (53)(54)(55)(56)(57). And based on the genomic organization of the human FAP gene (58), the splice sites utilized in the alternative splicing of the seprase pre-mRNA are located within exons 2 and 15, respectively. Alternative processing of human dipeptidyl peptidase IV pre-mRNA has not been reported (59,60).
Determination of the relative abundance of the seprase-l and -s mRNAs in LOX cells ("Results") suggests that the seprase mRNA ratio is in a dynamic state and that a complex set of cellular factors (SR proteins, transcription factors etc.) affect the relative abundance of the seprase mRNAs.
Deletion mutagenesis of the pA12 cDNA 5Ј leader sequence (Fig. 5) confirms that this uORF(s)-containing region inhibits the translation of the seprase-s downstream ORF. The 5Ј leader (388 nt) of the pA12 cDNA as well as the projected 5Ј leader (548 nt) of its cognate messenger has a GϩC content of ϳ40%. This observation suggests that it is the uORFs and not extensive secondary structure of the 5Ј leader that inhibit the expression of seprase-s. Because the initiation ATG codons for the distal (nt Ϫ340 to Ϫ326) and proximal (nt Ϫ179 to Ϫ72) uORFs have an adequate sequence context (49), the proximal uORF also possesses an in-frame ATG triplet (nt Ϫ131 to Ϫ129) in a strong context (A at Ϫ134; G at Ϫ128), it is probable  6. In vivo expression of seprase-s. A fusion protein construct, p14SC, that links the cDNA insert of plasmid pA12 (encodes seprase-s from its downstream ORF) to one that encodes CAT and a p14SC deletion mutant construct, p33⌬M-1SC, in which the initiation ATG codon (nt 1 to 3) of the downstream fusion protein (sepCAT) ORF is deleted, and two additional plasmids, pCAT, which encodes native CAT, and pA11, the vector control, were transiently transfected into COS-7 cells. Detergent extracts (ϳ55 g) of the transfected COS-7 cells were resolved by SDS-PAGE on a 10% gel, transferred to an Immobilon polyvinylidene difluoride membrane, and immunoblotted with a polyclonal anti-CAT Ab. Arrows indicate the positions of the CAT and sepCAT bands in their respective lanes. that 40 S subunits reaching the downstream ORF must translate at least one or both of these uORFs. This strongly suggests that the expression of the downstream coding sequence (seprase-s) involves 40 S subunit reinitiation, which is consistent with the decreased translational efficiency exhibited by the pA12 transcripts (49,61).
In vitro transcription and translation coupled with deletion mutagenesis of clone pA12 (Fig. 4) confirms that the seprase-s ORF encodes only the carboxyl-terminal region (Fig. 2) of the integral membrane isoform, seprase-l. And it is the carboxylterminal region of seprase-l that is responsible for its proteolytic activity (8,9). However, whether seprase-s retains proteolytic activity is still not known. Dimerization of seprase-l (9) and dipeptidyl peptidase IV (62) monomeric subunits is required for their proteolytic activity. Additionally, it has been reported that the dimerization of dipeptidyl peptidase IV subunits occurs in the Golgi apparatus (63). Analysis of the seprase-s sequence by the PSORT II program for prediction of protein subcellular localization sites (Swiss Institute of Bioinformatics) indicates a 70% probability that seprase-s is a cytoplasmic protein with no ability to insert itself into an organelle or plasma membrane or be targeted to an organelle. Recently, it was reported that invasive ductal carcinoma cells of human breast cancers exhibit polyclonal Ab staining against seprase not only on the cell surface but also throughout the cytoplasm (4). Whether the cytoplasmic staining of vesicle-associated intracellular seprase in these cells was in some part due to seprase-s remains to be determined. Clearly, the structure of seprase-s dictates that its role in the biology of seprase is going to be quite different from its integral membrane counterpart, seprase-l.