Targeting of Endopeptidase 24.16 to Different Subcellular Compartments by Alternative Promoter Usage*

Endopeptidase 24.16 or mitochondrial oligopeptidase, abbreviated here as EP 24.16 (MOP), is a thiol- and metal-dependent oligopeptidase that is found in multiple intracellular compartments in mammalian cells. From an analysis of the corresponding gene, we found that the distribution of the enzyme to appropriate subcellular locations is achieved by the use of alternative sites for the initiation of transcription. The pig EP 24.16 (MOP) gene spans over 100 kilobases and is organized into 16 exons. The core protein sequence is encoded by exons 5–16 which match perfectly with exons 2–13 of the gene for endopeptidase 24.15, another member of the thimet oligopeptidase family. These two sets of 11 exons share the same splice sites, suggesting a common ancestor. Multiple species of mRNA for EP 24.16 (MOP) were detected by the 5′-rapid amplification of cDNA ends and they were shown to have been generated from a single gene by alternative choices of sites for the initiation of transcription and splicing. Two types of transcript were prepared, corresponding to transcription from distal and proximal sites. Their expression in vitro in COS-1 cells indicated that they encoded two isoforms (long and short) which differed only at their amino termini: the long form contained a cleavable mitochondrial targeting sequence and was directed to mitochondria; the short form, lacking such a signal sequence, remained in the cytosol. The complex structure of the EP 24.16 (MOP) gene thus allows, by alternative promoter usage, a fine transcriptional regulation of coordinate expression, in the different subcellular compartments, of the two isoforms arising from a single gene.

Metalloendopeptidases form a large family of peptidases that have a His-Glu-X-X-His (HEXXH) zinc-binding motif and preferentially cleave short substrates. For example, endopeptidase 24.15 (EP 1 24.15), a member of this family, acts on peptides of 6 -18 amino acid residues and exhibits no or only very weak proteolytic activity against proteins (1)(2)(3). Among the members of this family, thimet oligopeptidase (TOP or EP 24.15) 1 and oligopeptidase M (MOP or EP 24.16) are unique in their sensitivities to thiol reagents and they constitute a subfamily, the thimet (thiol-and metal-dependent) oligopeptidase subfamily. Recent molecular cloning revealed the presence of a cysteine residue unique to members of this subfamily near position 483. This residue is absent from the other members that exhibit no thiol dependence (4,5). In addition to the members of this family of mammalian origin, certain oligopeptidases of microbial origin that belong to this family have also been identified, including oligopeptidase A (OpdA) and dipeptidyl carboxypeptidase (Dcp) of Escherichia coli and Salmonella typhimurium (6), peptidase F of Lactococcus lactis (7), mitochondrial intermediate peptidase of rat and yeast (8,9), and saccharolysin (YCL57w or proteinase yscD) of yeast (10). This report deals with the two best characterized mammalian enzymes, namely, EP 24.15 (TOP) and EP 24.16 (MOP), which are members of the thimet oligopeptidase family. This family has also been called the M3 family of metalloendopeptidases in the classification of Rawlings and Barrett (11,12). EP 24.15 (TOP) was first identified as a collagenase-like peptidase or Pz-peptidase in experiments with the Pz-peptide that was originally designed by Wü nsch and Heidrich (13) as a substrate for collagenase. Although the Pz-peptide was a good substrate for clostridial collagenase, it turned out not to be a substrate for avian and mammalian collagenases (14). The Pz-peptide hydrolyzing activities found in avian and mammalian tissues have, therefore, been designated collagenase-like peptidases or simply Pz-peptidases. Independent studies on the metabolism of brain peptides led to the discovery of two enzymes: one was described by Camargo et al. (15) in 1972 and was named neutral endopeptidase and, later, endo-oligopeptidase A; and the other, first described by Orlowski et al. (16) in 1983, was initially named soluble metalloendopeptidase and subsequently endopeptidase 24.15. All these enzymes turned out to be the same and are now known as thimet oligopeptidase (17). In this report we use the abbreviated designation EP 24.15 (TOP). cDNA sequences for the mammalian enzyme are now available for the rat (4,18,19), pig (20), and human (21). EP 24.16 (MOP) was also discovered independently in several different laboratories. 1) Heidrich et al. (22) demonstrated a Pz-peptide hydrolyzing activity in a mitochondrial fraction of rat liver, which was later shown to be distinct from EP 24.15 (TOP) by both biochemical characterization (23) and partial * This work was supported in part by a Grant-in-Aid for Scientific Research on Priority Areas (Intracellular Proteolysis) from the Ministry of Education, Science, Sports and Culture of Japan and by research grants from SRF, the Cosmetology Research Foundation, and the Life Insurance Association of Japan. The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
The nucleotide sequence(s) reported in this paper has been submitted to the GenBank TM  amino acid sequencing of the purified enzyme; it was named oligopeptidase M (24). 2) We (25) and Kiron and Soffer (26) identified a soluble angiotensin-binding protein in pig and rabbit liver during the course of studies aimed at identifying hepatic receptors for angiotensin II. After our publication of the cDNA sequence of the binding protein from pig (27), McKie et al. (18) pointed out the strong similarity between our sequence and that of rat EP 24.15 (TOP) which had been determined by Pierotti et al. (4,19). We then obtained a second cDNA clone which was very similar to but clearly different from that of the cDNA for the binding protein, and we showed that the second clone represented the pig homolog of rat EP 24.15 (TOP) (20). The angiotensin-binding protein, although originally identified as a binding protein, did indeed have thiol-and metal-dependent oligopeptidase activity (20). At that time, therefore, the binding protein appeared to represent a new member of the thimet oligopeptidase family since the amino acid sequence of oligopeptidase M or EP 24.16 (MOP) from no mammalian species had yet been determined. 3) Kawabata et al. (28,29) isolated an endopeptidase and the corresponding cDNA clone as a candidate for an enzyme responsible for the post-transcriptional processing of ␥-carboxyglutamic acid-containing blood coagulation factors. They failed to notice the strong similarity to our binding protein, which was later pointed out by McKie et al. (18). 4) Checler et al. (20,31) demonstrated the presence of a novel proteolytic activity capable of inactivating neurotensin. They purified the peptidase from rat brain synaptic membranes and characterized it (32). The enzyme, termed neurolysin or endopeptidase 24.16, was shown to be distinct from EP 24.15 (TOP) and neprilysin (also known as enkephalinase or endopeptidase 3.4.24.11) and to have a relatively broad substrate-specificity and tissue distribution. Recent determination of its amino acid sequence by cDNA cloning clearly indicated that neurolysin is identical to the three enzymes mentioned above (33). Thus, four separate lines of research have converged in the discovery of a single new member of the thimet oligopeptidase family. In this report we use the abbreviation EP 24.16 (MOP) for this protein, whose identity has been only recently established. EP 24.15 (TOP) and EP 24.16 (MOP) are very similar in terms of size and enzymatic properties: both are intracellular proteins of 78 -80 kDa, consisting of about 680 -700 amino acids, and their sequences are 65% homologous (20). They are, however, clearly distinguishable in several respects. For example, they have different specificities for inhibitors, different immunoreactivity, and different cleavage-site specificities. EP 24.15 (TOP) hydrolyzes neurotensin exclusively at the Arg-Arg bond whereas EP 24.16 (MOP) cleaves it at the Pro-Tyr bond (16,24,32). Another difference is found in the subcellular localizations of these enzymes. EP 24.15 (TOP) is found in the cytosol while EP 24.16 (MOP) is found in both the cytosolic and mitochondrial compartments. How can the product of a single gene be localized to more than one intracellular compartment? To answer this question and to characterize evolutionary relationships among the members of the thimet oligopeptidase family, we investigated the structural organization of the pig genes for EP 24.15 (TOP) and EP 24.16 (MOP) and of their 5Ј-proximal flanking regions. We discovered six species of mRNA for EP 24.16 (MOP) that are generated from one single gene as a result of the utilization of alternative sites for the initiation of transcription. The six species of mRNA can be classified into two categories: those containing an additional sequence that encodes a mitochondrial targeting sequence and those that lack such a sequence. The use of different promoters for the eventual targeting of proteins to appropriate subcellular compartments appears to be a useful mechanism for adjust-ment of local concentrations of proteins that function at different intracellular sites in response to the physiological requirements of the cell. 24.16 (MOP)-The 5Ј-ends of cDNAs for EP 24.16 (MOP) were cloned with the 5Ј-RACE (rapid amplification of cDNA ends) system (CLONTECH, Palo Alto, CA). Two g of poly(A) ϩ RNA, isolated from pig liver (27), were reverse-transcribed with a specific primer for the cDNA for pig EP 24.16 (MOP), 5RA-1 (5Ј-GTCTAGCATGGTTCGTTCC-3Ј), and avian myeloblastosis virus reverse transcriptase. The first-strand cDNA was ligated at the 3Ј-end with an anchor (5Ј-CACGAATTCACTATCGATTCTG-GAACCTTCAGAGG-3Ј) by T4 RNA ligase. A nested specific primer for the cDNA for EP 24.16 (MOP), 5RA-2 (5Ј-CCGTCTACACCTTCACTTC-3Ј), was used with an anchor primer (5Ј-CTGGTTCGGCCCACCTCT-GAAGGTTCCAGAATCGATAG-3Ј) for amplification of the 5Ј-ends of the cDNAs by polymerase chain reaction. The products of polymerase chain reaction were fractionated on a 3% agarose gel, and fragments of 300 -650 bp were isolated and cloned into pBluescript II (Stratagene, La Jolla, CA). Positive clones were identified by colony hybridization, with the 32 P-labeled EcoRI-EcoRV 592-bp fragment of PAB-L1 (27) as probe, and sequenced.

Identification of 5Ј-Terminal Sequences of cDNAs for EP
Sequencing of DNA-DNA was sequenced by the dideoxy chain termination method of Sanger et al. (34) with double-stranded plasmids as templates. Termination reactions were performed with SequiTherm DNA polymerase (Epicentre Technologies, Madison, WI) and IRD41labeled M13 universal or reverse primer (LI-COR, Lincoln, NE). The products were analyzed with a DNA sequencer (model 4000; LI-COR). Sequences were organized and analyzed with GENETYX-MAC program (Software Development, Tokyo, Japan).
Restriction Mapping of Phages-Positions of the EcoRI, SacI, and XbaI restriction sites in genomic clones were determined by complete or partial digestion with restriction enzymes and subsequent Southern blot analysis. UV irradiation and formation of pyrimidine dimers were used for preparation of incompletely digested genomic clones. EMBL3 SP6/T7 contains two unique SfiI or SalI sites and bacteriophage promoters (SP6 and T7) that flank the insert. Arms were separated with SfiI or SalI from the inserts, which still contained promoter sequences at the both ends. DNA samples were UV-irradiated for 0 or 20 min with UV Stratalinker 2400 (Stratagene) in 10 mM Tris, pH 7.5, 10 mM MgCl 2 , and 1 mM dithiothreitol. UV-irradiated samples (500 ng) were digested incompletely with EcoRI, SacI, or XbaI (10 units) for 1 h at 37°C, fractionated on a 0.7% agarose gel, and transferred to nylon membranes (Magnagraph; MSI, Westboro, MA). A set of filters was prepared and allowed to hybridize with end-labeled oligoprobes for T7 or SP6 promoter sequence, for 14 h at 37°C in 6 ϫ SSC, 5 ϫ Denhardt's solution, 0.5% SDS, and 100 g/ml herring sperm DNA. The filters were washed twice in 1 ϫ SSC, 0.1% SDS at 42°C for 30 min, exposed to imaging plates, and analyzed with a Bioimage Analyzer (model BAS 2000; Fuji Film, Tokyo, Japan). S1 Nuclease Protection Assay-Three primers, namely, 108L (CAAGCCTTGCGGCGGCCTAGCAAAGGAGGCAACAG) for exon 1; 107L (GGTGTCCCTCGGGGTAGACCATGTGGGCTGTAGAA) for exon 2; and 106b (GTCTCTCCATGAGAATGCTCCT) for exon 3, were designed for the synthesis of single-stranded antisense DNA probes that would protect pig 5Ј-ends of mRNAs for EP 24.16 (MOP). Ten pmol of each primer were labeled with [␥-32 P]ATP (Amersham) by polynucle-otide kinase (Takara) and used for the synthesis of probes. End-labeled primers were annealed with 5 g of plasmid DNA that contained genomic fragments of the pig gene for EP 24.16 (MOP) (ApaI-XhoI 837-bp fragment of PAB-G33 for exons 1 and 2; BglII-EcoRI 923-bp fragment of PAB-G32 for exon 3); and antisense probes were synthesized with T7 DNA polymerase (Pharmacia, Uppsala, Sweden). The 3Ј-ends of the probes were digested with restriction enzymes (SmaI for exon 1, BssHII for exon 2, and BglII for exon 3), fractionated by electrophoresis on a 5% polyacrylamide gel that contained 7 M urea and exposed to x-ray film. Probes were detected as bands of the expected mobility and extracted in 0.5 M ammonium acetate, 10 mM magnesium acetate, 1 mM EDTA, 0.1% SDS, and 10 g/ml yeast tRNA at 37°C for 12 h. Extracted probes were precipitated in ethanol, and probes (1 ϫ 10 5 cpm each) were annealed with 5 g of poly(A) ϩ RNA from pig liver or with 10 g of yeast tRNA, as a control, for 12 h at 30°C in 80% formamide, 40 mM PIPES, pH 6.4, 1 mM EDTA, and 400 mM NaCl. Non-annealed nucleic acids were digested with S1 nuclease (Boehringer Mannheim, Mennheim, Germany) at a final concentration of 1,000 units/ml in 0.28 M NaCl, 0.05 M sodium acetate, pH 4.5, 4.5 mM ZnSO 4 , and 20 g/ml denatured herring sperm DNA. The protected fragments were purified by extraction with phenol/chloroform and precipitation with ethanol, and electrophoresed in 5% polyacrylamide gels containing 7 M urea. Gels were dried and exposed to imaging plates for 48 h. Images were analyzed with the Bioimage Analyzer.
Production of Antiserum against Purified Pig EP 24.16 (MOP)-Rabbits were injected subcutaneously with 75 g of purified pig EP 24.16 (MOP) (formerly referred to as soluble angiotensin-binding protein, sABP (27)) in complete Freund's adjuvant. Booster injections with 75 g of purified protein in incomplete Freund's adjuvant were given 2, 4, and 6 weeks after the initial injection. Rabbits were bled 10 days after the fourth injection.  (Fig. 1), which encoded only the 5Ј-ends of type 1Ј, 2, 2Ј, 3, and 3Ј cDNAs, respectively, were digested with AlwNI at their 3Ј termini, ligated with the 2291-bp AlwNI-EcoRI fragment of PAB-L1 and subcloned into pcDNA3. COS-1 cells were maintained in Dulbecco's modified Eagle's medium (Life Technologies, Inc., Gaithersburg, MD) that contained 10 mM HEPES, pH 7.2, 10% fetal bovine serum, 50 units/ml penicillin, and 50 g/ml streptomycin, in a controlled atmosphere of 5% CO 2 in air at 37°C. Approximately 6 ϫ 10 6 cells were electroporated with 20 g of each plasmid at 220 V at a capacitance setting of 960 microfarads in a Gene Pulser apparatus (Bio-Rad) and harvested 48 h after electroporation.
Subcellular Fractionation of Cells and Western Blotting-All steps were performed at 4°C. Cells were washed by centrifugation in Dulbecco's phosphate-buffered saline (2.7 mM KCl, 138 mM NaCl, 1.2 mM KH 2 PO 4 , and 8.1 mM Na 2 HPO 4 , pH 7.4) at 700 rpm for 2 min. Approximately 2 ϫ 10 7 cells were suspended in 1 ml of 2.5 M sucrose and homogenized for 2 min. Nuclear fractions were removed by centrifugation at 3,000 rpm (700 ϫ g) for 10 min, and supernatants were centrifuged at 9,200 rpm (7,000 ϫ g) for 10 min to recover mitochondrial fractions as pellets. Mitochondrial fractions were washed twice by centrifugation at 25,000 rpm (24,000 ϫ g) for 10 min. The post-mitochondrial supernatants were centrifuged at 50,000 rpm (105,000 ϫ g) for 100 min, and the pellets (microsomes) and supernatants (cytosol) were recovered. The concentration of protein in each fraction was determined with the BCA protein assay reagent (Pierce). Five g of each protein sample were fractionated by SDS-PAGE (10% polyacrylamide) in standard glycine running buffer (192 mM glycine, 25 mM Tris, and 0.1% SDS) or high-resolution running buffer (492 mM glycine, 75 mM Tris, and 0.1% SDS). The separated proteins were transferred to a polyvinylidene difluoride membrane (ATTO, Tokyo, Japan) and probed with 2,000-fold diluted rabbit antiserum against pig EP 24.16 (MOP). Bound antibodies were detected with alkaline phosphatase-conjugated second antibodies, with 4-nitro blue tetrazolium chloride and 5-bromo-4-chloro-3-indolyl phosphate (NBT/BCIP) as chromogen.  (45) which is shown in italics. Three nucleotide replacements, due possibly to allelic polymorphism, were found in exon 2 (underlined). Exon boundaries, determined by comparison of genomic and cDNA sequences (Fig. 4D), are indicated by vertical arrows. cleotides were obtained. DNA sequencing of these fragments revealed the presence of several mRNAs whose sequences were completely different from others starting 28 nt upstream from the ATG initiation codon ( Fig. 1; M3). These results suggested that usage of alternative promoters and exons might be involved in the generation of the observed mRNA diversity. To determine the precise molecular mechanism responsible for generation of such heterologous mRNAs, we isolated and characterized the pig gene for EP 24.16 (MOP), which had previously been shown to be present as only a single copy (27).

Differences among 5Ј-Terminal
Structural Organization and 5Ј-Untranslated Exons of EP 24.16 (MOP) Gene-To isolate the pig gene for EP 24.16 (MOP), we screened approximately 2 ϫ 10 6 independent plaques of a pig genomic library in EMBL3 (CLONTECH) using the PAB-L1 cDNA clone (27) as the probe. We isolated and mapped more than 50 clones, and then we subcloned and sequenced the phage fragments for the identification of exons. The exonintron organization of the pig gene for EP 24.16 (MOP) was deduced from an analysis of 11 independent clones, each of which contained part of the gene (Fig. 2). The gene extends over 100 kilobases and contains 16 exons and 15 introns of various sizes (Figs. 2B and 4B). All the introns have typical splice donor and acceptor boundaries (Fig. 4D) (35).
Comparison of the nucleotide sequences of the genomic clones PAB-G32 and PAB-G33 (Fig. 2D) with those of the products of 5Ј-RACE (Fig. 1) allowed us to identify the alternatively spliced leader exons (Fig. 2, A and B, and 6A). Six distinct species of mRNA for EP 24.16 (MOP) appeared to be generated by differential use of three sites for initiation of transcription located upstream of exons 1, 2, and 3, respectively, and by the alternative splicing of exon 4; exons 1, 2, and 3 are mutually exclusive (Fig. 6A). Exon 1 encodes a putative mitochondrial targeting sequence, (M)IVRCLSAARRLHR (Fig.  6D), which is rich in basic amino acids and can be expected to form an amphipathic helix (36). The common exons 5 through 16 are used to assemble the functional domain of the enzyme. The zinc-binding motif HEFGH is encoded by exon 12 (Fig. 4B). The extreme 3Ј exon, exon 16, encodes the last 44 amino acid residues, the termination codon, and the 3Ј-untranslated sequences that include three polyadenylation signals, a short interspersed repetitive element (SINE or PRE-1), and an AT repeat, all of which were identified previously by cDNA cloning (27).
There appears to be a "pseudo-exon" that encodes a protein that resembles a ribosomal protein (11.5 kDa, L44 (37)) in reverse orientation (3Ј to 5Ј) within the untranslated region of the 3Ј-most exon (Fig. 4B). The sequence encoding the homolog of ribosomal L44 is flanked by the direct repeat TGTTTTA-GAGAATTT and has a poly(A) tract, suggesting that the pseudogene might have arisen as a result of retroposition.
Structural Comparison of the Genes for EP 24.15 (TOP) and EP 24.16 (MOP)-We wondered whether the complexity of organization of the gene for EP 24.16 (MOP) might be reflected in the genes for other members of the thimet oligopeptidase family and, to this end, we also characterized the gene for EP 24.15 (TOP). The gene for EP 24.15 (TOP) was isolated from the same pig genomic DNA library as that used for isolation of genes for EP 24.16 (MOP), and it was found to have a much simpler structure in its 5Ј-region (Figs. 3 and 4B). The gene exists as a single copy, as revealed by Southern blot analysis (data not shown); it spans approximately 45 kilobase pairs (Fig.  4A); and it is organized into 13 exons. The overall organization of the two genes is very similar with the exception of the length of introns and the 5Ј-leader and untranslated exons (Fig. 4B). For example, exons 2-12 of the gene for EP 24.15 (TOP) correspond precisely to exons 5-15 of the gene for EP 24.16 (MOP) and there is strong conservation of the respective exon-intron boundaries (Fig. 4D), suggesting evolution from a common ancestor. The zinc-binding motif HEFGH is encoded within exon 9. The 3Ј-terminal exon 13 is composed of a short coding sequence, the termination codon, and the entire 3Ј-untranslated sequence. The promoter region of the gene for EP 24.15 (TOP) lacks the TATA box but contains several putative binding sites for ubiquitous factors including one CCAAT box, three Sp1 sites, one NF-1 site, one AP-1 site, and two AP-2 sites (data not shown).
Identification of Three Major Sites of Transcription Initiation-Characterization of the 5Ј-ends of mRNAs for EP 24.16 (MOP) by 5Ј-RACE (Fig. 1) revealed the presence of multiple sites for initiation of transcription, at least one each in the upstream of exons 1, 2, and 3. To determine the transcription start sites, we performed S1 nuclease mapping using poly(A) ϩ RNA from pig liver and using three probes, which were complementary to exons 1, 2, and 3, respectively. The locations of these probes are indicated in Fig. 5A. As we had expected, we found three sites (Fig. 5A): one located 172 nt upstream of the first Met (ATG) codon of exon 1; another located at 106 nt upstream of the Met codon of exon 2; and the third located 23 nt downstream of the TATA box close to exon 3.
Putative Sites for Binding of Transcription Factor Near Sites of Initiation of Transcription-Inspection of the sequence of 5Ј-flanking regions of exons 1, 2, and 3, which we designated promoter regions P1, P2, and P3, respectively (Fig. 4B), revealed potential cis-acting DNA elements (Fig. 5, B and C). Promoter regions 1 and 2 are very GC-rich and lack the TATA and CAAT boxes that are typical of eukaryotic class II promoters; promoter 3 contains a conserved TATA box, which begins 29 nt upstream of the previously identified 5Ј-end of exon 3. The sequences upstream of exons 1 and 2 contain several putative binding sites for transcription factors AP-2 and Sp1. AP-2 mediates enhanced transcription as a result of stimulation by the protein kinase C, cAMP-dependent protein kinase A, and retinoic acid (38 -40); Sp1 is a protein that binds to the GC box specifically and is often involved in the regulation of so-called housekeeping genes (41). The upstream region of exon 3 includes consensus binding sites for the transcription factors Myb (product of the myeloblastosis oncogene), AP-1, and GATA-1. The presence of multiple binding sites for hematopoiesis-specific factors is intriguing: Myb has been demonstrated to be important in the control of the proliferation and differentiation of hematopoietic cells (42), while GATA-1 was originally found as an erythroid-specific factor (43). The Myb-binding site immediately downstream of the TATA box for exon 3 is of particular interest since such a juxtaposed arrangement of a TATA box and a Myb target sequence was recently demon-  (52), and dipeptidyl carboxypeptidase (Dcp) from E. coli (53). The extent of homology is indicated, as a percentage, below the corresponding exons. Bold numbers indicate relatively strongly conserved regions, with homology scores more than five points above the average. Italic numbers indicate poorly conserved regions, with homology scores more than five points lower than the average.  S1 nuclease mapping (A) and  sequences of promoter regions (B and C). A, 5 g of poly(A) ϩ RNA from pig liver (lanes 2, 4, and 6) or yeast tRNA (lanes 1, 3, and 5) were strated to serve as a Myb-suppressible promoter (44).
Differential Subcellular Localization of EP 24.16 (MOP) Directed by Alternatively Generated Speces of mRNA-The results described above suggest that the organization of the 5Ј-region of the genes for EP 24.16 (MOP) is unusually complex and that six mRNA species with different 5Ј termini are generated as a consequence of the use of separate promoters (Fig. 5) and the splicing of the 5Ј-leader exons 1 through 3 (in a mutually exclusive manner) and of exon 4. The cDNA sequences corresponding to the six species of mRNA are shown schematically in Fig. 6A, and they were used for the expression experiments described below. It should be noted that exon 1 has an in-frame ATG codon (designated M1), when connected directly to exon 5, and the open reading frame in exon 1 encodes a putative signal peptide for import into mitochondria; exon 2 also has an inframe ATG codon (M2) in an appropriate context for the initiation of translation ( Fig. 1) (45) and the open reading frame predicts an enzyme with 64 more amino acids at its amino terminus than the product generated by the open reading frame that starts with an ATG codon (M3) in the common exon 5 ( Figs. 1 and 6A). The fact that exon 1 could encode an amino-terminal leader sequence for targeting to mitochondria strongly suggests that, upon selection or elimination of the sequence of exon 1 via differential utilization of the multiple promoters, the subcellular localization of the products of the gene for EP 24.16 (MOP) is strictly and efficiently controlled. To confirm this possibility, we carried out the following experiments.
Targeting of Proteins to Different Subcellular Locations by Alternative Usage of Promoters: EP 24.16 (MOP) Represents the First Example of Such a Mechanism for Intracellular Peptidases/Proteinases-Proteins, after their synthesis, must be delivered to their sites of action. Delivery is usually accomplished with the help of terminal or internal targeting sequences. Sequences for the targeting proteins to the following sites have been identified: mitochondria, endoplasmic reticulum, lysosomes, nuclei, and peroxisomes.
The presence of a putative mitochondrial targeting sequence at the amino terminus of the precursor to EP 24.16 (MOP) was first deduced by Serizawa et al. (24) from the potential ability of this sequence to form an amphipathic ␣-helix with a hydrophobic and a positively charged face of the type expected for a mitochondrial leader sequence (36,46). This scenario explains the presence of EP 24.16 (MOP) in mitochondria. The enzyme is, however, known also to be present in the cytosol and, prior to the present study, the mechanism responsible for this distribution of EP 24.16 (MOP) has remained unclear. Discovery of 5Ј-end variants of the mRNA for EP 24.16 (MOP) by the 5Ј-RACE technique led us to investigate the genetic basis for such diversity. Through an analysis of the structure of the gene, which led to the identification of the three 5Ј-leader exons that are selected, in a mutually exclusive manner, by use of alternative promoters and splicing, we provided the following resolution of this problem (Fig. 7). If promoter 1 is used, the mitochondrial isoform of EP 24.16 (MOP) is generated by splicing of exon 1, which has a sequence that encodes a signal for transport to mitochondria, to exon 5, which is the beginning of the common translated region that encodes the mature portion of the protein (type 1 in Fig. 6, A and D). The precursor form (704 amino acid residues) with the mitochondrial targeting sequence is processed to the mature mitochondrial form of 667 residues (Fig. 7). The type 1 transcript can also yield the cytosolic form of 681 amino acids when the M3 site of initiation of translation is used instead of the M1 site. If promoter 3 is used and exon 3, which lacks an in-frame ATG codon, is joined to exon 5 (type 5 in Fig. 6, A and D), the cytosolic isoform is produced from the ATG initiation codon in exon 5. The use of promoter 2, which directs the synthesis of a cytosolic variant, is discussed below.
Similar senarios have been reported for several other enzymes that are known to occur and function in more than one subcellular compartment (for a recent review, see Ref. 47). Typical examples are the histidine and valine tRNA synthetases of Saccharomyces cerevisiae that are involved in protein synthesis in the cytosol and the mitochondria (48,49). In these cases, two types of transcript (long and short) are produced by used. Sequencing with the same oligonucleotides was used for calibration of mobilities. Strategies for the preparation of single-stranded antisense DNA probes are shown on the right. B, the sequence of the 837-bp ApaI-XhoI fragment of pig genomic DNA that contained exons 1 and 2. Sites of initiation of transcription are indicated by arrows. Exon 1 and exon 2 are located in a very small region with a GC content of 67%. Nine binding sequences for Sp1, five binding sequences for AP-2, and two Rb control elements (RCE) are indicated. Capital letters represent exons and the deduced amino acid sequences are shown. C, the sequence of the 923-bp BglII-EcoRI fragment of pig genomic DNA that contained exon 3. A site of initiation of transcription is indicated by an arrow. The sequence includes a TATA box at position Ϫ23 relative to the site of initiation of transcription of exon 3, three binding sites for Myb, one for AP-1, and one for AP-2. The binding site for Myb site near the TATA box is very similar to the sequence that is found in the c-erbB-2 promoter that has been shown to suppress this gene (44). alternative usage of promoters, and the long transcript yields the mitochondrial isoform exclusively, while the short transcript yields the cytosolic enzyme. In this way, adjustment of the levels of the proteins to the needs of each compartment is possible. Although the biological significance of this mechanism in the present case is not immediately apparent since the true substrates of the enzyme have not yet been identified, the general regulation of expression of the gene for an oligopeptidase by transcription regulatory factors and the unique regulation, as reported herein, of targeting of the product by use of alternative promoters seems to provide a powerful method by which cells can modulate the concentration of specific peptides in certain intracellular compartments to reflect the metabolic state of the cell.
Complex Organization of the Pig Gene for EP 24.16 (MOP)-As compared with the gene for EP 24.15 (TOP), another member of the thimet oligopeptidase family, the gene for EP 24.16 (MOP) have quite a complicated structure in its 5Јand 3Ј-regions. The two genes do, however, exhibit extensive similarity in the regions that encode the mature proteins, which consist of 11 exons, namely, exons 2 through 12 in the case of the gene for EP 24.15 (TOP) and exons 5 through 15 in the case of the gene for EP 24.16 (MOP). The similarity suggests that these two genes and, probably, the genes for other members of this family were generated from an ancestral gene as distinct sequences as a consequence of gene duplication. The presence of a SINE in the 3Ј-untranslated region of the gene for EP 24.16 (MOP) suggests that insertion of a SINE after the gene duplication might have destabilized the gene for EP 24.16 (MOP) and stimulated extensive diversification of 5Ј-and 3Јregions by recruiting the entire gene for ribosomal protein L44 (in the reverse orientation) into the 3Ј-most exon (exon 16) and the 5Ј-leader exons into the 5Ј-flanking region by, perhaps, retroposition and gene conversion.
Long and Short Forms of EP 24.16 (MOP)-The use of promoter 2 of the gene for EP 24.16 (MOP) yields the type 2 transcript, which is predicted to have an amino-terminally elongated product (Fig. 7). Consistent with this prediction, we are indicated by boxes and numbered (box patterns: black, coding regions for the peptidase; hatched, reading frames encoding the amino-terminal extensions; and white, non-coding regions). Three sites for initiation of transcription (Fig. 5) are indicated by arrows. The deduced amino acid sequences corresponding to the reading frames of exons 1, 2, and 5 are shown in single letter code, that contains translational initiation sites indicated by bold and large letters. Basic amino acid residues which are necessary for mitochondrial targeting sequences are indicated by bold letters. Alternative initiation of the transcription of exons 1-3 (P1-P3) and alternative splicing of exon 4 generate six isoforms of the mRNA. B, subcellular localization of the products of translation of isoforms of the cDNA for EP 24.16 (MOP) expressed in COS-1 cells. Six cDNA species, identified by 5Ј-RACE (Figs. 1 and panel A, of this figure), were expressed in COS-1 cells and the products were detected by Western blotting. Subcellular fractions were obtained by differential centrifugation, as follows: cyt, cytosol (100,000 ϫ g supernatant); mit, mitochondria (7,000 ϫ g pellet); mic, microsomes (100,000 ϫ g pellet). Construction of cDNA used for expression is indicated by boxes on the right of panel (A). C, resolution of mitochondrial and cytosolic forms of EP 24.16 (MOP) by SDS-PAGE in the high-resolution buffer system described in the text. M1a, precursor of the mitochondrial form generated by use of the first site (M1) for initiation of translation; M1b, the processed mature form imported into mitochondria; M3, the cytosolic form generated by translation from the M3 site of initiation of translation (for details see Fig. 7). D, nucleotide and amino acid sequences of the type 1, 1Ј, 2, and 3 isoforms. Three codons for initiation of translation, M1, M2, and M3, are present in exon 1, exon 2, and exon 5, respectively. The mitochondrial targeting sequence of EP 24.16 (MOP), containing six arginine residues (24), is underlined. Serizawa et al. (24,54) determined the amino-terminal amino acid sequences of the isoforms of EP 24.16 (MOP) purified from mitochondria and the cytosol and showed that the mitochondrial form has a Ser residue at its amino terminus and that the major cytosolic form begins with Thr; the Ser and Thr residues are indicated by white lettering on a black background. detected a long form of the protein in the cytosol of COS-1 cells transfected with the type 2 construct after SDS-PAGE and Western blotting (Fig. 6B, lane 10). Although the relative abundance of the corresponding mRNA, as estimated from the data after 5Ј-RACE, in the pig liver is low (Ͻ10%), the physiological significance of this form clearly merits further study. The roles of the extended amino-terminal region of 64 amino acid residues (Fig. 6D) could include stabilization of the enzyme, modulation of the substrate specificity, and/or mediation of interactions with other cytosolic proteins.
In our analysis, we also noticed the presence of a splice variant that lacked the sequence of exon 15. 2 This variant should encode a protein with a short and slightly different carboxyl-terminal tail. The functional significance of this variant and the tissue-and development-specific regulation of the splicing will be the subject of further research. The presence of at least two forms of EP 24.16 (MOP) has also been demonstrated in purified preparations of the enzyme from rabbit and pig liver (50,51).
The type 1, type 2, and type 3 species of mRNA all have splice variants, designated type 1Ј, type 2Ј, and type 3Ј, respectively, with an extra exon sequence (5Ј-untranslated exon 4, Fig. 7), but none of them results in a long open reading frame from the first initiation codon. It is unknown therefore, whether these variants have any biological significance. However, the possibility exists that the variant species of mRNA might contribute to the regulation of rates of translation. Alternatively, they might produce short peptides with as yet unidentified functions. In the case of the type 1Ј transcript, the insertion of the sequence of exon 4 provides a mechanism by which the synthesis of the mitochondrial isoform is suppressed even under conditions under which promoter 1 is active but there is no mitochondrial requirement for the oligopeptidase.
Conclusion-Analysis of the gene for EP 24.16 (MOP) revealed the very complex organization of the gene and the presence of a variety of transcripts generated by differential use of multiple sites of initiation of transcription and by alternative splicing of exons 2, 3, 4, and 15. In contrast to these complexities, a simple and definitive answer was obtained to the question of how the product of a single gene for EP 24.16 (MOP) is delivered to two different cellular compartments, namely, the cytosol and the mitochondria.