Protease Trafficking in Two Primitive Eukaryotes Is Mediated by a Prodomain Protein Motif*

Trypanosome protozoa, an early lineage of eukaryotic cells, have proteases homologous to mammalian lysosomal cathepsins, but the precursor proteins lack mannose 6-phosphate. Utilizing green fluorescent protein as a reporter, we demonstrate that the carbohydrate-free prodomain of a trypanosome cathepsin L is necessary and sufficient for directing green fluorescent protein to the lysosome/endosome compartment. A proper prodomain/catalytic domain processing site sequence is also required to free the mature protease for delivery to the lysosome/endosome compartment. A nine-amino acid prodomain loop motif, implicated in prodomain-receptor interactions in mammalian cells, is conserved in the protozoa. Site-directed mutagenesis now confirms the importance of this loop to protease trafficking and suggests that a protein motif targeting signal for lysosomal proteases arose early in eukaryotic cell evolution.

Sorting of newly translated lysosomal proteases may occur by two different mechanisms. In mammalian cells, the predominant is through the addition of phosphomannosyl residues and targeting to the lysosome pathway by binding to M6P 1 receptors within the Golgi (1). However, transport of lysosomal enzymes in many cells is unaffected by a deficiency in the phosphotransferase, which is required for M6P synthesis (2). M6Preceptor-independent membrane association has been reported for several lysosomal proteins (3). Confirmation and further analysis of a M6P-independent sorting pathway in mammalian cells has been complicated by both the presence of the M6P pathway itself and difficulty in distinguishing effects on protein folding versus protein trafficking when deletional mutants of the protease precursors were analyzed (4,5). Trypanosoma cruzi and Leishmania mexicana are protozoa (Trypanosoma) representing one of the earliest lineages of eukaryotic cells. They nevertheless express proteases homologous to cathepsins L and B and, unlike yeast, have an organelle ultrastructure more reminiscent of mammalian cells (6). While the cathepsin L-like protease prodomains of these organisms have significant homology to mammalian procathepsin L (e.g. 45% identity for cruzain versus mouse cathepsin L), there are no carbohydrate modifications, so they are unique experimental models for analyzing M6P-independent protein sorting.
Trypanosome cysteine proteases are synthesized as precursor proteins with a hydrophobic signal peptide, a 100 -122amino acid prodomain, a 200 -210-amino acid catalytic domain, and, in most cases, a 100 -130-amino acid carboxyl-terminal domain (7)(8)(9)(10)(11)(12). The function of the carboxyl-terminal domain is as yet unknown. Hypotheses that it plays a role in protease inactivation, or in facilitating folding of the catalytic domain, have been ruled out by expression of fully active recombinant proteases without this domain (10). Other proposed functions have included mediating intracellular trafficking of the protease (13), immune evasion (14), and facilitating activity against specific macromolecular substrates (15).
The prodomain of cysteine proteases has two well defined functions, maintaining the enzyme in an inactive form (zymogen) until it reaches an appropriate site of protease function, and functioning as a structural template to ensure proper folding during translation (10). The prodomain can also act as a reversible inhibitor and a stabilizer of the mature protease (16) and has also been implicated in protease precursor trafficking (17,18). A hypothesis for the role of the prodomain in trafficking suggests that a peptide motif near the amino terminus is recognized and bound by a membrane receptor within the Golgi (18). This receptor-prodomain interaction is proposed to direct protease precursors to appropriate cellular compartments. Release of free, active catalytic domain from the prodomain-receptor complex is thought to occur in a downstream compartment and require a final step of proteolytic processing.
We analyzed the contribution to intracellular protease trafficking of each of the three major domains of the T. cruzi cathepsin L-like protease, "cruzain," by transfection of constructs containing coding regions for the domains coupled to GFP as a reporter gene. We show that the prodomain is necessary and sufficient for trafficking. A nine-amino acid sequence motif, homologous to the putative membrane binding region of procathepsin L (18), was identified and its function confirmed by site-directed mutagenesis. Finally, the requirement of proteolytic release of the catalytic domain from the prodomain-receptor complex was confirmed by analysis of mutants in which the prodomain/catalytic domain processing site was altered. The results of these studies suggest that a protein motif trafficking pathway for lysosomal proteases arose early in eukaryotic cell evolution and may continue to function in higher eukaryotes independent of the M6P pathway.
Oligonucleotides-The synthetic DNA oligonucleotides used were synthesized on a Perkin-Elmer Applied Biosystems 394 DNA synthesizer at the Biomolecular Resource Center (University of California San Francisco) (see Table I).
PCR Amplification Reactions-PCR amplifications were for 25 cycles in a volume of 50 l, containing 2 units of Pwo polymerase, 0.1 g of each primer, and 2 ng of plasmid DNA as template on a Perkin-Elmer DNA thermal cycler. Amplification conditions were: 1-min denaturation at 94°C, 1-min annealing at 55°C, 2-min extension at 72°C.
Construction of pT-GFP and pT-Pre-GFP-The green fluorescent protein coding sequence was generated by PCR amplification using pS65T-C1 (CLONTECH) as DNA template and GFP1 and GFP2 oligonucleotides as primers. For pT-Pre-GFP we used GFP3 instead of GFP1. GFP3 contains the signal peptide of cruzain. The amplification product containing a multicloning site at the 3Ј region of the GFP gene was treated with SpeI and ClaI restriction enzymes, gel-purified, and then ligated into the appropriate site in the pTEX plasmid (26) to generate pT-GFP or pT-Pre-GFP.
Construction of pT-Pro-GFP and pT-ProCat-GFP (see Diagrams in Figs. 1 and 3)-The prodomain (Pro) and prodomain-catalytic domain (ProCat) of the cruzain gene were amplified from pCheYTc (10) using the oligonucleotide primers Pro1 and Pro2, and Pro1 and Cat2, respectively. A SpeI site was incorporated in all the primers. Following digestion with SpeI, the PCR products were gel-purified and ligated into pT-GFP. This procedure resulted in the generation of Pro and ProCat coding regions of cruzain fused upstream of the GFP coding sequence in the pT-GFP vector. The fused domains were in-frame with the GFP sequence, as verified by DNA sequencing. The specific sequences from cruzain in each case were ϪLeu 115 to Thr 14 for pT-Pro-GFP and ϪLeu 115 to Gln 195 for pT-ProCat-GFP. Both of these constructs included the cleavage site between the pro-and the catalytic domains, but neither included the cleavage site between the catalytic domain and the carboxyl-terminal domains. Amino acids are numbered based on the papain numbering system (34).
Construction of pT-GFP-Ctd-The carboxyl-terminal domain (Ctd) of the cruzain gene (Tyr 186 to Leu 342 ) contained in pCheYTc (10) was PCR-amplified using oligonucleotide primers Ctd1 and Ctd2. Either HindIII or BamHI sites were included in the primers to facilitate cloning of the resultant PCR product into the multicloning site of the GFP gene in pT-Pre-GFP. This amplification product was digested and ligated as described above.
Construction of pT-ProЈCat-GFP-A PCR-based mutagenesis procedure was used to generate V3D and V2P substitutions at the cleavage site of the prodomain (Fig. 3D). Two separate PCR reactions were done with primers Pro1 and Mut2 in one reaction and primers Mut1 and Cat2 in a second reaction (27). These reactions were carried out using 1 ng of plasmid pCheYTc and the PCR conditions described above. Two DNA fragments were generated by the reactions. Following agarose gel electrophoresis, these fragments were recovered and pooled, and 1 l of the mixture was used in a second round of PCR amplification with primers Pro1 and Cat2 and an annealing temperature of 60°C. A resulting 1-kilobase product was cloned into pT-GFP to generate pT-ProЈCat-GFP.
Transfection-The transfection procedure used was a modification of that described by Hariharan et al. (28). Parasite cultures were grown to mid log in the appropriate media supplemented with 10% FBS. Cells were harvested by centrifugation and washed twice in electroporation  buffer (100 mM NaCl, 3 mM KCl, 5 mM Na 2 HPO4, 2 mM KH 2 PO 4 , 0.5 mM MgCl 2 , 0.1 mM CaCl 2 ) and resuspended at 10 8 cells/ml in electroporation buffer. Four-hundred fifty microliters of cell suspension was incubated on ice for 10 min in a cuvette containing 10 g of plasmid DNA in 50 l of water. The cells were electroporated with a single pulse using a Transfector-600 (BTX Corp., San Diego, CA) set at 400 V and 500 microfarads using an electrode with a 0.8-mm gap. The electroporated cells were left on ice for a further 10 min and then diluted into a 5 ml of liver infusion tryptose/FBS. The parasites were incubated for 48 h before adding 20 g/ml G418 (Geneticin, Life Technologies, Inc.). Transformants were selected by gradually increasing the G418 concentration to either 100 g/ml for L. mexicana or 200 g/ml for T. cruzi over 3 weeks. Parasites were subcultured every 5 days in the presence of G418 and analyzed as populations.
Detection of GFP/Confocal Microscopy-Cell pellets were washed twice with phosphate-buffered saline (4°C, 10 min, 3000 rpm) and fixed in 0.1 M sodium cacodylate buffer, pH 7.4, containing 2% paraformaldehyde and 1% sucrose. GFP fluorescence was first confirmed in fixed cells with a Zeiss Axioplan fluorescence microscope (excitation and emission of fluorescein). Each trypanosome preparation was imaged with a laser scanning microscope 410 (Carl Zeiss Inc., Thornwood, NY) equipped with an Axiovert 100 microscope (Zeiss), a 63Ϫ, 1.4 NA plan-APOCHROMAT objective lens (Zeiss), and an argon/krypton laser. The fluorescein-labeled probe and the propidium iodide were imaged separately. Alternatively, samples were imaged with an MRC 1024 unit (Bio-Rad) equipped with a Nikon Diaphot 200 inverted microscope. Propidium iodide was imaged using the 568-nm laser line and collecting emissions longer than 590 nm. For all specimens, a series of twodimensional slice images, 0.5 m apart were acquired starting above the top surface of the section and extending below the bottom surface. A zoom factor of 2.0 was used during the scanning, resulting in a voxel size of approximately 0.2 m for each specimen. Each slice voxel inten- FIG. 1. Confocal microscopy of T. cruzi (A, C, E, and G) and L. mexicana (B, D, F, and H) sity was the average of at least three successive scans. Images were transferred to a UNIX workstation for archiving and analysis.
Digital image cytometry was done using the Quantitative Image Processing System in the Laboratory for Cell Analysis at the University of California San Francisco Cancer Center (29). The Quantitative Image Processing System images were acquired using a fluorescent Zeiss Axioscope equipped with a Xillix 1024 CCD camera and an automated filter wheel to select emission wavelengths. All mechanical controls for the system were under software control on a SUN workstation. Images were acquired sequentially for each fluorochrome, in addition to a bright field image, to show morphological features.
Western Blot-Western blots were performed with 1:1000 dilutions of the anti-GFP monoclonal antibody (CLONTECH) as the primary antibody, in phosphate-buffered saline, 5% non-fat milk, and detected using horseradish peroxidase-conjugated secondary antibody and ECL Western blot analysis system (Amersham Pharmacia Biotech).
Immunoelectron Microscopy-Immunoelectron microscopy was performed on frozen sections from epimastigotes fixed in 2% paraformaldehyde and 0.1% glutaraldehyde in phosphate-buffered saline, pH 7.4. Sections were probed with either rabbit antibody raised against recombinant cruzain (10, 21), conjugated to 15-nm gold (Amersham Pharmacia Biotech), or mouse anti-GFP monoclonal antibody (CLONTECH), conjugated to 10-nm gold. Controls were conducted with the nontransfected cell line and by omission of the respective primary antibody.
Lysosomal Localization-Epimastigotes were stained for 30 min with 20 nM red LysoTracker (Molecular Probes, Eugene, OR), a fluorescent probe (577 nm absorption, 590 nm emission) used to investigate protein localization in the lysosomes because of its selective accumulation in cellular compartments with low internal pH. Fixed and Lyso-Tracker-stained epimastigotes were observed in a Zeiss microscope equipped with UV epifluorescence.
Homology-based Model of Cruzain Prodomain-The coordinates of human procathepsin L (Protein Data Bank deposition code 1CJL (22) were used to model the proregion of cruzain. Coordinates were visualized with the program CHAIN (30). Optimal rotomer conformation for side chains of the cruzain sequence was selected based on the orientation of the corresponding side chain in the procathepsin L structure. These choices corresponded to a statistically likely rotomer conformation in all instances (31). Fig. 4 was generated with MOLSCRIPT (32).
Mutagenesis at the Conserved Nine-amino Acid Prodomain Motif-All substitution mutants were constructed using a PCR-based method (27) as above. The annealing temperature was varied empirically to maximize the yield of the products. The primers used to replace specific amino acids in each mutant are given in Table II. The extreme primers (Tex1 and Cat2) were common for the secondary reactions in all the mutants. The amplified fragments were gelpurified, digested with SpeI, and ligated in place of the corresponding wild type fragment in the pT-ProCat-GFP vector.

FIG. 2. Confirmation of localization of ProCat-GFP in lysosome/endosome compartment (A) of T. cruzi by confocal microscopy with simultaneous staining with red LysoTracker (B).
T. cruzi epimastigotes were transfected with pT-ProCat-GFP as described under "Experimental Procedures." Bar ϭ 10 m. C, confirmation of colocalization of native protease and GFP fusion protein (ProCat-GFP) in lysosome-like organelles (arrows) of T. cruzi by immunoelectron microscopy with both an antibody to GFP (10-nm gold particle) and an antibody to cruzain (15-nm gold particle). T. cruzi epimastigotes were transfected with pT-Pro-Cat-GFP as described under "Experimental Procedures." Bar ϭ 1 m.

Expression of GFP in L. mexicana Promastigotes and T. cruzi
Epimastigotes-GFP was expressed in both trypanosomes using the pT vector system (Fig. 1A). Leishmania promastigotes expressing GFP required 2 weeks of selection in G418, while T. cruzi epimastigotes required 3 weeks. Expression of GFP alone (Fig. 1, A and B), without associated protease domains, resulted in distribution of the fluorescent protein throughout the cytoplasm of promastigotes or epimastigotes, including the flagellum, as observed previously in transfection of the related parasite, Leishmania major (19).
The Carboxyl-terminal Domain of Cruzain Does Not Direct GFP to L/E Compartments- Fig. 1, C and D, show that constructs, which included the carboxyl-terminal domain, the last 20 amino acids of the catalytic domain, the natural processing site (VVGGP) between the catalytic domain and carboxyl terminus (downstream of GFP), and the signal peptide (ϪLeu 115 to ϪAla 105 ) (upstream of GFP), did not alter the diffuse cytoplasmic localization of the fluorescent protein.
The Prodomain Is Sufficient for Directing GFP to L/E Compartment- Fig. 1, E and F, show that GFP is correctly directed to the L/E compartment of both T. cruzi and Leishmania when it is expressed downstream of the cruzain prodomain and Pro-Cat processing site (ϪLeu 115 to Thr 14 ). Identical trafficking is observed if the entire cruzain catalytic domain is also included (Fig. 1, G and H). Intense fluorescence is seen in the "megasome" compartment of Leishmania promastigotes and the "reservasomes" of T. cruzi epimastigotes. These are both late endosome-or lysosome-like vacuoles identified by their size, ultrastructure, pH, and position within the cell relative to propidium iodide-labeled nucleus and kinetoplast (20,21). Lo-calization to these compartments was confirmed by colocalization with a fluorescent marker of low pH cellular compartments, red LysoTracker (Fig. 2, A and B), and by immunoelectron microscopy with both an antibody to GFP and an antibody to the catalytic domain of cruzain (Fig. 2C).
The Prodomain Must Be Removed for the Final Steps of Protease Sorting-While a prodomain-membrane receptor interaction is proposed for proper trafficking of mammalian procathepsin L from endoplasmic reticulum or the Golgi compartment to lysosomes (3), the catalytic domain must eventually be cleaved from the prodomain to release soluble active protease (3). The effects of specific cysteine protease inhibitors on the ultrastructure of T. cruzi in culture suggested that the detrimental effects of the inhibitors are due to inhibition of autoproteolytic cleavage of the prodomain from the catalytic domain in a late Golgi compartment (21). In the presence of cysteine protease inhibitors, unprocessed cruzain accumulated in the Golgi and did not reach the L/E compartment. To test the hypothesis that the prodomain must be removed prior to final sorting of cruzain to this compartment, a mutant construct (Fig. 3D) was analyzed in which the processing site between prodomain and catalytic domain was altered. Western blot of T. cruzi extracts (Fig. 3E) showed that the protein product of the transfected construct was not processed to the catalytic domain-GFP fusion protein seen in wild type organisms. Fig. 3, A-C, show that failure to cleave the prodomain resulted in abnormal accumulation of GFP in the Golgi compartment at the expense of normal L/E targeting.
Alignment of Trypanosome and Mammalian Cysteine Protease Prodomain Sequences Identifies Putative Membrane Receptor Binding Motifs-The M6P-independent sorting pathway for FIG. 3. Immunolocalization (A and C) and confocal microscopic image (B) of GFP fusion protein in T. cruzi epimastigotes transfected with pT-ProCat-GFP (mutated processing site). Immunoelectron microscopy was performed using an antibody to GFP (10-nm gold particle). Note diminished localization to lysosome/endosome (L) and accumulation in Golgi (G). C is a higher magnification to highlight accumulation of cruzain/GFP in Golgi stacks. The shift of cruzain/GFP to Golgi (G) from lysosome (L) is shown in B by confocal microscopy. mammalian procathepsin L is thought to involve specific motifs within the prodomain interacting with a microsomal membrane receptor (3). Alignment of the kinetoplast prodomains with that of mammalian procathepsin L (Fig. 4A) reveals similarity in a nine-residue sequence, which, in procathepsin L, mediates prodomain-membrane association (3). A homologybased model of the prodomain, based on the crystal structure of procathepsin L (22), shows this motif is on a solvent-exposed loop between helices 1 and 2 at the amino terminus (Fig. 4B).
Mutagenesis of the Nine-amino Acid Motif Confirms Its Importance to Lysosomal Protease Trafficking-Based on the assumption that the arginine-and histidine-rich receptor binding motif would likely form salt bridges with negatively charged residues on the receptor, and because glutamic acid replacements in the motif would be most disruptive, five mutations were constructed and evaluated. All changed neutral or positively charged amino acids to glutamic acid. Two mutated residues were in helices flanking the loop region containing the target motif (K42E at the end of helix 1 and A53E at the beginning of helix 2, Fig. 4). Fig. 5 shows that, despite changing these amino acids to glutamic acid, normal targeting of GFP to the L/E compartment was observed. Three mutations within the loop (K44E, G46E, and R47E) all resulted in distribution of GFP throughout the cytoplasm of the cell (Fig. 5), similar to the distribution of GFP when GFP is expressed alone or with the COOH-terminal domain (Fig. 2). No retention in the endoplasmic reticulum was observed with any of the constructs, as would be expected if protein misfolding occurred. DISCUSSION We have shown that the prodomain of the cathepsin L-like cysteine protease, cruzain, is necessary and sufficient for directing GFP to the L/E compartment of two primitive eukaryotic unicellular organisms, T. cruzi and L. mexicana. In constructs lacking any protease domain, GFP was distributed throughout the cytoplasm, including the flagellum. Addition of the carboxyl-terminal domain failed to alter this distribution, indicating that the carboxyl-terminal domain is not sufficient to direct proper protein sorting. Correct sorting with the prodomain construct occurred whether or not the catalytic domain was included. Coupled with previous analysis of Leishmania cysteine protease gene products (20,23), these results suggest that neither the carboxyl-terminal domain nor the catalytic domain is required in trafficking of trypanosome cysteine proteases, at least in the extracellular stages of T. cruzi and L. mexicana.
The pT-cruzain/GFP constructs correctly directed GFP in both L. mexicana and T. cruzi. This result suggests that key aspects of the cysteine protease trafficking pathway are shared between the two organisms. In fact, alignment of the kineto- plastid prodomains showed both significant homology between the prodomains of the two parasite species, as well as significant similarity to mammalian cathepsin L in a nine amino acid sequence implicated in mouse procathepsin L binding to microsomal membranes (3). Site-directed mutagenesis experiments confirmed that this motif, located on a solvent exposed loop near the amino terminus of procathepsin L (22), must be intact if proper protease sorting is to take place.
Mammalian procathepsin L binds to a microsomal membrane receptor via its prodomain, which is later cleaved to release soluble mature protease into specific cellular compartments (18). Previous work by Eakin et al. (10) indicated that the prodomain of cruzain is autoproteolytically removed from the catalytic domain at pH 5-6 at the processing site indicated in Fig. 3 (10,24). To test whether this autoproteolysis step was necessary for final intracellular sorting, we prepared constructs in which the normal prodomain/catalytic domain processing site was mutated to prevent proteolytic cleavage by cruzain. By Western blot, the transfected gene product was not processed (Fig. 3) and the prodomain remained attached to the catalytic domain. Expression of this mutated protease-GFP construct resulted in a population of T. cruzi epimastigotes in which GFP now accumulated within the Golgi compartment. Based on these results, as well as previous studies showing that synthetic cruzain inhibitors cause accumulation of precursor cruzain in the Golgi (21), we propose that removal of the prodomain occurs in an acidified late Golgi or early post-Golgi compartment. A similar location for proteolytic processing has been observed for the lysosomal glycosidase of another primitive eukaryote, Dictyostelium discoideum (25).
The protozoa T. cruzi and L. mexicana are closely related species in one of the earliest lineages of eukaryotic cells. As experimental models they can provide a glimpse of basic protein sorting pathways that might have developed early in the evolution of the eukaryotic cell. By analogy to a model of M6Pindependent trafficking of the lysosomal proteases cathepsin D and cathepsin L in mammalian cells (3,17), we propose that a membrane-associated receptor binds the prodomain of cruzain after its exit from the endoplasmic reticulum, presumably early in its Golgi transit. Evaluation of this model in primitive eukaryotic cells like T. cruzi is more straightforward because the cruzain prodomain lacks not only M6P, but any carbohydrate moiety (13). The observation that the nonglycosylated prodomain of the T. cruzi cysteine protease is necessary and sufficient for directing GFP to the L/E compartment of either organism is consistent with the model proposed by McIntyre and Erickson (17) for trafficking of mammalian procathepsin L. There is significant sequence similarity among these proteases in the nine-amino acid motif proposed to mediate procathepsin L binding to microsomal membranes (18). The observation that this motif is present on a solvent-accessible loop (22) and the results of the mutagenesis experiments reported here support the hypothesis of its role in prodomain-receptor interactions and the presence of a conserved M6P-independent pathway of lysosomal protease trafficking.