Characterization of Drosophila Carboxypeptidase D*

Metallocarboxypeptidase D (CPD), is a 180-kDa protein that contains three carboxypeptidase-like domains, a transmembrane domain, and a cytosolic tail and which functions in the processing of proteins that transit the secretory pathway. An initial report on the Drosophila melanogaster silver gene indicated a CPD-like protein with only two and a half carboxypeptidase-like domains with no transmembrane region (Settle, S. H., Jr., Green, M. M., and Burtis, K. C. (1995) Proc. Natl. Acad. Sci. U. S. A. 92, 9470–9474). A variety of bioinformatics and experimental approaches were used to determine that theDrosophila silver gene corresponds to a CPD-like protein with three carboxypeptidase-like domains, a transmembrane domain, and a cytosolic tail. In addition, two alternative exons were found, which result in proteins with different carboxypeptidase-like domains, termed domains 1A and 1B. Northern blot, reverse transcriptase PCR, and sequence analysis were used to confirm the presence of the various mRNA forms. Individual domains of Drosophila CPD were expressed in insect Sf9 cells using the baculovirus expression system. Media from domain 1B- and domain 2-expressing cells showed substantial enzymatic activity, whereas medium from domain 1A-expressing cells was no different from cells infected with wild-type virus. Domains 1B and 2 were purified, and the enzymatic properties were examined. Both enzymes cleaved substrates with C-terminal Arg or Lys, but not Leu, and were inhibited by conventional metallopeptidase inhibitors and some divalent cations. Drosophila domain 1B is more active at neutral pH and greatly prefers C-terminal Arg over Lys, whereas domain 2 is more active at pH 5–6 and slightly prefers C-terminal Lys over Arg. The differences in pH optima and substrate specificity between Drosophila domains 1B and 2 are similar to the differences between duck CPD domains 1 and 2, suggesting that these properties are essential to CPD function.

Upon cloning and sequencing of gp180 cDNA (2), it became clear that this duck protein was the homolog of the newly discovered rat and bovine enzyme named CPD (3) and the silver (svr) gene of Drosophila melanogaster (4). CPD is thought to work together with endopeptidases such as furin to process peptides and proteins that transit the secretory and endocytic pathways (5,6). Unlike carboxypeptidase E (CPE; also known as CPH) and all other members of the metallocarboxypeptidase gene family, CPD is unique in that it contains multiple carboxypeptidase domains (7). Human, rat, mouse, and duck CPD contain three carboxypeptidase-like domains followed by a transmembrane domain and a 58-residue cytosolic tail (2, 8 -10). Of the three carboxypeptidase-like domains, only the first two have enzyme activity toward standard substrates (11,12). Several of the key catalytic residues are missing from the third carboxypeptidase-like domain of human, rat, and duck CPD, consistent with the observed lack of enzyme activity toward standard carboxypeptidase substrates (11,12); the function of this highly conserved domain is not known.
The enzymatic properties of the first and second domains of duck CPD differ in several key aspects. The first domain is more active at neutral pH than acidic values, whereas the second is more active in the pH 5-6 range than at neutral pH (11,12). The combination of the two domains therefore results in an enzyme with a broader pH optimum than either domain alone. This is important in that CPD is present in the trans-Golgi network, the secretory and reuptake pathways, and transiently on the cell surface (13)(14)(15). The pH in these various compartments ranges from near neutral to acidic (pH 5-6), so the broad pH range presumably gives CPD the ability to function optimally in each compartment. In addition, the substrate specificities of the first and second domains also differ. Although both domains are highly specific for C-terminal basic amino acids, the first domain prefers C-terminal Arg, whereas the second domain prefers C-terminal Lys (12).
Mutation of the Drosophila svr gene can affect viability, pigmentation, wing shape, catecholamine metabolism, and the behavioral response to light, depending on the severity of the mutation (16). The gene was isolated, and the corresponding cDNA was sequenced in 1995 and found to encode a protein containing two full carboxypeptidase-like domains followed by half of a third domain and then a stop codon (4). A variety of mRNA forms were identified, ranging from 1.5 to 6 kb, suggesting alternative splicing (4). In addition, a partial clone of an alternatively spliced cDNA (named the 1a form) was found, which encoded a protein with a different N-terminal sequence (4). However, the partial sequence of this 1a form appeared to lack one of three key metal-binding residues, all of which are required for enzymatic activity, thus raising the possibility that this 1a form was inactive. Three members of the human metallocarboxypeptidase family (CPX-1, CPX-2, and AEBP-1) are also missing key metal-binding and/or other catalytic residues and do not cleave standard carboxypeptidase substrates (17)(18)(19).
To gain insight into the significance of the various carboxypeptidase-like domains of CPD, the Drosophila genome and various expressed sequence tag (EST) data bases were analyzed for CPD-like sequences. Then Northern blot analysis was performed with exon-specific probes in order to determine the forms of mRNA and the resulting proteins. The two different first carboxypeptidase-like domains (1A and 1B) and the second carboxypeptidase-like domain were individually expressed using the baculovirus expression system, and their enzymatic properties were determined. The finding that the first carboxypeptidase-like domain (the 1B form) and the second carboxypeptidase-like domain differ in their pH optima and specificities for C-terminal Lys versus Arg indicates that this is a fundamental aspect of CPD structure/function that has been conserved from Drosophila to vertebrates.

Analysis of the Drosophila Genome and EST Data Bases
The Berkeley Drosophila Genome Project Web site (www.fruitfly.org) and the FlyBase data base (flybase.bio.indiana.edu) were searched with a variety of carboxypeptidase sequences, including rat CPE, rat CPD, human carboxypeptidases A and B, and the Drosophila svr gene product. In addition to the genomic Drosophila sequence, data bases of EST clones were also screened with the various carboxypeptidase sequences. Five EST clones (LP08595, GH13060, LD23786, LD28490, and LP12324) were purchased (Invitrogen) and sequenced in both directions. Reverse transcriptase-PCR (RT-PCR) was also used to confirm the predicted splicing patterns of the various exons. For the predicted C-terminal region, one of the gene prediction programs in the FlyBase data base had predicted two short introns. RT-PCR was used with oligonucleotides corresponding to regions flanking these predicted introns, and the PCR product was analyzed on agarose gels. In addition, the PCR product was subcloned into the pCR4-TOPO vector (Invitrogen), and ϳ20 clones were isolated and sequenced.
The presence of an N-terminal signal peptide was predicted using the Web site www.cbs.dtu.dk/service/SignalP (20). The three-dimensional structure of each carboxypeptidase-like domain of the Drosophila CPD was analyzed using the PSSM Web site (www.sbg.bio.ic.ac.uk/ ϳ3dpssm/). Amino acid sequence identity among various carboxypeptidases was determined using GenePro (Hoeffer Scientific).

Northern Blot Analysis
Total RNA was prepared using the Qiagen RNeasy protect minikit, and ϳ20 g was fractionated on an agarose gel containing 2% formaldehyde. After photography of the ethidium bromide-stained gel, the RNA was transferred to a nitrocellulose membrane (Optitran; Schleicher & Schuell) and probed with fly CPD riboprobes. To generate exon-specific probes, PCR was performed using oligonucleotides corresponding to the 5Ј-and 3Ј-ends of exons 1A, 1B, 3, 6, and 8. The template for the PCR of exon 1A was the EST clone LP12324, and the reaction product was 510 nucleotides. For PCR of exons 1B and 3, the EST plasmid LD28490 was used, and the reaction products were 380 and 485 bp, respectively. For exons 6 and 8, RT-PCR was performed using Drosophila mRNA, and the reaction products were 470 and 465 bp, respectively. All PCR products were subcloned into the pCR4-TOPO vector (Invitrogen) and verified by sequence analysis. To generate riboprobe, the plasmids were linearized with either SpeI or NsiI, and the appropriate enzyme (T3 or T7 RNA polymerase) was used to generate antisense cRNA with [ 32 P]UTP using standard procedures (21). Approximately, 5 ϫ 10 6 cpm of each probe was hybridized with the nitrocellulose blot in 5 ϫ SSC, 50% formamide, 5 ϫ Denhardt's solution, 1% SDS, and 100 g/ml denatured salmon sperm DNA at 60°C overnight. After hybridization, blots were successively washed with 2 ϫ SSC containing 0.1% SDS, 1 ϫ SSC containing 0.1% SDS and then 0.1 ϫ SSC containing 0.1% SDS buffer at 65°C. Blots were dried and exposed to x-ray film (X-Omat Blue XB-1; Eastman Kodak Co.) for 3 days at Ϫ80°C with an intensifying screen.

Expression of CPD Domains in Baculovirus and Enzyme Purification
Plasmid Construction-For expression in the baculovirus system, PCR was used to generate a cDNA fragment corresponding to the various carboxypeptidase domains, and this fragment was subcloned into the pVL1393 baculovirus expression vector (Pharmingen) downstream of the signal peptide sequence derived from rat CPE. This CPE signal peptide sequence has previously been found to produce high levels of secreted proteins using the baculovirus expression system (22). The Drosophila CPD 1A and 1B constructs were created by amplifying 1.1-kb fragments using EST clones LP12324 and LD28490, respectively, as templates with Tgo DNA polymerase (Roche Molecular Biochemicals). The Drosophila CPD 2 construct was produced using Drosophila mRNA and RT-PCR SUPERSCRIPT TM II RNase H Ϫ Reverse Transcriptase (Invitrogen) and then Platinum Taq DNA polymerase High Fidelity (Invitrogen) to amplify a 1.2-kb product. The 5Ј-end of each PCR product contained a restriction site (XbaI or SpeI) that was in-frame with the XbaI site immediately following the prepro-CPE sequence in the baculovirus expression construct pCPE20 (22). The 3Ј-end of each PCR product contained a stop codon followed by the NotI restriction site. The PCR product was purified (PCR purification kit; Qiagen), digested with XbaI or SpeI and NotI, and ligated into XbaI/ NotI-digested pCPE20. All resulting plasmids were confirmed by sequencing in both directions.
Baculovirus Expression and Protein Purification-The three baculovirus expression plasmids (2.5 g each) were separately combined with 0.25 g of Baculoplatinum viral DNA (Orbigen) and used to transfect 10 6 Sf9 cells using the standard procedure recommended by Orbigen. The recombinant virus was amplified in Sf9 cells as described (11). For analysis of enzyme activity and subsequent protein purification, 100 -300 ml of Sf9 cells at 2 ϫ 10 6 cells/ml were infected with recombinant virus, and after 3 days, the cells were removed by centrifugation at 30,000 ϫ g for 30 min. The supernatant was removed and either assayed directly (described below) or purified on a p-aminobenzoyl-Arg-Sepharose affinity resin (23). For domain 1B, 100 ml of supernatant was adjusted to 100 mM NaAc, pH 5.5, and applied to a 0.5-ml p-aminobenzoyl-Arg-Sepharose affinity resin column. The column was washed with 50 ml of 0.1 M NaAc, pH 5.5, containing 1 M NaCl and 1% Triton X-100 and then rinsed with 10 ml of 10 mM NaAc, pH 5.5. The resin was eluted with 5 ml of 50 mM Tris-HCl, pH 8.0, containing 0.01% CHAPS and 0.05% Triton X-100 (Elute 1) and then with 5 ml of the same buffer containing 25 mM Arg (Elute 2). Enzyme activity was found in Elute 2. For domain 2, 300 ml of supernatant was dialyzed for 48 h against several changes of 50 mM NaAc, pH 5.5, and applied to a 5-ml paminobenzoyl-Arg-Sepharose affinity resin column. The column was washed with 0.1 M NaAc, pH 5.5, containing 0.5 M NaCl and 0.5% Triton X-100; rinsed with 10 mM NaAc, pH 5.5; and eluted with 5 ϫ 10 ml of Elute 1 buffer (above) and then 5 ϫ 10 ml of Elute 2 buffer (above). Most of the enzyme was found in Elute 2. Domain 1A was also tested with the p-aminobenzoyl-Arg-Sepharose affinity resin but was not found to bind under any of the above conditions, even when 1 mM ZnCl 2 was included in the binding buffer (data not shown).
Western Blot Analysis-Proteins were fractionated on SDS-PAGE gels and electrophoretically transferred to nitrocellulose (Optitran; Schleicher & Schuell). The nitrocellulose blots were blocked with 5% nonfat milk in 10 mM Tris, pH 7.4, 150 mM NaCl, 0.1% Tween 20 for 1 h at room temperature. The blots were probed with 1:1000 dilutions of polyclonal rabbit antisera raised against either duck CPD (AE 178) or rat CPD (AE 142). Following exposure of the blot to primary antiserum, the enhanced chemiluminescence method (Pierce) was used to detect bound antiserum.

Enzyme Assays
Carboxypeptidase Assay with Fluorescent Substrates-Enzyme activity was typically assayed in triplicate with 200 M dansyl-Phe-Ala-Arg in 100 mM Tris acetate, pH 7.4, buffer (for domain 1B) or pH 5.7 buffer (for domain 2) containing 0.01-0.1% Triton X-100 in a final volume of 250 l. The reaction was stopped with 100 l of 0.5 N HCl after 30 -45 min at 37°C, 2 ml of chloroform was added, and the tubes were mixed and centrifuged at 150 ϫ g for 2 min. The amount of product (dansyl-Phe-Ala) was determined by measuring the fluorescence of the lower chloroform phase (excitation, 350 nm; emission, 500 nm). The pH optima of purified enzymes were determined using 100 M dansyl-Phe-Ala-Arg in 50 mM Tris acetate buffer at various pH values at 37°C. To examine the effect of inhibitors and metals, purified enzymes were preincubated with the inhibitor for 15 min at room temperature. After the preincubation, substrate (dansyl-Phe-Ala-Arg) was added to 200 M final concentration, and the reaction was incubated at 37°C for 4 -5 h. For kinetic analysis, the purified enzymes were dialyzed (Centricon 30) to remove traces of Arg and incubated with substrate (final concentration 0.1, 0.2, 0.4, 0.6, 0.8, and 1 mM) for a 1-h incubation at 37°C. K m and K cat were calculated using SigmaPlot 2001 Enzyme Kinetic Module (Jandel Scientific).

RESULTS
Analysis of the Drosophila genomic and EST data base sequences as well as RT-PCR and sequencing revealed that the CPD gene is comprised of eight exons ( Fig. 1). There are three alternatively spliced first exons. Exon 1A encodes the sequence previously found as a partial cDNA sequence and named 1a (4), exon 1B corresponds to a cDNA sequence previously named 1b (4), and exon 1C corresponds to an EST clone (discussed further below). The original sequence report of the svr gene had a missing nucleotide, which led to a frameshift and the introduction of a stop codon in the middle of the third carboxypeptidaselike domain (4). Analysis of EST data base sequences confirmed that the previously reported svr gene sequence was incorrect and that there is a long open reading frame encoding three carboxypeptidase-like domains, a transmembrane domain, and cytosolic tail. The size of genomic BamHI and EcoRI fragments previously identified (4) match those predicted from the Drosophila genomic sequence (Fig. 1).
Exon-specific probes were generated to examine the mRNA forms of Drosophila CPD. Northern blot analysis revealed a common 6.5-kb form of CPD mRNA that was detected with all exon probes (Fig. 2). In addition, probes for exons 1B and 3 showed major bands of 3.4, 2.8, 1.7, and 1.5 kb (Fig. 2). These same species were also detected upon longer exposure of the blot probed with exon 1A (not shown), although the levels were much weaker relative to the 6.5-kb form. The exon 6 and 8 probes did not detect any of the smaller forms of CPD mRNA (Fig. 2). The forms of CPD mRNA were also examined by EST data base searches. A number of cDNA clones encoding Drosophila CPD were identified, and several of these were obtained and fully sequenced in both directions. The results of this analysis, together with the Northern blot data, are indicated in Fig. 3. The 1A and the 1B exons are found in both "long" and "short" forms. The N-terminal regions of the proteins encoded by the first ATGs in exons 1A and 1B are predicted to encode signal peptides. There is no apparent "pro" region following the signal peptide of the proteins encoded by either the 1A or 1B form of Drosophila CPD, which is consistent with the lack of a pro region in vertebrate CPD. In addition to the 1A and 1B forms, a third exon 1 (designated 1C) was found in a single EST cDNA sequence. The protein encoded by the 1C form of the mRNA does not contain a N-terminal signal peptide and has a truncated carboxypeptidase-like domain.
The long forms of Drosophila CPD mRNA encode three carboxypeptidase-like domains, each separated by a short "bridge region" (b1 and b2) and then a transmembrane domain and cytosolic tail (Fig. 3A). Although the mRNA appears as a single species of ϳ6.5 kb, this appears to represent at least two and possibly three different splice forms, each encoding a different cytosolic tail protein sequence (Fig. 3B). The unspliced form, referred to as tail-1, encodes a cytosolic tail of 67 amino acids (Fig. 3C). The form with the ␥ segment spliced out, referred to as tail-2, encodes a cytosolic tail of 100 amino acids (Fig. 3, B and C). Both of these forms were detected in roughly equal amounts when RT-PCR was used to amplify this region (data not shown). In addition to these two forms, a third form lacking both the ␥ segment and the ␣ segment (Fig. 3, B and C) was predicted in the FlyBase data base, but no cDNA clones to this region were detected either experimentally or in the various EST data bases.
The short forms of Drosophila CPD mRNA result from the failure to remove the intron between exons 4 and 5. This results in a single carboxypeptidase-like domain, part of the first bridge domain (b1), and then a unique C-terminal sequence of 10 amino acids followed by a stop codon (Fig. 3A). There is considerable variation in the length of the 3Ј-untranslated region of these short mRNA forms. Two of the exon 1B-containing cDNA clones obtained from the EST library correspond to the 1.7-and the 2.8-kb forms of 1B mRNA found in the Northern blots. These two clones differ in the site used for polyadenylation. An exon 1A-containing cDNA clone found in an EST library used a third polyadenylation site. Because the polyadenylation site is presumably unrelated to the presence of exon 1A or 1B, it is likely that all three polyadenylation sites are used for all of the short forms. Furthermore, the presence of additional bands on the Northern blots suggests that either additional polyadenylation sites are used or additional upstream exons may be included in some of the forms.
The various carboxypeptidase-like domains were compared with each other and to selected members of the E/N subfamily of metallocarboxypeptidases. Domains 1A and 1B have highest amino acid sequence identity with each other (82%), although this includes the common region encoded by exons 2-4. The unique regions of these two proteins show 42% amino acid identity. Both the 1A and 1B forms of Drosophila CPD have 30 -40% amino acid sequence identity with Drosophila CPD domain 2 and with various vertebrate carboxypeptidases (Table I). Interestingly, Drosophila domain 1B does not show higher amino acid sequence identity to the first domain of duck or rat CPD and has slightly higher sequence identity with the second domain of these proteins (Table I). The second domain of Drosophila CPD does show higher amino acid sequence iden-tity to the second domains of duck and rat CPD relative to the first or third domains of these proteins (Table I). The third domain of Drosophila CPD shows low amino acid sequence This region of exon 8 has been arbitrarily designated ␣, ␤, ␥, and ␦. The unspliced exon 8 produces the sequence designated "tail-1." The mRNA species in which the ␥ region has been removed is designated "tail-2," and the predicted species lacking both the ␣ and the ␥ regions has been designated "tail-3." The location of the BamHI site within this region of exon 8 is indicated for tail-1 and tail-2. C, deduced amino acid sequences of the tail-1, tail-2, and tail-3 splice forms. The first 5-13 amino acids correspond to the end of the putative transmembrane domain. Representative cDNA sequences have been deposited in GenBank TM . Accession numbers are AF545816 (1A "long" form with tail 1), AF545817 (1B "long" form with tail 1), AF545818 (1B "long" form with tail 2), AF545819 (1A "short" form), and AF545820 (1B "short" form). identity to all domains of duck and rat CPD and to other members of the gene family (Table I). In addition, the third domain of Drosophila CPD lacks many of the residues required for substrate binding and enzymatic activity, although it is predicted to fold into a carboxypeptidase-like structure. Domains 1A, 1B, and 2 are also predicted to fold into carboxypeptidase-like structures. Domains 1B and 2 contain all of the metal and substrate-binding residues in the expected positions and are predicted to encode active enzymes. Domain 1A contains nearly all of the active site residues and only lacks one of the metal-binding His residues (the His in the position equivalent to His 69 of bovine carboxypeptidase A, the reference numbering system for active site residues). In domain 1A, this critical His is replaced by a Gln, which is not predicted to bind Zn 2ϩ with high affinity.
The third domain of duck CPD was previously found to be inactive toward standard carboxypeptidase substrates, and because the Drosophila third domain has even fewer of the key catalytic residues, this domain was not further studied. However, carboxypeptidase domain 1A appeared to only lack a single metal-binding residue, so this was included in further studies investigating enzyme activity. The individual carboxypeptidase domains 1A, 1B, and 2 were expressed in baculovirus under the polyhedrin promoter and with the signal peptide of rat CPE (in order to provide a uniform 5Ј-untranslated region and signal peptide for all three constructs). Western blot analysis of medium from cells expressing wild-type virus using an antiserum to duck CPD revealed a major band of 64 kDa (Fig. 4A), which corresponds to either a viral or an Sf9 cell protein. The media from cells infected with the domain 1A and domain 1B constructs showed bands of ϳ50 kDa, and the medium of cells infected with the domain 2 construct showed a band of 54 kDa, in addition to the common 64-kDa band (Fig.  4A). Similar results were obtained using an antiserum raised against purified rat CPD (data not shown). When assayed with the standard CPD substrate, dansyl-Phe-Ala-Arg, media from both domain 1B-and domain 2-expressing cells showed substantial amounts of activity at pH 5.6 and 7.4 (Table II). The domain 1B medium was more active at pH 7.4, whereas the domain 2 medium was more active at pH 5.6. Medium from domain 1A-expressing cells showed carboxypeptidase activity that was comparable with the medium from cells infected with wild-type virus (Table II).
To further study CPD domains 1B and 2, they were purified using a substrate affinity column previously used to isolate vertebrate CPD (3,11). Both preparations of enzyme showed a major band of the expected size when analyzed by denaturing gel electrophoresis and silver staining (Fig. 4B). Drosophila domain 1B is maximally active at pH values between 7 and 8, whereas domain 2 is maximally active in the pH 5-6.5 range (Fig. 5). The pH optima of the purified enzymes are not substantially different from those of the unpurified medium (Fig.  5). Domain 1B is not inhibited by 1 mM phenylmethylsulfonyl fluoride or iodoacetamide or by 1 M guanidinoethylmercaptosuccinic acid, an active site-directed inhibitor (Table III). Domain 2 shows a slight degree of inhibition by 1 mM iodoacetamide and is more strongly inhibited by 1 M guanidinoethylmercaptosuccinic acid. As previously found for duck CPD (11,12), both domains of Drosophila CPD are inhibited by the thiol-directed reagent para-chloromercuriphenyl sulfonate, with the second domain more sensitive than the first domain (Table III). The metallopeptidase-nature of Drosophila CPD is evident from the substantial inhibition observed in the presence of the Zn 2ϩ chelator 1,10-phenanthroline (Table III). EDTA is less potent as an inhibitor than 1,10-phenanthroline.
Domains 1B and 2 of Drosophila CPD show differences in their sensitivity to various divalent cations. Concentrations of Zn 2ϩ of 10 M or higher inhibit domain 1B, whereas domain 2 is not substantially affected by a 10 or 100 M concentration of this ion (Fig. 6). Both domains are activated by Co 2ϩ , with domain 1B slightly more sensitive to this metal than domain 2. Cu 2ϩ and Hg 2ϩ inhibit domain 2 more potently than they a The overall amino acid sequence identity between protein domains 1A and 1B is 82%, but this includes the common exons 2, 3, and 4; the amino acid sequence identity between the coding region of exon 1A and exon 1B is only 42%.  2) were fractionated on a denaturing polyacrylamide gel and transferred to nitrocellulose. The Western blot was probed with a 1:1000 dilution of polyclonal rabbit antiserum raised against duck CPD (AE 178). Following exposure of the blot to primary antiserum, the enhanced chemiluminescence method (Pierce) was used to detect bound antiserum. B, purified proteins were analyzed on a denaturing polyacrylamide gel and silver-stained as described (32). The positions and molecular masses (in kDa) of prestained standards are indicated. a Activity was determined with 200 M dansyl-Phe-Ala-Arg. Units are nmol of product formed/min/media from 10 6 cells, with less than 10% variation between duplicate determinations. inhibit domain 1B (Fig. 6). Low concentrations of Cd 2ϩ and Ni 2ϩ slightly activate domain 1B but not domain 2, whereas 1 mM concentrations of these ions inhibit the two activities (Fig.  6).
Both domains of Drosophila CPD cleave dansyl-Phe-Ala-Arg with generally comparable K m values (Table IV). However, the V max for domain 1B is substantially higher than that of domain 2, and the resulting K cat /K m values for the two domains vary 10-fold. Because comparable substrates with a C-terminal Lys residue are not available, we employed another assay using shorter substrates, FA-Ala-Arg and FA-Ala-Lys. Unfortunately, these compounds showed a large amount of substrate inhibition at concentrations greater than 200 M, so it was not possible to evaluate kinetic parameters. Using a single concentration of 100 M, domain 1B cleaved FA-Ala-Arg ϳ10 times faster than FA-Ala-Lys (Fig. 7). No detectable hydrolysis of FA-Gly-Leu was observed (Fig. 7), even upon prolonged incubation (data not shown). Domain 2 also did not show any detectable hydrolysis of the Leu-containing substrate but cleaved FA-Ala-Lys slightly faster than it cleaved FA-Ala-Arg. Domain 2 was markedly less active toward either substrate than domain 1B; to produce comparable amounts of product formation, 5-fold more domain 2 protein was used than domain 1B protein (Fig. 7). Thus, the 10-fold difference between K cat /K m observed between domains 1B and 2 for cleavage of dansyl-Phe-Ala-Arg is comparable with the difference in activity observed for cleavage of 100 M FA-Ala-Arg. DISCUSSION The major finding of the present study is that key enzymatic differences between the first and second carboxypeptidase do-mains within vertebrate CPD are conserved in Drosophila CPD (this is discussed further below). In addition, the endogenous forms of Drosophila CPD have been deduced from a combination of bioinformatics and experimental approaches. A previous report on the sequence of the Drosophila svr gene contained a nucleotide error that incorrectly caused a termination of the open reading frame in the middle of the third carboxypeptidase-like domain (4). The correct sequence of this region encodes a full-length protein with three complete carboxypepti- FIG. 5. pH optima of Drosophila CPD domains 1B (D1B) and 2 (D2). CPD activity was determined using 100 M dansyl-Phe-Ala-Arg in 50 mM Tris acetate buffer at the indicated pH, measured at 37°C. Activity was normalized to the maximal activity detected at the optimal pH.   dase-like domains, a transmembrane domain, and a cytosolic tail. Thus, this svr gene product is clearly a homolog of mammalian and duck CPD based on the identical domain organization. In some databases, the svr gene is described as a CPE or CPH homolog (these are two names for the same protein).
Whereas it is possible that the svr gene product performs some of the same functions as CPE (discussed below), CPE contains only a single carboxypeptidase domain and thus is quite distinct from the svr gene product.
There are eight members of the N/E subfamily of carboxypeptidases in humans but only two in Drosophila; one is CPD, and the other has the highest amino acid sequence identity to CPM. 2 The lower number of N/E-type carboxypeptidases in Drosophila is consistent with estimates that this organism contains fewer genes than humans. One way to increase the diversity from a small number of genes is by alternative splicing of the exons. The present analysis of Drosophila CPD is consistent with this concept. The long form of Drosophila CPD containing the 1B exon (form 1B long) (Fig. 3) is most like mammalian CPD in that it contains two active carboxypeptidase domains followed by a carboxypeptidase-like domain that lacks many critical active site residues and then a transmembrane domain and cytosolic tail. The Drosophila 1B short form is more like mammalian CPN or CPZ than CPD in that this form contains a single carboxypeptidase domain lacking any membrane-association domain and is maximally active at neutral pH. Because the Drosophila 1A short form lacks a critical active site residue and has no detectable enzyme activity, this form is more like the mammalian proteins CPX-1, CPX-2, and AEBP-1, all of which lack one or more critical residues and have no detectable activity toward dansyl-Phe-Ala-Arg (17)(18)(19)24). The second domain of Drosophila CPD has enzymatic properties that are similar to those of mammalian CPE: a pH optimum in the 5-6 range and a slight preference for C-terminal Arg over Lys (11,12). Thus, the single Drosophila svr gene appears to encode proteins that have properties in common with seven of the mammalian members of this family: CPD, CPN, CPZ, CPE, CPX-1, CPX-2, and AEBP-1.
The proposal that the Drosophila svr gene is the functional equivalent of multiple mammalian carboxypeptidases is consistent with the degree of amino acid sequence identity among various proteins. Drosophila CPD domain 1B has essentially the same amino acid sequence identity (39 -40%) with rat CPD, rat CPE, human CPM, human CPN, and human CPZ. Although the second domain of the Drosophila CPD has a slightly higher sequence identity to the corresponding domain of duck CPD, the various mammalian carboxypeptidases show a more equal level of sequence identity. Despite the moderate conservation of amino acid sequences between Drosophila and vertebrate CPD proteins, the key enzymatic properties of the first and second domains of duck CPD are highly conserved in Drosophila domains 1B and 2. This finding implies that the difference in pH optimum and substrate specificity between the first and second domains of CPD is an essential feature. Other differences between the properties of the first and second domains, such as sensitivity to divalent cations and various enzyme inhibitors, are not as well conserved between Drosophila and duck CPD. However, these features are not physiological; none of the divalent cations that affect CPD activity are present in vivo at the concentrations that influence enzyme activity. Although Ca 2ϩ is typically present at millimolar levels in the secretory pathway, this ion does not affect enzyme activity of either domain. Thus, the only physiological differences appear to be substrate specificity and pH, and together the first and second domains of CPD enable the enzyme to efficiently remove both Lys and Arg residues at all pH values present in the exocytic and endocytic pathways.
Other peptide-processing enzymes exist as multidomain enzymes in mammals, such as peptidyl glycine ␣-amidating monooxygenase and angiotensin-converting enzyme. Peptidyl glycine ␣-amidating monooxygenase functions in the formation of C-terminal amide residues in neuropeptides and consists of a hydroxylase and a lyase (25). Although both of these activities are present in Drosophila, they result from distinct genes (26). Similarly, mammalian angiotensin-converting enzyme is generally found with two tandem active domains (except for the single-domain testis form), but in Drosophila two distinct genes encode the active forms (27)(28)(29). As with CPD, the functional properties of the individual peptidyl glycine ␣-amidating monooxygenase and angiotensin-converting enzyme domains appear to be conserved between Drosophila and mammals, suggesting that these differences are evolutionarily important. In the case of peptidyl glycine ␣-amidating monooxygenase, the two domains perform distinct functions, which are both required for the formation of C-terminal amide residues (25,26). The two angiotensin-converting enzyme domains are somewhat redundant, with only subtle differences in substrate specificity and buffer conditions for optimal activity (27,30), as found for CPD.
The function of the third carboxypeptidase-like domain within CPD is not clear. Although a full-length third domain is present in Drosophila, the amino acid sequence identity between this region and the corresponding region of vertebrate CPD is extremely low. This contrasts with the relatively high (82%) amino acid sequence identity between the third domain of duck and rat CPD (9). The third domain of Drosophila CPD is predicted by the PSSM Web site program to fold into a metallocarboxypeptide-like structure, and it is possible that the general structural requirement of this region is the important feature that has been conserved rather than a substratebinding function. It is possible that the 1A domain of Drosophila CPD functions as a catalytically inactive substrate-binding domain. Metallocarboxypeptidases require Zn 2ϩ for enzymatic activity but not for substrate binding (31). The absence of one of the three Zn 2ϩ -binding residues in domain 1A suggests that this protein will have a reduced affinity for Zn 2ϩ , if it binds at all, and this will translate into a reduced catalytic activity. However, the absence of Zn 2ϩ may not affect substrate binding; 2 G. Sidyelyeva and L. D. Fricker, unpublished results. further studies are needed to address this possibility and to test whether the addition of divalent cations can activate enzyme activity of domain 1A.
Previously, Settle et al. (4) found that lethality and some, but not all, features of the svr mutant phenotype could be rescued with a 13.4-kb genomic BamHI fragment that contains most of the CPD gene. This fragment could produce all of the short forms of CPD and the long forms corresponding to tail-1 and tail-3. However, the BamHI site is located within the coding region of the tail-2 form, and this fragment would therefore result in a truncated protein lacking 40 C-terminal amino acids. Within this C-terminal region are two acidic clusters and a casein kinase 2 consensus site; similar domains are present in vertebrate CPD and have been found to be important in the intracellular routing of this protein (13)(14)(15). Thus, it is possible that the 13.4-kb genomic BamHI fragment failed to correct all of the features of the mutant phenotype because of the improper intracellular trafficking of the tail-2 form of the CPD protein. One of the nonlethal svr mutants, svr poi , has a 0.9-kb insertion in the 1.5-kb EcoRI gene fragment (4). Because this corresponds to exons 5 or 6, it is likely that this mutant would express all of the short forms of CPD found in the present study. Thus, it appears that the first domain alone is sufficient for survival. Further studies are needed to define the precise role of the specific CPD forms in various phenotypes found in the different svr mutants.