JBC Avanti Polar Lipids

HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
 QUICK SEARCH:   [advanced]


     


Originally published In Press as doi:10.1074/jbc.M112254200 on February 8, 2002

J. Biol. Chem., Vol. 277, Issue 17, 14954-14964, April 26, 2002
This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF)
Right arrow All Versions of this Article:
277/17/14954    most recent
M112254200v1
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Right arrow Citation Map
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Wei, S.
Right arrow Articles by Fricker, L. D.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Wei, S.
Right arrow Articles by Fricker, L. D.
Social Bookmarking
 Add to CiteULike   Add to Complore   Add to Connotea   Add to Del.icio.us   Add to Digg   Add to Reddit   Add to Technorati  
What's this?

Identification and Characterization of Three Members of the Human Metallocarboxypeptidase Gene Family*

Suwen WeiDagger , Sonia Segura§, Josep Vendrell§, Francesc X. Aviles§, Edith Lanoue, Robert Day, Yun FengDagger , and Lloyd D. FrickerDagger ||

From the Dagger  Department of Molecular Pharmacology, Albert Einstein College of Medicine, Bronx, New York 10461, § Institut de Biotecnologia i Biomedicina and Departament de Bioquímica i Biologia Molecular, Universitat Autònoma de Barcelona, 08193 Bellaterra, Barcelona, Spain, and  Departement de Pharmacologie, Institut de Pharmacologie et Faculté de Médecine, Université de Sherbrooke, Sherbrooke, Québec J1H 5N4, Canada

Received for publication, December 21, 2001, and in revised form, February 5, 2002

    ABSTRACT
TOP
ABSTRACT
INTRODUCTION
MATERIALS AND METHODS
RESULTS
DISCUSSION
REFERENCES

Amino acid homology searches of the human genome revealed three members of the metallocarboxypeptidase (metallo-CP) family that had not been described in the literature in addition to the 14 known genes. One of these three, named CPA5, is present in a gene cluster with CPA1, CPA2, and CPA4 on chromosome 7. The cDNA encoding a mouse homolog of human CPA5 was isolated from a testis library and sequenced. The deduced amino acid sequence of human CPA5 has highest amino acid sequence identity (60%) to CPA1. Modeling analysis shows the overall structure to be very similar to that of other members of the A/B subfamily of metallocarboxypeptidases. The active site of CPA5 is predicted to cleave substrates with C-terminal hydrophobic residues, as do CPA1, -2, and -3. Using Northern blot analysis, CPA5 mRNA is detected in testis but not in kidney, liver, brain, or lung. In situ hybridization analysis shows that CPA5 is localized to testis germ cells. Mouse pro-CPA5 protein expressed in Sf9 cells using the baculovirus system was retained in the particulate fraction of the cells and was not secreted into the media. Pro-CPA5 was not enzymatically active toward standard CPA substrates, but after incubation with prohormone convertase 4 the resulting protein was able to cleave furylacryloyl-Gly-Leu, with 3-4-fold greater activity at pH 7.4 than at 5.6. Two additional members of the human CP gene family were also studied. Modeling analysis indicates that both contain the necessary amino acids required for enzymatic activity. The CP on chromosome 8 is predicted to have a CPA-like specificity for C-terminal hydrophobic residues and was named CPA6. The CP on chromosome 2 is predicted to cleave substrates with C-terminal acidic residues and was named CPO.

    INTRODUCTION
TOP
ABSTRACT
INTRODUCTION
MATERIALS AND METHODS
RESULTS
DISCUSSION
REFERENCES

Carboxypeptidases (CPs)1 serve many important functions in a variety of organisms. Previously, there were 14 known members of the human metallo-CP gene family with functions ranging from the digestion of food to the selective biosynthesis of neuroendocrine peptides (1-11). All metallo-CPs, including those from bacteria and plants, can be divided into one of two subfamilies based on consideration of the overall domain structure as well as on amino acid sequence similarities. One subfamily, which includes the digestive enzymes CPA1, CPA2, and CPB1 has been previously referred to as the "pancreatic" subfamily but is more appropriately referred to as the "A/B" subfamily. All members of the A/B subfamily contain an ~90 amino acid-long N-terminal "pro" region that functions as a chaperone to assist with the folding of the active CP domain. In addition, this pro domain also functions as an inhibitor of the CP, with full activity requiring endopeptidase cleavage of the pro domain in one or more places. Cleavage releases the active CP domain, which is ~300 residues long and does not contain any additional domains.

The other subfamily of metallo-CPs includes CPN, CPE, CPM, CPD, and others (1, 11). Although this subfamily has been referred to as "regulatory" CPs, this name is not ideal, because most of these enzymes are poorly regulated, whereas one of the members of the A/B subfamily, CPB2 or CPU, is highly regulated (4). Thus, this second subfamily has been recently renamed the N/E subfamily to reflect the first two members that were discovered (11). In contrast to the A/B subfamily members, all members of the N/E subfamily contain several domains in addition to the active CP domain and the signal peptide (11). Also, members of the N/E subfamily lack pro domains that function as enzyme inhibitors; some lack this region entirely (CPM, CPN, CPD), whereas another member (CPE) contains a short pro domain that has an unknown function (7-10, 12). A domain common to all members of the N/E subfamily is a transthyretin-like folding domain that is immediately C-terminal to the active CP domain (13, 14). The function of this domain is not yet clear. In addition to the transthyretin-like domain, members of the N/E subfamily contain domains that are thought to function in protein-protein and/or protein-membrane interactions and that target the protein to specific compartments within the secretory pathway or outside the cell (11).

The amino acid sequence identities within members of each subfamily (typically 35-65%) are higher than between subfamilies (15-25%). The three-dimensional structures of members of each subfamily show many similarities within the core, although there are major differences within several of the loops near the active site cleft (14, 15). The active site and substrate-binding residues are generally conserved among all members of the two subfamilies that display enzymatic activity; one or more critical residues are missing in some members of the N/E subfamily that do not appear to encode active carboxypeptidases such as CPX1, CPX2, AEBP1, and the third CP-like domain of CPD (11, 16-19).

The objective of the present study was to determine whether additional metallo-CP genes exist in the human genome. Homology searches were used to identify three novel human CP-like genes. One of these is related to the sequence of a mouse testis cDNA clone of unknown function (GenBankTM accession number AK015256), which we have named CPA5. The GenBankTM sequence does not contain a long open reading frame due to a frameshift with respect to the predicted human genomic sequence; the quality of the mouse sequence data was not reported. Another of the novel human CP-like genes was identical within exons 1-8 to a cDNA sequence previously found in human hematopoietic stem cells (GenBankTM accession number AF221594). However, this putative stem cell protein would not encode an active CP due to the absence of exons 9-11, which contain critical residues for enzyme activity. The present search revealed these missing exons downstream of exon 8, and reverse transcription (RT)-PCR was used to confirm the predicted exon splicing of this region of the gene. Finally, a third completely novel CP-like gene was identified. Although related to cDNA sequences found in several nonmammalian species, no mammalian cDNAs representing this gene have been described. Modeling of the three novel CPs was performed to determine whether these gene products were likely to encode active CPs and, if so, to predict their substrate preference for aliphatic, basic, or other amino acids. In addition, the tissue distribution of pro-CPA5 mRNA and the preliminary characterization of CPA5 enzymatic activity was performed.

    MATERIALS AND METHODS
TOP
ABSTRACT
INTRODUCTION
MATERIALS AND METHODS
RESULTS
DISCUSSION
REFERENCES

Data Base Searches-- The NCBI Web site was used to search the public human genome data base with various human CP sequences using the tblast-n program and the default parameters. To identify additional exons and to determine the nucleotide sequence of the exon/intron junctions, the genomic sequences that corresponded to the novel CP-like genes on chromosomes 7, 8, and 2 were downloaded from the NCBI site. These sequences were translated in all three reading frames, and the deduced amino acid sequences were searched for homology to various CPs using the GenePro program (Hoeffer Scientific). Because the amino acid sequence similarity within the N-terminal precursor regions of the CPs is low, these regions could not be identified from this homology-based approach. Searches of the expressed sequence tag (EST) data base on the NCBI Web site using the blast-n program revealed matches to the 5' region of the novel CP-like genes on chromosomes 7 and 8. Several of these EST sequences extended in the 5' region into the putative signal peptide and pro region, so they were used to search the human genome data base and obtain the 5' exons for this N-terminal region. Predictions on the presence of N-terminal signal peptides and putative cleavage sites of these peptides were performed using the Signal-P Web site (20).

Modeling of CPs-- The set of template structures used for modeling the target sequences was chosen in every case based on closest sequence similarity. The following templates were used: bovine pro-CPA1, porcine pro-CPA1, and human pro-CPA2 for the proenzyme sequence of CPA5 and human pro-CPB1, porcine pro-CPB1, human pro-CPA1, and human pro-CPA2 for the proenzyme sequence of CPA6 and CPO. All of the proteins were extracted from the Protein Data Bank (Research Collaboratory for Structural Bioinformatics, Rutgers University, New Brunswick, NJ) except human pro-CPB1.2

A preliminary multiple alignment was performed for the three sets of template proteins plus the target sequence using the program ClustalW (21) and a BLOSUM 62 matrix for weighting. To correctly align the target and the template sequences, data on the secondary and tertiary structures for this set of proteins were added. The secondary structure of the target proteins was predicted with the program PHD (22), and the real secondary structure of the templates was calculated with the program DSSP (23). The three-dimensional structures of the chosen templates were structurally superimposed by use of the program SSAP (24), and the multiple alignments of the templates were modified according to the three-dimensional superimposition of the templates. Subsequently, the sequence of each target sequence was used to obtain a final multiple alignment accounting for the structure of the templates, which was manually refined to account again for the structural information obtained from the template superimposed structures plus the predicted and real secondary structure.

A method of comparative modeling by satisfaction of spatial constraints was used to build the three-dimensional structure of pro-CPA5, pro-CPA6, and CPO. This method was implemented using the program MODELLER (25). The spatial constraints were derived by transferring the spatial features from the structures of the known proteins to the sequence of the unknown one. Idealization of bond geometry and removal of unfavorable nonbonded contacts were performed by energy minimization with the GROMOS force field for unsolvated systems (26, 27). The model was refined using 1000 steps of steepest descent. Root mean square (r.m.s.) deviation calculations and superimposition of the modeled structures with respect to the crystallographic ones were obtained by means of the program SSAP. Secondary structure calculations of the crystallographic structures and the model were performed with DSSP (23) and compared with the predicted secondary structure. The program PROSA-II (28) was used to check the quality of the models. This program allows identification of regions with non-near-native fold by the high positive values of pseudo-potential energy. The model showing the smallest pseudoenergy was taken for the final modeled structure.

Isolation of Mouse Testis CPA5 cDNA-- Mouse full-length CPA5 cDNA was generated by performing high fidelity PCR using thermostable Taq and the 3'-5' exonuclease activity of Pwo (Roche Molecular Biochemicals) with a mouse BALB/c testis cDNA library (CLONTECH). The forward primer ATA AGA ATG CGG CCG CTT GTC TGG AAG GAG AAG C included a NotI site at its 5'-end. The reverse primer GGG GTA CCG AAT TCT AGT GAT GAT GAT GAT GAT GAT AGG GGT GAT TCA GGG TGT G included KpnI and EcoRI sites and a His tag sequence at its 5' end. The PCR product was purified (PCR purification kit; Qiagen), digested with NotI and EcoRI, and subcloned into the baculovirus expression vector pVL1392 (Invitrogen). To generate a probe for Northern blotting analysis, the purified PCR product was digested with NotI and HincII and subcloned into pBluescript KS vector (Stratagene). All positive clones were confirmed by sequencing.

Northern Blot and RT-PCR Analysis-- Total RNA was prepared using the lithium chloride method (29), and ~20 µg was fractionated on an agarose gel containing 2% formaldehyde. Following photography of the ethidium bromide-stained gel, the RNA was transferred to a nitrocellulose membrane and probed with CPA5 riboprobe. The 500-nucleotide-long probe was produced using the HindIII-linearized mouse CPA5 vector described above, T3 RNA polymerase, and [32P]UTP. Approximately 4 × 107 cpm of probe was hybridized with the blot in 5× saline-sodium phosphate-EDTA buffer, 5× Denhardt's solution, 50% (v/v) formamide, 0.1% (w/v) SDS, and 100 µg/ml denatured salmon sperm DNA at 64 °C overnight. Following the hybridization, the blot was washed with 1× standard saline citrate (SSC) and 0.1% SDS, 0.5× SSC plus 0.1% SDS, and then 0.1× SSC plus 0.1% SDS buffer at 72 °C. After washing, the blot was dried and exposed to x-ray film (Eastman Kodak Co.) for 3 days at -80 °C with an intensifying screen.

RT-PCR was performed using the Titan one tube RT-PCR system (Roche Molecular Biochemicals). RNA was extracted using the lithium chloride method (29) from various mouse tissues, AtT-20 cells, or human mononuclear cells prepared from fetal cord blood as described (30). For mouse CPA5, the forward and reverse primers were the same as used for the isolation of CPA5 cDNA from the testis library (above). For human CPA6, the forward primer was TGG ATG CAT CAT CTG AAT AAA ACT CAC, and the reverse primer was TCC ATT TTT GTA GGC CCA ATC CAT. For mouse CPA6, the same forward primer was used, but the reverse primer was chosen from a region with lower degeneracy: CTT CAC TTT CCA GTT TCT ATT GGC ATC. The PCR products were purified (PCR purification kit; Qiagen) and subcloned into the TA cloning vector (Invitrogen). The resulting plasmids were sequenced in both directions.

In Situ Hybridization Analysis-- All experiments carried out in this study used antisense cRNA probes and sense probes for controls. Probes were synthesized by in vitro transcription of linearized plasmid DNA (CPA5 in pBluescript, described above), and labeling was done by incorporation of radioactive [35S]UTP and [35S]CTP nucleotides (Amersham Biosciences) as described (31). The in situ hybridization studies were carried out using antisense and sense [35S]UTP/[35S]CTP-labeled cRNA probes. Radioactive probes were diluted to 33 × 103 dpm/µl and in situ hybridization was performed as previously described (31). CD1 mice were sacrificed by rapid decapitation, and tissues were rapidly removed and frozen in isopentane cooled to -35 °C. The extracted tissues were stored at -80 °C until cryosectioning. Frozen 10-µm sections were cut on a Reichert cryostat (Leica Microsystems, Depew, NY), thaw-mounted on polylysine-coated glass slides, and stored at -80 °C until processing. These sections were submitted to the standard in situ hybridization procedure (31) followed by x-ray film autoradiography for 4 days to obtain low resolution images. These slides were subsequently dipped in NTB-2 nuclear emulsion (Eastman Kodak Co.) to obtain cellular resolution. The sections were counterstained with cresyl violet, cleared in xylene, and mounted with Permount histological mounting medium (Fisher).

Expression of Mouse Pro-CPA5 in the Baculovirus System-- Recombinant baculovirus expressing mouse pro-CPA5 was generated using Baculoplatinum DNA (Orbigen). Approximately one million Sf9 cells growing in Sf900-II serum-free medium (Invitrogen) were co-transfected with 5 µg of the pro-CPA5 pVL1392 vector along with 0.5 µg of Baculoplatinum DNA using the standard calcium phosphate procedure (18). Amplifications of the virus were performed as previously described (18). Sf9 cells from 450 ml of culture were recovered by centrifugation at 50,000 × g at 4 °C for 30 min. The cells were homogenized (Polytron; Brinkman) in 100 ml of 0.1 M Tris-Cl, pH 7.4, and centrifuged at 50,000 × g for 30 min. The pellet was homogenized in 0.1 M Tris-Cl containing 1 M NaCl and centrifuged at 50,000 × g. This pellet was homogenized in 0.1 M Tris-Cl containing both 1 M NaCl and 1% Triton X-100 and centrifuged again as above. The final pellet was resuspended in Tris buffer, and aliquots of each supernatant and the final pellet were analyzed on two denaturing polyacrylamide gels. One gel was stained with Coomassie while the other was transferred to nitrocellulose and probed with a mouse monoclonal antibody to the His6 epitope (Invitrogen).

The supernatant from the 1 M NaCl extract was dialyzed against 10 mM Tris-Cl, pH 7.4, overnight and centrifuged at 50,000 × g for 30 min at 4 °C. Both supernatant and pellet were analyzed on denaturing polyacrylamide gels (both protein and Western blot analysis, as above). The major protein bands in the precipitate fraction were cut from the gel, digested with trypsin (Promega), and subjected to mass spectrometry as previously described (32). The resulting fragments were compared with the theoretical masses of pro-CPA5 tryptic fragments predicted using the Paws computer program.

Characterization of Pro-CPA5-- Because the gel analysis revealed the size of the expressed protein to be that of pro-CPA5 and not the predicted active form, attempts were made to process the pro-CPA5 using a variety of endopeptidases. The pro-CPA5 could not be purified using the Talon His6 purification procedure (CLONTECH). Because the pellet after the dialysis of the 1 M NaCl cell extracts contained pro-CPA5 as one of the major components (based on the tryptic fingerprinting results), this material was used for subsequent studies on pro-CPA5. Approximately 5-10 µg of pro-CPA5 (estimated from the Coomassie-stained protein gel) was incubated with a variety of endoproteases in the following buffer: porcine trypsin (Promega), 0.1 M Tris, pH 7.4; purified baculovirus-expressed human furin (33), 0.1 M HEPES, pH 7.5, with 1 mM CaCl2, 1 mM 2-mercaptoethanol, and 0.5 mg/ml bovine serum albumin; purified baculovirus-expressed human PC7 (33), 20 mM bis-Tris, pH 6.5, with 1 mM CaCl2; and medium from High-Five cells expressing PC4 (the gift of Dr. Majambu Mbikay, Loeb Health Research Institute, Ottawa, Canada) (34), 0.1 M Tris, pH 7.0, with 2 mM CaCl2. For furin and the PCs, enzyme activity was verified with the standard substrate pyro-Glu-Arg-Thr-Lys-Arg-7-amido-4-methylcoumarin (34).

To measure CP activity, 5-10 µl of each sample (corresponding to 125-250 ng of pro-CPA5 prior to incubation with PC4) was added to 0.5 ml of 500 µM furylacryloyl-Gly-Leu (FA-Gly-Leu) in either 0.1 M Tris-Cl, pH 7.4, or 0.1 M sodium acetate, pH 5.5, at 25 °C. The enzyme reaction was followed by measuring the decrease in absorption at 336 nM; controls with bovine CPA1 (Sigma) showed a maximal change of 0.6 absorption units for the batch of substrate used.

    RESULTS
TOP
ABSTRACT
INTRODUCTION
MATERIALS AND METHODS
RESULTS
DISCUSSION
REFERENCES

Data Base Searches and Alignments of CPs-- When CPE was used to search the public human genome data base using the tblast-n program, each known member of the N/E subfamily was detected, but no additional genes were found (Table I). None of the N/E subfamily genes appeared to be present as a cluster; although chromosome 4 contains both CPE and CPZ, and chromosome 10 contains both CPN and CPX2, these genes are located at a considerable distance from each other within the chromosome. Similarly, none of the members of the N/E subfamily were located in close proximity to members of the A/B subfamily. Searches with CPE produced a limited number of hits to members of the A/B subfamily, as expected due to the generally low amino acid sequence identity between subfamilies.

                              
View this table:
[in this window]
[in a new window]
 
Table I
Genomic location of human metallocarboxypeptidase-related genes

Searches with CPA1 and CPB1 yielded generally similar results to each other, with hits to genes on chromosomes 7, 3, 8, 13, and 2 (Table I). The four genes found on chromosome 7 are found within a cluster, with the coding region of each gene located 3-23 kb apart from the coding region of the adjacent gene (Fig. 1). The genes for the pancreatic proteins CPA1 and CPA2 are located at the ends of this cluster. A cDNA sequence reported in the literature as CPA3 (35) and renamed CPA4 by the human genome nomenclature committee is also present in this cluster (Fig. 1). This sequence presumably encodes an active CPA-like enzyme based on consideration of the active site region; the enzymatic properties have not yet been reported. In addition, a novel human CP-like gene was found within this cluster and named CPA5 based on predictions of the active site region (discussed below). The genomic sequence of the appropriate region of chromosome 7 was searched in all three reading frames for homology to various CPs using the GenePro program. This analysis identified 11 exons for the entire predicted coding region of CPA5 (exons 3-13; Fig. 1), consistent with the 11 exons found in most other members of the A/B subfamily of CPs (36, 37). All of the introns in the coding region of CPA5 begin with GT and end with AG, and their locations exactly match those in other members of the A/B subfamily.


View larger version (9K):
[in this window]
[in a new window]
 
Fig. 1.   Gene structure of CP-like genes on chromosomes 7, 8, and 2. The length of each intron is indicated (kb), and the widths of the lines representing the exons correspond to their sizes (using a different scale from the intron size). The CPA5 gene on chromosome 7 is located in a cluster with CPA2, CPA4, and CPA1. The distance between the genes is indicated (kb). Within CPA5, exons 1 and 2A are found in all sequences in the GenBankTM EST data base with homology to this region of the gene, whereas exon 2B is only present in a single EST sequence from a testis cDNA library. The CP-like gene on chromosome 8 has been named CPA6. The putative CP on chromosome 2 (CPO) does not match any sequences in the public data base, so the N-terminal exons could not be identified from the homology search due to lower sequence similarity in this region. The exons were numbered to reflect the missing exons; nearly all other members of the CPA/B subfamily have three additional N-terminal exons.

CPA5 has >98% nucleotide identity with a total of 13 sequences in the GenBankTM human EST data base (as of November 30, 2001). Of these, nine were from libraries that included RNA from testis, three were from libraries of adult brain, and one was from fetal heart. One of the testis sequences, and all three of the brain sequences were ~560 nucleotides longer in the 5' direction than the coding sequence predicted from homology to other CPs. All four of these clones contained two additional 5' noncoding exons (exons 1 and 2A; Fig. 1), and the testis clone contained a third noncoding exon (exon 2B; Fig. 1). The intron/exon junctions of the two common exons obeyed the GT/AG rule, whereas the junction of the exon found only in the testis clone had a GG in place of the GT.

A mouse cDNA clone with 84% nucleotide sequence identity to human CPA5 was present in the NR data base (GenBankTM accession number AK015256). However, this GenBankTM mouse sequence does not encode a CP due to frameshifts in the mouse sequence relative to the long CP-like open reading frame of the human sequence. PCR was performed using a mouse testis cDNA library as template and oligonucleotides based on the sequence of the mouse cDNA clone. Sequence analysis of the resulting cDNA clones revealed four differences from the GenBankTM entry; one of these is a silent change, another changes Leu121 to Met (which matches the human sequence), and most importantly, there are two missing residues that shift the reading frame of the mouse protein to exactly match that of the predicted human protein.

The mouse CPA5 cDNA clone, like the human CPA5 sequences in the EST data base, extends ~560 nucleotides upstream of the coding region. The human upstream region contains eight ATGs, and the mouse region contains nine; many of these are conserved between the human and mouse sequences. However, only one of these upstream ATG motifs in either sequence is predicted to be a good consensus sequence for transcription initiation (38), and this is followed by an in-frame stop codon. The second ATG that is predicted to be a good consensus sequence for transcription initiation encodes a protein of 436 amino acids for both human and mouse sequences. The nucleotide sequence surrounding the ATG at the start of this long open reading frame is highly conserved between human and mouse, whereas the upstream ATGs show considerably less sequence similarity within the surrounding regions.

In addition to CPA5, two other sequences that have not previously been reported in the literature were detected on chromosomes 8 and 2 (Fig. 1). The first eight exons of the CP-like gene on chromosome 8 exactly match the sequence of a cDNA clone reported in GenBankTM (accession number AF221594). However, this GenBankTM entry lacks the sequence corresponding to exons 9-11 and has a 3'-end that does not have any homology with known carboxypeptidases. Instead, this 3'-end perfectly matches the putative intronic region found in the human genome sequence. Searches of the downstream genomic sequence for CP-like domains revealed the missing exons ~50 kb away. Although large for an intron, this is considerably smaller than the first and second introns, which are 121.8 and 98.4 kb (Fig. 1). All of the predicted introns begin with GT and end with AG, and their locations exactly match those found in other A/B subfamily CPs. Based on consideration of the active site region (discussed below), this chromosome 8 CP-like gene has been named CPA6.

Similar analysis of the novel gene on chromosome 2 revealed eight exons corresponding to the active CP domain of the protein (exons 4-11; Fig. 1). The amino acid homology within the N-terminal "prepro" regions of the carboxypeptidases was not sufficient to identify this portion of the gene based on homology searches. Because no cDNA sequence was found in public data bases that corresponds to this gene, it was not possible to determine the N-terminal exons. The downstream exons are numbered based on the related CPs, with the assumption that there would be three additional exons for the signal peptide and pro domains. The introns within this chromosome 2 CP-like gene all begin with GT and end with AG, and their positions exactly match one or more of the other family members. Based on consideration of the active site region, which suggests that this enzyme has a markedly distinct substrate specificity from CPA and CPB (discussed below), this gene was named CPO.

Both human and mouse CPA5 are predicted to be produced as precursors containing N-terminal signal peptides, with cleavage expected following Gly33. Following the signal peptide is a predicted 93-96-residue pro domain that has some homology to the corresponding pro region of other A/B subfamily CPs (Fig. 2). Based on sequence alignments with other CPs and on consideration of the specificity of various endopeptidases, cleavage of the pro region is predicted to occur either between Arg126 and Leu127 or between Arg129 and Ser130. Both sites are related to the consensus site for furin and other prohormone convertases in which cleavage occurs C-terminal to the second Arg of the sequence Arg-Arg or Arg-Xaa-Xaa-Arg (39-42). Although the first site has an Arg-Arg motif, the presence of the Leu in the P1 position is not ideal for many of these PCs (39-42). The second site, with the Arg-Xaa-Xaa-Arg motif, may be more readily cleaved due to the P1 Ser. Also, this second site is favored from the modeling analysis (discussed below).


View larger version (118K):
[in this window]
[in a new window]
 
Fig. 2.   Alignment of the deduced amino acid sequences of human (h) and mouse (m) CPA5, CPA6, CPO, and the previously identified members of this subfamily. The gap below the arrow denotes the putative signal peptide cleavage site. For CPA6, the signal peptide cleavage was predicted to occur at either Gly28 or Ser30 (only the Ser30 site is indicated). The gap below the arrowhead denotes the predicted pro peptide cleavage site. For CPA5, the pro peptide cleavage site may occur at either Arg126 or Arg129 (only the Arg129 site is indicated). The dotted line indicates the missing N-terminal sequence of CPO. Dashes are used to introduce gaps for optimal alignment of the sequences. Active site residues are in boldface type. The asterisks indicate active site or substrate-binding residues that are not conserved in all members of the CPA/B subfamily. These residues are numbered using the traditional method, which is based on the catalytically active form of CPA1 (i.e. after removal of the pro peptide). The numbers on the left of each line following the gene names refer to the amino acid position relative to the initiation ATG. The cDNA sequences from which the protein sequences were derived have been deposited in GenBankTM: mouse CPA5, AF466283; human CPA5, BK000187; human CPA6, AF466284 and BK000188; and human CPO, BK000189.

The deduced amino acid sequence of CPA6 encodes a protein of 437 residues that is also predicted to have a signal peptide, with cleavage predicted following either Gly28 or Ser30. The putative pro domain is nearly identical in length to that of CPA5 and other family members (Fig. 2). The predicted cleavage site of the pro domain of CPA6 is a perfect furin/prohormone convertase consensus site, with cleavage expected between Arg129 and Ser130 (Fig. 2). Because the N-terminal exons of CPO could not be identified due to the relatively low homology among the prepro regions of the various CPs, no information is available regarding the presence of a signal peptide and pro region in CPO. There is no furin/prohormone convertase consensus site within the region of CPO that aligns with the pro domain cleavage site of the other family peptides (Fig. 2).

Alignment of the amino acid sequences shows that the key residues for catalytic activity and substrate binding of other CPs are generally present in comparable positions in CPA5, CPA6, and CPO. Based on kinetic studies with rat CPB1 (36) and crystallographic studies with porcine CPB1 and CPA1 (43, 44), the important residues include those involved in the coordination of the active site zinc atom (His69, Glu72, and His196) and a series of residues important for substrate binding and catalysis (Arg71, Arg124, Arg127, Lys128, Asn144, Arg145, Ser194, Ser197, Tyr198, Ser199, Ile243, Tyr248, Ser253, Ile255, Thr268, Glu270, and Phe279; the numbering and residue names used are those of CPA1). All of the residues involved in zinc coordination and substrate catalysis are conserved in CPA5, CPA6, and CPO. Most of the residues involved in substrate binding are also conserved, and the differences observed in some of them are likely to be related to the individual specificities of the three forms (discussed below). The length of the active CP domain is nearly identical in all members of the A/B subfamily, with the major differences in overall length due to small differences in the number of residues of the signal peptide (Fig. 2). The overall amino acid sequence identity of prepro-CPA5 is highest to prepro-CPA1 (60%), slightly lower to prepro-CPA2 and prepro-CPA4 (53 and 51%), and lower (34-36%) to the remaining members of the gene family (Table II). Similar results are obtained when comparing just the active CP domains, although the amino acid sequence identities are 4-7% higher without the signal peptides and the pro regions (Table II). Prepro-CPA6 shows slightly higher amino acid sequence identity to prepro-CPA3 (41%) and prepro-CPB (40%) than the other members of the subfamily (Table II). The amino acid sequence identities are also higher when only the active CP domains are compared (Table II). Because the prepro region of CPO could not be determined, amino acid comparisons of this protein were done only with the active CP domains. CPO shows highest amino acid sequence identity to CPB1 (46%), CPA6 (45%), CPA3 (45%), and CPB2 (44%).

                              
View this table:
[in this window]
[in a new window]
 
Table II
Percentage of amino acid sequence identity among members of the CPA/B subfamily
The upper right half of the table indicates identity among the full-length (i.e. preprocarboxypeptidase) forms; the lower left half (italics) indicates identity only of the putative active forms (i.e. without the prepro sequence).

Modeling of CPs-- To gain a better understanding of whether the novel CPs encode active enzymes as well as to predict the optimal substrates, a method of comparative modeling by satisfaction of the spatial restraints determined by sequence alignments was used to build the three-dimensional structure of the pro-CPs using the program MODELLER. For each particular protein, a subset of sequences with known Protein Data Bank coordinates, chosen based on closest sequence similarity, was used as the starting alignment, which, in every case, contained the restraints shown in the global alignment in Fig. 2. In order to improve the model accuracy, a prediction of secondary structure was performed in every case. For pro-CPA5 and pro-CPA6, the prediction showed that the alpha -helix of the connecting segment (the region that links the pro and enzyme domains) is similar to human pro-CPA2 and slightly longer than the alpha -helix of pro-CPA1. Thus, several models were built in which the connecting region was only restrained by the structure of human pro-CPA2. These models showed an improved pseudoenergy profile with respect to those obtained using all restraints in this region. Due to the lack of information about the N-terminal pro region, CPO was modeled on CP templates only. In all cases, the structure with the lowest energy was chosen as the final model among the proposed ones.

The overall r.m.s. deviation values calculated between the enzyme moieties of the final refined models and the different CP templates are as follows: CPA5, 0.28, 0.34, and 0.37 Å for bovine CPA1, porcine CPA1, and human CPA2, respectively; CPA6, 0.49, 0.39, 0.66, and 0.69 Å for human CPB, porcine CPB, human CPA2, and bovine CPA1, respectively; CPO, 0.46, 0.41, 0.62, and 0.68 Å for human CPB, porcine CPB, human CPA2, and bovine CPA1, respectively. In the case of the two modeled proenzymes, the r.m.s. deviation values are higher in all cases when the complete models are compared with their templates. For instance, the r.m.s. deviation values for the pro-CPA5 model are 1.10, 1.06, and 0.50 Å for bovine pro-CPA1, porcine pro-CPA1, and human pro-CPA2, respectively. This indicates that while the enzyme domains of the pro-CPA5 or pro-CPA6 models are very similar, the pro segments are significantly different, especially when compared with pro-CPA1 structures. In the case of pro-CPA5, the main differences observed are allocated within the connecting segment, with its alpha -helix being five turns long (Fig. 3), as in the three-dimensional structure of pro-CPA2, and spanning from Ile110 to Leu127. This helix is followed by a short loop at the border with the pro-CPA5 moiety, where a highly exposed arginine (Arg129) is located. This arginine may be the first target for endopeptidase-mediated activation of the proenzyme. The connecting segment of pro-CPA1 has only a four-turn helix, and the loop next to it is longer than that of pro-CPA5. As in pro-CPB1, pro-CPB2, and pro-CPA3, the pro domain of pro-CPA6 contains a two residue deletion after position 65 and a four-residue insertion after position 72 (CPA6 numbering) relative to all other CPAs (Fig. 2). Thus, the model shows the one-turn 310 helix located between beta  strands 2 and 3 that is characteristic of the CPB pro region. However, unlike the pro region of CPB1, in which Asp56 (numbering of Fig. 2) forms a salt bridge with the active site residue Arg145 (CPA1 active form numbering) to completely inactivate the proenzyme toward all substrates, none of the other pro regions contain an acidic amino acid in this position (corresponding to Ser71 of CPA6).


View larger version (33K):
[in this window]
[in a new window]
 
Fig. 3.   Ribbon representation of the modeled structures of pro-CPA5, pro-CPA6, and CPO. The pro and enzyme domains are shown in blue and gold, respectively, and the different color intensities indicate alpha  or beta  secondary structures.

In the predicted active sites of the three CP models, the residues involved in the coordination of the active site zinc atom and the series of conserved key residues that form the different active center subsites have essentially the same conformation described for other CPs (Fig. 4). The positions that show higher variability among the various CPs are Ser194, Ile243, Ser253, Ile255, and Thr268 (CPA1 active form numbering and residue names). The predictions of individual specificities are therefore based largely on the conformational and space-filling effects of these amino acid substitutions. CPAs hydrolyze hydrophobic C-terminal residues. CPA2 exhibits a preference for C-terminal bulky aromatic residues, whereas CPA1 prefers smaller aliphatic residues. This difference is thought to be primarily due to two substitutions in the specificity pocket of human CPA2 at positions 194 and 268 (CPA1 active form numbering) (45). Human CPA5 has the same residues as CPA1 in these positions (Fig. 2). Two other residues in the specificity pocket of CPA5 are Ile380 and Val382, which correspond to residues Ser253 and Ile255 (CPA1 active form numbering). These changes would render the specificity pocket of CPA5 smaller than in the other CPs. CPA5 has only one predicted disulfide bond (between Cys265 and Cys288), which is in the same position as the disulfide bond in CPA1 and the other CPs. Human CPA2 contains a second disulfide bond between Cys318 and Cys352, which is unique for CPA2 and whose presence is thought to influence the specificity of the enzyme. This, together with the fact that the residues identified as responsible for the higher specificity of CPA2 for bulky aromatic residues are not present in CPA5 and the substitution at positions 253 and 255 (CPA1 active form numbering) that make the specificity pocket smaller, indicates that CPA5 is likely to exhibit a specificity for small aliphatic C-terminal residues.


View larger version (23K):
[in this window]
[in a new window]
 
Fig. 4.   Schematic diagram of the CP active site. The numbers indicate the residue number of CPA1 (active form numbering) and correspond to the numbers of the residues in boldface type in Fig. 2. The asterisks indicate residues that are variable among members of the CPA/B subfamily and that presumably affect the substrate specificity. R indicates the side chain of the C-terminal residue of the substrate.

There is a large overall coincidence in the hydrophobic nature of most of the residues that line the substrate binding pocket of CPA6 and CPA1. The CPA1-like activity of CPA6 may, however, be modulated by three substitutions; Ala371, Met383, and Ala396 in CPA6 correspond, respectively, to Ile243, Ile255, and Thr268 in CPA1 (active form numbering). These residues line the cavity where the side chain of the S1' residue sits. The smaller size of the residues lining the specificity pocket may indicate that CPA6 is more efficient in the hydrolysis of large, probably branched hydrophobic C-terminal residues. Finally, CPO is unique among human CPs in that it has a Ser at position 243 and an Arg at position 255 (CPA1 active form numbering); most CPs have a bulky hydrophobic residue at position 243 and either a hydrophilic residue (CPAs) or an acidic residue (CPBs) in position 255. These residues interact directly with the C-terminal side chain of the substrate. The presence of a basic residue in this position of CPO suggests a specificity for C-terminal acidic residues.

Distribution and Expression of CPA5-- The tissue distribution of mouse CPA5 was determined by Northern blot analysis using a cDNA probe isolated from a mouse testis library. This analysis showed a single band in testis of ~3.4 kb (Fig. 5). No signal was detected for kidney, liver, brain, or lung (Fig. 5). RT-PCR showed a signal in mouse testis as well as in pancreas and the AtT-20 mouse pituitary corticotrophic cell line (data not shown). No RT-PCR signal for CPA5 was detected in mouse heart, kidney, liver, or brain (not shown). To further characterize the localization of CPA5 in mouse testis and pituitary, in situ hybridization analysis was performed. A signal for CPA5 was detected in the germ cells of the testis (Fig. 6, top panels). The level of expression of CPA5 varied among tubules, suggesting the stage-specific expression of this mRNA. In pituitary, a diffuse signal for CPA5 was detected throughout the anterior and intermediate lobes but not in the neural lobe (Fig. 6, lower left panel). Control hybridizations with sense probes showed only background levels (Fig. 6, lower right panel).


View larger version (67K):
[in this window]
[in a new window]
 
Fig. 5.   Northern blot analysis of mouse CPA5. Total RNA from mouse kidney, liver, brain, testis, and lung was fractionated on a formaldehyde agarose gel, transferred to nitrocellulose, and probed with 32P-labeled mouse CPA5 cRNA. Top, autoradiogram of Northern blot (3-day exposure). The positions of size standards (in kb) are indicated (Invitrogen). Bottom, ethidium bromide-stained gel prior to transfer to nitrocellulose. The positions of 28 and 18 S RNA are indicated.


View larger version (74K):
[in this window]
[in a new window]
 
Fig. 6.   In situ hybridization analysis of CPA5 in mouse testis (top) and pituitary (bottom). Tissue sections were probed with antisense (AS) or sense (S) mouse CPA5 cRNA probes, as described under "Materials and Methods." AL, anterior lobe; IL, intermediate lobe. CPA5 is differentially expressed in tubules, suggesting a stage-specific germ cell expression. Low levels are observed in both the anterior and intermediate pituitary lobes. Sections hybridized with sense probes show only background signal.

RT-PCR of CPA6 showed a signal with RNA isolated from fetal umbilical cord blood, which contains human hematopoietic stem cells, and from mouse brain (data not shown). The primers used for the RT-PCR of human CPA6 were located on exons 5 and 11. The sequence of the cDNA derived from RT-PCR using these primers exactly matched that predicted from the gene, indicating that exons 5-11 are spliced as shown in Fig. 2. In addition, the nucleotide sequence of the mouse RT-PCR product for CPA6, which was performed with primers on exons 5 and 8, was 89% identical to the corresponding region of human CPA6 (data not shown).

Mouse CPA5 protein was expressed in insect Sf9 cells using the baculovirus system. This system has been successful for the expression of a number of other CPs (46-48). To detect the protein, a His6 tag was added to the C terminus. Baculovirus-expressed CPA5 was detected by Western blot analysis with an anti-His6 antiserum. The protein was detected in the Sf9 cell pellets as a 45-kDa protein (Fig. 7A), consistent with the predicted size of pro-CPA5. No signal was detected when a comparable amount of Sf9 cells from wild-type virus-infected cells was analyzed (Fig. 7A). Although pro-CPA5 was predicted to be secreted in a soluble form due to the presence of a signal peptide and the absence of any transmembrane domain, immunoreactivity was not detected in the medium (Fig. 7A). A major protein of the same size as the immunoreactive band was detected in the homogenates of pro-CPA5 baculovirus-infected Sf9 cells but not in homogenates of cells infected with wild type virus (Fig. 7B). When the cell homogenate was centrifuged, none of the 45-kDa protein (Fig. 7B) or immunoreactivity (not shown) was detected in the supernatant. A small amount of the particulate 45-kDa protein (Fig. 7B) and immunoreactivity (not shown) could be solubilized with 1 M NaCl. The combination of high salt and detergents such as 1% Triton X-100 did not enhance the extraction of the particulate 45-kDa protein (Fig. 7B). Once solubilized with high salt, the removal of the salt by dialysis followed by centrifugation at 50,000 × g for 30 min led to the recovery of the 45-kDa protein in the pellet (Fig. 7C). This 45-kDa band was cut from the gel and digested with trypsin, and the tryptic fragments were identified by matrix-assisted laser desorption ionization time-of-flight mass spectrometry. Twenty-six major tryptic fragments were detected, and 15 of them matched (±0.07%) to the theoretical masses of predicted tryptic fragments of pro-CPA5. The lower molecular weight bands detected in the precipitate of the high salt extracts were similarly analyzed and did not show any tryptic fragments that matched those predicted for pro-CPA5. These bands also did not react with the monoclonal antibody to the His6 epitope tag.


View larger version (42K):
[in this window]
[in a new window]
 
Fig. 7.   Expression of pro-CPA5 in Sf9 cells using the baculovirus system. A, aliquots of the cell pellets (Cell) or media (Sec.) from CPA5 or wild type (wt) virus-infected cells were fractionated on a denaturing polyacrylamide gel, transferred to nitrocellulose, and probed with a mouse antibody to the His6 tag (the His6 tag was included on the C terminus of the expressed protein). B, aliquots of pro-CPA5 (A5) or wild type baculovirus-infected Sf9 cell homogenates (Hom.), soluble fraction (Sol.), 1 M NaCl extract (Salt), 1 M NaCl plus 1% Triton X-100 extract (Salt/Tx), or pellet remaining after the extractions were analyzed on a denaturing polyacrylamide gel stained with Coomassie Blue. The arrowheads indicate the 45-kDa band in the pro-CPA5-expressing cell fractions that is not detected in the wild type virus cell fractions. C, the 1 M NaCl extract of the pro-CPA5-expressing virus-infected cells was dialyzed and centrifuged, and an aliquot of the pellet was analyzed by denaturing polyacrylamide gel electrophoresis. The 45-kDa band shown by the arrow was identified as pro-CPA5 by tryptic fragment fingerprinting with mass spectrometry. For all panels, the positions of prestained molecular weight markers (Invitrogen) are indicated.

No enzyme activity was detected with pro-CPA5 using two common CPA substrates, FA-Gly-Leu and hippuryl-Phe (not shown). Attempts to convert pro-CPA5 into the active form using purified furin, PC7, and trypsin were unsuccessful (not shown). However, when pro-CPA5 was incubated with medium from PC4-expressing baculovirus-infected cells, the immunoreactive 45-kDa band was greatly reduced, although a smaller immunoreactive band did not appear (not shown). The absence of a smaller band suggests that either the PC4 cleaved all or a portion of the His6 epitope tag, or else the activated CPA5 was able to cleave the C-terminal His6 epitope tag. The PC4-digested pro-CPA5 was enzymatically active toward FA-Gly-Leu (Fig. 8A). Neither PC4 medium alone nor pro-CPA5 incubated with wild type virus-infected medium showed enzyme activity toward the CPA substrate (Fig. 8A). Activity was severalfold greater at pH 7.4 than at pH 5.6 (Fig. 8B). The amount of FA-Gly-Leu cleaving activity detected with 125 ng of pro-CPA5 (digested with PC4) was comparable with the activity observed with 20 ng of bovine pancreatic type 1 CPA1 using the same substrate and buffer conditions.


View larger version (15K):
[in this window]
[in a new window]
 
Fig. 8.   Enzymatic assays with the standard CPA substrate FA-Gly-Leu. A, partially purified pro-CPA5 (from Fig. 7, lane C) was incubated with medium from High-Five cells expressing PC4 (triangles) or control medium (circles) overnight at 37 °C. Then 5 µl of the reaction (corresponding to ~250 ng of pro-CPA5 before incubation) was combined with 0.5 mM FA-Gly-Leu in 0.1 M Tris-HCl, pH 7.4, and the reaction was followed at 336 nm. In addition to the two samples with pro-CPA5, a control with PC4 alone was also tested (squares). B, approximately 125 ng of pro-CPA5 incubated with PC4 was combined with 0.5 mM FA-Gly-Leu in a 0.1 M concentration of either Tris-HCl, pH 7.4, or sodium acetate, pH 5.6, and the reaction followed at 336 nm. For both panels, relative activity indicates the decrease in absorption; units are standard absorption units. Control reactions with porcine CPA1 gave a maximal decrease in absorption of 0.6 units for the same concentration of substrate.


    DISCUSSION
TOP
ABSTRACT
INTRODUCTION
MATERIALS AND METHODS
RESULTS
DISCUSSION
REFERENCES

Of the three CP genes identified in the present study, only two of them were expected based on previous data base searches of mRNA-based cDNA libraries. However, none of these mRNA-based sequences predicted an active CP protein due to either frameshift errors or incomplete sequence information. One of the major contributions of the present study is the identification of the putative genes for the full coding regions of all three CP-like proteins. It is unlikely that there are any more human metallo-CP-like proteins with significant amino acid sequence similarities to either A/B or N/E subfamily members; numerous searches of genome and cDNA libraries using a variety of query CP sequences and search parameters failed to turn up any additional hits. Thus, there are a total of 17 metallo-CP family members, with nine of these in the A/B subgroup and eight in the N/E subgroup (although CPD consists of three CP-like domains, so the total number of CP domain sequences is 19). However, it is possible that additional human proteins have structural homology to the CP family that is not detected by the amino acid homology searches used in the present study.

The locations of introns in the coding regions of the various members of the A/B subfamily of CP are generally conserved in all of the previously identified CPs as well as the three novel ones found in the present study (although the exons encoding the putative pro region of CPO could not be identified, so no conclusions about the intron locations within this region can be made). The one exception is the intron between exons 3 and 4 (i.e. intron 3), which is absent from CPA1. All other previously identified members of the A/B subfamily contain intron 3, but the exact position differs by about 20 nucleotides. The three novel CPs all have an intron 3 in a position that matches one of the previously identified members. There is little similarity between A/B and N/E subfamily members regarding intron positions. Only one of the intron positions present in A/B subfamily members is conserved in CPE (49). Within the N/E subfamily, there is less conservation of the intron positions than within the A/B subfamily.

It is likely that the predicted intron splicings are correct because the sites exactly match one of the previously identified CP genes, the splice sites all obey the GT-AG rule, and, for CPA5 and CPA6, cDNA sequences that cover much or all of the predicted coding region are present in the data base. Of the three additional CPs described in this study, CPA5 had the most matches to cDNA clones in the data base, including a nearly full-length mouse cDNA sequence of unknown function. One reason that no function was previously ascribed to this cDNA is presumably due to the frameshift errors early in the sequence. The protein predicted from this incorrect cDNA sequence has no substantial homology to metallo-CPs. Based on our isolation and sequencing of the mouse cDNA for CPA5, it is clear that there is a long open reading frame with 83% amino acid sequence identity to the predicted human CPA5. This level of conservation is similar to that for human and rat CPB (80%), CPA1 (85%), and CPA2 (89%). Importantly, all of the active site residues that are different between CPA5 and the other members of this family are well conserved in both human and mouse CPA5 sequences.

Although full-length sequences of CPA6 have not been reported in GenBankTM, it is clear that this gene also is expressed because of the matches to cDNA sequences in the data base and our RT-PCR results. The longest of these GenBankTM cDNA sequences corresponds to the first eight exons of CPA6 but then continues into the putative intron and would not therefore encode an active CP due to the absence of the catalytically important exons 9-11. Using RT-PCR, we confirmed that exons 9-11 are spliced to the other exons as predicted based on the amino acid homology considerations shown in Fig. 2. Although no matches were found to the C-terminal portion of CPA6 in mammalian cDNA data bases, several matches were found to this region in chicken (Gallus gallus) and frog (Xenopus laevis) cDNA data bases. The amino acid sequence identity is extremely high, with 95% identity over 128 amino acids for the longer of the two chicken clones and 89% identity over 113 residues for a similar region of the frog sequence. Most importantly, this region contains the critical substrate-binding residue Met383 (corresponding to Ile255 of the active form of CPA1). The presence of a Met in this position is predicted to give the protein a unique CPA-like specificity, and the conservation of this site in the chicken and frog sequences implies that they represent homologs of human CPA6 and that this protein has been highly conserved during evolution. Interestingly, two of the 11 Drosophila CPA/B-like genes have a Met in this position, but previously no known mammalian CPs had this residue. It is not clear what effect this Met will have on substrate specificity, other than imparting a preference for C-terminal aliphatic or aromatic residues (discussed below).

Based on modeling of the proteins, and in particular the active site regions, it was predicted that CPA5 and CPA6 would cleave aliphatic/aromatic amino acids, with a preference of CPA5 for smaller side chains. For CPA5, this prediction was confirmed by the expression of pro-CPA5 and subsequent activation with PC4 into an enzyme that cleaved FA-Gly-Leu. The amount of enzyme activity detected was reasonable, based on the amount of product formed and the amount of pro-CPA5 in the original incubation with PC4. However, without a pure preparation of soluble enzyme, it is not possible to do kinetic analysis to examine the substrate specificity of CPA5. Part of the problem is the low recovery of soluble pro-CPA5 from the baculovirus system. Although the baculovirus system produces soluble or extractable forms of CPE and CPD, baculovirus-produced CPZ cannot be extracted from the particulate fraction even with high salt and detergent containing buffers (46). For CPZ, the nonextractable form is enzymatically active (46), but for CPE and CPD the nonextractable form is inactive and presumably represents misfolded protein. In addition to the problem of extracting pro-CPA5 and maintaining its solubility, it was also difficult to activate pro-CPA5. The inability of furin to cleave the pro domain was unexpected. It is possible that the structure of this region prevented furin from cleaving, despite the reasonable match of this sequence to the consensus furin cleavage site (39). Both pro-CPA5 and pro-CPA6 have a long, five-turn alpha -helix at the region connecting the pro and enzyme moieties, being in that sense more similar to pro-CPA2 than to pro-CPA1 or pro-CPB. The finding that PC4 cleaves pro-CPA5 is consistent with the expression of both mRNAs in testis germ cells. However, CPA5 appears to have a broader distribution than PC4, with low levels of CPA5 mRNA detected in pituitary using in situ hybridization. In addition, low levels of CPA5 mRNA are presumably present in the brain, based on the presence of several CPA5 sequences in human adult brain EST data bases but no detectable signal on a Northern blot of mouse brain RNA (Fig. 5). Because PC4 is not thought to be expressed in these tissues (34), if active CPA5 is produced in brain and/or pituitary it must be generated by another endopeptidase.

While the predicted active sites of CPA5 and CPA6 are generally similar to other CPAs, the predicted active site of CPO is unique. CPO resembles CPB in the nature of the amino acids that form the subsites for substrate anchoring but with significant substitutions at two residues that interact with the C-terminal side chain: Gly243 and Asp255 (CPA1 active form numbering). In CPO, these residues are substituted by Ser and Arg, indicating that the substrate binding funnel is of similar nature but the specificity is reversed. Thus, CPO probably cleaves C-terminal acidic amino acids, a specificity not yet described for any known CP. Interestingly, searches of the Drosophila and C. elegans genomes revealed metallo-CP-like genes with Lys and His in the position corresponding to the active site residue 255; it is therefore possible that these proteins have specificities similar to that of human CPO.

A key question concerns the function of the three CPs described in the present study. One way to approach this is to consider whether there are any known CP cleavages that cannot be explained by existing enzymes. One such cleavage has been observed for human beta -endorphin 1-26. This peptide is presumably formed from beta -endorphin 1-27, which is itself generated from beta -endorphin 1-31 by an endopeptidase followed by the action of CPE or CPD (50, 51). Whereas the conversion of rat beta -endorphin 1-27 into 1-26 requires cleavage of His27, which is slowly catalyzed by CPE (52), for human beta -endorphin this requires cleavage of Tyr27. Neither CPE nor CPD cleave C-terminal Tyr residues, so it is likely that an unknown CPA-like enzyme performs this cleavage. It is possible that CPA5 is this beta -endorphin-processing enzyme; it is presumably expressed in the secretory pathway due to the presence of the signal peptide, and it is also detected in low levels in mouse pituitary, where it appears broadly distributed. Thus, this enzyme should be present in the beta -endorphin-producing corticotrophs. The detection by RT-PCR of CPA5 mRNA in the AtT-20 corticotroph cell line further strengthens this possibility. However, this would not explain the function of testis germ cell CPA5. While beta -endorphin is known to be also expressed in testis, it is restricted to the Leydig cells and not the germ cells (53). It is possible that testis CPA5 is involved in processing intercellular signaling molecules formed after the concerted action of PC4 and CPD. Speculation on the functions of CPA6 and CPO requires more detailed information regarding the tissue and cellular distribution of these proteins.

    ACKNOWLEDGEMENTS

We thank Dr. Majambu Mbikay for providing PC4-expressing cell medium for testing with pro-CPA5 and Dr. Fa-Yun Che for assisting with mass spectrometry.

    FOOTNOTES

* This work was primarily supported by National Institutes of Health Grant R01 DK51271, and also by Grants R01 DA04494 and K02 DA00194 (to L. D. F.), by Ministerio de Ciencia y Tecnología, Spain Grants BIO98-0362 and 2FD97-0872, the Center de Referència en Biotecnologia of the Generalitat de Catalunya, Spain, (to F. X. A.), and grants from the Canadian Institutes of Health Research and a fellowship from the Fonds de la Recherche en Santé du Québec (to R. D.). The DNA sequencing facility of the Albert Einstein College of Medicine is supported in part by Cancer Center Grant CA13330.The costs of publication of this article were defrayed in part by the payment of page charges. The article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.

|| To whom correspondence should be addressed: Dept. of Molecular Pharmacology, Albert Einstein College of Medicine, 1300 Morris Park Ave., Bronx, NY 10461. Tel.: 718-430-4225; Fax: 718-430-8954; E-mail: fricker@aecom.yu.edu.

Published, JBC Papers in Press, February 8, 2002, DOI 10.1074/jbc.M112254200

2 P. J. Pereira, S. Segura-Martin, C. Ferrer-Orta, J. Vendrell, F. X. Aviles, M. Coll, and F. X. Gomis-Ruth, manuscript in preparation.

    ABBREVIATIONS

The abbreviations used are: CP, carboxypeptidase; FA, furylacryloyl; PC4 and PC7, prohormone convertase 4 and 7, respectively; r.m.s., root mean square; RT, reverse transcription; EST, expressed sequence tag; bis-Tris, 2-[bis(2-hydroxyethyl)amino]-2-(hydroxymethyl)propane-1,3-diol.

    REFERENCES
TOP
ABSTRACT
INTRODUCTION
MATERIALS AND METHODS
RESULTS
DISCUSSION
REFERENCES

1. Barrett, A. J., Rawlings, N. D., and Woessner, J. F. (1998) in Handbook of Proteolytic Enzymes (Barrett, A. J. , Rawlings, N. D. , and Woessner, J. F., eds) , pp. 1318-1320, Academic Press, Inc., San Diego
2. Auld, D. S. (1998) in Handbook of Proteolytic Enzymes (Barrett, A. J. , Rawlings, N. D. , and Woessner, J. F., eds) , pp. 1321-1326, Academic Press, Inc., San Diego
3. Auld, D. S. (1998) in Handbook of Proteolytic Enzymes (Barrett, A. J. , Rawlings, N. D. , and Woessner, J. F., eds) , pp. 1326-1328, Academic Press, Inc., San Diego
4. Hendriks, D. F. (1998) in Handbook of Proteolytic Enzymes (Barrett, A. J. , Rawlings, N. D. , and Woessner, J. F., eds) , pp. 1328-1330, Academic Press, Inc., San Diego
5. Springman, E. B. (1998) in Handbook of Proteolytic Enzymes (Barrett, A. J. , Rawlings, N. D. , and Woessner, J. F., eds) , pp. 1330-1333, Academic Press, Inc., San Diego
6. Aviles, F. X., and Vendrell, J. (1998) in Handbook of Proteolytic Enzymes (Barrett, A. J. , Rawlings, N. D. , and Woessner, J. F., eds) , pp. 1333-1335, Academic Press, Inc., San Diego
7. Fricker, L. D. (1998) in Handbook of Proteolytic Enzymes (Barrett, A. J. , Rawlings, N. D. , and Woessner, J. F., eds) , pp. 1341-1344, Academic Press, Inc., San Diego
8. Skidgel, R. A., and Erdos, E. G. (1998) in Handbook of Proteolytic Enzymes (Barrett, A. J. , Rawlings, N. D. , and Woessner, J. F., eds) , pp. 1344-1347, Academic Press, Inc., San Diego
9. Skidgel, R. A. (1998) in Handbook of Proteolytic Enzymes (Barrett, A. J. , Rawlings, N. D. , and Woessner, J. F., eds) , pp. 1347-1349, Academic Press, Inc., San Diego
10. Fricker, L. D. (1998) in Handbook of Proteolytic Enzymes (Barrett, A. J. , Rawlings, N. D. , and Woessner, J. F., eds) , pp. 1349-1351, Academic Press, Inc., San Diego
11. Reznik, S. E., and Fricker, L. D. (2001) Cell. Mol. Life Sci. 58, 1790-1804[CrossRef][Medline] [Order article via Infotrieve]
12. Song, L., and Fricker, L. D. (1997) Biochem. J. 323, 265-271[Medline] [Order article via Infotrieve]
13. Gomis-Ruth, F. X., Companys, V., Qian, Y., Fricker, L. D., Vendrell, J., Aviles, F. X., and Coll, M. (1999) EMBO J. 18, 5817-5826[CrossRef][Medline] [Order article via Infotrieve]
14. Aloy, P., Companys, V., Vendrell, J., Aviles, F. X., Fricker, L. D., Coll, M., and Gomis-Ruth, F. X. (2001) J. Biol. Chem. 276, 16177-16184[Abstract/Free Full Text]
15. Eipper, B. A., Park, L. P., Dickerson, I. M., Keutmann, H. T., Thiele, E. A., Rodriguez, H., Schofield, P. R., and Mains, R. E. (1987) Mol. Endocrinol. 1, 777-790[CrossRef][Medline] [Order article via Infotrieve]
16. Xin, X., Varlamov, O., Day, R., Dong, W., Bridgett, M. M., Leiter, E. H., and Fricker, L. D. (1997) DNA Cell Biol. 16, 897-909[Medline] [Order article via Infotrieve]
17. Xin, X., Day, R., Dong, W., Lei, Y., and Fricker, L. D. (1998) DNA Cell Biol. 17, 311-319[Medline] [Order article via Infotrieve]
18. Xin, X., Day, R., Dong, W., Lei, Y., and Fricker, L. D. (1998) DNA Cell Biol. 17, 897-909[Medline] [Order article via Infotrieve]
19. He, G. P., Muise, A., Li, A. W., and Ro, H. S. (1995) Nature 378, 92-96[CrossRef][Medline] [Order article via Infotrieve]
20. Nielsen, H., Engelbrecht, J., Brunak, S., and von Heijne, G. (1997) Protein Eng. 10, 1-6[Abstract/Free Full Text]
21. Thompson, J. D., Higgins, D. G., and Gibson, T. J. (1994) Nucleic Acids Res. 22, 4673-4680