Originally published In Press as doi:10.1074/jbc.M112254200 on February 8, 2002
J. Biol. Chem., Vol. 277, Issue 17, 14954-14964, April 26, 2002
Identification and Characterization of Three Members of the Human
Metallocarboxypeptidase Gene Family*
Suwen
Wei
,
Sonia
Segura§,
Josep
Vendrell§,
Francesc X.
Aviles§,
Edith
Lanoue¶,
Robert
Day¶,
Yun
Feng
, and
Lloyd D.
Fricker
From the
Department of Molecular Pharmacology, Albert
Einstein College of Medicine, Bronx, New York 10461, § Institut de Biotecnologia i Biomedicina and Departament de
Bioquímica i Biologia Molecular, Universitat Autònoma
de Barcelona, 08193 Bellaterra, Barcelona, Spain, and
¶ Departement de Pharmacologie, Institut de Pharmacologie et
Faculté de Médecine, Université de Sherbrooke,
Sherbrooke, Québec J1H 5N4, Canada
Received for publication, December 21, 2001, and in revised form, February 5, 2002
 |
ABSTRACT |
Amino acid homology
searches of the human genome revealed three members of the
metallocarboxypeptidase (metallo-CP) family that had not been
described in the literature in addition to the 14 known genes. One of
these three, named CPA5, is present in a gene
cluster with CPA1, CPA2, and CPA4
on chromosome 7. The cDNA encoding a mouse homolog of human
CPA5 was isolated from a testis library and sequenced. The
deduced amino acid sequence of human CPA5 has highest amino acid
sequence identity (60%) to CPA1. Modeling analysis shows the overall
structure to be very similar to that of other members of the A/B
subfamily of metallocarboxypeptidases. The active site of CPA5 is
predicted to cleave substrates with C-terminal hydrophobic residues, as
do CPA1, -2, and -3. Using Northern blot analysis, CPA5 mRNA is
detected in testis but not in kidney, liver, brain, or lung. In
situ hybridization analysis shows that CPA5 is localized to
testis germ cells. Mouse pro-CPA5 protein expressed in Sf9 cells
using the baculovirus system was retained in the particulate fraction
of the cells and was not secreted into the media. Pro-CPA5 was not
enzymatically active toward standard CPA substrates, but after
incubation with prohormone convertase 4 the resulting protein was able
to cleave furylacryloyl-Gly-Leu, with 3-4-fold greater activity at pH
7.4 than at 5.6. Two additional members of the human CP gene
family were also studied. Modeling analysis indicates that both contain
the necessary amino acids required for enzymatic activity. The CP on
chromosome 8 is predicted to have a CPA-like specificity for C-terminal
hydrophobic residues and was named CPA6. The CP on
chromosome 2 is predicted to cleave substrates with C-terminal acidic
residues and was named CPO.
 |
INTRODUCTION |
Carboxypeptidases (CPs)1
serve many important functions in a variety of organisms. Previously,
there were 14 known members of the human metallo-CP gene family with
functions ranging from the digestion of food to the selective
biosynthesis of neuroendocrine peptides (1-11). All metallo-CPs,
including those from bacteria and plants, can be divided into one of
two subfamilies based on consideration of the overall domain structure
as well as on amino acid sequence similarities. One subfamily, which
includes the digestive enzymes CPA1, CPA2, and CPB1 has been previously
referred to as the "pancreatic" subfamily but is more appropriately
referred to as the "A/B" subfamily. All members of the A/B
subfamily contain an ~90 amino acid-long N-terminal "pro" region
that functions as a chaperone to assist with the folding of the active
CP domain. In addition, this pro domain also functions as an inhibitor
of the CP, with full activity requiring endopeptidase cleavage of the
pro domain in one or more places. Cleavage releases the active CP
domain, which is ~300 residues long and does not contain any additional domains.
The other subfamily of metallo-CPs includes CPN, CPE, CPM,
CPD, and others (1, 11). Although this subfamily has been referred to
as "regulatory" CPs, this name is not ideal, because most of these
enzymes are poorly regulated, whereas one of the members of the A/B
subfamily, CPB2 or CPU, is highly regulated (4). Thus, this second
subfamily has been recently renamed the N/E subfamily to reflect the
first two members that were discovered (11). In contrast to the A/B
subfamily members, all members of the N/E subfamily contain several
domains in addition to the active CP domain and the signal peptide
(11). Also, members of the N/E subfamily lack pro domains that function
as enzyme inhibitors; some lack this region entirely (CPM, CPN, CPD),
whereas another member (CPE) contains a short pro domain that has an
unknown function (7-10, 12). A domain common to all members of the N/E subfamily is a transthyretin-like folding domain that is
immediately C-terminal to the active CP domain (13, 14). The function of this domain is not yet clear. In addition to the
transthyretin-like domain, members of the N/E subfamily contain
domains that are thought to function in protein-protein and/or
protein-membrane interactions and that target the protein to specific
compartments within the secretory pathway or outside the cell (11).
The amino acid sequence identities within members of each subfamily
(typically 35-65%) are higher than between subfamilies (15-25%).
The three-dimensional structures of members of each subfamily show many
similarities within the core, although there are major differences
within several of the loops near the active site cleft (14, 15). The
active site and substrate-binding residues are generally conserved
among all members of the two subfamilies that display enzymatic
activity; one or more critical residues are missing in some members of
the N/E subfamily that do not appear to encode active carboxypeptidases
such as CPX1, CPX2, AEBP1, and the third CP-like domain of CPD (11,
16-19).
The objective of the present study was to determine whether additional
metallo-CP genes exist in the human genome. Homology searches were used
to identify three novel human CP-like genes. One of these is related to
the sequence of a mouse testis cDNA clone of unknown function
(GenBankTM accession number AK015256), which we have
named CPA5. The GenBankTM sequence does not
contain a long open reading frame due to a frameshift with respect to
the predicted human genomic sequence; the quality of the mouse
sequence data was not reported. Another of the novel human CP-like
genes was identical within exons 1-8 to a cDNA sequence previously
found in human hematopoietic stem cells (GenBankTM
accession number AF221594). However, this putative stem cell protein
would not encode an active CP due to the absence of exons 9-11, which
contain critical residues for enzyme activity. The present search
revealed these missing exons downstream of exon 8, and reverse
transcription (RT)-PCR was used to confirm the predicted exon splicing
of this region of the gene. Finally, a third completely novel CP-like
gene was identified. Although related to cDNA sequences found in
several nonmammalian species, no mammalian cDNAs representing this
gene have been described. Modeling of the three novel CPs was performed
to determine whether these gene products were likely to encode active
CPs and, if so, to predict their substrate preference for aliphatic,
basic, or other amino acids. In addition, the tissue distribution of
pro-CPA5 mRNA and the preliminary characterization of CPA5
enzymatic activity was performed.
 |
MATERIALS AND METHODS |
Data Base Searches--
The NCBI Web site was used to search the
public human genome data base with various human CP sequences using the
tblast-n program and the default parameters. To identify additional
exons and to determine the nucleotide sequence of the exon/intron
junctions, the genomic sequences that corresponded to the novel CP-like
genes on chromosomes 7, 8, and 2 were downloaded from the NCBI site. These sequences were translated in all three reading frames, and the
deduced amino acid sequences were searched for homology to various CPs
using the GenePro program (Hoeffer Scientific). Because the amino acid
sequence similarity within the N-terminal precursor regions of the CPs
is low, these regions could not be identified from this homology-based
approach. Searches of the expressed sequence tag (EST) data base on the
NCBI Web site using the blast-n program revealed matches to the 5'
region of the novel CP-like genes on chromosomes 7 and 8. Several of
these EST sequences extended in the 5' region into the putative signal
peptide and pro region, so they were used to search the human genome
data base and obtain the 5' exons for this N-terminal region.
Predictions on the presence of N-terminal signal peptides and putative
cleavage sites of these peptides were performed using the Signal-P Web
site (20).
Modeling of CPs--
The set of template structures used for
modeling the target sequences was chosen in every case based on closest
sequence similarity. The following templates were used: bovine
pro-CPA1, porcine pro-CPA1, and human pro-CPA2 for the proenzyme
sequence of CPA5 and human pro-CPB1, porcine pro-CPB1, human pro-CPA1,
and human pro-CPA2 for the proenzyme sequence of CPA6 and CPO. All of
the proteins were extracted from the Protein Data Bank (Research
Collaboratory for Structural Bioinformatics, Rutgers University, New
Brunswick, NJ) except human
pro-CPB1.2
A preliminary multiple alignment was performed for the three sets of
template proteins plus the target sequence using the program ClustalW
(21) and a BLOSUM 62 matrix for weighting. To correctly align the
target and the template sequences, data on the secondary and tertiary
structures for this set of proteins were added. The secondary structure
of the target proteins was predicted with the program PHD (22), and the
real secondary structure of the templates was calculated with the
program DSSP (23). The three-dimensional structures of the chosen
templates were structurally superimposed by use of the program SSAP
(24), and the multiple alignments of the templates were modified
according to the three-dimensional superimposition of the templates.
Subsequently, the sequence of each target sequence was used to obtain a
final multiple alignment accounting for the structure of the templates, which was manually refined to account again for the structural information obtained from the template superimposed structures plus the
predicted and real secondary structure.
A method of comparative modeling by satisfaction of spatial constraints
was used to build the three-dimensional structure of pro-CPA5,
pro-CPA6, and CPO. This method was implemented using the program
MODELLER (25). The spatial constraints were derived by transferring the
spatial features from the structures of the known proteins to the
sequence of the unknown one. Idealization of bond geometry and removal
of unfavorable nonbonded contacts were performed by energy minimization
with the GROMOS force field for unsolvated systems (26, 27). The model
was refined using 1000 steps of steepest descent. Root mean square
(r.m.s.) deviation calculations and superimposition of the modeled
structures with respect to the crystallographic ones were obtained by
means of the program SSAP. Secondary structure calculations of the
crystallographic structures and the model were performed with DSSP (23)
and compared with the predicted secondary structure. The program
PROSA-II (28) was used to check the quality of the models. This program
allows identification of regions with non-near-native fold by the high positive values of pseudo-potential energy. The model showing the
smallest pseudoenergy was taken for the final modeled structure.
Isolation of Mouse Testis CPA5 cDNA--
Mouse full-length
CPA5 cDNA was generated by performing high fidelity PCR using
thermostable Taq and the 3'-5' exonuclease activity of
Pwo (Roche Molecular Biochemicals) with a mouse BALB/c testis cDNA library (CLONTECH). The forward
primer ATA AGA ATG CGG CCG CTT GTC TGG AAG GAG AAG C included a
NotI site at its 5'-end. The reverse primer GGG GTA CCG AAT
TCT AGT GAT GAT GAT GAT GAT GAT AGG GGT GAT TCA GGG TGT G included
KpnI and EcoRI sites and a His tag sequence at
its 5' end. The PCR product was purified (PCR purification kit;
Qiagen), digested with NotI and EcoRI, and
subcloned into the baculovirus expression vector pVL1392 (Invitrogen).
To generate a probe for Northern blotting analysis, the purified PCR
product was digested with NotI and HincII and subcloned into pBluescript KS vector (Stratagene). All positive clones
were confirmed by sequencing.
Northern Blot and RT-PCR Analysis--
Total RNA was prepared
using the lithium chloride method (29), and ~20 µg was fractionated
on an agarose gel containing 2% formaldehyde. Following photography of
the ethidium bromide-stained gel, the RNA was transferred to a
nitrocellulose membrane and probed with CPA5 riboprobe. The
500-nucleotide-long probe was produced using the
HindIII-linearized mouse CPA5 vector described above, T3 RNA
polymerase, and [32P]UTP. Approximately 4 × 107 cpm of probe was hybridized with the blot in 5×
saline-sodium phosphate-EDTA buffer, 5× Denhardt's solution, 50%
(v/v) formamide, 0.1% (w/v) SDS, and 100 µg/ml denatured salmon
sperm DNA at 64 °C overnight. Following the hybridization, the blot
was washed with 1× standard saline citrate (SSC) and 0.1% SDS, 0.5×
SSC plus 0.1% SDS, and then 0.1× SSC plus 0.1% SDS buffer at
72 °C. After washing, the blot was dried and exposed to x-ray film
(Eastman Kodak Co.) for 3 days at
80 °C with an intensifying screen.
RT-PCR was performed using the Titan one tube RT-PCR system (Roche
Molecular Biochemicals). RNA was extracted using the lithium chloride
method (29) from various mouse tissues, AtT-20 cells, or human
mononuclear cells prepared from fetal cord blood as described (30). For
mouse CPA5, the forward and reverse primers were the same as used for
the isolation of CPA5 cDNA from the testis library (above). For
human CPA6, the forward primer was TGG ATG CAT CAT CTG AAT AAA ACT CAC,
and the reverse primer was TCC ATT TTT GTA GGC CCA ATC CAT. For mouse
CPA6, the same forward primer was used, but the reverse primer was
chosen from a region with lower degeneracy: CTT CAC TTT CCA GTT TCT ATT
GGC ATC. The PCR products were purified (PCR purification kit; Qiagen)
and subcloned into the TA cloning vector (Invitrogen). The resulting
plasmids were sequenced in both directions.
In Situ Hybridization Analysis--
All experiments carried out
in this study used antisense cRNA probes and sense probes for controls.
Probes were synthesized by in vitro transcription of
linearized plasmid DNA (CPA5 in pBluescript, described above), and
labeling was done by incorporation of radioactive [35S]UTP and [35S]CTP nucleotides (Amersham
Biosciences) as described (31). The in situ hybridization
studies were carried out using antisense and sense
[35S]UTP/[35S]CTP-labeled cRNA probes.
Radioactive probes were diluted to 33 × 103 dpm/µl
and in situ hybridization was performed as previously described (31). CD1 mice were sacrificed by rapid decapitation, and
tissues were rapidly removed and frozen in isopentane cooled to
35 °C. The extracted tissues were stored at
80 °C until
cryosectioning. Frozen 10-µm sections were cut on a Reichert cryostat
(Leica Microsystems, Depew, NY), thaw-mounted on polylysine-coated
glass slides, and stored at
80 °C until processing. These sections
were submitted to the standard in situ hybridization
procedure (31) followed by x-ray film autoradiography for 4 days to
obtain low resolution images. These slides were subsequently dipped in
NTB-2 nuclear emulsion (Eastman Kodak Co.) to obtain cellular
resolution. The sections were counterstained with cresyl violet,
cleared in xylene, and mounted with Permount histological mounting
medium (Fisher).
Expression of Mouse Pro-CPA5 in the Baculovirus
System--
Recombinant baculovirus expressing mouse pro-CPA5 was
generated using Baculoplatinum DNA (Orbigen). Approximately one million Sf9 cells growing in Sf900-II serum-free medium
(Invitrogen) were co-transfected with 5 µg of the pro-CPA5 pVL1392
vector along with 0.5 µg of Baculoplatinum DNA using the standard
calcium phosphate procedure (18). Amplifications of the virus were
performed as previously described (18). Sf9 cells from 450 ml of
culture were recovered by centrifugation at 50,000 × g
at 4 °C for 30 min. The cells were homogenized (Polytron; Brinkman)
in 100 ml of 0.1 M Tris-Cl, pH 7.4, and centrifuged at
50,000 × g for 30 min. The pellet was homogenized in
0.1 M Tris-Cl containing 1 M NaCl and
centrifuged at 50,000 × g. This pellet was homogenized in 0.1 M Tris-Cl containing both 1 M NaCl and
1% Triton X-100 and centrifuged again as above. The final pellet was
resuspended in Tris buffer, and aliquots of each supernatant and the
final pellet were analyzed on two denaturing polyacrylamide gels. One gel was stained with Coomassie while the other was transferred to
nitrocellulose and probed with a mouse monoclonal antibody to the
His6 epitope (Invitrogen).
The supernatant from the 1 M NaCl extract was dialyzed
against 10 mM Tris-Cl, pH 7.4, overnight and centrifuged at
50,000 × g for 30 min at 4 °C. Both supernatant and
pellet were analyzed on denaturing polyacrylamide gels (both protein
and Western blot analysis, as above). The major protein bands in the
precipitate fraction were cut from the gel, digested with trypsin
(Promega), and subjected to mass spectrometry as previously described
(32). The resulting fragments were compared with the theoretical masses of pro-CPA5 tryptic fragments predicted using the Paws computer program.
Characterization of Pro-CPA5--
Because the gel analysis
revealed the size of the expressed protein to be that of pro-CPA5 and
not the predicted active form, attempts were made to process the
pro-CPA5 using a variety of endopeptidases. The pro-CPA5 could not be
purified using the Talon His6 purification procedure
(CLONTECH). Because the pellet after the dialysis
of the 1 M NaCl cell extracts contained pro-CPA5 as one of
the major components (based on the tryptic fingerprinting results),
this material was used for subsequent studies on pro-CPA5. Approximately 5-10 µg of pro-CPA5 (estimated from the
Coomassie-stained protein gel) was incubated with a variety of
endoproteases in the following buffer: porcine trypsin (Promega), 0.1 M Tris, pH 7.4; purified baculovirus-expressed human furin
(33), 0.1 M HEPES, pH 7.5, with 1 mM
CaCl2, 1 mM 2-mercaptoethanol, and 0.5 mg/ml
bovine serum albumin; purified baculovirus-expressed human PC7 (33), 20 mM bis-Tris, pH 6.5, with 1 mM
CaCl2; and medium from High-Five cells expressing
PC4 (the gift of Dr. Majambu Mbikay, Loeb Health Research Institute,
Ottawa, Canada) (34), 0.1 M Tris, pH 7.0, with 2 mM CaCl2. For furin and the PCs, enzyme
activity was verified with the standard substrate
pyro-Glu-Arg-Thr-Lys-Arg-7-amido-4-methylcoumarin (34).
To measure CP activity, 5-10 µl of each sample (corresponding to
125-250 ng of pro-CPA5 prior to incubation with PC4) was added to 0.5 ml of 500 µM furylacryloyl-Gly-Leu (FA-Gly-Leu) in either
0.1 M Tris-Cl, pH 7.4, or 0.1 M sodium acetate,
pH 5.5, at 25 °C. The enzyme reaction was followed by measuring the
decrease in absorption at 336 nM; controls with bovine CPA1
(Sigma) showed a maximal change of 0.6 absorption units for the batch
of substrate used.
 |
RESULTS |
Data Base Searches and Alignments of CPs--
When CPE was used to
search the public human genome data base using the tblast-n program,
each known member of the N/E subfamily was detected, but no additional
genes were found (Table I). None of the
N/E subfamily genes appeared to be present as a cluster; although
chromosome 4 contains both CPE and CPZ, and
chromosome 10 contains both CPN and CPX2, these
genes are located at a considerable distance from each other within the
chromosome. Similarly, none of the members of the N/E subfamily were
located in close proximity to members of the A/B subfamily. Searches
with CPE produced a limited number of hits to members of the A/B
subfamily, as expected due to the generally low amino acid sequence
identity between subfamilies.
Searches with CPA1 and CPB1 yielded generally similar results to each
other, with hits to genes on chromosomes 7, 3, 8, 13, and 2 (Table I).
The four genes found on chromosome 7 are found within a cluster, with
the coding region of each gene located 3-23 kb apart from the coding
region of the adjacent gene (Fig. 1). The
genes for the pancreatic proteins CPA1 and CPA2 are located at the ends
of this cluster. A cDNA sequence reported in the literature as CPA3
(35) and renamed CPA4 by the human genome nomenclature committee is
also present in this cluster (Fig. 1). This sequence presumably encodes
an active CPA-like enzyme based on consideration of the active site
region; the enzymatic properties have not yet been reported. In
addition, a novel human CP-like gene was found within this cluster and
named CPA5 based on predictions of the active site region
(discussed below). The genomic sequence of the appropriate region of
chromosome 7 was searched in all three reading frames for homology to
various CPs using the GenePro program. This analysis identified 11 exons for the entire predicted coding region of CPA5 (exons
3-13; Fig. 1), consistent with the 11 exons found in most other
members of the A/B subfamily of CPs (36, 37). All of the introns in the
coding region of CPA5 begin with GT and end with AG, and
their locations exactly match those in other members of the A/B
subfamily.

View larger version (9K):
[in this window]
[in a new window]
|
Fig. 1.
Gene structure of CP-like genes on
chromosomes 7, 8, and 2. The length of each intron is indicated
(kb), and the widths of the lines representing
the exons correspond to their sizes (using a different scale from the
intron size). The CPA5 gene on chromosome 7 is located in a
cluster with CPA2, CPA4, and CPA1. The
distance between the genes is indicated (kb). Within CPA5,
exons 1 and 2A are found in all sequences in the GenBankTM
EST data base with homology to this region of the gene, whereas exon 2B
is only present in a single EST sequence from a testis cDNA
library. The CP-like gene on chromosome 8 has been named
CPA6. The putative CP on chromosome 2 (CPO) does
not match any sequences in the public data base, so the N-terminal
exons could not be identified from the homology search due to lower
sequence similarity in this region. The exons were numbered to reflect
the missing exons; nearly all other members of the CPA/B subfamily have
three additional N-terminal exons.
|
|
CPA5 has >98% nucleotide identity with a total of 13 sequences in the GenBankTM human EST data base (as of
November 30, 2001). Of these, nine were from libraries that included
RNA from testis, three were from libraries of adult brain, and one was
from fetal heart. One of the testis sequences, and all three of the
brain sequences were ~560 nucleotides longer in the 5' direction than
the coding sequence predicted from homology to other CPs. All four of
these clones contained two additional 5' noncoding exons (exons 1 and 2A; Fig. 1), and the testis clone contained a third noncoding exon
(exon 2B; Fig. 1). The intron/exon junctions of the two common exons
obeyed the GT/AG rule, whereas the junction of the exon found only in
the testis clone had a GG in place of the GT.
A mouse cDNA clone with 84% nucleotide sequence identity to human
CPA5 was present in the NR data base (GenBankTM
accession number AK015256). However, this GenBankTM mouse
sequence does not encode a CP due to frameshifts in the mouse sequence
relative to the long CP-like open reading frame of the human sequence.
PCR was performed using a mouse testis cDNA library as template and
oligonucleotides based on the sequence of the mouse cDNA clone.
Sequence analysis of the resulting cDNA clones revealed four
differences from the GenBankTM entry; one of these is a
silent change, another changes Leu121 to Met (which matches
the human sequence), and most importantly, there are two missing
residues that shift the reading frame of the mouse protein to exactly
match that of the predicted human protein.
The mouse CPA5 cDNA clone, like the human CPA5 sequences in the EST
data base, extends ~560 nucleotides upstream of the coding region.
The human upstream region contains eight ATGs, and the mouse region
contains nine; many of these are conserved between the human and mouse
sequences. However, only one of these upstream ATG motifs in either
sequence is predicted to be a good consensus sequence for transcription
initiation (38), and this is followed by an in-frame stop codon. The
second ATG that is predicted to be a good consensus sequence for
transcription initiation encodes a protein of 436 amino acids for
both human and mouse sequences. The nucleotide sequence surrounding the
ATG at the start of this long open reading frame is highly conserved
between human and mouse, whereas the upstream ATGs show considerably
less sequence similarity within the surrounding regions.
In addition to CPA5, two other sequences that have not
previously been reported in the literature were detected on chromosomes 8 and 2 (Fig. 1). The first eight exons of the CP-like gene on chromosome 8 exactly match the sequence of a cDNA clone reported in
GenBankTM (accession number AF221594). However, this
GenBankTM entry lacks the sequence corresponding to exons
9-11 and has a 3'-end that does not have any homology with known
carboxypeptidases. Instead, this 3'-end perfectly matches the putative
intronic region found in the human genome sequence. Searches of the
downstream genomic sequence for CP-like domains revealed the missing
exons ~50 kb away. Although large for an intron, this is considerably smaller than the first and second introns, which are 121.8 and 98.4 kb
(Fig. 1). All of the predicted introns begin with GT and end with AG,
and their locations exactly match those found in other A/B subfamily
CPs. Based on consideration of the active site region (discussed
below), this chromosome 8 CP-like gene has been named
CPA6.
Similar analysis of the novel gene on chromosome 2 revealed eight exons
corresponding to the active CP domain of the protein (exons 4-11; Fig.
1). The amino acid homology within the N-terminal "prepro" regions
of the carboxypeptidases was not sufficient to identify this portion of
the gene based on homology searches. Because no cDNA sequence was
found in public data bases that corresponds to this gene, it was not
possible to determine the N-terminal exons. The downstream exons are
numbered based on the related CPs, with the assumption that there would
be three additional exons for the signal peptide and pro domains. The
introns within this chromosome 2 CP-like gene all begin with GT and end
with AG, and their positions exactly match one or more of the other family members. Based on consideration of the active site region, which
suggests that this enzyme has a markedly distinct substrate specificity
from CPA and CPB (discussed below), this gene was named
CPO.
Both human and mouse CPA5 are predicted to be produced as precursors
containing N-terminal signal peptides, with cleavage expected following
Gly33. Following the signal peptide is a predicted
93-96-residue pro domain that has some homology to the corresponding
pro region of other A/B subfamily CPs (Fig.
2). Based on sequence alignments with
other CPs and on consideration of the specificity of various endopeptidases, cleavage of the pro region is predicted to occur either
between Arg126 and Leu127 or between
Arg129 and Ser130. Both sites are related to
the consensus site for furin and other prohormone convertases in which
cleavage occurs C-terminal to the second Arg of the sequence Arg-Arg or
Arg-Xaa-Xaa-Arg (39-42). Although the first site has an Arg-Arg motif,
the presence of the Leu in the P1 position is not ideal for
many of these PCs (39-42). The second site, with the Arg-Xaa-Xaa-Arg
motif, may be more readily cleaved due to the P1 Ser. Also,
this second site is favored from the modeling analysis (discussed
below).

View larger version (118K):
[in this window]
[in a new window]
|
Fig. 2.
Alignment of the deduced amino acid sequences
of human (h) and mouse (m) CPA5,
CPA6, CPO, and the previously identified members of this
subfamily. The gap below the
arrow denotes the putative signal peptide cleavage site. For
CPA6, the signal peptide cleavage was predicted to occur at either
Gly28 or Ser30 (only the Ser30 site
is indicated). The gap below the
arrowhead denotes the predicted pro peptide cleavage site.
For CPA5, the pro peptide cleavage site may occur at either
Arg126 or Arg129 (only the Arg129
site is indicated). The dotted line indicates the
missing N-terminal sequence of CPO. Dashes are used to
introduce gaps for optimal alignment of the sequences. Active site
residues are in boldface type. The
asterisks indicate active site or substrate-binding residues
that are not conserved in all members of the CPA/B subfamily. These
residues are numbered using the traditional method, which is based on
the catalytically active form of CPA1 (i.e. after removal of
the pro peptide). The numbers on the left of each
line following the gene names refer to the amino acid
position relative to the initiation ATG. The cDNA sequences from
which the protein sequences were derived have been deposited in
GenBankTM: mouse CPA5, AF466283; human CPA5, BK000187;
human CPA6, AF466284 and BK000188; and human CPO, BK000189.
|
|
The deduced amino acid sequence of CPA6 encodes a protein of 437 residues that is also predicted to have a signal peptide, with cleavage
predicted following either Gly28 or Ser30. The
putative pro domain is nearly identical in length to that of CPA5 and
other family members (Fig. 2). The predicted cleavage site of the pro
domain of CPA6 is a perfect furin/prohormone convertase consensus site,
with cleavage expected between Arg129 and
Ser130 (Fig. 2). Because the N-terminal exons of CPO could
not be identified due to the relatively low homology among the prepro
regions of the various CPs, no information is available regarding the
presence of a signal peptide and pro region in CPO. There is no
furin/prohormone convertase consensus site within the region of CPO
that aligns with the pro domain cleavage site of the other family
peptides (Fig. 2).
Alignment of the amino acid sequences shows that the key residues for
catalytic activity and substrate binding of other CPs are generally
present in comparable positions in CPA5, CPA6, and CPO. Based on
kinetic studies with rat CPB1 (36) and crystallographic studies with
porcine CPB1 and CPA1 (43, 44), the important residues include those
involved in the coordination of the active site zinc atom
(His69, Glu72, and His196) and a
series of residues important for substrate binding and catalysis
(Arg71, Arg124, Arg127,
Lys128, Asn144, Arg145,
Ser194, Ser197, Tyr198,
Ser199, Ile243, Tyr248,
Ser253, Ile255, Thr268,
Glu270, and Phe279; the numbering and residue
names used are those of CPA1). All of the residues involved in zinc
coordination and substrate catalysis are conserved in CPA5, CPA6, and
CPO. Most of the residues involved in substrate binding are also
conserved, and the differences observed in some of them are likely to
be related to the individual specificities of the three forms
(discussed below). The length of the active CP domain is nearly
identical in all members of the A/B subfamily, with the major
differences in overall length due to small differences in the number of
residues of the signal peptide (Fig. 2). The overall amino acid
sequence identity of prepro-CPA5 is highest to prepro-CPA1 (60%),
slightly lower to prepro-CPA2 and prepro-CPA4 (53 and 51%), and lower
(34-36%) to the remaining members of the gene family (Table
II). Similar results are obtained when
comparing just the active CP domains, although the amino acid sequence
identities are 4-7% higher without the signal peptides and the pro
regions (Table II). Prepro-CPA6 shows slightly higher amino acid
sequence identity to prepro-CPA3 (41%) and prepro-CPB (40%) than the
other members of the subfamily (Table II). The amino acid sequence
identities are also higher when only the active CP domains are compared
(Table II). Because the prepro region of CPO could not be determined, amino acid comparisons of this protein were done only with the active
CP domains. CPO shows highest amino acid sequence identity to CPB1
(46%), CPA6 (45%), CPA3 (45%), and CPB2 (44%).
View this table:
[in this window]
[in a new window]
|
Table II
Percentage of amino acid sequence identity among members of the CPA/B
subfamily
The upper right half of the table indicates identity among the
full-length (i.e. preprocarboxypeptidase) forms; the lower
left half (italics) indicates identity only of the putative active
forms (i.e. without the prepro sequence).
|
|
Modeling of CPs--
To gain a better understanding of whether the
novel CPs encode active enzymes as well as to predict the optimal
substrates, a method of comparative modeling by satisfaction of the
spatial restraints determined by sequence alignments was used to build the three-dimensional structure of the pro-CPs using the program MODELLER. For each particular protein, a subset of sequences with known
Protein Data Bank coordinates, chosen based on closest sequence similarity, was used as the starting alignment, which, in every case,
contained the restraints shown in the global alignment in Fig. 2. In
order to improve the model accuracy, a prediction of secondary
structure was performed in every case. For pro-CPA5 and pro-CPA6, the
prediction showed that the
-helix of the connecting segment (the
region that links the pro and enzyme domains) is similar to human
pro-CPA2 and slightly longer than the
-helix of pro-CPA1. Thus,
several models were built in which the connecting region was only
restrained by the structure of human pro-CPA2. These models showed an
improved pseudoenergy profile with respect to those obtained using all
restraints in this region. Due to the lack of information about the
N-terminal pro region, CPO was modeled on CP templates only. In all
cases, the structure with the lowest energy was chosen as the final
model among the proposed ones.
The overall r.m.s. deviation values calculated between the enzyme
moieties of the final refined models and the different CP templates are
as follows: CPA5, 0.28, 0.34, and 0.37 Å for bovine CPA1, porcine
CPA1, and human CPA2, respectively; CPA6, 0.49, 0.39, 0.66, and 0.69 Å for human CPB, porcine CPB, human CPA2, and bovine CPA1, respectively;
CPO, 0.46, 0.41, 0.62, and 0.68 Å for human CPB, porcine CPB, human
CPA2, and bovine CPA1, respectively. In the case of the two modeled
proenzymes, the r.m.s. deviation values are higher in all cases when
the complete models are compared with their templates. For instance,
the r.m.s. deviation values for the pro-CPA5 model are 1.10, 1.06, and
0.50 Å for bovine pro-CPA1, porcine pro-CPA1, and human pro-CPA2,
respectively. This indicates that while the enzyme domains of the
pro-CPA5 or pro-CPA6 models are very similar, the pro segments are
significantly different, especially when compared with pro-CPA1
structures. In the case of pro-CPA5, the main differences observed are
allocated within the connecting segment, with its
-helix being five
turns long (Fig. 3), as in the
three-dimensional structure of pro-CPA2, and spanning from
Ile110 to Leu127. This helix is followed by a
short loop at the border with the pro-CPA5 moiety, where a highly
exposed arginine (Arg129) is located. This arginine may be
the first target for endopeptidase-mediated activation of the
proenzyme. The connecting segment of pro-CPA1 has only a four-turn
helix, and the loop next to it is longer than that of pro-CPA5. As in
pro-CPB1, pro-CPB2, and pro-CPA3, the pro domain of pro-CPA6 contains a
two residue deletion after position 65 and a four-residue insertion
after position 72 (CPA6 numbering) relative to all other CPAs (Fig. 2).
Thus, the model shows the one-turn 310 helix located
between
strands 2 and 3 that is characteristic of the CPB pro
region. However, unlike the pro region of CPB1, in which
Asp56 (numbering of Fig. 2) forms a salt bridge with the
active site residue Arg145 (CPA1 active form numbering) to
completely inactivate the proenzyme toward all substrates, none of the
other pro regions contain an acidic amino acid in this position
(corresponding to Ser71 of CPA6).

View larger version (33K):
[in this window]
[in a new window]
|
Fig. 3.
Ribbon representation
of the modeled structures of pro-CPA5, pro-CPA6, and CPO.
The pro and enzyme domains are shown in blue and
gold, respectively, and the different color intensities
indicate or secondary structures.
|
|
In the predicted active sites of the three CP models, the residues
involved in the coordination of the active site zinc atom and the
series of conserved key residues that form the different active center
subsites have essentially the same conformation described for other CPs
(Fig. 4). The positions that show higher variability among the various CPs are Ser194,
Ile243, Ser253, Ile255, and
Thr268 (CPA1 active form numbering and residue names). The
predictions of individual specificities are therefore based largely on
the conformational and space-filling effects of these amino acid
substitutions. CPAs hydrolyze hydrophobic C-terminal residues. CPA2
exhibits a preference for C-terminal bulky aromatic residues, whereas
CPA1 prefers smaller aliphatic residues. This difference is thought to
be primarily due to two substitutions in the specificity pocket of
human CPA2 at positions 194 and 268 (CPA1 active form numbering) (45).
Human CPA5 has the same residues as CPA1 in these positions (Fig. 2).
Two other residues in the specificity pocket of CPA5 are
Ile380 and Val382, which correspond to residues
Ser253 and Ile255 (CPA1 active form numbering).
These changes would render the specificity pocket of CPA5 smaller than
in the other CPs. CPA5 has only one predicted disulfide bond (between
Cys265 and Cys288), which is in the same
position as the disulfide bond in CPA1 and the other CPs. Human CPA2
contains a second disulfide bond between Cys318 and
Cys352, which is unique for CPA2 and whose presence is
thought to influence the specificity of the enzyme. This, together with
the fact that the residues identified as responsible for the higher
specificity of CPA2 for bulky aromatic residues are not present in CPA5
and the substitution at positions 253 and 255 (CPA1 active form
numbering) that make the specificity pocket smaller, indicates that
CPA5 is likely to exhibit a specificity for small aliphatic C-terminal residues.

View larger version (23K):
[in this window]
[in a new window]
|
Fig. 4.
Schematic diagram of the CP active site.
The numbers indicate the residue number of CPA1 (active form
numbering) and correspond to the numbers of the residues in
boldface type in Fig. 2. The asterisks
indicate residues that are variable among members of the CPA/B
subfamily and that presumably affect the substrate specificity.
R indicates the side chain of the C-terminal residue of the
substrate.
|
|
There is a large overall coincidence in the hydrophobic nature of most
of the residues that line the substrate binding pocket of CPA6 and
CPA1. The CPA1-like activity of CPA6 may, however, be modulated by
three substitutions; Ala371, Met383, and
Ala396 in CPA6 correspond, respectively, to
Ile243, Ile255, and Thr268 in CPA1
(active form numbering). These residues line the cavity where the side
chain of the S1' residue sits. The smaller size of the residues lining
the specificity pocket may indicate that CPA6 is more efficient in the
hydrolysis of large, probably branched hydrophobic C-terminal residues.
Finally, CPO is unique among human CPs in that it has a Ser at position
243 and an Arg at position 255 (CPA1 active form numbering); most CPs
have a bulky hydrophobic residue at position 243 and either a
hydrophilic residue (CPAs) or an acidic residue (CPBs) in position 255. These residues interact directly with the C-terminal side chain of the
substrate. The presence of a basic residue in this position of CPO
suggests a specificity for C-terminal acidic residues.
Distribution and Expression of CPA5--
The tissue distribution
of mouse CPA5 was determined by Northern blot analysis using a cDNA
probe isolated from a mouse testis library. This analysis showed a
single band in testis of ~3.4 kb (Fig.
5). No signal was detected for kidney,
liver, brain, or lung (Fig. 5). RT-PCR showed a signal in mouse testis
as well as in pancreas and the AtT-20 mouse pituitary corticotrophic
cell line (data not shown). No RT-PCR signal for CPA5 was detected in
mouse heart, kidney, liver, or brain (not shown). To further characterize the localization of CPA5 in mouse testis and pituitary, in situ hybridization analysis was performed. A signal for
CPA5 was detected in the germ cells of the testis (Fig.
6, top panels). The
level of expression of CPA5 varied among tubules, suggesting the
stage-specific expression of this mRNA. In pituitary, a diffuse signal for CPA5 was detected throughout the anterior and intermediate lobes but not in the neural lobe (Fig. 6, lower
left panel). Control hybridizations with sense
probes showed only background levels (Fig. 6, lower
right panel).

View larger version (67K):
[in this window]
[in a new window]
|
Fig. 5.
Northern blot analysis of mouse CPA5.
Total RNA from mouse kidney, liver, brain, testis, and lung was
fractionated on a formaldehyde agarose gel, transferred to
nitrocellulose, and probed with 32P-labeled mouse CPA5
cRNA. Top, autoradiogram of Northern blot (3-day exposure).
The positions of size standards (in kb) are indicated (Invitrogen).
Bottom, ethidium bromide-stained gel prior to transfer to
nitrocellulose. The positions of 28 and 18 S RNA are indicated.
|
|

View larger version (74K):
[in this window]
[in a new window]
|
Fig. 6.
In situ hybridization analysis of
CPA5 in mouse testis (top) and pituitary
(bottom). Tissue sections were probed with
antisense (AS) or sense (S) mouse CPA5 cRNA
probes, as described under "Materials and Methods." AL,
anterior lobe; IL, intermediate lobe. CPA5 is differentially
expressed in tubules, suggesting a stage-specific germ cell expression.
Low levels are observed in both the anterior and intermediate pituitary
lobes. Sections hybridized with sense probes show only background
signal.
|
|
RT-PCR of CPA6 showed a signal with RNA isolated from fetal umbilical
cord blood, which contains human hematopoietic stem cells, and from
mouse brain (data not shown). The primers used for the RT-PCR of human
CPA6 were located on exons 5 and 11. The sequence of the cDNA
derived from RT-PCR using these primers exactly matched that predicted
from the gene, indicating that exons 5-11 are spliced as shown in Fig.
2. In addition, the nucleotide sequence of the mouse RT-PCR product for
CPA6, which was performed with primers on exons 5 and 8, was 89%
identical to the corresponding region of human CPA6 (data not shown).
Mouse CPA5 protein was expressed in insect Sf9 cells
using the baculovirus system. This system has been successful for the expression of a number of other CPs (46-48). To detect the protein, a
His6 tag was added to the C terminus. Baculovirus-expressed CPA5 was detected by Western blot analysis with an
anti-His6 antiserum. The protein was detected in the
Sf9 cell pellets as a 45-kDa protein (Fig.
7A), consistent with the
predicted size of pro-CPA5. No signal was detected when a comparable
amount of Sf9 cells from wild-type virus-infected cells was
analyzed (Fig. 7A). Although pro-CPA5 was predicted to be
secreted in a soluble form due to the presence of a signal peptide and
the absence of any transmembrane domain, immunoreactivity was not
detected in the medium (Fig. 7A). A major protein of
the same size as the immunoreactive band was detected in the
homogenates of pro-CPA5 baculovirus-infected Sf9 cells but not
in homogenates of cells infected with wild type virus (Fig.
7B). When the cell homogenate was centrifuged, none of the
45-kDa protein (Fig. 7B) or immunoreactivity (not shown) was
detected in the supernatant. A small amount of the particulate 45-kDa
protein (Fig. 7B) and immunoreactivity (not shown) could be
solubilized with 1 M NaCl. The combination of high salt and detergents such as 1% Triton X-100 did not enhance the extraction of the particulate 45-kDa protein (Fig. 7B). Once
solubilized with high salt, the removal of the salt by dialysis
followed by centrifugation at 50,000 × g for 30 min
led to the recovery of the 45-kDa protein in the pellet (Fig.
7C). This 45-kDa band was cut from the gel and digested with
trypsin, and the tryptic fragments were identified by matrix-assisted
laser desorption ionization time-of-flight mass spectrometry.
Twenty-six major tryptic fragments were detected, and 15 of them
matched (±0.07%) to the theoretical masses of predicted tryptic
fragments of pro-CPA5. The lower molecular weight bands detected in the
precipitate of the high salt extracts were similarly analyzed and did
not show any tryptic fragments that matched those predicted for
pro-CPA5. These bands also did not react with the monoclonal antibody
to the His6 epitope tag.

View larger version (42K):
[in this window]
[in a new window]
|
Fig. 7.
Expression of pro-CPA5 in Sf9 cells
using the baculovirus system. A, aliquots of the cell
pellets (Cell) or media (Sec.) from CPA5 or wild
type (wt) virus-infected cells were fractionated on a
denaturing polyacrylamide gel, transferred to nitrocellulose, and
probed with a mouse antibody to the His6 tag (the
His6 tag was included on the C terminus of the expressed
protein). B, aliquots of pro-CPA5 (A5) or wild type
baculovirus-infected Sf9 cell homogenates (Hom.),
soluble fraction (Sol.), 1 M NaCl extract
(Salt), 1 M NaCl plus 1% Triton X-100 extract
(Salt/Tx), or pellet remaining after the extractions were
analyzed on a denaturing polyacrylamide gel stained with Coomassie
Blue. The arrowheads indicate the 45-kDa band in the
pro-CPA5-expressing cell fractions that is not detected in the wild
type virus cell fractions. C, the 1 M NaCl
extract of the pro-CPA5-expressing virus-infected cells was dialyzed
and centrifuged, and an aliquot of the pellet was analyzed by
denaturing polyacrylamide gel electrophoresis. The 45-kDa band shown by
the arrow was identified as pro-CPA5 by tryptic fragment
fingerprinting with mass spectrometry. For all panels, the
positions of prestained molecular weight markers (Invitrogen) are
indicated.
|
|
No enzyme activity was detected with pro-CPA5 using two common CPA
substrates, FA-Gly-Leu and hippuryl-Phe (not shown). Attempts to
convert pro-CPA5 into the active form using purified furin, PC7, and
trypsin were unsuccessful (not shown). However, when pro-CPA5 was
incubated with medium from PC4-expressing baculovirus-infected cells,
the immunoreactive 45-kDa band was greatly reduced, although a smaller
immunoreactive band did not appear (not shown). The absence of a
smaller band suggests that either the PC4 cleaved all or a portion of
the His6 epitope tag, or else the activated CPA5 was able
to cleave the C-terminal His6 epitope tag. The PC4-digested pro-CPA5 was enzymatically active toward FA-Gly-Leu (Fig.
8A). Neither PC4 medium
alone nor pro-CPA5 incubated with wild type virus-infected
medium showed enzyme activity toward the CPA substrate (Fig.
8A). Activity was severalfold greater at pH 7.4 than at pH
5.6 (Fig. 8B). The amount of FA-Gly-Leu cleaving activity
detected with 125 ng of pro-CPA5 (digested with PC4) was comparable
with the activity observed with 20 ng of bovine pancreatic type 1 CPA1 using the same substrate and buffer conditions.

View larger version (15K):
[in this window]
[in a new window]
|
Fig. 8.
Enzymatic assays with the standard CPA
substrate FA-Gly-Leu. A, partially purified pro-CPA5
(from Fig. 7, lane C) was incubated with medium
from High-Five cells expressing PC4 (triangles) or control
medium (circles) overnight at 37 °C. Then 5 µl of the
reaction (corresponding to ~250 ng of pro-CPA5 before incubation) was
combined with 0.5 mM FA-Gly-Leu in 0.1 M
Tris-HCl, pH 7.4, and the reaction was followed at 336 nm. In addition
to the two samples with pro-CPA5, a control with PC4 alone was also
tested (squares). B, approximately 125 ng of
pro-CPA5 incubated with PC4 was combined with 0.5 mM
FA-Gly-Leu in a 0.1 M concentration of either Tris-HCl, pH
7.4, or sodium acetate, pH 5.6, and the reaction followed at 336 nm.
For both panels, relative activity indicates the
decrease in absorption; units are standard absorption units. Control
reactions with porcine CPA1 gave a maximal decrease in absorption of
0.6 units for the same concentration of substrate.
|
|
 |
DISCUSSION |
Of the three CP genes identified in the present study, only two of
them were expected based on previous data base searches of
mRNA-based cDNA libraries. However, none of these
mRNA-based sequences predicted an active CP protein due to either
frameshift errors or incomplete sequence information. One of the major
contributions of the present study is the identification of the
putative genes for the full coding regions of all three CP-like
proteins. It is unlikely that there are any more human metallo-CP-like
proteins with significant amino acid sequence similarities to either
A/B or N/E subfamily members; numerous searches of genome and cDNA libraries using a variety of query CP sequences and search parameters failed to turn up any additional hits. Thus, there are a total of 17 metallo-CP family members, with nine of these in the A/B subgroup and
eight in the N/E subgroup (although CPD consists of three CP-like
domains, so the total number of CP domain sequences is 19). However, it
is possible that additional human proteins have structural homology to
the CP family that is not detected by the amino acid homology searches
used in the present study.
The locations of introns in the coding regions of the various members
of the A/B subfamily of CP are generally conserved in all of the
previously identified CPs as well as the three novel ones found in the
present study (although the exons encoding the putative pro region of
CPO could not be identified, so no conclusions about the
intron locations within this region can be made). The one exception is
the intron between exons 3 and 4 (i.e. intron 3), which is
absent from CPA1. All other previously identified members of
the A/B subfamily contain intron 3, but the exact position differs by
about 20 nucleotides. The three novel CPs all have an intron 3 in a
position that matches one of the previously identified members. There
is little similarity between A/B and N/E subfamily members regarding
intron positions. Only one of the intron positions present in A/B
subfamily members is conserved in CPE (49). Within the N/E
subfamily, there is less conservation of the intron positions than
within the A/B subfamily.
It is likely that the predicted intron splicings are correct because
the sites exactly match one of the previously identified CP genes, the
splice sites all obey the GT-AG rule, and, for CPA5 and
CPA6, cDNA sequences that cover much or all of the
predicted coding region are present in the data base. Of the three
additional CPs described in this study, CPA5 had the most
matches to cDNA clones in the data base, including a nearly
full-length mouse cDNA sequence of unknown function. One reason
that no function was previously ascribed to this cDNA is presumably
due to the frameshift errors early in the sequence. The protein
predicted from this incorrect cDNA sequence has no substantial
homology to metallo-CPs. Based on our isolation and sequencing of the
mouse cDNA for CPA5, it is clear that there is a long open reading
frame with 83% amino acid sequence identity to the predicted human
CPA5. This level of conservation is similar to that for human and rat CPB (80%), CPA1 (85%), and CPA2 (89%). Importantly, all of the active site residues that are different between CPA5 and the other members of this family are well conserved in both human and mouse CPA5 sequences.
Although full-length sequences of CPA6 have not been
reported in GenBankTM, it is clear that this gene also is
expressed because of the matches to cDNA sequences in the data base
and our RT-PCR results. The longest of these GenBankTM
cDNA sequences corresponds to the first eight exons of
CPA6 but then continues into the putative intron and would
not therefore encode an active CP due to the absence of the
catalytically important exons 9-11. Using RT-PCR, we confirmed that
exons 9-11 are spliced to the other exons as predicted based on the
amino acid homology considerations shown in Fig. 2. Although no matches
were found to the C-terminal portion of CPA6 in mammalian
cDNA data bases, several matches were found to this region in
chicken (Gallus gallus) and frog (Xenopus laevis)
cDNA data bases. The amino acid sequence identity is extremely
high, with 95% identity over 128 amino acids for the longer of the two
chicken clones and 89% identity over 113 residues for a similar region
of the frog sequence. Most importantly, this region contains the
critical substrate-binding residue Met383 (corresponding to
Ile255 of the active form of CPA1). The presence of a Met
in this position is predicted to give the protein a unique CPA-like
specificity, and the conservation of this site in the chicken and frog
sequences implies that they represent homologs of human CPA6 and that
this protein has been highly conserved during evolution. Interestingly, two of the 11 Drosophila CPA/B-like genes have a Met in this
position, but previously no known mammalian CPs had this residue. It is not clear what effect this Met will have on substrate specificity, other than imparting a preference for C-terminal aliphatic or aromatic
residues (discussed below).
Based on modeling of the proteins, and in particular the active site
regions, it was predicted that CPA5 and CPA6 would cleave aliphatic/aromatic amino acids, with a preference of CPA5 for smaller
side chains. For CPA5, this prediction was confirmed by the expression
of pro-CPA5 and subsequent activation with PC4 into an enzyme that
cleaved FA-Gly-Leu. The amount of enzyme activity detected was
reasonable, based on the amount of product formed and the amount of
pro-CPA5 in the original incubation with PC4. However, without a pure
preparation of soluble enzyme, it is not possible to do kinetic
analysis to examine the substrate specificity of CPA5. Part of the
problem is the low recovery of soluble pro-CPA5 from the baculovirus
system. Although the baculovirus system produces soluble or extractable
forms of CPE and CPD, baculovirus-produced CPZ cannot be extracted from
the particulate fraction even with high salt and detergent containing
buffers (46). For CPZ, the nonextractable form is enzymatically active
(46), but for CPE and CPD the nonextractable form is inactive and
presumably represents misfolded protein. In addition to the problem of
extracting pro-CPA5 and maintaining its solubility, it was also
difficult to activate pro-CPA5. The inability of furin to cleave the
pro domain was unexpected. It is possible that the structure of this
region prevented furin from cleaving, despite the reasonable match of
this sequence to the consensus furin cleavage site (39). Both pro-CPA5
and pro-CPA6 have a long, five-turn
-helix at the region connecting the pro and enzyme moieties, being in that sense more similar to
pro-CPA2 than to pro-CPA1 or pro-CPB. The finding that PC4 cleaves
pro-CPA5 is consistent with the expression of both mRNAs in testis
germ cells. However, CPA5 appears to have a broader distribution than
PC4, with low levels of CPA5 mRNA detected in pituitary using
in situ hybridization. In addition, low levels of CPA5
mRNA are presumably present in the brain, based on the presence of
several CPA5 sequences in human adult brain EST data bases but no
detectable signal on a Northern blot of mouse brain RNA (Fig. 5).
Because PC4 is not thought to be expressed in these tissues (34), if
active CPA5 is produced in brain and/or pituitary it must be generated
by another endopeptidase.
While the predicted active sites of CPA5 and CPA6 are generally similar
to other CPAs, the predicted active site of CPO is unique. CPO
resembles CPB in the nature of the amino acids that form the subsites
for substrate anchoring but with significant substitutions at two
residues that interact with the C-terminal side chain:
Gly243 and Asp255 (CPA1 active form numbering).
In CPO, these residues are substituted by Ser and Arg, indicating that
the substrate binding funnel is of similar nature but the specificity
is reversed. Thus, CPO probably cleaves C-terminal acidic amino acids,
a specificity not yet described for any known CP. Interestingly,
searches of the Drosophila and C. elegans genomes
revealed metallo-CP-like genes with Lys and His in the position
corresponding to the active site residue 255; it is therefore possible
that these proteins have specificities similar to that of human CPO.
A key question concerns the function of the three CPs described in the
present study. One way to approach this is to consider whether there
are any known CP cleavages that cannot be explained by existing
enzymes. One such cleavage has been observed for human
-endorphin
1-26. This peptide is presumably formed from
-endorphin 1-27,
which is itself generated from
-endorphin 1-31 by an endopeptidase followed by the action of CPE or CPD (50, 51). Whereas the conversion
of rat
-endorphin 1-27 into 1-26 requires cleavage of
His27, which is slowly catalyzed by CPE (52), for human
-endorphin this requires cleavage of Tyr27. Neither CPE
nor CPD cleave C-terminal Tyr residues, so it is likely that an unknown
CPA-like enzyme performs this cleavage. It is possible that CPA5 is
this
-endorphin-processing enzyme; it is presumably expressed in the
secretory pathway due to the presence of the signal peptide, and it is
also detected in low levels in mouse pituitary, where it appears
broadly distributed. Thus, this enzyme should be present in the
-endorphin-producing corticotrophs. The detection by RT-PCR of CPA5
mRNA in the AtT-20 corticotroph cell line further strengthens this
possibility. However, this would not explain the function of testis
germ cell CPA5. While
-endorphin is known to be also expressed in
testis, it is restricted to the Leydig cells and not the germ cells
(53). It is possible that testis CPA5 is involved in processing
intercellular signaling molecules formed after the concerted action of
PC4 and CPD. Speculation on the functions of CPA6 and CPO requires more detailed information regarding the tissue and cellular distribution of
these proteins.
 |
ACKNOWLEDGEMENTS |
We thank Dr. Majambu Mbikay for providing
PC4-expressing cell medium for testing with pro-CPA5 and Dr. Fa-Yun
Che for assisting with mass spectrometry.
 |
FOOTNOTES |
*
This work was primarily supported by National
Institutes of Health Grant R01 DK51271, and also by Grants R01 DA04494
and K02 DA00194 (to L. D. F.), by Ministerio de Ciencia y
Tecnología, Spain Grants BIO98-0362 and 2FD97-0872, the Center
de Referència en Biotecnologia of the Generalitat de Catalunya,
Spain, (to F. X. A.), and grants from the Canadian Institutes of
Health Research and a fellowship from the Fonds de la Recherche en
Santé du Québec (to R. D.). The DNA sequencing facility of
the Albert Einstein College of Medicine is supported in part by Cancer
Center Grant CA13330.The costs of publication of this
article were defrayed in part by the
payment of page charges. The article
must therefore be hereby marked
"advertisement" in
accordance with 18 U.S.C. Section
1734 solely to indicate this fact.
To whom correspondence should be addressed: Dept. of Molecular
Pharmacology, Albert Einstein College of Medicine, 1300 Morris Park
Ave., Bronx, NY 10461. Tel.: 718-430-4225; Fax: 718-430-8954; E-mail:
fricker@aecom.yu.edu.
Published, JBC Papers in Press, February 8, 2002, DOI 10.1074/jbc.M112254200
2
P. J. Pereira, S. Segura-Martin, C. Ferrer-Orta, J. Vendrell, F. X. Aviles, M. Coll, and F. X. Gomis-Ruth, manuscript in preparation.
 |
ABBREVIATIONS |
The abbreviations used are:
CP, carboxypeptidase;
FA, furylacryloyl;
PC4 and PC7, prohormone convertase
4 and 7, respectively;
r.m.s., root mean square;
RT, reverse
transcription;
EST, expressed sequence tag;
bis-Tris, 2-[bis(2-hydroxyethyl)amino]-2-(hydroxymethyl)propane-1,3-diol.
 |
REFERENCES |
| 1.
|
Barrett, A. J.,
Rawlings, N. D.,
and Woessner, J. F.
(1998)
in
Handbook of Proteolytic Enzymes
(Barrett, A. J.
, Rawlings, N. D.
, and Woessner, J. F., eds)
, pp. 1318-1320, Academic Press, Inc., San Diego
|
| 2.
|
Auld, D. S.
(1998)
in
Handbook of Proteolytic Enzymes
(Barrett, A. J.
, Rawlings, N. D.
, and Woessner, J. F., eds)
, pp. 1321-1326, Academic Press, Inc., San Diego
|
| 3.
|
Auld, D. S.
(1998)
in
Handbook of Proteolytic Enzymes
(Barrett, A. J.
, Rawlings, N. D.
, and Woessner, J. F., eds)
, pp. 1326-1328, Academic Press, Inc., San Diego
|
| 4.
|
Hendriks, D. F.
(1998)
in
Handbook of Proteolytic Enzymes
(Barrett, A. J.
, Rawlings, N. D.
, and Woessner, J. F., eds)
, pp. 1328-1330, Academic Press, Inc., San Diego
|
| 5.
|
Springman, E. B.
(1998)
in
Handbook of Proteolytic Enzymes
(Barrett, A. J.
, Rawlings, N. D.
, and Woessner, J. F., eds)
, pp. 1330-1333, Academic Press, Inc., San Diego
|
| 6.
|
Aviles, F. X.,
and Vendrell, J.
(1998)
in
Handbook of Proteolytic Enzymes
(Barrett, A. J.
, Rawlings, N. D.
, and Woessner, J. F., eds)
, pp. 1333-1335, Academic Press, Inc., San Diego
|
| 7.
|
Fricker, L. D.
(1998)
in
Handbook of Proteolytic Enzymes
(Barrett, A. J.
, Rawlings, N. D.
, and Woessner, J. F., eds)
, pp. 1341-1344, Academic Press, Inc., San Diego
|
| 8.
|
Skidgel, R. A.,
and Erdos, E. G.
(1998)
in
Handbook of Proteolytic Enzymes
(Barrett, A. J.
, Rawlings, N. D.
, and Woessner, J. F., eds)
, pp. 1344-1347, Academic Press, Inc., San Diego
|
| 9.
|
Skidgel, R. A.
(1998)
in
Handbook of Proteolytic Enzymes
(Barrett, A. J.
, Rawlings, N. D.
, and Woessner, J. F., eds)
, pp. 1347-1349, Academic Press, Inc., San Diego
|
| 10.
|
Fricker, L. D.
(1998)
in
Handbook of Proteolytic Enzymes
(Barrett, A. J.
, Rawlings, N. D.
, and Woessner, J. F., eds)
, pp. 1349-1351, Academic Press, Inc., San Diego
|
| 11.
|
Reznik, S. E.,
and Fricker, L. D.
(2001)
Cell. Mol. Life Sci.
58,
1790-1804[CrossRef][Medline]
[Order article via Infotrieve]
|
| 12.
|
Song, L.,
and Fricker, L. D.
(1997)
Biochem. J.
323,
265-271[Medline]
[Order article via Infotrieve]
|
| 13.
|
Gomis-Ruth, F. X.,
Companys, V.,
Qian, Y.,
Fricker, L. D.,
Vendrell, J.,
Aviles, F. X.,
and Coll, M.
(1999)
EMBO J.
18,
5817-5826[CrossRef][Medline]
[Order article via Infotrieve]
|
| 14.
|
Aloy, P.,
Companys, V.,
Vendrell, J.,
Aviles, F. X.,
Fricker, L. D.,
Coll, M.,
and Gomis-Ruth, F. X.
(2001)
J. Biol. Chem.
276,
16177-16184[Abstract/Free Full Text]
|
| 15.
|
Eipper, B. A.,
Park, L. P.,
Dickerson, I. M.,
Keutmann, H. T.,
Thiele, E. A.,
Rodriguez, H.,
Schofield, P. R.,
and Mains, R. E.
(1987)
Mol. Endocrinol.
1,
777-790[CrossRef][Medline]
[Order article via Infotrieve]
|
| 16.
|
Xin, X.,
Varlamov, O.,
Day, R.,
Dong, W.,
Bridgett, M. M.,
Leiter, E. H.,
and Fricker, L. D.
(1997)
DNA Cell Biol.
16,
897-909[Medline]
[Order article via Infotrieve]
|
| 17.
|
Xin, X.,
Day, R.,
Dong, W.,
Lei, Y.,
and Fricker, L. D.
(1998)
DNA Cell Biol.
17,
311-319[Medline]
[Order article via Infotrieve]
|
| 18.
|
Xin, X.,
Day, R.,
Dong, W.,
Lei, Y.,
and Fricker, L. D.
(1998)
DNA Cell Biol.
17,
897-909[Medline]
[Order article via Infotrieve]
|
| 19.
|
He, G. P.,
Muise, A., Li, A. W.,
and Ro, H. S.
(1995)
Nature
378,
92-96[CrossRef][Medline]
[Order article via Infotrieve]
|
| 20.
|
Nielsen, H.,
Engelbrecht, J.,
Brunak, S.,
and von Heijne, G.
(1997)
Protein Eng.
10,
1-6[Abstract/Free Full Text]
|
| 21.
|
Thompson, J. D.,
Higgins, D. G.,
and Gibson, T. J.
(1994)
Nucleic Acids Res.
22,
4673-4680 |