Identification and Characterization of Human Archaemetzincin-1 and -2, Two Novel Members of a Family of Metalloproteases Widely Distributed in Archaea*

Systematic analysis of degradomes, the complete protease repertoires of organisms, has demonstrated the large and growing complexity of proteolytic systems operating in all cells and tissues. We report here the identification of two new human metalloproteases that have been called archaemetzincin-1 (AMZ1) and archaemetzincin-2 (AMZ2) to emphasize their close relationship to putative proteases predicted by bioinformatic analysis of archaeal genomes. Both human proteins contain a catalytic domain with a core motif (HEXXHXXGX3CX4CXMX17CXXC) that includes an archetypal zinc-binding site, the methionine residue characteristic of metzincins, and four conserved cysteine residues that are not present at the equivalent positions of other human metalloproteases. Analysis of genome sequence databases revealed that AMZs are widely distributed in Archaea and vertebrates and contribute to the defining of a new metalloprotease family that has been called archaemetzincin. However, AMZ-like sequences are absent in a number of model organisms from bacteria to nematodes. Phylogenetic analysis showed that these enzymes have undergone a complex evolutionary process involving a series of lateral gene transfer, gene loss, and genetic duplication events that have shaped this novel family of metalloproteases. Northern blot analysis showed that AMZ1 and AMZ2 exhibit distinct expression patterns in human tissues. AMZ1 is mainly detected in liver and heart whereas AMZ2 is predominantly expressed in testis and heart, although both are also detectable at lower levels in other tissues. Both human enzymes were produced in Escherichia coli, and the purified recombinant proteins hydrolyzed synthetic substrates and bioactive peptides, demonstrating that they are functional proteases. Finally, these activities were abolished by inhibitors of metalloproteases, providing further evidence that AMZs belong to this catalytic class of proteolytic enzymes.

Systematic analysis of degradomes, the complete protease repertoires of organisms, has demonstrated the large and growing complexity of proteolytic systems operating in all cells and tissues. We report here the identification of two new human metalloproteases that have been called archaemetzincin-1 (AMZ1) and archaemetzincin-2 (AMZ2) to emphasize their close relationship to putative proteases predicted by bioinformatic analysis of archaeal genomes. Both human proteins contain a catalytic domain with a core motif (HEXXHXXGX 3 CX 4 CXMX 17 CXXC) that includes an archetypal zinc-binding site, the methionine residue characteristic of metzincins, and four conserved cysteine residues that are not present at the equivalent positions of other human metalloproteases. Analysis of genome sequence databases revealed that AMZs are widely distributed in Archaea and vertebrates and contribute to the defining of a new metalloprotease family that has been called archaemetzincin. However, AMZ-like sequences are absent in a number of model organisms from bacteria to nematodes. Phylogenetic analysis showed that these enzymes have undergone a complex evolutionary process involving a series of lateral gene transfer, gene loss, and genetic duplication events that have shaped this novel family of metalloproteases. Northern blot analysis showed that AMZ1 and AMZ2 exhibit distinct expression patterns in human tissues. AMZ1 is mainly detected in liver and heart whereas AMZ2 is predominantly expressed in testis and heart, although both are also detectable at lower levels in other tissues. Both human enzymes were produced in Escherichia coli, and the purified recombinant proteins hydrolyzed synthetic substrates and bioactive peptides, demonstrating that they are functional proteases. Finally, these activities were abolished by inhibitors of metalloproteases, providing further evidence that AMZs belong to this catalytic class of proteolytic enzymes.
Proteases mediate many key physiological processes (1). These enzymes play essential roles in a variety of events that determine cell life and death in all living organisms. Thus, proteases participate in the control of cell cycle progression, tissue morphogenesis and remodeling, cell proliferation and migration, ovulation and fertilization, angiogenesis, host defense, hemostasis, apoptosis, and autophagy (2)(3)(4)(5)(6)(7)(8)(9). Because of these crucial roles, strict regulatory mechanisms are necessary to prevent misdirected temporal and spatial proteolytic activities. The failure of these regulatory mechanisms contributes to the development of many pathological processes including arthritis, cardiovascular diseases, neurodegenerative disorders, and cancer (10 -15). The recent description of the human degradome, the complete set of human proteases, and the degradomes of other organisms represent preliminary steps to understanding the complexity of protease systems (16). According to our data, there are at least 561 protease and proteaserelated genes in the human genome (17) (web.uniovi.es/degradome). Likewise, we have annotated Ͼ600 protease genes in the mouse and rat genomes (17,18). This increased complexity in the rodent degradomes is mainly due to the expansion in both mouse and rat genomes of specific gene families encoding proteases implicated in reproduction and host defense. Similarly, the availability of the genome sequences of other model organisms such as Caenorhabditis elegans, Drosophila melanogaster, or Arabidopsis thaliana has allowed us to predict that all of them contain a large number of protease genes ranging from 400 to 600 different members (19) (merops.sanger.ac.uk), thereby emphasizing the complexity of proteolytic systems present in all organisms.
Recently, and as part of our studies aimed at identifying novel human proteases, we have evaluated the possibility that human tissues could produce proteases described previously in evolutionarily distant organisms but whose occurrence in mammals has yet not been reported. This approach led us to identify and characterize human and mouse ovastacin, a novel metalloprotease similar to hatching enzymes from arthropods, birds, amphibians, and fish (20). Following this strategy, we have also tried to explore the putative occurrence in the human genome of genes encoding proteases with sequence similarity to putative metalloproteases annotated in the course of genome sequencing projects of prokaryotic organisms (21). In this work, we report the identification, cloning, and characterization of two novel human metalloproteases called archaemetzincin-1 (AMZ1) 1 and arch-aemetzincin-2 (AMZ2), which are closely related to proteins whose sequence has been predicted by bioinformatic analysis of archaeal genomes. We perform a detailed phylogenetic analysis of these enzymes to clarify the origin and complex evolutionary history of this new family of metalloproteases. Finally, we examine the tissue distribution of AMZ1 and AMZ2 in human tissues and analyze their enzymatic properties.

EXPERIMENTAL PROCEDURES
Materials-Restriction endonucleases and other reagents used for molecular cloning were from Roche Diagnostics. Double-stranded DNA probes were radiolabeled with [␣-32 P]dCTP (3000 Ci/mmol) from Amersham Biosciences, using a commercial random priming kit purchased from the same company. Human cDNA libraries and Northern blots containing polyadenylated RNAs from different tissues were from Clontech. Fluorogenic substrates and biologically active peptides (neurogranin, angiotensin II, and angiotensin III) were purchased from Bachem, and protease inhibitors and AMC were from Sigma. Albumin, fibrillar collagens, gelatin, plasminogen, and aprotinin were also from Sigma. Antibodies against GST were developed in our laboratory as described previously (22).
Bioinformatic Analysis and cDNA Cloning-The BLAST program was used to screen public (www.ncbi.nlm.nih.gov) and private (www. celera.com) human genome databases, searching for regions with sequence similarity to prokaryotic metalloproteinase sequences (21). We found two partial sequences located in the human chromosomes 7p22.3 and 17q24.2 exhibiting similarity to putative metalloproteinase sequences identified during the course of large scale genome-sequencing projects involving Archaea (23)(24)(25)(26). After the identification of these human sequences, we designed specific oligonucleotides to PCR amplify the cDNAs for these metalloproteases using a human brain cDNA library as a template. All PCR amplifications were performed in a GeneAmp 2400 PCR system from PerkinElmer Life Sciences. After cloning the PCR products in pBluescript, their identity was confirmed by nucleotide sequencing Nucleotide Sequence Analysis-Cloned cDNAs were sequenced at the Oviedo University DNA analysis facility using BigDye Terminator (version 3.1) chemistry on an ABI PRISM 3100 genetic analyzer platform (Applied Biosystems). Computer analysis of DNA and protein sequences was performed with the GCG software package of the University of Wisconsin Genetics Computer Group.
Phylogenetic Analysis-The sequences of archaemetzincins from 27 archaeal, bacterial, and eukaryotic organisms were predicted from their genomic sequences with the TBLASTN algorithm. The obtained sequences were aligned automatically with ClustalX version 1.8 (www. igbmc.u-strasbg.fr/BioInfo/ClustalX) and manually with GeneDoc version 2.6. (www.psc.edu/biomed/genedoc). An unrelated metalloprotease from the enterobacteria Yersinia pestis was also added to this alignment as an out group. The most parsimonious tree according to this alignment was calculated with the Protpars program included in the Phylip package version 3.6 (evolution.genetics.washington.edu/phylip/ getme.html). The resulting tree was plotted with TreePlot (www.bioinformatics.nl/tools/plottree.html). Additionally, a tree of the selected species was constructed based on a diverse array of phylogenetic resources (www.ncbi.nlm.nih.gov/Taxonomy/CommonTree/wwwcmt.cgi).
Northern Blot Analysis-Nylon membranes containing 2 g of poly(A ϩ ) RNA from diverse human tissues were prehybridized at 42°C for 3 h in 50% formamide, 5ϫ SSPE (1ϫ SSPE is 150 mM NaCl, 10 mM NaH 2 PO 4 , 1 mM EDTA, pH 7.4) 10ϫ Denhardt's solution, 2% SDS, and 100 g/ml denatured herring sperm DNA. Membranes were then hybridized with specific radiolabeled probes containing nucleotides from positions 180 to 580 of AMZ1 cDNA and from 340 to 750 of AMZ2 cDNA. Hybridization was performed for 20 h under the same conditions used for prehybridization. Finally, blots were washed once with 2ϫ SSC, 0.05% SDS for 30 min and three times in 0.1ϫ SSC and 0.1% SDS for 30 min at 50°C and exposed to autoradiography. RNA integrity and loading was assessed by hybridization with an actin probe.
Production and Purification of Recombinant Proteins-cDNAs for the predicted catalytic domains of AMZ1 (positions 1-320) and AMZ2 (positions 1-300) were obtained by PCR amplification using specific oligonucleotide pairs containing defined restriction sites. The AMZ1 catalytic domain oligonucleotides were 5Ј-GGGGATCCCATGCTGCAG-TGTAGACCCGCACAGGA-3Ј and 5Ј-GGGTCGACGATTGAGAGAAG-GGGTAGGGTCCCTG-3Ј, and the AMZ2 catalytic domain oligonucleotides were 5Ј-GGGGATCCCATGCAAATAATACGGCACTCCG-3Ј and 5Ј-CAGGAATTCAGTAAAAACCTCTTGACGGTCCG-3Ј (where the restriction sites are underlined). PCR amplifications were performed with 30 cycles of denaturation (95°C for 30 s), annealing (60°C for 30 s), and extension (68°C for 1 min) using the Expand TM long template, high fidelity PCR system. PCR products were then digested with the corresponding restriction enzymes and cloned in the appropriate sites of the pGEX-5x-2 expression vector (Amersham Biosciences). The resulting constructs were transformed into BL21(DE3)-pLysE-competent Escherichia coli cells, and expression was induced by the addition of isopropyl-1-thio-␤-D-galactopyranoside (final concentration 1 mM), followed by 3 h of incubation at 28°C. The cells were then harvested by centrifugation, washed with phosphate-buffered saline, and lysed by incubation in phosphate-buffered saline with 100 g/ml lysozyme, 10 g/ml DNase, and 0.1% Triton X-100 overnight at 4°C. The recombinant catalytic domain proteins contained in the corresponding supernatants were purified by affinity chromatography using a glutathione-Sepharose column. The identity of the recombinant proteins was verified by Western blot and trypsin digestion followed by mass spectrometry analysis.
Trypsin Digestion-Gel bands were manually excised and placed into 0.5-ml tubes. Then, gel pieces were washed three times with 180 l of 25 mM ammonium bicarbonate/acetonitrile (70:30) (v/v), dried at 90°C for 15 min, and incubated with 12 g/ml trypsin (Promega) in 25 mM ammonium bicarbonate at 60°C for 1 h. Likewise, soluble proteins were incubated with trypsin (12 g/ml) in 25 mM ammonium bicarbonate for 1 h at 60°C. The resulting peptide mixtures were placed into ice for 2 min, and 2 l of 10% trifluoroacetic acid were added to each sample. Samples were then desalted by C18 reverse phase chromatography (ZipTip; Millipore). Peptides were eluted with 2 l of ␣-cyano-4-hydroxycinnamic acid in acetonitrile and 0.1% trifluoroacetic acid (50:50) (v/v). In a typical experiment, 1 l of this solution was analyzed by mass spectrometry.
Mass Spectrometry Analysis-Matrix-assisted laser desorption ionization was performed on a time-of-flight mass spectrometer equipped with a nitrogen laser source (Voyager-DE STR; Applied Biosystems). Data from 50 to 200 laser shots were collected to produce a mass spectrum (S. E Ϯ 20 ppm).
Protease Assays-Enzymatic activity of the purified recombinant human AMZ1 and AMZ2 was assayed using AMC-coupled amino acids (Asp-AMC, Thr-AMC, Leu-AMC, Glu-AMC, His-AMC, Val-AMC, Asn-AMC, Ser-AMC, Ile-AMC, Trp-AMC, Phe-AMC, Ala-AMC, Gln-AMC, Gly-AMC, Lys-AMC, Tyr-AMC, Pro-AMC, Met-AMC, and Arg-AMC) or the fluorogenic peptides QF35 (Mca-Pro-Leu-Ala-Nva-Dpa-Ala-Arg-NH 2 ) and QF41 (Mca-Pro-Cha-Gly-Nva-His-Ala-Dpa-NH 2 ), where Mca is (7-methoxycoumarin-4yl)-acetic acid, Nva is norvaline, Dpa is Ldinitrophenyl-diamino propionic acid, and Cha is cyclohexyl alanine. Assays were carried out at 37°C at a substrate concentration of 5 M in a buffer containing 50 mM Tris-HCl, 150 mM NaCl, and 0.05% Brij-35, pH 7.5. The fluorometric measurements were made in an LS55 spectrofluorometer from PerkinElmer Life Sciences ( ex ϭ 360 nm and em ϭ 460 nm for AMC-coupled amino acids and ex ϭ 328 nm and em ϭ 393 nm for Mca-containing peptides). The fluorescent signal was calibrated using known concentrations of AMC and Mca. For inhibition experiments, the recombinant proteins were preincubated for 30 min at 37°C with o-phenantroline, E-64, 4-(2-aminoethyl)-benzenesulfonyl fluoride, batimastat, tissue inhibitor of metalloproteinase-1, -2, -3, and -4, arphamenine A, and amastatin, and then the hydrolyzing activity against Ala-AMC for AMZ1 or against Arg-AMC for AMZ2 was determined by fluorometric measurements as described above. Kinetic studies were performed using different concentrations of the fluorogenic compounds (0.5-100 M) in 100 l of assay buffer containing recombinant enzymes (5 nM), and peptide hydrolysis was measured from the increase in fluorescence at 37°C over time. Initial velocities were calculated using the analysis package FL WinLab 2.01 (PerkinElmer Life Sciences), and data were fitted to the Michaelis-Menten equation (27) using GraFit version 4.0 (Erithacus). Assays with purified proteins (albumin, fibrillar collagens, gelatin, plasminogen, or aprotinin) and bioactive peptides (neurogranin, angiotensin II, and angiotensin III) were performed by incubation of 2.5 nM recombinant AMZ1 or AMZ2 with each substrate (10 M). Reactions were carried out at 37°C in a buffer containing 50 mM Tris-HCl and 150 mM NaCl, pH 7.5, overnight for purified proteins or in the course of 2 h for bioactive peptides. The digestions of purified proteins were analyzed by SDS-PAGE. Peptide digestions were purified using a ZipTip and analyzed by mass spectrometry.

RESULTS
Cloning and Characterization of Two Human cDNAs Encoding Novel Metalloproteases Similar to Archaeal Metzincins-A bioinformatic search of the human genome to look for sequences similar to those of archaeal or bacterial metallopro-teases led us to identify two DNA contigs located in chromosomes 7p22.3 and 17q24.2 and encoding two uncharacterized proteins with sequence similarity to putative archaeal metalloprotease sequences (21,(23)(24)(25)(26). The full-length cDNAs for both human enzymes were PCR-amplified using specific oligonucleotides and a brain cDNA library. These experiments led us to the amplification of 1.5-and 1-kb cDNAs, both containing in-frame initiator and stop codons. After cloning and sequencing of the PCR-amplified products, we confirmed by conceptual translation that the generated sequences encoded two novel proteins of 498 and 360 amino acids, respectively (Fig. 1A, and GenBank TM accession numbers AJ635357 and AJ635358). Domain analysis with the InterPro (www.ebi.ac.uk/interpro) and SMART (smart.embl-heidelberg.de) programs confirmed the presence in both human protein sequences of a catalytic domain related to neutral zinc metalloproteases. A search for orthologous sequences using the TBLASTN algorithm showed that both human sequences are closely related to members of a family of predicted metalloproteases originally identified during the analysis of archaeal genomes and tentatively called archaemetzincins (21). Accordingly, we propose to call the newly identified proteins human archaemetzincin-1 and -2. The maximum percentages of identities between the catalytic domains of human and archaeal enzymes were 27% between human AMZ1 and the predicted archaemetzincin from the genome sequence of Thermococcus kodakaraensis and 39% between human AMZ2 and the corresponding enzyme from Pyrococcus abyssi (Fig. 1B). Likewise, the percentage of identities between the catalytic domains of human AMZ1 and AMZ2 is ϳ40%. Further bioinformatic analysis of available genome sequences revealed that archaemetzincins are widely distributed in Archaea as well as in vertebrates including birds, amphibians, and fish (Fig. 1B). However, no orthologous sequences were found in a large number of bacterial species, with the exception of Aquifex aeolicus and Myxococcus xanthus. Likewise, archaemetzincins were also absent from plants and from non-vertebrate Metazoa, such as C. elegans, D. melanogaster and Anopheles gambiae.
Amino acid sequence alignment of both human archaemetzincins with all related sequences present in different species allowed us to identify a highly conserved core catalytic motif, HEXXHXXGX 3 CX 4 CXMX 17 CXXC, where the putative metalloprotease zinc-binding site is underlined. The conserved methionine would be part of the "Met-turn" described in the metzincin clan of metalloproteases (21), although it should be noted that the sequences deduced for mouse and rat AMZ1 contain a Leu residue at this position (Fig. 1B, and GenBank TM accession numbers AJ879912 and AJ879913, respectively). By contrast, mouse and rat AMZ2 contain the archetypal Met at this position (Fig. 1B, and GenBank TM accession numbers AJ879914 and AJ879915, respectively). Notably, human AMZ1 and rodent AMZ1 lack a conserved His residue that is present in human and rodent AMZ2, as well as in archaeal and fungal archaemetzincins (Fig. 1B). Accordingly, this His residue can be used as a distinctive structural feature between AMZ1 and AMZ2. It is also noteworthy that the core catalytic motif of these enzymes contains four Cys residues that are absolutely conserved in all archaemetzincins but are absent at the equivalent positions of other metalloproteases. Accordingly, we propose that these four Cys residues can be used as a specific signature to distinguish this family of metalloproteases within the metzincin clan.
Evolutionary Analysis of AMZs-An extensive search of the publicly available genome sequences allowed us to identify AMZ genes in multiple eukaryotic and prokaryotic organisms. However, AMZs are absent in several eukaryotic model organisms, such as Saccharomyces cerevisiae, A. thaliana, D. melanogaster, and C. elegans, as well as in most bacterial organisms. All of the predicted prokaryotic AMZs were classified as AMZ2 because of the presence of a third histidine residue in their catalytic site. Additionally, the archaeal organisms belonging to the Thermococcaceae family presented a second AMZ2 with a highly divergent N-terminal extension that we called AMZ2b. After alignment of these sequences, a phylogenetic tree was calculated and rooted with an unrelated metalloprotease (Fig. 2). The obtained tree shows eukaryotic and prokaryotic AMZs separated into two large groups. Only AMZ2b genes and AMZ2 from bacterial M. xanthus stand outside these groups. Interestingly, AMZ2 from A. aeolicus groups with AMZs from archaeal organisms, suggesting a relatively late lateral gene transfer event from Archaea to bacteria as was previously proposed for other A. aeolicus genes (28,29). Finally, data from this analysis were fitted to a taxonomic tree to construct a model that could explain the evolution of AMZ genes (Fig. 3). According to this model, the primordial AMZ arose in a common ancestor of Archaea and Eukaryota. Some bacterial species acquired this gene through lateral gene transfer from archaeal organisms. On the other hand, two duplication events would explain the presence of AMZ2b in Thermococcaceae and AMZ1 in Amniota. The lack of AMZ genes in several eukaryotic and archaeal organisms would be likely explained by multiple gene loss events at different times (Fig. 3).
Expression Analysis of AMZ1 and AMZ2 in Human Tissues-To analyze the distribution of both archaemetzincins in human tissues, Northern blots containing poly(A ϩ ) RNAs prepared from a variety of human adult tissues were hybridized with specific probes for human AMZ1 or AMZ2. As can be seen in Fig. 4, AMZ1 mRNA transcripts are detected predominantly in liver and heart, although there are significant mRNA levels in pancreas, kidney, and testis. On the other hand, AMZ2 mRNAs are mainly present in heart and testis, although there are also detectable transcripts in pancreas, kidney, liver, lung, placenta, brain, and prostate. Notably, both human AMZs display several mRNA transcripts, possibly derived from alternative splicing events. On the other hand, both AMZ mRNAs are present in all fetal tissues analyzed (Fig. 4), being mainly detected in kidney and liver in the case of AMZ1 and in kidney and brain in the case of AMZ2.
Enzymatic Properties of Human AMZ1 and AMZ2 Produced in E. coli-To analyze the enzymatic properties of both human AMZs, we produced in E. coli two fusion proteins containing the putative catalytic domains of these enzymes linked to GST at their N termini. The catalytic domains were defined based on the alignments of human AMZs with the related archaeal proteins, which showed the maximum degree of conservation in the N-terminal region of these proteins. Then, these constructs (encoding amino acids 1-320 of AMZ1 and 1-300 of AMZ2) were transformed in E. coli BL21, and, after isopropyl-1-thio-␤-D-galactopyranoside induction, bands of the expected size (55 kDa) were detected by SDS-PAGE and Western blot analysis of protein extracts using antibodies against GST (Fig. 5A). These recombinant GST-proteases were then purified by glutathione-Sepharose chromatography. To assess the identity of the proteins present in these bands, they were digested with trypsin and analyzed by mass spectrometry. The obtained spectra confirmed that the 55-kDa bands corresponded to GST-AMZ1 and GST-AMZ2 fusion proteins.
The recombinant human AMZ1 and AMZ2 proteins were then used in enzymatic assays with the fluorescent substrates commonly employed for assaying other proteases. These assays showed that recombinant AMZ1 exhibits a significant hydrolytic activity against Ala-AMC, whereas recombinant AMZ2 preferentially cleaves Arg-AMC (Fig. 5B). By contrast, we did  not detect any significant activity of human recombinant AMZs against QF35 or QF41, two peptides widely used for assaying metalloendopeptidases. Likewise, we did not detect any evidence of endoproteolytic activity of human recombinant AMZs against a number of protein substrates including albumin, fibrillar collagens, gelatin, plasminogen, or aprotinin (data not shown). It is also noteworthy that the two human AMZs exhibited different optimal pH values for their hydrolyzing activities against AMC derivatives. Whereas AMZ1 reaches a maximum of activity at pH 8.0, the optimal pH for AMZ2 activity is 7.0 (Fig. 5C). We then tested the ability of different protease inhibitors to block the enzymatic activity of both human AMZs. As can be seen in Fig. 5D, the activity of both peptidases was inhibited by the general metalloprotease inhibitors o-phenantroline and batimastat, but not by 4-(2-aminoethyl)-benzenesulfonyl fluoride, E-64, and tissue inhibitors of metalloproteinases (TIMPS), which are inhibitors of serine, cysteine, and matrix metalloproteases, respectively. Interestingly, AMZ1 and AMZ2 activities were significantly inhibited by amastatin, which is an inhibitor of aminopeptidases (Fig. 5D). We next performed a kinetic analysis of the proteolytic reaction catalyzed by the catalytic domains of AMZ1 and AMZ2 with their preferred substrates (Ala-AMC and Arg-AMC, respectively). The fitting of the resulting data to the Michaelis-Menten equation yielded k cat /K m values of 46 M Ϫ1 s Ϫ1 and 22 M Ϫ1 s Ϫ1 for catalytic domain proteins of AMZ1 and AMZ2, respectively, which are similar to the value reported for recombinant aminopeptidase O produced in the same expression system (30). We have also tried to perform similar experiments with the fulllength proteins produced in bacterial systems. To date, these experiments have been hampered by the low amounts of fulllength AMZs, which can be recovered in active form by using different expression systems. Nevertheless, preliminary experiments performed with full-length AMZ1 and AMZ2 produced as His tail fusion proteins in E. coli BL21 pLysS have con-firmed the above results for substrate specificity and sensitivity to inhibitors obtained by using the catalytic domains of both enzymes (data not shown).
To further characterize the enzymatic activity of the identified AMZ metalloproteases, several commercially available bioactive peptides were incubated in the presence of purified catalytic domains of AMZ1 or AMZ2, and the resulting samples were analyzed by mass spectrometry. As shown in Fig. 6, these experiments demonstrated that human AMZ1 exhibited aminopeptidase activity against neurogranin, whereas human AMZ2 was active against angiotensin III. Thus, as can be seen in Fig. 6A, neurogranin is detected as a 1800.1-Da peak consistent with its amino acid sequence (AAKIQASFRGH-MARKK), whereas incubation of neurogranin with AMZ1 produced a single additional peak with a mass of 1657.9 Da, corresponding to the processed peptide KIQASFRGHMARKK (Fig. 6B). Notably, AMZ2 did not process neurogranin under the same experimental conditions (data not shown). Similarly, neither AMZ1 nor AMZ2 hydrolyzed angiotensin II (data not shown). However, AMZ2 cleaved the N-terminal Arg residue of angiotensin III (RVYIHPF), albeit with low efficiency, to produce angiotensin IV (VYIHPF) (Fig. 6, C and D). DISCUSSION In this work we describe two new human proteases that have been tentatively called archaemetzincin-1 and -2. According to a series of structural and enzymatic features, these proteins belong to a new family of metalloproteases characterized by a conserved motif (HEXXHXXGX 3 CX 4 CXMX 17 CXXC) that contains an archetypal zinc-binding site and four Cys residues that contribute to defining the specific signature of this novel metalloprotease family. Furthermore, enzymatic assays performed with human recombinant AMZs have provided the first evidence that these proteins are catalytically active metalloproteases that exhibit substrate specificity and sensitivity to in-  hibitors, which appears to indicate that both proteases may act predominantly as aminopeptidases.
An additional distinctive feature of this family of metalloproteases is the complex series of evolutionary events that have contributed to its creation and diversification in different organisms. In fact, our bioinformatic analysis revealed that these enzymes are widely distributed in vertebrate and archaeal organisms but are absent in the genomes of a number of model organisms such as E. coli, S. cerevisiae, A. thaliana, D. melanogaster, and C. elegans. The occurrence of genes shared by prokaryotes and vertebrates but absent in other eukaryotes has been widely considered as an indication of lateral gene transfer events from prokaryotes to vertebrates (31). Accordingly, AMZs could represent novel and interesting examples of these rare evolutionary events. However, the recent accumulation of data questioning many cases of lateral gene transfer to the vertebrate lineage (32)(33)(34) prompted us to perform an exhaustive bioinformatic search for AMZ genes in all available genome sequences. This analysis led us to identify additional AMZ-related sequences in other non-vertebrate eukaryotes and in two bacterial species, as well as to uncover a series of complex evolutionary events underlying the formation of this metalloprotease family (Fig. 3). According to this phylogenetic analysis, the evolutionary history of AMZs is best described by a scenario in which the primordial AMZ gene, related to current AMZ2 enzymes, arose in a common ancestor of archaeal and eukaryotic organisms. The lack of AMZ genes in virtually all the analyzed bacterial genomes should be fully consistent with the proposed origin of AMZs after the appearance of the primordial bacterial organism. The presence of an AMZ in A. aeolicus, a hyperthermophilic bacteria that occupies an ecological niche dominated by Archaea, should be explained by lateral gene transfer from some of these archaeal organisms. Furthermore, the clear phylogenetic relationship between A. aeolicus AMZ2 and archaeal AMZs provides additional support to the occurrence of the proposed lateral transmission. Similarly, the finding of AMZ-related sequences in M. xanthus should likely be the result of lateral transfer from Archaea, but in this case this event was followed by a rapid accumulation of mutations, which would explain its location as an out group in the phylogenetic tree. The evolutionary history of AMZs in eukaryotic organisms has also involved a series of diverse events since their separation from their common ancestor with Archaea. First, the absence of AMZ genes in plants, nematodes, or insects is remarkable, suggesting the occurrence of multiple gene loss events in these organisms. Consistent with this proposal, codon usage or nucleotide composition analysis of AMZ genes failed to provide any evidence of lateral transmission from Archaea to vertebrates. Finally, our phylogenetic analysis also revealed that eukaryotic AMZ1 diverged from AMZ2 recently, probably by gene duplication, again illustrating the genomic plasticity of this family of metalloproteases.
To further explore the functional relevance of AMZ1 and AMZ2, we performed an enzymatic analysis of both recombinant enzymes produced in E. coli. This analysis revealed that the recombinant proteins are catalytically active and that their activities seem to correspond to those of aminopeptidases.
However, the two human metalloproteases show different substrate preferences. Whereas AMZ1 preferentially targets substrates that contain Ala at their N termini, AMZ2 mainly hydrolyzes substrates with Arg at that position. Consistent with this finding, AMZ1 could hydrolyze the N-terminal Ala of neurogranin, whereas AMZ2 processed the N-terminal Arg of angiotensin III. Furthermore, AMZ1 and AMZ2 failed to hydrolyze synthetic peptides such as QF35 and QF41, which are used for the analysis of metalloendopeptidases. The activities of both enzymes were abolished by general metalloprotease inhibitors such as o-phenantroline and batimastat and by the specific aminopeptidase inhibitor amastatin, providing further evidence that AMZs could be classified as metalloaminopeptidases. Also in this regard, it is remarkable that the activity of the recombinant catalytic domains of AMZs is relatively low, which is somewhat reminiscent of the limited proteolytic activity observed with the catalytic domains of ADAMTSs (a disintegrin and metalloprotease with thrombospondin motifs) but is in contrast to the case of matrix metalloproteinase catalytic domains, which readily hydrolyze linear peptides and gelatin (35)(36)(37). On this basis, it is tempting to speculate that the catalytic domains of some metalloproteinases including AMZs and ADAMTSs may have very strict requirements in terms of the presence of additional motifs or domains to exhibit their full potential as proteolytic enzymes.
In this work, we have also analyzed the distribution of AMZ1 and AMZ2 in human tissues. These studies allowed us to detect the predominant expression of AMZ1 in liver and heart, whereas AMZ2 mRNA was mainly detected in heart and testis. This pattern of expression suggests that both AMZs are implicated in the development or physiology of these organs. Nevertheless, further studies will be required to validate at the protein level the observed distribution of AMZ RNAs in human tissues. Additional clues about the physiological and pathological roles of these enzymes may be derived from their chromosome locations at 7p22 and 17q24, respectively. Alterations in these regions have been frequently associated with cancer and other diseases such as hypertension or multiple sclerosis (38 -41). Further studies will be required to ascertain whether AMZs could be a direct target of any of these genetic abnormalities resulting in cancer or other pathological conditions. Likewise, further experimental work, including the three-dimensional structural analysis of these enzymes and the generation of mutant organisms deficient in these proteases, will be necessary to clarify their functional roles and to define their precise relevance in the context of the growing complexity of proteolytic systems operating in all living organisms.