Structural and Biochemical Characterization of a Novel Aminopeptidase from Human Intestine*

Background: A protein product of the NAALADL1 gene is a homolog of glutamate carboxypeptidase II, a metallopeptidase studied as a promising theranostic cancer agent. Results: We solved the x-ray structure and analyzed the substrate specificity of the NAALADL1 gene product. Conclusion: We demonstrated that the protein represents a novel human ileal aminopeptidase. Significance: This study describes a novel enzyme involved in protein/peptide digestion in the small intestine and clarifies controversial previous reports. N-acetylated α-linked acidic dipeptidase-like protein (NAALADase L), encoded by the NAALADL1 gene, is a close homolog of glutamate carboxypeptidase II, a metallopeptidase that has been intensively studied as a target for imaging and therapy of solid malignancies and neuropathologies. However, neither the physiological functions nor structural features of NAALADase L are known at present. Here, we report a thorough characterization of the protein product of the human NAALADL1 gene, including heterologous overexpression and purification, structural and biochemical characterization, and analysis of its expression profile. By solving the NAALADase L x-ray structure, we provide the first experimental evidence that it is a zinc-dependent metallopeptidase with a catalytic mechanism similar to that of glutamate carboxypeptidase II yet distinct substrate specificity. A proteome-based assay revealed that the NAALADL1 gene product possesses previously unrecognized aminopeptidase activity but no carboxy- or endopeptidase activity. These findings were corroborated by site-directed mutagenesis and identification of bestatin as a potent inhibitor of the enzyme. Analysis of NAALADL1 gene expression at both the mRNA and protein levels revealed the small intestine as the major site of protein expression and points toward extensive alternative splicing of the NAALADL1 gene transcript. Taken together, our data imply that the NAALADL1 gene product's primary physiological function is associated with the final stages of protein/peptide digestion and absorption in the human digestive system. Based on these results, we suggest a new name for this enzyme: human ileal aminopeptidase (HILAP).

N-acetylated ␣-linked acidic dipeptidase-like protein (NAALADase L), encoded by the NAALADL1 gene, is a close homolog of glutamate carboxypeptidase II, a metallopeptidase that has been intensively studied as a target for imaging and therapy of solid malignancies and neuropathologies. However, neither the physiological functions nor structural features of NAALADase L are known at present. Here, we report a thorough characterization of the protein product of the human NAALADL1 gene, including heterologous overexpression and purification, structural and biochemical characterization, and analysis of its expression profile. By solving the NAALADase L x-ray structure, we provide the first experimental evidence that it is a zinc-dependent metallopeptidase with a catalytic mechanism similar to that of glutamate carboxypeptidase II yet distinct substrate specificity. A proteome-based assay revealed that the NAALADL1 gene product possesses previously unrecognized aminopeptidase activity but no carboxy-or endopeptidase activity. These findings were corroborated by site-directed mutagenesis and identification of bestatin as a potent inhibitor of the enzyme. Analysis of NAALADL1 gene expression at both the mRNA and protein levels revealed the small intestine as the major site of protein expression and points toward extensive alternative splicing of the NAALADL1 gene transcript. Taken together, our data imply that the NAALADL1 gene product's primary physiological function is associated with the final stages of protein/peptide digestion and absorption in the human digestive system. Based on these results, we suggest a new name for this enzyme: human ileal aminopeptidase (HILAP).
Human N-acetylated ␣-linked acidic dipeptidase-like protein (NAALADase L) 3 is encoded by the NAALADL1 gene localized at chromosome 11q12. The gene spans more than 14 kbp and contains 18 exons (1). Translation of the mRNA transcript yields a 740-amino acid type II transmembrane protein. According to the MEROPS database, NAALADase L belongs to the M28B metallopeptidase subfamily (MEROPS ID: M28.011). Glutamate carboxypeptidase II (GCPII), which shares 37% sequence identity and 50% similarity with NAALA-Dase L, is another member of this subfamily.
In 1997, Shneider et al. (2) identified a NAALADase L protein (designated as I100) in rat ileum. They analyzed mRNA expression in rat and human tissues, raised polyclonal antibody against I100, and detected DPP IV activity in an immunoprecipitate from rat ileum. The only other report to date describing the NAALADL1 gene product was published by Pangalos et al. in 1999 (1). These researchers performed a comparative analysis of the human NAALADL1 gene product and its two close paralogs NAALADase I and NAALADase II, which are now known as GCPII and GCPIII, respectively. They gave the protein its current name, NAALADase L, based on the close homology of these enzymes. They cloned NAALADase L cDNA, identified its position in the human genome, analyzed its alternative splicing, and detected DPP IV activity in NAALADase L-transfected cell lysate. Additionally, they detected DPP IV activity in GCPII-and GCPIII-transfected cell lysates. However, this activity was subsequently not confirmed for GCPII and GCPIII in experiments with purified recombinant protein (3,4). In the present study, we performed a thorough structural and biochemical characterization of the human NAALADL1 gene product using a purified recombinant protein preparation to elucidate the physiological role of the protein.

EXPERIMENTAL PROCEDURES
Cloning NAALADase L cDNA and Preparation of Expression Plasmids-Two plasmids, A (ID LIFESEQ95147340) and B (ID LIFESEQ4181072), containing cDNA coding for NAALADase L were purchased from Open Biosystems (now GE Healthcare). The extracellular portion of NAALADase L (aa 28 -740) was amplified from plasmid A, and BclI and XhoI restriction sites were introduced by PCR using primers FNAL28BclI (aaatgatcaatccccaaaaaagccaactcactggc) and RNAL740XhoI (tttctcgagtcatcagaggtcagccacaggcc). The PCR product was then ligated via BglII and XhoI restriction sites into pMT/BiP/V5-HisA (Life Technologies, Inc.), resulting in pMT/BiP/ rhNaalL28 -740. Sequencing identified two mutations in pMT/ BiP/rhNaalL28 -740 (resulting in aa mutations L364P and L393P). Therefore, the mutated part of DNA was replaced with the corresponding DNA from plasmid B utilizing NcoI and KpnI cleavage sites to insert leucine residues at the proper positions in the plasmid DNA. Additional sequencing confirmed that the mutations were repaired.
For preparation of N-terminal His-tagged NAALADase L, the extracellular portion of the protein (aa 28 -740) was amplified from pMT/BiP/rhNaalL28 -740 using primers FNAL28NdeI (aaacatatgatccccaaaaaagccaactcactggc) and RNAL740XhoI. The PCR product was then ligated into pET28b via NdeI and XhoI restriction sites, yielding pET28b-HisNaalL.
Preparation of His-NaalL Protein for Immunization-The plasmid pET28b-HisNaalL was used for large scale expression of the extracellular portion (aa 28 -740) of NAALADase L with an N-terminal His tag (His-NaalL) in Escherichia coli BL21-CodonPlus-RIL cells using 1 mM isopropyl 1-thio-␤-D-galactopyranoside for induction. His-NaalL was purified from inclusion bodies under denaturing conditions. First, the cell pellet was dissolved in running buffer (8 M urea, 0.5 M NaCl, 50 mM Na 2 HPO 4 , pH 8.0) and loaded onto a HisTrap HP (GE Healthcare) column using an Ä KTAexplorer LC system (GE Healthcare). The column was washed with running buffer containing 10 mM imidazole, and protein was eluted with 1 M imidazole. The elution fractions were pooled and used for mouse immunization.
Preparation of Monoclonal Antibody against NAALADase L-Monoclonal antibodies were prepared according to an established protocol (7). Briefly, BALB/c mice were intraperitoneally immunized with purified His-NaalL in 8 M urea (40 g in 100 l) mixed with an equal volume of incomplete Freund's adjuvant at 3-week intervals. Hybridoma cells, prepared by fusion of immune spleen cells and Sp/0 myeloma cells, were cloned by limited dilution and assayed for production of NAALADase L-specific antibodies by enzyme-linked immunosorbent assay. Positive selection was performed against His-NaalL, and negative selections were performed against rhGC-PII and rhGCPIII prepared in our laboratory (3,4). Four positive hybridoma clones (1C11, 3C11, 3D10, and 4E9) were expanded, and conditioned medium from the most sensitive hybridoma clone (1C11) was used for immunoblots. Additionally, we tested the ability of all four hybridoma supernatants to immunoprecipitate native Avi-NaalL. The results indicated that these monoclonal antibodies bind specifically to the denatured form of NAALADase L (data not shown).
Expression and Purification of Avi-NaalL and Avi-NaalL(E416A)-Preparation and purification of Avi-NaalL was performed as described previously (5). Briefly, Drosophila S2 cells expressing E. coli biotin-ligase (BirA) in the endoplasmic reticulum were transfected with pMT/BiP/AviTEV/rhNaalL (encoding the extracellular portion (aa 28 -740) of NAALA-Dase L with an N-terminal Avi tag and TEV protease cleavage site. Large scale expression of Avi-NaalL was performed, and the conditioned medium was concentrated 10-fold. The recombinant protein was purified on Streptavidin Mutein Matrix TM (Roche Applied Science), followed by size exclusion chromatography on Superdex TM 200 10/300 GL (GE Healthcare) in 10 mM Bistris propane, 150 mM NaCl, pH 7.4. The concentration of the final protein preparation was determined by amino acid analysis using a Biochrom30 amino acid analyzer (Biochrom). The preparation of Avi-NaalL(E416A) mutant was performed analogously.
HIV-1(D25N) protein was overexpressed in E. coli BL21(DE3)RIL (Novagen) transformed with pET24a_ HIVprD25N expression plasmid. Insoluble recombinant protein, accumulated in the form of inclusion bodies, was isolated and solubilized in 50% (v/v) acetic acid. Solubilized protein was purified by gel chromatography in 50% (v/v) acetic acid, using a Superdex G75 16/60 gel filtration column (GE Healthcare). Fractions containing mutant HIV-1 protease were pooled, and the protein was subsequently refolded by dropwise dilution into a 19-fold excess of water and additional dialysis against water for 3 h at 4°C, followed by subsequent overnight dialysis against 50 mM MES, 10% (v/v) glycerol, 1 mM EDTA, 0.05% (v/v) 2-mercaptoethanol, pH 5.8. The purified protein was stored at Ϫ70°C.
Crystallization and Data Collection-Avi-NaalL stock solution (5 mg/ml) was mixed in a 1:1 ratio with reservoir solution (0.16 M magnesium formate and 16% (w/v) PEG 3350). Crystals were grown using the hanging drop vapor diffusion method at 293 K. Diffraction quality crystals (0.3 ϫ 0.2 ϫ 0.2 mm 3 ) were obtained by seeding. For diffraction experiments, crystals were cryogenized in liquid nitrogen after transferring to cryoprotectant containing 0.16 M magnesium formate, 16% (w/v) PEG 3350, and 14% (v/v) glycerol. The diffraction data were collected at 100 K using synchrotron radiation at the Southeast Regional Collaborative Access Team sector 22 (BM beamline) of the Advanced Photon Source (Argonne, IL) with the x-ray wavelength tuned to 1.0000 Å. The complete data set was collected from a single crystal and recorded on a MarMosaic 225-mm CCD detector. Data processing was performed with the HKL2000 software package (8).
Structure Determination and Refinement-The Avi-NaalL structure was solved by molecular replacement using the program Phaser (9) and coordinates of GCPII (PDB entry 2PVW) as a search model. Subsequent structural refinement was carried with the program Refmac 5.1 (10). During the refinement process, 1% of randomly selected reflections were kept aside for cross-validation (R free ). Refinement was interspersed with manual corrections aided by the program Coot (11). The stereochemical quality of the final model was evaluated using the Web server MolProbity (12). The final model and experimental amplitudes were deposited in the RCSB Protein Data Bank (entry 4TWE).
Quantitative PCR (qPCR)-Amounts of NAALADL1 gene transcript were quantified using two distinct sets of primers. The first set, NaalL_A, contained forward and reverse primers (sequences cactgggctgccactacag and ttgtagacgctcacattcacct) and fluorescent hydrolysis Universal ProbeLibrary Probe 3 (Roche Applied Science). This set was designed to amplify nucleotides 945-1032 in NAALADL1 transcript NM_005468, which covers the region containing exons 6 and 7 and corresponds to aa 310 -339 in the longest ORF. The expected size of the amplified product was 88 bases; the size of the possibly amplified genomic DNA was 758 bases. The second set, NaalL_B, consisted of forward and reverse primers (sequences atctcctccatggacattgc and gctgtgtggtaggtggggta) and fluorescent hydrolysis Universal ProbeLibrary Probe 41 (Roche Applied Science). This set was designed to amplify nucleotides 1580 -1656 in NAALADL1 transcript NM_005468, which covers the region containing exons 13 and 14 and corresponds to aa 522-547 in the longest ORF. The expected size of the amplified product was 77 bases, whereas the size of the possibly amplified genomic DNA was 1024 bases.
All qPCRs (10 l) were carried out in triplicate in sealed 96-well plates (FrameStar 480, 4titude) using a LightCycler 480 II instrument (Roche Applied Science). Each reaction consisted of LightCycler 480 Probe Master (Roche Applied Science) diluted according to the manufacturer's protocol, forward and reverse primers (final concentration 1 M each), fluorescent probe (final concentration of probe 3 was 100 nM; final concentration of probe 41 was 50 nM), and template DNA. Reaction mixtures without template DNA were prepared as controls. Moreover, interplate calibrators were included in each plate. Initial denaturation for 3 min at 95°C was followed by 45 cycles of 10 s at 95°C, 30 s at 66°C, and 30 s at 72°C. The threshold cycle numbers (C q ) were then determined from fluorescence intensities acquired during the qPCR runs by the second derivative maximum method using LightCycler 480 software (Roche Applied Science). The presence and size of PCR products for all qPCRs was analyzed by agarose gel electrophoresis.
A set of serial dilutions, ranging from 10 2 to 10 7 copies/reaction, of the pMT/BiP/AviTEV/rhNaalL plasmid was used for absolute quantification. The initial concentration of plasmid DNA prior to dilution was determined spectrophotometrically based on absorption at 260 nm (Nanodrop ND-1000, Thermo Scientific). The amount of NAALADL1 transcript was then measured in 0.1-l aliquots of tissue cDNA libraries obtained from Clontech (Human MTC Panels I and II).
To assess the selectivity of qPCR amplification, pcDNA4 plasmids with subcloned protein coding sequences of homologous human transcripts (ORF from NM_004476 coding aa 1-750 of GCPII, ORF from NM_005467 coding aa 1-740 of GCPIII) at concentrations of 10 6 copies/reaction were amplified in the same qPCR setup with both primer sets. Possible products were inspected by agarose gel electrophoresis. Amplification of isolated human genomic DNA was tested in the same manner.
Human Tissue Lysate Preparation-Human tissue lysates were prepared as described previously (13). Briefly, human samples were obtained during autopsy performed 4 -8 h postmortem. The samples were homogenized in 50 mM Tris-HCl, 150 mM NaCl, pH 7.5, supplemented with EDTA-free protease inhibitors (Roche Applied Science). The samples were then sonicated using Soniprep 150 (Sanyo) three times for 10 s on ice, and 1% (v/v) Triton X-100 was added. After a 30-min incubation on ice, the suspension was centrifuged at 16,000 ϫ g for 10 min, and the supernatants were stored at Ϫ80°C for further experiments.
SDS-PAGE and Western Blotting-Protein samples were resolved by reducing SDS-PAGE. Gels were silver-stained for analysis of protein purification or stained with colloidal Coomassie G-250 for mass spectrometry analysis or electroblotted (100 V for 1 h) onto a nitrocellulose membrane using a Mini Trans-Blot Cell wet blot system (Bio-Rad).
MS Analysis-The chosen gel sections were excised, destained, reduced with DTT, alkylated with iodoacetamide, and digested with trypsin for 10 h. Peptides were extracted and dissolved in 0.1% formic acid (FA). Samples were analyzed with an UltiMate 3000 RSLCnano system (Thermo Scientific) coupled to a TripleTOF 5600 mass spectrometer with a NanoSpray III source (AB Sciex). Peptides were separated on an Acclaim PepMap100 analytical column (3 m, 250 mm ϫ 75-m inner diameter; Thermo Scientific). The concentration of buffer B (0.1% FA in acetonitrile) was gradually increased from 5 to 30% in buffer A (0.1% aqueous FA) over 40 min. The MS mass range was set to 350 -1250 m/z; in MS/MS mode, the instrument acquired fragmentation spectra within 100 -1600 m/z. The program Protein Pilot 4.5 (AB Sciex) was used for protein identification against the UniProt Homo sapiens database (reviewed December 20, 2013).
Samples for proteomic identification of cleavage site (PICS) were analyzed in the same manner but without gel extraction and by applying a 95-min instead of a 40-min gradient. PICS results were searched against the UniProt H. sapiens database containing typical MS contaminants (reviewed June 23, 2014).
DPP IV Activity Assay-Protein samples, either purified recombinant Avi-tagged protein or LNCaP cell lysate, were incubated in 25 mM Bistris propane, 150 mM NaCl, pH 8.5, with 100 M Gly-Pro-aminomethylcoumarin (Sigma) for 15 h at 37°C. The reactions were performed in triplicate in white 96-well polypropylene plates with V-bottoms. The fluorescence was measured on an Infinite M1000 PRO (TECAN) with excitation and emission wavelengths/bandwidths of 335/5 nm and 450/10 nm, respectively.
Synthesis of Dipeptides and Dipeptide Libraries-Substrate mixtures of the general formula Ac-Aaa-Xxx-OH (where Aaa is one of 19 proteinogenic amino acids (all except cysteine), and Xxx is an equimolar mixture of all 19 amino acids (all except cysteine)) were prepared by solid phase peptide synthesis on 2-chlorotritylchloride resin (for C-terminal amino acids Asp, His, Pro, and Trp) or on Wang resin (for all other C-terminal amino acids) using the Fmoc (N-(9-fluorenyl)methoxycarbonyl)/t-butyl protection and N,NЈ-diisopropylcarbodiimide/ 1-hydroxybenzotriazole coupling strategy described by Barlos et al. (15).
N-terminal acetylation was accomplished with acetic anhydride in the presence of N,N-diisopropylethylamine. The completeness of all couplings was controlled by a ninhydrin test (16). A mixture of 94% trifluoroacetic acid, 2% triisopropylsilane, 2% ethanedithiol, and 2% water was used for final cleavage from the resin. The peptides were freeze-dried from 10% acetic acid in water.
To confirm the presence and quantify the amount of all N-acetylated dipeptides in the mixtures, samples were analyzed by amino acid analysis using a Biochrom30 amino acid analyzer (Biochrom) following the manufacturer's protocol.
Carboxypeptidase Activity Assay-Reactions were performed in a round-bottom 96-well PP plate. The enzyme (final concentration 0.25 M) and 5 l of N-acetylated dipeptide library (the final concentration of each individual dipeptide in the library was ϳ25 M) were diluted with reaction buffer (25 mM Bistris propane, 150 mM NaCl, 0.001% (w/v) C12E8, pH 7.4) to a final volume of 50 l. The reaction was then incubated at 37°C for 15 h and subsequently analyzed by HPLC. The cleaved free amino acids in the reaction mixtures were analyzed on an Agilent 1260 HPLC system (Agilent Technologies) equipped with an AccQ-Tag Ultra RP 130 Å column (1.7 m, 100 mm ϫ 2.1-mm inner diameter; Waters) following a previously published protocol for amino acid analysis (17).
Enzymatic Assay with Peptides from PICS-Reactions were performed in 1. Separation of the peptides and amino acids was conducted on the Acquity UPLC BEH-C18 130 Å column (1.7 m, 100 mm ϫ 2.1-mm inner diameter; Waters) using a 10-min gradient from 2 to 100% mobile phase B (A, 0.1% aqueous FA; B, 0.1% FA in acetonitrile). The column temperature was 25°C, and the flow rate was 0.3 ml/min.
HPLC Analysis of Aminopeptidase Activity-Reactions were performed in round-bottom 96-well PP plates. For determination of kinetic parameters of IIDPNG peptide cleavage by Avi-NaalL, the reactions with various concentrations of the peptide ranging from 6 to 1600 M were performed. The Avi-NaalL was first diluted in reaction buffer (25 mM Bistris propane, 150 mM NaCl, 0.001% (w/v) C12E8, pH 7.4) in a final volume of 40 l to ensure 10 -20% conversion. The enzyme was then preincubated for 5 min at 37°C, and then 10 l of various concentrations of peptide substrate (IIDPNG) was added to start the reaction. After 15 min, the reactions were stopped by the addition of 50 l of stopping buffer (1 M glycine-HCl, pH 2.4).
For determination of the inhibition constant of bestatin, 90 ng of Avi-NaalL, ensuring ϳ10% conversion, and various con-centrations ranging from 0.1 to 500 M of bestatin hydrochloride (AppliChem) were mixed in reaction buffer in a final volume of 40 l. The reactions were preincubated for 5 min at 37°C, and then 10 l of peptide substrate (IIDPNG; final concentration 50 M) was added to start the reaction. After 15 min, the reactions were stopped by the addition of 50 l of stopping buffer. Each reaction was performed in duplicate.
The reactions were subsequently analyzed on an Agilent 1260 HPLC system (Agilent Technologies) equipped with the Acquity UPLC BEH-C18 130 Å column (1.7 m, 100 mm ϫ 2.1-mm inner diameter; Waters) using a 6-min gradient from 2 to 100% mobile phase B (A, 0.1% aqueous FA; B, 0.1% FA in acetonitrile). The column temperature was 25°C, and the flow rate was 0.3 ml/min. The obtained data were then processed using GraFit version 5.0.11 (Erithacus Software Ltd.).
The catalytic efficiency of Avi-NaalL(E416A) mutant was determined in reaction buffer by incubation with Avi-NaalL(E416A) (final concentration 10 M) and peptide IID-PNG (final concentration 4 mM) at 37°C. 2 l of reaction mixture was withdrawn from the reaction after 60, 120, and 240 min, added to 100 l of stopping buffer, and subsequently analyzed on HPLC.
PICS Assay-PICS was performed according to a previously published protocol with several modifications (18). Non-transfected HEK293T cells were grown in DMEM to 90% confluence and harvested by washing into PBS buffer. Cells were transferred into hypotonic lysis buffer containing protease inhibitors and lysed by sonication, followed by cysteine reduction and alkylation. Proteins were precipitated using TCA and resolubilized into 200 mM HEPES, pH 7.5. Following resolubilization, the proteome-derived peptide library was prepared by cleaving isolated denatured proteins into peptides with trypsin (Sigma-Aldrich). After abolishing trypsin activity using 1 mM PMSF, a second round of sulfhydryl reduction and alkylation was performed. Primary amines on peptide N termini and lysine side chains were blocked by formaldehyde-cyanoborohydride reductive dimethylation. Excess modification reagents were removed by gel filtration, and the peptide library was purified and transferred to HPLC-grade water using a C-18 solid phase extraction cartridge Sep-Pak (Waters) according to the manufacturer's protocol. The peptide concentration was adjusted to 2 mg/ml. The integrity of the peptide library was confirmed by LC-MS/MS analysis, and aliquots were stored at Ϫ80°C until further use.
For the endopeptidase assay, the peptide library (final concentration 1 mg/ml) was incubated in 200 l of 25 mM Bistris propane, 150 mM NaCl, pH 7.4, and 0.2 M Avi-NaalL. The reaction was incubated for 12 h at 37°C. Following incubation, the reaction was heat-inactivated for 30 min at 70°C and titrated with 2 M HEPES, pH 8.5, to final pH 8.0.
Subsequently, newly formed free peptide N termini (products of proteolytic cleavage) were in vitro biotinylated by the addition of 350 M sulfo-NHS-SS-biotin (Thermo Scientific) for 4 h at room temperature. Biotinylated products were then immobilized on streptavidin-agarose (Solulink) by a 2-h incubation with mild agitation at room temperature, followed by washing. Additional washing steps (2 M urea followed by 20% isopropyl alcohol, 5% DMSO, and 5% acetonitrile, all in wash-ing buffer (50 mM HEPES, 150 mM NaCl, pH 7.5)) were added to the original protocol, followed by 10 washes with washing buffer. Immobilized peptides were eluted with 20 mM DTT, desalted using Pepclean C-18 reverse phase cartridges (Thermo Scientific) according to the manufacturer's protocol, and analyzed by mass spectrometry.
As negative controls, Avi-NaalL(E416A) and buffer alone (blank) were used. As positive controls, the HIV-1 protease cleavage profile in 100 mM sodium acetate, 300 mM NaCl, pH 4.7, was tested with wild-type and HIV-1(D25N) protease. The protease library ratio was 1:200, and the final enzyme concentration was 0.2 M.
PICS Data Analysis-Data were analyzed using a series of predesigned queries in Microsoft Access database software. First, lists of identified peptides from each MS run were loaded into the database and filtered for peptides containing products of N-terminal modification by biotinylation. Second, peptides with confidence over 80% were picked for enzymatic reactions (Avi-NaalL or HIV-1 protease), whereas peptides with confidence over 10% were picked for control reactions (reactions with catalytically ineffective forms of the enzymes and blanks). To properly subtract the background signal, the list of peptides found in each enzymatic reaction was screened for peptides present in the blank reaction and the reaction with catalytically ineffective enzyme. Such peptides were then removed from processing. Finally, the enzymatic reactions were screened for peptides identified in the original unprocessed peptide library. Such peptides were also removed from the analysis.
The final cleared list of identified peptides was then mapped against the FASTA database used for proteomics database search. By identified peptide alignment with the database, the N-terminal portions of cleaved peptides (preceding the cleavage site) were determined. If there was more than one computationally identified amino acid for a given P position, the position was omitted from the processing, whereas the identified peptide sequence remained listed for downstream analysis. The final list of substrate peptides containing sequences of PЈ amino acids identified in the MS experiment and four P amino acids identified computationally is shown in Table 2. The frequency of each amino acid in each position was then calculated and plotted, yielding substrate specificity matrix heat maps.

RESULTS
We used an expression and purification system established in our laboratory to produce the extracellular portion (aa 28 -740) of human NAALADase L, designated Avi-NaalL, and its putative catalytically ineffective mutant Avi-NaalL(E416A) (5). These recombinant proteins, with a final purity greater than 95%, shown in Fig. 1, were used in all subsequent experiments.

X-ray Structure of Avi-NaalL
We prepared diffraction quality crystals of human Avi-NaalL in hanging drops using a microseeding approach. We solved the structure by molecular replacement using the program Phaser and refined the final model to 1.75 Å resolution (PDB code 4TWE; data collection and refinement statistics are shown in Table 1). As shown in Fig. 2A, the overall fold of Avi-NaalL is very similar to that of GCPII (PDB code 3BXM) with a root mean square deviation of 1.41 Å for the 646 equivalent C␣ atoms (19). Guided by the well described structural features of GCPII (20,21), we annotated the Avi-NaalL structure accordingly.
The Avi-NaalL monomer folds into three distinct domains: protease-like domain (residues 35-108 and 340 -587), apical domain (residues 109 -339), and C-terminal domain (residues 588 -740). Two Avi-NaalL molecules in the asymmetric unit form a dimer with a buried surface area of more than 2300 Å 2 . The dimerization interface mainly involves interactions between the C-terminal domain of the first monomer and the protease-like/apical domains of the second monomer (Fig. 2B). Size exclusion chromatography data also indicated that the protein is dimeric with an approximate relative molecular mass of 250 kDa (Fig. 1), suggesting that this is probably its physiological quaternary structure. This quaternary arrangement is reminiscent of the dimerization pattern of both human transferrin receptor and GCPII (22).
Unlike GCPII, Avi-NaalL contains one intramolecular disulfide bridge, linking Cys-296 with Cys-313. The Avi-NaalL sequence also contains seven potential N-glycosylation motifs. We found clear electron density for at least one N-acetylglucoseamide moiety for all putative glycosylation sites except Asn-274. A sequential alignment of NAALADase L and GCPII is shown in Fig. 2C with several key functional amino acids highlighted.
A strong positive density peak representing a metal ion was observed in the F o Ϫ F c map of the Avi-NaalL structure. The metal is heptacoordinated by the ␥-carboxylates of Glu-425 and Glu-428, the main chain carbonyls of Leu-261 and Thr-258, and the side chain hydroxyl group of Thr-258 (all at distances between 2.4 and 2.5 Å). The coordination sphere is completed by a water molecule (2.5 Å; Fig. 3A). In the final model, the Ca 2ϩ ion was modeled into this position. Our assumption of modeling calcium ion was guided by several lines of indirect evidence, including the favored pentagonal bipyramidal geometry of the coordination sphere, agreement with the electron density, "optimal" interatomic distances, and matching atomic displacement parameters (B-factors) of the metal and coordinating residues (23). Additionally, the identity of the Ca 2ϩ ion, occupying the equivalent site in closely related human GCPII, was confirmed experimentally before (20), thus strengthening our argument for calcium as a metal of choice. The exact structural and/or functional role of this cation, however, is not known at present and warrants further studies.
Avi-NaalL binds two Zn 2ϩ ions, which are coordinated by the side chains of His-368, Asp-378, Glu-417, Asp-445, and FIGURE 1. SDS-PAGE and size exclusion chromatography analysis of Avi-NaalL(E416A) affinity purification. Concentrated conditioned medium from S2 cells mixed with equilibration buffer (fraction L) was mixed with Streptavidin Mutein Matrix overnight. The medium was separated from the resin (fraction FT), and the resin was washed with washing buffer (fraction W1). The resin was then transferred to elution buffer (fraction W2). After a 1-h incubation, Avi-NaalL(E416A) was eluted with an excess of D-biotin (fractions E1-E4). Samples were loaded on the gel and silver-stained: fractions L and FT, 1 l; fractions W1 and W2, 15 l; fraction E1, 1 l; fraction E2, 10 l; fractions E3 and E4, 15 l. The quaternary fold of Avi-NaalL(E416A) was subsequently analyzed using an FPLC system and Superdex TM 200 10/300 column. Prior the analysis, the column was calibrated by low and high molecular weight gel filtration calibration kits (GE Healthcare) following the manufacturer's protocol. mAU, milliabsorbance units.

TABLE 1 Data collection and refinement statistics for Avi-NaalL
The data in parentheses refer to the highest resolution shell. Amino acids of the AviTEV-tag and first seven N-terminal amino acids of Avi-NaalL are disordered and not modeled in the structure. RMSD, root mean square deviation; AU, asymmetric unit. His-545 (Fig. 3B) at distances between 1.9 and 2.0 Å. These zinc-coordinating residues adopt the same conformation as in GCPII, forming the core of the Avi-NaalL active site. Glu-416, analogous to Glu-424 in GCPII (24), probably serves as a catalytic acid/base residue for peptide cleavage. Additionally, the presence of zinc ions was experimentally confirmed by fluorescent scanning of the Avi-NaalL protein crystal (data not shown). Similar to GCPII, Avi-NaalL also contains a cis peptide bond between Asp-378 and Pro-379, which is a hallmark of bimetallic peptidases.

Data collection statistics
Although Avi-NaalL was crystallized in the absence of a ligand, we detected an extensive positive electron density peak in the vicinity of the zinc ions (Fig. 3C). Unfortunately, we were unable to reliably model any molecule into this electron density. Moreover, this electron density peak spans over the expected position of the catalytic hydroxyl ion, which therefore FIGURE 2. X-ray structure of Avi-NaalL and its structural and sequential comparison to GCPII. A, alignment of Avi-NaalL (PDB code 4TWE) and GCPII (PDB code 3BXM) structures. Both are shown in ribbon representation. GCPII is shown in gray. Avi-NaalL is color-coded based on its domain organization: protease-like domain (red), apical domain (yellow), and C-terminal domain (orange). The zinc ions are shown as magenta spheres, and calcium ion is shown as a blue sphere. B, overall structure of Avi-NaalL dimer shown in ribbon representation. One monomer is shown in wheat, and the other is colored based on its domain organization. The zinc ions are shown as magenta spheres, and calcium ions are shown as blue spheres. The carbohydrate moieties are depicted as cyan sticks with oxygen atoms in red and nitrogen in blue. The disulfide bond between Cys-296 and Cys-313 is shown as a green stick. The top view of the dimer illustrates the extensive dimerization interface. C, sequential alignment of GCPII and NAALADase L. Conserved amino acids are highlighted in yellow, and similar amino acids are highlighted in cyan. The domain organization of NAALADase L is indicated by arrows above the sequence colored as follows: intracellular domain (black), transmembrane domain (gray), protease-like domain (red), apical domain (yellow), and C-terminal domain (orange). The catalytic acid/base Glu is marked with a blue star. Residues coordinating the zinc ions are black-boxed. N-Glycosylation sites are shown as blue circles, shaded if electron density for corresponding N-acetylglucoseamide was detected. Amino acids coordinating the calcium ion are shown as red circles, and Cys residues forming the disulfide bond are shown as green circles. Amino acids participating in N-acetyl-L-aspartyl-L-glutamate binding in GCPII are marked with magenta circles.
could not be modeled in the Avi-NaalL structure. In GCPII, this hydroxyl ion completes the tetrahedral coordination of zinc ions and acts as a nucleophilic agent during peptidic substrate cleavage (24).
The comparison of amino acids that mediate ligand binding in GCPII with their Avi-NaalL counterparts is illustrated in Fig. 3D; the structure of inactive GCPII in complex with its endogenous substrate N-acetyl-L-aspartyl-L-glutamate was used for comparison. We found two conserved residues that probably mediate substrate binding to Avi-NaalL, Arg-198 and Tyr-544. Their GCPII counterparts, Arg-210 and Tyr-552, interact with the C-terminal ␣ carboxylic group and the carbonyl oxygen of the cleaved peptide bond of the GCPII substrate. The residues Arg-534 and Arg-536, which form the socalled "arginine patch" in GCPII, thus providing the GCPII substrate binding cleft with a positive charge and contributing to its preference for negatively charged peptide substrates, are not conserved in Avi-NaalL (Fig. 3E). Consequently, the chloride ion, which stabilizes the energetically unfavorable "all-gauche" conformation of Arg-534 in GCPII, is not present in the Avi-NaalL structure.

Expression Pattern of the Human NAALADL1 Gene Product
We determined the amount of NAALADL1 gene transcript in commercially available cDNA libraries normalized to set of control genes from several human tissues by qPCR. We used two primer sets, each consisting of a primer pair for amplification and a fluorescent probe for detection. The first set of amplification primers targeted the region spanning exons 6 and 7 (NaalL_A), whereas the second targeted exons 13 and 14 (NaalL_B). The absolute amounts of NAALADL1 transcripts detected are shown in Fig. 4A. We identified NAALADL1 gene transcripts in several tissues, including the testis, colon, small intestine, ovary, spleen, and thymus.
On the other hand, using our newly developed mAb 1C11, we found that NAALADase L expression is primarily restricted to the small intestine (Fig. 4B). We interrogated the specificity of Western blots by mass spectrometry by analyzing several areas corresponding to positive Western blot signals (boxed in Fig. 4B). The small intestine tissue sample (solid box) was the only one in which we identified the NAALADL1 gene product (UniProt accession number Q9UQQ1). We found 33 distinct peptides covering 41% of the sequence entry in the database. On the other hand, gel sections from ovary tissue (dashed box) did not contain NAALADase L, suggesting that 1C11 has nonspecific cross-reactivity.

Enzymatic Activities of NAALADase L
Putative DPP IV Activity-Because both human and rat NAALADase L were previously reported to possess DPP IV activity (1, 2), we attempted to confirm these findings using our purified protein preparations. As illustrated in Fig. 5, we were unable to detect any DPP IV activity for Avi-NaalL and Avi-GCPII, whereas the use of LNCaP lysate as a positive control resulted in a clear increase in fluorescence intensity.
Putative Carboxypeptidase Activity-Guided by our structural analysis of Avi-NaalL, we set out to investigate its putative carboxypeptidase activity. We synthesized and screened a dipeptide library encompassing 19 proteinogenic amino acids (all but cysteine). Our library consisted of 19 individual N-acetylated dipeptide mixtures, each mixture comprising 19 different dipeptides with defined N-terminal and variable C-terminal amino acids (e.g. Ac-Ala-Xxx-OH and Ac-Met-Xxx-OH). After incubating each mixture with Avi-NaalL, released C-terminal amino acids were detected using HPLC analysis. We used a catalytically ineffective mutant, Avi-NaalL(E416A), as a control. Additionally, we tested an active  recombinant GCPII preparation (Avi-GCPII) and its catalytically ineffective E424A mutant as a corresponding control pair (5,25).
The results of this carboxypeptidase activity assay presented as differential heat maps of active and inactive protein preparation are illustrated in Fig. 6. The control experiment with Avi-GCPII indicates that the assay is reliable because it confirms a clear preference for acidic residues (Glu/Asp) at both the C-and N termini of the dipeptides (3). On the other hand, Avi-NaalL shows negligible activity and no preference for any amino acid at either the C or N terminus. These results demonstrate that Avi-NaalL does not possess detectable carboxypeptidase activity against an N-acetylated peptide as a substrate.
Putative Endopeptidase Activity-We performed a PICS experiment to assess Avi-NaalL endopeptidase activity. This technique enables simultaneous characterization of both the S and SЈ cleavage sites of an enzyme (18). We used a peptide library derived from the HEK293T cell proteome as substrate and included HIV-1 protease as a positive control to assess the functionality of the assay. Additionally, Avi-NaalL(E416A) and HIV-1(D25N), the corresponding catalytically ineffective mutants, were included to ensure reliable background subtraction.
The specificity matrices, presented in the form of heat maps, obtained from PICS are shown in Fig. 7. They summarize the sequences of peptide substrates for Avi-NaalL (72 individual sequences identified; a list of the peptides is shown in Table 2) and HIV-1 protease (over 160 individual sequences identified). For better clarity, we depicted matrices characterizing the percentage of individual amino acids at a given position in the identified cleaved peptides in Fig. 7A. Fig. 7B illustrates the ratios of individual amino acid occurrence at a given position compared with the overall occurrence of that amino acid in the human genome. Our data for HIV-1 protease showed high similarity to those found in the MEROPS database, demonstrating that our PICS experimental set-up provides a reliable account of protease substrate specificity. The specificity matrix for Avi-NaalL revealed substantial enrichment of acidic residues (Glu/ Asp) in the P2Ј position and Pro residue in the P3Ј position.
Novel Aminopeptidase Activity-To elaborate on the PICS results, we synthesized two of the identified peptides, peptide 52 (AVF2VDLEPT) and peptide 71 (GLFI2IDPNG), and assayed their processing by Avi-NaalL and Avi-NaalL(E416A).
During overnight incubation, both peptides were completely processed by wild-type Avi-NaalL, whereas its E416A mutant showed minimal activity (Fig. 8). Interestingly, we could identify only N-terminally truncated parts of the assayed peptides. Moreover, we detected free phenylalanine in both reactions, which most likely originated from the N-terminal portion of the peptides. These data suggest that Avi-NaalL possesses aminopeptidase rather than endopeptidase activity. To confirm this assumption we performed a time-dependent analysis of hydrolysis of peptide 52 and found that Avi-NaalL indeed processes its peptide substrate sequentially from the N terminus (Fig. 9A).
To obtain more quantitative data on Avi-NaalL enzymatic activity we designed model peptide substrate derived from peptide 71 (I2IDPNG) from which only one Ile should be cleaved off by Avi-NaalL. Using this model substrate, we determined kinetic parameters of Avi-NaalL cleavage, K m ϭ 210 M and k cat ϭ 2.16 s Ϫ1 (Fig. 9B), and also determined the inhibition constant of a common aminopeptidase inhibitor bestatin FIGURE 6. Heat maps reflecting qualitative determination of carboxypeptidase activity. Recombinant Avi-NaalL and Avi-GCPII were incubated with 19 different N-acetylated dipeptide libraries of the general formula Ac-Aaa-Xxx-OH (where Aaa represents a constant amino acid, and Xxx is a mixture of 19 proteinogenic amino acids). After 15 h of incubation at 37°C, the free amine groups of cleaved C-terminal amino acids were modified, and the amino acids were detected by HPLC. Avi-NaalL(E416A) and Avi-GCPII(E424A), catalytically ineffective mutants of both enzymes, were used for background subtraction. The color coding corresponds to the percentage of conversion of the particular aa in the reaction mixture; the overall amount of each amino acid in dipeptide libraries was determined by amino acid analysis prior to the experiment.
toward Avi-NaalL, K i ϭ 10.7 M (Fig. 9C). Additionally, using high substrate concentration, thus modeling saturation state of the enzyme, we were able to determine the catalytic efficiency of the Avi-NaalL(E416A) mutant, k cat ϭ 0.0011 s Ϫ1 . This result confirmed our prediction that the Glu-416 indeed serves as a proton shuttle residue, which is crucial for proper catalytic efficiency of the enzyme.
Finally, to further corroborate the Avi-NaalL substrate specificity, we designed a random peptide (RIKLNWFS) and assayed its degradation by Avi-NaalL. As shown in Fig. 10, we identified that the peptide was efficiently processed to single amino acids by Avi-NaalL during overnight incubation, a finding that is in line with the expected cleavage pattern because the peptide contained neither an acidic residue nor Pro that could block the peptide processing. Moreover, this experiment also revealed the fact that Avi-NaalL is able to efficiently process longer oligopeptides as well as smaller tri-or dipeptides.
All of these results demonstrate that Avi-NaalL is an aminopeptidase with wide substrate specificity. Additionally, the results strongly suggest that the apparent Avi-NaalL endopeptidase-like substrate specificity determined by PICS (acidic residues in P2Ј and Pro in P3Ј position) is actually a motif preventing further N-terminal processing of peptides by Avi-NaalL.

DISCUSSION
Structural Features of the NAALADL1 Gene Product-This is the first study to experimentally describe the structure of the NAALADL1 gene product, which is highly homologous to that of GCPII. The amino acids coordinating two active site zinc ions and the catalytic acid/base residue (Glu-416; Fig. 3B) in FIGURE 7. Heat maps of PICS results. PICS experiments were performed as described under "Experimental Procedures" with Avi-NaalL and its catalytically ineffective mutant Avi-NaalL(E416A) and with HIV-1 protease and its catalytically ineffective mutant HIV-1(D25N) protease (positive control). Briefly, the proteins were incubated overnight with a peptide library derived from the HEK293T proteome. Cleaved peptides were subsequently isolated from the reaction mixture via biotin modification of newly formed primary amino groups. The peptides were identified by LC-MS/MS analysis, and their corresponding prime parts were determined by sequence alignment with a human proteome database. The MEROPS substrate profile of HIV-1 protease is presented for comparison (see the MEROPS Web site). A, heat maps illustrate in grayscale the percentage of individual amino acids at a particular position relative to the scissile peptide bond. Four positions (P4Ј-P4) are shown. B, heat maps illustrate ratios of individual amino acid occurrence at a given position compared with the overall occurrence of that amino acid in the human genome. Ratios above 1 (i.e. preferred by the protease analyzed) are shown in shades of green, whereas ratios below 1 (i.e. rejected by the protease analyzed) are shown in shades of red.
Avi-NaalL are identical to those in GCPII. On the other hand, the GCPII amino acids participating in substrate recognition and formation of the arginine patch, which provides positive charge to the GCPII active site, are not conserved in Avi-NaalL. This structural comparison suggested that Avi-NaalL might bind the C terminus of its potential peptide substrate and position it toward the catalytic acid/base residue to enable proteolysis. However, it also indicated that Avi-NaalL substrate specificity is probably different from that of GCPII.
Both GCPII and NAALADase L belong to the MH clan (family M28) of metallopeptidases. The primary feature of these peptidases is a di-zinc active site center with conserved zinccoordinating amino acids His, Asp, Glu, Asp/Glu, and His with the first Asp coordinating both metal ions. Additionally, for all peptidase from this clan, an additional residue (Asp or Glu) has been identified that is important for catalysis and is placed adjacent to coordinating amino acids in the motif His-Xxx-Asp or Glu-Glu. Based on the sequence and structural alignment of Avi-NaalL with other members of the M28 family, such as aminopeptidase S from Streptomyces griseus (SGAP) or GCPII, Glu-416 most probably plays the role of an acid/base catalytic residue in Avi-NaalL. This assumption is additionally supported by the enzymological data with the E416A mutant presented in this study.
Similarly to other families from the MH clan (e.g. the M20 family), the M28 family of peptidases contains both carboxypeptidases (GCPII) and aminopeptidases (SGAP). In the M20 family, the peptidase enzymatic activity is determined by the sequence, more specifically by the presence of either Glu or Asp residue at the ambiguous position in the zinc coordination center. If a Glu residue is present, the peptidase embodies carboxypeptidase activity (e.g. carboxypeptidase G2 from Pseudomonas sp.), whereas in the presence of an Asp residue, the peptidase acts as an aminopeptidase (e.g. aminopeptidase PepV from Lactobacillus delbrueckii) (26). Interestingly, this relationship is not observed in the M28 family, where an Asp coordinating residue is present in both GCPII and SGAP.
Considering tertiary and quaternary structure organization, Avi-NaalL adopts a structure identical to that of GCPII, containing three domains and folding into the dimer, whereas aminopeptidases from the M28 family, such as SGAP, are single-domain monomeric proteins. On the other hand, our enzymological data indicate that Avi-NaalL, like SGAP, possesses aminopeptidase activity. Therefore, we hypothesize that Avi-NaalL represents an evolutionary intermediate between GCPII and SGAP, sharing the same structural features with the former and the same enzymatic activity with the latter. The elucidation of an Avi-NaalL reaction mechanism and substrate binding may hopefully lead to the identification of structure-activity relationship similar to that described for the M20 family. However, additional crystallographic studies, targeting complexes of the wild-type enzyme with a potent inhibitor (e.g. bestatin) or complexes of a catalytically ineffective mutant (e.g. Avi-NaalL(E416A)) with a substrate, will be needed to address these questions in detail.
Distribution of the NAALADL1 Gene Product in Human Tissues-Although prior studies have described NAALADL1 gene product expression in human tissues at the mRNA level, this is the first study that analyzes its expression profile at a protein level. Interestingly, the restricted expression of NAALADase L, which we detected only in the small intestine, did not correspond well with the results from our qPCR analysis, which showed wide tissue distribution (Fig. 4). False positive results from the qPCR assay or alternative splicing of the NAALADL1 transcript are two possible explanations for these differences.
Amplification of plasmid NAALADL1 transcript in our qPCR assay conditions was ϳ90% effective and showed linear dependence between the C q values and the logarithm of the plasmid concentration over a complete range of standard concentrations for both primer sets. Moreover, we confirmed that both primer sets do not amplify transcripts of close NAALA-Dase L homologs human GCPII/III and genomic DNA to a degree that would interfere with our qPCR assay (data not shown). These additional experiments and the use of commercial cDNA libraries, which are normalized to several different housekeeping genes and pooled from numerous individuals, help to ensure the reliability of our qPCR data.
Previously reported data on NAALADL1 transcript expression by Pangalos et al. (1), amplifying the region between exons 14 and 16, showed wide tissue distribution. On the other hand, a study by Shneider et al. (2), using a large probe spanning exons 10 -15, detected NAALADL1 transcript only in the small intestine. The discrepancies between these two studies could be explained by alternative splicing of the NAALADL1 transcript; short amplicons are more prone to recognize several splice variants, whereas long probes are more specific.
Data from the AceView database also support a high degree of alternative splicing of the NAALADL1 transcript, describing more than 15 different splice variants (27). Among these variants, deletions of exon 11 (68 bp) or exon 13 (91 bp) are most common. The NaalL_A primer set, targeting exons 6 and 7, can detect both of these potential splice variants, whereas the NaalL_B primer set, targeting exons 13 and 14, cannot detect a variant lacking exon 13. This observation may be responsible for the different results in lung, ovary, and prostate tissues, in which NaalL_A but not NaalL_B primer sets amplified the NAALADL1 transcript (Fig. 4A).
Taken together, both the presented and previously published data support the hypothesis that the NAALADL1 transcript undergoes extensive alternative splicing. This processing may function as a regulatory mechanism of NAALADase L expression.
Putative Activities of NAALADase L-We were unable to detect the previously reported DPP IV activity in our enzyme preparation. The previous reports used either polyclonal antibody immunoprecipitate (2) or crude cell lysate (1,2). Due to the ubiquitous presence of DPP IV activity in mammalian cells (28), we suggest that their results may have been false positive. Furthermore, one of these studies also reported DPP IV activity for GCPII and GCPIII, which was subsequently disproved by experiments with purified recombinant protein (3,4).
The results from our carboxypeptidase assay, which suggest slow but specific Avi-NaalL cleavage of N-acetylated dipeptides with identical C and N termini, can be explained by sequential N-terminal degradation of the tested dipeptides. The free N-terminal amino acid, present in the reaction mixture 19 times more frequently than other amino acids, would be detected preferentially, leading to a false positive signal. The release of free N-terminal amino acid could be explained either by the ability of Avi-NaalL to cleave the N-acetyl moiety from the dipeptide or by incomplete N-acetylation during library synthesis.
Aminopeptidase Activity of NAALADase L-After ruling out the potential carboxypeptidase activity of Avi-NaalL, we used PICS, a more general method, to probe potential Avi-NaalL proteolytic activity. PICS enables not only analysis of endopeptidase activity but, in theory, also determination of aminopeptidase activity. Because the method is based on MS analysis of PЈ regions of the cleaved peptides (i.e. the C-terminal fragments of putative peptide substrates), which are assigned by the peptide spectrum matching against the proteomic database, the cleavage of even one amino acid from the N terminus of putative peptide substrate can be detected. However, because the downstream sequences of different peptides are often identical, the determination of the "P sequence" can be ambiguous. We identified ϳ15% of peptides in our assay with more than one possible P sequence. We omitted these P sequences from the analysis (see Table 2 and "Experimental Procedures" for more details about data analysis).
The PICS data suggested that Avi-NaalL prefers an acidic residue at P2Ј and a Pro residue at P3Ј. Interestingly, individual analysis of 72 peptide sequences revealed that Asp/Glu in the P2Ј position (31 peptides identified) and Pro in the P3Ј position (25 peptides identified) are almost mutually exclusive. Of those 56 peptides, Glu/Asp and Pro were present simultaneously only in two cases. This observation led us to assume that these amino acids do not form a conserved substrate recognition motif for Avi-NaalL. Indeed, subsequent enzymatic analyses revealed that the identified cleavage pattern acts as a stopping point for N-terminal peptide degradation rather than as a substrate recognition motif. Nevertheless, these results confirmed the potential of PICS for characterization of aminopeptidase substrate specificity.
It should be noted that the peptides used for our PICS experiments were dimethylated at their N termini. This suggests that Avi-NaalL is capable of cleaving off N-terminal amino acids with blocked primary amine groups. However, this assumption conflicts somewhat with the results from the carboxypeptidase assay, in which N-acetylated dipeptides were inefficiently processed by Avi-NaalL. A likely explanation for this phenomenon is incomplete dimethylation of the peptide library, which would leave some peptides accessible to Avi-NaalL degradation.
Subsequent analysis of two individual peptides identified by PICS confirmed the aminopeptidase activity of Avi-NaalL. We were able to detect only the PЈ portion of peptide substrates. Additionally, we identified individual amino acids that were present in the N-terminal portions of the analyzed peptides (Phe in peptides 52 and 71 or Phe and Trp in random peptide). Moreover, reactions with prolonged incubation times revealed that either Asp/Glu at P2Ј or Pro at P3Ј is sufficient to significantly decrease the cleavage efficacy, but only the combination of both amino acids completely prevents peptide degradation (Fig. 8, peptide 52). The inhibition experiment with bestatin together with detection of complete degradation of random peptide showed that Avi-NaalL probably belongs to the group of aminopeptidases that are able to cleave a variety of amino FIGURE 9. Quantitative analysis of Avi-NaalL aminopeptidase activity. A, time-dependent analysis of the peptide 52 cleavage by Avi-NaalL. Recombinant Avi-NaalL was mixed with peptide 52 in final concentrations of 0.1 and 100 M, respectively. Reactions were performed at 37°C and were stopped at times ranging from 5 to 90 min and analyzed by LC/MS. The relative amount of individual peptides or amino acids is plotted against reaction time in A1-A5. B, determination of kinetic parameters of model peptide substrate (I2IDPNG) cleavage by Avi-NaalL. The enzyme was incubated with various concentrations of substrate, ranging from 6.3 to 1600 M, for 15 min at 37°C. The reaction mixtures were subsequently analyzed using HPLC. Each reaction was done in duplicate, and the result is shown as a mean with S.E. (error bar). The kinetic parameters were obtained by non-linear fit of the data using GraFit version 5.0.11 (Erithacus Software Ltd.). The reciprocal linear plot is also illustrated for comparison. C, determination of inhibition constant of aminopeptidase specific inhibitor bestatin toward Avi-NaalL. Various concentrations of bestatin, ranging from 0.1 to 500 M, were incubated with Avi-NaalL and a model substrate (I2IDPNG) for 15 min at 37°C. The reaction mixtures were subsequently analyzed using HPLC. Each reaction was done in duplicate, and the result is shown as a mean with S.E. (error bar). The IC 50 value was obtained by non-linear fit of the data using GraFit version 5.0.11 (Erithacus Software Ltd.), and the K i value was then calculated using the Cheng-Prussoff equation for competitive mode of inhibition (31). acids from the peptide N terminus, such as bovine lens aminopeptidase and SGAP (29). Although we perform quantitative analysis of Avi-NaalL cleavage of a model substrate, an additional analysis of Avi-NaalL substrate specificity and cleavage efficacy will be necessary for proper classification of this novel aminopeptidase.
Nevertheless, our data clearly demonstrate that the NAALADL1 gene product possesses different enzymological activity than GCPII. Because the protein product ("NAALA-Dase L") currently bears the same EC number as GCPII (EC 3.4.17.21), a new EC number for this enzyme should be appointed. Additionally, considering that the name NAALA-Dase L (N-acetylated ␣-linked acidic dipeptidase-like) does not in fact describe the enzyme's features and is rather misleading, we suggest renaming the enzyme human ileal aminopeptidase (HILAP), to reflect its actual enzymatic activity and expression profile.
To conclude, we solved the x-ray structure of the extracellular portion of the human NAALADL1 gene product and showed that it is very similar to the GCPII structure, including complete conservation of all amino acids necessary for peptidase activity. Analysis of NAALADL1 gene product expression showed high levels of corresponding mRNA in several tissues. However, the protein was identified predominantly in small intestine, implying that the corresponding mRNA undergoes extensive alternative splicing. We disproved the previously reported DPP IV-like activity of the enzyme and demonstrated that a recombinant protein derived from the NAALADL1 gene possesses aminopeptidase activity. The enzyme degrades peptide substrates from their N termini until it reaches an acidic residue at P2Ј or proline residue at P3Ј. This activity is inhibited by the specific aminopeptidase inhibitor bestatin.
Given its restricted expression profile and wide substrate specificity, we suggest that the physiological functions of the NAALADL1 gene product are probably associated with the final stages of protein digestion in the human body and suggest a new name for this enzyme: human ileal aminopeptidase (HILAP).