Determination of the Substrate Specificity of Tripeptidyl-peptidase I Using Combinatorial Peptide Libraries and Development of Improved Fluorogenic Substrates*

Classical late-infantile neuronal ceroid lipofuscinosis is a fatal neurodegenerative disease caused by mutations in CLN2, the gene encoding the lysosomal protease tripeptidyl-peptidase I (TPP I). The natural substrates for TPP I and the pathophysiological processes associated with lysosomal storage and disease progression are not well understood. Detailed characterization of TPP I substrate specificity should provide insights into these issues and also aid in the development of improved clinical and biochemical assays. To this end, we constructed fluorogenic and standard combinatorial peptide libraries and analyzed them using fluorescence and mass spectrometry-based activity assays. The fluorogenic group 7-amino-4-carbamoylmethylcoumarin was incorporated into a series of 7-amino-4-carbamoylmethylcoumarin tripeptide libraries using a design strategy that allowed systematic evaluation of the P1, P2, and P3 positions. TPP I digestion of these substrates liberates the fluorescence group and results in a large increase in fluorescence that can be used to calculate kinetic parameters and to derive the substrate specificity constant kcat/KM. In addition, we implemented a mass spectrometry-based assay to measure the hydrolysis of individual peptides in peptide pools and thus expand the scope of the analysis. Nonfluorogenic tetrapeptide and pentapeptide libraries were synthesized and analyzed to evaluate P1′ and P2′ residues. Together, this analysis allowed us to predict the relative specificity of TPP I toward a wide range of potential biological substrates. In addition, we evaluated a variety of new fluorogenic peptides with a P3 Arg residue, and we demonstrated their superiority compared with the widely used substrate Ala-Ala-Phe-AMC for selectively measuring TPP I activity in biological specimens.

The neuronal ceroid lipofuscinoses (NCLs) 2 are a group of hereditary neurodegenerative diseases primarily affecting children and adolescents who share similar clinical features, including visual loss, seizures, men-tal regression, behavioral changes, movement disorders, and shortened life expectancy (1). At a cellular level, NCLs are characterized by accumulation of autofluorescent storage material in neurons and other cell types (1,2).
Classical late-infantile NCL (LINCL), one of the most common forms of NCL, typically has an age of onset of about 2-4 years and results in death at around age 10. The gene defective in LINCL was found by using a proteomic approach that compared the spectrum of lysosomal proteins in control and LINCL brain autopsy specimens (3). This gene, designated CLN2, encodes a lysosomal protease that shares sequence similarities with bacterial pepstatin-insensitive peptidases, and a pepstatin-insensitive protease activity was found to be missing in LINCL specimens (3,4).
The CLN2 gene product was later demonstrated to be identical to the lysosomal enzyme tripeptidyl-peptidase I (TPP I) (5,6). This enzyme is a serine protease (7) that, based on the structure of the bacterial homologs, has an unusual Ser, Asp, and Glu catalytic triad, allowing activity at acidic pH (8,9). TPP I has two proteolytic activities, a weak endopeptidase activity with a pH optimum of 3.0 (10) that may be important for the low pH-triggered intramolecular autoactivation and autoprocessing of the 66-kDa inactive proenzyme to the 46-kDa mature enzyme (7,11), and a stronger exopeptidase activity with a pH optimum of 4.5. The exopeptidase activity of TPP I enables the cleavage of tripeptides sequentially from unsubstituted N termini of polypeptides or proteins. A variety of proteinase assays has been developed for enzymebased post-and prenatal diagnosis (4,(12)(13)(14)(15)(16).
In terms of a potential physiological substrate, subunit c of mitochondrial ATP synthase was found to be a major constituent of the storage material in LINCL (17). However, this extremely hydrophobic protein also accumulates in a number of different other NCLs (17)(18)(19)(20)(21)(22). Without detailed understanding of TPP I substrate specificity and the combined actions of other lysosomal proteinases, it remains difficult to distinguish whether subunit c accumulation in LINCL is a primary or secondary effect of TPP I deficiency.
Although there is some information regarding TPP I substrate specificity (reviewed in Ref. 23), no comprehensive study has been performed to date. Originally, TPP I was found to release Gly-Pro-Xaa tripeptides from collagen-related synthetic polypeptides (24). Pro was not tolerated at P1 and P1Ј (25)(26)(27), and the substrate specificity was related to peptide length (27). Detailed knowledge of the substrate specificity of TPP I is important for two reasons. First, identification of the true physiological substrates of TPP I may provide important insight into LINCL pathophysiology and may reveal targets for therapeutic intervention. Second, there is a need for a more specific substrate to assay TPP I activity for improved clinical enzyme-based diagnostic and carrier testing assays. In addition, a highly specific substrate would allow measurement of low levels of enzyme activity, which is important for determining residual activity that may be associated with some CLN2 mutants as well as screening for drugs that could potentially rescue the function of mutant CLN2 gene products (28).
Several recent methodological developments facilitate the construction and analysis of combinatorial peptide libraries for evaluation of peptidase substrate specificity. First, 7-amino-4-carbamoylmethylcoumarin (ACC) has been developed as a novel reagent that allows a fluorogenic molecule to be incorporated as the C-terminal residue of peptides using standard solid phase synthetic methods (29,30). Hydrolysis of the C-terminal amide bond results in release of free ACC and an ϳ900-fold increase in fluorescence that is largely independent of pH, thus providing an efficient method to assay enzyme activity. Second, use of an "isokinetic mixture" of Fmoc amino acids, where concentrations of reagents are adjusted so that they have similar pseudo-first order reaction kinetics, allows ready incorporation of multiple residues at a given position (30,31). Third, recent advances in LC/MS/MS technologies allow quantification of relatively complex mixtures of peptides for enzyme kinetic studies (32).
In this study, we used combinatorial fluorogenic and standard peptide libraries combined with fluorescence and LC/MS/MS-based assays to analyze the substrate specificity of TPP I. We determined the relative contribution of all non-sulfur-containing amino acids and norleucine to each of the P3-, P2-, P1-, P1Ј-, and P2Ј-substituted residues (see Ref. 48 for peptide subsite nomenclature). This approach allowed us to obtain a comprehensive substrate specificity map (landscape) of TPP I that should be useful in identifying true biological substrates. In addition, this information has enabled synthesis of new fluorogenic peptides that exhibit greater selectivity for TPP I compared with substrates used previously. These new TPP I substrates represent superior reagents for use in diagnostic and other activity assays.
Free ACC Synthesis-Free ACC was obtained by deprotection of the Fmoc-ACC resin followed by trifluoroacetic acid cleavage and cold ether precipitation. ACC was subject to reversed-phase HPLC purification and lyophilization. The final product was dried over P 2 O 5 before weight measurement.
Synthesis of Combinatorial Peptide Library-The splitting step in the combinatorial peptide synthesis was achieved by dividing the resin into multiple 1.5-ml polypropylene microcentrifuge tubes, and the synthesis of each pool of peptides was carried out manually. Reactions were conducted at room temperature with continuous mixing on an orbital shaker. Randomization was achieved during the coupling step by using the isokinetic Fmoc amino acid mixture described previously (31). All peptides synthesized in this study contain a free N terminus and a C-terminal amide.
P1 Peptide Library I-This synthesis followed the procedure described previously (29,30). In brief, the Fmoc-ACC resin was divided to 19 portions (60 mg, 0.029 mmol for each portion). Upon deprotection of the resins, one of the 19 Fmoc amino acids (0.15 mmol, omitting Cys and substituting Met by Nle) along with DMF (0.5 ml), 2-(7-aza-1Hbenzotriazole-1-yl)-1,1,3,3-tetramethyluronium hexafluorophosphate (56 mg, 0.15 mmol), and collidine (39 l, 0.29 mmol) were added to the designated tubes and coupled for 20 h. Double couplings were performed to improve yield. The unreacted aniline group on the ACC resin was partially blocked by adding AcOH (17 l, 0.29 mmol), diisopropylcarbodiimide (46 l, 0.29 mmol), and nitrotriazole (32 mg, 0.29 mmol). P2 and P3 positions were coupled using the isokinetic mixture (0.29 mmol of total Fmoc amino acids) along with PyBOP (150.9 mg, 0.29 mmol), HOBt (44.4 mg, 0.29 mmol), and DIEA (75.8 l, 0.44 mmol). After removing the Fmoc group, the resin was dried and treated with 95% trifluoroacetic acid, 2.5% triisopropylsilane, 2.5% H 2 O to cleave the peptide from the resin. The peptides were lyophilized and dried under vacuum. The composition of the libraries was evaluated through peptide sequencing and LC/MS/MS. Both methods showed that the 19 amino acids were all present at approximately the same levels at both the P2 and P3 positions (data not shown). The molar yield of each mixture was calculated from the dry weight and the theoretical average molecular weight of the 361 peptides in the mixture. Peptide substrate stock solutions (25 mM) were prepared by dissolving the lyophilized peptide pools in Me 2 SO.
P2 Library-Four peptidyl ACC resins with the P1 position fixed as Leu, Nle, Phe, or Tyr (0.6 g, 0.29 mmol for each) were constructed following the procedure described above. Each library was divided into 19 portions (30 mg, 0.014 mmol each). The P2 residues were coupled individually by standard solid phase Fmoc chemistry. Fmoc amino acid (0.042 mmol), PyBOP (22 mg, 0.042 mmol), HOBt (7 mg, 0.042 mmol), and DIEA (11 l, 0.078 mmol) were added to the designated tube and coupled for 3 h. Complete coupling was confirmed by the Kaiser test (33). The P3 residues were coupled using the isokinetic mixture method with the same synthesis, cleavage, and stock solution preparation steps as described above. The composition of the P2 library was evaluated by LC/MS, which demonstrated similar levels of the expected 19 peptides in each pool.
P3 Library-The P2 and P1 positions (P2-P1) were fixed as Nle-Leu, Nle-Nle, Pro-Nle, Ala-Phe, Nle-Phe, Pro-Phe, and Thr-Phe. Fmoc-ACC resin synthesis and coupling of both P1 and P2 residues followed the procedures described above. Batches of resin representing the seven P1-P2 sequences were divided into 19 portions each, and the 19 ϫ 7 peptides were then synthesized individually. The identities of the peptides were confirmed by LC/MS, and the purity of all the crude peptides was analyzed to be greater than 85% by analytical reversed-phase HPLC. Peptide stock solutions were prepared by using the crude peptides without further purification.
P1 Library II-Fmoc-ACC resin was divided into 19 portions, and 19 P1-specific peptidyl resins were synthesized. Each of these resins was then divided into two portions and reacted to introduce either Ala-Ala or Arg-Nle as the P3-P2 sequence. These 38 ACC tripeptides were purified by semi-preparative reversed-phase HPLC to a purity greater than 90%.
P1Ј Tetrapeptide and P1ЈP2Ј Pentapeptide Libraries-These libraries were synthesized by standard Fmoc chemistry using the procedure described elsewhere (34). The P3, P2, and P1 residues were fixed as Ala, Ala, and Phe. For the P1Ј tetrapeptides, the 19 Ala-Ala-Phe-Xaa peptides were synthesized individually. A similar strategy was used for the P1Ј-P2Ј pentapeptide library, but after the initial coupling of the 19 P2Ј residues to the resin, a mixture of 19 P1Ј residues was introduced by isokinetic coupling. The 19 peptide pools were then individually coupled to introduce Ala-Ala-Phe at the P3-P1 positions. However, LC/MS/MS and Edman degradation analyses showed that in the 19 pools of the P1ЈP2Ј library, the peptides containing Lys at P1Ј were not present, reflecting failed incorporation of Lys from the isokinetic mixture. Thus, each P1ЈP2Ј peptide pool contains 18 different peptides.
Fluorescence-based Enzyme Assay-The conditions for the enzyme kinetic assay were described previously (7,13). Briefly, purified proenzyme of TPP I (1 mg/ml, 16.6 M) was 10-fold diluted into 0.1 M sodium formate buffer, 0.15 M NaCl, 0.1% Triton X-100, pH 3.5, and incubated overnight at room temperature to allow for complete activation. The activated TPP I was stored at Ϫ80°C and thawed shortly before use. The Me 2 SO stock solutions of the peptide substrates were diluted into Assay Buffer (0.1 M sodium acetate buffer, 0.15 M NaCl, 0.1% Triton X-100, pH 4.5) to the appropriate concentration before incubation with TPP I.
Calibration curves were constructed to obtain a conversion factor to transform output data obtained using a fluorescence plate reader (CYTOFLUOR 4000, PerSeptive Biosystems, excitation filter at 360 nm and emission filter at 460 nm) to units of ACC concentration. Free ACC was obtained by complete digestion of 80, 40, 20, 10, 5, 2.5, and 1.25 M of Ala-Ala-Phe-ACC by 4.1 nM activated TPP I at 30°C for 24 h. Dilutions of synthetic free ACC (described above) were also analyzed directly to construct calibration curves. Both methods gave equivalent results. Preparation of tissue extract for enzymatic measurements was performed according to a reported procedure (4). The procedure for measurement of TPP I activity at different pH values was described previously (13). CLN2(Ϫ/Ϫ) mouse specimens were from the gene-targeted model described previously (35).
Determination of V max and K M -Activated TPP I and ACC peptides were serially diluted using Assay Buffer. Reactions were conducted in flat-bottom polystyrene 96-well plates and initiated by adding 80 l of peptide solution to 20 l of TPP I solution. The final concentration of TPP I during incubation was 4.10, 2.05, and 1.00 nM, and the concentration of substrate was 200, 100, 50, and 25 M. (Note: when multiple peptides are present in the mixture, the concentration given refers to the sum of all peptides.) Fluorescence was monitored at 1-min intervals for at least 30 min at 30°C. In some cases, the reactions were repeated using a 4-fold lower range of substrate concentrations to achieve optimum reaction conditions. Digestion rate was determined by measuring the initial rate of digestion using conditions where the rate of digestion was linear with respect to time and enzyme concentration. V max and K M values were obtained by fitting the digestion rate with the corresponding substrate concentration to the Michaelis-Menten function. k cat was calculated by dividing V max by the TPP I concentration, [E] 0 . For substrates where the initial rate did not approach V max at increasing substrate concentrations (K M Ͼ 400 M), the direct method was used to determine k cat /K M (see below).
Derivation of k cat /K M from Low Substrate Concentration-Substrate concentration for the TPP I digestion was 10, 5, 2.5, and 1.25 M during the digestion. Incubation and fluorescent monitoring followed the pro-cedure described above. The slope (k) of digestion rate (V observe ) plotted against the substrate concentration was obtained by linear fitting. The k cat /K M value was calculated by dividing k with the TPP I concentration [E] 0 .
TPP I Digestion for LC/MS/MS Analysis-All incubations were carried out in Assay Buffer. Peptide substrates were diluted in Assay Buffer to the desired concentration from stock solutions (25 mM) in Me 2 SO. All incubations were carried out at 30°C. To check that substrate disappearance corresponded to cleavage of the expected P1-P1Ј bond, the proteolysis products from the peptide pool Xaa-Nle-Phe-ACC were identified by LC/MS/MS.
For the P2 library, peptide solutions were added to the activated TPP I to give a final concentration of 100 M total peptides and 6.9 nM TPP I in a final volume of 250 l. Samples (100 l) were removed after 40 and 480 min, and the reaction was terminated using 10 l of 15% trifluoroacetic acid. The initial substrate concentration was determined as described above for the 40-min time point but omitting enzyme from the reaction mixture. More detailed studies were initially conducted with the Xaa-Nle-Phe-ACC library to monitor the time course of the reaction for method validation.
The P1ЈP2Ј pentapeptide library was digested as described above, except for the following. 1) The TPP I concentration during incubation was 2.1 nM. 2) Three ACC peptides (Ala-Ala-Phe-ACC, Asn-Nle-Nle-ACC, and Ala-Ala-Pro-ACC) were included as internal standards with a final concentration of 1 M. 3) Samples (50 l) were removed at 4, 40, and 480 min, and the reactions were terminated by adding 5 l of 15% trifluoroacetic acid. 4) Terminated reactions were diluted 5-fold with distilled water before LC/MS/MS analysis. Before the digestion of the P1Ј tetrapeptide library, equimolar amounts of 19 P1Ј tetrapeptides were mixed. The peptide mixture was digested under the same conditions as the P1ЈP2Ј pentapeptide library, but 7 time points were monitored (20 s and 4, 10, 40, 120, 280, and 480 min) to measure the time course of digestion.
LC/MS/MS Instrumentation-A Qtrap hybrid mass spectrometer (MDS/Sciex) was used for analysis of the P2 libraries. An LTQ linear ion trap mass spectrometer (Thermo) was used for analysis of the P1Ј tetrapeptide and the P1Ј-P2Ј pentapeptide libraries. The mass spectrometers were interfaced with an LC Packings HPLC system consisting of a solvent pumping system (Ultimate), a column switching device (Switchos II), and an autosampler (Famos).
LC/MS/MS Analysis of P2 Library-For the analysis of selected peptide pools from the P2 library (each pool containing 19 peptides), 5 l of incubation mixture were injected for each LC/MS/MS run. Samples were loaded onto a trapping column (5 ϫ 0.3 mm inner diameter, polymeric C-18; LC Packings) and washed for 5 min at a flow rate of 50 l/min using 0.1% formic acid/H 2 O. The valve was then switched so that the flow to the trapping column was reversed and brought in line with an analytical column (15 cm ϫ 1 mm inner diameter, 3-m C-18 resin; LC Packings) and the mass spectrometer. The peptides were separated by using a linear gradient of 5-25% of buffer B (2% H 2 O, 98% ACN, 0.1% formic acid) in buffer A (2% ACN, 98% H 2 O, 0.1% formic acid) for 40 min at a flow rate of 40 l/min. Multiple reaction monitoring (MRM) was used as the scan mode to identify and quantify each of the components in the peptide mixture. The first and third quadrupoles were operated at low resolution (2 m/z) to achieve maximum ion passage. The parent masses were selected based on the calculated molecular weight of each component in the mixture. The m/z values for singly charged MH ϩ ions were monitored for the majority of the peptides, whereas doubly charged (M ϩ 2H) 2ϩ /2 ions were monitored for the peptides containing positively charged residues (Arg, His, and Lys). The collision energy for each component was optimized based on directinfusion experiments of the peptide mixture using the "Enhanced MS/MS" scan mode. Briefly, Arg at the P3 position required a higher collision energy of 45-55 (arbitrary value); Pro, His, Glu, Gln, Phe, Trp, and Tyr required 35-40; Ile, Leu, Lys, and Nle required 30; Ala, Asn, Asp, Ser, Thr, and Val, required 25; and Gly required 20. The most frequently occurring fragments based on the enhanced MS/MS experiments were a 2 and b 2 ions and the immonium ions of the residues at the P1 and P3 positions. The two most intense fragments from each peptide in the mixture were selected for the MRM transitions. A total of 16 MRM scan events (16 distinguishable masses: Ile, Leu, and Nle are isobaric; Gln and Lys are indistinguishable) was used to cover all components in the mixture. Analysis of individual peptides from the P3 library was used to determine the relative retention time of peptides that differed at a single position (Ile Ͻ Leu Ͻ Nle and Lys Ͻ Gln), and this information was used to assign the appropriate transitions for analysis of the peptide pools. A complete cycle contained about 45 transitions. Each transition was allotted 100 ms to achieve optimum ion statistics and peak sampling. A complete cycle took about 5 s, which was compatible with the chromatographic method.
LC/MRM runs of the peptide mixtures enabled the chromatographic separation and identification of each peptide in the library. The MRM ion current for each transition was integrated by using the Analyst software package, and peak areas were normalized to the known initial substrate concentration [S] 0 and the observed peak area for the undigested control. Because the concentration of each component in the peptide library during incubation was ϳ5 M, which is well below the K M value of the corresponding substrate (Ն30 M), the enzyme kinetics can be considered as a first order reaction with respect to substrate and LC/MS/MS Analysis of PЈ Libraries Using the LTQ-The general chromatographic method described above was applied to the analysis of P1ЈP2Ј libraries except samples were loaded onto the trapping column and washed for 1.8 min at a flow rate of 30 l/min using 0.1% trifluoroacetic acid, 2% ACN. Peptides were monitored using MS/MS, based on the calculated parent masses for MH ϩ or (M ϩ 2H) 2ϩ /2 for each component. A scan event was conducted for each parent mass. A pilot run was carried out for each peptide pool to obtain the retention time for each component over the 40-min gradient. To allow adequate sampling of peaks, data acquisition was divided into three segments using the Xcalibur software, monitoring a maximum of 8 m/z values in each segment to ensure that the total scan time did not exceed 5 s. Peptides eluting near the border of two segments were included in both to ensure detection in case of run-to-run variation in retention times. The MS/MS ion chromatograph was analyzed by the Quan software in Xcalibur. Integrated peak areas for the two most intense product ions for each peptide were obtained and used for calculation of k cat /K M based on one time point for the P1Ј-P2Ј pentapeptide library or on a time course of digestion for the P1Ј tetrapeptide library as described above for the P2 libraries.

RESULTS
Library Design-We used combinatorial fluorogenic ACC tripeptides as well as tetrapeptide and pentapeptide libraries to evaluate systematically the substrate specificity of TPP I. Construction of fluorogenic peptides was based on a procedure where we sequentially evaluated each P1, P2, and P3 position, using the results of the previous analysis to design subsequent libraries. By stepwise analysis of different positions, the residues surveyed in each consecutive round of peptide synthesis were restricted, thus decreasing the complexity and number of peptides required for sufficient coverage.
P1 Library I-The first P1-diverse library (see Table 1 for nomenclature) was constructed by synthesizing 19 separate pools of 361 ACC tripeptides, each with a different fixed P1 residue and containing a randomized mixture of 19 residues at the P2 and P3 positions. The fluorogenic TPP I enzyme activity assay was used to analyze each pool, thus yielding the relative substrate specificity of each P1 residue averaged over all P2 and P3 residues. Subsequently, after analysis of the P2 and P3 libraries (see below), a second more restricted P1 library was constructed. This library, designated P1 library II, consisted of 38 separate ACC tripeptides, where all 19 residues were surveyed in the P1 position in the context of two different P3-P2 sequences (Ala-Ala and Arg-Nle).
Peptide pools in the P1 library I were digested using TPP I, and the substrate specificity constant, k cat /K M , was calculated directly from the rates of reaction at low substrate concentrations (direct method) or by determination of the individual kinetic parameters k cat and K M at medium to high substrate concentrations (Michaelis-Menten method). Note that both the substrate specificity constants and kinetic parameters are apparent values, reflecting activity of all 361 peptides in each pool. For the majority of the P1 peptide pools, the results obtained from both methods yielded similar results (supplemental Table 1). TPP I digestion favored the large hydrophobic residues Phe, Leu, and Nle at Pl (Fig. 1). The individual values for peptide pools with the P1 residue fixed at His, Val, Ile, Thr, Gly, and Pro could not be derived because of their very slow rate of digestion above the ACC background signal and a Indicated residue at a given position. When designated as "defined," separate syntheses were used to introduce the indicated amino acid at a given position. When designated as "mixed," an approximately equimolar mixture of amino acids at a given position was introduced by isokinetic coupling during synthesis. The 19 amino acids used consist of all natural amino acids excluding Cys and Met and including Nle. All peptides contain a free N terminus and a C-terminal amide. b Excludes Lys. required further investigation using purified peptides (see below for analysis of P1 library II).
There were two drawbacks using the complex mixture for the evaluation of P1. First, some of the fixed P1 peptide pools had high background fluorescence because of the presence of free ACC, which could not be removed using standard peptide synthesis extraction methods. For example, Arg, Ile, Thr, and Val had low coupling efficiencies to the aniline group on ACC, probably because of the branched side-chain structures of these amino acids. In these syntheses, more uncoupled free ACC was cleaved from the peptidyl resin along with the ACC tripeptides, and its presence interfered with the analysis of weak substrates. Second, although the current method can rank the P1 position in a relative manner, the presence of 361 peptides in the pool prevents determination of true k cat and K M values.
To address these issues, peptides with diverse P1 residues and fixed P3-P2 sequences were individually synthesized and purified by HPLC. Two combinations were chosen for this series. One was based on a widely used TPP I substrate, Ala-Ala-Phe-AMC, and thus the P3-P2 sequence was fixed as Ala-Ala. The other contained the P3-P2 sequence Arg-Nle based on analysis of the P3 library (see below). Peptides (25-200 M) were incubated with purified TPP I, and substrate specificity constants were calculated either using the Michaelis-Menten method or, when K M values exceeded 400 M, the direct method. k cat and K M values were reported for individual peptides that had K M values below 400 M (supplemental Table 2). Analysis of the individual peptides using the fluorometric TPP I assay revealed a similar rank order compared with that obtained from P1 library I. The hydrophobic residues Leu, Nle, and Phe were the most favorable residues at the P1 position (Fig. 2), followed by Tyr, Trp, Glu, Asp, Gln, and Ala. Interestingly, in cases when the kinetic constants K M and k cat could be determined, linear regression plots (not shown) indicated that the substrate specificity constant correlated well with substrate affinity and to a greater degree than with k cat (k cat /K M versus 1/K M , r 2 ϭ 0.94; k cat /K M versus k cat , r 2 ϭ 0.18). This analysis also showed clearly that peptides containing P1 Asn, His, Lys, Arg, Ser, Val, Ile, Thr, Gly, and Pro are all poor substrates for TPP I, with specificity constants less than 1% of that of the best substrates. The influence of the P1 position on the substrate specificity was quite marked, with over a 10,000-fold difference in k cat /K M between the optimal P1 residues (Leu, Nle, and Phe) and the poorest P1 residue (Pro) (see "Discussion" and Fig. 12).
There was excellent agreement on the effect of the P1 residue on substrate specificity when determined by using peptides with P3-P2 sequences being either Ala-Ala, Arg-Nle, or a mixture of all possible residues (Fig. 2). The noticeable exception was when Arg, Lys, or to a lesser extent His occupied the P1 position in the context of the Arg-Nle P3-P2 sequence. This suggests an unfavorable effect of placing positively charged residues at both P1 and P3 positions. Another noteworthy observation is that Ile and Val produce low substrate specificity in the P1 position for both peptide sets compared with other hydrophobic residues. This discrepancy may be explained in that Ile and Val differ in having a branched side chain at the ␤-carbon, which may sterically hinder substrate binding. Thr likewise has a ␤-branched structure and results in low substrate specificity when present in the P1 position.
P2 Library-The P2-diverse library was restricted at the P1 position to the following four optimum residues found when evaluating P1 library I: Leu, Nle, Phe, and Tyr. Here, for each of these fixed P1 residues, 19 pools of 19 ACC tripeptides were synthesized, each with a different fixed P2 position and containing a randomized mixture of 19 residues at the P3 position (see Table 1). These 76 peptide pools were evaluated by using the fluorescence assay to determine the relative contribution of each P2 residue in the context of four different P1 residues averaged over all P3 residues. Subsequently, after analysis of the P3 library and validation of an LC/MS/MS-based assay, we analyzed a subset of the peptide pools to determine the specificity constants for 589 individual peptides from 31 pools in the P2 library, allowing more thorough evaluation of potential P3-P2-P1 interactions (see the P3 library analysis below).
The results showed that the non-natural amino acid, Nle, which has an unbranched aliphatic side chain, was the most favorable residue overall at P2 for TPP I digestion. Of the natural amino acids, Pro and Ala followed by Phe, Tyr, and Trp were the most favorable residues (Fig. 3). In general, the P2 position did not show the high selectivity observed at P1, as the highest and lowest k cat /K M values only differed by a factor of ϳ100 (supplemental Table 3, and also see Fig. 12).
P3 Library-The P3-diverse library was constructed by synthesizing 133 individual ACC tripeptides that surveyed all 19 residues in the P3 position in the context of the following seven different P2-P1 sequences: Nle-Leu, Nle-Nle, Pro-Nle, Ala-Phe, Nle-Phe, Pro-Phe, and Thr-Phe.
The specificity constant k cat /K M was calculated after determining individual kinetic parameters using the Michaelis-Menten method. The results showed that the positively charged residues Arg, His, and Lys, as well as Tyr, were favored at the P3 position, and Pro was the least  favorable residue (Fig. 4). These data also revealed that the relative effect of different P3 residues on the substrate specificity constant depended on the identity of the P1 residue. Arg clearly is the optimal P3 residue when the natural amino acids Phe or Leu occupy P1. In contrast, His, Tyr, and Arg are the optimal P3 residues when the non-natural amino acid Nle occupies P1 (Fig. 4 and see below).
Because one purpose of this study was to develop an improved assay for TPP I activity, we compared the new ACC substrates with Ala-Ala-Phe-ACC, which has similar kinetic properties to the widely used TPP I substrate Ala-Ala-Phe-AMC (k cat /K M ϳ0.025 M Ϫ1 s Ϫ1 for both substrates). Inspection of Fig. 4 indicates that the Ala-Ala-Phe sequence was not optimal, with a number of fluorogenic peptides having considerably higher substrate specificity constants (k cat /K M ϳ0.1 to ϳ0.2 M Ϫ1 s Ϫ1 ). Although these consist predominantly of peptides containing the non-natural amino acid Nle (e.g. P3-P2-P1 sequences Arg-Nle-Leu, Arg-Nle-Nle, Arg-Nle-Phe, His-Nle-Nle, and Tyr-Nle-Nle), a number of the more specific substrates contained natural P3-P2-P1 sequences (e.g. Arg-Pro-Phe, Arg-Ala-Phe, Lys-Pro-Phe, and Gln-Pro-Phe). Examination of the individual kinetic constants revealed that the higher substrate specificity constants for the more favorable substrates could largely be attributed to their higher affinity (lower K M ) for TPP I compared with that of the corresponding Ala-Ala-Phe substrate ( Table  2 and supplemental Table 4).
Given that the use of Nle in the P2 position produced some of the most specific substrates, we also synthesized additional peptides with the optimal residues at P3 (His, Tyr, and Arg) and P1 (Nle and Phe) but containing norvaline (Nva) at P2. (Like Nle, Nva is a non-natural amino acid that contains an unbranched aliphatic side chain but is one methylene shorter than Nle.) Analysis revealed a trend where peptides that contain a P2 Nva residue have substrate specificity constants ϳ30%  higher than that of the corresponding P2 Nle peptides, and this is primarily because of an effect on K M (Table 2 and supplemental Table 4).

LC/MS/MS Analysis of Peptide Pools-
The results obtained from analysis of the individual peptides comprising the P1-II and P3 libraries suggested some interdependence of the P1, P2, and P3 positions on TPP I substrate specificity. To investigate this further without the considerable effort and expense required to synthesize additional individual peptides, we implemented an LC/MS/MS-based assay that allowed us to determine the substrate specificity constants of individual peptides within peptide pools. Briefly, TPP I was incubated for varying periods of time with peptide pools at concentrations of individual peptides well below their anticipated K M . After stopping the reaction, the mixture was analyzed by LC/MS/MS. From the retention time, the precursor mass, and the mass of the product ions, all peptides within a pool could be distinguished (see "Experimental Procedures"). Control experiments indicated that the integrated peak areas of the selected product ions were proportional to the amount of peptide analyzed and that there was excellent run-to-run reproducibility (data not shown).
We validated the LC/MS/MS assay using the seven P2 library pools that contained all 133 peptides that had been analyzed as individual peptides (see above under "P3 Library"). Initially, time course experiments using the Xaa-Nle-Phe-ACC peptide pool indicated that substrate disappearance could be fit with a single exponential decay model (Fig. 5, top panel), as expected if the digestion was first order with respect to substrate concentration (see "Experimental Procedures"). We analyzed a total of 589 peptides (31 pools of 19 peptides) from the P2 library using the LC/MS/MS assay and calculated their individual substrate specificity constants (values listed in supplemental Table 5 and graphed in supplemental Fig. 1). To simplify the presentation of this large amount of information, we normalized the k cat /K M values for peptides with a given P2-P1 sequence to the corresponding P3 Arg peptide, and we then averaged these normalized values across all available P2 residues for peptides with the same P1 and P3 residues (Fig. 6). This revealed that for all natural amino acids surveyed at P1 (Leu, Phe, and Tyr), the optimal P3 residue clearly was Arg. In contrast, for peptides with a P1 Nle, the optimal P3 residues were His followed by Tyr and Arg. These results were also evident from the fluorescence-based assay of the peptides composing the P3 library (see Fig. 4). In terms of trends for the suboptimal P3 residues, the only apparent differences were that P1 Nle   Substrate specificity constants of peptides within a given pool (fixed P1 and P2 residues) were first normalized to the corresponding value of the peptide containing P3 Arg within that pool. For peptides with a given P3 and P1 residue, the normalized value was averaged across all available P2 residues, and the mean value Ϯ range is displayed.
favored a P3 Phe and that P1 Tyr favored the negatively charged amino acids Glu and Asp at P3 (Fig. 6). Substrate Specificity at the P1Ј and P2Ј Position-We used nonfluorogenic peptide libraries and the LC/MS/MS assay to explore further the substrate specificity of TPP I. We initially evaluated the P1Ј position by synthesizing 19 individual tetrapeptides with diverse P1Ј residues, with the P3-P2-P1 sequence fixed as Ala-Ala-Phe based on the sequence of the commonly used TPP I substrate Ala-Ala-Phe-AMC. The peptides had a free N terminus and a C-terminal amide. Equimolar amounts of these 19 peptides plus three fluorogenic substrates serving as internal controls (Ala-Ala-Phe-ACC, Asn-Nle-Nle-ACC, and Ala-Ala-Pro-ACC) were mixed, digested with TPP I for periods varying from 20 s to 8 h, and analyzed using the LC/MS/MS-based assay (see "Experimental Procedures"). The k cat /K M values for each component were calculated by curve fitting as described above. The results obtained for the three internal standards agreed with the previous fluorescence-based analysis of the individual peptides, further validating this approach (data not shown).
The substrate specificity constants for the Ala-Ala-Phe-Xaa tetrapeptides are shown in Fig. 7 (solid bars). The most favored residues at P1Ј were Tyr and Phe, followed by the moderately favorable residues Val, Trp, Ile, and Ala. Tetrapeptides with polar residues in the P1Ј position were less favorable, and those with positively charged side chains were particularly unfavorable. Finally, like the P1 position, no hydrolysis was observed for peptides with a P1Ј proline.
This analysis also allowed us to compare the Ala-Ala-Phe-Xaa tetrapeptides with the corresponding Ala-Ala-Phe fluorogenic peptides. Both the ACC and AMC coumarin derivatives were relatively poor substrates, having specificity constants similar to that of Ala-Ala-Phe-Asn and Ala-Ala-Phe-Leu, ϳ40-fold lower than that of the optimal tetrapeptide Ala-Ala-Phe-Tyr. This suggests that either the aromatic coumarin is unfavorable because of steric constraints or that the nonpeptidic amide bond is somewhat resistant to hydrolysis. Regardless, the identity of the P1Ј residue had a significant effect, with the substrate specificity constant of the optimal Ala-Ala-Phe-Tyr being 500-fold higher than that of the second least specific substrate Ala-Ala-Phe-Arg.
We extended this analysis using a P2Ј-diverse library consisting of 19 pools, each containing 18 peptides. Synthesis was conducted using 19 separate reactions with a different fixed P2Ј residue, an isokinetic mixture of 18 amino acids in the P1Ј position, and the P3-P2-P1 sequence Ala-Ala-Phe (Table 1). (Edman sequencing and MS analysis revealed that Lys was not present at the P1Ј position, suggesting that Lys coupling from the isokinetic mixture of amino acids failed. However, given that analysis of the tetrapeptides revealed that P1Ј Lys was extremely unfavorable, the P1Ј Lys-deficient pentapeptide pools were used for further characterization of TPP I substrate specificity.) After adding three fluorogenic peptides as internal standards (see above), each pool was digested by TPP I for 0, 4, 40, and 480 min and analyzed using quantitative LC/MS/MS (see "Experimental Procedures").
The effect of peptide length on P1Ј is shown in Fig. 7. In Fig. 7, the open bars represent pentapeptides containing the indicated P1Ј residue averaged over all P2Ј positions except Pro, with the range being indicated by the vertical lines. Comparison of the pentapeptides with the corresponding tetrapeptides (Fig. 7, solid bars) indicates that in general, the relative contribution of different P1Ј residues to substrate specificity is similar for the different length peptides. Interestingly, both length and the identity of the P2Ј position have a relatively small effect on the more favorable P1Ј residues compared with the less favorable P1Ј residues. In general, the improvement found by increasing the length of the peptide is modest, with the optimal pentapeptide tested (Ala-Ala-Phe-Trp-Val, k cat /K M ϭ 2.3 M Ϫ1 s Ϫ1 ) having only an ϳ2-fold higher substrate specificity than the corresponding optimal tetrapeptide (Ala-Ala-Phe-Tyr, Table 6).
The relative effects of different P2Ј residues are more clearly demonstrated by Fig. 8. Here, peptides with the same P1Ј residue are grouped, with each graph showing the relative effect of the P2Ј position. In Fig. 8, solid lines indicate the substrate specificity constants for pentapeptides with different P1Ј residues averaged across all P2Ј residues except Pro, whereas the dashed lines in Fig. 8 indicate the values for the corresponding tetrapeptides. Note that the scales are different for different P1Ј residues. Several general features are apparent from this analysis. First, the presence of Pro at the P2Ј position markedly decreases hydrolysis of most peptides by TPP I (Fig. 8). Second, except for Pro, in general, the presence of a given P2Ј residue has either a beneficial or neutral effect on substrate specificity (compare the dashed lines with individual solid bars in Fig. 8). Hydrophobic residues at P2Ј tend to have the most beneficial effect, in particular Val, Leu, and Nle for all P1Ј residues and Ile for the poorer P1Ј residues. In contrast, hydrophilic residues and Gly at P2Ј tend to have little effect or even decrease the substrate specificity constant compared with the corresponding P1Ј tetrapeptides. However, compared with P1Ј, except for Pro, the overall effect of the P2Ј residue on substrate specificity is relatively small (see "Discussion" and Fig. 12).
Use of Fluorogenic Peptides for Analysis of Enzyme Activity in Biological and Clinical Samples-When devising biochemical assays using fluorogenic peptides, one has to consider both the activity of the enzyme of interest as well as additional potentially interfering activities. For instance, TPP I activity would be overestimated if the fluorophore were liberated following cleavage by another tripeptidylpeptidase or by the combined actions of a dipeptidase and an aminopeptidase. Alternatively, TPP I activity could be underestimated if there were nonproductive consumptions of substrate resulting from a single cleavage by an aminopeptidase or by a dipeptidase. To test these possibilities, we incubated a series of different fluorogenic tripeptides, selected to represent a broad range of k cat /K M values, with either highly purified recombinant human TPP I or biological samples (brain extracts from an unaffected human as well as CLN2(ϩ/ϩ) and CLN2(Ϫ/Ϫ) mice), and we measured the apparent enzymatic activity at pH 4.5. Fig. 9 demonstrates that the measurements obtained using human brain extracts and purified human TPP I showed excellent agreement. There was little activity found when using extracts from CLN2(Ϫ/Ϫ) mouse brain. Taken together, these results indicate that in brain the majority of the activity measured using all substrates tested was because of TPP I (Fig. 9, open  squares). Interestingly, the correlation between activities obtained with different substrates in purified human enzyme and CLN2(ϩ/ϩ) mouse brain was considerably poorer than the correlation found using purified human enzyme and human brain (compare Fig. 9, open and filled circles). Additional experiments using mouse liver and human fibroblast extracts revealed a consistent trend in that there was significantly better correlation when comparing the relative hydrolysis of different sub-strates using different enzyme sources within a species than when comparing across species (data not shown). This suggests that the human and mouse TPP I have evolved to have slightly different substrate preferences.
We also measured the activity as a function of pH for 12 fluorogenic peptides by using human lymphoblast extracts. Use of purified human TPP I gave an activity profile with a single acidic pH optimum in the range pH 4 -5 ( Fig. 10 and data not shown). Ala-Ala-Phe-ACC and Ala-Ala-Phe-AMC had similar pH profiles to each other, with high activity in the neutral pH range (Fig. 10, upper panel). In contrast, the Arg-Ala-Phe-ACC peptide had extremely low activity in the neutral pH range (Fig. 10, upper panel), and its pH profile was representative of three other ACC peptides with P3-P2-P1 sequences Arg-Nle-Nle, Arg-Pro-Leu, and Arg-Pro-Nle (partial data shown in Fig. 10, lower panel). The other fluorogenic peptides had appreciable activity in the neutral pH range, but all were considerably lower than the Ala-Ala-Phe coumarin derivatives (Fig. 10, lower panel). Interestingly, some substrates such as Arg-Pro-Phe-ACC (Fig. 10, upper panel) and Lys-Pro-Phe-ACC (data not shown) had broader pH optima, with activity at pH 5 greater than or equal to the activity at pH 4.5. Finally, other peptides such as Arg-Nle-Leu-ACC (Fig. 10, upper panel) and Arg-Nva-Nle-ACC (data not shown) had pH optima at both 4.5 and at 6.0. When considered with the observation that only a single acidic pH optimum was found using purified TPP I, these data emphasize the multiplicity of activities present within biological specimens that can hydrolyze some of the fluorogenic substrates. Spillover of these activities into the acidic pH range as well as other activities (see below) is a serious concern when attempting to measure accurately the low levels of residual TPP I activity that can arise from hypomorphic mutations in the CLN2 gene.
We further explored the utility of the new fluorogenic substrates by analyzing different tissues from wild-type and CLN2(Ϫ/Ϫ) mice, comparing the new fluorogenic substrate Arg-Nle-Nle-ACC and the widely used substrate Ala-Ala-Phe-AMC. Although only low levels of activity were detected in CLN2(Ϫ/Ϫ) brain using both substrates, this was not the case for all tissues examined (Table 3). For instance, when comparing CLN2(Ϫ/Ϫ) and wild-type spleen specimens, ϳ20 and ϳ1% activity was detected using the Ala-Ala-Phe-AMC and Arg-Nle-Nle-ACC substrates, respectively. Kinetic measurements on CLN2(Ϫ/Ϫ) spleen revealed that the Ala-Ala-Phe-AMC substrate had nonlinear reaction progress curves (Fig. 11, top panel). The lag followed by an increase in hydrolytic activity might result from processive peptide hydrolysis (e.g. an aminopeptidase and/or a combination of aminopeptidase and dipeptidase activities) or from proteolytic processing that could either activate or change the substrate specificity of another hydrolase. Regardless, no such activity was detected using the Arg-Nle-Nle-ACC substrate (Fig. 11, bottom panel). This clearly demonstrated the improved utility of the new fluorogenic substrate for the selective measurement of TPP I activity in biological specimens.

DISCUSSION
In this study, we conducted detailed substrate specificity measurements using synthetic peptide libraries and highly purified recombinant human TPP I. Our results confirm and extend earlier work that used partially purified enzyme preparations and commercially available peptides. For instance, Watanabe et al. (26) conducted a limited analysis of the P1 position using para-nitroaniline (pNA) peptides and partially purified TPP I from rat liver tritosomes. They found that Ala-Ala-Phe-pNA was ϳ60 times more specific than Ala-Ala-Ala-pNA, and no hydrolysis was detected for Ala-Ala-Pro-pNA. In another study, McDonald et al. (24) tested partially purified porcine TPP I using several fluorogenic Gly-Pro-Xaa tripeptides and found that the relative substrate specificity of the P1 position was Met Ͼ Leu ϳ Ala Ͼ Arg. Considering that the overall geometry of Met is very similar to Nle, these earlier results are consistent with our findings that Leu, Phe, and Nle are optimal at P1 and that peptides with Pro at P1 are extremely poor substrates.
Recently, Oda and co-workers (36) reported an investigation of the substrate specificity of recombinant human TPP I using a number of hexapeptide derivatives with the unnatural amino acid p-nitrophenylalanine (Nph) at the P1Ј position, which, after cleavage, undergoes a small change in absorbance at 300 nm. They found that the most effective substrate tested, Ala-Arg-Phe-Nph-Arg-Leu, had a substrate specificity constant that was 40 times higher than that of Ala-Ala-Phe-AMC. This is similar to our comparison of Ala-Ala-Phe-AMC and the optimal tetrapeptide Ala-Ala-Phe-Tyr (see "Results"). These investigators also examined the relative effect of five different residues at the P1 position in the context of Ala-Xaa-P1-Nph-Xaa-Leu, where Xaa represents a FIGURE 9. Hydrolysis of ACC tripeptides using different enzyme sources. Thirty six ACC tripeptides were selected that covered a broad range of substrate specificity constants. Individual peptides (200 M) were incubated with either purified human TPP I or the indicated brain extract. Correlation coefficients were as follows: purified human enzyme versus human brain, r 2 ϭ 0.99; purified human enzyme versus CLN2(ϩ/ϩ) mouse brain, r 2 ϭ 0.78. randomized mixture of 19 amino acids, and they found that all had similar activities (100 to 67% relative hydrolysis rates for Phe, Nle, Tyr, Leu, and Trp), but they did not report any findings regarding other P1 substitutions. Although we find a greater range in differences among these hydrophobic P1 residues, possibly because of the greater sensitivity of the fluorescence and mass spectrometry-based assays, the overall results are not too dissimilar.
Oda and co-workers (36) also analyzed five different P2 residues in the context of the sequence Ala-P2-Phe-Nph-Arg-Leu and five different P3 residues in the context of the sequence P3-Ala-Phe-Nph-Arg-Leu. Their results are markedly different from our findings. For the P2 analysis, they report a 5-fold difference in the k cat /K M values of their peptides, with the relative order being Arg Ͼ Ala Ͼ Asp Ͼ Ser ϳ His. In contrast, we find that the relative k cat /K M values of these residues varies by Ͼ50-fold, with their optimal P2 substrate, Arg, being among our least favored P2 residues. Also, for their P3 analysis, Oda and co-workers (36) reported a 50-fold difference in the five k cat /K M values, with the relative order being Ala Ͼ Asp Ͼ His Ͼ Arg ϳ Ser. In contrast, we find that one of their most disfavored P3 residues, Arg, is the optimal P3 residue in our study. It is possible that some of these discrepancies reflect the use of different reporter groups in the two studies (ACC versus Nph) and interdependencies between P1Ј and other positions. Future studies that directly compare the different substrates under identical conditions are required to resolve these issues.
We synthesized and analyzed over 7,200 different fluorogenic and standard peptides (as individual peptides or within mixtures) for our systematic analysis of the P3, P2, P1, P1Ј, and P2Ј residues. To summarize this large amount of data, we constructed a substrate specificity matrix that compares the relative substrate specificities of different residues at each position, with the optimal residue being assigned a score of 1 (Table 4). A plot of the relative k cat /K M values indicates that the P1 position is the most critical, followed by P1Ј, P2, and P3, with the P2Ј position tolerating a wide range of substitutions other than Pro (Fig. 12). If we consider each position to be independent, we can assign a relative score to all possible pentapeptide sequences as the product of the relative substrate specificities at each position (see below).
The matrix may be useful as a first approximation to estimate the specificity of TPP I for different substrates. It should be noted, however, that analyses of the P2 and P2Ј libraries and P1 library II revealed that there are peptides that exhibit synergistic or deleterious interactions between residues at different positions. For instance, the beneficial effects that we found for positively charged P3 residues may be related to the fact that these residues are located at the N terminus of the peptides. Thus, for P3 residues, the positive charge of the N terminus, coupled with the positive charge of the side chain, may enhance binding to a negatively charged subsite on TPP I. However, this may be complicated by charge repulsion or electrostatic steering effects that occur when other positively charged residues are present on the peptide substrate (see Fig. 2). Thus, a more sophisticated approach, including consideration of structural and electrostatic properties of both substrate and enzyme, may be required to predict TPP I substrate specificity.
A homology model of TPP I has been proposed based on the crystal structures of the related bacterial proteases (37). Unfortunately, the relatively low sequence identity between the mammalian and prokaryotic enzymes (25%) has limited the analysis to the prediction that the binding pocket is relatively open and that TPP I prefers substrates with bulky hydrophobic groups in the P1 position. No insights could be drawn regarding the N-terminal P3 residue or any of the PЈ residues. Further analysis would benefit from detailed knowledge of the threedimensional structure of TPP I. At that time, the actual substrate specificity constants for individual sequences listed in supplemental Tables 1-6 should provide useful training and testing sets for extended computational analysis.
It is worth attempting to relate our kinetic measurements to the function of TPP I in vivo. The lifetime of proteins in the lysosome varies considerably, from that of long lived resident lysosomal enzymes (hours to many days) to rapidly hydrolyzed substrates that enter the lysosome by the endocytic or autophagic pathways. Although it is difficult to predict what rates of hydrolysis would be biologically relevant, it seems likely that true substrates would be hydrolyzed in the seconds to hours a Data represent the mean Ϯ S.D. of two animals for each genotype. Activity was measured for 60 min at 30°C and at pH 4.5 by using substrate concentrations of either 200 M for Ala-Ala-Phe-AMC or 100 M for Arg-Nle-Nle-ACC. Similar results were found when activity was measured at pH 4.0 or 5.0 (data not shown).  (38), then [E] 0 ϳ500 nM. By using this value and considering the optimal substrate specificity constant to be 30 M Ϫ1 s Ϫ1 , we can assign time scales for hydrolysis of peptides with different substrate specificity constants and relate these to the predicted distribution of TPP I substrates (Fig. 13). One goal of this study was to provide information with which to interpret future studies directed at identifying substrates that accumulate in LINCL. Accumulation of peptides and proteins is likely to reflect a number of processes, including intrinsic susceptibility to TPP I (which will include both primary and higher order substrate structure considerations), accessibility/susceptibility to other lysosomal proteases that could potentially compensate for the defective TPP I, and the rate at which unhydrolyzed substrates aggregate to form protease-resistant storage bodies.
Although little is known regarding the substrates of TPP I in vivo, mitochondrial ATP synthase subunit c is known to accumulate in LINCL (17,39). The specific pathophysiological processes that both lead to and result from this accumulation are not well understood, and this hydrophobic membrane protein has been identified as a prominent component in the storage material in many other NCLs (19). If subunit c is directly degraded by TPP I, the results presented here indicate that it would represent only an average substrate, with a predicted relative substrate specificity constant of 1.8 ϫ 10 Ϫ5 based on its N-terminal sequence DIDTA (Table 4; 1.8 ϫ 10 Ϫ5 ϭ 0.031 ϫ 0.18 ϫ 0.023 ϫ 0.26 ϫ 0.56). Given the assumptions mentioned above, subunit c would be predicted to have a half-life of ϳ40 min in the lysosome (Fig. 13). a The substrate specificity of each peptide in 31 Xaa-P2-P1-ACC pools that were analyzed using the LC/MS/MS assay was normalized to the corresponding Arg-P2-P1-ACC peptide within the pool. Substrate specificity constants that were below the limits of detection were censored. The relative substrate specificity listed represents the mean normalized value for the indicated P3 residue. b Data from the fluorescence analysis of the P2 library (direct method) were used. For each of the four fixed P1 positions, data were normalized using the value of the P2 Nle peptide pool. The four values for each P2 residue were then averaged. c Substrate specificities determined for P1 library II (supplemental Table 2) were normalized to the peptide with the highest k cat /K M value within each Ala-Ala-Xaa-ACC and Arg-Nle-Xaa-ACC set. The two normalized values for each P1 were averaged and then re-normalized to the highest (P1 Leu) value. d Data from the P2Ј and P1Ј libraries were used. Substrate specificity values were normalized within the tetrapeptide group or within the pentapeptide groups that had the same P2Ј residue. Normalized values for the peptides having the same P1Ј residue were averaged and then re-normalized to the highest (P1Ј Phe) value. e Peptides in the P2Ј library were grouped by P1Ј residue, and a normalized substrate specificity value was calculated for each peptide (different P2Ј) within each group. A mean normalized P2Ј value was calculated using all P1Ј groups except Pro and these were then re-normalized to the highest average value (P2Ј Val).  Table 4. FIGURE 13. Histogram of the relative scores for TPP I digestion of pentapeptides. A score for each possible pentapeptide sequence is calculated as the product of the relative substrate specificities at each position (see Table 4). The optimal P3-P2-P1-P1Ј-P2Ј sequence Arg-Nle-Leu-Phe-Val has assigned a score of 1, and the poorest sequence Pro-His-Pro-Pro-Pro has a score of 8.8 ϫ 10 Ϫ14 . The sequences for all peptides are distributed as a log normal distribution having a mean value of 6.2 ϫ 10 Ϫ6 . The data were transformed into log units, and a frequency histogram was constructed using an interval (bin size) of 0.1. Scores corresponding to the indicated half-lives (dotted lines) were calculated assuming a concentration of 500 nM TPP I in the lysosome and a value of 30 M Ϫ1 s Ϫ1 for the optimum substrate. Although Nle has been used in generation of the histogram and can be taken as an analog for Met, the plot does not reflect data from the sulfur-containing amino acids Met and Cys.
One argument against a direct role for TPP I in subunit c degradation is that subunit c accumulates in many NCLs where TPP I activity measured in vitro is normal or even elevated (4,40,41). To reconcile this, one intriguing possibility is that in these other NCLs, the primary defect results in an accumulation of high affinity substrates that would compete with subunit c for TPP I. Alternatively, the primary defect could result in a change in the lysosomal environment resulting in reduced TPP I activity in situ toward subunit c. In either case, an increase in the already relatively long half-life of subunit c in the lysosome could allow slow processes such as aggregation and formation of storage bodies to occur.
Although the current substrate specificity measurements provide useful information, additional in vitro studies, combined with analysis of substrates that accumulate in the recently developed TPP I-deficient mouse model (35), are required to understand the lysosomal pathology that underlies LINCL. Nonetheless, the wide substrate specificity of TPP I and relative tolerance for a wide variety of residues at the P3, P2, P1Ј, and P2Ј positions ( Fig. 12 and Fig. 13 and Table 4) suggest that it is involved in general lysosomal proteolysis rather than specific processing of a limited set of proteins or peptides. This is in agreement with other studies employing enzyme inhibitors that report that degradation of a variety of peptides by lysosomal extracts was dependent on TPP I (42)(43)(44).
Another goal of this study is to identify new substrates that may form the basis for improved TPP I activity assays that can be used for research and clinical purposes. The tripeptide-coumarin fluorogenic assays have a number of advantages over earlier enzymatic assays that were developed prior to the realization that the CLN2 protease was identical to TPP I (3,4,45). These advantages include sensitivity, ease of use, and ability to process large numbers of samples in parallel (6,13). However, although clearly distinguishing affected individuals from carriers or controls, because of background considerations, assays employing the standard Ala-Ala-Phe-AMC substrate cannot accurately detect low levels of activity, which could be useful in predicting late-onset or prolonged disease associated with certain mutations (46,47). In addition, accurate measurements of low levels of activity would be useful for drug screening assays where compounds are evaluated for their abilities to counteract the effect of some CLN2 mutations, for example, aminoglycoside-mediated stop-codon suppression (28).
One key factor when developing assays to measure a specific enzyme in complex biological samples is the specificity of the substrate for the enzyme of interest compared with other activities. Thus, the relative specificity, rather than the absolute k cat /K M value, is of prime importance. When there are overlapping enzymatic activities, additional specificity can be achieved by judicious choice of reaction conditions (e.g. use of selective inhibitors, choosing a nonoptimal pH where one activity predominates, and/or treating samples to preferentially inactivate one of the activities), but the optimal solution is to find a highly specific substrate. In addition, other practical factors include signal to noise, ability to process multiple samples in parallel, and throughput, which is why multiwell plate-based assays using fluorogenic substrates are frequently the method of choice. The new substrates that we have identified in this study retain the advantages of the Ala-Ala-Phe-AMC substrate in this regard but have superior characteristics in terms of their relative specificities for TPP I.