Definition of the extended substrate specificity determinants for beta-tryptases I and II.

Tryptases betaI and betaII were heterologously expressed and purified in yeast to functionally characterize the substrate specificity of each enzyme. Three positional scanning combinatorial tetrapeptide substrate libraries were used to determine the primary and extended substrate specificity of the proteases. Both enzymes have a strict primary preference for cleavage after the basic amino acids, lysine and arginine, with only a slight preference for lysine over arginine. betaI and betaII tryptase share similar extended substrate specificity, with preference for proline at P4, preference for arginine or lysine at P3, and P2 showing a slight preference for asparagine. Measurement of kinetic constants with multiple substrates designed for beta-tryptases reveal that selectivity is highly dependent on ground state substrate binding. Coupled with the functional determinants, structural determinants of tryptase substrate specificity were identified. Molecular docking of the preferred substrate sequence to the three-dimensional tetrameric tryptase structure reveals a novel extended substrate binding mode that involves interactions from two adjacent protomers, including P4 Thr-96', P3 Asp-60B' and Glu-217, and P1 Asp-189. Based on the determined substrate information, a mechanism-based tetrapeptide-chloromethylketone inhibitor was designed and shown to be a potent tryptase inhibitor. Finally, the cleavage sites of several physiologically relevant substrates of beta-tryptases show consistency with the specificity data presented here.

Mast cells, mediators of inflammatory and allergic response, are found throughout the body concentrated near blood vessels in connective tissue and the mucous membranes of the respiratory and gastrointestinal tract. They play an important role in innate and acquired immune responses through the release of dense granules upon activation. Mast cell activation has also been implicated as a mediator of asthma and other inflammatory diseases. The major components of mast cell secretory granules are the tryptase serine proteases (1). Tryptases are secreted as catalytically active tetramers that are resistant to inactivation by plasma inhibitors. The 3-Å crystal structure has been solved and reveals a ringlike structure with the four active sites facing a central cavity (2). Several in vitro studies have identified multiple substrates for tryptase, including neuropeptides, fibrinogen, stromelysin, prourokinase, prothrombin, and protease-activated receptor-2 (3)(4)(5)(6). Human chromosome 16 encodes several homologous tryptase genes, designated tryptase ␣, ␤, and ␥ (7,8). The ␤-tryptases share greater than 99% sequence identity, with tryptase ␤I and ␤II differing by a single N-glycosylation site. It is unclear why so many highly similar tryptases are expressed by mast cells. One possibility is that they each perform different proteolytic functions that may be reflected in their substrate specificity preferences. Indeed, it has recently been shown that a single amino acid substitution between tryptase ␣ and tryptase ␤II accounts for discrimination in substrate preference for the two enzymes (9).
The substrate specificity of heterologously expressed human tryptase ␤I and ␤II was defined using multiple positional scanning synthetic combinatorial tetrapeptide libraries. We show that ␤I and ␤II tryptase have a defined primary (P1) and extended substrate specificity (P4 -P2). 1 The library profiles indicate that the substrate specificity is similar for the two enzymes. Furthermore, single substrates were designed and assayed to test the extended substrate specificity requirements, resulting in a sensitive and selective substrate for ␤-tryptases. Similarly, an irreversible inhibitor was designed from the preferred substrate sequence and shown to be a potent ␤-tryptase inhibitor. Structural determinants of specificity were examined through the modeling of the optimized substrate into the active site of the tryptase structure. Finally, it is noted that the specificity determined in this study correlates with the cleavage sites found in many of the characterized physiological substrates and may lead to the identification of additional substrates involved in both the immunological and pathological consequences of ␤I and ␤II tryptase release.

EXPERIMENTAL PROCEDURES
Materials-DNA-modifying enzymes were obtained from Promega (Madison, WI). The Pichia pastoris expression system was purchased from Invitrogen (San Diego, CA). Native human lung tryptase was purchased from ICN (Aurora, OH). Factor Xa was purchased from New England Biolabs (Beverly, MA). tPA 2 and uPA were purchased from * The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
␤II Tryptase Gene Construction-The pPIC9-Hu Try (human ␤I tryptase plasmid) (11) was subjected to site-directed mutagenesis using the GeneEditor™ in vitro site-directed mutagenesis system (Promega, Madison, WI). The mutant oligonucleotide 5Ј-GAGGAGCCGGTGAAG-GTCTCCAGCCAC-3Ј was used to introduce a substitution mutation in the DNA coding for amino acid residue 113 (N113K). Full-length nucleic acid sequencing of both strands confirmed the sequence conversion to the ␤II tryptase isoform.
Expression and Purification-Recombinant human ␤I and ␤II tryptases were expressed and purified as previously described (11). Briefly, pPIC9-Hu Try/N113K was linearized by SacI digestion and transformed into the GS115 strain of P. pastoris. A tryptase-expressing clone was isolated and used for large scale expression by fermentation in buffered minimal methanol complex medium with 0.5 mg/ml heparin. Secreted mature ␤I and ␤II tryptases were purified to homogeneity using a two-column affinity chromatography procedure described previously. The enzymes were suspended in a final storage buffer containing 2 M NaCl, 10 mM MES, pH 6.1, and 10% glycerol.
The proportion of catalytically active ␤II and ␤II tryptase was quantitated by active-site titration with 4-methylumbelliferyl p-guanidinobenzoate (12). Briefly, fluorescence was monitored, with excitation at 360 nm and emission at 450 nm, upon the addition of enzyme to 4-methylumbelliferyl p-guanidinobenzoate. The concentration of enzyme was determined from the increase in fluorescence based on a standard concentration curve.
The recombinant human ␤I and ␤II tryptases (1 g) and native human lung tryptase were subjected to reducing SDS-polyacrylamide gel electrophoresis on a 4 -20% TG gel (Novex). Following electrophoresis, the gel was stained by GelCode™ (Pierce) (Fig. 1) to verify size and purity.
Positional Scanning Synthetic Combinatorial Library Screening-Preparation and screening of the positional scanning synthetic combinatorial library were carried out as previously described (10,13). The concentration of each of the 361 substrates per well in the P1 lysine and P1 arginine libraries was 0.25 M. The concentration of the 6859 compounds/well in the P1-diverse library was 0.013 M. Enzyme activity of ␤I and ␤II tryptase in the positional scanning synthetic combinatorial tetrapeptide library was assayed in 100 mM HEPES, pH 7.5, 10% glycerol, and 0 or 0.1 mg/ml heparin at excitation and emission wavelengths of 380 and 450 nm, respectively.
Single Substrate Kinetic Analysis-Tryptase activity was monitored at 30°C in assay buffer containing 100 mM HEPES, pH 7.5, and 10% glycerol. Substrate stock solutions were prepared in Me 2 SO. The final concentration of substrate ranged from 0.005 to 2 mM. The concentration of Me 2 SO in the assay was less than 5%. The tryptase concentration was 5 nM. Hydrolysis of ACC substrates was monitored fluorometrically with an excitation wavelength of 380 nm and emission wavelength of 450 nm on a Fluoromax-2 spectrofluorimeter (JY Horiba).
Irreversible Inhibitor, Ac-PRNK-cmk, Kinetic Analysis-Progress curves were obtained for tryptase (1 nM) inactivation by multiple concentrations of Ac-PRNK-cmk (50 nM to 10 M). Activity was monitored at 30°C in activity buffer with 100 M Ac-PRNK-ACC substrate. The rate constant for loss of enzyme activity, k obs , was determined from a nonlinear regression of the progress curve data. k obs varied linearly with inhibitor concentration. Therefore, k a , the rate constant for the inactivation of enzyme with inhibitor, was determined by linear regression analysis (14). Several P1 basic-preferring proteases were monitored for inhibition by Ac-PRNK-cmk as follows. Tryptase ␤I (50 nM), tryptase ␤II (50 nM), factor Xa (50 nM), tPA (50 nM), uPA (50 nM), thrombin (1 nM), and plasmin (5 nM) were incubated for 5 min with 0, 10, and 100 M Ac-PRNK-cmk. After incubation, residual activity was monitored as follows. Ac-PRNK-ACC was added to a final concentration of 5 M to the samples containing tryptase ␤I and ␤II; Ac-GTAR-ACC (5 M) was added to the factor Xa and tPA samples; Ac-QFAR-ACC (5 M) was added to the uPA samples; Ac-nTPR-ACC (5 M) was added to the thrombin samples; and Ac-KQWK-ACC (5 M) was added to plasmin samples.
Structural Modeling of Optimized Substrate into Tryptase Active Site-The tryptase structure (PDB code 1a0l) was prepared for model-ing by removing inhibitor and water molecules, adding hydrogens using Sybyl6.5 (Tripos Inc., St. Louis, MO), and assigning AMBER partial atomic charges (16). Because the structure was solved with a covalent inhibitor, the catalytic Ser-195 was modeled to a geometry consistent with a noncovalent inhibitor by restoring the hydrogen bond with His-57. This was accomplished with a two-step torsional minimization in Sybyl (Tripos force field, ⑀ ϭ 1r). In the first step, the position of the Ser-195 hydroxyl hydrogen was minimized via torsion around the 2 bond, and in the second step both the oxygen and hydrogen were minimized via torsion around the 2 bond and 1 (CCCO) bonds. The structure of the enzyme was held rigid for the remainder of the modeling.
The capped peptide backbone of Ac-PRNK-Nme was modeled into the active site of the tryptase structure as follows. The structure of the P1-P3 portion of ovomucoid (complexed to chymotrypsin; PDB code 1cho) was used as a template for the backbone configuration. This portion of the inhibitor was translated into the tryptase active site using least squares superposition of the protease active site residues His-57, Asp-102, Ser-195, and 214 -216 onto the corresponding residues of the tryptase "A" protomer. The peptide side chains were then truncated at C-␤, hydrogens and AMBER charges were added (as above), and the configuration of the resultant (Ace-AAA-Nme) peptide was optimized with successive minimizations in the tryptase active site. Using DOCK4.0.1 (17), the atoms of the scissile amide bond were minimized first, and then successive rigid segments of the peptide were added (with torsional angles taken from the ovomucoid inhibitor) alternating with minimization. The minimizations included rigid and flexible degrees of freedom and were performed using the simplex algorithm with up to 500 iterations for each minimization. The DOCK energy scoring, applied to both intermolecular and intramolecular atom pairs, includes the coulombic and van der Waals terms from the AMBER force field (16,18) with an interatomic cut-off of 25 Å and ⑀ ϭ 4r. The peptide side chains (PRNK) were then added, and the conformations of the P1-P3 side chains and the P4 proline were modeled with DOCK4.0 as previously described (19). Finally, 10 independent minimizations were carried out, and the lowest energy configuration was retained.

RESULTS
Expression of Active ␤I and ␤II Tryptase in P. pastoris-Recombinant tryptase ␤I and ␤II were produced and secreted in P. pastoris as mature enzymes. The ability to produce active mature enzyme rather than the zymogen is important for substrate specificity studies, because it obviates the need to remove the propeptide through the addition of an activating protease, whose activity may complicate subsequent specificity studies. There is a single amino acid difference between tryptase ␤I and tryptase ␤II at position 113, an asparagine and a lysine, respectively. Replacement of asparagine for lysine removes an N-linked glycosylation site in tryptase ␤II, making it singly glycosylated. The relative degree of glycosylation can be seen in the recombinant expression of both enzymes (Fig. 1), with tryptase ␤I migrating as multiple glycosylated bands and tryptase ␤II migrating as a single glycosylated band. The only difference seen in expression and purification of the two enzymes is the final yield of active enzyme with tryptase ␤I expressing 10-fold more than tryptase ␤II. The phenomenon of reduced expression upon the removal of a glycosylation site has been observed with other proteases and has been postulated to involve decreased stability or solubility of the enzyme lacking post-translational glycosylation (20).
␤I and ␤II Tryptase Have Equivalent Primary and Extended Substrate Specificity-To explore whether this single difference in glycosylation affects the substrate specificity of tryptase ␤I and ␤II, three combinatorial peptide libraries with fluorogenic leaving groups were used. The P1 specificity was first defined using a library in which each of the P1 amino acids in a tetrapeptide was held constant while the other three positions contained an equimolar mixture of 19 amino acids (cysteine was omitted, and norleucine replaced methionine). Both tryptase ␤I and ␤II prefer cleaving after lysine over arginine with no other amino acids being accepted at this position (Fig. 2).
To define the extended substrate specificities of the ␤-tryp-tases as well as to determine if extended specificity is dependent on the context of the P1 amino acid, tryptase ␤I and ␤II were screened against two libraries that differed only in the P1 amino acid that was held constant, lysine and arginine. The P4 -P2 extended substrate specificities of both ␤-tryptases reveal that the isoforms have a similar substrate preference that is not dependent on the P1 amino acid (Fig. 3,  A and B). Also apparent from the specificity screen is that many suboptimal amino acids can be accommodated in the substrate, suggesting that additional mechanisms of substrate discrimination may also be in place. Both tryptases show an unusual preference for proline in the P4 position; no other serine protease has been shown to have preference to date. The P3 position shows a preference for positively charged amino acids. Finally, the P2 position shows a modest preference for asparagine (Fig. 3, A and B).
To quantitate tryptase ␤I and ␤II dependence on extended substrate specificity, several peptide substrates were synthesized and the kinetic constants were determined for each of the enzymes. The slight preference for lysine over arginine as seen in the P1-diverse peptide library ( Fig. 1) was validated with the substrates Ac-PRNK-ACC and Ac-PRNR-ACC. The Ac-PRNR-ACC substrate displays about 70 -90% of the activity of the Ac-PRNK-ACC substrate; compare k cat /K m of (1.12 Ϯ 0.14) ϫ 10 6 to (1.23 Ϯ 0.15) ϫ 10 6 M Ϫ1 s Ϫ1 for tryptase ␤I and (1.31 Ϯ 0.19) ϫ 10 6 to (1.89 Ϯ 0.17) ϫ 10 6 M Ϫ1 s Ϫ1 for tryptase ␤II (Table I). A minimal preference, approximately 2-fold, for P2 asparagine over P2 threonine was seen for both enzymes when Ac-PRTK-ACC was compared with Ac-PRNK-ACC, k cat /K m of (0.78 Ϯ 0.07) ϫ 10 6 to (1.23 Ϯ 0.15) ϫ 10 6 M Ϫ1 s Ϫ1 for tryptase ␤I and (1.27 Ϯ 0.12) ϫ 10 6 to (1.89 Ϯ 0.17) ϫ 10 6 M Ϫ1 s Ϫ1 for tryptase ␤II. A major difference is seen in the P3 position, with an approximately 10-fold decrease in activity observed for Ac-PANK-ACC over Ac-PRNK-ACC; compare k cat /K m of (0.14 Ϯ 0.01) 10 6 to (1.23 Ϯ 0.15) ϫ 10 6 M Ϫ1 s Ϫ1 for tryptase ␤I and (0.18 Ϯ 0.01) ϫ 10 6 M Ϫ1 s Ϫ1 to (1.89 Ϯ 0.17) ϫ 10 6 M Ϫ1 s Ϫ1 for tryptase ␤II. All of these effects are manifested in the K m term, not the k cat term. This indicates that ground state binding and recognition are important factors in tryptase catalysis. These results are consistent with the previous findings of Tanaka et al., who showed that benzyloxycarbonyl-Lys-Gly-Arg-p-nitroanilide was the most optimal of the 14 tripepidyl para-nitroanalide substrates tested (21).
To demonstrate that information obtained from the substrate screen could be translated into a potent tryptase inhibitor, the irreversible inhibitor Ac-PRNK-cmk was tested for inhibition of tryptase. The measured association rate constant, k a , of 5000 Ϯ 200 M Ϫ1 s Ϫ1 for both ␤I and ␤II tryptase indicates that Ac-PRNK-cmk is a potent inhibitor of tryptase. Selectivity of the designed tryptase inhibitor, Ac-PRNK-cmk, was demonstrated through the measurement of inhibition of several tryptic plasma proteases, factor Xa, tPA, uPA, thrombin, and plasmin. At an inhibitor concentration of 10 M, where tryptase is 95% inhibited, none of the proteases tested showed inhibition (Table II). At a 10-fold higher inhibitor concentration of inhibitor (100 M), where tryptase is completely inhibited, only uPA and plasmin showed inhibition, 34 and 63% inhibition, respectively (Table II).
␤-Tryptase Binds Its Preferred Substrate with Potential Interactions from Two Protomers-The source of the preference for basic residues at the P1 position is well known for this class of proteolytic enzyme; Asp-189 is present in all trypsin-like serine proteases and resides at the bottom of the S1 pocket. The source of extended specificity is less apparent. The structure of tryptase is unique among serine proteases in that it is a ringlike tetramer with the four active sites in close proximity within the interior pore (2). Using the program DOCK with energy scoring (22), the capped tripeptide Ac-PRNK-Nme was docked into the active site of BII tryptase. The docked molecule had a score of Ϫ86.34 DOCK units, consisting of an electrostatic contribution of Ϫ56.88 and a van der Waals contribution of Ϫ29.46. The unusually large electrostatic component is a result of the large negative charge concentrated within the pore of the tetramer.
The model of substrate binding suggests a paired binding site, with contributions from two tryptase protomers. Specifically, docking of the optimal peptide into the active site of tryptase predicts that the P4 and P3 side chains interact with the adjacent protomer. The P4 Pro side chain interacts with the ␥-carbon of Thr-96Ј of the adjacent protomer (Fig. 4). A recognition site for the P3 Arg is formed by acidic residues from both protomers, Glu-217 from the cognate protomer and Asp-60BЈ from the adjacent protomer (Fig. 4). Formation of the P4 and P3 side chain interactions requires a somewhat noncanonical backbone configuration, resulting in the loss of a backbone hydrogen bond. By contrast, the P2 and P1 sites make the canonical interactions seen with other members of this protease class. For example, the deep S1 pocket contains Asp-189

Substrate Specificity Determinants for ␤-Tryptases I and II 34944
from the cognate protomer that interacts with P1 Lys (Fig. 4). Another consequence of the structure is that each active site has an adjacent active site in close proximity, leading to potential substrate-substrate interactions (Fig. 4). DISCUSSION The tryptase family of serine proteases has been implicated in a variety of allergic and inflammatory diseases involving mast cells because of elevated tryptase levels found in biological fluids from patients with these disorders. However, the exact role of tryptase in the pathophysiology of disease remains to be delineated. The scope of biological functions and corresponding physiological consequences of tryptase are substan-   tially defined by their substrate specificity. In this study, high levels of mature recombinant human ␤I and ␤II tryptases were expressed in P. pastoris (Fig. 1) for studies of primary and extended substrate specificity. Human mast cells express at least four distinct tryptases, designated ␣, ␤I, ␤II, and ␤III. These enzymes are not controlled by blood plasma proteinase inhibitors and only cleave a few physiological substrates in vitro. It is currently unknown whether human tryptases perform redundant or unique functions in vivo. One recent study, which included a protein engineering approach, demonstrated that a single amino acid difference in one of the surface loops that forms the substratebinding cleft led to a functional distinction between human tryptase ␣ and ␤II (9). However, no physiological differences have been reported between ␤I and ␤II tryptases, which differ only at a single amino acid residue at position 113, leading to the loss of an N-linked glycosylation site. Based on the data presented herein, ␤I and ␤II tryptases have similar P4 to P1 substrate preferences (Figs. 2 and 3, Table I). The functional similarity observed for the two enzymes is in agreement with the reported crystal structure of ␤II tryptase (2), in that the structure shows the glycosylation site peripheral to the active site and should therefore have minimal effect on the substrate specificity. The shared preference for peptide substrates may extend to a shared preference for physiological substrates. Indeed, the optimal sequence for ␤-tryptase cleavage, P4 Pro, P3 Arg/Lys, P2 X, and P1 Lys/Arg, is found in many of the macromolecular substrates previously shown by others to be cleaved by tryptase in vitro.
Tryptase is a potent activator of pro-uPA, the zymogen form of a protease associated with tumor metastasis and invasion. Activation of the plasminogen cascade, resulting in the destruction of extracellular matrix for cellular extravasation and migration, may be a function of tryptase activation of prourokinase plasminogen activator at the P4 -P1 sequence of Pro-Arg-Phe-Lys (4). Vasoactive intestinal peptide, a neuropeptide that is implicated in the regulation of vascular permeability, is also cleaved by tryptase, primarily at the Thr-Arg-Leu-Arg sequence (5). The G-protein-coupled receptor PAR-2 can be cleaved and activated by tryptase at the Ser-Lys-Gly-Arg sequence to drive fibroblast proliferation, whereas the thrombin-activated receptor PAR-1 is inactivated by tryptase at the Pro-Asn-Asp-Lys sequence (3). Taken together, this evidence suggests a central role for tryptase in tissue remodeling as a consequence of disease. This is consistent with the profound changes observed in several mast cell-mediated disorders. One hallmark of chronic asthma and other long term respiratory diseases is fibrosis and thickening of the underlying tissues that could be the result of tryptase activation of its physiological targets. Similarly, a series of reports during the past year have shown angiogenesis to be associated with mast cell density, tryptase activity, and poor prognosis in a variety of cancers (6,(23)(24)(25).
A search of the protein data bases (Swiss-Prot) has revealed other candidate physiological substrates containing the predicted sequences for ␤-tryptase cleavage. These macromolecules have yet to be empirically characterized as tryptase substrates, but many, such as latent transforming growth factor-␤-binding protein (cytokine modulator), annexin I and II (calcium-binding proteins that participate in the regulation of early inflammatory responses), and HGF (a growth factor implicated in tumor development and progression and in angiogenesis), are particularly intriguing because they further support the concept of a prominent role for tryptase in tissue remodeling during disease pathogenesis.
In this study, the kinetic constants for ␤I and ␤II tryptase were determined using four synthetic peptide substrates derived from the positional scanning combinatorial peptide library screening results (Table I). These substrates served to quantify the tryptase dependence on extended substrate specificity, indicating that the ground state binding and recognition are important factors in tryptase catalysis. The preferred tetrapeptide PRNK substrate and irreversible inhibitor, revealed from the combinatorial library screens, also formed the basis for a rapid, sensitive, and selective enzymatic assay for human ␤-tryptases in a variety of complex biological media, including serum and plasma. 3 To explore the structural determinants of substrate binding, the Pro-Arg-Asn-Lys peptide was modeled into the active site of tryptase. The modeling results revealed several canonical substrate interactions in addition to interactions not seen with other serine proteases of the chymotrypsin fold. The unique tetrameric structure of tryptase allows for a substrate to interact with two protomers simultaneously. In addition, the close proximity of paired tryptase active sites raises the possibility of interactions between multiple substrates.
Tryptase has been recognized as a viable drug target, and therapeutically useful inhibitors have been under development by several pharmaceutical companies, some even taking advantage of the bifunctional active site (15,26). Insights gained from the modeling of the optimal sequence into the active site will support further development of novel selective substrates of ␤-tryptases that will enhance our understanding of the pathophysiology of these enzymes as well as lead to the development of new and effective inhibitors.
In summary, recombinant human ␤I and ␤II tryptase was expressed and used to determine an optimal sequence for ␤-tryptase cleavage: P4 Pro, P3 Arg/Lys, P2 X, and P1 Lys/Arg. This sequence has already proved useful for the development of pharmacological tools for the further study of tryptase. Moreover, this study of ␤I and ␤II tryptase highlights the utility of generalized positional scanning combinatorial peptide libraries to functionally characterize similarities and differences between homologous enzymes, generate sensitive and selective substrates and inhibitors, and define a subset of potential physiological substrates.