New, Sensitive Fluorogenic Substrates for Human Cathepsin G Based on the Sequence of Serpin-reactive Site Loops*

Cathepsin G has both trypsin- and chymotrypsin-like activity, but studies on its enzymatic properties have been limited by a lack of sensitive synthetic substrates. Cathepsin G activity is physiologically controlled by the fast acting serpin inhibitors α1-antichymotrypsin and α1-proteinase inhibitor, in which the reactive site loops are cleaved during interaction with their target enzymes. We therefore synthesized a series of intramolecularly quenched fluorogenic peptides based on the sequence of various serpin loops. Those peptides were assayed as substrates for cathepsin G and other chymotrypsin-like enzymes including chymotrypsin and chymase. Peptide substrates derived from the α1-antichymotrypsin loop were the most sensitive for cathepsin G with k cat/K m values of 5–20 mm −1 s−1. Substitutions were introduced at positions P1 and P2 in α1-antichymotrypsin-derived substrates to tentatively improve their sensitivity. Replacement of Leu-Leu inortho-aminobenzoyl (Abz)-Thr-Leu-Leu-Ser-Ala-Leu-Gln-N-(2,4-dinitrophenyl)ethylenediamine (EDDnp) by Pro-Phe in Abz-Thr-Pro-Phe-Ser-Ala-Leu-Gln-EDDnp produced the most sensitive substrate of cathepsin G ever reported. It was cleaved with a specificity constantk cat/K m of 150 mm −1 s−1. Analysis by molecular modeling of a peptide substrate bound into the cathepsin G active site revealed that, in addition to the protease S1 subsite, subsites S1′ and S2′ significantly contribute to the definition of the substrate specificity of cathepsin G.

Human cathepsin G, neutrophil elastase, and proteinase 3 are the major proteolytic enzymes in the azurophilic granules of polymorphonuclear leukocytes (neutrophils) (1)(2). They are also present in mast cells (3) and monocytes (4) and are involved in the degradation of foreign organisms or dead tissues within the phagolysosome (5) during inflammatory reactions. They are released into the extracellular space, by leakage and/or cell death, where they are controlled by plasma-derived inhibitors such as ␣ 1 -antichymotrypsin or ␣ 1 -proteinase inhibitor. Any imbalance between these proteinases and their inhibitors because of genetic or acquired deficiencies of these inhibitors, or chronic inflammation, can result in the uncontrolled digestion of most proteins of the extracellular matrix (6). Neu-trophil proteinases could therefore contribute to the development of connective tissue diseases, such as emphysema, rheumatoid arthritis, or perionditis, but the involvement of cathepsin G in these processes remains controversial. Recent studies suggest alternative functions for cathepsin G in various cellular processes such as platelet activation (7), monocyte and neutrophil chemotaxis (8), and enhancement of natural killer cytotoxicity (9). Cathepsin G also has a potent antibacterial activity that is independent of its serine protease activity (10).
The proteolytic activity of cathepsin G has been studied using a variety of protein and peptide substrates. This enzyme is implicated in the proteolysis of elastin, collagen type I and type II, cartilage proteoglycans, fibronectin, laminin, and immunoglobulins G and M (see Ref. 6 for a review). Cathepsin G is also able to interact with the V3 loop of human immunodeficiency virus type 1 gp120 protein (11) and is involved in the proteolytic processing/activation of interleukin-8 (12), complement C3 (13), and factor V (14).
Investigation of the substrate specificity of cathepsin G using oxidized insulin B chain (15), peptide 4-nitroanilide substrates (16,17), and peptide thioesters (18) has shown that cathepsin G has both chymotrypsin-and trypsin-like specificities but that it is much less active against all of the synthetic substrates assayed so far than are homologous serine proteinases. As no sensitive substrate suitable for enzymatic studies or the search for specific inhibitors of cathepsin G was available, we have developed a series of fluorogenic peptide substrates derived from the inhibitory loop of ␣ 1 -antichymotrypsin, the main physiological inhibitor of cathepsin G (19). Some other serpins, including ␣ 1 -proteinase inhibitor (19), have also been reported to be cathepsin G inhibitors. Interaction of these two serpins with their cognate enzymes involves the cleavage of their reactive inhibitory loop with the production of a SDS-stable complex as a result of the formation of a covalent bond. Therefore, the sequence of serpin reactive loops may be used to develop peptide substrates of the corresponding target enzyme(s), as has been shown recently for human kallikreins hK1 1 and hK2 (20). We therefore prepared a series of intramolecularly quenched substrates of cathepsin G based on the sequence of various serpin-reactive site loops and bearing an ortho-aminobenzoyl (Abz) and quenching N- (2,4-dinitrophenyl) ethylenediamine (EDDnp) as donor-acceptor pairs at the N and C termini of the peptides, respectively. The hydrolytic activity of cathepsin G was compared with that of such chymotrypsin-like en-zymes as chymotrypsin, chymase, and human Kallikrein hK3 (also called prostate-specific antigen). The effects of residues at the P 1 and P 2 positions on substrate binding were assessed using the crystal structure of human cathepsin G (21). The nomenclature used for the individual amino acid residues (P 1 , P 2 , etc.) of the substrate and corresponding residues of the enzyme subsites (S 1 , S 2 , etc.) is that of Schechter and Berger (22 (23). N,N-Dimethylformamide and acetonitrile were from Merck, and C18 cartridges for reverse-phase chromatography were from Brownlee Laboratories. All other reagents were of analytical grade.
Design and Synthesis of Quenched Fluorescent Substrates-All quenched fluorogenic substrates were synthesized by solid-phase procedures with the Fmoc (N-(9-fluorenyl)methoxycarbonyl) methodology using a multiple automated peptide synthesizer (PSSM-8, Shimadzu Co.) according to previously described procedures (24 -26). Glutamine was the C-terminal residue in all peptides because of a requirement of the synthesis strategy (24). Substrate purity was checked by matrixassisted laser desorption ionization-time of flight mass spectrometry (MALDI-TOF; Micromass, TofSpec-E) and by reverse-phase chromatography on a C18 column eluted at 2 ml/min with a 10-min linear gradient of acetonitrile (0 -60%) in 0.075% trifluoroacetic acid. Substrate stock solutions (2-5 mM) were prepared in N,N-dimethylformamide and diluted with activity buffer.
Enzyme Assays-Assays were carried out at 37°C in 50 mM Hepes buffer, pH 7.4, 50 mM NaCl for cathepsin G and chymotrypsin, in 0.1 M Tris/HCl, pH 8.0, 1.8 M NaCl for human chymase, and in 50 mM Tris/HCl, pH 8.3, 1 mM EDTA for hK3.
The hydrolysis of Abz-peptidyl-EDDnp substrates was followed by measuring the fluorescence at ex ϭ 320 nm and em ϭ 420 nm in a Hitachi F-2000 spectrofluorometer. The system was standardized using Abz-FR-OH prepared from the total tryptic hydrolysis of an Abz-FR-pNA solution, and its concentration was determined from the absorbance at 410 nm, assuming ⑀ 410 nm ϭ 8,800 M Ϫ1 cm Ϫ1 for p-nitroanilide. Substrate concentrations of Abz-peptidyl-EDDnp were determined by measuring the absorbance at 365 nm using ⑀ 365 nm ϭ 17,300 M Ϫ1 cm Ϫ1 for EDDnp.
Specificity constants (k cat /K m ) were determined under first-order conditions using a substrate concentration far below the K m (0.1-6 M, depending on the enzyme).
where [E]t is the final enzyme concentration, dividing k obs by [E]t gave the k cat /K m ratio. The k obs for the first-order substrate hydrolysis was calculated by fitting experimental data to the first-order law using Enzfitter software (Elsevier Science Publishers, Amsterdam).
Cathepsin G and chymotrypsin were titrated with ␣ 1 -antichymotrypsin as described previously (27), whereas the concentration of active chymase was determined under standard assay conditions using 1 mM succinyl-Ala-Ala-Pro-Phe-pNA as a substrate and assuming a specific activity of 2.7 mol of p-nitroaniline released min Ϫ1 /nmol of chymase (28).
The K m and V m for the hydrolysis of Abz-peptidyl-EDDnp substrates by cathepsin G were determined using 8 -10 substrate concentrations (1-15 M). The final concentration of cathepsin G was 20 -100 nM depending on the substrate used. Experimental data were fitted to the hyperbolic Michaelis-Menten rate equation using Enzfitter software. Values for k cat were obtained from V m /[E] t ϭ k cat .
Chromatographic Procedures and Analysis of Peptide Products-Fluorogenic substrates (4 -80 M final concentration) were incubated with cathepsin G (200 -400 nM), chymotrypsin (7-75 nM), or chymase (20 -120 nM) at 37°C in their respective buffers. The reaction was blocked by adding 3 l of trifluoroacetic acid, and aliquots (200 l) were removed at times ranging from 2 min to 1 h depending on the substrate and enzyme used. The substrate fragments generated were purified by reverse-phase chromatography on a C18 column (2.1 ϫ 30 mm, Brownlee), a P200 pump (Thermo Separation Products), and a Spectrasystem UV3000 detector ((Thermo Separation Products), at a flow rate of 1 ml/min, with a linear (0 -60%, v/v) gradient of acetonitrile in 0.07% trifluoroacetic acid over 12 min. Eluted peaks were simultaneously monitored at three wavelengths (220, 320, and 360 nm), which allowed the direct identification of EDDnp-containing peptides prior to sequencing or amino acid analysis. Cleavage sites were identified by N-terminal sequencing using an Applied Biosystems 477A pulsed liquid sequencer with the chemicals and program recommended by the manufacturer. Phenylthiohydantoin derivatives were identified with an on-line model 120A analyzer. Alternatively, cleavage sites were identified by the amino acid composition of Abz-containing hydrolysis products by acid hydrolysis of the fragments in 6 N HCl at 165°C for 90 min and an Applied Biosystems 420A derivatizer coupled to a 130A separation system. Phenylthiocarbamide-amino acids were detected at 254 nm and identified and quantified by reference to phenylthiocarbamide-amino acid standards.
Model Building-The model of a P 3 -P 3 Ј hexapeptide (TPFSAL) complexed to cathepsin G was constructed using the Insight II and Discover molecular modeling and energy minimization programs (Biosym Technologies Inc.). The coordinates for cathepsin G (Protein Data Bank code 1cgh) (21), where the complexed succinyl-Val-Pro-Phe P -(Oph) 2 inhibitor (where (Oph) 2 is diphenyl ester) was first removed, were optimally superimposed onto the ecotin-crab collagenase complex (29). The P 3 -P 3 Ј side chains in bound ecotin were changed to those for the model substrate by keeping the same conformation, except for Phe at the P 1 position, in which the side chain was manually moved to adopt a conformation similar to that of the Phe occupying the S 1 pocket of cathepsin G. Flanking residues were deleted and the substrate-cathepsin G complex was minimized using the steepest descent algorithm until the derivative was Ͻ0.1 kcal/mol/Å, with the cathepsin G atoms fixed to their crystallographic positions and the atoms of the substrate backbone tethered with a force constant of 500 kcal/mol/Å 2 .

RESULTS AND DISCUSSION
The activity of proteolytic enzymes is generally measured by the amidolytic release of a chromophore or a fluorophore covalently attached to a peptide moiety that confers the specificity to the enzyme under study. Cathepsin G, like some other serine proteases including proteinase 3 or hK3 (also called prostate-specific antigen), has little activity toward these substrates because of a low intrinsic catalytic activity of the proteinase or a low affinity with the peptide moiety. We have therefore used peptide sequences derived from proteolytically sensitive loops in natural serpin inhibitors to synthesize substrates with intramolecularly quenched fluorescence ( Table I) that have residues on both sides of the cleavage site. The interaction of serpins with their cognate enzymes results in cleavage of the reactive loop, leading to either the formation of a covalent, enzymatically inactive, complex (inhibitory pathway) or the release of an inactive serpin of lower intramolecular energy that has undergone a large conformational change (substrate pathway). This conformational change results in the insertion of the cleaved loop in a ␤-sheet of the protein. Partitioning of serpin between the substrate and the inhibitor pathway depends on the serpin-proteinase pair and explains why there can be stoichiometries of inhibition Ͼ1 (30). Because serpin specificity toward a proteinase is governed mainly by the inhibitory loop sequence, synthetic peptides derived from the variable loop structure of individual serpin inhibitors should be good substrates for the corresponding cognate serine proteinase(s). This approach was recently used to develop substrates of hK2 and hK1 (20).
Hydrolysis of ␣ 1 -Antichymotrypsin-derived Fluorogenic Substrates by Cathepsin G and Other Chymotrypsin-like Enzymes-␣ 1 -Antichymotrypsin (ACT) is the major physiological inhibitor of cathepsin G and is also an inhibitor of chymase and chymotrypsin (31). Peptide substrates of increasing length (Table I), reproducing the sequence of human ACT loop from P 5 to P 15 Ј (nomenclature based on the Leu-Ser cleavage site in the ACT reactive loop denoted as P 1 -PЈ 1 (22)), were synthesized and assayed as substrates of cathepsin G and related enzymes having chymotrypsin-like specificity, such as human chymase, bovine chymotrypsin, and human kallikrein hK3. The specificity constants k cat /K m for the hydrolysis of substrates were determined under first-order conditions, i.e. using substrate concentrations far below the supposed K m , thus avoiding the intermolecular quenching that occurs when the substrate concentration and/or length increases. As the specificity constant is not influenced by the nature of the rate-limiting step or by nonproductive substrate binding (32), it can be compared for different substrates and enzymes.
All ACT-derived substrates were hydrolyzed by cathepsin G, chymotrypsin, and chymase (Table II), although at different rates, in keeping with the inhibition spectrum of ACT. The three enzymes cleaved the shortest substrate of the series (AC-2) with the same k cat /K m , but the specificity constant increased dramatically with the peptide length using chymotrypsin and chymase, except for the longest substrate (AC-6), whereas it remains essentially constant for cathepsin G. This shows the importance of the residues from P 13 to P 5 , and to a lesser extent of residues at P 4 Ј-P 5 Ј, for the interaction between the substrate and chymotrypsin-like proteinases. Under the experimental conditions used, we found no significant hydrolysis of these substrates by hK3, despite the fact that this chymotrypsin-like enzyme is inhibited by ␣ 1 -antichymotrypsin in vivo (33). This result could be because of the low reactivity of hK3 against synthetic substrates (33) or because the inhibition mechanism of this proteinase by ACT is different and results in a different site of cleavage or no cleavage of the reactive loop.
The K m values for the hydrolysis of the ACT-derived substrates AC-2-AC-6 (Table III) are far lower than those reported for the peptidyl 4-nitroanilide substrates currently used for cathepsin G (Suc-VPF-pNA, K m ϭ 1.4 mM (17) or succinyl-AAPF-pNA K m ϭ 1.7 mM (27)). However, there was a limited improvement in the specificity constants because the k cat values were lower than those for the peptidyl 4-nitroanilide substrates (Table III), in agreement with the fact that the paranitroanilide group is a very good leaving group for chromogenic substrates.
Cleavage sites in ACT-derived substrates were identified by reverse-phase HPLC fractionation of hydrolysis products and sequencing of the EDDnp-containing fragment(s) and/or by amino acid analysis of the Abz-containing fragment. Cathepsin G, chymotrypsin, and chymase all cleaved the same Leu-Ser bond in all substrates tested, whatever the substrate length (Fig. 1). This cleavage site is identical to that in ␣ 1 -antichymotrypsin-proteinase complexes. This result emphasizes the importance of this sequence as a whole fitting into the enzyme active site, because there was no further cleavage after the other two Leu residues.
Other Serpin-derived Substrates for Cathepsin G-Substrates derived from the sequence of ␣ 1 -proteinase inhibitor and squamous cell carcinoma antigen-2 (SCCA2) (34), which are two other serpin inhibitors of cathepsin G, were synthesized (Table I) and assayed with chymotrypsin-like proteinases including cathepsin G. ␣ 1 -Proteinase inhibitor-derived substrate AP-1 was hydrolyzed by cathepsin G and chymase, although less efficiently than ACT-derived substrates, but was a very good substrate for chymotrypsin (Table IV). All enzymes cleaved at the same Met-Ser bond as that reported in ␣ 1proteinase inhibitor upon interaction with most of target proteinases. A different cleavage site, at the Phe 352 -Leu 353 bond of the ␣ 1 -proteinase inhibitor, has been reported for chymase (28). This finding could explain the low activity of chymase on the ␣ 1 -proteinase inhibitor-derived substrate AP-1, which does not contain the Phe-Leu pair. The SCCA2-derived substrate SC-2 was poorly hydrolyzed by cathepsin G but was very efficiently cleaved by chymase at the same Leu-Ser bond. Chymotrypsin did not hydrolyze the SCCA2-derived substrate SC-2, which agrees with the fact that it is not inhibited by native SCCA2, unlike cathepsin G or chymase (34).
We also tested a substrate that reproduces part of the sequence of SCCA1, a serpin that is 92% identical to SCCA2, with no serine proteinase target identified so far. It has been reported as a cysteine proteinase inhibitor (35). The SCCA1derived substrate was not hydrolyzed by cathepsin G, chymotrypsin, or chymase, although it contained a Phe residue.
The reactive site loop of maspin, a serpin that possesses a tumor-suppressing activity, could also be a good candidate for designing a substrate for cathepsin G. The inhibitory activity of this serpin is controversial, although recent studies have identified tissue-type plasminogen activator as its potential target (36). Other studies indicate that the maspin RSL is very sensitive to proteolysis, which leads to maspin inactivation (37) and would explain its lack of inhibitory activity. The substrate derived from the maspin RSL was a very poor substrate of cathepsin G and chymase, but it was rapidly hydrolyzed by chymotrypsin with a specificity constant of 41.7 Ϯ 0.8 mM Ϫ1 s Ϫ1 (Table IV).
Substitution at Positions P 2 and P 1 of the Lead ACT-derived Substrate, Abz-TLLSALQ-EDDnp-The nature of the P 1 amino acid is critical for defining the substrate specificity of many serine proteinases. Cathepsin G has a preference for Phe, Lys, Arg, Leu, and Met at the P 1 position of substrates (16 -17, 38). We therefore replaced the P 1 Leu residue of the scissile Leu-Ser bond in Abz-TLLSALQ-EDDnp with an aromatic (Phe) or a basic (Arg or Lys) residue to obtain a substrate of improved The sequence alignment is based on the scissile bond at the P 1 -P 1 Ј site in the corresponding native serpins.

No. Serpin
Amino acid sequence specificity toward cathepsin G. In keeping with the preference of chymotrypsin-like proteinases for an aromatic residue at P 1 , replacement of the P 1 Leu by Phe in AC-9 resulted in a 2-6-fold increase in the specificity constants (Table V) for the three proteinases tested. Introduction of an Arg (AC-7) or a Lys (AC-8) at P 1 resulted in a slight decrease in k cat /K m for cathepsin G and chymotrypsin, whereas chymase hardly cleaved these two substrates. Combining a Phe residue at P 1 with a Pro at P 2 , which is often a preferred residue at position P 2 of synthetic substrates for many serine proteinases (39) including cathepsin G (16), resulted in an approximately 2-fold increase in the k cat /K m values. This was thus the best substrate (AC-9p) for cathepsin G and chymase in this series. With Leu, Arg, and Lys at P 1 , the presence of a P 2 Pro did not significantly modify the catalytic efficiency of the enzymes, except for the substrate containing Pro-Arg, AC-7p, which gave a slightly lower k cat /K m value when tested with cathepsin G and chymase.
The substrates containing Phe (AC-9 and AC-9p) and Leu (AC-3 and AC-3p) at P 1 were cleaved by all three enzymes at the Phe-Ser and Leu-Ser bonds, but those containing Arg or Lys residues were further cleaved at the Leu-Gln bond by chymotrypsin and chymase but not by cathepsin G. Therefore the experimentally determined k cat /K m value for the hydrolysis of AC-7, AC-7p, AC-8, and AC-8p by chymotrypsin and chymase is an average value for the parallel hydrolysis of two peptide bonds. They are probably cleaved with similar efficiencies, i.e. with the same k cat /K m , because experimental data for first-order substrate hydrolysis fit a simple exponential (data not shown). The presence of a basic residue at P 1 therefore directs the cleavage by chymotrypsin and chymase, but not by cathepsin G, toward the Leu-Gln bond. Thus, chymotrypsin and chymase have a more restricted specificity for aromatic/ hydrophobic residues at P 1 than does cathepsin G, indicating that cathepsin G has almost equal trypsin-and chymotrypsinlike specificities.
The important contribution of a Phe at P 1 or a Pro-Phe pair at P 2 -P 1 in cathepsin G substrates is further shown by the substantially lower free energy (Table V) required to reach the transition state complex for hydrolysis of AC-9 and AC-9p (⌬(⌬G) ϭ Ϫ4.5 kJ mol Ϫ1 for P 1 Leu 3 Phe and Ϫ5.62 kJ mol Ϫ1 for P 2 -P 1 Leu-Leu 3 Pro-Phe). However, flanking residues at the P and PЈ positions also contribute to recognition and cleavage efficiency, because substrates retaining the Pro-Phe moiety in their sequences are not cleaved with the same efficiency by cathepsin G. This is emphasized by the low cleavage susceptibility of Abz-AAPFSQ-EDDnp (k cat /K m ϭ 3.8 mM Ϫ1 s Ϫ1 Ϯ 0.28) by cathepsin G. Other substitutions at P 1 resulted in a slight increase in the relative transition state binding energy ⌬(⌬G), the least favorable residue being Arg at P 1 .
Taken together, the results show that cathepsin G has the following preference for P 1 residues: Phe Ͼ Leu Ͼ Lys Ͼ Arg. This order of preference is similar but not identical to that recently described by Polanowska et al. (38), who used a series of p-nitroanilide substrates and found the following order: Phe ϭ Lys Ͼ Arg ϭ Leu Ͼ Met. These differences may be attributed to synergistic interactions between cathepsin G subsites, as shown for human chymase (40), and/or to the presence of PЈ residues in the intramolecularly quenched substrates. Earlier studies were done using peptide sequences derived from the RSL of ␣ 1 -antichymotrypsin (17) or ␣ 1 -proteinase inhibitor (41) to design cathepsin G substrates, but the kinetic parameters indicated that they were poorer substrates than those developed in this study, probably because of their short length, or lack of residues on the prime side.
ACT variants mutated at the P 1 residue of their RSL have  been produced and assayed as inhibitors of chymotrypsin-like proteinases (28). The stoichiometry of inhibition (SI) varies significantly, depending on the nature of the P 1 residue. For example, a SI of 7.0 was obtained for a Leu 358 -Phe variant (rACT L358F) interacting with chymase, compared with a SI of 4.5 with native ACT, indicating the predominance of the substrate pathway for the Phe P 1 ACT variant. A variant of ACT with an arginyl residue at P 1 (rACT L358R) did not inhibit human chymase. Our results agree with these observations because substrate AC-7, which has Arg at P 1, was poorly hydrolyzed by chymase, whereas ACT-derived peptide substrates with Leu or Phe at P 1 are excellent substrates for chymase. No SI values for the interaction of cathepsin G with ACT variants are available. The association rate constants for P 1 ACT variants and cathepsin G demonstrate a 15-fold lower association rate for L358R ACT than for native ACT (42). This agrees with the lower k cat /K m value for AC-7 (P 1 Arg) hydrolysis by cathepsin G than for AC-3 (P 1 Leu). Recent data suggest that the cathepsin G/␣ 1 -antichymotrypsin stoichiometry is 1:1 (43), indicating an absence of the susbtrate pathway in the overall mechanism of inhibition. Instead, cleavage of the inhibitory loop must occur to form an irreversible complex with the enzyme trapped as an acyl complex. This is reflected in the rather low turn-over number and high affinity of cathepsin G for ACT-derived substrates. The conformation of the inhibitory RSL of native ACT is probably more restricted than that of peptides derived from this loop, which should make the loop to bind more tightly and be cleaved even more slowly than the model substrates used here. Thus, the predominance of the substrate pathway over the inhibitory pathway during the interaction of a serpin-proteinase pair seems to be directly correlated with the susceptibility of the serpin inhibitory loop to proteolysis.
Structural Determinants of Substrate Specificity in Cathepsin G-Putative interactions between substrates and the extended binding site of cathepsin G were examined using molecular modeling to gain insight into the structural basis for cathepsin G substrate specificity. A model of the P 3 -P 3 Ј peptide portion (TPFSAL) of the substrate Abz-TPFSALQ-EDDnp within the active site was constructed based on the P 3 -P 3 Ј region of the ecotin inhibitor bound to crab collagenase, a serine protease in which the structure is very similar to that of cathepsin G. The conformation of the inhibitory loop of ecotin has been suggested to mimic that of a polypeptide substrate (29) and is thus a good template for modeling a substrate complex of cathepsin G. Optimal superimposition of the cathepsin G structure complexed to a transition state analogue inhibitor (Protein Data Bank code 1cgh) (21) over the ecotin-collagenase complex (Protein Data Bank code 1azz) (29) resulted in 210 topologically equivalent ␣-carbon atoms with a root mean square deviation of 0.86 Å. The inhibitor bound to cathepsin G was removed, and the P 3 -P 3 Ј residues in the ecotin binding loop (Ser-Thr-Met-Met-Ala-Cys) were replaced with the corresponding residues of the hexapeptide portion (Thr-Pro-Phe-Ser-Ala-Leu) of the AC-9p substrate, Abz-TPFSALQ-EDDnp. This procedure yielded an initial cathepsin G-substrate complex, which was energy-minimized as indicated in "Experimental Procedures." Interactions that could mediate binding of the hexapeptide model TPFSAL to cathepsin G are shown in the complex model (Fig. 2). Phe at P 1 is bound in the S 1 pocket defined by the 189 -191 and 214 -217 segments of cathepsin G. Because Glu 226 is present in the S 1 pocket, which is hydrogen-bonded to Ala 190 , the benzyl group of Phe P 1 does not completely enter the pocket, but the edge of its phenyl ring makes favorable electrostatic interactions with Glu 226 , a residue that is thought to be a major structural determinant for the preference of Phe at P 1 (21).
The Phe P 1 of the modeled substrate was replaced by Leu, Arg, or Lys (Fig. 2) to examine how the S 1 site accommodates aromatic and/or hydrophobic as well as positively charged side chains. Glu 226 prevents P 1 Leu from entering the very far pocket so that it lies at the mouth of the pocket establishing hydrophobic contacts with Phe 191 , Lys 192 , and Tyr 215 . The side chains of Lys and Arg may extend almost fully into the pocket to contact Glu 226 through hydrogen bond(s) and electrostatic interactions. The unique location of Glu 226 in the S 1 subsite of cathepsin G makes it a key residue of the S 1 specificity for both Phe and basic residues. In agreement with the experimental results presented here, the modeling studies indicate that the S 1 subsite of cathepsin G can accommodate several amino acid side chains without requiring any enzyme conformational change. Therefore, the ability of cathepsin G to accommodate various P 1 residues can be explained by the shape and nature of the primary specificity pocket rather than by any structural plasticity of the active site, as in some serine proteinases such as ␣-lytic protease (44). Some flexibility may be possible, however, to optimally bind the substrate because of the absence of the disulfide bond at Cys 191 -Cys 220 .
Cathepsin G has a preference for a Pro at P 2 , like many other chymotrypsin-like enzymes. Although lacking hydrogen bond donor functions, this residue allows a change in the direction of the substrate chain as it threads through the active site, thus avoiding steric hindrance with His 57 and leading to an optimal positioning of the scissile bond (Fig. 2). The conformation of the P 2 Pro in the model substrate obtained after replacement of the Ser residue by Pro in ecotin was very similar to that of the Pro belonging to the inhibitor complexed to cathepsin G. This observation emphasizes the necessity for the substrate or inhibitor backbone of the P 2 residue to adopt a conformation allowing an optimal adaptation of the polypeptide chain in the enzyme active site. Examination of the S 3 pocket reveals no obvious binding determinants for the P 3 side chain, although Lys 192 should favor the presence of acidic residues and disfavor basic ones in agreement with data from subsite mapping experiments using a series of p-nitroanilide substrates (16).
The modeling studies also suggest that P 1 Ј Ser of the substrate may form a hydrogen bond with Ser 40 of cathepsin G (distance O␥ Ser P 1 Ј-O␥ Ser 40 ϭ 3.4 Å), thus providing a structural basis for the preference of Ser at P 1 Ј, as in the natural inhibitory serpins, ␣ 1 -antichymotrypsin, ␣ 1 -proteinase inhibitor, and SCCA2. The segment 39 -41, belonging to a surface loop (loop 30s), has an unusual conformation that is also found in crab collagenase (29) and chymase (45) but not in most serine proteinases. This conformation results in a narrowing of the S 1 Ј pocket, suggesting why the preferred P 1 Ј residues in cathepsin G substrates tend to be small amino acids.
The binding clefts in the S 2 Ј and S 3 Ј sites appear to be less well defined than for other serine proteinases because of the peculiar backbone path of segment 39 -41. Consequently, cathepsin G shows no significant preference for the P 2 Ј and P 3 Ј residues. However, the side chain of Arg 41 , unlike the equivalent residues in collagenase (29) and chymase (45), does not project toward the solvent but is packed on the protein body toward the active site. This restricts the residues that can be accommodated in the S 2 Ј pocket to small amino acids such as Ala or Val. The binding energy is provided by residues 39 -42 of cathepsin G (Fig. 2), which run antiparallel to the modeled substrate and form antiparallel main chain hydrogen bonds with the P 2 Ј residue. It is also possible that the positively charged side chain of Arg 41 is flexible and that an acidic residue may be accommodated at P 2 Ј. The proximity of Arg 41 may also provide some discrimination at P 3 Ј.
These observations thus suggest that the 30s loop, because of its unique conformation in cathepsin G, acts as a major discriminating structural determinant of the P 1 Ј-P 3 Ј specificity of cathepsin G. For this reason, PЈ-elongated substrates, such as the intramolecularly quenched substrates used in this study, are far better substrates than conventional chromogenic or fluorogenic substrates of cathepsin G that have residues only on the P side.
Another structural determinant that could be important in defining the dual substrate specificity of cathepsin G is Gly 216 . Its conformation is determined by distal surface loops and is correlated with the P 1 preference in serine proteinases of the chymotrypsin family (44). The backbone conformation of Gly 216 in cathepsin G (⌽ ϭ Ϫ173.66°, ⌿ ϭ Ϫ160.89°), like that in crab collagenase (⌽ ϭ Ϫ152.28°, ⌿ ϭ Ϫ166.56°), a protease having an even broader specificity than cathepsin G (46), is intermediate between that of trypsin (⌽ ϭ 177.30°, ⌿ ϭ 174.87°) and  chymotrypsin (⌽ ϭ Ϫ165.56°, ⌿ ϭ Ϫ160.82°). It has also been demonstrated that variant trypsins (46) lacking a chymotrypsin-like backbone at Gly 216 do not accelerate hydrolysis of P 1 Phe-containing substrates. Because the Gly 216 of cathepsin G has a conformation close to that in chymotrypsin, the conformation of Gly 216 probably plays an important part in governing the dual trypsin-and chymotrypsin-like specificity of cathepsin G. We have shown here that sequences derived from a given serpin-reactive site loop may be useful for designing substrates of the corresponding target enzyme. Most of the intramolecularly quenched fluorogenic substrates described here for cathepsin G have greater k cat /K m ratios than the best substrates currently used to measure cathepsin G activity. They have low K m values and rather moderate k cat values, which make them suitable for several applications. The specificity constant values determined for the various substrates used in this study indicate that cathepsin G has about 40 -60% chymotrypsin activity, 10 -15% chymase activity, and is much more active than hK3. FIG. 2. Models of cathepsin G complexed to a hexapeptide substrate. A, the P 3 -P 3 Ј TPFSAL residues of the AC-9p substrate were docked in the active site of cathepsin G as described under "Experimental Procedures." The residues of cathepsin G forming the S 1 site (189 -191 and 214 -217) and the loop 30s (residues 39 -41) interacting with PЈ residues of the substrates are shown as a thin line. Glu 226 , located at the bottom of the S 1 site and thought to be the key residue defining the P 1 specificity of cathepsin G, is also shown. Residues P 3 -P 3 Ј of the substrate are shown as a thick line. The hatched line indicates the modeled hydrogen bond interaction between the Ser 40 and the P 1 Ј Ser of the substrate. The P 1 Phe residue of the modeled AC-9p substrate was replaced by P 1 Leu (B), P 1 Arg (C), and P 1 Lys (D) . The S 1 site may thus accommodate all of these residues without any conformational changes in the enzyme.