The 2.2-Å Crystal Structure of Human Pro-granzyme K Reveals a Rigid Zymogen with Unusual Features*

Granzyme K (GzmK) belongs to a family of trypsin-like serine proteases localized in electron dense cytoplasmic granules of activated natural killer and cytotoxic T-cells. Like the related granzymes A and B, GzmK can trigger DNA fragmentation and is involved in apoptosis. We expressed the Ser 195 3 Ala variant of human pro-GzmK in Escherichia coli , crystallized it, and determined its 2.2-Å x-ray crystal structure. Pro-GzmK possesses a surprisingly rigid structure, which is most similar to activated serine proteases, in particular complement factor D, and not their proforms. The N-termi-nal peptide Met 14 -Ile 17 projects freely into solution and can be readily approached by cathepsin C, the natural convertase of pro-granzymes. The pre-shaped S1 pocket is occupied by the ion paired residues Lys 188B -Asp 194 and is hence not available for proper substrate binding. The Ser 214 -Cys 220 segment, which normally provides a template for substrate binding, bulges out of the active site and is distorted. With analogy to complement factor D, we suggest that this strand will maintain its non-productive conformation in mature GzmK, mainly due (30). Residues Gly 89 to Phe 94 comprising the reactive-site loop of the second bikunin domain were taken from the reported crystal structure (1BIK) (31) and adjusted to the model of active GzmK according to the main-chain trace in the above-mentioned complexes. The side chains of Lys 192 of GzmK and Phe 94 of bikunin D2 were rotated to avoid clashes between the two moieties (inter atom distances below 3 Å). The structure of the complex was finally minimized with CNS, and the quality of the model was controlled with PROCHECK version 3.5.4 (28). Additional atomic coordinates for serine proteases used in this work were obtained from the Structural Bioinformatics Protein Data Bank (PDB) with the accession codes 2PTC, 1TPA, 1PPF, 1BIK, 1FDP, 1DIC, 1HFD, 1BIO, 1DFP, 1DSU, 1DST, 1PJP, 1KLT, 1CGH, 1FQ3, 2PKA, 2CGA, 1FDP, 1FAX, and 1TGB.

Granzyme K (GzmK) 1 was discovered in granules of human lymphokine-stimulated killer cells together with granzyme A (GzmA) as a trypsin-like non-glycosylated serine protease (molecular mass, ϳ28 kDa) that cleaves N ␣ -benzyloxycarbonyl-Llysine thiobenzyl ester (BLT) bonds (5). GzmA is 10-fold more abundant in human lymphokine-stimulated killer cell granules than GzmK (5), whereas granules from the RNK-16 rat tumor cell line contain significantly higher amounts of GzmK than GzmA (6). The rat enzyme (also known as fragmentin-3) was originally identified as the third proteolytic inducer of DNA fragmentation in perforin-treated YAC-1 target cells apart from GzmA and GzmB. DNA fragmentation initiated by GzmK or GzmA in the presence of perforin shows similar time kinetics and was observed after 20 h, whereas GzmB/perforin-mediated apoptosis occurred already within 2 h (6). In evolutionary terms, GzmK and GzmA are the most closely related granzymes. They both are encoded by single copy genes in human and rodents and map to orthologous chromosomal segments on human chromosome 5q11.2 and mouse chromosome 13, band D2.2, respectively (7)(8)(9). Transcripts of the murine GzmK gene have been detected in certain regions of the mouse brain, including the cerebral cortex, hippocampus, diencephalon, and the pineal and pituitary glands (10).
GzmK and GzmA differ from GzmB by their substrate specificity. Although the latter cleaves after Asp residues, both GzmK and GzmA preferentially cut after basic residues. To date only the thioesters Z-Arg-SBzl and BLT, and a highly basic 13-amino acid oligopeptide (Cys-Gly-Tyr-Gly-Pro-Lys-Lys-Lys-Arg-Lys-Val-Gly-Gly) have been reported to be cleaved by GzmK at neutral pH. GzmK cleaves this oligopeptide between basic residues, most rapidly after position 6 and 9 and more slowly after position 7 and 8 (11). The best synthetic inhibitors of GzmK known are 3,3-diphenylpropanoyl-Pro-4amidinophenylglycyldiphenylphosphonate esters (12) and D-Phe-Pro-Arg-chloromethyl ketone (13). The most efficient natural inhibitor is inter-␣-trypsin inhibitor, a heterotrimeric protein complex containing bikunin. Inhibition by this plasma protein complex is mediated by the second Kunitz-type domain of its bikunin subunit, the K i value being 22 nM (13).
GzmK, like GzmA and GzmB, is synthesized as a zymogen precursor, containing a signal and a short pro-peptide preceding the mature polypeptide chain (14). The cleavage site of the signal peptidase in the GzmK precursor has not been experimentally determined, and thus the exact length of the propeptide is uncertain. The N-terminal propeptides of GzmA, GzmB, and mast cell chymase precursors, however, have been identi-fied in cultured cells as Glu-Arg, Gly-Glu, and Gly-Glu, respectively (15)(16)(17). Removal of these dipeptides is ensured by the lysosomal cysteine protease cathepsin C (dipeptidyl peptidase 1) (18), which is deficient in the Papillon-Lefévre syndrome (19). Conclusive confirmation of the essential role of cathepsin C in the conversion process was obtained from cathepsin Cdeficient knockout mice, whose lymphocytes contain only catalytically inactive GzmA and GzmB precursors at normal levels (15). Our recent studies indicate that an N-terminal Met-Glu dipeptide is also efficiently removed by cathepsin C in vitro from the human granzyme K zymogen (13). Thus, the Met-Glu precursor of GzmK most likely represents the natural proform or at least a functional intermediate during biosynthesis of catalytically active GzmK in vivo. Transport of zymogens through the endoplasmic reticulum and Golgi network before recognition by cathepsin C requires a certain degree of conformational stability, which prevents both premature proteolytic activity and degradation of precursors intracellularly.
The natural Met-Glu precursor of human GzmK lacks Nlinked carbohydrates and can be produced by recombinant expression in Escherichia coli and refolding from inclusion body proteins at high quantities, which is a prerequisite for further structural analyses (13). To understand the specific conversion of pro-GzmK by cathepsin C as well as the weak proteolytic activity of mature GzmK on a structural basis, we produced the more stable but otherwise identical Ser 195 3 Ala variant of human pro-GzmK, crystallized this variant, and determined its crystal structure at 100 and 298 K. Our structural work reveals an unanticipated mechanism of zymogen stabilization and suggests a distorted conformation for activated GzmK, which is incompatible with the binding and processing of protein substrates.

EXPERIMENTAL PROCEDURES
Construction of the Human S195A Pro-GzmK Expression Plasmid-The cDNA segment between the NdeI and ApaI site of the wild type plasmid pET24c ϩ (Novagen) was removed by NdeI and ApaI cleavage (New England BioLabs) and replaced by a mutated cDNA sequence. This cDNA sequence was generated from the wild type construct (13) using the oligonucleotides DJ601 (upper primer, 5Ј-TGTGTTTCCATA-TGGAAATTAT-3Ј) and DJ582 (lower primer, 5Ј-TCAAGGGGCCCCC-TGCGTCAC-3Ј). Oligonucleotide DJ582 introduces a codon for Ala at position 195 in the wild type cDNA sequence of GzmK. The resulting PCR product was digested with NdeI and ApaI and ligated into the cleaved pET24c ϩ vector carrying the 3Ј coding portion of the cDNA. The recombinant plasmid was amplified in DH5␣ cells (Invitrogen) and transformed into B834(DE3) cells (Novagen) for protein expression.
Expression and Purification of the Ser 195 3 Ala Zymogen-The expression of the Ser 195 3 Ala mutant of pro-GzmK in B834(DE3) cells, grown in Luria-Bertani broth (LB) with 50 g/ml kanamycin at 37°C, was induced by addition of isopropyl-1-thio-␤-D-galactopyranoside to a final concentration of 1 mM as described previously (13). Bacteria were harvested after 5-h induction and lysed in 50 mM Tris-HCl, 2 mM MgCl 2 , 10 g/ml DNase I, and 0.25 mg/ml lysozyme, pH 7.2. Inclusion bodies were collected by centrifugation; washed twice in 50 mM Tris-HCl, 60 mM EDTA, 1.5 M NaCl, 6% Triton X-100, pH 7.2, and three times in 50 mM Tris-HCl, 60 mM EDTA, pH 7.2; and dissolved in 6 M guanidinium chloride, 100 mM Tris-HCl, 20 mM EDTA, 15 mM GSH, 150 mM GSSG, pH 8.0, by stirring at room temperature over night. Solubilized proteins were then dialyzed against 6 M guanidinium hydrochloride, pH 5.0 at 4°C. Refolding was initiated by diluting the solubilized proteins (typically 10 -15 mg/ml) 1:100 (v/v) in 50 mM Tris-HCl, 0.5 M L-arginine, 20 mM CaCl 2 , 1 mM EDTA, 0.1 M NaCl, 0.5 mM L-cysteine, pH 8.5 (refolding buffer). Two further aliquots were added to this solution after 8 and 16 h. Diluted proteins were kept at 4°C in refolding buffer for an additional period of 96 h. The protein solution was dialyzed against phosphate-buffered saline, pH 7.3, concentrated by ultrafiltration, and loaded onto a Mono S-Sepharose column (Amersham Biosciences). Proteins were recovered from the column using a linear salt gradient (from 0.137 to 2 M NaCl).
Crystallization, Data Collection, and Processing-Crystals suitable for diffraction analysis were grown by the sitting drop vapor-diffusion method. One microliter of the protein solution (10 mg/ml) was mixed with 1 l of the reservoir solution (3.2 M sodium formate, pH 8.4) and concentrated against 300 l of the reservoir. After 2 days of incubation, 100 l of 6 M sodium formate solution, pH 8.4, were added to the reservoir. Within 3 months at 20°C, crystals grew to maximal dimensions of 0.3 mm ϫ 0.2 mm ϫ 0.5 mm. A complete data set to 2.9-Å resolution was recorded at room temperature from a single crystal mounted in a thin-walled glass capillary. A second complete data set to 2.23 Å was collected at 100 K from a flash-cooled crystal that had previously been equilibrated in a cryo-solution (3.9 M sodium formate, pH 8.4, 15% (v/v) glycerol). Both data sets were recorded "in house" on a 300-mm MAR Research image plate detector, using monochromatized CuK␣ radiation from a RIGAKU rotating anode x-ray generator. The crystals of pro-GzmK Ser 195 3 Ala did not experience appreciable radiation decay over the 2 days that they were exposed to the x-ray beam at room temperature. This feature seems to be related to the relatively low solvent content of the crystals (39%, corresponding to a Matthews coefficient of 2.02). The crystals belong to space group P2 1 2 1 2 1 and contain one molecule per asymmetric unit.
Diffraction data sets were evaluated with MOSFLM (20), reduced, and scaled without applying a sigma cut-off, using programs supported by the Collaborative Computational Project No. 4 (21). Crystal structures were solved by molecular replacement with AMoRe (available at www.ccp4.ac.uk/dist/html/amore.html) using 15.0-to 3.5-Å data and a modified GzmB (1FQ3) search model (22) in which all residues differing between the two proteases had been truncated to alanine. Model building was done on a Silicon Graphics Indigo 2 workstation using MAIN (23,24). Calculation of the electron density maps and crystallographic refinement were performed with X-PLOR (25) and CNS (26) using the target parameters of Engh and Huber (27). In the final steps of refinement, highly restrained atomic B-values were also refined. Models were built in agreement with the human GzmK sequence as deposited in the Swiss-Prot data base (accession code P49863). They start with the N-terminal Met 14 , which was confirmed by sequence analysis of two crystals, and end with Asn 248 (chymotrypsinogen A numbering). In both cases, there is continuous electron density from Gly 19 to Ser 92 and from Ser 100 to Pro 244 . For the N-terminal segment Met 14 -Gly 18 , segment Arg 93 -Gln 99 , and the C-terminal residues from Pro 245 to Asn 248 , lacking interpretable electron density, the occupancy of all atoms was set to zero. The final model comprises 221 defined residues and 131 solvent (water) molecules. 99% of all residues fall into the most favored or additionally favored regions (PROCHECK version 3.5.4 (28)). The final R-factors are 24.07% (R free , 28.45%) and 22.36% (31.67%) for data sets measured at 100 and 298 K, respectively (see Table I for statistics of the data collection, processing, and refinement).
Modeling of Active GzmK-Active GzmK was manually modeled using MAIN (24). The S1 pocket and surrounding loops were modeled following bovine trypsin in complex with bovine pancreatic trypsin inhibitor (BPTI) (2PTC and 1TPA) (29) and human leukocyte elastase in complex with the third domain of turkey ovomucoid inhibitor (1PPF) e R free is the R value calculated with 500 reflections that were not used for the refinement. (30). Residues Gly 89 to Phe 94 comprising the reactive-site loop of the second bikunin domain were taken from the reported crystal structure (1BIK) (31) and adjusted to the model of active GzmK according to the main-chain trace in the above-mentioned complexes. The side chains of Lys 192 of GzmK and Phe 94 of bikunin D2 were rotated to avoid clashes between the two moieties (inter atom distances below 3 Å). The structure of the complex was finally minimized with CNS, and the quality of the model was controlled with PROCHECK version 3.5.4 (28). Additional atomic coordinates for serine proteases used in this work were obtained from the Structural Bioinformatics Protein Data Bank (PDB) with the accession codes 2PTC, 1TPA, 1PPF, 1BIK, 1FDP, 1DIC, 1HFD, 1BIO, 1DFP, 1DSU, 1DST, 1PJP, 1KLT, 1CGH, 1FQ3, 2PKA, 2CGA, 1FDP, 1FAX, and 1TGB.

RESULTS AND DISCUSSION
Overall Structure of Pro-granzyme K-In this report we describe and discuss the first three-dimensional structure of one of the zymogens that are converted by cathepsin C to an active serine protease in lymphocytes, mast cells, and granulocytes during granule biosynthesis. We have chosen human pro-GzmK, because the natural zymogen starts with a Met residue, lacks glycosylation sites, and can be expressed at high levels in E. coli and refolded from inclusion bodies. The catalytic Ser 195 was replaced by an Ala residue to avoid any autocatalytic cleavages in highly concentrated solutions during crystallization. Complete data sets were collected independently for two crystals at room temperature and at 100 K after flash cooling in liquid nitrogen, respectively. Both Fourier maps were essentially equivalent and allowed continuous tracing of the polypeptide chain from Gly 19 to Ser 92 , and from Ser 100 to Pro 244 , whereas the first N-terminal residues between Met 14 and Gly 18 , seven residues of the 99 loop between Arg 93 and Gln 99 , and the four residues of the C terminus (Pro 245 -Asn 248 ) were found disordered. The temperature independence of this flexibility indicates that these segments are statically rather than thermally disordered (32).
By analogy with trypsinogen and pro-complement factor D (pro-DF) and because of the weak proteolytic activity of activated GzmK, the pro-GzmK structure was expected to be quite flexible as in trypsinogen (33) and DF (34). To our surprise, the crystal structure analysis, however, revealed a rigid zymogen with a preformed active-site cleft (see below). Especially the activation domain and the S1 specificity pocket are well-ordered. The pro-GzmK molecule has the shape of an oblate ellipsoid with principal axes of about 35 and 50 Å (Fig. 1). As in other chymotrypsin-like serine proteases, the single GzmK polypeptide chain folds into two ␤-barrels each comprising six antiparallel strands labeled ␤1 to ␤6, and ␤7 to ␤12 (Fig. 2), which are strapped together by the domain-linking segments Val 22 -Pro 28 , Leu 121 -Ser 129 , and Thr 229 -Lys 232A (Fig. 2). The residues of the catalytic triad, His 57 , Asp 102 , and Ser 195 3 Ala, are located at the junction of the two ␤-barrels. The preformed active-site cleft (see below) runs perpendicular to this junction along the surfaces of both barrels. Three helical elements are found on the surface of the molecule, a single 3 10 turn (Ala 55 to Gln 59 , ␣1), an "intermediate" ␣-helix (Ser 164 to Asn 169 , ␣2), and a long C-terminal ␣-helix extending from Lys 232A to Leu 243 (␣3). The optimal alignment to bovine chymotrypsinogen A required insertions of one tetrapeptide (60A to 60D), three dipeptides (170A and 170B; 173A and 173B; 188A and 188B), and two single amino acid residues (223A and 232A), and deletions of three single residues (36,127,218) and of the tetrapeptide segment between 203 and 206 (see Fig. 2).
A direct topological comparison between pro-GzmK and the available enzyme-zymogen pairs deposited with the Protein Data Bank (35) revealed a better superposition with the active enzymes than with their respective zymogens. The best fit was obtained with the six PDB entries for active human complement factor D (DF), 1DIC, 1HFD, 1BIO, 1DFP, 1DSU, and FIG. 2. Topological sequence alignment of GzmK and related proteases. Initial multiple sequence alignment was done using "pileup" (Genetics Computer Group, GCG, Madison, WI). Final alignments were adjusted according to the structural alignment of each serine protease with bovine chymotrypsinogen A (btCtrA). Homologous regions were highlighted using "prettybox" (GCG program package). The vote weights for mmGzmk and rnGzmK were reduced to 0.6; no vote weight was specified for all other sequences. Helices and ␤-strands of pro-GzmK are represented by cylinders and arrows. The residues of the active site triad are emphasized in boldface numbers and the unique residues 188B, 215, and 219 are shown by italic numbers. Residues, which are identical in more than two of the aligned sequences, are emphasized by black shading. The species designations Hs (Homo sapiens), Rn (Rattus norvegicus), Mm (Mus musculus), Bt (Bos taurus) are used as prefixes in combination with accepted gene symbols (Ctra, chymotrypsinogen A; DF, complement factor D; Gzm, granzyme). The numbering of pro-GzmK residues is based on topological similarity to chymotrypsinogen A. Inserted residues are identified with the number of the last topologically equivalent chymotrypsinogen residue, and capital letters as a suffix in alphabetical order.
1DST. Considering only ␣ carbon atom pairs having C ␣ -C ␣ distances of 1.5 Å or less, 181 residues turned out to be topologically equivalent between pro-GzmK and DF, showing an average root mean square (r.m.s.) deviation of 0.72 Å (Fig. 1B). This topological similarity extends into many structural details (see below). By contrast, pro-GzmK and the four independent pro-DF molecules in the asymmetric unit of 1FDP possess only 157 topologically equivalent residues, with a slightly higher r.m.s. deviation of 0.81 Å. Second and third on the score ranking list are two groups of active serine proteases, comprising human mast cell chymase (1PJP and 1KLT) (36,37) and human cathepsin G (1CGH) (38), followed by GzmB (1FQ3) (22) and porcine kallikrein (2PKA) (39). From the two latter pairs 162 and 161 ␣-carbon atoms, respectively, can be superimposed on pro-GzmK with r.m.s. deviations of 0.74 and 0.76 Å. The topological fit with chymotrypsin and trypsin was worse, but still better than for their zymogens, chymotrypsinogen and trypsinogen.
Mast cell chymase, cathepsin G, and GzmB are slightly more distant in agreement with their evolution and chromosomal clustering. These three proteases and GzmH lack the disulfide bond Cys 191 -Cys 220 and are clustered together on chromosome 14q11.2 within 130 kb (4), whereas granzyme-related members with four disulfide bonds are distributed over two separate loci on chromosome 5q11.2 (GzmA and GzmK) and chromosome 19p13.3 (GzmM, azurocidin 1, proteinase 3, neutrophil elastase, and complement factor D).
Heparin Binding Regions of Human Pro-GzmK-Compared with other serine proteases, the percentage of polar residues in GzmK is not particularly high. The 6 glutamate and 11 aspartate residues, however, do not compensate for the positive charges of the 8 arginine and 23 lysine residues, rendering pro-GzmK very basic in accordance with a calculated isoelectric point of 10.2. The basic amino acid residues are unevenly distributed on the surface and form several clusters of positively charged surface patches (Fig. 3). A particularly large basic region extends from the intermediate helix toward the C-terminal helix running along the upper rim (in the standard representation, Fig. 3A). This patch comprises residues Lys 126 , Arg 131 , Arg 165 , Lys 166 , Lys 178 , Lys 232A , Lys 233 , and Lys 239 and shows clear topological similarities with the heparin binding site (also called anion binding exosite II) of ␣-thrombin (40). Other positively charged surface patches comprise residues Lys 86 , Lys 87 , Lys 107 , and Lys 113 on the front side (Fig. 3A) and Lys 20 , Arg 27 , Lys 135 , Lys 137 , and Lys 202 on the back side of the molecule (Fig. 3B). In and around the S1-pocket, the lysines (Lys 188B , Lys 192 , and Lys 224 ) and arginines (Arg 60A , Arg 93 , and Arg 150 ) and the catalytic His 57 are charge compensated by aspartates (Asp 97 , Asp 102 , Asp 145 , and Asp 194 ) and glutamates (Glu 173B and Glu 219 ) (Fig. 3A).
In line with these structural features, we were able to demonstrate strong interactions of both pro-GzmK and mature GzmK with a heparin matrix, from which they were eluted at high salt concentrations (0.8 M NaCl). Using the same buffer system, pro-GzmK also bound to a cation-exchange column (S15 Sepharose; Amersham Biosciences) but eluted at 0.6 M NaCl, suggesting that heparin-like ligands specifically interact with pro-GzmK with high affinity in vivo. This latter notion is also supported by our observations that heparin affects proteolytic activity of both murine and human GzmK. In activity assays using 0.5 unit/ml heparin, the activities of human and mouse GzmK were elevated by 50 and 30%, respectively, as compared with measurements in the absence of heparin (data not shown).
The fact that esterolytic activity is not reduced by heparin indicates that heparin and small substrates bind to separate regions on the surface of the molecule. This heparin binding (exo)site could thus contribute to the recognition of macromolecular substrates, as shown e.g. for ␣-thrombin, resulting in a highly selective function in vivo. Alternatively, the heparinbinding region of GzmK may play a role in the binding and uptake of GzmK by the target cells during killer cell attack. In this regard, all three mammalian homologs of GzmK are devoid of carbohydrates and thus cannot be transported into endosomal vesicles via the mannose 6-phosphate receptor during target cell killing, as recently shown for GzmB (41).
Pro-GzmK, a Rigid Zymogen with a Preformed Active-site Cleft- Fig. 1 Fig. 1. B, pro-GzmK is further rotated by 180°around the x-axis. The colors indicate positive (blue) and negative (red) electrostatic potential at the molecular surface, contoured at ϩ10 kT/e to Ϫ10 kT/e. Basic and acidic residues are highlighted by yellow labels consisting of single-letter symbols for amino acid residues and sequence numbers; the N and C termini of pro-GzmK are marked with yellow labels. The figure was made with GRASP (59).
still compact, bulges out of the main molecular body away from the active site. The 99 loop, normally forming the "roof" of the active-site cleft, is fully disordered between Ser 92 and Ser 100 . GzmK is special in that its residue 94 is not a tyrosine, as in almost all other chymotrypsin-like serine proteases, or (more rarely) a phenylalanine, but a smaller valine residue. In contrast to the bulkier side chains of Tyr 94 or Phe 94 residues, the Val 94 side chain does not cover the catalytic Asp 102 side chain and is hence unable to shield the "catalytic hydrogen bond" between His 57 N␦1 and Asp 102 O␦2 from bulk solvent molecules. With respect to position 94, GzmK resembles DF, which also carries a small residue, a serine, at this position (42). Adjacent to the flexible 99 loop of pro-GzmK lies the wellordered 175 loop. This loop is two residues longer than in chymotrypsinogen and intrudes, in particular with its Pro 174 pyrrolidine ring, deeply into the active-site cleft (Fig. 1).
The "lower" boundary of the pro-GzmK active-site cleft (described from left to right) is mainly formed by the 222 (␤11␤12) loop, the Cys 191 -Gly 193 S1 base, the well-defined compact 145 ("autolysis" or ␤7␤8) loop, and the 70 -80 (␤4␤5) loop (Fig. 1). The latter possesses a conformation similar to the "calcium binding loops" of the calcium-stabilized pancreatic and coagulation serine proteases, but differs e.g. from pro-DF. In contrast to calcium-dependent serine proteases, pro-GzmK lacks the negatively charged residues at positions 70 and 80, which contribute to the calcium coordination sphere (43). Instead, the distal ammonium group of Lys 80 of pro-GzmK occupies the position of the calcium ion and is within hydrogen bond distance to the carbonyl oxygens of Ser 72 and Lys 75 and to a bulk solvent molecule. Thus, GzmK should be stable and function independently of calcium, in agreement with our experimental findings (data not shown).
The catalytic triad residues Ser 195 , His 57 , and Asp 102 are located in the center of the active-site cleft as in functional active proteases (Figs. 4 and 5A). The Asp 102 side chain is placed almost "normally," without being shielded by residue 94 from bulk solvent (see above). The His 57 imidazoyl group, however, is rotated away from its normal gauche ϩ (1 about 64°) to a trans conformation (1 about 165.5°) and is thus unable to form the mechanistically required hydrogen bonds between its N␦1 atom and Asp 102 O␦2, and between its N⑀2 atom and Ser 195 O␥. This inactive His 57 conformation apparently results from the unusual conformation of the spatially adjacent residue Gly 215 , which would collide with a functionally arranged His 57 side chain, rather than from the absence of the Ser 195 hydroxyl group in the crystallized Ser 195 3 Ala variant of pro-GzmK.
The flanking Ser 214 -His 217 segment, which represents the C-terminal part of strand ␤11 and forms the kinked "entrance frame" of the S1 pocket in functional trypsin-like proteases, considerably curves outwards beyond the molecular body of pro-GzmK (Figs. 1, 4, and 5A). This unusual conformation seems to be stabilized by a strong hydrogen bond connecting the Gly 215 nitrogen with the Asp 102 carboxylate instead of the usually observed Ser 214 -Asp 102 hydrogen bond, with the Ser 214 side chain rotated into the molecule. In addition, the imidazol group of His 217 slips into a pocket on the left side between the 214 -217 segment and the main molecular surface. Gly 216 , which in active trypsin-like proteases typically anchors the P3 substrate residue (Fig. 6), exhibits a conformation incompatible with the donation (via N-H) or acceptation (via C-O) of substrate hydrogen bonds. Also with respect to the 214 -217 loop, the pro-GzmK structure most closely resembles active DF (Fig.  5A). In the latter, a serine residue at position 215 similarly pushes the His 57 side chain away from its functional position. In the overwhelming majority of serine proteases with trypsinlike specificity, the carbonyl group of Gly 219 points into the S1 pocket. Pro-GzmK, DF, and coagulation factor IX represent notable exceptions, because position 219 is occupied by a glutamate residue. In the pro-GzmK structure, the main chain carbonyl group of Glu 219 points away from the S1 pocket and forms a hydrogen bond with the Lys 224 N atom. The following Cys 220 -Pro 225 loop of pro-GzmK is loosely arranged on top of the Gln 187 -Asp 189 segment similarly to trypsin. Segment Pro 225 -Tyr 228 , which follows the ␤12 strand, is placed as in activated proteases, where it forms the back side of the S1 pocket.
The main chain segment Asp 189 -Ala 195 , like in functionally active serine proteases, forms the lower base of a pre-shaped S1 pocket in pro-GzmK. Due to the inversion of the Lys 188B -Asp 189 segment, however, the Lys 188B side chain intrudes into the S1 pocket toward Ala 195 , whereas the side chain of the immediately following Asp 189 residue points away from the pocket and into the solvent (Fig. 4). The segment from Ser 190 onward, in particular the Gly 193 -Ala 195 loop of pro-GzmK follows approximately the usual main chain course observed in active enzymes (Fig. 5A). The N-H groups of Gly 193 and Ser 195 provide a functional "oxyanion hole," important for the stabilization of the negatively charged transition state intermediate. The conformation of the oxyanion hole seems to be stabilized by the Lys 188B -Asp 194 ion pair, which is held together by a very short (2.5 Å) charged hydrogen bond (Fig. 4). The N atom of Lys 188B is further hydrogen-bonded to the main chain oxygen of Ser 190 . The Asp 194 side chain of pro-GzmK is arranged as in mature enzymes with regard to its ␤-carbon atom, but differs with regard to its carboxylate group, which is directed toward the Lys 188B ammonium group (1 ϭ 75°) rather than toward the Ile 16 pocket (requiring a large 1 rotation of about 150°to adopt the active conformation, see Fig. 6). Noteworthy, a similar conformation of Asp 194 has been found in the zymogen of coag- FIG. 4. The S1 specificity pocket is preformed in pro-GzmK. Stereo view of the final electron density map around the preformed S1 pocket of pro-GzmK. The active site region of pro-GzmK is shown in standard orientation, with residues displayed as a ball-and-stick model. The oxygens are highlighted in red, the nitrogens in blue, and the Cys 191 -Cys 220 disulfide bridge in yellow color. The 214 -220 segment, the basement of the preformed S1 pocket, the residues of the unusual zymogen triad of pro-GzmK, Lys 188B , Asp 194 , and Ser190, and His 57 are completely defined. The figure was generated with Bobscript version 2.5 (60, 61).

FIG. 5. A comparison of the active site regions of pro-GzmK and related
proteases/zymogens. Stereo views of the S1 pocket of pro-GzmK (yellow) superimposed with: A, complement factor D (red); B, pro-DF (dark blue); and C, chymotrypsinogen A (light blue). The same colors have been used to designate selected amino acid residues except for pro-GzmK, which is labeled in black. The stereo images were generated with Molscript (62). ulation factor VII (44), but the adjacent main chain segments are more flexibly arranged than in pro-GzmK.
Stability of the Inactive Zymogen-Stabilization of catalytically inactive conformations in multidomain serine proteases is achieved in several ways, whereas serine protease precursors with short propeptides can be stabilized by a hydrogen bond network between Ser 32 , His 40 , and Asp 194 , known as the "zymogen triad," because its first description in trypsinogen and chymotrypsinogen (45). These three residues are conserved in human, rat, and mouse GzmK but do not form a trypsinogenlike triad in pro-GzmK.
By contrast, the inactivity of pro-GzmK is ensured by a number of other structural features. The pre-shaped S1 pocket is occupied by the Lys 188B -Asp 194 ion pair, which although leaving space for substrates with small P1 residues, is not suited to bind a substrate correctly. In all pro-GzmK homologs sequenced so far (Fig. 2), both residues Lys 188B and Asp 194 are conserved. This suggests that the zymogen-stabilizing salt bridge Asp 194 -Lys 188B is a common feature among all GzmK homologs, and in turn points to a higher stability of this ionic interaction compared with the zymogen triad. More important, the template segment Ser 214 -His 217 is far from allowing peptide substrates to correctly align to the active site. Finally, the His 57 imidazoyl side chain is pushed out of its functional site by the uncommon Gly 215 (Figs. 4 and 5).
Conversion of Granzyme Precursors by Cathepsin C-Cathepsin C, a papain-like dipeptidyl-aminopeptidase, seems to be the genuine activator of several granule-targeted serine protease precursors produced by mast cells, natural killer cells, and activated lymphocytes (17), including pro-GzmK (13). Our crystallographic study of pro-GzmK was also aimed at understanding zymogen recognition by cathepsin C. In particular, we explored the possibility that exosites on the cathepsin C surface, distinct from its catalytic center, contribute to this high recognition specificity and cleavage efficacy via docking experiments between human cathepsin C (46) and pro-GzmK. We found that all four substrate binding regions of the cathepsin C tetramer are freely accessible. Any monomer of the tetrameric cathepsin C molecule can productively bind to the exposed N-terminal Met 14 -Gly 18 segment of pro-GzmK through subsites S2 to S3Ј without the imminence of clashes between the two approaching molecules. In particular, the deep S2 specificity pocket of cathepsin C appears to be well-suited to accommodate the side chain of the N-terminal Met 14 of pro-GzmK. Subsequent cleavage can occur between Glu 15 (P1) and Ile 16 (P1Ј). We could not discern any additional complementary surfaces in vicinity to the N terminus of pro-GzmK and the substrate binding region of cathepsin C.
Pro-chymase, one of the known cathepsin C substrates, has been reported to be resistant to cathepsin C in the absence of heparin. To explain this finding, favorable intramolecular electrostatic interactions between the Glu 15 carboxylate group and a basic patch on the surface of pro-chymase and their disruption by heparin have been postulated by some investigators (47) but not proven (47,48). The crystal structure of pro-GzmK now disapproves this postulated role for Glu 15 . Consistent with our structural findings are recent investigations showing that heparin and other sulfated polysaccharides with high negative charge density do not stimulate the activation of pro-chymase, but rather inhibit its cleavage by cathepsin C at physiological salt concentrations. Increasing ionic strength counteracts the inhibitory effect of heparin, but also leads to some inhibition of the conversion reaction, suggesting that the N terminus of granzyme precursors may be less accessible in heparin/pro-FIG. 6. Putative structure of active GzmK bound to a substrate/inhibitor. The crystal structure of pro-GzmK (blue) is shown superimposed with a model of active GzmK (red). In addition, the reactive site loop of the second Kunitz-type domain of bikunin (1BIK, residues Gly 89 -Phe 94 , yellow) has been modeled into the active site region using the known complex between BPTI and trypsin as a template. The figure was prepared with WEBLABVIEWER (available at www.msi.com).
granzyme complexes and at high salt concentrations (48). Alternatively, binding of heparin-like proteoglycans to zymogens might induce allosteric changes that impair cathepsin C binding. Such mechanisms could explain the inhibitory role of heparin in the pro-chymase conversion process and would not be at variance with our finding of a highly flexible N terminus in pro-GzmK.
A Structural Model for Mature GzmK-The conformational rearrangements that occur after cathepsin C conversion can be predicted with some confidence (Fig. 6) on the basis of structures available for several zymogen/enzyme pairs and in particular by analogy with pro-DF (1FDP) and DF (1DFP) (34,49). After removal of the N-terminal dipeptide Met-Glu, Ile 16 -Gly 19 rotates around Gly 18 -Gly 19 , dives into the pre-shaped Ile 16 pocket between Asp 189 and Ala 143 , and forms a salt bridge with the Asp 194 side chain. This change must be preceded or accompanied by disruption of the salt bridge between Lys 188B and Asp 194 and rotation of the Asp 194 side chain around its C␣-C␤ bond (from 1 ϭ 75°to about Ϫ70°). Competition of the newly formed N-terminal segment with the Lys 188B side chain for forming a charged hydrogen bond with Asp 194 would possibly require some support by a fitting substrate.
The N-terminal Ile 16 ammonium group, however, can easily form a second hydrogen bond with the carbonyl oxygen of Ala 143 , which by contrast to most other zymogens is already in an "active" conformation in pro-GzmK (Fig. 6). The following residue, Ile 17 , most likely interacts with Asp 189 via two inter main chain hydrogen bonds, which necessitates a C-CO bond rotation of Asp 189 and a complete rearrangement of the preceding segment Asp 185 -Asp 189 . The conformational changes associated with an intruding N terminus would thus affect the structure of pro-GzmK at multiple sites. The mutual flipping of Lys 188B and Asp 189 would result in the exposure of the Lys 188B side chain to the bulk solvent, and in the "correct" positioning of the Asp 189 side chain at the bottom of the S1 pocket. These conformational rearrangements would also affect segment Ser 190 -Gly 193 and the Ser 214 -Pro 225 loop, including the connecting Cys 191 -Cys 220 disulfide bridge (Fig. 6).
Not exactly predictable is, of course, the shape of the Ser 214 -His 217 segment, whose extended conformation is critical for productive substrate binding, and the conformation of the segment His 217 -Cys 220 , which is important for the correct formation of a trypsin-like S1 pocket. In analogy with observations in several independent structures for DF (49), which also possesses a non-aromatic residue at position 215, we assume first, that segment Ser 214 -His 217 and in particular Gly 216 maintain their projecting pro-GzmK conformations and remain exposed after activation cleavage, and second, that the His 57 imidazoyl side chain cannot move to its mechanistically correct position. Third, and in analogy to coagulation factor FIXa (1FAX) (50), Glu 219 does not adopt the "high energy" trypsin-like conformation. Fourth, a trypsin-like S1 entrance frame is not opened up. Finally, formation of a functional catalytic triad is disfavored by the lack of an aromatic side chain at position 94 as in DF.
The non-productive resting-state conformation, which we postulate for mature GzmK most likely originates from three atypical residues in human GzmK, namely Gly 215 , Glu 219 , and Val 94 for the following reasons. In trypsin and related typical active serine proteases, the side chain of the aromatic residue 215 fixes the exposed ␤11 template segment to the underlying strand ␤12. Simultaneously, this residue provides the basis for subsites S2 and S4. Certainly, and as supported by the DF structures, Gly 215 of GzmK cannot substitute for an aromatic residue and, rather, might push His 57 out of its catalytic position.
Gly 219 in functional serine proteases with trypsin-like structure and specificity, i.e. with a kinked entrance frame, adopts a main-chain conformation that would be of high energy when replaced by any other amino acid. Upon substrate occupation of the S1 pocket, the carbonyl group of Gly 219 forms a hydrogen bond with the distal ammonium or guanidyl group of a P1lysine or arginine residue, respectively. A non-glycine residue at position 219 would have to adopt a high energy main chain conformation to fulfill an equivalent function, but Glu 219 of pro-GzmK (Lys 219 in the mouse and rat GzmK, see Fig. 2) adopts a relaxed conformation ( ϭ Ϫ100°; ϭ 126°) instead. The extremely low affinity of GzmK for benzamidine (in the millimolar range (13)) seems to suggest such a cryptic, i.e. suboptimally widened S1 pocket. We had previously advanced a similar explanation for the weak activity of coagulation factor IXa toward chromogenic substrates (49), one of the very rare examples for a non-glycine residue at position 219 in trypsinlike enzymes. Another example is the mannan-binding lectin (mannose-binding protein)-associated serine protease 3, which can interact with mannan-binding lectin complexes (51). With regard to the factor IX-related, but much more active human coagulation factor Xa, we have shown that the substitution of Gly 219 by Glu in factor Xa (Gly 219 3 Glu mutation) lowers its activity about 1000-fold (52), thus proving the detrimental effect of a Glu residue in this position.
Finally, the typical Tyr/Phe residue 94 is replaced by the shorter aliphatic Val in GzmK. Normally, an aromatic side chain covers the reactive site Asp 102 and shields the important hydrogen bond to His 57 N␦2 from bulk solvent. Obviously, and similar to the even smaller Ser 94 in DF (42), Val 94 of human GzmK cannot protect this Asp 102 -His 57 pair, but leaves some extra space around the His 57 imidazoyl side chain to adopt an inactive tautomer conformation, as seen in some DF structures (42). This interpretation is consistent with experimental data showing that a triple mutant of DF (S94Y,T214S,S215W) exhibits a more than 16-fold higher esterolytic activity toward BLT than wild type DF (53).
Substrate-dependent Activation of Mature GzmK, a Hypothesis-In view of all these considerations and previous findings, we hypothesize that mature GzmK is activated only transiently by a well fitting protein or extended peptide substrate that is able to induce a functional conformation of the nonspecific substrate-binding template strand 214 -217 (Fig. 7). In fact, GzmK can interact with the peptide mimetic inhibitor D-Phe-Pro-Arg-chlorometyl ketone (13), although quite slowly compared with the specific reaction of thrombin, for instance. Macromolecular substrates that are efficiently cleaved might exclude solvent on top of the Asp 102 -His 57 pair and/or pack their P2, P3, and/or P4 side chains in the encounter complex with GzmK to promote formation of a functional catalytic machinery.
To illustrate the anticipated conformational changes of GzmK after binding to a macromolecular substrate, we modeled residues Gly 89 to Phe 94 of the reactive-site loop of the bikunin subunit D2 into the hypothetical S4 to S2Ј subsites of mature GzmK (Fig. 6). In this model, the bikunin Arg 92 side chain occupies the S1 pocket, as observed in the BPTI-trypsin complex (54), which was taken as the template (see "Experimental Procedures"). In addition, the Ser 214 -His 217 stretch had to be switched from a "resting" to an "active state" conformation to achieve an anti-parallel juxtaposition of this segment with the main chain of the substrate (Figs. 6 and 7). Concomitantly, the main chain conformation of Glu 219 had to be changed to allow for hydrogen bond formation between its carbonyl group and the terminal guanidyl group of the P1 Arg residue. His 217 has been shifted away by ϳ8 Å with its side chain now being exposed to the solvent. In this way the characteristic kink of the trypsin-like "entrance frame" to the S1 pocket can be generated.
GzmK evidently belongs to the growing subclass of trypsinlike serine proteases with low proteolytic activity such as FIXa (with a cryptic S1 pocket and a distorted 99 loop) (50), DF (42), degP (with a blocked and simultaneously distorted 214 -217 template) (55), and ␣-tryptase (with a kinked 214 -217 template and a blocked S1 pocket) (56). All these proteases are characterized by a non-functional substrate-interacting tem-plate, incompatible with productive substrate binding and processing. Interestingly, these proteases appear to develop their full proteolytic potential by diverse means. Binding of coagulation FIXa to coagulation cofactor VIIIa generates a complex (intrinsic X-ase) with proteolytic activity toward coagulation FX, its specific macromolecular substrate. Similarly, DF efficiently cleaves and activates factor B upon binding to the C3b-factor B complex (57). On the other hand, degP seems to become proteolytically active toward partially unfolded proteins upon large conformational changes at elevated temperatures. Advantages for such a mechanism would be high specificity for macromolecular substrates, inactivity toward short peptides, and dispensability of high affinity inhibitors.