Inhibition of Mammalian Legumain by Some Cystatins Is Due to a Novel Second Reactive Site*

We have investigated the inhibition of the recently identified family C13 cysteine peptidase, pig legumain, by human cystatin C. The cystatin was seen to inhibit enzyme activity by stoichiometric 1:1 binding in competition with substrate. TheK i value for the interaction was 0.20 nm, i.e. cystatin C had an affinity for legumain similar to that for the papain-like family C1 cysteine peptidase, cathepsin B. However, cystatin C variants with alterations in the N-terminal region and the “second hairpin loop” that rendered the cystatin inactive against cathepsin B, still inhibited legumain with K i values 0.2–0.3 nm. Complexes between cystatin C and papain inhibited legumain activity against benzoyl-Asn-NHPhNO2 as efficiently as did cystatin C alone. Conversely, cystatin C inhibited papain activity against benzoyl-Arg-NHPhNO2 whether or not the cystatin had been incubated with legumain, strongly indicating that the cystatin inhibited the two enzymes with non-overlapping sites. A ternary complex between legumain, cystatin C, and papain was demonstrated by gel filtration supported by immunoblotting. Screening of a panel of cystatin superfamily members showed that type 1 inhibitors (cystatins A and B) and low M r kininogen (type 3) did not inhibit pig legumain. Of human type 2 cystatins, cystatin D was non-inhibitory, whereas cystatin E/M and cystatin F displayed strong (K i 0.0016 nm) and relatively weak (K i 10 nm) affinity for legumain, respectively. Sequence alignments and molecular modeling led to the suggestion that a loop located on the opposite side to the papain-binding surface, between the α-helix and the first strand of the main β-pleated sheet of the cystatin structure, could be involved in legumain binding. This was corroborated by analysis of a cystatin C variant with substitution of the Asn39 residue in this loop (N39K-cystatin C); this variant showed a slight reduction in affinity for cathepsin B (K i 1.5 nm) but ≫5,000-fold lower affinity for legumain (K i ≫1,000 nm) than wild-type cystatin C.

The activities of cysteine peptidases of the papain family (C1) such as cathepsins B, H, L, S, and K in and around mammalian cells are regulated by reversible, tight-binding protein inhibitors of the cystatin superfamily (1). The cystatins constitute a superfamily of evolutionarily related proteins that are all composed of at least one 100 -120-residue domain with conserved sequence motifs (2). The single-domain human members of this superfamily are of two major types. The type 1 cystatins (or stefins) A and B contain approximately 100 amino acid residues, lack disulfide bridges, and are synthesized without signal peptides. Cystatins of type 2 are secreted proteins of approximately 120 amino acid residues (M r 13,000 -14,000) and contain at least two characteristic intrachain disulfide bonds. The type 2 cystatins include the human cystatins C, D, S, SN, and SA, which are all products of genes located in the cystatin multigene locus on chromosome 20 (3). Two recently identified type 2 cystatins, cystatin E/M and cystatin F (also called leukocystatin), are also secreted low M r proteins but are more atypical in that they are glycoproteins and show only 30 -35% sequence identity in alignments with the classical type 2 cystatins. They are, however, still functional inhibitors of family C1 cysteine peptidases (4 -7). It has been shown that the cystatin inhibition of cysteine peptidases of the papain family is due to a tripartite wedge-shaped structure with very good complementarity to the active site clefts of such enzymes (8). The three parts of the cystatin polypeptide chain included in the enzyme-binding domain are the N-terminal segment, a central loop-forming segment with the motif Gln-Xaa-Val-Xaa-Gly, and a second C-terminal hairpin loop typically containing a Pro-Trp pair (8 -10).
Legumain (EC 3.4.22.34) is a cysteine endopeptidase that was until recently known only from plants (11,12) and Schistosoma (13). In plants there is abundant evidence that legumain performs a protein-processing function, causing limited proteolysis of precursor proteins and protein splicing (12,14). Following the discovery of the enzyme in mammalian cells, it was cloned and sequenced from human (15) and mouse (16). The amino acid sequences of legumains show that they belong to a distinct family of cysteine endopeptidases (C13). Mammalian legumain is predominantly lysosomal in distribution (16), but its strict specificity for the hydrolysis of bonds on the carboxyl side of asparagine is very different from that of any cathepsin and adapts it particularly for limited proteolysis (17). Human legumain may have an important physiological function as a key enzyme in antigen presentation (18).
It was recently reported that pig legumain is inhibited by human cystatin C and chicken cystatin with K i values below 5 nM (15). This finding was unexpected, since the cystatins are already known as potent inhibitors of the papain-like cysteine peptidases in the unrelated family C1. The legumain family members are believed to have a protein fold quite unlike that of papain, and to be much more closely related to the caspases and gingipain (19). Although the active site cysteine residue could seem to be a common factor, it is known not to be required for the interaction of papain with cystatins (20). The present investigation was undertaken to elucidate the mechanism of inhibition of mammalian legumain by cystatins.
Production of Cystatin C Variants-Cystatin C devoid of the Nterminal 10 residues ((des1-10)-cystatin C) was obtained by incubation of recombinant wild-type human cystatin C with neutrophil elastase and isolated by SEC (28). Cystatin C variants with Gly replacements for one, three, or four of the residues involved in papain interactions, W106G-, (R8G,L9G,V10G)-, and (R8G,L9G,V10G,W106G)-cystatin C, were isolated after mutagenesis in an E. coli expression system (26). Dimeric cystatin C was obtained from wild-type recombinant human cystatin C by incubation for 30 min at 70°C, and purified from trace amounts of monomeric cystatin C by SEC (29).
A cystatin C variant with Lys substitution for residue Asn 39 was obtained by oligonucleotide-directed mutagenesis of the cystatin C cDNA gene in pHD313 (22,30), using a PCR protocol. Taking advantage of a unique PstI recognition site located 24 base pairs downstream from the Asn 39 codon, a downstream primer was designed to introduce a C 3 A nucleotide substitution in the Asn 39 codon to result in a AAA codon for Lys (5Ј-CCACCTGCAGCGCGCGGCTGTGGTACATGTCTTTGCT-3Ј; -strand, with PstI site and mutation underlined). This oligonucleotide was used together with the upstream vector primer MA206 (26), 5Ј-GTTTCGCCTGTCTGTTTTGC-3Ј, (both at 0.4 M final concentration) to amplify a 360-base pair fragment by PCR using 0.1 ng of pHD313 DNA as template. DNA polymerase and PCR buffer were from the AmpliTaq kit (Perkin-Elmer-Cetus), and the PCR was accomplished by 30 incubation cycles of (95°C, 1 min; 57°C, 1 min; 72°C, 1 min) in a Perkin-Elmer-Cetus 2400 thermocycler. The PCR product was purified (PCR Purification Kit; Genomed, Oeynhausen, Germany), digested with ClaI and PstI (Life Technologies, Inc., Paisley, United Kingdom) and ligated into ClaI/PstI-cut and dephosphorylated pHD313, to generate plasmid pCmut39K. The plasmid was introduced into E. coli MC1061 as described in detail elsewhere (26). That the plasmid was correctly mutated was verified by complete nucleotide sequencing of the cystatin C insert, as described (26). The conditions for culturing and induction of expression in bacteria containing pCmut39K were as described previously for wild-type cystatin C production using pHD313 (22). Periplasmic extracts containing the recombinant cystatin C variant were obtained by cold osmotic shock (31) and directly applied to a Superdex 75 (Amersham Pharmacia Biotech, Uppsala, Sweden) SEC column (1.6 ϫ 100 cm; in 50 mM ammonium bicarbonate buffer, pH 7.8, containing 100 mM NaCl). Fractions containing N39K-cystatin C were identified by agarose gel electrophoresis (32) and pooled. The purified cystatin variant was a homogenous protein preparation (Ͼ95% pure as estimated by SDS-polyacrylamide and agarose gel electrophoreses) with size and charge according to SDS-PAGE and agarose gel electrophoresis as expected.
Analysis of Cystatin C Complexes with Pig Legumain and Papain-Concentrations of human cystatin C and papain in solutions were determined by absorbance measurements, using ⑀ 280 values of 11,100 (33) and 58,500 (34) M Ϫ1 cm Ϫ1 , respectively. An ⑀ 280 value for pig kidney legumain of 47,100 M Ϫ1 cm Ϫ1 was calculated after quantitative amino acid analysis of a highly purified enzyme sample, by standard methods. The total cystatin and enzyme concentrations determined in this way are used in the text below, if not otherwise stated.
Cystatin C complexes with legumain were formed by incubating cystatin C with active pig legumain (from a 30 M solution in 50 mM sodium citrate buffer, pH 5.5, containing 0.4 M NaCl, 1 mM EDTA, 0.01% (w/v) CHAPS, and 10 mM cysteine) in SEC buffer (below) for 30 min at room temperature. Mixtures analyzed by SEC (below) contained 186 M cystatin C and 7.1 M legumain. Ternary complexes between cystatin C and the two peptidases were typically formed by first incubating cystatin C with Cm-papain in SEC buffer during 30 min, and then adding active pig legumain and further incubating the mixture, at room temperature, for 30 -60 min. Mixtures analyzed by SEC contained cystatin C, Cm-papain, and legumain at final concentrations of 81, 16, and 10 M, or 9, 24, and 14 M, respectively.
Separation and size estimation of the different enzyme-inhibitor complexes was performed by SEC on a Superdex 75 HR 10/30 column (Amersham Pharmacia Biotech) equilibrated in 50 mM sodium citrate buffer, pH 5.6, containing 150 mM NaCl. The column was operated at a flow rate of 0.5 ml/min using an HPLC system (Waters) equipped with multiple wavelength detector and an integration system (Waters 990). Ovalbumin (M r 43,000), bovine serum albumin (M r 67,000), chymotrypsinogen (M r 23, 400), and carbonic anhydrase (M r 30,000) were used for construction of a calibration curve.
The fractions corresponding to each SEC peak were pooled and concentrated by precipitation, by addition of nine volumes of 20% (w/v) trichloroacetic acid. Precipitated proteins in pellets obtained after centrifugation were resuspended in a minimal volume of SDS sample buffer, and analyzed by SDS-PAGE in 16.5% gels using the buffer system described by Schä gger and von Jagow (35).
Immunoblotting-To verify the identity of protein bands after SDS-PAGE separation (above), transfer to PVDF membranes (Immobilon-P; Millipore, Bedford, MA) was performed using electrophoresis (Trans-Blott ® SD; Bio-Rad). Immunodetection of cystatin C was done exactly as described before (36). The same procedure was followed for Cm-papain detection with polyclonal rabbit anti-papain antibodies (produced by standard immunization procedures, using Cm-papain as antigen). Legumain was detected by use of horseradish peroxidase-conjugated concanavalin A (Sigma; catalog no. L6397) at a final concentration of 5 g/ml. Antibody-or concanavalin A-detected protein bands were visualized using chemiluminescence (ECL Plus; Amersham Pharmacia Biotech). Band intensities were assessed by densitometric scanning, using a Bio-Rad Imaging Densitometer GS-670 and Molecular Analyst software (Bio-Rad).
Enzyme Inhibition Assays-The methods used for active site titration of papain (with Bz-DL-Arg-NHPhNO 2 as substrate; Bachem Feinchemikalien, Bubendorf, Switzerland) and for titration of the molar papaininhibitory concentration in cystatin preparations have been reviewed (1). Active inhibitor concentrations determined in this way were used for calculation of K i values, as this is the method traditionally used. However, freshly isolated cystatin C preparations typically display apparent activities of 50 -70% if the results from such papain titration assays are compared with total protein concentration determined by A 280 measurement (22,24). The apparently lower inhibitor concentration is likely due to to some of the papain molecules being catalytically inactive (possibly due to oxidation of the catalytic Cys residue) but still capable of binding cystatin. The stoichiometry of papain-cystatin C binding is indeed 1:1 viewed by molar total protein concentrations, according to fluorescence titration (33). Therefore, total protein concentrations are used in the text when describing SEC experiments with papain-cystatin C mixtures.
The fluorogenic substrate used for determination of equilibrium constants for dissociation (K i ) of complexes between cystatins and family C1 cysteine peptidases (1) was Z-Phe-Arg-NHMec (10 M; from Bachem Feinchemikalien) and the assay buffer was 100 mM sodium phosphate buffer (adjusted to pH 6.5 for papain, pH 6.0 for cathepsin B), containing 1 mM dithiothreitol and 1 mM EDTA. Steady state velocities were measured before and after addition of the cystatin variant under study in assays at 37°C, and K i values were calculated according to Henderson (37). Corrections for substrate competition were made using K m values determined for the substrate batch used, under the assay conditions employed (60 and 55 M for papain and cathepsin B, respectively).
Essentially the same procedures were used for legumain inhibition assays. Pig legumain was titrated with a cystatin C solution of known total protein concentration in a microtiter plate format using Bz-Asn-NHPhNO 2 as substrate (Bachem Feinchemikalien), in sodium citrate/ phosphate buffer (15), pH 5.8 (39.5 mM citric acid, 121 mM Na 2 HPO 4 ), containing 1 mM dithiothreitol, 1 mM EDTA, and 0.1% (w/v) CHAPS. The same buffer was used for fluorogenic continuous rate assays at 37°C with 10 M Z-Ala-Ala-Asn-NHMec, prepared as described by Kembhavi et al. (11), as substrate. The legumain concentration used for K i determinations in such assays was 0.1-0.5 nM. Results from such assays with substrate concentration in the range 5-50 M were used to assess whether the cystatin interaction was competitive with substrate binding, by standard methods. The K m value for legumain hydrolysis of this substrate under the assay conditions, used for corrections of apparent K i values, was 30 M.
Computer Modeling-Sequence alignments were carried out using programs in the GCG package (38), taking into account the structural alignment (39) (42,43). Graphic illustrations were produced using the program Swiss-PdbViewer (44,45) and then rendered with QuickDraw3D (Apple Computer, Inc.).

Cystatin C Is a Tight-binding Legumain
Inhibitor-To clarify whether cystatins are efficient legumain inhibitors or not, the interaction between pig legumain and human cystatin C was initially investigated. Cystatin C was able to completely inhibit legumain activity against Bz-Asn-NHPhNO 2 in a timeand dose-dependent manner. Titration curves drawn from experiments with varying cystatin concentrations in such assays were linear (see Fig. 1 for example). As the concentrations of enzyme and inhibitor in the assay were in the order of 1 M, this indicated a relatively tight complex between enzyme and inhibitor with a K i value below 10 nM. Using Z-Ala-Ala-Asn-NHMec in different concentrations as substrate in a continuous-rate legumain assay at lower enzyme concentration (0.1-0.5 nM), it was observed that the cystatin C interaction with the enzyme was reversible and competing with substrate binding (results not shown). Corrected for substrate competition, the K i value for the cystatin C -legumain complex was 0.20 nM.
Studies by SEC gave evidence for a 1:1 interaction between inhibitor and enzyme, with a faster eluting peak (9.0 ml) appearing upon mixing of the proteins, corresponding to a M r of 46,300 ( Fig. 2A). This size agrees very well with a theoretical M r value of 46,000 -47,000 calculated as the sum of the M r for cystatin C (13,343, from the amino acid sequence of recombinant cystatin C; Ref. 30) and the M r for pig legumain according to SDS-PAGE of the glycosylated native enzyme (33,600 and 33,100 estimated from the SDS-polyacrylamide gels in Figs. 3C and 2B, respectively). The retention volumes of 9.65 and 12.4 ml for the individual components on the calibrated SEC column ( Fig. 2A, arrows) equaled M r values of 35,300 and 13,400 for legumain and cystatin C, respectively. Analysis of the peak at 9.0 ml by SDS-PAGE agreed with the expected protein staining for legumain and cystatin C if they were present in equimolar amounts in a complex (Fig. 2B). In addition, the dissociated complex in the SDS gel demonstrated a cystatin C band size identical to that of the native inhibitor with no signs of degradation products, indicating that the inhibitor is not cleaved as a result of the enzyme interaction.
The Papain-inhibitory Site on Cystatin C Is Not Responsible for Legumain Inhibition-To elucidate which parts of the cystatin structure are involved in binding and inhibition of mammalian legumain, the interactions of cystatin C variants with alterations in the N-terminal region and the second hairpin loop were studied (Table I). The (des1-10)-cystatin C variant, devoid of the N-terminal decapeptide as a result of neutrophil elastase cleavage and with seriously compromised affinities for cathepsins B, H, and L (28), showed the same affinity for legumain as native cystatin C. Three cystatin C variants with Gly replacements for up to four critical residues with side chains participating in the high affinity binding between the inhibitor and papain-like cysteine peptidases (26) also displayed virtually unaltered affinity for legumain. Additionally, dimerized cystatin C, which has been shown to lose papaininhibitory activity completely and to be due to intermolecular binding via the papain-inhibitory reactive site regions of two monomer units (46,47), was essentially as efficient as the monomeric cystatin in the inhibition of legumain. Thus, the cystatin C surface responsible for the inhibition of papain-like enzymes seemed not to be involved in legumain binding.
The strict substrate specificity of legumain, with a requirement for an Asn residue in the P 1 position, allowed studies of the papain-cystatin C interaction in the presence of legumain by use of Bz-Arg-NHPhNO 2 as substrate (Fig. 1A). It was observed that the dose-dependent inhibition of papain by dilu- tions of a cystatin C solution was virtually unaffected when a portion of the same cystatin solution had been preincubated with an approximately equimolar amount of legumain (under conditions favoring stoichiometric interactions between enzyme and inhibitor). Essentially identical cystatin C-enzyme mixtures could be analyzed for the presence of legumain-inhibitory sites, as papain showed very slow hydrolysis of the legumain substrate, Bz-Asn-NHPhNO 2 (Fig. 1B). The dose-dependent inhibition of legumain by dilutions of the cystatin C solution was largely unaffected when a portion of the same cystatin solution was preincubated with papain (at the highest concentration possible given the concentrations of the stock solutions used). In the fluorogenic legumain assay with Z-Ala-Ala-Asn-NHMec as substrate, cystatin C preincubated with papain in 1:1 and 1:10 molar ratio displayed K i values of 0.26 and 0.33 nM, respectively, i.e. very similar to cystatin C alone (0.20 nM).
Demonstration of a Ternary Complex-Taken together, the results described above left little doubt that cystatin C inhibits legumain by a site that is distinct from that inhibiting papain and related peptidases. Consistent with this, the kinetic experiments indicated that the cystatin could simultaneously bind both legumain and papain, despite the small size of the cystatin molecule (M r 13,343) and the 2-and 3-fold larger papain and legumain molecules, respectively. Attempts were therefore made to detect a ternary complex between the cystatin and the two peptidases. For experiments with excess papain, Cm-pa- pain was used, to eliminate the risk of digestion of legumain by the papain. It was found that when a molar excess of cystatin C was incubated with Cm-papain and then with legumain, at final molar ratios of cystatin C:Cm-papain:legumain of 24:5:3, SEC analysis showed three peaks (Fig. 3A). Under these conditions, with a 5-fold excess of cystatin C over Cm-papain, most Cm-papain should become bound to the inhibitor and the major fraction of cystatin C would be free when legumain was added to the mixture. According to SDS-PAGE of the peak fractions (Fig. 3C), peaks III and II corresponded to free cystatin C and a Cm-papain-cystatin C complex, respectively. Peak I contained legumain and cystatin C in a bimolecular complex (cf. Fig. 2A). Next, cystatin C was incubated with Cm-papain and legumain in molar ratios of 2:5:3, to ascertain that most of the cystatin present in the mixture was bound to Cm-papain at the time of legumain addition. After incubation with legumain, SEC analysis of the mixture (Fig. 3B) demonstrated a new peak eluting at 8.7 ml (peak IV), earlier than that corresponding to the legumain-cystatin C di-complex, which corresponded to a M r of approximately 53,000. The expected elution volume on the calibrated SEC column for a three-member complex with a theoretical M r of 70,000 is 8 ml. SEC of Cm-papain alone or Cm-papain complexed with cystatin C resulted in anomalously low apparent M r values, a phenomenon that has been seen repeatedly in the past. 2 The anomalous behavior of papain in SEC may well explain the small apparent size of a ternary complex in our SEC experiment. However, it was clear from SDS-PAGE analysis of the SEC peak IV at 8.7 ml (Fig. 3C) that it contained proteins with estimated M r values of 33,600, 22,800, and 16,300, agreeing well with with pig legumain, Cm-papain, and cystatin C, respectively. The protein-staining of the three bands showed intensities consistent with a molar 1:1:1 ratio. Immunodetection of all three proteins after transfer to a PVDF membrane was performed on each fraction of the SEC shown in Fig. 3B. All three proteins were detected in the fractions included in peak IV, but were undetectable in the same fractions from control SEC experiments with cystatin C, legumain or Cm-papain alone (result not shown).
Legumain-inhibitory Activity of Cystatin Superfamily Members-A panel of different cystatins was tested for inhibitory activity against pig kidney legumain. The results (Table II) showed that the two type 1 inhibitors analyzed, human cystatin A and bovine cystatin B, had no significant inhibitory activity for pig legumain. Similarly, the type 3 inhibitor human L-kininogen, containing two cystatin repeats with inhibitory activity against papain-like cysteine peptidases, did not show any legumain-inhibitory activity. Of human type 2 cystatins, cystatins D was also non-inhibitory. Cystatin E/M demonstrated tight-binding affinity for pig legumain, with a K i value almost 100-fold lower than that of cystatin C, whereas cystatin F showed a significant but lower affinity for the enzyme, 50fold lower than that of cystatin C.
Possible Location of the Legumain Inhibitory Site on Cystatins-An alignment of the amino acid sequences for the different cystatins (Fig. 4) was inspected for similarities between the proteins showing activity against legumain. Of those segments known to be present on the surfaces of the cystatin molecules (8,10,40,41,47), the loop starting at Asn 39 , following the conserved ␣-helix, seemed like a good candidate for inclusion in a legumain-binding surface. This was because: 1) it is located on the opposite side to the papain-inhibitory surface (Fig. 5), which would allow the simultaneous binding of legumain and papain observed for cystatin C; and 2) it contains residues that are conserved among the cystatins with legumain-inhibitory activity, including an Asn residue, which could be directly involved in interactions with the substrate specificity pocket of legumain, given the strict specificity of legumain for asparaginyl bonds (17). To try to verify the possibility that the Asn 39 residue is intimately involved in the legumain-inhibitory site, this residue was mutated in cystatin C. The substitution selected was Asn 39 3 Lys, guided by the notion that the non-inhibitory cystatin D, as well as cystatin B, have a positively charged residue in the loop segment. The N39K cystatin C variant was produced by E. coli expression and purified to homogeneity. The variant showed a 5-fold decreased affinity for cathepsin B as compared with wild-type cystatin C (Table I), but total loss of inhibitory activity for pig legumain (K i , Ͼ Ͼ1,000 nM, equaling Ͼ Ͼ5,000-fold lower legumain affinity than wild-type cystatin C). This strongly indicated that the loop between the ␣-helix and the first strand of the main ␤-pleated sheet of the cystatin structure, and its Asn 39 residue, is part of a novel second reactive site of some cystatins. DISCUSSION The aim of the present investigation was to study the mechanism of inhibition of mammalian legumain by cystatins, to clarify how the inhibitor structure can result in tight-binding inhibition of enzymes belonging to two entirely different enzyme families, namely the papain family (C1) and the legumain family (C13). The different arrangements of catalytic residues 2 A. J. Barrett, unpublished observation.

TABLE I Inhibition of legumain by cystatin C variants
Equilibrium constants for dissociation of pig legumain and human cathepsin B complexes with cystatin C variants were determined by steady-state kinetics in continuous rate assays at 37°C with Z-Ala-Ala-Asn-NHMec and Z-Phe-Arg-NHMec, respectively, as substrate. The standard deviation (S.D.) and number of measurements (n) used to calculate the mean K i values given are indicated. The K i values were corrected for substrate competition as described under "Experimental Procedures." Wild type 0.20 Ϯ0.014 and different active site motifs show that the two families are evolutionarily unrelated, and that their peptidases have different protein folds (19). Moreover, legumain is not inhibited by the general inhibitor of enzymes belonging to family C1, E-64 (15), which supports the theory that the general topography of the active site clefts of legumains are probably entirely different from those of family C1 enzymes. Our present SEC, electrophoretic and enzyme kinetic results show that cystatin C can inhibit mammalian legumain, as cystatins inhibit family C1 enzymes (52), by high affinity reversible binding (K i 0.20 nM), in a bimolecular reaction that is competitive with substrate, and with no detectable cleavage of the cystatin in the legumain complex. Despite these similarities, our present re-sults demonstrate that the mechanisms of inhibition of legumain and family C1 endopeptidases must be completely different.
From structural studies of several cystatins (8 -10), it is well known that the N-terminal segment together with the "first and second hairpin loops" in cystatins are responsible for the inhibition of the C1 enzymes (Fig. 5). Consequently, removal of the N-terminal segment or substitution of any of the conserved amino acids in the N-terminal segment or the hairpin loops by Gly/Ala residues abolishes or seriously affects the inhibition of papain-like enzymes (26,53). Of four such variants analyzed in the present study, all displayed virtually unaltered binding of legumain. Dimeric cystatin C, which is completely inactive  1, 2, and 3 Equilibrium constants for dissociation of pig legumain complexes with recombinant bovine cystatin B, recombinant human cystatins A, C, D, E/M, and F, and low M r kininogen purified from human blood plasma were determined by steady-state kinetics in continuous rate assays at 37°C with Z-Ala-Ala-Asn-NHMec as substrate. The standard deviation (S.D.) and number of measurements (n) used to calculate the mean K i values given are indicated. The K i values were corrected for substrate competition as described under "Experimental Procedures." 0 , rate of substrate hydrolysis in absence of inhibitor; i , rate of substrate hydrolysis in presence of inhibitor. The quality of the cystatin preparations used was checked by determination of K i values for their interaction with papain or cathepsin B (published values are reviewed in Refs. 1 and 5).

Legumain
Cystatin activity control The alignment was done based on secondary structure elements from the NMR models of cystatin A and chicken cystatin (with structural elements indicated above the sequences of type 1 and type 2 cystatins, respectively) and the x-ray model of chicken cystatin (not represented in the figure) (8, 39 -41). Arrows represent ␤-strands (yellow and blue for type 1 and 2, respectively) and red cylinders represent ␣-helices. The region possibly involved in a legumain-inhibitory site (the back-side loop) is shaded in yellow. This loop-forming segment is magnified below; residues with similar chemical properties present in the three cystatins showing legumain-inhibitory activity (cf. Table II) but not in the others, and thus possibly constituting a consensus sequence for legumain binding, are boxed. Residue Asn 39 is shown in orange for cystatin A and in light blue in cystatin C, in accordance with the colors used in Fig. 6. The numbering refers to the cystatin C sequence as deduced from its cDNA, starting from the first residue of the mature protein (30,48). For the other cystatins, the naturally occurring forms with longest N-terminal segments are shown (4, 5, 21, 49 -51). Dots indicate gaps introduced to optimize the alignment.
against papain-like enzymes and by NMR studies has been shown to be a result of intermolecular interactions between the papain-binding surfaces of two cystatin C molecules (47), still showed legumain inhibition. We believe that this, together with the enzyme kinetic results presented and the direct demonstration of a ternary complex by SEC, proves that the binding sites for papain and legumain on cystatin C likely are completely independent of each other.
Where then is the legumain reactive site? Our investigation of a set of other mammalian cystatin superfamily members indicated that the capacity to inhibit legumain is a property of only some cystatins (Table II). Guided by this result and amino acid sequence comparisons, we propose that the side of cystatins directly opposite to the papain-binding surface is responsible for the legumain binding and inhibition. The loop segment connecting the main ␣-helix of the cystatin structure to the first long ␤-strand contains a conserved Asn residue (residue 39 in cystatin C) and seems quite conserved in sequence in those cystatins that show inhibitory activity: cystatins C, E/M, and F. An importance of the Asn 39 residue was confirmed by construction of the N39K cystatin C variant, which was seen to lack legumain inhibitory activity. A correctly positioned Asn residue on the cystatin surface could possibly result in an initial substrate-like interaction between the inhibitor and legumain. Besides the requirement for an Asn residue in the P 1 position, legumains have no clear preferences for residues in other subsites (17,54). There are therefore few obvious structural possibilities for specific legumain inhibition besides interaction with the S 1 pocket. Still, assuming that the "back-side loop" containing Asn 39 interacts with the enzyme in a mode resembling substrate binding, it appeared from the inhibition data obtained that a loop segment preferentially containing polar amino acids is compatible with legumain interaction. The consensus sequence found in the three inhibitory cystatins is Ser(Thr)-Asn 39 -Asp(Ser)-Met(Ile). Strikingly, a Ser 38 -Asn 39 -Asp 40 sequence is completely conserved in mouse, rat, and bovine cystatin C, as well as in chicken cystatin, which also inhibits pig legumain (15). The positively charged Lys residue in this segment, present in the non-inhibitory cystatins B and D, may be unfavorable for inhibition.
The suggested binding loop must be able to adopt a conformation to allow legumain interaction, but at the same time not expose the Asn 39 -Xaa bond to cleavage. A different back-side loop conformation may be one reason why the type 1 cystatins studied, with a loop sequence largely containing the proposed consensus sequence for legumain inhibition, although being two residues smaller, do not show inhibitory activity (Fig. 6). In the case of the inhibitory cystatins, the loop might be partially restrained in cystatin F as Cys 37 likely is involved in a disulfide bridge (5), which can explain why cystatin F is a poorer inhibitor than cystatin C and E/M. The size and conformation of the loop could also be one reason why cystatin D does not inhibit legumain, because of an amino acid insertion in this loop (Fig.  4). For the type 3 cystatin studied, human L-kininogen, the lack of legumain-inhibitory activity may be due to steric reasons, as both legumain and kininogen are bulky molecules. Two of the three cystatin domains of kininogens are clearly able to inhibit papain-like peptidases (55), which demonstrates that the papain-binding surfaces of these domains are exposed and accessible to protein interactions. Whether the kininogen structure is sufficiently flexible to also allow exposure of the backside loops on the opposite sides of these domains is presently unclear, as a three-dimensional model for type 3 cystatins is unfortunately not yet available. For the individual kininogen domains, the sequence requirements for a legumain-binding back-side loop suggested above seem to be fulfilled for domain 3, but not for domain 2 of human kininogen. Clearly, more studies are needed to clarify whether perhaps some variants of low or high M r kininogens, resulting from proteolytic cleavages to release the kinin portion or individual cystatin domains of the protein, display legumain-inhibitory activity.
Although our initial studies indicate that the back-side loop around Asn 39 is important for the ability of some cystatins to efficiently inhibit legumain, other cystatin segments may also be involved in interactions with the enzyme, just as several segments are involved in the cystatin inhibition of papain. The very flexible loop between the second and third of the four main ␤-strands of the cystatin structure, from Thr 74 to Asn 82 (which is not present in type 1 cystatins) may prove essential to stabilize the enzyme-inhibitor interaction, given its close proximity to the Asn 39 loop (Fig. 5). Interestingly, this segment contains a five-residue insertion in the most efficient legumain inhibitor we identified, cystatin E/M (4,6). That this loop contains the primary binding site for legumain seems quite unlikely, however, as the loop sequence is relatively conserved between human cystatins C and D (Fig. 4), of which only cystatin C shows legumain-inhibitory activity.
In conclusion, our present results strongly indicate that the loop between the ␣-helix and the first strand of the main ␤-pleated sheet of the cystatin structure and its Asn 39 residue, is part of a novel second reactive site of some cystatins. Cystatins carrying this site are sufficiently potent to be physiological inhibitors of mammalian legumain. Since legumain-like activity has very recently been shown to be crucial for cellular presentation of certain antigens to the immune system, but no efficient inhibitors to this activity are presently known (18), continued studies to elucidate and explore the mechanism of legumain inhibition by the novel cystatin site may prove valuable.
FIG. 6. Structural alignment of cystatins. A three-dimensional alignment of cystatin A and chicken cystatin, zoomed in on the backside loop region close to the cystatin ␣-helix, is shown in two different orientations. The loop and part of the first long ␤-strand of cystatin A (41) is shown in yellow and the corresponding segment of chicken cystatin (40) in blue. A, side view with the orientation of the chains from the N-terminal ends of the ␣-helices in the upper left corner. The alignment demonstrates that not only Asn 39 (in light blue) but the entire loop is more exposed at the molecule surface in chicken cystatin when compared with the corresponding residue in cystatin A (orange), which is placed in a considerably shorter loop. B, view along the ␣-helices, from their C-terminal ends. The residues corresponding to Asn 39 in cystatin C are shifted by 5.2 Å from each other, probably as a result of the kink in the third turn of the ␣-helix in cystatin A.