Structural Basis for Inhibition of Cathepsin B Drug Target from the Human Blood Fluke, Schistosoma mansoni*

Schistosomiasis caused by a parasitic blood fluke of the genus Schistosoma afflicts over 200 million people worldwide. Schistosoma mansoni cathepsin B1 (SmCB1) is a gut-associated peptidase that digests host blood proteins as a source of nutrients. It is under investigation as a drug target. To further this goal, we report three crystal structures of SmCB1 complexed with peptidomimetic inhibitors as follows: the epoxide CA074 at 1.3 Å resolution and the vinyl sulfones K11017 and K11777 at 1.8 and 2.5 Å resolutions, respectively. Interactions of the inhibitors with the subsites of the active-site cleft were evaluated by quantum chemical calculations. These data and inhibition profiling with a panel of vinyl sulfone derivatives identify key binding interactions and provide insight into the specificity of SmCB1 inhibition. Furthermore, hydrolysis profiling of SmCB1 using synthetic peptides and the natural substrate hemoglobin revealed that carboxydipeptidase activity predominates over endopeptidolysis, thereby demonstrating the contribution of the occluding loop that restricts access to the active-site cleft. Critically, the severity of phenotypes induced in the parasite by vinyl sulfone inhibitors correlated with enzyme inhibition, providing support that SmCB1 is a valuable drug target. The present structure and inhibitor interaction data provide a footing for the rational design of anti-schistosomal inhibitors.

Schistosomiasis (bilharzia) is a chronic infectious disease caused by trematode blood flukes that infect over 200 million people in tropical and subtropical areas (1). Of the five species of schistosomes infecting humans, Schistosoma mansoni is a major etiological agent of disease in parts of Asia, the Middle East, Africa, and South America. Morbidity associated with the disease arises from immunopathological reactions to parasite eggs that accumulate in various tissues, including the liver, intestinal tract, and bladder (2). Treatment and control of schistosomiasis now relies on just one drug, praziquantel, a perilous situation should drug resistance emerge and become established (1,3). Accordingly, there is continued impetus to identify new schistosomal protein targets and chemotherapeutically active anti-schistosomals (4 -6).
Adult schistosomes live in the cardiovascular system, and host blood proteins are a nutritive source for growth, development, and reproduction. In the schistosome gut, a network of peptidases (proteases) contributes to the digestion of host proteins, predominated by hemoglobin, to absorbable peptides and amino acids (7,8). For S. mansoni, the component digestive peptidases thus far characterized include the following: (i) cysteine peptidases of the Clan CA (papain family), namely cathepsin B1, cathepsins L1-L3, dipeptidyl peptidase I (cathepsin C), and a Clan CD asparaginyl endopeptidase (legumain); (ii) the Clan AA aspartic peptidase, cathepsin D; and (iii) the Clan MF metallopeptidase, leucine aminopeptidase (7)(8)(9)(10)(11). This study focuses on S. mansoni cathepsin B1 (SmCB1), 2 which is the most abundant cysteine peptidase in the parasite gut (12,13) and is necessary for normal parasite growth (14). SmCB1 is synthesized as an inactive zymogen and is converted in vitro to a mature, active 31-kDa enzyme by proteolytic removal of the pro-peptide that can be catalyzed by legumain (12). SmCB1 is a molecular target for cure of schistosomiasis mansoni in a mouse model using the vinyl sulfone cysteine peptidase inhibitor K11777 (15). Inhibition of SmCB1 therefore represents an attractive option for anti-schistosomal drug development; however, targetbased rational design of lead compounds has been hampered by a lack of structural information for the enzyme.
Recently, we designed reversible inhibitors of SmCB1 based on the pro-peptide scaffold. These were effective in vitro in the low micromolar range (16). Here, we identify covalent nanomolar inhibitors of SmCB1 and analyze their binding mode by structural analysis. The inhibitors include the following: (i) epoxide inhibitor CA074, a specific inhibitor of cathepsin B-type peptidases (17) that has been previously structurally characterized in complex with mammalian cathepsins B (18), and (ii) vinyl sulfone inhibitors K11017 and K11777 that have not been crystallographically studied so far in complex with cathepsins B. Vinyl sulfones are effective against papain-like cysteine peptidases and were originally investigated in the context of inhibiting human cathepsins (19,20). Later, they were demonstrated to inhibit cysteine peptidases from a variety of protozoan pathogens such as Trypanosoma and Plasmodium, and provide either parasitological cure or a temporary remission of parasitemia (21)(22)(23). As a chemotype, vinyl sulfones have acceptable pharmacokinetic attributes and in vivo safety profiles (24,25). Currently, K11777 is in pre-clinical development as an anti-chagasic compound (26).
Here, we report the crystallographic structure of SmCB1, the first for a schistosomal proteolytic enzyme. A comprehensive analysis of structure-activity/inhibition relationships is provided to describe the active site of SmCB1. We demonstrate that SmCB1 is an efficient exopeptidase/endopeptidase against both synthetic peptide substrates and the physiologically relevant protein substrate, hemoglobin. Also, inhibition of SmCB1 by various vinyl sulfone inhibitors correlates with the severity of phenotypes induced in the parasite in culture. This study therefore provides both evidence that SmCB1 is a relevant drug target and the necessary structure-ligand data with which the design of anti-schistosomal SmCB1 inhibitors can be continued.

Recombinant Expression and Purification of SmCB1
The nonglycosylated SmCB1 zymogen was expressed in the X33 strain of the methylotrophic yeast Pichia pastoris, purified and activated by S. mansoni legumain (27), as described previously (16). All purification steps were maintained under reduc-ing conditions in the presence of 3.5 mM ␤-mercaptoethanol and 1 mM EDTA to prevent the active site cysteine from oxidation. The expressed nonglycosylated SmCB1 exhibited analogous activity properties as the wild-type SmCB1 produced in the Pichia expression system (12). The nonglycosylated SmCB1 was used in all experiments described here.

Preparation of SmCB1-Inhibitor Complexes
The freshly activated SmCB1 was incubated (10 h, 18°C) with a 5-fold molar excess of the inhibitor in 0.1 M sodium acetate, pH 5.5, containing 15 mM cysteine and 1 mM EDTA. The enzyme inhibition was monitored with Cbz-Phe-Arg-AMC substrate. The complex was rechromatographed on an FPLC Mono S column (16), concentrated, and buffer-exchanged into 2.5 mM sodium acetate, pH 5.5, using Amicon Ultracel-10k centrifugal units (Millipore).

Protein Crystallization and Data Collection
Crystals were obtained by vapor diffusion in hanging drop. Drops consisting of 1 l of the protein solution and 1 l of the reservoir solution were equilibrated over 1-ml reservoir solution at 20°C. The reservoirs solutions were 0.2 M ammonium acetate, 0.1 M sodium citrate, 30% PEG 1500, pH 6.2, for SmCB1-K11777 and SmCB1-K11017 complexes, and 0.1 M sodium citrate, 0.2 M ammonium acetate, 30% PEG 1500, pH 6.1, for SmCB1-CA074 complex. Protein concentrations of the stock solutions of the complexes were 2.5-5 mg/ml (in 2.5 mM sodium acetate, pH 5.5). Rectangle-shaped crystals reached their final size within 10 days and were flash-cooled by plunging into liquid nitrogen. Diffraction data at 100 K were collected at beamline 19-BM of the Structural Biology Center at the Advanced Photon Source, Argonne National Laboratory, Argonne, IL All diffraction data were processed using the HKL-3000 suite of programs (35). Crystal parameters and data collection statistics are given in supplemental Table S1.

Structure Determination, Refinement, and Analysis
The structure of SmCB1 was determined by molecular replacement with the program Molrep (36) using the structure of human cathepsin B (Protein Data Bank code 1HUC) (37) as the search model. The sequence alignment of SmCB1 with human cathepsin B is shown in supplemental Fig. S1. Model refinement was carried out using the program REFMAC 5.2 (38) from the CCP4 package (39), interspersed with manual adjustments using Coot (40). The final steps included TLS refinement (41). The quality of the final models was validated with Molprobity (42). Final refinement statistics are given in supplemental Table S1. Atomic coordinates and experimental structure factors have been deposited with the Protein Data Bank with the codes 3QSD, 3S3R and 3S3Q for SmCB1-CA074, SmCB1-K11777, and SmCB1-K11017, respectively. The following services were used to analyze the structures: PISA server (43) and CONTACTS (39). All figures showing structural representations were prepared with the program PyMOL (44).

Interaction Energy Calculations
The subsite interaction energy between inhibitors and SmCB1 was calculated using the quantum chemical approach. The strategy consisted of the following two steps: optimization of the crystallographic complexes and calculation of interaction energies in the individual subsites.
Model Setup and Geometry Optimization-Hydrogen atoms were added to the crystallographic complexes of SmCB1 and inhibitors to correspond to pH of ϳ6 and were optimized using the AMBER 10 software (45). Further optimization of the inhibitor in the active site (residues within 6 Å) was carried out using the corrected semi-empirical quantum chemical method PM6-DH2 (46,47), including implicit water environment. The alternative conformation of VSPh in P1Ј position of inhibitors was modeled using PyMOL (44).
Subsite Interaction Energies-The inhibitor structures were fragmented into the P3 to P2Ј segments, with separated side chains and the main chains, and capped by hydrogens. The reactive centers of the inhibitors originating from the vinyl and epoxide moieties (located between P1 and P1Ј) as well as the catalytic Cys-100 that form the covalent linkage were not calculated. The subsite interaction energies were obtained as the difference between the energy of the fragment noncovalently bonded to the enzyme and the sum of energies of the enzyme and the inhibitor fragment calculated separately. The PM6-DH2 quantum chemical calculations in implicit water were applied.

SmCB1 Activity and Inhibition Assays
Activity measurements were performed in a microplate format (100-l assay volume) at 37°C. The reaction mixture contained enzyme (0.6 nM) and fluorogenic substrate Cbz-Phe-Arg-AMC (20 M) in 0.1 M sodium acetate, pH 5.5, containing 2.5 mM dithiothreitol and 0.1% PEG 1500 (16). The kinetics of product release was continuously monitored in an Infinite M200 microplate reader (Tecan) at excitation and emission wavelengths of 360 and 465 nm, respectively. For inhibition measurements, the enzyme was preincubated with inhibitor (0 -100 M) for 5 min followed by the addition of substrate. The IC 50 values were determined by nonlinear regression using GraFit software. The SmCB1 activity assay with FRET substrates was performed analogously, and the kinetics of product release was continuously monitored at excitation and emission wavelengths of 320 and 420 nm, respectively. The screening of libraries of FRET carboxydipeptidase substrates was performed with 40 M substrates in the reaction mixture. The Michaelis-Menten kinetic parameters (supplemental Table S5) were determined by measuring the rate of hydrolysis in the substrate concentration range of 0 -200 M, and K m and k cat values were obtained by nonlinear regression using GraFit software. In all assay systems, the final concentration of DMSO did not exceed 1.5%. Each measurement was performed in triplicate. The concentration of SmCB1 was determined by active site titration with E-64; the peptide solutions were quantified by amino acid analysis.

Hemoglobin Degradation
Digestion of human hemoglobin (Sigma, H7379) with SmCB1 was performed as described previously (48). Briefly, hemoglobin (10 g) was incubated with SmCB1 (0.25 g) in 25 mM sodium acetate, pH 3-6, including 2.5 mM DTT in a total volume of 35 l for 1-4 h at 37°C. Aliquots of the digest were subjected to derivatization with fluorescamine to quantify the newly formed N-terminal ends (49). The fluorescence signal was measured using an Infinite M200 microplate reader (Tecan) at 370 nm excitation and 485 nm emission wavelengths. All measurements were performed in triplicate. For SDS-PAGE visualization, hemoglobin digests were separated in Tricine gels (16% T, 6% C) containing 6 M urea (50). For RP-HPLC analysis, hemoglobin (0.15 mg) was incubated with SmCB1 (1.25 g) in 50 mM sodium acetate, pH 4.5, containing 2.5 mM DTT in a total volume of 200 l for 0 -15 h at 37°C. The reaction mixture was treated with 10 l of 10% TFA and separated by RP-HPLC on a C4 Vydac column (Vydac) equilibrated in 0.1% (v/v) TFA and eluted with a 1%/min gradient of a 99% (v/v) acetonitrile solution in 0.1% (v/v) TFA. The collected peak fractions were analyzed by FT-MS using an LTQ Orbitrap XL mass spectrometer (Thermo).

Parasite Assay and Phenotype Scoring
Newly transformed schistosomula (NTS) of S. mansoni were prepared from infective larvae (cercariae) as described previously (5) and incubated in the presence of protease inhibitors. Briefly, the assay was performed in a microplate 96-well format (200 l assay volume) with 200 -300 NTS in Basch Medium 169 (51) containing 5% FBS, 100 units/ml penicillin, and 100 g/ml streptomycin (52). Final concentrations of 1 or 10 M inhibitors in 0.5% DMSO were added and incubations continued for 3 days at 5% CO 2 and 37°C. Phenotypes that arise as a function of time and concentration were graded as follows: grade I, dead NTS by 2 days of culture at 10 M and dying/dead NTS by 3 days at 1 M; grade II, dead NTS by 3 days at 10 M and round/dark/ dying by 3 days at 1 M; grade III, round/dark/dying by 3 days at 1 and 10 M concentrations.

Determination of Crystal Structures-Recombinant
SmCB1 was produced as a nonglycosylated mutant in the methylotrophic P. pastoris expression system. The enzymatically active SmCB1 was obtained by activation processing of the SmCB1 zymogen with legumain that removes the activation peptide (pro-peptide) (12). The activated SmCB1 contains 253 amino acid residues starting with N-terminal Val-70 (the SmCB1 zymogen numbering is used throughout the paper).
SmCB1 was crystallized in complex with three covalent active site inhibitors, namely CA074, K11017, and K11777. The structure of SmCB1 was determined by molecular replacement using the structure of human cathepsin B (37) sharing 59% sequence identity. The SmCB1-CA074 and SmCB1-K11017 complexes crystallized in the same orthorhombic space group P2 1 2 1 2 1 with one molecule in asymmetric unit and solvent content of ϳ23%. The structures were refined using data to resolution 1.3 and 1.8 Å and the final crystallographic model contains residues 71-323 and 70 -323 for SmCB1-CA074 and SmCB1-K11017 complex, respectively. The electron density used for modeling of inhibitors was of excellent quality in both structures (Fig. 3). The SmCB1-K11777 complex crystallized in the orthorhombic space group P2 1 2 1 2 1 with three molecules (named A, B, and C) in the asymmetric unit and solvent content of ϳ47% and was refined using data to resolution 2.64 Å. One C-terminal residue in molecule C as well as residues 118 -122 of molecule B could not be located in the electron density map and were thus not included in the final model. All three molecules of the SmCB1-K11777 complex present in the asymmetric unit were very similar. The root mean square deviations (r.m.s.d.) for superposition of the three protein molecule backbones onto each other range from 0.34 to 0.48 Å, a value within the range observed for different crystal structures of identical proteins (53). Minor structural changes were localized in the surface exposed loops, and the substrate-binding sites are structurally almost identical. The electron density used to model K11777 was of excellent quality in all three protein chains in the asymmetric unit (Fig. 3). Mutual comparison of SmCB1 in complex with K11777, K11017, and CA074 did not reveal any significant differences in protein structure (backbone r.m.s.d. values are 0.18 -0.54 Å).
Overall Structure of SmCB1-SmCB1 is a single polypeptide chain that adopts a classic papain-like fold in which the molecule is divided into L and R domains (37). The active site cleft with catalytic residues Cys-100, His-270, and Asn-290 is located between both domains (Fig. 1). The SmCB1 structure clearly resembles cathepsin B-type peptidases of papain superfamily with the characteristic "occluding loop" (Phe-175 to Pro-197) that restricts access to the primed region of the active site (37). A comparison of SmCB1 with the structure of human cathepsin B shows a high degree of similarity (r.m.s.d. 0.87 Å for 247 C␣ atoms). The major differences in backbone superposition (with r.m.s.d. Ͼ1 Å) are located at the surface loop segments, including residues 117-125, 164 -167, 175-194 (occluding loop), 247-268, and 281 (supplemental Fig. S2A). On the SmCB1 surface, there are several large basic patches; the major positively charged cluster is located along the edge of the occluding loop and is absent in mammalian structures (supplemental Fig. S2B). This is reflected in the pI of SmCB1 that is more than 3 units higher than that of human (and other mammalian) cathepsin B based on theoretical pI values of 8.7 and 5.2, respectively.
In the SmCB1 occluding loop, there are two important features of structural rearrangements compared with mammalian homologs. First, a conserved segment Gly-Glu-Gly-Asp is replaced by the sequence Lys-Ile-Tyr-Lys (residues 192-195) in SmCB1. The glycine-containing segment is flexible in mammalian structures and able to move into the active site (18,54,55). In SmCB1, this segment is located more distally from the active site, where it is stabilized by the stacking interaction formed between Tyr-194 and Phe-175 at the side of the occluding loop ( Fig. 2A). Second, the flexibility of the occluding loop of SmCB1 is restrained by the presence of two salt bridges that stabilize the loop in the "closed" conformation (37). The ion pair His-180/ Asp-93 is conserved, whereas the pair Arg-186/Asp-295 of mammalian structures (SmCB1 numbering) is rearranged in SmCB1. The arginine is substituted by Tyr-186, which interacts with Asp-295 through Lys-185 to form a cluster stabilizing the loop conformation in SmCB1 (supplemental Fig. S3).
Mode of Binding of Inhibitors to SmCB1-Based on the crystal structures of the SmCB1-inhibitor complexes, the binding mode of the inhibitors K11777, K11017, and CA074 was described. These irreversible inhibitors form a covalent adduct with the thiol group of the catalytic residue Cys-100 and differ substantially in their positions in the active site (Fig. 3). CA074 (L-trans-epoxysuccinyl(propylamide)-Ile-Pro-OH) occupies the S2 to S2Ј subsites and is bonded via a C6 atom after opening the epoxide ring. K11777 (N-Mpip-Phe-Hph-VSPh) and K11017 (Mu-Leu-Hph-VSPh) occupy the S3 to S1Ј subsites making covalent bond through the C␤ atom of the vinyl sulfone moiety. In all complexes, there is a set of common interactions between the inhibitor backbone and the enzyme active site that involves predicted contacts of Gln-94, Gly-98, Gly-143, Gly-144, Gly-269 and His-270, Trp-101, Trp-292 (supplemental Table S2). Specific critical structural determi- nants of inhibitors and their interactions with SmCB1 subsites are as follows.
In the SmCB1-CA074 complex, propyl and carbamoyl groups of CA074 are in the S2 and S1 subsites, and the -Ile-Pro-OH part mimics a substrate in the S1Ј and S2Ј subsites (Fig.  3). The C-terminal carboxyl group of the P2Ј Pro residue inter-acts with two His residues localized at the occluding loop of SmCB1; three charge-assisted hydrogen bonds in total were formed between carboxyl oxygens and imidazole nitrogens of His-180 and His-181 (Fig. 4). Thus, the binding mode of CA074 to SmCB1 is similar to that known in mammalian cathepsin B complexes with CA074 and related derivatives, which target the . This is associated with a different orientation of the P1Ј moiety that fills the S1Ј subsite in K11017 but it is flipped in K11777. Heteroatoms have standard color coding. occluding loop at the S2Ј subsite leading to cathepsin B specific inhibition (18,55,56). An additional stabilization is conferred by nonpolar interaction between Ile-193 and the P2Ј proline ring. The S1Ј subsite is a hydrophobic pocket (Val-247, Leu-252, Leu-267, His-270, and Trp-292) in the R domain and stably holds the P1Ј Ile of CA074 through hydrogen bonding (Trp-292 and Gln-94). CA074 does not protrude deeply into the S1 subsite composed of Gln-94, Gly-98, Gly-144, and Gly-269. The inhibitor binding in this region is stabilized by the following interactions: C6 atom covalently bound to Cys-100 and two carbonyl oxygen atoms interacting with Gln-94 (in the "oxyanion hole") and Gly-144 (Fig. 4). The terminal propyl of CA074 occupies part of the S2 subsite, where it is directed toward Glu-316. This P2 group is bound through hydrophobic interactions with the backbone of Leu-146 and Ala-271.
The chemical structures of K11777 and K11017 are identical at the P1 position (Hph) and the P1Ј position (VSPh) but differ at the P2 position (Phe and Leu, respectively) and P3 position (N-Mpip and Mu, respectively) (Fig. 3). Contrary to CA074, the vinyl sulfone inhibitors do not occupy the S2Ј subsite of SmCB1. When comparing the binding mode of K11777 and K11017, a striking difference was observed for the conserved P1Ј substituent; the phenyl sulfone moiety fills the S1Ј subsite in K11017 but is flipped ϳ90°out of the active site in K11777 (for all three molecules in the asymmetric unit) (Fig. 3 and supplemental Fig. S4). Both conformations of phenyl sulfone are stabilized by polar contacts, mainly with Leu-252 and Trp-292 in K11017, and with Cys-97 and Gly-98 in K11777 (supplemental Table S2). The S1 subsite of SmCB1 binds the Hph residue of both K11777 and K11017; however, a net of polar contacts in this subsite is influenced by the situation in S1Ј. The residue Ile-193 located on the Lys-Ile-Tyr-Lys segment of the occluding loop has different orientation in the K11777 and K11017 complexes as it interacts with different positions in the inhibitors (Fig. 2B). In the SmCB1-K11017 complex, Ile-193 makes contacts with Hph in the S1 subsite and, in the SmCB1-K11777 complex, with phenyl sulfone that is flipped out from the S1Ј subsite. SmCB1 contains an acidic residue (Glu-316) at the bottom of the S2 subsite similarly to mammalian cathepsins B (57). Leu in the P2 position of K11017 is able to establish a polar contact with Glu-316, whereas for the bulkier Phe in K11777, Glu-316 points out of the pocket to avoid a steric clash (supplemental Fig. S5). Similar conformational changes of the acidic residue in S2 were reported for vinyl sulfone complexes of cathepsin L-type peptidases of Trypanosoma and Plasmodium (58). The flexibility of Glu-316 in SmCB1 is further demonstrated by its dual conformation in the SmCB1-K11017 complex (supplemental Fig. S5). The S3 subsite located at the entrance of the SmCB1 active site cleft is generally hydrophobic. This wide subsite accommodated loosely the terminal groups Mu and N-Mpip as the P3 substituents of vinyl sulfone inhibitors.
Computational Analysis of the Inhibitor Binding Mode-The quantum chemical calculations on the crystallographic complexes were employed to determine the noncovalent interaction energy of the K11017, K11777, and CA074 inhibitors in the subsites of SmCB1. Fig. 5 shows the interaction energies of the individual side-and main-chain segments in P3 to P2Ј positions.
For CA074, by far the largest favorable contribution comes from the P2Ј position containing C-terminal Pro residue (Ϫ32.8 kcal/mol). The other contributions vary in range from Ϫ4.5 to 5.3 kcal/mol with favorable interactions formed by the P2/P1 and P1 segments and unfavorable interactions by the side  OCTOBER 14, 2011 • VOLUME 286 • NUMBER 41

JOURNAL OF BIOLOGICAL CHEMISTRY 35775
chains of P2 and P1Ј (Fig. 5). This is contrasted with the interaction energies of the vinyl sulfone inhibitors that represent rather smaller favorable contributions. The comparison of K11017 and K11777 revealed that there are large differences in the interaction energy between both inhibitors at the P2 and P1Ј positions (Fig. 5). The side chain of the P2 residue, Leu of K11017 does not bring any favorable interaction (0.1 kcal/mol) in contrast to Phe of K11777 (Ϫ6.9 kcal/mol). The phenyl sulfone moiety at P1Ј adopts a distinct conformation in each of the both vinyl sulfone-SmCB1 complexes, contributing Ϫ13.8 kcal/mol in K11017 and Ϫ4.3 kcal/mol in K11777 to binding. To evaluate the effect of the two orientations of the P1Ј residue in the respective inhibitor complexes, we calculated the interaction energies for artificial complexes of K11017 and K11777, in which the side chains were built with interchanged conformations (supplemental Table S3). The calculated interaction energy of P1Ј is reduced substantially to less negative values, which strongly indicates that only the crystallographic conformation is favorable for binding of the respective inhibitor to SmCB1.
In the crystallographic complexes of SmCB1 with CA074 and K11017, we found dual conformations of the side chain of Glu-316 in S2 interacting with the P2 residues. Subsite interaction energies in both alternative conformations were calculated (supplemental Table S3). The energy differences were 1.3 and 0.1 kcal/mol for K11017 and CA074, respectively. We conclude that two conformations of Glu-316 in the S2 subsite are nearly isoenergetic and might be favorable for the complex formation in terms of the conformational entropy.
Inhibitor Specificity of the SmCB1-binding Subsites-A set of 20 vinyl sulfone inhibitors was screened in vitro against SmCB1 to explore the structural requirements of the inhibitor-binding subsites in the SmCB1 active site cleft. These compounds are listed in Table 1  At the P1 position, Hph is the favored residue that is present in all tested inhibitors with IC 50 of Ͻ10 nM. Its substitution with Lys (K11006) or Tyr (WRR-453) led to 1 and 4 orders of magnitude higher IC 50 , respectively, as compared with K11002. Also, a change of configuration at P1 Hph (and adjacent P2 Phe) from S to R substantially decreased the inhibitory potency as shown for WRR-359 derived from WRR-284 (IC 50 ϳ114 and 7.8 nM, respectively). Unfavorable substitutions at P1 (containing (R)-Ala) and P1Ј resulted in low inhibition of WRR-185 and WRR-200 (compare with WRR-145).
At the P2 position, both Phe and Leu are highly effective as demonstrated with K11002 and K11017 (IC 50 around 1.7 nM). In the K11777 scaffold, replacement of the P2 Phe by Phe-4-CH 3 (AR-198049) and Phe-4-CF 3 (AR-198048) resulted in 3and 5-fold weaker inhibition, and His (WRR-499) and Arg (WRR-483) afforded 14-and 24-fold weaker inhibition, respectively. The N-terminal modification of inhibitors corresponding to the P3 position was by N-Mpip, Mu, and Cbz capping groups. The heterocycles are present in the best inhibitors and do not differ importantly in their contribution to the inhibitory effect, as shown with K11777 and K11002 (IC 50 of 2.09 and 1.73 nM, respectively).
Cleavage Mode and Substrate Specificity of SmCB1-Hydrolysis by SmCB1 was analyzed with the physiological protein substrate, hemoglobin, and with a series of synthetic peptide substrates. SmCB1 degraded hemoglobin at acidic pH between 4 and 6 as measured by a fluorescamine assay that directly quantifies hemoglobin fragments (Fig. 7A). SDS-PAGE visualization of the hemoglobin fragmentation showed that the disappearance of the substrate band is not associated with a corresponding accumulation of large hemoglobin fragments of Ͼ3.5 kDa (Fig. 7B). A detailed pattern of hemoglobin-derived products was obtained by RP-HPLC separation (Fig. 7C). Like those after SDS-PAGE, the RP-HPLC profiles indicated that hemoglobin is gradually converted into a pool of dipeptides with little accumulation of peptides of intermediate size. The detected intermediate fragments ranging in size from 10 to 41 amino acids are mostly derived from the interior of the hemoglobin sequence (supplemental Table S4); this demonstrates the involvement of endopeptidase activity. The combined data suggest that upon endopeptidolytic cleavage of hemoglobin by SmCB1, the substrate is rapidly processed via carboxydipeptidase activity.
SmCB1 was then tested with various fluorogenic peptide substrates that allowed for the discrimination of endo-and exopeptidase activities. The supplemental Table S5 compares FIGURE 5. Subsite interaction energies between inhibitors and SmCB1. The noncovalent interaction energy was determined using quantum chemical calculations on the crystallographic complexes of SmCB1 with K11017, K11777, and CA074. The inhibitor structures were fragmented into the sidechain segments (P3 to P2Ј) and main-chain segments (Pi/P(i-1) connecting Pi and P(i-1)). The P1/P1Ј segment forming a covalent bond with the catalytic Cys-100 was not calculated. Positions absent in inhibitor structures are marked with a ϫ.
We next designed FRET-based substrate libraries for mapping the carboxydipeptidase activity and residue preferences in  Table 1). Examples of three inhibitor-induced phenotypes in the parasite versus untreated controls. Phenotypes arise as a function of time and inhibitor concentration and were graded as follows: Grade I, dead NTS by 2 days of culture at 10 M and dying/dead NTS by 3 days at 1 M; Grade II, dead NTS by 3 days at 10 M and round/dark/dying by 3 days at 1 M; Grade III, round/dark/dying phenotype in 3 days at 1 and 10 M concentrations. Scale bar, 0.2 mm.

TABLE 1 Inhibition of SmCB1 by vinyl sulfone inhibitors and their anti-schistosomal activity
The IC 50 values for 20 vinyl sulfone inhibitors were determined in a kinetic activity assay with SmCB1 and the fluorogenic peptide substrate, Cbz-Phe-Arg-AMC, at pH 5.5. The epoxide inhibitor CA074 was assayed for comparison. The vinyl sulfone structures are defined by the compound core (see scheme below) and substituents R3 to R1Ј. Inhibitors are ranked according to their IC 50 values. Mean values Ϯ S.E. are given for triplicate measurements. Induction of phenotypic alterations by the inhibitors was determined with NTS of S. mansoni. The inhibitors were tested at 1 and 10 m concentrations, and the resulting phenotypes, arising as a function of time and concentration, were graded I to III, with I being the most severe (see Fig. 6). a The following abbreviations used are as follows: N-Mpip, N-methylpiperazinylcarbonyl; Mu, morpholinylcarbonyl; Cbz, benzyloxycarbonyl; Ph, phenyl; Bz, benzyl. b Compounds were analyzed by x-ray crystallography in complex with SmCB1. c Residues are in R configuration; all other residues are in S configuration. d Grade I was the most severe; see text for details. the primed substrate-binding subsites of SmCB1. The structure of the synthesized libraries Abz-Phe-Arg-Xaa-Nph-OH and Abz-Phe-Arg-Nph-Xaa-OH contain two fixed residues, Phe-Arg in P2-P1 that are favored by SmCB1 and other cathepsins B (60,61,63), and help to anchor the substrates in nonprimed subsites. The substitutions in the Xaa positions define the P1Ј and P2Ј residues. The screening of the libraries showed that SmCB1 has a broader specificity of the S2Ј subsite than S1Ј subsite (Fig. 8). All residues were accepted at the P2Ј position, although Pro and basic residues led to lower substrate hydrolysis. At the P1Ј position, larger hydrophobic, aromatic (except for Trp), and basic residues were unfavorable, whereas Pro was not tolerated. Generally, the primed subsites of SmCB1 differed in their ability to accommodate large hydrophobic and aromatic residues of the tested carboxydipeptidase substrates.
Severity of Phenotypes Induced in Cultured Parasites Correlates with the Potency of SmCB1 Inhibition-A panel of inhibitors of SmCB1 listed in Table 1 was screened against S. mansoni NTS, the post-invasive parasite stage that feeds on host blood (5). The NTS were exposed to 1 and 10 M inhibitors, and the resultant phenotypes were graded I through III from most to least severe. About 30 and 60% of the tested compounds led to death of NTS by the 3rd day of the incubation at 1 and 10 M, respectively. The score and images of typical disordered phenotypes are presented in Table 1 and Fig. 6. The severity of phenotypes induced by the inhibitors was statistically correlated with their potency to inhibit SmCB1 activity (Table 1; Kruskal-Wallis rank analysis of variance, H (2,21) ϭ 12.6, p Ͻ 0.001). Specifically, grade I and II phenotypes were generally caused by inhibitors with IC 50 values less than 20 nM, and grade III phenotypes were induced by inhibitors with IC 50 values greater than 20 nM (except for WRR-185 and WRR-200, which were grade II).

DISCUSSION
SmCB1 is one of a number of digestive peptidases in the gut of the flatworm parasite S. mansoni (8,12). Both reverse genetics and chemical experiments suggest that it is a critical for parasite growth and a valuable target for the development of novel anti-schistosomal drugs (14,15). In this study, we provide a comprehensive structure-activity analysis of SmCB1 that includes a series of crystal structure determinations. We also describe the SmCB1 interaction with inhibitors and characterize its specificity with both peptidyl and protein substrates.
Structure of SmCB1-The three-dimensional structure of SmCB1 was solved for three inhibitor complexes; the best resolution achieved was 1.3 Å. SmCB1 possesses an occluding loop that is characteristic of cathepsin B-type peptidases (37). It is known to regulate access to the active site, where it partially blocks the primed substrate-binding subsites (at S3Ј and beyond), and thus confer carboxydipeptidase activity to cathepsins B (59,61,64). The occluding loop of SmCB1 presents local structural rearrangements compared with mammalian homologs; however, these changes retained the overall loop fold suggesting its functional competence. This was probed by determining the mode of SmCB1 action using specific peptide substrate; carboxydipeptidase activity was clearly manifested. In addition, SmCB1 displayed endopeptidase activity indicating that steric hindrance by the SmCB1 occluding loop is flexible such that the loop can move to accommodate endopeptidase substrate in the active site cleft as has been reported for human cathepsin B (59). FIGURE 7. Hydrolysis of hemoglobin by SmCB1. A, human hemoglobin (Hb) was digested with SmCB1 at various pH values. The degradation rate was determined with the fluorescamine derivatization assay quantifying the liberated fragments. The mean values Ϯ S.E. are expressed relatively to the maximum value. B and C, Hb digest at pH 4.5 was performed at two time points; the reaction mixture was electrophoretically and chromatographically separated and compared with the undigested control. B, Tricine-SDS-PAGE of the Hb digest visualized by protein staining. C, RP-HPLC of the Hb digest resolved on a C4 column using a TFA/acetonitrile system. Elution positions are indicated for the intact Hb substrate (␣ and ␤ subunits) and for Hb-derived fragments, which form the pools of dipeptides and large peptides (ranging in size from 10 to 41 amino acids, see supplemental Table S4). The flow-through peak (see profile at 0 h) contains nonpeptide components of the reaction mixture.

SmCB1, an Efficient Endo-and Exopeptidolytic Machine-
The carboxydipeptidase catalytic efficiency of SmCB1 as measured with peptide substrates was greater than its endopeptidase efficiency. The screening of carboxydipeptidase substrate libraries showed that a broad range of residues is tolerated in the primed positions P1Ј and especially P2Ј. This suggests that SmCB1 is able to trim effectively the C termini of peptides. SmCB1 has more promiscuous substrate specificity in P2Ј than human cathepsin B (60). Furthermore, we identified a combined carboxydipeptidase/endopeptidase action of SmCB1 on the physiological substrate hemoglobin. Analysis of the reaction products indicates that endopeptidolytic fragments are rapidly converted into dipeptides. Thus, oligo/polypeptide fragments do not accumulate to the extent observed for hemoglobin digestion by helminth cathepsin L-type endopeptidases (65)(66)(67). With regard to hemoglobinolytic capability, SmCB1 resembles cathepsins B from the Southeast Asian liver fluke Opisthorchis viverrini and the hookworm Ancylostoma caninum (66, 68) but differs from cathepsins B of the avian fluke Trichobilharzia regenti and the hookworm Necator americanus that cannot initiate hemoglobinolysis (69,70). We conclude that SmCB1 operates as an effective proteolytic machine to degrade the major protein in the parasite's blood meal.
Structure-based Insights for Drug Design-The interaction of SmCB1 with peptidomimetic inhibitors was investigated using the following: (i) crystal structures of three SmCB1-inhibitor complexes, (ii) computational analysis of interaction energies, and (iii) inhibition profiling with a panel of vinyl sulfones. These enabled us to evaluate the critical interactions of inhibitors in the binding subsites and provide a basic SAR for improving inhibitory potency and selectivity.
The S2Ј subsite was efficiently occupied in the complex of CA074, a specific inhibitor of cathepsin B-type peptidases. The hydrogen bonding of the inhibitor's C-terminal P2Ј residue with the occluding loop (especially the two His residues) was the largest favorable subsite interaction among the three crystallographic complexes (Fig. 5). The vinyl sulfone inhibitors, by contrast, do not contain a P2Ј residue, and therefore, we would consider re-designing the scaffold to extend into the S2Ј position, which may improve both potency and selectivity to SmCB1. For the vinyl sulfone P1Ј, aromatic sulfone moieties were preferred over the Ile residue in CA074 (in accordance with the low substrate specificity for Ile in P1Ј (Fig. 8)). An important discovery is the strikingly different conformation between the P1Ј phenyl sulfone moieties in the K11777 and K11017 complexes, which suggests a cooperativity between S1Ј and other subsite(s) of SmCB1. This conformational switch should be taken into account in future docking experiments to optimize P1Ј substituents, e.g. by aromatic groups with a longer linker (Table 1). Interestingly, two types of orientation of the P1Ј phenyl sulfone were also observed in the crystal structures of several vinyl sulfone inhibitors with cathepsin L-type peptidases from protozoan parasites, as shown in supplemental Fig.  S4 (58,71,72). In these complexes, the particular phenyl sulfone orientation was regulated by the structural environment of S1Ј; however, a dual (transient) conformation of this substituent was also documented (71).
In the S1 subsite of SmCB1, Hph of the vinyl sulfone inhibitors was energetically favored (Fig. 5). A basic residue at this position reduced inhibitory potency (Table 1), although basic P1 residues are preferred in SmCB1 substrates (63). This may reflect the effect of the overall scaffold of the active-site ligand that has been reported to change the P1 specificity of human cathepsin B (60,61). Based on the structural difference, the S1 pocket can be exploited to engineer selective inhibition of SmCB1 over human cathepsin B. For this purpose, the interaction can be optimized between the P1 residue and Ile-193 located on an nonconserved sequence segment of the occluding loop of SmCB1 (Fig. 2). At P2, bulky hydrophobic residues such as Phe and Leu on the vinyl sulfone scaffold afforded highly potent inhibitors, which agrees with the known P2 substrate preferences of SmCB1 (63). The bottom of the S2 pocket of SmCB1 and other cathepsins B contains Glu, which facilitates the recognition of positively charged residues at P2 (57,60,63). Further focus can be placed on this P2-S2 interaction by introducing basic substituents of a suitable size to make contact without displacing the flexible side chain of Glu-316 (supplemental Fig. S5). Finally, for S3, occupation by monocyclic heterocycles of the vinyl sulfones generated favorable interaction energies. More bulky substituents can be tested at P3 to improve inhibition, as reported for human cathepsin B inhibitors (73).
SmCB1 as Priority Drug Target-Although SmCB1 is one of a number of peptidases expressed in the gut (7-9) and elsewhere in the parasite (74), the correlation between the severity of phenotypes induced by vinyl sulfone inhibitors and the potency of inhibition of SmCB1 encourage the view that SmCB1 is a valuable drug target. This is congruent with the identification of SmCB1 as a major target for inhibition by K11777 during experimental therapy in a murine model of S. mansoni infection (15). Given the catalytic efficiency of SmCB1 against hemoglobin described here and being the major cysteine peptidase activity in the parasite (13,75), it might be anticipated that inhibition of this enzyme would impact the parasite's ability to thrive. Indeed, RNA interference of SmCB1 slowed the growth of the parasite both in culture and in an animal model of infection (14). To conclude, the SmCB1 crystal structures described herein provide the necessary first step in a structurebased drug development program to improve inhibitor specificity and potency, and possibly, generate new lead anti-schistosomal compounds.