Structural and Functional Relationships in the Virulence-associated Cathepsin L Proteases of the Parasitic Liver Fluke, Fasciola hepatica*

The helminth parasite Fasciola hepatica secretes cysteine proteases to facilitate tissue invasion, migration, and development within the mammalian host. The major proteases cathepsin L1 (FheCL1) and cathepsin L2 (FheCL2) were recombinantly produced and biochemically characterized. By using site-directed mutagenesis, we show that residues at position 67 and 205, which lie within the S2 pocket of the active site, are critical in determining the substrate and inhibitor specificity. FheCL1 exhibits a broader specificity and a higher substrate turnover rate compared with FheCL2. However, FheCL2 can efficiently cleave substrates with a Pro in the P2 position and degrade collagen within the triple helices at physiological pH, an activity that among cysteine proteases has only been reported for human cathepsin K. The 1.4-Å three-dimensional structure of the FheCL1 was determined by x-ray crystallography, and the three-dimensional structure of FheCL2 was constructed via homology-based modeling. Analysis and comparison of these structures and our biochemical data with those of human cathepsins L and K provided an interpretation of the substrate-recognition mechanisms of these major parasite proteases. Furthermore, our studies suggest that a configuration involving residue 67 and the “gatekeeper” residues 157 and 158 situated at the entrance of the active site pocket create a topology that endows FheCL2 with its unusual collagenolytic activity. The emergence of a specialized collagenolytic function in Fasciola likely contributes to the success of this tissue-invasive parasite.

studies suggest that a configuration involving residue 67 and the "gatekeeper" residues 157 and 158 situated at the entrance of the active site pocket create a topology that endows FheCL2 with its unusual collagenolytic activity. The emergence of a specialized collagenolytic function in Fasciola likely contributes to the success of this tissue-invasive parasite.
Clan CA papain-like cysteine peptidases, such as cathepsins B and L (1), are ubiquitous in helminth (worm) parasites of human and veterinary importance. These peptidases are involved in a variety of pathogen-specific functions, including penetration and migration through host tissues, catabolism of host proteins to peptides and amino acids, and modulation or suppression of host immune defenses by cleaving immunoglobulin or altering the activity of immune effector cells (2)(3)(4). The central role of Clan CA proteases in the survival of helminth parasites has positioned them as lead targets for the development of new chemotherapies and vaccines (5)(6)(7).
Fasciola hepatica is a helminth parasite that causes liver fluke disease (fasciolosis) in cattle and sheep worldwide. It is most prevalent in Europe with infection rates increasing because of the emergence of drug-resistant parasites and possibly as a result of climate change (8,9). Human fasciolosis has recently emerged as a major zoonosis in rural areas of South America (particularly Bolivia, Peru, and Equador), Egypt, and Iran where organized farm management practices are poor. It is estimated that worldwide over 2.4 million people are infected with F. hepatica and about 180 million are at risk of infection (10,11).
Secretion of cysteine proteases is associated with the virulence of F. hepatica and its capacity to infect a wide range of mammalian hosts (4,6,(12)(13)(14). Cathepsin L1 (FheCL1) and cathepsin L2 (FheCL2) are the two major peptidases secreted by the infective larvae that traverse the host intestinal wall, by the migratory stages that penetrate the liver tissues, and by the mature adult parasites that reside in the bile ducts and feed on host blood, which they ingest through the punctured bile duct wall (4,6,15). Experiments using purified native enzymes demonstrated that FheCL1 and FheCL2 efficiently degrade host hemoglobin, immunoglobulin, and interstitial matrix proteins such as fibronectin, laminin, and native collagen (6,16,17). Although FheCL1 and FheCL2 exhibited similar substrate specificities, FheCL2 showed a greater affinity for peptides containing Pro residues in the P2 position (18 -20). We proposed that by producing proteases with overlapping specificity the parasite could digest these host macromolecules more efficiently, and therefore more effectively penetrate host organs (6,16).
The F. hepatica cathepsin Ls belong to a lineage that eventually gave rise to the mammalian cathepsin Ls from which the mammalian cathepsin Ks diverged (2). Mammalian cathepsin L is ubiquitously expressed in tissues and performs a housekeeping function in protein turnover, but it also plays a part in more specialized functions such as antigen processing and presentation, hormone and protease activation, and extracellular matrix turnover (21). Cathepsin K, on the other hand, exhibits a more restricted expression profile being predominantly found in osteoclasts but also in multinucleated giant cells, macrophages, and lung epithelial cells (22,23). A specific role for cathepsin K in bone resorption by osteoclasts has been related to the ability of the protease to cleave the covalently linked triple helices of native collagen, a unique property among the mammalian papain-like cysteine proteases (24). This unusual property was attributed to the presence of a tyrosine residue at position 67 within the S2 subsite of cathepsin K that interacts with proline in the P2 of substrates, including the Gly-Pro-Xaa repeat sequence (where Xaa is mainly proline or 4-trans-L-hydroxyproline) found in collagen. A parallel therefore exists between mammalian cathepsin K and the F. hepatica FheCL2 as the latter can also cleave substrates with a P2 proline and possesses a tyrosine residue at the corresponding position 67.
To understand the role of the major secreted cathepsin L proteases of F. hepatica in the virulence of the parasite and its adaptation to various hosts, it is important to elucidate their biochemical properties and relate these to structure and function. Therefore, in this study, we have characterized the substrate specificity of active recombinant forms of FheCL1 and FheCL2. These properties were further explored by preparing variants of FheCL1 in which specific substitutions were made within the S2 subsite of the active site (positions 67 and 205) to simulate those residues present in human cathepsins L and K. In addition, the 1.4-Å three-dimensional structure of a variant FheCL1 zymogen, in which the active site Cys was replaced by a Gly (FheproCL1Gly 25 ), has been determined by x-ray crystallography. For FheCL2, the three-dimensional structure has been constructed via homology-based modeling. Analysis and comparison of these major parasite proteases with the human cathepsins L and K provide a structural interpretation of the substrate-recognition mechanisms.
Expression and Purification of Recombinant Cathepsin L Zymogens in Yeast-F. hepatica procathepsin L1 (FheCL1) and procathepsin L2 (FheCL2) were amplified by PCR from the pAAH5 Saccharomyces cerevisiae expression vector into which the full-length cDNA had been cloned previously in our laboratory (12,25). FheCL1 variants (FheCL1 L67Y and FheCL1 L205A) were synthesized and incorporated an SnaBI restriction site at the 5Ј end of the gene and an AvrII restriction site and His 6 tag sequence at the 3Ј end (Geneart, Regensburg, Germany). The 980-bp fragments were ligated into pCR-Script cloning vector (Stratagene), which were transformed into competent Escherichia coli for amplification. Inserts were digested from plasmid preparations with AvrII and SnaBI and inserted in-frame with the yeast ␣-factor at the AvrII/SnaBI site of P. pastoris expression vector pPIC9K (Invitrogen). Plasmids were linearized with SacI and then transformed into chemically competent GS115 cells (Invitrogen) as described previously (12). All inserts were sequenced to ensure congruence with original cDNAs.
P. pastoris yeast transformants were cultured in 500 ml of buffered glycerol complex medium broth, buffered to pH 8.0, in 5-liter baffled flasks at 30°C until an A 600 of 2-6 was reached (12). Cells were harvested by centrifugation at 2000 ϫ g for 5 min, and protein expression was induced by resuspending in 100 ml of buffered minimal methanol medium broth, buffered at pH 6.0 containing 1% methanol (20). Recombinant proteins were purified from yeast medium by affinity chromatography using nickel-nitrilotriacetic acid-agarose (Qiagen) (12,26). Purified recombinant zymogens were dialyzed against phosphate-buffered saline (PBS) and stored at Ϫ20°C. The 37-kDa zymogens were autocatalytically activated and processed to 24.5-kDa mature enzymes by incubation for 2 h at 37°C in 0.1 M sodium citrate buffer, pH 5.0, containing 2 mM DTT and 2.5 mM EDTA. The mixture was then dialyzed against PBS, pH 7.3. The proportion of functionally active recombinant protein in these preparations was determined by titration against E-64.
P1-P4 Specificity Using a Positional Scanning Synthetic Combinatorial Library-The substrate specificities of FheCL1, FheCL1 L67Y, and FheCL1 L205A and FheCL2 were determined using a complete diverse positional scanning synthetic combinatorial library (PS-SCL) (27). Screens were performed at 25°C in 0.1 M sodium acetate, 0.1 M NaCl, 0.01 M DTT, 0.001 M EDTA, 0.01% Brij-35, 1% Me 2 SO (from the substrates), pH 5.5. Aliquots of 25 nmol in 1 l from each of 20 sub-libraries of the P1, P2, P3, and P4 libraries were added to the wells of a 96-well Microfluor-1 U-bottom plate (Dynex Technologies). The final concentration of each compound of the 8000 compounds per well was 31.25 nM in a 100-l final reaction volume. The assays were initiated by addition of preactivated enzyme, and the reaction was monitored with a SpectraMax Gemini fluorescence spectrometer (Molecular Devices) with excitation at 380 nm, emission at 460 nm, and cutoff at 435 nm. Screens were performed in duplicate and triplicate for wild type and mutated enzymes, respectively.
Enzyme Assays and Kinetics with Fluorogenic Peptide Substrates-Initial rates of hydrolysis of the fluorogenic dipeptide substrates were measured by monitoring the release of the fluorogenic leaving group, NHMec, at an excitation wavelength of 380 nm and an emission wavelength of 460 nm using a Bio-Tek KC4 microfluorometer. k cat and K m values were determined using nonlinear regression analysis. Initial rates were obtained at 37°C over a range of substrate concentrations spanning K m values (0.2-200 M) and at fixed enzyme concentrations (0.5-5 nM). Assays were performed in PBS, pH 7.3, and 100 mM sodium acetate buffer, pH 5.5, each containing 2.5 mM DTT and 2.5 mM EDTA.
Rate constants for the inactivation of enzyme by Z-Phe-Ala-CHN 2 and cathepsin K inhibitor II were determined from progress curves in the presence of substrate (28,29). When substrate and inhibitor bind to enzyme in rapid equilibrium and the substrate concentration does not change significantly during the course of the assay, the concentration of product, [P], at time t after the start of the reaction is given by Equation 1, where v 0 is the initial rate of reaction; k obs is the rate of inactivation, and A 0 is the background fluorescence. k obs is related to the inhibitor concentration by Equation 2, When [I] Ͻ Ͻ K i plots of k obs versus [I] were linear with slope equal to an apparent second-order rate constant k obs /[I]. This value was then corrected for substrate concentration and the Michaelis constant to determine a true second-order rate constant k inact /K i . The initial rate v 0 is related to inhibitor concentration by Equation 3, Because the inactivation was carried out with [S] ϭ K m , Equation 3 reduces to Equation 4, An apparent inhibition constant K i(app) for the formation of the initial reversible enzyme-inhibitor complex prior to inactivation was determined by plotting v 0 against [I] and fitting to Equation 4.
Collagen Digestion-Calf skin collagen type 1 was solubilized in 0.2 M acetic acid at a concentration of 2 mg/ml and dialyzed for 2 days against 0.1 M sodium acetate, pH 4.0, 0.1 M sodium acetate, pH 5.5, or PBS, pH 7.3. Reactions contained 10 g of dialyzed collagen type 1, 1 mM DTT, and 2 mM EDTA and 5.47 M activated peptidase in a final volume of 100 l of one of the above buffers. Reactions were performed at 28°C for 3 and 20 h or at 37°C for 30 min. All reactions were stopped by the addition of 10 M E-64. Collagen digests were analyzed by 4 -20% gradient SDS-PAGE under reducing conditions and stained with Coomassie Brilliant Blue R-250.
Production of Inactive Variant FheproCL1 Gly 25 -For the purpose of obtaining a high resolution three-dimensional structure of FheCL1, an inactive enzyme was produced by replacing the active site Cys residue at position 25 in the mature domain by a Gly (12,26). This FheproCL1 Gly 25 enzyme migrated as a single protein of 37 kDa on reducing 12% SDS-PAGE, which represents the full zymogen containing a prosegment and mature enzyme domain (data not shown).
Data Collection, Structure Solution, and Crystallographic Refinement of FheproCL1 Gly 25 -Initial crystallization screening experiments were performed at the Hauptman-Woodward Institute high throughput crystallization laboratory. A total of 1536 conditions were tested using a nanoscale microbatch-under-oil method, resulting in several preliminary hits that suggested a route to diffraction-quality crystals (30). Ultimately, high quality crystals were grown in-house via vapor diffusion in sitting drops. One l of 10 mg/ml FheproCL1 Gly 25 enzyme was mixed with 1 l of the precipitating agent, 0.2 M sodium thiocyanate in 20% polyethylene glycol 3350, and allowed to equilibrate at 23°C over a 100-l reservoir of precipitating agent. Crystalline plates formed within 2 days; however, full-size growth to plates greater than 75 m in thickness took nearly 2 months.
Diffraction data were collected at the Advanced Light Source, beam line 8.3.1, using monochromatic (Si-111) radiation of 1.11588 Å (31). An ADSC Quantum 210 2 ϫ 2 CCD array detector was used with low temperature conditions of 100 K at the crystal position. Crystals of the single mutant protein were flash-cooled in liquid nitrogen after being soaked for ϳ1.5 min in a cryoprotectant solution of crystal growth solution plus 50% 2-methyl-2,4-pentane diol. High and low resolution datasets were collected from the same crystal. Data processing was completed with MOSFLM (32) and SCALA. The structure was solved via molecular replacement using the MOLREP program of the CCP4 suite (33) with a polyserine search model derived from the 1.8 Å structure of 1CS8 (human procathepsin L). The topmost solution had an R-factor value of 0.535 and correlation coefficient of 0.288, each several levels above the next best solution, which had corresponding statistics of 0.604 and 0.083, respectively. One unique solution was found with one molecule in the asymmetric unit and a starting R factor of 0.526. The initial molecular replacement solution was improved using ARP/ wARP as implemented in the CCP4 program suite (34) resulting in a model that was better than 85% complete. Iterative rounds of visualization and manual model building and refinement were completed with QUANTA (Accelrys, San Diego) and Refmac5 with anisotropic atomic displacement parameters (35), respectively. Water molecules were added automatically using ARPwaters in CCP4 (36) and were manually verified. In the final stages of refinement, XPLEO (37) was used to improve the fit of two areas of ambiguous density in the structure. Final visualization and manual adjustments to the structure as well as final assessment of water molecules were completed with COOT (38). Crystallographic parameters and statistics are summarized in Table 1, and final atomic coordinates have been deposited with the Protein Data Bank, accession ID 2O6X (RCSB040763).
Homology-based Molecular Modeling-A model structure of the mature domain of FheCL2 was built using Modeler (release 8, version 1), a program for protein structure modeling (39 -41). The 1.8 Å structure of human procathepsin L (PDB code 1CS8), the 2.2 Å structure of human cathepsin K (PDB code 1ATK), and our 1.4 Å solved structure of FheproCL1 Gly 25 were used as three-dimensional templates of related fold. Generated models were visualized and compared with COOT (37) and with PyMOL (42).
Sequence Analysis-F. hepatica cathepsin L protein sequences were aligned using Clustal X 1.81. Phylogenetic trees were generated from the alignment by the boot-strapped (1000-trial) neighbor-joining method using MEGA (43).

RESULTS
Active Site Residues Involved in Substrate Specificity of FheCL1 and FheCL2-Residues that make up the S2 pocket of FheCL1 and FheCL2 were determined using the three-dimensional x-ray crystal structure of FheCL1 and homology-based model of FheCL2, respectively, (see below), and their comparison to the structure of human cathepsin L (PDB code 1CS8) and cathepsin K (PDB code 1ATK) is shown in Table 2 (see also Fig. 1 and Fig. 7A; papain numbering is used). Most variation between papain cysteine proteases occurs at residues 67 and 205, and studies with human cathepsin L and cathepsin K demonstrated that the difference in residues 67 (Leu and Tyr, respectively) and 205 (Ala and Leu, respectively) reflect the striking difference in the substrate specificity of these two enzymes; for example, human cathepsin L exhibits a broad specificity and favors both aromatic and aliphatic P2 residues but will not accept proline, whereas cathepsin K prefers only aliphatic resides and most particularly proline. Indeed, the acceptance of a P2 Pro residue confers cathepsin K with its unique ability to cleave native type I and type II collagens, proteins that contain repeated Gly-Pro-X motifs (44). Like human cathepsin L, FheCL1 possesses a Leu at position 67; however, unlike cathepsin L it possesses a Leu at position 205 rather than an Ala. The Leu at position 205 is similar to cathepsin K, and thus, the FheCL1 exhibits hybrid character in the S2 subsite. By contrast, FheCL2 possesses a Tyr at position 67 and Leu at position 205 and hence is identical to cathepsin K at both sites. It has been suggested by us (2) and others (44,45) that the accommodation of Pro in the P2 position of peptide substrates by FheCL2 may be related to the presence of the Tyr 67 , analogous to the cathepsin K scenario.
To address the relationship between the residues presented at position 67 and 205 and the substrate specificity and function of FheCL1 and FheCL2, we prepared variants of FheCL1 as shown in Table 2. The FheCL1 L67Y variant has a single amino acid change making the S2 subsite similar to FheCL2 and cathepsin K at positions 67 and 205. The FheCL1 L205A variant has a single amino acid that was designed to make the S2 subsite similar to human cathepsin L. The wild type and variant F. hepatica cathepsin L peptidases were recombinantly expressed in the methylotrophic yeast P. pastoris, purified, and activated as described under  "Experimental Procedures." All enzymes were expressed as 37-kDa zymogens that autocatalytically processed at pH 4.5 to produce 24.5-kDa mature enzymes, which was confirmed by N-terminal sequencing (Fig. 2). Enzymatic assays showed that all substitutions made in the S2 subsite of the FheCL1 active site did not alter its pH profile for activity against the fluorogenic substrate Z-Phe-Arg-NHMec; both the wild type and variant free enzymes exhibited a Gaussian bellshaped pH profile with an optimum for activity in the region pH 6.5 to 7.0 (pK I ϭ 3.87 Ϯ 0.07 and pK II 8.14 Ϯ 0.08). Substrate Specificity Profiling Using a PS-SCL Reveals Unique and Distinct Activities of FheCL1 and FheCL2-Wild type FheCL1 and FheCL2 exhibited similar preferences for amino acids at P1. As expected for papain-like cysteine proteases, both enzymes had a clear preference for Arg at P1, but other residues accommodated in this position included Lys, Glu, Thr, and Met (Fig. 3, P1 panel), and these were all cleaved at similar relative rates to that observed for human cathepsin L and cathepsin K (44). Similar results were obtained for the variants FheCL1 L67Y and FheCL1 L205A, which were expected as the introduced substitutions do not affect the S1 active site pocket (not shown).
A P1-Arg fixed library was then used to explore P2-P4 specificities of FheCL1 and FheCL2. The enzymes show a distinct preference for hydrophobic amino acids in the P2; both favored Leu. Interestingly, however, the positional scanning method did not identify Phe as a suitable P2 residue even though our kinetic studies demonstrate that both FheCL1 and FheCL2, like other papain cysteine proteases, cleave fluorogenic substrates with a P2 Phe efficiently (see Table 3). The most striking observation was the distinct preference for Pro residues by FheCL2, particularly when compared with FheCL1 that did not accommodate this residue (Fig. 3, P2 panel). The unusual preference for a P2 Pro exhibited by FheCL2 is similar to that observed for human cathepsin K using the same methodology (44). However, whereas human cathepsin K favored equally Ile and Leu at P2 (44), both Fasciola cathepsins were more similar to human cathepsin L by preferring Leu over Ile.
The replacement of Leu for Ala at residue 205 (FheCL1 L205A) markedly altered the activity profile from wild type enzyme (Fig. 4). This variant exhibited a broader substrate specificity by accepting Phe, Trp, and Tyr at P2, residues that were not accepted by wild type FheCL1. The same residues are also accommodated by human cathepsin L (44), thus demonstrating that the replacement of Leu 205 for Ala in FheCL1 generates an enzyme more similar to the human orthologue. By contrast, the FheCL1 L67Y variant did not show a significant change in the P2 preference to wild type FheCL1; in particular, this substitution did not alter the activity of the enzyme toward Pro in the P2 position (Fig. 4). This was a surprising result as we expected that the FheCL1 L67Y variant would behave similarly to FheCL2 and cathepsin K given that the residues at positions 67 and 205 were identical.
As anticipated, the P3 and P4 specificities for FheCL1 and FheCL2 were similar, and like human cathepsin L and cathepsin K, the Fasciola enzymes accepted a broad range of residues in these positions. The P3-P4 specificity of FheCL1 was unaffected by the P2 substitutions present in the variant proteases (not shown).

Wild type and Variant Protease Specificities against Fluorogenic Peptide Substrates Correlates with Residues at Position 67
and 205-To support and extend the data derived from the positional scanning libraries, and to determine substrate kinetic parameters (K m , k cat , and k cat /K m ) for wild type FheCL1, the variants FheCL1 L67Y and FheCL1 L205A, and wild type FheCL2, we examined their hydrolytic activity against various fluorogenic di-and tripeptides (Table 3). FheCL1 efficiently cleaved both Z-Phe-Arg-NHMec (k cat /K m ϭ 1,021,092 M Ϫ1 s Ϫ1 ) and Z-Leu-Arg-NHMec (k cat /K m ϭ 8,395,402 M Ϫ1 s Ϫ1 ); the enzyme cleaved the latter substrate over eight times more rapidly largely because its K m value for this substrate is much lower than for the former substrate. Although the substrates Z-Pro-Arg-NHMec (k cat /K m ϭ 5,387 M Ϫ1 s Ϫ1 ), Tos-Gly-Pro-Arg-NHMec (k cat /K m ϭ 35,928 M Ϫ1 s Ϫ1 ), and Boc-Ala-Gly-Pro-Arg-NHMec (k cat /K m ϭ 46,021 M Ϫ1 s Ϫ1 ) were cleaved relatively poorly, nevertheless, the data indicate that FheCL1 can accommodate proline residues in the P2 position.
In comparison with FheCL1, FheCL2 is much less efficient at cleaving substrates with Phe and Leu in the P2 position;  the k cat /K m values for Z-Phe-Arg-NHMec and Z-Leu-Arg-NHMec with this enzyme are 24-and 7-fold lower than for FheCL1, respectively. However, its ability to cleave Z-Pro-Arg-NHMec, Tos-Gly-Pro-Arg-NHMec, and Boc-Ala-Gly-Pro-Arg-NHMec is 6-, 2-, and 2-fold higher than that of FheCL1 (Table 3). The kinetic data show that the S2 subsite of FheCL2 is able to accommodate proline residues more readily than the S2 subsite of FheCL1 and is in agreement with the data obtained by PS-SCL (Figs. 3 and 4). Substitution at the 205 position of FheCL1 to generate variant FheCL1 L205A had a significant impact on the substrate specificity of the enzyme by increasing its ability to cleave Z-Phe-Arg-NHMec, while reducing its effectiveness on Z-Leu-Arg-NHMec, Z-Pro-Arg-NHMec, Tos-Gly-Pro-ArgNHMec, and Boc-Ala-Gly-Pro-Arg-NHMec (Table 3). Substitution at position 67 to give the variant FheCL1 L67Y reduced the efficiency of the enzyme for both Z-Phe-Arg-NHMec and Z-Leu-Arg-NHMec about 2-fold, which was reflected in a reduction of both k cat and K m values for each substrate. This substitution did not significantly alter the specificity of the enzyme for the substrate Z-Pro-Arg-NHMec or Tos-Gly-Pro-Arg-NHMec, although it almost doubled its efficiency on Boc-Ala-Gly-Pro-Arg-NHMec (Table 3). (46). Changes in rates of inactivation by these inhibitors have highlighted different specificities at subsites of cysteine proteases such as cathepsin L and cathepsin B (47). In this study, rates of inactivation of FheCL1, FheCL1 L205A, FheCL1 L67Y, and FheCL2 by the cathepsin inhibitor Z-Phe-Ala-CHN 2 have been measured. Wild type FheCL1 and FheCL2 had second-order rate constants of 20,838 and 11,899 M Ϫ1 s Ϫ1 , respectively, showing that both enzymes were rapidly inactivated by Z-Phe-Ala-CHN 2 ( Table 4). The 2-fold greater rate of inactivation of FheCL1 compared with FheCL2 is further evidence that FheCL1 accommodates hydrophobic P2 residues more effectively than FheCL2.

Kinetic Analyses of Wild Type and Variant Proteases with Specific Inhibitors-Peptidyl diazomethyl ketones are irreversible inhibitors of cysteine proteases
The rate of inactivation of FheCL1 L205A was 24-fold greater than wild type indicating that Ala at residue 205 in the S2 subsites binds a P2 Phe more effectively than a Leu, which is consistent with our substrate kinetics studies (Table 3) and data derived from our tetrameric peptide library. These data highlighted further the major impact that Ala at position 205 has on binding P2 residues, and it is interesting to note that the second-order rate constant of 492,727 M Ϫ1 s Ϫ1 (Table 4) for the inactivation of FheCL1 L205A by Z-Phe-Ala-CHN 2 is similar to the value of 660,000 M Ϫ1 s Ϫ1 for the inactivation of mammalian cathepsin L by the same inhibitor (47,48). By contrast, the FheCL1 L67Y variant had a k obs /[I] value of 53,704 M Ϫ1 s Ϫ1 (Table 4), which is only 2.5-fold higher than wild type FheCL1 and 5-fold greater than wild type FheCL2; therefore, this substitution has not been such a major influence on the binding of Phe in the S2 pocket.
The inhibitor known as cathepsin K Inhibitor II (Z-LNHNH-CONHNHLF-Boc, CKII) is a potent time-dependent inhibitor   Tables 4 and 5). These values are similar to the value of 590,000 M Ϫ1 s Ϫ1 reported for the inactivation of cathepsin K by Wang et al. (49). The data are consistent with the kinetic data for hydrolysis of peptidyl fluorogenic substrates as both enzymes had highest k cat /K m values for Z-Leu-Arg-NHMec.
The rate of inactivation of FheCL1 L205A by cathepsin K inhibitor II was 7-fold lower than wild type. The K i(app) increased 11-fold demonstrating that this variant cannot accommodate leucine in the S2 subsite to the same extent as wild type FheCL1. Because the rate of inactivation of human cathepsin L by cathepsin K Inhibitor II was 53-fold lower than that for human cathepsin K (Table 5), these data indicate that the FheCL1 L205A variant has S2 specificity more characteristic of human cathepsin L. The rate of inactivation of FheCL1L67Y by cathepsin K inhibitor II was 3.5-fold lower than wild type, although the K i(app) did not change significantly, against showing that the Tyr substitution at this position exerts a relatively lower effect on P2 binding.
Wild Type FheCL2 but Not Wild Type FheCL1 or Its Variants Cleaves Native Collagen Type 1-FheCL1 and FheCL2 degraded type 1 collagen at pH 4.0 and 5.5 in reactions held at 28°C, but the activity of FheCL1 was much less and was limited to the ␤ and ␥ chains, whereas the ␣1 and ␣2 chains remained intact. Moreover, whereas FheCL1 produced clear degradation fragments, FheCL2 degraded the collagen completely, particularly at pH 4.0, indicating that only the latter cleaves efficiently within the helical structures (Fig. 5A). Because low pH may cause some structural unraveling of the collagen, additional studies were performed at neutral pH. FheCL1 and FheCL2 both exhibit optimum activity against fluorogenic substrates in the neutral pH range. However, FheCL1 exhibited minimal activity against type 1 collagen in PBS, pH 7.3, whereas FheCL2 cleaved within all collagen chains (Fig. 5B).
Like the wild type FheCL1 enzyme, FheCL1 L205A and FheCL1 L67Y variants cleaved collagen but were unable to cleave within the tightly wound helices. Although in all experiments FheCL1 L205A appeared to cleave collagen more efficiently than the wild type enzyme, the pattern of digested fragments was similar (Fig. 5B). Nevertheless, the greater efficiency of cleavage of collagen is consistent with this variant's enhanced activity against fluorogenic substrates ( Table 3). The inability of the FheCL1 L67Y variant to cleave within the helices of collagen is also consistent with the PS-PCL studies and substrate kinetics studies, because this enzyme did not show any increase in preference for P2 Pro compared with wild type FheCL1.
The FheproCL1Gly 25 Structure in Comparison to Other Clan CA Cysteine Proteases-The experimentally determined structure of FheproGL1Gly 25 is quite similar to that of previously described mammalian cathepsins. Although the x-ray crystal structure of FheCL1 presented here is that of an inactive zymogen mutant in which active site Cys 25 has been mutated to glycine, the remainder of the active site machinery is intact, and the key specificity determinant, the S2 pocket, has not been altered.
The molecule, FheproCL1Gly 25 , is similar in tertiary structure to human cathepsin L1. Electron density is clear, connected, and easily traceable for the entirety of the main chain of the mature domain of FheproCL1Gly 25 . The mature domain is bi-lobed, with a substrate-binding cleft running between the two lobes of the enzyme, which is characteristic of the papain superfamily of cysteine proteases (Fig. 6A). With the exception of the mutated catalytic cysteine at position 25, the expected catalytic machinery, highlighted in pink in Fig. 6B, is present in the area of the substrate-binding cleft. The left-hand lobe of the mature domain (Fig. 6A) is predominantly helical in composition. The second domain contains several elements of ␤-sheet. Similar to other members of the papain superfamily of  shape similarity that exists among the family of papain-like cysteine proteases (1). The prosegment of FheproCL1 Gly 25 folds in a manner very similar to human cathepsin L, as indicated by the value given for superimposition above, although it does show more divergence than is observed in the mature domain. In general, there is a globular region and an extended C-terminal portion, as illustrated in Fig. 6B, that connects the prosegment to the mature domain. As was described for human procathepsin L, the globular portion of the prosegment is fairly well structured and is comprised of distinct components of helix and ␤-strand (50). One notable change in the structure as compared with the human zymogen structure is as follows. In FheproCL1Gly 25 , a stretch of ␤-strand extends from 79P through 84P, which is then followed by a very short helical turn from 85P to 88P. In the human enzyme, this final helical turn is absent, and this segment as well as the remainder of the prodomain is made up of ␤-strand only. The final visible residues of FheproCL1Gly 25 , 89P through 96P are ␤-strand. It should be noted that the two species show the greatest structural divergence from residues 85P to 96P, with the chains carving a somewhat different path through three-dimensional space. The final four residues (97P, 98P, 99P, and 100P) of the prosegment of FheproCL1 Gly 25 are not visible in experimental electron density, which suggests that they are disordered and subject to motion within the crystal. Most of the prosegment of FheproCL1 Gly 25 sits adjacent to one side of the mature domain, in the region of a loop that extends from approximately residues 138 -155 of the mature domain. The corresponding area of contact in the prosegment is residues 55P through 68P. The extended C-terminal tether of the prosegment that links the two domains lies across the active site cleft of the mature domain (Fig. 6B).
Significant Differences Exist in the Active Site Clefts of FheCL1 and FheCL2-The composition of the active site cleft, particularly the deep and well defined S2 pocket, is a key determinant of the substrate specificity of the papain family of cysteine proteases (44). The ability to accept or exclude particular substrate moieties is highly dependent upon the size, shape, and volume of the available pocket, as well as the presence or absence of stabilizing interactions such as charge-charge pairs, hydrogen bonding, and hydrophobic interactions. The S2 pocket in Fhep-roCL1 Gly 25 is lined with several residues that extend into the active site space. These include Leu 67 , Met 68 , Ala 133 , Val 157 , Ala 160 , and Leu 205 ( Table 2 and Fig. 7A). Leu 67 and Val 157 are situated at the entrance to the pocket and act as "gatekeepers." Met 68 , Ala 133 , and Ala 160 sit below them, deeper into the pocket, whereas Leu 205 lines the floor of the pocket (illustrated in Fig. 7B). Sequence alignment and homology-based modeling of FheCL2 place the following residues within the S2 pocket: Tyr 67 , Met 68 , Ala 133 , Leu 157 , Ala 160 , and Leu 205 (Table 3), all at locations corresponding to those observed in the structure of FheCL1 Gly 25 . The differences between the S2 pockets in these similar enzymes include the presence of the dramatically larger Tyr in the "gatekeeping" position 67 at the opening of the S2 pocket, and the somewhat larger Leu 157 in the opposing position at the entrance to the pocket. Although tyrosine is much larger and bulkier than leucine, it is conformationally able to rotate somewhat freely based on the availability of an unre- strained torsion angle about the C␣-C␤ bond, and its presence at the top of the pocket does not necessary preclude the entry of P2 substrate residues (Fig. 7).
In one of the variants of FheCL1 constructed for this study (FheCL1 L67Y), a substitution of Tyr was made for Leu 67 at the opening of the pocket, which renders the entrance to the S2 pocket somewhat more similar to that found in FheCatL2 and human cathepsin K ( Table 2). Human cathepsin K is similar to the model constructed for FheCL2 sharing the presence of a larger Tyr residue at the entrance to the pocket. In this structure of human cathepsin K (PDB code 1ATK), the Tyr residue does not preclude access to the pocket and is positioned such that an inhibitor (E-64) is able to bind with a P2 Leu-like moiety just within the top of the S2 pocket. In the second FheCL1 variant constructed (FheCL1 L205A), a substitution of Ala was made in the Leu 205 position at the base of the pocket that changed this site to be similar to that found in human cathepsin L ( Table 2). As mentioned above the overall structure of the human enzyme is very similar to FheCL1 (50).

DISCUSSION
Substrate Specificity of FheCL1 and FheCL2-A comparison of the substrate specificity between the F. hepatica cathepsin L peptidases (wild type and variants) and human cathepsin L and cathepsin K is shown in Fig. 8, and helps to summarize the findings of our substrate specificity analyses and the effect active site substitutions have on this. First, it is clear that both FheCL1 and FheCL2 are similar to cathepsin K with regard to their preference for a P2 Leu over Phe. Second, both enzymes can accommodate Pro in the P2 position, but this is more readily accepted by FheCL2 compared with FheCL1; neither enzyme, however, cleaves substrates with this residue in the P2 position as readily as human cathepsin K. Third, substituting Tyr for Leu at residue 67 (variant FheCL1 L67Y) to make the S2 subsite of FheCL1 more like that of human cathepsin K did not significantly enhance its ability to cleave substrates with Pro in the P2 position; this was confirmed using three fluorogenic substrates as shown in Table, and by PS-SCL as shown in Fig. 4. Finally, substitution of Leu 205 with Ala (variant FheCL1 L205A) increased the relative activity of the peptidase for substrates with Phe in the P2 position, but this increase was not sufficiently dramatic as to reverse its preference for Leu over Phe as observed for human cathepsin L; thus, compared with wild type FheCL1, FheCL1 L205A is more similar to human cathepsin L but is not identical in its substrate specificity.
Our results using inhibitors are consistent with our data derived from the PS-SCL and substrate specificity studies. FheCL1 accommodates hydrophobic P2 residues of diazomethyl ketone Z-Phe-Ala-CHN 2 more effectively than FheCL2, and therefore its inhibition by this reagent was 2-fold greater.  Replacement of the Leu 205 by Ala, however, created an S2 pocket that accepted the P2 Phe more readily, and hence the inhibitory constant for Z-Phe-Ala-CHN 2 against the FheCL1 L205A variant was 24-fold greater than for the wild type enzyme. The cathepsin K inhibitor II, on the other hand, was 20 times more potent than Z-Phe-Ala-CHN 2 against both FheCL1 and FheCL2 and exhibited similar kinetics to that reported for human cathepsin K by Wang et al. (49). By contrast, it was 7-fold less potent against the FheCL1 L205A variant, which is interesting because it is 53-fold less effective against human cathepsin L, which possesses an Ala at position 205, compared with cathepsin K (49). Similar to our observations for substrate binding, the replacement of Leu 67 for Tyr did not have a dramatic effect on the binding of both inhibitors.
Differences in the substrate specificities of human cathepsin L and cathepsin K can be exquisitely demonstrated using collagen type 1 as a substrate. Because of its acceptance of a P2 proline, cathepsin K can completely degrade collagen by cleaving within the repeated Gly-Pro-Xaa motif in the helices of the tightly wound triple helical structure. Human cathepsin L, on the other hand, cleaves within the nonhelical telomeric regions but does not possess intrahelical activity (24,44). Although both FheCL1 and FheCL2 could cleave native collagen, only FheCL2 cleaved this substrate within the helical structures. Most strikingly, FheCL2 cleaved native collagen even at neutral pH, which suggests that the enzyme could perform this function in vivo to facilitate parasite tissue migration. Collectively, these results support the idea that the ability of FheCL2 to accommodate proline in the P2 position of substrates confers the enzyme with collagenase-like activity, similar to that observed for cathepsin K. Although FheCL1 exhibited low activity against fluorogenic substrates with a P2 Pro, this was insufficient to endow this enzyme the ability to cleave the helices within native collagen. Replacement of Leu 67 in FheCL1 with Tyr (FheCL1 L67Y) to make the S2 subsite of this enzyme similar to FheCL2 and cathepsin K did not enhance its ability to accept substrates with Pro residues in the P2 position, nor did it confer the enzyme with collagenase-like activity suggesting that other S2 residue(s) besides that at position 67 are essential for this activity (see below).
The Amino Acid at Position 205 Lies at the Bottom of the S2 Pocket and Has a Major Impact on Substrate Specificity-Our PS-SCL, substrate binding, and inhibitor studies showed that the Leu at position 205 is indeed a determinant of substrate turnover and inhibitor specificity in FheCL1. The wild type enzyme is able to cleave substrates with Phe in the P2 position; however, substrates with the much smaller Leu moiety in the P2 position were cleaved more than 8-fold more rapidly. In comparison with the wild type FheCL1, replacing the Leu with an Ala (FheCL1 L205A) not only enhanced the ability of the enzyme to accept P2 Phe much more readily, but broadened the overall substrate specificity of the enzyme such that larger P2 residues such as Tyr and Trp were also accepted (see Table 3 and Fig. 4).
It has been observed in another papain family member, cruzain, from the protozoan parasite Trypanosoma cruzi, that the character (i.e. size, charge, and torsion-based flexibility) of the residue Glu 205 is crucial for determining which P2 residues can be accommodated in the S2 pocket (51). Examination of structures of cruzain bound to inhibitors containing Phe at P2 shows that this residue is flexible and can rotate in or out of the pocket depending upon the substrate residue entering the pocket (51). In several inhibitor-bound structures of cruzain where the P2 residue is a Phe, Glu 205 is swung out into the solvent to accommodate the size of the phenylalanine (52). In FheCL1, however, the Leu at position 205 is shorter by one carbon-carbon bond than Glu and therefore cannot rotate its C-␦1 and C-␦2 out of the pocket. Although the Leu residue of FheCL1 is shorter than cruzain's Glu by ϳ1.5 Å, it is conformationally flexible. It can therefore position itself to make the most space possible available to an incoming P2 Phe. Nonetheless, it clearly prefers the smaller Leu. The side chain of the wild type Leu, as determined in the x-ray crystal structure of FheproCL1Gly 25 , extends 3.98Å from the bottom of the pocket, filling a volume of 166.7 Å 3 (53) and exposing a surface area of 170 Å 2 (54). The observed broader specificity of FheCL1 L205A, i.e. the ability to minimally accommodate Tyr and Trp, can thus be understood by comparing the space available in the base of the S2 pocket. An Ala variant at 205 would extend only as far as the C-␤ of Leu, or to a distance of 1.51 Å and a corresponding volume of 88.6 Å 3 and surface area of 115 Å 2 , leaving considerable additional space for larger substrate peptide residues to be accommodated.
Ability to Cleave Proline at P2 Is Influenced by Residue 67 and the Surrounding Gatekeeper Residues-By engineering a variant cathepsin K with a Leu replacing the Tyr at position 67, Lecaille et al. (44) demonstrated an important role for Tyr 67 in determining the P2 Pro activity of cathepsin K and in its unique collagen cleaving activity. In this study, we found that replacing the Leu 67 of FheCL1 with a Tyr to engineer the S2 pocket of the active site of this enzyme to mimic FheCL2 and human cathepsin K did not significantly alter its S2 subsite specificity and, most particularly, did not enhance the ability of the enzyme to accept a P2 Pro. The three-dimensional structure of FheCL1 was therefore analyzed and compared with cathepsin K to explain these observations and to determine what additional factors within the S2 pocket of FheCL1 may influence the acceptance of a P2 Pro residue (Fig. 7).
The S2 pockets of human cathepsin K and FheCL2 are very similar, as evidenced by our modeling data and that presented in Table 2; however, there are some noteworthy differences within a 5 Å radius of this site. First, residue 133 is an alanine in the FheCL2 enzyme, but is a serine in human cathepsin K. The impact of this difference on substrate preference may be minimal, however, because this residue is more peripheral to the outer edge of the base of the S2 pocket (Fig. 7). More significant is residue 158, which is adjacent to the gatekeeping residue 157 and sits just above the upper lip of the pocket. Based on its position, this residue, which is Asn in cathepsin K and Thr in FheCL2, appears to have some secondary influence on accessibility of the opening of the pocket. In the published structure of human cathepsin K (PDB code 1ATK), Asn 158 is swung out of the way of the entrance to the pocket and does not preclude the entrance of any incoming P2 moiety. However, the Thr 158 of FheCL2, which is shorter by one carbon than asparagine, cannot move completely out of the way and would possibly have either a carbon atom or an oxygen atom pointing in toward the top of the S2 opening. Based on spatial constraints, a proline would be accommodated in the S2 area of FheCL2, but this would not be as readily accepted as in human cathepsin K. Our analysis suggests, however, that acceptance of proline in this pocket is not achieved by offering a topology of easy deep penetration access but rather by providing opportunities for stabilizing interactions with the 5-membered proline ring of the substrate at the entrance to the pocket, and that such stabilization involves interactions between the aromatic ring of Tyr 67 and the P2 proline. The location and positioning of Leu 157 in the structure of human cathepsin K and FheCL2 suggest its availability to further stabilize the presence of a P2 proline, perhaps with constructive aliphatic interactions.
By comparison, there are greater differences in the gatekeeping positions at the top entrance to the S2 pocket of FheCL1 compared with FheCL2 and cathepsin K (Fig. 7). Residue 67 is the smaller Leu in FheCL1, and its terminal carbons, C-␦1 and C-␦2, extend only as far as the corresponding C-␦1 and C-␦2 of Tyr, which stretches its terminal oxygen nearly 3.7 Å further from the protein main chain along the edge of the pocket. On the opposing side of the pocket entrance, residues 158 of FheCL1 is Asn 158 , as in human cathepsin K, and is swung away from the entrance to the pocket, offering unimpeded access. However, residue 157 is a Val in FheCL1 and is one carbon shorter than the Leu found in FheCL2 and human cathepsin K, and accordingly its extension into the pocket is ϳ1.5 Å less; this does not allow it to extend far enough into the available space to participate in aliphatic interactions. Thus, the absence of both stabilizing Tyr and Leu residues would account for the reduced preference for P2 proline by FheCL1. On the other hand, the composition of the S2 pocket in FheCL1, being more open and accessible to deeper penetration, would more readily favor processing of longer amino acid moieties, such as Leu and Phe, as we have observed.
In summary, previous studies with human cathepsins K and L have shown that residues at positions 67 and 205 are essential in dictating substrate specificity (44,55). Much attention has been given to the importance of a Tyr 67 in conferring cathepsin K with the ability of accepting P2 Pro residues in the corresponding S2 subsite of the enzyme and the capacity to degrade native collagens. However, this study using FheCL1 and a recent study by Lecaille et al. (56) using human cathepsin L show that mutations that replace the Leu 67 to Tyr 67 in these enzymes are not sufficient alone to accommodate proline and thus to endow collagenolytic activity. Therefore, other residues at the opening of the active site pocket, namely the gatekeeper residues identified here that occupy sites 157 and 158, combine with Tyr to generate these specialized properties, and hence, we have set the groundwork for future mutational studies. It is important to note that glycosaminoglycans such as chondroitin sulfate are known to enhance the collagenolytic activity of human cathepsin K by binding to a site other than the active site (57). However, these do not influence the activity of FheCL2 (data not shown), which points to further intriguing differences between the parasite and mammalian enzymes.
Given that collagen is a major interstitial matrix protein that is highly resistant to proteolysis, our data showing that FheCL2 can degrade native collagen within the helical regions at physiological pH would suggest that this protease enabled Fasciola spp. to become proficient tissue-degrading pathogens. It is important to note that native collagenase-like activity is restricted to very few enzymes. These include the bacterial collagenases, matrix-metalloproteinases, and cathepsin K (24), and therefore the evolution and maintenance of such an activity in Fasciola are significant. By extension, the emergence of this enzyme group may have been essential to the adaptation of the parasites to the wide variety of mammalian species it infects (13).