Crystal Structure and Activity Studies of the C11 Cysteine Peptidase from Parabacteroides merdae in the Human Gut Microbiome

Clan CD cysteine peptidases, a structurally related group of peptidases that include mammalian caspases, exhibit a wide range of important functions, along with a variety of specificities and activation mechanisms. However, for the clostripain family (denoted C11), little is currently known. Here, we describe the first crystal structure of a C11 protein from the human gut bacterium, Parabacteroides merdae (PmC11), determined to 1.7-Å resolution. PmC11 is a monomeric cysteine peptidase that comprises an extended caspase-like α/β/α sandwich and an unusual C-terminal domain. It shares core structural elements with clan CD cysteine peptidases but otherwise structurally differs from the other families in the clan. These studies also revealed a well ordered break in the polypeptide chain at Lys147, resulting in a large conformational rearrangement close to the active site. Biochemical and kinetic analysis revealed Lys147 to be an intramolecular processing site at which cleavage is required for full activation of the enzyme, suggesting an autoinhibitory mechanism for self-preservation. PmC11 has an acidic binding pocket and a preference for basic substrates, and accepts substrates with Arg and Lys in P1 and does not require Ca2+ for activity. Collectively, these data provide insights into the mechanism and activity of PmC11 and a detailed framework for studies on C11 peptidases from other phylogenetic kingdoms.

Clan CD enzymes have a highly conserved His/Cys catalytic dyad and exhibit strict specificity for the P 1 residue of their substrates. However, despite these similarities, clan CD forms a functionally diverse group of enzymes: the overall structural diversity between (and at times within) the various families provides these peptidases with a wide variety of substrate specificities and activation mechanisms. Several members are initially expressed as proenzymes, demonstrating self-inhibition prior to full activation (2).
The archetypal and arguably most notable family in the clan is that of the mammalian caspases (C14a), although clan CD members are distributed throughout the entire phylogenetic kingdom and are often required in fundamental biological processes (2). Interestingly, little is known about the structure or function of the C11 proteins, despite their widespread distribution (1) and its archetypal member, clostripain from Clostridium histolyticum, first reported in the literature in 1938 (9). Clostripain has been described as an arginine-specific peptidase with a requirement for Ca 2ϩ (10) and loss of an internal nonapeptide for full activation; lack of structural information on the family appears to have prohibited further investigation.
As part of an ongoing project to characterize commensal bacteria in the microbiome that inhabit the human gut, the structure of C11 peptidase, PmC11, from Parabacteroides merdae was determined using the Joint Center for Structural Genomics (JCSG) 4 HTP structural biology pipeline (11). The structure was analyzed, and the enzyme was biochemically characterized to provide the first structure/function correlation for a C11 peptidase.

Experimental Procedures
Cloning, expression, purification, crystallization, and structure determination of PmC11 were carried out using standard JCSG protocols (11) as follows.
Cloning-Clones were generated using the polymerase incomplete primer extension (PIPE) cloning method (12). The gene encoding PmC11 (SP5111E) was amplified by polymerase chain reaction (PCR) from P. merdae genomic DNA using Pfu-Turbo DNA polymerase (Stratagene), using I-PIPE primers that included sequences for the predicted 5Ј and 3Ј ends (shown below). The expression vector, pSpeedET, which encodes an amino-terminal tobacco etch virus protease-cleavable expression and purification tag (MGSDKIHHHHHHENLYFQ/G), was PCR amplified with V-PIPE (Vector) primers. V-PIPE and I-PIPE PCR products were mixed to anneal the amplified DNA fragments together. Escherichia coli GeneHogs (Invitrogen) competent cells were transformed with the I-PIPE/V-PIPE mixture and dispensed on selective LB-agar plates. The cloning junctions were confirmed by DNA sequencing. The plasmid encoding the full-length protein was deposited in the PSI:Biology Materials Repository at the DNASU plasmid repository (PmCD00547516). For structure determination, to obtain soluble protein using the PIPE, method the gene segment encoding residues Met 1 -Asn 22 was deleted because these residues were predicted to correspond to a signal peptide using SignalP (13).
Protein Expression and Selenomethionine Incorporation-The expression plasmid for the truncated PmC11 construct was transformed into E. coli GeneHogs competent cells and grown in minimal media supplemented with selenomethionine and 30 g ml Ϫ1 of kanamycin at 37°C using a GNF fermentor (14). A methionine auxotrophic strain was not required as selenomethionine is incorporated via the inhibition of methionine biosynthesis (15,16). Protein expression was induced using 0.1% (w/v) L-arabinose and the cells were left to grow for a further 3 h at 37°C. At the end of the cell culture, lysozyme was added to all samples to a final concentration of 250 g ml Ϫ1 and the cells were harvested and stored at Ϫ20°C, until required. Protein Purification for Crystallization-Cells were resuspended, homogenized, and lysed by sonication in 40 mM Tris (pH 8.0), 300 mM NaCl, 10 mM imidazole, and 1 mM Tris(2carboxyethyl)phosphine hydrochloride (TCEP) (Lysis Buffer 1) containing 0.4 mM MgSO 4 and 1 l of 250 unit/l Ϫ1 of benzonase (Sigma). The cell lysate was then clarified by centrifugation (32,500 ϫ g for 25 min at 4°C) before being passed over Ni 2ϩchelating resin equilibrated in Lysis Buffer 1 and washed in the same buffer supplemented with 40 mM imidazole and 10% (v/v) glycerol. The protein was subsequently eluted in 20 mM Tris (pH 8.0), 150 mM NaCl, 10% (v/v) glycerol, 1 mM TCEP, and 300 mM imidazole, and the fractions containing the protein were pooled.
To remove the His tag, PmC11 was exchanged into 20 mM Tris (pH 8.0), 150 mM NaCl, 30 mM imidazole, and 1 mM TCEP using a PD-10 column (GE Healthcare), followed by incubation with 1 mg of His-tagged tobacco etch virus protease per 15 mg of protein for 2 h at room temperature and subsequent overnight incubation at 4°C. The sample was centrifuged to remove any precipitated material (13,000 ϫ g for 10 min at 4°C) and the supernatant loaded onto Ni 2ϩ -chelating resin equilibrated with 20 mM Tris (pH 8.0), 150 mM NaCl, 30 mM imidazole, and 1 mM TCEP and washed with the same buffer. The flow-through and wash fractions were collected and concentrated to 13.3 mg ml Ϫ1 using Amicon Ultra-15 5K centrifugal concentrators (Millipore).
Crystallization and Data Collection-PmC11 was crystallized using the nanodroplet vapor diffusion method using standard JCSG crystallization protocols (11). Drops were comprised of 200 nl of protein solution mixed with 200 nl of crystallization solution in 96-well sitting-drop plates, equilibrated against a 50-l reservoir. Crystals of PmC11 were grown at 4°C in mother liquor consisting of 0.2 m NH 4 H 2 PO 4 , 20% PEG-3350 (JCSG Core Suite I). Crystals were flash cooled in liquid nitrogen using 10% ethylene glycol as a cryoprotectant prior to data collection and initial screening for diffraction was carried out using the Stanford Automated Mounting system (17) at the Stanford Synchrotron Radiation Lightsource (SSRL, Menlo Park, CA). Single wavelength anomalous dispersion data were collected using a wavelength of 0.9793 Å, at the Advanced Light Source (ALS, beamline 8.2.2, Berkeley, CA) on an ADSC Quantum 315 CCD detector. The data were indexed and integrated with XDS (18) and scaled using XSCALE (18). The diffraction data were indexed in space group P2 1 with a ϭ 39.11, b ϭ 108.68, c ϭ 77.97 Å, and ␤ ϭ 94.32°. The unit cell contained two molecules in the asymmetric unit resulting in a solvent content of 39% (Matthews' coefficient (V m ) of 2.4 Å 3 Da Ϫ1 ).
Structure Determination-The PmC11 structure was determined by the single wavelength anomalous dispersion method using an x-ray wavelength corresponding to the peak of the selenium K edge. Initial phases were derived using the autoSHARP interface (19), which included density modification with SOLOMON (20). Good quality electron density was obtained at 1.7-Å resolution, allowing an initial model to be obtained by automated model building with ARP/wARP (21). Model completion and refinement were iteratively performed with COOT (22) and REFMAC (23,24) to produce a final model with an R cryst and R free of 14.3 and 17.5%, respectively. The refinement included experimental phase restraints in the form of Hendrickson-Lattman coefficients, TLS refinement with one TLS group per molecule in the asymmetric unit, and NCS restraints. The refined structure contains residues 24 -375 and 28 -375 for the two molecules in the crystallographic asymmetric unit. Structural validation was carried using the JCSG Quality Control Server that analyzes both the coordinates and data using a variety of structural validation tools to confirm the stereochemical quality of the model (ADIT (25), MOLPROBITY (26), and WHATIF 5.0 (27)) and agreement between model and data (SGCHECK (28) and RESOLVE (29)). All of the mainchain torsion angles were in the allowed regions of the Ramachandran plot and the MolProbity overall clash score for the structure was 2.09 (within the 99th percentile for its resolution). The atomic coordinates and structure factors for PmC11 have been deposited in the Protein Data Bank (PDB) with the accession code 3UWS. Data collection, model, and refinement statistics are reported in Table 1.
Structural Analysis-The primary sequence alignment with assigned secondary structure was prepared using CLUSTAL OMEGA (30) and ALINE (31). The topology diagram was produced with TOPDRAW (32) and all three-dimensional structural figures were prepared with PyMol (33) with the electrostatic surface potential calculated with APBS (18) and contoured at Ϯ5 kT/e. Architectural comparisons with known structures revealed that PmC11 was most structurally similar to caspase-7, gingipain-K, and legumain (PBD codes 4hq0, 4tkx, and 4aw9, respectively). The statistical significance of the structural alignment between PmC11 and both caspase-7 and gingipain-K is equivalent (Z-score of 9.2) with legumain giving a very similar result (Z-score of 9.1). Of note, the ␤-strand topology of the CDP domains of Clostridium difficile toxin B (family C80; TcdB; PDB code 3pee) is identical to that observed in the PmC11 ␤-sheet, but the Z-score from DaliLite was notably less at 7.6. It is possible that the PmC11 structure is more closely related to the C80 family than other families in clan CD, and appear to reside on the same branch of the phylogenetic tree based on structure (2).
Protein Production for Biochemical Assays-The PmCD00547516 plasmid described above was obtained from the PSI:Biology Materials Repository and used to generate a cleavage site mutant PmC11 K147A and an active-site mutant PmC11 C179A using the QuikChange Site-directed Mutagenesis kit (Stratagene) as per the manufacturer's instructions using the following primers: K147A mutant (forward): CAGAATAAGCTGG-CAGCGTTCGGACAGGACG, and K147A mutant (reverse): CGTCCTGTCCGAACGCTGCCAGCTTATTCTG; C179A mutant (forward): CCTGTTCGATGCCGCCTACATGGCA-AGC, and C179A mutant (reverse): GCTTGCCATGTAGGC-GGCATCGAACAGG. The expression plasmids containing PmC11 were transformed into E. coli BL21 Star (DE3) and grown in Luria-Bertani media containing 30 g ml Ϫ1 of kanamycin at 37°C until an optical density (600 nm) of ϳ0.6 was reached. L-Arabinose was added to a final concentration of 0.2% (w/v) and the cells incubated overnight at 25°C.
Compared with the protein production for crystallography, a slightly modified purification protocol was employed for biochemical assays. Initially, the cells were resuspended in 20 mM sodium phosphate (pH 7.5), 150 mM NaCl (Lysis Buffer 2) containing an EDTA-free protease inhibitor mixture (cOmplete, Roche Applied Science). Cells were disrupted by three passages (15 KPSI) through a One-Shot cell disruptor (Constant Systems) followed by centrifugation at 20,000 ϫ g for 20 min at 4°C. The supernatant was collected and sterile-filtered (0.2 m) before being applied to a 5-ml HisTrap HP column (GE Healthcare) equilibrated in Lysis Buffer 2 containing 25 mM imidazole, and the protein was eluted in the same buffer containing 250 mM imidazole. The peak fractions were pooled and buffer exchanged into the assay buffer (20 mM Tris, 150 mM NaCl, pH 8.0) using a PD-10 column. When required, purified PmC11 was concentrated using Vivaspin 2 30-K centrifugal concentrators (Sartorius). Protein concentration was routinely measured using Bradford's reagent (Bio-Rad) with a BSA standard.
Fluorogenic Substrate Activity Assays-The release of the fluorescent group AMC (7-amino-4-methylcoumarin) from potential peptide substrates was used to assess the activity of PmC11. Peptidase activity was tested using 20 g of PmC11 and where n is the multiplicity of reflection hkl, and I i (hkl) and ͗I(hkl)͘ are the intensity of the ith measurement and the average intensity of reflection hkl, respectively (44). d R cryst and R free ϭ ⌺ʈF obs ͉ Ϫ ͉F calc ʈ/⌺͉F obs ͉ for reflections in the working and test sets, respectively, where F obs and F calc are the observed and calculated structurefactor amplitudes, respectively. R free is the same as R cryst but for 5% of the total reflections chosen at random and omitted from structural refinement. e ESU is the estimated standard uncertainties of atoms. f The average isotropic B includes TLS and residual B components. g RMSD, root-mean-square deviation. a final reaction volume of 200 l and all samples were incubated (without substrate) at 37°C for 16 h prior to carrying out the assay. The substrate and plate reader were brought to 37°C for 20 min prior to the addition of the PmC11 and samples prepared without PmC11 were used as blanks (negative controls). The curves were plotted using the blank-corrected fluorescence units against the time of acquisition (in min). The assays were carried out in black 96-well flat-bottomed plates (Greiner). AMC fluorescence was measured using a PHERAstar FS plate reader (BMG Labtech) with excitation and emission wavelengths of 355 and 460 nm, respectively.
To investigate the substrate specificity of PmC11, substrates Z-GGR-AMC, Bz-R-AMC, Z-GP-AMC, Z-HGP-AMC, Ac-DEVD-AMC (all Bachem), BOC-VLK-AMC, and BOC-K-AMC (both PeptaNova) were prepared at 100 mM in 100% dimethyl sulfoxide. The amount of AMC (micromoles) released was calculated by generating an AMC standard curve (as described in Ref. 34) and the specific activity of PmC11 was calculated as picomoles of AMC released per min per mg of the protein preparation.
The reaction rates (V max ) and K m values were determined for mutants PmC11 K147A and PmC11 C179A by carrying out the activity assay at varying concentrations of Bz-R-AMC between 0 and 600 M. The blank-corrected relative fluorescence units were plotted against time (min) with ⌬FU/T giving the reaction rate. The K m and V max of PmC11 and PmC11 K147A against an R-AMC substrate were determined from the Lineweaver-Burk plot as described (34), calculated using GraphPad Prism6 software. All experiments were carried out in triplicate.
Effect of VRPR-FMK on PmC11-To test the effect of the inhibitor on the activity of PmC11, 25 M Z-VRPR-FMK (100 mM stock in 100% dimethyl sulfoxide, Enzo Life Sciences), 20 g of PmC11, 100 M R-AMC substrate, 1 mM EGTA were prepared in the assay buffer and the activity assay carried out as described above. A gel-shift assay, to observe Z-VRPR-FMK binding to PmC11, was also set up using 20 g of PmC11, 25 M inhibitor, 1 mM EGTA in assay buffer. The reactions were incubated at 37°C for 20 min before being stopped by the addition SDS-PAGE sample buffer. Samples were analyzed by loading the reaction mixture on a 10% NuPAGE BisTris gel using MES buffer.
Effect of Cations on PmC11-The enzyme activity of PmC11 was tested in the presence of various divalent cations: Mg 2ϩ , Ca 2ϩ , Mn 2ϩ , Co 2ϩ , Fe 2ϩ , Zn 2ϩ , and Cu 2ϩ . The final concentration of the salts (MgSO 4 , CaCl 2 , MnCl 2 , CoCl 2 , FeSO 4 , ZnCl 2 , and CuSO 4 ) was 1 mM and the control was set up without divalent ions but with addition of 1 mM EGTA. The assay was set up using 20 mg of PmC11, 1 mM salts, 100 M R-AMC substrate, and the assay buffer, and incubated at 37°C for 16 h. The activity assay was carried out as described above.
Size Exclusion Chromatography-Affinity-purified PmC11 was loaded onto a HiLoad 16/60 Superdex 200 gel filtration column (GE Healthcare) equilibrated in the assay buffer. The apparent molecular weight of PmC11 was determined from calibration curves based on protein standards of known molecular weights.
Autoprocessing Profile of PmC11-Autoprocessing of PmC11 was evaluated by incubating the enzyme at 37°C and removing samples at 1-h intervals from 0 to 16 h and placing into SDS-PAGE loading buffer to stop the processing. Samples were then analyzed on a 4 -12% NuPAGE (Thermo Fisher) Novex BisTris gel run in MES buffer.
Autoprocessing Cleavage Site Analysis-To investigate whether processing is a result of intra-or inter-molecular cleavage, the PmC11 C179A mutant was incubated with increasing concentrations of activated PmC11 (0, 0.1, 0.2, 0.5, 1, 2, and 5 g). The final assay volume was 40 l and the proteins were incubated at 37°C for 16 h in the PmC11 assay buffer. To stop the reaction, NuPAGE sample buffer was added to the protein samples and 20 l was analyzed on 10% NuPAGE Novex BisTris gel using MES buffer. These studies revealed no apparent cleavage of PmC11 C179A by the active enzyme at low concentrations of PmC11 and that only limited cleavage was observed when the ratio of active enzyme (PmC11: PmC11 C179A ) was increased to ϳ1:10 and 1:4.

Results
Structure of PmC11-The crystal structure of the catalytically active form of PmC11 revealed an extended caspase-like ␣/␤/␣ sandwich architecture comprised of a central ninestranded ␤-sheet, with an unusual C-terminal domain (CTD), starting at Lys 250 . A single cleavage was observed in the polypeptide chain at Lys 147 (Fig. 1, A and B), where both ends of the cleavage site are fully visible and well ordered in the electron density. The central nine-stranded ␤-sheet (␤1-␤9) of PmC11 consists of six parallel and three anti-parallel ␤-strands with 4 1 3 2 2 1 1 1 5 1 6 1 7 2 8 2 9 1 topology (Fig. 1A) and the overall structure includes 14 ␣-helices with six (␣1-␣2 and ␣4 -␣7) closely surrounding the ␤-sheet in an approximately parallel orientation. Helices ␣1, ␣7, and ␣6 are located on one side of the ␤-sheet with ␣2, ␣4, and ␣5 on the opposite side (Fig. 1A). Helix ␣3 sits at the end of the loop following ␤5 (L5), just preceding the Lys 147 cleavage site, with both L5 and ␣3 pointing away from the central ␤-sheet and toward the CTD, which starts with ␣8. The structure also includes two short ␤-hairpins (␤A-␤B and ␤D-␤E) and a small ␤-sheet (␤C-␤F), which is formed from two distinct regions of the sequence (␤C precedes ␣11, ␣12 and ␤9, whereas ␤F follows the ␤D-␤E hairpin) in the middle of the CTD (Fig. 1B).
The CTD of PmC11 is composed of a tight helical bundle formed from helices ␣8 -␣14 and includes strands ␤C and ␤F, and ␤-hairpin ␤D-␤E. The CTD sits entirely on one side of the enzyme interacting only with ␣3, ␣5, ␤9, and the loops surrounding ␤8. Of the interacting secondary structure elements, ␣5 is perhaps the most interesting. This helix makes a total of eight hydrogen bonds with the CTD, including one salt bridge (Arg 191 -Asp 255 ) and is surrounded by the CTD on one side and the main core of the enzyme on the other, acting like a linchpin holding both components together (Fig. 1C).
Structural Comparisons-PmC11 is, as expected, most structurally similar to other members of clan CD with the top hits in a search of known structures being caspase-7, gingipain-K, and legumain (PBD codes 4hq0, 4tkx, and 4aw9, respectively) ( Table 2). The C-terminal domain is unique to PmC11 within clan CD and structure comparisons for this domain alone does not produce any hits in the PDB (DaliLite, PDBeFold), suggesting a completely novel fold. As the archetypal and arguably most well studied member of clan CD, the caspases were used as the basis to investigate the structure/function relationships in PmC11, with caspase-7 as the representative member. Six of the central ␤-strands in PmC11 (␤1-␤2 and ␤5-␤8) share the same topology as the six-stranded ␤-sheet found in caspases, FIGURE 1. Crystal structure of a C11 peptidase from P. merdae. A, primary sequence alignment of PmC11 (Uniprot ID A7A9N3) and clostripain (Uniprot ID P09870) from C. histolyticum with identical residues highlighted in gray shading. The secondary structure of PmC11 from the crystal structure is mapped onto its sequence with the position of the PmC11 catalytic dyad, autocatalytic cleavage site (Lys 147 ), and S 1 binding pocket Asp (Asp 177 ) highlighted by a red star, a red downturned triangle, and a red upturned triangle, respectively. Connecting loops are colored gray, the main ␤-sheet is in orange, with other strands in olive, ␣-helices are in blue, and the nonapeptide linker of clostripain that is excised upon autocleavage is underlined in red. Sequences around the catalytic site of clostripain and PmC11 align well. B, topology diagram of PmC11 colored as in A except that additional (non-core) ␤-strands are in yellow. Helices found on either side of the central ␤-sheet are shown above and below the sheet, respectively. The position of the catalytic dyad (H, C) and the processing site (Lys 147 ) are highlighted. Helices (1)(2)(3)(4)(5)(6)(7)(8)(9)(10)(11)(12)(13)(14) and ␤-strands (1-9 and A-F) are numbered from the N terminus. The core caspase-fold is highlighted in a box. C, tertiary structure of PmC11. The N and C termini (N and C) of PmC11 along with the central ␤-sheet (1-9), helix ␣5, and helices ␣8, ␣11, and ␣13 from the C-terminal domain, are all labeled. Loops are colored gray, the main ␤-sheet is in orange, with other ␤-strands in yellow, and ␣-helices are in blue.

TABLE 2 Summary of PDBeFOLD (45) superposition of structures found to be most similar to PmC11 in the PBD based on DaliLite (46)
The results are ordered in terms of structural homology (Q H ), where %SSE PC-X is the percentage of the SSEs in the PmC11 that can be identified in the target X (where X ϭ caspase-7 (47), legumain (3), gingipain (48), and TcdB-CPD (49); % SSE X-PC is the percentage of SSEs in X (as above) that can be identified in PmC11 (as above); % sequence ID is the percentage sequence identity after structural alignment; N align is the number of matched residues; and r.m.s. deviation the root mean squared deviation on the C␣ positions of the matched residues. with strands ␤3, ␤4, and ␤9 located on the outside of this core structure (2) (Fig. 1B, box). His 133 and Cys 179 were found at locations structurally homologous to the caspase catalytic dyad, and other clan CD structures (2), at the C termini of strands ␤5 and ␤6, respectively (Figs. 1, A and B, and 2A). A multiple sequence alignment of C11 proteins revealed that these residues are highly conserved (data not shown). Five of the ␣-helices surrounding the ␤-sheet of PmC11 (␣1, ␣2, ␣4, ␣6, and ␣7) are found in similar positions to the five structurally conserved helices in caspases and other members of clan CD, apart from family C80 (2). Other than its more extended ␤-sheet, PmC11 differs most significantly from other clan CD members at its C terminus, where the CTD contains a further seven ␣-helices and four ␤-strands after ␤8.

Enzyme
Autoprocessing of PmC11-Purification of recombinant PmC11 (molecular mass ϭ 42.6 kDa) revealed partial processing into two cleavage products of 26.4 and 16.2 kDa, related to the observed cleavage at Lys 147 in the crystal structure ( Fig.  2A). Incubation of PmC11 at 37°C for 16 h, resulted in a fully processed enzyme that remained as an intact monomer when applied to a size-exclusion column (Fig. 2B). The single cleavage site of PmC11 at Lys 147 is found immediately after ␣3, in loop L5 within the central ␤-sheet (Figs. 1, A and B, and 2A). The two ends of the cleavage site are remarkably well ordered in the crystal structure and displaced from one another by 19.5 Å (Fig. 2A). Moreover, the C-terminal side of the cleavage site resides near the catalytic dyad with Ala 148 being 4.5 and 5.7 Å from His 133 and Cys 179 , respectively. Consequently, it appears feasible that the helix attached to Lys 147 (␣3) could be responsible for steric autoinhibition of PmC11 when Lys 147 is covalently bonded to Ala 148 . Thus, the cleavage would be required for full activation of PmC11. To investigate this possibility, two mutant forms of the enzyme were created: PmC11 C179A (a catalytically inactive mutant) and PmC11 K147A (a cleavage-site mutant). Initial SDS-PAGE and Western blot analysis of both mutants revealed no discernible processing occurred as compared with active PmC11 (Fig. 2C). The PmC11 K147A mutant enzyme had a markedly different reaction rate (V max ) compared with WT, where the reaction velocity of PmC11 was 10 times greater than that of PmC11 K147A (Fig. 2D). Taken together, these data reveal that PmC11 requires processing at Lys 147 for optimum activity.
To investigate whether processing is a result of intra-or intermolecular cleavage, the PmC11 C179A mutant was incubated with increasing concentrations of processed and acti-vated PmC11. These studies revealed that there was no apparent cleavage of PmC11 C179A by the active enzyme at low concentrations of PmC11 and that only limited cleavage was observed when the ratio of active enzyme (PmC11: PmC11 C179A ) was increased to ϳ1:10 and 1:4, with complete cleavage observed at a ratio of 1:1 (Fig. 2E). This suggests that cleavage of PmC11 C179A was most likely an effect of the increasing concentration of PmC11 and intermolecular cleavage. Collectively, these data suggest that the pro-form of PmC11 is autoinhibited by a section of L5 blocking access to the active site, prior to intramolecular cleavage at Lys 147 . This cleavage subsequently allows movement of the region containing Lys 147 and the active site to open up for substrate access.
Substrate Specificity of PmC11-The autocatalytic cleavage of PmC11 at Lys 147 (sequence KLK∧A) demonstrates that the enzyme accepts substrates with Lys in the P 1 position. The substrate specificity of the enzyme was further tested using a variety of fluorogenic substrates. As expected, PmC11 showed no activity against substrates with Pro or Asp in P 1 but was active toward substrates with a basic residue in P 1 such as Bz-R-AMC, Z-GGR-AMC, and BOC-VLK-AMC. The rate of cleavage was ϳ3-fold greater toward the single Arg substrate Bz-R-AMC than for the other two (Fig. 2F) and, unexpectedly, PmC11 showed no activity toward BOC-K-AMC. These results confirm that PmC11 accepts substrates containing Arg or Lys in P 1 with a possible preference for Arg.
The catalytic dyad of PmC11 sits near the bottom of an open pocket on the surface of the enzyme at a conserved location in the clan CD family (2). The PmC11 structure reveals that the catalytic dyad forms part of a large acidic pocket (Fig. 2G), consistent with a binding site for a basic substrate. This pocket is lined with the potential functional side chains of Asn 50 , Asp 177 , and Thr 204 with Gly 134 , Asp 207 , and Met 205 also contributing to the pocket (Fig. 2A). Interestingly, these residues are in regions that are structurally similar to those involved in the S 1 binding pockets of other clan CD members (shown in Ref. 2).
Because PmC11 recognizes basic substrates, the tetrapeptide inhibitor Z-VRPR-FMK was tested as an enzyme inhibitor and was found to inhibit both the autoprocessing and activity of PmC11 (Fig. 3A). Z-VRPR-FMK was also shown to bind to the enzyme: a size-shift was observed, by SDS-PAGE analysis, in the larger processed product of PmC11 suggesting that the inhibitor bound to the active site (Fig. 3B). A structure overlay of PmC11 with the MALT1-paracacaspase (MALT1-P), in complex with Z-VRPR-FMK (35), revealed that the PmC11 dyad sits in a very similar position to that of active MALT1-P and that Asn 50 , Asp 177 , and Asp 207 superimpose well with the principal MALT1-P inhibitor binding residues (Asp 365 , Asp 462 , and Glu 500 , respectively (VRPR-FMK from MALT1-P with the corresponding PmC11 residues from the structural overlay is shown in Fig. 1D), as described in Ref. 5). Asp 177 is located near the catalytic cysteine and is conserved throughout the C11 family, suggesting it is the primary S 1 binding site residue. In the structure of PmC11, Asp 207 resides on a flexible loop pointing away from the S 1 binding pocket (Fig. 3C). However, this loop has been shown to be important for substrate binding in clan CD (2) and this residue could easily rotate and be involved in substrate binding in PmC11. Thus, Asn 50 , Asp 177 , and Asp 207 are most likely responsible for the substrate specificity of PmC11. Asp 177 is highly conserved throughout the clan CD C11 peptidases and is thought to be primarily responsible for substrate specificity of the clan CD enzymes, as also illustrated from the proximity of these residues relative to the inhibitor Z-VRPR-FMK when PmC11 is overlaid on the MALT1-P structure (Fig. 3C).
Comparison with Clostripain-Clostripain from C. histolyticum is the founding member of the C11 family of peptidases and contains an additional 149 residues compared with PmC11. A multiple sequence alignment revealed that most of the secondary structural elements are conserved between the two enzymes, although they are only ϳ23% identical (Fig. 1A). Nevertheless, PmC11 may be a good model for the core structure of clostripain.
The primary structural alignment also shows that the catalytic dyad in PmC11 is structurally conserved in clostripain (36) (Fig. 1A). Unlike PmC11, clostripain has two cleavage sites (Arg 181 and Arg 190 ), which results in the removal of a nonapeptide, and is required for full activation of the enzyme (37) (highlighted in Fig. 1A). Interestingly, Arg 190 was found to align with Lys 147 in PmC11. In addition, the predicted primary S 1 -binding residue in PmC11 Asp 177 also overlays with the residue predicted to be the P1 specificity determining residue in clostripain (38) (Asp 229 , Fig. 1A).
As studies on clostripain revealed addition of Ca 2ϩ ions are required for full activation, the Ca 2ϩ dependence of PmC11 was examined. Surprisingly, Ca 2ϩ did not enhance PmC11 activity and, furthermore, other divalent cations, Mg 2ϩ , Mn 2ϩ , Co 2ϩ , Fe 2ϩ , Zn 2ϩ , and Cu 2ϩ , were not necessary for PmC11 activity (Fig. 3D). In support of these findings, EGTA did not inhibit PmC11 suggesting that, unlike clostripain, PmC11 does not require Ca 2ϩ or other divalent cations, for activity.

Discussion
The crystal structure of PmC11 now provides three-dimensional information for a member of the clostripain C11 family of cysteine peptidases. The enzyme exhibits all of the key structural elements of clan CD members, but is unusual in that it has a nine-stranded central ␤-sheet with a novel C-terminal  ). B, gel-shift assay reveals that Z-VRPR-FMK binds to PmC11. PmC11 was incubated with (ϩ) or without (Ϫ) Z-VRPR-FMK and the samples analyzed on a 10% SDS-PAGE gel. A size shift can be observed in the larger processed product of PmC11 (26.1 kDa). C, PmC11 with the Z-VRPR-FMK from the MALT1paracacaspase (MALT1-P) superimposed. A three-dimensional structural overlay of Z-VRPR-FMK from the MALT1-P complex onto PmC11. The position and orientation of Z-VRPR-FMK was taken from superposition of the PmC11 and MALTI_P structures and indicates the presumed active site of PmC11. Residues surrounding the inhibitor are labeled and represent potentially important binding site residues, labeled in black and shown in an atomic representation. Carbon atoms are shown in gray, nitrogen in blue, and oxygen in red. C, divalent cations do not increase the activity of PmC11. The cleavage of Bz-R-AMC by PmC11 was measured in the presence of the cations Ca 2ϩ , Mn 2ϩ , Zn 2ϩ , Co 2ϩ , Cu 2ϩ , Mg 2ϩ , and Fe 3ϩ with EGTA as a negative control, and relative fluorescence measured against time (min). The addition of cations produced no improvement in activity of PmC11 when compared in the presence of EGTA, suggesting that PmC11 does not require metal ions for proteolytic activity. Furthermore, Cu 2ϩ , Fe 2ϩ , and Zn 2ϩ appear to inhibit PmC11.
domain. The structural similarity of PmC11 with its nearest structural neighbors in the PDB is decidedly low, overlaying better with six-stranded caspase-7 than any of the other larger members of the clan ( Table 2). The substrate specificity of PmC11 is Arg/Lys and the crystal structure revealed an acidic pocket for specific binding of such basic substrates. In addition, the structure suggested a mechanism of self-inhibition in both PmC11 and clostripain and an activation mechanism that requires autoprocessing. PmC11 differs from clostripain in that is does not appear to require divalent cations for activation.
Several other members of clan CD require processing for full activation including legumain (39), gingipain-R (40), MARTX-CPD (8), and the effector caspases, e.g. caspase-7 (41). To date, the effector caspases are the only group of enzymes that require cleavage of a loop within the central ␤-sheet. This is also the case in PmC11, although the cleavage loop is structurally different to that found in the caspases and follows the catalytic His (Fig. 1A), as opposed to the Cys in the caspases.
All other clan CD members requiring cleavage for full activation do so at sites external to their central sheets (2). The caspases and gingipain-R both undergo intermolecular (trans) cleavage and legumain and MARTX-CPD are reported to perform intramolecular (cis) cleavage. In addition, several members of clan CD exhibit self-inhibition, whereby regions of the enzyme block access to the active site (2). Like PmC11, these structures show preformed catalytic machinery and, for a substrate to gain access, movement and/or cleavage of the blocking region is required.
The structure of PmC11 gives the first insight into this class of relatively unexplored family of proteins and should allow important catalytic and substrate binding residues to be identified in a variety of orthologues. Indeed, insights gained from an analysis of the PmC11 structure revealed the identity of the Trypanosoma brucei PNT1 protein as a C11 cysteine peptidase with an essential role in organelle replication (42). The PmC11 structure should provide a good basis for structural modeling and, given the importance of other clan CD enzymes, this work should also advance the exploration of these peptidases and potentially identify new biologically important substrates.