Enzymatic and Structural Characterization of the Major Endopeptidase in the Venus Flytrap Digestion Fluid*

Carnivorous plants primarily use aspartic proteases during digestion of captured prey. In contrast, the major endopeptidases in the digestive fluid of the Venus flytrap (Dionaea muscipula) are cysteine proteases (dionain-1 to -4). Here, we present the crystal structure of mature dionain-1 in covalent complex with inhibitor E-64 at 1.5 Å resolution. The enzyme exhibits an overall protein fold reminiscent of other plant cysteine proteases. The inactive glycosylated pro-form undergoes autoprocessing and self-activation, optimally at the physiologically relevant pH value of 3.6, at which the protective effect of the pro-domain is lost. The mature enzyme was able to efficiently degrade a Drosophila fly protein extract at pH 4 showing high activity against the abundant Lys- and Arg-rich protein, myosin. The substrate specificity of dionain-1 was largely similar to that of papain with a preference for hydrophobic and aliphatic residues in subsite S2 and for positively charged residues in S1. A tentative structure of the pro-domain was obtained by homology modeling and suggested that a pro-peptide Lys residue intrudes into the S2 pocket, which is more spacious than in papain. This study provides the first analysis of a cysteine protease from the digestive fluid of a carnivorous plant and confirms the close relationship between carnivorous action and plant defense mechanisms.

The unusual carnivorous adaptation of plants to nutrientpoor environments has attracted scientific attention since the time of Darwin. Small animals are captured and subsequently digested by the plant's use of passive or active trap structures. A variety of hunting strategies are observed in these plants, including the passive pitfall traps of tropical pitcher plants (Nepenthes), the adhesive flypaper traps of sundews (Drosera), and the active snap traps of Venus flytraps. Because plant carnivory is known to have evolved independently at least six times in the plant kingdom and is represented in Ͼ600 identified species, the mechanical components and enzymatic processes display variation from one carnivorous plant to another (1). The Venus flytrap (Dionaea muscipula) belongs to the Droseraceae family, and its natural habitat is the southeastern region of North America. This plant attracts animals using volatile organic compounds (2) and catches prey by rapid movement of specialized leaves in response to a mechanical stimulation of designated trigger hairs (3). The digestion process is initiated through the secretion of acidic digestive fluid (pH 4) from secretory glands and continues for ϳ10 days, after which the trap reopens to complete the hunting cycle (4,5). During the digestion phase, the protein content of the digestive fluid increases along with further acidification and higher proteolytic activity (4,6).
The protein components of the Venus flytrap digestive fluid were recently determined using a combined proteomics and transcriptomics approach (5). The presence of chitinases, lipases, phosphatases, and peptidases revealed a concerted action targeted at insect prey, where chitinases degrade chitin in the cuticle (chitin-rich outer layer) of insects and spiders, thereby providing access to the internal parts of the prey for further enzymatic breakdown (7). The Venus flytrap takes up alanine, glycine, and peptides from the digestive fluid (8 -10) along with the active assimilation of ammonia through designated transporters (11). The degradation of prey proteins is performed by cysteine endopeptidases (dionains) supported by the action of a serine carboxypeptidase (5). This is in contrast to the digestive fluids of Nepenthes, Cephalotus, and Drosera, which all rely heavily on aspartate proteases (12,13).
Dionains display Ͻ70% sequence similarity to other cysteine proteases characterized to date. They are ascribed to the papain-like C1 family of the MEROPS database according to the overall homology and conservation of specific sequence motifs uniquely related to autocatalytic maturation from inactive zymogenic pro-forms to active proteases (14). These include the ERFNIN and GXNXFXD motifs in a predicted pro-region of ϳ100 -120 residues (15,16). The pro-domain prevents adven-titious proteolysis by blocking the active site Cys-His catalytic dyad (17,18). Activation occurs at an acidic pH and proceeds through automaturation. The majority of known plant cysteine proteases, such as papain and caricain, are secreted into an acidic environment, where the active enzyme serves as a first line of defense against pathogens and herbivores (19 -22). For dionains, pro-domain peptides have been identified in the acidic digestive fluid (5), suggesting that a similar postsecretory maturation process occurs.
In this study, we have performed the first functional and structural characterization of a digestive cysteine protease (dionain-1) from a carnivorous plant. We sequenced the fulllength pre-pro-dionain-1 cDNA through targeted cloning based on de novo identified peptide sequences from the digestive fluid. The glycosylated inactive pro-form was produced by heterologous recombinant expression in yeast and found to undergo accelerated autolytic maturation at low pH to yield active dionain-1. The crystal structure of mature dionain-1 in complex with the general cysteine peptidase inhibitor E-64 was solved and refined with data to 1.5 Å resolution. The results revealed an overall conserved protein fold, similar to that of other plant cysteine proteases (papain, caricain, castor bean cysteine peptidase, etc.). In addition, substrate library profiling and structural insight into the specificity pockets of dionain-1 were used to discuss evolutionary similarities and carnivorous lifestyle adaptations.

Experimental Procedures
Materials-Venus flytrap plants were purchased from the Lammehave Nursery (Ringe, Denmark) and grown in a walk-in plant growth chamber under a 12-h/12-h light/dark cycle at 26°C. All of the experiments were performed on healthy mature plants. EasySelect TM Pichia expression kit, Pichia Easy-Comp TM kit, and Zeocin were purchased from Invitrogen. The LA PCR kit was purchased from Clontech, and the FastDigest restriction enzymes and T4 ligase were obtained from Fermentas. HiTrap columns, Superdex 75 10/30, and the Ettan CAF TM MALDI sequencing kit were purchased from GE Healthcare. C18 stage tips were purchased from Proxeon Biosystems A/S. Z-FR-AMC 2 was obtained from Bachem, and Z-LR-AMC, Boc-LKR, and Boc-GKR were purchased from the Peptide Institute Inc. (PeptaNova GmbH, Sandhausen, Germany). All other chemicals were obtained from Sigma-Aldrich unless otherwise specified.
De Novo Sequencing of Dionain-1 Peptides-Digestive fluid from mechanostimulated Venus flytrap leaves was analyzed by SDS-PAGE using a 5-15% gradient gel. The suspected major band containing dionain-1 was excised for in-gel trypsin digestion or analyzed using automated Edman degradation (Procise 494-HT protein sequencer) after electrophoretic blotting to an Immobilon membrane. The trypsin digestion was performed overnight at 37°C and followed by acidification with 0.1% trifluoroacetic acid (TFA), C18 reverse-phase micropurification, and spotting onto a target plate for matrix-assisted laser desorption ionization mass spectrometry (MALDI-MS) using ␣-cyano-4-hydroxycinnamic acid matrix. Chemical-assisted fragmentation (Ettan CAF TM ) was employed to facilitate the fragmentation and interpretation of the generated MS/MS spectra. The MALDI-MS analysis was performed on a Micromass Q-TOF Ultima Global mass spectrometer (Waters).
Cloning of cDNA Encoding Dionain-1-To stimulate the dionain-1 transcription, prey digestion by Venus flytrap was initiated by feeding the plants with yellow mealworm beetles (Tenebrio molitor). After 40 -88 h, the trap leaves were collected, rinsed in water to remove residual beetle remains, snapfrozen in liquid nitrogen, and stored at Ϫ80°C. Total RNA was isolated using a cetyltrimethylammonium bromide protocol, followed by mRNA purification using oligo(dT) magnetic beads. Intact transcripts were amplified by RNA ligase-mediated PCR, in which an RNA adapter is linked to dephosphorylated intact 5Ј transcript ends (23). First-strand synthesis was performed with 5Ј-oligo(dT)-linker-primer-3Ј and Moloney murine leukemia virus RNase H reverse transcriptase and followed by long and accurate PCR (Sigma-Aldrich) amplification using a 5Ј-cDNA primer (adapter sequence) and a 3Ј-primer downstream of the poly(A) tail. For subsequent 5Ј-RACE PCR, dionain-1 reverse primers were designed from two de novo sequenced peptides by MALDI-MS/MS, DCDTDGNDK (5Ј-ATAACCATCTTCACCCCAAGAXGTXCCCCA) and WGT-SWGEDGY (5Ј-TTTATCGTTACCATCTGTXTCXCAXTC), where X denotes degenerate positions during primer synthesis to sample all possible codons. The PCR and subsequent TA cloning into pBS II skϩ vector resulted in generation and sequencing of full-length dionain-1 cDNA, which was deposited with GenBank TM (access code KP663370). The dionain-1 cDNA without the predicted 29-residue signal peptide (by Sig-nalP version 4.0 (24)) was subsequently amplified through the incorporation of XhoI and NotI restriction sites and then cloned into the yeast pPICZ␣C expression vector. An Asp-to-Gln N98pQ mutant, in which "p" refers to the numbering of the pro-protein sequence, was generated by site-directed mutagenesis.
Expression and Purification of WT and N98pQ Pro-dionain-1-For recombinant expression of WT and N98pQ-mutant pro-dionain-1 in Pichia pastoris, competent KM71H cells were transformed with linearized vector DNA according to the manufacturer's protocols for the EasyComp TM and EasySelect TM Picha expression kits (Invitrogen). The transformants were grown at 28°C for 3 days on YPDS (yeast extract peptone dextrose medium with sorbitol) plates containing Zeocin (100 g/ml). High-yield expressing colonies were identified by tests in buffered glycerol complex medium, and large scale expression cultures were grown in buffered methanol complex medium, followed by concentration to A 600 at ϳ85 and expression at 28°C for 72 h (WT) or 48 h (N98pQ) with the addition of 0.5% methanol every 24 h. For WT pro-dionain-1, the culture supernatant was dialyzed against buffer A (10 mM Tris-HCl, 10 mM NaCl, pH 7.6), filtered, and loaded onto a 5-ml HiTrap Q column equilibrated in buffer A. Bound protein was eluted using a 1%/min gradient of buffer B (10 mM Tris-HCl, 1 M NaCl, pH 7.6), and fractions containing pro-dionain-1 were pooled and dialyzed into 10 mM Tris-HCl, 20 mM NaCl at pH 7 and then concentrated. The final protein concentration was determined using a BCA kit (Thermo Scientific). For pro-dionain-1 N98pQ, ammonium sulfate was added to the culture supernatant to a final concentration of 1 M, filtered, and applied to a 5-ml HiTrap Phenyl HP column equilibrated in buffer A (20 mM Tris-HCl, 1 M ammonium sulfate, pH 7.0). Bound protein was eluted using a 2%/min gradient of buffer C (20 mM Tris-HCl, pH 7.0). Fractions containing pro-dionain-1 N98pQ were collected and dialyzed against 20 mM Tris-HCl, 100 mM NaCl, pH 7.0, and then concentrated. The concentration of the purified zymogens was estimated by absorbance at 280 nm, using the BCA kit (Thermo Scientific) and SDS-PAGE analysis. The yield was ϳ150 mg/liter of culture medium. Protein identity was determined by LC-MS/MS analysis (see below). For further purification of the highest glycosylated WT variant, gel filtration was performed using a Superose 6 column (GE Healthcare) equilibrated in 20 mM phosphate buffer (pH 7.0) containing 100 mM NaCl.
SDS-PAGE and Edman Degradation-SDS-PAGE was performed using 5-15% gradient gels and a discontinuous ammediol/glycine buffer system (25). Dionain-1 samples at pH 7.0 were boiled for 5 min in the presence of 50 mM DTT and sample buffer. Dionain-1 samples at pH 3.2-6.0 were treated with 1 mM freshly prepared E-64 solution for 10 min on ice and then boiled in the presence of 50 mM DTT and sample buffer. Deglycosylation of pro-dionain-1 WT was conducted by treatment with endo-␤-N-acetylglucosaminidase H for 2 h at 37°C at a 1:30 (w/w) ratio (endo-␤-N-acetylglucosaminidase H/dionain-1) in 10 mM Tris-HCl, 50 mM NaCl, pH 7.0. Gels were stained with Coomassie Brilliant Blue or blotted for 20 min to an Immobilon-P membrane (Millipore) for N-terminal protein sequence analysis by Edman degradation using a Procise 494-HT protein sequencer (Applied Biosystems).
Mass Spectrometry Analyses (LC-MS/MS)-To identify proteins from the gel bands, the relevant material was excised and subjected to overnight in-gel digestion by trypsin (26). The tryptic peptides were purified by C18 stage tips (Proxeon, Thermo Scientific) and analyzed on a TripleTOF 5600ϩ instrument (AB Sciex) equipped with a Nanospray II source (AB Sciex). The instrument was coupled in-line with an EASY-nLC II system (Thermo Scientific) equipped with a trap column and a 15-cm analytical column pulled in-house and packed with ReproSil-Pur C18-AQ 3-cm resin (Dr. Maisch GmbH, Ammerbuch-Entringen, Germany).
The collected MS files were converted to Mascot generic format (MGF) using the AB SCIEX MS Data Converter beta 1.1 (AB SCIEX) and the "proteinpilot MGF" parameters. The generated peak lists were searched against the Swiss-Prot database using an in-house Mascot search engine (Matrix Science). Search parameters allowed one missed trypsin cleavage site and propionamide (in-gel digest) or carbamidomethyl (in-solution) as a fixed modification with peptide tolerance and MS/MS tolerance set to 10 ppm and 0.2 Da, respectively. The Mascotprovided exponentially modified protein abundance index value was adjusted for molecular mass to calculate the relative protein amount (27). The relative protein amount was based on three replicates.
Proteolytic Susceptibility toward Trypsin-Five g of each pro-dionain-1 variant (WT, deglycosylated, and N98pQ) was incubated with 0.1 g of trypsin for either 5 or 15 min at 37°C, which was followed by immediate transfer to ice and the addition of SDS-PAGE sample buffer. The samples were boiled and analyzed on a 5-15% gradient gel.
Circular Dichroism Spectroscopy-The wavelength spectra and thermal unfolding of WT and N98pQ dionain-1 were assessed by circular dichroism (CD) on a Jasco J-800 spectropolarimeter using a quartz cuvette of 0.1-cm path length and a protein concentration of 0.2 mg/ml in 20 mM NaH 2 PO 4 / Na 2 HPO 4 buffer, pH 6.0, with 2 mM E-64. The wavelength spectra were recorded in the far-UV area from 200 to 250 nm at 20°C to verify similarity in secondary structure content. Thermal unfolding was assessed by monitoring changes in the signal at 222 nm over a temperature range from 20 to 95°C with a scan rate of 90°C/h. The thermal melting point was estimated by fitting each thermal denaturation curve to a two-state unfolding model using the Kaleidagraph software (Synergy Software).
Characterization of Pro-dionain-1 Maturation Using SDS-PAGE-WT pro-dionain-1, deglycosylated pro-dionain-1, or N98pQ pro-dionain-1 was incubated at 0.2 mg/ml in standard phosphate-citrate buffer solutions (0.1 M citrate with 0.2 M Na 2 HPO 4 adjusted to the desired pH). To determine the effect of pH, intervals were tested using buffers of pH 3.2, 3.4, 3.6, 3.8, 4.0, and 7.0 at 37°C for 40 min. To follow the automaturation over time, incubation periods of 0, 15, and 30 min were used in a phosphate-citrate buffer solution at pH 4.0 All incubation periods were followed by inhibition with 1 mM E-64 on ice before SDS-PAGE analysis.
Identification of Pro-domain Peptides-A total of 50 g of pro-dionain-1 was incubated at 0.2 mg/ml in phosphate/citrate buffer solutions at pH 4.0 for 10 min, followed by the addition of 1 mM E-64 and subsequent acidification by TFA to a final concentration of 0.5%. Pro-domain peptides ranging from 6 to 10 kDa were purified by reverse phase HPLC on a Brownlee SPP C18 column (PerkinElmer Life Sciences) equilibrated in 0.1% TFA (solvent A). The peptides were eluted using a linear gradient from solvent A to 90% acetonitrile. The relevant fractions were analyzed using SDS-PAGE, and the masses were determined by MALDI-TOF-TOF mass spectrometry and compared with relevant pro-domain sequences (GPMAW software).
Kinetic Profiling of Pro-dionain-1 Maturation-Incubation of pro-dionain-1 at various pH values, temperatures, and concentrations was performed in phosphate-citrate buffer solutions (0.1 M citrate with 0.2 M Na 2 HPO 4 adjusted to the desired pH) with a 0.25 mM concentration of the fluorogenic substrate, Z-FR-AMC, in a 96-well plate setup using a FluoStar Omega plate reader at ex ϭ 355 nm and em ϭ 485 nm for product detection. Assay buffers were preincubated in 96-well half-area plates (Costar 3881) at the indicated temperatures, and 10 nM final pro-dionain-1 was used for each well. Where relevant, the change in enzyme activity was extracted according to the first derivative of the fluorescent signal. Autocatalytic fitting was performed using a two-step model assuming the following reactions and rate expressions, where E* represents the zymo-gen and E represents the mature enzyme: 1) intermolecular "pro" activation, and 2) intermolecular "mature" activation.
The combined rate equation can be expressed as follows.
we can express the following, where E*(0) is the initial amount of pro-dionain-1.
For the automaturation at different temperatures, an apparent rate constant (k app ) was extracted for each curve as the inverse of the time to reach a fixed product signal of 5000 relative fluorescence units (RFU). A classic semilog Aarhenius relationship was observed of k app as a function of temperature.
Kinetic Profiling of Mature Dionain-1-Mature dionain-1 was produced by incubation of reduced (in 2 mM DTT, 15 min, room temperature) pro-dionain-1 (0.2 mg/ml) for 8 -10 min at 45°C in 50 mM sodium acetate at pH 4.0. The mature enzyme was used immediately. The Z-FR-AMC substrate was used for kinetic profiling either at 0.25 mM or in a dilution series to determine kinetic constants. The catalytic competence of dionain-1 at different temperatures (4, 20, 30, 40, 50, 60, 70, and 80°C) was probed by end point fluorescence after 15-min incubation of dionain-1 in preheated 50 mM sodium acetate, pH 5.5, assay buffer containing 2 mM DTT and 0.25 mM Z-FR-AMC. The reaction was stopped by the addition of 1 mM E-64 and incubation for 10 min at room temperature before measuring total fluorescence. The effect of pH on the hydrolysis rate was assessed at 35°C in phosphate-citrate buffer solutions from pH 2.0 to 7.0 adjusted to a constant ionic strength of 0.5 M with KCl. Maximal hydrolysis rates were extracted and normalized to the observed maximal rate. The kinetic constants for activated dionain-1 and papain (Sigma-Aldrich, P4752) were determined in 10 mM phosphate buffer at pH 6.0 and 35°C. A final theoretical concentration of 5 nM dionain-1 was used, and the actual amount of active sites was quantified using E-64 titration series with preincubation of dionain-1 at 100 nM with relevant E-64 concentrations. For papain, preincubation for 10 min at room temperature with 2 mM DTT preceded kinetic measurements at a final concentration of 2.5 nM. The active site titration by E-64 was performed at the final 2.5 nM papain concentration, and it produced an estimate within 2% of the theoretical concentration. All assays were based on triplicate determinations and repeated at least twice with similar results. K m and V max were extracted using Kaleidagraph software (Synergy), and k cat was calculated as V max /[E T ], where [E T ] is the concentration of active sites determined by E-64 titration.
Specificity Profiling Using an Internally Quenched Fluorogenic Probe (IQFP) Library-Substrate preference for mature dionain-1 was probed with a library of 3375 IQFPs (Mimotopes). The peptides in this library consist of AMC-GGXYZG-GDPAKK, where each variable position (X, Y, and Z) contains equimolar ratios of A/V, L/I, K/R, S/T, N/Q, E/D, F/Y, or P. Thus, each X, Y, and Z combination results in a peptide pool of 2 3 unique sequences, except for the proline-containing peptides, where the number of unique sequences is 1, 2, or 4. In total, the assay contains 512 peptide pools divided into microtiter plate wells for the analysis. The peptides remain optically silent in the uncleaved state, whereas upon cleavage, they emit a fluorescent signal with an intensity that is proportional to the extent of the cleavage. The library has been validated as a tool for specificity profiling against the major protease classes (28). The IQFP assay for dionain-1 was performed in pH 6 assay buffer (50 mM sodium acetate) at 40°C for 2 h with 6 nmol of total substrate and 0.37 pmol of activated dionain-1 in each well (2.5 nM). The extent of cleavage was evaluated as the background-subtracted end point fluorescence ( ex ϭ 320 nm, em ϭ 420 nm). The intensity-weighted score distribution was calculated as follows, where I denotes the product intensity score in RFU, i denotes observed individual substrate pools above 1000 RFU, and n denotes the residue type. An analysis of exact cleavage sites for the top 50 substrates was performed by MALDI-MS. In brief, 2 l of each cleaved IQFP solution was acidified with 8 l of 0.1% TFA solution, micropurified using C18 Stagetips (Proxeon), and spotted onto MALDI target plates using an ␣-cyano-4-hydroxycinnamic acid matrix. The MS analysis was performed on a Micromass Q-TOF Ultima Global mass spectrometer (Waters/Micromass). Specificity Profiling of Dionain-1 against Bovine ␤-Casein-A total of 5 g of bovine ␤-casein (14 kDa; Sigma) was analyzed on a 5-15% gradient gel, and the intact protein band was cut out and subjected to in-gel digestion with dionain-1 (1:50, w/w) at 37°C overnight. After digestion, the sample was acidified with 0.1% TFA, micropurified using Poros 50-m R2 reverse phase material (Applied Biosystems), and analyzed by LC-MS/MS, as described above. Unique cleavage sites in ␤-casein were identified based on the peptides that were detected using Mascot version 2.3.02 (Matrix Science). A control sample of bovine ␤-casein without dionain-1 was included, and the four cleavage sites identified herein were excluded from the dionain-1 specificity analysis. The frequency of each residue identity on positions P 3 -P 1 Ј was calculated by dividing the number of observations (p obs ) by the total number of this residue type in ␤-casein (p total ) after subtracting the restricted positions with a P 1 Ј proline identity (p pro ).
Specificity Profiling against AMC Substrates-All substrates were solubilized according to the manufacturer's instructions. For Z-FR-AMC and Z-RR-AMC, 1.6 mM stocks were prepared in H 2 O. For all other substrates, 20 mM stocks in DMSO were used. The activities of mature dionain-1 (at 10, 50, and 250 nM) and papain (at 5, 25, and 125 nM) were tested against a 0.25 mM concentration of each peptide substrate. The average hydrolysis rates were extracted as RFU/s and normalized to the total fluorescent signal after complete substrate conversion, through the addition of excess papain. Of note, quantification was performed by normalization of the fluorescent yield to the theoretical substrate concentration because manufacturer variations precluded reliable AMC fluorophore normalization.
Dionain-1 Digest of Drosophila melanogaster-Frozen D. melanogaster were homogenized in 50 mM sodium acetate, pH 4.0, using 10 l/fly. Thirty l of the extract (representing three flies) were titrated with 0, 0.625, 1.25, 2.5, 5, and 10 g of activated dionain-1. The samples were incubated at 37°C for 90 min with shaking at 1000 rpm. After incubation, 200 mM Tris-HCl, pH 7.6, was added, and one-third of the sample was immediately boiled in the presence of SDS sample buffer and 50 mM DTT and analyzed by SDS-PAGE. The content of selected protein bands was identified by mass spectrometry. The remaining two-thirds of the sample was lyophilized, reduced with 10 mM DTT in the presence of 8 M urea, and alkylated in 30 mM iodoacetamide. The sample was diluted with 20 mM ammonium bicarbonate to reduce the urea concentration to 2 M and treated with 1 g of trypsin for 16 h at 37°C. The samples were micropurified and analyzed by mass spectrometry (see above).
Crystallization and Data Collection of Mature Dionain-1-Purified wild-type recombinant pro-dionain-1 was prepared for crystallization by maturation at pH 4.1 (50 mM sodium acetate) at 45°C for 70 min. For inhibition, E-64 was added in a 6.5-fold molar excess and incubated for 30 min at room temperature before applying the mixture to a Superdex 75 10/30 column (GE Healthcare) at 4°C, which was equilibrated with 10 mM Tris-HCl containing 20 mM NaCl at pH 7. The purification was performed using an Ä KTA Purifier FPLC system operated at 4°C (GE Healthcare). The eluate was monitored at 280 nm, and fractions were collected and assayed for the presence of mature dionain-1 ϩ E-64 using SDS-PAGE (10% acrylamide gel, Tricine buffer system) followed by Coomassie Brilliant Blue staining. The pooled fractions were concentrated to 37 mg/ml using a Vivaspin centrifugal concentrator unit with a 5000 molecular weight cut-off polyethersulfone membrane (Sartorius).
Crystallization assays were performed by the sitting drop vapor diffusion method, and reservoir solutions were prepared by a Tecan robot, with 100-nl crystallization drops dispensed on 96 ϫ 2-well MRC plates (Innovadyne) by a Phoenix Nano-Drop robot (Art Robbins) or a Cartesian Microsys 4000 XL (Genomic Solutions) robot at the joint Institut de Biologia Molecular de Barcelona/Institute for Research in Biomedicine Automated Crystallography Platform at Barcelona Science Park. In total, 768 different conditions were screened. Plates were stored in Bruker steady temperature crystal farms at 4 and 20°C. Successful conditions were scaled up to the microliter range in 24-well Cryschem crystallization dishes (Hampton Research).
Mature dionain-1 ϩ E-64 in 10 mM Tris-HCl, 20 mM NaCl, pH 7, was crystallized at 4°C from equivolumetric drops with 0.8 M K 2 HPO 4 , 0.9 M NaH 2 PO 4 , 0.2 M Li 2 SO 4 , 0.1 M CAPS, pH 10.5, as reservoir solution. Crystals were cryoprotected by immersion in reservoir solution stepwise supplemented with glycerol from 0 to 30% (v/v). The optimal diffraction data set was collected at 100 K from a liquid N 2 flash-cryocooled crystal (Oxford Cryosystems 700 series cryostream) on an Area Detector Systems Corp. Quantum Q315r detector at beam line ID23-1 of the European Synchrotron Radiation Facility synchrotron (Grenoble, France) within the block allocation group "Barcelona." Diffraction data were integrated, scaled, merged, and reduced with the programs XDS (29) and XSCALE (30) and transformed with XDSCONV to formats suitable for the CCP4 suite of programs (31). The crystal was centered monoclinic and contained one protein molecule per asymmetric unit (V M ϭ 2.20 Å 3 /Da; solvent content ϭ 44%).
Structure Solution and Refinement-The structure was solved by maximum likelihood-scored molecular replacement with the program PHASER (32) and a search model obtained by trimming the side chains of the structure of castor bean (Ricinus communis) cysteine peptidase (Protein Data Bank (PDB) entry 1S4V (17) Table 1 presents the final refinement and model quality statistics. The ideal coordinates and parameters for crystallographic refinement of the E-64 ligand were obtained from the PRODRG server (37). Structural similarity searches were performed with DALI (38), and structure figures were prepared with the CHIMERA program (39). The experimental dionain-1 structure was validated with MOLPROBITY (40). The final coordinates of mature D. muscipula dionain-1 were deposited in the Protein Data Bank (entry 5A24).

Results and Discussion
De Novo Sequencing of Full-length Pre-pro-dionain-1-The most abundant protein in the digestive fluid of the Venus flytrap, based on SDS-PAGE and Coomassie Brilliant Blue staining, is a ϳ45-kDa cysteine protease named dionain-1, reflecting its origin (Dionaea) and homology with papain (41). A more recent combined transcriptomics and proteomics approach revealed the existence of four different dionains in total present at both the transcript and protein levels (5). To verify the high abundance of dionain-1 in the digestive fluid, we analyzed the composition of the Venus flytrap digestive fluid after mechanical stimulation by SDS-PAGE followed by N-terminal sequencing of the major protein band migrating at ϳ45 kDa (Fig. 1A). The result revealed a single N terminus with the sequence DVPAAVDXRTAGAVTP (where X denotes ambiguous residue identification), uniquely identifying the protein as dionain-1. The major protein band was also digested with trypsin to facilitate de novo sequencing of peptides using chemically assisted fragmentation MALDI-MS. Two selected peptides were used to generate designated primers for the amplification and cloning of the complete pre-pro-dionain-1 cDNA sequence through targeted 5Ј-RACE PCR as described under "Experimental Procedures." The full-length dionain-1 cDNA sequence (GenBank TM KP663370) is 1056 base pairs long and encodes a 352-residue pre-pro-dionain-1 with a molecular mass of 37.8 kDa (Fig. 1B). The pre-pro-dionain-1 cDNA sequence obtained by highthroughput next-generation sequencing (5) as well as the previously purified and sequenced dionain-1 peptides (41) fit well with the sequence reported here by the targeted cloning approach. Pro-dionain-1 shares 37% sequence identity with pro-papain from Carica papaya and 44% sequence identity with the cysteine endopeptidase precursor (pro-Cys-EP) from the castor bean (R. communis), representing the closest ortholog with a crystal structure of the mature protease domain (PDB code 1S4V (17)). The pro-domain of dionain-1 forms the first 100 residues (Ser 1p -Thr 100p ; corresponding to Ser 30 -Thr 129 of the full-length pre-pro-protein sequence, followed by the mature protease (Asp 1 -Ala 223 ; Asp 130 -Ala 352 of full-length pre-pro-protein) with active site residues Cys 26 and His 165 , corresponding to Cys 25 and His 159 in papain, respectively.

Heterologously Expressed Pro-dionain-1 Is Glycosylated and
Automaturates at Acidic pH-The functional expression of pro-dionain-1 (323 residues, 34.8 kDa, pI 4.7) was achieved in P. pastoris using the endogenous yeast signal peptide in place of the 29-residue predicted dionain-1 signal sequence (by SignalP version 4.0 (24)). The pro-form was purified by anion exchange and yielded a protein migrating at sizes ranging from 50 to 66 kDa in SDS-PAGE. We attributed the heterogeneity in mass to variations of N-linked glycans at residue Asn 98p . Glycosylation was verified by treating pro-dionain-1 with endoglycosidase H and by producing and purifying a glycosylation-deficient N98pQ mutant (Fig. 1C). Both approaches resulted in a protein that migrated as a single band of ϳ50 kDa.
The N terminus of pro-dionain-1 was identified by Edman degradation as S 1p SSRLLTS, which is consistent with the predicted processing site. Minor proportions were processed to yield N termini T 7p SSEQ (ϳ20% of total) and S 8p SEQV (ϳ10%). The molecular mass of the deglycosylated form of pro-dionain-1 derived from SDS-PAGE (ϳ50 kDa) differed from the theoretical value (ϳ35 kDa) by ϳ15 kDa. This has previously been observed for other plant cysteine proteases (42). The aberrant migration during reducing SDS-PAGE was most likely caused by acidic motifs in the active enzyme (residues 1-223) because the activated form similarly migrated abnormally at ϳ45 kDa (Fig. 1A), despite a theoretical molecular mass of 23.1 kDa.
The active site in the pro-form of the C1 papain proteases is sterically blocked by a reverse-oriented substrate-like pro-peptide stretch (16,43,44). Catalytic competence is gained upon acidification by the loosening of pro-domain contacts and proteolytic maturation (45)(46)(47). To investigate the activation of pro-dionain-1, the pro-form was exposed to a series of acidic buffers with pH values between 2.4 and 5.0. This mimicked biological digestive fluid conditions in which slow acidification from approximately pH 4.4 to 3.4 occurs (4). The SDS-PAGE analysis demonstrated accelerated maturation upon pH reduction ( Fig. 2A). Automaturation was also followed over time at pH 4.0 and produced a similar maturation pattern, suggesting that pH reduction accelerated the process without altering its features. The autoproteolytic processing of the pro-domain was evident from the appearance of small molecular mass bands representing pro-dionain-1 fragments. The major species observed at ϳ9 kDa was found to have an S 8p SE sequence as the N terminus. The pro-domain peptide fragments were subjected to further proteolytic breakdown, indicating the presence of initial preferential cleavage sites. Reverse phase purification and subsequent analysis of the intact peptide fragments by mass spectrometry identified S 8p SEQ. . .PLK 85p (9.3 kDa), D 30p DA. . .PLK 85p (6.6 kDa), and D 30p DA. . .GYK 81p (6.1 kDa) as the three major peptide fragments. The occurrence of these peptides preceded the full maturation of dionain-1, indicating that shedding of the glycosylated pro-domain peptide is a sequential process. Furthermore, the results suggested a substrate preference of dionain-1 for hydrophobic/aliphatic residues at P 2 and positively charged residues at P 1 .
To better understand the catalytic activation of dionain-1, we used a Z-FR-AMC fluorogenic substrate and analyzed the substrate conversion over a range of pH values in a phosphate/ citrate buffer system. The maturation of dionain-1 progressed autocatalytically in the pH range from pH 5.0 to 3.4. The exponential phase was essentially absent below pH 3.4 (Fig. 2B), and in this range, the maximal rate of substrate hydrolysis was reduced with pH, most likely by partial protonation of the active site cysteine (pH-induced change in the AMC fluorophore quantum yield was not significant). Furthermore, the time required to reach full catalytic competence (rate plateau) rapidly decreased upon acidification (Fig. 2B, inset).
Next, we extracted the peak substrate conversion rates as a function of pH. The optimal enzymatic efficiency of dionain-1 was achieved at approximately pH 3.6 and expressed the bal-ance between pro-domain destabilization, catalytic competence, and autodegradation (Fig. 2C). Additional analyses revealed a clear Arrhenius relationship of the automaturation process upon temperature increase, confirming that no apparent thermal threshold is required to unfold or facilitate prodomain destabilization. The maturation of dionain-1 protease was concentration-dependent, suggesting the dominance of intermolecular trans-activation events. This result correlates well with the pro-domain location of the major observed cut at Lys 85p that cannot be performed intramolecularly because of the apparent distance to the active site cleft (see below). Overall, the autolysis of pro-dionain-1 is thermally accelerated and only proceeds to a significant extent below pH 5.0. It reaches a maximum at pH 3.5-3.8 before the protonation of key residues or acid-mediated partial denaturation slows the catalytic activity. This suggests that pro-dionain-1 maturation is tailored for a slightly more acidic environment than that of pro-papain, which presents optimal automaturation at pH 3.5-4.0 (45).
Pro-region Glycosylation Supports Proper Folding of Dionain-1 but Does Not Influence Stability-The presence of an N-linked glycan attached to Asn 98p in pro-dionain-1, only two residues away from the mature N terminus (Fig. 1), prompted us to investigate the effect of glycosylation on both domain stability and automaturation. First, the maturation of WT and endo-␤-N-acetylglucosaminidase H-treated pro-dionain-1 was tracked by SDS-PAGE and showed no apparent difference (Fig.  3A). Second, the same D 1 VVPA mature dionain-1 N terminus was observed by Edman degradation. Third, thermal denaturation by CD spectroscopy was conducted to address the structural stability and gave very similar transition midpoints (T m ) of 72.4 Ϯ 0.1 and 73.0 Ϯ 0.1°C for WT and deglycosylated prodionain, respectively (Fig. 3B). Because no apparent effect on enzyme function was found for the Asn 98p glycosylation, we addressed the glycosylation-deficient N98pQ mutant. This variant was successfully secreted by P. pastoris but displayed severe cooperative unfolding above 40°C with a T m of only 42.7 Ϯ 0.1°C (Fig. 3B). High proteolytic susceptibility of N98pQ toward trypsin corroborated the lack of proper domain folding, where both WT and deglycosylated pro-dionain-1 were well protected. These data collectively suggest that prodionain-1 glycosylation serves to ensure proper folding of the pro-enzyme through the secretory pathway during biosynthesis. Note that pro-domain glycosylation is found in the homologous human pro-cathepsin K (48) and is considered a prerequisite for functional expression of pro-papain in insect cells (49). However, because pro-domain glycosylation motifs are not always used (42) or may be absent in certain cysteine proteases, it is not a per se requirement for proper folding and processing of C1 proteases. Dionain-1 Activity and Kinetic Parameters-The acidic digestive fluid of Venus flytraps leads to the production of mature dionain-1. The pro-region shedding at low pH to produce a more compact functional form (residues 1-223) is similar to the process in papain and other related extracellular plant cysteine proteases participating in defense mechanisms in the latex of fruits and plants during wounding (19). To determine the kinetic parameters of mature dionain-1, the steadystate hydrolysis rate of Z-FR-AMC was probed as a function of pH and compared with that of papain (Fig. 4A). The optimum activity was found at the pH interval from 5.4 to 6.0, and a rapid decline was observed above pH 7.0. Interestingly, dionain-1 maintained higher relative activity on the acidic limb of the bell-shaped curve compared with that of papain. At pH 3.4, dionain-1 displayed 35% of maximal activity, whereas papain was only 5% active. The catalytic competence of the mature acidic cysteine proteases is primarily linked to the formation of a cysteine-S Ϫ /imidazolium-H ϩ ion pair. The particular reactive Cys displays a low microscopic pK a value (2.5-3.4), but surrounding electrostatic modulators of the active site result in a bell-shaped curve with optimum proteolytic activity at approximately pH 6.0 (50 -53). The pH optimum of dionain-1 and the conservation of the most important electrostatic modulators around the active site indicate a similar catalytic construction. However, the biological environment in the digestive fluid of the Venus flytrap requires dionain-1 to function from pH 3.2 to 4.5, which underscores how both the optimum for automaturation (Fig. 2) and the enhanced acidic activity (Fig. 4) may reflect an adaption to functioning in the acidic environment of the "green stomach." The thermal effect on dionain-1 substrate hydrolysis was found to be positive up to ϳ60°C, after which the activity rapidly declined. This was most likely caused by denaturation and enhanced autodegradation (Fig. 4B). The kinetic parameters of dionain-1 at pH 6.0 and 35°C against Z-FR-AMC were determined and compared with those of papain (Table 2). In both . Effect of pH and temperature on pro-dionain-1 automaturation. A, pro-dionain-1 was incubated at 0.2 mg/ml at the indicated pH values at 37°C for 45 min. E-64 (1 mM) was added, and the samples were analyzed by SDS-PAGE. Low pH triggered automaturation to produce mature dionain-1 (ϳ45 kDa, arrow) and degradation of its own N-terminal pro-domain. B, the enzymatic activity was followed by Z-FR-AMC hydrolysis over time as a function of pH, indicated as RFU. Each curve represents triplicate mean values with experimental deviation of Ͻ2.5%. The time to reach the maximal enzyme activity (rate plateau) was extracted through first derivative plots (see inset). The curve profiles clearly demonstrated the pH dependence of the automaturation process. C, the maximum enzyme activity (rate plateau value in RFU/s) was plotted as a function of pH. Error bars, plateau value fit error to the triplicate mean derivative plot. The inset illustrates the derivative plot at pH 3.8 and the fitting curve to the automaturation function (see "Experimental Procedures").  cases, the amount of active enzyme was determined by active site titration with E-64. Notably, poor E-64 inhibition of dionain-1 occurred, possibly due to the presence of interfering prodomain sequences, and resulted in a slightly overestimated active site concentration. Interestingly, the K m values for the two enzymes were very similar (68.5 Ϯ 4.7 and 71.3 Ϯ 4.7 M for papain and dionain-1, respectively), but the catalytic efficiency was higher for papain against this common cysteine protease substrate. Specificity of Dionain-1 Probed by Substrate Library Profiling-The specificity of dionain-1 was determined using a substrate library screening approach with three variable positions (Trp/His/Met/Cys/Gly not represented). The library contained 3375 different peptide sequences and was based on IQFPs. The analysis suggested that dionain-1 cleaved ϳ30% of the substrates in the library to varying extents. Decoding of the resulting main substrate pools (168 sequence motifs) and weighting them by the fluorescence output revealed a high occurrence of Leu/Ile, Arg/Lys, and Phe/Tyr on the variable positions (Fig.   5A). Because the positive identification of substrate cleavage did not provide the P 1 -P 1 Ј border, we performed mass spectrometry analyses of the top 50 substrate pools, of which the top 10 are shown (Fig. 5B). The identification of Leu/Ile at P 2 and Arg/Lys at P 1 is consistent with the papain-like preference for hydrophobic residues at P 2 and a positively charged P 1 (54). In addition, cleavage also occurred with Arg/Lys in the P 2 position, which has not been observed for papain (55) (Fig. 5B).
To further characterize the P 2 preference of dionain-1 observed in the library, peptides with validated cleavage after Arg/Lys were grouped and analyzed for their residue type distribution at P 3 -P 1 Ј (Fig. 5C). Of the 16 validated unique peptides, 13 contained either Phe/Tyr or Ile/Leu at P 2 , two contained Asn/Gln, and one contained Arg/Lys. Therefore, hydrophobic residues are favored at P 2 by dionain-1, although positively charged residues may also be accommodated. The P 3 position did not display any considerable preference, whereas the P 1 Ј position preferred the glycine provided by the linker region. Several cleavages after the first C-terminal linker glycine were observed during the MS validation of P 1 -P 1 Ј positions in the positive peptide substrates. Because glycine is outside of the variable positions, these cleavage sites would have escaped detection without the MS analysis. The residue preference for the glycine P 1 cleavage was followed by a hydrophobic P 2 position and an Arg/Lys preference at P 3 (Fig. 5D). This identification suggests a need for a stabilizing charge interaction in P 3 to support the cleavage after small uncharged residues at P 1 .
The dionain-1 cleavage of peptides with a charged residue at P 2 (R/K) prompted us to investigate differences in substrate preference between papain and dionain-1. A range of di-and tripeptide AMC-coupled substrates were investigated through the extraction of their relative activity benchmarked against Z-FR-AMC (Fig. 5F). The results for the dipeptide substrates Z-FR-AMC, Z-LR-AMC, and Z-RR-AMC showed a preference for Leu in P 2 for dionain-1, whereas papain preferred Phe at this position. However, both enzymes only displayed marginal activity with Arg at P 2 . To test whether the presence of a P 3 residue altered the activity toward basic residues at P 2 , we analyzed the activity toward Boc-LKR-AMC and Boc-GKR-AMC, where the former substrate resembles the composition of one of the top hits in the IQFP library. Dionain-1 was able to cleave Boc-LKR-AMC as efficiently as the Z-LR-AMC substrate but with a clear indication of Leu at P 2 and Lys at P 1 (identified by the reduced fluorescence yield of R-AMC compared with free AMC). Papain also cleaved this substrate but with reduced efficiency, again preferring Phe over Leu at P 2 . However, Boc-GKR-AMC was a poor substrate, suggesting that a positively charged residue at P 2 was unfavorable, at least in the case of the small peptide substrates tested here.
Dionain-1 Degradation of Intact Protein Targets-The substrate preference of dionain-1 was tested in the context of an intact protein target using bovine ␤-casein. Here, incubation overnight at an enzyme substrate ratio of 1:50 (w/w) led to extensive ␤-casein degradation. Analysis of the P 3 -P 1 Ј identities in the observed proteolytic peptides demonstrated promiscuous P 2 acceptance by dionain-1, although hydrophobic/aliphatic residues were still prevalent (ϳ50% of all cuts) (Fig. 5E).  In addition, the substrate preference may change or broaden upon acidification (56) and accelerate autolytic cleavage at the dionain-1 pro-domain border (FT2DV), not otherwise considered a favored substrate. Next, we addressed the biological function of dionain-1 as a digestive protease by exposing a Drosophila protein extract to increasing amounts of mature dionain-1 at pH 4.0. A clear degradation of several protein bands was observed by SDS-PAGE, and their content was identified by mass spectrometry (Fig. 6A,  lanes 1-6). Interestingly, myosin represented the majority of the proteins in all of the analyzed gel bands, including the bands most susceptible to dionain-1 treatment (Fig. 6A). Although endogenous Drosophila proteases may have contributed to the presence of myosin fragments of various sizes, this protein constitutes an excellent biological substrate for dionain-1. To quantitatively assess the degradation of the most abundant proteins in the Drosophila extract, we analyzed the digests directly by mass spectrometry using the exponentially modified protein abundance index method (Fig. 6B) (27). Myosin heavy chain emerged as a major substrate for dionain-1, supporting our observation by SDS-PAGE. It accounted for ϳ27% of the total protein content and was degraded faster than any of the other abundant proteins, with a reduction to 14% of the total protein content at the lowest dionain-1 concentration.
The exponentially modified protein abundance index value is essentially based on the number of identified peptides in a protein after trypsin treatment (27). The list of detectable tryptic peptides is reduced by enzymatic action at P 1 sites other than Lys and Arg prior to the trypsin treatment. The clear reduction we observed in myosin content underlines the promiscuous nature of dionain-1, which was also evident in the degradation of the ␤-casein substrate (Fig. 5E). Myosin has a 17% relative content of Lys and Arg residues compared with the average of 10% in the Drosophila proteome. This may explain the effective breakdown by dionain-1, which, although promiscuous, displays a preference for Lys and Arg in the P 1 position (Fig. 5). Therefore, in a biological context, dionain-1 mediates the rapid release of peptides high in nitrogen   indicate identified cleavage after both P 1 positions. C, substrate P 3 -P 1 Ј residue distribution with R/K set to P 1 from substrates with R/K in the Zaa or Yaa position. A clear preference for hydrophobic P 2 residues (L/I or F/Y) is seen with a minor fraction of N/Q or R/K. Gly at P 1 Ј is part of the linker sequence in the IQFP substrate. D, substrate P 3 -P 1 Ј residue distribution for peptides with validated cleavage after Gly in the linker sequence (as P 1 position). A strong preference for hydrophobic residues at P 2 and R/K in P 3 is observed. E, substrate P 3 -P 1 Ј residue distribution (weighted residue frequencies) of the observed ␤-casein peptide cuts after overnight in-gel digest by dionain-1. Peptide identification was done by LC-MS/MS. Also, cleavage with P 2 His (1 of 2 possible), Trp (1 of 1), and Met (1 of 4 possible) was observed but is excluded from the pie diagram shown here. F, relative activities of dionain-1 and papain against different di-and tri-peptide AMC substrates, normalized to the rate of Z-FR-AMC hydrolysis.
content and facilitates the essential nitrogen uptake for the carnivorous lifestyle.
Structure of Dionain-1-The molecular structure of dionain-1 in complex with the covalent inhibitor E-64 (57) was determined by x-ray diffraction and refined with data to 1.5 Å resolution. As expected, the mature peptidase demonstrated an overall shape reminiscent of other plant cysteine peptidases, such as those of papaya and castor bean (R. communis). Similar to other cysteine proteases from the papain superfamily, dionain-1 has three disulfide bridges and consists of two domains that form the active site cleft at their interface (Fig. 7). The most N-terminal domain is primarily ␣-helical and harbors the catalytic nucleophile, Cys 26 along with Gln 20 , which forms the oxyanion hole (corresponding to Cys 25 and Gln 19 in papain (58)). The catalytic His, His 165 , is located in the ␤-sheet-rich C-terminal domain along with Asp 164 , which contributes to catalytic efficiency. The resemblance between dionain-1, papain, and castor bean cysteine protease is reflected by the low root mean square deviation of the corresponding C␣ atoms upon structural superposition (0.56 and 0.52, respectively) (Fig. 7C).
Catalytic Site and E-64 Contacts-The geometrical arrangement and electrostatic environment around the catalytic site are major determinants of catalytic competence and efficiency through their effect on the pK a values of the catalytic residues. This architecture not only ensures the existence of the cysteine-S Ϫ /imidazolium-H ϩ ion pair but also speeds up the acylation and deacylation steps in the hydrolytic cycle by interaction or stabilization of the transition states (59). Active site His 165 is hydrogen-bonded to Asn 187 in the dionain-1 crystal structure, and Asp 164 forms a contact with the main-chain amide nitrogen of Ala 142 and Ser 143 . This configuration is essentially identical to the arrangement found in both papain (Asp 158 contacts Ala 136 and Ala 137 ) and castor bean cysteine protease (Asp 161 contacts Ala 139 and Gly 140 ). The active site Cys 26 is covalently linked to the C2 atom of the epoxy ring of the cysteine protease inhibitor, E-64, which neatly fits into the subsites of the sub-strate-binding cleft upstream of the cleavage site and is held in place by multiple hydrogen bonds ( Fig. 8A and Table 3). The arrangement is highly similar to the binding mode found in papain and other cysteine proteases of the same family and causes only a minor expansion of the active site to accommodate the proper coordinate chemistry (60,61). The oxyanion hole is occupied by the carboxylic acid group of E-64, and the leucyl moiety is located in the S 2 pocket of dionain-1, where it is held in place by hydrophobic interactions and by backbone hydrogen bonding of the succeeding E-64 amide nitrogen to the carbonyl oxygen of Gly 69 (Fig. 8A). The 4-guanidinobutane moiety of E-64 is not visible in the electron density map (Fig. 8A), most likely due the absence of interactions with the enzyme to restrict its motion. In papain, Tyr 61 and Tyr 67 hold the guanidinium group in place through bonding via the side-chain hydroxyl groups (61); however, dionain-1 differs by having Arg 64 and Thr 70 in the corresponding positions. This arrangement indicates clear differences with a possible effect on substrate recognition profiles from S 3 subsites and onward between these homologous cysteine proteases.
Subsites of Dionain-1-The nucleophilic attack on the peptide carbonyl group of the P 1 -P 1 Ј scissile bond by the thiolate anion of Cys 26 is guided by the accommodation of the peptide or peptide-like substrate in the subsites of the active site cleft of dionain-1. The unique subsite arrangement is illustrated in Fig.  8C and indicates the electrostatic groups involved in the selectivity and modulation of substrate binding. The S 1 subsite of dionain-1 is a relatively shallow V-shaped groove that primarily provides main chain stabilization, as seen in the E-64 contacts formed here. The substrate library profiling (Fig. 5) indicates that long-chain positively charged residues, such as Arg or Lys, are highly preferred at this position, followed by smaller aliphatic residues. Indeed, the exposed main-chain oxygen molecules from Gly 24 , Cys 66 , and Asn 67 would disfavor negatively charged Asp and Glu residues. A similar arrangement is seen in Cys-EP, where the inhibitor P 1 arginine points toward the sol-   In both Cys-EP and ervatamin A, the cavity-bottom residue is also Ala, but the volume of the S 2 subsite is restricted by substitution of Pro 71 (in dionain-1) with Met (Cys-EP) or Phe (ervatamin A). Additionally, the presence of Thr 70 in dionain-1 in place of papain's Tyr 67 widens the substrate-binding cleft from subsite S 3 outward (Fig. 8D). This places dionain-1 as a promiscuous cysteine protease with a spacious S 2 pocket and a large range of possible interactions to accommodate the P 3 substrate residue and beyond. These structural features agree with the substrate library profiling in which large hydrophobic residues preferentially occupy the S 2 site and low selectivity is seen for P 3 . In regard to the observed cleavages for substrates with Gly in P 1 and concomitant preference for Lys/Arg in P 3 (Fig. 5D), dionain adequately carries a negatively charged interface by Asn 62 and Asp 63 that could comprise a stabilizing S 3 subsite. In fact, this interface results from a dionain-1-specific 2-residue insertion (Asp 63 -Arg 64 ) that gives rise to a main-chain protrusion toward subsites S 3 -S 4 (Fig. 8E). For the dionain-1 activity observed for substrates filling the S 2 subsite with a Lys/Arg residue (Fig. 5B), the accommodation of the positive charges in S 2 could be explained by the presence of Gln 217 at the top of the pocket, which may act as a hydrogen bond donor. Such stabilization is observed in cathepsin B, which carries an Asp in this position for charge compensation (64,65). Downstream of the scissile bond, the dionain-1 S 1 Ј pocket is very similar to that of papain and is formed by the side chains of Ala 142 , Gln 148 , Asp 164 , and Trp 189 . From here, the substrate main chain may continue along the domain interface and engage in various interactions because a particular structural requirement for P 2 Ј side chains (or further) is not apparent.
The overall globular structure of dionain-1 is highly similar to the papain-like fold differing primarily by the presence of three sequence inserts (Fig. 8E). The largest sequence insert is found from residue 176 to 181 in the C-terminal domain and leads to expansion of an antiparallel ␤-sheet, located on the distal top side of the active site cleft. In Cys-EP, a similar 5-residue insertion is found, but the functional role of this segment remains to be established.
Insights into the Pro-domain-The pro-domain of pro-dionain-1 was modeled by the structural prediction program PHYRE2 (66), justified by the conservation of known important sequence motifs for the pro-domain interaction surface with the core enzyme and intradomain salt bridge networks (15, 16) (Fig. 9A). The model suggested an architecture of the dionain-1 pro-domain similar to that of pro-papain and pro-caricain (Fig.  9B) and was combined with the mature dionain-1 enzyme structure. In general, the topology of the enzyme body does not alter significantly upon maturation in cysteine peptidases (root mean square deviation ϭ 0.35 Å for papain and the pro-papain enzyme core C␣ atoms) (18). Interesting differences between pro-papain, pro-caricain, and the putative pro-dionain-1 structure are observed in the pro-domain region that covers subsites S 2 Ј-S 2 (Fig. 9C). In pro-papain, helix III is anchored to the core domain in the S 2 Ј pocket by the hydrophobic pro-domain residues Phe 70p , Phe 78p , and Tyr 82p (pro-domain residue numbers with "p" suffix; Fig. 9C). In pro-dionain-1, the hydrophobic patch is formed by Phe 65p , Phe 73p , and His 77p . The ionizable nature of His 77p is likely to play a role in the fine-tuning of automaturation, considering that an F70pH mutation in papain causes a pH shift in activation from 4.0 to 5.0 (16). The extended polypeptide stretch that follows helix III sterically blocks the active site. In papain, the stretch is composed of Thr 82p -Gly 83p -Ser 84p -Leu 85p , which bulges away from the catalytic cysteine. In pro-caricain, the similar peptide stretch is Val-Gly-Ser-Leu, but here, the Gly 83 carbonyl group is in hydrogen bonding distance to Cys 25 , and the Ser 84p O ␥ is positioned toward the solvent. For pro-dionain-1, the model suggests that the active site would be blocked by Asn 78p -Gly 79p -Tyr 80p -Lys 81p which, similar to procaricain, place Gly 79p in hydrogen-bonding distance to Cys 26 but also reduce the insertion of the backbone of Tyr 80p in the S 1 pocket through a stabilized bulge, involving Arg 74p and Asn 78p (Fig. 9C, middle). Tentatively, these changes could facilitate the filling of the S 2 pocket by positively charged Lys 81p , which is larger than the corresponding Leu residues in papain and caricain.
The observed primary proteolytic sites in the pro-domain of dionain-1 during automaturation are VYK 29p , GYK 81p , and PLK 85p , as previously mentioned. Lys 27p locates to the loop region between pro-domain helices I and II and would be solvent-exposed according to the modeled structure of pro-dionain. Tyr 26p , however, would be buried in the hydrophobic interstrand environment, participating in the hydrogen bond network of Arg 24p and Asp 65p (conserved motif; see Fig. 9A). Thus, loosening of pro-domain contacts upon acidification is predicted to cause unfolding and solvent exposure of Tyr 28p , which would mediate binding to the substrate cleft of dionain-1 and hydrolysis at Lys 29p . The cleavages at Lys 81p and Lys 85p both locate to the extended polypeptide chain spanning the non-primed subsites of dionain-1. In fact, Lys 81p fills the S 2 site in the proposed pro-domain model. For these peptide stretches to be cleaved intramolecularly, it is necessary to liberate the active site, loosen the stretch, and reverse the peptide chain direction. The reversal is not possible due to structural and spatial constraints, which support an intermolecular cleavage event at these positions. The final maturation to create the dionain-1 N terminus at Asp 1 is also assumed to be mediated through trans-processing for the same reason.
Dionains of the Venus Flytrap-The presence of four distinct dionains in Venus flytrap digestive fluid has been established at the transcript and protein level; however, the regulation and function of each isoform remain unclear (5). The full-length sequence of pre-pro-dionain-2 is 85% identical to dionain-1 and does not differ in any aspects regarding the S 2 pocket or the pro-peptide chain that blocks the active site cleft. This may represent a functional redundancy and a likely result of an ancestral gene duplication event. Pro-dionain 3 is 58% identical to dionain-1 within the aligned sequence (from Glu 18p to Thr 222 ), and dionain-4 is the most diverse, with only 48% identity (from Glu 10p to Thr 222 ). The S 2 substrate specificity pockets of both dionain-3 and -4 differ slightly from that of dionain-1, with either Asp (dionain-3) or Glu (dionain-4) in place of Gln 217 providing a full electrostatic stabilization for a positively charged P 2 residue. The substrate-like pro-peptide stretches read GYKQR and GTKLK, maintaining the S 2 pocket occupation by a pro-peptide Lys side chain. Although dionain-1 to -3 carry the pro-region ionizable His in the hydrophobic patch of the S 2 Ј site (Fig. 9B), dionain-4 has the more common Tyr in this position and also lacks the pro-domain glycosylation signal. In addition, the active site dicysteine motif, which is present in dionain-1 to -3, is substituted with the more regular serinecysteine sequence in dionain-4, suggesting a more distant relationship. The presence of multiple dionain isoforms in the digestive fluid of Venus flytrap is comparable with that of the enzyme-rich latex produced upon injury in several other plants.  I Y K N I DE K I Y R FE I FK DN L K Y I DE TNK K NNS Y WL G L NV FA DMS NDE FK E K Y TGS I A GNY T T TE L S Y -E  TS TE R L I Q L FNS WML NHNK FY E NV DE K L Y R FE I FK DN L NY I DE TNK K NNS Y WL G L NE FA D L S NDE FNE K Y V GS L I D --A T I E QS Y DE   100  110  120  130  140  150  160  170   RP S P FN F TDV P A A V DWR TA GA V TP V K NQQQCGCCWA FS A V A A I E GA TQ I K TG T L TS L S E E Q I V DCD TNGNDK GCNGG TP  . Features of the pro-dionain-1 structure from homologous modeling. A, sequence alignment of pro-dionain-1, pro-papain, and pro-caricain reveals highly conserved sequence motifs, including ERFNIN (boxed orange), salt bridges (indicated by squares), and hydrophobic packing motifs (indicated by spheres). Most divergence is seen in the pro-peptide region blocking the active site (subsite filling) and the sequence that follows. Sequence numbers indicate pro-dionain-1 numbering. B, pro-enzyme structures of thermostable pro-papain (left, PDB entry 3TNX) and pro-caricain (right, PDB entry 1PCI). Pro-dionain-1 (middle) was modeled using Phyre2, and the core enzyme was replaced with the crystal structure determined in this study. C, display of the subsite filling by the pro-peptides of the cognate enzymes. Hydrophobic patch residues are colored green, and active site cysteine and histidine are presented by orange and purple surfaces, respectively (pro-caricain carries the H164A mutation). For pro-dionain-1, subsite carbonyl groups in the core enzyme are represented by red surface colors. For pro-papain and pro-caricain, selected core enzyme-stabilizing carbonyl groups are represented by maroon surface colors.
Therefore, the closely related dionains are most likely an attribute of a defense-related system in an ancestral lineage. The carnivorous adaption has been facilitated by exploitation of a pre-existing proteolytic machinery that, in combination with aspartic acid proteases and a serine carboxypeptidase (5), now constitutes an evolutionarily tailored digestive system for efficient breakdown and nutrient absorption. The origin of the essential proteolytic activity associated with an efficient plant carnivorous lifestyle differs between species and reflects independent evolutionary events (1,67). For D. muscipula, cysteine proteases account for the majority of the proteolytic activity in prey digestion (5,41,68), and recent studies strongly link chemical stimulation by the jasmonate phytohormones to the increased production and secretion of the dionains (69). Such results provide evidence for the fine-tuning of digestive fluid proteolytic activity by hormonal stimulation from prey decomposition, in addition to the mechanical stimulation trigged by trap closure. A similar induction of cysteine proteases in the leaves of R. communis in response to mechanical wounding or jasmonates (70) further supports the link between protective defense mechanisms and evolution of plant carnivory.
Conclusion-The unambiguous identification by N-terminal sequence analysis of dionain-1 in Venus flytrap digestive fluid corroborates the abundance of this protease in the medium. In this study, we used a targeted approach to provide the fulllength pre-pro-dionain-1 cDNA sequence. The recombinant heterologous expression of pro-dionain-1 in P. pastoris was established to advance the knowledge of this digestive cysteine protease.
The findings presented here show that mature dionain-1 is structurally homologous to the Cys-EP and papain cysteine proteases from MEROPS family C1. It is activated at low pH by autocatalytic processing and dissociation of the inhibitory prodomain. Homology modeling of the pro-dionain-1 structure suggests that access to the active site may be blocked by a stabilized Arg-Tyr bulge under occupancy of the S 2 subsite by a pro-region Lys residue. The mature protease displays broad activity against a wide range of substrate sequences and prefers hydrophobic residues at the P 2 position of the substrates and long positively charged residues at P 1 . Dionain-1 is inefficient at cleaving substrates carrying positively or negatively charged residues in P 2 . However, as gauged from the pro-domain organization, positively charged residues seem to fit into the S 2 pocket but may lead to suboptimal positioning of the scissile bond for nucleophilic attack. The presence of dionain isoforms and a carboxypeptidase in the enzymatic sap of the Venus flytrap (5) complements dionain-1 in the enzymatic breakdown and highlights the concerted action of proteases for efficient protein and peptide hydrolysis during prey digestion.
This study provides the first functional and structural insight into the essential proteolytic component of the Venus flytrap "green stomach." Dionain-1 is an efficient, stable, and broad digestive protease resembling defense-related plant proteases. It is activated by the acidic pH in the digestive fluid and is able to efficiently degrade prey proteins. Investigation of the dionain-1 pro-domain binding mode and inhibitory potential is of particular interest for future investigation. Because pro-domain pep-tides have been shown to confer potent and selective inhibition of homologous C1 proteases across species (71,72), such knowledge could be applied in the specific targeting of proteolytic processes. This may be relevant in the inhibition of prey proteases released during the Venus flytrap digestion process or in the field of cancer biology and beyond, where homologous C1 proteases play pivotal roles in tissue remodeling, and selective inhibitors are of great interest to pharmaceutical and therapeutic research.