Characterization of Gut-associated Cathepsin D Hemoglobinase from Tick Ixodes ricinus (IrCD1)*

Background: Aspartic peptidase activity initiates a multienzyme hemoglobinolysis inside tick guts. Results: IrCD1 is a structurally unique hemoglobinolytic cathepsin D that is up-regulated in tick gut cells during feeding. Conclusion: IrCD1 is the major intestinal aspartic peptidase of I. ricinus. Significance: Biochemical and functional characterization of IrCD1 completes our knowledge on initial host hemoglobin degradation inside tick gut cells. To identify the gut-associated tick aspartic hemoglobinase, this work focuses on the functional diversity of multiple Ixodes ricinus cathepsin D forms (IrCDs). Out of three encoding genes representing Ixodes scapularis genome paralogs, IrCD1 is the most distinct enzyme with a shortened propeptide region and a unique pattern of predicted post-translational modifications. IrCD1 gene transcription is induced by tick feeding and is restricted to the gut tissue. The hemoglobinolytic role of IrCD1 was further supported by immunolocalization of IrCD1 in the vesicles of tick gut cells. Properties of recombinantly expressed rIrCD1 are consistent with the endo-lysosomal environment because the zymogen is autoactivated and remains optimally active in acidic conditions. Hemoglobin cleavage pattern of rIrCD1 is identical to that produced by the native enzyme. The preference for hydrophobic residues at the P1 and P1′ position was confirmed by screening a novel synthetic tetradecapeptidyl substrate library. Outside the S1-S1′ regions, rIrCD1 tolerates most amino acids but displays a preference for tyrosine at P3 and alanine at P2′. Further analysis of the cleavage site location within the peptide substrate indicated that IrCD1 is a true endopeptidase. The role in hemoglobinolysis was verified with RNAi knockdown of IrCD1 that decreased gut extract cathepsin D activity by >90%. IrCD1 was newly characterized as a unique hemoglobinolytic cathepsin D contributing to the complex intestinal proteolytic network of mainly cysteine peptidases in ticks.

To identify the gut-associated tick aspartic hemoglobinase, this work focuses on the functional diversity of multiple Ixodes ricinus cathepsin D forms (IrCDs). Out of three encoding genes representing Ixodes scapularis genome paralogs, IrCD1 is the most distinct enzyme with a shortened propeptide region and a unique pattern of predicted post-translational modifications. IrCD1 gene transcription is induced by tick feeding and is restricted to the gut tissue. The hemoglobinolytic role of IrCD1 was further supported by immunolocalization of IrCD1 in the vesicles of tick gut cells. Properties of recombinantly expressed rIrCD1 are consistent with the endo-lysosomal environment because the zymogen is autoactivated and remains optimally active in acidic conditions. Hemoglobin cleavage pattern of rIrCD1 is identical to that produced by the native enzyme. The preference for hydrophobic residues at the P1 and P1 position was confirmed by screening a novel synthetic tetradecapeptidyl substrate library. Outside the S1-S1 regions, rIrCD1 tolerates most amino acids but displays a preference for tyrosine at P3 and alanine at P2. Further analysis of the cleavage site location within the peptide substrate indicated that IrCD1 is a true endopeptidase. The role in hemoglobinolysis was verified with RNAi knockdown of IrCD1 that decreased gut extract cathepsin D activity by >90%. IrCD1 was newly characterized as a unique hemoglobinolytic cathepsin D contributing to the complex intestinal proteolytic network of mainly cysteine peptidases in ticks.
Previous mapping of those proteolytic enzymes (peptidases) that digest the blood meal inside the guts of partially engorged I. ricinus females demonstrated the presence of cysteine and aspartic peptidases. Their multienzyme complex operating in the acidic compartments of tick gut cells (4,5) is analogous to those found in platyhelminthes (6,7) and nematodes (8). This complex most likely predated the evolution of secreted alkaline trypsin-like hemoglobinases of blood-sucking insects (9). Using biochemical assays and PCR cloning systems, a model describing tick hemoglobinolysis was developed (5,10). In this model, an aspartic cathepsin D endopeptidase (IrCD), 3 supported by the cysteine peptidases cathepsin L (IrCL) and asparaginyl endopeptidase (legumain; IrAE), is responsible for initiating cleavage of host hemoglobin. Production of shorter secondary hemoglobin fragments is performed primarily by a cathepsin B (IrCB). These peptides are further processed by the exopeptidase activity of IrCB (carboxyl dipeptidase) and IrCC (cathepsin C, an amino dipeptidase). Single amino acids are then released from N or C termini of peptides by leucine or serine monoexopeptidases, respectively.
Here, we demonstrate that out of the three identified Ixodes cathepsin D paralogs, the newly characterized and most diverse IrCD1 is responsible for the specific aspartic endopeptidase activity detected from I. ricinus female gut extracts. This study completes our analysis of the initial endopeptidases of the intestinal tick hemoglobinolytic network (11,12).

EXPERIMENTAL PROCEDURES
Tick Tissue Preparation-I. ricinus ticks were collected and fed on laboratory guinea pigs as described previously (4,12). All animals were treated in accordance with the Animal Protection Law of the Czech Republic No. 246/1992 sb., ethics approval number 137/2008. For tissue preparation, guts, salivary glands, and ovaries were dissected from individual partially engorged females (day 6 of feeding). To prepare gut samples, the luminal contents were carefully removed, and remaining tissue was gently washed from the host blood excess in phosphate-buffered saline (PBS). Samples were further divided into two halves and pooled for either RNA isolation or tissue extraction. Gut tissue extracts were prepared and stored at Ϫ80°C as described previously (5). A smaller number (3)(4) of dissected tick gut tissues was processed independently for microscopy observations (see below).
Isolation of RNA, Full cDNA Sequencing, and RT-PCR-Total RNA was isolated from tissues of I. ricinus via the NucleoSpin RNA II kit (Macherey-Nagel) and stored at Ϫ80°C. First strand cDNA was reverse-transcribed from 0.5 g of total RNA using the transcriptor high fidelity cDNA synthesis kit (Roche Applied Science) and oligo(dT) primer and stored at Ϫ20°C. cDNA fragments of IrCD2 and IrCD3 were PCR-amplified and sequenced using primers designed from Ixodes scapularis genes ISCW003823 and ISCW023880, respectively (genome dataset IscaW1.1). Full-length IrCD2 and IrCD3 cDNA sequences were obtained with gene-specific primers from the partial PCR amplicons via 3Ј-RACE PCR-modified protocol for SMART TM cDNA library construction kit (Clontech and Takara) and the 5Ј-RACE system for rapid amplification of cDNA ends (Invitrogen) as described before (4).
For RT-PCR, the cDNA was diluted 20-fold and used as a template in a ratio of 2 l per 25 l of PCR mixture. The following combinations of gene-specific primers were used for RT-PCR profiling of the three cathepsin D mRNAs: IrCD1 forward 5Ј-GACAGAAGGCGGACAGTACC-3Ј and reverse 5Ј-CGGAAATTGTGAAGGTGACAT-3Ј; IrCD2 forward 5Ј-CCGAGATCCTGCACG-3Ј and reverse 5Ј-GCTCACGAT-GTACTCTCC-3Ј; and IrCD3 forward 5Ј-CCTGACGTTT-GTGGCTG-3Ј and reverse 5Ј-TCTTGAGGACGTAGTCGC-3Ј. Dual-labeled UPL probes and specific primers were designed on line (Roche Applied Science) and used for quantitative RT-PCR assays. IrCD1 forward 5Ј-GACAGAAGGCG-GACAGTACC-3Ј and reverse 5Ј-CGGAAATTGTGAAGGT-GACAT-3Ј PCR primers were used in combination with probe 78 (Roche Applied Science). IrCD2 forward 5Ј-GAGCTGCAA-GAGCATCGAC-3Ј and reverse 5Ј-TTCGAGCACGAAGTC-CTTG-3Ј PCR primers were used in combination with probe 44 (Roche Applied Science). The reaction was carried out in triplicate in Rotor-Gene RG3000 PCR cycler (Corbet Research) with the following conditions: 95°C for 10 min followed by 40 cycles of 95°C for 15 s and 60°C for 60 s. Data were analyzed and quantified with the Rotor-Gene 6 analysis software. Relative values were standardized to the PCR amplification of the cDNA for elongation factor 1␣ (ELF1␣) (13) and normalized to the sample with the highest level of expression.
Expression, Refolding, and Purification of Recombinant IrCD1-The Escherichia coli bacterial expression system Champion TM pET directional expression kit (Invitrogen) was selected for expression of the IrCD1 zymogen. N-terminal His 6 -tagged fusion IrCD1 was prepared by PCR amplification of the IrCD1 cDNA without the signal peptide. PCR primers for directional pET cloning were as follows: forward 5Ј-CAC-CGCTTTCAGGATCCCGCT-3Ј and reverse 5Ј-GCAGCG-GACGAAGTCGGAA-3Ј. The product was inserted into the pET100/D-TOPO expression vector. The sequence-verified construct was transformed into BL21 Star TM (DE3) E. coli (Invitrogen), and the expression of recombinant protein was performed according to the manual provided with the kit. Inclusion bodies were resolved in buffered 6 M guanidinium hydrochloride (14), and the recombinant IrCD1 (rIrCD1) was purified with Co 2ϩ chelating chromatography (Hi-Trap TM IMAC FF, GE Healthcare) in the presence of 8 M urea. A linear gradient of 0.01-0.5 M imidazole was used for elution on an FPLC AKTA purifier (GE Healthcare). The purified protein was refolded using the following protocol: L-arginine was added to the sample to 0.4 M final concentration. This solution was successively dialyzed at 4°C against 25 mM Tris/HCl buffer, pH 7.5, 0.15 M NaCl, 1 mM mercaptoethanol containing the following: 1) 4 M urea, 0.4 M L-arginine for 3 h; 2) 2 M urea, 0.4 M L-arginine for 3 h; 3) 0.4 M L-arginine for 3 h; and 4) plain buffer overnight. The refolded rIrCD1 zymogen was purified by FPLC on a Q-Sepharose column (GE Healthcare) equilibrated in 20 mM BisTris, pH 6.5, and eluted using a linear gradient of 0 -1 M NaCl. The purified rIrCD1 zymogen was activated in 0.1 M sodium acetate, pH 4.0, for 3 h at 37°C. Activated rIrCD1 was subsequently purified by FPLC on a Mono S column (GE Healthcare) equilibrated in 50 mM sodium formate, pH 3.8, and eluted using a linear gradient of 0 -1 M NaCl. The purification and activation processes were monitored by the FRET activity assay and SDS-PAGE (see below).
Preparation of Antibodies and Indirect Immunofluorescence Microscopy-To obtain IrCD1-specific polyclonal antibodies (RaϫrIrCD1), a rabbit was repeatedly immunized with purified rIrCD1 according to a previously described protocol (15). To increase specificity, the rabbit antibodies were affinity-purified using a previously described protocol (16). Briefly, purified rIrCD1 zymogen was coupled to the CNBr-activated Sepharose 4B and packed to a column. The isolated RaϫrIrCD1 Ig fraction was diluted in PBS and purified over this column, washed with PBS, eluted with 0.2 M L-glycine, 0.15 M NaCl, pH 2.2, neutral-ized with 1 M Tris-base, and stored at Ϫ20°C. Reducing SDS-PAGE and Western blot analyses were performed using a previously described protocol (17). To prepare samples for indirect fluorescent microscopy, dissected tissue was fixed and processed using optimized protocol (16). Briefly, the gut tissue was fixed in a solution of formaldehyde and glutaraldehyde, dehydrated in ascending ethanol dilutions, infiltrated in LR White resin (London Resin Co.), and polymerized in gelatin capsules (Polysciences). Semi-thin sections (0.5 m) were transferred onto glass slides, blocked with BSA and low fat dry milk, and incubated with RaϫrIrCD1 antibodies. Alexa Fluor 488 dyeconjugated goat anti-rabbit antibody (Invitrogen) diluted 1:500 in PBS/Tween and 4Ј,6Ј-diamidino-2-phenylindole (DAPI) (Sigma) counterstain were used for fluorescent labeling. Sections were mounted in 2.5% 1,4-diazabicyclo[2.2.2]octane (Sigma) dissolved in glycerol and examined with the Olympus FW 1000 confocal microscope and consequently processed with the Fluoview (FV10-ASW, Version 1.7 software).
RNAi-The in-detail RNAi protocol was reported previously (18). Briefly, a 281-bp gene-specific DNA fragment of IrCD1 was amplified using primers forward 5Ј-ATGGGCCCGTTAG-CGCCTCAAAATCGG-3Ј and reverse 5Ј-ATTCTAGACTC-ACGCAAAGCGTTTGACC-3Ј, containing ApaI and XbaI restriction sites (underlined) for further cloning into pll10 vector with two T7 promoters in reverse orientation (19). The dsRNA synthesis was performed as described previously. IrCD1 dsRNA (0.5 l; 3 g/l) was injected into the hemolymph of female ticks using a micromanipulator (Narishige). The control group was injected with an identical volume of GFP dsRNA synthesized under the same conditions from the linearized plasmid pll6 (19). After 24 h of rest in a humid chamber, ticks (25 females and 25 males) were fed on guinea pigs. Partially engorged females were forcibly removed from the host and weighed, and the guts were dissected. The level of IrCD1 knockdown was checked on the following levels: 1) transcription (qRT-PCR); 2) protein abundance (immunoblotting using RaϫrIrCD1 antibodies); and 3) cathepsin D activity (assays of the gut tissue extract using FRET and labeled hemoglobin substrates; see below).
Sequence and in Silico Structural Analysis of IrCDs-The multiple sequence alignment was generated using the Clustal X version 1.83 software (20). Three-dimensional structure models of IrCD1, IrCD2, and IrCD3 were constructed using the Phyre software version 0.2 (21) with the x-ray structure of porcine pepsinogen (Protein Data Bank entry 3PSG (52)) as a template. Three-dimensional structures were visualized using the UCSF Chimera program package (22).
IrCD1 Activity and Inhibition Assays-Cathepsin D activity was measured using the FRET peptide substrate Abz-Lys-Pro-Ala-Glu-Phe-Nph-Arg-Leu (single letter code Abz-KPAEFn-FRL; Abz, 2-aminobenzoyl; nF, nitrophenylalanine) in 96-well microplates in a total volume of 100 l. Recombinant IrCD1 (0.1-0.5 g) or gut extract (150 g of proteins) was preincubated for 10 min at 37°C in 150 mM phosphate/citrate buffer, pH 4.0. Hydrolytic activity was continuously measured after addition of substrate (40 M final concentration) in an Infinite M200 microplate reader (Tecan) at excitation and emission wavelengths set to 330 and 410 nm, respectively. pH profile of activity was determined as stated above in 150 mM phosphate/ citrate buffer, pH 2.5-7.0. Assays were performed in triplicate. The kinetic measurements were performed in the initial linear phase of reaction progress curves and in linear response region to enzyme concentration. For activity assay in the presence of peptidase inhibitors, an aliquot of rIrCD1 was preincubated for 15 min at 37°C in the CP buffer, pH 4.0, with inhibitors ( Table  2). Assay of cathepsin D activity in the tick gut extract was measured in the presence of 10 M E-64 to prevent undesired hydrolysis by cysteine cathepsins. For the profiling of cathepsin D activity during feeding on the host, measured activities were normalized per one tick gut as described previously (16).
Hemoglobin Degradation, Quantification, and Fragment Identification-Bovine hemoglobin (100 g/ml) was incubated with rIrCD1 (0.5 g) in CP buffer, pH 2.5-7.0, in a total volume of 50 l overnight at room temperature. Hemoglobin digests were separated in 4 -12% NuPAGE BisTris gel in NuPAGE MES SDS Running Buffer (Invitrogen) and stained with Coomassie Brilliant Blue. For RP-HPLC analysis, bovine hemoglobin (0.3 mg) was incubated with rIrCD1 (0.5 g) in 50 mM sodium acetate, pH 4.2, in a total volume of 200 l for 15 min and 4 h at 35°C. The reaction mixture was treated with 10 l of 10% trifluoroacetic acid (TFA) and separated by RP-HPLC on C4 Vydac column equilibrated in 0.1% (v/v) TFA and eluted with a 1%/min gradient of a 99% (v/v) acetonitrile solution in 0.1% (v/v) TFA. The collected peak fractions were analyzed by mass spectrometry. Mass spectra of peptides were measured by Fourier transform/MS using an LTQ Orbitrap XL mass spectrometer (Thermo) operating in high resolution mode (R ϳ10 5 ). Cleavage sites were searched by the MS nonspecific module of Protein Prospector software (University of California San Francisco) using a mass tolerance of 3 ppm. For quantification of hemoglobin degradation, the tick gut extract was preincubated for 10 min with 10 M E-64 and 1 M Aza-N-11a (12) to prevent undesired hydrolysis by cysteine cathepsins and asparaginyl endopeptidase. Hemoglobin (10 g) was incubated with 5 l of the gut tissue extract in 25 mM sodium acetate, pH 4.2, in a total volume of 35 l for 1-4 h at 37°C. Aliquots of the digest were subjected to derivatization with fluorescamine to quantify the newly formed N-terminal ends (23). The fluorescence signal was measured using an Infinity M200 microplate reader at 370 nm excitation and 485 nm emission wavelengths. All measurements were performed in triplicate, and the measured kinetic speeds were normalized per one tick gut (16).
Active Site Labeling of IrCD-Active site labeling of gut extracts and rIrCD1 was performed with the cathepsin D-specific fluorescently tagged probe in a total volume of 100 l. The selected probe FAP-09 (24) has a binding core of reversible active site ligand Val-Val-Sta-Ala-Leu-Gly containing a statin (Sta) residue. An aliquot of the gut tissue extract (100 g of protein) or purified rIrCD1 (0.5 g of protein) was incubated (20 min at 26°C) with 0.5 M FAP-09 in 50 mM sodium acetate, pH 4.0. The competitive labeling was performed after preincubation (15 min at 26°C) with 2 M pepstatin A. The reaction mixture was irradiated in an open tube for 10 min on ice with a 125-watt high pressure mercury vapor lamp (at a distance of 20 cm) to allow for photoactivated cross-linking. The reaction mixture was then precipitated with 4 volumes of acetone and boiled in reducing SDS sample buffer, separated by SDS-PAGE (15% gel), and the labeled peptidases were visualized directly in the gel using a Typhoon 9400 fluorescence imager (GE Healthcare) with 532 nm excitation (green laser) and the 580 nm emission filter.
Substrate Specificity Profiling of rIrCD1-A highly diversified peptide library consisting of 124 synthetic tetradecapeptides was synthesized using Fmoc (N-(9-fluorenyl)methoxycarbonyl) chemistry. Each peptide was purified by HPLC. All peptides had unmodified termini and consisted of natural amino acids except methionine and cysteine. Norleucine was included as a substitute for methionine. The peptides were mixed into equimolar pools consisting of 52, 52, and 20 peptides and diluted to 1 M in 50 mM ammonium acetate, pH 4.0. An equal volume of 100 nM IrCD1 in the same buffer was added to the peptide pools such that the final concentration of each peptide was 0.5 M. An enzyme-free assay was set up as a control. The assay was incubated at room temperature, and aliquots were removed after 5, 10, 15, 30, 60, 120, 240, and 1200 min. All reactions were quenched by the addition of pepstatin to a final concentration of 0.5 M, evaporated to dryness, and reconstituted to the original volume in 0.1% formic acid. 10 l of each time point was injected onto a 150 ϫ 0.3 mm Magic C18AQ column (Michrom Bioresources) connected to a Thermo Finnigan LTQ ion trap mass spectrometer equipped with a standard electrospray ionization source. Peak lists were generated from the raw files using PAVA software (University of California, San Francisco) and searched against a database consisting of all 124 peptides using Protein Prospector. For estimation of false discovery rate, four different decoy databases containing the randomized sequences of the same 124 entries were concatenated to the original 124 entries to create a final database of 620 sequences. Protein Prospector score thresholds were selected to a minimum protein score of 15, minimum peptide score of 10, and maximum expectation value of 0.1 for protein and 0.1 for peptide matches, and resulted in a peptide false discovery rate of 0.17%. Newly formed IrCD1 cleavage products were identified by comparison with a control assay consisting of peptides and buffer. Four residues at either side of the cleaved bond (P4 -P4Ј) were included in the frequency analysis, and heatmaps and cleavage signatures were made using iceLogo (25). All possible cleavage sites within the peptide library (n ϭ 1612) served as the negative data set, and only amino acids that differ significantly (p Ͻ 0.05) from the negative dataset are highlighted in the cleavage signature.

RESULTS
Three different cathepsin D enzymes are expressed by I. scapularis and I. ricinus ticks. Data mining of the latest I. scapu-laris genome dataset (IscaW1.1, December 2008) identified three cathepsin D paralogs as follows: ISCW013185, ISCW003823, and ISCW023880 tagged as I. scapularis cathepsin D1 (IsCD1), D2 (IsCD2), and D3 (IsCD3), respectively. IsCD1 is an ortholog of previously identified I. ricinus cathepsin D (IrCD1; GenBank TM EF428204) (4). A set of PCR primers was designed to clone cDNA fragments of IsCD2 and IsCD3 I. ricinus homologs. The newly identified IrCD2 and IrCD3 cDNAs were fully sequenced via 5Ј-and 3Ј-RACE PCR. Basic parameters of IrCD zymogens, including GenBank TM accession numbers, mRNA length, calculated molecular weight, and theoretical pI are shown in Table 1.
Comparison of Three Identified IrCD Zymogens Reveals Modifications in the Propeptides-The full Clustal X amino acid sequence alignment of the three IrCD zymogens, two other tick hemoglobinolytic cathepsin D precursors longepsin (26) and BmAP (27), extracellular porcine pepsinogen, and lysosomal human cathepsin D can be found in the supplemental File 1. A graphical schematic overview (excluding BmAP and longepsin) demonstrating basic organization of the primary structures is shown in Fig. 1A. All aligned IrCDs are synthesized as prepropeptides, with predicted signal peptides for targeting to the endoplasmic reticulum. All three IrCD enzyme core structures are related to other cathepsin D-like enzymes (28) and differ mostly in the propeptide region consisting of the conserved part A and variable part B. Processing of the zymogens most likely involves the removal of propeptide parts that have roles in protein folding, stability, inhibition of the active site, pH dependence of activation, and intracellular sorting (29). All three mature IrCDs have two catalytic residues, Asp 33 and Asp 231 , in the conserved motif Asp-Thr-Gly (numbering after mature hCD, see supplemental File 1) and do not incorporate the processing loop of mammalian lysosomal cathepsin Ds. The sequence similarity matrix (supplemental File 1) reveals that IrCD1 is 50% identical to IrCD2 and 49% to IrCD3, whereas IrCD2 and IrCD3 share about 58% identity. The gut-associated tick cathepsins, BmAP from R.(B.) microplus (27) and longepsin from Haemaphysalis longicornis (26) are 54 -58% identical to IrCD1 and IrCD3 and 74% identical to IrCD2. Maximum parsimony analysis was performed using available cathepsin D sequences from ticks aligned with 53 cathepsin D-like molecules from various animal groups. Human pepsinogen was used as outgroup (supplemental File 2). The analysis resulted in 21 equally parsimonious trees. Cathepsins D from ticks created a monophyletic group within arthropods. The results demonstrate that longepsin and BmAP are orthologous to IrCD2, whereas IrCD1 is the most diverse I. ricinus cathepsin D. Surprisingly, orthologs of evolutionarily distant ovarian R. micro- plus yolk cathepsin (30) and the heme-binding aspartic peptidase (tick heme-binding aspartic proteinase) (23) are apparently missing in the I. scapularis genome. Spatial homology models of IrCD1, IrCD2, and IrCD3 zymogens were constructed to compare the structural features of IrCD isoforms (Fig. 1B). The x-ray structure of porcine pepsinogen (Protein Data Bank code 3PSGS) was used as a template. All ticks cathepsins D have a conserved bilobal structure with two catalytic aspartic acid residues (red, Fig. 1) on each side of the active site cleft (31). All IrCDs contain two loops in the Depicted residues are shown in single letter amino acid coding. Two catalytic aspartic acid residues (red D), predicted N-glycosylations, yellow full circles, N; phosphorylation determinant lysines in the K203 region and in the ␤-loop region, violet diamond labels, K. (Note: full Clustal X alignment including two tick aspartic hemoglobinases longepsin and BmAP could be found in supplemental File 1). B, tertiary structures of IrCD1, IrCD2, and IrCD3 zymogens (pro-IrCD1-3) modeled using x-ray structure of porcine pepsinogen (Protein Data Bank entry 3PSG). Phyre software version 0.2 (21)-created models were visualized with UCSF Chimera (22). Pro-cathepsins D (ribbons) are shown with two catalytic aspartic residues (red sticks). Propeptide is divided into conservative part (bluegreen) and variable (green) part, that is reduced in the structure of IrCD1 and IrCD2 compared with IrCD3 and pPD. Y flap region, polyproline loop in the vicinity of the substrate binding pocket are labeled pink and blue, respectively; predicted sites of N-glycosylations (yellow sticks) differ in the three IrCD isoforms.
proximity of the active site cleft for substrate binding. These are designated as "flap region" and the so-called "polyproline loop" (pink and blue in Fig. 1, respectively) in accordance with nomenclature of mammalian aspartic peptidases (28). However, the polyproline loop of IrCD1 is rearranged; the conserved Gly 308 and Asp 310 are mutated to serine and glutamic acid, respectively, and the loop contains three amino acid deletions at Ser 315 -Pro 317 (supplemental File 1), which is also present in the I. scapularis analog IsCD1 (data not shown).
Predicted N-glycosylation sites at Asn 73 and Asn 273 in the IrCD1 proenzyme are positionally different from the three predicted N-glycosylation sites in the IrCD2 (Asn 70 , Asn 172 , and Asn 194 ) and the IrCD3 (Asn 70 and Asn 172 ) proenzymes, respectively (Fig. 1). The positions of lysine residues in the "K203" and "␤-loop" regions (Fig. 1A) play a critical role in the phosphotransferase recognition patch (32). IrCD1 appears analogous to lysosomal human cathepsin D by possessing conserved Lys 203 , Lys 267 , and Lys 293 . These residues are important for lysosomal targeting of mammalian cathepsin D via the mannose 6-phosphate pathway (33).

IrCD1 Is Solely Expressed in the Gut, Up-regulated by Feeding, and Localized in Digestive Cell Vesicles-RT-PCR profiling
of IrCDs demonstrated that IrCD1 mRNA is restricted to the gut tissue of partially engorged female ticks. IrCD2 is expressed in guts and salivary glands, whereas IrCD3 mRNA is mostly produced in the ovaries ( Fig. 2A). Thus, IrCD3 was further excluded from studies on hemoglobinolysis. The dynamics of expression of IrCD1 and IrCD2 mRNAs in the gut tissue following feeding were analyzed by qRT-PCR (Fig. 2B). IrCD1 mRNA peaks in partially engorged females (day 6 of feeding), whereas IrCD2 mRNA increases in fully fed and detached females (day 8 of feeding). IrCD1 protein abundance in the gut tissue during feeding was monitored by Western blotting using RaϫrIrCD1. The IrCD1 protein signal is detectable in partially and fully fed females (Fig. 3A). Activities monitored with the Abz-KPAEFnFRL substrate in gut extracts prepared from females at different time points (days) of feeding show a rapid increase peaking in fully fed ticks (Fig. 3A). Indirect immunofluorescence microscopy with RaϫrIrCD1 and Alexa Fluor 488 dye-conjugated secondary antibodies localized IrCD1 in the vesicles of digestive gut cells at the day 6 of feeding (Fig. 3B).
rIrCD1 Activates Autocatalytically and Displays Cathepsin D-like Substrate/Inhibitor Specificity-IrCD1 zymogen was expressed in E. coli (rIrCD1) and isolated with Co 2ϩ -chelating chromatography under denaturing conditions, renatured, and further purified by ion-exchange FPLC. The correctly folded pro-rIrCD1 was autocatalytically activated at pH 4.0. The active enzyme efficiently hydrolyzed the cathepsin D-specific FRET substrate Abz-KPAEFnFRL with the quencher 4-nitrophenylalanine group. Maximal activity was detected after 2-3 h of activation at pH 4.0 (Fig. 4A). Processing of the zymogen was followed by SDS-PAGE to demonstrate that activation was accompanied by autocatalytic proteolytic processing (Fig. 4C). Pro-IrCD1 band (47 kDa) was completely converted to a 40-kDa band. This corresponds to the theoretical molecular mass of mature IrCD1 predicted from the amino acid sequence. N-terminal amino acid sequencing of the 40-kDa band identi- Two-step RT-PCR was performed with IrCD1, IrCD2, and IrCD3 gene-specific primers and first strand cDNA templates prepared from total RNA isolated from guts, salivary glands (gl), and ovaries. The identity of resulting PCR products was confirmed by DNA sequencing. Ferritin 1 primers (15) were used as template loading control. IrCD1 is the only cathepsin D with gut tissue restricted expression. B, dynamics of IrCD1 and IrCD2 expression in female guts during feeding. Levels of IrCD1 and IrCD2 mRNAs were determined by qRT-PCR using dual labeled UPL probe 78 and 44 (Roche Applied Science), gene-specific primers, and gut cDNA templates from days 0, 2, 4, 6, and 8 of female feeding. Reactions were done in triplicate, and standard deviations are depicted. The levels of mRNA are normalized to the sample with maximum mRNA level (set to 100%). Relative level of IrCD1 mRNA peaks in the 6th day before rapid engorgement, which is conditioned by fertilization, although the relative level of IrCD2 mRNA peaks in fully fed fertilized females after rapid engorgement. fied a single sequence, Ile-His-Glu-Gly-Pro-Tyr, that was generated by the cleavage between Lys 22 and Ile 23 (pro-IrCD1 numbering). This autoactivation cleavage site is located three amino acids upstream of the homologous mature human cathepsin D sequence (supplemental File 1). No other processing intermediates or inter-chain processing products were observed.
The pH activity profile of rIrCD1 was determined using the synthetic FRET peptide substrate Abz-KPAEFnFRL and bovine hemoglobin (Fig. 4B). Both substrates were effectively cleaved at pH 2.5-5.0 with optimal activity at pH 4.0.
Inhibitory specificity of rIrCD1 was determined using a panel of selective peptidase inhibitors ( Table 2). rIrCD1 activity was completely inhibited by pepstatin A (34) and potato cathepsin D inhibitor (35), and both specifically inactivate cathepsin D-like peptidases. Partial inhibition was observed with lopinavir (36), which targets the aspartic peptidases of the retropepsin family. rIrCD1 activity was unaffected by Pefabloc (37), leupeptin (38), E-64 (39), and EDTA (40), which inhibit peptidases of serine, cysteine, and metallopeptidase classes, respectively. This inhibition profile confirms that IrCD1 has ligand binding characteristics similar to mammalian cathepsin D.
Recombinant IrCD1 and native IrCD in the tick gut tissue extract were visualized using the specific activity-based probe, FAP-09, that binds to the cathepsin D active site (Fig. 4D) (24). The labeled enzymes migrate on the SDS-PAGE as single bands of 40 and 45 kDa, respectively. The lower molecular weight of the rIrCD1 band can be explained by the absence of N-glycosylations at two predicted sites (Asn 73 and Asn 273 , Fig. 1A; hCD numbering) in the E. coli expressed zymogen. Labeling with FAP-09 was quenched when the active site had been pre-occupied by pepstatin A as a specificity control.
IrCD1 Has a Preference for Hydrophobic Residues at the P1 and P1Ј Substrate Positions-A novel set of short peptidyl substrates and macromolecular hemoglobin was used to determine rIrCD1 cleavage site specificity. rIrCD1 was incubated with an equimolar mixture of 124 synthetic tetradecapeptides that were designed to have equal representations of all amino acids. Samples of the assay were removed at multiple time points between 5 and 1200 min and subjected to LC-MS/MS sequencing. The total number of cleavage sites identified was 202; however, 97 of these were observed as early as 15 min (supplemental File 3). A heat map illustrating the frequency of residues found in the P4 to P4Ј sites after 15 min incubation was generated. These data indicated that rIrCD1 has a strong preference for hydrophobic residues in both the P1 and P1Ј positions, although the other positions are much less selective. The S1 position appears to be the major determinant of substrate specificity with a preference for Phe Ͼ Tyr Ͼ Leu/Trp/norleucine but not Ala, Val, and Ile. The S1Ј pocket is less selective and has an equal preference for all hydrophobic residues except Pro and Leu. Outside the S1-S1Ј regions, rIrCD1 tolerates most amino acids but displays FIGURE 3. IrCD activity, IrCD1 expressional profiling, and localization with specific antibodies. A, Abz-KPAEFnFRL measured authentic cathepsin D activity in female I. ricinus gut extracts during feeding. Time line depicts feeding phases as follows: attachment, slow feeding period, rapid engorgement (eng), and detachment. Activity is increasing from the day 4 of feeding, peaks in fully fed females, and than slowly decreases. Bottom panel, Western blot, immunodetection with RaϫrIrCD1 of authentic IrCD1 in gut extracts in different days of feeding, signal raises by day 6 and peaks in day 8. B, indirect immunofluorescent microscopy, semi-thin section labeled with affinity purified RaϫrIrCD1. The goat anti-rabbit IgG conjugated with Alexa-Fluorா 488 was used as the secondary antibody. Note: signal is localized to intracellular vesicles of digestive cells. Nuclei of cells were counterstained with DAPI. BC, basal epithelial cells; HC, hemoglobin crystal in the gut lumen; DC, digestive gut cells. a preference for tyrosine at P3 and alanine at P2Ј. Further analysis of the cleavage site location within the peptide substrate indicated that rIrCD1 is a true endopeptidase with 91% of these cleavages occurring when the S3 to S3Ј subsites were occupied and 100% when S2 to S2Ј were occupied (Fig. 5A). To investigate if a phenylalanine residue in the P1 position was sufficient for cleavage to occur, a list of all tetrapeptides present within the library with phenylalanine in the second position was generated (Fig. 5B). This represents all possible P2 to P2Ј residues with Phe in the P1 site. Cleavage sites were characterized by the time at which peptide products were first observed. Although a Phe in the P1 site is the major specificity element for rIrCD1, not all Phe-X bonds are cleaved. In general, cleavage of Phe-X bonds occurred readily when a hydrophobic residue was present in the P1Ј position. In the substrates containing the AFnH, NFnA, and SFIE sites (single letter amino acid codes according to Fig. 5), cleavage does not occur after Phe because each tetrapeptide sequence is situated on the termini of the tetradecapeptide substrate such that no residues are present in either the P3/P4 (NFnA and SFIE) or P3Ј/P4Ј (AFnH) positions. Furthermore, cleavage occurs slowly or not at all in substrates where the P1Ј residue is not optimal. When cleavage occurred at sites with a nonoptimal P1Ј residue, such as IF2EI, then residues at P2 and P2Ј were often preferred.
Hemoglobin digested by rIrCD1 was resolved by RP-HPLC, and peptide fragments were characterized by mass spectrometry. The cleavage sites identified in ␣and ␤-subunits of hemoglobin are indicated in Fig. 5C. In general, these cleavage sites contain hydrophobic residues in the P1 and P1Ј positions with a preference for Phe and Leu in P1. After 15 min of incubation with rIrCD1, four cleavage sites were identified, Leu 29 -Glu 30 , Phe 33 -Leu 34 , and Leu 109 -Ala 110 in the ␣-subunit and Phe 40 -Phe 41 in the ␤-subunit. These initial hemoglobin cleavage sites of rIrCD1 are equal to those reported previously for the gut tissue extract cathepsin D activity (5). The overall identity of cleavage sites found for the rIrCD1 and the authentic enzyme is 62%.
RNAi Confirms the Function of IrCD1 as an Intestinal Aspartic Hemoglobinase-The major contribution of IrCD1 to the overall gut-associated cathepsin D activity and the specific role of this enzyme in the hemoglobinolytic digestive machinery were validated using gene-specific RNAi. The IrCD1 tran-

Ixodes ricinus Gut-associated Cathepsin D (IrCD1)
script, as well as protein synthesis and specific cathepsin D activity, was dramatically reduced (Fig. 6) compared with GFP-dsRNA-treated control ticks. Gene specificity of RNAi was verified by qRT-PCR. The level of IrCD1 mRNA in IrCD1-dsRNA-treated tick group was reduced to 16% of the GFP control tick group, although the change in the expressional level of IrCD2 mRNA was still ϳ90% compared with GFP control (not significant, Fig. 6A). In kinetic assays with Abz-KPAEFnFRL and labeled hemoglobin substrates, the overall gut extract cathepsin D activity was reduced in the IrCD1 tick group to 20 and 10%, respectively (Fig. 6C). Phenotype markers, mortality, weight, oviposition, and larvae hatching (41) displayed no statistically significant changes in IrCD1-dsRNA-treated ticks (data not shown).

DISCUSSION
Since the 1970s, none of the digestive tick cathepsin D reports (23,26,42,43) reflected the functional diversity of multiple tick cathepsin D paralogs and the coordinated hemoglobinolytic action of cysteine and aspartic peptidases (4,5,44). Therefore, identification and full cloning of all I. ricinus cathepsin D genes was the primary concern of this study. Despite the high relative diversity among primary structures, phylogenetic analyses confirmed that the three FIGURE 5. Substrate specificity profiling of rIrCD1-screening of novel highly diversified peptide library consisting of 124 synthetic tetradecapeptides and hemoglobin rIrCD1 cleavage map. A, a heat map and iceLogo reflecting the frequency of residues in the P4 to P4Ј positions of 97 unique rIrCD1 cleavage sites. Each amino acid is represented by the single letter code, and n corresponds to norleucine. In the heat map red boxes identify amino acids that are present at a frequency less than that of the entire library (negative selection), and green boxes identify residues that are enriched (positive selection). Cleavage of substrates occurred only when the P2 to P2Ј sites were occupied, whereas 3 and 6% of substrates were cleaved when no amino acid was present in the P3 and P3Ј sites, respectively. The same dataset is represented as an iceLogo (25) where only residues that are significantly different from background (p Ͻ 0.05) are illustrated. B, qualitative assessment of rIrCD cleavage products observed between 5 and 1200 min incubation with peptide library. All tetrapeptide sequences within the library that possess Phe in the second position (corresponding to P1) are listed for comparison. C, bovine hemoglobin was digested in vitro with rIrCD1 at pH 4.0. The fragments were identified by mass spectrometry, and the corresponding cleavage sites are indicated in the hemoglobin sequence with black triangles. The initial cleavage sites (after 15 min of reaction) are marked with asterisks. Numbering of the ␣-subunit residues excludes the first mRNA translated methionine of the hemoglobin precursor.
IrCD forms most likely evolved by gene duplication and ongoing mutation within the Acari group (supplemental File 2). We hypothesize that the structural modifications of IrCD1 arose with the adaptation of ticks/mites to a bloodfeeding lifestyle.
Our results clearly identify IrCD1 as the only I. ricinus cathepsin D exclusively expressed in the gut. The hemoglobinolytic role of IrCD1 was previously indicated by missing RT-PCR signals in I. ricinus life stages not feeding on hosts (4). Dynamic up-regulation of IrCD1 mRNA was found to be consistent with the rapidly increasing intestinal aspartic peptidase activity during several days of tick feeding (16). The presence of IrCD1 in female guts was also previously noted by mass spectrometry (5).
The uptake of blood components by tick gut cells (45) appears to follow similar endocytic mechanisms of fluid phase endocytosis and receptor-mediated endocytosis described from mammalian cells (46). Immunolocalization of IrCD1 shows a distribution through vesicles, most likely lysosomes and endosomes of gut cells. This localization is analogous to previously characterized tick hemoglobinolytic cysteine peptidases IrAE1 (12), IrCB1 (16), IrCL1 (11), and the cathepsin L from R./B. microplus (47). We propose that IrCD1 is targeted to endolysosomes via the mannose 6-phosphate pathway due to the presence of an identical recognition patch (Lys 203 , Lys 267 , and Lys 293 ; Fig. 1A) for the lysosomal N-acetylglucosamine-1phosphotransferase (UDP-GlcNAc) within IrCD1 and the human lysosomal cathepsin D (33). In addition to the difference in the post-translational modification patterns of IrCD1 and the BmAP/longepsin analog IrCD2 (supplemental File 1), the putative role of IrCD2 in the secretory or extracellular processes is supported by IrCD2 expression in both salivary glands and guts, where the expression peaks in fully fed females (Fig. 2B).
Our current tick hemoglobinolytic enzyme network model includes primary hemoglobinolytic endopeptidases IrCD1, FIGURE 6. RNAi knockdown of IrCD1. I. ricinus females (25 per each group) were injected with IrCD1 dsRNA (iIrCD1, minus, experimental group) and/or GFP dsRNA (GFP, plus, control group). Gut tissue of partially engorged ticks was used for RNA/DNA isolation and preparation of tissue homogenates. A, effect on mRNA expression levels. Dual-labeled UPL probe 78 (Roche Applied Science) and IrCD1-specific primers were used for qRT-PCR analysis of IrCD1 RNAi effect between the iIrCD1 and GFP tick groups. The presence of IrCD1 mRNA is decreased to 16% in the relative level compared with the GFP control group. For gene specificity of the IrCD1 knockdown, qRT PCR with UPL probe 44 (Roche Applied Science) and IrCD2-specific primers was performed. IrCD2 mRNA level showed no significant decrease in between the two tick groups. All PCRs were performed in triplicate. Relative values are depicted with standard deviations. B, effect on protein abundance. SDS-PAGE separated gut extracts of iIrCD1 and GFP ticks were electrotransferred to PVDF membrane (Coomassie Blue-stained lines for loading control). Presence of IrCD1 is determined by Western blot with antibody RaϫrIrCD1. No signal in iIrCD1 tick group line compared with ϳ40-kDa rIrCD1 signal in GFP control group line. C, effect on gut extract cathepsin D activity. Kinetic assays measured with Abz-KPAEFnFRL and hemoglobin displayed ϳ80% and Ͼ90% decrease of activity, respectively, between GFP and iIrCD1 tick groups. Measuring was performed in triplicates. Relative values are depicted with standard deviations. IrCL1, and IrAE1 (10). All three peptidases prepared as recombinant proteins are capable of autocatalytic activation in acidic conditions indicating the process to occur upon the acidification of gut cell endosomes (for review see Ref. 48). Acidification of the large hemoglobin-containing vesicles was demonstrated by acridine orange staining of R. microplus gut cells by Lara et al. (45). Hydrolysis is also facilitated by the spontaneous denaturation of hemoglobin below pH 4.5 (49).
LC-MS characterization of peptide libraries was introduced to resolve secondary preferences at P3, P2, P2Ј, and P3Ј subsites (50) as a more sophisticated method to internally quenched fluorescent peptide libraries (51). Our tetradecapeptidyl library approach revealed that the rIrCD1 is a true endopeptidase with most cleavage occurring when P3 to P3Ј residues are occupied. The S1 and S1Ј pockets define the primary specificity, and a distinct preference for hydrophobic residues at each site is evident. Importantly, hemoglobin cleavage sites by rIrCD1 and the native I. ricinus gut cathepsin D show identical preference for Phe and Leu at P1 (5).
IrCD1 RNAi experiments decreased the overall gut aspartic peptidase activity by Ͼ90% demonstrating the major contribution of IrCD1 to the cathepsin D activity of the multienzyme hemoglobinolytic network (5). The decrease excludes hemoglobinolytic activity of papain-like enzymes due to the presence of 10 M E-64 in the assay. We propose that the remaining 10% of activity in IrCD1 knocked down ticks (Fig. 6C) is not sufficient enough to off-set the appearance of obvious RNAi phenotypes with respect to tick weight, mortality, oviposition, and larvae hatching. The missing RNAi phenotype may be better explained by the synchronous operation between IrCD1 and the other initial hemoglobinase IrCL1, which is also remarkable from hemoglobinolytic assays performed with gut extracts (5). The redundancy between IrCD1 and IrCL1 was also indicated by the only changed phenotype marker ( tick average weight decreased by 24%) in the IrCL1 RNAi knocked down ticks (11). Recently, Cruz et al. (27) confirmed in B. (R.) microplus that BmAP and BmCL are together responsible for the generation of hemoglobin-derived antimicrobial peptides. Because of dsRNA dose and volume limits for each tick female, our current method has not yet allowed us to obtain multiple endo-peptidase RNAi knockdowns to confirm the cathepsin L/D redundancy (data not shown).
To conclude, this report identifies and characterizes IrCD1 as the specific gut-associated aspartic peptidase contributing to the peptidase hemoglobinolytic network operating in the digestive epithelium of partially engorged I. ricinus females. Together with previous reports characterizing IrAE (12) and IrCL1 (11), this study completes the biochemical and functional characterization of those primary individual endopeptidases of the network. Although IrCD1 plays a greater role in initial processing of hemoglobin (5,10), all three endopeptidases appear to have synergistic roles similar to that described for the intestinal proteolytic network of Schistosoma mansoni (7). However, to confirm potential synergies and possibly yield phenotypes, combinatorial RNAi and chemical inhibition of multiple protease targets will be required and may help prioritize those individual Ixodid proteases as targets for anti-tick interventions.