Enzymic, Phylogenetic, and Structural Characterization of the Unusual Papain-like Protease Domain of Plasmodium falciparum SERA5*

Serine repeat antigen 5 (SERA5) is an abundant antigen of the human malaria parasite Plasmodium falciparum and is the most strongly expressed member of the nine-gene SERA family. It appears to be essential for the maintenance of the erythrocytic cycle, unlike a number of other members of this family, and has been implicated in parasite egress and/or erythrocyte invasion. All SERA proteins possess a central domain that has homology to papain except in the case of SERA5 (and some other SERAs), where the active site cysteine has been replaced with a serine. To investigate if this domain retains catalytic activity, we expressed, purified, and refolded a recombinant form of the SERA5 enzyme domain. This protein possessed chymotrypsin-like proteolytic activity as it processed substrates downstream of aromatic residues, and its activity was reversed by the serine protease inhibitor 3,4-diisocoumarin. Although all Plasmodium SERA enzyme domain sequences share considerable homology, phylogenetic studies revealed two distinct clusters across the genus, separated according to whether they possess an active site serine or cysteine. All Plasmodia appear to have at least one member of each group. Consistent with separate biological roles for members of these two clusters, molecular modeling studies revealed that SERA5 and SERA6 enzyme domains have dramatically different surface properties, although both have a characteristic papain-like fold, catalytic cleft, and an appropriately positioned catalytic triad. This study provides impetus for the examination of SERA5 as a target for antimalarial drug design.

Malaria inflicts serious health and economic burdens on many countries throughout the world. Plasmodium falciparum is responsible for the most acute form of the disease, causing high morbidity and death in more than one million children each year. An effective vaccine and new drugs are urgently required. One molecule that has the potential to serve as both is the highly expressed blood-stage protein known as the serine repeat antigen (SERA) 1 (1)(2)(3).
Recently, it has become clear that SERA, and another well known P. falciparum SERA paralogue known as SERPH (4), belong to a large gene family that includes a total of nine members in P. falciparum and four in Plasmodium yoelii as well as numerous others in other Plasmodium genomes that have not yet been completely sequenced (5)(6)(7)(8). SERA and SERPH are the fifth and sixth genes in an eight-gene cluster on chromosome 2 and are now known as P. falciparum SERA5 and SERA6, respectively (9,10). Although mRNA has been detected for all P. falciparum SERAs in mature blood stage forms, not all are expressed equally with mRNA levels higher for SERAs3-6 and for SERA9, a separate SERA-like gene located on chromosome 9. Moreover, protein expression in blood stages appears to be restricted to those more strongly expressed at the mRNA level in all parasite lines examined to date (9,10). SERA5 is expressed at much greater levels than other family members and is indeed highly abundant generally. Furthermore, SERA5 appears to be essential for normal P. falciparum blood-stage growth given the inability to disrupt this gene using a number of different approaches (9). 2 SERAs are peripheral proteins of 100 -130 kDa that are exported to the parasitophorous vacuole (1,3,9,11), and there is evidence that SERA5 associates with the merozoite surface presumably by interacting with an integral membrane protein (12). Antibodies against the N-terminal 47-kDa fragment of SERA5 are elicited in immune individuals (9,10) and can effectively inhibit blood-stage growth operating, in one way at least, by preventing the dispersal of merozoites in a rupturing schizont (13). Therefore, it is possible that SERA5 plays a key role in egress and/or invasion of merozoites into host erythrocytes. * This work was supported in part by the National Health and Medical Research Council (NHMRC) of Australia. The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
b To whom correspondence may be addressed. All SERA proteins possess a central domain that shows strong homology to the papain family of cysteine proteases. However, some SERAs, including P. falciparum SERA5, have a serine substituted in place of the active site cysteine residue (2,6,7). In P. falciparum, six of the nine SERAs have an active site serine (SERA1-5 and SERA9), whereas the other three have an active site cysteine. This non-conventional substitution has led to questions as to whether SERAs with an active site serine retain enzymic function (6).
In this study, we have expressed and re-natured a recombinant form of the SERA5 protease-like domain. This protein cleaved substrates downstream of aromatic residues and demonstrated autolytic activity that was reversible with a serine protease inhibitor providing the first indication that this protein has retained proteolytic activity. Phylogenetic analysis revealed that the cysteine to serine substitution predates speciation and that the current SERA family is separated into two distinct phylogenetic groups possessing either SERA5-like (active site serine) or SERA6-like (active site cysteine) enzyme domains. Molecular modeling further supported the notion of distinct proteolytic roles for the enzyme domains of SERA5 and SERA6.
Reversed-phase HPLC-Reversed-phase HPLC was performed using a Hewlett-Packard (Waldbronn, Germany) 1050 modular HPLC. Buffer A comprised 0.05% (v/v) trifluoroacetic acid (HPLC/Spectro grade, Pierce, Rockford, IL) in Milli-Q grade water (Millipore, Bedford, MA); buffer B comprised 0.05% (v/v) trifluoroacetic acid in acetonitrile (Chro-mAR HPLC grade, Malinckrodt, Paris, KY) in Milli-Q grade water. Aliquots from the refold mixture were centrifuged at 14,000 rpm for 15 min at 4°C prior to loading onto a C8 column (2.1-mm inner diameter ϫ 100-mm (Brownlee) columns, PerkinElmer Life Sciences, Norwalk, CT) in the presence of buffer A. Bound proteins were eluted using a linear gradient of 0 -100% B over 12 min at a flow rate of 0.5 ml/min. SDS-PAGE and Immunoblotting of Parasite Proteins-Parasite cultures were synchronized by treatments with sorbitol then enriched by Percoll purification. The parasitized erythrocytes were washed in phosphate-buffered saline then placed into sample buffer with or without ␤-mercaptoethanol prior to snap freezing and storage at Ϫ70°C. Parasite proteins were fractionated on 10% discontinuous gels and were electrophoretically transferred to nitrocellulose and immunoblotting was performed using procedures described previously (14).
Limited Proteolysis-A 5 mg/ml solution of Elastase (porcine pancreas, Roche Applied Science) in 50% (v/v) glycerol, 100 mM NaCl, 0.5 mM EDTA, and 20 mM Tris, pH 8.25, was diluted progressively in a 3-fold series using a stock solution of 300 mM NaCl/60 mM Tris, pH 8.25. To each 5-l aliquot of enzyme from the dilution series was added 5 g of 5PE in 10 l (total vol. ϭ 15 l). The reactions were incubated at 37°C for 10 min, placed immediately into an ice bath before fractionation by SDS-PAGE. To identify the major fragments generated by limited proteolysis, the procedure was scaled up 5-fold, and elastase concentrations of 1.6 and 0.06 mg/ml were used to generate 5PEb and 5PEa plus 5PEb, respectively. These reactions were incubated for 15 min at 37°C. Further elastase activity was inhibited by the addition of phenylmethylsulfonyl fluoride to a 2 mM final concentration. Fragments of SERA 5PE were purified by RP-HPLC as described above except that a gradient profile of 0 -24 min, 0 -100%B at a flow rate of 0.5 ml/min was used. The 5PEa and 5PEb elastase fragments co-eluted using these RP-HPLC conditions.
Disulfide Bond Determination-Disulfide-linked peptides were identified by comparing the RP-HPLC chromatographic profiles for dithiothreitol-reduced and non-reduced tryptic digests of SERA 5PE, in a manner similar to that described previously (15). A preparative run using 60 g of tryptic digest was then fractionated under identical chromatographic conditions, and the peaks identified as containing disulfide-linked peptides were subjected to Edman sequencing for identification. RP-HPLC was performed as described above, however, a Vydac (The Nest Group, Southborough, MA) C18 column (4.6 ϫ 250 mm inner diameter) was used throughout, and the elution profile was 0 -25 min, 0% B, 25-115 min, 0 -60% B, 115-120 min, 100% B. Fractions were collected manually at a flow rate of 1 ml/min. N-terminal Sequence Analysis-N-terminal sequence analyses were performed using a Hewlett-Packard biphasic N-terminal protein sequencer (Model G1005A, Hewlett-Packard) using version 3.0 chemistry as described previously (16).
Electrospray Mass Spectrometry-On-line mass spectrometric analysis of 5PE preparations was performed using a Micromass QTof2 hybrid quadrupole time-of-flight mass spectrometer equipped with a nanoelectrospray ionization source. Multiply charged mass spectra obtained were processed using MaxEnt 1 built into the MassLynx software package to obtain parent molecular weights (Micromass, UK) Zymograms-Samples were separated in 12% discontinuous SDS-PAGE gels with or without the presence of 0.4% (w/v) Bloom 300 gelatin (Sigma). Gels were rinsed in water and incubated for 30 s in 10% (v/v) acetic acid, rinsed again in water, and incubated for 30 min at room temperature in 2.5% (v/v) Triton X-100. Gels were left overnight at 37°C in 0.1 M sodium bicarbonate, pH 7.5, buffer containing 1 mM CaCl 2 . Protein bands were visualized after the staining of gels with a solution of Coomassie Brilliant Blue R-250 (Bio-Rad).
Enzyme Solution Assays-Solution assays were performed in 0.1 M NaHCO 3 , pH 7.5, 5 mM CaCl 2 , and 0.1% (v/v) Tween 20. The fluorescent substrates Succinyl-AAPF-4-amino-7-methyl coumarin (Suc-AAPF-AMC) and Suc-LLVY-AMC were obtained from a commercial source (Calbiochem, La Jolla, CA). Stock substrate solutions (1 mM in Me 2 SO) were diluted to 10 M in the above buffer. Fluorescence readings were taken every 90 s over a 2-h period using an excitation wavelength of 370 nm and an emission wavelength of 460 nm.
Molecular Modeling-A three-dimensional model was first constructed for the putative enzyme domain of P. falciparum SERA6. To select the most appropriate structural template, this sequence was threaded against a data base of 4258 structural templates using the fold recognition module of the program ProCeryon (17). Default values were used for the gap opening and gap extension penalty parameters. The program Modeler (18), which builds homology models from the satisfaction of spatial restraints derived from the alignment of the target with the template, as implemented within the InsightII (version 98.0) Homology software package (Accelrys Inc., San Diego, CA), was used to construct the homology model. Twenty structural models of SERA6 enzyme were built using the following parameter values or options: library_schedule of 1, max_var_iterations of 500, md_level of "refine1Ј," repeat_optimization of 2, and max_molpdf of 10 6 . The sub-routine "spe-cial_patches" was used to constrain the formation of disulfide bonds absent in the template. The model with the lowest value of the Modeler objective function (i.e. Ϫ1.0xln (molecular probability density function)) was selected as the best model structure. The structural and stereochemical quality of this model was checked and confirmed with the programs PROSAII version 3.0 (19), Profiles-3D (20), and PROCHECK version 3.5 (21). For example, ProsaII gave a total Z-score of Ϫ7.64 for the best SERA6 model compared with Ϫ9.86 for the template structure Caricain. The same approach was subsequently used to model the enzyme domain of SERA5, which gave a ProsaII Z-score of Ϫ7.01.
Phylogenetic Analysis-Sequences for 28 presumptive SERA enzyme domains were obtained from the Plasmodium data base (www.Plas-moDB.org) and aligned using ClustalW, with default parameters (22). P. falciparum SERA8, a suspected pseudogene (9), was excluded from the analysis. Initially, pairwise distances were computed as maximum likelihood estimates of the expected number of substitutions per site, under the Dayhoff PAM matrix substitution model. These distances were calculated with PROTDIST, and trees were constructed with the neighbor-joining algorithm as implemented in NEIGHBOR, both in the PHYLIP package (23,24). The trees were bootstrapped, and the input order of the sequences was randomized in the process. Subsequent comparison of the resulting unrooted gene tree with hypothesized evolutionary relationships among the six Plasmodium species produced a list of implied speciation, duplication, and deletion events. Rooting the gene tree (see below) and selecting two alternative configurations over two low confidence internal edges in the original neighbor-joining tree dramatically reduced the number of duplication and, in particular, deletion events required to reconcile the gene and species trees. 3

RESULTS
Expression and Refolding of the Enzyme Domain of P. falciparum SERA5-To determine if the SERA5 enzyme domain possessed proteolytic activity, we expressed most of the 50-kDa fragment in E. coli (Fig. 1). The N terminus of 5PE is the same as the native 50-kDa species (11), whereas the location of the C terminus of 5PE was based on sequence alignments and the known termini of numerous other papain family proteases. These alignments were also used to predict the location of the N terminus of the enzyme domain 187 residues downstream from the beginning of 5PE (Fig. 1).
E. coli-expressed 5PE was deposited exclusively as insoluble inclusion bodies ( Fig. 2A). These were solubilized under reducing and denaturing conditions and purified by Ni-NTA-agarose chromatography ( Fig. 2A). 5PE was then refolded in vitro by dilution into refolding buffer in the presence of a glutathione redox pair. Refolding was monitored by RP-HPLC until the majority of the starting material had converted to an earliereluting, Gaussian-shaped peak (Fig. 2B), consistent with appropriate internalization of hydrophobic groups. Refolded SERA 5PE was further purified using one or more chromatographic steps. The observed differential migration under reduced versus non-reduced SDS-PAGE supports the existence of a disulfide bond-stabilized conformation within the refolded SERA 5PE (Fig. 2, A and C). Also consistent with appropriate refolding: only 5PE that was refolded and not exposed to reducing agent, either prior to or during electrophoresis, reacted strongly with IgG from malaria-immune donors, which recognizes native SERA5 epitopes (Fig. 2C, lane 6). By gel filtration chromatography, the vast majority of the refolded 5PE appeared to be monomeric with a molecular mass determined from the calibration curve estimated at 49,545 Da, consistent with the calculated mass of 54,150 Da (Fig. 3). Limited proteolysis with elastase was used to investigate the structural stability of refolded 5PE. Treatment with different concentrations of elastase revealed a moderately stable and a highly stable fragment termed 5PEa and 5PEb, respectively (Fig. 4A). The identities of these species, presumed to represent tightly folded, homogeneous species, were determined using a combination of Edman degradation and mass spectrometry on RP-HPLC-purified products (15). Both fragments correspond to the full putative protease domain with a small N-terminal extension of either 32 (5PEa) or 20 (5PEb) amino acids and no processing at the C terminus (Fig. 4, B-E, and see Fig. 1B 12). The catalytic serine is indicated with an asterisk, and other active site residues are indicated by the § symbol. The predicted N terminus of the enzyme domain is indicated by E. The N termini of the elastase-resistant species (5PEa and 5PEb) are indicated, as is that of the major autolytic breakdown product (5PEc). masses for all fragments (including 5PE itself) perhaps reflected differences in the oxidative state of the numerous internal methionine residues within these fragments. Studies on the structures of zymogens have shown the Pro-sequences of some of these proteases prevent premature activity by inserting into the catalytic cleft of the enzyme domain (25,26). It is possible that the C-terminal region of the SERA5 Pro domain associates with the catalytic cleft and is held in position via a hairpin loop that is stabilized by a disulfide bond between cysteine residues 3 and 4. Both disulfide linkages found in the Pro region of 5PE were determined by disulfide-mapping techniques as described elsewhere (15) (Fig. 5). Tryptic digestion of the refolded 5PE revealed two disulfide-linked peptides, which were purified by RP-HPLC. Edman degradation of these peptides established a disulfide-linkage between C1 and C2 and between C3 and C4 (Fig. 5). Providing support for the hypothesis that the C terminus of the Pro domain associates with the catalytic cleft is the finding that SERA 5PE undergoes autolytic breakdown upon storage to generate a stable species of similar molecular weight to the highly elastase-resistant 5PEb species (see below). This species, termed 5PEc, was shown by Edman sequencing/mass spectrometry to comprise the protease domain with an N-terminal extension of 18 amino acids (Fig. 4, B and F, and see Fig. 1B).
To further test if 5PE resembled the native SERA5 50/56-kDa domain, polyclonal rabbit antisera and a panel of mouse polyclonal and monoclonal antibodies were raised to this recombinant protein. Although details of their reactivities will be described elsewhere, 4 these reagents reacted strongly and specifically with a parasite species that represents full-length 4 A. N. Hodder and B. S. Crabb, manuscript in preparation.

FIG. 2. Expression and refolding of the enzyme domain of SERA5 (5PE).
A, Coomassie Blue-stained SDS-PAGE gel loaded with the following samples: uninduced p5PE-E. coli (ϪIPTG), induced p5PE-E. coli (ϩIPTG), washed 5PE inclusion bodies (IB), 5PE eluent from a Ni-NTA column using denaturing buffers (NiNTA), and refolded and purified 5PE (ref5PE). All samples were electrophoresed in the presence of reducing agent except for that at the extreme right of the panel, which was separated under non-reducing conditions. B, RP-HPLC profiles of denatured (Un) and in vitro refolded (Ref) 5PE. Panel C shows identical SDS-PAGE gels that were either stained with Coomassie Blue (left) or probed with human anti-P. falciparum IgG following immunoblotting (right). Antibody staining was visualized by chemiluminescence (14). For Coomassie Blue staining, samples were loaded at 1 g/lane, whereas for immunoblotting samples were loaded at 40 ng/lane. Samples 1-3 were separated in the presence of reducing sample buffer, whereas samples 4 -6 were separated under non-reducing conditions. Prior to the addition of SDS-PAGE sample buffer, the 5PE in lanes 1 and 4 had been reduced and denatured (i.e. this material had not yet been refolded in vitro), whereas the material in lanes 2 and 5 had been reduced and alkylated (14). Lanes 3 and 6 contain the purified in vitro refolded 5PE. 12% resolving gels were used in each case. (ϳ110 kDa) and processed forms of native SERA5. In Fig. 6, an example is shown of the reactivity by immunoblot of three different anti-5PE polyclonal sera (mouse and rabbit origin) to proteins derived from mature blood-stage parasites. These antibodies recognize a band that co-migrates with the full-length SERA5 as recognized by antibodies that react specifically with the N terminus of SERA5 (9). The other bands recognized by anti-5PE antibodies have M r values equivalent to the three different processing products, P73, P56, and P50, that are known to contain the central protease domain (Fig. 6) (27). As expected, antibodies that recognize the N terminus of SERA5 do not recognize these sub-fragments but instead detect the N-terminal P47 species. Pre-bleed sera from these animals failed to react with parasite proteins (Fig. 6).

FIG. 4. Refolded 5PE contains a tightly folded, elastase-resistant domain.
A, incubation of purified 5PE with decreasing amounts of elastase (E). Samples were separated by SDS-PAGE under reducing conditions on 12% resolving gels. B, Edman degradation and summary of mass spectrometric data for 5PE and its proteolytic fragments. Note that the N-terminal methionine residue is not present in the 5PE peptide sequence (asterisk) and has probably been removed from the mature 5PE. In the peptide sequence, X represents residues that correspond to serine or cysteine residues. Note also that the theoretical mass determined for each fragment assumes all disulfide linkages are in place (double asterisk). The deconvoluted mass spectra shown in Panels C-F are for samples containing 5PE, 5PEa plus 5PEb, 5PEb, and the autolytic product 5PEc, respectively. 5PEa and 5PEb were analyzed together (D), because these fragments could not be separated by RP-HPLC.

Recombinant SERA5 Enzyme Domain (5PE) Has Proteolytic
Activity-The refolded 5PE fragment displayed enzymatic activity on zymograms after SDS-PAGE in non-reducing, but not reducing sample buffer (Fig. 7A). Overnight incubation of gelatin-embedded zymograms with a pH 7.5 activation buffer resulted in no detectable loss of gelatin, but did result in the almost complete disappearance of 5PE, which had been electrophoresed in non-reducing sample buffer (Fig. 7, A and B). No such degradation was observed in pH 5.5 activation buffer (data not shown). As mentioned above, several smaller forms of 5PE, which are comprised of the enzyme domain with varying lengths of N-terminal sequence, have been generated. These proteins were similarly degraded in zymograms under identical conditions (the example of 5PEc is shown in Fig. 7A).
Autolytic activity of 5PE was significantly inhibited by the presence of the serine protease inhibitor 3,4-diisocoumarin (3,4-DCI) at 200 M. Fig. 7B summarizes the results obtained from three independent experiments, involving different 5PE preparations, designed to quantify the amount of 5PE autolysis in the presence or absence of 3,4-DCI. Me 2 SO, used to solubilize the inhibitor, had a negligible effect on 5PE autolysis on its own (Fig. 7B). The presence of another serine protease inhibitor, phenylmethylsulfonyl fluoride, at 1 mM had no detectable effect on activity in zymograms (data not shown).
The autolytic product 5PEc was processed immediately downstream of tyrosine residue, suggesting chymotrypsin-like activity. Hence, we tested whether 5PE could process two synthetic fluorescent peptide substrates, Suc-AAPF-AMC and Suc-LLVY-AMC, downstream of their respective aromatic residues at pH 7.5. As shown in Fig. 7C, 5PE demonstrated activity against both substrates (0.22 and 0.08 fluorescence units per second, respectively).
SERA Protease Domain Sequences Group into Distinct Phylogenetic Clusters-An understanding of the evolutionary history of the SERA protease family may shed some light on the possibility of different functions for the different family members. To examine this, the amino acid sequences of 28 presumptive SERA protease domains from six Plasmodium species were aligned and subjected to phylogenetic analysis. From the alignment it was clear that all SERA domains are closely related to one another sharing 48 -98% amino acid identity. However, the family separated into broad phylogenetic clusters, those with an active site serine forming one cluster and those with an active site cysteine the other (Fig. 8). This separation had strong bootstrap support. All species examined for which more than one SERA gene has been identified encoded at least one member of each cluster. This analysis is consistent with the active site cysteine to serine mutation only occurring once in the evolutionary history of the SERA family.
Interestingly, the "cysteine" sequences were more closely related to each other than were the "serine" cluster. The serine cluster further subdivided into three sub-classes, which were grouped along the lines of closely related species and according to the presence of additional mutations to the active site histidine (Fig. 8). However, no such mutations were evident in members of the cysteine group. All six P. falciparum protein sequences with the catalytic serine grouped together, including the SERA9 sequence, which is encoded by a gene on a separate chromosome from the other five. Together, this analysis is consistent with an ancient cysteine to serine event followed by gene duplication and divergence of these genes post speciation.
Molecular Modeling of the Enzyme Domains of P. falciparum SERA5 and SERA6 -We performed molecular modeling on representative members of the serine and cysteine groups (P. falciparum SERA5 and SERA6, respectively) to further investigate the relationship of these domains. Initial threading FIG. 5. Determination of disulfide bonds in the SERA5 Pro domain. A, RP-HPLC of 5PE tryptic digest. Expanded regions of the chromatogram are shown for identification of the peaks containing disulfide-linked tryptic peptides (as described under "Experimental Procedures"). B, Edman degradation results determined for the peptides found in peaks 1 and 2 of the tryptic digest. C, schematic of the disulfide connectivity pattern for 5PE. The solid lines portray connectivities determined via biochemical techniques, whereas the bonds in the enzyme domain depicted with broken lines were predicted by modeling (see Fig. 9).
FIG. 6. Immunoblot of P. falciparum D10 strain parasites probed with polyclonal antisera raised to 5PE in rabbits and mice. Parasite proteins were electrophoresed on 10% resolving gels in the presence of reducing sample buffer. The contents of lanes 3 and 4 were probed with antibodies from two rabbits that were immunized three times (1 month apart) with recombinant refolded 5PE. The contents of lanes 1 and 2 were probed with the corresponding prebleeds of these rabbits. The contents of lanes 5 and 6 were probed with the prebleed and the serum of a three times 5PE immunized mouse, respectively. A polyclonal rabbit serum raised to a unique region of the N terminus of SERA 5 was used to probe the contents of lane 7 (9). All polyclonal sera were used at a dilution of 1/2000. analysis revealed that the SERA6 enzyme sequence was most compatible with the caricain fold. Accordingly, the caricain structure (Protein Data Bank identification number, 1MEG) was used as the template to model the SERA enzyme domain sequences. The initial threading alignment, after some manual adjustment, was used as the input for the homology modeling. A disulfide bond constraint was applied on cysteines #5 and #7 (see Fig. 5), because initial modeling showed that these cysteine side chains were close to each other and in relative orientations amenable to the formation of a disulfide link. The analogous caricain residues are not cysteines; however, there is precedence for the existence of such a disulfide linkage in human cathepsin B, a related protein from the papain family. Cysteine residues 14 and 43, which form a disulfide-linkage in human cathepsin B, align with cysteines #5 and #7 from 5PE (28). The remaining six cysteines in the enzyme domain of 5PE align in positions equivalent to those found in caricain and were predicted to form the same disulfide linkages in the model (Fig. 5).
Ribbon representations of the final models for enzyme domains of SERA5 and SERA6 are shown in relation to the known structure for caricain (Fig. 9A). It is clear that each of the modeled structures is similar to caricain and both possess a cleft with key active site residues positioned appropriately for catalysis. By analogy with Papain, the S1Ј sub-site is completely conserved among SERA sequences from P. falciparum, P. vivax, and P. vinckei, whereas sub-sites S1 and S2 are largely conserved. Despite the similarity of the enzyme domains of SERA5 and SERA6 at a sequence and structural level, the electrostatic potential on the molecular surface of the modeled enzyme domain of SERA5 is strikingly different to that of SERA6 with the former predominantly negatively charged (total formal charge of Ϫ13) and the latter positively charged (total formal charge of ϩ8; Fig. 9B). Together with their different active-site residues and their distinct evolutionary groupings, these data suggest different functional roles for the catalytic domains Rooting and two minor internal modifications were suggested by reconciliation of an initial inferred gene tree with assumed species phylogeny (36,37). Squares denote inferred duplication events. The identities of the amino acids at two key active site positions, which are normally cysteine and histidine in papain family proteases, are indicated on the tree. Sequences were derived from P. yoelii (y1-y4), P. vinckei (vi1-vi3), P. vivax (v1-v6), P. knowlesi (k1-k5), P. falciparum (f1-f9), and P. chabaudi (c1). P. falciparum sequences are in red. Gene numbering for P. falciparum, P. vivax, and P. vinckei is as assigned in the literature (6,7,9). of SERA5 (serine-type)-like and SERA6 (cysteine-type)-like proteins.

DISCUSSION
It is well established that proteases play key roles in mediating the release of P. falciparum merozoites from infected erythrocytes and in the invasion by these forms into new erythrocytes (29 -33). Given the parasite-specific nature of these processes and the likely accessibility of the enzymes involved, such proteases are attractive potential drug targets. Unfortunately, although many proteases are encoded by the P. falciparum genome (34), very few have been demonstrated to play a role in parasite egress and/or erythrocyte invasion. SERA proteins are clearly candidates for such functional roles. Numerous additional factors point to an important role for P. falciparum SERA5 in particular. However, the unusual cysteine to serine active site substitution in this molecule has raised uncertainty about whether it functions as a protease. Here, we demonstrate that the SERA5 enzyme domain is capable of proteolytic activity.
There are several reasons why we are confident that the activity observed represents genuine processing by 5PE and not by contaminating E. coli-derived proteases. First, 5PE was obtained in a highly purified state that followed a number of sequential enrichment steps that included inclusion body preparation, nickel affinity, ion exchange, and gel permeation chromatography. Moreover, the inclusion bodies containing the histidine-tagged 5PE were extensively denatured and purified initially in a reduced state. Hence, contaminating proteases would also require appropriate refolding that may or may not occur with the conditions used for 5PE. Also in relation to purity, no evidence of contaminating proteases capable of digesting gelatin or casein in 5PE zymograms was detected. Second, the activity was reproducible using different 5PE preparations and slightly different purification strategies (e.g. RP-HPLC, lyophilization, and reconstitution). Third, smaller fragments of 5PE, containing the enzyme domain, show similar behavior to the parent 5PE on zymograms. Finally, autolysis could be reduced significantly using the inhibitor 3,4-DCI, a serine protease inhibitor that binds covalently to catalytic site serine residues.
Also, the modeled structure of the SERA5 enzyme domain and its characteristics upon refolding bear the hallmarks of a proteolytic enzyme. The predicted structure of this domain conforms well to a papain-like fold despite retaining only ϳ20% amino acid identity to caricain. This includes a characteristic catalytic cleft and appropriate positioning of key coordinating and active site residues (see Fig. 9). Furthermore, appropriate folding of recombinant forms of the SERA5 enzyme domain appeared to require the presence of the Pro sequence. In a number of instances with other active papain-like proteases, it is known that this sequence folds back into the active site, stabilizing the structure of the enzyme and/or inhibiting proteolytic function until appropriate activation conditions are encountered (25,26). The results obtained for the limited digestion of 5PE with elastase were consistent with a tight association of enzyme and Pro sequences. Here, the highly elastase- resistant fragment encompassed the entire enzyme domain plus 20 amino acids of Pro sequence. This sequence includes two cysteine residues, which we determined were disulfidebonded in 5PE. Hence, it is likely that this bond stabilizes a hairpin loop that places residues upstream of the first of these cysteine residues in the catalytic cleft, i.e. residues 12-20 upstream of the enzyme domain. Consistent with this, the stable autolytic product 5PEc had an N terminus beginning 18 residues upstream of the beginning of the enzyme domain. This narrows down the likely amino acids in the cleft to some or all of the residues in the peptide DNSDNMF (see Fig. 1).
In this study we also addressed the phylogenetic relationship of the Plasmodium SERA enzyme domains. It appears from this analysis that an ancient gene duplication event, predating speciation, gave rise to the serine-type SERA enzymes from its cysteine-type ancestor. Plasmodia that circulate today retain at least one of each SERA type suggesting independent roles for each. Furthermore, in P. falciparum it is known that members of both types, SERA4 and SERA5 representing the serine-type and SERA6 the cysteine-type, are expressed in blood stages (9,10). Supporting different roles, it was apparent that there is greater evolutionary constraint on cysteine-type SERA enzymes than those of the serine-type. It is possible that cysteinetype SERA proteases are under greater functional constraint than those of the serine-type. Another possibility is that it is only enzymes of the serine-type that function at a stage at which they are exposed to protective antibodies: late in schizont rupture or perhaps in invasion itself. It is apparent for instance that SERA5 growth inhibitory antibodies are able to access parasites during egress in a manner that prevents dispersal of merozoites (13). It is tantalizing to speculate that the higher rate of gene duplication and divergence of serine-type SERA genes is a response to this increased selection pressure. Interestingly, in a previous study we detected antibodies in P. falciparum immune humans to four different serine-type SERAs but found no such antibodies to any of the cysteine-type proteins (9). Also consistent with a different functional role for the two SERA protease types, the electrostatic properties of the modeled structures of P. falciparum SERA5 and SERA6 were very different. We speculate that the two SERA types of enzymes may function at different points in the two-step egress process; the cysteine-type in the exit from the parasitophorous vacuole, which is a cysteine protease-dependent process inhibited by E64, and the serine-type in release from the host erythrocyte a process blocked by the inhibitors leupeptin and chymostatin (29 -31).
Proteases with unusual catalytic triads have been described elsewhere, such as the picornavirus 3C proteinase (35). In almost the reciprocal circumstance to SERA5, 3C resembles chymotrypsin but has some papain-like features, including a catalytic cysteine. Further characterization of the unusual catalytic activity of SERA5 is clearly required with identification of its natural substrate a particularly important goal. However, whatever its precise role in parasite biology, there is now considerable evidence validating SERA5 as a target for the development of a new class of antimalarials.