Cloning and Characterization of Four Anopheles gambiae Serpin Isoforms, Differentially Induced in the Midgut by Plasmodium berghei Invasion*

The genomic locus SRPN10 of the malaria vector Anopheles gambiae codes for four alternatively spliced serine protease inhibitors of the serpin superfamily. The four 40- to 42-kDa isoforms differ only at their C terminus, which bears the reactive site loop, and exhibit protein sequence similarity with other insect serpins and mammalian serpins of the ovalbumin family. Inhibition experiments with recombinant purified SRPN10 serpins reveal distinct and specific inhibitory activity of three isoforms toward different proteases. All isoforms are mainly expressed in the midgut but also in pericardial cells and hemocytes of the mosquito. The cellular localization of SRPN10 serpins is nucleocytoplasmic in pericardial cells, in hemocytes and in a hemocyte-like mosquito cell line, but in the gut the proteins are mostly localized in the nucleus. Although the transcript levels of all SRPN10 isoforms are marginally affected by bacterial challenge, the transcripts of two isoforms (KRAL and RCM) are induced in female mosquitoes in response to midgut invasion by Plasmodium berghei ookinetes. The KRAL and RCM SRPN10 isoforms represent new potential markers to study the ookinete midgut invasion process in anopheline mosquitoes.

Successful transmission of malaria parasites to vertebrate hosts requires completion of their sporogonic cycle within the mosquito vector. Therefore, in addition to factors such as feeding behavior or mosquito longevity, vector competence is determined by the ability of the mosquito to support effectively the sporogonic development of a given Plasmodium species. The bird parasite Plasmodium gallinaceum, for example, normally fails to cross the midgut epithelium of the major human malaria vector Anopheles gambiae, because its ookinete stages are lysed in the midgut cells of the vector through an unknown mechanism (1). Even in naturally occurring, well adapted mosquito/parasite combinations the parasites experience severe losses. In fact, in the mosquito the parasite faces not only the cellular and humoral immune responses of hemocytes and the fat body but also the well documented local immune responses of the midgut and salivary gland epithelial barriers (2). In the genetically selected A. gambiae strain L3-5 the antiparasitic mechanisms reach an extreme form, in which complete refractoriness is achieved through the melanotic encapsulation of early oocysts (3).
Because the mosquito's immune system was shown to react to the presence of the parasite, major efforts have been devoted to dissection of the molecular mechanisms underlying these responses (reviewed in Refs. 4 and 5). Continuing efforts are guided by fundamental studies on invertebrate innate immunity performed either in model experimental systems such as Drosophila (6), or in systems that permit consistent biochemical studies such as the Lepidoptera Manduca sexta and Bombyx mori (7) or the horseshoe crab Limulus (8).
Serine proteases play critical roles in the regulation of the invertebrate innate immune responses. In Limulus, for example, cascades of autocatalytically activated proteases triggered either by lipopolysaccharide or glucans converge to activate the hemolymph clotting system, which functions both in coagulation and in defense against pathogens (9). Similarly in M. sexta, the prophenoloxidase cascade (which catalyzes the formation of melanin during the defense reaction and thus plays an important role in the encapsulation of pathogens) is also initiated by proteolytic processing (10).
Not only serine proteases but also their associated regulatory serine protease inhibitors of the serpin superfamily are important modulators of immune responses. The proteases underlying the activation of the clotting cascade in Limulus are known to be inhibited by regulatory serpins (11), as are the Manduca and Hyphantria cunea prophenoloxidase-activating proteases (12,13). The most dramatic evidence for the regulatory function of serpins comes from Drosophila, where genetic dissection of complex pathways is feasible. The Toll pathway has important roles not only in early development but also in the antifungal response (14). Intracellular signaling is initiated through the binding of the extracellular ligand Spaetzle to the Toll receptor, after Spaetzle is proteolytically cleaved by Persephone (15). This recently identified protease is thought to be the end point of an antimicrobial recognition cascade that controls activation of Spaetzle and, therefore, the Toll pathway-mediated antifungal response. In necrotic mutants (nec) the proteolytic cascade is deregulated, leading to the uncontrolled cleavage of Spaetzle and to constitutive expression of the antifungal peptide drosomycin (16). The necrotic (nec) phenotype is associated with two mutations in a serpin gene, the Spn43Ac locus (17). Several immune-responsive serpins have been isolated from various insects, but their specific functions remain unknown (18 -21).
Serpins constitute a large group of functionally diverse serine protease inhibitors that fold into a conserved structure and exhibit a peculiar suicide substrate-like inhibitory mechanism through an exposed reactive site loop (RSL) 1 located toward the C terminus of the protein (22)(23)(24). Serpins have been found in viruses and many groups of eukaryotes, including plants, nematodes, arthropods, and vertebrates, but not prokaryotes or yeast (23). Most members of this superfamily are inhibitory and modulate serine proteases, subtilisins (25,26), cysteine proteases such as caspases (27), and papain-like cysteine proteinases and cathepsins (28). In addition some serpins have lost their inhibitory potential and perform other roles such as hormone transport, storage, and blood pressure regulation (reviewed in Ref. 24).
We present the cloning and characterization of SRPN10, an Anopheles mosquito serpin gene, which gives rise to four alternative spliced serpin isoforms. These isoforms share homology to other insect serpins but also to intracellular cytoprotective mammalian serpins of the ovalbumin family. We characterize biochemically the inhibitory potential of three recombinant serpins and show that the isoforms are expressed in tissues that participate in insect defense reactions, such as hemocytes and the midgut epithelium. Finally, we present evidence that at least two isoforms are transcriptionally up-regulated during parasite passage through the midgut, suggesting that they may be implicated in antiparasitic action or, alternatively, parasite tolerance.

EXPERIMENTAL PROCEDURES
Mosquito and Parasite Techniques-The A. gambiae strain G3 and the hemocyte-like cell line Sua 5.1* were reared and cultured according to previous studies (Refs. 29 and 30, respectively). Mosquito infections were performed as described previously (2), and the mosquitoes were kept at 19°C to allow parasite development until RNA extraction.
Cloning, Characterization, and Cytogenetic Mapping-Degenerate primers corresponding to the third strand of ␤-sheet A and to the P15-P9 hinge region (fSPI: 5Ј-AAYGCIGTSTAYTTYAARG; rSPI: 5Ј-GCYCCYTCYTCRTTIACYTC) were used at 20-pmol concentration in 50 l of PCR reactions (1.5 mM MgCl 2 , 200 M dNTP, Amersham Biosciences; 0.1 unit of Taq polymerase, Roche Molecular Biochemicals) to amplify products of expected size from mosquito cell-line and larval lambda ZAP-express cDNA libraries (Stratagene). After subcloning and sequencing, positive fragments were used to screen cDNA libraries for full-length clones. For characterization of the SRPN10 genomic locus an A. gambiae lambda DASH (Stratagene) genomic phage library was amplified, and the phage DNA was isolated using a Qiagen lambda prep kit, following the manufacturer's instructions. The isolated phage DNA was digested with EcoRI, XbaI, and XhoI restriction endonucleases and subjected to Southern blotting with radioactively labeled serpin probes. Identified fragments containing the serpin locus were excised from gels, purified, and cloned into a KS Bluescript plasmid. Several positive clones were picked and sequenced. Cytogenetic mapping was performed with serpin RCM isoform cDNA by in situ hybridization to polytene chromosomes of the A. gambiae Suakoko strain (31).
RNA Techniques-Total RNA from mosquitoes was isolated using TRIzol (Invitrogen). Northern blot analysis was carried out using Hybond Nϩ (Amersham Biosciences) nylon membranes with 20 g of total RNA, following the manufacturer's instructions. Blots were hybridized with radioactive DNA probes, labeled by random priming (Megaprime labeling kit, Amersham Biosciences). For semiquantitative RT-PCR experiments, first-strand cDNA synthesis was primed from total RNA using oligo(dT) magnetic Dynabeads (Dynal) as described previously (29), using 1-5 l of resuspended beads as template. To amplify specific products, serpin-specific primers (20 pmol) were used in 50-l reactions with Amplitaq (0.25 unit, Roche Molecular Biochemicals), 200 M dNTP, 1.5 mM MgCl 2 , through several thermal cycles (45 s at 94°C, 45 s at 55/60°C, and 30 s at 72°C). During a pause at 72°C, primers complementary to the ribosomal protein S7 gene were added, and the reaction was allowed to proceed for additional cycles, leading to amplification of both the gene of interest and the internal standard, and permitting normalization of the reaction. After electrophoresis on agarose or polyacrylamide slabs, gels were stained with the sensitive SYBR green dye (Molecular Probes) for 45 min and analyzed with a fluorescence imager (Fuji). The primers used for RT-PCRs were as follows. General SRPN10 primers: 5Ј-TTCTGGCTGAGCGAGACGGAATC and  5Ј-CTTTGTGGACGACTTTGGACACC; KRAL-specific: 5Ј-GCTTGGAT-GATGGGGTCTTC, 5Ј-TCGTGGCGATTTGCTTGGGC; RCM-specific:  5Ј-TACCGGTATGATCATGATGATGC, 5Ј-CGCGACCCACAAAGTAA-ACCATC; FCM-specific: 5Ј-CCATGATCGCGGTGTCATTC, 5Ј-GGCAT-TTTACAGGTTTTTCC; CAM-specific: 5Ј-CTGAAAGATTCGCAAGGA-AACAT, 5Ј-CATACATACGGATGGATTAGTTA; rpS7: 5Ј-GGCGATCA-TCATCTACGT and 5Ј-GTAGCTGCTGCAAACTTCGG. Recombinant Serpin Expression and Affinity Purification-The ORF encoding the RCM-serpin isoform was amplified from an Anopheles cell-line library with primers containing a BamHI (5Ј-AAGGATCCAT-GGCCGACAATAGCAGCT) and an SacI overhang (5Ј-TTGGAGCTCG-CTAAGCATCGATC) and cloned directionally in a pQE-30 vector (Qiagen). This RCM-pQE30 plasmid was used as the starting point to construct the expression vectors for the other serpin variants. An internal PinAI/AgeI restriction site, conveniently located immediately upstream of the splice site of the CAM and RCM isoforms, was used in combination with a pQE-30 HindIII site to substitute the RCM RSL with the homologous CAM sequence, isolated from a full-length library cDNA. For KRAL and FCM isoforms, specific reverse primers with HindIII overhangs located at the 3Ј-end of the coding sequences (5Ј-T-GTGTAAAGCTTACAATTCCTCGTGGCGATTTGC and 5Ј-TTCGGAA-GCTTGTTATGGCATTTTACAGGTTTTTC, respectively) were used in combination with a general forward primer, 5Ј-TTCTGGCTGAGCGA-GACGGAATC, to amplify specific products from an adult Anopheles library. The PCR products were then digested with Eco47III and Hin-dIII and directionally cloned into the RCM-pQE30 acceptor plasmid, linearized with the same restriction enzymes. All the expression constructs were sequenced before transformation into Escherichia coli strain TG1. For native protein purification, bacterial cultures were grown until A 600 0.7-0.8, induced with 0.4 mM isopropyl-1-thio-␤-Dgalactopyranoside, and grown for 4 additional hours at 30°C. This procedure resulted in a high solubility of all serpin isoforms. Bacteria were lysed in 300 mM NaCl, 50 mM NaH 2 PO 4 , 10 mM imidazole, pH 8.0, incubated 40 min on ice with 100 g/ml lysozyme and 10 g/ml DNase, passed through a French-press, and centrifuged at 20,000 rpm for 40 min in a refrigerated Beckman ultracentrifuge. The supernatants containing the soluble protein fraction were recovered, incubated with 0.1% Triton X-100, and gently mixed for 30 min at 4°C. Soluble tagged serpins were purified to over 80% purity with an Ni-NTA column (Qiagen) following the manufacturer's instructions.
Generation of Antiserum against the Common Serpin Backbone-Purified recombinant RCM-serpin (1.0 mg) was used to immunize two rabbits in Ribi Adjuvant (RAS, Ribi Immunochem). Boosts were carried out every fourth week with 0.2 mg of antigen until final bleeding. For antiserum purification, the RCM-serpin gene was fused in-frame to a GST fusion vector pGEX-5x-1. The construct was sequenced and transformed into an E. coli BL21(DE3) strain for protein expression. The fused GST-serpin was purified through a glutathione-Sepharose 4B matrix (Amersham Biosciences) according to the instructions of the supplier, and eluted fractions were further purified with a Mono Q ion-exchange column (Amersham Biosciences) using a fast protein liquid chromatography device. The peak corresponding to the GST-serpin fusion was collected, and 4 mg of protein was coupled to CNBr-activated Sepharose 4B (Amersham Biosciences). The serpin polyclonal antiserum was incubated with GST-serpin fusion Sepharose overnight at 4°C, packed in a column, and washed with PBS until A 280 was close to zero. Bound antibodies were eluted with 200 mM glycine-HCl, 200 mM NaCl, pH 2.5, and fractions with A 280 Ն 0.1 were pooled and tested in immunoblots for activity.

RESULTS
Cloning and Characterization of SRPN10 -To isolate mosquito serpin genes, degenerate primers were designed based on conserved LVNAVYF and IEVNEEGTEA sequences of both insect and vertebrate serpins (23). A 450-bp fragment of expected size was amplified from an A. gambiae cell line cDNA library and subcloned into a Topo-TA plasmid vector. After confirmation by sequencing, the cloned fragment was used to screen the cell line library, and several positive clones were picked and sequenced. All clones had an identical 5Ј-end but formed two sequence groups at their 3Ј-end, indicative of alternative splicing. Therefore, additional clones were analyzed and a fourth instar cDNA larval library was also screened, leading to identification of three distinct full-length serpin clones. They correspond to alternatively spliced 3Ј-ends, which code for the exposed reactive site loop (RSL) of the serpin.
Because multiple alternatively spliced isoforms had been reported for another insect (32), a mosquito lambda DASH genomic library was screened and used to derive the genomic sequence, thereby determining the full range of possibilities for alternative splicing. Six recombinant bacteriophage clones were isolated by high stringency plaque hybridization with probes corresponding to the sequence of the common serpin backbone and to the RSL of one isoform (CAM), respectively, and analyzed using appropriate restriction enzymes. A single bacteriophage clone spnG6.1 encompassed the whole gene locus (12551 bp) in three adjacent EcoRI fragments (3917, 1772, and 6862 bp). The genomic structure of the locus is schematically represented in Fig. 1A. It was mapped to subdivision 21F on the left arm of the second chromosome (2L) by in situ hybridization to polytene chromosomes (Fig. 1B). The serpin gene was named SRPN10, and its nucleotide sequence was deposited in the GenBank TM data base with accession number AJ420785 (SPI21F).
Several exons encompass the coding region present in the spnG6.1 genomic clone. Comparison with the three isolated cDNA clones showed that the first three exons (1-3) form the common backbone of all splice variants. Exons R, F, and C code for the distinct isoform specific C-terminal reactive site loops, known to be alternatively spliced from the cDNA studies. Careful search of the genomic sequence revealed the presence of an additional in-frame exon (K), encoding a fourth putative reactive site loop, suggesting an additional splice variant from this locus. Specific primers to exon K were designed and used in combination with a backbone primer to amplify from a larval cDNA library a band, which upon sequencing was shown to correspond to the KRAL transcript, thereby confirming the existence of the fourth serpin isoform. Thus, seven exons of the SRPN10 gene were defined, separated by six introns of highly variable size (85, 114, 125, 108, 2459, and 141 bp, respectively, for introns 1-6). Exon/intron boundaries conform to the GT/AG splice donor/acceptor rule, and all introns were characterized by the presence of a polypyrimidine tract. The fifth intron is large (2459 bp), setting the last two RSL exons F and C far apart from the rest of the gene. This intron sequence was carefully searched for cryptic exons, looking for additional putative RSL sequences, but none were found.
In contrast to the alternatively spliced Manduca serpin-1 locus (33), each specific RSL exon codes for its own 3Ј-untranslated region. In-frame translational stop codons are found in each of the RSL exons, followed at a distance ranging form 105 to 257 bp by a consensus polyadenylation site AATAAA, with the exception of exon K in which an obvious polyadenylation sequence is absent. The putative translation initiation site is located in exon 1, with a conserved Kozak consensus (A at Ϫ3, G at ϩ4; Ref. 34). Consistent with the computational prediction, a single major transcriptional initiation site was located by primer-extension analysis, 25 nucleotides downstream of a well conserved TATA box (data not shown). Using Genomatix Matinspector software, a 1400-bp region upstream of the transcriptional initiation site was explored for the presence of putative regulatory elements (diagrammed in Fig. 1A). A CCAAT enhancer binding protein ␤ (C/EBP␤) sequence was located 10-bp upstream of the TATA box. Other putative binding sites were found, for morphogenetic factors implicated in embryogenesis (Dorsal (DL), Hunchback (HB), Deformed (DFD), Krü ppel (KR)), for the ecdysone-inducible DNA binding proteins Broad Complex Z4 (BRZ4) and E74A, for GATA factors, for the Activator Protein 1, and for an activator of the alcohol dehydrogenase gene. Motifs with high similarity to a neuronal ciselement, to nuclear factor AT, to a yeast stress response element, and to binding sites for the ZESTE regulator and c-REL were also observed. This complex organization of the putative SRPN10 promoter region may reflect complex developmental and tissue-specific transcriptional regulation and deserves further investigation. Two REL family factors are also implicated in immune responses in Drosophila (6).
When analyzing the flanking regions of the locus, searching for potential ORFs or additional exons encompassing previously undetected RSLs, sequences with homology to a retroviral elements were noted at both ends (depicted with gray bars in Fig. 1A). Upstream of the serpin promoter a coding sequence with significant homology to a retroelement polyprotein was located, whereas downstream of exon C (encoding the last RSL) a sequence was found with similarity to the reverse transcriptaselike protein of the Aedes aegypti LINE transposon Juan-A (35). To check whether the SRPN10 locus might encompass additional, distant RSL exons, we checked the A. gambiae genome sequence that became available on the web after this study was completed (scaffold AAAB01008900.1). None were found in the 24.7-kb genomic sequence spanning from exon C to the next predicted ORF (a putative Zinc-finger DNA-binding protein). Thus, we are confident that the SRPN10 locus encodes only four alternatively spliced isoforms of the serpin superfamily.
Analysis of Sequence Similarities of SRPN10 Serpins-Serpins typically consist of 370 -400 amino acid residues and fold into a conserved structure, with an exposed reactive site loop (RSL) located at the C terminus of the molecule. The RSL represents the accessible "bait" region that mimics the cleavage consensus sequence of the target protease. Upon cleavage of the RSL (at the so-called P1-P1Ј scissile bond), serpins undergo a drastic conformational reorganization in which part of the RSL and of the adjacent hinge region fold back into the ␤-sheet of the backbone, conferring a more stable conformation. When conformational change occurs before deacylation of the Michaelis complex, formed between serpin and protease, the latter is trapped in a very stable complex with the inhibitor (24). Thus, serpins act as suicide substrates, and their specificity is largely determined by the scissile bond located within the RSL. The SRPN10 serpins lack a signal peptide and share the first 335 N-terminal residues, differing only in the last 44 -60 C-terminal amino acid residues, which encompass the RSL (Fig. 1, C   FIG. 1. Four serpin isoforms arise from alternative splicing of the SRPN10 gene. A, the serpin SRPN10 genomic locus. Two probes indicated by narrow black bars were used to isolate and characterize a phage containing the whole locus. Restriction enzyme sites (EcoRI, XbaI, and XhoI) are shown. White boxes indicate the three exons forming the common serpin backbone. Colored boxes represent alternatively spliced exons (K, R, F, and C), which give rise to four splice variants (KRAL, RCM, FCM, and CAM, respectively). Gray boxes indicate the location of sequences coding for a retrotransposon-like element. Potential upstream regulatory sites were identified with the Genomatix Matinspector software available on the web and are presented if showing over 90% sequence similarity to the core matrix of known eukaryotic regulatory sequences. For abbreviations, see text. B, in situ hybridization to polytene chromosomes, mapping the serpin locus to the left arm of the second chromosome, subdivision 21F (2L 21F). C, schematic representation of serpin variants. The common backbone is depicted in white (cf. exons 1, 2, and 3 in panel A) and the isoform-specific reactive site loops are colored (cf. exons K, R, F, and C in panel A). The number of amino acid residues in each isoform is indicated. The serpin variants are named after the first residue in the scissile bond region (P1 in panel D). A matrix comparison of peptide sequence similarity and identity of SRPN10 serpin isoforms is presented. Values indicate the percent identity (pink) and similarity (light gray) of the peptide sequences encompassing the RSL in different isoforms (in parentheses are the corresponding values for the entire protein sequences). D, sequence alignment of SRPN10 serpins (in red lettering) and homologous serpins in the C-terminal region encompassing the RSL. The hinge region is indicated with a gray box, the putative scissile bond with yellow, and a motif with high similarity to an ER retention signal with pink. Abbreviations: Dm Sp, Drosophila melanogaster serpin; Bm, B. mori; Em, Echinococcus multiocularis; At, Arabidopsis thaliana; Hv ZX, Hordeum vulgaris protein ZX; ACH, anti-chymotrypsin; PAI, plasminogen activator inhibitor; SCCA-1, human squamous cell carcinoma antigen 1; LEI, leukocyte elastase inhibitor; PI-8, protease inhibitor 8; PI-6, protease inhibitor 6; PTI-6, mouse placental serine protease inhibitor 6 (also known as SPI-3); SPB11, serpin B11; MENT, chicken heterochromatin-associated protein; A1AT, A1 antitrypsin; THBG, thyroxine-binding globulin; A1A4, ␣1-antitrypsin 1-4; KAIN, kallikrein inhibitor (kallistatin). and D). The alternative splicing of the RSL coding exons therefore potentially represents a functional multiplication of the inhibitory range of a single serpin gene. The reactive site loops of the four isoforms exhibit sequence conservation both at the nucleotide and the amino acid level. In Fig. 1C we show a matrix comparison of amino acid similarity and identity values between the RSLs of the SRPN10 locus, which are lowest for the KRAL isoform.
BLAST analysis of the whole protein sequences revealed that SRPN10 serpins share high homology to other insect serpins, in particular the Drosophila Sp-4 and Sp-6, with which it forms an orthologous group according to bioinformatic analysis of complete mosquito and fruit fly genomes (36). In addition, SRPN10 shows significant similarity with the mammalian serpins belonging to the neuroserpin and ov-serpin clades (data not shown).
The sequence and conformation of the RSL largely determines the selectivity of inhibition. Thus, sequence alignments (Fig. 1D) and BLAST analysis of the serpin C termini comprising the RSLs were particularly revealing. By these criteria, the four SRPN10 isoforms could be distinguished as follows. The RCM and CAM isoforms resemble not only other insect serpins such as Drosophila Sp-6, B. mori serpin-2 but also neuroserpin and multiple intracellular cytoprotective ov-serpins (e.g. human and bovine PI-6, mouse PTI-6, human PI-8 and PI-9), which are involved in the inflammatory response and in the modulation of pro-apoptotic proteases of epithelial and endothelial tissues as well as of neutrophils and macrophages (37). The Anopheles KRAL isoform and the Drosophila serpin Sp-4 are characterized by multiple basic residues in the scissile bond and further down the C-terminal peptide sequence. Their C terminus is distinguished by a short stretch of residues that closely resemble an endoplasmic reticulum retention signal found also in neuroserpins and in other serpin clades (38). Multiple basic residues are also found in the C-terminal sequences of Hordeum and Arabidopisis serpins, as well as in the chicken MENT protein, which belongs to the intracellular ovserpins. MENT is known to induce higher order chromatin compaction and is an abundant component in terminally differentiated hematopoietic cells (39). The FCM isoform, characterized by a stretch of hydrophobic residues in the reactive site, resembles most closely mammalian leukocyte elastase inhibitors (intracellular ov-serpin) and two mouse stomach serpins. It presents a Phe residue in the predicted scissile bond sequence, as is the case for MENT and for a viral rabbitpox virus serpin SPI-1 (the latter is not shown in the figure). Searching for Drosophila orthologs of SRPN10 serpins, we noticed that the Sp-4 gene codes for 10 serpin splice combinations (accession numbers AJ428880 to AJ428889), with the possibility of four alternative RSLs (40). For convenience, we named the Sp-4 RSL variants according to the amino acid residues located at the predicted scissile bond (ASM, TSL, VMA, and KRAI, respectively; Fig. 1D). Protein sequence alignments of the Cterminal regions of SRPN10 and Sp-4 show that Sp-4 KRAI may be considered an ortholog of Anopheles SRPN10 KRAL, unlike the other Sp-4 isoforms. In fact, all the other RSLs diverge significantly from the SRPN10 ones. Finally, additional mammalian and insect serpin RSLs with low similarity to SRPN10 are presented in the lowermost alignment of Fig. 1D. In conclusion, SRPN10 Anopheles serpins exhibit remarkable sequence homology not only to specific Drosophila serpins but also to a set of vertebrate serpins of the neuroserpin and ov-serpin clades, both at the whole protein level and in the exposed reactive site loop.
Specific Inhibitory Activity of Anopheles Serpin Isoforms-Recombinant serpin isoforms with a His tag were produced in a bacterial expression system, and fractions containing a band of the expected size, corresponding to the predicted authentic polypeptides, were purified using Ni-NTA matrices ( Fig. 2A). An exception was the KRAL isoform, where extensive processing resulted in the accumulation of a lower mass product, possibly corresponding to the serpin after cleavage of the RSL at the dibasic scissile bond.
The functionality of SRPN10 serpin isoforms was assayed in protease inhibition tests, in which the enzymatic activity of commercially available proteases was monitored using chromogenic substrates. The test proteases were mammalian digestive enzymes (trypsin, chymotrypsin, and elastase), serine proteases from human blood (thrombin, kallikrein, and plasmin), and subtilisin-like proteases (proteinase K, subtilisin Carlsberg). Three of the isoforms proved to be inhibitors, whereas KRAL was inactivated as expected. As can be seen in Fig. 2B, all four isoforms have small to medium hydrophobic residues (Cys and Ala) at or near their scissile bond, suggesting that FIG. 2. Protease inhibition assay. A, SDS-PAGE of recombinant serpins. The proteins were produced in a bacterial expression system and were purified through their N-terminal His tags on Ni-NTA columns. After purification, bands of expected size were detected on the gel, with the exception of the KRAL isoform that is cleaved under native purification conditions. Lanes marked i are extracts of bacterial cultures induced with 0.4 mM isopropyl-1-thio-␤-D-galactopyranoside, whereas p are purified recombinant serpin isoforms. B, SRPN10 protease inhibition assay. The indicated proteases (10 pmol) were incubated for 5 min with an equimolar (10 pmol) or 10-fold higher concentration (100 pmol) of purified serpin. Proteases were trypsin (TRYP), thrombin (THRMB), chymotrypsin (CHY), porcine pancreatic elastase (PPE), kallikrein (KAL), human plasmin (PLAS), proteinase K (ProtK), and subtilisin Carlsberg. Inhibition is reported as percent reduction of the rate cleavage of a protease-specific chromogenic substrate (see "Experimental Procedures"). Inhibition rates are relative to the non-inhibitory serpin ovalbumin and reported values are averages of two independent experiments (different serpin preparations have been used). In each experiment and for each SRPN10 variant/protease combination, duplicate assays were performed. Therefore, each mean and standard deviation was calculated from four data sets. S.D. Ͻ 5%. Inhibition rates of Յ15% were treated as insignificant (--). Asterisks denote that the KRAL isoform showed no inhibitory activity, apparently due to proteolytic cleavage of the RSL during the purification process. they might be good inhibitors of elastase, which cleaves preferentially at such target residues. In contrast, only RCM and KRAL have basic residues at the predicted P1 site, making them potential inhibitors of trypsin and thrombin, which prefer Arg or Lys at the N-terminal residue of the target peptide bond. Consistent with these predictions, all the testable isoforms, CAM, FCM, and RCM, are potent inhibitors of elastase, whereas only RCM inhibits in addition trypsin as well as thrombin (Fig. 2B). The pattern of chymotrypsin inhibition is also interesting. This protease tends to cleave at hydrophobic residues, with the catalytic efficiency improving as the side chain increases in size. As predicted, it is efficiently inhibited by FCM, which has a Phe residue in the scissile bond, and least so by the CAM isoform, which is characterized by the presence of small hydrophobic residues (Cys and Ala). Importantly, the CAM isoform proved to be an effective inhibitor of bacterial subtilisin-like proteases, inhibiting both subtilisin Carlsberg and proteinase K. This is intriguing, because microbial subtilisin-like proteases are often associated with pathogenicity.
Expression and Localization of SRPN10 Isoforms-The distribution of the SRPN10-derived transcripts in adult tissues was first monitored by RT-PCR (Fig. 3A), using as template RNAs extracted from dissected adult thoraces, midguts and gut-free abdomens, and a primer pair annealing to the sequences common to all isoforms, or a combination of common and isoform-specific primers. Amplified products were sepa-FIG. 3. SRPN10 developmental profile and localization. A, tissue expression profiles of SRPN10 transcripts. Total RNA was extracted from dissected midgut-free abdomens (ab), midguts (gt), and thoraces (tx) of adult female mosquitoes and subjected to RT-PCR. An internal control corresponding to the ribosomal gene S7 was used for normalization. The abundance of serpin transcripts was assayed both with the general primers (upper panel) and with isoform-specific primers (lower panels). The numbers to the right of the panels report the number of amplification cycles used in each experiment. B and C, SRPN10 developmental protein profile. Samples derived from different developmental stages were immunoblotted using the serpin antiserum (␣-SRPN10). Boiling in SDS loading buffer for 5 min is not sufficient to dissociate the high molecular weight serpin-protease complexes (panel B). Treatment with 8 M urea succeeds in dissociating the inhibitor-protease complexes (panel C). EE, early embryo 18 h; LE, late embryo 42 h; L1 and L4, first and fourth instar larval stages; EP, early pupae; LP, late pupae; M, male adult mosquitoes; F, female adult mosquitoes; BF, female adult mosquitoes 24 h after blood feeding; 5.1*, Sua 5.1* mosquito cell line. Equal amounts of total protein were loaded, as calculated using a Bradford assay (Bio-Rad) prior to treatment with SDS or urea. D-I, immunolocalization by confocal microscopy. SRPN10 is stained in red in all the panels. Nuclei are green in panels D-G, whereas the serine protease Sp22D is blue in panel E and green in panel H. D, a group of hemocytes (hc) is attached to a trachea, with one expressing SRPN10. E, two hemocytes attached to the fat body (fb, note the characteristic lipid inclusions of this tissue) one with nucleocytoplasmic SRPN10 (red) staining, the other with cytoplasmic Sp22D (blue) staining. F, most of the hemocyte-like cells of line Sua5.1* show nucleocytoplasmic SRPN10 staining. G, low magnification view of pericardial cells with strong SRPN10 staining, which is absent from fat body cells (fb). H, magnified view of a binucleated pericardial cell showing nucleocytoplasmic serpin staining (arrowheads indicate the nuclei) and patchy Sp22D staining (absent in the nuclei). I, in the A. gambiae midgut, SRPN10 serpins are mostly located in the nucleus of the epithelial cells. In panels D and E the transmitted light channel (differential interference contrast filter) was combined with the fluorescent channels of the confocal microscope. Scale bars ϭ 10 m. rated on agarose gels and visualized with a fluorescence imager after SYBR green staining. Amplification of ribosomal protein S7 transcripts served as an internal standard for sample normalization. Serpin expression is highest in dissected midguts (gt), with weaker expression levels in the thorax (tx) and the gut-free abdomen (ab) (Fig. 3A). This is true both for all isoforms together and for three individual isoforms. The exception is KRAL, which is enriched both in the gut and in the gut-free abdomen. In the adult, RCM is the most abundant of the four isoforms: its transcripts are readily detected with 26 amplification cycles, whereas the other isoforms need four additional cycles to reach comparable amplification levels. Similarly, in the hemocyte-like mosquito cell-line Sua 5.1*, the RCM transcripts are significantly more abundant than the other isoform transcripts (data not shown).
Based on previous experience with the gut-free abdominal fraction, the RT-PCR data suggested that SRPN10 serpin is produced in hemocytes as well as the midgut. To determine the developmental profile as well as the tissue distribution of total SRPN10 serpin, a polyclonal antiserum was raised against a recombinant protein in which amino acid sequences encompassing the common serpin backbone were fused to the His tag.
To determine the developmental profile of serpin levels, total protein was extracted from different stages of the A. gambiae G3 strain, and the total protein content of each sample was equalized on the basis of Bradford assays. The samples were then either treated with SDS loading buffer and boiled for 5 min (Fig. 3B) or boiled for 15 min in 8 M urea to promote the complete dissociation and denaturation of serpins and serpinprotease complexes (Fig. 3C) and were then immunoblotted with the serpin antiserum. Serpins are known to bind to their target proteases, forming very stable complexes resistant to SDS denaturation. The predicted molecular masses of the four intact isoforms range from 40 to 42 kDa, and the expected size of each isoform cleaved by its target protease was ϳ37 kDa. An immunoreactive band at this size range was detected in protein extracts treated with 8 M urea (Fig. 3C). Additional higher molecular size bands were detected in the same extracts in the absence of urea treatment and probably represent serpin-protease complexes as well as uncleaved serpins (Fig. 3B). SRPN10 serpins are nearly undetectable in early embryos (EE). Their levels increased in late embryos and the first larval stage (LE and L1) and peaked at the fourth larval stage (L4), thereafter declining in the early and late pupal stages (EP and LP). In the adults, 1-day-old sugar-fed females (F) showed higher serpin content than males of the same age (M), and a further increase in serpin levels was recorded in females 24 h after a blood meal (BF).
The antiserum was then used for whole-mount stainings of dissected adult tissues. High serpin levels were detected in selected hemocytes attached to tracheae (Fig. 3D) or to the fat body ( Fig. 3E) but not in the fat body itself (Fig. 3E, fb). Consistent with the RT-PCR data, serpins were also localized at high levels in the midgut cells, showing a predominant nuclear localization (Fig. 3I). An additional class of cells that stain strongly with the antibody are the scavenger (detoxifying) pericardial cells (Fig. 3, panels G and H). In all these three cell types serpin was detected in the nucleus. This is better shown in Fig. 3, panels E and H, where an antibody against the serine protease Sp22D (29), which is also expressed in hemocytes and pericardial cells, was used in addition to anti-serpin and antihistone antibodies. In the pericardial cells serpins were detected in the nucleus (Fig. 3G and arrowheads in Fig. 3H) and distributed throughout the cytoplasm, whereas Sp22D is absent from the nucleus showing a granular cytoplasmic localization (Fig. 3H). In hemocytes Sp22D is restricted to a narrow subpopulation, whereas serpins are present in a broader set of blood cells, only partially overlapping with those which express Sp22D (data not shown). Similarly, in the hemocyte-like cell line Sua 5.1* almost the entire cell population (90%) showed nucleocytoplasmic serpin staining (Fig. 3F), whereas Sp22D was present in secretory vesicles in only 5% of the cells, as previously reported (29).
Ookinete Midgut Invasion Enhances Transcription from the SRPN10 Locus-Because SRPN10 serpins are expressed in the midgut and hemocytes of the mosquito, tissues that both have key roles in insect defense against pathogens, we wanted to investigate whether the expression levels of SRPN10 serpins are affected by challenge of mosquitoes with bacteria and Plasmodium parasites. Serpins regulating the humoral response pathways in insects (for example the Drosophila Spn43Ac) are expected to be secreted in the hemolymph (16). However, aspects of insect defense are cell-mediated, and in M. sexta an intracellular hemocyte-specific serpin was shown to be induced upon bacterial challenge (18).
After pricking with a mixture of heat-inactivated Gramϩ and GramϪ bacteria, mosquito females were dissected and the levels of serpin transcripts were analyzed by RT-PCR using as template total RNA isolated at successive time points after bacterial challenge (Fig. 4A). Although transcriptional up-regulation of the antimicrobial gene defensin was evident 12 h after pricking, the bacterial challenge had no substantial effect on the total level of all SRPN10 serpin transcripts combined and for each serpin variant, as monitored separately by using pairs of common and isoform specific primers (quantified in Fig. 4B). Only the KRAL splice variant showed a modest and transient up-regulation after bacterial challenge (Fig. 4B). The same type of analysis was applied to female mosquitoes fed on P. berghei-infected mice. Although no apparent differences of serpin transcript levels were detected in mosquitoes 18 and 20 h following an infected blood meal, a remarkable induction was visible at 24 h and persisted at 48 h (Fig. 5A). This response coincides with ookinete invasion and is midgut-specific, because no comparable induction was evident in the gutfree carcasses of the same infected mosquitoes. The results were confirmed by blot analysis of total RNA extracted from naïve or infected midguts, dissected 23 h after the blood meal (Fig. 5B).
To check whether the levels of all four isoforms are equally affected during ookinete invasion, pairs of isoform specific primers were used in RT-PCR analysis to amplify isoform specific transcripts (Fig. 5C). Although the levels of FCM and CAM transcripts in mosquitoes fed on infected mice (black bars) did not diverge significantly from the levels present in the RNA of control blood fed mosquitoes (gray bars), the transcript levels for RCM and especially KRAL were markedly enriched 24 h after the infective blood meal and remained higher than the control levels even 48 h later.
To exclude the possibility that this enrichment was due to the parasite presence, by using KRAL-and RCM-specific primers we attempted to amplify any putative contaminating band from genomic DNA or from cDNA derived from in vitro cultured ookinetes (data not shown). No signal was recorded, confirming that RT-PCR results demonstrated an increase in the expression levels of RCM and KRAL isoforms in the infected midguts. DISCUSSION The present work reports the cloning and characterization of an A. gambiae serpin gene (SRPN10) that is transcriptionally regulated during ookinete midgut invasion. Four isoforms were derived from this gene by alternative splicing of exons encoding distinct reactive site loops. This kind of genomic architecture permits multiplication of the functionality of the gene by increasing the number of target-specific bait regions and resembles the organization found in some other insect serpin genes, such as the M. sexta serpin-1 (12 splice variants (12)), the B. mori serpin-1 (2 splice variants (21)), and the Drosophila serpin Sp-4 (4 alternatively spliced RSLs (40)).
Partial clones encoding serpin-like sequences were obtained recently in gene discovery projects, aimed to identify immuneresponsive molecules in A. gambiae. Analysis of expressed sequence tags derived from a hemocyte-like cell line library revealed four clone clusters encoding putative serpins (41). One of these clusters, I10, corresponds to SRPN10 serpins. Similarly, a differential display search for immune-responsive genes in the adult females of A. gambiae identified a fragment (AF203339) with a predicted sequence homology to inhibitory serpins (42). Interestingly, in the salivary glands of the mosquito vector A. aegypti, a secreted 48-kDa serpin was identified, which possesses a hemostatic activity and is assumed to inhibit the clotting Factor Xa during mosquito blood feeding (43).
We were able to show experimentally that at least three of the four SRPN10 serpin variants are functional inhibitors of serine proteases. This is consistent with features they share FIG. 5. SRPN10 up-regulation during ookinete invasion. A and B, midgut-specific SRPN10 induction after ookinete invasion. A, RT-PCR analysis of common backbone SRPN10 transcripts assayed in midgut and carcass 18, 20, 24, and 48 h after an infective (i) or noninfective (c) blood meal. B, RNA blot analysis of SRPN10 RNAs in dissected midguts of infected (i) or control (c) female mosquitoes 23 h after the blood meal. The ribosomal protein gene S7 was used as loading control in both experiments. C, differential mRNA abundance of SRPN10 splice variants after ookinete midgut invasion. RT-PCR of midgut transcripts assayed 18, 20, 24, and 48 h after an infective (i, black bars) or non-infective blood meal (c, gray bars) with isoform-specific primers. Amplification products were resolved on agarose or polyacrylamide gels and stained with SYBR green dye. After normalization to S7, the intensity of each band was measured with a fluorescence imager. The bars indicate the average -fold induction of serpin transcripts relative to a non-infective blood meal 18 h after feeding (triplicate experiments). Error bars indicate S.E. In the lower panels photographs of the RT-PCR products of a representative experiment are shown. with inhibitory serpins, which can generally be recognized by a consensus pattern of residues in the hinge region: in inhibitory serpin P15 is usually glycine, P14 is threonine or serine, and residues with short side chains, such as alanine, glycine, or serine usually, occupy positions P12-P9. These residues are essential for inhibitory activity, because they permit a rapid insertion of the RSL into the A ␤-sheet, facilitating the conformational change that is necessary for the inhibitory activity of the serpin to be manifested (23). In SRPN10 serpins these essential residues are conserved (shaded gray in Fig. 1D), suggesting the potential for inhibitory activity, which in fact was demonstrated for all three variants that could be tested in vitro. We were unable to test the KRAL serpin isoform, because it is proteolytically cleaved, presumably by endogenous bacterial proteases, during the production and purification procedure. Consistent with the distinct composition of the RSLs, we also showed that the different isoforms exhibit specific protease inhibition spectra in vitro. These biochemical assays did not aim to identify the physiological target(s) of SRPN10 serpins but rather to establish their inhibitory potential. However, we detected high molecular mass complexes in immunoblots using a specific anti-serpin antibody, which are dissociated under harsh denaturing conditions, suggesting that also in vivo SRPN10 serpins are associated with target proteases.
The similarity of SRPN10 serpins to mammalian ov-serpins, both in the whole protein sequence and in some cases in the RSL amino acid composition, leads to intriguing speculations as to their physiological role. Ov-serpins reside in the cytosol and/or in the nucleus of protease-secreting cells, including cytotoxic lymphocytes, monocytes, and epithelial and endothelial cells (44). The physiological role of ov-serpins is still emerging, but for many members of the family a cytoprotective role is envisaged and thought to be exerted through the modulation of pro-inflammatory and pro-apoptotic proteases (24).
Midgut epithelial cells invaded by ookinetes show features indicative of apoptosis, such as loss of cell contacts, genomic DNA fragmentation, and sometimes caspase activation (45,46). In this context, up-regulation of the inhibitory SRPN10 serpin gene may reflect the activation of anti-apoptotic or cytoprotective mechanisms during ookinete invasion.
Alternatively, the inhibitory activity of the CAM isoform against two distinct bacterial subtilisin-like proteases may support another working hypothesis, according to which SRPN10 serpin isoforms may inhibit ookinete-derived proteases. It is known that ookinetes secrete Sub2, a subtilisin-like protease, during the midgut invasion process (45). Because a pivotal role in red blood cell invasion is predicted for subtilisins secreted by the merozoite stages (47), a similar role of Sub2 during ookinete midgut invasion is an intriguing hypothesis. Provided that functional, purified enzyme becomes available, it would be interesting to test the potential effect of all four SRPN10 serpins toward Sub2. Production and testing of the KRAL isoform would be of special interest, because it is strongly up-regulated during midgut invasion.
In agreement with the lack of an obvious signal peptide, we demonstrated by immunofluorescence that Anopheles SRPN10 serpins have an intracellular nucleocytoplasmic localization, principally in midgut cells, i.e. scavenger pericardial cells and hemocytes, which are well known to mediate, respectively, epithelial and cellular immune responses in the mosquito. Bacterial challenge elicits only marginal up-regulation of serpin transcripts (particularly of the KRAL isoform), in contrast to the immune-responsive Spn43Ac Drosophila serpin (16,17) and to the hemocyte-specific Manduca serpin-2 (18).
A remarkable property of SRPN10 is that two of its isoforms are transcriptionally up-regulated upon ookinete midgut inva-sion. This differentiates SRPN10 serpins from other described markers such as, nitric-oxide synthase, defensin, and gram negative protein, which are transcriptionally regulated by both ookinete invasion and bacterial challenge (2,5,48,49).
Marker genes that are specifically regulated during ookinete invasion are particularly valuable as tools to dissect and compare the physiological responses triggered in the vector when infected with different Plasmodium species or strains. A fragment encoding a gene with sequence similarity to ␣2-macroglobulin was shown to respond strongly to malaria parasite infection and not to bacteria (42). Several genes that are differentially regulated and may be involved in the defense reaction of the A. gambiae midgut toward P. falciparum have been recently isolated by differential display (50). Curiously, SRPN10 serpins were not among them. This might be due to the different combination of experimental organisms (A. gambiae and P. falciparum) or to the low midgut infection rates in that study (not exceeding 15 oocysts per infected midgut). In our experimental combination (A. gambiae and P. berghei), high infection rates were achieved. Additional studies are necessary to distinguish whether SRPN10 serpin up-regulation takes place only in cells that are invaded by the parasite or is a general response of the midgut after heavy infection. Our highly specific ␣-SRPN10 antibody is a very valuable tool for such studies.
Only two of the serpin variants (the ones that utilize the most upstream alternative exons) are enriched during ookinete midgut invasion. Combined with the results of primer extension experiments that indicate lack of alternative promoters, these observations point to regulation at a step other than transcriptional initiation. The step in question may affect transcriptional termination, splicing, or relative stability of the mRNAs. The existence of distinct 3Ј-untranslated regions for each splice variant might be associated with differences in transcriptional termination or mRNA stability. Alternatively, ookinete invasion could result in preferential splicing of KRALand RCM-serpin variants: cells penetrated by the parasite show apoptotic phenotypes (45,46), and it has been reported that cell death can affect profoundly the splicing machinery, favoring maturation of distinct gene products through specific responsive elements present in their introns (51,52).
Understanding the complex regulation of SRPN10 serpin expression requires systematic studies, starting with the functional dissection of the promoter region, which is characterized by the presence of multiple regulatory elements, and proceeding with analysis of pre-mRNA transcription and in vivo processing and stability. In addition, the effects of purified SRPN10 serpin isoforms on in vitro cultured ookinetes, or ookinete invasion in transgenic midguts overexpressing or inhibiting specific isoforms, will be necessary to clarify whether SRPN10 promotes or inhibits ookinete invasion.