A Pattern Recognition Protein for Peptidoglycan

Peptidoglycan recognition protein (PGRP) specifically binds to peptidoglycan and is considered to be one of the pattern recognition proteins in the innate immunity of insect. The PGRP is an essential component for peptidoglycan to trigger the prophenoloxidase cascade that is now recognized to be an important insect defense mechanism. We cloned cDNA encoding PGRP from the silkworm fat body cDNA library. Northern blot analysis showed that the PGRP gene is constitutively expressed in the fat body, epithelial cell, and hemocytes of naive silkworms. Furthermore, a bacterial challenge intensified the gene expression, with the maximal period being from 6 to 36 h after infection. The upstream sequence of the cloned PGRP gene was shown to contain putativecis-regulatory elements similar to the NF-κB-like element, interferon-response half-element, and GATA motif element, which have been found in the promoters of the acute phase protein genes of mammals and insects. A homology search revealed that the homologs of silkworm PGRP are present in mice, nematodes, and bacteriophages. This suggests that the recognition of peptidoglycan as foreign is effected in both vertebrates and invertebrates by PGRP homologs with an evolutionally common origin.

Innate immunity plays vital roles in primary defense mechanisms against invading pathogens in both vertebrates and invertebrates. The common structures among pathogens are recognized to be non-selves in the immunity. This type of recognition is termed pattern recognition, as opposed to clonal recognition in which clonally selected immunoglobulins are employed (1). Lipopolysaccharide (LPS) 1 and peptidoglycan (PG) are the cell wall components of bacteria and ␤-1,3-glucan (␤G) is that of fungi. They are recognized by pattern recognition proteins that are present in plasma as free-floating molecules or on the cell surface as receptors.
Among the pattern recognition proteins in mammals, a hu-moral LPS-binding protein and the cellular receptor CD14 have been well characterized (2)(3)(4)(5). Their roles in stimulating macrophages to produce cytokines are well understood. In contrast, pattern recognition proteins for PG and ␤G have not been characterized as thoroughly as those for LPS, although CD14 is also implicated for the recognition of PG as non-self (6,7). In insects, a number of proteins that could be pattern recognition proteins have been described. They are lectins (8 -11), hemolin (12,13), lipopolysaccharide-binding protein (14,15), Gram-negative bacteria-binding protein (16), peptidoglycan recognition protein (PGRP) (17), and ␤-1,3-glucan recognition protein (␤GRP) (18,19). The latter two proteins are members of the prophenoloxidase cascade, of which other members are known to be serine protease zymogens and prophenoloxidase (20). They have been shown to have specific affinity to PG or ␤G and to work at initiation points of the cascade. The prophenoloxidase cascade is now recognized to be one of major insect defense mechanisms and to possibly play vital roles in interrelating the mechanisms and in recognizing of microbes as foreign. Thus, it is probable that PGRP and ␤GRP of the prophenoloxidase cascade are employed generally in insects as pattern recognition molecules for PG and ␤G. We previously reported the purification and the characterization of silkworm PGRP and ␤GRP (17,18). Pattern recognition proteins other than PGRP and ␤GRP are not known to be distributed among insects as widely as PGRP and ␤GRP.
It is not clear whether pattern recognition proteins with the same specificity in mammals and in insects have any common structural similarity. However, parallels between cellular signaling pathways for the synthesis of mammalian acute phase proteins and insect immune proteins after a bacterial challenge have been revealed (21). This indicates a common origin of the pathways in innate immunity in mammals and insects Here, we report the cloning of PGRP cDNA from the fat body cDNA library and PGRP gene from the genomic library of the silkworm Bombyx mori. The homology search showed that PGRP is a homologous protein to bacteriophage T7 lysozyme, although it does not have lysozyme activity, and that proteins homologous to PGRP are expressed in mammals. This experimental evidence suggests that recognition proteins for the initial extracellular non-self recognition in innate immunity of vertebrates and invertebrates have developed from a common origin. Our experimental results also show that PGRP synthesis is induced by a bacterial challenge and suggest that expression of the PGRP gene is regulated by the Rel family of transcription factors.

EXPERIMENTAL PROCEDURES
Insect and Bacterial Challenge-Silkworms, B. mori (strain Kinshu ϫ Showa), were reared on an artificial diet as described previously (22). The larvae on day 5 of the fifth instar were injected with 10 l of late logarithmic phase Enterobacter cloacae (JCM1232) suspension (A 600 ϭ 0.1) in the physiological saline (10 mM bis-Tris propane buffer, pH 6.5, containing 150 mM NaCl) or with 10 l of the saline as the control experiment.
Purification and Amino Acid Sequencing of PGRP-PGRP was purified as described by Yoshida et al. (17). S-Pyridylethylated or intact PGRP was digested with trypsin at a molar ratio of enzyme to substrate of 1:50. The digestion was carried out in 0.1 M Tris-HCl, pH 6.5, at 37°C for 24 h, and the resulting peptides were separated by high performance liquid chromatography on a C 8 column (4.6 ϫ 150 mm, Vydac). The isolated peptides were sequenced using an automatic protein sequencer (Shimadzu PSQQ-10). The sequences of the peptides derived from Spyridylethylated or intact PGRP were compared to determine the disulfide bond locations.
Cloning and Sequencing of PGRP cDNA-A silkworm fat body cDNA library was constructed in the vector ZAP (Stratagene). Using internal peptide sequences, degenerate oligonucleotides corresponding to KKQWDG and WPEWLE were synthesized. Their sequences were 5Ј-AAGAATTCAA(A/G)AA(A/G)CA(A/G)TGGGA(C/T)GG-3Ј (sense primer) and 5Ј-AAGAATTCTC(A/C/G/T)A(A/G)CCA(C/T)TC(A/C/G/T-)GGCCA-3Ј (antisense primer), respectively. These primers were used for the polymerase chain reaction. The reaction was performed using the silkworm fat body cDNA library as a template under the following conditions: 35 cycles comprising 94°C for 1 min, 60°C for 2 min, and 72°C for 3 min. A 480-bp fragment was amplified, subcloned into a plasmid vector pBluescript (Stratagene), and sequenced using an automatic DNA sequencer (PE Applied Biosystems, model 377). The cloned polymerase chain reaction product labeled with [␣-32 P]dCTP was used to probe the ZAP fat body cDNA library. Hybridization was carried out at 42°C for 16 h in 2 ϫ PIPES buffer (0.8 M NaCl, 40 mM PIPES, pH 6.5), 50% formamide, 0.5% SDS, and 100 g/ml denatured salmon sperm DNA. The membrane was washed twice in 0.1 ϫ SSC (15 mM NaCl, 1.5 mM sodium citrate, pH 7.2) at 53°C for 15 min and subjected to autoradiography. Three positive clones were obtained and sequenced. The clone that had the longest insert and a complete open reading frame was used as PGRP cDNA clone.
Northern Blot Analysis-In the detection of PGRP transcript in naive silkworms, poly(A) ϩ RNA preparations from hemocytes, fat body, epidermal cells, midgut, silk gland, and malpighian tubules were prepared by chromatography on oligo(dT)-cellulose (Amersham Pharmacia Biotech). 5 g of poly(A) ϩ RNA of each preparation were separated by electrophoresis in 1% agarose gels with 10 mM sodium phosphate, pH 7.0, transferred to a Hybond-Nϩ (Amersham Pharmacia Biotech) membrane, and hybridized with a 32 P-labeled PGRP cDNA probe. The hybridization was performed for 16 h at 42°C in 50% formamide, 4 ϫ SSPE (1.2 M NaCl, 40 mM sodium phosphate, pH 7.4, and 4 mM EDTA), 5 ϫ Denhardt's (0.1% polyvinylpyrrolidone, 0.1% bovine serum albumin, and 0.1% Ficoll), 0.1% SDS, and 100 g/ml denatured salmon sperm DNA. The membrane was washed once in 2 ϫ SSPE containing 0.1% SDS and subsequently twice in 1 ϫ SSPE containing 0.1% SDS at 56°C for 10 min. In the experiments to demonstrate the induction of PGRP gene in the fat body, total RNA from the fat body of silkworms was collected at intervals after the bacterial challenge. 20-g aliquots of the RNA preparations were subjected to Northern blot analyses as above except that silkworm cecropin B (0.4 kbp) (23) and ␣-tubulin (1.3 kbp) probes were used in addition to the PGRP probe. The silkworm cecropin B (0.4 kbp) and ␣-tubulin (1.3 kbp) probes were obtained as polymerase chain reaction products, and the identities of their sequences to those having been reported were confirmed by referring to GenBank TM accession numbers S60579 and X83429, respectively.
Southern Blot Analysis-Silkworm genomic DNA was isolated from the silk gland using DNzol (Life Technologies, Inc.), and 10 g of the DNA were digested with EcoRI, BamHI, or SacI. DNA fragments in the digests were separated by electrophoresis on a 0.8% agarose gel, blotted onto Hybond-Nϩ membrane, and hybridized with the 32 P-EcoRI/KpnIdigested cDNA. The digested cDNA probe (0.77 kbp) was the entire insert of PGRP cDNA clone with short flanking sequences. The conditions for the hybridization and the washings were the same as those in Northern blot analysis.
Analysis of PGRP Gene Structure-2 ϫ 10 5 plaques of an amplified B. mori genomic library constructed in FIX (Stratagene) were screened with the 32 P-labeled PGRP cDNA probe. Three positive plaques were isolated, and the DNA was digested with BamHI or SacI. The 32 Plabeled PGRP cDNA probe hybridized with 2.5-and 4.3-kbp DNA fragments of BamHI digest and with a 2.1-kbp fragment of SacI digest. All the fragments were separately subcloned into pBluescript. After the deletion clones of the subclones were prepared, they were sequenced. Computer analysis of the sequencing data was performed using the GENETYX system (Software Development Co., LTD Tokyo).

Nucleotide Sequence of PGRP cDNA and the Deduced Amino
Acid Sequence-PGRP cDNA clones were obtained by screening a fat body cDNA library of the silkworm B. mori. The nucleotide sequence and the deduced amino acid sequence are shown in Fig. 1. The open reading frame composed of nucleotides 31-618 encodes 196 amino acid residues. The first 23 amino acid residues seemed to be a signal peptide because the deduced amino acid sequence beginning from the 24th residue was identical to that observed previously at the N terminus of PGRP. Thus, the mature protein consists of 173 residues with a calculated molecular mass of 19290 Da. 49% of the deduced mature protein sequence, including the N-terminal, was confirmed to be identical to that determined by direct sequencing of peptides obtained after trypsin digestion of PGRP (Fig. 1).
FIG. 1. Nucleotide sequence of cDNA encoding PGRP from B. mori and the deduced amino acid sequences. The amino acid sequence is numbered at right, beginning at the N terminus of the mature protein, and the nucleotide sequence is numbered at left. Underlined amino acid residues were confirmed by sequencing of the N terminus of the mature protein and the peptide fragments which obtained after proteolysis of the S-pyridylethylated PGRP by trypsin.
Consensus sequence for N-glycosylation was not found in the deduced sequence. A polyadenylation consensus signal was not observed downstream of the stop codon. The cysteine residues engaged in the formation of the disulfide bond were determined to be Cys 2 -Cys 124 and Cys 38 -Cys 44 from a comparison of the results of peptide mapping and amino acid sequencing of reduced and nonreduced PGRP.
Expression of PGRP mRNA-Northern blot analysis using PGRP cDNA as a probe detected a 0.8-kbp transcript of PGRP in hemocytes, fat body, and epidermal cells but not in malpighian tubules, silk gland, and midgut of naive silkworms. This result indicates that PGRP mRNA is constitutively expressed in the fat body, epidermal cells, and hemocytes. Major sites for the synthesis of PGRP seemed to be the fat body and epidermal cells (Fig. 2A). The expression of PGRP gene and cecropin B gene in fat body was induced by injection of bacteria (E. cloacae) to silkworm larvae, as shown in Fig. 2B. The induction kinetics was similar in both genes: the induction was detected at 6 h and reached its maximum at 24 h after injection. When saline was injected instead of bacteria in the above Northern analysis, neither of the transcripts of PGRP and cecropin B genes increased in the fat body for 36 h after the injection (data not shown). These results indicated that the expression of PGRP gene is inducible by a bacterial challenge and that PGRP could be classified as an acute phase protein.
We could detect the induction of PGRP gene expression in the fat body by the injection of Micrococcus luteus peptidoglycan. 2 Structure of PGRP Gene-We isolated three positive clones from the silkworm genomic library. The restriction maps of the three clones indicated that a 2.1-kbp SacI fragment overlaps both the 2.5-and 4.3-kbp BamHI fragments (Fig. 3B). Both of the BamHI fragments and the 2.1-kbp SacI fragment were subcloned separately into pBluescript. The sequences of these fragments were examined after subcloning the various restriction fragments. By comparison of the nucleotide sequence of the PGRP gene with that of the PGRP cDNA clone, the PGRP gene was revealed to contain four exons (exon I, nucleotides 1-100; exon II, 101-299; exon III, 300 -418; and exon IV, 419 -739 by nucleotide numbering of the PGRP cDNA clone in Fig. 1) interspersed with three introns. The 5Ј end of exon I is putative because the starting point of transcription of the PGRP gene has not been studied. The 2.5-kbp BamHI fragment contained 1966 bp of 5Ј upstream sequence of the initiating Met codon (Fig. 3C). The TATA-box (TATATA) is located from nucleotide Ϫ73 to Ϫ68. Several putative cis-regulatory sequences were observed within Ϫ210 bp from the TATA-box. They were cAMP response motif (GTGACGTCAC), NF-kB-like motif (AGG-GATTTCC), GATA motif (TGATAA), and five interferon response half-motifs (GAAANN) that have been found in promoter regions of many acute phase protein genes (24).
Southern blot analyses of genomic DNA digested with EcoRI, BamHI, or SacI were performed by using the PGRP cDNA probe. As is shown in Fig. 3A, only one hybridized band was observed with the digest by EcoRI or BamHI, and the digest with SacI gave two hybridized bands (2.1 and 6.0 kbp). Because the cloned PGRP gene had one restriction site for each of BamHI and SacI, the digest with BamHI should have given two hybridization bands (2.1 kbp in addition to the observed 6.0 kbp). The reason we could not observe the 2.1-kbp fragment seems to be that the fragment contains only a short exon and the PGRP cDNA probe barely hybridized with the fragment. Thus, the results of the Southern blot analyses indicate the presence of a single copy of the PGRP gene in the silkworm genome.
Search for the Homologs of PGRP-We searched for the existence of silkworm PGRP homologs in the data base sequences by BLAST and found that an expressed sequence tag from the lymphatic filarial nematode Brugia malayi, and sequences of bacteriophage T7 lysozyme and mouse Tag7 protein are homologous to silkworm PGRP (Fig. 4). Furthermore, the Drosophila melanogaster gene encoding RNA polymerase II (M27431) was found to contain a PGRP-like sequence. The sequence is in the complementary strand of RNA polymerase II-coding sequence and corresponds to 140 bp from the end of the registered sequence (Fig. 4). DISCUSSION We have previously reported the purification and characterization of the PGRP from the hemolymph of the silkworm B. mori (17). The PGRP specifically binds to peptidoglycan, and this binding leads to activation of the prophenoloxidase cascade in the plasma fraction of the silkworm hemolymph. In the present study, we cloned PGRP cDNA and the gene.
The cloned PGRP cDNA contained an open reading frame (nucleotides 31-618) encoding a protein with 196 amino acid residues. The deduced amino acid sequence of the protein had a putative 23-amino acid signal sequence and a mature protein sequence. The molecular mass (19290 Da) calculated from the primary structure of the deduced mature protein is in good agreement with the value (19 kDa) determined before by SDSpolyacrylamide gel electrophoresis of purified PGRP. The absence of N-glycosylation consensus sequence in the deduced amino acid sequence of PGRP is consistent with our previous observation that purified PGRP did not manifest reactivity to eight commercially available lectins. We have produced recombinant PGRP in a baculovirus expression system and confirmed that the protein possesses the ability to specifically bind to peptidoglycan. 3 This observation supports further that our cloned cDNA encodes PGRP.
Among proteins of which the sequence and function are known, bacteriophage T7 lysozyme is the only protein interacting with peptidoglycan and having the sequence similarity with PGRP. Furthermore, the amino acid residues of PGRP corresponding to the five catalytically active residues of the lysozyme (25)  sidering the ability of PGRP to bind to peptidoglycan, the homology is understandable. At the same time, the present results concerning the PGRP sequence did not offer any clues to answering our question: What kind of activity does the complex of PGRP and peptidoglycan have? Because binding of PGRP to peptidoglycan is known to lead to the activation of protease zymogens of the prophenoloxidase cascade, we had speculated that PGRP itself is a protease zymogen of which activation is caused by its binding to peptidoglycan. However, the deduced PGRP sequence seemed to deny this possibility. We now speculate that the complex of peptidoglycan and PGRP activates a protease zymogen that is a member of the prophenoloxidase cascade. In the blood coagulation system of horseshoe crab, the factors that directly interact with elicitors such as lipopolysaccharide and ␤-1,3-glucan are zymogen of proteases (26). In this respect, the component located at the initiation point of the prophenoloxidase cascade is different from that of the blood coagulation cascade.
Our search for proteins with the sequence similarity to PGRP indicated that homologs to PGRP are present in mice and nematodes. The mouse Tag7 protein has been reported to be a cytokine (27). It remains to be determined whether the silkworm PGRP has a similar biological activity. The ancestral protein of the homologs to PGRP seems to have appeared before protostomia and deuterostomia diverged in evolution and to be present widely in animals. This wide distribution of PGRP suggests the possibility that insect PGRP and its homologs may generally play a role in the recognition of peptidoglycan in innate immunity. The function of PGRP in triggering the prophenoloxidase cascade may be peculiar to insects. Peptidoglycan manifests various biological activities as have been mentioned in the introduction of our previous paper (17) and of papers by Dziarski and co-workers (6,7). Despite the myriad activities, the mechanism for the recognition of peptidoglycan as foreign has been poorly understood. Recently, Dziarski and co-workers showed that CD14 on murine macrophages has affinity to peptidoglycan and is involved in activation of the transcription factor NF-B by peptidoglycan (7). It has been established that CD14 is a receptor for LPS in mammals (3). In this case, however, another protein, the LPS-binding protein, in plasma is assumed to form a complex with LPS, and K D of CD14 for the complexed LPS becomes very low in comparison FIG. 3. The PGRP gene. A, Southern blot analysis of B. mori genomic DNA digested with EcoRI, SacI, or BamHI. 10 g of the digested DNA was separated on 0.8% agarose gel, blotted onto a nylon membrane, and hybridized with the 32 P-EcoRI/KpnI-digested PGRP cDNA. B, schematic representation of the genomic clone of PGRP. Open boxes with Roman numerals represent exons, and restriction enzyme sites are indicated by their names and arrows. C, the nucleotide sequence of the SacI-BamHI genomic fragment containing exon I of the PGRP gene. Nucleotides are numbered from the A of the translation start codon (position ϩ1). Exon I is double-underlined, and the deduced amino acid sequence for PGRP is shown below the nucleotide sequence in single-letter code. The putative TATA box and various cis-regulatory elements are underlined with their names.
FIG. 4. Sequence alignment of silkworm PGRP with the homologs from the GenBank TM data base. The shown sequences, except B. mori PGRP, were obtained from the data base: D. melanogaster PGRP, M27431; nematode-expressed sequence tag, AA228200; mouse Tag7, X86374; T7 lysozyme, S75616. Alignments were done with the CLUSTAL V program. Gaps (indicated by hyphens) are inserted to maximize sequence alignment, and amino acid residues identical to silkworm PGRP are boxed. Percentage identities referred to silkworm PGRP are indicated.
with that for noncomplexed LPS. Thus, the LPS-binding protein increases the sensitivity of cells with CD14 to LPS. It would be worth testing whether PGRP works in the same way as the LPS-binding protein.
A single copy of the PGRP gene was indicated to be present in the silkworm genome (Fig. 3A). We detected a cAMP response element, NF-B-like element, and five interferon response half-elements in the promoter region of the gene. The transcripts of the gene were detected in the fifth instar larvae before challenging them with Gram-negative bacteria or peptidoglycan. However, the transcription was up-regulated after challenging them with the microbes or the cell wall component (Fig. 2). The presence of cis-regulatory elements like an NF-B element and the enhancement of transcription by cell wall components of microbes are common features of acute phase proteins of insects and mammals (28,29). In this respect, PGRP can be classified to be an acute phase immune protein.
Whether up-regulated expression of the PGRP gene upon bacterial challenge has merits for silkworms and, if so, what those merits are remain to be studied.
The intracellular events leading to the induction of acute phase immune protein synthesis are now being intensely studied in both insects and mammals. Proteins belonging to the Rel family are commonly employed in the signaling pathway for the induction in different organisms (21,30,31). Toll has been shown to be a receptor for a protein named spä tzle and to participate in the signaling for the activation of Drosophila immune protein gene coding for drosomycin (32). Recently, a Toll homolog has been reported to be involved in human innate immunity as well as adaptive immunity (33). All these observations point to a common origin of the intracellular signaling pathway. Our present results indicated the presence of homologous pattern recognition proteins for peptidoglycan in both insects and mammals. The results seem to imply that the mechanism for extracellular recognition of microbes as foreign is conserved between insects and mammals, indicating a yet closer relation between innate immune systems in such phylogenetically remote organisms as insects and mammals.
Addendum-After the submission of the manuscript of the present paper, a paper reporting a cDNA for PGRP from a moth Trichoplusia ni and its mouse and human homologs was published (34).