Molecular cloning and characterization of Porphyromonas gingivalis lysine-specific gingipain. A new member of an emerging family of pathogenic bacterial cysteine proteinases.

The proteinases of Porphyromonas gingivalis are key virulence factors in the etiology and progression of periodontal disease. Previous work in our laboratories resulted in the purification of arginine- and lysine-specific cysteine proteinases, designated gingipains, that consist of several tightly associated protein subunits. Recent characterization of arginine-specific gingipain-1 (gingipain R1; RGP-1) revealed that the sequence is unique and that the protein subunits are initially translated as a polyprotein encoding a proteinase domain and multiple adhesin domains (Pavloff, N., Potempa, J., Pike, R. N., Prochazka, V., Kiefer, M. C., Travis, J., and Barr, P. J. (1995) J. Biol. Chem. 270, 1007-1010). We now show that the lysine-specific gingipain (gingipain K; KGP) is also biosynthesized as a polyprotein precursor that contains a proteinase domain that is 22% homologous to the proteinase domain of RGP-1 and multiple adhesin domains. This precursor is similarly processed at distinct sites to yield active KGP. The key catalytic residues in the proteinase domain of KGP are identical to those found in RGP-1, but there are significant differences elsewhere within this domain that likely contribute to the altered substrate specificity of KGP. Independent expression of the proteinase domain in insect cells has shown that KGP does not require the presence of the adhesin domains for correct folding to confer proteolytic activity.

The anaerobic Gram-negative rod bacterium Porphyromonas gingivalis is implicated in the initiation and progression of certain forms of periodontitis, including juvenile and adult periodontal disease (1)(2)(3). Mechanisms by which P. gingivalis evades the host defense response and elicits hard and soft tissue damage in the periodontal pocket are under investigation, with many studies being focused on the proteinases produced by the organism. To date, the combined proteolytic activities of P. gingivalis have been shown to be capable of degrading and inactivating host defense proteins (iron-binding proteins, immunoglobulins, and complement components), structural proteins (collagen, fibronectin, and fibrinogen), and plasma proteinase inhibitors (4 -10). In addition, proteinase activity is associated with the ability of the organism to adhere to collagenous substrata and to hemagglutinate and lyse red cells, thus allowing the organism to remain in the periodontal pocket while meeting its nutritional requirements for both heme and peptides (11,12).
Recent work in our laboratories has resulted in the identification and purification of multiple cysteine proteinases, designated gingipains, some of which comprise several tightly associated protein subunits. The recent complete gene sequence of arginine-specific gingipain-1 (gingipain R1; RGP-1) 1 revealed a unique proteinase that is synthesized initially as a polyprotein encoding a proteinase domain and four adhesin domains (13). The protein subunits are generated by subsequent proteolytic processing to mature RGP-1.
We have also purified a lysine-specific gingipain (gingipain K; (KGP) from P. gingivalis. This enzyme is a 60-kDa cysteine proteinase that is also found associated with adhesin molecules (14,15). Amino-terminal protein sequencing of the 60-kDa proteinase subunit of KGP showed that it also is a unique proteinase. Here we present the complete DNA sequence of the kgp gene and its encoded protein. This represents the second fully characterized proteinase from an emerging family of cysteine proteinases that have structural features distinct from those of all other known families.
The sequence shows that KGP is similar to RGP-1 in its structural organization, biosynthesis, and maturational processing. We also show by expression of the KGP proteinase domain in insect cells that KGP proteinase activity is not dependent on the presence of the adhesin domains for correct folding. It is likely that the tightly regulated coexpression of proteinase and adhesion functions is present in other soluble or membrane-bound forms of gingipains that have been described and shown to be key factors in contributing to the pathogenicity of the organism.

EXPERIMENTAL PROCEDURES
Bacterial Strains-P. gingivalis strains HG66 and W50 were obtained from Dr. Roland Arnold (Emory University, Atlanta).
Oligonucleotide Synthesis-Oligonucleotide primers for PCR probes and DNA sequencing were synthesized by the phosphoramidite method on an automated DNA synthesizer (Applied Biosystems Model 394), purified by polyacrylamide gel electrophoresis, and desalted on Sep-Pak C 18 cartridges (Millipore Corp.) using standard protocols. Primer MK-9-29 was designed to complement the noncoding strand of kgp DNA corresponding to the six amino-terminal residues of the mature protein (DVYTDH) (14). The sequence of this 29-base primer consisted of 17 * This work was supported in part by Grant  kgp-specific bases, a 6-base EcoRI restriction site, and 6 extra bases at the 5Ј-end (underlined) as follows: 5Ј-AGATCTGAATTCGA(C/T)GT(A/ C/G/T)TA(C/T)AC(A/C/G/T)GA(C/T)CA-3Ј. Primer MK-10-29 was designed to complement the coding strand of kgp DNA corresponding to residues 16 -21 of the mature protein (MLVVAC) (14). The sequence of this 29-base primer comprised 17 kgp-specific bases, a 6-base HindIII restriction site, and 6 extra bases at the 5Ј-end (underlined) as follows: 5Ј-AGATCTAAGCTTCC (A/C/G/T) GC(A/C/G/T) AC(A/C/G/T)AC(A/C/G/ T)A(A/G)CAT-3Ј. Primer Lys-1-33 (5Ј-CATACGAACCGGCGTATTATA-CAAGTCGCCATG-3Ј) was designed to complement the noncoding strand of kgp DNA corresponding to amino-terminal residues 6 -16 of the mature protein (HGDLYNTPVRM) and was designed on the basis of partial sequence information on kgp (nucleotides 1351-1383; see Fig.  1A). This primer was used as a probe to screen a DASH P. gingivalis genomic DNA library (see below). An additional oligonucleotide primer, Lys-1-35, was used as a probe to identify and clone the PstI/Asp718 3Ј-end fragment (see Fig. 1B) of the kgp gene from genomic DNA. Primer Lys-1-35 was designed to complement the noncoding strand of kgp DNA corresponding to 27 bases specific for the 3Ј-end of the kgp gene (5Ј-TTCTACCGTAACGTCTTTACATACCTT-3Ј, nucleotides 3445-3471). Finally, primer HGP27-1 was used as a probe to identify and clone the HindIII/HindIII 3Ј-end fragment (see Fig. 1B) of the kgp gene from genomic DNA. Primer HGP27-1 was designed to complement the noncoding strand of kgp DNA corresponding to 30 bases specific for the 3Ј-end of the kgp gene (5Ј-GTAACCCGTATTGTCTCCCCATACGT-TGTC-3Ј, nucleotides 2893-2922).
PCR-PCR amplification was performed on P. gingivalis strain HG66 DNA using primers MK-9-29 and MK-10-29 and yielded consistently a single 76-base pair product (P76) representing a kgp DNA fragment. After Klenow treatment and digestion with EcoRI/HindIII, P76 was subcloned into M13mp18 and M13mp19 vectors (New England Biolabs Inc.) and sequenced. Based on these results, the specific primer (Lys-1-33) was synthesized, 32 P-labeled, and used to screen the DASH library. Incorporated radiolabeled nucleotides were separated from unincorporated nucleotides on a Sephadex G-25 column (Boehringer Mannheim).
Genomic DNA Library Synthesis-The P. gingivalis strain HG66 DASH and P. gingivalis strain W50 ZAP DNA libraries used here have been described previously (13).
Screening of Genomic Libraries-Approximately 2 ϫ 10 5 phage were grown on 5 ϫ 150-mm plates, lifted in duplicate onto supported nitrocellulose transfer membranes (BAS-NC, Schleicher & Schuell), and hybridized with the 32 P-labeled Lys-1-33 probe. Hybridizations were performed overnight at 42°C in 2 ϫ Denhardt's solution, 6 ϫ SSC (15 mM sodium citrate and 150 mM NaCl), 0.4% (w/v) SDS, and 500 mg/ml fish sperm DNA. Filters were washed in 2 ϫ SSC containing 0.05% (w/v) SDS at 48°C. The DNA from positive plaques was purified and subjected to Southern analysis (see below). A 3.8-kb BamHI fragment and 3.4-kb PstI fragment were identified, excised, and cloned into pBluescript SK(Ϫ). The 3.4-kb PstI fragment and a 0.9-kb PstI/BamHIgenerated 3Ј-end fragment of the 3.8-kb BamHI fragment were cloned into M13mp18 and M13mp19 vectors and sequenced. Standard protocols for cDNA library screening, phage purification, agarose gel electrophoresis, and plasmid cloning were employed (16). To clone the 3Ј-end of the kgp gene, PstI/Asp718-and HindIII-digested DNAs were size-selected on 1% agarose, and the regions at ϳ0.2 and 4.5 kilobase pairs, respectively, were cloned into pBluescript SK(Ϫ). Positive clones were identified by probe hybridization, and smaller fragments were subcloned into M13 for DNA sequencing (see Fig. 1B).
DNA Sequencing-Double-stranded DNA cloned into pBluescript SK(Ϫ) and single-stranded DNA cloned into M13mp18 and M13mp19 vectors were sequenced by the dideoxy terminator method (17) using sequencing kits purchased from U. S. Biochemical Corp. (Sequenase Version 2.0). DNA was sequenced using the M13 universal primer, reverse sequencing primer, and internal primers (see Fig. 1B).
Active-site Titration and Labeling-KGP from P. gingivalis HG66 was purified and titrated, and the active site was labeled as described previously (13). Biotinylated KGP (20 nmol) was denatured in 6 M guanidine HCl, reduced with 2-mercaptoethanol, and S-pyridylethylated by the method of Howke and Yuan (18). The sample was desalted by dialysis against 25 mM ammonium bicarbonate, pH 7.8.
Polypeptide Chain Fragmentation and Analysis-The derivatized protein (25 nmol) was digested in 25 mM ammonium bicarbonate buffer, pH 7.8, with trypsin at 37°C for 16 h (1:50 enzyme/substrate weight ratio). Each digest was analyzed as described earlier (13). Biotinylated peptides were analyzed for amino-terminal sequence, amino acid composition, and molecular mass.
Construction of Recombinant Baculovirus-The KGP proteinase domain was cloned into the pBlueBac III vector (Invitrogen). A PCR primer corresponding to DNA encoding the 5Ј-end of the proteinase domain was used in conjunction with a second primer complementary to DNA encoding the 3Ј-end of the proteinase domain. The 5Ј-oligonucleotide contained six extra nucleotides, a BamHI site, the Kozak consensus sequence GCC (19), an initiation codon, and the first six amino acids of the KGP proteinase domain. The 3Ј-oligonucleotide contained six extra nucleotides, a PstI site, a termination codon, and the last six amino acids of the proteinase domain. PCR fragments were cloned into pBlue-Bac III, and recombinant plasmids were isolated. Two of them were sequenced as described above and used to generate recombinant viruses by in vivo homologous recombination. Recombinant viruses were used to infect Spodoptera frugiperda clone 9 (Sf9) cells. After 48 h, recombinant viruses were collected, identified by PCR, and further purified. Standard protocols for plasmid cloning were used (16). Standard procedures for selection, screening, and propagation of recombinant baculovirus were performed as described by the supplier (Invitrogen).
Recombinant KGP Activity Assay-Sf9 cells were infected at a multiplicity of infection of 5 with either wild-type baculovirus or baculovirus engineered to encode the KGP proteinase domain. At 24, 48, and 72 h, cells and media were collected and centrifuged (5000 rpm, 3 min). Supernatants (2 ml) and pellets (resuspended in KGP assay buffer) were assayed for lysine-specific amidolytic activity as follows. Aliquots (200 l) of the supernatant or resuspended pellet were added to 800 l of assay buffer (0.2 M Tris, 0.1 M NaCl, 10 mM L-cysteine, and 5 mM CaCl 2 , pH 7.6) and mixed for 1 min. Substrate (S-2251, Chromogenix) was added to a final concentration of 100 M, and amidolytic activity was measured at 405 nm. The amount of active recombinant KGP proteinase domain expressed was quantitated by comparison with the activity of a fixed amount of KGP purified from P. gingivalis.

RESULTS AND DISCUSSION
Cloning and Sequencing of the kgp Gene-Pike et al. (14) have previously determined the primary structure of the NH 2 terminus of Lys-specific gingipain by direct amino acid sequencing. This information was used to prepare a mixture of synthetic oligonucleotides complementary to amino acids 1-6 and 16 -21 of the mature protein. These primers were used to amplify, by PCR, kgp gene sequence from P. gingivalis DNA. A single 76-base pair product (P76) was identified, cloned, and sequenced to determine codon usage for the amino-terminal residues of KGP. On the basis of this sequence, a 32 P-labeled oligonucleotide probe (Lys-1-33) corresponding to the coding strand of this partial kgp DNA was synthesized and used to screen the DASH P. gingivalis DNA library.
DNA from positive clones was extracted, purified, and subjected to restriction enzyme analysis. All clones gave an ϳ3.8-kb BamHI fragment and an ϳ3.4-kb PstI fragment. Similar results were obtained by Southern analysis of P. gingivalis total genomic DNA (data not shown). We therefore concentrated on one clone designated A2. The 3.8-kb BamHI and 3.4-kb PstI fragments from clone A2 were cloned into pBluescript SK(Ϫ). The 3.4-kb PstI fragment and a 0.9-kb PstI/ BamHI-generated 3Ј-end fragment of the 3.8-kb BamHI fragment were subcloned into M13mp18 and M13mp19 vectors and sequenced. To clone the region of the kgp gene containing the termination codon, overlapping clones were isolated from sizeselected PstI/Asp718 and HindIII plasmid libraries using oligonucleotide probes (see "Experimental Procedures"). Using this procedure, several ϳ0.7and 5.7-kb fragment containing clones were obtained, respectively. In total, ϳ8.1 kb of genomic DNA, from PstI to HindIII sites (Fig. 1B), was isolated and characterized. The composite 7.304-kb PstI/NcoI fragment of this genomic DNA (Fig. 1B) was fully sequenced in both directions and is described here.
Analysis of the Encoded KGP Sequence-Within the composite kgp gene sequence was found an open reading frame encoding a 1723-amino acid sequence, with the 5Ј-most ATG initiation codon at nucleotides 652-654 (Fig. 1). Between this ATG codon and the published amino-terminal sequence of KGP are an additional four in-frame methionine codons. The exact ATG codon used for initiation of translation is currently unknown, but the presence of a consensus TATA box (ATAAATT) at nucleotides 635-641 and a 15-amino acid signal peptide sequence immediately following the 5Ј-most ATG codon suggests it to be the strongest candidate (amino acids Ϫ227 to Ϫ213; Fig. 1A).
The most striking feature of the deduced protein sequence is the presence of multiple homologous sequences immediately carboxyl-terminal to the proteinase-coding domain (Fig. 1B), leading to a calculated molecular mass of 186.8 kDa for the encoded polyprotein. As described for RGP-1 (13), within these sequences can be found peptides identified by Pike et al. (14) as the components of 95-kDa gingipain R (high molecular mass gingipain (HGP)) that likely confer adhesion activity on the high molecular mass KGP complex. The polyprotein sequence deduced from the gene sequence now allows exact delineation of the primary sequence of the mature KGP proteinase. The amino terminus is derived from proteolytic processing at an arginine residue (Fig. 1B). The carboxyl terminus of the proteinase domain is derived by processing at Arg-509, which also releases the amino terminus of a novel subunit that is a hybrid of the 27-kDa (HGP27) and 44-kDa (HGP44) polypeptide chains described recently in the rgp-1 gene structure (Fig. 1B) (13). Previously, we identified three large repeats of homologous sequence located within three of the HGP cleavage products (13). It is interesting to note that the hybrid HGP27/ HGP44 appears to result from fusion within this repeated region. The molecular mass of this hybrid protein is 44.7 kDa. The first 147 amino acids are identical to HGP27. The remaining 271 amino acids show 99% identity to HGP44, including one amino acid change and two amino acid deletions (98% identity at the nucleotide level). Okamoto et al. (20) have cloned and sequenced RGP-1 from P. gingivalis strain 381. The encoded protein also contains a hybrid of the HGP44/HGP27 molecules that results from fusion within this homologous sequence. It is, however, in the reverse combination to that observed for KGP. The remainder of the processing events are identical to those described for RGP-1. The alignment of each structure is shown in Fig. 1B. Processing at Arg-927 gives rise to HGP15; processing at Lys-1062 gives rise to HGP17 (with two amino acid changes compared with HGP17/RGP-1 (13)); and processing at Arg-1220 gives rise to HGP27. The homology to the rgp-1 gene is also evident in the first 120 nucleotides of the 3Ј-noncoding sequences.
The proteinase domain of KGP comprises 509 amino acids with a calculated molecular mass of 55.9 kDa. This is lower than the 70-kDa molecular mass of Lys-gingivain first described by Scott et al. (21), but corresponds more closely to the 60-kDa molecular mass determined by Pike et al. (14). A comparison of the deduced amino acid sequences of the proteinase domains of RGP-1 and KGP indicates 22% identity using the Macaw alignment program (22). This includes a long stretch of homologous sequence in the carboxyl terminus that incorporates the motif LTATT present in all HGP repeat sequences (Fig. 2, A and B) (13). This sequence includes Asn-463, which aligns with Asn-442 in RGP and is the putative asparagine in the catalytic triad. Similarly, His-216 in KGP and His-211 in RGP-1 are other putative residues in the catalytic triads (Fig.  2, B and C). Previously, Cys-185 in RGP-1 was thought to be the active-site residue using active-site labeling and subsequent reduction and S-carboxymethylation (13). For KGP, an alternative methodology (described under "Experimental Procedures") was used to determine that Cys-249 is the active-site residue of this proteinase. This residue aligns with Cys-244 in RGP-1. This prompted us to re-examine the active site of RGP-1. This methodology, which utilizes a more sensitive reduction and pyridylethylation step, identified Cys-244 as the actual active-site cysteine of RGP-1, and not Cys-185 as reported previously (13).
KGP Has No Homology to Known Cysteine Proteinase Families-To date, some 20 families of cysteine proteinases are recognized. In Fig. 2C is shown the alignment of active sites of several members of these families and the putative active-site sequences in RGP-1 and KGP. As observed previously, RGP-1 exhibits no similarity to any known cysteine proteinase. This is also true for KGP. Thus, both P. gingivalis proteinases appear to represent an emerging distinct branch of this family of proteolytic enzymes in which the catalytic apparatus is likely different from that of other known cysteine proteinases. We also have evidence of the existence of a third member of this family. Restriction endonuclease-digested P. gingivalis DNA hybridized with an oligonucleotide corresponding to the amino terminus of mature RGP-1 (13) and showed two BamHI fragments of ϳ9.4 and ϳ3.5 kb and two PstI fragments of ϳ10 and ϳ3 kb (data not shown). Isolation of positive recombinant clones from the library revealed, upon analysis, one clone with a 3.5-kb BamHI fragment and a 3-kb PstI fragment. This corresponded to the rgp-1 gene described previously (13). We also identified independent clones containing a 9.4-kb BamHI  (22). Identical residues are shaded. C, conservation of sequences around the potential catalytic cysteine, histidine, and asparagine residues of some cysteine proteinase families with those in the new gingipain cysteine proteinase family. Residues identical to those in either RGP-1 or KGP are underlined. The catalytic triad Cys, His, and Asn are in boldface. The cysteine proteinase alignment was based on that of Rawlings and Barrett (28). fragment and an ϳ10-kb PstI fragment. Preliminary sequence analysis confirms that these clones encode a protein that is different from RGP-1 and that the translated sequence corresponds to peptide sequences of high molecular mass protein sequenced earlier in our laboratories and designated RGP-2 (data not shown).
Recombinant KGP Proteinase Domain Is Functionally Active-Because of the lack of a specific antibody against KGP, we tested the expression of the KGP proteinase domain in the baculovirus system by measuring its activity. Lysine-specific amidolytic activity was detectable after 48 h and was maximal at 72 h (Fig. 3) only in the supernatant of KGP recombinant baculovirus-infected cells. No amidolytic activity was observed in resuspended pellets or in the supernatant of wild-type infected cells. Calculations based on the activity of full-length KGP purified from P. gingivalis cultures indicate that at least 2 mg/liter KGP proteinase domain is being expressed. These results show the expression of the KGP proteinase domain in the baculovirus system and support the notion that the membrane-bound forms of gingipains likely associate with the membrane via the adhesin domains and imply that this domain does not require the remainder of the precursor protein for correct folding into a catalytically active conformation.
Implications of KGP Structure for P. gingivalis Pathogenicity-Molecular cloning of the kgp gene confirms previous findings that, as shown for RGP-1, KGP is also closely associated with adhesion activity. Sequencing of the corresponding kgp gene of the virulent W50 strain of P. gingivalis revealed 94% identity in the proteinase domain sequence (data not shown). Thus, as described for RGP-1, any involvement of KGP in virulence is likely due to its differential regulation and enhanced expression in virulent strains. Microorganism adhesion to host tissues is the initial critical event in the pathogenesis of most infections and an essential step for colonization of surfaces exposed to continuous fluid flow (23,24). The specific binding between the microbial cell-surface components (adhesins) and host cell, or other microorganism, surfaces is mediated by either protein-protein (non-lectin adhesin) or proteincarbohydrate (lectin adhesin) interactions. Besides mediating bacterial cell attachment to host tissues, the non-lectin adhesins of P. gingivalis have been implicated in hemagglutination with both activities associated with bacterial proteinases. A clearer understanding of the association between hemagglutinins/adhesins and proteinases occurred recently when we elucidated the structure of the gene encoding RGP-1 (13). We have now shown that both RGP-1 and KGP are synthesized as proteinase-adhesin polyproteins. It remains to be determined which polypeptide chain of gingipain complexes possesses affinity for microbial adhesion substrates such as fibrinogen. Fibrinogen is found in high concentrations in blood plasma, where it plays roles in blood coagulation and wound-healing processes. Its deposition on foreign bodies and at sites of trauma allow it to serve as a substrate for microbial adhesion. Sequence alignments of the proteinase domains of RGP-1 and KGP and of HGP17 and HGP44 reveal a region of high homology that might function as an adhesin domain ( Fig. 2A). The ability of RGP-1, KGP, and other membrane-associated P. gingivalis gingipains to adhere to matrix molecules indicates similarity to the fibrinogen-binding microbial surface components recognizing adhesive matrix molecules (25,26). Recently, two P. gingivalis cell-surface cysteine proteases, named porphypains 1 and 2, were purified and shown to bind and degrade fibrinogen (27). These likely correspond to two forms of high molecular mass complexes of gingipains.
The availability of RGP-1 and KGP DNA sequences will allow the further study of erythrocyte hemolysis mediated by the hemagglutinin/adhesin activities of this family of proteolytic polyproteins. In addition, recombinant polypeptides derived from these sequences may also prove useful in the development of potential immunoprophylactic and therapeutic agents against this human pathogen.