Structural insights unravel the zymogenic mechanism of the virulence factor gingipain K from Porphyromonas gingivalis, a causative agent of gum disease from the human oral microbiome*

Skewing of the human oral microbiome causes dysbiosis and preponderance of bacteria such as Porphyromonas gingivalis, the main etiological agent of periodontitis. P. gingivalis secretes proteolytic gingipains (Kgp and RgpA/B) as zymogens inhibited by a pro-domain that is removed during extracellular activation. Unraveling the molecular mechanism of Kgp zymogenicity is essential to design inhibitors blocking its activity. Here, we found that the isolated 209-residue Kgp pro-domain is a boomerang-shaped all-β protein similar to the RgpB pro-domain. Using composite structural information of Kgp and RgpB, we derived a plausible homology model and mechanism of Kgp-regulating zymogenicity. Accordingly, the pro-domain would laterally attach to the catalytic moiety in Kgp and block the active site through an exposed inhibitory loop. This loop features a lysine (Lys129) likely occupying the S1 specificity pocket and exerting latency. Lys129 mutation to glutamate or arginine led to misfolded protein that was degraded in vivo. Mutation to alanine gave milder effects but still strongly diminished proteolytic activity, without affecting the subcellular location of the enzyme. Accordingly, the interactions of Lys129 within the S1 pocket are also essential for correct folding. Uniquely for gingipains, the isolated Kgp pro-domain dimerized through an interface, which partially overlapped with that between the catalytic moiety and the pro-domain within the zymogen, i.e. both complexes are mutually exclusive. Thus, pro-domain dimerization, together with partial rearrangement of the active site upon activation, explains the lack of inhibition of the pro-domain in trans. Our results reveal that the specific latency mechanism of Kgp differs from those of Rgps.

by anaerobic, proteolytic Gram-negative species, which cause tissue destruction and inflammation (8). In the oral cavity this causes inflammation of the gums (gingivitis) and the periodontium (periodontitis) (6), which erodes the alveolar bone support of the teeth. Bacterial species infecting the periodontium include Actinobacillus actinomycetemcomitans, Tannerella forsythia, Prevotella intermedia, Fusobacterium nucleatum, and Porphyromonas gingivalis. P. gingivalis is a chief component of the dysbiotic oral microbiome and the major etiologic agent of chronic periodontitis (CP), 5 as revealed by comparative studies between healthy individuals and CP patients (6,9). Routine treatment of severe CP consists of mechanical debridement of the teeth surface below the gum line, which is laborious, repetitive, painful, and incompletely effective (10). Accordingly, there is an unmet need for the development of novel therapeutic approaches against CP, which is among the most prevalent infection-driven inflammatory diseases (11), and P. gingivalis is a prime target (6).
P. gingivalis persistently colonizes the human oral cavity, as indicated by its detection in several paleomicrobiological samples, which include the wet mummy of the Tyrolean Iceman "Ötzi" dated to ϳ5,300 years ago (12)(13)(14). During this multimillenial colonization of our mouth, the bacterium has evolved to deactivate our innate immune and inflammatory defense mechanisms and to keep bacterial competitors in the gingival crevice in check through a panel of virulence factors, which include peptidases (15)(16)(17). Among the latter are the gingipains K (Kgp) and R (RgpA and RgpB), which specifically cleave substrates after lysines and arginines (18), respectively. They are soluble or outer membrane-anchored cysteine peptidases responsible for up to 85% of the total extracellular proteolytic activity of the bacterium (19 -21) and can be found at very high concentrations in gingival crevicular fluid from CP patients (22). Kgp accounts for most of this activity (23) and is essential for bacterial survival and progression of CP (18). Accordingly, blocking Kgp may be a promising approach to combating P. gingivalis (24,25).
Kgp is a multidomain protein consisting of a signal peptide, an N-terminal pro-domain (NPD), a catalytic domain (CD), an immunoglobulin superfamily domain (IgSF), between three and five hemagglutinin/adhesion domains, and a C-terminal domain, thus spanning up to 1,732 residues in total. It is secreted through a C-terminal domain-dependent type-9 secretion system, which is also called a "Por secretion system" and has so far only been found in the Bacteroidetes phylum (26 -33). Similar to other secretory peptidases, Kgp is produced as a zymogen to prevent undesired intracellular proteolysis and only achieves full activity once secreted. The NPD exerts zymogenic latency and is proteolytically removed during activation of Kgp (34 -37). In addition to maintaining latency, NPDs often fold independently and fulfill a chaperone-like function on the downstream CDs to facilitate their correct folding during biosynthesis (38). They may also participate in intracellular sorting of zymogens (35) and inhibit the mature enzymes when added in trans, as described for funnelin metallocarboxypeptidases, for example (39,40).
Unraveling the biochemical and structural determinants of zymogenicity is essential to understand its pathophysiological function and to facilitate the design of inhibitors to block its proteolytic activity (41). We recently reported the crystal structure of the linked CD and IgSF domains of Kgp (Kgp-CD ϩ IgSF) (42) and of the complex between active RgpB-CD ϩ IgSF and its cognate NPD (RgpB-NPD) (43). To determine the mechanism of Kgp zymogenicity, we crystallized the isolated NPD of Kgp and solved its structure, which revealed a fold similar to Rgp-NPD despite 20% sequence identity and allowed modeling of its interaction with CD in the zymogen. Mutagenesis studies further suggest that the NPD of Kgp is an essential chaperone for the folding of the CD and perhaps other domains.

Results and Discussion
Overall Structure of the Kgp Pro-domain-Over several years, our attempts to produce crystals of the Kgp zymogen spanning domains NPD, CD, and IgSF have failed, which contrasts with our success with RgpB (43). Accordingly, we followed the divide and conquer approach and managed to get the structures of Kgp-CD ϩ IgSF (42) and Kgp-NPD (this work) separately. The latter is visible in the final Fourier map from residue Gln 20 to Gln 199 or Ala 201 (Kgp numbering as superscripts) of molecules B and A, respectively, within the asymmetric unit of the crystal.
The molecule is boomerang-shaped, with approximate maximal dimensions of 55 ϫ 40 ϫ 30 Å (Fig. 1, A and B). Its core is a central, 12-stranded, strongly bent ␤-structure (strands ␤IЈ, ␤I-␤IV, ␤IVЈ, ␤V, ␤VI, ␤VII ϩ ␤VIII, ␤IX-␤XI; strand numbering based on the structure of RgpB-NPD; see Fig. 1 in Ref. 43) split in ␤-sandwiches 1 and 2, which are approximately perpendicular to each other (reference view according to Fig. 1, C and D). Two 3 10 -helices, I and II, are laterally attached to sandwich 2. The Kgp-NPD moiety is held together by a central hydrophobic core traversing the molecule, which glues the two sandwiches and reaches from Val 92 , Pro 93 , and Ala 96 on the rightmost face of sandwich 1 to Phe 64 , Phe 138 , and Tyr 145 on the leftmost face of sandwich 2. Sandwich 1 is made up by antiparallel front and back sheets, respectively, featuring four strands (␤I, ␤II, ␤XI, and ␤VI) and three strands (␤V, ␤IX, and ␤VII ϩ ␤VIII). Front sheet strands ␤XI and ␤VI are, respectively, Nand C-terminally extended beyond the limits of the sandwich and bent by ϳ50 -60°. In this way, they contribute to the antiparallel three-stranded front sheet (strands ␤VI, ␤XI, and ␤X) of sandwich 2. The back sheet of the latter is five-stranded (␤IЈ, ␤III, ␤IV, ␤V, and ␤IVЈ) and mixed parallel-antiparallel.
The N-terminal part of the molecule shapes the outermost strand of the back sheet of sandwich 2 (␤IЈ) before entering a first ␤-ribbon, which gives rise to one edge of the front sheet of sandwich 1 (ribbon ␤I␤II). The chain rejoins sandwich 2 through two consecutive ␤-ribbons (␤III␤IV and ␤IVЈ␤V), which with ␤IЈ features the back sheet. The C-terminal extension of strand ␤V shapes through its extra part one edge of the back sheet of sandwich 1. At this point, the chain enters edge strand ␤VI of the front sheet of sandwich 1. The C-terminal elongation of ␤VI, in turn, gives rise to an edge strand of the front sheet of sandwich 2. After this strand, a 45-residue loop, the "inhibitory loop," connects strands ␤VI and ␤VII ϩ ␤VIII. It contains helices I and II and largely protrudes from the Kgp-NPD moiety (Fig. 1, A and B). This extended loop leads to a ␤-ribbon that creates the two edge strands of the back sheet of sandwich 1 (termed ␤VII ϩ ␤VIII and ␤IX). Thereafter, a loop leads to another ␤-ribbon (␤X␤XI) that completes the front sheet of sandwich 2. Finally, the C-terminal extension of ␤XI contributes to the front sheet of sandwich 1 as a central strand before reaching the C terminus of the molecule (Fig. 1D).
Structural Similarity with the B Pro-domain-The gingipain pro-domain constructs we expressed and crystallized in the present and past work comprised, respectively, fragments Gln 20 -Arg 228 of Kgp and Gln 25 -Arg 229 of RgpB, because the N-terminal 19 and 24 residues, respectively, correspond to signal peptides for secretion. Although the structure of Kgp is well ordered in the final Fourier map from Gln 20 on, RgpB is only ordered from Gly 30 /Arg 31 of RgpB onward (there are four molecules in the crystal asymmetric unit; see PDB code 4IEF and Ref. 43). On the opposite C terminus, Kgp is only ordered until maximally Ala 201 and flexible thereafter (Asp 202 -Arg 228 ), whereas RgpB is ordered until the end of the construct at Glu 226 /Arg 229 (except for flexible segment Leu 205 /Val 206 -Ser 209 /Thr 210 ). The C-terminal flexibility of Kgp in the crystal structure is supported by the likely absence of regular secondary structure within Gln 199 -Arg 243 according to bioinformatics predictions (data not shown). Both NPDs share a grossly similar central core but differ at both termini.
Initial automatic superposition of the coordinates of Kgp-NPD and RgpB-NPD confirmed the similarity of the structures (Fig. 2, A and B). There were 147 aligned residues, which showed a core root mean square deviation of 2.  with helix I (L␤VII) and Asn 141 within LIII (Fig. 2C). Starting from this superposition, we performed an accurate sequence alignment based on visual inspection of the structures (Fig. 2, A-C), which revealed rather 164 common residues and just 21% sequence identity (in comparison, RgpA-NPD and RgpB-NPD are 75% identical; (44)). The low sequence identity and the large root mean square deviation value of the superposed structures provide an explanation a posteriori for the fact that the structure of Kgp-NPD could not be solved by molecular replacement using the coordinates of RgpB-NPD. Structural superposition further unveiled that Kgp possesses an extra ␤-strand, ␤IЈ, which contributes to sandwich 2 (see above) and is missing in RgpB-NPD. The latter, in turn, has an extra C-terminal helix, ␣⌱, missing in Kgp-NPD (see Fig. 1 in Ref. 43 and Fig. 2C). Moreover, Kgp-NPD has an extra ␤-strand between ␤IV and ␤V, termed ␤IVЈ, and strands ␤VII and ␤VIII, separated in RgpB, are joined to a single continuous strand in Kgp (Figs. 1, C and D, and 2C).
Furthermore, superposition revealed that Lys 129 within the inhibitory loop of Kgp likely corresponds to Arg 126 of RgpB as the "intruding residue" penetrating the S 1 pocket of the active site cleft, thereby contributing to inhibitory potency (see Refs. 43 and 44 and Fig. 2, B and C). To assess the importance of Lys 129 for Kgp, we constructed point mutants (K129A, K129E, Here, the proven S 1 -intruding residue of RgpB (R 126 ) and the putative one of Kgp (K 129 ) are shown for their side chains and labeled. C, structure-based sequence alignment of the NPDs of RgpB and Kgp. Residues not defined in the respective structure are in light gray, structurally aligned residues are in black (when differing) or red (when identical), and residues defined but structurally non-equivalent in the two structures are in light salmon. The residue (potentially) intruding the S 1 pocket of the respective CD is framed. Numbering and secondary structure elements (arrows for strands; kringles for helices) above the alignment in green correspond to Kgp; those in blue below the alignment correspond to RgpB. and K129R) and analyzed their expression at mRNA, protein, and activity levels using Rgps and wild-type Kgp as controls. Mutations had no effect on transcription as determined by qRT-PCR (data not shown). In contrast, Western blotting analysis revealed no protein for mutant K129E in stationary bacterial cultures (fraction BC), the washed cell fraction (WC), the soluble intracellular protein fraction (CP ϩ PP), the cell envelope (CE) fraction, and the cell-free growth medium (GM) (Fig.  3A). Mutant K129R only revealed trace amounts in the first two fractions, whereas the wild type and K129A were detected in all fractions except the intracellular fraction, as expected for secreted proteins. The respective presence of protein correlated with enzymatic activity, which was absent for K129E and only residual for K129R (Fig. 3C). K129A displayed ϳ45% of the wild-type activity ( Fig. 3C) but only a slightly lower signal in Western blotting analysis (Fig. 3A). Mutations in the Kgp gene had no effect on expression, processing, and activity of Rgps (Fig. 3, B and D).
Taken together, these findings suggest that occupancy of the S 1 specificity pocket of the CD by Lys 129 not only imposes latency to the Kgp zymogen (44) but also contributes significantly to the chaperone function of the NPD, similar to what was previously found for Rgps (45). The deep, negatively charged pocket of Kgp is optimally designed to accommodate a lysine side chain (42). Consistently, a negative charge (mutant K129E), as well as extended or reduced side chains (mutants K129R and K129A, respectively), lead to interaction impairment or inefficiency, which apparently disturbs the folding process. This results in subsequent degradation of misfolded proteins by quality control proteases in the periplasm.
Dimerization of Kgp-NPD-The two molecules in the asymmetric unit of the Kgp-NPD crystal associate through a large interface, which gives rise to a pseudo-continuous antiparallel eight-stranded ␤-sheet mediated by the respective edge strands ␤I of the upper sheets of sandwiches 1 (Fig. 4, A-C). The hydrophobic cores cohering the monomers (see above) are likewise joined at the interface. Among the internal hydrophobic residues, two free cysteines (Cys 35 ) within strands ␤I are close to the dimer interface but too far apart from each other for bonding (ϳ7 Å), thus indicating that dimerization is non-covalent (Fig. 4C). Instead, the respective S␥ atoms are bridged through electrostatic interactions with a (tentatively assigned) azide molecule from the buffer.
Computational analysis of the dimer revealed an interface of 1060 Å 2 , which is quasisymmetrically shaped by 27 residues from either molecule A and B and includes nine hydrogen bonds and two salt bridges in total ( Table 1). The calculated solvation free energy gain upon interface formation is Ϫ11.7 kcal/mol, with an associated p value of 0.230. The extensive nature of the interface suggests the functional relevance of dimerization. In addition, the complexation significance score, which estimates the relevance of an interface for assembly formation (46), is 97.7%. Altogether, these results strongly suggest that the oligomerization state in solution is a stable dimer. To verify this hypothesis, we performed size exclusion chromatography experiments at three pH values (5.5, 6.5, and 8.0), which reproduce, respectively, values of the crystallization conditions and of human gingival crevicular fluid in inflamed sites (47) plus an intermediate value. Consistent with the observations in the crystals (Fig. 4B), we found that Kgp-NPD elutes as a dimer at all three pH values (Fig. 5), thus supporting the notion that the NPD dimerizes upon Kgp activation. This dimerization is likely to provide a mechanism to prevent rejoining of the NPD and Western blotting analysis of parental strain P. gingivalis W83 and mutants K129A, K129E, and K129R is shown. Late exponential/early stationary bacterial cultures (BC) were separated by centrifugation into cell-free growth medium (GM) and cell pellet. The latter was washed, giving rise to the whole cell (WC) fraction, which was further fractionated into the soluble intracellular protein fraction (CP ϩ PP) and the cell envelope fracion (CE) by sonication and ultracentrifugation. A and B, all fractions were standardized to the initial volume of the culture subjected to centrifugation and analyzed by Western blotting to detect Kgp forms (A) and Rgps (control) (B). C and D, Kgp (C) and Rgp (D) gingipain activities determined in whole cultures, cell-free growth medium, and washed cells using specific substrates. The whole-culture activity of the wild-type strain was arbitrarily taken as 100%.
the CD, which would lead to inhibition in trans of the secreted gingipain.
Homology Model of the Kgp Zymogen-To gain insight into the structure of the full Kgp zymogen, we constructed a homo-logy model spanning domains NPD, CD, and IgSF based on the structural information available on Kgp and RgpB fragments (see "Experimental Procedures" and Fig. 6, A-D). As an independent validation of the model, the distance between the last NPD residue (Ala 201 ) and the first CD residue (Asp 229 ) spans ϳ22 Å, which is sufficient to accommodate the missing 28 residues. Model building required rearrangement of six segments from their conformation observed in the unbound NPD and CD structures, three from each moiety, to avoid clashes (Fig.  6D). The largest changes in the NPD involved the inhibitory loop, and in particular, Lys 129 , which was rebuilt to match the position of a L-lysylmethyl moiety found attached to the catalytic cysteine Cys 477 in mature Kgp (Fig. 6C), as well as strands ␤VIIϩ␤VIII and ␤IX plus the intervening loop, which were folded outward to avoid clashes. Rearrangement of the CD, in turn, mainly involved the N terminus, helices ␣11 and ␣12, and L␣11␣12, which are close to the active site (see Ref. 42 and Fig.  6D) and produced severe clashes with the NPD (Fig. 6D), suggesting that the CD must adopt distinct conformations in the zymogenic and activated forms. According to this model, the NPD would attach laterally to the catalytic moiety through a large concave surface, distal to the IgSF (Fig. 6A) and overlapping with the NPD dimerization surface. Latency of the zymogen would be conferred by the inhibitory loop, by blocking access to the active site cleft through the insertion of Lys 129 into the S 1 pocket. The lysine residue would be sandwiched for its aliphatic part between Trp 513 and Cys 477 and bound for its N group by the side chains of Asp 516 and Thr 442 , and the main chain carbonyl of Asn 475 at the bottom of the pocket.
Conclusions-By following a concerted biophysical and biochemical approach, which included protein crystallography, bioinformatics with homology modeling, oligomerization studies in solution, and mutagenesis studies, we have unveiled the molecular mechanism of zymogenic latency maintenance for Kgp, an essential proteolytic virulence factor of a major diseasecausing pathogen for oral dysbiosis and CP. According to our working model, Kgp-NPD, which comprises an all-␤ scaffold, would inhibit cognate Kgp-CD within the zymogen through steric hindrance of the active-site cleft and specificity pocket, with a key feature being the insertion of Lys 129 into the S 1 pocket. This part of the latency mechanism is shared with related gingipains RgpA and RgpB.
Activation is conferred by cleavage at the hinge between Kgp-NPD and Kgp-CD, resulting in conformational changes in the active site of Kgp-CD and of surface segments of Kgp-NPD, followed by release and dimerization of the Kgp-NPD. Because dimerization occurs through a surface that overlaps with the NPD/CD interface in the zymogen, dimerized Kgp-NPD is no longer inhibitory. In contrast, there is no evidence for NPD dimerization in RgpA/B, and no significant structural differences are found in RgpB-CD between the zymogenic complex and the mature enzyme (see Fig. 3 in Ref. 43). Collectively, these findings explain that although RgpA and RgpB are strongly inhibited by their respective NPDs when added in trans, with inhibition constants in the nanomolar range, Kgp is not inhibited by addition of its NPD (44). The importance of the intruding residue in Rgp inhibition is reflected by the fact that its mutation to    lysine reduced the inhibitory potency by 1 order of magnitude. Replacement with alanine, glutamine, or glycine even totally abolished inhibition (44). This inhibitory capacity in trans in turn poses the requirement that Rgp-NPDs be completely degraded after NPD cleavage, so Rgp-CD activity can be fully liberated. Highly purified RgpB zymogen does not auto-activate, possibly because of NPD inhibition, and the addition of active enzyme in catalytic amounts only very slowly generates active RgpB. Along this line, the pI value of RgpB-NPD is 8.06, which is close to the pH value of human gingival crevicular fluid in inflamed sites (8.0) and thus might destabilize the NPD and render it prone to degradation. In contrast Kgp-NPD has a pI of 5.95 and should be stable at pH 8. Kgp-NPD may therefore remain intact, requiring dimerization to prevent inhibition of Kgp after activation, and the dimer might also have other functions in the colonization of P. gingivalis in the gingival crevice.

Experimental Procedures
Protein Production-Kgp-NPD of P. gingivalis strain W83 (sequence Gln 20 -Arg 228 ; see UniProt database access code Q51817) was produced as a fusion construct with glutathione S-transferase by recombinant overexpression in Escherichia coli and purified as described previously (44). The cloning strategy resulted in the retention of a pentapeptide, Ϫ5 GPLGS Ϫ1 , from the expression vector on the N terminus of the recombinant protein after excision of the fusion protein. Kgp-NPD was then concentrated and dialyzed against 5 mM Tris-HCl, pH 7.4, 0.02% sodium azide for crystallization.
Oligomerization Studies in Solution-Size exclusion chromatography analysis of Kgp-NPD was performed in a Superdex 75 column equilibrated with 50 mM sodium acetate, 150 mM sodium chloride, pH 5.5/6.5, or 50 mM Tris-HCl, 150 mM sodium chloride, pH 8.0, at 0.5 ml/min. Both buffers were supplemented with 0.02% sodium azide and 2 mM 1,4-dithiothreitol, with the latter added to prevent formation of intermolecular disulfide bridges.

Generation of P. gingivalis Mutant Strains-Strains incorpo-
rating point mutations of Kgp-NPD residue Lys 129 (K129A, K129E, and K129R) were generated from P. gingivalis strain W83. Therefore, master plasmid Kgp-CepA was first obtained through PCR/restriction digestion methods. Briefly, a 0.8-kb region upstream of the kgp gene, the CepA ampicillin resistance, the 2.8-kb fragment of the kgp gene, and the CepA ampicillin resistance cassette were amplified by PCR using Phusion polymerase (Thermo Fisher) and appropriate primers ( Table 2); digested with SphI, HindIII, BglI, and BamHI restriction enzymes (Thermo Fisher); and cloned into the pUC19 plasmid. The correct placement and orientation of DNA segments in resulting plasmid Kgp-CepA were confirmed by sequencing. The wild-type plasmid construct of Kgp-NPD (Kgp-CepA) was subsequently used to produce K129A, K129E, and K129R mutations by the SLIM method (48). The mutated constructs were verified by DNA sequencing.
Chromosomal integration of the mutated regions into the P. gingivalis genome was achieved via double homologous recombination as described previously (49). Briefly, 1 g of purified plasmid DNA was electroporated into P. gingivalis strain W83 competent cells (2.5 kV, 4 ms; Bio-Rad Micropulser). Bacteria were grown for 5-7 days on enriched tryptic soy broth blood agar (30 g of trypticase soy broth, 5 g of yeast extract, 5 mg of hemin, 15 g of agar, pH 7.5, per liter containing 4% defibrinated sheep blood) supplemented with 5 g/ml ampicillin for antibiotic resistance selection. Resistant clones were further subcultured and analyzed by PCR and sequencing to confirm mutations.
Growth of P. gingivalis Strains and Cell Fractionation-Wildtype and mutant strains of P. gingivalis were grown under anaerobic conditions (85% nitrogen, 5% hydrogen, and 10% carbon dioxide) in liquid enriched tryptic soy broth (30 g of trypticase soy broth, 5 g of yeast extract, 5 mg of hemin, pH 7.5, per liter; further supplemented with 0.5 g of L-cysteine and 2 mg of menadione). For the mutants, the medium was additionally supplemented with 5 g/ml ampicillin. The cultures were har-vested at the beginning of the stationary growth phase (A 600 ϭ 1.0 -1.3; on average after 24 h), adjusted to A 600 ϭ 1.0 with enriched tryptic soy broth and centrifuged (6,000 ϫ g, 15 min, 4°C) to separate bacterial cells from the medium. The pellets were washed with PBS, and bacterial cells were resuspended in the same buffer to be disrupted by ultrasonication (A ϭ 70%, 50 s, 5 s pulse, 2 s off). The resulting homogenate was subjected to ultracentrifugation (150,000 ϫ g, 60 min, 4°C) to obtain the particle-free soluble cytoplasmic/periplasmic proteins fraction (supernatant) and the pellet. The latter consisted of the cell envelope fraction, thus encompassing inner and outer membranes, and was resuspended in PBS for further fractionation.
Analysis of Gingipain Expression-Expression levels of gingipains (Rgps and Kgp) were determined by activity assays, Western blotting analyses and qRT-PCR. The activities of Rgps and Kgp were determined in the whole cell culture, cell-free growth medium, and suspension of washed bacterial cells using the chromogenic p-nitroanillide (pNA) substrates benzoyl-Arg-pNA and acetyl-Lys-pNA, respectively. Briefly, 10 l of cell fraction was added to 190 l of TNCT buffer (50 mM Tris-HCl, pH 7.5, 5 mM calcium chloride, 150 mM sodium chloride, 0.05% Tween 20; supplemented with 10 mM L-cysteine-HCl neutralized with 10 mM sodium hydroxide). After 10 min of preincubation at 37°C, the reaction was initiated by adding 10 l of substrate (final concentration, 0.5 mM), and the increase of A 405 was recorded using a Spectromax Flexstation 3 (Molecular Devices) microplate reader. Gingipain activity was expressed as mOD/min/l, and the activity in the full-culture of the wildtype strain was taken as 100%.
For Western blotting analysis, whole bacterial cultures (BC), washed-cell lysates (WC), the soluble cytoplasmic/periplasmic fraction (CP ϩ PP), the CE fraction, and GM were separately resolved by SDS-PAGE and proteins transferred to nitrocellulose membranes. The membranes were stained with Coomassie Brilliant Blue, photographed to document the protein load, and blocked for nonspecific binding sites with 2% albumin. Mouse monoclonal anti-Kgp and rabbit polyclonal anti-Rgp primary antibodies were used at 1:1,000 dilution. Secondary anti-mouse horseradish peroxidase-conjugated antibodies from rabbit (BD Pharmingen) and anti-rabbit antibodies from goat (Amersham Biosciences) were used at 1:15,000 dilution.
The expression of Kgp variants was further tested at the mRNA level by quantitative RT-PCR. All strains were grown from initial A 600 ϭ 0.1 until they reached 1.0. Total RNA was isolated using the TRI-reagent (Invitrogen) and incubated with DNase I (Ambion) to eliminate genomic DNA contamination. For the subsequent reverse transcriptase reaction (Applied Biosystems), 400 ng of RNA were used. Quantitative RT-PCR was performed with a Bio-Rad CFX96 real time PCR detection system using SYBR Green Jump Start (Sigma) and the following steps: 95°C for 5 min; 40 cycles at 95°C for 30 s; 56°C for 30 s; and 72°C for 45 s. The final elongation step consisted in incubation at 72°C for 10 min. Melting curves were acquired on a SYBR channel from 60 to 95°C (0.5°C increment). The expression level of each variant was normalized to 16S rRNA by the ⌬⌬C T method and represented in relation to the wild type.
Crystallization and Diffraction Data Collection-Crystallization trials were set up at 20°C as sitting drop vapor diffusion experiments by an Innovadyne Screenmaker 96 ϩ 8 Xtal robot with 96-well MRC plates. Crystals were obtained in conditions 1-2 of the MIDAS screen (Molecular Dimensions), which contained 0.1 M MES, 12% polyvinylpyrrolidone K15, pH 5.5. Optimization screens were performed by the hanging drop vapor diffusion method in 24-well plates. The best crystals were obtained with protein solution (at 12 mg/ml in 5 mM Tris-HCl, pH 7.4, 0.02% sodium azide) and 14% polyvinylpyrrolidone K15, 0.1 M Bis-Tris, pH 5.5, as reservoir solution from 1:1-l drops at 20°C. The crystals were cryo-protected by soaking for 2 min in 50% polyvinylpyrrolidone K15, 0.1 M Bis-Tris, pH 5.5, and flash vitrified in liquid nitrogen. The crystals used for phasing experiments were immersed for 2-3 min in a solution of 0.1 M Bis-Tris, pH 5.5, 50% polyvinylpyrrolidone K15 plus either 0.5 or 1 M potassium iodide prior to flash vitrification. A native data set was collected at 0.9795 Å wavelength, and several data sets for sulfur and iodide SAD/SIRAS phasing experiments were collected at 1.7 and 1.8 Å wavelength, respectively, at Diamond Light Source Beamline I04 (Didcot).
Data Processing and Structure Solution-After analyzing several data set combinations and methods, the structure of Kgp-NPD was eventually solved by SIRAS employing a native and a iodide-derivative data set processed to 1.8 and 2.6 Å resolution, respectively, with the programs XDS and XSCALE (Ref. 50 and Table 3). The correct space group was identified as P2 1 2 1 2 1 based on the systematic absences. SIRAS calculations with program SHELXD (51) and data cut to a resolution limit of 3.7 Å enabled us to identify five sites with occupancies above 20%. This solution stood out because of the clear discrimination of the correlation coefficient (CC; 32% for all data; 25% for weak reflections with EϽ1.5). Subsequently, phasing with program SHELXE (52) using a data set obtained by merging native and derivative data plus the iodide sites yielded calculated phase shifts. These showed small though clear discrimination in contrast, connectivity, and CC free between the original and inverted substructures and gave an interpretable map. Phasing and main chain tracing was done with a beta version of SHELXE, which enforces goodness of tertiary structure on strand tracing. This led to a polyalanine model of 245 residues, which was subsequently refined with the REFMAC5 program (53). Two Kgp-NPD molecules (A and B) were contained in the crystal asymmetric unit and were completed in successive rounds of manual model building with program COOT (54) and crystallographic refinement with program BUSTER/TNT (55), which included TLS refinement. The final model, which was validated with program MolProbity (Ref. 56 and Table 3), contained Kgp residues Gln 20 -Ala 201 of molecule A and G Ϫ2 -Ser Ϫ1 -Gln 20 -Gln 199 of molecule B (except Ser 128 -Asp 132 ) plus one azide molecule and 307 solvent molecules. Overall, molecule A was more rigid and better defined in the final Fourier map than molecule B as suggested by a lower average thermal displacement parameter (31.9 Å 2 versus 41.6 Å 2 ; Table 3), so the former molecule was taken as a reference for the "Results and Discussion" if not otherwise stated.
Bioinformatics-A homology model of the zymogen fragment of Kgp comprising NPD ϩ CD ϩ IgSF was obtained as follows: the structure of RgpB-CD ϩ IgSF in complex with its NPD (PDB code 4IEF and Ref. 43) was superimposed onto Kgp-NPD (molecule A) with the SSM program (57) using the NPDs only. Afterward, the structure of Kgp-CD ϩ IgSF (PDB code 4RBM and Ref. 42) was superimposed on the equivalent parts of RgpB. This provided an initial model for Kgp-NPDϩCDϩIgSF. Detailed visual inspection of this model with COOT revealed some loop regions involved in clashes. These regions were corrected in two rounds of manual remodeling with COOT and energy minimization with the Chimera program employing default parameters (58) plus a final regularization step with the geometry_minimization routine of the PHENIX suite (59). The latter included restraining metal-binding distances to tabulated values. The final homology model was assessed with MolProbity, which revealed the following parameters of protein geometry and all-atom contacts: clash-score for all atoms, 1.76 (99 th percentile); poor rotamers, 2 (0.38% of 524 residues); Ramachandran outliers, 1 (0.16% of 630 residues); Ramachandran favored, 612 residues (97.1%); C␤ deviations Ͼ0.25 Å, 0; residues with bad bond/angles, 0/0; MolProbity score, 1.09 (100 th percentile). The regularized homology model of the Kgp zymogen can be obtained from the corresponding authors.
Secondary structure predictions were calculated with JPRED-4 (60). The sequence alignment in Fig. 2C was performed manually based on visual inspection with COOT of the structures previously superposed with the SSM program. Structure figures were prepared with Chimera and protein interfaces were analyzed with the PISA server (46). The coordinates of the final experimental model of Kgp-NPD from P. gingivalis strain W83 were deposited with the PDB under access code 5MUN.   (56,64).