A structure-derived snap-trap mechanism of a multispecific serpin from the dysbiotic human oral microbiome

Enduring host-microbiome relationships are based on adaptive strategies within a particular ecological niche. Tannerella forsythia is a dysbiotic member of the human oral microbiome that inhabits periodontal pockets and contributes to chronic periodontitis. To counteract endopeptidases from the host or microbial competitors, T. forsythia possesses a serpin-type proteinase inhibitor called miropin. Although serpins from animals, plants, and viruses have been widely studied, those from prokaryotes have received only limited attention. Here we show that miropin uses the serpin-type suicidal mechanism. We found that, similar to a snap trap, the protein transits from a metastable native form to a relaxed triggered or induced form after cleavage of a reactive-site target bond in an exposed reactive-center loop. The prey peptidase becomes covalently attached to the inhibitor, is dragged 75 Å apart, and is irreversibly inhibited. This coincides with a large conformational rearrangement of miropin, which inserts the segment upstream of the cleavage site as an extra β-strand in a central β-sheet. Standard serpins possess a single target bond and inhibit selected endopeptidases of particular specificity and class. In contrast, miropin uniquely blocked many serine and cysteine endopeptidases of disparate architecture and substrate specificity owing to several potential target bonds within the reactive-center loop and to plasticity in accommodating extra β-strands of variable length. Phylogenetic studies revealed a patchy distribution of bacterial serpins incompatible with a vertical descent model. This finding suggests that miropin was acquired from the host through horizontal gene transfer, perhaps facilitated by the long and intimate association of T. forsythia with the human gingiva.

The human oral microbiome was first described in 1683 by Antoni van Leeuwenhoek (1). It is the second largest microbial community after the gut microbiome (2) and comprises over 600 species or phylotypes (http://www.homd.org) 4 (3). Under conditions of dysbiosis, it is dominated by opportunistic pathogens causing periodontal disease (PD) 5 and likely contributing to the development of systemic diseases (4,5). PD is a chronic inflammatory disease driven by bacteria in the subgingival dental plaque, which occurs in 5-20% of the adult population worldwide (6). It is mainly caused by the "red complex" (7), a bacterial consortium including Tannerella forsythia, Porphyromonas gingivalis, and Treponema denticola, which exclusively reside in periodontal pockets (8). Analysis of contemporary and ancient oral microbiomes revealed that these pathogenic bacteria have inhabited our oral cavity over several thousand years (5). Indeed, paleomicrobiological studies showed that all three species were abundant in ancient dental calculus samples from medieval individuals with PD dated to ϳ800 -1000 years ago (5). P. gingivalis and members of the Tannerella and Treponema genera were also identified in Polish mesolithic/paraneolithic, German neolithic, and English Bronze Age calculus samples dated to ϳ7,550 -5,450, ϳ7,400 -6,725, and ϳ4,200 -3,000 years ago, respectively (9). Moreover, P. gingivalis and T. denticola were detected in the mummy of The authors declare that they have no conflicts of interest with the contents of this article. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health. The atomic coordinates and structure factors (codes 5NCS, 5NCT, 5NCU, and 5NCW) have been deposited in the Protein Data Bank (http://wwpdb.org/). 1 Both authors contributed equally to this work. 2 Present address: EMBL Outstation Grenoble, 71 Ave. des Martyrs, CS 90181, 38042 Grenoble Cedex 9, France. 3 To whom correspondence should be addressed. E-mail: xgrcri@ibmb.
csic.es. 4 Please note that the JBC is not responsible for the long-term archiving and maintenance of this site or any other third party hosted site. 5 The abbreviations used are: PD, periodontal disease; CEP, cysteine endopeptidase; SEP, serine endopeptidase; ␣ 2 M, ␣ 2 -macroglobulin; RSB, reactive-site bond; RCL, reactive-center loop; RMSD, root mean square deviation; SI, stoichiometry of inhibition; PDB, Protein Data Bank.
the Tyrolean iceman "Ötzi" from ϳ5,300 years ago, who also suffered from PD (10). In addition, P. gingivalis was found in human dental calculus samples from Chile and Argentina dated to ϳ3,700 -4,500 years ago (11). Finally, the three red-complex members were recently identified in ϳ49,000-year-old dental calculus samples from Homo neanderthaliensis (12), which suffered from dental caries and PD like modern humans (13) and likely interbred with them across Eurasia (14). The persistence of these bacterial species within the human oral microbiome can only be explained by the development of adaptive mechanisms to thrive in a harsh environment, which is characterized by competing microorganisms and host defenses. T. forsythia is an anaerobic, Gram-negative bacterium from the Bacteroidetes phylum that is associated with grievous PD (15). It inhabits dental plaques in periodontal pockets (16) that contain an inflammatory exudate: gingival crevicular fluid. This fluid is rich in defensive cysteine endopeptidases (CEPs) and serine endopeptidases (SEPs) such as neutrophil elastase and cathepsin G, from the host immune system (17). It also contains extracellular peptidases engaged in virulence and colonization from T. forsythia (18) and other red-complex partners, which compete for resources at the site of infection. Indeed, P. gingivalis secretes the CEPs calpain-like peptidase Tpr (19), periodontain (20), and gingipains K and R (21), as well as SEP PepK (22); and T. denticola contributes with CEP dentipain (23) and SEP dentilysin (24). To keep these peptidases in check, T. forsythia possesses miropin, a serine proteinase inhibitor from the serpin family (25)(26)(27).
Serpins are comparatively large proteins of ϳ350 -400 residues that are grouped into peptidase inhibitor family I4 in the MEROPS database (http://merops.sanger.ac.uk) 4 (28). They span over 3,000 members and also include noninhibitory variants with other functions (27,29,30). Serpins have been extensively studied in humans and other mammals, where they participate in inflammation, coagulation, fibrinolysis, intracellular signaling, and complement activation (29,31). They are also widespread in other animals, plants, and some viruses (32). In contrast, they are only sporadically found in prokaryotes (33,34), where they have been mainly studied from environmental microbiota (34). Here, their precise biological roles remain obscure (27).
Generally, serpins target in a highly specific manner certain chymotrypsin-like and/or subtilisin-like SEPs, as well as CEPs, from the papain, cathepsin, and caspase families (31,32,35,36). These inhibitors are one of three covalent suicide inhibitor families (37), which also include the ␣ 2 -macroglobulins (␣ 2 Ms; family I39) (38) and the relatives of baculovirus p35 protein (family I50). Serpins behave like pseudo-substrates, and inhibition is initiated by formation of the Michaelis complex, which occurs without significant structural changes in either enzyme or serpin (39). Subsequently, nucleophilic attack of the catalytic hydroxyl or sulfhydryl group from the target SEP or CEP, respectively, on the scissile carbonyl of a specific reactive-site bond (RSB; bond P 1 -P 1 Ј, nomenclature of substrate subsites in the active-site cleft according to Ref. 40) cleaves the bond and forms a covalent (thio)acyl-enzyme intermediate. The RSB occurs in an exposed, flexible 20 -24-residue reactive-center loop (RCL) (41) or serpin binding loop (42). At this stage, large conformational rearrangement of the serpin (29,32) causes the loose segment upstream of the cleavage site to insert as a new strand into a central ␤-sheet of the inhibitor moiety (sheet sA). This occurs under translocation of the covalently attached peptidase (27,32,43). The residues in P 1 and P 1 Ј are then far apart and thus unavailable for enzymatic resynthesis of the RSB, which has been described for some standard-mechanism protein inhibitors (44), so the cleavage reaction is irreversible. Furthermore, the steric collisions caused by translocation cause deformation of the peptidase (41). The peptidase becomes unable to catalyze hydrolysis of the (thio)acyl-enzyme intermediate-which is chemically stable and resistant to SDS-PAGE-and strongly susceptible to proteolysis (41).
This inhibitory mechanism also entails that the serpin undergoes a "stressed-to-relaxed" transition (45,46) between an intact, high-energy, "S" or native conformation and a cleaved, triggered, low-energy, "R" or induced conformation, which is reminiscent of a snap trap and thermodynamically driven by the energy derived from strand insertion. The process results in an induced serpin that is substantially more stable than the native molecule (47). Some intact serpins have been further described in two additional conformers: the "latent form" and the "␦ form," but these forms are rare (29,32).
Since the discovery of the first inhibitory serpin structures of induced ␣ 1 -trypsin inhibitor, which is also called ␣ 1 -proteinase inhibitor (48), and native human antithrombin (49), a number of structural studies have contributed to shedding light on the working mechanism of animal, plant, and viral serpins (see Refs. 29,31,32,37,46, and 50 for reviews). However, the only prokaryotic serpins that have been structurally investigated to date are Thermobifida fusca thermopin (51,52) and Caldanaerobacter subterraneus tengpin (53,54), both of which originate in environmental thermophiles. To expand these data, we assessed the inhibitory capacity of miropin against physiologically relevant SEPs and CEPs and found that it uniquely blocks many serine and cysteine endopeptidases of disparate architecture and substrate specificity owing to several potential target bonds within the RCL. We further investigated the structure-based molecular mechanism of miropin, which is the first of a bacterial serpin from a human microbiome member of biomedical relevance, and found that broad specificity is due to plasticity in accommodating extra ␤-strands of variable length. To this aim, we analyzed its native and peptidase-induced wildtype forms, as well as two mutants affecting the RCL and a mutant ablating a disulfide bond between vicinal residues. We further performed phylogenetic studies to hypothesize about the evolutionary origin of miropin and other bacterial serpins and suggest that it was acquired from the host through horizontal gene transfer.

Miropin broadly inhibits SEPs and CEPs through various reactive-site bonds
T. forsythia miropin has a signal peptide (Met 1 -Ala 16 ; residue numbers in superscript), which is cleaved off upon secretion. This leaves a cysteine at the N terminus (Cys 17 ), which suggests post-translational modification for insertion into the Endopeptidase inhibition mechanism of T. forsythia miropin membrane through a "lipobox"-like mechanism (55) and location to the outer layer of the outer membrane (27). For functional and structural studies, we produced a soluble fragment spanning residues Glu 39 -Glu 408 (hereafter referred to as "wildtype miropin") and found it had higher activity (data not shown) than the previously described N-terminally extended variants (27). This suggests that the first ϳ20 residues of the secreted protein are dispensable for function and likely arranged as a flexible spacer from the bacterial membrane surface across the periplasm. The periplasmic location of miropin contrasts with that of most animal serpins, which are secreted and soluble (32). This paradox is reminiscent of ␣ 2 M family inhibitors, whose animal members are also generally secreted to the circulation (38), whereas bacterial members undergo post-translational modifications for periplasmic location similar to miropin (56).
Miropin inhibited physiologically relevant SEPs such as trypsin, neutrophil elastase, pancreatic elastase, subtilisin, and cathepsin G, which have disparate substrate specificities and architectures, but not chymotrypsin and thrombin (Fig. 1, A and B) (27). Moreover, to assess whether the broad inhibitory spectrum of miropin could be extended to CEPs, we tested papain (Fig. 1, A and B) and two physiologically relevant peptidases secreted by red-complex partner P. gingivalis at the site of infection, viz. calpain-like peptidase Tpr and gingipain K. 6 When covalent enzyme-inhibitor complexes were incubated, trapped peptidases underwent processing over time because of destabilization, as is usually observed in serpin complexes (41). Only fragments spanning up to ϳ3 kDa remained covalently attached through the catalytic residue to the C-terminal residue of the miropin-cleaved bond. For inhibition, miropin employed not only the theoretic RSB, P 1 -P 1 Ј (Thr 369 -Ser 370 ; assigned based on Ref. 27) but also upstream bonds P 2 -P 1 (Lys 368 -Thr 369 ) and P 3 -P 2 (Val 367 -Lys 368 ), i.e. up to two positions upstream within the RCL. In stark contrast, the overwhelming majority of inhibitory serpins described are specific for one or few more endopeptidases of equivalent specificity and one class and employ a single RSB as bait (57)(58)(59)(60). In these cases, cleavage outside the RSB within the RCL results in inactivated serpins without endopeptidase inhibition, similarly to snap traps that are triggered but do not catch the prey (29,61,62).
To characterize inhibition in more detail and determine whether there was any preferred RSB in miropin, we produced two mutants in which the lysine in P 2 of the wild type was moved to P 1 (mutant K368A/T369K) or P 3 (mutant V367K/ K368A), whereas the pI of the molecule was kept intact. We

Endopeptidase inhibition mechanism of T. forsythia miropin
then compared their inhibitory capacity against trypsin with that of the wild type. Similar mutation studies that moved the target bond had revealed cleavage, but no peptidase inhibition, in other serpins (32). Given the mechanism of serpins as suicide substrate-like inhibitors ("branched-pathway mechanism"; see Fig. 6 in Ref. 32), their inhibitory activity is usually assessed by the "stoichiometry of inhibition" (SI). This is defined as the number of molecules of serpin needed to inhibit one molecule of proteinase and is ϳ1 for most serpins. A second parameter, which is more relevant for efficacy, is the second-order association rate constant, k ass , which is ϳ10 5 -10 6 M Ϫ1 s Ϫ1 for most inhibitory serpins (63). Wild-type miropin and mutant K368A/ T369K inhibited trypsin with SI Ϸ 1.5, whereas V367K/K368A did so with SI Ϸ 5. The k ass values, in contrast, were very similar (ϳ10 5 M Ϫ1 s Ϫ1 ; Fig. 2, A-C), thus indicating that the RSBs are equivalent with respect to inhibitory power in miropin.
Accordingly, miropin is unique in its inhibitory capacity against endopeptidases of variable specificity, from two enzyme classes. It employs various bonds of the RCL, which is reminiscent of the "bait region" from the likewise suicidal but otherwise structurally and mechanistically unrelated ␣ 2 Ms (38,64).

Structure of native miropin
To shed light on the mechanistic basis of this broad specificity, we determined crystal structures of native and induced miropin. The structure of native wild-type miropin was solved using diffraction data to 3.0 Å resolution (Table 1). Miropin is an elongated ellipsoid with two similar equatorial semi-axes of ϳ35 Å and a long vertical semi-axis of ϳ80 Å ( Fig. 3A; molecule orientation of reference). The structure contains a central sandwich consisting of two perpendicular, partially overlapping, twisted ␤-sheets, sA and sB (consensus nomenclature of serpins; see Ref. 32), of five vertical strands (from left to right, s6A, s5A, s3A, sA2, and s1A) and seven horizontal strands (from top to bottom, s0B-s6B; see Fig. 3, A and D), respectively. Sheet sB is antiparallel and includes an uppermost extra strand, s0B, that is absent from other serpins, whereas sA is mixed parallel-antiparallel. A third four-stranded antiparallel ␤-sheet (sC; left to right, s1C-s4C) is placed on top of sA, forming the roof of the molecule and making a second ␤-sandwich with the left half of sB. In sC, strands s1C, s2C, and the C-terminal segment of s3C are antiparallel. The latter strand is bent leftwards at halflength, so that its N-terminal half does not interact with neighboring strand s2C. Instead, it forms a ␤-ribbon with strand s4C, which protrudes from the front surface of the molecule. Overall, the three ␤-sheets constitute the central core of the protein, which is decorated with nine helices (hA-hI). All are located behind the sheets, except for frontal helix hF (Fig. 3A).
N-terminal helix hA runs horizontally along the back surface of miropin right to left, approximately at half-height of sA, and leads to the lowermost ␤-strand of sB, s6B (Fig. 3, A and D). This strand leads to hB, which nestles in the back surface of sheet sA and roughly parallels its strands. Helices hA-hE and hI are arranged as a helical cluster behind sA and shape the back surface of the lower half of miropin. Among them, hB, hC, and hD are contiguous and linked by short loops, and they connect s6B with s2A. Helix hE, in turn, is inserted between the latter strand and s1A, and helix hI is placed in an extended loop connecting strands s5A and s6A. In the upper part of the molecule, behind sB, two more tandem helices (hG and hH) shape the left back surface of the molecule and connect s3B with s0B. Finally, on the front surface of sA, helix hF and the downstream extended loop leading to s3A (loop-hF-s3A) cover sA like a "front flap." Interestingly, a highly strained disulfide bond links the side chains of vicinal residues Cys 245 and Cys 246 (Fig. 4A). To inves-

Endopeptidase inhibition mechanism of T. forsythia miropin
tigate the role of this unique feature for serpins, we constructed a mutant in which both residues were replaced with alanine (mutant C245A/C246A). We found that this mutant had inhibitory properties indistinguishable from wild-type miropin (data not shown). Next, because disulfide bridges generally contribute to protein stability, we further compared mutant and wild-type miropin by differential scanning calorimetry. These studies revealed a single peak for both species in the analyzed range, but the temperature at which the excess molar heat capacity was maximal was 4.7°C lower in the mutant (56.4°C versus 61.1°C; Fig. 4B). This reveals that the intramolecular disulfide bond plays a role in the stability, but not in the activity, of miropin. Native miropin exhibits an exposed RCL, which connects s5A with s1C and protrudes upwards from the molecular body (Fig. 3, A, D, and F). It spans 24 residues (Glu 353 -Pro 376 , corresponding to positions P 17 -P 7 Ј) and does not interact with the protein moiety between Ala 360 and Thr 375 . This suggests that it is likely to be flexible and disordered in solution, so it can easily adapt to the active-site clefts of disparate prey peptidases. In the crystal structure, the RCL is defined in the final Fourier map (Fig. 3F) because of noncrystallographic-symmetry contacts. It contains a short 1.5-turn helix (hRCL; Thr 362 -Val 367 ; P 8 -P 3 ) shortly before the theoretic RSB (Fig. 3A), which is reminiscent of the native structures of noninhibitory serpin ovalbumin from chicken (65) and of a variant of human ␣ 1 -antichymotrypsin (66). The RCL is subdivided into a "hinge region" (27) spanning Glu 353 -Val 361 (P 17 -P 9 ; Fig. 3A), which is essential for the conformational rearrangement of the RCL upon induction (see the next section), and an "exposed loop" region from P 8 to P 7 Ј (Thr 362 -Pro 376 ; Fig. 3A), which contains the theoretic RSB (P 1 -P 1 Ј). Another region of functional importance is the "breach": the point of initial strand insertion at the top of sA (see "Structures of induced miropin and reaction mechanism"). Following the consensus serpin architecture (67), the breach includes residues Phe 201 , Lys 202 , Gly 203 , Trp 205 , and Phe 209 from s3A and loop-s3A-s4C; Met 232 from s3C; Tyr 253 from s2B; and Glu 353 plus Gly 355 from the RCL hinge region. Also relevant is the "shutter," engaged in sheet opening and other changes prior to strand insertion (see "Structures of induced miropin and reaction mechanism"), which in miropin encompasses Phe 47 from hA; Ser 67 , Pro 68 , Ser 70 , and Leu 75 from loop-s6B-hB and hB; Leu 94 from hC; Ile 168 , Asn 169 , Cys 172 , Thr 176 , Asp 178 , and Ile 180 from the front flap; Leu 195 and Asn 197 from s3A; Asn 345 from s5A; Ala 358 from the hinge region; and Leu 398 , Phe 399 , and  -hH and hRCL). The strands are arranged in three ␤-sheets (sheet sA, yellow strands s6A, s5A, and s3A-s1A; sheet sB, orange strands s0B-s6B; and sheet C, red strands s1C-s4C). The RCL (Glu 353 -Pro 376 ) connects strand s5A with s1C and is subdivided into the hinge region (Glu 353 -Val 361 ; brown ribbon) and the exposed loop (Thr 362 -Pro 376 ; blue ribbon). The residues flanking the theoretic RSB (P 1 -P 1 Ј; Thr 369 -Ser 370 ) are shown for their side chains and labeled. The position of the P 16 (Glu 353 ) and P 17 (Glu 354 ) residues is indicated by lines. B, same as A but showing trypsin-induced wild-type miropin as representative of the induced miropin structures. ␤-Strand s4A from sheet sA, absent in A, is shown in the colors of the corresponding segment of A and labeled. Dipeptide Asp 194 -Ser 195 of trypsin is covalently attached through atom Ser 195 O␥ to the carbonyl of Lys 368 after cleavage of bond P 2 -P 1 (Lys 368 -Thr 369 ). On the primed side, the chain is only defined from Ser 373 (P 4 ) onwards. C, superposition of native and subtilisin-induced wild-type miropin as C␣-traces in cross-eye stereo depicting sheet sA and the RCL (respectively, in orchid and turquoise), and the flap consisting of helix hF and downstream loop hF-s3A (pink and blue). Upon induction, rotation around the curved orange arrow leads to strand insertion, which entails a left shift for strands s6A and s5A and a right shift for s3A, s2A, and s1A (small orange arrows). The visible ends of the cleaved region of induced miropin (Thr 369 , P 1 and Pro 376 , P 7 ) are pinpointed by black arrows. D and E, topology scheme of native (D) and induced (E) miropin in the coloring of A and B. Relevant positions for the mechanism are depicted in the notation of Schechter and Berger (40) (P 18 -P 8 Ј; miropin residues Asp 352 -Ile 377 ; see also Fig. 7 in Ref. 27). The exposed loop of the RCL (blue coil) spans P 8 -P 7 Ј (Thr 362 -Pro 376 ), includes helix hRCL, and contains the theoretic RSB (P 1 -P 1 Ј). The hinge region of the RCL (brown coil) spans P 17 -P 9 (Glu 353 -Val 361 ). Upon productive cleavage at the RCL and induction (red scissors, D), P 15 -P 2 (Gly 355 -Lys 368 ; trypsin cleavage at P 2 -P 1 ) or P 15 -P 1 (Gly 355 -Thr 369 ; subtilisin cleavage at P 1 -P 1 Ј) becomes inserted into sheet A as s4A (in brown/blue) between s3A and s5A. The residues of each regular secondary structural element are indicated in italics, and those differing in native and induced miropin are in red. The covalently attached endopeptidase is symbolized by a green and yellow ellipse. F, fragment of refined native miropin depicting the five strands of sheet A (from left to right, s6A, s5A, s3A, s2A, and s1A), strand s1C in magenta, and the RCL in orange superposed with the final refined (2mF obs Ϫ DF calc )-type Fourier map (turquoise mesh) shown with a zone radius of 2 Å and a contour level of 1 . Residues Glu 353 (fulcrum) and Gly 354 (rotation around bond C␣-C leads to strand insertion) are labeled, and the rotation occurring in the latter upon induction is pinpointed by a magenta arrow. G, same as F but showing trypsin-induced miropin around sheet A only (from left to right, strands s6A-s1A) as magenta sticks, except for strand s4A and the preceding hinge region, in orange as in F. Trypsin dipeptide Asp 194 -Ser 195 is shown as blue sticks at the sheet bottom. H, same as G but showing the structure of subtilisin-induced miropin. I, same as G but depicting the structure of trypsin-induced miropin V367K/K368A.

Endopeptidase inhibition mechanism of T. forsythia miropin
Gly 401 from s5B and the C-terminal tail. Collectively, the residues of shutter and breach contribute to a set of 51 positions, which are required for core structure integrity and are conserved among Ͼ70% of inhibitory serpins (see Table 2 in Ref. 67). In miropin, 45 positions are strictly conserved (Fig. 5A), and the exceptions are conservative replacements that do not interfere with the general core architecture: Cys 172 instead of valine, Asp 178 instead of glycine, Ala 191 instead of threonine, Arg 301 instead of lysine, Ile 338 instead of leucine, and Val 406 instead of proline. Moreover, inhibitory serpins also show a consensus sequence pattern in the hinge region, which is required for efficient and rapid strand insertion upon induction (see also "Structures of induced miropin and reaction mechanism"): E(E/K/R)G(T/S)X(A/G/S) 4 (42,68). Mutation of these residues in inhibitory serpins often resulted in inhibitor cleavage but not peptidase inhibition (67,69). Inspection of the miropin structure reveals that it matches the consensus sequence ( 353 EEGTEAAAV 361 ). Accordingly, the structure of native miropin fulfills all the structural requirements described for functional inhibitory serpins and contains a long and flexible RCL, which is potentially targetable by endopeptidases.

Structures of induced miropin and reaction mechanism
We obtained covalent complexes of induced wild-type miropin with two SEPs of disparate specificity and structure, viz. trypsin (to 1.6 Å resolution) and subtilisin (to 1.7 Å resolution) (Table 1), which cleaved the RCL at bonds P 2 -P 1 and P 1 -P 1 Ј, respectively. These sites are consistent with the respective endopeptidase specificities (70,71). We also solved the structure of trypsin-induced mutant V367K/K368A (to 1.5 Å resolution), which was cleaved at P 3 -P 2 . The high resolutions of these structures contrast with the low resolution of the native structure, which is consistent with the difference between the relaxed, low-energy induced conformation and the stressed, metastable native conformation normally found in serpins. As mentioned above, the enzyme-inhibitor complexes underwent processing, so that the crystallized samples contained miropin covalently linked only to small fragments of trypsin and subtilisin via the O␥ atoms of the respective catalytic serines. These fragments could not be resolved in the crystal structures, with the exception of dipeptide Asp 194 -Ser 195 (trypsin numbering in subscript) in the complex between wild-type miropin and trypsin (Fig. 3, B, C, E, and G).
Superposition of the three induced miropin structures shows they are practically indistinguishable (RMSD values of ϳ0.5 Å). As expected from an inhibitory serpin, induction cleavage causes the downstream segment of the RCL to loosely protrude from the top molecular surface and be flexible, following a similar chain trace to native miropin only from Pro 376 onwards (Fig. 3, A and B). In contrast, the upstream segment of the RCL undergoes large rearrangement, which drags the covalently bound peptidase fragment from one pole of the serpin to the opposite (Fig. 3, A-C). This results from a ϳ180°rotation around bond C␣-C of P 16 residue Glu 354 (Fig. 3, A and B), next to "fulcrum" (29) residue Glu 353 in P 17 , which causes the downstream polypeptide chain to be rotated ϳ90°downwards and inserted between sA strands s5A and s3A as new strand s4A (Gly 355 -Lys 368 ). In the three induced structures, strand insertion is equivalent until Val 367 (Lys 367 in mutant V367K/ K368A), which is the last residue buried in the molecular moiety and visible in the Fourier map of the mutant complex. The trypsin complex of wild-type miropin includes extra residue Lys 368 , which is linked to the catalytic serine of trypsin (Fig. 3G), and the subtilisin complex even further includes Thr 369 as the last defined residue. Thus, miropin uniquely inserts segments spanning between 14 and 16 residues, apparently without destabilization of the complex, which provides a structural explanation for its flexible inhibitory capacity. This is exceptional, because although the length on the free C-terminal side of the reactive site bond generally varies between five and nine residues (P 1 Ј-P 5 Ј/P 9 Ј) (32) among serpins, the segments for s4A strand insertion strictly span 16 or 17 residues (P 1 -P 16 /P 17 ) (32). This has been hailed as critical for enzyme-inhibitor complex stabilization upon cleavage and rearrangement (29). Serpin variants that were two residues longer or shorter were cleaved, but no inhibitory complexes were formed, because the peptidase was still able to hydrolyze the acyl-enzyme intermediate (72).
Strand insertion transforms sA from a mixed five-stranded ␤-sheet to a more stable antiparallel six-stranded ␤-sheet (Fig.  3, B, C, E, and G). This movement causes the cleaved residue upstream of the scissile bond-and thus bound prey peptidase fragments-to be pulled downwards by ϳ75 Å. The insertion is accounted for by a horizontal sliding motion, a rigid-body left shift of strands s6A and s5A of up to ϳ2 Å, and a right shift of strands s3A, s2A, and s1A of maximally ϳ3.5 Å (Fig. 3C). This "stage curtain opening" mechanism was supported by analysis of the theoretical molecular flexibility of native and trypsininduced wild-type miropin based on the elastic network model, which identified three hinge points centered at positions Tyr 200 -Phe 201 , Thr 304 -Cys 306 , and Thr 347 -Phe 348 (scores of 0.85-0.92) (73). These are central residues of s3A, s6A, and s5A, respectively, which highlight the importance of the horizontal section at half-height of sA in conformational rearrangement upon induction. This rearrangement cascades down to elements from the shutter region: the C-terminal segment of the molecule moves ϳ1.5 Å to the left, and this causes sC ␤-ribbon s3C-s4C to be rotated backwards ϳ7°. Moreover, loop-hA-s6B from the shutter is shifted leftwards by ϳ2 Å. On the right-hand side of sA, displacement of s2A causes slight rearrangement of the C-terminal turn of hD, which leads loop-s4B-s5B to be shifted backwards ϳ2 Å. However, the most relevant of all the changes affects the front flap (Fig. 3C). This structural element may play a role in stabilizing the five-strand conformation in native serpins, thus preventing untimely triggering of the rearrangement, and in controlling opening of the sheet (32,74). The front flap has to be pulled away to allow for strand insertion and then positioned back to protect the refurbished ␤-sheet sA in what is known as a "coupling mechanism" (see Fig. 10 in Ref. 32). In miropin, displacement of s1A causes downstream helix hF and loop-hF-s3A from the flap to be rearranged under maximal displacement of ϳ3.5 Å (at N 177 ), in particular because of a 1 -rotation of the side chain of Tyr 200 . In contrast to all these changes, the helical cluster behind sA and the rest of the molecule remain essentially unchanged upon induction.
The induced structures also explain the aforementioned conservation of the hinge region in miropin: generally small residues are required to prevent clashes after strand insertion with side chains behind sheet sA, which are unaltered upon induction. Indeed, a glycine (Gly 355 ) is the only residue allowed in P 15 to avoid collision with Trp 205 and Tyr 253 ; maximally a threonine (Thr 356 ) is allowed in P 14 because of the aforementioned aromatic residues plus Met 260 , Val 349 , and Val 351 ; and small or middle-sized side chains are also required at the positions of downstream s4A residues Ala 358 , Ala 360 , Thr 362 , and Val 364 . Notably, a methionine (Met 366 ) is found at P 4 , whose bulky hydrophobic side chain contributes to a hydrophobic cluster with Phe 192 , Met 193 , and Leu 195 from neighboring strand s3A; upstream s4A residue Val 364 ; Ile 338 and Ile 340 from the segment preceding s5A; Thr 78 from hB; and Phe 330 plus Ile 333 from loop-hI-s5A. Taken together, these methionine-mediated interactions provide a strong anchor for s4A to the subjacent moiety in miropin, as found in plasminogen activator inhibitor 1 and peptidase inhibitor 6 among human serpins, which have an equivalent methionine (32). To sum up, structures of induced miropin provide the basis for its potential targetability by endopeptidases of different architecture and specificity.
In turn, sequence identity searches revealed close matches within bacteria (see Fig. 5A for a selection). Miropin is the only serpin of T. forsythia, and several potential orthologs were found in other Tannerella species that displayed pairwise sequence identities of 57-76%. In contrast, red-complex partners P. gingivalis and T. denticola lacked serpins, despite sharing the habitat with T. forsythia, possibly because they themselves secrete several SEPs and CEPs that would be cancelled out by endogenous serpins (78). However, other Porphyromonas strains, such as human intestinal Porphyromonas uenonis and canine gingival Porphyromonas cangingivalis and Porphyromonas crevioricanis, encoded potential serpins sharing 26 -49% sequence identity, but they clustered separately from miropin (Fig. 5B). Clustering with miropin were sequences from Bacteroides species residing in the human gastrointestinal tract, with identities spanning 33-50%. However, as within the genus Porphyromonas, other Bacteroides species, such as Bacteroides fragilis and Bacteroides thetaiotaomicron, lacked serpins.
Serpin sequences were also found in other human symbionts such as Bifidobacterium and Prevotella (27,33,79) and in human pathogenic Mycobacterium ulcerans but not in Mycobacterium tuberculosis. Their presence was also detected in bacteria of distinct environmental origin, such as the plant symbiont Rhizobium leguminosarum, the free-living bacteria Streptomyces albus and Bacillus subtilis, and the cyanobacteria Anabaena variablis and Arthrospira platensis.
The high structural similarity and sequence identity of miropin to the human intracellular serpin SCCA1 (37%) is not an isolated case. Other bacterial serpin sequences shared even higher identities with eukaryotic homologs, such as Chondromyces crocatus, a myxobacterium isolated from dead herbarium specimens and animal faeces (80). This potential serpin shares 46% identity with the potential homolog from American sparrow Zonotrichia albicollis, and it clusters together with the eukaryotic sequences (Fig. 5B).
To sum up, the presence of serpins and serpin-like sequences in bacteria is widespread and includes free-living, pathogenically invasive, symbiotic colonizing and saprophytic species, which indicates that bacterial serpins can be housekeeping proteins but may also participate in colonization or virulence by providing protection against attacking peptidases of human or bacterial origin (27,29). However, taken together, the occurrence of bacterial serpin sequences is patchy.

Concluding remarks
Our resident microbes play a pivotal role in health and disease, and they can be envisaged as an additional organ contributing to our human condition (81). Unfortunately, we know remarkably little about their diversity, variation, and evolution, so unveiling the molecular mechanisms they have derived to adapt to us and to microbial competitors is key to our understanding of both healthy symbiosis and pathogenic dysbiosis.
We here describe for the first time the mechanism of action of the bacterial serpin that enables T. forsythia to better thrive in the harsh and crowded human oral cavity. Miropin is capable of broadly inhibiting SEPs and CEPs of disparate classes, architectures, and substrate specificities. This is achieved by offering several target bonds of the RCL for cleavage within a bait region, instead of a single RSB as found in canonical serpins. In addition, promiscuous inhibition is facilitated by the capacity to insert strands deviating from the canonical length into the central sheet sA, while keeping the prey peptidase bound and inactivated. Despite the apparent flexibility of RCLs, so far only three serpins were shown to inhibit target proteases using two separate-though overlapping-sites (57)(58)(59)(60). Hence, the structural adaptation of miropin to provide a relaxed inhibitory specificity, which allows for formation of inhibitory complexes using different sites, is unique among serpins. It can be hypothesized that this adaptation evolved in response to the highly proteolytic environment of the dysbiotic bacterial biofilm, which is crowded with microbial species secreting a broad array of SEPs and CEPs and saturated with host phagocyte-derived proteases. For T. forsythia, a bacterium whose integrity depends on the semi-crystalline surface layer of proteins (S layer) (82), protection from proteolytic degradation is a matter of life or death. To this end, outer membrane-anchored miropin, with its ability to inhibit a variety of both host and bacterial proteases, seems to be a perfect solution. It would be interesting to see whether other miropin-like serpins from highly proteolytic environments have the same propensity. A long RCL, which in many cases evinces one or two additional residues, is found in putative serpins from gut-and dental-plaque-dwelling Bacteroides sp. and Prevotella sp. but not in environmental specimens, which seems to argue that these serpins may also possess a relaxed inhibitory specificity. Also in this context, it is interesting that Tannerella species from the oral microbiome carry a cluster of four genes encoding serpins that are very closely related to miropin and comprise a RCL of variable length. Analysis on the inhibitory properties of further miropin-like serpins from species inhabiting different environments will verify our hypothesis.
Because of the wide distribution of serpins in animals and plants (for example, 36 paralogs are found in humans) (36), they should also be broadly and homogeneously present among ancestral organisms if they had an ancient origin (34). However, proteinase inhibitors are generally rare within prokaryotes (33), and this also holds for serpins: their distribution is patchy in bacteria, where they have only been identified recently (34). This is incompatible with a vertical descent model and would only be partially explained by massive loss-of-function events during evolution, so bacterial serpins may not share a common ancestor. Instead, xenologous horizontal gene transfer from our interfacial epithelial cells to the bacterial cells of our microbiomes, partially supported by phylogenetic data, may explain the possible origin and evolution of at least a fraction of bacterial serpins, including miropin (27,34,67,83). The intimate host-microbiome interaction and persistence, which can be traced back several millennia for red-complex partners, would support such transfer. It is further backed by the proposal that an exchange of DNA in plaque biofilms by a transformationlike process may provide an important ecological advantage for the survival and persistence of oral microbiome bacteria (84). Finally, this hypothesis is in line with those postulated for the origin of other human microbiome effectors, such as bacterial ␣ 2 Ms (56), and bacterial metallopeptidases, such as T. forsythia karilysin (85) and B. fragilis fragilysin (86).

Protein production and purification
A fragment of T. forsythia strain ATCC43037 miropin (Gen-Bank TM code WP_041590947; UniProt code G8UQY8) spanning residues Glu 39 -Glu 408 (mutation R174Q attributed to natural variability within T. forsythia strains, hereafter referred to as "wild-type miropin") was produced from a construct derived from plasmid pGEX-6P-1_Tfs46 (27), which attaches an N-terminal glutathione S-transferase (GST) tag and a PreScission protease cleavage site to the protein of interest, by employing the Phusion site-directed mutagenesis kit (Thermo Fisher Scientific) and phosphorylated forward primer GAAAAGATAG-AAAA AGACAATGCCTTTGCCTTC and reverse primer GGATCCCAGGGGCCCCTGGAAC. The resulting plasmid was transformed into Escherichia coli Rosetta (DE3) cells, which were grown in Luria Bertani medium supplemented with ampicillin (100 g/ml) and chloramphenicol (33 g/ml) at 37°C to an A 600 of 0.75-1 and then incubated for 30 min at 4°C. Recombinant protein expression was induced with 0.1 mM isopropyl-␤-D-1-thio-galactopyranoside. After 6 h at 20°C, cells were harvested by centrifugation (15 min, 6,000 ϫ g, 4°C), resuspended in PBS plus 0.02% sodium azide (15 ml per pellet from 1 liter of culture), and subsequently lysed by sonication (cycle of 30 ϫ 0.5 s pulses at 70% amplitude per pellet from 1 liter of culture) using a Branson Digital 450 Sonifier (Branson Ultrasonics). Cell lysates were clarified by centrifugation (40 min, 40,000 ϫ g, 4°C) and loaded onto a glutathione-Sepharose 4 Fast Flow column (GE Healthcare Life Sciences; bed volume, 10 ml), previously equilibrated with PBS plus 0.02% sodium azide at 4°C. Tag-free recombinant miropin was obtained by in-column cleavage of the GST moiety with PreScission protease (GE Healthcare Life Sciences), which left five residues ( Ϫ5 GPLGS Ϫ1 ) attached to the N terminus of the protein. Protein-containing fractions were pooled, concentrated, and further purified by size-exclusion chromatography in a HiLoad 16/600 Superdex 75-pg column (GE Healthcare) previously equilibrated with 5 mM Tris-HCl, 50 mM sodium chloride, 0.02% sodium azide, pH 8.0, using an ÄKTA Purifier 900 FPLC system (GE Healthcare) at a flow rate of 1.5 ml/min. Protein identity and purity were assessed by 15% Tricine SDS-PAGE stained with Coomassie Blue, peptide-mass fingerprinting of tryptic protein digests, N-terminal sequencing through Edman degradation, and mass spectrometry. The latter three approaches were carried out at the Protein Chemistry Service and the Proteomics Facilities of the Centro de Investigaciones Biológicas (Madrid, Spain). Ultrafiltration steps were performed with Vivaspin 15 and Vivaspin 500 filter devices of 5-kDa cutoff (Sartorius Stedim Biotech). Protein concentrations were estimated with the help of the respective theoretical extinction coefficients by measuring A 280 in a spectrophotometer (NanoDrop). When required, the concentrations were also determined more precisely with the BCA protein assay kit (Thermo Scientific) with bovine serum albumin as a standard.

Peptidase cleavage assays
Native wild-type miropin and double mutants V367K/ K368A and K368A/T369K were used to assay inhibition of SEPs and CEPs through incubation for different time spans (30 s to 1 h) at room temperature. SEPs tested included trypsin from bovine pancreas, elastase from porcine pancreas (both from Sigma-Aldrich), subtilisin Carlsberg from B. subtilis (CalBiochem), and human neutrophil elastase (Elastin Products Company, Inc.). CEPs tested included papain from papaya latex (Sigma-Aldrich). The reactions were carried out in 50 mM Tris-HCl, 150 mM sodium chloride, pH 7.5, at peptidase:miropin ratios between 1:1 and 1:2. The reactions were stopped through inhibition with 4 mM Pefabloc SC (Roche Life Sciences) or precipitation with 2,2,2-trichloroacetic acid (Sigma-Aldrich). The reaction products were visualized directly on SDS-PAGE or after purification by size-exclusion chromatography.

Differential scanning calorimetry
The thermostability of wild-type miropin and mutant C245A/C246A was studied using a NANO DSC III apparatus (model 6300) with capillary cells. Samples in 5 mM Tris-HCl, 50 mM sodium chloride, 0.02% sodium azide, pH 8.0, were diluted with the same buffer to 0.2 mg/ml, degassed for 10 min, and centrifuged at 20,000 ϫ g for 10 min. Buffer was used for controls. Measurements were performed within the temperature range 25-75°C, with a scanning rate of 1°C/min. The thermograms of the protein solutions and the buffer were used to calculate excess molar heat capacity curves using NanoAnalyze software (TA Instruments).

Stoichiometry of inhibition
The number of molecules of miropin needed to inhibit one molecule of target protease (SI) was determined as previously reported (27) by incubating constant amounts of active-site titrated bovine trypsin (50 nM) in 100 mM Tris-HCl, 150 mM sodium chloride, 5 mM calcium chloride, 0.02% Tween 20, pH 7.6, with increasing concentrations of miropin (wild type, V367K/K368A, or K368A/T369K) to yield molar ratios of enzyme-inhibitor ranging from 0 to 5. After 15 min of incubation at 37°C, an equal volume of a solution of chromogenic trypsin substrate N ␣ -benzoyl-L-Arg-4-nitroanilide hydrochloride (3 mM) was added, and enzymatic hydrolysis of the substrate was monitored for 30 min at 37°C at ϭ 410 nm in a SpectraMax microplate reader. Residual activity was plotted as a function of the molar ratio of miropin:proteinase. The SI was considered to be the value at which the fitted line intersected the x axis.

Determination of the association rate constant
Kinetic parameters of inhibition of trypsin by miropins were determined by the progress curve method (87). Briefly, mixtures containing constant concentrations of substrate (Boc-Gln-Ala-Arg-MCA; at 40 M) and increasing concentrations of miropin (wild type, V367K/K368A, or K368A/T369K) in a total volume of 100 l were prepared in microtiter plates. Next, 100 Endopeptidase inhibition mechanism of T. forsythia miropin l of trypsin (0.2 nM) was added, and the rate of substrate hydrolysis was recorded ( exc ϭ 360 nm, em ϭ 460 nm) employing a SpectraMax Gmini XS microplate reader (Molecular Devices). The pseudo-first-order association rate constant, k obs , was determined as previously described (27).

Crystallization and diffraction data collection
For crystallization, native wild-type or V367K/K368A mutant miropin was reacted with either trypsin or subtilisin, and the residual activity was inhibited by adding Pefabloc SC or phenylmethylsulfonyl fluoride (Roche). Proteins were subsequently buffer-exchanged to 20 mM Tris-HCl, 2 mM sodium chloride, pH 7.5, and further purified by ion-exchange chromatography in a TSKgel DEAE-2SW column (TOSOH Bioscience) equilibrated with the same buffer. A gradient of 4 -60% of 20 mM Tris-HCl, 500 mM sodium chloride, pH 7.5, was applied over 20 ml, and samples were collected and pooled. Finally, each pool was concentrated by ultrafiltration and subjected to size-exclusion chromatography in a Superdex 75, 10/300 column (GE Healthcare) equilibrated with 20 mM Tris-HCl, 150 mM sodium chloride, pH 7.5. Under these conditions, only small peptidase fragments of 370 -3,580 Da-as measured by MALDI-TOF analysis-remained covalently attached through thecatalyticserinetoinducedmiropinformsbecauseofautoproteolysis. These specimens were crystallized. In parallel, a certain fraction of intact enzyme-inhibitor complexes with trypsin or subtilisin could be isolated by rapid inhibition of the SEPs and subsequent size-exclusion chromatography, but these samples did not crystallize.
Crystallization assays were performed by the sitting-drop vapor diffusion method using 96 ϫ 2-well MRC plates (Innovadyne) at the joint Molecular Biology Institute of Barcelona/Institute for Research in Biomedicine (IRB) Automated Crystallography Platform at Barcelona Science Park. To this end, reservoir solutions were prepared with a Tecan robot and 100-nL crystallization drops were dispensed with a Phoenix nanodrop robot (Art Robbins) or a Cartesian Microsys 4000 XL robot (Genomic Solutions). The plates were stored in Bruker steady-temperature crystal farms at 4 or 20°C. Successful conditions were scaled up to the microliter range in 24-well Cryschem crystallization dishes (Hampton Research). The best crystals of native wild-type miropin were obtained at 20°C from 1 l:1 l drops of protein solution (at 6 -15 mg/ml concentration in 20 mM Tris-HCl, pH 7.4, 100 mM sodium chloride) and 2.4 M disodium malonate, pH 7.0, as reservoir solution. In turn, all induced miropin variants were crystallized similarly, but the reservoir solution contained 200 mM sodium iodide, 100 mM Bis-Tris, 20% (w/v) polyethylene glycol 3350, pH 6.5, instead. Carefully washed and dissolved crystals were analyzed by N-terminal Edman degradation and mass spectrometry, which revealed that native crystals contained intact miropin spanning residues Glu 39 -Glu 408 . In contrast, induced miropin variants were cleaved in the N-terminal segment (after Ile 41 or Lys 40 ) and within the RCL (trypsin/wild type, after Lys 368 ; subtilisin/wild type, after Thr 369 ; and trypsin/mutant V367K/K368A, after Lys 367 ).
Crystals were cryoprotected by rapid passage through drops containing increasing concentrations of either 2.4 -3.5 M diso-dium malonate, pH 7.0 (native crystals), or the crystallization buffer plus glycerol up to 15% (v/v) (induced crystals). Complete diffraction data sets were collected at 100 K from liquid-N 2 flash cryocooled crystals (Oxford Cryosystems 700 series cryostream) on a Pilatus 6M pixel detector (from Dectris) at Beamline XALOC (88) of the ALBA synchrotron (Barcelona, Spain). Further data were collected on the same detector type at Beamline ID30A of the European Synchrotron Radiation Facility synchrotron (Grenoble, France) within the Block Allocation Group "BAG Barcelona." Diffraction data were integrated, scaled, merged, and reduced with programs XDS and XSCALA (89). Native miropin crystals belonged to space group P4 1 2 1 2, contained one dimer per asymmetric unit, and diffracted to maximally 3.0 Å. Trypsin-induced and subtilisin-induced wildtype miropin crystals, as well as trypsin-induced V367K/K368A mutant miropin crystals, belonged to space group P2 1 2 1 2 1 , contained a monomer per asymmetric unit, and diffracted to maximally 1.6, 1.7, and 1.5 Å resolution, respectively. Table 1 provides a summary of data collection and processing.

Structure solution and refinement
An initial sequence similarity search identified horse leukocyte elastase inhibitor as the closest relative of miropin among the proteinase-activated serpin structures reported (PDB code 1HLE) (90). Its coordinates were used to solve the structure of trypsin-induced wild-type miropin by likelihood-scored molecular replacement with the PHASER program (91). A clear solution was found at 293.6, 116.5, 332.1 (␣, ␤, ␥ in Eulerian angles) and 0.639, 0.960, 0.209 (x, y, z, as fractional unit-cell coordinates) after rigid-body refinement. This solution gave an initial Z score of 12.4 for the rotation function and 19.6 for the translation function, as well as a final log-likelihood gain of 508. A subsequent density improvement step with ARP/wARP (92), which employed refinement program REFMAC5 (93), yielded a partial model and a Fourier map that enabled straightforward manual model building with the COOT program (94). The latter alternated with crystallographic refinement with PHENIX (95) and BUSTER/TNT (96) under inclusion of TLS refinement, until the final refined model was obtained. The latter comprised residues Asp 44 -Lys 368 and Ser 373 -Glu 408 from miropin and Asp 194 -Ser 195 from trypsin (residue numbers in subscript for trypsin) covalently linked between atoms Ser 195 O␥ and Lys 368 C. Two tentative glycerol molecules and 409 solvent molecules completed the structure.
Among the reported structures of native serpins, human SCCA1 (PDB code 2ZV6) (97) displayed the closest sequence similarity with miropin and was employed-with all side chains mutated to alanine-to solve the native structure of the latter, for which a data set to 3.3 Å resolution was initially available (data not shown). Two solutions were found with PHASER at 15.1, 84.3, 247.1, 0.660, 0.337, 0.219 and 251.4, 87.5, 79.3, 0.841, 0.213, 0.834 after rigid-body refinement. These solutions yielded initial Z scores of 4.7/15.1 and 3.3/11.9 for the respective rotation/translation functions, as well as a final log-likelihood gain of 422. This calculation was followed by a density modification and model extension step with the AUTOBUILD protocol of PHENIX (98), which produced an improved Fourier map and a partial model. Model building and refinement pro-ceeded as for trypsin-induced wild-type miropin. At the final stages of model completion, an isomorphous data set to higher resolution (3.0 Å) became available, which was used to complete the structure ( Table 1). The final model of native miropin contained residues Glu 39 -Glu 408 plus residues Ϫ4 PLGS Ϫ1 from the N-terminal tag for each of the two molecules (A and B) present in the asymmetric unit. A further 16 solvent molecules completed the structure. Molecule A was significantly more rigid and better defined than molecule B (overall thermal displacement parameter of 66.0 Å 2 versus 90.8 Å 2 ), and superposition of the two molecules revealed essentially identical chain traces (RMSD value of 0.9 Å), so molecule A was used for the presentation of the results and discussion.
The structure of subtilisin-induced wild-type miropin was solved by molecular replacement with the protein coordinates of trypsin-induced miropin after omitting segments Glu 353 -Lys 368 and Ser 373 -Ile 381 and the trypsin dipeptide. These calculations yielded a solution at 255.0, 174.0, 77.1, Ϫ0.472, 0.546, Ϫ0.055 after rigid-body refinement, which had initial Z scores of 4.1 and 6.1 for the rotation and translation functions, respectively, as well as a final log-likelihood gain of 9,201. Subsequent density modification, model building, and refinement proceeded as with trypsin-induced miropin. The final model comprised miropin residues Lys 40 -Val 369 and the following (partially occupied) tentatively assigned ligands: 7 iodide ions, 1 potassium ion, 4 glycerol molecules, and 402 solvent molecules.
Finally, the structure of trypsin-induced miropin mutant V367K/K368A was solved as that of subtilisin-induced wildtype miropin. The molecular replacement calculations yielded a top peak at 192.3, 178.3, 12.2, Ϫ0.494, 0.497, Ϫ0.001 after rigid-body refinement, which had initial Z scores of 4.1 and 6.6 for the rotation and translation functions, respectively, as well as a final log-likelihood gain of 12,722. Subsequent density modification, model building, and refinement proceeded as with trypsin-induced miropin. The final model comprised miropin mutant residues Glu 39 -Glu 408 plus residues Gly Ϫ5 -Ser Ϫ1 from the fusion construct, as well as the following (partially occupied) tentatively assigned ligands: 2 zinc ions, 4 iodide ions, 1 potassium ion, 6 chloride ions, 2 glycerol molecules, 1 tris(hydroxymethyl)aminomethane molecule, and 448 solvent molecules.

Phylogenetic analysis
The miropin sequence was used for BLAST searches within the NCBI database (99), which were initially performed within phyli Fibrobacteres, Chlorobi, and Bacteroidetes and subsequently expanded to other bacterial families and eukaryotes. Only selected sequences were used and further analyzed because of the very high number of sequences retrieved. These were aligned with MAFFT using the G-INS-i algorithm (100). A phylogenetic tree was constructed with PHYML (101) using LG as substitution model, with 100 bootstrapping replicates, 4 substitution rates, optimization in topology/length/rate and topology search based on nearest-neighbor interchange, and subtree-pruning and regrafting within the Geneious platform.