A Self-compartmentalizing Hexamer Serine Protease from Pyrococcus Horikoshii

Background: Oligopeptidases are serine proteases cleaving only short peptides. Results: The complex channel system found within a hexameric oligopeptidase presents a rigid, double-gated model for size-based substrate selection. Conclusion: The substrate selection mechanism applied by an oligopeptidase depends on its multimerization state. Significance: Degradation of cytotoxic and misfolded proteins is aided by oligopeptidases, which are thus possible targets of cancer therapy. Oligopeptidases impose a size limitation on their substrates, the mechanism of which has long been under debate. Here we present the structure of a hexameric serine protease, an oligopeptidase from Pyrococcus horikoshii (PhAAP), revealing a complex, self-compartmentalized inner space, where substrates may access the monomer active sites passing through a double-gated “check-in” system, first passing through a pore on the hexamer surface and then turning to enter through an even smaller opening at the monomers' domain interface. This substrate screening strategy is unique within the family. We found that among oligopeptidases, a residue of the catalytic apparatus is positioned near an amylogenic β-edge, which needs to be protected to prevent aggregation, and we found that different oligopeptidases use different strategies to achieve such an end. We propose that self-assembly within the family results in characteristically different substrate selection mechanisms coupled to different multimerization states.

Oligopeptidases impose a size limitation on their substrates, the mechanism of which has long been under debate. Here we present the structure of a hexameric serine protease, an oligopeptidase from Pyrococcus horikoshii (PhAAP), revealing a complex, selfcompartmentalized inner space, where substrates may access the monomer active sites passing through a double-gated "check-in" system, first passing through a pore on the hexamer surface and then turning to enter through an even smaller opening at the monomers' domain interface. This substrate screening strategy is unique within the family. We found that among oligopeptidases, a residue of the catalytic apparatus is positioned near an amylogenic ␤-edge, which needs to be protected to prevent aggregation, and we found that different oligopeptidases use different strategies to achieve such an end. We propose that self-assembly within the family results in characteristically different substrate selection mechanisms coupled to different multimerization states.
Members of the prolyl oligopeptidase family are serine proteases, such as prolyl oligopeptidase (POP) 5 itself, dipeptidyl peptidase IV (DPP-IV), oligopeptidase B (OPB), and acylaminoacyl peptidase (AAP), that are able to distinguish between potential substrates based on their size and only cleave those that do not exceed 30 amino acids in length (1,2). When the first crystal structure of a serine oligopeptidase, that of POP, was determined (3), size exclusion was explained by the domain structure of the enzyme, which later proved to be quite similar for all family members. Oligopeptidases consist of a halfsphere-shaped hydrolase region containing the active site and a cylindrical propeller domain, which caps it, burying the Ser-His-Asp catalytic triad in a well protected inner hole with dimensions comparable with the size limitations imposed by the enzyme. The structure, however, raised the question of substrate access and, through it, how enzyme effectiveness and selectivity are simultaneously maintained in the case of such a decidedly buried active site.
It was initially proposed that substrates must reach the active site through the narrow channel bisecting the propeller domain. Alternatively, substrate access routes must target the domain interface, either through a transient opening between the propeller and the hydrolase domains or through a permanent hole between the two. POP, the first discovered and most extensively studied member of the family, was revealed to have both a closed (4) 6 (PDB codes 1H2W and 3MUN) and an open arrangement (6, 7) (PDB codes 3IUL and 1YR2) in crystal structures of its unligated form, in the latter of which the cylindrical dome of the propeller region flips back, exposing the active site entirely. This suggests that an open/closed dynamic equilibrium, fine-tuned by the amino acid composition of the interacting domain surfaces, might describe the resting state of the enzyme, where both forms coexist. This notion recently gained support by the determination of the crystal structure of another oligopeptidase, AAP from Aeropyrum pernix (ApAAP), which contains both the open and closed forms of the unligated enzyme within the same crystal lattice, in the form of 1:1 open/closed mixed dimers (8). In both cases, a keen control mechanism is coupled to the opening; the active site is disassembled in the open form mainly by the destabilization and reformation of the flexible loop containing the catalytic His (6,8). Substrates exceeding the size limit disrupt the closing of the enzyme and thus only encounter the impaired active site of the open form and leave the transient complex unchanged.
An alternate possible substrate selection strategy is formation of a permanent entrance hole and providing for its shielding. DPP-IV was shown to have such an architecture; substrates can access its active site either through its large propeller hole (9) or through a spacious side opening between its hydrolase and propeller domains stabilized and partly covered by dimerization (10,11,12).
In this work, we present an even more complex example of the latter: that of acylaminoacyl peptidase from Pyrococcus horikoshii (PhAAP), which we found active in the hexameric form (13). Three crystal structures were determined, which demonstrates that hexamerization results in self-compartmentalizing, a self-assembled double-gated channel system that may effectively screen the substrates of the six individual monomers, all possessing a permanent entry at the domain interface. Although hexameric arrangement is unique among classical serine proteases, it is reminiscent of other self-compartmentalizing intracellular proteases, such as Gal6/bleomycin hydrolase (14), tricorn protease (15), Lon protease (16), or the complex systems of proteasomes.
Thus, the PhAAP structure provides a unique possibility of comparison among oligopeptidases sharing common domain structure but applying different substrate selection strategy and among self-compartmentalizing proteases of different monomeric structure but of similar selection strategy. This comparison allowed us to propose that the flexibility of the loop holding the catalytic His residue, the accessibility of a sticky ␤-strand anchoring the His-loop, and unobtrusive insertions or terminal extensions determine the substrate selection strategy and self-assembly in the oligopeptidase family. Structure determination of PhAAP and understanding of the structural requirements and consequences of multimerization also takes us one step closer to the physiologically relevant mammalian enzyme, which is active as a tetramer.
AAP functions as an exopeptidase in mammals, removing N-terminally blocked amino acids from peptides, although other orthologs have endopeptidase activity too (13,17). It has been shown to act in concert with and exert regulation over the proteasome (18) while also being implicated in renal and small cell carcinoma (19 -21), and it is also referred to as a potential target of certain cognitive enhancers (22,23).

EXPERIMENTAL PROCEDURES
Protein Expression and Purification for Crystallization-PhAAP was expressed in Escherichia coli and purified as described previously for PhAAP and ApAAP (13,17). The active site mutation was introduced in PhAAP (S466A) by the two-step PCR method as described for ApAAP (17). The protein solutions were concentrated to 10 mg/ml in 20 mM Tris/ HCl buffer (pH 8.0). Preparing the covalent complex of PhAAP with the inhibitor benzyloxycarbonyl-glycyl-glycyl-phenylalanyl-chloromethyl ketone (CMK) was carried out by adding the inhibitor in 2.4-fold excess to the protein solution.
Crystallization, Data Collection, Phasing, and Refinement-The PhAAP structure was determined from crystals grown at 20°C by the hanging drop method. Crystal quality was improved by microseeding. The crystals were flash-frozen and stored in liquid nitrogen prior to data collection.
For the native crystal, drops were prepared by mixing 2 l of protein solution and 2 l of reservoir solution (3.0 M 1,6hexanediol, 0.20 M MgCl 2 , 0.1 M Tris/HCl buffer, pH 8.5). The protein solutions contained 10.5-13.0 mg/ml PhAAP in 20 mM Tris/HCl buffer (pH 8.0). Phase information was obtained using the following three derivatives. A crystal was soaked in 4.5 l of reservoir solution (same as for the native crystal) mixed with 1.5 l of 0.05 M UO 2 (NO 3 ) 2 solution for 1 h prior to flash cooling. A crystal was soaked in 2 l of reservoir solution mixed with 2 l of 0.02 M K 2 PtCl 4 solution for 2 h prior to flash cooling. After mixing 2 l of 0.2 M KI solution in the crystallization drop, a crystal was quick soaked in it.
The orthorhombic crystal form was obtained by a co-crystallizing trial of the S466A mutant form of PhAAP and its substrate (13) N-benzyloxycarbonyl-Glu-Phe-Ser-Pro-(para-nitro-Phe)-Arg-Ala. The substrate was dissolved in the protein solutions (9.2 mg/ml protein in 20 mM Tris/HCl buffer, pH 8.0) at a 10-fold excess. The drops contained 2 l of protein solution and 2 l of reservoir solution (3.0 M 1,6-hexanediol, 0.30 M MgCl 2 , 0.1 M Tris/HCl buffer, pH 8.0). The structure, however, does not contain any fragment of the substrate, possibly because 1,6-hexanediol competes for and occupies the S1 site. For studying binding of a substrate fragment, we prepared crystals with a covalent inhibitor, CMK.
Data collection was carried out on a Rigaku diffractometer (Cu K␣ radiation, wavelength 1.5418 Å) for the native, UO 2 (NO 3 ) 2 , and KI derivative crystals (graphite monochromator, R-AXIS IIc detector) as well as the CMK complex (blue optics, R-AXIS IVϩϩ detector). Data sets of the K 2 PtCl 4 derivative crystal were collected at synchrotron sources at wavelengths 0.8162 and 1.0715 Å, at the EMBL beamline X11 of the Deutsches Elektronen-Synchrotron (DESY) and at beamline BM14 of the European Synchrotron Radiation Facility (ESRF), respectively. The data set of a crystal of the orthorhombic crystal form was collected at beamline BM14 of ESRF (wavelength 1.0715 Å). All data were collected at 100 K. Data processing was carried out using the XDS and XSCALE programs (24) for all data sets. All but one crystal were isostructural with the hexamers of AAP formed by space group symmetry H32 from the single monomer of the asymmetric unit. The asymmetric unit of the crystal in the orthorhombic crystal form contains two hexamers of PhAAP (for this data set, reflections for R free calculation were selected in thin shells).
Data sets of the native, UO 2 (NO 3 ) 2 and KI and K 2 PtCl 4 derivatives were used for solving the phase problem using multiple isomorphous replacement with anomalous scattering. The initial set of sites was located by the SHELX package (25). Phases were refined with the program MLPHARE (26), and they were further improved with the program DM (27). An initial model was automatically generated by Buccaneer (28). The phase problem for the orthorhombic crystal form was solved by molecular replacement using the program MOLREP (29). Manual model building was performed for all of the three structures using the program Coot (30). Refinement was carried out either by using REFMAC (31) (native and CMK complex) or Phenix (32) (orthorhombic crystal form), including TLS B-factor refinement for each individual protein domain. The model was validated using SFCHECK (33) and MolProbity (34). The percentages of residues in the favored/disallowed Ramachandran regions are as follows: PhAAP native structure, 95.82%/0.17%; PhAAP-CMK complex, 96.22%/0.33%; PhAAP orthorhombic crystal form, 96.19%/0.00%. The final statistics can be found in Table 1.
Molecular Dynamics (MD) Simulations-60-ns NPT MD simulations were carried out on monomers of POP (using PDB structure 1H2W (4) as a starting conformation) and PhAAP (using the presently determined native structure as starting conformation) at 300 K. The systems were pre-equilibrated at 300 K by decreasing the restraints on the protein atoms and submitting the free system to a 200-ps NVT step to stabilize pressure. The GROMACS program suite (35) and CHARMM27 force field (36,37) were utilized. Systems were solvated by ϳ28,000 TIP3P waters, protein overall charge was neutralized, and physiological salt concentration was set using Na ϩ and Cl Ϫ ions. Average geometries and B-factors were calculated using the last 10 ns of the trajectories.

RESULTS
Crystal Structures-Crystal structure of the PhAAP hexamer was determined in two crystal forms. Two sets of data originate from crystals bearing R32 space group symmetry, where the six monomers of the hexamer are crystallographically equivalent. One of these shows the substrate-free form of the enzyme, whereas the other contains a chloromethyl ketone inhibitor (CMK) covalently bound to the active site serine and histidine residues, serving as an analog of the intermediate of the substrate hydrolysis reaction. The second crystal form is orthorhombic, containing two substrate-free hexamers in the asymmetric unit, with no crystallographic restrictions for the symmetry and topology of the monomers within the hexamer ( Table 1). The monomer units within each hexamer and in between monomers of the different structures are strikingly similar (after fitting to the native unligated structure, root mean square deviation of C␣ atoms is 0.23 Å for the CMK complex structure, and it is between 0.25 and 0.41 Å for the 12 molecules of the orthorhombic structure, respectively). Furthermore, the overall hexameric form is also quite similar in all of the three structures, indicating that the hexamer is a symmetric object with a well defined structure.
Overall Structure of the PhAAP Monomer-The enzyme monomer consists of a dome-shaped, seven-bladed ␤-propeller domain and the C-terminal catalytic domain of ␣/␤-hydrolase fold. Each blade of the ␤-propeller domain contains four antiparallel ␤-strands. There is a large cavity with overall diameter of about 22 Å between the two domains, which contains the active site (Fig. 1, A and B).
A side opening of this cavity (12 ϫ 16 Å) is created under blade 2, gated by blades 1 and 3 (Fig. 1B). Blade 2 is both shortened and shifted upward by 6 -8 Å as compared with the corresponding blades of other family members, such as POP, OPB, or ApAAP, but similar to that of DPP-IV with an even greater interdomain opening of 15 ϫ 22 Å formed by shortening and shift of its blades 1-3 (9).
The single structure homologous to PhAAP available to date is that of the ␤-propeller domain of trilobed protease from Pyrococcus furiosus (TLP, sequence identity 79 and 85% for the propeller domain and the full protein chain, respectively (PDB code 2GOP (38)). Alignment of the propeller domains of PhAAP and TLP reveals high structural similarity (root mean square deviation of fitted C␣ atoms is 0.94 -1.06 Å), and the conformational differences as well as disorder of certain loop regions of TLP can all be explained by the fact that in PhAAP, these segments are involved in interactions with the hydrolase domain (which is missing from the TLP structure) or other monomers of the PhAAP multimer structure (supplemental Fig. 1).
The hydrolase domain (residues 1-14 and 347-618) that contains the catalytic triad, consists of a short N-terminal segment and the residues sequentially following the seven-bladed ␤-propeller domain inserted between the two (residues 15-346). The N-terminal segment is unusually short within the family; it contains only 14 amino acid residues (as compared with the 22, 31, 72, and 94 amino acids of the same region of ApAAP, DPP-IV, POP, and OPB, respectively), whereas the structure of the core hydrolase domain is well conserved despite its relatively low sequence homology.
In PhAAP, the Ser 466 of the catalytic triad is positioned on a short turn between a ␤-strand, member of an eight-stranded ␤-sheet, and a helix and is thus in a rather well defined environment. This feature is conserved among oligopeptidases, as is the fact that the other two members of the catalytic triad, His 578 and Asp 546 , are placed on neighboring, longer loops (His-loop/ Asp-loop), connecting ␤-stands of the very same ␤-sheet that the Ser is linked to, with helices that stand on either side of the sheet, at the point where it reaches the surface of the hydrolase domain (Fig. 1C). Thus, the catalytic His and Asp residues provide direct coupling between the rather buried catalytic Ser and the domain surface. The Asp-loop (residues 543-552) is well structured with four intraloop hydrogen bonds, similarly to the corresponding loops of the other family members. However, the well defined nature of the His-loop of PhAAP (residues 574 -586) is unique; there are six intraloop hydrogen-bonds contributing to its internal stability (Fig. 1C). The only other example of a rigidified His-loop among oligopeptidases is that of DPP-IV, with four hydrogen bonds in this region. In the rest, 0 -2 intraloop hydrogen bonds can be found in the corresponding His-loops (zero in ApAAP (8), one in POP from porcine brain (4) and muscle (3) and from Aeromonas caviae, 6 two in POP from Myxococcus xanthus (39) and OPB (40); PDB codes are 3o4j, 1h2w, 1qfm, 3mun, 2bkl, and 2xe4, respectively); their conformations are mainly defined by their interaction with the neighboring Asp-loops and by interdomain connections with the loops of the propeller domain. In PhAAP, five hydrogen bonds link the His-loop and the Asp-loop to one another in addition to the catalytic His 578 -Asp 546 interaction, but neither the His-nor the Asp-loop forms direct interdomain hydrogen bonds with the propeller domain. This is the case because, interestingly, these two loops line the entrance hole facing (through the cavity) the shortened blade 2 of the propeller, so besides participating in the catalytic reaction, these regions may partake in the screening and direction of the substrates too.
Accommodation of the Covalently Bound Inhibitor and Hexanediol Molecules in the Active Site-In both crystal forms of the substrate-free enzyme, the substrate specificity pocket was occupied by 1,6-hexanediol (Fig. 1D), which was present in the crystallization condition in high concentration (3 mol/ dm 3 ). For studying substrate binding, a covalent adduct of PhAAP and a chloromethyl ketone inhibitor was crystallized, and its structure was solved. The covalently bound part and the Phe and Gly moieties occupying the S1 and S2 subsites are well defined by electron density and are bound in canonical position (supplemental Fig. 2). The N-terminal benzyloxycarbonyl-glycyl moiety, however, cannot be located in the electron density map, suggesting that it does not specifically bind to the enzyme surface. Comparison of the structure containing the covalently bound inhibitor mimicking the reaction intermediate and that of the substrate-free form reveals high similarity of the two, indicating that the substrate binding site and the catalytic apparatus are preformed in the latter.
The Active Site Structure Is Consistent with Endopeptidase Activity-The active site structure of PhAAP resembles that of ApAAP (Fig. 1D). The catalytic triad and the oxyanion site display an active conformation in both enzymes. The conformation and interactions of the loop holding the conserved His stabilizing the oxyanion hole (His 367 ) highly resemble that of ApAAP, because it was proposed that this structural element is characteristic of all AAPs within the prolyl oligopeptidase family (41). The substrate specificity pocket (S1) is deep, like in ApAAP (supplemental Fig. 3  . C, the loops holding the catalytic triad (Ser 466 , His 578 , and Asp 546 ; light blue, orange, and green, respectively) are exposed to solvent and form only few contacts with the propeller domain (yellow), yet they are stabilized by intra-and interloop hydrogen bonds. (Hydrogen bonds are color coded as follows. Red, intra-His-loop; green, intra-Asp-loop; light blue, between His-and Asp-loops; black, formed with the rest of the hydrolase domain; yellow, with the propeller domain.) D, comparison of the PhAAP structure with hexanediol (wheat) bound in the substrate specificity pocket and that of PhAAP in complex with a covalently bound chloromethyl ketone inhibitor (magenta; with two hexanediol molecules outside the S1-S2 region shown in wheat) reveals that the only significant difference is adjustment of the S1 pocket by rotation of the Phe 507 side chain in the covalent complex (in darker colors). charged residue (Asp 508 ) are located, whereas in ApAAP, the corresponding side chains are Met 477 and Ile 489 , respectively, suggesting that PhAAP may also cleave after positively charged residues (supplemental Fig. 3). Indeed, we observed that in the absence of Phe at the P1 site, substrates with Arg or Lys can also be hydrolyzed (data not shown).
In addition to the acylaminoacyl peptidase activity, PhAAP also exhibits endopeptidase activity (13). The structure is in line with this finding. The substrate binding site is extended and does not have any steric or electrostatic blocking at the S2-S3 sites, which true exopeptidases, like DPP-IV, do.
Stabilization of the PhAAP Hexameric Structure; Trimer of Dimers-In a previous study, we have shown, using size exclusion chromatography, that PhAAP is a homohexameric enzyme in solution (13). Hexamers of PhAAP in the crystalline state are formed as trimers of dimers of the monomer enzyme (Fig. 2, A and B). Analysis of molecular contacts revealed that the association within the dimers is much stronger than that securing trimerization. Both the area of the buried surfaces and the free enthalpy contribution of the dimers dominate over those of the trimers (supplemental Table 1). Although the highly homologous enzyme, TLP was proposed to be trimeric (38), we suggest that it may form hexamers in solu-tion, because the sequence identity between PhAAP and TLP is even higher at the contact regions of the trimer interfaces formed by the propeller domains (81% compared with 79% for the propeller domain) as well as for the dimer interfaces formed by both domains (96%, compared with 85% for the whole protein) (supplemental Table 2).
The dimer interface lies between the long loop insertion of blade 3 from the propeller domain of one monomer and the hydrolase domain of the other monomer (Fig. 2C). The long insertion of one molecule lines up to and extends the central eight-stranded ␤-sheet holding the catalytic residues of the other, by forming the ninth strand of it in an antiparallel manner (framed in Fig. 2C; supplemental Fig. 4). In contrast, in ApAAP, which is active in the dimeric form, the dimer interface is formed between the two hydrolase domains so that the central ␤-sheets of the two hydrolase domains are connected by forming a 16-stranded continuous sheet, whereas the propeller domain of one momomer has no contacts with the either domains of the other (supplemental Fig. 4).
The trimer interfaces are composed of the outer side of blade 1 of the propeller of one monomer and the outward facing part of blades 2 and 3 of its neighboring monomer (Fig. 2D, blades 1-3 are labeled for the three monomers) and are dominated by side chain interactions. Blades 1-3 are those lining the entrance of the monomer; thus, trimerization contributes to fixing of the pore size by restricting the blades' upward movement.
Hexamerization creates further substrate size screeners. Three large openings of about 20 ϫ 30 Å can be found on the hexamer surface (Fig. 2, B and E), which continue in tunnels and are joined in a big cavity in the middle of the hexamer. The interdomain opening of each monomer is also connected to this central chamber of the hexamer, because all six cavities face inward. In principle, the substrate can reach one of the active sites in two ways: 1) first entering the opening of the hexamer and subsequently the side chamber of a monomer or 2) through the central pore of the propeller domain, provided that the pore can be widened (42). However, the trimerization contacts restrict the flexibility of the propeller pore region, which makes this second option unlikely (Fig. 2E). The two successive openings and the complex tunnel system between them seem sufficient for screening the substrates. Those molecules that are small enough to enter the antechamber through the larger hole but are not flexible enough for entering the second may simply pass through the central chamber and leave at the opposite end; however, those that can proceed through the monomer cavity to the active site will be cleaved.
MD Simulations for Determining Monomer Stability-MD simulations were carried out to determine the monomer stability of two members of the prolyl oligopeptidase family, those of POP and PhAAP. Calculations were designed to test the significance of multimerization; POP is active and stable as a monomer, whereas PhAAP hexamerizes. Monomer structures were compared after 60 ns of simulation time. The two molecules behave quite similarly; the conformation seen in the crystal structure of each was preserved in the simulation. This has a special significance in the case of the PhAAP structure because it also demonstrates that the 1,6-hexanediol molecule trapped in the active site in the crystalline state and absent in the simulation does not have a structure-determining role. The overall root mean square deviation of main chain atoms from their crystallographically determined position was found to be 1.34 and 1.44 Å in POP and PhAAP, respectively (supplemental Fig. 5).
A notable difference between the crystal structures and the simulation results can be seen in the conformation of the catalytic triad, both in the closed structure of POP and in PhAAP with its wide side opening. The hydrogen-bonded triad of the crystal structures loosens in the fully solvated monomer. The Ser-O␥-His-N⑀2 distance increases from 2.9 and 2.6 Å to 4.7 and 4.6 Å, in the case of POP and PhAAP, respectively, due mainly to the active Ser sampling a different rotamer, relaxing its strained, nearly eclipsed N ϩ1 -C-C␣-C␤ torsion of 2-5°seen in the crystal structures to that of a more staggered 35°. Ser and His stay connected through a shared water molecule in 58 and 19% of the structures of the last 10 ns of the trajectory in POP and PhAAP, respectively, and through two or more coordinated water molecules in the rest. The His-Asp hydrogen bond remains intact in all snapshots. Such a "latent" conformation of the Ser-His-Asp catalytic triad was seen in the crystal structure of ligand-free tricorn interacting factor F1 (43), where, in the absence of propeller domain, the hydrolase is only partially shielded from solvent. In crystal structures of other oligopepti-dases, in addition to the strained backbone conformation of the catalytic Ser, the above mentioned N ϩ1 -C-C␣-C␤ torsion is also in a high energy conformation of Ϫ6 -7°. It seems that the partially "dried out" condition of crystallization or the binding of a substrate is required to force the Ser into the conformation that guarantees its reactivity.

DISCUSSION
When comparing the structure of PhAAP presented here and those of oligopeptidases determined previously, three fundamentally different substrate admission routes can be detected: 1) allowing access of substrates through the propeller hole of the closed form, 2) dynamic domain flapping between the closed and open conformation, or 3) a channel system created by multimerization.
The first provides an almost too strict screener; the narrow pore of the propeller domain cannot be appropriate for all oligopeptidase substrates. However, DPP-IV, for example, with an eight-bladed propeller domain and a correspondingly wider pore than those of POP, ApAAP, OPB, or PhAAP was proposed to function that way (9), although, because it can also cleave even 80-amino acid-long peptides, it must apply other substrate admission routes too (10).
POP and ApAAP seem to utilize the second strategy. In the closed conformation of these enzymes, the hydrolase and propeller domains sit on top of each other, burying their active site. In their open form, the relative domain position is fixed only on one side by a hinge region, whereas at the opposite side of the molecules, the domains flap away from each other, forming a 40 -50°opening, causing an ϳ20-Å shift at the point of the greatest opening. This conformation provides easy access to the active site. In all cases where such an open form was observed, the active site was disassembled and inactivated by the opening, chiefly through the destabilization of the loop holding the catalytic His residue. In the closed form, this loop is supported by interdomain hydrogen bonds that are lost when the opening takes place and the domains move away from each other (6,8).
In the crystal structure of PhAAP described in this paper, the third strategy can be witnessed. Hexamerization creates a compartmentalized inner space, with a complex, double-gated "check-in" system. Substrates first have to pass through the entrance at the hexamer surface (Fig. 2E). Once inside, the active sites can be reached through the spacious side opening of the monomers created by the shortening and upward shift of the second of the seven blades of the propeller domains.
The overall topology enables all oligopeptidases to widen their propeller pore in an induced fit step, to multimerize, or to open, thus to utilize any of the three outlined strategies. The interesting question is which structural features encode selecting one or the other.
A crucial feature to be considered is the flexibility of the loop containing the His residue of the catalytic triad (His-loop). Pliability of this region is essential for those oligopeptidases that open up, because selectivity is assured by "switch-off" of the active site in the open form by destabilization of the His-loop. Accordingly, in the cases of POP, OPB, and ApAAP, the shape of the His-loop and the position and orientation of the catalytic His are defined only by interloop contacts, both with the neigh-boring Asp-loop containing the third member of the triad and with the facing loops of the propeller domain. Therefore, the shape and stability of the His-loop is coupled to the proximity of the propeller domain. In contrast, in multimer enzymes with permanent entrance between the two domains (PhAAP and DPP-IV), the His-loop is rigidified in the active conformation by a number of intraloop hydrogen bonds.
In all oligopeptidase structures so far determined, the Hisloop connects an ␣-helix and the terminal ␤-strand of the central ␤-sheet of the hydrolase domain, both at the surface of the monomer. Interestingly, this edge ␤-strand of 6 -8 residues is of regular extended geometry in all cases, without twists, bulges, or prolines to disrupt its hydrogen-bonding potential; thus, it is an ideal aggregation primer (44). The corresponding sequences were recognized by the WALTZ predictor (45) as having high probability for amyloid formation, PhAAP scoring the highest value, 98.3, whereas the others scored between 79.9 and 89.0.
The propensity for amyloid-type aggregation is a generic property of proteins (46), which thus have evolved structural and sequential adaptations to protect their surface-close ␤-strands (44,47,48), because aggregation prompted by such segments, even in the native fold or in locally unfolded states, has been shown to be the first step of toxic aggregation processes (49,50). Negative selection against conformers with significant tendency to aggregate is now regarded as one of the key determinants in the development of complex molecular machineries (51), as seen in the case of PhAAP in the present study. (Fig. 3).
In monomeric oligopeptidases POP and OPB, the sticky ␤-strand is covered by a 72-and 95-residue long N-terminal extension, respectively (Fig. 3A (left) and supplemental Fig. 4). Although the extensions join to the amino acids of the propellers in sequence, they run alongside the hydrolase domains and form a complex strap around it. Numerous hydrogen bonds are established between them, the majority of which fix the strap to the sticky ␤-strand and the two ␣-helices that stand on either side of it (12-14 and 17 in total, 8 -11 and nine to the ␣-helix/ ␤-strand/␣-helix triad in POPs from different sources and in OPB, respectively). Because the structure of POP was determined in the open conformation too, it could be ascertained that these interactions are unchanged by the opening. The N-terminal strap might also be a promoter of monomer stability, but our MD simulations showed that the PhAAP monomer, devoid of such a segment, preserved its structure just as well as POP, supporting the notion that the N-terminal extension might instead be a safeguard against aggregation.
In ApAAP, DPP-IV, and PhAAP, the edge ␤-strand is unprotected in the monomer; these oligopeptidases multimerize. In ApAAP, the N-terminal extension is only 22 amino acids long, it forms a helix that runs parallel to the ␣-helix/␤-strand/␣helix triad and leaves the sticky ␤-strand fully unprotected (Fig.  3A, right). However, in the functional ApAAP dimer, the ␤-strand of one monomer is covered by the corresponding strand of the other in an antiparallel manner, joining their central eight ␤-strands to form a 16-pleated continuous ␤-sheet (supplemental Fig. 4). This arrangement results in the two active sites facing in the opposite direction, which allows them to function independently; they might even adopt strikingly different conformations, as was seen in the open/closed mixed dimer structure of ApAAP, where one monomer was found in an open and the other in a closed conformation. Flexibility is retained because only the hydrolase domains participate in the dimerization, and this is essential, because the structure of the closed form of the ApAAP monomer has no side opening to allow access of substrates.
Small hindrances obstruct full access to the exposed ␤strands of DPP-IV and PhAAP (Fig. 3B). In DPP-IV, a 30-amino acid-long insertion of blade 4 of the propeller is what reaches to the top of the edge-␤-strand, whereas in PhAAP, a 12-residuelong C-terminal extension comes into proximity to the bottom end of it. Both behave as a peg, making it impossible for the head-on dimerization seen in the case of ApAAP to take place. DPP-IV dimerizes; in the dimer, the insertions from each monomer form an X-shaped cross, whereas the eight-pleated central ␤-sheets meet in a perpendicular orientation. In PhAAP, the loop insertion of blade 3 of one monomer lines up to the sticky ␤-strand of the other and forms a ninth, antiparallel strand of the central ␤-sheet making a dimer, whereas the dimers are further organized into the hexamer (supplemental Fig. 4). Because in both DPP-IV and PhAAP, both the hydrolase and the propeller domains participate in multimerization, domain movements necessarily become restricted; thus, these enzymes must possess a fixed substrate entrance for effective- ness, which they do. The mode of multimerization also leads to self-compartmentalizing, especially in the hexamer structure of PhAAP, because the active site openings point inward toward a common central chamber formed by the association. This therefore provides for a sufficient substrate screening mechanism, for maintaining selectivity.
Experimental evidence also supports the suggested role of the N-terminal pegs of oligopeptidases. Removal of the N-terminal segment of POP (where no additional protection for the sticky-␤-strand is available) results in aggregation and inactivation depending on the size of the removed segment (52). In contrast, removing the N-terminal extension of ApAAP causes only some destabilization of the enzyme lowering the melting temperature by about 20 K (53).
Outside the ␣/␤-hydrolase clan, among self-compartmentalizing proteases, tricorn protease, for example, has a multidomain structure similar to that of PhAAP. This enzyme carries out the degradation of small (7-9-amino acid-long) peptides and, just like PhAAP, is also hexameric, although its multimer architecture is of a bit simpler, toroidal shape. Its monomer, however, is more complex than that of PhAAP, consisting of five subdomains. Its ␣/␤ catalytic region is capped on both sides by propeller domains; substrates reach the active site by passing the great opening of the toroid and then proceeding through the channel of a seven-bladed propeller, a permanently open route to its active site. Instead of a catalytic triad, a four-residue hydrogen-bonded system of Ser-His-Ser-Glu performs the catalytic reaction. Glu 1023 is situated on a long loop connecting an unprotected strand and a distant helix. The seven-bladed propeller is close enough to obstruct dimerization with the corresponding strand of another monomer but too far for shielding; thus, a ␤-turn-␤ type insert from the neighboring monomer covers it via hexamerization similarly to that seen in the case of PhAAP.
The proteasome, a further self-compartmentalizing system, forms an immense multilayer, barrel-shaped complex, where four rings, each made up of seven monomers, stack on each other to compose two antechambers and one central catalytic chamber. Proteasome monomers do not have a propeller domain (so they have a very large permanent hole above the hydrolase). The catalytically active Thr is located on the penultimate ␤-strand of a five-membered sheet, where, in the heptamer, the ␤-edge of the sheet in one monomer is covered by an insert of the neighboring monomer, resulting in a self-assembled compartment system of inwardly turned active sites in this case too.
Based on the above, we propose that the mode of multimerization and self-assembly among oligopeptidases is finetuned by a shielding intent of a sticky ␤-edge, insertions, and N-and C-terminal extensions, whereas to maintain the effectiveness of catalysis and selectivity, pliability of the His-loop and the position of the propeller blades are adjusted (as summarized in Table 2).