Complexation of Two Proteic Insect Inhibitors to the Active Site of Chymotrypsin Suggests Decoupled Roles for Binding and Selectivity*

The crystal structures of two homologous inhibitors (PMP-C and PMP-D2v) from the insect Locusta migratoria have been determined in complex with bovine α-chymotrypsin at 2.1- and 3.0-Å resolution, respectively. PMP-C is a potent bovine α-chymotrypsin inhibitor whereas native PMP-D2 is a weak inhibitor of bovine trypsin. One unique mutation at the P1 position converts PMP-D2 into a potent bovine α-chymotrypsin inhibitor. The two peptides have a similar overall conformation, which consists of a triple-stranded antiparallel β-sheet connected by three disulfide bridges, thus defining a novel family of serine protease inhibitors. They have in common the protease interaction site, which is composed of the classical protease binding loop (position P5 to P′4, corresponding to residues 26–34) and of an internal segment (residues 15–18), held together by two disulfide bridges. Structural divergences between the two inhibitors result in an additional interaction site between PMP-D2v (position P10 to P6, residues 21–25) and the residues 172–175 of α-chymotrypsin. This unusual interaction may be responsible for species selectivity. A careful comparison of data on bound and free inhibitors (from this study and previous NMR studies, respectively) suggests that complexation to the protease stabilizes the flexible binding loop (from P5 to P′4).

Small canonical serine protease inhibitors are widely distributed among living organisms. They have been classified by Bode and Huber (1) into 16 structural families. One of the novel families that has emerged since then is the grasshopper family (2). The first members were characterized from the brain (pars intercerebralis) and the hemolymph of the insect Locusta migratoria (3)(4)(5). Furthermore, similar peptides (named SGPI) were isolated from Schistocerca gregaria (6); they share 40 -80% homology (including the six conserved cysteines) with the Locusta peptides. More recently, the same sequence motif was also identified in Pacifastin, a 155-kDa protein from the crayfish Pacifastacus leniusculus and composed of two domains with different activities. One of these domains contains nine repeats similar to the locust peptides and has a protease inhibitory activity (7).
We have carried out extensive investigations on two locust peptides, PMP-C and PMP-D2 (pars intercerebralis major peptide). They consist of 36 and 35 residues, respectively, have three disulfide bridges, and are 40% identical in sequence. The three-dimensional structures of PMP-D2 and PMP-C were determined by 1 H NMR (8,9). These peptides display a new fold with an unusual disulfide bond pattern. PMP-C is a very potent bovine ␣-chymotrypsin inhibitor (K i 0.13 nM) and a weak human leukocyte elastase inhibitor (K i 180 nM) and is devoid of activity toward porcine trypsin. The nature of its P1 residue (nomenclature according to Schechter and Berger (10)), Leu 30 , is in accordance with the literature and its inhibitory properties. Indeed, chymotrypsin inhibitors have bulky and aromatic residues such as Phe, Leu, or Met as their P1 residue whereas trypsins require basic amino acids (Lys or Arg). In PMP-D2, the corresponding residue is Arg 29 . However, PMP-D2 has no effect on porcine trypsin and is a weak inhibitor of bovine trypsin and chymotrypsin (K i 100 and 1,500 nM, respectively), whereas it can be converted to a potent bovine ␣-chymotrypsin inhibitor by a single mutation R29L at P1 (K i 0.8 nM). In addition, a second point mutation at PЈ1 (K30M) converts PMP-D2 into a fairly potent human leukocyte elastase inhibitor (K i 2 nM) (5). Very recently, we showed that PMP-D2 is a strong inhibitor of trypsins isolated from L. migratoria. 1 The inhibitory properties of PMP-D2 are of great interest, because they highlight selectivity not observed yet for PMP-C.
To provide the first description of the recognition mechanism in the grasshopper family, we solved the crystal structure of PMP-C in complex with bovine ␣-chymotrypsin. We also determined the structure of the R29L/K30M PMP-D2 variant (named PMP-D2v) bound to chymotrypsin, because the weak affinity of native PMP-D2 prevented any complex crystallization with available proteases.

EXPERIMENTAL PROCEDURES
Biological Material-The peptides were synthesized on solid phase using Fmoc strategy and were refolded by air oxidation of the six cysteines as described previously (5). Bovine ␣-chymotrypsin was purchased from Sigma (product code C3142).
Crystallization and X-ray Diffraction Studies-The peptides, as well as the enzyme, were solubilized in a buffer containing 50 mM Tris, pH * The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
The ʈ To whom correspondence may be addressed. Fax: 33-3-88-60-74-64; E-mail: ckellen@chimie.u-strasbg.fr. 7.5, and mixed together in a molar ratio of 1:3 (enzyme:inhibitor). The crystallization trials were performed after 1 h of incubation. Single crystals of PMP-C peptide complexed to bovine ␣-chymotrypsin were obtained at 20°C using the hanging drop vapor diffusion method. Protein drops were equilibrated against a 1-ml reservoir solution containing 100 mM sodium acetate, 29% polyethylene glycol 400, and 100 mM CdCl 2 at pH 5.0. The drops consisted of 2 ml of protein solution at 10 mg/ml mixed to an equal volume of the reservoir solution. Hexagonal bipyramids of about 0.3 ϫ 0.3 ϫ 0.5 mm 3 in size developed after 10 days. X-ray diffraction data were collected at room temperature to 2.1-Å resolution on a 300-mm Mar research imaging plate with l ϭ 0.970 Å on beam line W32 at Laboratoire pour l'Utilisation du Rayonnement Electromagnétique. The data were processed using the DENZO software package (11). PMP-C⅐␣-chymotrypsin complex crystallized in the space group P6 5 with cell dimensions a ϭ b ϭ 92.9 Å, c ϭ 165.8 Å. Specific volume calculations yielded three molecules per asymmetric unit, with a solvent content of 50%. A total number of 46359 unique reflections were indexed using the SCALEPACK package (11) program with an R merge on intensities of 7.9%, a data set multiplicity of 3.9, and a completeness of 98.2%, between 20.0 and 2.1 Å.
Single crystals of PMP-D2v complexed to bovine ␣-chymotrypsin were obtained at 20°C using the hanging drop vapor diffusion method. Protein drops were equilibrated against the similar reservoir solution as for PMP-C. Crystals about 0.2 ϫ 0.2 ϫ 0.3 mm 3 in size developed after 15 days. X-ray diffraction data were collected at room temperature to 3.0-Å resolution on a 300-mm Mar research imaging plate with ϭ 0.970 Å on beam line W32 in LURE. The data were processed using the DENZO software package. The PMP-D2v⅐␣-chymotrypsin complex crystallized in the space group P6 5 22 with cell dimensions a ϭ b ϭ 86.1 Å, c ϭ 187.3 Å. Specific volume calculations yielded one molecule per asymmetric unit, with a solvent content of ϳ60%. A total number of 8699 unique reflections were indexed using the SCALEPACK package program with an R factor on intensities of 15.1%, a data set multiplicity of 7.3, and a completeness of 99.8%, between 15.0 and 3.0 Å.
Structure Determination-The structure of PMP-C⅐␣-chymotrypsin complex was solved with the molecular replacement method using the AMoRe program (12). The ␣-chymotrypsin structure (Protein Data Bank code 1CHO) was used as the search model. The rotation function gave two solutions, leading to three positions in the translation function, two of them with the same orientation. Before fitting, the correlation coefficient and R factor were 51.5 and 39.6%, respectively, which refined to 63.9 and 33.1%, respectively. 3-Fold averaging was then carried out in a region containing the ␣-chymotrypsin trimer, as well as the putative inhibitor binding region, using the DEMON/ANGEL package (13). All non-chymotrypsin density peaks were assigned as waters, and X-PLOR refinement (14) was carried out between 27 and 2.7 Å using strict non-crystallographic symmetry. Water molecules were progressively added during subsequent cycles of refinement until the three PMP-C molecules became visible. The model was then transferred to the high resolution data set collected at LURE, and the refinement was pursued, using cycles of ARP/X-PLOR between 10 and 2.1 Å, this time with weaker NCS restrains as determined by R free trials. Once the PMP-C molecules were mostly built, the refinement protocol was ended with several cycles of refinement in TNT, using data between 20 and 2.1 Å. The final model has good geometry, with an R free of 22.8% for an R factor of 18.9%. The three PMP-C molecules are complete, respectively, between residues 2 and 35, 4 and 34, and 5 and 35. The Ramachandran plot shows that 86.9% of the non-glycine residues fall within the most favorable regions. The remaining 13.1% lies within the additionally allowed regions. The statistics of the refinement are given in Table I.
The molecular replacement method using the AMoRe program was also used to solve the structure of PMP-D2v⅐␣-chymotrypsin complex with the same search model. The rotation function yielded one solution, and the translation function yielded a unique solution, with a correlation coefficient and an R factor of 63.0 and 36.8%, respectively, for data between 15 and 4 Å. After rigid body refinement, the correlation coefficient was 70% for an R factor of 32.7%. A weak electron density appeared for the inhibitor that was sufficient to start the inhibitor structure building. After performing several cycles of slow cooling protocol starting at 2500 K and manual replacement and building, for the inhibitor, on the graphic display with the Turbo-Frodo program (15), the R factor has decreased to 16.4% (R free 19.3%). Coordinates for the structures of PMP-C⅐␣-chymotrypsin complex and PMP-D2v/␣-chymotrypsin complex have been deposited in the Protein Data Bank under file name 1GL1 and 1GL0, respectively.

RESULTS AND DISCUSSION
General Description of the Structures-The crystals of the PMP-C⅐␣-chymotrypsin complex belong to the space group P6 5 and diffract up to 2.1-Å resolution. After phase determination by molecular replacement and phase improvement by density modification, refinement yielded an R factor and an R free of 18.9 and 22.8%, respectively (Table I).
The asymmetric unit contains three copies of the complex displaying nearly identical conformations. Pairwise comparisons of the C␣ atoms of the three proteases yield root mean square deviations (r.m.s.d.) 2 where F obs and F calc are the observed and calculated structure factor amplitudes, respectively. c R free is calculated with 5% of the diffraction data, which were not used during the refinement.
of the estimated coordinate errors, it can be considered that the three complexes are identical. The main chain of the ␣-chymotrypsin component can be traced from Cys 1 to Gly 12 (activation peptide), from Ile 16 to Tyr 146 , and from Asn 150 to Asn 245 . The main chain of PMP-C shows no ordered electron density at both the amino (Glu 1 ) and the carboxyl termini (Asn 36 ) (Fig. 1A).  Fig. 1B shows the overall fold of PMP-C colored according to the temperature factors. The molecular replacement method was also used to solve the structure of PMP-D2v⅐␣-chymotrypsin complex. Crystals belong to the space group P6 5 22 and diffract up to 3.0-Å resolution. The refinement yields an R factor of 16.4% and an R free of 19.3%. The model of the complex accounts for 242 residues for the protease (1-12 and 16 -245) and for 33 residues for the inhibitor. The N-terminal (Glu 1 ) and C-terminal (Ala 35 ) of the inhibitor residues are not modeled because of lack of electron density. The refined average B factor are 21.5 Å 2 for all protease atoms and 67.8 Å 2 for the inhibitor. PMP-D2v exhibits the same overall structure as PMP-C with one ␤-sheet composed of three antiparallel ␤-strands (␤1, residues 8 -11; ␤2, residues 16 -19; and ␤3, residues 25-28) connected by two loops (1, residues 12-15; 2, residues 20 -24) and stabilized by three disulfide bridges (Cys 4 -Cys 19 , Cys 17 -Cys 27 , and Cys 14 -Cys 32 ).
Binding Loop of PMP-C and Its Interactions with ␣-Chymotrypsin-PMP-C buries 875 Å 2 of its solvent-accessible surface upon binding to the protease. As illustrated in Fig. 2A, the interaction site of PMP-C is composed of two regions, the binding loop (residues 26 -34, 596 Å 2 ) and an internal segment (residues 15-18, 153 Å 2 ). Leu 30 (P1) binds to the S1 pocket of the protease, and its side chain conformation is located in the deepest energy minimum (angular values of 1 ϭ Ϫ57°, 2 ϭ 173°) and superimposes well with the P1 side chain of inhibitors such as ascaris inhibitor of chymotrypsin/elastase ( 1 ϭ Ϫ71°, 2 ϭ 164°; see Ref. 16) and OMTKY3 ( 1 ϭ Ϫ48°, 2 ϭ 157°; see Ref. 17). The P3-P5 segment (residues 26 -28) forms an antiparallel ␤-sheet with chymotrypsin residues 218 -216. Ala 32 (PЈ2) interacts with Phe 41 through a hydrogen bond, and Pro 34 (PЈ4) makes a stacking interaction with Phe 39 side chain. The intermolecular hydrogen bonds are listed in Table II, top. The binding loop of PMP-C is maintained in a rigid conformation by intramolecular hydrogen bonds to an internal segment (residues 14 -20), as shown in Table II, bottom. In particular, a network of four hydrogen bonds (denoted by * in Table II, bottom), involving Asn 15 and Thr 29 side chains, highly stabilizes PЈ1 and P2 and maintains the local conformation of P1. Reviewing the grasshopper family inhibitors, we found out that Asn 15 and Thr 29 are the only conserved residues, apart from the six cysteines (Fig. 1C). Although unrelated to grasshopper inhibitors, Ecotin (18,19) displays a segment (residues 47-56) similar to that of PMP-C (residues [11][12][13][14][15][16][17][18][19][20], which is also connected to the binding site by a disulfide bridge (Cys 50 -Cys 87 ). However, the second disulfide bridge present in PMP-C (Cys 17 -Cys 28 ) is missing, and the cysteines are replaced by His 53 and Ser 82 , respectively. This local conformation is maintained by a similar hydrogen bond network, as described previously for PMP-C, where Asn 51 and Thr 83 side chains play the same role as Asn 15 and Thr 29 (Fig. 3A).
Comparison of PMP-C and PMP-D2v-From P3 (Cys 27 ) to PЈ2 (Gly 31 ), the intermolecular interactions between PMP-D2v and ␣-chymotrypsin are are similar to those observed for PMP-C. The P1 residue of PMP-D2v (Leu 29 ) is exactly in the same conformation as that of PMP-C. However, some differences are observed between the two inhibitors. The nature of the PЈ4 residue in PMP-D2v (Gln 33 instead of Pro 34 ) suppresses the stacking with Phe 39 . In addition, the hydrogen bond between Ser 218 and Ala 26 (P5) in PMP-C is not present in PMP-D2v. A water-accessible surface area of 946 Å 2 is buried upon chymotrypsin binding to PMP-D2v ( Fig. 2A). Beside the primary interaction site, composed of the binding loop (residues 25-33, 584 Å 2 ) and the internal segment (residues 15-18, 108 Å 2 ), PMP-D2v displays a secondary site (residues 20 -24, 175 Å 2 ). This region interacts with residues 172-175 of ␣-chymotrypsin through van der Waals contacts (Thr 20 -Trp 172 , Pro 21 -  4 -19). The color code is as follows: PMP-C in pink (subdomain I) and red (subdomain II), PMP-D2v in blue (subdomain I) and dark blue (subdomain II), the disulfide bridges in green, and ␣-chymotrypsin is in gray.
Gly 173 and Thr 174 , Thr 22 -Gly 173 , Val 24 -Trp 172 ). The structures of PMP-C and PMP-D2v were superimposed on each other (Fig.  2B), yielding an overall r.m.s.d. value of 0.7 Å for 17 residues (residues 8 -17 and 27-33, according to PMP-C numbering). A large structural divergence, with deviations between C␣ atoms higher than 5 Å, is observed for residues 4 -7 and 18 -27 (2 loop). Accordingly, PMP-C and PMP-D2v can be dissected into two structural subdomains as illustrated in Fig. 2B. On one hand, the so-called "subdomain I" is formed by residues 8 -17 and 28 -34, and these two regions are bound together by the disulfide bridge Cys 14 -Cys 33 (numbering of PMP-C). The subdomains I of the two inhibitors carry the primary binding site and are therefore structurally similar. On the other hand, residues 4 -7 and 18 -27, which are linked together by the disulfide bridge Cys 4 -Cys 19 , constitute subdomains II. The structural divergences that characterize subdomains II may be caused by differences in 2 loop, which is one residue shorter and possesses a Pro at position 21 in PMP-D2v. As a result, subdomain II of PMP-D2v is shifted closer to chymotrypsin and may be responsible of additional contacts (either favorable interaction or steric hindrance) and could therefore be considered as an element for discriminating between protease targets. PMP-C may be regarded as more permissive toward various enzymes, as also observed with S. gregaria inhibitors (20).
Comparison of Complexed (X-ray) and Free (NMR) Inhibitors-The x-ray structure of bound PMP-C was compared with the free form (36 structures; Protein Data Bank code 1PMC), solved in solution by 1 H NMR (9). The ␤-sheets of the two structures superimpose well, with an r.m.s.d. smaller than 1 Å whereas two regions display larger deviations, 1 loop (residues 12-15) and C terminus (residues 29 -35), with r.m.s.d. values of 1.8 and 2.4 Å, respectively (see Fig. 3B). The NMR structures display an average r.m.s.d. of 0.33 (Ϯ0.14) Å for the backbone atoms of the well defined segments 3-11, 14 -19, and 24 -30 (calculation performed with 164 nuclear Overhauser effect-derived distances using the program X-PLOR) and below 1 Å for the regions 12-15 and 29 -33. As the r.m.s.d. values from NMR study are smaller than the deviations resulting from NMR/x-ray structure superimposition, these latter may be considered as conformationally different in the regions 12-15 and 29 -33. However, it should be pointed out that the conformation of region 29 -33 was built from only four medium and long range nuclear Overhauser effects. Therefore, we believe that the scarcity of NMR data would rather account for more flexibility in this region. Our assumption is supported by the examination of PMP-D2. Although the superimposition of the free (20 structures) 3 and bound structures yields r.m.s.d. values similar to that of PMP-C (within 1 Å for the core and 1.3 and 2.1 Å for the loop 1 and the C terminus, respectively), free PMP-D2v is globally less structurally defined than PMP-C, with NMR r.m.s.d. values in the range of 1-2 Å for loop 1 and of 2-6 Å for C terminus region (residues 29 -35). This demonstrates a clear tendency to flexibility in these regions.
A scenario of events can therefore be put forward; Leu 30 (P1) penetrates deeply into the active site pocket S1 of the protease, followed by its neighbors Thr 29 (P2) and residues 31-33, resulting in a rather rigid conformation of this stretch. The Cys 14 -Cys 33 bridge, together with the hydrogen bond network involving Asn 15 and Thr 29 , stabilizes residues 12-15. This cascade of events is supported by a detailed comparison of the main chain conformations of the P3-PЈ3 residues in bound and free PMP-C, in relation with the canonical values reported by Bode and Huber (1). The dihedral angles of the binding loop in our study are in accordance with the canonical values whereas some given by the NMR study are not, in particular for the P1 and PЈ1 residues (Table III). We therefore infer that the disordered C terminus of free PMP-C achieves conformational stabilization upon chymotrypsin binding.
Such a hypothesis is supported by the example of Elafin, a 3 G. Mer, personal communication.  potent elastase inhibitor. Its structure was studied in complexed (21) and free (22) forms. The binding loop of the free inhibitor shows a high degree of flexibility, most of the and angles being out of the range defined for a canonical conformation. In the crystal structure, the and angles for P3 to PЈ3 residues fall in the defined range.
Concluding Remarks-We have determined the structures of two insect inhibitors in complex with bovine ␣-chymotrypsin. These inhibitors display a new fold, unrelated to that of other small serine protease inhibitors, and were recognized as the first members the grasshopper family (2). To our knowledge, PMP-C and PMP-D2 are the only insect serine protease inhibitors that were studied both in their free and bound forms.
From crystal structure superimposition, it appears that they are composed of two subdomains. The first subdomain is structurally conserved and is responsible for the inhibition as it contains the primary interaction site. The binding loop (P5-PЈ4) is found in a canonical conformation and superimposes well with that of other inhibitors such as OMTKY3, Ecotin, and bovine pancreatic trypsin inhibitor (23). In locust peptides, the binding loop is maintained to an internal segment by a hydrogen bond network involving mainly the two conserved residues Asn 15 and Thr 29 . This feature can be considered as a signature of the grasshopper family. The similarities reported in Ecotin strengthen this view, so we may consider that this "micro device" is contributing to the inhibitor mechanism.
The second subdomain displays more variability in terms of sequence and structure. In PMP-D2v, it contains an additional interaction site with the protease that could be involved in the species selectivity. Indeed, PMP-D2 is not active toward porcine trypsin, whereas it is weakly and strongly active toward bovine and locust trypsins, respectively. 1 Both x-ray and NMR structures of PMP-C show the binding loops in extended conformations. However, whereas the x-ray data account for a canonical conformation, the free inhibitor does not present the typical 3 10 helix conformation for the P1 residue. Thus we propose that the lock and key mechanism for serine protease inhibition (24) is valid for locust inhibitors, where the flexible binding loop is seen as a "soft" preformed key that acquires its canonical conformation upon lock-enzyme binding.
Until recently, there was a dearth of sequence and structure information about insect serine proteases, although a large number of insect inhibitors were characterized (25). The only crystal structure of an insect proteolytic enzyme (ant chymotrypsin) was reported recently (26). In the context of the physiological targets of locust inhibitors, Lam et al. (27) identified several forms of chymotrypsins and trypsins (28) from the midgut of L. migratoria. Because we established the species selectivity of PMP-D2, further x-ray studies of locust protease⅐locust inhibitor complexes will be helpful in charac-terizing the peculiar structure of insect proteases and the structural requirements for inhibition.
The preeminent feature of the recently sequenced insect genome of Drosophila melanogaster was the high number of putative protease open reading frames (over 300 identified). Such potentially dangerous enzymes need to be handled carefully by insects along their development pathway, a task that might be devoted to the short peptides inhibitors described in this study.