Inhibition of the Hepatitis C Virus NS3/4A Protease

The hepatitis C virus NS3 protein contains a serine protease domain with a chymotrypsin-like fold, which is a target for development of therapeutics. We report the crystal structures of this domain complexed with NS4A cofactor and with two potent, reversible covalent inhibitors spanning the P1–P4 residues. Both inhibitors bind in an extended backbone conformation, forming an anti-parallel β-sheet with one enzyme β-strand. The P1 residue contributes most to the binding energy, whereas P2–P4 side chains are partially solvent exposed. The structures do not show notable rearrangements of the active site upon inhibitor binding. These results are significant for the development of antivirals.

Hepatitis C virus (HCV) 1 infection is a major health problem that leads to cirrhosis and hepatocellular carcinoma in a substantial number of infected individuals estimated at 100 -200 million worldwide. Immunotherapy or other effective treatments for HCV infection are not yet available, and administration of interferon in combination with ribavirin has several limitations because of toxicity (1). One of the best characterized targets for HCV therapy is the serine protease of NS3 protein.
The NS3 protease domain constitutes the N terminus of the NS3 protein, which, when associated to the NS4A polypeptide, gets activated and therefore is responsible for maturation of the viral polyprotein (2).
The structure determination of the HCV NS3 protease complexed with a truncated NS4A cofactor (residues 21-34) revealed a shallow, nonpolar P1 specificity pocket. Because of the unusual substrate specificity of this enzyme it has been inferred that the design of highly selective inhibitors that could bind to the NS3 protease would be unlikely (3,4). We have found that capped tri-peptide ␣-ketoacids, incorporating difluoro aminobutyric acid in the P1 position, are potent, slow binding inhibitors of this enzyme (5). Their mechanism of inhibition is biphasic. The first kinetic phase involves the rapid formation of a noncovalent collision complex with association rate constants Ͼ0.2 s Ϫ1 , and the second kinetic phase consists of a slow isomerization with rate constants between 5 ϫ 10 Ϫ3 and 7.5 ϫ 10 Ϫ3 s Ϫ1 . This results in the formation of a very tight complex with dissociation rate constants between 1.2 ϫ 10 Ϫ5 and 1.8 ϫ 10 Ϫ5 s Ϫ1 and with half-lives of 11-16 h. The overall K i values are between 27 and 67 nM (5).
The inhibitors described here span the P1-P4 residues and contain an activated carbonyl in an ␣-ketoacid moiety as the active-site serine trap. To investigate the binding mode of these compounds, an hexagonal crystal form of the NS3 protease domain (J strain) complexed with the truncated NS4A cofactor, amenable to soaking experiments, was obtained. The crystal structures of the noninhibited NS3/4A complex (2.4 Å) and with two inhibitors (Fig. 1A), inhibitor I, ␣-ketoacid t, BOC-L, Glu-L, Leu-L(difluoro)aminobutyric acid (2.1 Å), and inhibitor II, ␣-ketoacid Z-L, Ile-L, Leu-L(difluoro)aminobutyric acid (2.4 Å), were solved.

EXPERIMENTAL PROCEDURES
Protein Expression and Purification-A DNA fragment encoding the serine protease domain of NS3J (amino acids 1-187) was obtained by polymerase chain reaction amplification of full-length cDNA and cloned in the pT7-7 vector. The NS3 protein was expressed in Escherichia coli BL21(DE3) as described previously (6). The purified protein was characterized by N-terminal sequencing, electrospray ionization mass spectrometry, dynamic light scattering, and HPLC activity assay. Results obtained from the N-terminal sequencing were confirmed by the molecular mass determined by electrospray ionization mass spectrometry (Met-NS3J (1-187)). Molecular size estimation, determined by light scattering, showed NS3J and NS3J/4A complex to have hydrodynamic radii (R H ) of 2.6 and 2.2 nm, respectively, which correspond to estimated molecular masses of 29 kDa for NS3J and 20 kDa for the NS3J/4A complex, respectively. These results indicate a more compact protein state for the complex compared with NS3 alone and correlate with the activation of the protease because of complex formation with the cofactor, as determined by HPLC activity assay following the protocol previously described (7). The kinetics of inhibition of the NS3 protease by the ␣-ketoacids were determined by stopped flow experiments, as described by Narjes et al. (5).
Crystallization-The NS3 protein (1 mg⅐ml Ϫ1 ) was incubated (4°C) with the NS4A cofactor peptide, containing a solubilizing lysine tag at its N and C termini (KGSVVIVGRIILSGRK) at a molar ratio of 1:2 and concentrated by ultrafiltration to 290 M. NS3J/4A crystals, with a maximum size of 0.6 ϫ 0.3 ϫ 0.2 mm 3 , were obtained by both "hanging" and "sitting drop" vapor diffusion methods, after 2 weeks at room temperature, with 3.4 M NaCl, 4.8 mM cyclohexyl-pentyl-␤-D-maltoside, 5 mM dithiothreitol, and 0.02% NaN 3 in 0.1 M citrate buffer, pH 5.1. The ternary complexes with inhibitors were prepared by adding to the stabilized NS3J/4A crystals (in 4.5 M NaCl, 10 mM dithiothreitol, 0.1 M citrate buffer, pH 5.1), 5 mM inhibitor I, or 2.5 mM inhibitor II and equilibrated for 2-3 weeks before mounting. Inhibitor binding inside the crystal was confirmed by mass spectrometry.
Structure Determination-The x-ray diffraction data were collected at 100 K at beam lines X11 and X31 (EMBL, Deutsches Elektronen Synchrotron, Hamburg, Germany) and ID14/EH3 (European Synchrotron Radiation Facility, Grenoble, France), using 30% glycerol as cryoprotectant. Data were integrated and scaled with the HKL suite (8) and with the CCP4 suite (9). Crystals belong to the space group P6 1 with two molecules in the asymmetric unit. A summary of the diffraction data is * The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
The presented in Table I. The noninhibited structure was solved by molecular replacement with AMoRe (10), using the NS3BK/4A coordinates as a search model, Protein Data Bank code 1JXP. The starting models of the inhibited structures were obtained by rigid body refinement of the refined native coordinates within AMoRe. In each case, the unique inhibitor was built into the initial, clearly interpretable 2F o Ϫ F c and F o Ϫ F c density maps. Refinement, using a maximum likelihood target function, was performed with REFMAC (11). All data were used (no FIG. 1. The ␣-ketoacid inhibitors bound to the NS3/4A complex. A, inhibitor I, ␣-ketoacid t, BOC-L, Glu-L, Leu-L(difluoro)aminobutyric acid and inhibitor II, ␣-ketoacid Z-L, Ile-L, Leu-L(difluoro)aminobutyric acid. B, secondary structure diagram of the NS3 protease in complex with the NS4A cofactor peptide and with the inhibitor I (stick model). Superimposed is also the inhibitor II (ball-and-stick model). Helices are colored in red, sheets are in yellow, turns are in purple, and random coils are in green. Both inhibitors are colored by atom type (carbon is green, oxygen is red, fluorine is light blue, and nitrogen is purple). N and C indicate the N and C termini of the protease domain, whereas NЈ and CЈ indicate the N and C termini of the NS4A cofactor peptide. The catalytic Ser 139 is also shown in a stick model colored by atom type. C, stereo view of the hydrogen bond interactions of inhibitors I and II (darker green) bound to NS3. The NS3 residues and both inhibitors are shown as stick models and colored by atom type. D, solid surface representation of NS3 active site with the bound inhibitor I (ball-and-stick model, colored by atom type). On the protein surface, acidic residues are colored in red, basic residues are in light purple, hydrophobic residues are in lime green, and Ser 139 is in very light blue. B and D were created using Insight97.0 from Molecular Simulations, Inc., (San Diego, CA); C was created with BOBSCRIPT (24). cutoff) from 20 Å to the high resolution limit of each data set ( Table I).
Modeling of solvent sites was executed with an automatic refinement program, ARP (12). In the refinement, 5% of the data was set aside for use as a cross-validation set (13). Refinement was continued interspersed with manual model building with the program O (14). The final statistics are given on Table I. The refined 2F o Ϫ F c electron density maps for the inhibitors I and II and for the glycerol molecule are shown in Fig. 2.
Coordinates-The coordinates of the three structures have been deposited in the Protein Data Bank (accession codes 1dxp, 1dy8, and 1dyg).

RESULTS AND DISCUSSION
NS3 Structure-Two crystal forms of the NS3/4A complex (H and BK strains), have been reported (3,4). Likewise, the NS3J/4A complex contains a C-terminal domain (residues 94 -175) with a six-stranded ␤-barrel that ends with a C-terminal helix. The N-terminal domain (NS3 residues 1-93 and NS4A residues 21-34), also a ␤-barrel, contains eight ␤-strands, with one of the two additional strands contributed by the N-terminal region of the protease and the other by the NS4A cofactor (Fig.  1B). The cofactor assumes an extended conformation forming main chain hydrogen bonds with the protease in an anti-parallel fashion. The N terminus is well ordered in both molecules of the asymmetric unit, and, as such, it is similar to the NS3BK/4A crystal structure (4). Conversely, in the NS3H/4A structure (3), only one N terminus is ordered. Both inhibitors bind at the active site (His 57 , Asp 81 , and Ser 139 ) located at the crevice between the two domains and are rather solvent exposed (Fig. 1B). In one of the two molecules of the asymmetric unit of the noninhibited enzyme, a glycerol molecule is bound at the oxyanion hole through a hydrogen bonding system (Fig.  3A). This finding correlates with biochemical data showing glycerol-induced stabilization of NS3/4A complex (15). At the metal binding site, the Zn 2ϩ ion is present in the noninhibited structure (Fig. 3B) and is coordinated to Cys 97 , Cys 99 , Cys 145 and, via a water molecule, to the imidazole ring of His 149 . The electron density at this site is very well defined. On the contrary, in the inhibited structures, the metal ion is absent or present only with a low occupancy, and this site is quite disordered. The absence of Zn 2ϩ is likely due to the presence of dimethyl sulfoxide (ϳ5%), necessary to dissolve the inhibitors. This is the only region where there is a large root mean square deviation (r.m.s.d.) for the C-␣ distance (ϳ0.6 -1.0 Å) between the noninhibited and inhibited structures. Otherwise the r.m.s.d. is within the coordinate experimental error. Perturbations at the Zn 2ϩ site do not affect the NS3/4A complex and the active site conformation in our crystal structures. This appears to be in conflict with folding studies (6). Although, as of today, it has not been formally shown that the removal of Zn 2ϩ from the folded NS3 protease has an effect on the activity or the structure.
Inhibitor Binding-The binding mode of the two inhibitors is equivalent, forming an antiparallel ␤-sheet with the protease, with one strand contributed by the inhibitor and one by the protease (Fig. 1, B and C). Two pathways leading to different conformations (Fig. 4) are possible, depending on the stereochemistry of the nucleophilic attack at the carbonyl carbon of the ketoacid by the serine OH group. In the x-ray crystal structures of thrombin and trypsin complexed with p-amidinophenylpyruvate (16,17) the attack is from the re-side (Fig. 4, complex I). In the case of HCV NS3 protease, the carbonyl carbon of the inhibitor is attacked from the si-side by the catalytic serine, forming a covalent bond (1.44 Å) with the inhibitor via the O␥, resulting in the tetrahedral intermediate shown in Fig. 4 (complex II). The carbonyl oxygen does not e R free is calculated from 5% of the data that were omitted during the course of the refinement. f r.m.s.d. is the root mean square deviation from ideal geometry. g As defined by PROCHECK (25), the percentage distribution is given in parentheses. point into the oxyanion hole, as observed for thrombin and trypsin (16,17), instead it is hydrogen bonded to the N⑀2 of His 57 . The two carboxyl oxygens point into the oxyanion hole: one forms a hydrogen bond with Ser 139 nitrogen, the other is hydrogen bonded with Gly 137 nitrogen. This different binding mode may be influenced by the soaking conditions, although it explains the remarkable discrepancy between the kinetics of hemiketal formation in the p-amidinophenylpyruvate/trypsin complex (18) compared with the kinetics of covalent bond formation in the ␣-ketoacid/NS3 complexes (5). A biphasic kinetic behavior, analogous to that reported for our inhibitors (5), was observed for the reaction between trypsin and p-amidinophenylpyruvate. In the latter case, the rate constants describing the equilibrium between the initial complex and the covalent resolution and contoured at 1.0 , for the glycerol molecule and Ser 139 . All panels were created with BOBSCRIPT (24).  NS3 (B). A, all residues and the glycerol molecule are shown as stick models and colored by atom type. The interaction between the glycerol molecule and Lys 136 is speculative, because the side chain of Lys 136 is not visible in the electron density of the noninhibited structure and has been modeled based on the two inhibited structures. B, residues, in a stick model, are colored by atom type; the zinc ion, in a CPK model, is colored dark yellow; and a water molecule, in a CPK model, is colored magenta. Both panels were created using Insight97.0 from Molecular Simulations, Inc. (San Diego, CA). complex were determined to be 3.15 and 0.1 s Ϫ1 , respectively (18). These data should be compared with 7.5 ϫ 10 Ϫ3 s Ϫ1 and 1.2 ϫ 10 Ϫ5 s Ϫ1 , for the NS3-inhibitor I and to 5 ϫ 10 Ϫ3 and 1.8 ϫ 10 Ϫ5 s Ϫ1 , for the NS3-inhibitor II complexes, respectively. Therefore, these results suggest a significantly larger energy barrier for both the formation and the dissociation of the covalent complex, in the case of the NS3 protease, and correlate with the structural differences of the final covalent complexes (Fig. 4).
Another characteristic of our structures is that His 57 makes two strong hydrogen bonds (ranging from 2.33 to 2.75 Å for all three structures; Fig. 1C); one with its N␦1 to the O␦2 of Asp 81 , another with its main chain nitrogen to the O␦1 of the same aspartate. The presence of these particularly short hydrogen bonds is consistent with the structure of the subtilisin protease from Bacillus lentus, solved at 0.78 Å (19). More generally, the existence of hydrogen bonds with a partially covalent character has been demonstrated by a recent Compton scattering study (20). The His 57 -Asp 81 interaction is missing in the structures of the noninhibited NS3 without the NS4A cofactor (21,22), whereas it is present in our noninhibited NS3/4A complex (Fig.  3A). The solvent exposed residue, Lys 136 , close to the active site likely stabilizes the carboxylate bound at the oxyanion hole (Fig. 1, C and D).
The P1 and P2 sites for both inhibitors are the same, (diflu-oro)aminobutyric acid and Leu, respectively (Fig. 1, B and C). The S1 pocket constitutes a small nonpolar depression on the protease surface and is formed by Val 132 , Leu 135 , and Phe 154 , which, together with the aliphatic part of Lys 136 , can make lipophilic interactions, thus explaining the preference for a Cys residue in the P1 position of substrates. The P1 residue of both inhibitors, a chemically inert fluorocarbon group, mimics the cysteine thiol of natural substrates (5) and makes hydrophobic interactions with Val 132 , Leu 135 , the aliphatic part of Lys 136 , and Phe 154 (Fig. 1D). In addition, attractive interactions between the fluorine lone pairs and the aromatic C-H of Phe 154 are possible. In the S2 region, Ala 156 side chain together with His 57 and the aliphatic part of Arg 155 provide a small hydrophobic patch on the protein surface (Fig. 1D). The side chains of His 57 , Arg 155 , and Ala 156 sterically limit the space available to the P2 residue. Because of this steric hindrance, hydrophobic ␤-branched amino acids cannot be accommodated in S2, hence explaining the lower activity of inhibitors with Val or Ile in P2 (23), whereas Leu in P2 fits well. Inhibitor binding is further reinforced through its main chain N1 and O3, which form hydrogen bonds with the main chain carbonyl of Arg 155 and the main chain amide of Ala 157 , respectively (Fig. 1C). The S3 surface, an extension of the S1 pocket, is formed by the hydrophobic side chains of Ala 157 and Cys 159 and makes lipophilic interactions with the nonpolar part of the P3 side chain of the inhibitors (Fig. 1D; Glu for inhibitor I and Ile for inhibitor II). In the structure of NS3/4A in complex with inhibitor I, the carboxylate of Glu forms a salt bridge with the NZ of Lys 136 (Fig. 1C). Because of the bulky nature of the P3 side chain of inhibitor II, Lys 136 is sterically restricted and also visible, whereas it is disordered in the noninhibited structure. In addition, at the P3 position, a carbon-sulfur interaction with the enzyme residue Cys 159 is possible, thereby helping to stabilize the inhibitor in an extended conformation. A hydrogen bond is formed between the backbone N2 of the inhibitors and the main chain carbonyl of Ala 157 (Fig. 1C). The S4 site consists of a solvent exposed hydrophobic patch, created by Arg 123 and Val 158 (Fig. 1D). This small hydrophobic area should favor hydrophobic amino acids. The preference for bulky hydrophobic amino acids in P4, shown previously (23), can be explained by the hydrophobic interaction of the P4 residue (tert-butyl in inhibitor I and benzyl capping group in inhibitor II) with Arg 123 and Val 158 of the enzyme and by the intramolecular hydrophobic contact with Leu in P2 (Fig. 1D). To our knowledge, this is the first report of the crystal structure of inhibitors bound to the HCV NS3/4A complex and permits the interpretation of structure/activity relationship data (5). These structures show that even in the presence of an inhibitor, the substrate-binding region remains largely featureless and solvent-exposed and underscores the likely difficulties of developing small drug-like inhibitors.