Insights to Substrate Binding and Processing by West Nile Virus NS3 Protease through Combined Modeling, Protease Mutagenesis, and Kinetic Studies*

West Nile Virus is becoming a widespread pathogen, infecting people on at least four continents with no effective treatment for these infections or many of their associated pathologies. A key enzyme that is essential for viral replication is the viral protease NS2B-NS3, which is highly conserved among all flaviviruses. Using a combination of molecular fitting of substrates to the active site of the crystal structure of NS3, site-directed enzyme and cofactor mutagenesis, and kinetic studies on proteolytic processing of panels of short peptide substrates, we have identified important enzyme-substrate interactions that define substrate specificity for NS3 protease. In addition to better understanding the involvement of S2, S3, and S4 enzyme residues in substrate binding, a residue within cofactor NS2B has been found to strongly influence the preference of flavivirus proteases for lysine or arginine at P2 in substrates. Optimization of tetrapeptide substrates for enhanced protease affinity and processing efficiency has also provided important clues for developing inhibitors of West Nile Virus infection.

regions of Africa, the Middle East, Europe, Russia, western Asia, and Australia (less severe subtype Kunjin) and most recently in North America (1). WNV is transmitted by Culex mosquitoes from avian reservoir hosts to vertebrate dead end hosts, including humans and horses (2). Human infection is generally asymptomatic or causes a mild febrile disease, West Nile fever. However, more recent infections of WNV have also been associated with higher rates of severe neurological disease and fatalities, particularly among the elderly (2). Since the introduction of WNV into New York in 1999, the virus has spread rapidly throughout North America, infecting over 19,000 people and causing more than 700 fatalities (see the Center for Disease Control and Prevention site on the World Wide Web at www. cdc.gov/ncidod/dvbid/westnile/index.htm). Currently there is no vaccine or antiviral therapy for the prevention or treatment of human WNV infection (1).
WNV is a small, enveloped virus with a single-stranded, positive sense 11-kb RNA genome, which encodes a single polyprotein precursor. This polyprotein must be cleaved co-and posttranslationally to produce 10 functional proteins: three structural (C, prM, and E) and seven nonstructural (NS1, NS2A, NS2B, NS3, NS4A, NS4B, and NS5). Translation of the viral polyprotein is membrane-associated with host proteases cleaving junctions within the lumen of the endoplasmic reticulum and the Golgi, whereas a viral protease encoded within NS3 cleaves at the junctions NS2A/NS2B, NS2B/NS3, NS3/NS4A, and NS4B/NS5 and also internal sites within C, NS3, and NS4A ( Fig. 1) (3). Cleavage at these sites by the NS3 protease is essential for viral replication, so the protease is a potential therapeutic target (4 -7).
NS3 is a multifunctional protein, the protease comprising the N-terminal third and nucleotide triphosphatase, RNA triphosphatase, and helicase components comprising the remainder (8 -10). NS3 is a trypsin-like serine protease with a classical catalytic triad (His-51, Asp-75, Ser-135) (11) and is highly specific for substrates with dibasic P1 and P2 components and a small amino acid at P1Ј. This recognition sequence is highly conserved throughout flaviruses (12); however, different flaviviruses prefer either Lys or Arg at P2. Although Den and YF NS3 proteases predominantly recognize Arg at P2, WNV protease recognizes Lys at P2 ( Table 1). The activity of flavivirus NS3 proteases is dependent on an NS2B cofactor, with truncation studies in Den2 having shown that a central 40-amino acid hydrophilic domain is sufficient for activity (13). The flanking hydrophobic domains within NS2B are likely to function in promoting membrane association of NS2B-NS3 (14).
Due to their pivotal roles in both normal physiology and disease, proteases are increasingly attracting interest as pharmaceutical targets (15). Since early successes in human immunodeficiency virus chemotherapy (human immunodeficiency virus-protease inhibitors) and in the treatment of high blood pressure (angiotensin-converting enzyme (ACE) inhibitors), a large number of new protease inhibitors have entered clinical trials (16). One reason for the drug potential of proteases is the relatively predictable way in which they recognize their substrates and inhibitors in extended (␤-strand) conformations (4,17). There are no known examples of proteolytic processing of peptide ␣-helices, ␤-sheets, or ␤-turns. Greater access to three-dimensional structures for proteases (over 1500 in the Protein Data Bank) has also facilitated hybrid structure/substrate-based drug design (5).
Recently reported crystal structures for NS2B/NS3 proteases of both WNV and Den2 (6) provide new structural insights to flaviviral proteases in ligand-bound conformations. An earlier homology model of the WNV protease (7), derived from the crystal structures of a highly homologous dengue NS3 protease without NS2B cofactor (18) and a less homologous hepatitis NS3 with bound NS4A cofactor (19), differs significantly from the crystal structure of WNV NS2B-NS3. This has prompted a reexamination now of some of the previous mutagenesis data (20). The reported WNV protease crystal structure shows an N-capped tetrapeptide aldehyde inhibitor bound in the substrate-binding cleft as a loop (instead of the commonly observed ␤-strand conformation) with the "P5" capping benzoyl residue sitting on top of the P1 residue. It therefore seemed unlikely that this ligand had the same binding mode as substrates beyond P1 and P2.
In previous kinetic studies using short hexapeptide p-nitroanilide substrates derived from endogenous polypeptide cleavage sites, we found no preference of WNV protease for specific residues except at P1 and P2 (7). However, more recent studies using nonnative dengue hexapeptide and decapeptide 5 and tetrapeptide and octapeptide (21) substrate sequences have suggested opportunities for enhancing substrate affinity for flaviviral proteases using nonnative or nonproteinogenic amino acids with hydrophobic (Nle, Leu) residues at P4 and a basic (Lys Ͼ Arg) residue at P3, a feature not seen in suboptimal native sequences. Using a combination of computer docking of substrates into the enzyme crystal structure, site-directed mutagenesis of the protease (Fig. 1), and kinetic studies of the processing of tetrapeptide substrates, we have focused the present study on increasing substrate affinity and processing efficiency, identifying the enzyme residues likely to be involved in binding to the P2-P4 positions of substrates, and taking early steps toward potent substrate-based nonpeptidic inhibitors by incorporating unnatural amino acids in tetrapeptide substrates.

EXPERIMENTAL PROCEDURES
para-Nitroanilide (pNA) Substrate Synthesis-pNA substrates were synthesized according to the general method of Abbenante et al. (22) and characterized by analytical high per-formance liquid chromatography, mass spectrometry, and NMR (see supplemental material).
Modeling of Substrates into the NS3 Crystal Structure-The crystal structure of West Nile virus NS2B/NS3 protease (Protein Data Bank code 2fp7) was prepared for docking by adding protons using InsightII (version 2000; Accelrys Inc.). Substrates were assembled using the Biopolymer and Sketcher modules within InsightII and minimized using Discover. All substrate docking experiments were conducted using GOLD version 2.1.2 (23). Hydrogen bonding and distance constraints were used to align the substrate within the active site as follows. The P1 Arg residue was positioned as observed for the corresponding residue in the aldehyde inhibitor complex (Protein Data Bank code 2fp7) (24), using a distance constraint of 3.5 Ϯ 1.5 Å between the Arg -carbon and the aromatic ␥-carbon of Tyr-161 in the S1 pocket. This positions the positive charge of the arginine optimally for a -cation interaction but also enables a charge-charge interaction with Asp 129 . A hydrogen bond between the P2 Lys z-NH 3 ϩ and the Asn-152 side-chain carbonyl oxygen was used to anchor P2 in the shallow solventexposed S2 pocket, as predicted in earlier modeling work (7) and later verified experimentally (24). Hydrogen bond constraints for critical substrate backbone-enzyme interactions between NS3-Gly-153 carbonyl oxygen and substrate P3 NH, Gly-153 NH and substrate P3 carbonyls, and Gly-151 carbonyl oxygen and Arg P1 ␣NH were also used. No constraints on the position of P3 and P4 side chains were used. For the larger 2-naphthoyl residue in 2-naphthoyl-KKR-pNA, the docking ( Fig. 2) produced poses with a high degree of steric clash in the vicinity of S4. In these cases, the docking poses were minimized with the enzyme backbone either fixed or tethered and the enzyme side chains and the entire substrate allowed to move using Discover (Accelrys).
Enzyme Expression and Purification-The pQE9 vector was used to allow high level, inducible expression of N-terminal His 6 -tagged recombinant proteins. Cultures of E. coli strain SG13009 transformed with the expression plasmids containing the site-directed mutations were grown in 2 ϫ 25 ml of LB medium containing 100 g/ml ampicillin and 25 g/ml kanamycin at 37°C until the A 600 reached 0.5. Expression of the recombinant protein was induced by the addition of isopropyl ␤-D-thiogalactopyranoside to a final concentration of 0.3 mM and incubated for an additional 3 h at 22°C. Cells were then harvested by centrifugation at 4500 ϫ g for 10 min and stored at Ϫ20°C. For protein purification, the cell pellets were thawed and resuspended in 1 ml of lysis buffer (50 mM HEPES, pH 7.5, 300 mM NaCl, 10 mM imidazole, 5% glycerol). To prevent proteolytic cleavage of protein during lysis and purification, the following protease inhibitors were added to give the final concentrations of 1 g/ml aprotinin, 1 g/ml leupeptin, 1 g/ml benzamidine, and 1 mM phenylmethylsulfonyl fluoride. Resuspended cells were lysed by sonication, and insoluble products were pelleted by centrifugation at 27,000 ϫ g for 20 min. The recombinant proteases were purified by affinity chromatogra-phy using an N-terminal His 6 tag on Ni 2ϩ nitrilotriacetic acid-agarose. Resin (0.5 ml) was pre-equilibrated with 10 ml of column buffer (50 mM HEPES, pH 7.5, 300 mM NaCl, 10 mM imidazole, 5% glycerol), and then the resin was removed and mixed with the supernatant of the cell lysates. These mixtures were incubated at 4°C on a shaker for 30 min to allow the His-tagged protein to bind to the Ni 2ϩ column. Resin was pelleted at low speed (100 ϫ g), and the buffer was removed. The resin was washed with 3 ϫ 5 ml of column buffer containing 50 mM imidazole, and the proteins were eluted into a single 300-l fraction with column buffer containing 500 mM imidazole. Purification was confirmed by 12% SDS-PAGE.
Enzymatic Characterization and Substrate Analysis-Purified recombinant protease, WNV CF40.Gly. NS3pro, and site-directed mutants were assayed against tetrapeptide (Ac-LKKR-pNA) and hexapeptide (Ac-LQYTKR-pNA) substrates corresponding to P6 -P1 of cleavage sites in the endogenous substrates but with a chromogenic pNA group at the P1Ј position. Cleavage of pNA from the peptides by WNV protease produced a yellow color that allowed monitoring at 405 nm. The assay was conducted in a 96-well plate, with a final reaction volume of 200 l containing 0.25 or 0.5 M recombinant protease, using optimized conditions (final concentration of 50 mM glycine-NaOH, pH 9.5, 30% glycerol, and 1 mM CHAPS) (7). Eight different substrate concentrations, each in triplicate, were used for determining kinetic constants. After preincubation in separate wells (10 min, 37°C), catalysis was initiated by mixing substrate with enzyme-buffer solution by automatic shaking for 5 s. The optical density was measured at 405 nm every 11-30 s for 210 s to 30 min (depending on activity) in a SpectraMax 250 reader, and the average change in millioptical density/min was calculated. For low substrate concentration, where there was a visible loss in activity over time, only the first five points were used to calculate the average change in millioptical density/min. Kinetic parameters were calculated from weighted nonlinear regression of the initial velocities as a function of the eight substrate concentrations using Graphpad Prism 4 software. The parameters k cat , K m , and k cat /K m were cal-

RESULTS
Predicted Substrate-Enzyme Interactions-The reported crystal structure of WNV NS3 protease in association with the cofactor domain of NS2B has increased the understanding of the mechanism of substrate binding and protease activity. Since the enzyme was also bound to a tetrapeptide-aldehyde inhibitor, it was possible to observe enzyme residues that interact with P1 and P2 side chains. However, because the inhibitor was not bound in the extended ␤-strand conformation typical of substrate-protease interactions (4,17), enzyme contacts with substrate beyond P1 and P2 positions could not be deduced from that crystal structure. In this work, we sought to dissect critical substrate interactions with enzyme at S2, S3, and S4. Cleavage sites for WNV protease vary considerably between P6 and P3 in native polypeptide substrates (e.g. DPNRKR2GW (NS2A-NS2B), LQYTKR2GG (NS2B-NS3), FASGKR2SQ (NS3-NS4A), KPGLKR2GG (NS4A-NS5)), with polar, acidic, basic, and hydrophobic residues all tolerated at P6 -P3 positions.
To predict protease residues that are important for substrate binding, we conducted molecular modeling experiments using GOLD to dock tetrapeptide pNA substrates into the crystal structure of the WNV NS2B/NS3 protease (Protein Data Bank code 2fp7). The tetrapeptide substrate Ac-LKRR-pNA spanning P4 -P1 had been previously identified (21) as optimal for dengue protease and, since we knew that WNV elicited a preference for Lys over Arg at P2, our docking studies began with the tetrapeptide substrate Ac-LKKR-pNA. Docking flexible molecules such as peptides into rigid solid state structures of proteases is notoriously difficult due to inadequate sampling of conformational space and insufficiently minimized docked poses (25) and because GOLD does not allow for cooperative interactions or enzyme flexibility. To generate more valid docking results, hydrogen bonding and distance constraints were used to restrict the substrate to ␤-strand-like conformations that are more biologically relevant (4,5).
The crystal structure of WNV protease suggests that in addition to substrate-binding residues within NS3, residues within the NS2B cofactor also interact with substrate. Molecular docking ( Fig. 2A) suggests that the P3 Lys side chain does not occupy a well defined S3 pocket in the enzyme but instead is largely solvent-exposed and binds in a shallow groove extending toward S1. This hydrophobic region is perhaps the reason that the four endogenous cleavage sequences contain a range of residues at P3 (Arg, Thr, Gly, and Leu). In all cases, there are hydrophobic elements in proximity to the main chain that are able to interact with the hydrophobic wall of the groove (e.g. Ile-155) as well as being able to accommodate both charged and polar side-chain termini directed outward into solvent. Asn-152 is hypothesized to be the S2 hydrogen bond acceptor of the P2 Lys side chain (7). On the opposite side of the substrate binding cleft to S2 is a hydrophobic surface patch consisting of Val-154, Met-156, and cofactor residue Leu-87. The hydrophobicity of these residues is highly conserved within the Flavivirus genus, and they most likely constitute one side of the shallow S4 pocket. The cofactor residue Gln-86 is observed in the crystal structure to participate in a hydrogen bond with the P3 Lys-NH 3 ϩ of an aldehyde inhibitor bound in an unusual conformation, but, since this residue is poorly conserved among the flaviviruses, the side chain is probably not important for substrate binding. Farther away, the cofactor residue Val-75 may make interactions with P5 or P6. However, it is difficult to predict how substrate might extend into and bind at this position. Also of interest is the cofactor residue Asn-84, which appears to make a hydrogen bond with the P2 Lys. This residue is semiconserved within the Flavivirus genus as either a polar or negatively charged residue (Asn, Ser, Thr, Asp, and Glu). Docking of 2-naphthoyl-KKR-pNA generally resulted in two docking poses. One had the bulky naphthoyl residue in the S1 pocket, resulting in a nonextended conformation reminiscent of the turnlike structure of the inhibitor aldehyde in the published crystal structure. The other had the aromatic residues interacting with hydrophobic enzyme residues at S4 but with some resultant steric clashes. Docking with GOLD used an explicit rigid protein method that is frequently inadequate for proteases, since they often display a high degree of active site plasticity (25). In this case, we took the docked poses where the 2-naphthoyl residue occupied the conventional extended conformation in the S4 site and used a combination of molecular dynamics and energy minimization to investigate possible induced fit binding modes. Fig. 2B shows one minimized docking pose where the small S4 pocket has been enlarged by a subtle movement of the side chains of Ile-155 and Val-154 in NS3 and Val-75, Val-77, and Leu-87 of NS2B. After these movements, the now slightly deeper S4 pocket is additionally defined by two residues (Phe-116 from NS3 and Phe-85 from NS2B) that make favorable aromatic-aromatic interactions with the naphthoyl ring.
Regarding the orientation of the substrate in the active site of the enzyme, the model suggests that the carbonyl carbon of the substrate scissile amide was 2.5-2.8 Å from the catalytic serine hydroxyl and in an orientation reminiscent of a Michaelis complex, despite no explicit restraints being used to fix it.
Following these substrate-docking modeling experiments, we prepared a number of site-directed mutants with residue substitutions in both NS3 and NS2B to test the predicted interactions. In parallel, we synthesized a library of chromogenic pNA substrates, designed around the optimal substrate used for our docking studies, and examined their processing kinetics by both wild type and mutant NS2B/NS3 West Nile Virus proteases.
Cofactor-Substrate Correlations-The crystal structure of WNV NS2B/NS3 protease revealed that Asn-84 of the NS2B cofactor is within hydrogen bonding distance of the P2 Lys of the bound ligand. Asn-84 is located within a highly conserved region of the cofactor, the Gly residue on the N-terminal side is completely conserved, and the third residue on the C-terminal side is a highly conserved hydrophobic Leu or Ile. Although the residue homologous with NS2B Asn-84 in other flaviviruses is variable, it is always polar or negatively charged (Table 1). This is of particular interest, because there appears to be an association between this residue and either Lys or Arg at P2 in native cleavage sequences. An Asn residue is at this position in WNV and St. Louis encephalitis (SLEV) proteases, corresponding to a preferred Lys at P2 of native substrates. However, in the proteases of all four serotypes of dengue, there is either Ser or Thr at this position matched by Arg at P2 in native substrates. The presence of Gln at P2 for one of the crucial cleavage sites (NS2B-NS3) suggests a requirement for a hydrogen-bonding pair and not a charge-charge interaction pair.
Specific partnering between cofactor and substrate is also seen when a negatively charged residue is present in the homologous position of the cofactor. In YF, a Glu at this position in the cofactor always corresponds with Arg at P2 in the native cleavage site. In JE, Murray Valley encephalitis (MVE), Zika virus (ZIKV), and Bussuquara virus (BSQV), an Asp is matched by a P2 Lys predominantly in the native cleavage site (Table 1). On the other hand, proteases of flaviviruses that are transmitted by ticks, tick-borne encephalitis (TBE) and Langat virus (LGTV), do not show the same preference as WNV and St. Louis encephalitis proteases. Despite Asn at the homologous position in the cofactor for TBE and LGTV proteases, in all cases they recognize Arg at P2 in the cleavage sites of their native substrates, compared with Lys at P2 in substrates for SLEV and WNV proteases. The tick-borne flaviviruses form a and tick-borne encephalitis virus (TBE). The first column shows the alignment of the region of NS2B involved in substrate binding, whereas the next four columns show the native flavivirus cleavage sequences (P4 -P1Ј). The final column shows the degree of homology between the various flaviviruses and the WNV NS2B 40-amino acid cofactor domain and the NS3 protease domain. Residues shown in green and yellow designate homology. The residue shown in boldface type and designated by the asterisk is believed to interact with P2. A P2 arginine residue is shown in blue, and a P2 lysine is shown in red.
cluster distinct from the mosquito-borne flaviviruses, and there are significant amino acid changes between the two subgroups. Therefore it is likely that there are other significant differences within the substrate binding cleft that may account for the preference of a P2 Arg in substrates. To investigate the role of NS2B-84, various mutant proteases of WNV were produced with this residue altered to homologous residues from other flaviviruses (see below).
Substrate Design and Kinetic Processing-Kinetic parameters for proteolytic processing of modified substrates by WNV NS2B-NS3 protease are shown in Table 2. Initially, our hypothesis was that hydrophobic residues at P4 could interact with the surface-exposed hydrophobic region bounded by Val-154 and Met-156 of NS3 and Leu-87 of NS2B. We therefore prepared substrates based on Ac-XKKR-pNA, where X represents a hydrophobic unnatural amino acid ( Table 2, entries 1-9, and Fig. 3). Most of the examined substrates had K m and k cat values comparable with those of Ac-LKKR-pNA, with a ␤-branched residue t-butylglycine that induces steric crowding on the peptide main chain being most detrimental to binding and processing. Substrate processing was least efficient for this and similar ␤-branched residues like aminoisobutyric acid and cyclohexylglycine (Table 2, entries 3, 9, 10). The substrate containing 2-amino-octanoic acid Aoc gave the highest k cat of any substrate in this study, but this was tempered by a slightly lower affinity (K m ). Replacing the flexible Ac-P4 moiety with a rigid 2-naphthoyl group resulted in the highest affinity substrate (K m ϭ 25 M) with a slightly lower k cat but the most efficient processing (k cat /K m ϭ 42,603 M Ϫ1 s Ϫ1 ). Taken together, these results confirm the hypothesis that hydrophobic interactions are the key to interactions between P4 of substrates and S4 of enzyme.
The observation that substrates bearing the bulky 2-naphthoyl group at P4 possessed enhanced K m but lower k cat values may be attributable to an induced fit at P4/S4. As found in the  docking studies on this class of substrate, allowing the enzyme side chains to move produced better binding modes at this site. However, this requirement may contribute to lower overall catalytic efficiency by causing a repositioning of the scissile bond relative to the catalytic machinery, or alternatively, if the enzyme moves back into a "native" conformation after substrate hydrolysis, it may affect the departure of the N-terminal cleavage fragment from the active site (k off ). We next examined whether nonproteinogenic amino acids could be incorporated at P3 (Table 2, entries 10 -20). In light of the finding that 2-naphthoyl was a higher affinity replacement for Ac-Leu, a number of P3 mutants were examined with both Ac-Leu and 2-naphthoyl at P4. Interestingly, all of the changes made in this panel of substrates ( Fig. 4 and Table 2, entries 10 -19) resulted in lower overall k cat /K m values, primarily due to large reductions in k cat . Substitution of 2-naphthoyl for Ac-Leu at P4 contributed a 10-fold improvement in K m for substrates containing citrulline at P3, 3-fold for ornithine, 2-fold for homoarginine, but negligible improvement for 3-pyridylalanine. These gains, however, came at a cost of a 4-fold reduction in k cat for citrulline, 2-fold for ornithine, 10-fold for homoarginine, but surprisingly a 2-fold gain in k cat for 3-pyridylalanine. It therefore appears that a protonated amine is not absolutely required for affinity at P3, since the citrulline mutant substrate showed the best K m . However, this required the presence of 2-naphthoyl at P4 for a cooperative effect. The effect of lengthening the spacer between the positive charged Arg and the main chain (hArg) was minor when P4 was Ac-Leu but detrimental when 2-naphthoyl capped the substrate. This clearly indicates that the naphthoyl substituent, possibly due to induced fit effects, had a major effect on the mode of binding of the substrates at the neighboring S3 enzyme subsite.
For the study of P2 mutant substrates ( Table 2, entries 21-30), 2-naphthoyl was invariant at P4, and P3 was initially Lys. As for dengue protease (21), the presence of a basic amine at this position was found to be essential, with aromatic amines, capped amines, and hydrophobic residues not being tolerated. Only ornithine, with its high affinity and good k cat , was a satisfactory replacement for Lys, although homoarginine with the longer side chain before the positive charge, had comparable activity, with an increase in K m and a decrease in k cat . The effects of cooperative changes were then examined using ornithine at P2 and either citrulline or ornithine at P3. As predicted, there was no cooperative gain in substrate fitness by these double mutations. The Orn-Orn mutant possessed a good K m but had a further reduction in k cat , with Cit-Orn conferring inferior K m and k cat . It is interesting to note that WNV protease processed substrates with a P2 Lys over 2-fold more efficiently than those with a P2 Arg.
We first evaluated the kinetics of substrate processing by WNV protease mutants with tetrapeptide and hexapeptide substrates. The 24 mutant proteases were tested against the optimized tetrapeptide substrate Ac-LKKR-pNA, and 10 were also tested against the hexapeptide substrate Ac-LQYTKR-pNA based on the native NS2B/NS3 cleavage site. Kinetic parameters (Table 3) for these mutant proteases showed that most had impaired function compared with wild type enzyme, with only the conservative NS2B-L87F and polar charged NS2B-Q86E mutants retaining comparable catalytic efficiency. In particular, it is notable that NS2B-Q86 participates in a hydrogen bond with the charged P3 Lys of inhibitor in the crystal structure, suggesting that the Q86E mutant might enhance this interaction. However, K m is not significantly higher than the wild type, and the overall reduction in processing efficiency is due to a reduction in k cat . This may be due to a reduction in k off or alternatively due to repositioning of the scissile bond farther from the catalytic machinery. All WNV protease mutants also exhibited variable decreases in substrate affinity. The effects and their implications for substrate-binding interactions are discussed below.
We also examined the kinetics of processing by WNV protease mutants on the truncated N-capped tripeptide substrate 2-naphthoyl-KKR-pNA (Table 4). Docking of the substrate 2-naphthoyl-KKR-pNA into the crystal structure of WNV protease yielded two possible substrate conformations. The ␤-strand substrate conformation orients the 2-naphthoyl group in a hydrophobic S4 subsite in the enzyme, bounded by NS3 residues Val-154 and Ile-155 and the NS2B cofactor residues Val-75, Val-77, and Leu-87, whereas a turn-like conformation positions the 2-naphthoyl substituent over the P1 arginine of the substrate. To identify which of these is more likely, we tested one relevant NS3 mutant (V154F) and two separate relevant NS2B mutants (L87A and L87F). The V154F and L87A mutations caused similarly dramatic reductions in k cat and minor increases in K m , whereas L87F resulted in a similar K m and only a 3-fold decrease in k cat . Although these results show that the V154F and L87A mutations have a large effect on processing of substrates with a naphthoyl at P4, those substrates are still relatively efficiently processed by the L87F mutant protease, suggesting that Phe-87 may make ainteraction that stabilizes naphthoyl binding at S4.
To investigate the hypothesis that the residue at NS2B-84 contributes to substrate binding and provides flavivirus NS3/ NS2B proteases with specificity for either Lys or Arg at P2, the wild type WNV recombinant protease and the NS2B-N84D/ N84E/N84S mutant proteases were tested against the tetrapeptide substrates Ac-LKKR-pNA and Ac-LKRR-pNA (Table 5). The comparative enzyme kinetic parameters show that the N84S mutation resulted in a 4-fold decrease in K m when P2 was Lys and a 2-fold increase in K m when P2 was Arg. This supports observations that Asn-84 (native to WNV) binds more strongly to a P2 Lys, whereas Ser-84 (native to Den) binds more strongly to a P2 Arg. Against both substrates the N84S mutation caused a decrease in catalytic efficiency, ϳ2-fold against Ac-LKKR-pNA and as much as 15-fold against Ac-LKRR-pNA, translating in both cases to about an 8-fold decrease in catalytic efficiency (k cat /K m ) versus wild type enzyme.
Similarly, the substitution of N84D (native to JE, MVE, ZIKV, and BSQV) produced a mutant protease with a 2-fold higher K m for Ac-LKKR-pNA and a 2-3-fold lower K m for Ac-LKRR-pNA, whereas the k cat was unaffected for the former substrate but about 6-fold lower for the latter substrate. The overall effect was a 2-fold decrease in catalytic efficiency for both substrates following this enzyme mutation. Comparison between the substrates reveals an almost 4-fold higher k cat and a 2-fold higher catalytic efficiency, k cat /K m for Ac-LKKR-pNA compared with Ac-LKRR-pNA, indicating a preference for a P2 Lys (predominantly present in native cleavage sequences for JE, MVE, ZIKV, and BSQV).
The N84E mutation (native to YF) produced a mutant protease with a slightly higher catalytic efficiency for a P2 Arg (present in YF native cleavage sequences). However, the

TABLE 4 Site-directed mutant enzyme kinetics against 2-naphthoyl-Lys-Lys-Arg-pNA
Enzyme kinetics were obtained using the in vitro enzyme assay against the substrate 2-naphthoyl-KKR-pNA, with each data point representing the mean of triplicate measurements Ϯ S.E. trend was not the same when comparing substrate affinity, since both mutants had slightly higher affinity for the Ac-LKRR-pNA substrate. The N84S mutant protease was also tested against the truncated substrates 2-naphthoyl-KRR-pNA and 2-naphthoyl-K(hR)R-pNA in order to examine whether a P2 homoarginine residue could possibly be recognized with similar efficiency by either Asn-84 or Ser-84. Both the wild type protease and the N84S mutant had very similar K m values when assayed against the 2-naphthoyl-K(hR)R-pNA, suggesting that a P2 homoarginine can bind to Asn-84 or Ser-84 with similar affinity; however, the catalytic efficiency was decreased by over 20-fold. Taken together, these results support our earlier work (7,26), which suggested that these WNV constructs of the NS2B/NS3 protease are more efficient enzymes than their dengue counterparts at processing their native sequences.

DISCUSSION
Substrate Modifications-One of the most common approaches to inhibitor development is substrate-based and begins with substrate optimization. Such approaches for hepatitis C virus NS3 protease have led to increases in IC 50 values of over 1000-fold (27,28). In this study, we began with a tetrapeptide substrate and systematically replaced P4, P3, and P2 amino acids with peptidic and nonpeptidic side chains designed to map the individual substrate-binding subsites in the protease, to test the importance of specific enzyme residues that line the substrate-binding groove, and to enhance substrate affinity and catalytic efficiency. These solution studies have the potential to complement structural information provided from solid state crystal structures of the enzyme, especially providing new information on the binding of substrate in an extended conformation and induced fit effects, information that is not available from crystal structures.
A significant improvement in ligand affinity came from replacements of the "P5" acetyl cap and P4 hydrophobic Leu with the more rigid, nonpeptidic, 2-naphthoyl group. This change enhanced substrate affinity, with K m increasing over 2-fold. Docking studies suggested that the bulkier 2-naphthoyl group may insert deeper into the S4 subsite, with cooperative movement of both NS3 and NS2B amino acids for optimal ligand binding. It is likely that further modifications of this nonpeptidic P4 cap can produce even higher affinity substrate/in-hibitor ligands. Incorporating a citrulline into P3 also improved substrate affinity but reduced k cat . However, when next to a P4 Leu in the substrate, a P3 citrulline had a 10-fold lower K m but a relatively high k cat . From a drug design standpoint, the observation that citrulline-containing peptides had higher affinity than basic amines when 2-naphthoyl was present with poorer substrate turnover (k cat ), while having good turnover in the presence of Ac-Leu at P4, suggests that a double mutation to a nonpeptidic entry with no positive charges may be useful for the preparation of potent inhibitors. Removal of this positive charge increases drug-like characteristics, such as membrane permeability. Replacing positive charges at P2 and P1 has so far not been effective in flavivirus protease inhibitors.
The finding that WNV protease was twice as active against substrates with Lys rather than Arg at P2 contrasts with the four serotypes of dengue protease (21) that were more active against substrates with Arg instead of Lys at P2. This confirms predictions from native cleavage sequences and constitutes the first recognized difference in substrate preference between related flavivirus proteases. It was previously assumed, based on the high level of homology within NS3, that an inhibitor developed against one flaviviral protease could be active against multiple flaviviruses. This may still be possible if a replacement at P2 can be found that has a high affinity against multiple flavivirus proteases. The WNV protease had a substrate affinity (K m ) for a P2 homoarginine that was comparable with lysine, and a P2 ornithine also gave a relatively high K m and k cat . Neither of these residues have yet been tested in dengue protease substrates.
Correlation of Mutagenesis Results to Architecture of the Substrate-binding Cleft-Ile-155 in the crystal structure lies just beyond the rim of the S1 pocket, with docking of a substrate in the extended ␤-strand conformation suggesting that Ile-155 may contribute hydrophobic contacts to the methylene region of the substrate P3 Lys side chain. Mutation of this residue to a bulkier Phe could potentially affect binding in three ways. First, steric clashes may reduce substrate affinity, seen in the 4-fold reduction in K m and a halving of k cat . Second, extra bulk at the lip of the S1 pocket may interfere with entry of the critical P1 Arg of substrates into the S1 pocket, partially capping it. Third, it is possible that the aromatic Phe side chain participates in a -cation interaction with the charged Lys side chain at P3. Experimental Hydrophobic residues in NS3 (Val-154 and Met-156) and NS2B (Leu-87) in the crystal structure form a hydrophobic patch that probably constitutes S4. The hydrophobic character of these residues are conserved throughout the Flavivirus genus. Both WNV and dengue NS3 proteases have a preference for hydrophobic P4 residues. Substitution of each of these residues gave large losses in substrate affinity, NS3-M156A resulting in a 3-fold decrease in K m , NS2B-L87A resulting in an 8-fold reduction, and even the conservative substitution NS3-V154L resulting in a 3-fold decrease in K m . Substitution at this site with a bulky Phe had a large effect on substrate affinity; NS3-V154F gave a 5-fold decrease in K m , whereas NS2B-L87F caused a 3-fold decrease. Each of these substitutions caused a large loss of catalytic activity except NS2B-L87F, which maintained a similar catalytic turnover. Modeling suggests that the L87F mutant directs the Phe side chain away from S4, whereas in the L87A mutant the small side chain remains directed into the pocket. This suggests that the S4 pocket is quite flexible and capable of induced fitting to various P4 substituents, but Leu may be optimal for efficient association of NS2B to NS3. Each of these mutants was also tested against the hexapeptide substrate Ac-LQYTKR-pNA and produced similar effects on substrate affinity and catalytic turnover, reflecting similar binding to the P4 Tyr as to the P4 Leu.
The Gln at NS2B-86 is within hydrogen bonding distance to the main chain between the P3 and P4 substrate residues. Gln-86 is relatively poorly conserved within the Flavivirus genus but is predominantly polar or positively charged (Glu, Ser, Lys, Arg, and His; Table 3), suggesting that it may make a hydrogen bond with the oxygen atom of the P4 backbone carbonyl. Substitution of this residue for Ala or Leu produced 5and 3-fold losses of substrate affinity, respectively. However, substitution with a negatively charged Glu produced only a minor affect on substrate affinity, suggesting that this residue may participate in hydrogen bonding with the adjacent NS2B Asn-84. The Q86A and Q86L mutants would disrupt this interaction and destabilize the region, whereas Q86E would maintain it.
The NS2B Val-75 is located in a position that could potentially interact with either P5 or P6. A hydrophobic residue is completely conserved in this position within the Flavivirus genus. Substitution with Ala resulted in a 4-fold loss in K m when assayed against the tetrapeptide Ac-LKKR-pNA but a large 27-fold loss in K m when assayed against the hexapeptide Ac-LQYTKR-pNA, supporting the prediction that NS2B Val-75 makes an interaction with P5 or P6. Substitution with an aromatic Phe produced a comparable effect on both tetrapeptide and hexapeptide substrates, 5-and 4-fold losses in K m and 7and 3-fold losses in k cat , respectively. It is possible that NS2B Val-75 is involved in a hydrophobic interaction between the cofactor and NS3 and contributes to the loss in activity and affinity.
There is also an interesting correlation between the mutagenesis results and the binding site for the 2-naphthoyl substituent. Although the docking of the substrate 2-naphthoyl-KKR-pNA into the crystal structure of WNV protease yielded two possible conformations for the substrate, kinetic analysis of this substrate against mutant proteases has suggested the ␤-strand substrate conformation to be most likely. This conformation orients the bulky 2-naphthoyl group by induced fit into a slightly enlarged hydrophobic P4 subsite. Since NS3-V154F and NS2B-L87A mutants retained good affinity for the substrate 2-naphthoyl-KKR-pNA but dramatically affect k cat , there are possibly favorable hydrophobic, aromatic-aromatic orinteractions occurring. The decrease in k cat is presumably due to the slow release (k off ) of the N-terminal cleavage product from the enzyme-product complex. It is unlikely that this naphthoyl residue is exerting its effect on k cat by displacing the scissile amide bond from its optimal location due to its significant distance from the catalytic serine residue. When the important amine/basic side chains of the substrates are shortened or lengthened ( Table 2, entries 10 -30), modeling studies suggest that this causes a displacement of the scissile bond, the likely cause of the marked drop in k cat for these substrates.
Substrate Specificity and Cofactor Residue Asn-84-The cofactor residue at NS2B-84 is associated with a preference for Lys or Arg at the substrate P2 position. For flaviviral proteases in which Asn or Asp is at NS2B-84, Lys is predominantly present in P2 of the native substrate. When Ser, Thr, or Glu is at NS2B-84, Arg is preferred at P2 (Table 1).
Various mutations of Asn-84 were tested to further analyze the effect of this residue on substrate specificity. N84S decreased affinity of a P2 Lys substrate by 4-fold but increased affinity of the Arg analogue 2-fold. This came at the expense of catalytic activity, which decreased by 2-or 20-fold, depending on the substrate. The altered substrate affinity may prevent dissociation of the cleavage products that inhibit activity, or alternatively substrate may bind in a slightly different position in the mutant enzyme, altering the position of the scissile bond and affecting catalytic activity.
N84D (native to JE, MVE, ZIKV, and BSQV) gave a higher catalytic efficiency against substrates containing a P2 Lys, whereas the N84E (native to YF) gave higher catalytic efficiency against substrates containing P2 Arg. Because there are subtle differences in the sizes and shapes of substrate binding pockets in flavivirus proteases, substitution of a single residue is unlikely to account for the divergence of specificity, and it is likely to be these additional differences that add to the specificity for either Lys or Arg at P2 in substrates.
Although it is unlikely that NS2B-84 contributes the only difference between the active sites, it does appear to be a major factor that will need to be addressed in order to develop a broad spectrum flavivirus protease inhibitor. As an initial attempt to identify compounds that can bind equally well to either Asn or Ser at NS2B-84, we found that the tetrapeptide substrate, 2-naphthoyl-KhRR-pNA, with a P2 homoarginine residue could bind with similar affinity to either the wild type WNV protease or the NS2B-N84S mutant (K m ϭ 26.7 Ϯ 3.7 and 29.5 Ϯ 4.1, respectively). This residue showed a high affinity that was similar to that for substrate with a P2 lysine binding to wild type enzyme (K m ϭ 25.4 Ϯ 4.4), better than that for substrate with a P2 arginine binding to wild type enzyme (K m ϭ 43.4 Ϯ 4.6) and only slightly worse than for a substrate with a P2 arginine binding to the mutant N84S protease (K m ϭ 19.7 Ϯ 5.0). Although this substrate has yet to be tested against Den NS2B/NS3 protease, these results indicate that the P2 homoarginine is likely to bind well to the P2 pocket of both Den and WNV, and therefore it may be a good candidate for incorporation into broad spectrum flavivirus protease inhibitors.
Mutations outside the Substrate-binding Cleft-Some mutations were also made in the vicinity of the putative P3 and P4 substrate-binding sites as defined by the homology model. Within the resolution of the WNV NS2B/NS3 crystal structure, these mutated residues were some distance from the active site. Thr-111 is on the opposite side in the protease to the substratebinding cleft but is within hydrogen bonding distance of the carbon backbone of the NS2B residue Thr-69. The T111L mutation caused a 4-fold reduction in K m and a 27-fold reduction in k cat , much greater than a T111F mutation (3-fold reduction in K m , 8-fold reduction in k cat ). This suggests that the hydrogen bond contributed by Thr-111 is important for cofactor binding, and in the absence of this, the cofactor binds less efficiently to protease or in a way that affects catalytic activity and substrate affinity.
The NS3-I162F mutant produced a large reduction in K m (5-fold) and k cat (10-fold) against the tetrapeptide substrate but a lesser reduction against the hexapeptide substrate for both K m (2-fold) and k cat (5-fold). Ile-162 is located below the Val-154 and Met-156 residues, which are proposed to constitute the hydrophobic S4 pocket, and disruption of these residues by the I162F mutation is likely to affect substrate binding at S4. Both Ala-164 and Val-166 are in the center of the enzyme, and their substitution may affect folding of the enzyme. Mutation of A164S/A164V substantially reduced K m (3-and 6-fold, respectively) and k cat (6-and 9-fold, respectively). The more conservative substitution of V166L had only a minor effect on processing of both tetrapeptide and hexapeptide substrates. The largest effect was the 2.5-fold reduction in k cat for Ac-LKKR-pNA.

CONCLUSION
This study has provided valuable new information about the architecture of the flavivirus NS3 protease in solution and about interactions made between peptide substrate analogues and the substrate-binding cleft of the enzyme. Together with solid state crystal structures of the WNV NS2B/NS3 protease, this new information helps to provide a firmer basis for rational drug design. The changes to tetrapeptide substrates described herein substantially increased enzyme affinity and provide important clues toward development of substrate-based inhibitors of flavivirus proteases.