Functional profiling of recombinant NS3 proteases from all four serotypes of dengue virus using tetrapeptide and octapeptide substrate libraries.

Regulated proteolysis by the two-component NS2B/NS3 protease of dengue virus is essential for virus replication and the maturation of infectious virions. The functional similarity between the NS2B/NS3 proteases from the four genetically and antigenically distinct serotypes was addressed by characterizing the differences in their substrate specificity using tetrapeptide and octapeptide libraries in a positional scanning format, each containing 130,321 substrates. The proteases from different serotypes were shown to be functionally homologous based on the similarity of their substrate cleavage preferences. A strong preference for basic amino acid residues (Arg/Lys) at the P1 positions was observed, whereas the preferences for the P2-4 sites were in the order of Arg > Thr > Gln/Asn/Lys for P2, Lys > Arg > Asn for P3, and Nle > Leu > Lys > Xaa for P4. The prime site substrate specificity was for small and polar amino acids in P1' and P3'. In contrast, the P2' and P4' substrate positions showed minimal activity. The influence of the P2 and P3 amino acids on ground state binding and the P4 position for transition state stabilization was identified through single substrate kinetics with optimal and suboptimal substrate sequences. The specificities observed for dengue NS2B/NS3 have features in common with the physiological cleavage sites in the dengue polyprotein; however, all sites reveal previously unrecognized suboptimal sequences.

Dengue virus is the etiologic agent of dengue fever, dengue hemorrhagic fever, and dengue shock syndrome and is the most prevalent arthropod-transmitted infectious disease in humans. Dengue consists of four closely related but antigenically distinct viral serotypes (DEN1-4), 1 of the genus Flavivirus (1,2). Following primary infection, lifelong immunity develops that prevents repeated assault by the same serotype but does not provide protection from a virus of a different serotype (3). Dengue diseases are endemic in the tropics and subtropics, and the viruses are maintained in a cycle that involves humans and the Aedes aegypti mosquito. Infection with dengue viruses produces a spectrum of clinical illness ranging from a nonspecific viral syndrome to severe and fatal hemorrhagic disease (1,2). Currently there is no antiviral drug or vaccine available against dengue viruses, and the pathogenesis of the disease is poorly understood.
As with other members of the Flaviviridae family, the genomes of the dengue viruses consist of a positive singlestranded RNA of ϳ10,700 bases in length (4). Co-translational processing and post-translational processing of the polyprotein give rise to three structural proteins and at least seven nonstructural proteins (4). The correct processing of these proteins is essential for virus replication and requires host proteases such as signalase and furin (5) and a two-component viral protease, NS2B/NS3 (4). Previous studies have shown that the N-terminal part of NS3 contains trypsin-like protease domain (6) and that the activity of NS3 was dependent on at least 40 amino acids of NS2B (6 -8).
The preferred NS3 protease-cleavage sites in the viral polyprotein have two basic amino acid residues (Arg-Arg, Arg-Lys, Lys-Arg, or occasionally Gln-Arg) at the P2 and P1 positions, followed by a Gly, an Ala, or a Ser at the P1Ј position (4). The crystal structure of the DEN-2 NS3pro in the absence of NS2B has been determined at 2.1-Å resolution by Murthy et al. (9) and shows a shallow substrate binding site, indicating a lack of significant interactions beyond P2-P2Ј. The NS3pro domain in the absence of NS2B is an inefficient protease as demonstrated by the low turnover rate of the small chromogenic substrate N-␣-benzoyl-L-Arg-p-nitroanilide (10). Although NS2B is required for efficient enzymatic activity of the NS3pro, the structure of the latter without the cofactor resembles that of the related hepatitis C NS3 protease bound to its activating peptide NS4A. The exact mechanism by which the NS2B cofactor stimulates the protease is not currently known. However, it is plausible that NS2B resembles NS4A and interacts directly with the NS3 protease domain, causing a conformational change that extends the binding pockets (10). * The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
The aim of the current study was to elucidate and compare the substrate specificity of NS3 protease from all four serotypes. We performed functional substrate profiling of the P1-P4 and P1Ј-P4Ј for the DEN1-4 protease complexes using tetrapeptide and octapeptide positional scanning peptide libraries. As a consequence, we expanded the earlier findings on DEN2 NS3 to a broader extent (P4-P4Ј) and discovered that its substrate preference was shared by enzymes of the other three serotypes.

EXPERIMENTAL PROCEDURES
Materials-Dengue virus serotype 1 (strain Hawaii), and serotype 4 (H241) were purchased from American Type Culture Collection (Manassas, VA). Dengue virus serotype 3 (strain S221/03, GenBank TM accession number AY662691) was obtained from a dengue patient and was a kind gift from Dr. Eng Eong Ooi (Environmental Health Institute, Singapore). The plasmids pGEM-T-(E-NS3) and pET15b-NS3NS5 containing, respectively, the NS2B/NS3 and NS3 cDNAs from Dengue virus serotype 2 (strain TSV01, GenBank TM accession number AY037116) were kind gifts from James Cook University, Queensland, Australia. The dengue virus NS3 protease substrate peptide Boc-Gly-Arg-Arg-AMC was purchased from Bachem (Bubendorf, Switzerland). Restriction enzymes and modifying enzymes were purchased from New England Biolabs (Beverly, MA).
Expression and Purification of DEN 1-4 CF40-Gly-NS3pro185-Competent Escherichia coli BL21-CodonPlus-(DE3) (Stratagene) were transformed with pET15b-DEN 1-4 CF40-Gly-NS3pro185 expression vectors and grown in 500 ml Luria-Bertani broth containing ampicillin (100 g/ml), chloramphenicol (50 g/ml), and 0.2% (w/v) glucose at 37°C with shaking until A 595 reached ϳ0.5. Cells were centrifuged in a Sorvall SLA 3000 rotor at 5000 ϫ g for 10 min and resuspended in 500 ml of Luria-Bertani media with ampicillin and chloramphenicol. Cultures were induced with 0.4 mM isopropyl ␤-D-thiogalactopyranoside, and growth was continued for a further 16 h at 16°C. The resulting cells were pelleted and resuspended in 30 ml of cold lysis buffer (50 mM HEPES, pH 7.5, 300 mM NaCl, and 5% glycerol). Cells were passed through a cell disruptor twice at 20,000 p.s.i. (Basic Z model; Constant Systems Ltd.), and debris was removed by centrifugation at 35,000 ϫ g for 30 min. The protein solution was filtered by 0.22-m filter and loaded onto a 5-ml HiTrap chelating heparin (Amersham Biosciences) column equilibrated with the lysis buffer. The resin was washed with 10 column volumes of lysis buffer before the bound proteins were eluted from the column with lysis buffer and a linear gradient of imidazole from 20 -300 mM in the same buffer. The peak fractions were analyzed by 10% SDS-PAGE. The positive fractions were pooled, desalted, and concentrated with spin concentrators (Amicon Ultra-15 ml; Millipore, Billerica, MA) with a molecular mass cutoff of 10,000 Da.
SDS-PAGE Gels and Western Analysis-Protein samples were resolved on a 12% SDS-polyacrylamide gel, transblotted onto Hybond-C membranes (Amersham Biosciences), blocked with 3% nonfat skim milk in phosphate-buffered saline, and then probed with anti-NS3 polyclonal (a gift from James Cook University) or anti-His monoclonal (1:1000 dilution; Qiagen, Valencia, CA) antibodies for 1 h at room temperature. After extensive washes in 0.05% Tween 20 in phosphate-buffered saline, a secondary anti-mouse antibody conjugated to horseradish peroxidase (1:5000 dilution; Sigma) was applied to the blots for at least 1 h at room temperature. Washes were repeated, and membrane-bound antibodies were detected with an ECL chemiluminescence kit (Amersham Biosciences).
Profiling of P4-P1 and P1Ј-P4Ј Specificities with Substrate Libraries-For P4-P1 substrate specificity determination, two-position fixed positional scanning tetrapeptide libraries were synthesized and assayed as described previously (12)(13)(14)(15). Assays were carried out in 384well plates on SpectraMax Gemini EM or XS microtiter plate reader (Molecular Device). The final reaction mixtures (30 l) contained 50 mM Tris-HCl (pH 8.5), 20% glycerol, 1 mM CHAPS, and ϳ150 M total substrate. After the addition of enzymes (1-3 M CF40-Gly-NS3pro185 proteases) to the tetrapeptide coumarin library, reaction mixtures were incubated at 37°C, and the liberated coumarin fluorophore was monitored at a ex of 380 nm and a em of 450 nm. Initial fluorescent velocities in relative fluorescent units per second were calculated as a fraction of the highest velocity in the library set and plotted into a two-dimensional format with DecisionSite (Spotfire).
The octapeptide donor quencher positional scanning library was synthesized and assayed as described previously (16). Briefly, CF40-Gly-NS3pro185 proteases (0.5-2 M) were incubated in 96-well plates with 100 l reaction containing the same buffer as described above with ϳ100 M total substrates (16,17). The reactions were monitored at a ex of 320 nm and a em of 380 nm, and initial velocities were analyzed and graphed in DeltaGraph.
Steady-state Kinetics of Fluorogenic and Chromogenic Peptide Substrates-Five fluorogenic tetrapeptide substrates with the 7-amino-3carbamoylmethyl-4-methyl coumarin (ACMC) leaving group (Bz-Nle-Lys-Arg-Arg-ACMC, Bz-Nle-Lys-Thr-Arg-ACMC, Bz-Nle-Thr-Arg-Arg-ACMC, Bz-Thr-Lys-Arg-Arg-ACMC, and Bz-Thr-Thr-Arg-Arg-ACMC) were synthesized using standard Fmoc (N-(9-fluorenyl)methoxycarbonyl) solid phase peptide synthesis techniques (14,15). The thiobenzyl ester substrate, Bz-Nle-Lys-Arg-Arg-SBzl, was purchased from Peptides International. After high performance liquid chromatography purification, the concentration of aliquots of each fluorogenic substrate was determined using total hydrolysis with trypsin, and the released ACMC fluorophore was read at a ex of 380 nm and a em of 450 nm. The concentration of each substrate was then calculated with standard ACMC solutions. The concentration of the SBzl substrate was also determined using total hydrolysis with trypsin; the released SBzl moiety was monitored spectrophotometrically at 324 nm in the presence of 0.5 mM 4,4Ј-dithiodipyridine, and concentration was determined using the extinction coefficient of 19,800 M Ϫ1 cm Ϫ1 for the SBzl-thiopyridine conjugate. Active site titration for purified CF40-Gly-NS3pro185 proteases was performed by inhibition with freshly reconstituted aprotinin (18,19). For kinetic studies, CF40-Gly-NS3pro185 proteases were incubated with various concentrations of individual ACMC, AMC, or SBzl peptide substrates at 37°C. The proteolytic reaction was monitored as an increase in fluorescence at a ex of 380 nm and a em of 450 nm for the ACMC and AMC substrates or an increase in absorbance at 324 nm in the presence of 0.5 mM 4,4Ј-dithiodipyridine for the SBzl substrate. Typical reaction mixtures (100 l) contained 50 mM Tris-HCl, pH 8.5, 20% glycerol, 1 mM CHAPS, 10 nM enzyme, and fluorogenic/chromogenic peptide substrates ranging from 0.5 M to 1 mM. Initial fluorescence or absorbance velocities (relative fluorescence units per minute or relative absorbance units per minute) were converted to M⅐s Ϫ1 from a standard ACMC or AMC calibration curve or to an extinction coefficient of 19,800 M Ϫ1 cm Ϫ1 for the SBzl-thiopyridine conjugate. The progression curves were fitted into a Michaelis-Menten equation by nonlinear regression using GraphPad Prism. Steady-state kinetic constants of each substrate were determined from duplicate measurements and reported as mean Ϯ S.E.
Model of Substrate Binding to the NS3pro Structure-The P4-P4Ј octapeptide (Nle-Lys-Arg-Arg-Ser-Gly-Ser-Gly) was fitted to the active site of the enzyme using the crystal structure of the dengue NS3 protease complex with the mung bean Bowman-Birk inhibitor (Protein Data Bank code 1DF9) (20) as a guide. The side chains of residues 44 -51 of the inhibitor, representing P4-P4Ј, were mutated to the sequence of the octapeptide and manually fitted to dengue NS3 protease with the molecular modeling program Maestro (Schrödinger LLC, Portland, OR), seeking to maximize electrostatic interactions, hydrogen bond formation, and hydrophobic interactions. The main chain coordinates were not moved, nor were any atoms of the protein altered.

In Vitro Expression and Purification of DEN1-4 CF40-Gly-NS3pro185-Clum et al.
showed that the expression of the core hydrophilic domain of the DEN2 New Guinea C strain NS2B (cNS2B; 40 amino acids) as an N-terminal fusion was sufficient for activating the NS3 protease domain (21). The introduction of a flexible, protease-resistant, nine-amino acid linker (Gly 4 SerGly 4 ) generated a soluble and catalytically active protease complex (11). In the study presented here, similar constructs were expressed based on this strategy by cloning the chimeric CF40-Gly-NS3pro185 cDNAs derived from dengue serotypes 1-4 into the pET15b vector from Novagen (see "Experimental Procedures"). Expression of the recombinant DEN2 CF40-Gly-NS3pro185 protease as an N-terminal His tag fusion protein in E. coli followed by affinity purification led to high yields (typically 15-20 mg from a 1-liter culture) of soluble protein, of which Ͼ95% were full-length (Fig. 1). The identity of CF40-Gly-NS3pro185 was confirmed with Western analyses using anti-His and anti-NS3 antibodies (Fig. 1). The minor lower band detected by the anti-NS3 antibody (Fig. 1B, asterisk), but not by the anti-His antibody (Fig. 1C), is presumably the heterodimeric protein without the hexahistidine tag. CF40-Gly-NS3pro185 proteases for DEN1, DEN3, and DEN4 were similarly expressed and purified, except that the yield of the DEN4-derived enzyme was 10-fold lower (data not shown). Active site titration with aprotinin was used to accurately assess the active enzyme concentration (11). Aprotinin binds to the four CF40-Gly-NS3pro185 proteases with high affinity (K i ϭ 79, 25, 88, and 6.4 pM for DEN1-4 CF40-Gly-NS3por185, respectively).
Characterization of Enzymatic Activity of DEN1-4 CF40-Gly-NS3pro185 on a Fluorogenic Tripeptide Substrate-The activities of the four CF40-Gly-NS3pro185 enzymes were characterized using the fluorogenic peptide Boc-GRR-AMC, which had been shown previously to be cleaved by the DEN2-NGC cNS2B/NS3 protease complex (10). The K m , V max , and k cat /K m or substrate specificity for DEN2 CF40-Gly-NS3pro185 using Boc-GRR-AMC were determined by varying the substrate concentration from 1000 to 10 M using 12 serial dilutions (see "Experimental Procedures"). The steady-state kinetic parameters obtained were k cat ϭ 0.13 Ϯ 0.02 s Ϫ1 , K m ϭ 150 Ϯ 15 M, and k cat /K m ϭ 840 Ϯ 100 M Ϫ1 s Ϫ1 (Table I).
The activities of the purified DEN1, DEN3, and DEN4 CF40-Gly-NS3pro185 were characterized using the same Boc-GRR-AMC fluorogenic peptide. All four proteases exhibited compa-rable K m values, but the k cat and k cat /K m values showed greater variations, with the values for DEN1 protease being lower and, hence, the least active (Table I). This observation was consistent also for a DEN1 protease cloned from a clinical isolate obtained during the Dengue fever outbreak in the Indonesian city of Jakarta (data not shown).
Profiling of P4-P1 Specificities of DEN1-4 CF40-Gly-NS3pro185-Sequence analysis of the NS3 proteases from all four distinct serotypes indicated that they share greater than 60% identity in their primary sequences (Fig. 2). To explore the substrate structure-activity relationship, the P4-P1 substrate specificities of recombinant NS3 proteases from DEN1-4 were examined using tetrapeptide positional scanning synthetic combinatorial libraries of the general structure Ac-XXXX-7amino-4-carbamoylmethyl coumarin (Fig. 3) (14,15,17). The tetrapeptide substrates were synthesized and assayed as mixtures of peptides in a positional scanning format where two positions were fixed with a specific amino acid and two positions were randomized with 19 amino acids (X represents all  19 ϫ 19 ϫ 19). Cleavage of the peptide-7-amino-4-carbamoylmethyl coumarin bond results in an increase in fluorescence that can be directly monitored. The total concentration of substrates in each well was ϳ150 or ϳ0.4 M for each substrate. The relative rates for the mixture of substrates are represented in a two-dimensional matrix, with each square in the matrix representing both the identity of the two fixed amino acids (x-axis and y-axis) and the relative activity as indicated on a gray scale in which white represents no activity and black represents the highest activity (Fig. 3B). The activity of the enzyme across all three sublibraries (P1 ϫ P2, P1 ϫ P3, and P1 ϫ P4) was normalized to the highest activity as indicated by the white-to-black scale below each of the two-dimensional graphs. The enzymatic activity was also represented in histogram form (Fig. 3B), where the P1 position is fixed as arginine, the x-axis represents the P2, P3, and P4 fixed positions, and the y-axis represent the normalized hydrolysis rates in relative fluorescence units per second. The substrate specificity at each subsite in the tetrapeptide can be determined by the highest hydrolysis rate (in reflective fluorescent units per second) observed in the individual P2, P3, and P4 sub-libraries.
The substrate specificities for the four NS3 proteases from DEN1-4 were found to be very similar (Fig. 3B). Because the whole library is assayed at a constant substrate concentration, the relative importance of the amino acid at a particular position can be compared with the intensities of corresponding signals across the library. An exclusive preference for basic amino acid residues (either Arg or Lys) at the P1 position was observed in P1-P2, P1-P3, or P1-P4 fixed libraries (Fig. 3B,  left). After plotting the activities from wells containing Arg at P1, P2-4 preferences were readily observed in the order of Arg Ͼ Thr Ͼ Gln/Asn/Lys for P2, Lys Ͼ Arg Ͼ Asn for P3, and Nle Ͼ Leu Ͼ Lys Ͼ Xaa for P4 (Fig. 3B, right).
Steady-state Kinetic Constants for the Hydrolysis of Optimal Substrates by Dengue NS3 Proteases-For serine proteases, the well established catalytic mechanism involves a two-step process of acylation and deacylation as shown in Scheme 1.
During the acylation step the catalytic serine acts as a nucleophile to attack the P1 carbonyl of the substrate, forming an acyl-enzyme intermediate where the non-prime portion of the peptide substrate remains covalently bound to the enzyme and the prime site segment of the substrate dissociates from the enzyme. In the subsequent deacylation step, water acts as the nucleophile to form the new C terminus of the cleaved substrate with the ensuing regeneration of the catalytic serine (22). A consequence of this catalytic mechanism, as shown in Equations 1 and 2, k cat ϭ k 3 ϫ k 5 k 3 ϩ k 5 (Eq. 1) is that the macroscopic steady state constants k cat and K m are related to the acylation (k 3 ) and deacylation (k 5 ) rate constants.  When acylation is rate-determining, k 5 Ͼ Ͼ k 3 , the steady state kinetic constants simplify to k cat ϭ k 3 and K m ϭ K d . For most serine proteases, acylation is rate-determining for amide bond hydrolysis, and deacylation is rate-determining for ester bond hydrolysis (22)(23)(24). To determine whether the acylation step is rate-determining for dengue NS3 protease, the steadystate kinetic constants were determined for two substrates that contained the same peptide sequence but with different leaving groups, namely Bz-Nle-Lys-Arg-Arg-ACMC, representing amide bond hydrolysis, and Bz-Nle-Lys-Arg-Arg-SBzl, represent-FIG. 3. P4-P1 specificity of dengue NS3 serine proteases. A, the representative structure of peptides in the libraries. B, the initial velocities associated with each substrate in the individual library for each NS3 protease were normalized with the highest initial velocity observed and presented in a two-dimensional format scaled by gray intensity (white for no activity and black for highest activity). The amino acids are represented using the single letter codes, and norleucine is represented as n. The initial velocities of P4, P3, and P2 positions when P1 is fixed at arginine are presented in the right section. At the top of each graph X stands for the 19-amino acid equimolar mixture (cysteine excluded; methionine is replaced by the isostere norleucine), and O stands for single fixed amino acid. Numerical data are available in the supplemental material, which can be found in the on-line version of this article. ing thio ester bond hydrolysis. If deacylation were rate determining for both substrates, then the catalytic rate constants would be largely indistinguishable because the catalytic rate would be dependent on deacylation of the acyl-enzyme intermediate, and the acyl-enzyme intermediate formed by both substrates would be identical. If acylation is rate-determining for one or both of the substrates, then the catalytic rates would be significantly different and would depend on the relative reactivities of the leaving groups in the original substrates. Indeed, for dengue NS3 protease (Table I, DEN4) the k cat for the Bz-Nle-Lys-Arg-Arg-SBzl substrate is ϳ1000-fold greater than that of the Bz-Nle-Lys-Arg-Arg-ACMC substrate, k cat,SBzl ϭ 300 s Ϫ1 versus k cat,ACMC ϭ 2.8 s Ϫ1 . This observation is consistent with acylation being the rate-determining step for amide bond hydrolysis by dengue NS3 protease.
To validate the substrate specificities identified from the non-prime site substrate profiling, we synthesized a series of individual substrates and examined the contribution of the P2-P4 positions by comparing their steady-state kinetic parameters (K m , k cat , and k cat /K m ; Table I). Bz-Nle-Lys-Arg-Arg-ACMC contains the combination of optimal amino acids at P4-P1, whereas the other substrates (Bz-Nle-Lys-Thr-Arg-ACMC, Bz-Nle-Thr-Arg-Arg-ACMC, Bz-Thr-Lys-Arg-Arg-ACMC, and Bz-Thr-Thr-Arg-Arg-ACMC) each contain one or two substitutions in P2-P4, with the corresponding optimal residue replaced by the suboptimal Thr residue. The k cat /K m values of Bz-Nle-Lys-Arg-Arg-ACMC for the four NS3 enzymes are 75-1000-fold higher than that of Boc-Gly-Arg-Arg-AMC, a widely used NS3 substrate (Table I). The increase in activity is due to both a decrease in K m and an increase in k cat . This drastic difference is not likely caused by the use of a modified fluorophore, because Bz-Nle-Lys-Arg-Arg-AMC and Bz-Nle-Lys-Arg-Arg-ACMC have comparable K m and k cat values (Table  I, DEN4).
The recognition of dibasic residues at the P1 and P2 sites by dengue NS3 protease is considered the key specificity characteristic of flaviviral NS3 enzymes and has been reported by a number of groups (7,8,25). The data presented here demonstrates that P2 is very sensitive to substitution and supports the role of P2 in substrate ground state binding in view of the markedly increased K m of the suboptimal substrate Bz-Nle-Lys-Thr-Arg-ACMC when compared with Bz-Nle-Lys-Arg-Arg-ACMC (Table I).
The single substrate kinetic data also indicate that nonprime subsites beyond P2 also contribute significantly to substrate binding and turnover. Specifically, a suboptimal P3 substitution (Bz-Nle-Thr-Arg-Arg-ACMC) causes an increase in K m up to 10-fold but has little influence on k cat . The P4 substitution with a suboptimal amino acid, Bz-Thr-Lys-Arg-Arg-ACMC, maintains a similar K m but displays a 6-fold decrease in k cat . The role of P4 in catalysis can also be observed when comparing the suboptimal substrates Bz-Nle-Thr-Arg-Arg-ACMC and Bz-Thr-Thr-Arg-Arg-ACMC. Not surprisingly, changing both P3 and P4 (Bz-Thr-Thr-Arg-Arg-ACMC) to suboptimal residues affects both K m and k cat and leads to the loss of k cat /K m by 34 -168-fold.
Profiling of P1Ј-P4Ј Specificities of DEN1-4 NS3 Protease-A number of observations have suggested the presence of prime site substrate specifity in dengue NS3 proteases. Murthy et al. (9) first reported the crystal structure of the apo NS3 serine protease domain at 2.1 Å. This structure revealed a restricted substrate binding cleft with few predicted interactions beyond P2-P2Ј. Defined interactions with the prime site pockets was recently observed in the structure of NS3 protease complexed with a Bowman-Birk inhibitor that has P1Ј-Ser and P3Ј-Pro in both active-site loops (20). Further direct evidence stems from a mutagenesis study on the S2Ј pocket where a single Gly-133 to Ala substitution strongly reduced the autoprocessing of NS2B-NS3 (27).
To further elaborate the prime site substrate specificity of NS3, a focused P1Ј-P4Ј octapeptide donor-quencher library was synthesized in a positional scanning format (Fig. 4A). The P1Ј-P4Ј region of the donor-quencher positional scanning library contained a tetrapeptide sequence with one position fixed as a specific amino acid and three positions randomized as 19 amino acids for a total of 130,321 substrates (19 ϫ 19 ϫ 19 ϫ  19) in mixtures of 6,859 per well (19 randomized amino acids ϫ 19 randomized amino acids ϫ 19 randomized amino acids ϫ 1 fixed amino acid). Cleavage of the peptide between the fluorescence donor methoxycoumarin group and the quencher dinitrophenyl group results in an increase in fluorescence. The library was designed to bias for cleavage between the P1 and P1Ј positions by occupying the P4-P1 sites with the sequence Nle-Lys-Arg-Arg, the non-prime specificity determined from the tetrapeptide positional scanning library (Fig. 3). The dependence of the hydrolysis rate on the identity of the amino acid in the prime site position is represented in Fig. 4B, where the x-axis represents the amino acid in the fixed position and the y-axis represents the relative rate of hydrolysis in relative fluorescent units per second.
Dengue NS3 proteases from all four serotypes displayed similar prime site substrates specificity as observed in the donor-quencher positional scanning substrate library (Fig. 4). In particular, P1Ј and P3Ј sites exhibited specificity for small and polar amino acids such as serine, whereas the P2Ј and P4Ј substrate positions showed minimal activity when compared with the P1Ј and P3Ј positions.
Correlation of Substrate Specificities with Natural Cleavage Sites-The NS3 protease is responsible for the cleavage of at least 2A/2B, 2B/3, 3/4A, and 4B/5 boundaries of the virus encoded polyprotein (Fig. 5B) (6,7,25,28,29). These sites are highly conserved; dibasic residues at P1 and P2 are followed by a small or polar residue at P1Ј (Ser, Gly, or Ala). The only exception is that the N2B/3 sites of all four dengue serotypes contain a Gln residue at the P2 position. The dibasic cleavage patterns are evident from the substrate profiling experiment (summarized in Fig. 5A). At the P3 position, the four sites of DEN1-3 contain optimal Lys/Arg or near optimal Gly, whereas three of DEN4 sites are occupied with unfavorable Ser/Thr/Pro. On the prime site, the strong preference for Ser at P3Ј in the profiling study was not reflected in the native sites except for the 3/4A linker in DEN1.
Octapeptide Substrate Docking into the NS3 Active Site-The structure of the dengue NS3 protease complexed with Bowman-Birk inhibitor was used to dock the optimized octapeptide (Nle-Lys-Arg-Arg-Ser-Gly-Ser-Gly) by mutating the corresponding P4-P4Ј residues in the bound inhibitor. The schematic of the structure with the potential interacting side-chains is shown in Fig. 6. Despite the strong preference for a positive charge at P1 and P2 positions in the substrate, the corresponding substrate binding pockets do not appear to contain any negative charges. In the S1 pocket, it has been recently suggested that the Tyr-150 may be involved in -interaction with the side-chain of P1-Arg (30). The residues that line the P1pocket are mostly conserved in all the NS3 sequences (data not shown) except at position 115, where Leu, Thr, or Ile can be found. The predicted S2 pocket is lined by the side chains of Gln-35 and Asn-152. All three residues are completely conserved, and it is conceivable that the P2 Arg side chain may hydrogen bond with Gln-35. In the S3 pocket the side chains of Leu-128, Asp-129, Pro-132, and Val-155 are completely conserved in all NS3 sequences with the exception of Arg-157, FIG. 4. P1Ј-P4Ј substrate specificity of NS3 proteases. A, representative structure of the donor-quencher peptide substrate in a positional scanning library format. P1Ј-P4Ј represents positions in the peptide that are either fixed as a specific amino acid or are an equimolar mixture of which may be replaced by a Lys or a Thr. The prediction of a hydrophobic wall provided by the completely conserved side chains of Val at positions 153 and 155 is consistent with the P4 norleucine preference from the substrate profiling results. The Ser-135 and the His-51 are close to the S2 pocket and are positioned to interact with the carbonyl group of the scissile bond. The pockets that occupy the prime site residues are less prominent, except that His-51 may interact with Ser at P1Ј. DISCUSSION Four genetically similar but antigenetically distinct serotypes of dengue viruses, DEN1-4, have been identified. Each of these serotypes can infect human and induce host immunity to the corresponding serotype. However, dengue hemorrhagic fever or dengue shock syndrome is thought to be mostly associated with secondary infections and therefore remains a factor that poses challenges to the development of a safe vaccine (31). The encoded dengue viral protease, NS3, is an attractive therapeutic target, as it is essential for the formation of the virally encoded non-structural proteins (32,33).
Although efforts are under way to develop protease inhibitors against dengue viral infection, the question remains open if a pan-serotype inhibitor can be developed or if multiple inhibitors will have to be designed for individual serotypes. Based on the sequence analysis, the NS3 proteases from the four serotypes share Ͼ60% identity in the protease domains (Fig. 2). To reveal the functional similarity between the four dengue NS2B/NS3 proteases, we report the bacterial expression and purification of highly active chimeric single chain NS2B/NS3 proteases from all four serotypes. With combinatorial peptide substrate libraries it was demonstrated for the first time that all four enzymes exhibit very similar substrate specificities at both non-prime sites (Fig. 3) and prime sites (Fig. 4). Individual substrate kinetics further confirmed the similar preference and sensitivity to replacement at P4-P1 positions by all four NS3 proteases (Table I). These results suggest that the four NS3 proteases share very similar, if not identical, peptide substrate structure activity relationships and imply that it is possible to develop a single inhibitory agent targeting all four dengue NS3 proteases.
The other open question pertains to the extent of the enzyme substrate binding site. The crystal structure of the DEN2 NS3pro in the absence of NS2B shows a shallow substrate binding site, indicating that significant interactions are restricted to P2-P2Ј (9). The substrate profiling study presented here clearly supports the importance of P2-P2Ј in substrate binding as indicated by the strong dibasic preference at P1 and P2 as well as the small amino acids at P1Ј (Figs. 3 and 4). The critical role of P2-P2Ј is also reflected in changes of kinetic parameters upon single residue substitution (Table I). For ex- 19 amino acids (cysteine excluded; methionine replaced by the isostere norleucine). B, initial velocities associated with each substrate in the library for each NS3 protease were normalized with the highest activity and presented as relative fluorescent units per second. The amino acids are represented using the single letter codes, and norleucine is represented as n. At the top of each graph X stands for the 19-amino acid equimolar mixture, and O stands for a single fixed amino acid. Of the residues that form the various substrate binding pockets, all but two are invariant across the four dengue serotypes. Position 115 is deep in the S1 subsite near the head group of the P1 residues (⑀-amino group of Lys or the guanidinium group of Arg) and can either be a Leu (serotypes 2 and 4) or a Thr (serotypes 1 and 3). Replacement of Leu with the smaller Thr side chain enlarges the S1 pocket but removes any direct contact between the side chain of residue 115 and the P1 residue of the substrate. Residue 157 of the dengue NS3 protease can either be a Lys (serotypes 3 and 4), an Arg (serotype 2), or a Thr (serotype 1). ample, Bz-Nle-Lys-Arg-Arg-ACMC binds to the enzyme 3-30fold tighter as compared with its P2 suboptimal counterpart, Bz-Nle-Lys-Thr-Arg-ACMC (3-30-fold increase in K m ). More importantly, the profiling and kinetic data reveal for the first time that the P3 position also contributes to the substrate ground state binding. Substitution of optimal P3 Lys to Thr leads to a increase in K m by 4 -10-fold for all four NS3 proteases. Besides P3, a clear preference for Ser at P3Ј also suggests a strong enzyme-substrate interaction beyond P2-P2Ј. The discrepancy between this observation and the crystal structure could be explained by the difference of enzyme source. The enzymes used in this study are composed of the protease domain of NS3 tethered to the central 40 amino acids of hydrophilic NS2B element that confers full proteolytic activity to NS3 and, therefore, more closely resembles its native conformation (11). In contrast, the crystal structure is derived from the NS3pro domain in the absence of NS2B (9,20). Although the structure without the cofactor resembles that of the related hepatitis C NS3 protease bound to its activating peptide, NS4A, the exact mechanism by which the NS2B cofactor stimulates the protease activity is not currently known. Yusof et al. (10) experimentally compared the kinetic properties of NS3 and NS2B/NS3, and their results suggested that NS2B generates additional specific interactions with the P2 and P3 residues of the substrates. It is possible that NS2B activates the NS3 protease directly or interacts with the NS3 protease domain, causing a conformational change in the substrate binding pockets (10). Although substitution at P4 from suboptimal Thr to optimal Nle does not significantly alter the K m , it does increase k cat by 6-fold. This result again supports the idea that the substrate specificity of NS3 protease is influenced by more extensive contact than has been reported previously.
Several groups have characterized the enzymatic properties of NS2B/NS3 proteases with synthetic peptide substrates bearing endogenous dengue cleavage sites (11,34,35). The best substrate from both studies, Ac-Thr-Thr-Ser-Thr-Arg-Argpara-nitroanilide, covers the sequence from NS4B/5 cleavage site with a k cat /K m of 275 M Ϫ1 s Ϫ1 . From the current study, Bz-Nle-Lys-Arg-Arg-ACMC was the optimal sequence identified from the profiling experiments and exhibits a k cat /K m of 51,800 M Ϫ1 s Ϫ1 , thus representing a Ͼ150-fold improvement in activity over existing substrates.
The studies presented here not only provide a powerful tool for monitoring the proteolytic activity of dengue NS3 proteases but illustrate the fact that natural cleavage sites are not necessarily occupied by optimal residues. The most obvious example is the conserved glutamine residue at 2B/3 boundary in all four serotypes. Consistent with previously reported results from Kumthong et al. (35) and Leung et al. (11), our kinetic study with synthetic peptides resembling the native cleavage sequence revealed much slower k cat /K m for the substrate with 2B/3 sequence (see the supplemental data available in the on-line version of this article). These data support the observation from the in vitro processing study that an intramolecular cleavage between 2A/2B precedes an intramolecular cleavage between 2B/3 (6). The differential rates of cleavage at the four major cleavage sites (Fig. 5B) may guide an ordered processing of dengue viral polypeptide, yield sufficient intermediates with desired function, and harmonize viral replication and assembly. A similar example of timed cleavage has been recently observed with the human immunodeficiency virus Gag precursor (26,37,38).
Taken together, a systematic evaluation is presented here of the extended substrate specificity of dengue serine proteases from all four distinct serotypes by using a combination of synthetic positional scanning combinatorial libraries and single substrate kinetics. This study represents the first observation on the conserved and extended substrate specificities among the four dengue NS3 proteases. The data provided here should facilitate the development of dengue NS3 protease inhibitors with detailed peptide substrate structure-activity relationships and greatly improve protease activity detection agents.