Dengue Virus NS3 Serine Protease

The mosquito-borne dengue viruses are widespread human pathogens causing dengue fever, dengue hemorrhagic fever, and dengue shock syndrome, placing 40% of the world’s population at risk with no effective treatment. The viral genome is a positive strand RNA that encodes a single polyprotein precursor. Processing of the polyprotein precursor into mature proteins is carried out by the host signal peptidase and by NS3 serine protease, which requires NS2B as a cofactor. We report here the crystal structure of the NS3 serine protease domain at 2.1 Å resolution. This structure of the protease combined with modeling of peptide substrates into the active site suggests identities of residues involved in substrate recognition as well as providing a structural basis for several mutational effects on enzyme activity. This structure will be useful for development of specific inhibitors as therapeutics against dengue and other flaviviral proteases.

The mosquito-borne dengue viruses are widespread human pathogens causing dengue fever, dengue hemorrhagic fever, and dengue shock syndrome, placing 40% of the world's population at risk with no effective treatment. The viral genome is a positive strand RNA that encodes a single polyprotein precursor. Processing of the polyprotein precursor into mature proteins is carried out by the host signal peptidase and by NS3 serine protease, which requires NS2B as a cofactor. We report here the crystal structure of the NS3 serine protease domain at 2.1 Å resolution. This structure of the protease combined with modeling of peptide substrates into the active site suggests identities of residues involved in substrate recognition as well as providing a structural basis for several mutational effects on enzyme activity. This structure will be useful for development of specific inhibitors as therapeutics against dengue and other flaviviral proteases.
Dengue viruses, members of the family Flaviviridae, are transmitted by mosquitos, Aedes aegypti and Aedes albopictus, and cause severe and widespread epidemics of diseases such as dengue fever and dengue hemorrhagic fever/dengue shock syndrome (for a review, see Ref. 1). The global distribution of dengue virus infections is comparable to that of malaria, with approximately 2.5 billion people at risk, mostly in the southern hemisphere. Nearly 5% of the close to one million dengue hemorrhagic fever cases each year are fatal (2). Currently, dengue infections are endemic in all continents except Europe (for recent reviews, see Refs. 3 and 4). Management of dengue fever is largely supportive; however, severe hemorragic manifestation may require blood transfusions (1). No vaccine is available to protect against dengue virus infections, and thus there is considerable interest in developing new antiviral therapeutic agents to combat diseases caused by dengue viruses.
Dengue virus type 2 (Den2), 1 the most prevalent of the four serotypes, contains a single-stranded RNA of positive polarity with a type I cap structure at the 5Ј-end and codes for a single polyprotein precursor (3,391 amino acid residues for Den2) (5) arranged in the order NH 2 -C-prM-E-NS1-NS2A-NS2B-NS3-NS4A-NS4B-NS5-COOH. Maturation of the polyprotein precursor occurs both cotranslationally and post-translationally within the endoplasmic reticulum, yielding three structural proteins, C (core), prM (precursor to membrane), and E (envelope), which are components of the virion, and at least seven nonstructural (NS) proteins (NS1-NS5). The nonstructural proteins are thought to function in viral replication, although the components of the viral RNA replicase have not been defined precisely.
Proteolytic processing by at least two different proteases, one of host and the other of viral origin, liberates the individual viral proteins from the polyprotein precursor. Signal peptidase within the endoplasmic reticulum cleaves at the C-prM, prM-E, E-NS1, and NS4A-NS4B junctions (6 -8). Processing at the NS1-NS2A junction is mediated by a host protease within the endoplasmic reticulum which may be either signal peptidase or another unknown enzyme (9). Maturation of prM to M is mediated by another host protease located in an acidic compartment encountered during the exocytic pathway and occurs at a late step of virion morphogenesis (10). The remaining cleavages in the polyprotein precursor are mediated by the trypsin-like serine protease, encoded within the NH 2 -terminal 180 amino acid residues of NS3 protein. This protease domain was identified based on the sequence similarity of NS3 with several viral and cellular serine proteases (11,12). Besides the protease domain at the NH 2 terminus, the COOH-terminal three-fourths of NS3 contains conserved motifs found in DEXH family of several viral RNA-stimulated NTPases/RNA helicases. The RNA helicase activity has been implicated in unwinding of double-stranded RNA replicative intermediate. NS3 also interacts with the viral RNAdependent RNA polymerase (NS5) (13), and these protein-protein interactions may facilitate the localization of the viral replicase complex to endoplasmic reticulum membranes where genome replication occurs (for reviews, see Refs. 14 and 15).
Analysis of polyprotein processing established that the NS3 protease, as a complex with the viral activator protein NS2B (16 -18) catalyzes the cleavages at NS2A-NS2B, NS2B-NS3, NS3-NS4A, and NS4B-NS5 sites in the polyprotein which have Lys-Arg, Arg-Arg, Arg-Lys, and occasionally Gln-Arg at the P2 and P1 positions, followed by a short chain amino acid Gly, Ala, or Ser at the P1Ј position (16, 17, 19 -23). Thus, the biologically active viral protease is a heterodimeric complex of NS2B-NS3. The importance of this protease activity in viral viability is underscored by the finding that mutations that abolished protease activity when introduced in the context of an infectious cDNA clone eliminated virus recovery (16). NS2B has three hydrophobic regions flanking a conserved hydrophilic domain of about 40 amino acid residues. This hydrophilic region is necessary and sufficient for activation of the NS3 protease domain in vivo (18,32) and in vitro. Although the hydrophobic regions of NS2B are dispensable for the protease activity, they are required for cotranslational membrane insertion of full-length NS2B and its efficient activation of the NS3 serine protease domain (24).
The covalently linked trypsin-like serine protease domain and the RNA-stimulated NTPase/RNA helicase domain are also found in hepatitis C virus (HCV), a new member of Flaviviridae (25,26), although HCV does not depend on arthropod vectors for transmission (27). Moreover, unlike the arthropodborne flaviviral NS3 protease domain, which is activated by the hydrophilic region of NS2B, the protease domain of HCV is activated by a 19-residue NH 2 -terminal hydrophobic region of NS4A protein (28 -33).
Serine proteases are perhaps the best studied (34 -40) of the four classes (serine, aspartic, metallo, and cysteine) of proteases (for a review, see Ref. 41 and the references therein). Pioneering crystallographic, biochemical, and molecular biological studies have extensively documented the existence of a common catalytic apparatus (Asp-His-Ser triad) participating in a conserved mechanism of catalysis among serine proteases (35,(37)(38)(39) (for a review, see Ref. 41). The basic mechanism consists of a charge relay system that transfers the unfavorable negative charge on the buried carboxyl via the histidine to the serine. This results in transfer of the Ser O␥ proton to the histidine, converting the serine into a strong nucleophile for attack on the peptidyl carbonyl of the substrate. The exquisite selectivity of proteases for particular substrates is a result of the existence of specific binding sites (frequently termed pockets) on the enzyme for amino acid side chains of the substrate(s). The substrate is oriented by binding of the amino acid side chain of the P1 residue in the S1 pocket (the P and S nomenclature of Schechter and Berger (73) is used; P1 is the substrate residue at the NH 2 -terminal end, and P1Ј is the residue at the COOH-terminal end of the scissile bond), a hydrogen bond between the backbone NH of the P1 residue, and two hydrogen bonds between the carbonyl oxygen of the scissile bond and two backbone NH groups of the enzyme (oxyanion binding hole). The reaction proceeds through a tetrahedral transition state with an acyl enzyme intermediate. Recently, however, serine proteases that lack one of the three components of the catalytic triad have also been identified (41). The important role of the predicted catalytic triad residues (His-51, Asp-75, and Ser-135 in Den2) in the mechanism of homologous flaviviral NS3 serine proteases was established by site-directed mutagenesis of these residues which abolished protease activity.
In this study we report the structure of the Den2 protease domain at 2.1 Å resolution, which provides a structural basis for the effects of several site-specific mutations on enzyme activity (42). Modeling of tetrapeptide substrates into the active site indicates that a number of residues that were identified as potential determinants of substrate specificity by sequence alignments do make contacts with the substrate, although there are significant differences from anticipated interactions. The structure of the Den2 protease reveals a substrate binding cleft that is small and shallow, except for the S1 and S2 pockets, and a catalytic triad that mimics half of the charge relay system (43).
The structures of the NS3 protease domain of HCV in the presence and absence of a synthetic activator region of NS4A have been reported (44,45). Comparison of the Den2 protease structure with these HCV protease structures reveals notable differences; for example, the structural zinc binding site and the long hydrophobic NH 2 -terminal loop of the HCV protease are absent in the Den2 protease. Taken together, this is the first structural report of an arthropod-borne flavivirus prote-ase, including the differences between substrate cleavage specificities between the Den2 and HCV serine proteases (discussed later). As such this structure will be useful not only for the development of specific inhibitors with therapeutic potential for treatment of diseases caused by dengue viruses but will also serve as a model for other serine proteases of more than 70 members of the arthropod-borne flavivirus family.

EXPERIMENTAL PROCEDURES
Expression and Purification of the Den2 NS3(Protease) Domain-5ЈGGGGTACCGCTGGAGTATTGTGGGAT (underlined nucleotides 4522-4539 of Den2 genome (5) and 5Ј-CCCAAGCTTCTTTCGAAAAAT-GTCATC (underlined complementary nucleotides 5059 -5076) were used for polymerase chain reaction on the template pTM1-NS2B-3(Pro)-PFH (24). The polymerase chain reaction product was digested with KpnI and HindIII and cloned into pQE-30 (Qiagen). This plasmid codes for a hexahistidine tag fused to the NH 2 terminus of the Den2 NS3 protease domain. Escherichia coli strain XL1-Blue MRFЈ transformed with the 6His-NS3(185aa) plasmid was grown at 37°C in LB containing 100 g/ml ampicillin until the OD 600 nm reached about 0.5, induced with 1 mM isopropyl 1-thio-␤-D-galactopyranoside, and shifted to 30°C for 4 h. Cells were harvested and resuspended in a lysis buffer (50 mM HEPES, pH 7.5, 0.3 mM NaCl, and 5% glycerol). Cells were disrupted by either a French press or by sonication. Bacterial cell lysates were clarified by centrifugation at 27,000 ϫ g for 30 min at 4°C. The 6His-NS3(185aa) was distributed in both soluble and pellet fractions, but predominantly in the latter. The soluble fraction was loaded onto a preequilibrated Ni 2ϩ -agarose affinity resin. The column was then washed with the lysis buffer (20 ϫ bed volume) and then with the same buffer containing 40 mM imidazole. The protein was eluted with the buffer containing 0.5 M imidazole. The pellet fraction was solubilized in a denaturation buffer (7 M urea, 0.1 M NaH 2 PO4, 0.01 M Tris-HCl, pH 8.0) by incubation for 2 h at 0°C. The suspension was clarified by centrifugation at 27,000 ϫ g for 30 min, and the solubilized fraction was loaded onto a preequilibrated Ni 2ϩ -agarose affinity resin. The protein eluate was refolded by successive dialysis against 50 mM Tris-HCl, pH 7.5, 50 mM NaCl (4 ϫ 1 liter for 14 -16 h). The refolded protein was clarified by centrifugation (27,000 ϫ g for 30 min) and concentrated by ammonium sulfate precipitation (70% w/v).
Crystallographic Analysis-Crystals were grown in hanging drops using a 5-mg/ml solution of the protein. Well solutions contained 200 mM Tris-HCl, pH 7.8, 200 mM LiCl, 4 mM NiCl 2 , 0.4% ␤-octyl glucoside, and 11% polyethylene glycol 3350. Protein drops (2 l) mixed with an equal volume of well solution were allowed to equilibrate at 20°C, and crystals measuring 0.2 ϫ 0.25 ϫ 0.4 mm were obtained after 3-4 weeks. Space group was determined to be P2 1 with a ϭ 48.8, b ϭ 62.4, c ϭ 35.6 Å and ␤ ϭ 96.70 and a solvent content of 50% assuming a monomer of the protease in the asymmetric unit. A native data set to 2.1 Å was derived from 59,122 measurements made on three crystals (an average redundancy of 4.6) on a Siemens X1000 area detector system at room temperature and processed with XDS (46). Data measurement statistics are presented in Table I. The structure was determined using a single twosite samarium derivative, obtained by soaking crystals in a solution containing 2.5 mM samarium acetate solution at pH 7.8 over 2 days. The strong Bijvoet signal from samarium in conjunction with isomorphous differences (Table I) was used to obtain an excellent set of SIRAS phases. All derivative data (80% complete to 2.7 Å ) used in phasing were measured from one crystal rotated around the crystallographic b direction to map hkl and h-kl reflections on to the detector simultaneously and in similar geometry, thus reducing systematic errors in Bijvoet differences. Systematic errors in both Bijvoet and isomorphous differences were reduced further by anisotropic local scaling (47). Positions of the two samarium ions in the asymmetric unit were determined from the Harker section of a Bijvoet differences Patterson map and the relative y translation between them derived from analysis of cross-vectors. The positions were confirmed from isomorphous differences Patterson maps. Correct chirality was established by inspection of electron density maps. SIRAS phasing and refinement were done using MLPHARE (48), part of the CCP4 package (49) with reflections to 2.7 Å , the limit of usable data from the derivative. Phasing statistics are given in Table I.
Significant improvement to the electron density map calculated with weighted SIRAS phases was obtained using DM (50), and a model of approximately 40% of the protein was built into this map using Bones and O (51). At this stage similarities to HCV NS3 protease became apparent, and the rest of the structure was built into SIGMAA (43) weighted 2Fo-Fc maps calculated using SIRAS phases combined with those from the partial model. The initial model, consisting of residues

RETRACTED
BY PUBLISHER 12-178, had an R value of 0.42, and refinement with XPLOR (52) and model building using O with extension of resolution to 2.1 Å were achieved expeditiously. The slow cooling protocol (53) was used in refinement and progress was monitored using free R values (54). The current model includes residues 5-181 of Den2 protease and 64 solvent molecules modeled as water oxygens. The final R value after restrained individual B factor refinement is 0.186, and the free R value is 0.228. Root mean square (r.m.s.) deviation in bond lengths is 0.012 Å, in bond angles 1.2°, and r.m.s. ⌬B between bonded atoms is 3.2 Å 2 . Analysis using PROCHECK (55) indicates excellent geometry with no residues in disallowed regions of the Ramachandran map. Despite requirement for nickel ions in crystallization, neither the metal ions nor the His tag was visible in maps. Refinement statistics are presented in Table I.

RESULTS AND DISCUSSION
Structure Description-A side-by-side stereo representation of the electron density map around the catalytic triad is shown in Fig. 1. Fig. 2a shows the overall folding along with the conserved orientation of the catalytic triad. Comparisons with the structures of HCV protease domain, either as a heterodimeric complex with the activating NS4A peptide (Protein Data Base no. 1jxp (56); see also Ref. 45) or in the absence of NS4A (Protein Data Base no. 1a1q (44)) show similarities expected from sequence comparisons as well as significant differences. Although both the amino-and carboxyl-terminal ␤ barrels are six-stranded in both Den2 and HCV proteases, the strands in the NH 2 -terminal domain of Den2 protease are shorter, and the barrel is significantly more deformed (Fig. 2a).
In the carboxyl-terminal domain, all six strands are more strongly conserved and are of comparable length. In general, the carboxyl-terminal domain carries most of the residues that contribute to substrate specificity and is highly conserved among serine proteases (34), and Den2 protease is not an exception. Superposition (57) of 68 ␣ carbon atoms of the COOH-terminal domain of Den2 protease (residues 87-167) and either HCV protease structure (residues 96 -186), omitting strands B2 and C2 in both, yields a r.m.s. deviation of 0.9 Å compared with 0.6 Å for the two structures of HCV protease. Alignment of pairs of structures using this transformation shows that the conformation of Den2 protease has a greater resemblance to that of the HCV protease-NS4A complex in this region. The first 30 residues in the two HCV protease structures have very different conformations (Fig. 2b). In the absence of NS4A these residues make ␤ sheet interactions with symmetry-related molecules. Upon interaction with NS4A peptide, these 30 residues form two ␤ strands (A0 and A1) and a helix ␣0. However, the structure of the Den2 protease in the absence of NS2B peptide more closely resembles that of the HCV protease-NS4A complex rather than the HCV protease domain alone (r.m.s. deviation 0.4 Å) (Fig. 2b). A second major difference between the two HCV structures occurs near helix Ha and strands D1, E1, and F1. Here again, the conformation of Den2 protease is much closer to that of the HCV protease-NS4A complex (Fig. 2, a and b).
There are significant differences in the mode of catalysis by Den2 and HCV proteases. For example, the substrate cleavage

RETRACTED BY PUBLISHER
specificities of HCV and Den2 proteases are different; HCV protease prefers a Cys residue at the P1 position of the substrates for cleavages at the NS4A-NS4B, NS4B-NS5A, and NS5A-NS5B sites but a Thr residue at this position for the intramolecular cleavage (NS3-NS4A site) (28 -30, 58, 59) in contrast to a basic residue (Lys or Arg) for all of the cleavages by Den2 protease (14) (see Table II). The activating peptide, NS4A, enhances the efficiency of HCV protease activity in vitro to a different extent: 100-fold for the cleavage of NS4B-NS5A site but only 11-and 3-fold for the cleavages of NS4A-NS4B and NS5A-NS5B sites, respectively (60). However, the Den2 protease requires NS2B for cleavages of all protease-sensitive sites at least within the sensitivity of currently available assays. Moreover, the region of NS2B required for activation of Den2 protease is a hydrophilic domain of 40 residues in contrast to the HCV NS4A activator peptide, which consists of hydrophobic residues. Thus the mode of activation of Den2 and HCV protease domains by their respective activator peptides and their activated states are likely to be different.
Other significant (Ͼ2 Å) deviations of Den2 from the fold of both HCV protease structures occur in strand F1 (Fig. 2, a and  b) where there is a two-residue insertion in HCV protease relative to Den2 protease, strand D2, the site of a three-residue insertion in HCV protease, and strand C2 where Den2 protease has a seven-residue insertion relative to HCV protease. Strand C2, a long 13-residue strand in HCV protease, is split into two shorter strands of three residues each interrupted by a fourresidue coil segment in Den2 protease. Among the conserved

RETRACTED BY PUBLISHER
features of all three protease structures are the small helical turn that carries the catalytic His residue as well as the helices close to the termini. The spatial relationship between members of the catalytic triad, His-51, Asp-75, and Ser-135 is strongly conserved as in other serine proteases, with Ser-135 being within hydrogen bonding distance (2.7 Å) of N⑀2 of His-51 (Fig.  2a). Connecting density for this hydrogen bond was visible in electron density maps. Side chain carboxyl oxygen atoms of Asp-75 are, however, oriented away from His-51, hydrogen bonding to main chain N of Trp-50; a hydrogen bonding interaction with His-51 can be generated easily by a small rotation of the side chain, thus establishing the charge relay system (43). There are also two water molecules within hydrogen bonding distance of His-51 and Ser-135 which could be relevant to the catalytic mechanism, perhaps by providing the attacking nucleophile for hydrolysis of the acyl enzyme intermediate (61,62). As evidenced by solvent accessibility calculations (63) using a 1.4 Å probe, the enzyme appears to be in at least a partially "open" conformation, with the catalytic triad and the residues in the substrate binding pocket which are accessible to model substrates. The open conformation suggests that Den2 protease is likely to have some level of intrinsic activity. Experimental evidence for this expectation must await development of sensitive in vitro assays for Den2 protease.
Substrate-Protease Interactions Using Molecular Modeling-Sequences around the in vivo cleavage sites of Den2 protease and seven other flaviviral NS3 proteases (14) are given in Table  II. The substrate binding cleft of the Den2 protease is not very extensive ( Figs. 2 and 3, a and b) and does not appear capable of providing specific interactions, in the absence of NS2B activating peptide, with side chains beyond P2 and P2Ј. This observation is consistent with heterogeneity of residues beyond these sites seen (14) in flaviviral proteases (Table II). Most viral proteases are considerably smaller than their cellular counterparts; several loops that are present in cellular enzymes that provide specific interactions to P3, P4, and P5 side chains of substrates are absent in the viral proteases (45). It is possible that heterodimerization with NS2B peptide could generate additional specific interactions for side chains beyond P2 and P2Ј, both by altering the protease conformation into an activated state and by interacting directly with the substrate as has been suggested for other two-component proteases (45).
Accordingly, molecular modeling was limited to building a tetrapeptide into the substrate binding cleft. Complexes with substrates were modeled using well known principles of serineprotease-substrate interactions (64). Rigid-body transformations (57) and small manual adjustments were used throughout. Using the relevant tetrapeptide section of the main chain from a porcine trypsin-inhibitor complex (65) (Protein Data Base no. 1mct), the scissile carbonyl was positioned within hydrogen bonding distance of ␣ nitrogen atoms of Gly-133 (2.71 Å) and Ser-135 (2.8 Å). Simultaneously the peptide nitrogen was placed at 3.1 Å from the carbonyl oxygen of Gly-151.
Gly-133 and Ser-135 were identified as the most likely to form the oxyanion hole by comparison with structures of other serine proteases. Further small translations established hydrogen bonding interactions between the main chain of P1 and P2 residues with appropriate main chain atoms of Gly-153 and Asn-152 to generate the short section of ␤ sheet common to serine protease-inhibitor interactions (64). Side chains were built in their most probable conformations using O to generate the desired sequence at each of the viral polypeptide junctions (Table II). Table III presents a summary of results obtained. There are three residues (Ser-131, Tyr-150, and Ser-163) within the S1 pocket (Fig. 3b), in addition to the catalytic Ser-135, which are potentially capable of providing specific stabilizing interactions with the guanidino nitrogen atoms of an Arg residue at P1. In addition, Leu-115 provides nonspecific van der Waals interactions. These interactions can also accommodate a Lys side chain (as found in the NS3-NS4A cleavage site; see Table II) equally well. All of the residues that form the S1 pocket except Ser-131 and Leu-115 are absolutely conserved (14) among all of the flaviviruses listed in Table III. Primary specific interaction of the Den2 protease with a lysyl (NS2A-NS2B site), glutaminyl (NS2B-NS3 site), or an arginyl (NS3-NS4A and NS4B-NS5 sites) side chain at P2 is provided by Asn-152. The catalytic His-51 and Asp-75 as well as Gly-151 and Gly-153 make additional interactions through their main chain atoms. These residues are also strongly conserved among the listed flaviviruses (Table III). The O⑀1 atom of Asn-152 forms a salt bridge/ hydrogen bond with N⑀ of the P2 Arg in the modeled complex (Fig. 3b); alternate interactions are possible by rearrangement of the side chains of both the protease and the P2 residue of the substrate to accommodate a Lys or Gln (Table II). A serine side chain at P1Ј fits into the S1Ј pocket formed by the catalytic His-51 and Ser-135 and residues Gly-35, Ile-36, and Val-52. Of these Ile-36 interacts only through its main chain atoms and is not conserved (Table III). A Ser at P2Ј is also stabilized exclusively by interaction with main chain atoms of the residues listed (Table III). A larger side chain at P2Ј, such as a Trp (as in the NS2A-NS2B cleavage site; see Table II) or an Arg (occurring in an internal cleavage site within Den2 protease) makes additional van der Waals interactions with side chain atoms of residues defining the S2Ј pocket as well as with Asp-129 and Phe-130. It is possible that the paucity of specific interactions at sites other than P2 and P2Ј is caused by the protease being not complexed with NS2B. Heterodimerization with NS2B could provide additional interactions beyond the P2 and P2Ј positions of the substrate side chains directly as well as indirectly through altered positions of residues in the protease.

RETRACTED BY PUBLISHER
3 consists of strand D2 and includes the catalytic Ser-135 together with several residues expected to contribute to substrate binding; and conserved region 4 includes strand E2 with additional residues involved in substrate recognition. Residues Asp-129, Phe-130, Tyr-150, Asn-152, and Gly-153 were judged to be important for determination of substrate specificity in Den2 protease as these are analogs of residues 189, 190, 213, 215, and 216 in trypsin (chymotrypsin numbering) which determine interactions with side chains of the substrate in that enzyme. Asp-129, in particular, was identified as the residue that lies at the bottom of the S1 site and provides charge neutralization for the substrate Arg or Lys side chain (12,66). A wide ranging site-specific mutagenesis study reported re-cently (42) measured change in catalytic activity of the enzyme resulting from each of a number of mutations in the protease domain. Residues, mutations of which resulted in significant alteration of enzyme activity (42), are plotted in Fig. 3c on a C␣ trace of the NS3 protease.
Mutations in Conserved Region 3-In conserved region 3, Asp-129 and Phe-130 are residues that were expected to form part of the S1 pocket, with Asp-129 providing charge neutralization of a Lys or Arg residue at P1, in analogy with trypsin. However, Asp-129 could be replaced by Glu, Ser, or Ala without significant loss of activity; substitution by Lys, Arg, or Leu did not abolish activity completely. Substrate modeling indicates that the side chain carboxylate of Asp-129 is distant (11Å; Fig.   FIG. 3. Substrate modeling and structural explanation of mutational data. Panel a, substrate binding (top left). A stick model (carbon atoms white, nitrogen purple, and oxygens red) of the tetrapeptide RRSW (P2-P2Ј residues at the NS2A-NS2B cleavage site) is shown binding in the active site cleft of Den2 protease, represented by its molecular surface (magenta). The shallowness of the active site and the side chain of the P1 arginine residue disappearing into the S1 pocket of the enzyme are readily visible. The figure was made with GRASP (71). Panel b, illustration of interactions in the S1 pocket (top right). The contribution of strands E2 and F2 (thin gray ribbon) to the surface shown in panel a has been removed to provide a clear view of the S1 pocket. Substrate side chains are labeled (P2-P2Ј), carbon atoms are colored white, nitrogen purple, and oxygen red. Enzyme residues near the guanidine moiety of P1 arginine are shown as sticks, colored cyan, and labeled. Asn-152 is shown in yellow and can be seen interacting with N⑀ of the P2 arginine. Note that protease residues are shown in their conformations in the native structure to illustrate the multiple possibilities for interaction, but they are capable of interaction with the P1 residue by side chain rotations. The figure was made with GRASP (71). Panel c, mutations causing significant reduction in enzyme activity (bottom left). Side chains of residues (stick representation) that caused a reduction of catalytic activity (42)  3, b and c) from the S1 pocket and cannot interact with a P1 Arg or Lys side chain without substantial main chain motion. Phe-130, likewise is too far from the P1 side chain, but both Asp-129 and Phe-130 could provide some van der Waals contacts for a large side chain at P2Ј, such as a Trp found at the NS2A-NS2B cleavage site. Ser-135 is an essential residue and cannot tolerate replacement by Ala; however, a replacement of Ser-135 by a cysteine also renders the Den2 protease nearly inactive. Cysteine proteases, in which a Cys plays the role of the active site serine in serine proteases, have structural requirements that are similar to those of serine proteases. Mechanistic details of substrate hydrolysis for the two classes of proteases are also very similar (67). Thus it would be of interest to determine the underlying cause of the severe loss of activity in the S135C mutant Den2 protease. Gly-133 is in the loop that carries the catalytic Ser-135 (Fig. 3d) and is part of a ␤ turn that positions strand D2 to interact with strand A2 to form the core ␤ barrel (Fig. 2a). Replacement of this residue might destabilize the barrel as well as alter the position of Ser-135 significantly. Other residues in conserved region 3, residues 139 -144 (Fig. 4 of Ref. 42), do not seem to play an essential role in the modeled substrate-enzyme complex. This observation is consistent with the mutational effects of these residues (42).
Mutations in Conserved Region 4 -Tyr-150 and Ser-131 (conserved region 3), which were mutated to a number of other residues (42), are close enough to the basic side chains of either Arg or Lys at P1 to form a salt bridge or hydrogen bond with small side chain rotations. In contrast to conclusions from sequence comparisons, it is Tyr-150 rather than Asp-129 which appears to provide primary charge stabilization within the S1 pocket. In addition, the side chain of Ser-163, which is also highly conserved, could stabilize a P1 Arg or Lys in a different conformation after side chain rotation (Fig. 3c). Tyr-150, in addition, could provide stabilization for a positively charged side chain through its aromatic electron cloud, a mode of interaction which has been suggested earlier (68) and proposed for the other serine proteases (44,45). Thus replacement of Tyr-150 by Phe will not reduce activity significantly because although it would abolish the salt bridge/hydrogen bond to the Tyr-150-OH, it would not abrogate aromatic stabilization and the potential for interaction with Ser-131 or Ser-163. However, replacement of Tyr-150 with Ala, Val, or His would be expected to decrease activity because fewer modes of stabilization of the P1 side chain would be available, which is consistent with observation (42). It would be interesting to see the effects of a double or triple mutant of residues Tyr-150, Ser-131, and Ser-163. Gly-148 is part of the strand E2 packing against the strand F2 in the COOH-terminal ␤ barrel (Fig. 2a), and a mutation is likely to cause destabilization of the barrel, which was indeed observed (42).
The S2 pocket of Den2 protease has Asn-152 as one of the components (Table III) which can provide hydrogen bonding interactions for Lys, Arg, or Gln side chains at P2. A number of carbonyl atoms in the vicinity provide a electrostatically favorable environment for a basic P2 side chain. If Asn-152 does indeed provide primary interaction with the P2 side chain, its replacement with Ala, as was done (42), would negate this interaction. Its replacement with a Lys would also be expected to decrease activity because a basic side chain at P2 would be repelled. A Lys would still be capable of providing an interaction for a P2 Gln, and perhaps this mutant enzyme will have activity at the NS2B-NS3 (Table II) junction but not at other sites. It is more difficult to rationalize the observed decrease in activity of the N152Q mutant because this would appear to conserve the interaction with a positively charged substrate. The structure of the heterodimeric NS2B-NS3 protease, the species that is probably present under the assay conditions of that study (42), might suggest an appropriate rationalization.
Mutations of Other Residues-It is clear why Den2 protease loses activity if Gly-153 is replaced (42). Gly-153 is part of the strand E2 which forms one half of the sheet E2-F2 (Fig. 2a). It is packed against Ser-163 in the opposite strand and is almost completely inaccessible to solvent. Substitution of the ␣ hydrogen on Gly-153 by any other residue will undoubtedly destabilize this sheet and cause main chain changes in surrounding residues. Modeling studies indicate that Gly-153 is likely to be one of the residues that form the oxyanion hole, donating its NH to the carbonyl atom of the P1 residue to stabilize and orient the substrate in the Michaelis complex. Any changes in its position would, therefore, severely curtail enzyme activity. In addition, Gly-153 is in the vicinity of Tyr-150 and Asn-152, both of which are part of the strand E2. Changes in this part of the main chain will likely alter the positions of Tyr-150 and Asn-152, further reducing the ability to stabilize the basic P1 side chain.
Effects of mutations at Val-126 appear to be related to the packing environment of this residue. Val-126 participates in a hydrophobic cluster that probably defines the conformation of this loop and which positions Ser-135 appropriately for catalysis. The main chain beyond Val-126 descends into the active site cleft through a series of stacked ␤ turns (Fig. 3d). None of these turns is internally hydrogen bonded, thus deriving their stability from van der Waals interactions in the vicinity and a hydrogen bond between Asp-129 and Ser-127 side chains (Fig.
DEN2 L115 S131 S135 G136 Y150 S163 Structure of the Dengue Virus Serine Protease 3d). The side chains of Val-126 along with those of Leu-115 from the loop between the strands B2 and C2 (Fig. 2a) and Ala-160 and Tyr-161 from the strand F2 participate in this cluster ( Fig. 3d and Table III). Replacement of Val-126 with smaller side chains will create a packing fault, very likely altering the conformation of the loop, making it more mobile and therefore making it more difficult to position Ser-135 optimally for catalysis. Explanation of severe loss of activity upon deletion of residues amino-terminal to residue 167 in the NS3 domain 2 also follows from the observation that residues Ala-160 and Ser-163 probably contribute critical interactions to stabilize the enzyme-substrate complex. It is clear from the structure and substrate modeling experiments that Den2 protease in the absence of interaction with NS2B peptide is not likely to provide specific interactions to substrate side chains beyond P2Ј on the carboxyl side of the scissile bond and P2 on the amino-terminal side. Such a limited substrate binding cleft is likely to make the design of specific inhibitors somewhat formidable. However, the structure of the protease domain provides an excellent starting point for structural studies of binary complexes with inhibitors and ternary complexes with inhibitors and activating NS2B polypeptide. These analyses will provide additional information on both substrate interactions as well as molecular mechanism of activation. If NS2B extends the binding site of the protease to provide additional specific interactions with substrates, that information can be used in the structure-assisted design of specific inhibitors (69). Moreover, a heterodimeric NS3 protease complex with NS2B could be an alternate target aimed at preventing full activation of the enzyme, a strategy that has been suggested for the HCV protease (45). Based on modeling calculations presented in this study, interactions that have been suggested to be important for binding of the substrate are also conserved among other serine protease domains of arthropod-borne flaviviruses (some shown in Table III). This observation suggests that conclusions drawn about substrate binding with the Den2 protease are also applicable to these other serine proteases. Results reported here for the Den2 protease are thus of additional significance in providing a structural basis for design and evaluation of antiviral therapeutics. This structurebased approach offers considerable promise toward treatment of a wide range of life-threatening diseases caused by flaviviruses.