Hepatitis C Virus Non-structural Protein 3 (HCV NS3): A Multifunctional Antiviral Target*

Hepatitis C virus non-structural protein 3 contains a serine protease and an RNA helicase. Protease cleaves the genome-encoded polyprotein and inactivates cellular proteins required for innate immunity. Protease has emerged as an important target for the development of antiviral therapeutics, but drug resistance has turned out to be an obstacle in the clinic. Helicase is required for both genome replication and virus assembly. Mechanistic and structural studies of helicase have hurled this enzyme into a prominent position in the field of helicase enzymology. Nevertheless, studies of helicase as an antiviral target remain in their infancy.

Hepatitis C virus non-structural protein 3 contains a serine protease and an RNA helicase. Protease cleaves the genomeencoded polyprotein and inactivates cellular proteins required for innate immunity. Protease has emerged as an important target for the development of antiviral therapeutics, but drug resistance has turned out to be an obstacle in the clinic. Helicase is required for both genome replication and virus assembly. Mechanistic and structural studies of helicase have hurled this enzyme into a prominent position in the field of helicase enzymology. Nevertheless, studies of helicase as an antiviral target remain in their infancy.
Hepatitis C virus (HCV) 3 infection is a leading cause of chronic liver disease and hepatocellular carcinoma. HCV is the founding member of the Hepacivirus genus of the family Flaviviridae (1). The HCV genome is a single-stranded RNA of positive polarity that is on the order of 9000 nucleotides (nt) in length (Fig. 1a). The 5Ј-non-translated region (NTR) contains a terminal stem-loop that is a required cis-acting replication element and the internal ribosome entry site. The 3Ј-NTR contains a cis-acting replication element that consists of an RNA stem-loop of variable sequence, a polyuridine/polypyrimidine tract of variable length, and a highly conserved "X-tail. " The variability observed in the 3Ј-NTR and elsewhere in the genome is sufficient to permit classification of HCV into six distinct genotypes.
The HCV genome encodes a single open reading frame, the translation of which is directed by the internal ribosome entry site. The HCV polyprotein is on the order of 3000 amino acids long and can be divided into a structural region (C-p7 proteins) and a non-structural (NS) region (NS2-NS5B proteins) (Fig.   1a). Cleavage of the HCV polyprotein occurs co-and posttranslationally by host (structural region) and viral (non-structural region) proteases. Only the NS3-NS5B region of the polyprotein is required for genome replication in cell culture (2). NS5B is the viral RNA-dependent RNA polymerase (3). NS5A is a phosphoprotein (4) capable of specifically interacting with the 3Ј-NTR of the HCV genome (5), other non-structural proteins (6), and numerous cellular proteins (7,8). NS5A also functions in virus assembly (9,10). NS4B is an integral membrane protein that is required for assembly of the "membranous web," the organelle used for RNA replication (11,12). NS4A is a cofactor for NS3 that directs the localization of NS3 and modulates its enzymatic activities (13). The N-terminal one-third of NS3 contains the protease activity responsible for processing of the non-structural region of the polyprotein (Fig. 1b) (14 -16) and some cellular proteins (17)(18)(19). The C-terminal two-thirds of NS3 is an RNA helicase of the DExH family ( Fig. 1c) (20). The biological function of the RNA helicase activity is not known but may include 1) RNA folding/remodeling (21), 2) enhancement of polymerase processivity (22), and/or 3) genome encapsidation (23).
The significance, if any, of having the major viral protease tethered to the RNA helicase is not known. The protease domain can function independently of the helicase domain and vice versa. Evidence suggests that each domain modulates the biochemical activity of the other (24,25). Importantly, both the protease domain and NS4A modulate the RNA helicase activity of NS3 (26,27). The ability to physically and functionally separate the two activities of NS3 has fostered the development of two diametrically opposed NS3 disciplines. Study of NS3 protease activity has been very practical, focusing on the pursuit of inhibitors as specifically targeted antiviral therapy for hepatitis C (STAT-C). Study of NS3 helicase activity has been much more fundamental, focusing on the use of this helicase as a model for the structure and mechanism of DExH proteins in general. Here, we summarize our understanding of the structure, mechanism, and/or inhibition of both activities of NS3 and discuss the implications of the current state of the art on future directions for studies of this very important yet incompletely tapped antiviral target.

NS3 Protease
Structure, Substrate Recognition, and Catalytic Mechanism-NS3 protease is a serine protease that belongs to the trypsin/ chymotrypsin protease superfamily (28,29). The enzyme consists of two ␤-barrel domains that are flanked by two short ␣-helices (Fig. 1b). One of the ␤-strands of the N-terminal ␤-barrel derives from the central hydrophobic region of NS4A. The structure is stabilized by a Zn 2ϩ ion that is coordinated by three cysteine residues and one water molecule (Fig. 1b). Zn 2ϩ binding is essential for function (30). The substrate-binding pocket can accommodate six amino acid residues (P4 -P2Ј) (Fig. 2a). However, cleavage efficiency is best when substrates include 10 amino acid residues (P6 -P4Ј) (31), suggesting that interactions of the substrate with the surface of NS3 occur.
Protease activity requires a catalytic triad (Ser-139, His-57, and Asp-81) and an oxyanion hole (backbone amides of Gly-137 and Ser-139) (Fig. 1b). The NS4A cofactor contributes to the proper positioning of the catalytic triad and the substrate (32), thus explaining its role in catalytic efficiency and substrate specificity.
Similar to other viral proteases, NS3 protease substrates are defined by multiple amino acid residues, not just those at the scissile bond (P1-P1Ј). The consensus cleavage site has cysteine and serine at the scissile bond (Fig. 2a). Studies in vitro with peptides have shown that cleavage at the NS5A/NS5B junction is at least an order of magnitude more efficient than observed at other sites, including the NS4B/NS5A junction, which also has cysteine and serine at the scissile bond (31). This observation is consistent with the existence of multiple determinants of substrate recognition. All of the studies that have evaluated the substrate specificity of NS3 protease quantitatively have used steady-state kinetics and parameters such as k cat , K m , and k cat /K m to define this specificity. The general assumption for serine proteases is that k cat reports on formation of the acyl-enzyme intermediate with amides, i.e. release of both product peptides is fast relative to formation of this intermediate (33). It is worth noting that seminal studies by De Francesco and co-workers showed that some Pn-P1 product peptides exhibit substantial inhibitory activity that can be increased by amino acid substitution (34). Therefore, it is possible that the affinity of the NS3 protease for the N-terminal product peptide may alter the ratelimiting step in the steady state. Changes in rate-limiting steps such as these have substantial contraindications for the use of steady-state kinetics alone to understand substrate specificity of HCV NS3 protease. The addition of pre-steadystate kinetic analysis to the study of NS3 substrate specificity may be warranted but has never been reported.
The catalytic/chemical mechanism of NS3 protease-catalyzed cleavage of peptide bonds is likely identical to that observed for other serine proteases (33,35) based on the conservation of the catalytic triad (Fig. 2b). His-57 is predicted to serve as a general base to activate the Ser-139 nucleophile. The pK a value for His-57 is elevated to the level required to deprotonate Ser-139 by hydrogen bonding to Asp-181, which likely also exhibits an elevated pK a value. Once substrate binds, nucleophilic attack of the carbonyl carbon of the scissile bond by Ser-139 leads to formation of a tetrahedral intermediate containing an oxyanion ( Fig. 2b) that is stabilized by the oxyanion hole of the enzyme (Fig. 1b). Collapse of the tetrahedral intermediate produces an acyl-enzyme intermediate and the N-terminal product peptide. Dissociation of this product peptide permits binding of water, hydrolysis of the acyl-enzyme intermediate, and production of the C-terminal product peptide. Dissociation of this product resets the system for another round of catalysis.
Antagonism of Innate Immunity-Pathogen-associated molecular patterns (PAMPs), such as pathogen-specific nucleic acids, proteins, carbohydrates, and lipids, are recognized by pathogen recognition receptors (PRRs). Toll-like receptors (TLRs) are PRRs that are located on the cell surface and recognize extracellular PAMPs, although some TLRs are capable of detecting intracellular PAMPs. Most intracellular PAMPs are detected by retinoic acid-inducible gene I (RIG-I)-like recep-FIGURE 1. a, location of the NS3-coding sequence in the HCV genome. The cis-acting elements in the NTRs are shown. The asterisk indicates the microRNA-122-binding site. Full-length NS3 protein is located from amino acids 1027 to 1658 of the polyprotein of the genotype 1b consensus sequence (NCBI accession number AJ238799) (2). IRES, internal ribosome entry site. b, structure of the NS3 protease domain in complex with the NS4A cofactor and substrate peptide. The crystal structure (Protein Data Bank code 1A1R) of the N-terminal protease domain (red) in complex with the NS4A peptide cofactor (purple) is rendered as a ribbon (28). The polypeptide containing the NS3/NS4A junction, corresponding to positions P4 -P2Ј, was modeled into the active site of the NS3 protease and shown as surface (gray); the sequence of the peptide is shown next to the structure. The polypeptide model, starting with its primary sequence, was constructed in-house. The expanded views highlight the catalytic triad, the oxyanion hole, and the zinc-binding site. The lower left expanded view displays the interactions of the catalytic triad (His-57, Asp-81, and Ser-139), rendered as balland-stick, with the modeled peptide; black dashed lines represent the hydrogen bonds. The lower right expanded view shows the oxyanion loop (positions 135-139), whose backbone nitrogen atoms play a role in stabilizing the developing negative charge on the O␥ of Ser-139 during the peptide cleavage; the side chains are drawn as sticks, and the oxyanion hole is highlighted by the blue shadow. The top expanded view shows the side chains of the key residues (Cys-97, Cys-99, Cys-145, and His-149), depicted as sticks, that interact with the metal ion (purple sphere). c, structure of the NS3 helicase domain in complex with ssDNA and ADP⅐AlF 4 Ϫ . Coordinates are from Protein Data Bank code 3KQL (89). The conserved motifs of RNA helicases are colored and indentified by established Roman numeral designations (62,89). The expanded view shows the ATP transition state analog in the NTP-binding/catalytic site of the enzyme.
tors, including RIG-I and MDA5 (melanoma differentiation antigen 5). Once engaged, PRRs activate the innate immune response by inducing expression of interferon (IFN)-␤. Release of IFN-␤ then causes an auto-and paracrine response that includes activation of IFN-stimulated genes, the products of which are effectors of the innate immune response.
In the case of HCV infection, viral RNA is sensed by two mechanisms. The first requires TLR3, which may detect viral RNA during entry and uncoating of the virion. The second requires RIG-I, which is thought to detect cytosolic HCV RNA and its replication intermediates. PRRs recognize a variety of features associated with HCV RNA, e.g. the 5Ј-triphosphate (36), double-stranded RNA associated with the NTRs (37), and perhaps even single-stranded stretches like the polypyrimidine tract (38).
HCV infection is capable of blocking IRF3 activation in response to PAMP engagement by TLR3 and RIG-I (17,19,44,45). This antagonism of the innate immune response is medi-ated by the NS3 protease activity (17-19, 44, 45). Both TRIF and MAVS are substrates for NS3 protease (17-19, 44, 45). As shown in Fig.  2a, both proteins are cleaved after a cysteine residue as observed for most NS3 protease substrates. Similarities beyond the P1 position are also apparent (Fig. 2a).
Inhibition, Inhibitor Resistance, and the Substrate Envelope Hypothesis-The significant experience of the pharmaceutical industry in developing clinically useful inhibitors of human immunodeficiency virus protease (HIV PR) led very early on to the development of antiviral programs targeting NS3 protease activity. These early investments have filled the pipeline with numerous compounds that are at various stages of clinical development. Although NS3 protease and HIV PR have similar roles in the viral life cycle (polyprotein processing), NS3 protease also antagonizes activation of the innate immune response. Therefore, inhibition of NS3 protease activity will not only decrease the efficiency of viral replication but will also restore, at least partially, the innate immune response in HCV-infected cells.
Most NS3 protease inhibitors are competitive with the substrate and thus target the substrate-binding site (Fig. 3a). The earliest inhibitors were based on product peptides (46). Efforts to remove reactive groups and create molecules with enhanced pharmacological properties led to the development of macrocyclic compounds (Fig. 3b), and it was this class of compounds that were used in the first proof-of-concept clinical studies (47). Although these initial macrocyclic compounds were never approved, new macrocyclic compounds are currently in clinical trials (48,49). A second class of compounds that are showing promise in clinical trials are linear peptide-like molecules that introduce an electrophilic ␣-ketoamide group at the scissile bond with which the catalytic serine can react to form a covalent bond that is reversible (Fig. 3b) (50, 51).
The major complication for the development of therapeutics targeting activities encoded by RNA viruses is the development of drug-resistant mutants. Given the high mutation rate and large size of RNA virus populations, essentially every single non-lethal amino acid substitution possible will be present at the start of drug therapy. Mutants capable of replicating in the presence of the drug will then be rapidly selected. As anticipated, this scenario has played out for inhibitors of NS3 protease activity (52). Numerous positions of NS3 protease have been shown to contribute to resistance in cell culture and in the clinic. Many of the positions confer resistance to both macrocyclic and linear inhibitors. Amino acid residues of NS3 protease linked to drug resistance are shown in Fig. 3c. Many of these residues are actually quite far (7-8 Å) from the substrate-binding pocket (Fig. 3c).
Studies of HIV PR have provided substantial insight into mechanisms of resistance and strategies to increase the barrier to resistance (53,54). Schiffer and co-workers have put forward a very thought-provoking hypothesis for the apparent low barrier to resistance observed for HIV PR inhibitors. They refer to this hypothesis as the "substrate envelope hypothesis." The fundamental tenet is that the inhibitor envelope (van der Waals surface of the inhibitor) exceeds the substrate envelope. Therefore, residues of HIV PR that confer tight binding of the inhibitor need not contribute to substrate binding. Amino acid substitutions at positions that are not essential for substrate binding would lead to drug-resistant proteases and viruses that are not debilitated for function. Additional second-site substitutions could then restore any fitness loss associated with the initial substitution. A prediction of this hypothesis is that by constraining the inhibitor envelope to that of the substrate, the barrier to resistance should be increased because loss of interaction with the inhibitor should now correspond, minimally, to interference with substrate binding. Initial studies testing the substrate envelope hypothesis have supported the hypothesis (55).
Nalam and Schiffer (56) and Romano and Schiffer (57) have further shown that the low barrier to resistance observed for HCV NS3 inhibitors likely relates to the fact that the inhibitor envelope exceeds the substrate envelope. 4 The substrate enve-lope for an NS3/NS4A junction peptide is shown in Fig. 3c. Again, many amino acid residues involved in drug resistance do not interact directly with the substrate. When the positions of inhibitors are evaluated relative to this substrate envelope, the inhibitor envelope is not coincident with the substrate envelope (Fig. 3d). It is also evident from this figure how sites remote from the substrate-binding site could lead to inhibitor resistance without affecting substrate binding and/or cleavage and therefore viral fitness. Application of the substrate envelope hypothesis to the design of HCV NS3 protease inhibitors could yield inhibitors with a high barrier to resistance.
NS3 Protease and Tools to Study HCV-In addition to being a target for antiviral therapy, NS3 protease activity is being exploited for the development of new assays to study viral infection in primary cell cultures (58), to perform live cell imaging and analysis of individual infected cells (59), and to identify inhibitors (60,61). These new capabilities link localization, activity, or function of a reporter to processing by NS3 protease activity introduced into cells by infection.

NS3 Helicase
Nucleic Acid Binding and Oligomerization in Vitro-Helicases are classified into superfamiles (SFs) based on sequence homology, with SF1 and SF2 being the largest (62). NS3 is a member of SF2. A thorough review on NS3 helicase (63) and several excellent reviews on helicase mechanisms (64 -67) have recently been published. NS3 binds to DNA and RNA with an equilibrium dissociation constant in the low nM range, a binding site size of 7-8 nt, and little or no reported cooperativity (68). Binding to RNA and DNA, as well as unwinding of both substrates, is enhanced at pH ϳ6.5 (69). It is not known whether the pH dependence observed in vitro has biological significance, but it may indicate a unique environment within the membranous web where HCV replication occurs (11).
NS3 interacts with itself in vitro to form large aggregated structures, but it is not known whether oligomerization is biologically significant (70). The active form of NS3 has been studied by several laboratories with different conclusions regarding the active species including monomer (71)(72)(73), dimer (74,75), and oligomer (70). Evidence indicates that monomeric NS3 can rapidly unwind RNA, albeit with relatively low processivity (73,76). The NS3 helicase domain (NS3h), unlike NS3 or NS3/ NS4A, does not readily interact with itself (70). However, NS3h unwinding activity is increased when multiple molecules bind to the same DNA substrate molecule. This phenomenon is referred to as functional cooperativity, which results from all of the bound enzymes translocating in the same direction on the tracking strand of the substrate (77).
Kinetic and Physical Mechanism for DNA and RNA Unwinding-Unwinding of duplexes of varying length has led to several descriptors of the kinetic and physical constants associated with helicases (78). Unwinding experiments reveal a distinct lag phase, which can be interpreted to determine the kinetic step size for unwinding, which is the number of base pairs unwound prior to a rate-limiting kinetic step. The physical step size refers to the number of base pairs unwound in a single physical step. A helicase can unwind 1 bp at a time (physical step of one) but then proceed through a slow conformational change that 4 C. A. Schiffer, personal communication. An explanation for the large kinetic step sizes must involve a periodic slow step in the kinetic mechanism. Slow dissociation of the displaced strand was proposed to account for the large kinetic step size (73). NS3 binds more tightly to DNA or RNA compared with NS3h, so the protease domain may interact with the nucleic acid, perhaps by binding to the displaced strand. Identification of residues that interact with the displaced strand is now needed to test this hypothesis.
NS3 contains a number of highly conserved helicase motifs that line the cleft between domains 1 and 2 or face into the binding site for single-stranded DNA (ssDNA) (Fig. 1c). Structure-function studies have led to a description of the role for many of these motifs in ATP binding and hydrolysis or interaction with nucleic acid (82,83). The first structure of NS3h bound to ssDNA led to a proposed inchworm model for unwinding activity (84). Translocation of the helicase was suggested to result from movement of domains 1 and 2 relative to one another as a function of ATP binding and hydrolysis. The inchworm mechanism for other helicases has received support from structural (84 -87) and biochemical (88) studies.
The proposed inchworm model for NS3h was further supported by recent structures of NS3h co-crystallized with ATP analogs and ssDNA (Fig. 4, a and b) (89). The ATP analog ADP⅐BeF 3 mimics the ground state of ATP, whereas ADP⅐AlF 4 Ϫ mimics the transition state for ATP hydrolysis. Two amino acids of particular importance to the mechanism are Trp-501 and Val-432, which form "bookends" for the tracking strand of ssDNA bound in the active site of the enzyme. Trp-501 stacks with a base at the 3Ј-end of the ssDNA, whereas Val-432 inserts between bases on the 5Ј-end of the ssDNA (Fig. 4c). Five nt of the tracking strand are bound to the enzyme between the bookend residues in the absence of ATP. When bound to ADP⅐BeF 3 , NS3h undergoes a conformational change, with domains 1 and 3 rotating in the 5Ј-direction and closing with domain 2 (Fig.  4b). Domain movement results in 1 nt sliding outside the binding site at the position of Trp-501, leaving only 4 nt between the bookend residues. The structure of NS3h in the presence of the ATP transition state analog ADP⅐AlF 4 Ϫ illustrates intradomain movements that accompany the domain rotations. The net effect of these molecular motions is to rearrange interactions with nucleotides that ensure unidirectional movement along the binding track (Fig. 4c). ATP hydrolysis is proposed to release the closed conformation, allowing domain 2 to move in the 5Ј-direction by 1 nt. The net result is movement of 1 nt per ATP hydrolyzed (Fig. 4d). Key to this mechanism is that Val-432 and additional contacts from domain 2 hold the 5Ј-end of the ssDNA in place during movement of domains 1 and 3, whereas stacking of Trp-501 and additional ssDNA contacts hold the 3Ј-end of the ssDNA while domain 2 moves forward. Changing Trp-501 or Val-432 to alanine greatly reduces DNA unwinding activity, consistent with the inchworm mechanism (83). Movement of the DNA through the active site has also been proposed to be driven in part by electrostatic repulsion between the DNA and a glutamate residue (Glu-493) that maintains an unusually high pK a value (90).
An alternative mechanism to the inchworm model is the Brownian ratchet mechanism, in which NS3 diffuses along the DNA as a result of ATP-dependent cycling between tight and weak binding states (91). In the weak binding state, ATP binding causes the enzyme to have reduced affinity for ssDNA. Brownian motion allows the enzyme to slide forward or backward in the weakly bound state. Movement in the 3Ј-to-5Ј-direction is favored owing to the ratchet effect presumably due to Trp-501, which reduces the probability of the enzyme sliding backward.
Another region of NS3 that is critical for unwinding is a ␤-hairpin that extends from domain 2 (Fig. 1c). This has been referred to as the Phe loop due to two conserved phenylalanine residues in the turn of the hairpin (92). The role of the hairpin was inferred from the structure of the archaeal SF2 helicase Hel308, which was solved in the presence of a single-stranded/double-stranded DNA junction (93). The pin appears to serve as a wedge that splits the duplex DNA. A recent report for the SF1 helicase RecD2 showed that removal of the corresponding pin domain abolished doublestranded DNA unwinding without affecting DNA-stimulated ATPase activity (94). Hence, a likely role for the ␤-hairpin (Phe loop) is to separate the incoming duplex.
Data from single-molecule fluorescence resonance energy transfer experiments support a 3-bp step size in a "springloaded" mechanism for NS3 (72). The 3-bp physical step size was determined to include smaller 1-bp steps that are likely associated with one ATP hydrolysis event. These data led to a Ϫ complex. c, amino acid residues required for translocation (ribbon diagram of the structure shown in b highlighting the position of Val-432 and Trp-501 relative to nucleic acid). d, model for translocation and unwinding by NS3 invoking a 1-bp physical step size (84,89). Domain colors match those in A, and the base stacked with Trp-501 is indicated in red. ATP binding leads to movement of domains 1 and 3 in the 3Ј-to-5Ј-direction along the tracking strand, leading to 1 nt emerging from the ssDNA-binding site. ATP hydrolysis leads to movement of domain 2, resulting in melting of 1 bp and returning the enzyme to its initial state. This schematic does not depict all of the possible enzyme states that must occur along the reaction pathway. For example, more subtle conformational changes, transition states for ATP hydrolysis, and intermediate states with ADP and P i bound must occur but are not shown for clarity. A similar model invoking a 1-nt substep and a 3-bp unwinding step has been proposed (72).
proposed variation of the inchworm mechanism whereby domains 1 and 2 move along the DNA strand by 1 nt for each ATP hydrolyzed, whereas domain 3 remains anchored to the DNA through its tight interaction via Trp-501. Domain 3 was proposed to spring ahead after building up strain in the enzyme, resulting in the unwinding of 3 bp and providing an explanation for the singlemolecule fluorescence resonance energy transfer data.
Progress has been made in determining the specific step in the kinetic mechanism that limits duplex unwinding. The rate of phosphate release after ATP hydrolysis occurred at the same rate as the rate of unwinding of a 9-bp RNA duplex, suggesting that phosphate release may contribute to the rate-limiting step (95). Recently, the ATPase kinetic cycle of the DEAD box protein DbpA showed that one ATP hydrolysis event leads to unwinding of an 8-bp duplex (96). For this enzyme, the release of RNA occurs after ATP hydrolysis but prior to phosphate release.
Although RNA is the biological substrate for NS3 helicase, the enzyme exhibits robust DNA binding and unwinding activities. NS3 has been found in the nuclei of liver cells in patients infected with HCV (97). To date, no direct connection has been made between the DNA unwinding activity of NS3 and the pathobiology of HCV.
Inhibitors of NS3 Helicase Activity-No inhibitors of NS3 helicase activity have entered clinical trials. The helicase has not been found to recognize an RNA sequence or structure with high specificity, which is in contrast to the active site of the NS3 protease. Therefore, it may be difficult to find molecules that bind to the active site of NS3 and that do not exhibit crossreactivity with the myriad other cellular helicases whose active sites are similar to that of NS3. High throughput helicase assays, such as those with hairpin-forming molecular beacons, may enable discovery of specific inhibitors (98). Structure-based approaches might also yield success based on new structures that reveal different conformations of the protein (89). Finally, disruption of protein-protein interactions between NS3 and other HCV structural or non-structural proteins may impair HCV replication.