Crystal Structure of RNA Helicase from Genotype 1b Hepatitis C Virus

Crystal structure of RNA helicase domain from genotype 1b hepatitis C virus has been determined at 2.3 Å resolution by the multiple isomorphous replacement method. The structure consists of three domains that form a Y-shaped molecule. One is a NTPase domain containing two highly conserved NTP binding motifs. Another is an RNA binding domain containing a conserved RNA binding motif. The third is a helical domain that contains no β-strand. The RNA binding domain of the molecule is distinctively separated from the other two domains forming an interdomain cleft into which single stranded RNA can be modeled. A channel is found between a pair of symmetry-related molecules which exhibit the most extensive crystal packing interactions. A stretch of single stranded RNA can be modeled with electrostatic complementarity into the interdomain cleft and continuously through the channel. These observations suggest that some form of this dimer is likely to be the functional form that unwinds double stranded RNA processively by passing one strand of RNA through the channel and passing the other strand outside of the dimer. A “descending molecular see-saw” model is proposed that is consistent with directionality of unwinding and other physicochemical properties of RNA helicases.

Crystal structure of RNA helicase domain from genotype 1b hepatitis C virus has been determined at 2.3 Å resolution by the multiple isomorphous replacement method. The structure consists of three domains that form a Y-shaped molecule. One is a NTPase domain containing two highly conserved NTP binding motifs. Another is an RNA binding domain containing a conserved RNA binding motif. The third is a helical domain that contains no ␤-strand. The RNA binding domain of the molecule is distinctively separated from the other two domains forming an interdomain cleft into which single stranded RNA can be modeled. A channel is found between a pair of symmetry-related molecules which exhibit the most extensive crystal packing interactions. A stretch of single stranded RNA can be modeled with electrostatic complementarity into the interdomain cleft and continuously through the channel. These observations suggest that some form of this dimer is likely to be the functional form that unwinds double stranded RNA processively by passing one strand of RNA through the channel and passing the other strand outside of the dimer. A "descending molecular see-saw" model is proposed that is consistent with directionality of unwinding and other physicochemical properties of RNA helicases.
Helicases are enzymes catalyzing strand separation of double stranded DNA (dsDNA) 1 or dsRNA coupled with hydrolysis of NTP. They are required for many cellular events including transcription, RNA processing, translation, and DNA or RNA replication (1)(2)(3). Since the first discovery of DNA helicase activity more than 20 years ago (4,5), many different helicases have been identified with preferences for unwinding duplexes of DNA or RNA. In in vitro experiments, nearly all helicases require a single stranded region; some require a 3Ј overhang region (3Ј to 5Ј helicases), whereas others require a 5Ј overhang region (5Ј to 3Ј helicases). This single stranded region is pro-posed to provide an initiation site for unwinding duplex nucleic acids (6). Escherichia coli Rep DNA helicase, an extensively studied helicase, exhibited highly processive unwinding of replicative form of phage DNA in an in vitro experiment (7). In order for a helicase to unwind duplex nucleic acids in a processive manner, the enzyme should destabilize the hydrogen bonds between the base pairs, translocate to the next base paired region, and repeat the cycle without fully dissociating (6,8). Recently, oligomeric forms, generally dimers or hexamers, were observed for some DNA helicases (8,9). These oligomers are believed to provide the helicases with multiple nucleic acid binding sites necessary for the helicase function (6). Although Rep DNA helicase, for example, is a stable monomer in solution in the absence of DNA, a dimeric form of Rep is induced in the presence of DNA which is known as the functional form (6,8).
Hepatitis C viruses (HCVs) are the major etiologic agents of non-A, non-B hepatitis that are estimated to have infected about 1% of the population worldwide. HCV belongs to flaviviridae, the positive-strand RNA virus family (10). Its genome consists of about 9400 nucleotides with the gene order of NЈ-C-E1-E2-NS2-NS3-NS4A-NS4B-NS5A-NS5B-CЈ (10) encoding a viral polyprotein of about 3010 residues (11). The polyprotein is processed into functional proteins by host-and virus-encoded proteases. Among the processed proteins, NS3 is best characterized. The N-terminal one-third of NS3 is a serine protease domain (12) which is known to cleave the NS3/4A, NS4A/4B, NS4B/5A, and NS5A/5B junctions (13)(14)(15). The C-terminal twothirds of NS3 is an RNA helicase domain exhibiting nucleotide triphosphatase/RNA helicase activity (16,17). The domain was shown to unwind not only dsRNA but also RNA/DNA heteroduplex and dsDNA (18). For its function the helicase domain strictly requires a 3Ј overhang region, and it unwinds double stranded nucleic acids only in the 3Ј to 5Ј direction (18).
Sequence alignment of many RNA helicases revealed four highly conserved sequence motifs, and in HCV RNA helicase they are conserved as G 207 SGKST, D 290 ECH, T 322 AT, and Q 460 -RRGRTGRGRRG sequences. The G 207 SGKST sequence, known as Walker A motif, is found in nearly all NTP hydrolyzing enzymes and is responsible for NTP binding. The D 290 ECH sequence is a variant of Walker B motif (19). Biochemical and mutational analyses showed that the T 322 AT sequence is important in unwinding of RNA, whereas the Q 460 RRGRTGR-GRRG sequence is important in the RNA binding and the unwinding of RNA (20,21).
Here we report the crystal structure of the genotype 1b HCV RNA helicase domain and discuss in detail the structural features of the conserved motifs. Based on a modeling experiment we propose a mechanism of processive unwinding of the duplex RNA consistent with previously observed physicochemical properties of the enzyme.

MATERIALS AND METHODS
Protein Purification and Crystallization-The HCV RNA helicase domain was isolated from an overexpressing E. coli strain (BL21 (DE3)) and purified using a Ni-NTA-agarose column (QIAGEN) and a poly(U)-Sepharose column (Amersham Pharmacia Biotech) successively as described previously (22). Crystals were grown at 4°C from a precipitant solution containing 30% polyethylene glycol 4000, 0.1 M sodium cacodylate (pH 6.5), and 0.2 M ammonium acetate on micro batch plates under Al's oil (Hampton Research). The crystals belong to the space group P3 1 21 with the unit cell dimensions of a ϭ b ϭ 93.3 Å, c ϭ 104.6 Å. The crystals contain one molecule of the enzyme in the asymmetric unit.
Data Collection, Structure Determination, and Refinement-All diffraction data were measured from flash-frozen crystals on a DIP2020 area detector system with graphite monochromated CuK␣ x-ray generated by a MacScience M18XHF rotating anode generator operated at 90 mA and 50 kV. Data reduction, merging, and scaling were accomplished with the programs DENZO and SCALEPACK (23). Initial diffraction phases were obtained by multiple isomorphous replacement with three heavy atom derivatives. A difference Patterson map of an iridium derivative (K 3 IrCl 6 ) and that of thimerosal derivative were calculated, respectively, with the fast Fourier transform of the CCP4 suite (1994). Heavy atom sites for the two derivatives were readily identified by strong Patterson peaks (Ͼ6) on Harker sections (Z ϭ 1/3) and at general positions. The heavy atom positions were used to calculate MIR phases with the program MLPHARE (24). The MIR phases revealed relatively weak heavy atom positions for a gold derivative (KAu(CN) 2 ). The MIR phases with all three derivative data had a mean figure of merit of 0.55 at 3.0 Å resolution (Table I) and were improved with real space density modification using the program DM in the CCP4 suite (1994). The final MIR map was of high quality showing virtually all side chain electron densities except in flexible regions. A nearly complete model of the helicase was built using the program O (25) and was refined using X-PLOR program package (26). MIR phases were abandoned at this point, and electron density maps calculated with phases derived from the refined model allowed model building of loop regions. The N-terminal 39 residues including hexahistidine tag attached to the protein exhibited no electron density and were omitted in the final model. Only faint electron densities were observed for residues 416 -420.

RESULTS AND DISCUSSION
Protein Fold and Structural Features-The structure of HCV RNA helicase consists of three nearly equal-sized domains that form a Y-shaped molecule (Fig. 1). The N-terminal one-third of the enzyme is a NTPase domain consisting of a typical central core of pleated sheet surrounded by helices (27). The active site cleft for the NTP hydrolysis can be readily identified at the periphery of the domain by the G 207 SGKST sequence, the NTP binding motif. The second domain is an RNA binding domain containing the highly conserved Q 460 RRGRTGRGRRG sequence identified as an RNA binding motif (Fig. 1). The folding pattern of the RNA binding domain is similar to that of the NTPase domain, but it contains fewer ␣-helices. The C-terminal one-third of the enzyme is a helical domain composed of five ␣-helices and loops. The NTPase and the helical domain are more or less continuously linked with a shallow groove between the two, but the RNA binding domain is distinctly separated from the other two domains forming a deep interdomain cleft as shown in Fig. 1. The size of the interdomain cleft is adequate for binding ssRNA (or ssDNA) but too narrow for binding double stranded nucleic acids. The NTPase and the RNA binding domain are connected by two random coils (Fig. 1). In contrast, two antiparallel ␤-strands, unusually protruding from the RNA binding domain, are inserted into the helical domain like an anchor linking the two domains ( Fig. 1). The turn region of the antiparallel ␤-strands is rich in hydrophobic amino acids and interact extensively with apolar residues of the helical domain.
It remains a question why NS3 containing two completely different activities, protease and helicase activities, is not cleaved into two polypeptides in the processing of the viral polyprotein. In the crystal structure presented here, the electron density for the first 39 amino acid is not visible, indicating that the protease domain and the helicase domain of NS3 should be linked to each other by a highly flexible loop region. The visible N terminus of the helicase is at the back side of the molecule (opposite the interdomain cleft), and thus the enzyme activities of the protease and the helicase domain appear independent of each other (see below).
NTP Binding Site-The NTP binding site is located at the periphery of the NTPase domain. The G 207 SGKST and the D 290 ECH sequences are close to each other, lining part of the active site cavity (Fig. 2). The side chain of Asp 290 is involved in an ionic interaction with Lys 210 and a hydrogen bond with the side chain of Ser 211 . The D 290 ECH sequence is on an unusual loop structure which orients the side chains of Asp 290 , Glu 291 , and His 293 toward the cavity and that of Cys 292 in the opposite direction (Fig. 3). The functional roles of Asp 290 and Glu 291 can be inferred from the structures of other NTPases. The crystal structures of Bacillus stearothermophilus PcrA (28) and E. coli Rep DNA helicases (29) were determined as complexes with ADP. It has been proposed that in the major domain of RecA, which exhibits similarity to the NTPase domains of the DNA helicases, the magnesium ion of Mg-ATP is coordinated by Asp 144 (30). This residue corresponds to Asp 223 and Asp 214 in the DEXX motif of PcrA and Rep DNA helicase, respectively. Glu 215 (in Rep) and Glu 224 (in PcrA) of the DEXX motif are in the same relative position in space as Glu 96 in RecA, which was proposed to activate the catalytic water molecule during the where ␣ is the phase and P(␣) is the phase probability distribution. The R free was calculated with 5% of the data.  (31). The electron density for the NTP binding site is very strong showing detailed features including many bound water molecules (Fig. 2). However, it was not possible to predict a correct binding mode of NTP by simple model building due to severe steric clashes. Some conformational change in the active site cavity is expected to occur upon NTP binding.
Flexible Hinge-One of the two loops connecting the NTPase and the RNA binding domain contains the invariant T 322 AT. In the structure presented here, the hydroxyl group of Thr 322 , located at the beginning of the loop (Fig. 1, 3), is involved in a hydrogen bond with the imidazole ring nitrogen of His 293 . In contrast, the hydroxyl group of Thr 324 , located toward the middle of the loop and about 4 Å apart from His 292 , is just exposed to the bulk solvent (Fig. 3). It is generally believed based on mutational studies that the T 322 AT sequence couples the NTP hydrolysis and the duplex unwinding by the enzyme (20). In other experiments, H293A mutation in HCV RNA helicase and the corresponding mutation in vaccinia virus RNA helicase were shown to affect severely the duplex unwinding activity without affecting the NTP hydrolysis activity (31,32). Thus, His 293 , Thr 322 , and Thr 324 may function as a triad in coupling the NTP hydrolysis and the helicase activity. It is possible that His 293 could switch its hydrogen bond to and from Thr 322 and Thr 324 during the helicase function, which should require a small flexible hinge motion of the connecting loops considering the proximity of the three residues. As supporting evidence, in the crystal structure of highly homologous genotype 1a HCV RNA helicase which was determined very recently, two molecules in the asymmetric unit displaying 3-4°r igid body rotation of the RNA binding domain with respect to each other undergo a small hinge bending motion of the two loops (33). The domain movement appears intrinsically small due to the presence of the two antiparallel ␤-strands which link the RNA binding and the helical domain. The "structured" strands, interacting heavily with the helical domain, are unlikely to undergo an appreciable conformational change. In the crystal structure of genotype 1a HCV RNA helicase, the ␤-strands of the two molecules in the asymmetric unit show a negligible twist with respect to each other. Consistently, de- spite the completely different crystal packing, the RNA helicase structure presented here (in the space group P3 1 21) does not exhibit any noticeable closure or opening of the interdomain cleft compared with the structure of genotype 1a HCV RNA helicase (in the space group P2 1 2 1 2 1 ).
RNA Binding Motif-The RNA binding motif, Q 460 RRGR-TGRGRRG sequence, shows unusually high occurrence of glycine and arginine. The first three residues constitute the end of an ␣-helix, and the rest of the residues forms a loop structure (Fig. 1, 4). The high occurrence of glycine on the loop structure strongly suggest that the sequence can easily undergo a conformational change necessary for the alignment of the arginine residues on the loop structure in favorable contacts with the phosphate backbone of RNA. It was noted that the side chains of the most conserved residues, Gln 460 , Arg 461 , Arg 464 , and Arg 467 , point to the interdomain cleft with an ϳ7 Å spacing (Fig. 4).
Putative Functional Dimeric Model-Entrapment of substrates within protein structures has been observed as a common theme for several proteins which possess a processivity in the interaction with nucleic acids, including ␤ subunit of E. coli DNA polymerase III (34) and -exonuclease (35). A functional oligomeric state of HCV helicase is not known but has been proposed equivocally as a monomer or a dimer (36). We examined crystal packing interfaces in the crystal structure and found that a symmetry-related monomer-monomer interaction could reflect interfaces of a functional form of the RNA helicase. The interfaces are formed by the interactions between the NTPase and the RNA binding domains of the two molecules and represent the most extensive crystal packing interactions (Fig. 5). The interactions screen a total of 144 Å 2 surface area of one molecule. In the middle of the dimer a channel is found that is helical in shape with a vertical length of about 29 Å and a horizontal length of about 10 Å on the front side and 12 Å on the back side. The RNA binding motif of each molecule is the major part shaping the surface of the channel.
It was possible to model a canonic RNA (37) into the interdomain cleft of one molecule and continuously through the channel at the dimer interface with the 3Ј end hanging out of the channel. In the modeling experiment, the phosphate backbone of ssRNA was brought into contact with the RNA binding motifs of the two molecules with slight dihedral angle changes of the phosphate backbone. This was easily achieved because both the two RNA binding motifs and the two interacting phosphate groups are separated by ϳ17 Å corresponding to a half-turn of RNA (Fig. 6). In this binding mode, one molecule of the RNA helicase interacts with ssRNA extensively, whereas the other molecule reacts less extensively using the RNA binding motif only. The phosphate backbone of the bound ssRNA, interacting with the RNA binding motifs at the channel, appears as a "glue" which stabilize the dimeric structure. Similar observations were made for the dimer or the trimer of the DNA polymerase ␤ subunit, both of which are supposed to be stabilized by dsDNA entrapped in the middle of the oligomeric structures (34,38). It was noted that the interdomain cleft is slightly wider for the ssRNA modeled in the interdomain cleft. A small rotation of the RNA binding domain toward the NTPase and the helical domain is expected for an induced fit of ssRNA as a result of the hinge bending motion. It is known that ssRNAs increase the NTPase activity of HCV RNA helicase up to 27-fold (36), whereas dsRNAs do not. The activity increase is likely due to the ssRNA binding to the interdomain cleft, which may also trigger some conformational change at the active site of the enzyme. In this regard, the RNA helicase with a bound

FIG. 5. A dimer interface which exhibits the most extensive crystal packing interaction in the crystals of HCV RNA helicase.
A channel is found between the two symmetry-related molecules through which ssRNA is able to pass. A crystallographic 2-fold symmetry axis is roughly perpendicular to the figure. The interdomain cleft is predominately negatively charged (red). A putative active dimeric form is proposed to be similar in conformation to the symmetric dimer shown here (see text). Fig. 5. In the modeling, a canonic RNA was used, and the phosphate backbone of the ssRNA was brought into contact with the RNA binding motifs of the dimer with slight changes of the backbone dihedral angles of the ssRNA. The top portion of the RNA binding domain facing the channel is the RNA binding motif. The negative electrostatic potential of the interdomain cleft suggests that it interacts with bases of ssRNA. This was accounted for in the modeling experiment, but it was not necessary to change the phosphate backbone dihedral angles.

FIG. 6. Modeling of single stranded portion of RNA into the interdomain cleft and through the channel between the two molecules of the symmetric dimer shown in
ssRNA at the interdomain cleft can be considered as an activated molecule, whereas the RNA helicase without a bound ssRNA can be considered as a resting molecule. It is not known whether the RNA and the NTP binding to the enzyme are sequential or random. Our modeling experiment, which shows that the bound ssRNA dose not block the NTP binding site, cannot distinguish the two. Whether it is sequential or random, ATP hydrolysis would occur mainly on the activated molecule. Because of the expected small conformational change upon the ssRNA binding at the interdomain cleft, the asymmetric putative functional dimeric form composed of the activated molecule and the resting molecule would be slightly different from the The ssRNA bound to the interdomain cleft of ␣ is proposed to induce a small conformational change which increases the NTP hydrolysis activity. The dimer is stabilized by the interaction of ssRNA with the RNA binding motifs of ␣ and ␤. C, the NTP hydrolysis by ␣ in B results in the detachment of the ssRNA and a rigid body rotation of the dimer along an axis at the RNA binding motif of ␣. As a result, the dimer translocates along the ssRNA in the 5Ј direction, and the interdomain cleft of ␤ binds the other portion of the ssRNA. D, the dimer reaches the junction of ssRNA and dsRNA by repeated cycles of the translocation. E, in the same manner, the dimer translocates along the same strand of RNA. Energy required for the disruption of base pairings can be supplied by favorable interactions between the interdomain cleft and the ssRNA. One strand passing through the channel at the dimer interface is separated from the other strand hanging out of the dimer. symmetric dimer composed of the two resting molecules presented here. The dimer interface in the crystal structure does not involve any specific interaction such as ion pairs between charged amino acids. Thus, the slight rotation of the RNA binding domain at the dimer interface could easily occur upon the induced conformational change by the ssRNA binding.
Mechanism of Duplex Unwinding-With this putative functional dimeric model, we propose a mechanism of processive unwinding of duplex RNA coupled with the NTP hydrolysis by HCV and related RNA helicases. ATP decreases the affinity of HCV RNA helicase for deoxyuracil 18 mer by 95% (36), indicating that the ATP hydrolysis results in the dissociation of RNA from the enzyme. Based on this observation, it can be hypothesized that the NTP hydrolysis causes a hinge bending motion which transforms the activated conformation of the enzyme to the resting conformation concomitant with the detachment of the bound ssRNA from the interdomain cleft. In the model presented here, the detached ssRNA can be closer to and bound by the interdomain cleft of the resting molecule of the putative functional dimer. This can be described as a rotation of the dimer relative to the bound ssRNA (Fig. 7). In order for the dimer to translocate on the ssRNA by the rotational motion, a rotation axis should be toward the 5Ј end of the bound RNA relative to the "pseudo-" 2-fold symmetry axis of the functional dimer. In the structure of the dimer, the front part of the RNA binding motif of one molecule is below the 2-fold symmetry axis (Fig. 6). It was shown that individual alanine substitutions of the conserved arginine residues in the RNA binding motif in vaccinia virus RNA helicase cause severe defects in RNA unwinding with slight reduction in RNA binding affinity (21). This mutational study led to the conclusion that the motif must play an essential role in the helicase mechanism. The structural observation and the biochemical data suggest that a front part of the RNA binding motif serves as the pivoting region for the rotation in the context of the proposed model. About 60°rotation of the dimer along an axis passing the front part of the RNA binding motif of the activated molecule brings the resting molecule into contact with the ssRNA and results in a translocation of the dimer toward the 5Ј direction as shown in Fig. 7. After this one cycle of the rotation and translation the conformations between the activated molecule and the resting molecule are exchanged, reproducing the same RNA binding mode of the functional dimer. Repeated cycles of the rotation and translation along the ssRNA containing the 3Ј overhang can be described as "descending molecular see-saw" motion ( Fig. 7). Since the size of the interdomain cleft is adequate only for ssRNA, duplex unwinding by the disruption of base pairings should occur at the ssRNA and dsRNA junction. Required energy can be provided by the favorable interaction between the interdomain cleft and the ssRNA. The translocation of the dimer along the ssRNA is the process of duplex unwinding because one strand of RNA passing through the channel is separated from the other strand hanging out of the dimer (Fig. 6, 7). The step size of duplex unwinding and translocation along the ssRNA is about 5 nucleotides when the dimer rotates along an axis at the front part of the RNA binding motif (about Arg 462 ). This coincides with the step size of UvrD helicase obtained by kinetic measurement (39).
The proposed functional dimeric model satisfies the previously observed properties of the enzyme. First, the model requires a 3Ј overhang region of RNA for the initial binding of one molecule the RNA and for the formation of the putative active dimer (Fig. 8). This explains the requirement of an at least 11-base-long single stranded portion of the RNA substrates for the helicase function (36). It is the minimum length that spans the interdomain cleft and the channel of the dimer (Fig. 6).
Second, the functional dimer moves along the ssRNA containing 3Ј overhang in the 3Ј to 5Ј direction as previously observed (40). It was not possible to insert ssRNA into the dimeric structure in the 5Ј to 3Ј direction simply due to severe steric clashes. Third, the model requires the pivoting motion around the RNA binding motif for the duplex unwinding. This explains that the RNA binding motif is necessary not only for the binding of RNA but also more importantly for the unwinding activity of the enzyme. Fourth, since the functional dimer translocates along only one strand of nucleic acids in this model, the dimer is able to unwind dsRNA, dsDNA, and DNA/RNA duplex as previously known. Besides, since the dimer uses only the front side of the molecules for the duplex unwinding and the flexible N terminus is located at the back side of the molecule, the protease domain linked to the helicase domain in NS3 would not interfere the movement of NS3 along RNA substrates in vivo.
The proposed model is analogous in concept to the "rotarytype" mechanism of F 1 -ATPase. It is believed that a 120°rotation of ␥ subunit of F 1 -ATPase induces sequential conformational changes of the ␣ 3 ␤ 3 subunits. Each subunit alternates among three conformations, ADP and P i bound, ATP bound, and none-bound (41). Like the ␥ subunit of F 1 -ATPase, the bound ssRNA results in the asymmetric functional dimer, each molecule of which alternates between the activated and the resting conformation. Interestingly, it was proposed recently that hexameric T7 DNA helicase encircles only one strand of DNA (42), as does the functional dimeric model proposed here. In conclusion, the "descending molecular see-saw" model is presented consistent with the previously observed biochemical data for the RNA helicases. The model provides a plausible framework explaining how the enzymes achieve the duplex unwinding and the translocation along nucleic acids in a processive manner.