The hepatitis C virus RNA-dependent RNA polymerase directs incoming nucleotides to its active site through magnesium-dependent dynamics within its F motif

RNA viruses synthesize new genomes in the infected host thanks to dedicated, virally-encoded RNA-dependent RNA polymerases (RdRps). As such, these enzymes are prime targets for antiviral therapy, as has recently been demonstrated for hepatitis C virus (HCV). However, peculiarities in the architecture and dynamics of RdRps raise fundamental questions about access to their active site during RNA polymerization. Here, we used molecular modeling and molecular dynamics simulations, starting from the available crystal structures of HCV NS5B in ternary complex with template-primer duplexes and nucleotides, to address the question of ribonucleotide entry into the active site of viral RdRp. Tracing the possible passage of incoming UTP or GTP through the RdRp-specific entry tunnel, we found two successive checkpoints that regulate nucleotide traffic to the active site. We observed that a magnesium-bound nucleotide first binds next to the tunnel entry, and interactions with the triphosphate moiety orient it such that its base moiety enters first. Dynamics of RdRp motifs F1 + F3 then allow the nucleotide to interrogate the RNA template base prior to nucleotide insertion into the active site. These dynamics are finely regulated by a second magnesium dication, thus coordinating the entry of a magnesium-bound nucleotide with shuttling of the second magnesium necessary for the two-metal ion catalysis. The findings of our work suggest that at least some of these features are general to viral RdRps and provide further details on the original nucleotide selection mechanism operating in RdRps of RNA viruses.

Template-dependent synthesis of nucleic acids to generate a template-complementary strand is at the heart of the transmission and expression of genetic information. Nucleic acid polymerases are those enzymes that direct such template-dependent DNA or RNA synthesis. At the most basic level, the crucial function of these enzymes is to achieve exact positioning of a nucleotide so that it may probe the next base of the template to be copied (base labeled "ϩ1" on Fig. 1B), checking for proper Watson-Crick base pairing. The "P-1" 3Ј hydroxyl of the neosynthesized nucleic acid can then attack the 5Ј-␣-phosphate of the nucleoside triphosphate (NTP), ultimately leading to the addition of the nucleoside monophosphate and release of pyrophosphate. This chemistry step is assisted by two metal ions (normally magnesium) and this overall scheme seems general to all nucleic acid polymerases (1). In bacteria and viruses the polymerase activity may be carried by a single polypeptide chain that is necessary and sufficient for template-dependent nucleic acid synthesis. These polymerases usually only use one type of nucleic acid (DNA or RNA) as template, and synthesize only one type (DNA or RNA) complementary to the first. Atomic structures and sequence comparisons of these enzymes have shown that a very successful architecture is that initially found in the Klenow fragment of Escherichia coli DNA polymerase I (a DNA-dependent DNA polymerase, or replicase) (2). This architecture (hereafter called base polymerase) has been likened to a right hand with three domains dubbed "fingers," "palm," and "thumb" (Fig. 1A). There are base polymerases found for all four possible activities: replicases, transcriptases (DNA-dependent RNA polymerases such as the bacteriophage T7 RNA polymerase), reverse transcriptases (such as the HIV DNA polymerase), and RNA-dependent RNA polymerases (RdRp). 3 In the latter two cases of RNA-dependent base polymerases, the enzymes are specific to the viral world and are prime targets of antivirals.
The first complete crystal structure of a viral RdRp, that of hepatitis C virus polymerase (HCV NS5B), revealed that, in contrast to other base polymerases, RdRp harbor extensions from the fingers connecting to the thumb. These "fingertips" (Fig. 1A) leave only a narrow tunnel for access of NTP to the catalytic site (3)(4)(5). Nearly all RNA viruses with no DNA stage encode their own RdRp for intracellular transcription and replication, and numerous atomic structures of these are now known. They all display this peculiar elaboration on the base polymerase-fold, whether in the 400 -600 -residue RdRp of single-stranded, positive sense RNA ((ϩ)-ssRNA) viruses or in the usually larger and more complex RdRp of (Ϫ)-ssRNA viruses (such as influenza virus) (6) and dsRNA viruses (7,8), where the fingertips may be very large.
Groundbreaking work at the turn of the 1990's identified conserved sequence motifs within the basic architecture of base polymerases, some found in all (2) and others specific to RNAdependent polymerases (9) or more narrowly to RdRp (10). We follow here the nomenclature of Delarue and colleagues (9), extended by other authors, in labeling these motifs in alphabetical order. Thus motifs A and C (common to all base polymerases) provide aspartic acid residues that chelate the two active site magnesium ions (for HCV NS5B, Asp 220 of motif A and Asp 318 -Asp 319 of motif C, Fig. 1B). The so-called motif F is best described as a tripartite, RdRp-specific motif F1-F2-F3 (in NS5B, residues 141-144, 151-154, and 155-161) (11). It resides in the RdRp fingertips in successive circles of residues lining the NTP tunnel and increasingly distant from the active site in the order F3 Ͻ F1 Ͻ F2. F1-F3 are not as conserved in RdRp as motifs A and C. Indeed sequence divergence in these and other residues allows partitioning of (ϩ)-ssRNA RdRp into three supergroups (10,12). Among major pathogens, supergroup I contains the Picornaviridae (including the enteroviruses such as poliovirus, or hepatitis A virus), and Caliciviridae (including noroviruses), as well as Coronaviridae (including severe acute respiratory syndrome virus). Supergroup II contains the Flaviviridae (including among many others dengue virus, Zika virus and HCV). Supergroup III contains the Togaviridae (including the alphaviruses such as chikungunya virus and the rubella virus), as well as hepatitis E virus. Available atomic structures of complexes with template-primer nucleic acids (called binary complexes in the field) and of complexes with nucleic acids and an incoming nucleotide bound at the active site (ternary complexes) are very unevenly distributed among (ϩ)-ssRNA RdRp supergroups. There are no structures of supergroup III RdRp (apo or complexes). Recently the structures of binary (13) and ternary (14) complexes of HCV NS5B have been the first for supergroup II. Finally, numerous crystal structures of binary and ternary complexes are available for supergroup I (e.g. Refs. 15 and 16).
Of particular interest in these supergroup I structures are those obtained in recent years for enteroviruses (17)(18)(19). Through careful tailoring of experimental conditions, Gong and colleagues (17)(18)(19) could catch incoming NTP not only at the active site but also just prior to transfer to the active site. At this stage, the NTP already base pair to the ϩ1 template base but its triphosphate, which comes with catalytic magnesium B (Mg(B)), is not in the right place yet. Indeed, Mg(A) (Fig. 1B) does not yet reside in the active site where it is to activate the primer 3Ј-OH for attack of the NTP ␣-phosphate. In one enterovirus complex it is found instead some 5 Å away (18), at a noncatalytic site that has been reported also in other RdRp including Flaviviridae (supergroup II) RdRp (20) (and references therein). Gong and colleagues (17)(18)(19) conclude that these complexes are equivalent to the "preinsertion" state reported for the T7 RNA polymerase (21). In the present work, we extend these structural insights on nucleotide entry into RdRp through extensive molecular dynamics simulations using a variety of methods on HCV NS5B complexes.

Overall dynamics of ternary and binary complexes
We took advantage of the availability of crystal structures of ternary complexes of NS5B with double-stranded primer-template RNAs and incoming nucleoside diphosphate (14) to generate the all-atom ternary system (see "Experimental procedures"): NS5B, template-primer RNA and UTP with two Mg 2ϩ , Mg(A) and Mg(B), at the active site (Fig. 1, A and B). Starting from this system, we first produced 1 s of simulation in the forms of (i) 5 replicas of 100-ns simulations of the ternary complex and (ii) 5 replicas of 100-ns simulations of a binary complex where we kept Mg(A) at the active site but removed UTP, together with Mg(B), because it is expected to come with the incoming nucleotide (22,23). In all 10 simulations, the systems showed several conserved properties. The first is an initial relaxation/expansion from the crystal structure conformation corresponding to an opening movement of the protein, mainly of the thumb domain. To monitor this opening we used the NS5B "interdomain angle" as defined by Davis and Thorpe (24). We find that this angle consistently goes up from some 76°in the crystal structure to around about 84°during the simulations (see Fig. S1 for the ternary complexes). In fact this initial relaxation occurs in large part during the careful equilibration protocol we used to avoid spurious breakup of contacts that we saw in initial trials. The relaxation is ended for all simulations after 10 ns and the angle then oscillates with breathing motions of the enzyme. We therefore considered the first 10 ns of simulations as pre-production.
Root mean square fluctuation (RMSF) increases with mobility of atoms and is a good indicator of which parts are rigid and which flexible. RMSF on all nonhydrogen atoms computed over the last 90 ns show that several NS5B internal loops are very mobile (Fig. 2, A and B). We focus on the loop in the fingertips that overhangs the NTP entry tunnel and shows a very sharp RMSF peak in all 10 simulations at residues 148 -152 (RMSF peaking at 4 -5 Å at residue 151 and above 2.5-3 Å over 148 -152, whereas it is below 1 Å immediately upstream and downstream). In crystal structures this part (assigned by Bruenn (11) to RdRp motif F2, and hereafter called "entry loop") is often not visible in electron density maps and hence not built in the HCV RdRp magnesium-dependent dynamics upon NTP entry model, suggesting dynamic disorder. However, it is present in the crystal structure we used and our simulations show that it does stabilize at times in several defined positions. This is best illustrated by the distance between Lys 151 N and Asp 352 C␥ on the other side of the NTP tunnel next to motif D (Fig. 2, C and D). The very wide distributions (from below 3.5 to 25 Å) show several peaks that correspond to alternate salt bridges made by Lys 151 . The dynamics of the entry loop thus allow it to stably sample "closed" conformations (including one where Lys 151 and Asp 352 make a salt bridge, bisecting the NTP tunnel entry in two) to "open" conformations (including one where the entry loop sticks out from the protein, Fig. 2C). When we remove UTP and Mg(B) (binary complex), the mobility of the entry loop increases and the distribution of distances is shifted as the loop tends to sample more the open conformations and less the closed ones (Fig. 2D).

NS5B residues interfering with access to and egress from the active site
In base polymerases, the NTP tunnel is RdRp-specific (see "Discussion") and how nucleotides shuttle through it has been an almost completely unaddressed question. We therefore set out to explore how the UTP could enter the NS5B NTP tunnel and reach the active site. As a starting approach to locate residues likely involved during NTP entry, we first placed the UTP with Mg(B) at the tunnel entry 40 Å from Mg(A), which we kept at first at the active site. We then used biased MD protocols to direct UTP toward the active site.

Targeted molecular dynamics simulations on 4 systems (2 UTP orientations ؋ 2 loop conformations)
We first used targeted molecular dynamics simulations (TMD). For the starting point we chose the side of the larger opening remaining when Lys 151 forms a salt bridge to Asp 352 (Fig. S2, panels 1 and 2). We denote this side as "Asp 387 side," whereas the other side is "Lys 51 side." In TMD an extra energy term is added that depends on the differences between a set of atoms in the simulated system and the same set in a reference, target system. In all cases the target system was our initial, energy minimized and equilibrated ternary complex system. The difference between the simulated and target systems was computed as the root mean square deviation (RMSD) between UTP in the simulated system and UTP in the target system. The extra energy term increased with the square of RMSD, thus pushing UTP toward its position and orientation in the ternary complex. Mg(B) was not included in the RMSD target, but still remained associated with UTP in all simulations. In view of the previous results, we tried two starting NS5B conformations, one with the entry loop closed and the other open (Fig. 2C, Fig. S2). These conformations were chosen by clustering of the entry loop conformations in the 5 binary complex simulations using our in-house clustering util-
In all 4 systems UTP enters the tunnel up to a point, then is checked in its progression (Fig. S3 for the closed entry loop, UTP base-oriented case). The main contribution to this check seems to be RdRp motif F3's Arg 158 , whose side chain keeps UTP out of the active site. Increasing the constant to 0.1 Kcal/ mol/Å 2 allows UTP to finally pass the Arg 158 block, pushing aside the Arg 158 side chain. Even after this forced passage, UTP is not correctly inserted in the active site and neither makes a Watson-Crick bp to the template ϩ1 adenosine nor has its ␣-phosphate coordinating Mg(A).

Steered molecular dynamics simulations of the base-oriented systems
Steered molecular dynamics (SMD) is a complementary method to the previous TMD to investigate incoming UTP at the active site. To better assess how Arg 158 interferes with the establishment of base pairing to the template RNA we performed 30 SMD, most on the open entry loop (initial UTP orientation) and a few on the closed entry loop (initial UTP orientation) systems. During the SMD simulations, a time-dependent external force is applied between two selected atoms (or group of atoms) to define the direction forcing the UTP to move from its initial position to the target position (a straightline in this case). Parameter values were varied for the velocity constant, the atoms on which the force was applied, and the force constant. In all cases, from the entrance of the tunnel to Arg 158 , the progression of the nucleotide is progressive and continuous. Then the nucleotide abuts on the salt bridge between Arg 158 and Glu 143 of motif F1 and at that moment we observe four scenarios (Fig. S4, panels A, B, C, and D, respectively). 1) The progress of the nucleotide is stopped (panel A).
2) The salt bridge Arg 158 -Glu 143 breaks off before the nucleotide arrives, clearing the passage to the template ϩ1 adenosine. 3) The nucleotide passes next to the salt bridge but is poorly positioned to reach the base to be interrogated. 4) The nucleotide goes under the salt bridge and can reach the template strand. We could obtain all four scenarios with both conformations of the entry loop and similar work was required for the UTP trajectories. Interestingly, in scenario 2, the salt bridge Arg 158 -Glu 143 breaks, whereas the incoming UTP is far, about 36 Å from its target (Fig. S4B, right panel). This is because Glu 143 can form a salt bridge with Lys 141 , whereas Arg 158 interacts with the template ϩ1 adenosine (-cation interaction). These results further highlight Arg 158 in motif F3 and more precisely its salt bridge with Glu 143 in motif F1 as a possible dynamic checkpoint in nucleotide entry.

Nucleotide interactions once at the active site: a locked down UTP
In view of this, we examined the behavior of Arg 158 in the ternary complex, once UTP has entered the active site. In the initial PDB code 4WTA, Arg 158 is sandwiched between Glu 143 and UTP and makes two stable interactions with the latter. It establishes on the one hand a salt bridge to the ␣-phosphate and on the other hand a cation-interaction with the base. In our 5 unbiased simulations of ternary complexes, these contacts are maintained throughout, as well as Watson-Crick interactions of the UTP base with the ϩ1 template adenine (Ͼ70%) and classical Mg 2ϩ -mediated electrostatic interactions on its triphosphate side. Only ribose interactions with the protein are dynamic, with the 2Ј-OH environment being particularly labile on one side. These observations highlight a locking down of the active site after nucleotide insertion where Arg 158 contributes to keeping in place triphosphate and base moieties, whereas the ribose moiety is more loosely bound.

Exit TMD
To further probe this locking down, we also performed reverse TMD from the ternary UTP complex (not shown). Thus we used as target system the initial ternary complex system and as target RMSD at 30 Å instead of 0 to push UTP out of the active site. With a constant of 0.01 Kcal/mol/Å 2 , UTP does not budge from the active site (final RMSD of about 2 Å). Increasing the constant to 0.1 Kcal/mol/Å 2 finally pushes UTP out of the active site (final RMSD goes above 20 Å). However, UTP does not then pass Arg 158 to go to the NTP tunnel but pushes aside the Ϫ1 template-primer bp and the surrounding protein. The two magnesium ions remain attached to the triphosphate moiety and pull along the three coordinating aspartates: Asp 220 , Asp 318 , and Asp 319 .

A minimally biased approach of the NTP tunnel: distancerestrained MD from "open entry loop"
Having found a likely checkpoint along the NTP tunnel, we next searched how an incoming nucleotide would actually approach the tunnel. For this we did not use targeted or steered MD, as they introduce important biases. For instance in TMD, "large" conformational changes, which reduce most the target RMSD, are strongly biased to occur first (26). In the case of UTP entry this means that translation in a straight line toward the target position will tend to occur first. The starting nucleotide position is therefore very important as the nucleotide may be on an improper pathway and/or become trapped in an improper orientation. There may also not be enough time for the protein to adjust to the nucleotide's presence. Therefore we designed minimally biased MD protocols to explore both translational and rotational components of diffusion of UTP at the NTP tunnel entry. We started simulations from 9 initial UTP positions with centers of masses 40 Å from Mg(A). The positions (labeled 1-9 on Fig. 3A) were chosen to cover the space outside the NTP tunnel, with position 1 corresponding to the starting point of TMD (see above). In view of our previous results we selected as initial NS5B conformation one that is typical of an open entry loop in binary complexes after clustering of the trajectories (Fig.  2, C and D). We only biased the dynamics by adding a harmonic energy term when the distance to Mg(A) is more than 30 Å (Fig.  3A, right panel) with 5 energy constants 0 (no bias)/0.02/0.05/ 0.1/0.5 Kcal/mol/Å 2 . Four ns were sufficient for UTP to settle on the protein in most of the 45 simulations (Fig. 3B, Fig. S5).
Visual examination of the trajectories shows that with these protocols, UTP indeed samples many orientations before coming to the protein's surface, as is apparent also from the variety HCV RdRp magnesium-dependent dynamics upon NTP entry of final orientations that may be reached from the same starting point (e.g. Fig. 3D). Interestingly, in many cases the UTP distance to Mg(A) converges below 30 Å, where no restraining force is applied any longer. This is particularly apparent for starting position 3 (Fig. 3B), but for 8 of 9 starting positions at least one simulation displays this behavior, sometimes even with no force applied at any point (red curves).
To quantify and visualize regions where UTP accumulates, we computed a density map of UTP in all 45 simulations (Fig.  3C). A prominent apparent accumulation site lies at a surface pocket on the outside of the palm domain. We dismiss this site as artifactual because it is actually contributed to only by simulations from position 9, the only starting position where UTP never goes much below 30 Å from Mg(A) (Fig. S5). Clearly the restraint force is what traps UTP there. The relevant features of the map show that UTP goes preferentially to the entry of the NTP tunnel. We can thus define for NS5B a "capture cone," one limit of which is position 9, which extends from the NTP tunnel and in which nucleotides will tend to bind at the tunnel's entry.
Contouring the map at a higher level (red density), we find two locations on either side of the entry loop. A fairly wide density at 20 -25 Å lined on one side by residue Asp 387 and a more sharply localized one at 15 Å near Lys 51 , corresponding to the two distance plateaus below 30 Å that may be reached by UTP (e.g. Fig. 3B). Both densities are contributed to by nucleotide poses where the triphosphate (still with its Mg(B)) makes most interactions to the protein, whereas the ribose and base may be oriented in several directions but generally point outside the NTP tunnel. Indeed the end points from starting position 3 sample these positions and orientations and are shown

HCV RdRp magnesium-dependent dynamics upon NTP entry
on Fig. 3D. Replicating some of the calculations (not shown) we found that UTP could end up from the same initial position and with the same restraint force either on the same side of the entry loop but in a different orientation, or on the other side of the entry loop. We next tried to extend simulations beyond 4 ns (not shown). After 13 ns we found that the nearer positions (15 Å) had remained stable once reached, whereas the farther (20 -25 Å) generally had not. Indeed in one case UTP had gone from a farther to a nearer position within the initial 4-ns simulation (Fig. S5, position 8, blue curve). Interestingly, the part of the 15 Å density closest to the active site overlaps with the "interrogation (I) site" (Fig. 3E) that was previously identified as a site that bound NTP by their triphosphate with ribose and base crystallographically disordered (27). From these observations, we conclude that this region likely represents the end of the capture cone and that nucleotides will tend to funnel toward it, albeit in an undefined orientation.

The NTP tunnel entry is a nucleotide orientation location
We reasoned that the next step in nucleotide entry either occurred at a much larger time scale than a few nanoseconds or depended on an extra trigger. An obvious one is the magnesium Mg(A). Indeed, in crystallographic structures of RdRp, Mg(A) is seen at the active site only with the fully inserted nucleotide, i.e. in ternary complexes. However, in some cases a "noncatalytic ion" site has been reported 5-7 Å away from the catalytic site, as measured after superimposition of the RdRp backbones. The hallmark of this noncatalytic site is that the counterpart of NS5B Asp 319 is part of the Mg 2ϩ coordination sphere but is flipped away from the 3Ј end of the RNA primer (Fig. 1B) and toward the NTP tunnel (Ref. 20, and references therein). The recent work by Gong and colleagues (17,18) has shown that Mg(A) may be located at this noncatalytic site during nucleotide entry. Therefore, on the one hand we used accelerated molecular dynamics (aMD), a technique for exploring the conformational space of biomolecules that accelerates the state to state evolution of a system relative to normal molecular dynamics (28). On the other hand, we considered several hypotheses as to the location of Mg(A) during the initial steps of nucleotide entry, either at the noncatalytic position or missing altogether.

Simulations of binary complexes with Mg(A) at noncatalytic position or missing, and subsequent distance restrained MD (drMD)
First we ran preparatory simulations of binary complexes as our previous ones (Fig. 2, B and D) but now with Mg(A) either at the noncatalytic site (Fig. 4A, middle) or removed altogether (Fig. 4A, right). Mg(A) was as stable at the noncatalytic site as at the catalytic site over 100 ns. We found that the mobility of the entry loop was much affected, as is evident in the Lys 151 -Asp 352 distance distribution (Fig. 4A, red histograms). Contrary to our expectations, in the case of Mg(A) at the noncatalytic site the

HCV RdRp magnesium-dependent dynamics upon NTP entry
two extra positive charges near Asp 352 actually increase the frequency of the Lys 151 -Asp 352 salt bridge (i.e. of the closed entry loop conformation). Also unexpectedly, the absence of any Mg(A) quickly induces a particular, retracted conformation of the loop where it establishes several salt bridges including Lys 151 -Asp 387 , a conformation that is at best very minor in the other two kinds of simulations (Fig. 4A, blue histograms). We performed new drMD with UTP (and Mg(B)) starting from position 3 (Fig. S6), using this "retracted loop" conformation for systems with no Mg(A) and an open loop conformation for systems with Mg(A) at the pre-catalytic position (Fig. 4A, stars). The results were qualitatively similar to those with Mg(A) at the catalytic site, in that UTP bound stably at the entry of the NTP tunnel in the same general places and also in a variety of orientations, but there were differences in details. Generally simulations took longer to converge and reached slightly different positions, as evidenced by the distances between the UTP center of mass and residue 319 of motif C (Fig. 4B) and by examination of the positions after convergence (Fig. 4C).

Spontaneous nucleotide orientation observed by accelerated molecular dynamics
Next we performed accelerated molecular dynamics simulations from positions shown in Fig. 4, B (arrows) and C. With Mg(A) at the catalytic position or with no Mg(A) we find that UTP flips to insert its base moiety into the NTP tunnel, pointing toward the RNA template (Fig. 5, A and B). This movement can be followed by monitoring the distance between the base N1 and Arg 48 C (Fig. 5, A and B, right panels, green curves), as the Arg 48 side chain lines the side of the tunnel, interacting at this point with Asp 225 of motif A. Early in aMDs this distance drops to a plateau of 6 Å, a feature that follows soon after establishment of interactions of Asp 220 also in motif A with the ribose moiety (Fig. 5, A and B, blue curves). The details of interactions are different, with the nucleotide bound much more stably when no Mg(A) is present. Close examination of the trajectories shows that in this case, all three nucleotide moieties interact strongly with the NS5B conserved residues, as the triphosphate moiety is pinched between Lys 51 at the tunnel entry (red

HCV RdRp magnesium-dependent dynamics upon NTP entry
curve) and Lys 154 , positioned on the other side in the "retracted entry loop" conformation (orange curve). Only the first interaction is established in the "open entry loop" conformation of the system with Mg(A) at the catalytic site. As to the case of Mg(A) at the noncatalytic site, the nucleotide does not flip toward the template RNA at all (Fig. S7), although we tried either initial base orientation (Fig. 4C, middle) and in both cases the nucleotide base samples several positions and orientations. The problem seems to be that with Mg(A) stably bound at the noncatalytic position, the triphosphate moiety establishes many strong interactions when the nucleotide binds more deeply in the NTP tunnel (Fig. 4B, middle), whereas the Lys 154 -Lys 51 pincer of the case with no Mg(A) allows much more rotational freedom.

Mg(A)-dependent unmasking and initial base pairing of the ؉1 templating base
In view of these results, we continued with the system with no Mg(A). Extending the aMD did not lead to further progress of the nucleotide toward the template RNA. Indeed at this stage we find the check by Arg 158 that we noted in directionally biased simulations (TMD and SMD, see above). It is all the more prominent here that, when the nucleotide flipped, the Arg 158 side chain broke its salt bridge to Glu 143 and moved to a position where it stacks against the Ϫ1 base of the primer strand, i.e. right between the ϩ1 template base and the nucleotide base (Fig. 5C). We noted in simulations of binary complexes that the presence and location of Mg(A) affected the dynamics not only of the entry loop (Fig. 4A), but also of Arg 158 (not shown). We therefore tried adding Mg(A) at the noncatalytic site and performed a subsequent 20-ns simulation (Fig. 6, A-C). The main effect is a large increase in the dynamics of motif F3 (Fig. 6, B and C). Initially Arg 158 C␣ moves away from the primer's 3Ј end by 2.3-2.6 Å, followed by successive changes in side chain rotamers and interactions to Glu 143 that break the Arg 158 side chain contact with the P-1 base (Fig. 6B). Arg 158 then cycles between two major conformations (Fig. 6C, 6  and 7). Compared with 7, 6 seems stabilized by the UTP base being sandwiched between Arg 158 and Arg 48 , with Arg 158 itself stabilized by its salt bridge to Glu 143 . We performed targeted MD simulations from this conformation with the same parameters as described previously (Fig. S3, orange curves). This time, however, the Arg 158 guanidinium group moved readily around the UTP base to reach the ␣-phosphate, allowing pairing with the ϩ1 base (Fig. 6D), in striking similarity to the poliovirus RdRp preinsertion complex (Fig. 6E).

GTP entry with C as ؉1 base can be recapitulated as UTP's with A as ؉1 base
Having found a likely path through the NS5B NTP tunnel for UTP, we sought to check the generality of our observations by using the same procedures with another NTP. We chose GTP, a purine, as a more trying test with the larger guanine base size. We generated a system identical to the one in Fig. 4A, right, namely with no Mg(A) and a retracted NS5B entry loop, but with GTP as entering nucleotide starting from position 3 and C as ϩ1 base. Distance-restrained MD (Fig. 7, A and B) shows the retracted loop conformation also allows GTP binding at the tunnel entry in several conformations, although at only one site (likely due to the larger base size) and with differences in details. As with UTP, aMD then allows nucleotide orientation (Fig. 7C) thanks to interactions of the triphosphate with Lys 51 and its base with the tunnel's side (Fig. 7D), which rearranges similarly with a salt bridge between Arg 48 and Asp 225 . Again details differ, most obviously as the guanine base and ribose are almost at a right angle to UTP's in the former system. Still adding Mg(A) at this stage also has the effect of increasing dynamics of Arg 158 leading to unmasking of the ϩ1 base (Fig. 7, E and F). The set of Arg 158 interactions, however, is different from the former case of the UTP system with A as ϩ1 base, as is readily apparent from the lack of exchange of the Arg 158 -Glu 143 salt bridge (compare Fig. 7F with Fig. 6B). Although a major conformation of Arg 158 again emerges, it is not the same as found in the former system. Still, TMD from this major conformation also leads to Watson-Crick base pairing with the ϩ1 C and to Arg 158 binding the ␣-phosphate, constitutive of RdRp preinsertion (Fig. 7G).

GTP entry with a ؉1 base A mismatch
Finally, we sought to find out which of the above-mentioned differences were due to a different ϩ1 base and which to a different entering nucleotide, and to assess how NS5B would discriminate a Watson-Crick mismatch. We thus performed the same successive steps for a system with A as ϩ1 base and with GTP, allowing comparison with both previous systems (Fig. S8). We found the same behavior for GTP in the early binding and orientation steps. After these two steps GTP are superimposable regardless of the ϩ1 base. In contrast, the dynamics of Arg 158 were dependent on the ϩ1 base. They were identical to the system with UTP and A as ϩ1 base, before and after Mg(A) addition (e.g. compare Fig. S8F to Fig. 6B). This resulted in the same major Arg 158 conformation after Mg(A) addition. TMD from this conformation did not allow GTP to pass Arg 158 with the same force constant (0.05 Kcal/mol/Å 2 ) as the one applied for the previous GTP TMD. The G and A bases remained apart (Fig. S8G). Further increasing the force constant was necessary to bring the bases within hydrogen bonding distance, but even then Arg 158 did not come around to contact the GTP ␣-phosphate.
From all this we conclude that GTP may come up to base orientation into the NTP tunnel but will not in general reach preinsertion even with an available Mg(A) if the ϩ1 base is A. On the other hand, Mg(A) is sufficient for ϩ1 base unmasking by Arg 158 . The atomic details of Arg 158 dynamics depend on the nature of the ϩ1 base. Thus Arg 158 seems to be an integral part of nucleotide selection by Watson-Crick base pairing.

Discussion
The nucleotide addition cycle of base polymerases involves several crucial steps. Starting from the product state of the preceding addition (step 1 in Fig. 8), the template-primer nucleic acid must translocate (1 to 2) to allow subsequent probing of the template ϩ1 base by incoming nucleotides (step 6), transfer to the active site proper (6 to 7) and the chemistry step with release of pyrophosphate (7 to 1). In all base polymerases except RdRp (i.e. in replicases, transcriptases, and reverse transcriptases), a  Fig. 5C show a reversal of Arg 158 interaction with base P-1 as the salt bridge with Glu 143 is re-established. C, clustering of the Arg 158 conformation along the trajectory with a timeline (top) and a distribution of clusters (bottom). The representative snapshot from the major cluster 6 used for A is indicated by the cyan arrow. D, after targeted molecular dynamics (see text for details), base-pairing with the template base (labeled "ϩ1") is readily established. E, the same view for the preinsertion complex crystal structure for the poliovirus RdRp obtained with a 2Ј,3Ј-ddCTP (PDB 3OLB) (17) is shown for comparison with D.

HCV RdRp magnesium-dependent dynamics upon NTP entry
large movement (typically a 20°rotation) of the fingers domain occurs between steps 2 and 7 as the fingers close in on the correct incoming nucleotide. The structure of the preinsertion complex (step 6) of the T7 RNA polymerase (a transcriptase) shows that this occurs essentially between steps 6 and 7 (21). Biomolecular simulations have proved extremely helpful in complementing the wealth of structural and biochemical data on base polymerases. For instance, pyrophosphate release was found thereby to be largely independent of the structural changes associated with fingers reopening and translocation in the T7 RNA polymerase (i.e. 1 and 2 in Fig. 8 are distinct states) (29). This finding is consistent with the emerging picture that catalysis and translocation are actually uncoupled in base polymerases including RdRp (30). Similarly, simulations with ribonucleotide or antiviral ribonucleotide analogues at the active site of the mitochondrial RNA polymerase allowed computing differences in affinity that match experimental differences (31).
Still there is next to no data on the path followed by nucleotides to access the active site after translocation. This may not be a large issue for non-RdRp base polymerases, whose active sites are readily accessible before fingers closing. But the RdRp fingers remain closed throughout the nucleotide addition cycle. This begs the question of how NTP access the active site through the restricted NTP tunnel and what quality checks they encounter. Here we trace the path of incoming UTP and GTP to the cognate ϩ1 base of the template RNA for hepatitis C virus RdRp NS5B using a variety of biased and unbiased molecular dynamics simulations methods. We find that once the nucleotide together with an associated Mg 2ϩ Mg(B) enters a cone extending from the entry of the NTP tunnel, it is attracted to the tunnel entry (Fig. 8, steps 2-3). The interactions leading to this binding are mostly electrostatic and involve the triphosphate moiety (step 3). At a longer time scale, the ribose and base spontaneously reorient and UTP or GTP enter the NTP tunnel base first (step 4). This necessary step is thus achieved well upstream of base pairing by the organization of the tunnel entry at this stage. Indeed, it is observed identically when the ϩ1 template base is mismatched with the entering nucleotide. A major determinant of this organization is a mobile entry loop harboring what has been termed "motif F2" (11). Lys 151 in F2 may form

HCV RdRp magnesium-dependent dynamics upon NTP entry
alternate salt bridges to Asp 352 on the other side of the tunnel, or to Asp 387 on the same side, stabilizing the entry loop in opposite conformations and consequently making up alternate and very different tunnel entries.
These entry loop conformations, the preferential nucleotidebinding positions, and the subsequent nucleotide reorientation are all strongly affected by the presence and location of magnesium ion Mg(A). Thus the tunnel entry is widest and UTP reorients most easily and stably when no Mg(A) is present in our simulations. Furthermore, in this case GTP also binds and reorients at the tunnel entry despite its larger size. On the other hand, further progress toward the active site after nucleotide reorientation depends on Mg(A) arriving at the RdRp noncatalytic site (Fig. 8, 4 -5). Gong and colleagues (17,18) showed for the enterovirus RdRp that the movement of Mg(A) from the noncatalytic to the catalytic position was a major determinant of active site closure in the very last stages of nucleotide entry. We propose here that Mg(A) is involved in nucleotide entry already from a very early stage. Its dissociation from the active site region after the previous nucleotide incorporation, either at the pyrophosphate release step as depicted in Fig. 8 (steps 7 to  1), or together with translocation (steps 1 to 2), thus triggers a rearrangement of the NTP tunnel entry that signals availability of the active site for the next ϩ1 base probing. After a new nucleotide orientation at the tunnel entry, Mg(A) passage at the noncatalytic site signals further nucleotide advance to check for pairing to the ϩ1 base. This ensures the simultaneous presence of two magnesium dications with the nucleotide for active site closure, if Watson-Crick base pairing is successful.
The roles of Mg(A) in pre-chemistry steps have been much studied experimentally for base DNA polymerases, notably by stopped-flow assays. For instance, in HIV1 reverse transcriptase a combined experiment and simulation study using Be 2ϩ to substitute for Mg(B) highlighted a specific role of Mg(A). After fingers closing upon incoming NTP binding, Mg(A) would bind and stabilize a catalytically competent active site conformation (32). A similar mechanism involving late binding of Mg(A) is found in bacterial base polymerases, for instance, in experimental study of the pre-chemistry conformational transitions of the Klenow fragment of DNA polymerase I. Alternately mutating the two conserved aspartates Asp 705 and Asp 882 from motifs A and C (corresponding to NS5B Asp 220 and Asp 318 ) shows that the latter (that can bind an Mg 2ϩ ion at the active site even in the absence of nucleotide) is required for fingers closing (33). In contrast the former aspartate is necessary for a subsequent conformational change, likely induced by Mg(A) entry. This late involvement of Mg(A) is obviously similar to the active site closure of RdRp seen when comparing the preinsertion to the ternary complexes of enteroviruses RdRp (17)(18)(19). Our work indicates that the steps of NTP binding (with Mg(B)) and fingers closure are replaced in RdRp by shuttling through the NTP tunnel, and that this early process is critically dependent of Mg(A) for limited but critical conformational adjustments in the fingertips. In accordance with this extra early role, Mg(A) can be found at the noncatalytic site in structures of nonternary RdRp complexes crystallized at high Mg 2ϩ concentration (20) and residues making up the site (including Asp 219 ) are conserved throughout RdRp supergroups.
The patterns of residue conservation along the NTP tunnel show successive circles receding from the active site. Thus motif F3 (closest to the active site) is also conserved, with Arg 158 being found in all (ϩ)-ssRNA virus RdRp and even in reverse transcriptases, where its counterpart (Arg 72 in HIV1 RT) binds the incoming NTP upon fingers closing (5,32,34). Further away, motif F1 that we show here prominently contributes to the Arg 158 dynamics that substitute for fingers closing is not

HCV RdRp magnesium-dependent dynamics upon NTP entry
present in reverse transcriptases. However, it is readily detectable in all (ϩ)-ssRNA virus RdRp, with the counterpart of Lys 141 strictly conserved and the counterpart of Glu 143 strictly conserved in supergroups I and II (it is an aspartate in supergroup III) (10,12). In contrast, residues involved in nucleotide reorientation at the tunnel entry are not detectably conserved when considering all RdRp of (ϩ)-ssRNA viruses. For instance, the segment between motifs F1 and F3 is of very varying length between virus families (from 5 to Ͼ50 residues) and the entry loop at its tip that harbors motif F2 is highly mobile (35), rendering both sequence-only and structure-based alignments dubious. However, in this third circle residues are conserved within virus families, whether they participate in triphosphate or base stabilization in our simulations. Thus at the tunnel entry motif F2 (including Lys 151 ), Arg 48 and Lys 51 are strictly conserved in all genotypes of HCV and in its closest relatives. Motif D on the other side of the tunnel from motif F2 also displays this pattern of conservation within families, for instance, in flaviviruses and enteroviruses (35). In enterovirus RdRp a lysine from motif D (not present in HCV NS5B) could serve in reorientation of the nucleotide. This lysine has been proposed also to be directly involved in catalysis and fidelity control (36), but this is by no means exclusive. We propose that nucleotide reorientation at the NTP tunnel entry is a general feature of RdRp but that it may be achieved by different means in different families. Indeed, mechanistically it only requires that the triphosphate moiety be bound by suitably oriented basic residues allowing rotational mobility, and that favorable interactions, not specific of the base, be made available for the base. In support of this, we find here that this is what happens for both UTP and GTP at the tunnel entry of HCV NS5B. The GTP base and triphosphate are bound and oriented at the same spot as UTP, but in distinct orientations and by making distinct contacts to the same NS5B residues.
As stated, and in contrast to nucleotide reorientation, in the next stages of UTP entry we find only residues conserved among (ϩ)-ssRNA RdRp of all three supergroups (10). The central residue regulating template base probing by the incoming nucleotide base is the strictly conserved arginine of motif F3 (Arg 158 for HCV NS5B, a supergroup II RdRp). Arg 158 achieves its limited but essential movements through an Mg(A)-dependent mobility that is surprising for such a buried residue, particularly as it involves in part its main chain. Sofosbuvir, the ribose-modified nucleotide analogue that is so successful in the treatment of hepatitis C, selects a rare resistance mutation in a residue that is close to the ribose in ternary complexes, but also another mutation, L159F, that is buried in the protein core away from the nucleotide. This highlights the importance of motif F3 main chain dynamics in nucleotide selection (37). The other factor to Arg 158 dynamics is its ability to form alternate networks of salt bridges involving conserved residues of motif F1 (Glu 143 and Lys 141 in HCV NS5B). The counterpart of K141R was selected as a resistance mutation for the antiviral favipiravir (a purine analogue) in the RdRp of chikungunya virus (supergroup III) (38). The same mutation engineered into enterovirus B3 (supergroup I) led to an RdRp with lower fidelity and increased susceptibility (rather than resistance) to favipiravir (39). Again these findings are not ascribable to direct inter-action with the analogue but are consistent with the regulation that we see here of F3 dynamics by F1 being important for base selection in all three supergroups of (ϩ)-ssRNA RdRp.
In conclusion, our work points to the dependence on magnesium of conformational adjustments in viral RdRp that allow these enzymes to shuttle incoming nucleotides to the preinsertion state, thus achieving template probing and eventually initial base-pairing through completely different mechanisms than the other, fingers-closing base polymerases. First, the preclosed fingers of RdRp imply that incoming NTP must be suitably oriented at the entry of their NTP tunnel, something that is probably achieved in different ways by different RdRp families. Second, the counterpart of fingers closing is achieved by magnesium-dependent mobility of motif F3, regulated by motif F1.
This divergent preinsertion mechanism explains in principle why some nucleotide analogues can be so active against viral RdRp, yet not inhibitory toward the mitochondrial RNA polymerase, a critical requirement for nontoxicity (31,40). For instance, in this scheme Gong and colleagues (17,18) proposed that local conformational changes in motif A allowed ribose check prior to preinsertion. Such a mechanism for ribose checking is unrelated to the one proposed for T7 RNA polymerase that would involve rather residues of the closing fingers (41). Our work also provides a framework for further establishing mechanistic differences between viral RdRp and cellular polymerases.

Construction of the molecular systems
To generate our systems we used the crystallographic structure 4WTA refined to 2.8 Å resolution with a UDP in the active site base-paired to the template RNA ϩ1 adenosine (14). To stabilize the nucleotide in the active site, Appleby et al. (14) used Mn 2ϩ instead of Mg 2ϩ (the likely physiological catalytic ion). Furthermore, to prevent catalysis and incorporation they used nucleoside diphosphates rather than triphosphates, the single ␤-phosphate in the former being a much worse leaving group than the ␤,␥-diphosphate in the latter (see the supplemental information to Ref. 14). We changed the two catalytic ions back to Mg 2ϩ (Mg(A) and Mg(B)) and added a ␥-phosphate to obtain the ternary NS5B/template-primer RNA/UTP complex (Fig. 1). In principle there is an ambiguity as to which of the two nonchelating oxygens is the bridging oxygen between ␤and ␥-phosphates, but comparison with a norovirus polymerase ternary complex (PDB 3BS0) (16) makes this unambiguous. The NS5B construct crystallized in 4WTA harbored two deletions, one C-terminal (residues 571-591) to increase solubility and one internal (residues 444 -453) to favor the more open RNA-binding conformation. All residues were crystallographically ordered except for a few side chains (that were modeled in stereochemically consistent starting positions) and residues beyond 542 (that were not modeled). Thus our systems comprise 534 protein residues (NS5B 1-443 and 454 -542, with an added diglycine to connect 443 to 454) and 12 RNA bases (a 7-base template and a 5-base primer). We included or excluded UTP bound to Mg(B) and/or Mg(A) as indicated. For simulations of GTP entry, we either used the same NS5B/RNA system as for UTP or we mutated the ϩ1 base to C, as indicated. Parameters for UTP were adapted and parameters for GTP were obtained from the nucleotide parameters (http://research. bmh.manchester.ac.uk/bryce/amber). 4

Unbiased molecular dynamics simulations
Molecular dynamics simulations of the ternary complexes (with UTP) and the binary complexes (without UTP) were carried out with the Gromacs program (42) using the AMBER ff99SB-ILDN force-field. The systems to simulate were neutralized with Cl Ϫ counterions and solvated in a periodic cubic box of explicit TIP3P water molecules with a minimum distance of 12 Å between the protein atoms and the edge of the water box. The obtained models were then energy minimized with the steepest descent method for 20,000 steps, whereas restraining the protein and RNA atoms and ions with harmonic potentials with a constant force of 1000 KJ/mol. Our simulation protocol starts by NVT progressive heating in two steps of 500 ps each followed by NPT equilibration. During the NPT equilibration the constraints were gently removed in 12 successive stages, each progressively releasing the restraints on protein side chains, main chain, then RNA. The total length of the equilibration phase was 4 ns. For both equilibration and production, a 2-fs integration step was used. For electrostatic and van der Waals interactions, cut-off distances for the short-range van der Waals (rvdw), Coulomb cut-off (r coulomb), and neighbor list (rlist) were fixed at 12 Å. The particle mesh Ewald method was used for the treatment of long-range electrostatic interactions. Water molecules were constrained using the SETTLE algorithm and the parallel linear constraint solver (P-LINCS) algorithm was used for covalent bond constraints of biomolecules. After the equilibration phase, production runs of 100 ns with a time step for integration of 2 fs were carried out where each simulation started from a different random velocity.
Simulations were also performed with the AMBER 14 suite of programs (43) as indicated, particularly for minimizing and equilibrating systems prior to biased MD (see below). The AMBER protocol was equivalent to the Gromacs protocol.

Biased molecular dynamics simulations
All biased molecular dynamic simulations were performed using the AMBER 14 suite (43). aMD simulations aMD (28) is a technique for exploring the conformational space of biomolecules that accelerates the state to state evolution of a system relative to normal molecular dynamics therefore making rare events accessible in reasonable simulation times. In aMD, a positive boosting potential, ⌬V(r), is applied to the system if the potential energy drops below a certain energy threshold (E). We used the dual boosting potential of the AMBER implementation of aMD (iamd ϭ 3). We estimated the AMBER aMD input parameters for the boost to the total potential and the extra boost to the torsion potential (EthreshP, ␣P, EthreshD, and ␣D) by using the average energy values computed from a 4 to 12 ns unbiased simulation of our system and we used a boost factor of 0.2. aMD were then launched with the parallelized version of the pmemd module of the AMBER package with a 2-fs time step.

TMD
In TMD simulation, a molecular system is guided from an initial structure using a reference structure by adding an extra term to the energy function according to the following equation, where N atm is the number of atoms in the subset used for RMSD (t) computation; K is the energy constant applied during the simulation; RMSD (t) is the root mean square deviation of a subset of atoms in the current structure compared with the same subset in the reference structure. This subset was the UTP heavy atoms (30 atoms) in all our TMD; and RMSD target is the desired final RMSD. We used K Ͼ 0 and RMSD target ϭ 0 to guide the simulation toward the reference structure's UTP from a different starting point. We used K Ͼ 0 and RMSD target Ͼ 0 to push UTP from the reference structure's position. We used energy constants ranging from K 1 ϭ 0.01 Kcal/mol/Å 2 to K 2 ϭ 0.1 Kcal/mol/Å 2 as indicated. The initial and reference systems were built to be chemically identical (same topology file) and minimized and equilibrated according to our unbiased AMBER protocol prior to TMD simulations. The TMD were then launched with the parallelized version of the sander module of the AMBER package. The duration of TMD simulations ranged from 1 to 4 ns length with a 2-fs time step.

SMD
During the SMD runs the incoming UTP was positioned outside the polymerase and a harmonic symmetric restraint was applied to reduce the distance of UTP with the ϩ1 base. The atoms defining this distance, the velocity of the translation, and the force constant of the external harmonic force were varied as indicated.
drMD drMD simulations are conventional MD simulations with an added harmonic energy term if the distance between the UTP or GTP center of mass and the active site is above a threshold distance that we fixed at 30 Å. For the location of the active site we used either Mg(A) or the center of mass of residue 319, as indicated. For each UTP or GTP starting position, we performed five simulations with force constants ranging from 0 to 0.5 kcal/mol/Å 2 as indicated.