Structural and Functional Characterization of Programmed Ribosomal Frameshift Signals in West Nile Virus Strains Reveals High Structural Plasticity Among cis-Acting RNA Elements*

West Nile virus (WNV) is a prototypical emerging virus for which no effective therapeutics currently exist. WNV uses programmed −1 ribosomal frameshifting (−1 PRF) to synthesize the NS1′ protein, a C terminally extended version of its non-structural protein 1, the expression of which enhances neuro-invasiveness and viral RNA abundance. Here, the NS1′ frameshift signals derived from four WNV strains were investigated to better understand −1 PRF in this quasispecies. Sequences previously predicted to promote −1 PRF strongly promote this activity, but frameshifting was significantly more efficient upon inclusion of additional 3′ sequence information. The observation of different rates of −1 PRF, and by inference differences in the expression of NS1′, may account for the greater degrees of pathogenesis associated with specific WNV strains. Chemical modification and mutational analyses of the longer and shorter forms of the −1 PRF signals suggests dynamic structural rearrangements between tandem stem-loop and mRNA pseudoknot structures in two of the strains. A model is suggested in which this is employed as a molecular switch to fine tune the relative expression of structural to non-structural proteins during different phases of the viral replication cycle.

West Nile virus (WNV) 4 is a member of the Japanese encephalitis serogroup within the genus Flaviviridae (1). WNV is an enveloped virus that contains a single-stranded plus-strand RNA genome that harbors one open reading frame encoding three structural and seven non-structural (NS) proteins (Fig.  1A). WNV is named for the region in Uganda where it was first isolated in 1937 (1). It is a zoonosis, i.e. it replicates and accu-mulates in birds and spreads to new hosts via mosquitoes (1). The virus has been known to infect other animals including cows, horses, and humans. Humans are considered to be "dead end hosts" as the virus is not able to accumulate to high enough titers to enable person-to-person transmission. WNV infection produces symptoms ranging from mild fever to paralysis (2). There are numerous variations, or "strains" of West Nile virus that are generally classified into two distinct lineages (3). Lineage 1 strains are found throughout the world in Africa, Australia, South America, North America, and the Middle East and include those most commonly associated with severe human illnesses, i.e. meningoencephalitis (encompassing encephalitis, meningitis, myelitis, and cases with overlapping features of these syndromes). Lineage 2 strains are limited to the African continent and typically cause the clinically uncomplicated West Nile fever in humans, which typically lasts less than a week (1). WNV was introduced into the United States in 1999, causing a meningoencephalitis outbreak in New York City resulting in 7 deaths (2). In 2002, there were nearly 4000 human cases of WNV-associated disease reported with 246 deaths (4). In addition, more than 14,000 dead crows, blue jays, and other birds were found to be infected. The virus has been documented in all 48 states and the District of Columbia in the United States mainland, sparking nationwide interest in finding ways to prevent and treat the resulting illnesses. No WNV vaccine has been approved for human use (5) leaving current prevention efforts focused entirely on mosquito control (6). Treatment of WNV-related diseases is primarily palliative (5). Some antivirals show potential, but none have been proven effective in humans (6). Thus, there is a demand for a drug effective for treatment of WNV infection. When considering drug development, it is important to identify molecular mechanisms that are specific to the pathogen.
Many viruses, particularly those with RNA genomes such as WNV, utilize a molecular mechanism called programmed Ϫ1 ribosomal frameshifting (Ϫ1 PRF) to make more efficient use of their small genomes: Ϫ1 PRF enables them to control the relative expression of different proteins encoded within a single open reading frame. Programmed Ϫ1 ribosomal frameshifts direct a fraction of translating ribosomes to shift one base in the 5Ј direction, enabling them to continue translation in the Ϫ1 reading frame (7). Frameshift signals typically comprise three parts: a slippery site, a spacer, and an mRNA structural element, often an mRNA pseudoknot (7). A slippery site is a heptameric sequence on which frameshifting can occur while maintaining non-wobble base pairing between tRNAs in the ribosomal Aand P-sites with the mRNA after the frameshift event. The spacer is a short region between the slippery site and the downstream structural element. This structural element is thought to direct the ribosome to stall with its A-and P-site tRNAs positioned over the slippery site, which increases the likelihood of kinetic partitioning of ribosomes into the Ϫ1 frame (8,9). It is not known whether the mRNA structural element plays an active role in this process, or if it is a passive participant. Prior studies using the Kunjin variant of WNV demonstrated that the virus utilizes Ϫ1 PRF to produce an extended version of the NS1 protein, called NS1Ј (10). Whereas NS1 plays an essential role in viral replication and assembly, NS1Ј is not essential. However, in addition to serving the same function as NS1, it has effects on WNV virulence: abrogation of NS1Ј production reduces neuroinvasiveness (10), viral replication, and RNA levels (11,12). The primary translational product is a polyprotein that is subsequently cleaved into structural (encoded by sequences in the 5Ј region of the genome) and NS (encoded downstream sequence) proteins (Fig. 1A). Programmed Ϫ1 ribosomal frameshift events direct translating ribosomes to an out-of-frame stop codon. Thus, the Ϫ1 PRF signal is localized in the WNV genome such that frameshift events prevent production of NS proteins, i.e. Ϫ1 PRF affects ratios of structural (upstream of the frameshift) to non-structural (downstream). Increased frameshifting results in production of more structural proteins favoring virion production. Indeed, inhibition of frameshifting resulted in decreased ratios of structural/ non-structural proteins (E/NS5 ratio), and in decreased virus production (11). Furthermore, Ϫ1 PRF attenuated mutants showed reduced virulence in a mouse encephalitis model (10). The infectivity of a Ϫ1 PRF-deficient WNV-NY infectious clone was also attenuated in birds, and a Ϫ1 PRF-deficient WNV-Kunjin virus displayed decreased replication and spread in Culex mosquitos (11). Thus, Ϫ1 PRF presents a possible therapeutic target against WNV.
Although the WNV Ϫ1 PRF signal was previously identified, detailed functional and structural analyses have not been undertaken. To address this, the NS1Ј frameshift signals of four WNV strains were analyzed. The study included two lineage 1 strains (New York and Kunjin) and two from lineage 2 (Madagascar and h442). The New York strain (accession number NC_009942) was responsible for the 1999 encephalitis outbreak in New York and continues to be one of the main virulent strains in the United States (2). Kunjin (accession number AY274504) has caused outbreaks of disease in human and equine populations in Australia. It was previously believed to be a unique species, but now occupies its own clade within lineage 1 of WNV (12). The Madagascar strain (accession number DQ176636) caused an outbreak of disease in birds in 1978, but does not produce appreciable pathogenesis in humans (13). The h442 strain (accession number EF429200) was isolated from a patient in South Africa and is one of the few lineage 2 strains associated with human disease (14). Consistent with prior studies using mammalian and mosquito cells as well as cell-free assays (10, 15), measurements using standard dualluciferase reporters in HeLa cells revealed that the sequences from all four strains promote very high rates of Ϫ1 PRF: from ϳ30 to ϳ70%. Mutational analyses are consistent with Ϫ1 PRF, and do not support the presence of cryptic promoters or internal ribosome entry site elements. This approach also revealed the requirement of additional sequences 3Ј of those previously identified for optimal frameshifting activity. Notably, rates of Ϫ1 PRF promoted by the pathogenic strain sequences were significantly greater than those conferred by sequences derived from the low pathogenesis Madagascar strain. Much of the difference in Ϫ1 PRF efficiencies between the New York (pathogenic) and Madagascar strains can be attributed to a single base difference in their heptameric slippery sequences. In addition, chemical modification studies suggest that, at least in these two cases, the Ϫ1 PRF promoting elements may be structurally dynamic, transiting between tandem stem-loop and mRNA pseudoknot structures, such that formation of pseudoknot structures further enhance the already strong Ϫ1 PRF stimulating activity of slippery site proximal stem loops. The deeper understanding of the structural and molecular biology of the WNV Ϫ1 PRF signals may partially explain the increased pathogenicity of the New York strain of WNV that can be exploited for therapeutic intervention.

Results
The WNV Sequences Encode Bona Fide Ϫ1 PRF Signals-The predicted Ϫ1 PRF signal sequences identified in the Recode Database (16) are limited to 75 nucleotides in length, and secondary structures are predicted using mfold (17), a program that cannot predict complex RNA structures such as pseudoknots.
Given that many known viral Ϫ1 PRF signals are longer than 75 nucleotides and are comprised of mRNA pseudoknots, longer sequence windows (up to 129 nucleotides) were analyzed by three programs capable of identifying pseudoknotted RNAs: Pknots (18), NUPACK (19), and HotKnots (20). These suggested that additional 3Ј sequence information may participate in the downstream Ϫ1 PRF stimulatory structures (not shown). Thus, the 129-nucleotide long sequences shown in Fig. 1B derived from the New York, Kunjin, Madagascar, and h442 strains were cloned into the dual luciferase vector p2luci (21). This figure also highlights the presence of heterogeneity among these sequences, most of which maps to sequence 3Ј of the original 75-nucleotide window. However, it is notable that the first triplet of the WNV-NY slippery site is a perfect CCC, whereas this is UCC in the other three strains. The full-length sequences shown in Fig. 1B were cloned into dual luciferase reporters, as well as three additional series of mutant versions of the frameshift signals that were created as controls. Mutation of the (T/C)CCTTTT (0-frame indicated by spaces) slippery sites to ACTGACT were constructed to validate Ϫ1 PRF through the canonical mechanism. Constructs harboring termination codons in the zero frame 5Ј of the slippery sites were used to control for the presence of cryptic promoters, whereas those harboring termination codons in the Ϫ1 frame 3Ј of the slippery sites were used to control for the presence of internal ribosome entry site elements or cryptic splice sites (Fig. 1B). All four of the longer WNV-derived sequences promoted highly efficient frameshifting, ranging from ϳ35% (Madagascar) to ϳ72% (Kunjin) ( Fig. 2A). In all cases, the mutated frameshift signals showed a near complete loss of firefly luciferase expression. These findings provide strong genetic support for the hypothesis that the WNV-derived sequences direct highly efficient levels of Ϫ1 PRF.
A prior report suggested that the WNV Ϫ1PRF signals were entirely encoded within shorter 75-nucleotide long sequences (15) (Fig. 1C). To determine whether an additional sequence is required to promote optimal frameshifting, the Ϫ1 PRF activities of the shorter sequences (NY Short and Madagascar Short) were compared with those of the longer sequences (NY Long and Madagascar Long) using the dual luciferase assay. Although both of the short sequences were able to promote efficient levels of Ϫ1 PRF, the actual values were significantly less than those promoted by their longer counterparts indicating that the complete WNV Ϫ1 PRF signals extend beyond the previously predicted 75 nucleotides (Fig. 2B). Additionally, a head to head comparison of the New York versus Madagascar Ϫ1 PRF signals revealed both the short and long versions of the New York-derived sequence promoted statistically significant greater levels of frameshifting than those derived from the Madagascar strain (Fig. 2C).
As noted above, the New York strain CCCUUUU slippery site conforms to the canonical XXXYYYZ slippery site (22). In contrast, the Madagascar UCCUUUU slippery site is non-canonical, although it does allow for G-U base pairing between the P-site tRNA and the Ϫ1 frame codon upon slippage (23,24). To determine whether this single base difference may partially account for the observed increased ability of the New York WNV-derived sequence to promote Ϫ1 PRF relative to the Madagascar strain, the slippery sites were switched between the two in the context of the full-length signal. Consistent with this hypothesis, swapping the UCCUUUU slippery site into the New York WNV sequence resulted in a significant decrease in the efficiency of Ϫ1 PRF promoted by this sequence (Fig. 2D). Conversely, substitution of the CCCUUUU slippery site into the Madagascar WNV sequence promoted significantly increased rates of Ϫ1 PRF relative to the wild-type frameshift signal (Fig. 2D).
Chemical Modification Analyses Suggest That the WNV Ϫ1 PRF Signals Are Structurally Complex, Diverse, and Dynamic-The combination of (a) sequence diversity among the different WNV Ϫ1 PRF signals, (b) the observed range of Ϫ1 PRF promoted by these sequences, and (c) the observed differences in Ϫ1 PRF activities between the long and short forms of the New York and Madagascar WNV Ϫ1 PRF signals suggested a certain degree of structural complexity among these frameshift stimulating sequences. To investigate this further, synthetic "short"  and "long" form RNA transcripts of the New York and Madagascar strains of the WNV Ϫ1 PRF signals were treated with 1M7, and subjected to SHAPE analyses. Representative autoradiograms are shown in Fig. 3, A (the short New York WNV sequence, i.e. New York Short), B (New York Long), E (Madagascar Short), and F (Madagascar Long). Autoradiograms of denaturing polyacrylamide gels were scored for reactivity of 1M7 versus untreated (DMSO only) control RNAs. SHAPE data were combined with computational predictions to create twodimensional models of the folded structures. For ease of visualization, these are depicted as Feynman diagrams (18). These analyses indicated that the short forms of both Ϫ1 PRF signals form similar stem-loop structures (Fig. 3, C and G). In contrast, the long versions of the WNV New York and Madagascar Ϫ1 PRF signals appeared to be able to assume at least two distinct and mutually exclusive structures: tandem stem-loops and more complex mRNA pseudoknots, each of which span nearly the entire 129-nucleotide sequences (Fig. 3, D and H). The different 3Ј sequences among all four strains also suggested that there may be a significant amount of structural diversity among the various Ϫ1 PRF signals. To address this, SHAPE analysis was also performed on the long versions of Ϫ1 PRF signals derived from the Kunjin and h442 strains. This analysis revealed that they, too, were able to form complex RNA pseudoknot structures encompassing nearly the entire length of the sequences (Fig. 4, A-D). Consistent with the hypothesis, this analysis revealed significant structural diversity in the Ϫ1 PRF signals of the four strains.

Discussion
Consistent with prior studies (10, 15), a thorough genetic analysis strongly supports the hypothesis that these sequences promote highly efficient rates of Ϫ1 PRF. However, whereas those studies employed the computationally predicted 75-nucleotide long WNV stem-loop structure, the current analysis revealed that an additional 3Ј sequence (ϳ50 nucleotides) contributes to enhancement of Ϫ1 PRF efficiency. Although the shorter sequences promote very high levels of Ϫ1 PRF (compare the 30 -50% promoted by these WNV sequences to the ϳ1-15% observed for the Ϫ1 PRF signals of most other viruses (25)), the longer sequences further serve to increase Ϫ1 PRF rates. This is likely because mRNA pseudoknots are more difficult for ribosomes to resolve than the stem-loops observed in shorter sequences, resulting in longer ribosomal stalling thus increasing the chances for Ϫ1 PRF to occur. Furthermore, Ϫ1 PRF promoted by sequences derived from pathogenic strains was consistently more efficient than those promoted by the non-pathogenic Madagascar strain. Swapping the slippery sites of the New York and Madagascar strains revealed that this can be partially attributed to a single base difference. However, it should be noted that both the Kunjin and h442 strains also harbor the UCCUUUU slippery sites, and thus this is not the sole reason for lower rates of Ϫ1 PRF by the Madagascar WNVderived sequence. As discussed above, this increased efficiency should result in higher expression of NS1Ј, increasing the ratios of structural to non-structural viral proteins enabling greater  JULY 22, 2016 • VOLUME 291 • NUMBER 30 production of WNV virions by the New York strain. Thus, this single base difference may partially account for the increased pathogenicity of this strain.

Structurally Plastic Frameshift Signals
SHAPE analyses revealed a greater degree of RNA structural variation than was previously predicted among the Ϫ1 PRF promoting mRNA pseudoknots of the WNV-NY and WNV-Madagascar strains. This is likely due to the greater number of sequence differences toward the 3Ј ends of the signals, which were not included in earlier computational predictions, demonstrating the need for empirical wet-bench studies to complement computational predictions. The high degree of sequence ( Fig. 1) diversity among the Ϫ1 PRF signals from the different WNV strains is consistent with the prior observations that cis-acting elements such as Ϫ1 PRF signals evolve rapidly (26 -29). Furthermore, their structural diversity (Figs. 3 and 4) suggests that Ϫ1 PRF signals are very plastic, i.e. that highly efficient levels of Ϫ1 PRF can be easily stimulated by many different RNA structures. Analyses of the long forms of the New York and Madagascar Ϫ1 PRF stimulatory sequences also support this idea of structural dynamism: both tandem stem-loop and mRNA pseudoknot structures can be derived from the SHAPE data (Fig. 3). Given the functional analyses showing that the entirety of these sequences are required for maximal frameshifting (Fig. 2), we suggest that the tandem stem-loops first may be formed co-transcriptionally, and that the more complex structure is folded post-transcriptionally. Co-transcriptional formation of the tandem stem-loops seems likely because the first loop could form in its entirety before transcription of the 3Ј end of the signal has been completed. This is also consistent with the structure formed by the shorter sequences, showing the absence of the 3Ј end still permits formation of the 5Ј proximal stem-loop (Fig. 3). After transcription is complete, however, longer range base pairing interactions can enable the RNA to refold into the mRNA pseudoknot structures. This RNA folding is a dynamic process; the same sequence may have different conformations temporally and/or spatially. Indeed, it is possible that this structural heterogeneity could function as a molecular switch to control viral structural to non-structural protein production through the infectious program. For example, during the early phase of infection it may be advantageous for the virus to maximize production of non-structural proteins; this may be required to delay onset of the innate immune response (13, 30 -32). This could be effected by decreasing Ϫ1 PRF by favoring formation of the stem-loop structures. In contrast, during the late phase of infection, maximization of Ϫ1 PRF by favoring formation of the pseudoknotted structures would serve to increase synthesis of structural proteins, maximizing viral particle production. Alternative to the dynamic switch model, this sequence element may simply function like a resistor, attenuating the amplitude of the downstream translational output to control the relative ratios of structural to nonstructural proteins independent of the viral lifecycle. Regardless of the switch or resistor model, increased production of virions consequent to higher levels of Ϫ1 PRF may contribute to viral pathogenesis as discussed above. It should be noted that WNV can be viewed as a prototype for emerging Flaviviruses in the Western hemisphere, including the Zika virus. In addition, emerging Alphaviruses, e.g. Chikungunya, and the Equine Encephalitis viruses also utilize Ϫ1 PRF (16). A deeper understanding of how RNA structural dynamics control Ϫ1 PRF and gene expression and how this may relate to the viral life cycle may contribute toward understanding how to disrupt the process.

Structurally Plastic Frameshift Signals
and reverse primers containing overlaps of at least 20 nucleotides (Integrated DNA Technologies) were designed to include BamHI and SalI restriction endonuclease recognition sites (Table 1). Primers were extended and amplified from fulllength clones (kindly provided by Dr. Brenda Fredericksen) by PCR using the DreamTaq Master Mix (Life Technologies), using the following protocol: denaturation at 95°C for 30 s, annealing at 58°C for 30 s, and extension at 72°C for 40 s over the course of 30 cycles. Insert size and purity were confirmed by electrophoresis through 1% TAE-agarose gel and extracted using the GeneJET Gel Extraction Kit (Fermentas). BamHI/ SalI-digested inserts were ligated into a similarly cut dual luciferase reporter plasmid p2luci (21) with T4 DNA ligase such that only a Ϫ1 shift in the reading frame would result in translation of the downstream firefly luciferase gene. All other sequences were cloned into p2luci using the Gibson Assembly method (34). Primers were designed with at least 15 nucleotide overlaps with both each other and the template and ordered from Integrated DNA Technologies (Table 1). Primers were extended and amplified via PCR, gel purified, and extracted as described above. Linear plasmid and inserts were assembled using Gibson Assembly master mix (New England BioLabs) at 50°C for 1 h. All plasmids were amplified in Escherichia coli strain DH5␣ and extracted using GeneJET Plasmid Miniprep Kit (Life Technologies). Cloned sequences were confirmed by commercial sequence analysis (GeneWiz). Oligonucleotide-directed mutagenesis was used to create additional reporters harboring the following (see Fig. 1A): (a) a 0-frame UAA termination codon immediately 5Ј of the WNV slippery sites to control for the presence of cryptic promoters; (b) a UAA termination codon in the Ϫ1 frame immediately after the slippery site to control for the presence of internal ribosome entry site elements; and (c) substitution of the non-slippery ACTGACT sequence for the wild-type slippery sequences to verify that slippage was dependent on the slippery sequences. Primers used for sequence confirmation and to construct reciprocal swaps of the New York and Madagascar WNV slippery sites into one another's downstream frameshift promoting sequences are shown in Table 1. HeLa cells (ATCC) were grown at 37°C in DMEM ϩϩϩ (7.5 ml of fetal bovine serum, 100 l of penicillin and streptomycin, 500 l of non-essential amino acid mixture, 1 ml of 5% L-glutamate, 41 ml of Dulbecco's modified Eagle's medium) and 5% CO 2 . Cells were transfected with 1100 ng of plasmid and 3.3 l of FuGene reagent in 1 ml of DMEM ϩϩϩ and grown overnight in 12-well plates. Dual Luciferase Assays-Frameshifting efficiencies were tested by dual luciferase assay (21,35). Forty hours post-transfection, Renilla and Firefly luciferase activities were measured in HeLa cell lysates using the Dual Luciferase Reporter Assay System and read with a GloMax-Multi Microplate Luminometer (Promega). p2luci was employed as the in-frame control. Lysates of mock-transfected HeLa cells were used to control for background levels of luminescence. Ϫ1 PRF efficiencies were calculated as the ratio of firefly to Renilla luciferase reads and statistical analyses were performed as previously described (36).
Selective 2Ј-Hydroxyl Acylation Analyzed by Primer Extension-RNA transcripts of frameshift signals were synthesized using the MEGAscript T7 kit (Ambion) at 37°C overnight and purified using the MEGAclear kit (Ambion). Full-length transcripts were purified from 1% agarose gels. Transcripts were treated with 1-methyl-7-nitroisatoic anhydride (1M7) using 10 pmol of RNA, folding buffer, and 1M7 at 37°C for 10 min. Identical reactions were performed with DMSO as a negative control. SHAPE primers (IDT) were designed to amplify the transcripts from two points to allow for resolution of the entire frameshift signal. A universal primer was designed to anneal to the Renilla sequence 3Ј of the frameshift sequence, optimizing visualization of the 3Ј end of the sequence. Specific primers were designed to anneal to the middle of individual viral sequences to enable optimal resolution of the 5Ј ends of the viral sequences. Primers were labeled with [␥-32 P]ATP and T4 DNA kinase at 37°C for 30 min. Labeled primers were purified using Sephadex 25 spin columns. 1M7-and DMSO-treated RNAs were denatured at 65°C for 5 min, and annealed with RNAs at 42°C for 15 min. An enzyme mixture containing dNTPs, DTT, SuperScript III (Life Technologies), and buffer was added to the primed RNA. cDNA synthesis proceeded at 45°C for 10 min followed by 52°C for 10 min.
Denaturing Polyacrylamide Gel Electrophoresis and Interpretation-Synthesized cDNAs and parallel sequencing reactions were separated through 8% polyacrylamide urea denaturing gels. Radiolabeled cDNAs were visualized using a Typhoon phosphorimager (GE Healthcare Life Sciences). Phosphorimages were scored by identifying bands appearing in the 1M7-treated, but not the DMSO control lanes. Dark bands were scored as strongly reactive, lighter, 1M7-specific bands were scored as moderately reactive, and the appearance of no band was scored as unreactive. After scoring, secondary structures were manually refined using computationally predicted folding structures as guides (18 -20).