Structural and functional conservation of the programmed −1 ribosomal frameshift signal of SARS coronavirus 2 (SARS-CoV-2)

Approximately 17 years after the severe acute respiratory syndrome coronavirus (SARS-CoV) epidemic, the world is currently facing the COVID-19 pandemic caused by SARS corona virus 2 (SARS-CoV-2). According to the most optimistic projections, it will take more than a year to develop a vaccine, so the best short-term strategy may lie in identifying virus-specific targets for small molecule–based interventions. All coronaviruses utilize a molecular mechanism called programmed −1 ribosomal frameshift (−1 PRF) to control the relative expression of their proteins. Previous analyses of SARS-CoV have revealed that it employs a structurally unique three-stemmed mRNA pseudoknot that stimulates high −1 PRF rates and that it also harbors a −1 PRF attenuation element. Altering −1 PRF activity impairs virus replication, suggesting that this activity may be therapeutically targeted. Here, we comparatively analyzed the SARS-CoV and SARS-CoV-2 frameshift signals. Structural and functional analyses revealed that both elements promote similar −1 PRF rates and that silent coding mutations in the slippery sites and in all three stems of the pseudoknot strongly ablate −1 PRF activity. We noted that the upstream attenuator hairpin activity is also functionally retained in both viruses, despite differences in the primary sequence in this region. Small-angle X-ray scattering analyses indicated that the pseudoknots in SARS-CoV and SARS-CoV-2 have the same conformation. Finally, a small molecule previously shown to bind the SARS-CoV pseudoknot and inhibit −1 PRF was similarly effective against −1 PRF in SARS-CoV-2, suggesting that such frameshift inhibitors may be promising lead compounds to combat the current COVID-19 pandemic.

Approximately 17 years after the severe acute respiratory syndrome coronavirus (SARS-CoV) epidemic, the world is currently facing the COVID-19 pandemic caused by SARS corona virus 2 (SARS-CoV-2). According to the most optimistic projections, it will take more than a year to develop a vaccine, so the best short-term strategy may lie in identifying virusspecific targets for small molecule-based interventions. All coronaviruses utilize a molecular mechanism called programmed 21 ribosomal frameshift (21 PRF) to control the relative expression of their proteins. Previous analyses of SARS-CoV have revealed that it employs a structurally unique three-stemmed mRNA pseudoknot that stimulates high 21 PRF rates and that it also harbors a 21 PRF attenuation element. Altering 21 PRF activity impairs virus replication, suggesting that this activity may be therapeutically targeted. Here, we comparatively analyzed the SARS-CoV and SARS-CoV-2 frameshift signals. Structural and functional analyses revealed that both elements promote similar 21 PRF rates and that silent coding mutations in the slippery sites and in all three stems of the pseudoknot strongly ablate 21 PRF activity. We noted that the upstream attenuator hairpin activity is also functionally retained in both viruses, despite differences in the primary sequence in this region. Small-angle X-ray scattering analyses indicated that the pseudoknots in SARS-CoV and SARS-CoV-2 have the same conformation. Finally, a small molecule previously shown to bind the SARS-CoV pseudoknot and inhibit 21 PRF was similarly effective against 21 PRF in SARS-CoV-2, suggesting that such frameshift inhibitors may be promising lead compounds to combat the current COVID-19 pandemic.
SARS-CoV-2, the etiological agent of COVID-19, is a member of the coronavirus family (1). Coronaviruses have (1) single-strand RNA genomes that harbor two long ORFs that occupy approximately two-thirds of the 59 end of the genomic RNA (ORF1 and ORF2), followed by several ORFs that are expressed late in the viral replication cycle from subgenomic RNAs (Fig. 1A) (2). In general, the immediate early proteins encoded by ORF1a are involved in ablating the host cellular innate immune response, whereas the early proteins encoded in ORF1b are involved in genome replication and RNA synthesis. These functions include generating the minus-strand replicative intermediate, new plus-strand genomic RNAs, and subgenomic RNAs, which mostly encode structural, late proteins. ORF1b is out of frame with respect to ORF1a, and all coronaviruses utilize a molecular mechanism called programmed 21 ribosomal frameshifting (21 PRF) as a means to synthesize the ORF2-encoded proteins (3,4). 21 PRF is a mechanism in which cis-acting elements in the mRNA direct elongating ribosomes to shift the reading frame by 1 base in the 59 direction. The use of a 21 PRF mechanism for expression of a viral gene was first identified in the Rous sarcoma virus (5). A 21 PRF mechanism was shown to be required to translate ORF1ab in a coronavirus, avian infectious bronchitis virus, 2 years later (6). In coronaviruses, 21 PRF functions as a developmental switch, and mutations and small molecules that alter this process have deleterious effects on virus replication (7,8).
The 21 PRF signal can be broken down into three discrete parts: the "slippery site," a linker region, and a downstream stimulatory region of mRNA secondary structure, typically an mRNA pseudoknot (reviewed in Ref. 3). The primary sequence of the slippery site and its placement in relation to the incoming translational reading frame is critical: it must be N NNW WWZ (codons are shown in the incoming or 0-frame), where NNN is a stretch of three identical nucleotides, WWW is either AAA or UUU, and Z = G. The linker region is less welldefined, but typically is short (1-12 nt long) and is thought to be important for determining the extent of 21 PRF in a virusspecific manner. The function of the downstream secondary structure is to induce elongating ribosomes to pause, a critical step for efficient 21 PRF to occur (reviewed in Ref. 9). The generally accepted mechanism of 21 PRF is that the mRNA secondary structure directs elongating ribosomes to pause with its A-and P-site bound aminoacyl-and peptidyl-tRNAs positioned over the slippery site. The sequence of the slippery site allows for re-pairing of the tRNAs to the 21 frame codons after they "simultaneously slip" by 1 base in the 59 direction along the mRNA. The subsequent resolution of the downstream mRNA secondary structure allows the ribosome to continue elongation of the nascent polypeptide in the new translational    reading frame. The downstream stimulatory elements are most commonly H-type mRNA pseudoknots, so called because they are composed of two co-axially stacked stem loops where the second stem is formed by base pairing between sequence in the loop of the first-stem loop and additional downstream sequence (10). The SARS-CoV pseudoknot is more complex because it contains a third, internal stem-loop element (11)(12)(13). Mutations affecting this structure decreased the rates of 21 PRF and had deleterious effects on virus propagation, thus suggesting that it may present a target for small-molecule therapeutics (7,8). In addition, the presence of a hairpin located immediately 59 of the slippery site has been reported to regulate 21 PRF by attenuating its activity (14). Here, we report on the 21 PRF signal from SARS-CoV-2. The core 21 PRF signal is nearly identical to that of SARS-CoV, containing only a singlenucleotide difference, a C to A. This change maps to a loop region in the molecule that is not predicted to affect the structure of the three-stemmed pseudoknot. The primary sequence of the attenuator hairpin is less well-conserved. However, genetic analyses reveal that both elements appear to have been functionally conserved. Conservation of RNA structure is further supported by the similarity of the small-angle X-ray scattering profiles for the two pseudoknots and by the similar antiframeshifting activity of a small-molecule ligand against both frameshift signals.

Comparative structural analyses of the two 21 PRF signals
The core of the SARS-CoV 21 PRF signal begins with the U UUA AAC slippery site, followed by a 6-nt spacer region and then the three-stemmed mRNA pseudoknot that stimulates 21 PRF. A second regulatory element, called the attenuator hairpin, is located 59 of the slippery site. Pairwise analysis of the SARS-CoV and SARS-CoV-2 frameshift signals revealed that the sequence of the attenuator hairpin was less well-conserved than the frameshift-stimulating pseudoknot (Fig. 1B). The structure of the SARS-CoV 21 PRF signal was previously determined to include a three-stemmed pseudoknot (11). Using this structure as a guide, the single C-to-A base difference between the core SARS-CoV and SARS-CoV-2 21 PRF signals ( Fig. 1B) that maps to a loop that is not predicted to alter the structure of the 21 PRF stimulating element (7) (Fig. 1C). In contrast, the attenuator hairpin contains six differences in the nucleotide sequence between the two viruses ( Fig. 1B), and the SARS-CoV-2 element is predicted to be less stable than its SARS-CoV counterpart (Fig. 1D). To determine the importance of each of these elements, a series of silent coding mutants of both the SARS-CoV and SARS-CoV-2 sequences were constructed to disrupt the putative attenuators, slippery sites, and stems 1, 2, and 3 of the pseudoknots (Fig. 1, E and F).
Comparative functional analyses of the two 21 PRF signals Standard Dual-Luciferase assays were used to monitor 21 PRF activities of the two 21 PRF signals (15,16) in cultured human cell lines. For both of the elements, 21 PRF activity was ;20% in HEK ( Fig. 2A) and ;30% in HeLa (Fig. 2B). Amino acid sequence silent coding mutation of the U UUA AAC slippery sites to C CUC AAC (the incoming 0-frame codons are indicated by spaces) ablated 21 PRF activity in both cases to less than 1% (Fig. 2, A and B), demonstrating the functional conservation of this central feature of the 21 PRF signal.
To test functional conservation of the three-stemmed pseudoknot, a series of silent 0-frame coding mutations were made to each of the stems in both the SARS-CoV and SARS-CoV-2 frameshift signals, and assays were performed in HEK cells. Disruption of stem 1 strongly suppressed the ability of both elements to promote 21 PRF, decreasing rates to 0.67 6 0.03 and 0.7 6 0.1% for SARS-CoV and SARS-CoV-2, respectively, p , 0.0001 (Fig. 2C). Similarly, disruption of stem 2 had a strong negative impact on 21 PRF, decreasing rates to 0.68 6 0.04% for SARS-CoV and 0.8 6 0.1% for SARS-CoV-2; p , 0.0001 (Fig. 2D). In contrast, although disruption of stem 3 did decrease 21 PRF efficiencies, the effects were less severe, although the decreases were statistically significant (13.1 6 0.9 and 8 6 1% for SARS-CoV and SARS-CoV-2, respectively; p , 0.0001) (Fig. 2E). These findings support the hypothesis that the structure and function of the core 21 PRF signals have been conserved between the two viruses.

Conservation of the 59 attenuator function
Prior studies demonstrated the presence of an element located immediately 59 of the SARS-CoV slippery site that had the ability to decrease 21 PRF, called the attenuator hairpin (14). Although less well-conserved at the primary sequence level (Fig. 1, B and C), addition of this sequence into the SARS-CoV-2 reporter also resulted in decreased 21 PRF efficiency: 16 6 3% compared with 25 6 3% without the attenuator hairpin (p , 0.01), whereas disruption of the hairpin did not result in decreased efficiency (22 6 4%, p = 0.415) (Fig. 2F). In the control experiment, the SARS-CoV attenuator also promoted decreased 21 PRF, albeit to a lesser extent (20 6 2% compared with 23 6 2% without the attenuator hairpin (p = 0.04) and 24 6 1% with the disrupted hairpin (p = 0.716)) (Fig. 2G). Thus, the attenuation function has also been conserved between the two viruses despite the differences in primary nucleotide sequences.
Small-molecule frameshift inhibitor of SARS-CoV 21 PRF is also active against SARS-CoV-2 Based on the strong conservation of the frameshift signal between SARS-CoV and SARS-CoV-2, we tested whether a ACCELERATED COMMUNICATIONS: Frameshifting in SARS-CoV-2 frameshift inhibitor active against the first also retained activity against the second. We focused on a small-molecule ligand previously shown to bind to the SARS-CoV pseudoknot and suppress 21 PRF, 2-{[4-(2-methyl-thiazol-4ylmethyl)- [1,4] diazepane-1-carbonyl]-amino}-benzoic acid ethyl ester, hereafter denoted as MTDB (17,18). Comparing the 21 PRF activity from Dual-Luciferase measurements in rabbit reticulocyte lysates in the presence and absence of MTDB, we found that 5 mM MTDB reduced 21 PRF activity by almost 60%, from 36 6 3 to 15 6 1% (Fig. 3). This reduction was comparable with, but slightly smaller than, that seen previously for the SARS-CoV pseudoknot, where 0.8 mM MTDB reduced 21 PRF by roughly 60% (17).

Solution scattering profiles of the SARS-CoV and SARS-CoV-2 pseudoknots are indistinguishable
Finally, we used small and wide-angle X-ray scattering (SAXS) to compare the solution scattering profiles of the two pseudoknots, which reflect their structure. The scattering profiles (intensity as a function of the scattering vector q) were indistinguishable for laboratory-purified samples of SARS-CoV AH denotes constructs that included attenuator hairpin sequences. AH mutant denotes mutants harboring the silent coding attenuator hairpin sequences shown in Fig. 1 (E and F). Assays were performed using Dual-Luciferase assays as previously described (15,16). Each data point represents a single biological replicate comprised of three technical replicates. Error bars denote S.E. n.s, not significant.
( Fig. 4A, blue) and SARS-CoV-2 (Fig. 4A, red) pseudoknots. The difference between their scattering profiles is consistent with 0 at all q (Fig. 4B). The high-q portion of the profile is sensitive to the finer molecular details of the structure (19); hence the similarity of the profiles for the two pseudoknots indicates that their structures are likely the same. Because SARS-CoV pseudoknots can dimerize (20), we also performed inline sizeexclusion chromatography (SEC) SAXS measurements, where the RNA was purified by SEC immediately before X-ray exposure to ensure only monomers were measured. From inline SEC-SAXS profiles (Fig. 4A, inset), we determined the monomer size, parameterized as the radius of gyration, R g . We measured the same values for SARS-CoV and SARS-CoV-2 pseudoknots: R g = 28.1 6 0.3 Å and 28.1 6 0.2 Å, respectively. The difference profile for this set is also consistent with 0 for all q (Fig. 4C).

Discussion
These results verify that SARS-CoV-2 does indeed have a functional 21 PRF site. They also show that the properties of the frameshift signal in SARS-CoV-2 are very similar to those of the frameshift signal in SARS-CoV. Not only was the level of 21 PRF close to identical for both viruses, but disrupting stems 1 and 2 in the stimulatory pseudoknot abolished frameshifting in both cases, whereas disrupting stem 3 reduced 21 PRF but did not abolish it in each case. Furthermore, each frameshift signal featured an attenuator hairpin that promoted modestly decreased 21 PRF levels, and the global structures of two pseudoknots as reflected in the SAXS scattering profiles were virtually identical.
The very close correspondence in the properties of the frameshift-stimulatory pseudoknots from SARS-CoV and SARS-CoV-2 suggests that other properties of the former that have been characterized in previous studies are highly likely to carry over to SARS-CoV-2. For example, deletion of stem 3 will likely lead to little or no change in 21 PRF, whereas mutation of the A bulge in stem 2 will likely abolish 21 PRF (7,8), the pseudoknot will likely dimerize via interactions between loop 2 (20), and suppression of 21 PRF will most likely attenuate viral propagation (11). This likely susceptibility of SARS-CoV-2 to attenuation by suppressing 21 PRF is of particular interest, because it suggests that targeting 21 PRF may provide a promising avenue for therapeutic intervention. Previous work on SARS-CoV found that antisense peptide nucleic acids could inhibit both 21 PRF and virus replication (21). The fact that the compound MTDB, which was found in a computational search for 21 PRF inhibitors in SARS-CoV (17), is similarly active at suppressing 21 PRF in SARS-CoV-2 provides concrete evidence for small-molecule frameshifting inhibitors in SARS-CoV-2 and supports the hypothesis that the frameshift-stimulatory pseudoknot may be an attractive therapeutic target.

Experimental procedures Identification of the SARS-CoV-2 21 PRF signal and computational methods
The SARS-CoV-2 21 PRF signal was identified from the complete genome sequence (NCBI sequence NC_045512.2). The EMBOSS Water pairwise alignment tool was used to identify sequences in the SARS-CoV-2 genome most similar to the SARS-CoV 21 PRF sequence. One hit was reported between bases 13,461 and 13,547 of SARS-CoV-2 that was 98.9% identical to the original SARS sequence. The SARS-CoV-2 sequence contains a single point mutation from C to A at base 13,533. EMBOSS Water was used to generate pairwise alignments between sequences derived from SARS-CoV (GenBank TM entry NC_004718.3, begin nt 13361, end nt 13478) and SARS-CoV-2 (GenBank TM entry NC_045512.2, begin nt 13431, end nt 13547).

Preparation of plasmids and RNA transcription templates
Plasmids for cell-based Dual-Luciferase assays for SARS-CoV-2 were generated by site-directed mutagenesis of the pJD2359 plasmid (SARS-CoV pSGDluc reporter plasmid) (8), introducing a single C-to-A point mutation at base 1873, corresponding to the point mutation in the SARS-CoV-2 genome (Q5 site-directed mutagenesis kit, NEB). Site-directed mutagenesis primers (Table S1) were synthesized and purified by IDT. Products were transformed into DH5a Escherichia coli cells (NEB) and spread onto LB agar plates containing 50 mg/ml carbenicillin. Positive clones were verified by DNA sequencing (Genewiz). The frameshift reporter negative controls and reporter constructs containing silent mutations disrupting the 21 PRF slippery site (ssM), stem 1 (St1), stem 3 (St3), and attenuator hairpins were constructed similarly by site-directed mutagenesis. Reporters containing silent mutations to stem 2 were made by digesting pJD2257 with SalI and BamHI and ligating a DNA oligonucleotide insert (IDT) containing the silent mutations to stem 2 of SARS and SARS-CoV-2 (IDT) into the plasmid using T4 DNA ligase (NEB).
Plasmids for cell-free Dual-Luciferase assays were made as described previously (22). Briefly, the reporter construct was made by cloning the sequence for Renilla luciferase and SARS-

ACCELERATED COMMUNICATIONS: Frameshifting in SARS-CoV-2
CoV-2 frameshift signal in the 0 frame upstream of the firefly luciferase sequence in the pISO plasmid (Addgene), with firefly luciferase in the 21 frame. A negative control was made by replacing part of the slippery sequence with a stop codon, and a positive control was made without a frameshift signal and the two luciferases in-frame. RNA transcription templates were amplified from these plasmids by PCR and transcribed in vitro by T7 RNA polymerase.
Plasmids for producing samples for SAXS were prepared by ligating an insert containing the sequences of the SARS-CoV and SARS-CoV-2 pseudoknots into the BamHI and SpeI sites of the pMLuc-1 plasmid as described previously (23). RNA transcription templates were amplified from these plasmids by PCR, including three extra nucleotides upstream of the pseudo-knot and four downstream (all Us). The forward PCR primer was extended on its 59 end to include the T7 polymerase promoter. The transcription templates were then transcribed in vitro by T7 RNA polymerase. Plasmids used in this study are shown in Table S2. Inset, scattering profiles from inline SEC-SAXS measurements, containing purely monomeric pseudoknots. B and C, difference between the scattering profiles for SARS-CoV and SARS-CoV-2 pseudoknots obtained from lab-purified SAXS (B) and inline SEC-SAXS (C) samples. Arb., arbitrary. 061), 1% nonessential amino acids (Fisher Scientific 11140-050), 1% HEPES buffer (Fisher Scientific 15630-030), and 13 penicillin/streptomycin (Fisher Scientific 15140-122) at 37°C in 5% CO 2 . HeLa cells were maintained in Dulbecco's modified Eagle's medium supplemented with 10% fetal bovine serum, 1% GlutaMAX, and 13 penicillin/streptomycin at 37°C in 5% CO 2 . HEK293T and HeLa cells were seeded at 4 3 10 4 cells/well into 24-well plates. The cells were transfected 24 h after seeding with 500 ng of Dual-Luciferase reporter plasmid using Lipofectamine3000 (Invitrogen L3000015) per the manufacturer's protocol.

Dual-Luciferase assays of 21 PRF
The frameshifting efficiency of the reporter plasmids in cultured cells was assayed as described previously (15,16) using a Dual-Luciferase reporter assay system kit (Promega). 24 h after transfection, the cells were washed with 13 PBS and then lysed with 13 passive lysis buffer (E194A, Promega). Cell lysates were assayed in triplicate in a 96-well plate, and luciferase activity was quantified using a GloMax microplate luminometer (Promega). The percent frameshift was calculated by averaging the three firefly or Renilla luciferase technical replicate reads per sample and then forming a ratio of firefly to Renilla luminescence per sample. Each sample luminescence ratio was compared with a read-through control set to 100%. The ratio of ratios for each sample is the percent frameshift for the sample. A minimum of three biological replicates were assayed for each sample, each of which were assayed in triplicate (technical replicates). Mean technical replicate values of each biological replicate are depicted on graphs with standard error of the mean for biological replicates. Statistical analyses were conducted using Student's t test or one-way analysis of variance as appropriate using Prism 8 software (GraphPad).
To measure 21 PRF efficiency in cell-free assays, 2 mg of mRNA from each construct was heated to 65°C, mixed with 35 ml of nuclease-treated RRL (Promega) and 0.5 ml of 1 mM amino acid mixture lacking Leu and Met, and then incubated for 90 min at 30°C. The firefly luminescence from each of the constructs was measured after incubating 20 ml of each reaction with 100 ml of Dual-Glo Luciferase reagent (Promega) for 10 min, and then Renilla luminescence was measured 10 min after adding 100 ml of Dual-Glo Stop and Glo reagent. The 21 PRF efficiency was calculated from the ratio of firefly and Renilla luminescence, subtracting the background measured from the negative control and normalizing by the positive control. Eight independent measurements were made without MTDB, and four were made with MTDB.

SAXS measurements
RNA samples for SAXS experiments were made by in vitro transcription of DNA templates followed by ethanol precipitation of the RNA. To avoid dimerization, RNA was resuspended in a low-salt solution (50 mM MOPS, 10 mM KCl, pH 7.5). The RNA was annealed by heating to 95C°for 5 min and then placed on ice. After concentration with a spin concentrator, a fraction of the RNA was set aside for inline SEC-SAXS, performed just prior to X-ray exposure, whereas the rest was purified by SEC in a column equilibrated with the SAXS buffer (50 mM MOPS, 130 mM KCl, pH 7.5). Selected peak fractions of these lab-purified samples were then concentrated to 17.3 mM for the SARS-CoV RNA and 19.2 mM for the SARS-CoV-2 RNA shown in the figure. All samples were sent to the National Synchrotron Light Source II for data acquisition.
The SAXS data were collected at the Life Sciences X-Ray Scattering Beamline (LIX) at Brookhaven National Laboratory using their standard solution scattering set-up, experimental procedures, and data-processing packages (24). SEC-SAXS was performed on a Superdex 200 increase 5/150 GL column (GE) equilibrated in the SAXS buffer condition.

Data availability
Full data sets of 21 PRF assays are available upon request. Please contact Dr. Jonathan D. Dinman, University of Maryland, dinman@umd.edu.