The 5′UTR of HCoV-OC43 adopts a topologically constrained structure to intrinsically repress translation

The emergence of SARS-CoV-2, which is responsible for the COVID-19 pandemic, has highlighted the need for rapid characterization of viral mechanisms associated with cellular pathogenesis. Viral UTRs represent conserved genomic elements that contribute to such mechanisms. Structural details of most CoV UTRs are not available, however. Experimental approaches are needed to allow for the facile generation of high-quality viral RNA tertiary structural models, which can facilitate comparative mechanistic efforts. By integrating experimental and computational techniques, we herein report the efficient characterization of conserved RNA structures within the 5′UTR of the HCoV-OC43 genome, a lab-tractable model coronavirus. We provide evidence that the 5′UTR folds into a structure with well-defined stem-loops (SLs) as determined by chemical probing and direct detection of hydrogen bonds by NMR. We combine experimental base-pair restraints with global structural information from SAXS to generate a 3D model that reveals that SL1-4 adopts a topologically constrained structure wherein SLs 3 and 4 coaxially stack. Coaxial stacking is mediated by short linker nucleotides and allows SLs 1 to 2 to sample different cojoint orientations by pivoting about the SL3,4 helical axis. To evaluate the functional relevance of the SL3,4 coaxial helix, we engineered luciferase reporter constructs harboring the HCoV-OC43 5′UTR with mutations designed to abrogate coaxial stacking. Our results reveal that the SL3,4 helix intrinsically represses translation efficiency since the destabilizing mutations correlate with increased luciferase expression relative to wildtype without affecting reporter mRNA levels, thus highlighting how the 5′UTR structure contributes to the viral mechanism.

The emergence of SARS-CoV-2, which is responsible for the COVID-19 pandemic, has highlighted the need for rapid characterization of viral mechanisms associated with cellular pathogenesis.Viral UTRs represent conserved genomic elements that contribute to such mechanisms.Structural details of most CoV UTRs are not available, however.Experimental approaches are needed to allow for the facile generation of high-quality viral RNA tertiary structural models, which can facilitate comparative mechanistic efforts.By integrating experimental and computational techniques, we herein report the efficient characterization of conserved RNA structures within the 5 0 UTR of the HCoV-OC43 genome, a lab-tractable model coronavirus.We provide evidence that the 5 0 UTR folds into a structure with well-defined stem-loops (SLs) as determined by chemical probing and direct detection of hydrogen bonds by NMR.We combine experimental base-pair restraints with global structural information from SAXS to generate a 3D model that reveals that SL1-4 adopts a topologically constrained structure wherein SLs 3 and 4 coaxially stack.Coaxial stacking is mediated by short linker nucleotides and allows SLs 1 to 2 to sample different cojoint orientations by pivoting about the SL3,4 helical axis.To evaluate the functional relevance of the SL3,4 coaxial helix, we engineered luciferase reporter constructs harboring the HCoV-OC43 5 0 UTR with mutations designed to abrogate coaxial stacking.Our results reveal that the SL3,4 helix intrinsically represses translation efficiency since the destabilizing mutations correlate with increased luciferase expression relative to wildtype without affecting reporter mRNA levels, thus highlighting how the 5 0 UTR structure contributes to the viral mechanism.
The COVID-19 pandemic is a reminder of the pervasive and pathological capacity that RNA viruses pose to global human health.According to the World Health Organization, there have been a reported 652 million cases of SARS-CoV-2 worldwide as of December 2022, of which 6.65 million have died from complications associated with viral pathogenesis.A common mechanism of all positive-sense RNA (+ssRNA) viruses, which includes Coronaviridae (CoV), is that viral replication initiates in the cytoplasm of infected cells via processes that are tightly regulated by RNA structures located within the 5 0 -untranslated regions (UTRs).Viral 5 0 UTRs are genetic elements under high selective pressures to maintain their overall structural topologies, even when sequence conservation diverges.This is because the structures regulate essential gene expression events during the earliest stages of viral infection (1).
Betacoronaviruses constitute an important genus of the larger CoV family and are responsible for human illnesses, ranging from the common cold to more morbid conditions, such as respiratory failure, neurological complications, and even death (2)(3)(4)(5).Human coronavirus OC43 (HCoV-OC43) is a nonsegmented, enveloped virus, which causes mild to moderate respiratory illness and is endemic in the population, resulting in frequent reinfections throughout life (6).Owing to its low pathogenicity, HCoV-OC43 is a tractable model coronavirus for investigating structure-conservation-function relationships of RNA elements from the 5 0 UTR and to calibrate novel chemical modalities with capacity to target CoV RNA structural elements (7)(8)(9).
Multiple studies have shown conservation of the 5 0 UTR structure among the members of the three major CoV genera (Alpha-, Beta-, and Gamma-), particularly for stem-loops (SLs) 1, 2, and 4 (1,10,11).Other SLs have been observed for different CoVs, suggesting species-specific and host-dependent fine tuning of biological function (1,12).The current understanding is that SLs 1 to 4 modulate steps of viral protein and RNA synthesis, since mutations, deletions, or substitutions of the SLs differentially affect gene expression and viral viability (10).For example, specific mutations to the pyrimidine-rich pentaloop within SL2 reduces the viability of murine hepatitis virus (MHV) by affecting subgenomic RNA (sgRNA) synthesis (11,13).Moreover, substitution of U48 in MHV with either a C or A retains viral viability, whereas the U48G substitution does not (11,13).In addition, substitutions at C47, C51, or C59 maintain viral replication fidelity, while a substitution at G50 does not, confirming the selective pressure on SL2.Related functional phenotypes are observed when the structures of SL1, 3, and 4 are also changed or swapped (10,12,(14)(15)(16)(17).
Collectively, these observations indicate that SLs 1 to 4 of the 5 0 UTR confer essential and shared functional capacity across most CoVs.
Predictive and empirical secondary-structural models of CoV 5 0 UTRs have been reported, with recent structures of SARS-CoV-2 confirming the conserved foldability of SLs 1 to 5 (18)(19)(20)(21)(22)(23).Surprisingly, there are no structures of the HCoV-OC43 5 0 UTR despite it being responsible for widespread and seasonal viral infections and coinfection.To fill this knowledge gap, we herein report a conjoined chemical-and NMR-probed structure of the 5 0 UTR (nucleotides 4-329) of HCoV-OC43.Our chemical modification results show that the 5 0 UTR folds into five SLs consistent with prior phylogenetic comparisons (1).NMR analysis of isolated SLs 1 to 4 and an intact construct (SL1-4) verify that this region folds independently of the remaining 5 0 UTR.Small-angle X-ray scattering (SAXS) further validates that these elements have native tertiary structure and that SLs 3 and 4 co-axially stack.This provides a common axis whereby SLs 1 and 2 can sample multiple orientations.Interestingly, evolutionary trace analysis (ETA) on a 3D structural model, refined with empirical hydrogen-bond restraints and SAXS scattering data, demonstrates that the compactness of the tertiary structure brings distally conserved residues within close proximity so as to form an extended surface.The functional significance of the SL3,4 coaxial helix was further probed via a luciferase reporter assay wherein mutations designed to abrogate stacking of SL3 and SL4 correlate with an increase in luciferase expression.We also show that the N-terminal domain (NTD) of the nucleocapsid protein binds the leader transcription regulatory sequence (TRS-L), located within the apical loop of SL3, with distinctly different thermodynamics depending on its structural context.In sum, this study shows that SLs 1 to 4 of the HCoV-OC43 5 0 UTR form an evolutionarily conserved 3D architecture, which we posit can be targeted with chemical modalities in order to modulate viral replication.The work further verifies the utility of integrating orthogonal methods to provide efficient mechanistic insights into viral RNA elements at multiple levels of structural resolution (24).

Results
The HCoV-OC43 5 0 UTR folds into an evolutionarily conserved structured domain Previous global phylogenetic comparisons of β-CoVs suggest that the HCoV-OC43 5 0 UTR folds into a structure composed of five conserved SL domains (1); however, this structure has not been validated experimentally.To obtain an empirical model of the HCoV-OC43 5 0 UTR, we probed the first 327 nts using DMS-MapSeq (25). Figure 1A shows a plot of the normalized DMS reactivity indices of the 5 0 UTR.Interestingly, the data reveal low to moderate DMS modifications for the first 150 nts of the 5 0 UTR, with increasingly higher reactivity indices extending out to the 3 0 end (Fig. 1A).The reactivity indices of the last 150 nts were on average 2-fold higher than those comprising the 5 0 end.We incorporated the reactivity indices as restraints into RNAstructure (26) to determine an empirical model of the 5 0 UTR. Figure 1B illustrates that the population-averaged structure of the 5 0 UTR is composed of four isolated SLs and a fifth multibranched domain (SL5 mb ) consisting of five sub-SLs (5A-5E) organized around a central junction (Fig. 1B).Of note, there are no DMS restraints included for the first 24 and last 27 nts due to the design of the sequencing primers so their base pairing is exclusively determined by the thermodynamic parameters.Mapping the reactivity indices onto the HCoV-OC43 5 0 UTR structure shows that SLs 1 to 4 have low to moderate DMS modifications, with the most reactive region localized to the apical loop of SL3 and a short linker connecting to the lower helix of SL4 (Fig. 1B).By comparison, SL5 mb shows significantly higher DMS modifications spread out over each of the five sub-SLs and the central junction.Despite being the largest singly folded structure in the 5 0 UTR, SL5E shows the highest degree of modification primarily adjacent to or within several structural motifs (Fig. 1B).
Since DMS reactivities were only detected for unpaired adenosines and cytosines, we also independently probed the HCoV-5 0 UTR using selective 2 0 hydroxyl acylation analyzed by primer extension, with mutational profiling (SHAPE-MaP, Fig. S1). Figure 1C shows the HCoV-5 0 UTR folded using SHAPE (2-methylnicotinic acid imidazole [NAI]) reactivity indices as restraints in RNAstructure (26).The population average structure of the SHAPE-derived HCoV-5 0 UTR model is very similar to that determined using DMS reactivity indices.SLs 1 to 4 adopt identical apical loop topologies in the two models confirming that these structures are thermodynamically favored.Minor differences in the helical lengths of SL3 are observed between the SHAPE and DMS models, with the SHAPE structure containing two additional terminal AU base pairs.The additional terminal pairs shorten the SL2-3 and SL3-4 linkers relative to the DMS-derived structure (Fig. 1, B  and C).SL 4 also shows minor differences in the two structures with the lower helix being shortened by one base pair and two new Watson-Crick (WC) base pairs within the large internal loop in the SHAPE model.
The most significant differences between the SHAPE-and DMS-derived models are observed within the SL5 mb domain (Fig. 1, B and C).Both models contain a central helix with near identical base pair composition; however, the SHAPE-derived structure consists of three sub-SLs (5A 0 -5C 0 ) compared with five observed in the DMS-determined structure.The apical sequences of SLs 5A/A 0 are identical in the two models as well as SLs 5B'/5C, and 5C'/5E.Sub-SLs 5B-5D of the DMS-derived model primarily fold into the longer SL5B 0 in the SHAPE structure.Notably, the AUG start codon localizes to distinct structural environments within the two models.In the DMS structure, the AUG start codon is located in a linker connecting SL5D to SL5E, whereas it is adjacent to a singlenucleotide bulge and partially base paired in SL5B' (Fig. 1, B  and C).The structural variability of the SL5 mb domain observed between the DMS-and SHAPE-determined structures indicate that this region is more conformationally dynamic than SLs 1 to 4. In general, the empirical HCoV-OC43 5 0 UTR structures determined here are consistent with the The HCoV-OC43 5'UTR is topologically constrained consensus structure previously proposed by phylogenetic comparisons of group 2A β-CoVs (also known as embecoviruses), except for SL5 mb , where this region was predicted to fold into three independent SLs (1).
To investigate if conserved residues localize to specific structures within the 5 0 UTR, we performed ETA on 17 β-CoV 5 0 UTR sequences curated by Rfam (27).RNA ETA was developed in order to identify conserved nucleotide clusters that map to potentially functional surfaces (28).To that end, we reasoned that RNA ETA might be useful for identifying structural motifs within the HCoV-OC43 5 0 UTR that are under evolutionary pressures to maintain sequence clusters, even in the background of compensatory substitutions.ETA indicates that the apical loops of SL2 and SL3 contain the most conserved nucleotide clusters, which is consistent with previous phylogenetic studies of β-CoVs (28).Moderately conserved nucleotides also cluster to the apical loops of SL5B, SL5C, and SL5D.By comparison, the non-WC structural motifs of SL1, SL4, SL5A, and SL5E are more divergent (Fig. 1D).Collectively, these results indicate that the HCoV-OC43 5 0 UTR folds into a highly structured domain that positions clusters of conserved nucleotides in distinct structural environments.

Isolated stem-loops 1 to 4 adopt defined tertiary structures as determined by NMR and SAXS
As an orthogonal evaluation of structures within the 5 0 UTR, we investigated the solution biophysical properties of isolated SLs 1 to 4 by NMR spectroscopy and SAXS.The extent of DMS modification for SLs 1 to 4 is 2-fold lower compared with the remaining 3 0 end, suggesting that the structures within the first 150 nts are more conformationally stable.A similar positional difference in chemical reactivity was also observed by SHAPE-Map (Fig. S1).Constructs that correspond to SL1-2, SL3, SL4 and SL1-4, based on the DMS-determined model, were in vitro transcribed with or without ( 15 N/ 13 C) selectively labeled rNTPs.Details of the NMR and SAXS analysis for each construct are described separately below.For NMR studies (283 K), samples were prepared in 25 mM K 2 HPO 4 , 50 mM KCl, pH 6.2 in 10% D 2 O. SAXS measurements (298 K) were collected on samples prepared in 5 mM Mes, 50 mM KCl, pH 6.5.
HCoV-OC43 5 0 UTR stem-loops 1 and 2 SLs 1 to 2 span nucleotides 4 to 53 of the 5 0 UTR of HCoV-OC43.To achieve high yields by T7 RNAP-dependent in vitro The normalized reactivity indices are superimposed as a scaled color code from 0 (low reactivity) to 1 (high reactivity) for each modified AC nucleotide.Guanosine and uracil nucleotides are rendered in white, and nucleotides that overlap with primers are rendered in a lower transparency.C, the secondary structure was also calculated using the reactivity indices collected by 2-methylnicotinic acid imidazole modification and shown in a similar manner with intensities colored as relatively low (black) moderate (orange) and high (red).D, the evolutionary conservation of each nucleotide as determined with evolutionary trace analysis is superimposed onto the calculated secondary structure with the highest level of conservation in red and the lowest conservation in blue.Nucleotides that do not have data are rendered in a lower transparency.
The HCoV-OC43 5'UTR is topologically constrained transcription, we included a nonnative guanosine on the 5 0 end (Fig. 2A).ETA for SL1 indicates relatively low levels of conservation, including within the internal and apical loops.Helical nucleotides show correlated levels of divergence consistent with previous observations that the majority of these residues covary to maintain structure (1).Conversely, ETA shows conserved clustering of nucleotides within the 6-nt apical loop of SL2, with a lower rate of correlated divergence in its 5-bp helix (Fig. 1D).In agreement with the evolutionary pressure to maintain stable structure, sequential and long-range (G/U) NH-(G/U)NH NOE cross-peaks can be traced in the 1 H-1 H NOESY spectrum for most of the WC base pairs that were independently verified by low DMS and SHAPE reactivity (Fig. 2B).Direct measurement of hydrogen-bond interactions-as detected by HNN-COSY for 12 of 15 expected WC pairsprovide unambiguous evidence of the secondary structure of SL1-2 (Fig. 2E).Notably, we observed strong NOE cross-peaks between U13 and U26 within the internal loop of SL1 suggesting that the WC edges of these bases are likely engaged in some type of pairing interaction, which also agrees with their 1 H/ 15 N chemical shifts (Fig. 2C).Indeed, weak scalar couplings are observed between U12:C28 and U13:C27, revealing that these nucleotides form transient WC-type base pairs, which is also consistent with the relatively low chemical modification of C27 and C28 (Fig. 1, B and C).Taken together, the NMR data provide evidence that SL1,2 adopts a defined structure that includes additional noncanonical base-base interactions within conserved structural motifs.I q ) versus q, (G) Guinier fit, (H) unitless Kratky plot, and (I) P(r) versus r profile from the data in normalized equal area (i.e., proportional to P(r)/r) for ease of comparison.SAXS experiments were performed in 5 mM Mes, 50 mM KCl, pH 6.5.

The HCoV-OC43 5'UTR is topologically constrained
To gain insights into the global topology of SL1,2, SAXS measurements were collected on the 52-nt construct.Figure 2, F-I show SAXS data sets on an SL1,2 sample that was resolved by size exclusion chromatography (SEC).The radius of gyration (R(g)) calculated from the linear region of the Guinier plot is 23.83 Å (Table S2), in close agreement with that expected for an RNA with A-form helical geometry.The overall shape of the pairwise distribution (P(r)) plot is asymmetric, with its maximum peak centered at around 20 Å, monotonically tailing off to zero with a maximum dimension (D max ) of 86 Å.The profile of the P(r) plot suggests that SL1,2 adopts an elongated rod-like structure in solution that likely arises through coaxial stacking of the respective helices.In agreement with a compact topology, the Kratky plot of SL1,2 has an inverted parabolic shape that is characteristic of stably folded structures.In sum, the SAXS data confirm that SL1-2 folds into a topologically stable 3D structure.
HCoV-OC43 5 0 UTR stem-loop 3 SL 3 spans nucleotides 56 to 77 of the HCoV-OC43 5 0 UTR.In order to achieve high yields during transcription, we included two nonnative guanosines on the 5 0 end and two nonnative cytosines on the 3 0 end of SL3 (Fig. 3A).ETA for SL3 indicates relatively high levels of conservation, particularly Nonnative nucleotides are denoted using N(+/−#) according to genomic numbering.Size exclusion chromatography-SAXS results for the SL12 construct are shown (F) with the overall scattering intensity (I q ) versus q, (G) Guinier fit, (H) unitless Kratky plot, and (I) P(r) versus r profile from the data in normalized equal area (i.e., proportional to P(r)/r) for ease of comparison.SAXS experiments were performed in 5 mM Mes, 50 mM KCl, pH 6.5.
The HCoV-OC43 5'UTR is topologically constrained in the 7-nt TRS-L, which spans the apical loop of SL3 as 5 0 -aUCUAAACUu-3' (Fig. 1D).This is consistent with the evolutionary pressure for this region to maintain a singlestranded context to facilitate RNA-RNA interactions that promote sgRNA synthesis (10).Analysis of DMS data reveals higher levels of reactivity in the adenosines located within or around the apical loop (Fig. 3B).Direct measurement of hydrogen-bond interactions for six of eight expected WC base pairs by HNN-COSY, coupled with sequential NOE crosspeaks observable in the 1 H-1 H NOESY, provides unambiguous evidence for the secondary structure (Fig. 3, C and D).Notably, we observed moderately intense NOE interactions between U63 and U70 located at the base of the apical loop (Fig. 3, A and D), indicating that the imino groups are protected from rapid solvent exchange.Moreover, the 1 H/ 15 N chemical shifts of U63 and U70 are consistent with values expected for non-WC base pairs.Taken together, these data show that SL3 forms a stable structure, with the presence of a U63:U70 base pair located within the apical loop containing the TRS-L.
To gain further insights into the global topology of SL3, SAXS measurements were collected on the 26-nt construct.Figure 3, F-I show SAXS datasets on a SL3 sample that was resolved by SEC.The radius of gyration R(g) calculated from the linear region of the Guinier plot is 15.43 Å (Table S2).By comparison with the SL1-2 construct, the overall profile of the P(r) plot is less asymmetric with the peak centered at around 18.5 Å and a D max of 51.0 Å, suggesting that SL3 is more globular in shape (Table S2).The overall folded structure of SL3 is further corroborated by the inverted parabolic profile of the Kratky plot (Fig. 3H).
HCoV-OC43 5 0 UTR stem-loop 4 SL 4 spans nucleotides 79 to 131 of the HCoV-OC43 5 0 UTR; however, two terminal GC base pairs were added to the construct in order to achieve high yields during transcription (Fig. 4A).ETA depicts relatively low levels of conservation among the nucleotides in SL4 with the exception of the G/Urich apical loop.Nucleotides comprising the helices, including the AUG of the short, upstream open reading frame, show relatively high rates of evolutionary divergence as expected for positions that undergo compensatory substitutions to maintain base pair.Analysis of the DMS and SHAPE modification patterns of SL4 shows that nucleotides located in the upper helix are generally less reactive than those located in the lower helix (Fig. 1).Consistent with the modification pattern, direct measurement of hydrogen bonding for seven of eight expected WC base pairs is observed in the upper helical region.Furthermore, sequential and long-range NOE interactions were traced for the upper helix providing evidence of continuous base stacking.Conversely, we were only able to confidently assign four of the eight native base pairs of the lower helix; however, several broad diagonal peaks with chemical shifts consistent with the expected base-pair composition can be detected in the 1 H-1 H NOESY indicating that the missing imino NOEs likely result from rapid solvent exchange.We cannot exclude the possibility that local conformational dynamics also contribute to peak broadening, however.A difference in solvent exchange properties suggest that the upper helix is more stable than the lower, which is also reflected in the chemical modification profile for SL4 (Fig. 1, A and B).Interestingly, we observed NOE crosspeaks between G92, as well as one or more unassigned uracils located within the internal loop of SL4 (Fig. 4B), indicating that this large internal loop contains residual structure as observed in SHAPE-determined model (Fig. 1C).Together, the NMR data reveal that SL4 folds into a well-defined structure that includes a partially stacked internal loop that separates two base paired helices.
We next investigated the overall topology for SL4 by collecting SAXS measurements on samples resolved by SEC (Fig. 4, F-I).The radius of gyration R(g) calculated from the linear region of the Guinier plot is 24.87 Å (Table S2) in close agreement for that expected for an RNA with an A-form helical geometry.The overall profile of the P(r) plot is asymmetric with the peak centered at around 25 Å, with a D max of 92 Å (Table S2).Furthermore, the Kratky plot of SL4 has an inverted parabolic shape that is characteristic of stably folded structures (Fig. 4H).In summary, the SAXS data confirms that SL4, which contains a large internal loop, adopts a topologically defined 3D architecture.
The intact SL1-4 locus contains a coaxially stacked helix that juxtaposes conserved nucleotide clusters As described above, the first 131 nts of the HCoV-OC43 5 0 UTR fold into four consecutive SLs that are connected via short (1-to 2-nt) linkers (Fig. 5A and B).Low to moderate DMS reactivities were detected for most of the terminal base pairs and linking nucleotides, except for the 1-nt linker that connects SL3-4 and the terminal base pair of SL4 (Fig. 1B).By contrast, the linker uracils in the SHAPE-derived model display moderately low reactivities and as such form base pairs.As a step toward characterizing the overall structure of the SL1-4 locus, we in vitro transcribed nucleotides 4 to 131 containing 13  The HCoV-OC43 5'UTR is topologically constrained between the SL2,3 and SL3,4 linkers as observed in the SHAPE-derived structure (Fig. 1C).
We also collected SAXS data on the intact SL1-4 locus to evaluate its global structural properties.Figure 5, D-G summarize SAXS measurements on a representative SL1-4 sample resolved by SEC.The Guinier plot is linear, indicating that SL1-4 is monodisperse.The calculated radius of gyration R(g) is 52.34 ± 0.4 Å, considerably larger than expected for an RNA with all A-form helical topology.Consistent with the NMR data, the Kratky plot of SL1-4 shows an inverted profile that further validates that the RNA is stably folded in solution.Interestingly, the pairwise distance distribution function is highly asymmetric, with a peak maximum at around 20 Å, followed by a broader peak, which eventually decays to zero at a D max of 200 Å.The radius of gyration R(g) determined from the P(r) analysis is 53.86 ± 0.25 Å in close agreement with the value calculated by the Guinier plot.The P(r) profile suggests that the SL1-4 locus adopts a topology consisting of a central coaxially stacked helix, with at least one SL extending outward from a common junction.Such a topological arrangement is consistent with the secondary structure of SL1-4 and the moderately low chemical modifications observed at the SL termini.
To gain insights into the 3D architecture of SL1-4, we proceeded to determine a data-driven structural model by inputting NMR-and DMS-derived base-pair restraints into The HCoV-OC43 5'UTR is topologically constrained Rosetta FARFAR2 (29), followed by SAXS scoring of representative low-energy conformers from different clusters.For comparative purposes, structural models of the isolated SL constructs used for NMR and SAXS studies were also determined (Fig. S3).Representative structures were further refined in AMBER using hydrogen-bond restraints derived directly from NMR and indirectly from DMS modifications (see Methods).Fig. S4 shows that the SL1-4 models cluster into distinct conformers, where each is characterized by an SL3,4 coaxially stacked helix that forms the central axis of the structure.SLs 1 and 2 jointly adopt different relative conformations about the SL3,4 helical axis, in which the majority (n = 5) of the structures converge to a similar overall topology.Interestingly, SLs 1 and 2 coaxially stack in the isolated construct (Fig. S3); however, this is not observed within the context of the intact SL1-4 locus likely due to topological constraints imposed by the 2-nt linker that connects SL2 to the SL3,4 extended domain.
Figure 6B shows an ensemble of the five common SL1-4 structures colored according to the evolutionary conservation of each nucleotide position.In this subset of structural models, clusters of conserved nucleotides located in SL2 and SL3 assemble in close proximity, suggesting that they might cooperate to regulate 5 0 UTR-dependent functions.Notably, the global topology of the SL1-4 structural model agrees reasonably well with the molecular envelope calculated from the SAXS data, albeit areas of unoccupied or missing density likely reflect inter-and intradomain dynamics.Potential SL1-4 structural Figure 5.The intact SL1-4 locus is an independently folded HCoV-OC43 5 0 UTR domain.A and B, the collective hydrogen bonds directly detected for each sub-SL construct is shown next to the DMS-determined secondary structure of the SL1-4 locus.C, the 1 H, 15 N-SOFAST-HSQC for SL1,4 is shown (black) overlaid with individual SL12 (orange), SL3 (blue), and SL4 (yellow) HSQC results.Size exclusion chromatography-small-angle X-ray scattering results for the SL14 construct are shown with the overall scattering intensity (I q ) versus q (D), Guinier fit (E), unitless Kratky plot (F), and P(r) versus r profile (G) from the data in normalized equal area (i.e., proportional to P(r)/r) for ease of comparison.Small-angle X-ray scattering experiments were performed in 5 mM Mes, 50 mM KCl, pH 6.5.SL, stem-loop.
The HCoV-OC43 5'UTR is topologically constrained dynamics is inferred from the broad peak in the P(r) plot, which indicates a distribution of interparticle distances (Fig. 4I).
We also collected SEC-SAXS data on the intact HCoV-5 0 UTR (Fig. S5).Surprisingly, the SAXS profiles show evidence of globally defined structural features despite the large size of this RNA construct.In particular, the Kratky plot has an inverted parabolic shape revealing that the 5 0 UTR adopts an overall folded conformation in solution.The P(r) plot has a distinct maximum centered at 22 Å and a broader peak close to 100 Å, which is close to the average length (105 Å) of the SL3,4 coaxial helix as measured from the apical loop of SL3 to SL4 (Fig. S4).The broadness of the P(r) plot suggests that the HCoV-OC43 5 0 UTR samples multiple large-scale conformations around a central axis.

The HCoV-OC43 nucleocapsid N-terminal domain binds to the SL1-4 locus specifically and with high affinity
The 5 0 UTR of CoV genomes regulate essential biological processes in part by binding viral and cellular proteins that in turn modulate distinct stages of the virus lifecycle (30).The coronavirus nucleocapsid protein binds to the 5 0 UTR to regulate transcription and genome packaging (31)(32)(33).The NTD of nucleocapsid interacts specifically with the transcription regulatory sequence within the leader (TRS-L) to modulate nascent viral RNA synthesis (33,34).Depending on the CoV, the TRS-L is either completely single stranded or forms the apical loop surface of SL3 (1,15).Nucleocapsid has RNA chaperone activity and can destabilize secondary structures to facilitate its various biological functions (35,36), as such the structural environment by which nucleocapsid engages its cognate genomic receptors is important for understanding its mechanism of action.
To gain some insight into nucleocapsid-5 0 UTR interactions, we performed calorimetric titrations of the HCoV NTD with isolated SL3 and the SL1-4 locus.Fig. S6 shows that the nucleocapsid NTD binds SL3 as a 1:1 complex and with moderate affinity (K D = 2.00 ± 0.34 uM).The binding event is driven by a large favorable change in total binding The HCoV-OC43 5'UTR is topologically constrained enthalpy (ΔH = −21.15± 0.419 kcal/mol) and opposed entropically (-TΔS= 13.7 ± 0.419 kcal/mol).Interestingly, the calorimetric profile of nucleocapsid NTD titrated into the SL1-4 locus shows a more sigmoidal shape as compared with isolated SL3 (Fig. S6).Fitting of the processed isotherm to a 1:1 binding model reveals that the nucleocapsid NTD binds the SL1-4 locus 5-fold tighter (K D = 0.447 ± 0.056 uM) than it binds isolated SL3.The binding event is also driven by a large change in total binding enthalpy (ΔH=−16.64 ± 0.042 kcal/mol) and opposed entropically (-TΔS = 7.97 ± 0.042 kcal/mol).The comparative thermodynamic results show that the structural environment of the HCoV-5 0 UTR influences the capacity by which the nucleocapsid protein recognizes the TRS-L.Additional studies are needed to fully elucidate the determinants of nucleocapsid-5 0 UTR interactions; however, the thermodynamic signatures illustrate that the RNA structural environment of the TRS-L influences the binding mechanism.

The SL3,4 coaxial helix intrinsically represses the efficiency of cap-dependent translation
To evaluate RNA determinants that modulate cap-dependent translation of virus RNAs, we cloned a series of luciferase reporter constructs that contain different HCoV 5 0 UTR variants (Fig. 7A).Vero E6 cells were transfected with equal amounts (200 ng) of in vitro transcribed, 5 0 -capped and polyadenylated Renilla and Firefly luciferase reporter mRNAs (RLuc and FLuc, respectively).The RLuc mRNA served as an internal, leaderonly, control, whereas the FLuc mRNA constructs contained variants of the HCoV-OC43 5 0 UTR upstream of the FLuc open reading frame (Fig. 7A).Notably, the activity of the RLuc reporter was significantly higher than each of the FLuc constructs (Fig. S7A), indicating that the highly structured HCoV-OC43 5 0 UTR intrinsically represses translation efficiency relative to a leader sequence that lacks stable structure.Quantitative PCR analysis shows that the RLuc and FLuc mRNA levels are statistically identical (Fig. S7B), thus indicating that the differences in luciferase activities derive from the presence of the various HCoV 5 0 UTR structures.Deletion of the SL5 mb (5 0 UTR ΔSL5mb ), which contains the HCoV-OC43 ORF-1ab AUG start codon, resulted in an 23% reduction in FLuc activity relative to the intact 5 0 UTR (Fig. 7B, graphs for nucleotides 4-131 and nucleotides 4-329, respectively).This observation suggests that SLs 1 to 4 set a lower threshold on translation efficiency and that the SL5 mb domain functions as a translational enhancer.We speculated that the repressive activity of SLs 1 to 4 on FLuc translation derives in part from the SL3,4 coaxial stacked helix.As a test, we engineered two FLuc reporters that contained mutations designed to abrogate the topological SL3,4 structure (Fig. 7B): (i) nucleotides A75, A76, and A77 were mutated to C75, C76, and C77, which break three WC base pairs in the lower helix of SL3 (5 0 UTR SL1-4,AAA/CCC ), and (ii) insertion of five cytosines between nucleotides 77 and 79, which introduces a flexible linker between SLs 3 and 4 (5 0 UTR SL1-4,5C-INS ).Relative to the 5 0 UTR ΔSL5mb FLuc mRNA (nucleotides 4-131), the 5 0 UTR SL1-4,AAA/CCC and the 5 0 UTR SL1-4,5C-INS RNAs showed increases in translation efficiency of approximately 113% and 167%, respectively.Surprisingly, the 5 0 UTR SL1-4,5C-INS RNA was more efficient (123%) than the intact 5 0 UTR RNA within the context of the luciferase reporter assay (Fig. 7B).Taken together, the results indicate that the SL3,4 coaxial helix functions to intrinsically repress HCoV translation by forming a thermodynamically stable tertiary structure that likely acts as a steric block to ribosomal scanning.

Discussion
The 5 0 UTRs of CoVs are genetic elements under high selective pressures to maintain their overall structures (and to some extent sequences) because they regulate essential steps during the cellular stages of replication including viral translation, RNA synthesis, genome packaging, and innate immunity.Furthermore, 5 0 UTRs are receptors for cellular RNA-binding proteins.Through their recruitment, CoVs reprogram the cellular environment to favor optimal viral production.Thus, the requirement of 5 0 UTR structures to orchestrate required events during replication makes them attractive models to calibrate principles of structure-conservation-function relationships.Moreover, this information can be leveraged to identify new targets for therapeutic intervention.To that end, we carried out a comprehensive analysis of the HCoV-OC43 5 0 UTR to reveal that the first four SLs fold into a 3D architecture that intrinsically restricts the efficiency of cap-dependent translation of viral RNAs and juxtaposes clusters of conserved residues.
By combining chemical modification with NMR spectroscopy, we provide orthogonal evidence that the first four SLs of the HCoV-OC43 5 0 UTR fold into independent subdomains.Although these structures were predicted previously, this work illuminates the extent of base pairing through the direct detection of 38 of 44 native hydrogen bonds by HNN-COSY and 1 H-1 H NOESY.Moreover, we observed several mismatch base pairs within the loops of SL1, SL3, and SL4 (not detectable by chemical modification) demonstrating that these structural motifs engage in local and long-range tertiary contacts.Low levels of chemical modification were detected for several of the nucleotides within the internal loops of SL1 and SL4, revealing the degree of correspondence between the reactivity profile and NMR analysis.Interestingly, a subset of the nucleotides within the internal loops of SL1 and SL4 show a similar rate of evolutionary divergence as the adjacent helices, suggesting that base positions within these structural motifs covary (Fig. 1).SAXS data collected on different SL constructs provide additional evidence on the degree of tertiary structure encoded within the 5 0 UTR.Indeed, SAXS revealed that the intact SL1-4 locus folds into a distorted Y-shaped structure consisting of three arms of unequal dimensions (Fig. 6, B and C).In each of the experimentally restrained structural models of the intact SL1-4 locus, SLs 3 and 4 coaxially stack to form a continuous helix of approximately 105 Å long (Fig. S4).Coaxial stacking is mediated in part through a long-range U55:A78 base pair predicted by FARFAR2 and observed in the SHAPE-derived structure but The HCoV-OC43 5'UTR is topologically constrained not the DMS model (Fig. 6).A78 and A79 are highly modified by DMS but show low reactivity by NAI, indicating these nucleotides likely engage in dynamic base pairs with U54 and U55 as determined in the SHAPE model.The coaxial stacked topology of SL3 and SL4 is also corroborated by the SAXS data as is evident by superimposing a low-energy SL1-4 model into the molecular envelope.By comparison, the topological arrangement of SLs 1 and 2 is less defined in the structural models and can adopt various orientations that pivot about the SL3,4 coaxial helix (Figs.6B and S4).In a subset of the structural models that best agree with the SAXS envelope, the apical loops of SL2 and SL3 are spatially juxtaposed such that the most conserved nucleotides within the 5 0 UTR cluster to one surface (Fig. 6B), suggesting a potential functional synergy.
Interpreting the work presented here within the context of prior phylogenetic and functional studies provides insights into the biological significance of the 3D architecture of SLs 1 to 4 of HCoV-OC43.As observed previously, the linkers The HCoV-OC43 5'UTR is topologically constrained between the various SLs within the 5 0 UTR of CoVs are variable and depend on whether SL3 forms or not (1,10).The linkers that connect SLs 1 to 4 of lineage A β-CoVs are some of the shortest, including those observed here for HCoV-OC43.Short nucleotide linkers adjoining neighboring and wellfolded SLs impose topological constraints on the overall 3D structure that the 5 0 UTR can adopt.The coaxial stacking of SL3,4 fixes their relative orientations, and it places their apical loops a maximal distance apart.Of note, we cannot rule out the possibility that the linkers themselves undergo dynamic base pairing as suggested by comparing the DMS-and SHAPE-derived secondary structural models.Such dynamic remodeling of the linkers may fine-tune the overall compactness of the HCoV 5 0 UTR in a replication stage-dependent manner given that the same genome is used for viral protein and RNA synthesis.In support of this concept, our luciferase reporter assays show that mutations engineered to increase the flexibility between SLs 3 and 4 correlate with an increase in luciferase gene expression (Fig. 7B).This observation indicates that the compactness of the HCoV-5 0 UTR restricts the efficiency of cap-dependent translation of viral RNAs, likely by sterically blocking ribosome scanning (37).
It has been suggested that SL4 regulates sgRNA synthesis by functioning in part as a spacer element since deleting it is lethal for MHV replication, but viability can be restored by replacing it with a sequence-unrelated SL (38).The spacer element hypothesis agrees with the structural model determined here, which illustrates that the SL3,4 coaxial helix determines the spatial disposition of the TRS-L element located within the apical loop of SL3.Mutations that disrupt SL4 base pairing would destabilize the coaxial stacking of SL3,4 and in turn distort the relative orientation of the TRS-L to modulate its interactions with cognate binding partners.Indeed, our calorimetric titrations of the HCoV nucleocapsid protein reveals that its binding capacity to the TRS-L depends on its structural environment since its affinity is 5-fold tighter for the SL1-4 locus compared with the isolated SL3.The biophysical origins of this difference in binding affinity await further verification.
The apical loop of SL2 is one of the most conserved genetic elements within the 5 0 UTR of CoVs (1,10,12).Mutations that change the apical loop sequence of SL2 or that disrupt base pairing of its helix are lethal to MHV and affect viral sgRNA synthesis.This observation suggests some level of functional correlation between SL2 and the TRS-L.In the SL1-4 structural model determined here for HCOV-OC43, the SL3,4 coaxially stacked helix controls the relative orientation of SL2 bringing it within spatial proximity of the TRS-L in a subset of the models.Such a topological arrangement would create more of an extended surface to coordinate RNA-RNA or protein-RNA interactions that regulate sgRNA synthesis.Indeed, ETA reveals that phylogenetically conserved residues located within the apical loops of SL2 (5 0 -uCUUGUUa-3 0 ) and SL3 (5 0 -aUCUAAACUu-3 0 ) cluster to one surface of the intact SL1-4 locus.
In sum, this study shows that the SL1-4 locus of the HCoV-OC43 5 0 UTR folds into a topologically defined 3D architecture where the SL3,4 coaxially stacked helix controls the relative orientations of conserved surface residues that are required for viral replication and it intrinsically represses cap-dependent translation efficiency of viral RNAs.The work also sheds light on roles of linker residues that connect SL domains within the 5 0 UTR.Strain-dependent differences in linker lengths could differentially modulate the biological outputs of CoV 5 0 UTRs.Future comparative structural and functional studies should consider the roles of linkers in determining the overall topology of 5 0 UTRs.Lastly, the work demonstrates the effectiveness of integrating complementary biophysical, biochemical, and computational approaches to efficiently determine RNA structural models with potential to inform on biological functions (24).

RNA synthesis and purification
RNA constructs for SLs 1 and 2 (SL1,2), 3 (SL3), and 4 (SL4) were transcribed in vitro using synthetic DNA oligos (Integrated DNA Technologies, Inc, IDT) as templates (Table S1).A construct consisting of the first four SLs (SL1-4) was transcribed in vitro using a synthetic DNA ultramer (IDT) as the template.Uniformly 15 N, 13 C-labeled guanidine (GTP), cytidine (CTP), adenosine (ATP), and uridine (UTP) triphosphates (Cambridge Isotope Laboratories) were used to prepare isotopically labeled samples for SOFAST-HSQC and HNN-COSY data collection.Fully protonated rNTPs (Sigma-Aldrich) were used to prepare samples for H 2 O NOESY and SAXS data collection.Transcription reaction conditions were optimized individually for each construct prior to data collection.Following in vitro transcription, constructs were purified by 6 to 12% urea-PAGE and eluted in Tris-Borate-EDTA buffer.RNA constructs were desalted using centrifugation (Amicon Ultra Centrifugal filter, MWCO: 3000-10,000 Da, Millipore-Sigma) and then refolded at concentrations of 20 μM in water.Refolded samples were then exchanged into experimental buffers and concentrated using centrifugation (Amicon MWCO: 3000-10,000 Da).
DMS-MaPseq of HCoV-OC43 5 0 UTR Purified HCoV-OC43 5 0 UTR (1 μg) was heated to 95 C for 15 s and flash cooled on ice for 2 min.Following this, 95 μl of DMS modification buffer (100 mM, sodium cacodylate, 140 mM KCl, 3 mM MgCl 2 , pH 7.5) was added to the RNA sample.This was incubated at room temperature for 30 min.Dimethyl sulfate (2-5%) was added to the RNA sample, which was subsequently incubated at 37 C while shaking at 500 rpm for 5 to 10 min.The reaction was terminated with 60 μl of β-mercaptoethanol (Sigma-Aldrich).The modified RNA sample was then cleaned using RNA cleanup and concentrator-5 column (Zymo Research).The methylated RNA was reverse transcribed using a thermostable group II intron reverse transcriptase, third generation (TGIRT-III, InGex) and a specific reverse primer complementary to nucleotides 299 to 327 of the HCoV-OC43 5 0 UTR sequence.RNA templates were then digested by addition of RNase H The HCoV-OC43 5'UTR is topologically constrained (New England Biolabs) and incubated for 20 min at 37 C.The reverse-transcribed DNA was amplified via PCR using the Phusion High-Fidelity DNA polymerase (New England Biolabs) and specific primer sets (FW nucleotides 1-24, RV nucleotides 299-327).The PCR cycle began with initial denaturing for 30 s, followed by 25 PCR cycles (denaturing for 5 s at 98 C, annealing for 10 s at 65 C, and extension for 15 s at 72 C), and final extension for 5 min at 72 C. The PCR products were desalted using the DNA cleanup and concentrator-5 column kit (Zymo Research).Sample homogeneity was assessed by agarose gel.
Sequencing was performed on an HiSeq 2000 sequencing system (Illumina, Inc), which uses cluster generation and sequencing by cluster chemistry.The sequencing results were aligned according to a well-established protocol (25).The minimal mutational signals of 5 0 -and 3 0 -primer regions and signals from T and G were determined to be null.Signals for A and C in the target region were 95% winsorized and normalized to the highest to generate a DMS reactivity index.DMS restraints were used as input to guide folding of the HCoV-OC43 5 0 UTR in the RNAstructure algorithm and visualized in VARNA (26,39).SHAPE analysis of HCoV-OC43 5 0 UTR SHAPE experiments were carried out using the SHAPE Single Kit from Eclipsebio according to the manufacturer's guidelines (www.eclipsebio.com).In short, 100 ng of in vitro transcribed RNA was resuspended in molecular biology grade water.The RNA was denatured using heat and subsequently refolded using SHAPE folding buffer reagent.Samples were probed in vitro using NAI or dimethyl sulfoxide for control samples.Samples were then fragmented and subsequently phosphorylated in preparation for ligation.After phosphorylation, a barcoded RNA adapter was ligated to the 3 0 end of all RNA fragments.RNA was reverse transcribed in the presence of MnCl 2 to induce mutations in the cDNA (SHAPE-MaP).The following steps, ssDNA adapter ligation, quantitative PCR, and PCR amplification were performed according to the published single-end seCLIP protocol (40).Samples were sent to Eclipsebio for sequencing and library preparation.SHAPE reactivity scores were then used as constraints for secondary structure prediction using the RNAstructure web server.The secondary structure generated using SHAPE reactivities was then exported to the vaRNA structural visualization software where the raw reactivity scores were used to generate a color map for visualization.
NMR data acquisition, processing, and analysis NMR spectra were recorded on a Bruker Avance 900-MHz high-field NMR spectrometer equipped with a cryogenically cooled H-C-N triple-resonance probe and a z-axis pulsed-field gradient accessory (Table S2).Exchangeable 1 H-1 H NOESY (t m = 200 ms) spectra were collected on fully protonated samples using the Watergate noesygpph19 pulse sequence.
simulation with H-bonding and EMAP restraints to track model progress.

Ab Initio structural modeling
Based on the predicted secondary structures for SL12, SL3, SL4, and SL14 constructs, as well as the experimentally determined hydrogen-bonding pattern for each construct, de novo structure predictions were generated using the FARFAR2 module of Rosetta 3.11 (29).The RNAtools package was used to generate idealized helices for each of the base-paired regions, for all constructs, using the secondary structure and sequence as inputs.FARFAR2 was used to generate 10,000 initial structures, which were then filtered by the k-means clustering algorithm in Rosetta 3.11, in order to best sample conformational space.Clusters were analyzed, and a representative structure was generated from each of the clusters.The ATSAS program, CRYSOL, was used to compare experimental SAXS data to each representative structure in order to calculate χ 2 values and track the refinement validity throughout the refinement process.Each of the 10 representative structures was prepared for further refinement in AMBER using the ff99bsc-OL3 forcefield in tLEaP.All simulations were conducted using the High-Performance Computing resource in the Core Facility for Advanced Computing at Case Western Reserve University.

Molecular dynamics simulations with hydrogen-bonding and EMAP restraints
After initial preparation using tLEaP, each individual construct, along with the SL1-4 structure, was minimized using sander (Simulated Annealing with NMR-Derived Energy Restraints) as part of the AMBER molecular dynamics package (46).During the minimization runs, the RNA was held fixed at an arbitrary 10.0 kcal/mol and minimized for 4000 steps.A 24.0-Å cutoff was used for calculation of Born radii throughout each minimization.Initial refinement of models was performed in AMBER by inputting base pair restraints derived from NMR and DMS chemical probing.The RNA was heated from 0 to 300 K in the first 100 ps and then held at 300 K for the duration of the simulation with NMR restraints.The salt concentration used for each explicit-solvent simulation was 0.154 M NaCl, representing physiological salt concentration.Following the simulation, structures were generated from the last frame of the trajectory file using CPPTRAJ.These structures were compared with SAXS data using CRYSOL in order to calculate χ 2 values for each representative structure for each construct.The restart file from the previous simulation was used as the starting structure for the final refinement.Each of the 10 representative structures for each construct was then simulated for 1 ns using explicit solvent conditions with both NMR and EMAP restraints.The RNA was held at 300 K for the duration of the simulation with a salt concentration of 0.154 M. Structures were again generated from the final trajectories of each simulation using CPPTRAJ.Structures were then compared with SAXS data using CRYSOL.
Preparation of HCoV nucleocapsid N-terminal domain and calorimetric titrations RNA samples for isothermal titration calorimetry (ITC) were prepared using the in vitro transcription and purification processes as described above using fully protonated, unlabeled rNTPs.HCoV-OC43 nucleocapsid N-terminal domain protein was cloned into the pMCSG7 vector using a geneblock purchased from IDT with an N-terminal HIS tag attached.The protein was overexpressed in LB broth and purified using Niaffinity and SEC.Both RNA and protein samples were buffer exchanged into ITC buffer (20 mM K 2 HPO 4 , 20 mM KCl, 0.5 mM EDTA, pH 8.0) before ITC titrations were conducted.
Calorimetric titrations were performed on a VP-ITC calorimeter (Microcal, LLC) at 25 C into ITC buffer, centrifuged prior to use.Nucleocapsid NTD at 100 μM was titrated into 1.4 ml of 10 μM of the respective RNA construct over a series of 32 injections set at 6 μl each.To minimize the accumulation of experimental error associated with batch-tobatch variation, titrations were performed in triplicate.Data were analyzed using KinITC routines supplied with Affinimeter (47).

Evolutionary trace
The Evolutionary Trace software package was acquired from GitHub (github.com/LichtargeLab/RNA_ET_ms).All 17 multiple sequence alignments were obtained from Rfam in FASTA-gapped format (49).Sequences were made to be identical lengths before being converted to multiple sequence format.ETA was performed according to the established protocol (28).Nucleotides ranked between 0 and 0.35 (0 and 35%) were deemed "ETA nucleotides" (28).Nucleotide ranks were then mapped onto the empirically determined secondary structure of the OC43 5 0 UTR using the VARNA software package in order to generate a heatmap.In addition, nucleotide ranks were visualized on the data-driven model of SL1-4 using the PyETV plugin for PyMOL (50).

Figure 1 .
Figure 1.Secondary structural model of the HCoV-OC43 5 0 UTR.A, normalized DMS reactivity indices of the HCoV-OC43 5 0 UTR.B, the population averaged secondary structure of the HCoV-OC43 5 0 UTR as determined by incorporating DMS reactivity indices as pseudoenergy restraints.The normalized reactivity indices are superimposed as a scaled color code from 0 (low reactivity) to 1 (high reactivity) for each modified AC nucleotide.Guanosine and uracil nucleotides are rendered in white, and nucleotides that overlap with primers are rendered in a lower transparency.C, the secondary structure was also calculated using the reactivity indices collected by 2-methylnicotinic acid imidazole modification and shown in a similar manner with intensities colored as relatively low (black) moderate (orange) and high (red).D, the evolutionary conservation of each nucleotide as determined with evolutionary trace analysis is superimposed onto the calculated secondary structure with the highest level of conservation in red and the lowest conservation in blue.Nucleotides that do not have data are rendered in a lower transparency.

Figure 2 .
Figure 2. NMR and small-angle X-ray scattering (SAXS) analysis of the HCoV-OC43 SL1,2 construct.Comparison of the SL12 secondary structures as determined by (A) direct NMR detection of hydrogen bonding and (B) DMS profiling.C, 1 H,15 N-SOFAST-HSQC, (D)1 H-1 H-NOESY, and (E) HNN-COSY spectra for imino proton correlation of SL1,2 are shown with peak assignments provided.In the HNN-COSY, positive contours for the hydrogen bond donors are shown in red and negative contours for hydrogen bond acceptors are in blue.Nonnative nucleotides are denoted using N(+/−#).All NMR spectra were collected in 25 mM K 2 HPO 4 , 50 mM KCl, pH 6.2 in 10% D 2 O at 283 K. Size exclusion chromatography-SAXS results for the SL1,2 construct are shown with the (F) overall scattering intensity (I q ) versus q, (G) Guinier fit, (H) unitless Kratky plot, and (I) P(r) versus r profile from the data in normalized equal area (i.e., proportional to P(r)/r) for ease of comparison.SAXS experiments were performed in 5 mM Mes, 50 mM KCl, pH 6.5.

Figure 3 .
Figure 3. NMR and small-angle X-ray scattering (SAXS) analysis of the HCoV-OC43 SL3 construct.Comparison of the SL3 secondary structures as determined by (A) direct NMR detection of hydrogen bonding and (B) DMS profiling.(C) 1 H, 15 N-SOFAST-HSQC, (D) 1 H-1 H-NOESY, and (E) HNN-COSY spectra for iminoproton correlation of the 5 0 -cis-acting RNA construct SL12 are shown with peak assignments provided.Positive signals are shown in blue, whereas negative signals are shown in red.Nonnative nucleotides are denoted using N(+/−#) according to genomic numbering.Size exclusion chromatography-SAXS results for the SL12 construct are shown (F) with the overall scattering intensity (I q ) versus q, (G) Guinier fit, (H) unitless Kratky plot, and (I) P(r) versus r profile from the data in normalized equal area (i.e., proportional to P(r)/r) for ease of comparison.SAXS experiments were performed in 5 mM Mes, 50 mM KCl, pH 6.5.
C/ 15 N(G/U)-selectively labeled rNTPs and recorded a 1 H-15 N HSQC spectrum to compare with the spectra of the isolated SLs.Despite being 42 kDa, the spectrum of the SL1-4 locus shows well-resolved and dispersed correlation signals that span the chemical shift range expected for WC, GU wobble, and non-WC base pairs indicating that it folds into a regular secondary structure (Figs.5C and S2).Indeed, overlay of the 1 H-15 N correlation signals of the intact SL1-4 locus onto the signals of the isolated SLs shows a remarkable degree of overlap (Figs.5C and S2), verifying that the SL1-4 locus adopts nearly identical component structures as the isolated SLs.Subtle differences in chemical shifts reflect the different termini used to prepare the isolated SLs and potentially inter-SL stacking interactions.Of note, the 1 H-15 N HSQC spectrum of the intact SL1-4 locus shows a few additional signals relative to the spectra of the isolated constructs (Figs.5C and S2).The majority of the additional signals fall within the chemical shift range expected for uracils involved in AU base pairs, which likely reflect the formation of base pairs

Figure 4 .
Figure 4. NMR and small-angle X-ray scattering (SAXS) analysis of the HCoV-OC43 SL4 construct.Comparison of the SL4 secondary structures as determined by (A) direct NMR detection of hydrogen bonding and (B) DMS profiling.C, 1 H, 15 N-SOFAST-HSQC, (D) 1 H-1 H-NOESY, and (E) HNN-COSY spectra for iminoproton correlation of the 5 0 -cis-acting RNA construct SL12 are shown with peak assignments provided.Positive signals are shown in blue, whereas negative signals are shown in red.Nonnative nucleotides are denoted using N(+/−#) according to genomic numbering.Size exclusion chromatography-SAXS results for the SL12 construct are shown (F) with the overall scattering intensity (I q ) versus q, (G) Guinier fit, (H) unitless Kratky plot, and (I) P(r) versus r profile from the data in normalized equal area (i.e., proportional to P(r)/r) for ease of comparison.SAXS experiments were performed in 5 mM Mes, 50 mM KCl, pH 6.5.

Figure 6 .
Figure 6.Coaxial stacking of SL3,4 topologically constrains the 5 0 UTR to juxtapose conserved nucleotide clusters.A, the evolutionary conservation as determined by evolutionary trace analysis (ETA) is shown first on the DMS-determined secondary structure.B, the ensemble of models from AMBER simulations is shown colored according to ETA data.C, to better visualize the global topology in relation to the highly conserved regions, ETA data are shown on the surface representation of the model best fitting small-angle X-ray scattering (SAXS) data.D, finally, one model is shown superimposed on the SAXS-generated molecular envelope.The ensemble of structures generated through molecular dynamics simulations are shown overlaid on each other.The AMBER-generated structure is shown superimposed onto the SAXS-generated molecular envelope.All structures are colored according to evolutionary conservation with red indicating the highest levels of conservation and blue indicating the lowest level of conservation as depicted on the color bar.

Figure 7 .
Figure 7.The HCoV-OC43 SL3,4 coaxial helix located within the 5 0 UTR intrinsically represses the efficiency of cap-dependent translation of virus RNA.A, luciferase reporter constructs used to assess the influence of the HCoV-OC43 5 0 UTR on cap-dependent translation.Top, schematic of the Renilla luciferase reporter used as internal, leader-only, control.Bottom, schematic of the Firefly luciferase reporter with corresponding secondary structures of the HCoV-OC43 5 0 UTR variants used to measure influence on cap-dependent translation efficiency.B, Vero E6 cells were transfected with equal amounts (200 ng) of RLuc and FLuc reporter mRNAs.After 24 h post transfection, cells were lysed and assayed for RLuc and FLuc activities.Plots of relative luciferase levels reveal that the highly structured HCoV-OC43 5 0 UTR significantly reduces cap-dependent translation compared with the Renilla control mRNA and that the intrinsic block to translation depends on the SL3,4 coaxial helix.Mean values ± standard deviations from 15 independent experiments (N = 15) are shown in the bar graphs.p Values were determined by unpaired two-tailed Student's t test.****p < 0.0001.