The molecular virology of coronaviruses

Few human pathogens have been the focus of as much concentrated worldwide attention as severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), the cause of COVID-19. Its emergence into the human population and ensuing pandemic came on the heels of severe acute respiratory syndrome coronavirus (SARS-CoV) and Middle East respiratory syndrome coronavirus (MERS-CoV), two other highly pathogenic coronavirus spillovers, which collectively have reshaped our view of a virus family previously associated primarily with the common cold. It has placed intense pressure on the collective scientific community to develop therapeutics and vaccines, whose engineering relies on a detailed understanding of coronavirus biology. Here, we present the molecular virology of coronavirus infection, including its entry into cells, its remarkably sophisticated gene expression and replication mechanisms, its extensive remodeling of the intracellular environment, and its multifaceted immune evasion strategies. We highlight aspects of the viral life cycle that may be amenable to antiviral targeting as well as key features of its biology that await discovery.

Few human pathogens have been the focus of as much concentrated worldwide attention as severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), the cause of COVID-19. Its emergence into the human population and ensuing pandemic came on the heels of severe acute respiratory syndrome coronavirus (SARS-CoV) and Middle East respiratory syndrome coronavirus (MERS-CoV), two other highly pathogenic coronavirus spillovers, which collectively have reshaped our view of a virus family previously associated primarily with the common cold. It has placed intense pressure on the collective scientific community to develop therapeutics and vaccines, whose engineering relies on a detailed understanding of coronavirus biology. Here, we present the molecular virology of coronavirus infection, including its entry into cells, its remarkably sophisticated gene expression and replication mechanisms, its extensive remodeling of the intracellular environment, and its multifaceted immune evasion strategies. We highlight aspects of the viral life cycle that may be amenable to antiviral targeting as well as key features of its biology that await discovery.
The Coronaviridae family of viruses are enveloped, singlestranded positive-sense RNA viruses grouped into four genera (alphacoronavirus, betacoronavirus, gammacoronavirus, and deltacoronavirus) that primarily infect birds and mammals, including humans and bats. The seven coronaviruses known to infect humans fall within the alpha-and betacoronavirus genera, whereas gamma-and deltacoronaviruses primarily infect birds. Coronaviruses have been studied for decades using the model betacoronavirus, murine hepatitis virus (MHV), and the human alphacoronavirus HCoV-229E. In humans, the circulating coronaviruses HCoV-229E, HCoV-NL63, HCoV-OC43, and HCoV-HKU1 generally cause mild upper respiratory illness and collectively are associated with 10-30% of common cold cases (1). However, within the past two decades, three highly pathogenic coronaviruses have emerged into the human population as the result of spillover events from wildlife that can cause severe respiratory illness: severe acute respiratory syndrome coronavirus (SARS-CoV) emerged in 2002, Middle East respiratory syndrome coronavirus (MERS-CoV) emerged in 2011, and most recently, severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) emerged in 2019. These outbreaks, together with estimates suggesting that hundreds to thousands of additional coronaviruses may reside in bats alone (2), highlight the potential for future coronavirus zoonotic transmission.
In this article, we provide an overview of the coronavirus life cycle with an eye toward its notable molecular features and potential targets for therapeutic interventions (Fig. 1). Much of the information presented is derived from studies of the betacoronaviruses MHV, SARS-CoV, and MERS-CoV, with a rapidly expanding number of reports on SARS-CoV-2. The first portion of the review focuses on the molecular basis of coronavirus entry and its replication cycle. We highlight several notable properties, such as the sophisticated viral gene expression and replication strategies that enable maintenance of a remarkably large, single-stranded, positive-sense (1) RNA genome and the extensive remodeling of cellular membranes to form specialized viral replication and assembly compartments. The second portion explores the mechanisms by which these viruses manipulate the host cell environment during infection including diverse alterations to host gene expression and immune response pathways. This article is intended as a more in-depth companion piece to our online "Coronavirus 101" lecture (https://youtu.be/8_bOhZd6ieM).

Part I: The viral life cycle Viral entry
Coronavirus particles consist of a ;30-kb strand of positivesense RNA that forms the genome; this genome is coated with nucleocapsid (N) protein and enclosed in a lipid bilayer containing three membrane proteins: spike (S), membrane (M), and envelope (E) (3). For all studied coronaviruses, the M protein is critical for incorporating essential viral components into new virions during morphogenesis, and N protein associates with the viral genome and M to direct genome packaging into new viral particles. The E protein forms an ion channel in the viral membrane and participates in viral assembly. The S protein is required for viral entry, as it binds to the target cell and initiates fusion with the host cell membrane (reviewed in Ref. 4). S is homotrimeric, with each subunit consisting of two domains, S1 and S2. S1 contains the receptor-binding domain (RBD) and engages with the host receptor, whereas S2 mediates subsequent membrane fusion to enable the virus to enter the host cytoplasm. Activation of the S protein fusion activity requires prior proteolytic cleavage at two sites. The first cleavage site is at the S1/S2 boundary, leading to structural changes in the S2 domain that place it in a prefusion conformation. This cleavage event also separates S2 from S1, although the two domains remain noncovalently associated. The second cleavage site is at S29, which drives fusion of the viral and cellular membranes to enable release of the N-coated RNA genome into the cytoplasm. ‡ These authors contributed equally to this work. * For correspondence: Britt Glaunsinger, glaunsinger@berkeley.edu.
Whereas coronaviruses use the above general strategy to enter target cells, the receptors and proteases used as well as subcellular sites of S cleavage differ depending on the virus (reviewed in Ref. 5). The S proteins of both SARS-CoV and SARS-CoV-2 use host ACE2 as their receptor (6)(7)(8) (Fig. 2). ACE2 is a cell-surface peptidase that hydrolyzes angiotensin II and is expressed in most organs, with particularly high expression in the epithelia of lung and small intestine (9). After ACE2 receptor binding, SARS-CoV and SARS-CoV-2 S proteins are subsequently cleaved and activated by the host cell-surface protease TMPRSS2 at the S1/S2 and S29 sites, leading to membrane fusion (6,(10)(11)(12). Some coronavirus S proteins are precleaved at the S1/S2 site by the cellular protease furin during their biosynthesis in the producer cell, as has been shown for both MHV and MERS-CoV (13)(14)(15), priming them for entry upon receptor binding on the target cell. MERS-CoV S protein uses DPP4 as its receptor (16,17), and multiple cellular pro-teases, including TMPRSS2, endosomal cathepsins, and furin, have been implicated in the subsequent cleavage at the S29 site (16,18,19). The MHV S protein uses host CEACAM1a as its receptor and is subsequently cleaved at S29 by lysosomal proteases (20,21).
The extent to which specific coronaviruses fuse at the plasma membrane versus during endocytosis remains incompletely resolved. In the cases of SARS-CoV, MERS-CoV, and MHV, the involvement of endosomal and lysosomal proteases in cleavage of their S proteins suggests that entry can occur during endocytosis. MHV enters predominantly through clathrinmediated endocytosis and fusion with lysosomal membranes, as lysosomal proteases activate the S protein (22,23). For SARS-CoV and MERS-CoV, both the endocytic and direct membrane fusion pathways may be used for entry. Studies in which components of endocytosis and endosomal proteases have been blocked demonstrate that SARS-CoV and MERS- Figure 1. Coronaviruses engage with a host cell-surface receptor and deposit their RNA genomes into the host cytoplasm through endocytosis or directmembrane fusion (1). The positive-sense RNA genome is translated by the host translation machinery (2) to make polyproteins that are cotranslationally cleaved by proteases encoded in the polyprotein to generate components of RdRp complex (3). The RdRp complex uses the genome as a template to generate negativesense subgenome and genome-length RNAs (4), which are in turn used as templates for synthesis of positive-sense full-length progeny genomes and subgenomic mRNAs (5). Transcription and replication occur in convoluted membranes (CM) adjacent to DMVs that are both derived from rough endoplasmic reticulum(see Fig. 6 for more details). The subgenomic mRNAs are translated into structural and accessory proteins (6). The positive sense genomic RNA is bound by nucleocapsid and buds into the ERGIC, which is decorated with structural proteins S, E, and M translated from positive-sense subgenomic RNAs (steps 6 and 7). The enveloped virion is then exported from the cell by exocytosis (steps 8 and 9).
CoV can exploit the endocytic pathway to enter target cells (24)(25)(26)(27). For these viruses, it is likely that the producer and target cell type influence which pathway they use for viral entry. For instance, when MERS-CoV S is precleaved in the producer cell, it gets activated by cell-surface proteases and enters the target cell by direct membrane fusion (28). In contrast, when MERS-CoV S is uncleaved in the producer cell, it enters the target cell through endocytosis and is instead activated by endosomal cathepsins. MERS-CoV with S that has not been precleaved during morphogenesis is incapable of infecting target cell types that have low expression of cathepsins. There are reports demonstrating that inhibition of endosomal cathepsins reduces the efficiency of SARS-CoV-2 entry, suggesting that this virus also exploits endocytosis as another route of entry in addition to direct membrane fusion (6,29,30).
There has already been considerable research on the SARS-CoV-2 S protein, given the crucial role it plays during viral entry (reviewed in Ref. 31). Comparing the SARS-CoV-2 S protein sequence with that of closely related SARS-CoV-like viruses revealed that almost all the residues important for ACE2 engagement are not conserved in SARS-CoV-2 (32), although the SARS-CoV-2 S RBD has a 10-20-fold higher binding affinity to ACE2 than SARS-CoV S RBD (33). The mechanistic basis for the enhanced binding affinity is not entirely clear, as ACE2 engagement is structurally similar between SARS-CoV S and SARS-CoV-2 S (34). However, there is a unique salt-bridge interaction present between SARS-CoV-2 S and ACE2, and this may contribute to the enhanced binding affinity. Furthermore, the S1/S2 site in SARS-CoV-2 S contains an insertion of polybasic residues (35,36). The stretch of polybasic residues contains a furin recognition motif, and recent data suggest that furin can cleave at the S1/S2 site on SARS-CoV-2 S, but not SARS-CoV S, in producer cells (29,37). This precleavage event is analogous to the processing of MERS-CoV S and MHV S, both of which also contain a furin cleavage site at S1/S2. A precleavage event at the S1/S2 site implies that SARS-CoV-2 S may only require cleavage at the S29 site on the target cell surface, which would potentiate the membrane fusion process. Notably, acquisition of polybasic cleavage sites occurs during experimental selection for increased transmissibility and expanded tropism in other viruses, suggesting that it may have played a role in the bat-to-human spillover of SARS-CoV-2 (38-42). Further investigation into the properties of S protein from SARS-CoV-2 and other closely related viruses may provide insight into the origin of SARS-CoV-2 as well as the mechanism behind its high transmissibility.
Numerous therapeutic strategies are being explored to inhibit SARS-CoV-2 entry, including blocking ACE2 engagement, inactivating host proteases, and inhibiting S2-mediated membrane fusion. Neutralizing antibodies against SARS-CoV S display moderate efficacy in blocking SARS-CoV-2 infection due to significant differences in the epitope region (6,35,43,44). A recent study isolated neutralizing antibodies capable of blocking the interaction between S and ACE2 from convalescent SARS-CoV-2 patients and demonstrated that they effectively reduce viral load in a mouse model, garnering optimism about the possible use of neutralizing antibodies for treatment (45). Other strategies include development of lipopeptides that block S2-mediated membrane fusion (46) and use of a clinically tested TMPRSS2 inhibitor (6). Not surprisingly, generating protective immunity against the S protein has been the major focus of SARS-CoV-2 vaccine efforts. S protein-directed vaccine platforms under development include production of recombinant S protein, use of nonpathogenic viral vectors to direct expression of S, and nucleic acid-based vaccines in which sequence encoding the S protein is delivered as an mRNA or on a DNA backbone (47). The viral vector and nucleic acid vaccine strategies rely on host ribosomes to translate the S sequence into protein, which would then be subsequently processed and presented to the immune system.

Genome organization, polyprotein synthesis, and proteolysis
Coronaviruses have one of the largest known genomes among RNA viruses, ranging from 27 to 32 kb in length, more than double the length of the average RNA virus genome, and encode for ; 22-29 proteins (48, 49). Given the constraints of eukaryotic translation, which generally allow one protein to be translated per mRNA with ribosome scanning beginning near The SARS-CoV-2 S protein engages with the host ACE2 receptor and is subsequently cleaved at S1/S2 and S29 sites by TMPRSS2 protease. This leads to activation of the S2 domain and drives fusion of the viral and host membranes. See section on 'viral entry' for details. the 59 end, it is worth pausing to consider how this number of viral proteins can be synthesized from the genome with a single ribosome entry site. Coronaviruses achieve this feat through the use of large, multiprotein fusions (termed polyproteins, described below) that are subsequently processed into individual proteins (50), as well as through synthesis of sub-genomelength mRNAs using an unusual transcription mechanism (discussed in the subsequent section).
All of the viral nonstructural proteins (nsps) are encoded in two open reading frames (ORF1a and -b) that encompass roughly the first two-thirds of the viral genome (Fig. 3). ORF1a/ b is translated from the 59-capped RNA genome by cap-dependent translation to produce a shorter polyprotein (the ;440-500-kDa pp1a, which includes nsps 1-11) or a longer polyprotein (the ;740-810-kDa pp1ab, which includes nsp1 to -16), depending on whether the stop codon at the end of ORF1a is recognized or bypassed. Bypassing the ORF1a stop codon occurs through a 21 ribosomal frameshift in the overlapping region between ORF1a and -1b just upstream of the stop codon, enabling production of the larger pp1ab polyprotein. Frameshifting occurs with ;20-50% efficiency (51) and is triggered by the presence of a slippery sequence, UUUAAAC, followed by an RNA pseudoknot structure (52), the disruption of which affects frameshifting efficiency (53). Whereas nsp1 to -11 from ORF1a are involved in a broad range of functions from blocking the initial immune response to functioning as cofactors for replication and transcription proteins, the core components of the replication and transcription machinery, such as the RNA-dependent RNA polymerase (RdRp), helicase, and other RNA-modifying enzymes, are present in the ORF1b portion of pp1ab. This frameshifting-based translational control strategy helps the virus maintain a stoichiometry of pp1a and pp1ab proteins that is optimal for infectivity and replication (54,55). Due to this requirement of precise ratios of pp1a and pp1ab, frameshifting has been explored as a novel drug target (56, 57) similar to such efforts in HIV (58). These drugs typically prevent frameshifting by binding to RNA structures that are required for frameshifting (56,57).
To liberate the individual nsps, pp1a and pp1ab are proteolytically processed in cis and in trans by two viral proteases encoded by nsp3 and nsp5. Nsp3 contains one or two papain-like proteases (PLpro1 and PLpro2), and nsp5 contains a chymotrypsin-like cysteine protease (3CLpro) (reviewed in Ref. 59). The 3CLpro catalyzes the proteolytic cleavage of all nsps downstream of nsp4 and is thus referred to as the main protease. Inhibitors of 3CLpro and PLpro have long been considered as potential drug targets, as their cleavage recognition sequences are distinct from other human proteases and they are essential to viral replication (60)(61)(62). Although PLpro is responsible for fewer cleavage events in pp1a, it additionally functions as a deubiquitinase and deISGylating (removal of conjugated interferon-stimulated gene 15 from cellular proteins) enzyme (63,64), activities that contribute to evasion of the initial antiviral response (64). It is therefore possible that targeting PLpro would inhibit viral replication as well as prevent dysregulation of cellular signaling pathways that could lead to cell death in surrounding cells (65).

Replication and gene expression
A subset of nsps generated by proteolytic cleavage of the polyproteins come together to form the replication and transcription complexes (RTCs) that copy and transcribe the genome. RTCs reside in convoluted membrane structures (discussed in detail below) derived from rough endoplasmic reticulum (ER) and are anchored in place by viral transmembrane proteins nsp3, nsp4, and nsp6 (66)(67)(68)(69). Similar to other positive-strand RNA viruses, replication of coronaviruses involves synthesis of the complementary full-length negative-strand RNA, which serves as a template for generation of positive-strand progeny genomes (70). The negative-strand templates get turned over via unknown mechanisms (71), and the positive-strand genomes are packaged into virions. Several cis-acting RNA elements in the 59 and 39 end of the genome are important for replication and transcription (reviewed in Refs. 72 and 73). These include conserved stem loop structures within ;500 nucleotides of the 59 end of the genome, structural elements in the 39 UTR that are partially conserved across the different coronaviruses, and the 39 poly(A) tail. Negative-strand synthesis is facilitated by the N protein interacting with both the poly(A) tail and the 59 end of the genome to bring these termini in proximity (74). The RNA genome encodes two categories of proteins: nsps and structural and accessory proteins. The nonstructural proteins are encoded in ORF1a and ORF1b. Cap-dependent translation begins at ORF1a and produces pp1a, encompassing nsp1-11, or pp1ab, a longer polypeptide that includes nsp12-16. The production of either polypeptide depends on whether the stop codon at ORF1a is recognized by the ribosome or is bypassed through a change in the reading frame by the ribosome frameshifting site. The structural and accessory proteins are synthesized by translation of their respective subgenomic mRNAs (see Fig. 4). The proteins have been color-coded by functional categories for SARS-CoV (see Table 1).
In addition to genomic replication, the RTCs also carry out synthesis of subgenomic (sg RNA) mRNAs, which encode for the ORFs located in the 39-proximal one-third of the genome. All sg mRNAs are co-terminal and contain a common 59 leader sequence that is derived from the 59 end of the viral genome (75). Placement of the common leader sequence at the 59 end of all sg mRNAs involves an unusual and complex mechanism of discontinuous transcription (Fig. 4) (reviewed in Ref. 76). During negative-strand synthesis, the RdRp complex terminates or pauses at specific sites along the genome called transcription regulatory sequences (TRSs). The TRSs are present downstream of the common leader sequence at the 59 end of the genome (TRS-L) and 59 of every viral ORF along the body of the viral genome (TRS-B) except ORF1a and -1b. Complementarity between sequences in TRS-B on the newly synthesized negative sense RNA and TRS-L allows for the transcription complex to switch templates-effectively jumping from a given TRS-B to the TRS-L at the 59 end of the genome. Transcription then continues, copying the leader sequence to complete the negativestrand sg RNA (77,78). The negative-strand sg RNAs subsequently serve as templates to generate large numbers of sg mRNAs; the positive-strand RNAs far outnumber the negativestrand RNAs (79). Secondary structure analysis of the TRS-L region has shown that the context of the sequence and associ-ated structures are important for ensuring that only the TRS-L, and not other TRS-B sequences, acts as the template for strand switching by the RdRp (80). The purpose of the 59 leader sequence in all sg mRNAs, other than to potentially prime sg mRNA synthesis, is not completely understood. One study with SARS-CoV suggested that the 59 leader sequence could be important for protection against cleavage by viral nsp1 (81), although the mechanism by which protection is rendered is unclear. The efficiency with which the template switch occurs is an important determinant of the levels of the different sg mRNAs and the ratio of sg mRNAs to genome-length RNA, as failed template switching leads to read-through at TRSs and increases the probability of producing genome-length RNA (reviewed in Ref. 80). Most of what is known about this regulation is from studies on arteriviruses, which belong to the same order (Nidovirales) as coronaviruses and synthesize sg mRNAs by a similar mechanism. The levels of several sg mRNAs are correlated with the stability (DG) of the duplex between TRS-L and TRS-B (77), and hence duplex stability was thought to be an important regulator of this process. However, a recent sequencing study with an arterivirus showed that some TRS-B sequences with 100% similarity to TRS-L core sequences were not used as switching points for the transcription complex, suggesting that whereas duplex stability is necessary, it is not sufficient to dictate template switching (82). Regulation of  (1). Upon copying the TRS-B sequences present at specific sites along the genome body (2), the RdRp complex may "jump" to the TRS-L sequence (3) owing to complementarity between the TRS-B sequence on the nascent sg RNA and TRS-L sequence on the genome. Transcription is resumed on the new template, and the leader sequence (shown in red) is copied to complete the negative-strand sg RNA. The RdRp complex does not always switch templates at TRS-B sequences, resulting in the synthesis of genome-length negative-strand RNA. The negative-strand RNAs serve as templates for the synthesis of genome-length positive-strand RNAs or sg mRNAs.
the levels of some sg mRNAs, such as the N protein sg mRNA in coronaviruses, was shown to be mediated by short-and longrange RNA-RNA interactions (83,84).
Several proteins have also been implicated in regulating the levels of sg mRNAs and the switch between full-length negative-strand synthesis and sg RNA synthesis, although a clear picture of features that favor transcription or replication has not emerged. For example, the viral N protein (85) and the cellular kinase GSK-3 and helicase DDX1 (86) have been shown to be important for producing full-length negative-strand genomic RNA and long sg RNAs, suggesting a role in read-through of TRSs. However, the N protein also has helicase-like activity (87), promotes template switching, and appears dispensable for replication but required for efficient sg mRNA transcription (88). It is also possible that the transcription complex that carries out negative-strand synthesis is distinct from the version that carries out positive-strand synthesis (89).

Composition of the replication/transcription complex
Coronavirus replication, discontinuous transcription, and RNA processing are orchestrated by a remarkably sophisticated replicase complex (Fig. 5). Unlike other RNA viruses, where replication is primarily dependent on the RdRp and a small number of cofactors, coronaviruses appear to use a multiprotein complex, including the RdRp (nsp12), processivity factors (nsp7-8), a helicase (nsp13), single-strand binding protein (nsp9), a proofreading exonuclease (nsp14), other cofactors (e. g. nsp10), and capping enzymes (e.g. nsp16). This is more reminiscent of replisomes from DNA-based organisms and is potentially a consequence of their unusually large genomes (90).
In vitro studies showed that whereas the SARS-CoV RdRp nsp12 has some minimal activity on its own, its activity and processivity are greatly stimulated in the presence of nsp7-nsp8 cofactors (91). Cryo-EM structures of the SARS-CoV and SARS-CoV-2 nsp12-nsp7-nsp8 tripartite complex revealed that nsp8 binds nsp12 as both a heterodimer (nsp7-nsp8) and by itself to stabilize the regions of nsp12 involved in RNA binding (92,93). Whether the RdRp is capable of de novo initiation or requires a primer-template substrate remains heavily debated (94,95). Coronavirus RdRps also have a conserved Nterminal domain that has nucleotidylation activity (NiRAN domain), which is essential for coronavirus replication (96). Structural homology analysis of the NiRAN domain suggests that it shares significant homology with the nucleotide-binding site of protein kinases (92), although how it might mediate nucleotidyltransferase or the function of this domain is not known.
In addition to its role as a processivity factor for the RdRp, nsp8 was first thought to function as a primase during replication (97,98). However, whereas nsp8 has polyadenylation activity that is stimulated by the presence of a polyU stretch on the template strand, it is unable to incorporate other nucleotides on heteropolymeric templates (99), suggesting that it might not be a primase. Additionally, the cryo-EM structure with nsp7 and nsp12 does not suggest a mechanism for nucleotide incorporation by nsp8. It has been proposed that the presence of polyU sequences at the 59 end of the negative-strand viral RNA could promote polyadenylation of the viral positive-strand RNAs by nsp8, but this remains to be experimentally validated. The poly(A) tail length also varies during infection (100), and it would be interesting to explore whether nsp8 has a role in this process.
One of the interacting partners of nsp8 in the RTC is nsp9, a single-strand (ss) nucleic acid-binding protein (101,102) with no obvious sequence specificity or function. It binds ssDNA and ssRNA with equal affinity, although ssRNA is the presumed substrate during infection. Structural studies have shown that it dimerizes, and this is important for viral replication but dispensable for RNA binding (103). It is possible that nsp9 binds to singlestranded regions of the viral genome and protects them from nucleases, akin to the role played by ssDNA-binding proteins in DNA replication systems. Indeed, other ss nucleic acid-binding proteins are also known to play roles in recombination and homologous base pairing (104), processes that occur during discontinuous negative-strand synthesis in coronaviruses.
Another key component of the RTC is nsp13, a superfamily 1 (SF1) 59 ! 39 helicase (105) that interacts with nsp12 (106) and several other components of the RTC. The functional role of helicases in replication of RNA viruses is largely unknown, although they are one of the most conserved proteins encoded by coronaviruses (reviewed in Ref. 107). Helicases use the energy from nucleotide hydrolysis to translocate on nucleic acids. In addition to its (d)NTPase activity, nsp13 also has a 59triphosphophatase activity, suggesting a role for it in RNA capping (108). The helicase domain of MERS and SARS-CoV nsp13 shows remarkable similarity to the cellular Upf1 helicase, a protein involved in the nonsense-mediated decay pathway. Based on this observation, it has been proposed that nsp13 could also play a role in quality control of RNAs (109).
One of the central outstanding questions about the role of helicases in RNA viruses is whether they function similarly to replicative helicases or if they are involved in unwinding local structures and removing obstacles for the polymerase. Replicative helicases typically work together with the polymerase to unwind the double-stranded nucleic acid ahead of the polymerase. The 59 ! 39 directionality of the helicase is reminiscent of prokaryotic replisomes, where the helicase and polymerase translocate on different strands and the helicase helps in unwinding the duplex ahead of the polymerase. Thus, during the synthesis of full-length progeny genomes using the negative-strand RNA as a template, nsp13 could be bound to the positive-strand RNA and assist the RdRp as it copies the negative strand (Fig. 5). Cooperativity between the replicative helicase and polymerase is a conserved feature of DNA replisomes. The RdRp stimulates the activity of the helicase (106), but whether the helicase has a reciprocal effect on RdRp activity, similar to DNA replisomes, would be interesting to test. A nonmutually exclusive possibility is that the helicase facilitates RdRp template switching during discontinuous transcription by releasing subgenomic RNAs at TRS sites during negativestrand synthesis, similar to a role played by some other SF1 helicases in recombination (110).

Mechanisms underlying high-fidelity replication
RNA viruses typically have high mutation rates due to lack of RdRp proofreading activity, which promotes viral genetic diversity and increases their adaptive potential. However, the potential for accumulation of deleterious mutations leading to collapse of the viral population through error catastrophe caps the size of most RNA virus genomes to ;15 kb (reviewed in Ref. 90). The ;30-kb coronavirus genome far exceeds this threshold, indicating that they must have specialized mechanisms to counteract this mutational burden. In this regard, they are one of the few RNA viruses apart from toroviruses and roniviruses (which are also exceptionally large) that have an exonuclease activity and associated high-fidelity replication (111). The discovery of this exonuclease (nsp14-ExoN) in the coronavirus genome (112) showed for the first time the potential for proofreading activity in RNA viruses and explained how coronaviruses maintain their genome integrity. Indeed, the mutation rates of coronaviruses are an order of magnitude lower (10 26 to 10 27 ) than that of most RNA viruses, and mutating the SARS-CoV or MHV ExoN gene causes the error frequency to jump to that observed in many other RNA viruses (10 23 to 10 25 ) (113)(114)(115).
Active-site mutants that abolish the exonuclease activity of ExoN are lethal for HCoV-229E and transmissible gastroenteritis virus (TGEV) and cause impaired growth for MHV and SARS-CoV (112), suggesting that ExoN is important but may not be essential under all conditions. Why MHV and SARS-CoV but not HCoV-229E and TGEV can tolerate ExoN mutants is unclear, although it is possible that ExoN is essential only in alphacoronaviruses (HCoV and TGEV) and not in betacoronaviruses (MHV and SARS-CoV). It is also possible that the active-site mutation in SARS and MHV did not fully deactivate the enzyme or that other proteins in the replicase can compensate for the absence of an active ExoN. For example, nsp10 stimulates the catalytic activity of nsp14-ExoN to remove a mismatched nucleotide at the 39 end of the RNA by .35-fold (116), and the high replication fidelity depends on the nsp10-nsp14 interaction (117). ExoN (nsp14) also interacts with the nsp12-nsp8-nsp7 tripartite complex (91), providing biochemi-cal evidence for its role in proofreading during transcription/ replication. Nsp10 also interacts with nsp16 (a potential RNAmodifying enzyme), and it has been proposed that all of these proteins could come together to form a larger complex during replication similar to DNA replisome complexes. In vitro biochemical studies comparing the activity of ExoN from MHV and SARS-CoV and HCoV-229E together with the accessory proteins could shed mechanistic light on these phenotypic differences between the ExoN mutants.
Replication fidelity is inherently tied to viral fitness and, in most cases, changes to replication fidelity decrease fitness (reviewed in Ref. 90). This suggests that mutants with altered replication fidelity (such as the ExoN mutant) have potential therapeutic value as live attenuated vaccines (118). Indeed, the SARS-CoV ExoN mutant had decreased pathogenesis and did not revert to virulence even after persistent infection in vivo (118). The ExoN mutation did not revert to WT even over 250 viral passages, although it accumulated a variety of mutations that partially compensated for the replication defect and decreased the population sensitivity to mutagens (119). Several components of the replicase complex, including nsp8, nsp9, nsp12, and nsp13, had mutations in the coding region, underscoring the complexity and interdependence of the RTC and how that helps the virus circumvent the consequences of decreased fidelity. A better understanding of the mechanism of replication fidelity will also allow for the exploration of mutants that increase replication fidelity and thereby reduce diversity and potentially fitness of the population, as has been shown for polioviruses (120).
Recombination, which is generally high in RNA viruses and is linked to their virulence and pathogenicity (121), may also influence coronavirus diversity. In coronaviruses, recombination occurs as an inherent part of the replication cycle during the synthesis of sg RNAs and is tied to the ability of the RdRp to switch templates from the TRS-B sequence to the TRS-L sequence to copy the leader sequence from the 59 end of the genome. Such recombination events can also occur between coinfecting coronaviruses with different genotypes (reviewed in Ref. 122). Recombination can lead to defective copies of RNA that can no longer be replicated (123) or recombinants with new properties, such as the ability to replicate in a new host (122), leading to new outbreaks. Mutational reversion and recombination-driven processes can pose significant challenges to the use of live attenuated vaccines (120), emphasizing the need to engineer recombination-resistant strains (124). A recent study suggests the involvement of nsp14-ExoN in mediating recombination frequency and junction site selection in several coronaviruses (125), opening up an exciting avenue of exploration for nsp14 in vaccine development.

Viral RNA processing
Capping the 59 end of the viral mRNA is important for viral mRNA stability, translation initiation, and escape from the cellular innate immune system (126). Capping typically occurs cotranscriptionally in the nucleus, so RNA viruses that replicate in the cytoplasm encode their own enzymes or incorporate other strategies, such as cap snatching (as in bunyaviruses) (127), to protect the 59 end of their RNAs. The coronavirus capping mechanism is not completely understood, although it appears to follow the canonical capping pathway. Capping begins with hydrolysis of the g-phosphate of the 59 end nucleotide; although not yet directly shown, this is thought to be mediated by the nucleotide triphosphatase activity of nsp13helicase (128). This is followed by the addition of a guanosine monophosphate to the diphosphate RNA by a guanylyl transferase that has remained elusive in coronaviruses, although the NiRAN domain of nsp12 could be involved in this process (96). The guanosine is then methylated at the N7 position, likely by N7-methyltransferase (MTase) activity that resides in the Cterminal part of ExoN (nsp14) (129). Finally, nsp16 is thought to methylate the first and second nucleotides at the 29-O position (130). This activity requires interaction with nsp10, which appears to improve substrate and RNA binding by nsp16 (131). The 29-O-methylation is important for evasion of the type-I interferon (IFN) response (which is discussed below) (132). Of the enzymes involved in capping, the N7-MTase of nsp14 is an attractive antiviral target, as this domain exhibits a noncanonical MTase fold different from cellular MTases (129).
The 39 end of coronavirus mRNAs are polyadenylated. The length of the polyadenylated tail regulates translation efficiency of the mRNAs (100) and is essential for negative-strand synthesis (133). Whereas polyadenylation-related elements, such as a AGUAAA hexamer and the poly(A) tail, work in concert to ensure polyadenylation of the genome (134), the precise mechanism by which this occurs is not known. It is also unclear whether the RdRp carries out the polyadenylation or if cellular poly(A) polymerases are recruited for this process.
Given that translation of coronavirus mRNAs relies on host cap-dependent translation machinery, a number of cellular cap-binding complex factors are candidates for therapeutic targeting (135,136). Systematic mapping of the interaction between SARS-CoV-2 proteins and the host proteome has revealed interactions between viral proteins and host translation machinery, and an inhibitor of cap-dependent translation initiation reduced viral infectivity in cell culture (137). These data point to the possible effectiveness of a host-directed antiviral therapeutic strategy in treating COVID-19.

Replication/transcription complex proteins as drug targets
Whereas the complexity of the coronavirus replisome may have enabled the virus to expand its genome, it also presents numerous targets for the development of antivirals (138). Most prominent is the RdRp, as it is essential for the virus and lacks homologs in the host. Nucleoside analogs, which are nucleotide triphosphate (NTP) mimics, are commonly used RdRp inhibitors (139). However, designing nucleoside analogs as inhibitors is particularly challenging for coronaviruses due to the presence of the exonuclease, which can exise incorporated analogs and thus provide resistance. An exception to this has been the adenosine analog remedesivir, which is currently in phase 3 clinical trials for treating coronavirus infections (140). A recent in vitro study with purified RdRp-nsp8 complex from several coronaviruses showed that remedesivir incorporation blocks chain elongation 3 nucleotides downstream of its incorporation site, which potentially protects it from ExoN cleavage (141,142). Additionally, remedesivir is selectively incorporated by the RdRp over the natural substrate ATP. Better in vitro reconstitution systems incorporating the other components of the RTC (nsp7-, nsp13-, and nsp14-exonuclease) will further help to elucidate the mechanism of inhibition.
It may also be of interest to develop nonnucleoside RdRp inhibitors, as have been developed for other RNA viruses, such as hepatitis C virus (143). Nonnucleoside inhibitors typically function allosterically and hence are potentially immune to the resistance conferred by the exonuclease activity of ExoN. Combining compounds that inhibit ExoN together with nucleoside analogs to inhibit the RdRp or using small molecules that increase the mutation load of the virus by other mechanisms that are not sensitive to the exonuclease are other viable options (144). Finally, other components of the RTC, such as the helicase (145), exonuclease (115), and capping machinery (131,146), have also been considered as potential druggable targets.

Coronavirus replication occurs within heavily modified membranes
A defining feature of many positive-strand RNA viruses, including CoVs, is their ability to hijack and reform intracellular membranes to create a cellular niche for the replication of their RNA genome. Ultrastructural characterization of mainly MHV-and SARS-CoV-infected cells has revealed the membranes that anchor RTCs in CoV-infected cells to be quite striking, consisting of double membrane vesicles (DMVs) among other intricate convoluted membrane structures that isolate CoV RNA from the rest of the cellular environment (Fig. 6) (147)(148)(149)(150). Conceptually, RTC formation leads to the concentration of viral replication machinery, spatially separating the sites of viral RNA replication from downstream virion assembly in the endoplasmic reticulum-Golgi intermediate compartment (ERGIC). Additionally, RTCs likely prevent detection of viral dsRNA replication products from innate immune sensors.
The DMVs and convoluted membranes in CoV-infected cells form at the nuclear periphery and are derived from host ER membrane (66). The majority of the membrane manipulation is carried out by three nonstructural proteins with integral transmembrane domains: nsp3, nsp4, and nsp6 (67). Although biochemical characterization of these proteins is hindered by their hydrophobic nature, protein-protein interaction studies performed in cells suggest that nsp3, nsp4, and nsp6 can oligomerize and form complexes through their luminal loops (67)(68)(69). Expression of these RTC proteins individually in uninfected cells is sufficient to cause membrane proliferation and various perturbations of membrane morphology (69). Coexpression of nsp3 and nsp4 leads to their colocalization in perinuclear foci by fluorescence microscopy and the formation of membrane structures with increased curvature by EM (67). Because the specific interaction of nsp3 and nsp4 is required for these structures to form, it is hypothesized that nsp3 and nsp4 rearrange membranes and introduce curvature by a "zipper" mechanism, essentially bringing ER membranes together through nsp3/4 interactions (69) (Fig. 6). The nsp3/4 interaction also recruits other proteins, including nsp6, to anchor RTCs. Finally, triple transfection of nsp3, nsp4, and nsp6 together results in the formation of DMVs in uninfected cells (67). Due to the importance of membrane modification during viral replication, CoV transmembrane proteins may be attractive drug targets. In fact, a small molecule screen for antiviral activity yielded a compound that targets the transmembrane protein nsp6 and essentially blocks viral RNA replication and DMV formation (151).
During CoV infection, the inner membrane of the DMV is sealed while the outer membrane of the DMVs forms a contiguous network with the convoluted membranes and modified ER membranes (Fig. 6). When this network is isolated from cells, it is capable of producing both genomic and subgenomic RNAs in vitro even in the presence of RNases and proteases, but not detergent, thus implicating the membrane network in shielding viral RNA replication (152). The anchored RTC complexes consist of viral proteins nsp2-10, nsp12-16, and N protein, which have diverse enzymatic functions required for RNA replication as discussed above (150,153,154). The RTC microenvironment also includes numerous host proteins that participate in CoV biology, such as proteins involved in vesicular trafficking and translation initiation factors, the latter of which are suggestive of active translation near sites of viral RNA replication (154). The site of RNA replication inside this membrane network is currently unknown. Whereas viral RTC proteins labeled by immuno-EM primarily localize to convoluted membranes between DMVs, dsRNA (presumed to be of viral origin) labeled by the J2 antibody localizes inside the DMVs (Fig. 6). However, there is no experimental evidence demonstrating whether dsRNAs inside the DMV represent nascent viral tran-scripts, viral RNA replication byproducts, or even host dsRNAs. Recently, nascent viral RNA was visualized by metabolic labeling and quantitative EM autoradiography, revealing that viral transcription does in fact occur in association with the DMVs rather than convoluted membranes (155). The spatial resolution of this technique, while clearly demonstrating viral transcription within the vicinity of the DMVs, was not sufficient to pinpoint the localization of nascent viral RNA within DMVs and/or in association with DMV membranes. Because no visible pores or openings in the inner membrane of the DMV have been detected with conventional EM techniques, viral RNA synthesis regardless of locale would rely on a yet unidentified transport mechanism capable of moving viral proteins and/or RNA in and out of the DMV inner membrane (66,155).

Viral packaging and egress
The assembly of an infectious CoV virion requires that its nucleocapsid, consisting of the viral RNA genome coated with N protein, and viral envelope coalesce into the same intracellular space. Viral glycoproteins that are incorporated into the envelope (M, E, and S proteins) are translated in the ER and retained at the site of budding in the ERGIC (Fig. 1). The ERGIC budding site is distinct from the site of viral genome synthesis in the RTC. The nucleocapsid core of the virion traffics from the RTC to ultimately bud into ERGIC membranes, which are decorated with M, E, and S protein and become the lipid envelope of the virion. The most abundant envelope component is the M protein, which plays a central role in viral egress. Outside of the context of infection, M protein expression alone is not sufficient to cause budding of virus-like Nsp3 and nsp4 are co-translationally embedded in the ER membrane and interact via their luminal loops. This leads to "zippering" of ER membranes and induced curvature (1). These interactions yield a complex array of convoluted membranes (CM) and DMVs that are contiguous with the rough ER (2). The protein components of RTCs are mainly localized to the convoluted membranes. The DMVs contain dsRNA, thought to be sequestered replication intermediates. The DMV inner membrane has no ribosomes, connections to the cytoplasm or connections to the rest of the network. The mechanism of DMV formation and the exact site of CoV RNA replication within this membrane network are currently unknown. See section, 'Coronavirus replication occurs within heavily modified membranes' for references.
particles, but co-expression with E (or N in the case of SARS-CoV) can result in virus-like particle formation in the absence of infection (156)(157)(158). During infection, the M protein nucleates virion components within the ERGIC budding compartment, as M directly interacts with the virion proteins E, N, and S and the CoV genomic RNA (159)(160)(161)(162). The E protein, while not highly abundant in the envelope, is critical for viral envelope curvature and maturation and can form membrane ion channels, although the significance of this latter activity is not yet appreciated (163,164). S protein assembly into virions is enhanced by C-terminal dilysine, dibasic, or tyrosine-based endoplasmic reticulum retention signals (165)(166)(167). Although the retention signals are quite divergent among CoVs, all serve to maintain S near the ERGIC-localized M protein, ensuring M-S interaction at the site of virion assembly. Following budding of the nucleocapsid core into the M-, E-, and S-containing ERGIC membranes, the newly enveloped virion then leaves the cell through the exocytic pathway.
Although CoV replication produces an abundance of unique viral RNAs in the cell (positive-strand genomic RNAs, positivestrand sg mRNAs, and negative-strand RNAs), purified CoV virions house mainly full genome-length RNA (159,168,169). Conceptually, this specificity is thought to be driven by a packaging signal unique to the genome-length RNA. In MHV, a packaging signal has been mapped to ORF1b within the nsp15 gene (a region absent in sg RNAs) and is predicted to form a bulged stem-loop structure with repeating AGC/GUAAU motifs (170,171). This packaging signal specifically binds both the N and M proteins, but the order in which these interactions occur is not clear (reviewed in Ref. 172). N protein must have broad RNA-binding activity, as it ultimately coats the length of the viral genome to form the nucleocapsid component of the virion and additionally forms complexes with sg RNAs (160,173,174). Thus, an additional role of M in recognizing the packaging signal and selecting full-length genomic RNA is an attractive model for genome packaging specificity, at least in the context of MHV infection (175). In contrast, the packaging signal identified in MHV is absent from other lineages of b-coronaviruses, including SARS-CoV and MERS (reviewed in Ref. 172), leaving us with little understanding of how other CoVs selectively package genome-length RNAs.

Part II: Viral manipulation of the host
Viruses depend on host processes to complete their life cycle. In addition to employing cellular machines like the ribosome to translate their proteins and manipulating cellular membranes during RNA synthesis and viral morphogenesis, several coronavirus proteins modify the cellular environment in ways that may influence viral pathogenesis and replication in vivo. In this section, we discuss the roles of coronavirus proteins in altering the cellular signaling landscape as well as the ability of the virus to modulate host gene expression and its interactions with and counteraction of the host immune response.

Accessory proteins and viral pathogenicity
Coronavirus genomes contain a number of genes concentrated in the 39 region of the genome that encode for accessory proteins that are largely dispensable for viral replication and growth in vitro (176)(177)(178)(179)(180)(181)(182). The SARS-CoV genome encodes for eight accessory proteins (3a, 3b, 6, 7a, 7b, 8a, 8b, and 9b), which are the best-studied set of accessory proteins among b-coronaviruses (183,184). Accessory proteins are specific to each CoV genus and exhibit little homology across the family; as such, this set of eight proteins are specific to human and animal isolates of SARS-CoV (185). Additionally, no significant amino acid sequence similarity is shared between SARS-CoV accessory proteins and other known viral or cellular proteins, providing little insight to predict functional roles (186). Despite being nonessential for viral replication in cultured cells, the accessory proteins presumably modulate virus-host interactions that are important during in vivo infection, including cell proliferation, programmed cell death, pro-inflammatory cytokine production, and IFN signaling (see Table 1) (186,187). Many SARS-CoV accessory proteins can also be incorporated into virions or virus-like particles during infection, potentially suggesting minor structural roles (187).
Given the variability of accessory genes between coronaviruses, they may be linked to virus-specific pathogenicity. That said, annotation of the SARS-CoV-2 genome has identified a similar set of accessory genes as SARS-CoV (ORF3a, ORF3b, ORF6, ORF7a, ORF7b, ORF8, and ORF9b), albeit with some notable differences among the putative type-I IFN signaling antagonists (188,189). These include a premature stop codon in ORF3b resulting in a truncated and likely nonfunctional 20amino acid protein, and relatively lower (69%) amino acid similarity of the ORF6 protein. This may suggest differences in the susceptibility of SARS-CoV and SARS-CoV-2 to host IFN responses (190).
The ORF8 region of the SARS-CoV genome, encoding for ORF8a and ORF8b proteins, displays major variation among human and animal isolates of SARS-CoV (191). Animal isolates contain a single ORF8 gene, whereas this region forms two separate genes in human isolates. However, SARS-CoV and SARS-CoV-2 isolated from patients during early phases of outbreaks closely resemble animal isolates with a single ORF8 gene, likely representing the first zoonotic transmission (192,193). It remains unclear whether these variances arise from genomic instability or if there is adaptive evolutionary pressure for these changes that may be related to the functional role of ORF8 proteins.
Functional roles have yet to be established for the majority of accessory proteins of other alpha-or gammacoronaviruses. Moreover, much of our understanding of these proteins in the betacoronaviruses derives from transfection or overexpression systems rather than during infection of cultured cells or in vivo. Further development of animal models is paramount to advancing our mechanistic understanding in this area. Nonetheless, the propensity for these genes to be maintained in coronavirus genomes suggests underlying functional importance.

Host shutoff
Numerous RNA and DNA viruses inhibit cellular gene expression by directly targeting mRNAs in order to redirect resources toward viral gene expression and dampen innate immune responses (194). In coronaviruses, this "host shutoff" activity is best characterized for nsp1, which uses an unusual two-part mechanism to restrict translation of mRNA that involves translational repression by 40S binding as well as mRNA cleavage (195,196). Nsp1 itself does not have detectable RNase activity, but its expression causes cleavage of mRNAs such as IFN-b near the 59 end of transcripts, perhaps by recruitment of a host endonuclease (195). Viral RNAs as well as some highly structured 59 UTRs, including certain internal ribosome entry site sequences, are resistant to cleavage, although still susceptible to translational repression (81). The 59 common leader sequence is necessary and sufficient to confer protection to viral and reporter RNAs from nsp1-induced endonucleolytic cleavage (81,197). Whereas nsp1 specifically targets RNA polymerase II-transcribed RNAs in cells (195), the spectrum of mRNAs that are cleaved in response to nsp1 is unclear. However, if it functions analogously to mRNA-targeting host shutoff factors in other viruses, host transcripts may be broadly down-regulated (198)(199)(200).
Mutations in SARS-CoV nsp1 that block its interaction with the 40S ribosome inhibit both the translational repression and RNA cleavage functions, but an RNA cleavage-deficient mutant retains the translational repression activity (201)(202)(203)(204). Thus, nsp1-induced RNA cleavage may occur subsequent to translational repression. Unlike SARS-CoV and SARS-CoV-2 nsp1, MERS nsp1 does not contain the region of the protein that mediates interaction with the 40S, although it still represses host translation. Instead, it targets translationally competent transcripts of nuclear origin and spares virus-like reporter RNAs that are introduced directly into the cytoplasm (203).
The accessory protein ORF7a has also been shown to participate in SARS-CoV host shutoff by reducing total protein synthesis (205). Additional studies are needed to clarify the relative contribution of ORF7a and nsp1 to the translational repression seen during infection as well as to decipher the mechanisms underlying translational repression and mRNA cleavage.

Immune antagonism
Coronavirus-induced dampening of host antiviral responses and an overexuberant pro-inflammatory host response (e.g. cytokine storm) have been linked to the disease pathology associated with infection (206)(207)(208)(209)(210). Infection with the circulating human coronaviruses (HCoV-229E, HCoV-NL63, HCoV-OC43, and HCoV-HKU1) rarely causes severe disease, and that which occurs is largely associated with comorbidities (211). However, the highly pathogenic human coronaviruses SARS-CoV, MERS-CoV, and SARS-CoV-2 can cause significant acute respiratory disease syndrome (212). Samples from SARS, MERS, and COVID-19 patients show limited induction of antiviral IFN cell signaling pathways (206,(213)(214)(215)(216). Additionally, SARS patients exhibiting high initial virus titers and increased inflammatory monocyte-macrophages and neutrophil accumulation in the lungs were associated with marked elevation of pro-inflammatory cytokines and chemokines (217,218). Proinflammatory cytokines and chemokines recruit inflammatory cells to the sites of infection. Subsequently, neutrophils and cy-totoxic T cells, along with these cytokines, induce severe lung tissue damage, including vascular leakage, and stimulate pulmonary fibrosis (219). Recent work analyzing pro-inflammatory profiles among COVID-19 patients identified a similar subset of cytokines and chemokines to be markedly up-regulated (220,221).
Coronaviruses engage and counteract the immune system in a variety of ways (Fig. 7), which collectively are hypothesized to underlie the disease pathology. SARS-CoV in particular encodes multiple factors that directly antagonize pattern recognition receptors (PRRs) and simultaneously target the expression of IFN-signaling molecules induced by viral recognition. Many of these factors also further stimulate pro-inflammatory cellular responses. These multifaceted interactions with the immune system presumably contribute to the highly restricted induction of type I IFNs during coronavirus infection, while stimulating production of pro-inflammatory molecules. Below we summarize how individual coronavirus proteins modulate the host innate, inflammatory, and adaptive immune responses.
Modulation of the host innate immune response-In contrast to viruses like Sendai virus and influenza A, all assayed coronaviruses have shown limited induction of IFN-b and other type I IFNs in tissue culture models, in mice, and in patient samples, including MERS-CoV, HCoV-229E, SARS-CoV, and SARS-CoV-2 (206,(214)(215)(216)222). Type I IFNs remain suppressed even during co-infection of Sendai virus and SARS-CoV, highlighting the ability of coronaviruses to actively silence immune effector expression (201). Suppression by MERS-CoV is particularly robust, as it down-regulates IFN-b ;60-fold more than SARS-CoV and 300-fold more than HCoV 229E, indicating that differences in viral gene sequences between the coronaviruses influence this response (184).
The strong down-regulation of type I IFN during CoV infection suggests that these viruses are highly sensitive to the presence of IFN, and administration of IFNs has been proposed as a therapeutic for SARS-CoV and SARS-CoV-2. IFN-b dramatically (5 3 10 4 -fold) reduces SARS-CoV RNA copies in cell culture, and IFN-a reduced viral titer in macaques 1 3 10 4 -fold. In cell culture, pretreatment with either IFN-a or IFN-b followed by SARS-CoV infection or post-treatment with IFN-b decreased viral replication (223,224). Similar results showing antiviral effects of type I IFN treatment were recently described in tissue culture models with SARS-CoV-2 (190,225), highlighting the potential for the rational design of a live attenuated vaccine with mutations in key immune agonist genes, discussed below.
Replication intermediates produced during RNA virus infection can be recognized by two PRRs: RIG-I and MDA5. RIG-I preferentially recognizes short dsRNA with 59 di-and triphosphates, whereas MDA5 preferentially recognizes long dsRNA, which is formed as an intermediate during RNA copying (226,227). MHV is primarily recognized by MDA5, as MDA5 but not RIG-I knockout cells show strong IFN-b induction following infection (228). Interestingly, no coronavirus inhibitor of MDA5 has yet to be identified, which is notable, given CoV targeting of many other arms of the innate immune response.
Multiple SARS-CoV proteins antagonize the host innate immune response, including ORF3b, ORF6, nsp1, N, M, and PLpro (Fig. 7). For example, N protein inhibits TRIM25 ubiquitylation, thereby limiting activation of the RIG-I PRR that recognizes viral dsRNA with a 59 di-or triphosphate (229). Downstream of PRR activation, IFN-b, and other type I IFNs are transcriptionally induced by the phosphorylation and dimerization of IRF3 and -7, which then traffic to the nucleus to initiate transcription. PLpro, nsp1, ORF3b, and N all inhibit IRF3 phosphorylation, blocking its nuclear entry and type I IFN transcription (63,201,(230)(231)(232). Nsp1 further inhibits IRF7 activation and reduces c-Jun expression and phosphorylation (201). Type I IFNs can also be turned on by NF-kB, but NF-kB-responsive promoter activation is inhibited by both the viral M and nsp1 proteins (233). Once type I IFNs are produced, they then signal through the JAK/STAT pathway to induce interferonstimulated genes (ISGs) in an autocrine and paracrine fashion. Coronaviruses target this pathway as well; nsp1 induces degradation of IFN-b RNA during host shutoff, ORF6 inhibits STAT1 translocation to the nucleus, and nsp1 inhibits STAT1 phosphorylation, inhibiting downstream induction of ISGs (195,234).
Experiments with viral deletion mutants using reverse genetics have begun to parse out the overlapping contributions of each of these viral proteins. Consistent with its role as an essential virulence factor, deletion of nsp1 severely attenuates infection in in vivo mouse models with MHV and renders mice immune to a subsequent challenge with WT virus (235). Although viruses lacking ORF3b or ORF6 do not exhibit reduced viral replication in tissue culture or in mouse models of SARS-CoV, this may be due to functional redundancy with other IFN antagonists, which could contribute to the pathogenicity of SARS-CoV (178). In this regard, expression of SARS-CoV ORF6 (but not other SARS-CoV accessory proteins) during infection with an attenuated version of MHV leads to increased viral replication in cell culture and increased virulence in mice (236).
dsRNA produced during RNA virus replication can also trigger host translation shutoff through induction of the antiviral 29-59 oligoadenylate synthetase and RNase L. 29-59 oligoadenylate synthetase synthesizes RNAs with unique 29-59 linkages that activate RNase L, which broadly antagonizes translation by cleaving host and viral RNAs, including ribosomal RNAs, restricting viral replication. MERS NS4b and MHV ns2 are 29,59-phosphodiesterases that directly cleave the 29-59 RNAs that activate RNase L, thereby inhibiting cellular detection of viral replication intermediates. Deletion of the MHV ns2 The N protein inhibits recognition of the foreign viral RNA by inhibiting TRIM25 activation of RIG-I and also inhibiting IRF3 phosphorylation. PlPro, nsp1, and ORF3b also inhibit IRF3 phosphorylation, and ORF3b and N further inhibit IRF3 translocation to the nucleus. Nsp1 additionally targets IRF7 and c-Jun phosphorylation. M inhibits assembly of the Traf6 complex, thereby reducing NF-kB import into the nucleus. Together, these activities result in reduced type I IFN production (IFN-b). IFN-b signals in an autocrine and paracrine fashion to activate ISGs through JAK/STAT signaling. Nsp1 inhibits STAT1 phosphorylation, and ORF6 inhibits STAT1 translocation to the nucleus, further dampening ISG production.
protein or mutation of its catalytic residues results in increased IFN-g in the liver of infected mice (237)(238)(239)(240). MHV nsp15 (EndoU) also targets the production of dsRNA by endonucleolytically degrading stretches of polyU RNA made during copying of the viral poly(A) tail, and mutation of nsp15 results in a 200-fold increase in IFN-b induction over WT (241). In addition to the above mechanisms of immune antagonism, coronaviruses appear to also reduce the immunogenicity of the dsRNA they produce by sequestering them in the DMVs (147,242).
Work in animal models using a mouse-adapted strain of SARS-CoV, which resembles human disease, has further bolstered the connection between a dysregulated innate immune response and disease pathology (209). Genetic knockout of the IFN-a/b receptor or inflammatory monocyte-macrophage depletion during infection protected SARS-CoV-infected mice, demonstrating the role of a vigorous pro-inflammatory response in lethal SARS-CoV infection and identifying these pathways as potential therapeutic targets in patients infected with a highly pathogenic coronavirus.
Modulation of host pro-inflammatory response and programmed cell death pathways-A number of SARS-CoV proteins have been implicated in modulating pro-inflammatory immune responses, likely contributing to the cytokine storm detected in infected patients. Mitogen-activated protein kinase (MAPK) pathways are critical in relaying environmental stress to host cellular stress responses to elicit appropriate physiological responses, such as cellular proliferation, differentiation, development, inflammatory responses, and apoptosis (243). There are three major MAPK pathways in mammals: the extracellular signal-regulated kinase (ERK), p38 MAPK, and the stress-activated protein kinase/c-Jun N-terminal kinase (SAPK/JNK), which are all targeted during SARS-CoV infection (244). Phosphorylation of ERK and JNK has been observed in SARS-CoV-infected cells, and increased levels of phosphorylated p38 MAPK have been found in both cell culture studies and in the leukocytes of SARS patients and have been linked to abnormal interleukin-6 (IL-6) and interleukin-8 (IL-8) cytokine profiles in these patients (244,245). COVID-19 patients also exhibit a similar pattern of immune dysregulation, with elevated levels of IL-6 in particular, which correlates with severe disease pathology and mortality (221,246). Work in murine models has shown that elevated IL-6 levels play a major role in driving acute lung injury akin to that observed in both SARS and COVID-19 patients, and loss of IL-6 alleviates the severity of acute lung injury (247).
In cell culture experiments, overexpression of SARS-CoV structural (S, M, N, and E) and various accessory proteins (ORF3a, ORF3b, and ORF7a) has been associated with the activation or interference with MAPK and NF-kB signaling pathways, correlating with dramatic expression changes at cytokine and chemokine promoters, such as CCL2, IL-8, and RANTES (summarized in Table 1) (248)(249)(250)(251)(252)(253)(254). However, interference with host NF-kB activity occurs with some cell-type variability and varying degrees of effects on cytokine induction (250)(251)(252)255). Nonetheless, this likely contributes to the up-regulation of cytokines and chemokines associated with acute respiratory disease syndrome, asthma, and pulmonary fibrosis, which is consistent with pro-inflammatory profiles observed during SARS-CoV infection and accounts for patient deaths (256).
Another trigger of pro-inflammatory responses during infection with highly pathogenic coronaviruses is activation of host programmed cell death pathways. Necroptosis and pyropotosis are forms of highly inflammatory cell death that are observed during infection with cytopathic viruses and likely contribute to the molecular mechanisms underlying the severe lung pathology associated with SARS, MERS, and COVID-19 (257). Cell death through these mechanisms leads to a wave of local inflammation involving increased secretion of proinflammatory cytokines and chemokines leading to further tissue damage. SARS-CoV-2-infected patients exhibit elevated levels of the cytokine IL-1b, which is associated with pyroptosis (220). In particular, expression of the SARS-CoV ORF3a protein induces caspase-independent necrotic cell death and also initiates an inflammatory cascade through activation of the NLRP3 inflammasome contributing to pyroptosis (258). The SARS-CoV ORF3b protein has also been shown to induce necrosis (259). Notably, the ORF3b protein is truncated to 20 aa in SARS-CoV-2 and is likely nonfunctional, suggesting differences in the underlying mechanisms driving virally induced necrotic cell death in SARS and COVID-19 patients (188).
Whereas noninflammatory apoptosis often serves as a host antiviral response during infection (260), infection-induced activation or modulation of host apoptotic machinery may also induce death of particular cell types that enhance viral egress and pathogenesis (261). During human coronavirus infection, virally induced apoptosis can occur in a variety of cell types beyond those of the respiratory tract, including immune cells such as macrophages, monocytes, T lymphocytes, and dendritic cells (262)(263)(264)(265)(266). A molecular understanding of the pro-apoptotic roles for coronavirus proteins largely comes from studies of SARS-CoV or homologous MHV proteins that investigate the pro-apoptotic roles of their structural proteins and unique accessory proteins (summarized in Table 1). Expression of S, N, E, M, ORF3a, ORF3b, ORF7a, OR8a, or ORF9b proteins in various cells lines have all been shown to trigger apoptosis mediated through various pathways, including the PERK pathway through the unfolded protein response, cytochrome c release, and caspase-dependent apoptosis pathways (259,264,(267)(268)(269)(270)(271)(272)(273)(274)(275)(276). At present, it remains unclear whether SARS-CoVinduced cell death functions as an immune evasion tactic, an exit strategy to enhance viral spread, or an indirect consequence of viral replication.
Neutralizing antibodies and memory B-cell response-Protective immunity requires preexisting antibodies, memory B cells, and memory T-cell responses. B-and T-cell responses can be detected within 1 week following the onset of symptoms in both SARS and COVID-19 patients (277,278). Following infection with SARS-CoV in particular, neutralizing antibodies develop within 2-3 weeks, likely against the S protein (279,280). In contrast, COVID-19 patients may develop an antibody response earlier due to viral titers peaking earlier (281)(282)(283)(284). The primary target of neutralizing antibodies in SARS-CoV is the RBD of the S protein, a region of the protein that is significantly different in SARS-CoV-2 (285, 286). As such, only a small number of previously identified monoclonal antibodies to SAR-CoV bind and neutralize SARS-CoV-2 (287,288). A number of strategies are being employed to develop therapeutic monoclonal antibodies against SARS-CoV-2, including mouse immunization and hybridoma isolation and cloning of B-cell sequences from convalescent human patients, which has previously been successful in treating SARS patients (44, [289][290][291]. Importantly, neutralizing antibody titers and the memory Bcell responses, while robust against SARS-CoV (and likely for SARS-CoV-2), are relatively short-lived in recovered patients. Neutralizing antibody titers consistently decline over time and cannot be detected in most SARS-recovered patients 6 years following the onset of symptoms, and memory B-cell responses cannot be detected as early as 3.5 years post-infection (292). These responses may also be short-lived for at least a subset of COVID-19 patients (44). In contrast, memory T-cell responses persist up to 6 years post-infection in a large subset of SARSrecovered patients (292). Whereas T-cell responses are critical for controlling infection and memory T cells are present in higher numbers and often elicit faster responses post-infection than memory B cells, memory T cells alone likely cannot provide adequate long-term protective immunity (293). Importantly, vaccinated animal models have also shown increased immunopathology associated with detrimental T-cell responses; thus, further investigation is critical in understanding coronavirus specific T-cell responses, particularly in the context of vaccine development (294).
Waning protective immunity among previously infected individuals opens up questions regarding the susceptibility of reinfection. Future studies defining immune correlates of protection following SARS-CoV-2 infection are critical and will inform both vaccine strategies and disease management.

Conclusions
Coronavirus spillovers have provoked three epidemics in the last 20 years, and our ability to counteract future emergent viruses will be influenced by how deeply we understand the mechanistic details of coronavirus replication and the virushost interaction. Despite immense progress, significant questions still exist. For example, much remains to be learned about the mechanisms by which the multicomponent replicase complex executes its sophisticated genome replication, transcription, and RNA-processing functions. The membrane reorganization necessary to form viral replication and transcription compartments is also a central facet of coronavirus biology, yet how the various stages of the viral replication cycle are coordinated and organized within these vesicles and the mechanism of dsRNA sequestration in the DMVs are largely unknown. Virus-host interactions that influence the innate and adaptive immune response are of obvious importance, as they presumably underlie aspects of coronavirus pathogenesis that can differ markedly between viral strains and are central to vaccine development. Many viral accessory proteins appear to antagonize the innate immune system, yet their relative contributions and roles during infection are not clear. Despite these significant unknowns, the current pace of SARS-CoV-2 research is truly remarkable and a testament to the power of cooperative and collaborative science. As these new discoveries are rapidly layered onto the existing foundational work in the coronavirus field, our understanding of this fascinating group of viruses will surely be refined and reshaped more rapidly than for any pathogen in human history.
Acknowledgments-We apologize to those who conducted the significant amount of research we were unable to cite herein and acknowledge that the current pace of the coronavirus field means that many new findings will have emerged by the time this manuscript is published. All figures were designed in collaboration with Biorender and are available as editable templates at BioRender. com. Conflict of interest-The authors declare that they have no conflicts of interest with the contents of this article.