Site-specific Protease Activity of the Carboxyl-terminal Domain of Semliki Forest Virus Replicase Protein nsP2*

The virus-specific components (nsP1–nsP4) of Semliki Forest virus RNA polymerase are synthesized as a large polyprotein (P1234), which is cleaved by a virus-en-coded protease. Based on mutagenesis studies, nsP2 has been implicated as the protease moiety of P1234. Here, we show that purified nsP2 (799 amino acids) and its C-terminal domain Pro39 (amino acids 459–799) specifically process P1234 and its cleavage intermediates. Analysis of cleavage products of in vitro synthesized P12, P23, and P34 revealed cleavages at sites 1/2, 2/3, and 3/4. The cleavage regions of P1/2, P2/3, and P3/4 were expressed as thioredoxin fusion proteins (Trx12, Trx23, and Trx34), containing (cid:1) 20 amino acids on each side of the cleavage sites. After exposure of these purified fusion proteins to nsP2 or Pro39, the reaction products were analyzed by SDS-polyacrylamide gel electrophoresis, mass spectrometry, and amino-terminal sequencing. The expected amino termini of nsP2, nsP3, and nsP4 were detected. The cleavage at 3/4 site was most efficient, whereas cleavage at 1/2 site required 5000-fold more of Pro39, and 2/3 site was almost resistant to cleavage. The activity of Pro39 was inhibited by N -ethylma-leimide,

The genomes of many positive strand RNA viruses are expressed as polyproteins in order to achieve the expression of multiple proteins from a single message, unlike the mRNAs of their eukaryotic host cells, which mostly code for single proteins. Thus, proteolyses of the polyprotein precursors are essential events in the regulation of the replication and morphogenesis of these RNA viruses. In picornaviruses and flaviviruses, the entire RNA genome is translated as a single polyprotein, from which the structural and nonstructural proteins are processed by proteolysis. In picornavirus-infected cells, the processing is carried out by virus-encoded proteases within the polyprotein, whereas the processing of flavivirus polyprotein is assisted by host proteases (1,2). The large RNA genomes of coronaviruses (approximately 30 kilobases) and arteriviruses (12.7-15.7 kilobases), together classified as Nidovirales, use in addition to the polyprotein strategy also a set of subgenomic mRNAs (3). Alphaviruses and rubella virus, members of the Togaviridae family, express two polyproteins. The nonstructural polyprotein is expressed directly from the RNA genome, whereas the structural polyprotein is synthesized from a subgenomic mRNA (4 -6).
Semliki Forest virus (SFV) 1 is a typical alphavirus with a lipoprotein envelope surrounding the nucleocapsid. The 5Ј twothirds of the 11.5-kilobase 42 S RNA genome codes for the nonstructural polyprotein (P1234) of 2432 aa, which is autocatalytically cleaved to finally yield the virus-specific components of the RNA polymerase complex, nsP1-nsP4 (4,5,7). The processing of the nonstructural polyproteins P1234 and P123 of Sindbis virus (SIN), another alphavirus, has been studied in detail (4,8,9). Using mostly in vitro translation and sitedirected mutagenesis as tools, autocatalytic protease activity was detected in the polyprotein and its cleavage intermediates. The protease activity was localized to nsP2, and more precisely, to its carboxyl-terminal part (10,11). Cysteine 481 and histidine 558 were identified as essential residues for the autoprotease activity (12), supporting the view that the protease is a thiol proteinase belonging to the papain superfamily. The same conclusion was reached by sequence comparisons (13). The respective mutation of Cys-478 in SFV nsP2 also inactivates the autocatalytic processing of P1234, P123, and P23 (14).
The amino-terminal domain of SFV nsP2 (residues 1-470) has been shown to house several enzymatic activities including RNA triphosphatase (15), nucleoside triphosphatase (16), and RNA helicase (17). Interestingly, a significant amount of nsP2, synthesized during infection, is transported into the nucleus (18,19). The core of the nuclear localization signal was mapped to a short sequence P 647 RRR (20). nsP2 mutant PRDR is not lethal for the virus in cell culture, but SFV carrying this mutation is apathogenic for mouse (21). In addition, the carboxylterminal domain of SFV nsP2 has been implicated in the regulation of the synthesis of the subgenomic mRNA (22)(23)(24).
Taken together, these data suggest that alphavirus nsP2 consists of two structurally independent domains, each possessing a distinct set of biological activities. However, direct proof that purified nsP2 or its carboxyl-terminal part has protease activity has been lacking. To this end, recombinant nsP2 and a set of its amino-terminally truncated variants were expressed in Escherichia coli, purified by metal-chelate chromatography, and assayed for the presence of protease activity. Both full-length nsP2 and its soluble carboxyl-terminal fragment Pro39 (aa 458 -799) catalyzed site-specific proteolysis of SFV P1234 in vitro. Furthermore, both nsP2 and Pro39 could specifically cleave purified recombinant fusion proteins containing short (ϳ40 aa) SFV-specific peptides, which span pro-tease cleavage sites. Using this newly devised in vitro assay, we show that Pro39 was inactivated by N-ethylmaleimide, in accordance with the catalytic mechanism of cysteine proteases. C478 represents catalytic residue of the active site; ϩ, protease activity; Ϫ, no protease activity; nd, activity could not be determined. nsP2 Expression Plasmids-Full-length nsP2 of SFV was produced and purified as described previously (15). To obtain a set of nsP2 mutants, containing progressive amino-terminal truncations, nine separate PCR amplifications were carried out using Pfu Turbo DNA polymerase (Stratagene), SFV infectious cDNA as a template and 3Ј-Mut (Table I) as common 3Ј primer. This oligonucleotide was designed to introduce a G 3 E mutation in the 2/3 cleavage site to prevent possible self-proteolysis of the expressed protein at the nsP2-His 6 tag boundary. Oligonucleotides N⌬60, N⌬120, N⌬180, N⌬240, N⌬300, N⌬350, N⌬400, N⌬458, and N⌬470 (Table I) were used as 5Ј primers. Similarly, a set of bidirectional truncations of nsP2 was also prepared, using N⌬300 as a common 5Ј primer and C⌬40, C⌬80, or C⌬120 (Table I) 4, 5, 9, 10, 14, and 15. In vitro synthesized nsP1 and polyproteins containing nsP1 show regularly a truncated form ⌬P1, due to internal initiation of translation. B, polyproteins 12 CA 3 and 12 CA 34 as substrates. Immunoprecipitation analyses are shown in lanes 4 -6 and 10 -13. The proteins were synthesized using a cell-free in vitro system in the presence of [ 35 S]methionine for 1 h at 30°C. The synthesis was stopped by adding cycloheximide. The enzyme preparations (100 ng) were added to the reaction mixture followed by incubation for 1 h at 30°C. The reaction products were analyzed by SDS-PAGE in 10% gels before and after immunoprecipitation, followed by visualization by autoradiography. Positions of the polyproteins and their cleavage products are indicated with the arrows.
Site-directed Mutagenesis of the Cleavage Sites-SFV cDNA fragments covering 1/2 and 3/4 cleavage sites (SFV genome regions 1444 -1944 and 5306 -6138, respectively) were subcloned into pBlueScript KS vector (Stratagene), and fragment covering 2/3 cleavage site (genome region 3791-5531) was subcloned into pUC18. The three resultant plasmids were used as templates for the PCR-based site-specific mutagenesis, using Pfu Turbo DNA polymerase (Stratagene) and the one of the following primer pairs: 1MutF and 1MutR, 2MutF and 2MutR, or 3MutF and 3MutR (Table I). The primers were designed to change the 1/2 processing site YHAGA2GVVE to YHAEA2GVVE, 2/3 site HTAGC2APSY to HTAEC2APSY, and 3/4 site GRAGA2YIFS to GRAEV2YIFS, respectively (target residues underlined). PCR products were treated with DpnI (Stratagene) and T4 polynucleotide kinase (New England Biolabs) and self-ligated, using T4 DNA ligase (New England Biolabs). The plasmids obtained were verified by sequencing and designated pMut1^2, pMut2^3, and pMut3^4.
In Vitro Synthesis of the Nonstructural Polyprotein Substrates-Coupled transcription-translation of the constructs encoding SFV P12 CA , P2 CA 3, P12 CA 3, P12 CA 34, and P34 was carried out in the T7 TNT rabbit reticulocyte lysate system (Promega) according to the manufacturer's instructions. Reaction mixtures (10 l) supplemented with 10 Ci of [ 35 S]methionine (Ͼ1000 Ci/mmol; Amersham Pharmacia Biotech) and 1 g of plasmid DNAs were incubated for 1 h at 30°C, and the protein synthesis was stopped by adding 1 mM cycloheximide. For protease assay with 35 S-labeled nonstructural polyprotein substrates, typically 0.1-0.5 g of the isolated protease was added to the cycloheximide-arrested translation mixtures (see above), and the mixture was incubated for 1 h at 30°C. Reaction products were separated by SDS-PAGE. Gels were dried, and radioactive protein bands were visualized using phosphorimager (BAS-1500, Fuji).
Protease Assay with Purified Thioredoxin Fusion Protein Substrates-Protease activity of nsP2 and Pro39 was assayed using thioredoxin (Trx) substrates (see above). Purified substrates and enzymes were mixed in the buffer containing 50 mM HEPES-NaOH, pH 7.2, 20 -100 mM NaCl, and 1 mM dithiothreitol, and the mixtures were incubated for 1 h at 30°C. Reaction products were analyzed by SDS-PAGE and/or mass spectrometry after HPLC purification. In the protease inhibition assays, enzyme was pre-incubated with the inhibitor for 10 min.
Reversed Phase Chromatography-Substrates and the reaction products were separated on a 0.1 ϫ 15-cm Vydac C8 column (300 Å, 5 m, LC-Packings) using a SMART system (Amersham Pharmacia Biotech, Uppsala, Sweden). Elution was performed using linear gradients of acetonitrile (0 -60% in 100 min) in 0.1% trifluoroacetic acid. Chromatography was monitored for absorbancy at 214 nm, and the peptidecontaining fractions were collected automatically.
Mass Spectrometry and NH 2 -terminal Sequence Analysis-MALDI-TOF mass spectrometry was performed on a Biflex time-of-flight instrument (Bruker-Franzen Analytik, Bremen, Germany) equipped with a nitrogen laser operating at 337 nm. The reversed phase HPLC separated fractions were analyzed in the linear positive ion delayed extraction mode using saturated sinapic acid in a mixture of 0.1% trifluoroacetic acid and 50% acetonitrile (1:2) as a matrix. Samples were prepared by mixing 1 l of reversed phase HPLC eluate with 1 l of sinapic acid matrix on the target plate and dried under a gentle stream of warm air. All mass spectra were calibrated externally with either cytochrome C or myoglobin as standards. Electrospray ionization mass spectra were obtained using a Micromass Q-TOF quadrupole/time-offlight hybrid mass spectrometer (Micromass, Manchester, United Kingdom). Pro39 was dissolved in a mixture of 0.1% trifluoroacetic acid and 50% acetonitrile (1:2) and directly injected into the electrospray ionization mass spectrometer with a syringe pump at a flow rate of 30 l/h. The mass spectrometer was calibrated using sodium trifluoroacetate as described (27). Protein masses were calculated by deconvulation in MassLynx 3.4 (Micromass). NH 2 -terminal sequence analyses were performed by Edman degradation using a Procise 494A Sequencer (PerkinElmer Applied Biosystems Division).
Both nsP2 and Pro39 Are Proteolytically Active in Vitro-To assay protease activities of the isolated proteins, several SFV nonstructural polyprotein substrates containing [ 35 S]methionine were synthesized in a cell-free transcription-translation system. These were P12 CA , P2 CA 3, P12 CA 3, P12 CA 34 (with a protease-inactivating mutation of C478A in the nsP2 moiety), as well as P34. Both nsP2 and its amino-terminally truncated fragments were active as proteases; however, Pro39 showed the  highest specific activity (data not shown). From the polyprotein pairs, both preparations cleaved readily at sites 1/2 (Fig. 3A, lanes 2-5) and 3/4 (Fig. 3A, lanes 12-15), whereas cleavage at site 2/3 was much less efficient, but detectable (Fig. 3A, lanes  9 and 10). The two larger polyproteins were cleaved also at sites 1/2 and 3/4 and to some extent at site 2/3, as revealed by immunoprecipitation of the cleavage products (Fig. 3B, lanes  4 -6 and 10 -13). The cleavage at site 3/4 was more efficient than that at site 1/2, as seen after reduction of enzyme concentration or time of incubation (data not shown). As expected, the C478A mutation inactivated both nsP2 and Pro39 completely (data not shown). Overall, these data demonstrate site-specific protease activity in vitro of purified nsP2 and Pro39. The high solubility and specific activity of Pro39 suggest that this fragment represents a structurally compact protease domain of nsP2.
Protease Activity of nsP2 and Pro39 Assayed with Purified Substrates-In the following experiments, protease activity of nsP2 and Pro39 was assayed with a set of purified recombinant Trx fusion proteins as substrates (Fig. 4A). These fusion proteins (ϳ18 kDa) contained approximately 40-aa sequences spanning the SFV P1234 polyprotein cleavage sites (Trx12, Trx23, and Trx34). As controls we used non-cleavable analogs with point mutations (glycine to glutamic acid) in the penultimate amino acid of the predicted cleavage site (Trx1^2, Trx2^3) as well as an additional mutation (Ala 3 Val) for 3/4 cleavage site yielding Trx3^4 (Fig. 4). Both nsP2 and Pro39 cleaved Trx34 with the highest efficiency. As expected, two proteolytic products were formed, one larger (L; ϳ14 kDa) and one smaller (S; ϳ4 kDa). The L-fragment (thioredoxin moiety plus 19 carboxyl-terminal residues from nsP3) could be detected in standard SDS-PAGE gels (Fig. 4B). The S-fragment migrated in this gel system in the front (data not shown).  However, Tricine SDS-PAGE method (28) resolved this fragment as a defined band (Fig. 4C). Pro39 also cleaved Trx12, as judged by the appearance of the L-fragment (Fig. 4D). Under the same conditions, the cleavage products obtained with nsP2 as protease and Trx12 as substrate were barely detectable (data not shown). No cleavage of Trx23 could be detected with either nsP2 or Pro39 when analyzed by SDS-PAGE (Fig. 4E). Neither nsP2 nor Pro39 could cleave Trx1^2, Trx2^3, or Trx3^4, which served as controls (Fig. 4, B, D, and E). The relative cleavage efficiency by Pro39 of sites 1/2 and 3/4 was further analyzed using serial dilutions of the enzyme (Fig. 5). From these results it could be estimated that complete cleavage of Trx12 requires ϳ5000-fold more enzyme than cleavage of Trx34. HPLC Purification and MALDI-TOF Mass Spectrometry of the Proteolytic Products-In addition to the SDS-PAGE analysis, the proteolytic products of Trx12, Trx23, and Trx34 were also separated by reversed phase HPLC (see "Experimental Procedures" for details). Fractions containing S-and L-fragments, as well as non-cleaved substrates, were analyzed by MALDI-TOF mass spectrometry as shown in detail for Trx12 (Fig. 6). Amino-terminal sequences of the S-fragments were also determined. The results of these experiments confirmed that Pro39 cleaves Trx-fusion protein substrates exactly at the predicted positions of P1234, and were supported by previous NH 2 -terminal radiosequence analyses of in vivo labeled nsP2, nsP3, and nsP4 (29,30) (Table II). Importantly, this approach was sensitive enough to detect the cleavage of the Trx23 substrate. Thus, 2/3 site can be hydrolyzed correctly in vitro by Pro39 but with a very poor efficiency (Table II).
Characterization of the Pro39-catalyzed Reaction-Effects of several reaction parameters on the protease activity of Pro39 were studied systematically. The enzyme was active over a broad range of pH values (pH 6.8 -9.5) and different ionic strengths (0 -500 mM NaCl), and the optimal reaction temperature was 30°C (data not shown). Omission of reducing reagents from the reaction mixture had no detectable effect on Pro39 activity. The time course of the proteolysis under optimized conditions was also studied. In this experiment, Pro39 demonstrated a very high specific activity, hydrolyzing 50% of 400-fold molar excess of Trx34 substrate in 5 min (Fig. 7).
The effect of different protease inhibitors on the enzymatic activity of Pro39 was also tested. The enzyme was completely resistant to the inhibitors of serine proteases (PMSF), metalloproteases (EDTA), aspartic proteases (pepstatin), and some cysteine protease inhibitors (leupeptin and E-64) (Fig. 8A). Cleavage of both Trx12 and Trx34 was completely inhibited by 2.5 mM N-ethylmaleimide (NEM), a typical cysteine protease inhibitor. Surprisingly, the protease was also sensitive to some divalent cations in the reaction mixture. Addition of 2 mM Zn 2ϩ or Cu 2ϩ resulted in total inactivation of the protease activity, and Co 2ϩ and Ni 2ϩ caused partial inhibition (Fig. 8B). On the other hand, the same concentrations of Ca 2ϩ , Mg 2ϩ , and Mn 2ϩ had no effect on Pro39 activity. DISCUSSION Previous work on Semliki Forest virus showed that early in infection the synthesis of the negative strand RNA was strictly dependent on protein synthesis and ceased in about 15 min after addition of cycloheximide (31), whereas late in infection the synthesis of positive strand RNAs could continue for several hours in the absence of protein synthesis. Solution to this dilemma came from findings with Sindbis virus, another alphavirus, where the processing intermediate P123 of the nonstructural polyprotein together with nsP4 was shown to be responsible for the synthesis of the negative strand RNA (32)(33)(34). Cleavage of P123 is essential for the synthesis of the positive RNA strands. Thus, the regulated processing of the nonstructural polyprotein controls the early events of virus infection. An overactive processing mutant of Sindbis virus nsP2 (N614D) cannot replicate, as the P123 intermediate is too short-lived to enable the necessary synthesis of the complementary RNA (35).
Our knowledge of the processing of alphavirus nonstructural polyprotein(s) is based mostly on experiments of in vitro translation of Sindbis virus RNA. Ingenious constructions by which the cleavage sites were mutated alone and in different combinations, together with constructs coding for enzymatically inactive polyprotein as substrates, have been used to analyze this complex process (4,8,9). These experiments showed that the polyprotein P1234 itself and all its cleavage intermediates containing nsP2 to be active proteases. The cleavability of the different sites varied and was dependent on the order of removal of different nsPs from the polyprotein substrate. Particularly interesting was the finding where the cleavage of site 2/3 in P1234 or P123 was only possible after the cleavage of nsP1 (36). Evidently, conformational changes in P1234 and its cleavage products affect the interactions between the cleavage sites and the protease domain of nsP2 in a complex manner. To understand the processes better, we have characterized purified SFV nsP2 and its carboxyl-terminal fragment Pro39 as proteolytic enzymes. As substrates we used in vitro synthesized polyproteins P1234, P123, P23 and P34, as well as recombinant thioredoxin fusion proteins, which contain short SFV-specific fragments, spanning the polyprotein processing sites 1/2, 2/3, and 3/4.
We show for the first time that purified nsP2 has proteolytic activity, which cleaves readily the 3/4 site of P1234 and P34. Deletion series of nsP2 resulted in a soluble, active carboxylterminal fragment consisting of amino acid residues 459 -799, which was designated as Pro39. It was purified to near homogeneity by metal-affinity chromatography. The specificities of Pro39 and nsP2 were identical, indicating that the aminoterminal half of nsP2 does not affect the fidelity of the protease. According to sequence alignments with the thiol protease superfamily, Pro39 contains a conserved protease domain (459 -600), but also almost 200 carboxyl-terminal extra amino acids (9,13). Our attempts to delete 40 -120 amino acids from the carboxyl terminus of nsP2 resulted in insoluble or inactive proteins (Fig. 1). Experiments with temperature-sensitive mutants of SIN and SFV nsP2 have shown that amino acid replacements N700K in SIN ts133, K736S in SIN ts24, and M781T in SFV ts4 result in inhibition of protease activity at 39°C, suggesting that the extreme carboxyl terminus of nsP2 participates somehow in the protease function (24).
Establishment of a biochemical assay system consisting of isolated Pro39 and thioredoxin attached cleavage regions of the SFV nonstructural polyprotein allowed characterization of the viral protease under defined experimental conditions. Pro39 was inactivated by N-ethylmaleimide but not with pepstatin, EDTA, or PMSF. These are properties, which are in accordance with its classification as a thiol proteinase of the papain superfamily. However, Pro39 is not inhibited by E-64, which is a typical inhibitor of cysteine proteases (37). Sensitivity for NEM and resistance for E-64 have been previously reported for poliovirus 3C thiol proteinase (38,39). Another interesting feature of Pro39 is the inhibition by zinc ions (Fig. 8).
When Pro39 (or nsP2) was added to the reaction mixture, after in vitro translation of P12 or P123, almost a quantitative release of nsP1 was observed. Similarly, when P34 or P1234 were used as substrates, quantitative release of nsP4 was seen, whereas only a small amount of nsP3 was released from P23, P123, or P1234 (Fig. 3). These results suggested that sites 1/2 and 3/4 were exposed to the added protease, whereas site 2/3 was not. To study this phenomenon under controlled conditions, in which the large protein domains would not interfere sterically with the proteolysis, we constructed fusion proteins with a different number of amino acid residues around the cleavage sites.
Constructs with less than 10 amino acids on both sides of the cleavage site were not digested by Pro39 or nsP2 (data not shown). We ended up using thioredoxin fusion proteins with about 40 residues of each cleavage region (Trx12, Trx23, and Trx34). Isolation of the cleavage products and their mass spectrometric analysis, as well as amino-terminal sequencing showed that Pro39 cleavage products were derived exactly from the predicted cleavage sites, determined previously by radiosequence analysis (29,30) (Table II). As controls we used thioredoxin fusion proteins with mutations close to the cleavage site, which were not digested by Pro39 or nsP2 (Fig. 4). Thus, we conclude that both proteases recognize specifically the three cleavage sites of the SFV nonstructural polyprotein. However, there were large differences in the sensitivity of the different cleavage sites, the 3/4 junction being most sensitive. Roughly 5000-fold more Pro39 was needed for complete cleavage of site 1/2 (Fig. 5). Under the same conditions, only a small amount of Trx23 was cleaved.
The different sensitivities of the three cleavage sites may well reflect the different specificities of the protease associated with the polyproteins, in which the cleavage at site 2/3 of P123 or P1234 does not take place unless preceded by cleavage of nsP1. The fact that Pro39 can catalyze cleavage at 1/2 site of in vitro translated SFV P12, P123, and P1234, which normally undergoes cleavage in cis (14), is better understood when realized that the estimated molar enzyme to substrate ratio represents an excess of 50 to 1, which is difficult to imagine to take place during virus infection. Thus, we cannot exclude the possibility that cleavages at sites 1/2 and 2/3 require cofactor(s), which might be derived from the other nsPs. Such a situation has been characterized thoroughly for the NS3 protease of hepatitis C virus. The site-specific proteolytic activity of NS3 protease was greatly increased by a short amino acid sequence of NS4 protein adjacently located in the polyprotein (40,41,43,44).
The processing intermediate P123, together with nsP4, enables the synthesis of the complementary RNA for a short time period, whereafter P123 is autocatalytically cleaved to yield the components of the stable RNA polymerase among them nsP2. The released nsP2 exercises its role in two different forms. As a part of the RNA polymerase complex (45), the amino-terminal domain provides RNA triphosphatase and RNA helicase activities (15,17). As "soluble nsP2," the carboxyl-terminal domain acts as a regulator of 26 S RNA synthesis (24) and as a transacting protease, which catalyzes the rapid cleavage of P1234 and P123, thus preventing the negative strand RNA synthesis late in infection.