Alternate Translation Occurs within the Core Coding Region of the Hepatitis C Viral Genome*

The majority of hepatitis C virus (HCV) isolates contain an open reading frame (ORF) overlapping with the core coding sequences in the (cid:1) 1 frame, which was assumed to be untranslated. We present evidence supporting the expression of this ORF (designated core (cid:1) 1 ORF) via novel translation mechanisms. First, fusion of the luciferase gene with the HCV-1 core (cid:1) 1 ORF followed by in vitro translation resulted in the synthesis of a chimeric protein (core (cid:1) 1-luciferase) that exhibited (cid:1) 54% luciferase activity relative to the positive control (core-luciferase). Second, antisera raised against two different synthetic core (cid:1) 1 peptides recognized the previously identified p16 (but not p21) core protein band expressed from HCV-1, indicating the presence of epitopes from the core (cid:1) 1 ORF within the p16 protein. Third, HCV-positive sera specifically recognized lysates of Escherichia coli cells expressing recombinant core (cid:1) 1 protein, suggesting the presence of anti-core (cid:1) 1 antibodies in HCV-infected patients. Finally, luciferase tagging experiments designed to assess for (cid:2) 1 frameshifting combined with site-directed mutagenesis experiments supported the presence of (cid:1) 1/ (cid:2) 1 ribosomal frameshift translation mechanisms within the core coding region. In conclusion, our data provide evidence for novel translation mechanisms within the core ex- periments RNAs total of S]Met assayed and a assay kit cording vitro transferred onto nitrocellulose membranes, and detected by autoradiography.

Hepatitis C virus (HCV) 1 is the major cause of non-A, non-B acute and chronic hepatitis, which frequently leads to liver cirrhosis and hepatocellular carcinoma (1)(2)(3). HCV is a member of the Flaviviridae family, with a positive, single-stranded RNA genome of ϳ10 kb. The genome encodes a single polyprotein that is proteolytically cleaved to produce three structural (core, E1, and E2) and at least six nonstructural (NS2, NS3, NS4A, NS4B, NS5A, and NS5B) proteins (4,5). The core protein is located at the N terminus of the polyprotein and is predicted to have a length of 191 amino acids and a molecular mass of 23 kDa (p23) (6 -9). Additional processing of p23 produces the mature core protein (p21), consisting of 173-182 amino acids (10). A shorter form of the core protein (p16) with an apparent molecular mass of 16 kDa has also been reported (11,12). This form was first detected during in vitro expression studies of the prototype HCV-1 isolate and has been largely attributed to a specific Arg-to-Lys mutation in codon 9 of the core coding sequences (11).
Besides its apparent role in viral assembly (5,13), the core protein has multiple independent activities, thus playing a pivotal role in viral pathogenesis (14,15). According to recent reports, the core protein interacts with an increasing number of cellular proteins and modulates the expression of several cellular or viral promoters. Thus, the core protein can either activate or inhibit programmed cell death (16,17), modulate signal transduction pathways of the host cells (18,19), suppress the host immune response (20), affect lipid metabolism (21,22), and has a transforming potential (23).
Interestingly, computer-assisted analysis of the HCV genome has revealed the presence of an additional out-of-frame open reading frame (ORF) overlapping the core gene in the ϩ1 frame (24,25). This novel ORF is open for 124 -160 codons in most of the HCV strains (25). Thus, a putative polypeptide of ϳ14 -17 kDa could be potentially synthesized by an alternate translation mechanism. Comparison of the complete genome sequences from different variants of HCV has shown a strong conservation within the core coding region at both the amino acid and nucleotide levels (24,25). More importantly, it has been reported that synonymous substitutions at the third position are highly suppressed in the core coding region (24). On the other hand, this region lacks an obvious translation start codon, which may help to explain why this sequence conservation was attributed to structural constraints of the viral genome rather than the presence of a functional gene.
In the course of experiments involving the construction of chimeric GST-core hybrid proteins, we found that a chimeric GST protein containing mostly HCV sequences encoded by the ϩ1 ORF was reactive to HCV-positive human sera. This construct was the result of a random PCR-induced single nucleotide deletion mutation near the GST-core junction. This prompted us to examine the possibility that this ORF (designated coreϩ1 ORF) represents a functional ORF. To this end, we undertook three different approaches: (a) protein tagging experiments using the luciferase protein as a tag, (b) directly monitoring the expression of the coreϩ1 ORF in vitro with specific anti-coreϩ1 antibodies combined with site-directed mutagenesis experiments, and (c) screening of sera from HCVinfected patients for the presence of circulating anti-coreϩ1 antibodies. From these studies, we provide evidence supporting the expression of the coreϩ1 ORF, at least for some HCV isolates, via novel translation mechanism(s). * This work was supported by grants from the National Secretariat of Research and Technology. A preliminary report of this study describing the luciferase tagging experiments and the screening of human sera was presented at the Seventh International Meeting on Hepatitis C and Related Viruses, December 3-7, 2000, Gold Coast, Queensland, Australia. The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18

EXPERIMENTAL PROCEDURES
Construction of Plasmids-All plasmids described in this study were constructed by PCR using various primer pairs and the following conditions: 35 cycles at 94°C for 60 s, 65°C for 30 s, and 72°C for 2 min, with a final extension step at 72°C for 10 min. For all DNA plasmids described, the DNA sequences were confirmed twice by sequencing (Amersham Biosciences sequencing kit and Applied Biosystems sequencer).
Plasmid pHPI-668 is based on the pGEX-3X expression vector (Amersham Biosciences) and was made to express a chimeric GST-coreϩ1 ORF fusion protein. The core coding region (nt 390 -920) was obtained by PCR using, as template, p36-27, which contains the prototype HCV-1 cDNA sequence (nt 268 -1052) cloned into the pBluescript KS vector (kindly provided by M. Beach). The oligonucleotides 5Ј-CCGGAATTC-CGTAACACCAACCGTCGCCCA-3Ј and 5Ј-CTCGAATTCCACTAGG-TAGGCCGAAG-3Ј (underlined sequences represent EcoRI sites) were used as sense and antisense primers, respectively. The PCR fragment was digested with EcoRI and cloned into the pGEX-3X expression vector. Plasmids pHPI-756 and pHPI-996 contain the core coding sequences (nt 341-920) from prototype HCV-1 and HCV-1a (strain H) (kindly provided by G. Inchauspe), respectively, cloned into the pGEM-3zf(ϩ) vector (Promega) under the control of the SP6 promoter. For the construction of pHPI-756, a double-step PCR was performed using plasmid p36-27 as template. The reason for the double-step PCR was to ensure the presence of the 10 consecutive adenine residues at nt 363-372 of the prototype HCV-1 core region, inasmuch as DNA sequencing analysis of this region gave inconclusive results. For the first PCR, we used 5Ј-GTGCTTGCGAATTCCCCGGGA-3Ј as the sense primer and the 5Ј-ACGTTTGTTTTTTTTTTGAG-3Ј as the antisense primer. For this PCR only, different conditions were used: 35 cycles at 94°C for 60 s, 50°C for 30 s, and 72°C for 120 s, with a single final extension step at 72°C for 10 min. The product of the first PCR was used in the second PCR as the sense primer along with 5Ј-CTCGAATTCCACTAGGTAG-GCCGAAG-3Ј as the antisense primer. Plasmid p36-27 was used again as template. The conditions for the second PCR were exactly the same, except that the annealing step was at 62°C for 30 s. The final PCR product was digested with EcoRI and cloned into the EcoRI cloning site of the pGEM-3zf(ϩ) vector under the control of the SP6 promoter. Plasmid pHPI-996 was also obtained by PCR using pRc/CMV/HCV-H as template and primers 5Ј-GTGCTTGCGAATTCCCCGGGA-3Ј (sense) and 5Ј-CTCGAATTCCACTAGGTAGGCCGAAG-3Ј (antisense). The PCR product was subsequently digested with EcoRI and cloned into the EcoRI cloning site of the pGEM-3zf(ϩ) vector under the control of the SP6 promoter. Plasmids pHPI-725, pHPI-736, and pHPI-737 contain the luciferase gene fused at the 0, ϩ1, or -1 frame, respectively, with a mutated HCV-1 core coding region (9 A residues). The HCV cDNA sequences encoding the IRES and part of the mutated core coding sequences (nt 9 -630) were obtained by PCR using plasmid pHPI-888 as template. The sense primer 5Ј-CGCCGGATCCTGATGGGGGC-GACA-3Ј was used for all three plasmids. The antisense primers were 5Ј-AGACAGGATCCAATCCCGCC-3Ј (oligonucleotide A) for pHPI-736, 5Ј-CACGGGGAGACAGGATCCATCCCGCCCACC-3Ј (oligonucleotide B) for pHPI-725, and 5Ј-CACGGGGAGACAGGATCCACCCGC-CCACCC-3Ј (oligonucleotide C) for pHPI-737 (underlined sequences represent BamHI sites). The PCR products were digested with BamHI and inserted into the BamHI cloning site of the pGEM-luc vector (Promega). Plasmid pHPI-888 is based on the pGEM-3zf(ϩ) vector and contains cDNA sequences (nt 9 -1054) from the prototype HCV-1 isolate. Nucleotide sequence analysis of pHPI-888 revealed the presence of a single nucleotide (A 363 ) deletion in the core coding region.
Plasmids pHPI-766, pHPI-767, and pHPI-768 contain the wild-type core coding region (nt 9 -630) from the prototype HCV-1 isolate fused to the luciferase gene. As before, to ensure for the presence of the wildtype nucleotide sequences within the nt 363-372 region, we used a double-step PCR cloning approach. For the first PCR, plasmid pHPI-888 was used as template, and oligonucleotides 5Ј-AGTGTTGGGTCGC-GAAAGGCC-3Ј (the NruI site located at nt 271 of the 5Ј-untranslated region is underlined) and 5Ј-ACGTTTGTTTTTTTTTTGAG-3Ј were used as sense and antisense primers, respectively. Subsequently, the PCR product was used as the sense primer, and oligonucleotide 5Ј-CCAAGGGTACCCGGGCTGAG-3Ј (the KpnI site located at nt 585 in the core coding region is underlined) was used as the antisense primer for the second PCR. The template was again plasmid pHPI-888. The PCR product of the second PCR was digested with NruI and KpnI and used to replace the corresponding sequences from plasmid pHPI-736, yielding pHPI-766. Plasmids pHPI-767 and pHPI-768 were made by replacing the ScaI-KpnI fragment (ScaI is a vector site located 5Ј to the HCV sequences) of the pHPI-725 or pHPI-737 plasmid with the corresponding sequences from pHPI-766.
Plasmids pHPI-748, pHPI-749, and pHPI-750 contain the entire IRES and part of the core coding sequences (nt 9 -630) from the HCV-1a (H) strain fused to the luciferase gene in all three frames. Cloning was performed by PCR using oligonucleotide 5Ј-CGCCGGATCCTGAT-GGGGGCGACA-3Ј as the sense primer and oligonucleotide A (for pHPI-748), oligonucleotide B for (pHPI-749), and oligonucleotide C (for pHPI-750) as the antisense primers. The PCR products were digested with BamHI and inserted into the BamHI cloning site of the pGEM-luc vector. Plasmid pHPI-1309 contains the core coding sequences (nt 385-920) from the prototype HCV-1 isolate with a start codon in the ϩ1 frame. The core coding region was obtained by PCR using pHPI-755 as template and primers 5Ј-CCGGAATTCGTAATGCCAACCGTCGC-CCACAGGACGTCAAGTTCC-3Ј (sense) and 5Ј-CTCGAATTCCACT-TAGTAGGCCGAAGC-3Ј (antisense) (underlined sequences represent the EcoRI sites, and the AUG start codon and the TTA stop codon are in boldface). The PCR product was digested with EcoRI and cloned into the EcoRI site of pGEM-3zf(ϩ) under the control of the SP6 promoter.
Site-directed mutagenesis was performed using the QuikChange TM site-directed mutagenesis Kit (Stratagene) and plasmid pHPI-755 as template. The pHPI-755 plasmid contains the HCV-1 core coding sequences (nt 341-920) cloned into the EcoRI cloning site of the pGEM-3zf(ϩ) vector under the control of the T7 promoter. Plasmid pHPI-774 is a product of site-directed mutagenesis at nt 342 and 343 of the prototype HCV-1 core coding region (mutation of A 342 3 T and T 343 3 A) using primers 5Ј-CGTGCACCTAGAGCACGGATCCTAAACCTC-3Ј (sense) and 5Ј-GAGGTTTAGGATCCGTGCTCTAGGTGCACG-3Ј (antisense). This double substitution converts the start codon of the core coding region (ATG) into a stop codon (TAG). The primers also insert a BamHI enzyme site, mutating A 351 3 G, allowing us to check the mutation by digestion.
Plasmid pHPI-775 is a product of site-directed mutagenesis at nt 357 of the prototype HCV-1 core coding region (mutation of A 357 3 T) using primers 5Ј-ATGAGCACGGATCCTTAACCTCAA-3Ј (sense) and 5Ј-TT-GAGGTTAAGGATCCGTGCTCAT-3Ј (antisense). This substitution converts Lys (AAA) into a stop codon (TAA). The primers also insert a BamHI enzyme site, mutating A 351 3 G, allowing us to check the mutation by digestion. Plasmid pHPI-777 is a product of site-directed mutagenesis at nt 453 of the prototype HCV-1 core coding region (mutation of C 453 3 A) using primers 5Ј-GTTTACTTGTTGACGCG-CAGGGG-3Ј (sense) and 5Ј-CCCCTGCGCGTCAACAAGTAAAC-3Ј (antisense). This substitution converts Cys (TGC) of the coreϩ1 ORF into a stop codon (TGA). The primers also insert a HincII enzyme site, mutating C 453 3 A, allowing us to check the mutation by digestion.
Bacterial Expression-For bacterial expression of the recombinant coreϩ1 fusion proteins, Escherichia coli XL1-Blue cells were transformed with plasmid pGEX-3X or pHPI-668 and grown at 37°C in LB medium containing 50 g/ml ampicillin to an absorbance of 0.6 at 600 nm. Isopropyl-␤-D-thiogalactopyranoside (Sigma) was added to a final concentration of 0.7 mM, and the bacterial cells were grown for an additional 3 h at 37°C and then pelleted by centrifugation and stored at -20°C.
Antibodies-For the production of polyclonal antibodies against the coreϩ1 ORF, peptide R1 (consisting of amino acid sequence CCRA-GALDWVCARRERLPSGRNLEV) was chemically synthesized using the branched method, and peptide R2 (consisting of amino acid sequence AGGRDGSCLPVALGLA) was chemically synthesized and conjugated with keyhole limpet hemocyanin. Both peptides were used to immunize rabbits. Specifically, 100 g of each peptide were mixed separately with 750 l of complete Freund's adjuvant (Sigma) and injected into New Zealand White rabbits. The rabbits were boosted three times with the same antigens mixed with incomplete Freund's adjuvant (Sigma) at an interval of ϳ2 weeks each. The antisera were collected 2 weeks after the last boost and used in Western blot analysis and enzyme-linked immunosorbent assays. The monoclonal antibody against the core protein was obtained from Chemicon International, Inc. and Biogenesis.
Western Blot Analysis-Proteins from E. coli lysates transformed with pHPI-668 and the corresponding empty vector were subjected to 10% SDS-PAGE and transferred onto nitrocellulose membrane (Schleicher & Schü ll). The membrane was incubated with blocking buffer (5% nonfat dry milk and 0.05% Tween 20 in phosphate-buffered saline) for 2 h at room temperature. Subsequently, the membrane was incubated overnight at 4°C either with human sera diluted 1:100 or with anti-GST antiserum in 1% nonfat dry milk and Tween 20 in phosphate-buffered saline. Subsequently, horseradish peroxidase-conjugated anti-human or anti-rabbit immunoglobulin, respectively, was used as the secondary antibody. Incubation with the secondary antibody was carried out at room temperature for 2 h. Recombinant proteins were detected using 4-chloro-1-naphthol solution as substrate.
In Vitro Transcription and Translation-For all plasmids lacking an IRES, the TNT reticulocyte lysate coupled in vitro transcription/translation reaction kit (Promega) was used in a standard 50-l reaction according to the manufacturer's protocol. For all the luciferase-expressing plasmids containing the HCV IRES element, Flexi rabbit reticulocyte lysates (Promega) supplemented with KCl at 120 mM and Mg(OAc) 2 at 0.5 mM were used. 1 g of each DNA was linearized with SalI and transcribed in vitro with SP6 RNA polymerase (Promega) according to the manufacturer's instructions. In vitro translation experiments were carried out on uncapped RNAs in a total volume of 25 l using [ 35 S]Met (Amersham Biosciences). 5 l of the translation products were analyzed by 12% SDS-PAGE, transferred onto nitrocellulose membranes, and detected by autoradiography. 5 l of the same translation products were assayed for chemiluminescence using a Turner TD-20/20 luminometer and a Promega luciferase assay kit according to the manufacturer's protocol. Each in vitro transcription/ translation reaction was performed in triplicate.
RNA Sequencing-10 g of plasmid pHPI-CS were linearized with BamHI and transcribed in vitro with SP6 RNA polymerase according to the manufacturer's instructions. The RNA was dephosphorylated using shrimp alkaline phosphatase (Roche Molecular Biochemicals) for 2 h at 37°C. Subsequently, the RNA was end-labeled using [␥-32 P]ATP and T4 polynucleotide kinase (MBI Fermentas). After elution from a 5% denaturing polyacrylamide gel, the ␥-labeled RNA was incubated with RNases A, T2, CL3, and T1 for 4 min at 37°C in buffer containing 10 mM HEPES (pH 7.4), 1 mM EDTA, and 1 g of tRNA and subsequently loaded onto a 10% denaturing polyacrylamide gel.
Immunoprecipitation Analysis-20 l of the translation products were mixed with 250 l of triple detergent buffer consisting of 50 mM Tris (pH 8), 150 mM NaCl, 0.1% SDS, 100 g/ml phenylmethylsulfonyl fluoride, 1 g/ml aprotinin, 1% Nonidet P-40, and 0.5% sodium deoxycholate. The reactions were incubated overnight at 4°C either with 5 l of monoclonal antibody or with 10 l of polyclonal antibodies. 50 l of protein A-Sepharose (Sigma) were added, and the reactions were incubated for 2 h at 4°C. After microcentrifugation, the Sepharose beads were washed three times with buffer consisting of 50 mM Tris (pH 7.5), 150 mM NaCl, 0.1% Nonidet P-40, 1 mM EDTA, 0.25% gelatin, and 0.02% sodium azide. The immunoprecipitates were subsequently subjected to 12% SDS-PAGE, transferred onto nitrocellulose membranes, and detected by autoradiography.

RESULTS
Evidence for Alternate Translation within the HCV Core Coding Region-As a first attempt to investigate the possibility of alternate translation within the HCV core coding region, the luciferase gene was fused with the first half of the core coding sequences (630 nt) in the 0, ϩ1, and -1 frames relative to the AUG initiation codon of the polyprotein, and the expression of these constructs was analyzed in a cell-free system using rabbit reticulocyte lysates. To facilitate the construction of these plasmids, we designed three different sets of primers for PCR cloning of the HCV cDNA sequences into a luciferase geneexpressing vector (pGEM-luc). Each set of primers uses the same 5Ј-oligonucleotide, but different oligonucleotides complementary to the 3Ј-core coding sequences (Fig. 1A). Oligonucleotide A was designed to place the luciferase gene in the 0 frame relative to the AUG initiation codon. Oligonucleotide B contains a single nucleotide insertion mutation (T 627 ) that places the luciferase gene in-frame with the overlapping ORF in the ϩ1 frame relative to the core coding sequences (coreϩ1 ORF). Oligonucleotide C contains a deletion of the adenine residue at nt 626, which places the luciferase gene in-frame with the -1 frame relative to the AUG initiation codon. Additionally, all three oligonucleotides contain a single base substitution at nt 630 (C to A) to create a BamHI cloning site. Three groups of plasmids were constructed using the above primers.
The first group contains the core coding sequences from the prototype HCV-1 isolate, and it was designed to test for a possible expression of the coreϩ1 ORF. Plasmid pHPI-766, constructed using oligonucleotide A, contains the luciferase gene fused in-frame with the preceding core coding sequences and served as a positive control. This plasmid was expected to produce a chimeric core-luciferase protein with an apparent molecular mass of 72 kDa, which is ϳ10 kDa larger than the mass of the native luciferase protein. Plasmid pHPI-767, constructed using oligonucleotide B, contains the luciferase gene fused in-frame with the coreϩ1 ORF. This plasmid was predicted to produce a chimeric luciferase protein only if the core ϩ1 frame was expressed. The size of this putative protein would depend on the location of the translation initiation codon of the coreϩ1 ORF. Finally, pHPI-768 plasmid, made using oligonucleotide C, carries the luciferase gene fused in-frame with the -1 frame relative to the core coding sequences and served as a negative control. Because the Ϫ1 frame contains multiple stop codons, no functional luciferase was predicted to be produced. As anticipated, in vitro translation of pHPI-766 resulted in the synthesis of a chimeric luciferase protein with an apparent molecular mass of 72 kDa. High enzymatic activity also indicated that this construct was capable of supporting the expression of an active chimeric core-luciferase protein (Fig.  1B, lane 4). As expected, only background levels of luciferase activity and no protein were detected from the expression of the pHPI-768 plasmid (Fig. 1B, lane 6). However, when pHPI-767 was used, a fusion protein with an apparent molecular mass of 72 kDa was again produced. This protein exhibited ϳ54% luciferase activity relative to that from the pHPI-766 construct (Fig. 1B, lane 5). Notably, normal translation from the RNA of this plasmid led to a translation stop shortly after the luciferase gene region. As expected, the chimeric luciferase protein expressed from pHPI-767 reacted strongly with anti-coreϩ1 antibodies (R1 and R2; see below), whereas no reaction was observed with the chimeric luciferase protein expressed from pHPI-768 (data not shown). Therefore, the results presented here provide the first direct evidence supporting the expression of the coreϩ1 ORF from prototype HCV-1.
We then sought to explore the possibility of a -1 ribosomal frameshift event within the core coding region of the same HCV isolate. Because the -1 frame of the core gene contains multiple stop codons, premature termination would prevent the detection of a potential -1 frameshift event. To overcome this limitation, we constructed a new series of plasmids that contained a single nucleotide deletion mutation within the core coding region (A 364 ). This mutation introduced a change in the reading frame after the first 9 amino acids of the core ORF, thus placing the upstream AUG codon out-of-frame with respect to the core ORF and in-frame with the coreϩ1 ORF. Consequently, plasmid pHPI-736, which was made with oligonucleotide A, placing the luciferase gene in-frame with the core ORF, now had the luciferase gene out-of-frame relative to the AUG initiation codon. This plasmid was therefore predicted to produce a functional luciferase protein only if a -1 frameshift event occurred. On the other hand, plasmid pHPI-725, which was made using oligonucleotide B, placing the luciferase gene in-frame with the coreϩ1 ORF, now had the luciferase gene in-frame with the AUG initiation codon. This plasmid was expected to give rise to a hybrid luciferase protein containing the first 10 amino acids of the core protein followed by the 86 amino acids of the coreϩ1 ORF in-frame with the luciferase gene. Finally, plasmid pHPI-737, which was made using oligonucleotide C, was our negative control plasmid because the luciferase gene was fused in the third frame containing the multiple stop codons. As shown in Fig. 1B (lane 1), when plasmid pHPI-725 was used to program in vitro translation, a fully active luciferase protein with an apparent molecular mass of 72 kDa was produced. This construct reproducibly gave the highest levels of luciferase activity. Interestingly, plasmid pHPI-736 also produced an active luciferase protein with a similar size (72 kDa) and an enzymatic activity of ϳ22% compared with our positive control construct (Fig. 1B, lanes 2 and 1, respectively). As predicted, anti-coreϩ1 antibodies R1 and R2 reacted strongly with the chimeric luciferase protein from pHPI-725 and failed to react with the translation product from pHPI-736 (data not shown). No detectable signal was obtained from the translation products of our negative control construct plasmid pHPI-737 (Fig.  1B, lane 3). These data provide strong evidence for the presence of a -1 ribosomal frameshift mechanism within the core coding sequences and predict that the slippery site would be within the nucleotide sequences encoding the N terminus of the core protein.
Finally, we sought to assess the expression of the coreϩ1 ORF from another HCV isolate. Thus, a third set of plasmids containing the core coding sequences from the HCV-1a (H) strain cloned into the luciferase-expressing vector was constructed. Plasmid pHPI-748 (equivalent to the pHPI-766 construct) contains the luciferase gene fused in-frame with the core coding sequences. Plasmid pHPI-749 (equivalent to pHPI-767) contains the luciferase gene fused in-frame with the coreϩ1 ORF, and pHPI-750 (equivalent to pHPI-768) contains the luciferase gene in-frame with the Ϫ1 frame relative to the AUG initiator codon. As shown in Fig. 1B, translation of plasmid pHPI-748 resulted in the expression of a fully active luciferase protein with an apparent molecular mass of 72 kDa (lane 7), whereas no luciferase expression was detected from the negative control plasmid pHPI-750 (lane 9). However, plasmid pHPI-749, with the luciferase gene fused in-frame with the coreϩ1 ORF, was not able to support detectable levels of chimeric luciferase protein (Fig. 1B, lane 8), suggesting that the HCV-1a (H) isolate does not support efficient expression of the coreϩ1 ORF under the experimental conditions of this study.
Overall, these results provide evidence for the presence of novel translation mechanism(s) within the HCV-1 core coding sequences. However, the reasons for the apparent lack of detectable expression of the coreϩ1 ORF from the HCV-1a (H) isolate are not currently clear.
Evidence for Expression of the Coreϩ1 ORF in Vitro-Previous in vitro translation studies have shown that the core coding region from the HCV-1a (H) isolate expresses predominantly the 21-kDa core protein (p21), whereas the same cDNA sequence from the prototype HCV-1 isolate produces predominantly a faster migrating form of the core protein that is ϳ16 kDa in size (p16), in addition to p21 (11). The p16 protein is assumed to represent a C-terminal truncated form of the core, FIG. 1. Tagging experiments with the luciferase gene. A, shown is a schematic representation of the constructs used for the tagging experiments. The entire HCV IRES (nt 9 -340) and part of the core coding sequences (nt 341-630) were cloned into the pGEM-luc expression vector under the control of the SP6 promoter. Nucleotide sequences at the junction of the core and luciferase coding regions are illustrated underneath. The AUG initiation codon of the luciferase gene is boxed. Oligonucleotide A (Oligo-A) was used to fuse the luciferase gene in-frame with the AUG initiator codon (0 frame) in plasmids pHPI-766, pHPI-748, and pHPI-736. Oligonucleotide B (Oligo-B) was used to fuse the luciferase gene in the ϩ1 frame relative to the AUG initiator codon (ϩ1 frame) in plasmids pHPI-767, pHPI-749, and pHPI-725. Oligonucleotide C (Oligo-C) was used to fuse the luciferase gene in the Ϫ1 frame relative to the preceding core coding sequence (Ϫ1 frame) in plasmids pHPI-768, pHPI-750, and pHPI-737. The underlined nucleotide indicates an insertion of a thymidine residue, and the inverted triangle indicates a deletion of an adenine residue. B, the different IRES-core-luciferase fusion constructs (as described under "Experimental Procedures") were transcribed in vitro, and equal amounts of uncapped RNAs were used to program in vitro translation reactions using Flexi rabbit reticulocyte lysates. 5 l of the products were resolved by SDS-PAGE, and the results from autoradiography are shown. Products are indicated by the arrow. 5 l of the same translation products were measured for luciferase activity according to the Promega protocol, and the resulting values are illustrated in the graph. Each bar represents the average luciferase activity of triplicate in vitro translation samples. Error bars indicate the S.D. values of triplicate samples. For clarity, bars corresponding to the same series of constructs have been similarly shaded. and its synthesis has been correlated to an Arg-to-Lys change in codon 9 (11). However, the possibility that p16 is in fact a different protein generated from an alternative reading frame of the HCV core coding region was not addressed at that time.
To explore this possibility and to further characterize the nature of the p16 protein, we raised specific antibodies against the coreϩ1 ORF using two different peptides, R1 and R2, corresponding to the HCV sequences encoded by nt 448 -522 and 613-660, respectively. The reactivity of these antibodies against the translation products of the core coding region from the HCV-1a (H) or prototype HCV-1 isolate was analyzed by immunoprecipitation experiments in a coupled in vitro transcription/translation system. Consistent with previous studies, the major translation product from plasmid pHPI-996 (HCV-1a (H)) was a protein of 21 kDa (p21), whereas plasmid pHPI-756 (HCV-1) resulted in the production of two major products of 21 and 16 kDa (p21 and p16, respectively) (Fig. 2B, lanes 1 and 5). Immunoprecipitation experiments carried out with the in vitro translation products indicated that the p21 protein from both plasmids pHPI-996 (HCV-1a (H)) and pHPI-756 (HCV-1) reacted strongly with the monoclonal antibody against the core (Fig. 2B, lanes 2 and 6), but failed to react with the R1 and R2 polyclonal antibodies (lanes 3 and 4 and lanes 7 and 8, respectively). These results are in agreement with previous studies and provide strong evidence for the specificity of the R1 and R2 polyclonal sera. In contrast, the p16 protein was recognized by the anti-core monoclonal antibody (Fig. 2B, lane 6) and the two specific anti-coreϩ1 antisera (lanes 7 and 8), suggesting that the 16-kDa protein band contains epitopes from both the core and coreϩ1 proteins. Thus, in contrast to the hypothesis made in previous studies, the p16 protein more likely represents a chimeric protein containing amino acids encoded by both the core and coreϩ1 ORFs.
In an attempt to further characterize the p16 protein, three separate mutations were introduced into the HCV-1 core coding sequences, and the expression of the p21 and p16 proteins was analyzed in vitro as described above (Fig. 3A). First, the AUG initiator codon for the core was mutated to a UGA stop codon, resulting in plasmid pHPI-774. Second, a stop codon was introduced 6 amino acids downstream of the AUG initiator codon and in-frame with the core reading frame, yielding pHPI-775. Finally, plasmid pHPI-777 contains a nucleotide substitution at nt 453 that introduces a stop codon (TGA) into the coreϩ1 ORF (Fig. 3A). As shown in Fig. 3B (lane 1), plasmid pHPI-755, which contains the wild-type core coding sequences of the HCV-1 isolate, produced both forms (p21 and p16) of the core protein. On the other hand, plasmids pHPI-774 and pHPI-775, both containing a termination codon at the beginning (first or sixth amino acid) of the conventional frame of the core coding sequences, failed to synthesize the 21-kDa as well as the 16-kDa core proteins (Fig. 3B, lane 2). The phenotype of both mutations is in agreement with previous studies (12), suggesting that the expression of the p16 protein is dependent on the AUG initiator codon of the polyprotein and that both the p21 and p16 proteins share common amino acids. Interestingly, however, plasmid pHPI-777, which contains a stop codon (TGA) in the coreϩ1 ORF, failed to express the 16-kDa protein, whereas the 21-kDa protein was normally synthesized (Fig. 3B,  lane 4). These data are in agreement with the observed reactivity of the p16 protein to anti-coreϩ1 antibodies (Fig. 2) and strongly suggest that the synthesis of the p16 protein is dependent on the translation of both the core and coreϩ1 ORFs.
Finally, to exclude the possibility that the expression of the coreϩ1 ORF is the result of slippage of the SP6 polymerase in the A-rich region of the HCV core sequence (nt 364 -373), we performed direct/enzymatic sequencing of the in vitro synthesized HCV RNA of this region. As shown in Fig. 4, following digestion with the indicated RNases, the stretch of 10 A residues is clearly visible, with no apparent deletion or insertion that would lead to the synthesis of a frameshifted product.
Evaluation of HCV-positive Human Sera for the Presence of Anti-coreϩ1 Antibodies-Patients with chronic HCV infection produce antibodies against most HCV proteins. If the coreϩ1 ORF is expressed during HCV infection, it is likely that some HCV-positive patients will have antibodies recognizing epitopes of the coreϩ1 ORF, in addition to other anti-HCV antibodies. To test this possibility, serum samples from both HCV-infected and healthy individuals were evaluated for their reactivity against recombinant coreϩ1 proteins expressed in E. coli. For this purpose, a cDNA fragment containing the core coding region (nt 390 -920) of the prototype HCV-1 isolate was cloned into the pGEX-3X expression vector, resulting in plasmid pHPI-668 (Fig. 5A). This plasmid was designed to produce a GST-coreϩ1 fusion protein with a calculated molecular mass of 41 kDa. At first, the expression of the pHPI-668 plasmid was analyzed by Western blotting using anti-GST antiserum. As expected, E. coli extracts harboring the pHPI-668 plasmid expressed the 41-kDa protein, whereas bacterial lysates harbor- ing the plasmid vector expressed the vector-encoded GST protein (27 kDa) (Fig. 5B, lanes 1 and 2). The 41-kDa protein was also reactive with the R1 and R2 anti-coreϩ1 polyclonal antibodies (data not shown). Next, using the same E. coli extracts, we screened a panel of previously characterized sera from HCV-positive individuals (34 samples) and controls (15 samples) by immunoblot assays. Representative results are shown in Fig. 5B. Interestingly, most of the HCV-positive human sera (77%) recognized a ladder of protein bands with apparent molecular masses of 41 to 29 kDa only in the pHPI-668-transformed E. coli lysates (Fig. 5B, lanes 4, 6, 8, and 10), suggesting the presence of circulating antibodies against the coreϩ1-encoding determinants in the HCV-infected individuals. No reaction was observed with the corresponding bands in E. coli extracts harboring the pGEX-3X vector (Fig. 5B, lanes 3, 5, 7,   and 9). Furthermore, all of the HCV-negative serum samples from healthy individuals failed to react (data not shown). These results indicate that the reactivity of the HCV-positive human sera to extracts from E. coli cells expressing the GST-coreϩ1 protein was not caused by nonspecific binding of serum immunoglobulins to E. coli proteins or to the GST part of the recombinant antigen. However, because of the presence of multiple instead of a single sero-reactive band, we constructed an additional recombinant coreϩ1 antigen based on the pMal-c2 expression vector. Screening of human sera against the maltose binding protein-coreϩ1-expressing E. coli extracts verified the positive reactivity of the HCV-positive sera (data not shown). As before, however, the serum reactivity was mainly against a broad protein band, suggesting instability problems inherent to the recombinant coreϩ1 antigen in E. coli.
Finally, in an attempt to overcome this problem, a truncated form of the coreϩ1 protein lacking the first 10 amino acids of the core region (Fig. 5C) was synthesized in vitro in the presence of [ 35 S]methionine. Immunoprecipitation experiments with human sera from HCV-positive patients revealed a weak reactivity with a single protein band (ϳ16 kDa) corresponding to the coreϩ1 protein, whereas no significant reactivity was observed with sera from the HCV-negative controls (Fig. 5D). It should be noted, however, that in contrast to the previous screening analysis, background levels could be detected in some HCV-negative sera from non-healthy individuals (Fig.  5D, lane 6). Taken together, these results are in agreement with recent studies (26) and support the presence of circulating anti-coreϩ1 antibodies in the HCV-infected individuals, thus implying the expression of this ORF in vivo.

DISCUSSION
In this study, evidence is presented that supports the expression of the ORF that overlaps the HCV core coding sequences in the ϩ1 frame, designated in this study as the coreϩ1 ORF. Most importantly, we have shown that the previously identified p16 protein band from the HCV-1 isolate contains antigenic determinants of both the core and coreϩ1 polypeptides. Finally, we have provided preliminary evidence supporting the presence of novel translation mechanisms within the core coding region.
The first direct evidence for the expression of the coreϩ1 ORF was obtained from in vitro translation studies of the core region of the HCV-1 isolate using luciferase tagging experiments. Fusion of the luciferase gene with the ϩ1 frame relative to the core coding sequences resulted in the expression of a chimeric luciferase protein that reacted with specific anti-coreϩ1 antibodies. Moreover, the chimeric protein exhibited ϳ54% luciferase activity relative to the control construct carrying the luciferase gene fused to the core ORF. Because, by design, expression of the luciferase gene from this construct (pHPI-767) occurs only through the translation of the ϩ1 frame, these data provide strong evidence that the coreϩ1 ORF from HCV-1 is a functional ORF.
Further evidence for the expression of the coreϩ1 ORF was obtained by analyzing the reactivity of novel anti-coreϩ1 antibodies (R1 and R2) against the translation products of the core region from the HCV-1 isolate. Interestingly, we showed that the HCV-1 p16 protein contains epitopes not only from the core, but also from the coreϩ1 ORF, inasmuch as R1 and R2 specifically immunoprecipitated the p16 protein band, whereas they failed to react with the p21 core product. Both p21 and p16 were immunoprecipitated, as expected, by the anti-core monoclonal antibody. This was an unanticipated finding because the p16 protein has been assumed to represent a processed form of the p21 core protein (11)(12). To explain these data, we considered two likely possibilities. Either the p16 protein is a chimeric protein containing core coding sequences at the N terminus and coreϩ1 amino acid sequences at the C terminus, i.e. p16 and p21 are identical in their N-terminal ends, but differ at their C-terminal ends, owing to the use of the ϩ1 translation frame; or the p16 protein band may consist of two proteins with the same apparent molecular mass: the truncated core protein and the coreϩ1 polypeptide. Because of the basic nature of the core and putative coreϩ1 proteins (predicted pI Ͼ 11), twodimensional gel electrophoresis would not resolve these polypeptides. On the other hand, separation of those polypeptides on Tricine gels reproducibly gave a single protein (data not shown). Moreover, site-directed mutagenesis experiments support the first possibility, inasmuch as insertion of a translation termination codon after the ninth amino acid of the core abolished the expression of both the p21 and p16 proteins, whereas insertion of a stop codon in the coreϩ1 ORF (nt 453) abolished the expression of the p16 protein only.
When the core region of the HCV-1a (H) isolate was used to perform similar experiments, no evidence for detectable coreϩ1 expression was found. Luciferase tagging experiments performed with the coreϩ1 frame from HCV-1a (H) gave no detectable expression of the luciferase gene, indicating no or very low expression of the coreϩ1 ORF. Furthermore, no reactivity of R1 and R2 anti-coreϩ1 antibodies was detected with in vitro translated products of the core region. The latter finding is in agreement with the low levels of expression of p16 from HCV-1a (H). Thus, there appears to be a correlation between the lack of efficient coreϩ1 expression (based on the luciferase tagging experiments) and the synthesis of the p16 protein in the HCV-1a (H) isolate, further supporting the suggestion that expression of the HCV-1 p16 protein is directly related to the translation of the coreϩ1 ORF.
At this time, it is unclear why expression of the coreϩ1 ORF and/or the synthesis of p16 protein is predominantly observed only for the prototype HCV-1 isolate, a finding that raises important questions regarding the biological relevance of p16 in the life cycle of the virus. However, two recent reports may shed some light on this issue. Of particular interest is the study by Suzuki et al. (27) indicating that the p16 protein can be expressed by other isolates, but in an unstable form due to proteasome-induced degradation. Furthermore, Yeh et al. (28) provided evidence for the expression of the p16 protein during natural infection. Notably, the expression of p16 is associated with three different mutations located within codons 9 -11 other than the Lys-to-Arg change observed in HCV-1. One critical concern is to verify that the p16 protein produced in those studies is the same protein as the p16 protein produced by the HCV-1 isolate in vitro. Interestingly, Yeh et al. (28) showed that polyclonal antibodies against p16 fail to react with the p21 protein and vice versa. This suggests the presence of different epitopes in the two proteins and, similar to our findings, supports the different nature of the p16 and p21 proteins.
The mechanisms of the coreϩ1 expression and/or synthesis of the p16 protein are unknown. However, our data support ribosomal frameshifting into the ϩ1 frame, even though we cannot formally exclude other, less "orthodox" translation mechanisms such as protein splicing and RNA editing that may FIG. 5. Screening of human sera for the presence of anti-core؉1 antibodies. A, construction of the GST-recombinant coreϩ1 plasmid. Nucleotides 390 -920 from the HCV core coding sequences were cloned in the ϩ1 frame into the pGEX-3x expression vector, resulting in plasmid pHPI-668. The GST-coreϩ1 fusion protein has a calculated molecular mass of 41 kDa. B, Western blot analysis of the GST-recombinant coreϩ1 plasmid. Plasmid pHPI-668 and the pGEX-3X control vector were transformed into E. coli cells; and after induction with isopropyl-␤-D-thiogalactopyranoside, the cell lysates were resolved by SDS-PAGE. Bacterial lysates transformed with the pGEX-3X vector alone (lanes 1, 3, 5, 7, and 9) and with plasmid pHPI-668 (lanes 2, 4, 6, 8, and 10) were tested by Western blotting for their reactivity against anti-GST antiserum and human sera from HCVpositive patients. Protein molecular mass markers (in kilodaltons) are indicated on the left, and the migration of the recombinant proteins is indicated by the arrow. C, schematic diagram of plasmid pHPI-1309, which contains the coreϩ1 coding region under the control of the SP6 promoter with a start codon in the ϩ1 frame. D, immunoprecipitation experiments performed with the coreϩ1 protein. Lanes 1-3, immunoprecipitations using human sera from HCV-positive patients; lanes 4 -6, immunoprecipitations using human sera from HCV-negative individuals. The location of the coreϩ1 protein is indicated by the arrow. operate in vivo. Although the luciferase tagging experiments cannot directly address the mechanism responsible for coreϩ1 ORF expression, the apparent molecular mass of the chimeric luciferase protein (encoded by pHPI-767) argues against the possible use of an obvious start codon in the ϩ1 frame, as the first AUG codon is located just 12 amino acids upstream of the luciferase gene. Furthermore, the combined reactivity of p16 to both anti-core and anti-coreϩ1 antibodies provided strong evidence for a translation mechanism that would allow translation from both the 0 and ϩ1 frames, resulting in the synthesis of a chimeric protein. Further support for a ϩ1 ribosomal frameshift mechanism was provided by the in vitro mutagenesis experiments, inasmuch as insertion of a stop codon into either the 0 or ϩ1 frame (6 or 37 amino acids downstream of the AUG initiator codon, respectively), abolished the expression of the p16 protein. This is in contrast to the expression of the p21 protein, which, as expected, was not affected by the insertion of a stop codon in the ϩ1 frame. Moreover, alteration of the AUG initiator codon to a stop codon eliminated the expression of both the p21 and p16 proteins.
Interestingly, the luciferase tagging experiments also support the presence of a Ϫ1 ribosomal frameshift event within the core region. The Ϫ1 frame relative to the core ORF contains multiple stop codons. Thus, should such a mechanism operate in vivo, it could function as a regulatory switch to control the expression of the core and/or coreϩ1 rather than to culminate in the synthesis of a new protein, as is the case in the putative ϩ1 frameshift event. Negative regulation of core expression through the Ϫ1 frameshift event or translation of the coreϩ1 ORF would function in favor of encapsidation or RNA replication. The details of the molecular mechanisms responsible for alternate translation within the core region are currently under investigation.
The data presented here suggest the presence of novel RNA signals within the core coding region responsible for the reading of alternative frames by the ribosomes. According to previous studies, the efficiency of -1 frameshifting depends on the primary sequence of the RNA slippery site or on distal sequences that participate in a secondary structure that is responsible for the pausing of the ribosomes (29). On the other hand, the cis-acting elements responsible for the ϩ1 frameshift mechanism are less well defined (30,31). Finally, it should be noted that the previously described mutation in codon 9 of the HCV-1 isolate generates a region of 10 consecutive A residues that represents a known Ϫ1 slippery site (29). Notably, Yeh et al. (28) detected p16 from clinical isolates containing mutations at nt 366 -374 (codons 9 -11) that failed to reproduce the 10-A residue region of HCV-1, indicating that the presence of 10 A residues is not obligatory for the efficient expression of the p16 protein.
The critical question that remains to be answered concerns the biological importance of our findings in the context of the HCV life cycle. Our first attempt to address this was to screen human sera from HCV-positive patients against E. coli lysates expressing recombinant coreϩ1 protein or in vitro synthesized coreϩ1 protein. Our data suggest the presence of circulating antibodies against epitopes from the coreϩ1 ORF in most of the HCV-infected individuals, implying that the coreϩ1 ORF may be expressed in vivo during HCV infection. Notably, although the nature of the positive signal for the E. coli antigen is a family of overlapping polypeptides rather than a single protein band, the reaction was specific for the coreϩ1-containing E. coli extracts, inasmuch as no reaction was observed with the negative controls. Furthermore, our data are in agreement with recent studies reporting the identification of antibodies to synthetic peptides representing the coreϩ1 ORF in HCV-infected patients (26).
The genome organization of all the members of the Flaviviridae family is similar and characterized by the presence of the structural proteins at the amino terminus and the nonstructural proteins at the carboxyl terminus of the polyprotein. Notably, however, the structural region of the pestiviruses has two distinguishing characteristics: the presence of an extensive secondary structure at the 5Ј-untranslated region of their genome that functions as an IRES element as well as the presence of the N pro gene located upstream of the core coding sequence. N pro is a nonstructural protease that is autocleaved from the nascent polyprotein (32). Flaviviruses lack both characteristics, whereas the HCV genome encodes an IRES element. In fact, secondary structure modeling of the 5Ј-untranslated region of HCV and pestiviruses revealed a remarkable folding similarity, and several short stretches with a significant sequence identity have been recognized in the 5Ј-untranslated region of both viruses (33,34). Thus, it is tempting to speculate that the HCV core ϩ1 polypeptide might represent the counterpart of N pro in HCV. Interestingly, a closer examination of the catalytic amino acids of N pro and other members of the chymotrypsin-like cysteine proteases revealed that the important catalytic amino acids are relatively conserved in the ϩ1 ORF between nt 615 and 656 (35).
In summary, this study provides direct evidence for the presence of alternate translation mechanisms within the core coding region of the HCV-1 isolate. Such mechanisms may be important for translation of the coreϩ1 ORF and/or regulation of core expression itself. How widespread expression of the coreϩ1 ORF and/or the putative ϩ1/Ϫ1 frameshift mechanism(s) may be remains an open question. The finding of a p16 protein from clinical isolates combined with the observed reactivity of human sera to recombinant coreϩ1 antigens or peptides suggests that the coreϩ1 ORF and/or the alternate translation mechanism within the core may indeed represent novel functions for HCV.
Finally, it should be noted that during the review process of our paper, Xu et al. (36) reported the discovery of a novel HCV protein (designated F protein) synthesized from the core coding region. It was shown that this protein is synthesized from the coding sequence that overlaps the core protein via a ribosomal frameshift translation mechanism. It appears that the F protein is identical to the coreϩ1 protein described in this report.