Expression, purification, and characterization of an active RNase H domain of the hepatitis B viral polymerase.

The replication of the hepatitis B viral DNA genome proceeds through a pregenomic RNA intermediate. This pregenomic RNA subsequently serves as the template for the formation of the viral DNA by the reverse transcriptase activity of the viral P gene product. The P gene product is believed to be a multifunctional enzyme with DNA-dependent DNA polymerase, RNA-dependent DNA polymerase, and RNase H activities. Detailed biochemical studies of this protein have not been performed because of the inability to obtain sufficient amounts of the enzyme from the virus and by the inability to produce the enzyme in heterologous expression systems. The RNase H activity is essential for viral replication and is believed to be responsible for the degradation of the RNA pregenomic intermediate as well as for generating the short RNA primer that is required for DNA second strand synthesis. We have assembled an expression vector which directs the synthesis of a protein that corresponds to the putative RNase H domain of the P gene product and having a carboxyl-terminal polyhistidine tag to facilitate purification. The protein has been expressed in Escherichia coli and purified to yield 1-2 mg of protein/liter of culture. This protein has RNase H activity as defined by its ability to degrade the RNA component of RNA-DNA hybrids but not the DNA component. The RNase H has a basic optimum pH, is active only in the presence of reducing agents, and is dependent on the presence of divalent cations, with magnesium being preferred over manganese.

Hepatitis B virus (HBV) 1 is a member of the hepadnaviridae, a family of small, partially double stranded (mammalian) or fully double stranded (avian) DNA viruses (1). This group also includes the woodchuck hepatitis virus, ground and tree squirrel hepatitis viruses, duck hepatitis B virus, and heron hepatitis virus. The genomes of these viruses are very small, having only about 3000 base pairs, and all members have very similar genome organization (2). There are four (mammalian) and three (avian) open reading frames (for review, see Ref. 3). One open reading frame (preS/S gene) encodes the three different lengths of surface protein HBsAg (S, M, and L). A second open reading frame (preC/C) encodes the core and e proteins (HBcAg and HBeAg). A third open reading frame (X gene) in the mammalian viruses encodes a trans-activating protein, the HBx protein, which is lacking in the avian members. The fourth open reading frame (P gene) encodes the putative viral polymerase (4).
HBV infects more than 300 million people worldwide. Acute infections result in disease ranging in severity from asymptomatic infection to fulminant hepatitis and death. In addition, chronic infection is closely correlated with the development of hepatocellular carcinoma (for review, see Ref. 3). Although an effective vaccine has been available for more than a decade (5), the fact that there are millions of HBV carriers ensures that there will remain a relatively high interest in developing possible antiviral therapy. The most obvious target for antiviral drugs is the viral polymerase. However, HBV is not readily cultured and therefore is not a suitable source for obtaining this enzyme in sufficient quantities for study. Moreover, although all the other proteins of the hepatitis virus have been produced in large quantity by recombinant methods, the P gene product has not been produced and purified in an active form in heterologous expression systems.
The HBV polymerase is predicted to be a multifunctional protein, having DNA-dependent DNA polymerase, reverse transcriptase, and RNase H activities (3,6). The replication of these viruses has been most extensively studied using the duck hepatitis B virus as a model. Summers and Mason (7) demonstrated that replication proceeds through an RNA intermediate, the pregenomic RNA, which is produced by transcription of the closed circular viral DNA by cellular RNA polymerase (for review, see Ref. 3). This RNA intermediate serves as a template for minus strand DNA synthesis, which is primed by a protein product of the P gene and is catalyzed by the reverse transcriptase activity of the polymerase. During synthesis of the minus strand DNA, the pregenomic RNA is degraded by the RNase H activity of the polymerase. The full-length minus strand then serves as a template for plus strand DNA synthesis by the DNA-dependent DNA polymerase activity of the polymerase. This reaction is primed by a short RNA, which is believed to be also derived from the pregenomic RNA by the activity of the RNase H domain of the polymerase (8).
The domain structure of the HBV P gene product has been studied by in vitro transfection with viral DNA containing mutations leading to amino acid changes in different parts of the protein (9). Based on these studies, as well as comparisons of the amino acid sequence of the expected protein product of the P gene with other polymerases and RNase Hs, the HBV RNase H has been localized to within the carboxyl-terminal portion of the P gene product and connected to the carboxyl terminus of the polymerase domain (6,9). However, the activity of RNase H has not been directly demonstrated. Because RNase H is apparently an essential component of the viral replication machinery, this enzyme could be an important target for antiviral drugs, if the pure enzyme were available in sufficient quantities to allow the development of screening assays.
* The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18  Multifunctional enzymes are common in cells and are often separable into active domains. Oberhaus and Newbold (10) have recently reported the detection of RNase H activity in highly purified duck hepatitis B virions, and that the activity was associated with a protein with a molecular weight much smaller than the expected full-length P gene product. Although these investigators did not demonstrate that the activity was actually related to the viral P gene protein, if it were, these results would show that the RNase H domain might be separable from the polymerase domain. In this article we demonstrate that this is, in fact, the case for the RNase H activity of the HBV multifunctional polymerase protein. We describe here the production of an active HBV RNase H in Escherichia coli, its purification, and some of the basic properties of this domain of the HBV viral polymerase.

MATERIALS AND METHODS
Oligodeoxynucleotides were obtained from the Virginia Commonwealth University Oligonucleotide Synthesis Core Facility. Vent polymerase was obtained from New England Biolabs. 3 H-Labeled nucleotides were obtained from American Radiolabeled Chemicals, Inc. (St. Louis, MO). All other chemicals were obtained from Sigma.
Cloning of the RNase H Domain of the HBV-Oligodeoxyribonucleotides having the sequences C CGG GAA TTC CAA CGG CCA GGT CTG TGC and GGG AAG CTT CGG TGG TCT CCA TGC GAC were used to copy bases 1161-1625 of the HBV (ayw) genome (11). The product was purified by agarose gel electrophoresis, cleaved with EcoRI and HindIII restriction endonucleases, and ligated into the vector pET21a (Novagen, Inc., Madison, WI). The ligation mixture was used to transform competent E. coli HMS174 cells and plated on medium containing ampicillin. Bacterial colonies were selected, and 5-ml cultures were grown to allow screening of plasmids by restriction nuclease digestion of the DNA. Plasmids containing an insert with the expected size were subsequently sequenced (by the Virginia Commonwealth University Nucleic Acid Core Facility), and one containing the correct sequence was used for all subsequent studies. This plasmid is designated pET-RNase H.
Expression and Purification of the RNase H Domain-Cells were grown in 100 ml of L broth (10 g of tryptone, 5 g of yeast extract, 10 g of NaCl, and 50 g/ml ampicillin) overnight and then used to inoculate 1 liter of medium containing 10 g of tryptone, 6 g of Na 2 HPO 4 , 3 g of monobasic KH 2 PO 4 , 5 g of NaCl, 1 g of glucose, 1 mM MgCl 2 , 0.1 mM CaCl 2 , and 50 mg of ampicillin/liter. The A 600 was monitored, and when it had reached 0.6 -0.8, 60 mg/liter, isopropylthiogalactopyranoside was added. The cells were grown for an additional 4 h to allow expression of the protein and then harvested by centrifugation. Cells were disrupted by a single passage through an Avestin, Inc., Emulsiflex operating at 20,000 p.s.i. The protein was found to be present as insoluble inclusion bodies. These were collected by centrifugation at 25,000 ϫ g for 30 min. The inclusion bodies were washed extensively by suspension in Tris-HCl, pH 8.0, and centrifugation (three times), and then they were solubilized in 6 M urea in 50 mM Tris-HCl, pH 8.0. The solution was clarified by centrifugation at 25,000 ϫ g and applied to a nickel-Sepharose column (Qiagen, Chatsworth, CA). The column was washed extensively with 50 mM Tris-HCl, pH 8.0, containing 6 M urea, 10 mM imidazole, and 300 mM NaCl until the A 280 returned to the baseline value. The protein was then eluted with 100 mM EDTA in Tris-HCl, pH 8.0, containing 6 M urea. Elution was monitored at 280 nm, and proteincontaining fractions were pooled and dialyzed extensively against 6 M urea to remove EDTA.
Renaturation of the RNase H was performed by dilution of the protein to a concentration of 0.1-0.2 mg/ml, a final urea concentration of 0.7 M, 12 mM mercaptoethanol, and 20% glycerol. The solution was dialyzed against the same solution overnight and then dialyzed against the same buffer lacking urea at 4°C.
Sodium Dodecyl Sulfate-Polyacrylamide Gel Electrophoresis-SDSpolyacrylamide gel electrophoresis was performed with 15% acrylamide gels according to the methods of O'Farrell (12). Midrange molecular weight markers were obtained from Promega (Madison, WI). For immunoblotting, the protein was transferred to nitrocellulose according to the method of Burnette (13) and reacted with anti-HBV-positive human serum, which had been diluted 1:200 in phosphate-buffered saline containing 4% bovine serum albumin. Following overnight incubation at room temperature with gentle agitation, the membrane was washed extensively with phosphate-buffered saline containing 0.02% Tween 20 and then incubated with peroxidase-labeled goat anti-human IgG, diluted 1:2000 in phosphate buffered saline containing 4% bovine serum albumin. After 1 h at room temperature, the membrane was washed extensively, and substrate (chloronaphthol and hydrogen peroxide) was added.
Protein Concentration Determination-The concentration of the HBV RNase H was determined spectrophotometrically using the value of 26,780 for the molar extinction coefficient at 280 nm, which was calculated from the amino acid composition (14).
Protein Sequencing-The amino-terminal sequence of the protein was determined by automatic Edman degradation in the Virginia Commonwealth University Protein/Peptide core facility.
Substrate Preparation-[ 3 H]Poly(dC)⅐poly(rG) was synthesized using E. coli RNA polymerase as described by Starnes and Cheng (15). E. coli RNA polymerase (20 units) was added to a reaction mixture containing 50 mM Tris-HCl, pH 7.8, 100 mM KCl, 5 mM MgCl 2 , 1 mM MnCl 2 , 5 mM DTT, 5% glycerol, 1 A 260 unit of poly(dC), and 500 M [ 3 H]GTP (2 Ci/mmol). The reaction mixture was incubated at 37°C for 10 min, and then the reaction was stopped by the addition of 10 l of 0.5 M EDTA, and the products were separated from unincorporated nucleotide by passage through a Sephadex G-50 column (1.5 ml). Poly(rG)⅐[ 3 H]poly-(dC) was synthesized with mouse mammary tumor virus reverse transcriptase by the addition of 25 units of enzyme to a reaction mixture containing 50 mM Tris-HCl, pH 7.8, 100 mM KCl, 5 mM MgCl 2 , 5 mM DTT, 5% glycerol, 1 A 260 unit of poly(rG), 0.1 A 260 unit of synthetic oligo(dC) (dodecamer), and 500 M [ 3 H]dCTP (25 Ci/mmol). The reaction was incubated at 37°C for 30 min, and then the reaction was stopped, and the products were separated from unincorporated nucleotide by passage through a Sephadex G-50 column.
RNase H Assay-A modification of the procedure of Starnes and Cheng was used as follows (15). Poly(dC)⅐[ 3 H]poly(rG) or poly(rG)⅐[ 3 H] poly(dC) was used as substrate. 1-5 l of protein solution (0.1 mg/ml) was added to a 50-l reaction mixture containing 50 mM Tris-HCl, pH 8.0, 4 mM DTT, 2 mM MgCl 2 , and 2-8 ϫ 10 4 cpm of 3 H-labeled substrate. The reaction mixture was incubated at 37°C for various times, and then the reaction was terminated by application to trichloroacetic acid (5%)-soaked glass fiber filters (Whatman GF/A). The filters were washed extensively with 5% trichloroacetic acid and then 70% ethanol. Radioactivity was then measured by liquid scintillation counting.
Product Analysis-The products of the reactions were analyzed by thin layer chromatography on Silica Gel 1B2 plates (J.T. Baker Chemical Co., Phillipsburg, NJ) developed with isopropanol:water:ammonium hydroxide (5:4:1). Standards containing nucleotide monophosphates, diphosphates, and triphosphates were run next to the reaction mixtures. The positions of the standards were determined by examination under ultraviolet light. The thin layer chromatograms were then cut into sections, and the position of the radioactive compounds was determined by scintillation counting.

RESULTS
The HBV/ayw pol gene is encoded by bases 2309 -1624 (Gen-Bank accession number J02203 V01460) and has 832 amino acids. Fig. 1 shows the amino acid sequence of the HBV RNase H protein expected from the translation of that segment of the HBV pol gene, containing the codons for amino acid residues 679 -832, which had been inserted into the expression plasmid. This sequence differs from the viral sequence by the presence of a 16-residue amino-terminal extension (vector-derived) and a 13-residue carboxyl-terminal extension (vector-derived). The latter contains 6 histidyl residues at the carboxyl terminus, which facilitates purification by nickel chelate chromatography. These additional amino acids are shown in lower case letters. Fig. 2 shows the expression of the HBV RNase H protein in E. coli by the vector pET-RNase H and its subsequent purification. When cells containing the pET-RNase H vector were induced by the addition of isopropylthiogalactopyranoside, there was no obvious expression of a new protein at the expected molecular weight when cell lysates were examined by SDS-polyacrylamide gel electrophoresis. (Fig. 2, lanes 2 and 3) However, when the insoluble fraction of the cell lysate was solubilized in urea and subsequently applied to a nickel-Sepharose column, a protein having an apparent molecular weight of about 20,000 was obtained by elution with EDTA (Fig. 2, lane  4). The apparent molecular weight of the final protein product is consistent with that which was calculated from its amino acid composition (M r 19,963). The protein concentration was determined from the absorption spectrum of the final solution, using a value of 1.3 for the A 280 for a 1 mg/ml solution of protein, which was calculated from the molar extinction coef-ficient (determined from the amino acid sequence) and molecular weight as given above. The normal yield of the purified protein was about 1-2 mg/liter of culture.
Automatic Edman degradation verified that the amino-terminal sequence of the purified protein was that shown in Fig.  1, except that the amino-terminal methionine had been removed by the E. coli methionine aminopeptidase. This is consistent with the known specificity of this amino peptidase (16). Only the initial 5 amino acids were determined, since they were as predicted. The remaining sequence, as shown in Fig. 1, was deduced by determining the complete nucleotide sequence of the DNA in the expression vector. Only the insert and a few flanking nucleotides were sequenced.
When immunoblotted with HBV-positive human sera (obtained from asymptomatic chronic carriers), several (3 of 10) were found to contain antibodies that recognized the RNase H domain, further confirming that this protein is the expected viral protein. One example is shown in Fig. 2, lanes 5 and 6. Lane 5 shows the results of immunoblotting with normal human serum, whereas lane 6 shows the results obtained when blotted with an HBV-positive serum. No reactivity was observed in the case of the normal human serum, but the positive serum reacted strongly with the RNase H.
Assay of RNase H- Fig. 3 shows the time-dependent release of 3 H-labeled poly(rG) from poly(dC)⅐[ 3 H]poly(rG) duplexes. Essentially all of the labeled RNA was converted to trichloroacetic acid-soluble products within 1 h of incubation. However when poly(rG)⅐[ 3 H]poly(dC) was the substrate, very little radioactivity was converted to trichloroacetic acid-soluble material (Fig.  3). Heating the RNase H to temperatures up to 60°C for 10 min did not affect the reaction (data not shown); however, no RNase H activity was detected in protein samples that were incubated in boiling water for 10 min (Fig. 3).
The products of the reaction with poly(dC)⅐[ 3 H}poly(rG) were analyzed by thin layer chromatography, and the results, which are shown in Fig. 4, were obtained. The reaction was examined at a time when approximately 50% of the initial substrate had been converted to trichloroacetic acid-soluble material. Radioactivity was observed at two positions in the chromatogram. About 50% of the radioactivity remained very near the origin and is presumably undigested or only partially digested substrate. The remaining radioactivity migrated at a position corresponding to authentic GMP. When the products of reaction with poly(rG)⅐[ 3 H]poly(dC) were examined by the same procedure, all of the label remained near the origin of the chromatogram (data not shown).
Effect of Reducing Agents-The enzyme is apparently easily inactivated by oxidation, since no activity was detected in the absence of DTT (or mercaptoethanol). Reactions were performed with varying concentrations of DTT ranging from 0 -10 mM. As shown in Fig. 5, 4.0 mM DTT was sufficient to give maximal activity and was used in all subsequent assays.
Divalent Cation Requirement-To determine whether the enzyme requires magnesium ions for activity, the standard assay was modified to contain varying amounts of this metal ion. Fig. 5 shows the results of these studies. The enzyme has no activity in the absence of magnesium ions, and 2 mM Mg is sufficient for maximal activity. Magnesium is preferred over manganese (data not shown). At concentrations greater than 10 mM the activity decreases significantly (Fig. 5).
Effect of Salt Concentration-KCl concentrations of 0 -20 mM gave maximal enzyme activity, whereas higher salt concentrations resulted in lower activity (Fig. 5).
Optimum pH of the RNase H-The effect of pH on enzyme activity was examined using Tris-HCl buffer adjusted to the various pH values shown in Fig. 6. Under these conditions, the enzyme exhibited an alkaline optimum pH of about 8 -8.5.

DISCUSSION
Although the P gene has been predicted to code for a multifunctional protein with DNA-dependent DNA polymerase, RNAdependent DNA polymerase, and RNase H activities, none of these activities has been previously demonstrated in a purified protein.
The reverse transcriptase activity was first hypothesized to be coded for by the P gene based on its homology to the polymerase domain of the retroviral transcriptase (4). Since the retroviral reverse transcriptases were located on multifunctional proteins, which were shown to also contain RNase H activities, Khudyakov and Makhov (6) examined the HBV pol gene for possible regions of homology to the retroviral RNase H. However, no homologies were noted. On the other hand, homology between the carboxyl-terminal segment of the P gene product and the E. coli RNase H was observed, and they proposed that this part constitutes a HBV RNase H.
Data consistent with this hypothesis were later presented by Radziwill, et al. (9), who examined the effect of various mutations within the P gene on the ability to produce core particles containing the virus-associated polymerase activity. Two mutations led to results consistent with inactivation of a viral RNase H. These were the conversion of glutamic acid 56 (in Fig.  1 and corresponding to residue 718 of the predicted P gene protein) to histidine, and aspartic acid 75 (in Fig. 1 and corresponding to residue 737 of the predicted P gene protein) to valine. These results support the assignment of the RNase H activity to this region of the P gene product. However, this interpretation is complicated by a mutation that converted alanine 63 (in Fig. 1, corresponding to residue 725 of the P protein) to aspartic acid. This mutation resulted in complete FIG. 3. RNase H assay. The RNase H activity was measured by monitoring the conversion of trichloroacetic acid (5%)-insoluble to trichloroacetic acid (5%)-soluble radioactivity. q, reaction with the substrate poly(dC)⅐[ 3 H]poly(rG). E, same reaction using RNase H that had been heated to 100°C. For these reactions, the left ordinate indicates the cpm observed. ϫ, reaction with [ 3 H]poly(dC)⅐poly(rG) as substrate. For this reaction, the right ordinate indicates the cpm observed. In each case 0.2 g of enzyme was used. loss of reverse transcriptase activity, which is presumably in a different protein domain. Thus, the conclusion that any of these effects were directly due to modification of the RNase H domain rather than indirect effects could not be made with certainty.
Oberhaus and Newbold (10) reported the detection of RNase H activity associated with highly purified duck hepatitis B virions. When assayed in SDS-polyacrylamide gels following renaturation, the activity was detected at a position corresponding to a molecular weight of 34,000 -36,000. No activity was observed at molecular weights corresponding to those at which these authors had previously observed RNA-and DNAdependent DNA polymerase activity using similar activity gel assays. However, the RNase H was not purified; therefore, its relationship, if any, to the viral P gene product is not known. If, however, it were derived from the P gene protein, then these data would demonstrate that the RNase H of duck hepatitis B virus can function in a less than full-length P protein and maybe as a domain independent of polymerase activity. This would be similar to what has been observed in the Moloney murine leukemia virus polymerase (20) as described below.
In this study, we chose to produce a protein containing only the putative RNase H domain of the hepatitis B virus, based on the predicted domain location as described by Khudyakov and Makhov (6). We included several additional amino acids at the amino terminus, which might be expected to be a nonessential linker between the polymerase and RNase H domains. This protein was expressed in E. coli, although in relatively small amounts compared with the levels of may other proteins produced in this system. The reason for the relatively low expression level is unknown. Studies of E. coli in which the bacterial RNase H was expressed from a multicopy plasmid did not show deleterious effects from such overexpression under normal growth conditions (17). Therefore, it seems unlikely that low expression is due to the protein being toxic to the cells. Nevertheless, even though the protein is not expressed at high levels, it was easily purified in milligram amounts.
The properties of this RNase H are very similar to those of other RNase Hs from eukaryotic, prokaryotic, and viral sources (18). Thus, like many other RNase Hs, the enzyme is small, has a basic pI, requires reducing agents and a divalent cation for activity, and (by definition) degrades the RNA but not the DNA strands of RNA-DNA hybrids. Two viral RNase Hs that have been well characterized are those of HIV and the Moloney murine leukemia virus. The HIV-1 RNase H has been expressed as a fusion protein with dihydrofolate reductase. The chimeric protein was subsequently cleaved, and the RNase H domain was purified. However, unlike the HBV RNase H do-FIG. 6. Effect of pH on RNase H activity. Trichloroacetic acidsoluble radioactivity released from the substrate poly(dC)⅐[3H]poly(rG) was determined as described under "Materials and Methods." The reaction mixture contained 50 mM Tris-HCl, 4 mM DTT, and 2 mM MgCl 2 and was adjusted to the pH indicated. The data indicate the cpm observed relative to that observed in the standard assay, which is performed at pH 8.0. Poly(dC)⅐[ 3 H]poly(rG) was used as the substrate for all reactions. All reaction mixtures contained 50 mM Tris-HCl, pH 8.0, and were performed at 37°C. The reaction was initiated by the addition of 0.2 g of enzyme and allowed to react for 15 min. Trichloroacetic acid-insoluble counts were determined as described under "Materials and Methods." The values reported are relative to those obtained with the standard assay, which contains 4 mM DTT, 2 mM MgCl 2 , and no KCl. For determining the effect of DTT, the reaction mixture also contained 2 mM MgCl 2 . For determining the effect of MgCl 2 , the reaction mixture also contained 4 mM DTT. For determining the effect of KCl, the reaction mixture also contained 4 mM DTT and 2 mM MgCl 2 . main described here, the HIV RNase H domain was catalytically inactive. Nevertheless, the structure of HIV RNase H has been determined by x-ray crystallography and shown to be similar in overall structure to the E. coli enzyme but with numerous differences (19). On the other hand, the Moloney murine leukemia virus reverse transcriptase has been expressed as enzymatically active, isolated domains (20). The amino-terminal domain (approximately two-thirds of the protein sequence) has only polymerase activity, and the carboxylterminal domain (approximately one-third of the protein sequence) has only RNase H activity. Although the RNase H domain of the murine leukemia virus (191 amino acids) is similar in length to the RNase H domain of HBV (154 amino acids) reported here, the two have little homology (20% sequence identity and 50% sequence similarity by BESTFIT analysis). Moreover, residues reported to be critical for the murine leukemia virus RNase H (21) are not among the 20% identical residues, and the residues reported to be critical for the HBV RNase H, described above, are also not conserved in the murine leukemia viral enzyme. In addition, secondary structure predictions for the two enzymes reveal little similarity. Murine leukemia viral RNase H is predicted to have extensive helical segments (22), whereas the HBV enzyme has only one (or possibly two) short helical segment (between residues 32 and 48). Although it is likely that the HBV RNase H will have a core structure similar to the E. coli, HIV, murine leukemia viral, and, by extension, many other RNase Hs, considerable differences must also exist. Whether such structural differences between RNase Hs can be exploited for the design of specific antiviral agents remains to be determined. However, the availability of this enzyme in pure form should facilitate the development of screening assays for the identification of potential inhibitors and for the detailed analysis of the structure and mechanism of the enzyme, which would be required for the rational design of such inhibitors.