Molecular Cloning of the Human Gene, PNKP, Encoding a Polynucleotide Kinase 3′-Phosphatase and Evidence for Its Role in Repair of DNA Strand Breaks Caused by Oxidative Damage*

Mammalian polynucleotide kinases catalyze the 5′-phosphorylation of nucleic acids and can have associated 3′-phosphatase activity, predictive of an important function in DNA repair following ionizing radiation or oxidative damage. The sequences of three tryptic peptides from a bovine 60-kDa polypeptide that correlated with 5′-DNA kinase and 3′-phosphatase activities identified human and murine dbEST clones. The 57.1-kDa conceptual translation product of this gene, polynucleotide kinase 3′-phosphatase (PNKP), contained a putative ATP binding site and a potential 3′-phosphatase domain with similarity tol-2-haloacid dehalogenases. BLAST searches identified possible homologs in Caenorhabditis elegans,Schizosaccharomyces pombe, and Drosophila melanogaster. The gene was localized to chromosome 19q13.3–13.4. Northern analysis indicated a 2-kilobase mRNA in eight human tissues. A glutathione S-transferase-PNKP fusion protein displayed 5′-DNA kinase and 3′-phosphatase activities. PNKPis the first gene for a DNA-specific kinase from any organism.PNKP expression partially rescued the sensitivity to oxidative damaging agents of the Escherichia coli DNA repair-deficient xth nfo double mutant. PNKPgene function restored termini suitable for DNA polymerase, consistent with in vivo removal of 3′-phosphate groups, facilitating DNA repair.

Because of its widespread presence in mammalian cells, the acidic pH optimum PNK is likely to be a key enzyme in DNA metabolism, and its biochemical functions immediately suggest a role in the critical process of DNA repair. One of its enzymatic activities, DNA 3Ј-phosphatase, implies an ability to repair strand breaks terminated by 3Ј-phosphate, a type of DNA damage seen in cells treated with ionizing radiation or hydrogen peroxide (16). Removal of this 3Ј-end blocking lesion allows synthesis by DNA polymerase and joining of nicks by DNA ligase. DNA purified from irradiated thymocytes and irradiated thymus, but not DNA irradiated in vitro, contains strand breaks with 5Ј-OH termini (17,18). The 5Ј-phosphorylation activity of the SNQI-PNK enzyme suggests a possible model in which 5Ј-OH termini are repaired prior to ligation. 5Ј-OH termini in DNA also occur in ischemia in rat brain (19), after cleavage by nucleases with the appropriate specificity such as DNase II (20), and as intermediates during topoisomerase cleavage (21,22). The highest concentration of 5Ј-DNA termini occurs during DNA replication, and Pohjanpelto and Hölttä (23) proposed that a small fraction of Okazaki fragments contain 5Ј-OH termini; this fraction decreases upon incubation of extracts with ATP at pH 6.0, which was inferred to reflect 5Ј-phosphorylation by a cellular PNK.
Despite extensive biochemical studies, to date there are no molecular reagents such as antibodies or cDNAs available for mammalian PNKs, hampering further investigation. We present here the molecular cloning of the PNKP gene, the first gene for a mammalian PNK and the first gene for a DNA-specific kinase from any organism. Concomitantly, the PNKP gene also represents the second gene for a mammalian DNA 3Ј-phosphatase; the previously described human APE/HAP1 AP endonuclease has a weak 3Ј-phosphatase activity (24,25). Using Escherichia coli as a model biological system, we report the first evidence for participation in DNA repair in vivo by the PNKP gene product.

EXPERIMENTAL PROCEDURES
Protein Purification-In studies of bovine thymus, we observed multiple activities that 5Ј-phosphorylate oligo(dT) 25 (8). The SNQI-PNK fraction was named on the basis of fractionating into the supernatant after a Polymin P precipitation and eluting first among the activities from the supernatant on a Q-Sepharose column (8). The SNQI-PNK polypeptide of approximately 60 kDa (SNQI-PNK) that correlated with 5Ј-DNA kinase and 3Ј-phosphatase activities assayed as described (6,8) was purified to near-homogeneity from the thymus glands of 6-monthold calves (6). Briefly, the purification scheme involved preparation of a crude cell extract; precipitation of nucleic acids and some acidic proteins with Polymin P (Sigma Aldrich, Oakville, Ontario, Canada; Ref. 26); then sequential chromatography on Q-Sepharose (Amersham Pharmacia Biotech, Baie d'Urfe, Quebec, Canada), SP Sepharose, Blue Sepharose (all obtained from Amersham Pharmacia Biotech), DNA cellulose (Sigma), and Superose 12 (Amersham Pharmacia Biotech) columns (method 1). If the order of the DNA cellulose and Superose 12 columns was reversed, three polypeptides were observed, one of which was similar in estimated molecular mass to the active enzyme as observed in renaturation gel activity experiments (method 2).
Amino Acid Sequence Analysis of the Purified Protein-Protein concentration was measured using the method of Bradford (27) with a commercially obtained reagent (Pierce). Protein (2 g) from a method 2 purification was electrophoresed through a 10% SDS-polyacrylamide gel (28). The gel was briefly stained with Coomassie Blue R-250, and the band at around 60 kDa excised for tryptic digestion. Tryptic peptides were purified by HPLC, and the sequences were analyzed by mass spectrometry using a Finnigan LCQ ion trap mass spectrometer at the Harvard Microchemistry Facility (Cambridge, MA). In addition, nearhomogeneous protein (0.2 g) from a method 1 preparation was also further purified by SDS-polyacrylamide gel electrophoresis and HPLCpurified tryptic peptides were subjected to sequencing both by Edman degradation and mass spectrometry. The isobaric pairs of amino acid residues I/L, M r ϭ 113, and Q/K, M r ϭ 128, could not be unambiguously differentiated in mass spectrometric sequencing.
Identification and Isolation of cDNAs-Peptide sequences of suitable length were used to search the dbEST section of GenBank using the BLAST algorithm (29). Further overlapping EST clones were identified in subsequent searches. I.M.A.G.E. consortium clones were ordered from ATCC (Rockville, MD) or GenomeSystems (St. Louis, MO). DNA was prepared using Wizard Minipreps (Fisher, Laval, Quebec, Canada) or Mini or Midi Preps (Qiagen, Mississauga, Ontario, Canada), and the identity was confirmed by DNA sequencing at Sheldon Biotechnology (McGill University, Montreal, Quebec, Canada). Clone 32798 (human) was fully sequenced on both strands using an overlapping primer walk strategy. Additional human cDNA clones were obtained independently by screening of a cDNA library from a lymphoblastoid cell line cloned into pREP4 (Ref. 30; kindly provided by Dr. Manuel Buchwald, University of Toronto, Toronto, Ontario, Canada) using a probe that was prepared by PCR from the 5Ј end of the 32798 clone. The PCR primers (U221, 5Ј-GCGTATGCGGAAGTCAAACC-3Ј; and L400, 5Ј-TCGGAGC-TTACGGGGAATCT-3Ј) also generated a signal of about 220 base pairs in the cDNA library, confirming the presence of the cDNA. Positive colonies were detected by colony hybridization using the 220-base pair fragment labeled with the RediPrime kit (Amersham Pharmacia Biotech) and [␣-32 P]dCTP (ICN, Mississauga, Ontario, Canada) with a modification (31) of the procedure in Sambrook et al. (32). The cDNA sequence of the 5Ј end of the gene was obtained by PCR from the aforementioned pREP4 cDNA library, using the pREP4 sequencing primer (30) as an anchored primer together with the L400 primer.
Southern Blot Analysis-A Southern blot containing 4 g of EcoRIdigested DNA from nine eukaryotic species was obtained from CLON-TECH (Palo Alto, CA). The blot was prehybridized for 5 h at 65°C in ExpressHyb (CLONTECH) according to the directions of the manufacturer. A KpnI/HindIII fragment from a pcDNA 3.1/His C construct containing the full-length PNKP cDNA 3 was used as a probe. Hybridization was for 16 h at 65°C. The blot was washed quickly four times with 2ϫ SSC, 0.05% SDS, three times for 10 min at room temperature with 2ϫ SSC, 0.05% SDS, and twice for 20 min each at 65°C with 0.1ϫ SSC.
Chromosomal Localization-A bacterial artificial chromosome clone was identified by hybridization screening of the RPCI-11 human genomic library (33) with the HindIII/NotI insert from I.M.A.G.E. clone 32798. This genomic clone (429D20) was then used for mapping by fluorescence in situ hybridization (FISH; Refs. 34 and 35) at the MRC Genome Resource Facility.
Northern Blot Analysis-A Northern blot of 2 g of poly(A) ϩ RNA extracted from eight different human tissues (Human Multiple Tissue Northern blot II, CLONTECH) was hybridized to a cDNA probe from the middle portion of the coding sequence (HindIII/PstI fragment of 32798) and to a probe including the 3Ј portion of the PNKP cDNA (32798 digested with HindIII and NotI). A GAPDH cDNA probe was used as a loading control. Probes were labeled as described above, and prehybridization and hybridization were carried out with ExpressHyb (CLONTECH). The blot was prehybridized for 2 h at 68°C and hybridized for 3 h at 68°C. Washing protocols were as described in Sambrook et al. (32). Between hybridizations, the previous probe was stripped by heating at 95-100°C for 10 min in 0.1ϫ SSC, 0.5% SDS. The filter was then exposed to x-ray film for at least 96 h to confirm the absence of prior signals.
Expression of I.M.A.G.E. Consortium Clone 32798 as a Fusion Protein in E. coli (GST PNKP) and Assay of Enzymatic Activity-DNA from clone 32798, a 1.45-kb insert with an open reading frame of 452 amino acids comprising residues 69 -521 of the full-length PNKP conceptual translation cloned into the Lafmid BA vector, was cleaved with HindIII. The 5Ј terminus of the gene fragment was treated with Klenow polymerase and dNTPs to generate a blunt end, after which the DNA was cleaved with NotI and purified from an agarose. DNA from the vector pGEX-4T-3 (Amersham Pharmacia Biotech) was cleaved with SmaI and NotI. The insert and vector were incubated with T4 DNA ligase (Canadian Life Technologies, Burlington, Ontario, Canada) and the ligation mixture was used to transform DH5␣-competent cells (Life Technologies, Inc.). Plasmid DNA was prepared (Promega Wizard kit, Fisher, Town of Mount Royal, Quebec, Canada), and the DNA sequence of the pGST-PNKP construct was verified by sequencing. Expression of the fusion protein in exponentially growing BL21 cells was induced with 1 mM isopropyl-1-thio-␤-D-galactopyranoside. A freeze-thaw method with buffer consisting of 50 mM Tris-HCl, pH 7.5, 30 mM NaCl, 0.5 mM dithiothreitol, 0.5 mM EDTA, and a mixture of protease inhibitors (aprotinin, leupeptin, chymostatin, N ␣ -p-tosyl-L-lysine chloromethyl ketone, and phenylmethylsulfonyl fluoride) was used to lyse the cells. The GST protein or the GST PNKP protein was purified on glutathione-Sepharose 4B (Amersham Pharmacia Biotech) according to the instructions of the manufacturer. Purified protein (130 ng) from the cell extracts expressing GST or GST PNKP 69 -521 was assayed for PNK activity with oligo(dT) 25 at pH 5.5 (8) and for 3Ј-phosphatase activity using the procedure of Cameron and Uhlenbeck (6, 10). The same procedure was followed to purify GST or GST PNKP expressed in BW528, with slightly lower yields of GST PNKP.
Preparation of Antiserum-A 14-mer peptide, EPRLGRLYCQFSEG, comprising the C terminus of the conceptual translation product of the PNKP gene was synthesized and conjugated to keyhole limpet hemocyanin. The conjugated peptide was then used to produce a rabbit polyclonal antiserum, termed AC-IV, using a standard inoculation protocol. This work was carried out at Research Genetics (Huntsville, AL).
Immunoblot Analysis-Protein samples were electrophoresed through 10% SDS-polyacrylamide gels (28) and transferred to a nitrocellulose membrane using a Bio-Rad MiniTransblot apparatus as recommended by the manufacturer. Immunoblots were carried out using the ECL kit (Amersham Pharmacia Biotech) according to the directions of the manufacturer, with an anti-rabbit horseradish peroxidase-conjugated secondary antibody (Amersham Pharmacia Biotech). In some experiments, alkaline phosphatase activity was used for detection (36). Antibodies against glutathione S-transferase were purchased from Amersham Pharmacia Biotech. Secondary antibodies conjugated to alkaline phosphatase were from Jackson Laboratories (West Grove, PA).
Gradient Plate Assay-Cells were grown overnight in 1 ml of Luria broth in the presence of 50 g/ml ampicillin. Cells were replicated onto various Luria broth agar drug gradient plates, which were prepared as described in detail elsewhere (37,38).  37.0 MBq) was added to the reaction mixture to a specific activity of 1260 cpm/pmol. The reaction was started when the samples were immersed into a 37°C water bath. At the indicated time, 40-l samples were withdrawn and added to tubes containing 200 l of 0.1 M sodium pyrophosphate and 1 mg/ml bovine serum albumin, followed by the addition of 200 l of 0.8 M trichloroacetic acid, mixed, and placed on ice for 10 min. The samples were processed on a 12-hole filtration apparatus (Millipore, Bedford, MA) using GF/C circle filters (Whatman). The trapped DNA was washed three times with 3 ml of 0.1 M sodium pyrophosphate, briefly rinsed with ethanol, air-dried, and counted with 5 ml of scintillation fluid (BCS, Amersham Pharmacia Biotech). In the case of chromosomal DNA pretreated with purified endonuclease IV, 10 ng of the enzyme was incubated with the DNA for 20 min at 37°C in 10 l of endonuclease reaction buffer (25 mM Hepes-KOH, pH 7.6, 50 mM KCl, 1 mg/ml bovine serum albumin). Endonuclease IV was heat-inactivated at 70°C for 3 min.

Identification of dbEST Clones Containing Peptide
Sequence from the Bovine DNA Kinase SNQI-PNK-Two bovine peptide sequences were found from analysis of method 2 purified material, I/LVI/LFTN[Q/K]MGI/LGR (peptide 1), and I/LI/LYI/ LEI/L(PR) (peptide 2) where I/L or Q/K symbolizes an isobaric amino acid, the brackets indicate an ambiguous residue assigned according to the sequence found in a dbEST hit, and the parentheses indicate residues assigned with uncertainty. The peptide 1 sequence identified murine and human cDNAs in the dbEST data base in searches using the BLAST algorithm (29). A human cDNA, clone 32798, contained the peptide 2 sequence in its longest open reading frame upon conceptual translation. This cDNA clone, with an insert of 1.45 kb, was fully sequenced. In the analysis of method 1 purified material, a peptide of sequence GPI/LI/LTQ/KVTDR (peptide 3) was determined by mass spectrometry. A murine cDNA, clone 598211, contained peptide 3 in its longest ORF in EST sequence from the 5Ј end (GenBank accession no. AA162545), indicating that peptide 3 probably mapped close to the N terminus of the protein.
Assembly of the Composite Full-length Human cDNA Sequence-Another cDNA clone (clone 27) was obtained by screening a human lymphoblastoid cell line cDNA library in pREP4 (30) using a colony hybridization protocol. Other clones were generated by anchored PCR (Fig. 1A). The combined DNA sequence information from all of the available cDNA clones allowed the inference of the sequence of the full-length cDNA (Fig. 1B). The gene was given the name PNKP, for polynucleotide kinase 3Ј-phosphatase.
Features of the PNKP Protein-The PNKP gene encodes a polypeptide of 521 amino acids, with a predicted molecular mass of 57,148 daltons. Salient features of the protein include a consensus nucleotide binding site, GFPGAGKS, at residues 372-379 (40). Such a nucleotide binding site is also found in the T4 polynucleotide kinase peptide sequence (41), although in T4 PNK it maps near the N terminus at residues 9 -17. Several motifs found in the L-2-haloacid dehalogenase superfamily (42, FIG. 1. A, cloning strategy and cDNAs available. As described under "Experimental Procedures," a human dbEST clone from an infant brain cDNA library (32798, 1.45-kilobase pair insert) was obtained by BLAST searches using the sequence of a 13-mer tryptic peptide, and sequenced using an overlapping primer walking strategy. An additional clone (pREP4 clone 27) of about the same size was identified in a human lymphoblastoid cell line cDNA library in the vector pREP4. The latter library was the source for 5Ј clones generated by anchored PCR (TA13, TA21, TA26). These clones were sequenced in their entirety using the M13 forward and M13 reverse primers. B, sequence of composite cDNA. The DNA sequence and several features are shown. Underlined are the presumed ATG start codon, the first in-frame stop codon, and the putative poly(A) binding site. 43), also observed in the T4 polynucleotide kinase sequence (41), are present in the PNKP translation product (Fig. 2, double underline). Motif 1 includes an aspartate residue (171) and a threonine residue (175), motif 2 includes threonine 217, and motif 3 comprises aspartates 283 and 289. These motifs are likely to be involved in the 3Ј-phosphatase activity of the PNKP protein. Biochemical studies of the rat liver enzyme indicated a nuclear localization (1, 44), although we used whole cell extracts for our purification. There is no obvious candidate for a nuclear localization sequence, but there is a grouping of four basic residues, RKKK at 301-304, that might be involved in nuclear localization (45). All three of the bovine peptides were observed in the conceptual translation product; only one difference (a glycine in peptide 1 instead of a serine in the human conceptual translation at residue 221) in sequence was apparent, although this analysis cannot take into account the possibility of differences at positions of isobaric amino acids.

Conservation during Evolution as Detected by BLAST Searches of the GenBank Data Base and Southern Analysis-
There is a high degree of homology between the murine and human PNKP genes as seen in dbEST, at least 70% identical amino acids. GenBank data base searches using the BLAST algorithm (29) were conducted with the PNKP conceptual translation product in order to identify similar genes. Caenorhabditis elegans (F21D5) and Schizosaccharomyces pombe (c23c11) clones with scores of 159 and 111 and E values of 2e-66 and 3e-43, respectively, representing possible homologs, were retrieved from the NR data base (Fig. 3). The C. elegans cosmid F21D5 gene contained two regions of similarity with 46% and 47% identical residues. These regions spanned amino acids 160 -504 of the human PNKP protein and contained both the putative phosphatase and nucleotide binding site motifs. As well, the S. pombe chromosome 1 gene had two segments with scores over 80, from 147 to 267 in hPNKP (46% identical or conserved amino acids) and from 268 to 464 in hPNKP (50% identical or conserved amino acids) and spanned both putative domains. A Drosophila dbEST (LPO5621) clone with a score of 140 and expected value of 5e-32 was identified, with 45% identical residues. A ClustalW alignment of human PNKP polypeptide with the three related polypeptides, indicating the identical or conserved residues, is shown in Fig. 3. In Saccharomyces cerevisiae, BLAST searches revealed a protein (YMR156c) with a lower score (43, E value 0.014). T4 PNK is retrieved by a PSI-BLAST search using the PNKP peptide sequence within 4 iterations, with an expected value of 1e-38. There are two regions of similarity, one with 20% identity and 34% conserved or identical residues, and one with 16% identity and 32% conserved or identical residues (data not shown). A eukaryotic viral gene with homology to T4 PNK and T4 RNA ligase is encoded by the ORF86 gene product of Aplysia californica nucleopolyhedrovirus (46). In PSI-BLAST searches, this ORF had a region of similarity from residue 361 to residue 484 of PNKP, with 22% identical and 35% conserved or identical residues. The protein has not yet been demonstrated to have any enzymatic activity (47).
The availability of cDNA for the PNKP gene prompted us to see if this gene is conserved across the various eukaryotic species. Southern analysis of EcoRI-digested genomic DNA from multiple species probed with the full-length cDNA for PNKP (Fig. 4) indicated a strong hybridization signal in lanes containing mammalian DNA, and a weak signal in DNA from chicken. Reproducible bands were observed in S. cerevisiae DNA. These data suggest that the PNKP gene is conserved among mammals and as far as avian species (chicken). Taken together, the results from the BLAST searches and the Southern analysis provide ample evidence that the exons of the PNKP gene have been highly conserved during evolution, implying an important function in many species.
Localization of the PNKP Gene in the Human Genome-It was important to investigate the chromosomal localization of the PNKP gene to ascertain if it corresponded to the position of any potential disease gene and to initiate a study of the genomic organization of the PNKP locus. Genomic clone 429D20 was isolated by hybridization screening and used for fluorescence in situ hybridization (Fig. 5). Panel A indicates the fluorescence results, and panel B the overall chromosomal staining. The gene maps to chromosome 19q13.3-13.4 as indicated in the ideogram (Fig. 5, panel C). The results of the screening for genomic clones (data not shown) and the chromosomal localization (Fig. 5A) indicate that there are no other closely related loci in the human genome.
Expression of the PNKP Gene in Human Tissues-Since biochemical studies had indicated presence of acid pH optimum DNA kinase activity in many tissues (reviewed in Ref. 1), we were interested to investigate tissue-specific gene expression, which was studied in eight human tissues using a commercially obtained Northern blot. A message of 2 kb was observed in all of the tissues (Fig. 6A). The size of this mRNA would be sufficient to encode the 60-kDa polypeptide found in preparations of SNQI-PNK. The strongest intensity of this signal was observed in spleen and in testis (Fig. 6A, lanes 1 and 4, respectively), and the weakest in small intestine (Fig. 6A, lane 6). In order to confirm whether each lane of the blot contained an equal amount of RNA, the blot was stripped and probed with glyceraldehyde-3-phosphate dehydrogenase (GAPDH) cDNA. The result is shown in Fig. 6B; lane 6 (small intestine) was observed to be underloaded. Normalization to GAPDH levels indicated about 3-fold higher expression in spleen and testis (data not shown). Despite the highly stringent washing conditions, a second signal of about 7.5 kb with highest intensity in spleen was also observed, either with an internal probe (Fig.  6A) or a probe containing the 3Ј region of the cDNA (data not shown). The intensity of this signal relative to the 2-kb signal varied somewhat depending on the tissue source. The genomic library screen and the chromosomal localization suggest that this signal also arises from the PNKP locus, since there were no data indicating genomic clones of a highly related gene that cross-hybridized with PNKP clones. Currently, it is not known whether this signal may represent an alternatively spliced transcript; we have at present characterized no cDNA clones that indicated that an alternative transcript might be present. Another possibility is that this larger transcript represents retained intron sequences. There is also a weak band appearing between 2.4 and 4.4 kb.
Immunoblot Analysis of Bovine PNKP-The AC-IV antiserum against the C terminus of the conceptual translation product of human PNKP produced a signal of slightly below 60 FIG. 4. Conservation of exons of the PNKP gene analyzed by Southern blotting. A Southern blot containing 4 g of DNA from nine species digested with EcoRI was probed with the full-length PNKP cDNA. All genomic DNAs were isolated from kidney tissue, except human DNA was isolated from placental tissue, while chicken DNA was isolated from liver tissue. kDa in partially purified (SP Sepharose step; Ref. 6) bovine SNQI-PNK preparations (Fig. 7A, lane 1), while no signal was apparent with the same dilution of preimmune serum (Fig. 7A,  lane 2). This figure demonstrates that an antiserum produced against a conceptual translation product with features expected in a polynucleotide kinase identifies a polypeptide of a similar size to the purified protein. Importantly, in samples from the same stage of the SNQI-PNK purification, renaturation gel activity studies demonstrated an active polypeptide of about 60 kDa (6). These results also show that the cDNA sequence obtained probably accounts for the size of the purified mammalian protein. Also evident is the conservation of the epitope(s) recognized by the antiserum between the bovine and human proteins.
Expression of the PNKP Gene Product as a 5Ј-DNA Kinase and 3Ј-Phosphatase in E. coli-A plasmid, pGST-PNKP, was constructed by fusing in-frame the clone 32798 next to the C-terminal end of GST in the vector pGEX-4T-3. The expression construct consisted of amino acids 69 -521 of the PNKP conceptual translation. When the plasmid was introduced into strain E. coli BL21, it produced a polypeptide that migrated with an apparent molecular mass of 80 kDa, which is consistent with the predicted mass of the fusion protein. Protein from cells expressing pGST-PNKP and from control cells expressing the empty vector pGEX-4T-3 was purified on glutathione-Sepharose 4B and analyzed by SDS-polyacrylamide gel electrophoresis and immunoblotting. As expected, the AC-IV antibodies directed against the C terminus of the conceptual translation product of the PNKP gene detected the GST-PNKP fusion protein (Fig. 7B, lane 4). A signal of about 80 kDa was detected, reflecting the anticipated size of the fusion protein.
The bacterially expressed GST-PNKP was tested for DNA kinase function using oligo(dT) 25 as a substrate. Protein from crude extracts of E. coli expressing pGEX-4T-3 contained no detectable DNA kinase activity, while 5Ј-phosphorylation of the oligonucleotide substrate was observed in protein from crude extracts expressing pGST-PNKP (data not shown). This finding led us to test purified GST-PNKP and corresponding amounts of control GST protein for DNA kinase activity. In Fig. 8A, there was again no detectable activity in control samples ( lanes  1-3), but the DNA kinase activity of GST-PNKP protein was clearly detectable (lanes 4 -6). The specific activity was increased compared with the bovine SNQI-PNK preparations at the final step of purification (data not shown). The 3Ј-phosphatase activity of GST-PNKP was investigated using a TLC assay monitoring conversion of 5Ј [ 32 P]T P to 5Ј [ 32 P]T OH (6,10). The region of the autoradiogram showing 5Ј [ 32 P]T OH is displayed in Fig. 8B. As seen with T4 polynucleotide kinase run as a positive control, the GST-PNKP (lanes 5-7) functioned as a 3Ј-phosphatase. No detectable 3Ј-phosphatase activity was ob-served in reactions containing control GST protein (lanes 2-4). The GST-PNKP had a comparable specific activity to the bovine SNQI-PNK preparation at the final step of the purification (data not shown).
Human PNKP Confers Resistance to Some DNA Damaging Agents in E. coli Lacking 3Ј-Phosphodiesterase-Several oxidants are known to engender genetic instability by inducing DNA single strand breaks (ssb) with blocked 3Ј-termini that prevent DNA repair synthesis (25,48). For example, H 2 O 2 and the antitumor drug bleomycin produce ssb bearing 3Ј-phosphate and 3Ј-phosphoglycolate, respectively. These blocked 3Јtermini are typically removed by enzymes with 3Ј-phosphodiesterase activity, thereby permitting DNA repair synthesis to proceed. In E. coli, exonuclease III and endonuclease IV constitute the major 3Ј-phosphodiesterases that process the 3Јblocking groups in damaged DNA. In addition, these enzymes each possesses an apurinic/apyrimidinic (AP) endonuclease activity that hydrolyzes AP sites produced indirectly by alkylating agents, such as methyl methane sulfonate (MMS). Thus, E. coli mutants lacking both exonuclease III and endonuclease IV display marked hypersensitivity to both oxidative agents and MMS (Fig. 9, A, C, and D; Ref. 39). We predicted that if the apparent 3Ј-phosphatase activity of PNKP indeed plays a role in DNA repair, then this enzyme should restore some oxidant resistance to E. coli strain BW528 (xth nfo) that is deficient in both exonuclease III (xth) and endonuclease IV (nfo). Plasmid pGST-PNKP and its control empty vector pGEX-4T-3 were introduced into BW528 cells. Purification of the fusion protein from a glutathione affinity column, and subsequent confirmation of its 3Ј-phosphatase activity, revealed that GST-PNKP was actively expressed in strain BW528 as an 80-kDa polypeptide with a similar specific activity as obtained from BL21 cells (data not shown). When GST alone was purified from the affinity column, no 3Ј-phosphatase activity was detected, again similar to the data from BL21 cells (data not shown).
We next examined whether the expressed GST-PNKP is capable of restoring resistance to DNA damaging agents to strain BW528 by using a gradient plate assay. In this assay, cells deficient in DNA repair grow only a short distance into the gradient of increasing chemical concentration, as compared with cells that are proficient in DNA repair (39). The plasmid pGST-PNKP restored to strain BW528 partial resistance to the chemical oxidants H 2 O 2 and tert-butylhydroperoxide (tBH, Fig.  9, panel A, bar 7 and panel C, bar 5, respectively). No drug resistance was conferred to strain BW528 either by the empty vector pGEX-4T-3 (Fig. 9, panel A, bar 6; panels C and D, bar 4), or by a plasmid carrying the human LIG1 gene fused to GST (Fig. 9, panel A, bar 8). These latter observations preclude the possibility that the GST domain itself is contributing to the enhanced drug resistance of strain BW528 harboring plasmid pGST-PNKP. In control experiments, and as previously reported (39), the plasmid pNfo, which actively expresses bacterial endonuclease IV, also restored drug resistance to strain BW528 (Fig. 9, panels A and C, bar 5 and bar 3, respectively). It was important to determine that overproduction of GST-PNKP did not confer additional resistance to wild type AB1157; this was not the case (Fig. 9, panels A and B, bar 1 (AB1157/ pGEX-4-T-3) compared with bar 2 (AB1157/pGST-PNKP). We further tested whether the partial restoration of drug resistance to strain BW528 conferred by pGST-PNKP is specific for oxidative agents. A gradient plate assay was performed on cells treated with the alkylating agent MMS. Surprisingly, pGST-PNKP also conferred MMS resistance to strain BW528, but not to the same extent as pNfo (Fig. 9D; bar 5 versus bar 3). One possible interpretation of this finding is that the endogenous AP lyases, such as endonuclease III, formamidopyrimidine-  , lanes 4 -6) was incubated in a standard DNA kinase assay with oligo(dT) 25 substrate. The autoradiogram was exposed for 3 h, and the bands were excised for quantitation of DNA kinase activity. B, 3Ј-phosphatase activity. A 3Ј-phosphatase assay employed 5Ј [ 32 P]T p as substrate. Reactions were monitored by TLC and autoradiography. The portion of the autoradiogram corresponding to the 5Ј [ 32 P]T OH product is shown. For quantitation, the corresponding area of the TLC plate was excised and counted by liquid scintillation. Lane 1, T4 polynucleotide kinase (positive control); lanes 2-4, purified control GST protein (130 ng); lanes 5-7, purified GST-PNKP (130 ng).
DNA glycosylase, and/or endonuclease VIII, may cleave the AP sites to generate 3Ј-blocked termini, which are then further processed by the 3Ј-phosphatase activity of GST-PNKP (see "Discussion"). It should be noted that strain BW528 is no more sensitive than the parental strain AB1157 to the DNA damaging agent formaldehyde (Fig. 9B), which produces DNA lesions other than strand breaks with blocked 3Ј-termini and AP sites. Moreover, the expressed GST-PNKP conferred no additional formaldehyde resistance to strain BW528 (Fig. 9B). From these data, it would appear that the drug resistance conferred by pGST-PNKP to strain BW528 may be due to the enzyme's ability to process DNA lesions.
Human PNKP Acts in Vivo to Process H 2 O 2 -induced 3Ј-Blocking DNA Lesions-To directly test whether PNKP is acting in vivo to remove 3Ј-blocking DNA lesions at ssb, we examined if chromosomal DNA isolated from H 2 O 2 -treated cells could sustain in vitro DNA repair synthesis by E. coli DNA polymerase I (39). Three exponentially growing strains BW528/ pNfo, BW528/pGEX-4T-3, and BW528/pGST-PNKP were either untreated or treated with 25 mM H 2 O 2 for 1 h, the chromosomal DNA was immediately isolated from each strain, and examined for the extent of [methyl-3 H]dTMP incorporation by DNA polymerase I. Chromosomal DNA isolated from any of the untreated cells showed virtually no incorporation of [methyl-3 H]dTMP by DNA polymerase I (Fig. 10, A-C, open circles). In contrast, H 2 O 2 -damaged chromosomal DNA isolated from strain BW528/pNfo showed a substantial level, at least 30-fold increase, of [methyl-3 H]dTMP incorporation (Fig. 10A, closed  circles). The incorporation of [methyl-3 H]dTMP was directly dependent on the in vivo processing of the damaged DNA by the 3Ј-phosphodiesterase activity of endonuclease IV, as no incorporation was observed into H 2 O 2 -damaged DNA derived from strain BW528 carrying only the vector pGEX-4T-3 (Fig.  10B, closed circles). However, if the damaged DNA from strain BW528/pGEX-4T-3 was pretreated with purified endonuclease IV, the extent of [methyl-3 H]dTMP incorporation was greatly enhanced, and reached the same level as damaged DNA derived from strain BW528/pNfo (Fig. 10, A and B, closed  squares). The most striking observation was the incorporation of [methyl-3 H]dTMP into the H 2 O 2 -damaged DNA derived from strain BW528 harboring the pGST-PNKP (Fig. 10C, closed  circles). This finding can only be explained if the 3Ј-phosphatase activity of PNKP acts in vivo to process H 2 O 2 -induced DNA lesions. It is noteworthy, however, that the extent of [methyl-3 H]dTMP incorporation into the damaged DNA derived from BW528/pGST-PNKP was only 65% of the level incorporated with H 2 O 2 -damaged DNA obtained from strain BW528/pNfo (Fig. 10, A and C). Endonuclease IV pretreatment of the H 2 O 2 -damaged DNA derived from strain BW528/pGST-PNK permitted an additional 30% of [methyl-3 H]dTMP incorporation by DNA polymerase I (Fig. 10C). This latter finding suggests that the level of GST-PNKP in strain BW528 may be insufficient to process all the H 2 O 2 -induced DNA lesions. Nonetheless, the level of label incorporation into the H 2 O 2 -damaged DNA was unchanged if the endogenous level of GST-PNKP was increased from the isopropyl-1-thio-␤-D-galactopyranoside-inducible lac promoter of the pGEX-4T-3 vector (data not shown). It is certainly possible that PNKP may be unable to access and or repair all the H 2 O 2 -induced DNA lesions in vivo (see "Discussion"), thus accounting for the partial drug-resistance conferred by GST-PNKP to strain BW528. DISCUSSION The identification of a cDNA encoding a human PNKP represents the first mammalian polynucleotide kinase and the second mammalian 3Ј-phosphatase to be cloned. Importantly, this gene is the first DNA-specific kinase from any organism to be characterized at the molecular level. This discovery has FIG. 9. Resistance to DNA damaging agents conferred by human PNK in the DNA repair-deficient E. coli strain BW528 (xth nfo). In panels A and B the numbers represent the following strains: 1-3, AB1157 (Xth ϩ Nfo ϩ ) harboring the vector pGEXT-4T-3, the plasmids pNfo, and pGST-PNKP, respectively; 4 -8, BW528 (xth nfo) harboring the vector pBluescript S/K, pNfo, pGEX-4T-3, pGST-PNKP, and pGST-LIG1, respectively. For panels C and D, the strains are numbered as follows: 1, AB1157; 2-5, BW528 harboring the plasmids pBluescript S/K, pNfo, pGEX-4T-3, and pGST-PNKP, respectively. The initial amount of drug in the bottom layer of the gradient was 75 mol of H 2 O 2 (panel A), 0.01% formaldehyde (HCHO, panel B); 3.9 mol of tBH (panel C), and 0.4 mmol of MMS (panel D). The concentration increases in a gradient from left to right. Photographs were taken after cells were incubated overnight at 37°C. Complementation was scored as complete restoration to growth to that observed in the wild type strain; partial complementation was scored as increased growth compared with the mutant strain. allowed significant progress toward an understanding of its role in DNA metabolism, particularly in DNA repair of damage by oxidative damaging agents.
We conclude that we have obtained a gene encoding a polynucleotide kinase 3Ј-phosphatase due to the 5Ј-DNA kinase and 3Ј-phosphatase activities of the GST-PNKP construct expressed in E. coli. The observation of 3Ј-phosphatase activity of the PNKP gene expressed in E. coli BW528, which lacks significant endogenous 3Ј-phosphatase activity that might have copurified with GST-PNKP, is incontrovertible evidence that the two activities are encoded by the same gene. Sequence similarity to T4 PNK in certain key motifs also supports our conclusion that a PNK gene has been cloned. Furthermore, antibodies raised against the PNKP peptide sequence recognize in immunoblots a bovine polypeptide of the size of the active DNA kinase detected in renaturation gel activity assays (6).
The PNKP gene product contains a motif found in adenine nucleotide-binding proteins (40), GFPGAGKS, at residues 372-379. There is a corresponding conserved motif, GCPGSGKS, at positions 9 -16 in the T4 polynucleotide kinase peptide sequence (41). The importance of this conserved region for the polynucleotide kinase activity of the T4 enzyme correlates with mapping of the kinase domain to the N terminus (49). We identified in the T4 PNK peptide sequence (6) a series of three motifs that have been found to be important in numerous studies of proteins in the L-2-haloacid dehalogenase superfamily (42,43). In the PNKP gene product, a particularly important residue in catalysis is predicted to be aspartate 171, which is the conserved residue found to be critical for function of L-2haloacid dehalogenase (50) and observed to be phosphorylated when human phosphomannomutase, also a member of the superfamily, is incubated with substrate (51). Interestingly, in the T4 sequence these motifs map to the C-terminal half of the protein (6), but in the mammalian PNKP protein sequence the motifs are centrally located between residues 171 and 290. The C. elegans and S. pombe genes that scored high in BLAST searches contained the putative 3Ј-phosphatase and nucleotide binding domains in the same order as in the PNKP protein.
Thus, the organization of the putative active domains of the T4 PNK and PNKP proteins seems to differ considerably, with the relative locations of the putative PNK and 3Ј-phosphatase motifs switched. The observation that a GST fusion construct containing amino acids 69 -521 of the conceptual translation product was active in E. coli was not entirely unexpected, since this construct contained 87% of the coding sequence, including the putative phosphatase motifs at residues 171-289 and the putative nucleotide binding site at residues 372-379 that would be anticipated to be critical for the 3Ј-phosphatase and 5Ј-kinase activities, respectively.
The PNKP gene maps to chromosome 19q13.3-13.4, a region of the genome rich in well-characterized genes, including POLD1 at 19q13. 3-13.4 (52), and LIG1 at 19q13.2-q13.3 (53, 54) among those involved in DNA metabolism. DNA repair genes that localize proximally to PNKP include ERCC1 at 19q13.2-q13.3 (54), ERCC2/XPD at 19q13.2-q13.3 (55), and XRCC1 at 19q13.2 (54,56). A search of the OMIM data base (57) revealed no obvious candidate human disease mapping to this area of the genome; however, since many human genetic diseases have not been mapped, a contribution of PNKP to human disease burden cannot be ruled out. This part of chromosome 19 is involved in certain translocations found in malignancies (57). In addition, loss of heterozygosity of this portion of chromosome 19 has been reported in some neoplasms (58 -62).
Preliminary gene expression studies revealed two major signals upon Northern analysis of samples of human poly(A) ϩ RNA, corresponding to sizes of 2 and 7.5 kb. The cDNA sequences obtained correspond well to the size of the smaller of the two messages. In the tissues that we examined, testis and spleen displayed greater amounts of the 2-kb message when the results were normalized to expression of the GAPDH gene. The larger message is of unknown biological significance; it was detectable with a probe containing the 3Ј end of the cDNA and a probe from the interior of the cDNA. The genomic DNA library screening results and FISH mapping results are strongly indicative of one gene. The larger message may be an alternatively spliced transcript, although dbEST clones analyzed so far do not support this notion, or perhaps may represent retained intron sequences.
A potential physiological role in DNA repair has been suggested repeatedly from biochemical studies of mammalian DNA kinase. Findings from in vitro reconstitution experiments with synthetic substrates containing 3Ј-phosphate and/or 5Ј-OH termini support the notion that polynucleotide kinase 3Ј-phosphatases can function in DNA repair (4,63). Another correlation with DNA repair is that the first human DNA 3Ј-phosphatase to be cloned is the APE/HAP1 gene, which has a firmly established DNA repair function supported by many biochemical and genetic studies. The APE/HAP1 gene is a member of the exonuclease III family, which together with the endonuclease IV family includes enzymes detected in E. coli and eukaryotes that are able to repair 3Ј-phosphate damage at DNA strand breaks (25,64). These proteins have multiple functions, including Type II AP endonuclease activity, 3Ј-phosphodiesterase activity, and sometimes an exonuclease activity. Biochemical studies of APE/HAP1 function (25) indicate that it is unlikely to be responsible for repair of all 3Ј-blocking residues arising from radiation or oxidative damage to DNA. Moreover, fractionation of mammalian cells suggests the presence of multiple 3Ј-phosphodiesterase activities (24,65).
To investigate the role of PNKP in DNA repair in vivo, we performed heterologous complementation experiments. Rescue of a mutant phenotype in another species can often provide compelling evidence for a physiological role. We took advantage of the availability of E. coli with vastly reduced 3Ј-phosphodiesterase activity, xth nfo mutants deficient in AP endonuclease activity and 3Ј-phosphodiesterase activity (37). We performed experiments testing sensitivity to the DNA damaging agents hydrogen peroxide and tBH. As expected, the BW528 (xth nfo) cells were highly sensitive, and overexpression of the bacterial nfo gene provided resistance. Our discovery that overexpression of pGST-PNKP, but not an empty vector or pGST-hLIG1, partially overcame the sensitivity to these oxidative DNA damaging agents but not to formaldehyde used in control experiments, supports a role for PNKP in DNA repair in living cells. Importantly, isolated DNA from H 2 O 2 -treated cells expressing pGST-PNKP, but not cells expressing the empty vector, pGEX-4-T-3, was a better substrate for DNA polymerase. This reflects in vivo DNA repair and shows that the partial mutant rescue was correlated with events at the DNA level rather than some other effect on cellular response to H 2 O 2 .
The ability of PNKP to confer partial resistance to the alkylating agent MMS to strain BW528 can be explained if endogenous AP lyases, i.e. formamidopyrimidine-DNA glycosylase, endonuclease III, or endonuclease VIII, cleave the MMS-induced AP sites to produce blocked 3Ј-termini, such as 3Ј-phosphate. Alternatively, the PNKP enzyme might directly cleave the AP sites. This latter possibility is supported by the finding that GST-PNKP purified from AP endonuclease-deficient strain BW528 weakly cleaves a substrate with a centrally located AP site (66). Experiments are in progress to determine if the enzyme acts as an AP lyase or as a hydrolytic AP endonuclease that directly produces 3Ј-hydroxyl termini, which are compatible with DNA repair synthesis. The inability of PNKP to fully restore drug resistance to strain BW528 is not entirely surprising, since the enzyme is not in its natural environment. In fact, expression of the yeast homologue of endonuclease IV, Apn1, in strain BW528 also only partially substitutes for endonuclease IV (39). We cannot completely exclude the possibilities that the N-terminal 13% of the polypeptide may be required for full complementation or that the GST domain may interfere with the enzyme's ability to process DNA lesions. Furthermore, in mammalian cells, PNKP may require accessory factors, which are lacking in E. coli, to efficiently repair damaged DNA.
This work provides support for a crucial function of the 3Ј-phosphatase activity of the PNKP gene product in repair of oxidative DNA damage in mammalian cells. Importantly, this type of damage arises endogenously during normal cellular metabolism, and its repair is essential for cellular survival. The role of the 5Ј-DNA kinase activity of the gene product may also reside in DNA repair, restoring 5Ј-OH termini arising during the life of the cell to 5Ј-P termini suitable for ligation. The PNKP gene product may also participate in DNA replication.
The molecular reagents reported here, together with the appropriate genetic model systems that we have now identified, will allow further detailed analysis of the biological implications of juxtaposition of 5Ј-DNA kinase and 3Ј-phosphatase activities in the same enzyme.