Direct interaction between uracil-DNA glycosylase and a proliferating cell nuclear antigen homolog in the crenarchaeon Pyrobaculum aerophilum

: Proliferating cell nuclear antigen (PCNA) acts as a sliding clamp on duplex DNA. Its homologs, present in Eukarya and Archaea, are part of protein complexes that are indispensable for DNA replication and DNA repair. In Eukarya, PCNA is known to interact with more than a dozen different proteins, including a human major nuclear uracil-DNA glycosylase (hUNG2) involved in immediate postreplicative repair. In Archaea, only three classes of PCNA-binding proteins have been reported previously: replication factor C (the PCNA clamp loader), family B DNA polymerase, and flap endonuclease. In this study, we report a direct interaction between a uracil-DNA glycosylase (Pa-UDGa) and a PCNA homolog (Pa-PCNA1), both from the hyperthermophilic crenarchaeon Pyrobaculum aerophilum (T(opt) = 100 degrees C). We demonstrate that the Pa-UDGa-Pa-PCNA1 complex is thermostable, and two hydrophobic amino acid residues on Pa-UDGa (Phe(191) and Leu(192)) are shown to be crucial for this interaction. It is interesting to note that although Pa-UDGa has homologs throughout the Archaea and bacteria, it does not share significant sequence similarity with hUNG2. Nevertheless, our results raise the possibility that Pa-UDGa may be a functional analog of hUNG2 for PCNA-dependent postreplicative removal of misincorporated uracil. have so far been identified as PCNA binding proteins, based on in vitro binding study and crystal structure analysis. They are family B DNA polymerase (Pol B) (22) and flap endonuclease (FEN) (25-29), both proteins known to interact with PCNA in eukaryotes. The proposed putative PCNA binding motifs in these archaeal PCNA binding proteins are quite similar to the conserved PCNA binding motif identified in eukaryotic PCNA binding proteins (22, 27). However, these putative PCNA binding sites have not been verified by mutation analysis. In this study we report identification of another archaeal PCNA binding protein, Pyrobaculum aerophilum uracil-DNA glycosylase 1 (PaUDGa), and the biochemical confirmation of its interaction with PCNA via the PCNA binding motif. P. aerophilum is a hyperthermophile with an optimal growth temperature of 100 (cid:176) C, and a The biochemical characterization of PaUDGa’s uracil-DNA glycosylase activity Analysis of the complete genome sequence of P. aerophilum revealed two putative PCNA homologs PaPCNA1 and PaPCNA2, as expected for a crenarchaeote We demonstrate that PaUDGa preferentially binds to PaPCNA1, similar to two other P. aerophilum PCNA binding proteins, PaFEN and PaPol B3. PaUDGa’s ability to bind to PaPCNA1 resembles the eukaryotic PCNA binding protein hUNG2, which belongs to a distinctly different UDG family due to low amino acid sequence similarity to PaUDGa. Our results raise the possibility that PaUDGa may be a functional analog of hUNG2 for PCNA-dependent post-replicative removal of misincorporated uracil. The PaUDGa mutant carrying F183A/F184A in the putative binding motif 1 still capable of binding to PaPCNA1 (Fig. 5C and 5D, lane 6). However, the binding was largely abolished in the PaUDGa mutant carrying F191A/L192A in the putative motif 2 (Fig. 5C and 5D, lane 7). These results show that Phe345 and Phe346 of PaFEN, and Phe191 and Leu192 of PaUDGa are necessary for the binding of PaFEN and PaUDGa to PaPCNA1.


SUMMARY
Proliferating cell nuclear antigen (PCNA) acts as a sliding clamp on duplex DNA.
Its homologs, present in Eukarya and Archaea, are part of protein complexes that are indispensable for DNA replication and DNA repair. In Eukarya, PCNA is known to interact with more than a dozen different proteins, including a human major nuclear uracil-DNA glycosylase (hUNG2) involved in immediate post-replicative repair. In Archaea, only three classes of PCNA binding proteins have been previously reported: replication factor C (the PCNA clamp loader), family B DNA polymerase (Pol B) and flap endonuclease (FEN). In this study we report a direct interaction between a uracil-DNA glycosylase (PaUDGa) and a PCNA homolog (PaPCNA1), both from the hyperthermophilic crenarchaeon Pyrobaculum aerophilum (T opt = 100°C). We demonstrate that the PaUDGa-PaPCNA1 complex is thermostable, and two hydrophobic amino acid residues on PaUDGa (Phe 191 and Leu 192 ) are shown to be crucial for this interaction. It is interesting to note that although PaUDGa has homologs throughout the Archaea and Bacteria, it does not share significant sequence similarity with human major nuclear uracil-DNA glycosylase (hUNG2). Nevertheless, our results raise the possibility that PaUDGa may be a functional analog of hUNG2 for PCNAdependent post-replicative removal of misincorporated uracil.

INTRODUCTION
Proliferating cell nuclear antigen (PCNA) is essential for life. It is a processivity factor for DNA polymerase, forming a toroidal shaped trimer acting as a sliding clamp on duplex DNA (1)(2)(3)(4). Its function requires another protein, the clamp loader replication factor C, to load it onto the circular DNAs (5)(6)(7)(8)(9). PCNA is present in eukaryotes and its functional analog, the β subunit of DNA polymerase III holoenzyme, is present in bacteria (10,11). More than a dozen classes of eukaryotic PCNA binding proteins have been shown to interact with the PCNA sliding clamp, linking PCNA to several important biological processes beyond DNA replication, such as DNA repair and cell cycle regulation (12)(13)(14)(15)(16)(17)(18). In many cases PCNA binding partners interact with PCNA through a conserved motif identified as "Qxx(L/M/I)xx(F/Y/H)(F/Y)" that is usually located near either the amino or the carboxyl terminus. One important example of an eukaryotic PCNA binding protein involved in DNA repair is human major nuclear uracil-DNA glycosylase (hUNG2), which removes uracil from misincorporated dUMP residues in an immediate postreplicative process (19,20). hUNG2 interacts with PCNA through its PCNA binding site, 4-QKTLYSFF-11, which is located near the amino terminus of hUNG2.
Recently PCNA sequence homologs have been identified in Archaea (21,22). So far, each of the 10 completely sequenced archaeal genomes contains at least one putative PCNA homolog (23). There is a distinction found between the two major subdomains of the Archaea, Crenarchaeota and Euryarchaeota (23). While each euryarchaeal genome tends to have one PCNA homolog, each crenarchaeal genome has two or three putative PCNA homologs (21,23,24). Biochemical studies have been conducted with several of the archaeal PCNA homologs, including a PCNA homolog from the euryarchaeote Pyrococcus furiosus and two PCNA homologs from the crenarchaeote Sulfolobus solfataricus (21,22). These studies have confirmed that all of them are processivity factors for their corresponding DNA polymerases.
In Archaea, in addition to the PCNA clamp loader (replication factor C), two classes of archaeal proteins have so far been identified as PCNA binding proteins, based on in vitro binding study and crystal structure analysis.
They are family B DNA polymerase (Pol B) (22) and flap endonuclease (FEN) (25)(26)(27)(28)(29), both proteins known to interact with PCNA in eukaryotes. The proposed putative PCNA binding motifs in these archaeal PCNA binding proteins are quite similar to the conserved PCNA binding motif identified in eukaryotic PCNA binding proteins (22,27). However, these putative PCNA binding sites have not been verified by mutation analysis.
In this study we report identification of another archaeal PCNA binding protein, Pyrobaculum aerophilum uracil-DNA glycosylase 1 (PaUDGa), and the biochemical confirmation of its interaction with PCNA via the PCNA binding motif. P. aerophilum is a hyperthermophile with an optimal growth temperature of 100°C, and a member of the crenarchaeal subdomain of Archaea (30). The biochemical characterization of PaUDGa's uracil-DNA glycosylase activity was recently published (31). Analysis of the complete genome sequence of P. aerophilum revealed two putative PCNA homologs (24), PaPCNA1 and PaPCNA2, as expected for a crenarchaeote (23). We demonstrate that PaUDGa preferentially binds to PaPCNA1, similar to two other P. aerophilum PCNA binding proteins, PaFEN and PaPol B3. PaUDGa's ability to bind to PaPCNA1 resembles the eukaryotic PCNA binding protein hUNG2, which belongs to a distinctly different UDG family due to low amino acid sequence similarity to PaUDGa. Our results raise the possibility that PaUDGa may be a functional analog of hUNG2 for PCNA-dependent post-replicative removal of misincorporated uracil.

Bacterial Expression Plasmids
P. aerophilum genomic DNA was prepared as described previously (32). The coding regions for PaUDGa (PAE0651, accession number AAL62921) and PaFEN (PAE0698, accession number AAL62961) were amplified by polymerase chain reaction (PCR) using P. aerophilum genomic DNA as template with their corresponding primer pairs synthesized by Invitrogen (Invitrogen, Carlsbad, CA). The PCR products were cloned into a pCR2.1-TOPO vector using a TOPO TA cloning kit (Invitrogen, Carlsbad, CA). The primer information can be obtained upon request.
The full-length PaUDGa gene was amplified by PCR using pCR2.1TOPOPaUDGa as template, cloned into a pGEX-2TK vector (Amersham Pharmacia Biotech, Piscataway, NJ) at the BamHI site to create a plasmid that expresses a fusion protein of glutathione S-transferase (GST) and PaUDGa. The full-length PaFEN gene was amplified by PCR using pCR2.1TOPOPaFEN as template, cloned into a pGEX-2TK vector (Amersham Pharmacia Biotech) at the EcoRI site to create a plasmid that expresses GST-PaFEN fusion protein. Two PaPol B3 (PAE2109, accession number AAL63952) fragments containing the C-terminal region (C1, amino acid residuals 612-785 and C2, amino acid residuals 726-785) were amplified by PCR using P. aerophilum genomic DNA as template, cloned into a pGEX-2TK vector (Amersham Pharmacia Biotech) between BamHI-EcoRI site to create two plasmids that express GST-PaPol B3 (C1, amino acids 612-785) fusion and GST-PaPol B3(C2, amino acids 726-785) fusion, respectively. The full-length PaPCNA1 gene (PAE3038, accession number AAL64629) was amplified by PCR using P. aerophilum genomic DNA, cloned into a pQE30 vector (Qiagen, Chatsworth, CA) between BamHI-HindIII site to create a plasmid that expresses the N terminal hexahistidine tagged PaPCNA1. The full-length PaPCNA2 gene (PAE0720, accession number AAL62977) was amplified by PCR using P. aerophilum genomic DNA, cloned into a pQE30 vector (Qiagen) between SphI-SalI site to create a plasmid that expresses the N terminal hexahistidine-tagged PaPCNA2. The murine PCNA (33) was subcloned into a pBluescriptII KS vector (Stratagene, La Jolla, CA) between BamHI and EcoRI site. Subsequently the BamHI-HindIII fragment was subcloned into a pQE30 vector (Qiagen) between BamHI and HindIII site to create a plasmid that expresses the N terminal hexahistidine-tagged murine PCNA.
For thermostable binding assays the BamHI-BamHI fragment of PaUDGa was subcloned into pQE30 vector (Qiagen) to create a plasmid (pQE30PaUDGa) that expresses a N-terminal hexahistidine-tagged PaUDGa recombinant protein. The BamHI-HindIII fragment of PaPCNA1 was subcloned into a pQE60 vector (Qiagen) to create a plasmid (pQE60PaPCNA1) that expresses the native form of PaPCNA1 without the histidine tag.

Generation of PaUDGa and PaFEN mutants
The amino acid fragments of PaUDGa 1-182, 131-196, and 172-196 were amplified by PCR using pCR2.1TOPOPaUDGa as template. The products were subcloned into a pGEX-2TK vector (Amersham Pharmacia Biotech) at BamHI site to create GST fusion protein expression plasmids.
The PaUDGa mutant F183A/F184A was generated by standard site-directed mutagenesis procedure (34) using pCR2.1TOPOPaUDGa as template. The obtained PCR products were cloned into a pGEX-2TK vector (Amersham Pharmacia Biotech) at BamHI site to create an expression plasmid for GST-PaUDGa (F183A/F184A) and the mutations were verified by DNA sequencing using a SequiTherm EXCEL II DNA sequencing kit (Epicentre, Madison, WI).
The PaUDGa mutant F191A/L192A was generated by site-directed mutagenesis. The PCR products were first cloned into a pCR2.1-TOPO vector using a TOPO TA cloning kit (Invitrogen). The BamHI-BamHI fragment containing the full-length PaUDGa with two amino acid substitutions was subcloned into a pGEX-2TK vector (Amersham Pharmacia Biotech) at BamHI site to create an expression plasmid for GST-PaUDGa (F191A/L192A) and the mutations were verified by DNA sequencing using a SequiTherm EXCEL II DNA sequencing kit (Epicentre).
The PaFEN mutant F345A/F346A was generated by site-directed mutagenesis. The PCR products were first cloned into a pCR2.1-TOPO vector using a TOPO TA cloning kit (Invitrogen). The EcoRI-EcoRI fragment containing the full-length PaFEN with two amino acid substitutions was subcloned into a pGEX-2TK vector (Amersham Pharmacia Biotech) at EcoRI site to create an expression plasmid for GST-PaFEN (F345A/F346A) and the mutations were verified by DNA sequencing using a SequiTherm EXCEL II DNA sequencing kit (Epicentre).

Expression and partial purification of recombinant PCNA homologs.
Overnight cultures of E. coli BL21/pREP4 harboring plasmid pQE30-PaPCNA1, A native PaPCNA1 recombinant protein without the histidine tag was obtained for the thermostable binding assay by a similar protocol as described above. An E. coli strain BL21/pREP4 harboring plasmid pQE60PaPCNA1 was used for the preparation of cell lysates.
Subsequently, a heat treatment procedure was employed to the cell lysates to obtain a partial purified recombinant PaPCNA1 in its native form without the histidine tag.

Thermostable binding assay
A N-terminal hexahistidine-tagged PaUDGa protein was expressed in an E. coli BL21/pREP4/pQE30PaUDGa strain and the cell lysate was prepared by sonication in buffer B (50 mM sodium phosphate, pH 8.0, 300 mM NaCl, supplemented with 1 mM PMSF). To immobilize PaUDGa, Ni 2+ -NTA beads were added to the cell lysate and incubated at 4°C for 1 h. After extensive washing with the buffer B, the Ni 2+ -NTA beads were transferred to buffer A. The Ni 2+ -NTA beads (60 µl) containing immobilized PaUDGa were mixed with the partial purified recombinant PaPCNA1 in its native form or with E. coli cell lysate containing pQE30 vector alone after 70°C heat-treatment. As a control for the experiment, a sample containing the partial purified PaPCNA1 and 60 µl Ni 2+ -NTA beads without immobilizes PaUDGa was also prepared. After five-time washing with 0.8 ml buffer A at 4°C, the Ni 2+ -NTA beads were evenly split into two tubes. While one tube was incubated at 70°C waterbath for 5 min with 0.8 ml buffer A pre-equilibrated at 70°C, the other tube was kept on ice with 0.8 ml chilled buffer A. Ni 2+ -NTA beads were pelleted at 1000 x g for 2 min. Bound proteins were released by boiling in SDS sample buffer, analyzed by SDS-PAGE and visualized by Coomassie Brilliant blue staining.

Expression and partial purification of two putative PCNA homologs from P. aerophilum
As a member of the archaeal subdomain Crenarchaeota, P. aerophilum contains two putative PCNA homologs, PaPCNA1 and PaPCNA2, identified by amino acid sequence homology (Fig. 1A, ref. 24). PaPCNA1 shares 24% amino acid sequence identity with PaPCNA2 ( Fig 1B). The amino acid sequence identities between the two putative PaPCNA homologs and the other characterized eukaryotic or archaeal PCNAs range from 17% to 26% (Fig. 1B). Both putative PaPCNA homologs contain the highly conserved (L/I)-A-P-(K/R) motif located near the carboxyl terminus, which may interact with the clamp loader, replication factor C (23). Phylogenetic analysis of PCNA homologs in crenarchaea reveals that the multiple homologs in Crenarchaea fall into two classes, consistent with a duplication event early after the divergence of the crenarchaeal clade (23).
PaPCNA1 and PaPCNA2 were cloned on the bacterial expression vector pQE30 and expressed as N-terminal hexahistidine-tagged recombinant proteins in E. coli ( Fig. 2A and 2B,  lanes 3 and 5). To obtain partially purified recombinant PaPCNA homologs for in vitro binding assays, we took advantage of the heat stability characteristic of proteins from thermophiles and heated the E. coli crude cell lysates expressing each homolog to 70°C. The recombinant PaPCNA1 and PaPCNA2 were largely heat-stable after 10-min heat treatment ( Fig. 2A and 2B,   lanes 4 and 6). A hexahistidine-tagged eukaryotic murine PCNA recombinant protein was also included in the experiment and was found to be heat labile under the conditions tested ( Fig. 2A and

In vitro direct binding of PaPol B3 to PaPCNA1
P. aerophilum DNA polymerase B3 (PaPol B3) was tested for binding to recombinant PaPCNA1 and PaPCNA2. Analysis of the PaPol B3 protein sequence revealed that its C terminal region contains a putative PCNA binding motif "778-ERTLLDFF-786" (Fig. 3A), which is similar to the putative PCNA binding motifs of several archaeal Pol B homologs predicted by Ishino and his colleagues (22). Glutathione S-transferase (GST) fusion proteins were constructed for two overlapping fragments containing the carboxyl terminal region of PaPol B3 (C1↑amino acids 612-785, C2↑amino acids 726-785, Figs 3A & 3B). In the "pull-down" affinity bead interaction assays, GST alone had no detectable binding to either of the PaPCNAs (Fig. 3C, lanes 2 and 6). However, binding to PaPCNA1 was detected with both of the GST-PaPol B3 fusions (Fig. 3C, lanes 3 and 4). In addition, a weak binding to PaPCNA2 was also detected (Fig. 3C, lanes 7 and 8). The observed preferred binding to PaPCNA1 by the C terminal region of PaPol B3 provides evidence for an in vivo direct interaction between PaPol B3 and PaPCNA1.

In vitro direct binding of PaUDGa and PaFEN to PaPCNA1
A GST fusion protein with PaUDGa was constructed for the study of in vitro direct binding of PaUDGa to two recombinant PaPCNA homologs using the pull-down affinity bead interaction assay (Fig. 4A, lane 3). PaFEN, which was predicted to be a PCNA binding protein based on previous studies (27), was also included for the binding experiment (Fig. 4A, lane 4).
While GST alone bound to neither PaPCNA (Fig. 4B, lanes 3 and 4), both GST-PaUDGa and GST-PaFEN bound to PaPCNA1 (Fig. 4B, lanes 5 and 7). Binding to PaPCNA2 was not detected with either GST fusion (Fig. 4B, lanes 6 and 8). Therefore, the observed binding of PaUDGa and PaFEN to PaPCNA1 provides evidence for a direct in vivo interaction between PaUDGa-PaPCNA1 and PaFEN-PaPCNA1. The effect of NaCl concentration on the formation of a complex between PaUDGa and PaPCNA1 was studied. Binding was detected in the presence of 0.05 -0.4 M NaCl and was disrupted at NaCl concentrations higher than 0.4 M ( Fig.   4C and 4D). The effect of temperature on the formation of the complex was also studied. When using a recombinant hexahistidine-tagged PaUDGa and a recombinant PaPCNA1 in its native form without any tag, binding was observed at 4°C and was largely retained after 5 mintreatment at 70°C (Fig. 4E, lanes 3 and 4). Under the same experimental conditions, the recombinant PaPCNA1 protein alone had no detectable binding to the Ni 2+ -NTA resin (data not shown). Binding of PaUDGa and PaFEN to the eukaryotic murine PCNA was also tested, but was undetectable under our experimental conditions (data not shown).

The PCNA interaction motif is located near the C-terminus of PaUDGa
The interaction between PaUDGa and PaPCNA1 was further verified by identifying the specific regions on PaUDGa required for PCNA binding. First, three GST fusion proteins that contain various regions of PaUDGa were constructed and tested for PaPCNA1 binding activity using the pull-down affinity bead interaction assay (Fig. 5A) 172-196)). These results demonstrate that the 25-amino acid region near the PaPCNA1 carboxyl terminus is required for PCNA binding.
Next, PaUDGa mutants with specific substitutions within the above identified 25 amino acid region were generated in order to identify amino acid residues critical for PCNA binding activity. Previous studies have shown that many PCNA binding proteins contain the consensus PCNA binding motif "Qxx(L/M/I)xx(F/Y/H)(F/Y)" and mutational analysis indicates that the two consecutive hydrophobic amino acid residuals within this motif are involved in the interaction with PCNA (36). Analysis of the PaUDGa amino acid sequence revealed two putative PCNA binding motifs near the carboxyl terminus of PaUDGa (Fig. 5B). The sequence for the first region is 177-QKDLAMFF-184 (motif 1), which contains the eukaryotic PCNA binding consensus sequence "QxxLxxFF". The sequence for the second region is 185-GGGLDRFL-192 (motif 2), which contains two consecutive hydrophobic amino acid residues (Phe 191 and Leu 192 ) closer to the carboxyl terminus (Fig. 5B). Two PaUDGa mutants were generated for the pull-down binding assay, each with two amino acid changes in the putative PCNA binding motif 1 (F183A/F184A) or motif 2 (F191A/L192A). A PaFEN mutant carrying F345A/F346A modification in its putative PCNA binding motif 339-TSSLDSFF-346 near its carboxyl terminus was also included in the experiment (Fig.  5B). As expected, the PaFEN mutant carrying F345A/F346A failed to bind to PaPCNA1 (Fig. 5C and 5D, lane 4).
The PaUDGa mutant carrying F183A/F184A in the putative binding motif 1 was still capable of binding to PaPCNA1 ( Fig. 5C and 5D, lane 6). However, the binding was largely abolished in the PaUDGa mutant carrying F191A/L192A in the putative motif 2 ( Fig. 5C and 5D, lane 7). These results show that Phe345 and Phe346 of PaFEN, and Phe191 and Leu192 of PaUDGa are necessary for the binding of PaFEN and PaUDGa to PaPCNA1.

DISCUSSION
With the goal of finding new archaeal PCNA binding proteins, we searched the predicted protein sequences of the P. aerophilum genome with identified putative PCNA binding motifs using MACPATTERN 3.6 (37). In this way PaUDGa was identified as a putative PCNA binding protein. We carried out in vitro biochemical analysis of PaUDGa binding to two putative P. aerophilum PCNA homologs, PaPCNA1 and PaPCNA2, using the pull-down affinity bead interaction assay. Our results show that PaUDGa preferentially binds to PaPCNA1 to form a thermostable protein complex. Apparently, the binding between PaUDGa and PaPCNA1 is specific, as PaUDGa has no detectable binding to the murine PCNA. Two consecutive hydrophobic amino acid residues Phe 191 and Leu 192 , located near the carboxyl terminus of PaUDGa, are crucial for PCNA binding activity. Maintaining these same experimental conditions, the interactions between PaPCNA1 and two other P. aerophilum proteins-PaFEN (wild type and mutant F345A/F346A) and the C terminal region of PaPol B3-provides evidence for PaPCNA1 as a functional PCNA homolog in P. aerophilum. At this point we were unable to perform in vivo experiments to verify the physiological significance of the PCNA binding property of PaUDGa due to the current lack of a genetic system in P. aerophilum. However, identification of PaUDGa as an archaeal PCNA binding protein and its critical amino acid residues for PCNA binding is the first step in revealing its biological significance in future genetic analyses.
The PCNA binding activity observed in PaUDGa from this study and in hUNG2 from previous studies (19) raises the possibility that PaUDGa may be a functional analog of hUNG2.
PaUDGa and hUNG2 belong to two distinct UDG families due to the low sequence similarity between them (31,38). In addition, PaUDGa is an olive green colored protein (31)  Biochemical experiments will be required to test these binding motifs and to determine whether the remaining archaeal homologs also bind PCNA.

ACKNOWLEDGEMENT
This work was supported by USHHS Institutional National Research Service Award The asterisks mark amino acid residues corresponding to those forming the hydrophobic pocket in human PCNA (17,23).