Relating structure to function in phi29 DNA polymerase.

Bacteriophage f29 DNA polymerase, the product of the viral gene 2, was originally characterized as a protein involved in the initiation of f29 DNA replication based on both in vivo (1) and in vitro (2–4) studies. The cloning of gene 2 (5), the overproduction and purification of its product (6), and the development of an in vitro system for complete f29 DNA replication (7) allowed the characterization of protein p2 as the viral DNA replicase (8). This monomeric enzyme, with a molecular mass of only about 66 kDa, catalyzes two distinguishable synthetic reactions: 1) DNA polymerization, as any other DNA-dependent DNA polymerase, with insertion discrimination values ranging from 10 to 10 and with an efficiency of mismatch elongation 10–10-fold lower than that of a properly paired primer terminus (9); 2) terminal protein (TP) deoxynucleotidylation, which consists of the formation of a covalent linkage (phosphoester) between the hydroxyl group of a specific serine residue (Ser) in f29 TP and 59-dAMP, requires the presence of divalent metal ions and is strongly stimulated by the presence of the viral DNA replication origins. By means of this reaction, in which the TP is acting as a primer, f29 DNA polymerase catalyzes the initiation step of f29 DNA replication (5, 8). In addition to the synthetic activities, f29 DNA polymerase has two degradative activities: 1) pyrophosphorolysis, the polymerization reversal, whose physiological significance is still unclear (10); 2) 39–59-exonuclease, shown to be involved in a proofreading function (11, 12). This activity, kinetically characterized using ssDNA as substrate and Mg as metal activator (13), degrades processively DNA substrates longer than six nucleotides, the catalytic constant being 500 s. When the DNA length is reduced below 6–4 nucleotides, the f29 DNA polymerase-ssDNA complex dissociates at a rate of 1 s. The multiple enzymatic activities of f29 DNA polymerase (summarized in Table I) allow this enzyme to be the only polymerase involved in the replication of the f29 genome (7, 14). Moreover, the enzyme has two intrinsic properties: high processivity (.70 kilobases) and strand displacement ability (15). Based on this enzymatic potential, complete replication of both DNA strands can proceed continuously from each terminal priming event, without the need of synthesis of RNAprimed Okazaki fragments and making unnecessary the participation of accessory proteins and DNA helicases. The efficiency of the protein-primed initiation reaction is in part guaranteed by the previous formation of a heterodimer between TP and DNA polymerase (16), whereas the nucleotide specificity, as in normal DNA polymerization, is dictated by the DNA template (17).

Bacteriophage 29 DNA polymerase, the product of the viral gene 2, was originally characterized as a protein involved in the initiation of 29 DNA replication based on both in vivo (1) and in vitro (2)(3)(4) studies. The cloning of gene 2 (5), the overproduction and purification of its product (6), and the development of an in vitro system for complete 29 DNA replication (7) allowed the characterization of protein p2 as the viral DNA replicase (8). This monomeric enzyme, with a molecular mass of only about 66 kDa, catalyzes two distinguishable synthetic reactions: 1) DNA polymerization, as any other DNA-dependent DNA polymerase, with insertion discrimination values ranging from 10 4 to 10 6 and with an efficiency of mismatch elongation 10 5 -10 6 -fold lower than that of a properly paired primer terminus (9); 2) terminal protein (TP) 1 deoxynucleotidylation, which consists of the formation of a covalent linkage (phosphoester) between the hydroxyl group of a specific serine residue (Ser 232 ) in 29 TP and 5Ј-dAMP, requires the presence of divalent metal ions and is strongly stimulated by the presence of the viral DNA replication origins. By means of this reaction, in which the TP is acting as a primer, 29 DNA polymerase catalyzes the initiation step of 29 DNA replication (5,8). In addition to the synthetic activities, 29 DNA polymerase has two degradative activities: 1) pyrophosphorolysis, the polymerization reversal, whose physiological significance is still unclear (10); 2) 3Ј-5Ј-exonuclease, shown to be involved in a proofreading function (11,12). This activity, kinetically characterized using ssDNA as substrate and Mg 2ϩ as metal activator (13), degrades processively DNA substrates longer than six nucleotides, the catalytic constant being 500 s Ϫ1 . When the DNA length is reduced below 6 -4 nucleotides, the 29 DNA polymerase-ssDNA complex dissociates at a rate of 1 s Ϫ1 .
The multiple enzymatic activities of 29 DNA polymerase (summarized in Table I) allow this enzyme to be the only polymerase involved in the replication of the 29 genome (7,14). Moreover, the enzyme has two intrinsic properties: high processivity (Ͼ70 kilobases) and strand displacement ability (15). Based on this enzymatic potential, complete replication of both DNA strands can proceed continuously from each terminal priming event, without the need of synthesis of RNAprimed Okazaki fragments and making unnecessary the participation of accessory proteins and DNA helicases. The efficiency of the protein-primed initiation reaction is in part guaranteed by the previous formation of a heterodimer be-tween TP and DNA polymerase (16), whereas the nucleotide specificity, as in normal DNA polymerization, is dictated by the DNA template (17).

A "Sliding-back" Mechanism to Initiate
TP-primed DNA Replication It has been shown that 29 DNA polymerase does not start replication at the first base of the genome but employs the second position from the 3Ј-end of the template for the initial base pairing and formation of the corresponding TP-dAMP complex at each DNA end. The DNA ends (telomeres) are recovered by a specific mechanism, so called "sliding-back," that is based on a 3Ј-terminal repetition of two T residues. This reiteration permits, prior to DNA elongation, the asymmetric translocation of the initiation product, TP-dAMP, to be paired with the first T residue (18). The fact that TP-containing genomes, either from virus or linear plasmids (7), contain some kind of sequence repetitions at their ends supports the hypothesis that the "sliding-back" mechanism could be a common feature of protein-primed replication systems (18). This proposal has been demonstrated in the case of bacteriophages PRD1 from Escherichia coli and Cp1 from Streptococcus pneumoniae and in adenovirus. PRD1 DNA polymerase initiates replication at the fourth nucleotide of the terminal 3Ј-CCCC repetition (19) and Cp1 DNA polymerase at the third nucleotide of the terminal 3Ј-TTT repetition, 2 the end being recovered in both cases by a "stepwise sliding-back" mechanism. Adenovirus DNA polymerase initiates replication at the fourth nucleotide of the terminal repetition 3Ј-GTAGTA, followed by a preelongation step that originates TP-CAT, the end being recovered by a "jumping-back" step (20).

Structural Mapping of the Enzymatic Activities of 29 DNA Polymerase
The C-terminal Domain of 29 DNA Polymerase DNA Polymerization-Our structure-function studies of 29 DNA polymerase started when we found three regions of significant amino acid similarity, shared with other DNA polymerases from eukaryotic origin. Interestingly, these segments of similarity served to identify putative DNA polymerases encoded by linear plasmids from eukaryotic organisms, being also present in the DNA polymerase from bacteriophage T4 (21). In good agreement with such a novel eukaryotic filiation, 29 DNA polymerase and T4 DNA polymerase were shown to be sensitive to specific inhibitors of eukaryotic DNA polymerase ␣ such as aphidicolin, phosphonoacetic acid, butylanilino-dATP, and butylphenyl-dGTP (21,22). These three regions, located in the C-terminal portion of each polypeptide (see Fig.  1A), contained the amino acid motifs "DX 2 SLYP," "KX 3 NSXYG," and "YXDTDS." The results obtained by sitedirected mutagenesis at these three motifs of 29 DNA polymerase (23-26) support the proposal that these three segments, corresponding to motifs A, B, and C, form an evolutionary conserved polymerization active site in several groups of nucleic acid-synthesizing enzymes (27). Afterward, more detailed amino acid sequence comparisons, facilitated by the increasing number of DNA polymerase sequences available, allowed definition of additional conserved regions and motifs belonging to the C-terminal portion of the eukaryotic type superfamily (28,29), whose general conservation among other polymerase families is not clear at present. Two of these motifs, "TX 2 GR" and "KXY," have been also studied by site-directed mutagenesis in 29 DNA polymerase (30,31). The mutational analysis demonstrated that the C-terminal two-thirds of the 29 DNA polymerase polypeptide constitutes the polymerization domain, containing sites for interaction with the metal activator, dNTPs, and DNA (see Fig. 1B). Thus, three aspartate residues, invariant in all members of the eukaryotic type superfamily, were implicated in metal binding and catalysis at the polymerization active site (23,25). These 29 DNA polymerase residues, Asp 249 , belonging to motif "DX 2 SLYP," and Asp 456 and Asp 458 , belonging to motif "YXDTDS," are predicted to form a metal binding tripod, analogous to that formed by Pol I residues Asp 705 , Asp 882 , and Glu 883 , by human immunodeficiency virus-reverse transcriptase residues Asp 110 , Asp 185 , and Asp 186 (32), and by Pol ␤ residues Asp 190 , Asp 192 , and Asp 256 (33,34). In addition, 29 DNA polymerase residue Arg 438 , forming part of the "TX 2 GR" motif, was also proposed to play a role in catalysis of the polymerization reaction (30). Three tyrosine residues, invariant or highly conserved in the eukaryotic type superfamily, were identified as directly or indirectly involved in interaction with dNTPs: Tyr 254 (motif "DX 2 SLYP" (24, 25)), Tyr 390 (motif "KX 3 NSXYG" (24, 26)), and Tyr 454 (motif "YX-DTDS" (23)). Several defects such as an increased K m for dNTPs, instability of the incorporated dNTPs, altered sensitivity to dNTP analogs, and reduced selection of the correct dNTPs, could be measured either during DNA polymerization or TP-primed initiation reactions. Tyr 254 and Tyr 390 were also involved in nucleotide binding selection, thus playing a crucial role in the fidelity of DNA replication (35). Eight residues, invariant or highly conserved in the C-terminal domain of eukaryotic type superfamily, have been involved in binding template-primer structures (see Fig. 1B): Ser 252 (motif "DX 2 SLYP" (25)), Asn 387 , Gly 391 , and Phe 393 (motif "KX 3 NSXYG" (26)), Thr 434 and Arg 438 (motif "TX 2 GR" (30)), and Lys 498 and Tyr 500 (motif "KXY" (31)).
Structural Mapping of Processive Synthesis by 29 DNA Polymerase-A flexible subdomain of Klenow, which closes the "primer cleft" once the DNA is bound to it (36), was proposed to be mainly responsible for the extent of processivity required by Pol I, a repair enzyme. However, the high processivity required for DNA replication is generally achieved by association of the catalytic subunit with accessory proteins that reduce the rate of dissociation of the enzyme from the DNA, relative to translocation and further nucleotide addition. The fact that 29 DNA polymerase is highly processive in the absence of any accessory protein suggests that this enzyme must have specific binding subdomains involved in processivity. By amino acid sequence comparisons, two large insertions flanking the evolutionary conserved motif "KX 3 NSXYG" have been identified in 29 DNA polymerase and in other DNA polymerases catalyzing TP-primed replication, a mechanism involving highly processive synthesis of both DNA strands (21,37). The structural mapping of the putative 29 DNA polymerase domain(s) involved in processivity will be carried out by site-directed mutagenesis of the most conserved residues corresponding to  1. Structure-function studies of 29 DNA polymerase. A, relative arrangement of the most conserved regions among prokaryotic and eukaryotic DNA polymerases. The amino acid sequence of 29 DNA polymerase (572 amino acids) is represented by a bar, with the N terminus at the left. Gray and filled-in regions indicate the predicted 3Ј-5Ј-exonuclease and DNA polymerization domains, respectively. The area in between these two domains (ct) has been involved in the communication or cross-talk among these two domains. Alternative nomenclature for the regions (boxed) that contain the motifs (44) is indicated. A, B, and C correspond to motifs that are generally conserved among different classes of nucleic acid-synthesizing enzymes (27). B, proposed role for individual residues forming highly conserved N-terminal and C-terminal motifs of 29 DNA polymerase, as defined by site-directed mutagenesis. Motifs are represented in single-letter notation, where x indicates any amino acid. Alternative residues for a particular position are separated by a bar. A summary of the mutational analysis carried out in 29 DNA polymerase is described in the text. these two specific insertions.
It has been described that the binding of 29 DNA polymerase to DNA primer-template structures is largely enhanced by the presence of metal ions known to activate DNA polymerization (30). This behavior suggests that metal-assisted DNA binding could also increase the efficiency of DNA translocation, thus favoring the processivity of 29 DNA polymerase.
TP-primed Initiation-29 DNA polymerase interacts with TP to form a very stable heterodimer as a prerequisite in the initiation of 29 DNA replication (16). By extrapolation to the three-dimensional structure of the Klenow fragment of E. coli DNA polymerase I (Pol IK) complexed with DNA (36), the same cleft involved in binding the double-stranded region (primer cleft) of the replicating DNA molecule is proposed to be also the TP-binding site. In agreement with that, mutations at residues Thr 434 and Arg 438 (motif "TX 2 GR") of 29 DNA polymerase parallelly decreased the ability to bind both the TP and template-primer DNA molecules (30) (see Fig. 1B). Moreover, recent results indicate that one of the specific insertions, proposed to form a flexible domain that could be involved in processivity of protein-priming DNA polymerases, appears to have a direct role in TP binding. 3 Based on our site-directed mutagenesis analysis of 29 DNA polymerase, it can be also concluded that protein-primed initiation and DNA polymerization are both catalyzed at a unique active site, involving the same critical residues and amino acid motifs generally conserved in eukaryotic type DNA polymerases.

The N-terminal Domain of 29 DNA Polymerase
3Ј-5Ј-Exonuclease-Based on both amino acid sequence similarities and site-directed mutagenesis studies in 29 DNA polymerase, Bernad et al. (38) proposed that the 3Ј-5Ј-exonuclease active site of prokaryotic and eukaryotic DNA polymerases is evolutionary conserved, being formed by three Nterminal amino acid segments (ExoI, ExoII, and ExoIII) that invariantly contain the five critical residues identified in Pol IK, involved in metal binding and 3Ј-5Ј-exonuclease catalysis (39) (see Fig. 1A). The validity of this proposal has been confirmed in the case of other prokaryotic and eukaryotic enzymes such as T7, T4, and herpes simplex virus DNA polymerases, E. coli Pol II, Bacillus subtilis Pol III, and cellular DNA polymerases ␦, ⑀, and ␥ from Saccharomyces cerevisiae (see an excellent review of all these mutagenesis studies by Derbyshire et al. (40)). A steady-state analysis of mutants at each putative 3Ј-5Ј-exonuclease active site residue of 29 DNA polymerase (Asp 12 , Glu 14 , Asp 66 , Tyr 165 , and Asp 169 ) demonstrated their role in catalysis, supporting the idea that the geometry of the Pol I 3Ј-5Ј-exonuclease active site and the two-metal ion mechanism proposed for this enzyme (41) can be extrapolated to 29 DNA polymerase and the rest of proofreading DNA polymerases (13). In addition to the residues involved in metal binding and catalysis at the 3Ј-5Ј-exonuclease active site, other residues appear to be structurally and functionally conserved at the exonuclease domain of most prokaryotic and eukaryotic DNA polymerases. Among them, 29 DNA polymerase residues Thr 15 and Asn 62 , located at the ExoI and ExoII motifs, respectively, act as single-stranded DNA ligands, having a critical role in the stabilization of the frayed primer terminus at the 3Ј-5Ј-exonuclease active site (42) (see Fig. 1B).
Strand Displacement-Surprisingly, the mutational analysis of the ExoI, ExoII and ExoIII motifs of 29 DNA polymerase showed that the intrinsic capacity to couple strand displacement to DNA polymerization is also located in the N-terminal domain, somehow overlapping with the 3Ј-5Ј-exonuclease ac-tive site (9,43) (Fig. 1A). Our model proposed that the enzyme could make an alternative use of the ssDNA binding site, present at the N-terminal domain, either to bind the 3Ј-5Јexonuclease substrate or to stabilize the interaction between the polymerase molecule and the DNA strand to be displaced. However, the ssDNA ligands Thr 15 and Asn 62 of 29 DNA polymerase seem to be specialized in the stabilization of the editing complex, not having a role in the strand displacement capacity of the enzyme (42). Therefore, a dual role in 3Ј-5Јexonuclease and strand displacement appears to be restricted to residues directly acting as metal ligands (see Fig. 1B), such as residues Asp 12 and Glu 14 of the ExoI motif (DXE), Asp 66 of the ExoII motif (NX 2-3 (F/Y)D), and Asp 169 of the ExoIII motif (YX 3 D), or likely affecting the metal binding network, such as Tyr 165 of the ExoIII motif (YX 3 D). These data suggest that the interaction with the displaced strand, leading to duplex opening, could be assisted by contacts with the divalent metal ions that hold and orient the ssDNA substrate for exonucleolytic proofreading.

Structural Independence of 29 DNA Polymerase Domains
A C-terminal deletion derivative of 29 DNA polymerase, containing the first 188 N-terminal amino acid residues (including the three Exo motifs), was independently expressed in E. coli cells. As expected from our hypothesis of a modular organization of enzymatic activities in 29 DNA polymerase, analogous to that of the Klenow fragment of DNA polymerase I, this N-terminal domain was devoid of any synthetic activity (TP-primed initiation and DNA polymerization) but retained 3Ј-5Ј-exonuclease activity (44). Recently, a N-terminal deletion derivative of 29 DNA polymerase, lacking the first 188 Nterminal amino acid residues (among them the three Exo motifs), has been independently expressed in E. coli cells. This C-terminal domain retained both synthetic activities, TPprimed initiation and DNA polymerization, but it was devoid of 3Ј-5Ј-exonuclease activity. 4

Communication between the N-terminal and C-terminal Domains: Coordination between Synthesis and Degradation
As described before, the mutational analysis carried out along the 29 DNA polymerase molecule allowed demonstration of the existence of two structurally independent domains containing the synthetic and degradative activities of this enzyme. However, for an effective proofreading of DNA polymerization errors, a mechanism for coordinating DNA polymerization and DNA excision must exist, relying on a structural and functional communication or cross-talk between the N-terminal and C-terminal domains. The basis of this communication, specially important in the case of processive DNA polymerases, involves the intramolecular switching of the primer terminus between the polymerization and 3Ј-5Ј-exonuclease active sites.
By site-directed mutagenesis of 29 DNA polymerase, it has been recently demonstrated that the conserved motif "YXG(G/A)," located between the 3Ј-5Ј-exonuclease and polymerization domains of eukaryotic type DNA polymerases (Fig. 1), is a DNA binding motif that plays a role in the coordination between DNA synthesis and proofreading. 5 We propose that residues Tyr 226 and Phe 230 of 29 DNA polymerase are primarily involved in the stabilization of template-primer structures at the polymerization active site, playing also a role in the movement (switching) of the primer terminus between the polymerase and exonuclease active sites. This dual role could be achieved if the "YXG(G/A)" motif is involved in a conformational change, triggered by the unstabilization produced by insertion of a mismatched nucleotide. In addition to this motif, other amino acid residues of 29 DNA polymerase have been implicated in stabilization of the primer terminus at both polymerization (Thr 434 , Arg 438 , Lys 498 , and Tyr 500 ) and exonuclease (Thr 15 and Asn 62 ) active sites, playing in this case an indirect role in the dynamics of DNA interaction required to coordinate polymerization and proofreading.

Future Prospects
In addition to the structure-function studies that are extrapolatable to most DNA-dependent DNA polymerases, one of the main goals of our research is the characterization of the structural basis for the intrinsic high processivity of 29 DNA polymerase. To approach this problem, we will search for specific subdomains that could lead to a proliferating cell nuclear antigen-like topological interaction with DNA. As it has been described for both proliferating cell nuclear antigen (45) and the ␤ subunit of E. coli DNA polymerase III (46), such a strong interaction would be dissociated only when reaching a DNA end, as should be the case after completing replication of the linear 29 DNA molecule. Of additional interest is understanding how 29 DNA polymerase is able to use both a protein and DNA as primers, and the dynamics of interactions occurring at the transition between TP-primed initiation and the elongation stage of 29 DNA replication. One of the most intriguing questions is whether, after formation of the TP-dAMP initiation complex, TP and 29 DNA polymerase must dissociate either as a consequence of the special translocation step ("slidingback"), necessary to accommodate the newly created primer terminus in an adequate position to accept the next incoming dNTP, or after the synthesis of a short DNA suitable to be used as primer.
Attempts to obtain 29 DNA polymerase crystals adequate for x-ray diffraction analysis were not successful so far. Other approaches, such as the crystallization of 29 DNA polymerase complexed with TP, will be also explored to elucidate the structural basis for the vast potential of this monomeric enzyme.