Unique Error Signature of the Four-subunit Yeast DNA Polymerase ϵ*

We have purified wild type and exonuclease-deficient four-subunit DNA polymerase ϵ (Pol ϵ) complex from Saccharomyces cerevisiae and analyzed the fidelity of DNA synthesis by the two enzymes. Wild type Pol ϵ synthesizes DNA accurately, generating single-base substitutions and deletions at average error rates of ≤2 × 10-5 and ≤5 × 10-7, respectively. Pol ϵ lacking 3′ → 5′ exonuclease activity is less accurate to a degree suggesting that wild type Pol ϵ proofreads at least 92% of base substitution errors and at least 99% of frameshift errors made by the polymerase. Surprisingly the base substitution fidelity of exonuclease-deficient Pol ϵ is severalfold lower than that of proofreading-deficient forms of other replicative polymerases. Moreover the spectrum of errors shows a feature not seen with other A, B, C, or X family polymerases: a high proportion of transversions resulting from T·dTTP, T·dCTP, and C·dTTP mispairs. This unique error specificity and amino acid sequence alignments suggest that the structure of the polymerase active site of Pol ϵ differs from those of other B family members. We observed both similarities and differences between the spectrum of substitutions generated by proofreading-deficient Pol ϵ in vitro and substitutions occurring in vivo in a yeast strain defective in Pol ϵ proofreading and DNA mismatch repair. We discuss the implications of these findings for the role of Pol ϵ polymerase activity in DNA replication.

Replication of chromosomes in eukaryotes is believed to require three DNA polymerases, Pol ␣, 1 Pol ␦, and Pol ⑀ (for reviews, see Refs. [1][2][3]. Pol ␣ has an associated DNA primase activity and synthesizes short RNA-DNA primers to initiate replication at origins and start Okazaki fragments on the lagging DNA strand. The roles of Pol ␦ and Pol ⑀ in chromosomal replication are not clearly defined. Evidence for essential roles of Pol ␦ and Pol ⑀ in replication comes from genetic studies in Saccharomyces cerevisiae. Mutational inactivation of catalytic subunits and some of the additional subunits of both polymerases is lethal. Strains carrying temperature-sensitive mutations in POL3 gene (encoding the catalytic subunit of Pol ␦), POL2, or DPB2 genes (encoding subunits of Pol ⑀) arrest in S phase of the cell cycle with a terminal morphology characteristic of a DNA replication defect upon shift to non-permissive temperature (4 -6). Incorporation of labeled precursors into DNA stops in these mutants at non-permissive temperature (4,5,7). The POL3, POL2, and DPB2 genes, as well as the DPB3 gene encoding the third subunit of Pol ⑀, are expressed periodically in the cell cycle with a peak in G 1 to S phase transition (4,5,8), which is characteristic of genes encoding DNA replication proteins. Both Pol ␦ and Pol ⑀ have intrinsic 3Ј 3 5Ј exonuclease activities that correct polymerase errors during replication (9,10). Mutations in POL3 or POL2 inactivating the exonucleases result in a mutator phenotype (11,12).
To accommodate roles for Pol ␦ and Pol ⑀ at a replication fork, it was proposed that Pol ⑀ elongates the leading DNA strand, while Pol ␦ elongates Okazaki fragments on the lagging DNA strand (13). Other models for the roles of Pol ␦ and Pol ⑀ in chromosomal replication have also been considered (14,15). "One strand-one polymerase" models are consistent with the observation that both polymerases are monomers in solution (16,17) and with our earlier genetic studies demonstrating that the 3Ј 3 5Ј exonucleases of Pol ␦ and Pol ⑀ correct replication errors on opposite DNA strands (18). However, it is not yet known whether Pol ␦ and Pol ⑀ always correct their own errors in vivo, and genetic data on the roles of Pol ␦ and Pol ⑀ exonucleases are not sufficient to conclude that the two polymerases perform synthesis on opposite DNA strands.
Pol ⑀ belongs to the B family of DNA polymerases that includes Pol ␣ and Pol ␦ as well as DNA polymerases that replicate genomes of bacteriophages T4 and RB69. The B family also includes the eukaryotic DNA polymerase , a translesion synthesis polymerase that is not essential for chromosomal DNA replication in yeast. A unique feature of Pol ⑀ is that the catalytic subunit contains a large C-terminal domain that is not present in other B family DNA polymerases. This domain is needed for interaction of the catalytic subunit with the other subunits of the Pol ⑀ holoenzyme and for the function of Pol ⑀ in S phase checkpoint control (13, 19 -21). The essential role of Pol ⑀ polymerase activity in replication was questioned by the observation that only the C-terminal domain of the catalytic subunit is required for cell viability, while mutants with a deletion of the N-terminal domain, which contains all the conserved DNA polymerase motifs, survive (22). However, these mutants display severe growth and replication defects (23). Moreover single amino acid substitutions in the polymerase active site of Pol ⑀ are lethal (20,24). One explanation for these observations is that Pol ⑀ is normally a component of the replication machinery, but when absent, it can be partially substituted for by another DNA polymerase. It was also suggested that Pol ⑀ participates in DNA replication during late but not early S phase in human cells (25).
A property of replication that is critical for genome stability is the fidelity of DNA synthesis. An initial study conducted 12 years ago (26) demonstrated that the catalytic subunit of calf thymus Pol ⑀ synthesized DNA with high fidelity. A more recent kinetic analysis of fidelity (27) demonstrated that yeast Pol ⑀ efficiently prevents nucleotide misinsertion at one template position and that two of the three possible mismatches at that site can be excised by the 3Ј 3 5Ј exonuclease of Pol ⑀, indicating that yeast Pol ⑀ can conduct accurate DNA synthesis. Just how accurate yeast Pol ⑀ is for all 12 mispairs in different sequence contexts and how efficiently it prevents and proofreads insertion and deletion errors are unknown. With this in mind and with the goal to eventually understand the role of yeast Pol ⑀ in replicating one or both strands of the 12 million-nucleotide yeast genome and stably maintaining it in the face of DNA damage, here we characterize the fidelity of DNA synthesis by wild type and proofreading-deficient forms of yeast Pol ⑀ holoenzyme. We provide a comprehensive description of error rates in many sequence contexts for all 12 singlebase mismatches as well as for single-base additions and single-base and larger deletions. We confirm that Pol ⑀ is highly accurate, describe the contribution of proofreading to fidelity, report two unanticipated aspects of fidelity suggesting that the polymerase active site is unique among family B members, and compare Pol ⑀ fidelity to mutational specificity data in vivo to support a replication function.

EXPERIMENTAL PROCEDURES
Purification of Yeast DNA Polymerase ⑀ -A protease-deficient S. cerevisiae strain was constructed for this study by disrupting the PEP4 gene in strain GC379⌬ (MAT␣ ade5-1 lys2-Tn5-13 trp1-289 his7-2 leu2-3,112 ura3⌬, Ref. 28) using a heterologous kanMX cassette as described previously (29). The chromosomal POL2 allele of this strain was replaced with the pol2-4 allele as described previously (11). The pol2-4 mutation results in the FDIET 3 FAIAT amino acid replacement in the Exo I motif of Pol ⑀. Wild type Pol ⑀ was overproduced in the POL2 ϩ strain, and the exonuclease-deficient Pol ⑀ with the FDIET 3 FAIAT substitution was overproduced in the pol2-4 strain. The enzymes were purified in parallel by conventional chromatography as described previously (17).
The primer (0.8 M) was labeled by 20 Ci of [␥-32 P]ATP in a 100-l reaction with 25 units of T4 polynucleotide kinase (New England Biolabs) for 1 h. The reaction was heated for 10 min at 56°C and cleaned using a Qiagen nucleotide removal kit. The labeled product and unlabeled primer were annealed to the template in a final 100-l volume at a ratio of 1 M primer to 1.04 M template by 5-min incubation at 86°C and cooling to room temperature. The substrate (100 nM) was tested for completeness of annealing by performing a 10-min reaction with 15 nM Exo Ϫ T4 DNA polymerase and 50 M dNTPs in a buffer described under "Processivity Assay." Exo Ϫ T4 DNA polymerase was kindly provided by F. Kadyrov (30). Under these conditions, more than 95% of the primer was elongated. For determining exonuclease activity, 2.5 l of substrate (50 nM) was incubated with various concentrations of Pol ⑀ in a 50-l reaction in standard buffer (50 mM Tris-Cl, pH 7.5, 8 mM MgCl 2 , 2 mM dithiothreitol, 100 g/ml bovine serum albumin, 10% glycerol; see Ref. 31) for 15 min. Ten l of the reaction was mixed with 10 l of deionized formamide dye solution, and 5 l of the mixture was loaded on 18% urea-polyacrylamide gel and electrophoresed for 1.5 h at 80 watts.
Processivity Assay-An oligonucleotide (PAGE-purified, Oligos Etc.) was annealed to a single-stranded M13mp2 DNA template such that DNA synthesis initiates at nucleotide 191 of the lacZ gene, 6 bases from the start of synthesis in the gap-filling mutagenesis assay (see below). The primer sequence was 5Ј-CGGAAACCAGGCAAAGCGCCATTCGC-CATTCAGGCTGCGCAG-3Ј. The primer (0.8 M) was labeled with 20 Ci of [␥-32 P]ATP in a 100-l reaction with 25 units of T4 polynucleotide kinase (New England Biolabs) for 1 h. The reaction was heated for 10 min at 56°C and cleaned using a Qiagen nucleotide removal kit. Unlabeled primer and 10 l of labeled primer were annealed in a total volume of 100 l to single-stranded M13mp2 template DNA at a ratio of 0.25 M primer to 0.25 M template by 30-min incubation at 52.5°C in 100 mM potassium acetate. After cooling to room temperature, the substrate was desalted using a Bio-Rad Micro Bio-Spin 30 column. When tested for complete annealing as described above, more than 80% of the primer was elongated. For processivity assays, the substrate (5 nM) was incubated with various concentrations (0.05-1 nM) of Pol ⑀ (at 30°C) and other polymerases (at 37°C) in 50-l reactions in standard buffer (see below) with 250 M dNTPs for the times listed in the legend to Fig. 2. Ten l of the reaction mixture was mixed with 10 l of deionized formamide dye solution, and 5 l of the mixture was loaded onto a 12% denaturing polyacrylamide gel and subjected to electrophoresis for 2 h at 80 watts. The catalytic subunits of human Exo Ϫ Pol ␥ and Pol ␣ were kindly provided by W. Copeland (Refs. 32 and 33, respectively). Buffers used were the following: for T4 Pol, 20 mM Tris-Cl, pH 7.5, 50 mM potassium acetate, 8 mM MgCl 2 , 5 mM dithiothreitol; for Pol ␥, 25 mM HEPES, pH 7.5, 2 mM ␤-mercaptoethanol, 0.1 mM EDTA, 50 g/ml bovine serum albumin, 5 mM MgCl 2 ; and for Pol ␣, 20 mM Tris-Cl, pH 8.0, 8 mM MgCl 2 , 1 mM ␤-mercaptoethanol, 200 g/ml bovine serum albumin. The sequencing ladder standard was produced by using the same substrate and a thermostable polymerase manual sequencing kit (U.S. Biochemical Corp.).
Assay to Measure Polymerase Fidelity in Vitro-DNA synthesis fidelity was measured using the bacteriophage M13mp2 forward mutation assay described previously (34). In brief, double-stranded M13mp2 DNA with a 407-nucleotide single-stranded region containing a portion of the lacZ gene was used a substrate for in vitro DNA synthesis. Reactions mixtures contained ϳ1.5 nM DNA template, 50 mM Tris-Cl, pH 7.5, 8 mM MgCl 2 , 2 mM dithiothreitol, 100 g/ml bovine serum albumin, 10% glycerol, 25 or 250 M dNTPs, and 1.5-14 nM wild type or exonuclease-deficient Pol ⑀. Reactions were incubated at 30°C for 10 min. Aliquots of the reactions were analyzed by agarose gel electrophoresis to confirm complete gap filling, and another aliquot of DNA was introduced into Escherichia coli to score the frequency of light blue and colorless plaques reflecting errors made during in vitro DNA synthesis. Single-stranded DNA was isolated from independent mutant plaques, and the lacZ gene was sequenced using either an ABI Prism 377 or ABI Prism 3100 sequencer. Error rates for individual types of mutation were calculated according to the following equation: ER ϭ [(N i /N) ϫ MF]/(D ϫ 0.6) where N i is the number of mutations of a particular type, N is the total number of mutants analyzed, MF is frequency of lacZ mutants, D is the number of detectable sites for the particular type of mutation, and 0.6 is the probability of expressing a mutant lacZ allele in E. coli (34).
Forward Mutation Assay in Vivo-Independent colonies of the pol2-4 msh6 bik1::URA3 derivatives of CG379⌬ were streaked on yeast extract peptone dextrose plates, grown for 2 days at 30°C, and replicaplated onto medium with 5-fluoroorotic acid (FOA) (36) to select for ura3 mutants. One colony resistant to FOA (FOA r ) was picked from each patch, and the URA3 open reading frame was amplified by PCR and sequenced by using either ABI Prism 377 or ABI Prism 3100 sequencer. The rate of FOA r mutation in the pol2-4 msh6 strains was measured by fluctuation analysis as described previously (37). The majority of FOA r colonies result from mutations in the URA3 gene in wild type strains (36). For more accurate estimation of the ura3 mutation rate in the pol2-4 msh6 strains, we determined the proportion of FOA r mutants that were due to ura3 mutations in our fluctuation tests. To do this, the FOA r colonies were randomly picked and crossed to strain E68 (MATa ade2-1 arg4 -8 leu2-3,112 thr1-4 trp1-1 ura3-52 lys2⌬ cup1-1) (37). The hybrids were replica-plated onto medium lacking uracil to monitor complementation of the ura3-52 mutation. To estimate the ura3 mutation rate, the FOA r mutation rate was multiplied by the proportion of ura3 mutants among FOA r clones.
Mutational Spectra Analysis-The frequencies of different types of substitutions were compared using the Pearson 2 test as implemented in the COLLAPSE program (38). Monte Carlo modification of the Pearson 2 test of spectra homogeneity (39,40) and the Kendall's correlation coefficient (41)(42)(43) were used to compare spectra. Pcc is the probability that an observed correlation between two spectra is due to random fluctuations. Calculations were done using the programs HG-PUBL (44), CORR12, and COMP12 (41). The CLUSTERM program (43,45) was used for hot spot prediction. The known list of detectable positions in lacZ (46) was used for analysis of spectra.
Alignment of Amino Acid Sequences and Secondary Structure Prediction-Multiple alignment of Pol ⑀ homologs was constructed using the MAXHOM program (47). Secondary structure prediction was made using the PSIPRED (48) and PHD (49) programs.
Here we used this procedure to overproduce and purify wild type Pol ⑀ and a variant with a double amino acid change (D289A,E291A) in the Exo I motif of the catalytic subunit. Both protein preparations contained all four subunits in the expected ratios (Fig. 1A). The additional faint bands seen on the colloidal Coomassie-stained gel represent minor contaminants and not Pol2 proteolytic fragments as confirmed by Western blot analysis with polyclonal antibodies to Pol2 (data not shown). The double amino acid change in the exonuclease active site of the catalytic subunit was previously shown to reduce the 3Ј 3 5Ј exonuclease activity of Pol ⑀ with no significant effect on polymerization activity (11,27). When we examined our enzyme preparations for exonuclease activity, wild type Pol ⑀ readily digested a labeled, correctly paired DNA primer (Fig.  1B, lanes 2-4). No digestion was seen for the Pol ⑀ D289A, E291A preparation (lanes 5-7). Similar results were obtained when a primer with a terminal mismatched nucleotide was used as a substrate for the exonuclease reaction (data not shown). Therefore the catalytic subunit of the Pol ⑀ D289A,E291A as well as the three accessory subunits lacks 3Ј 3 5Ј exonuclease activity detectable in this assay.
Processivity of Pol ⑀ -Next we compared the processivity of DNA synthesis by Pol ⑀ D289A,E291A to that of several other B family DNA polymerases (Fig. 2). Reactions were performed with excess primer-template over polymerase such that once the polymerase synthesizes DNA and dissociates the probability of using the extended product again is very low (see legend to Fig. 2 for further details). Analysis of the products of the reactions catalyzed by exonuclease-deficient variants of Pol ⑀, T4 Pol (gp43 protein only), and Pol ␥ (catalytic subunit only) shows that all three enzymes terminate synthesis at numerous template positions (Fig. 2). Some of the sites of frequent termination are shared by the three polymerases, while others are not shared, suggesting that these polymerases interact somewhat differently with template-primers even in the same DNA sequence context. All three polymerases incorporated up to 170 nucleotides in one cycle of polymerization (up to approximately template nucleotide 20, Fig. 2). Maximum processivity could be even higher with other DNA templates since the termination band observed at position 20 corresponds to the location of the palindromic lacZ operator sequence. The DNA synthesis reached this point in 11% of the Pol ⑀ reactions as compared with 6 and 3% for T4 Pol and Pol ␥, respectively (the values are calculated by phosphorimagery as the amount of product at that location divided by the total products). This suggests that Pol ⑀ holoenzyme is the most processive of these three replicative polymerases. Additional studies will be required to determine whether high processivity is intrinsic to the catalytic subunit of Pol ⑀ or is influenced by the accessory subunits. In contrast to Pol ⑀, T4 Pol, and Pol ␥ and in agreement with earlier studies (50 -52), the catalytic subunit of human Pol ␣ is less processive, incorporating from 1 to about 20 nucleotides per cycle.
Fidelity of Wild Type Pol ⑀ -We first analyzed fidelity of the wild type Pol ⑀ during gap-filling DNA synthesis using the M13mp2 forward mutation assay (see "Experimental Procedures"). Agarose gel electrophoresis demonstrated that Pol ⑀ filled the 407-nucleotide gap (data not shown but similar to Fig.  3 in Ref. 34). The frequency of lacZ mutants obtained after transfection of E. coli with these reaction products (0.00194, Table I) was reproducibly only slightly elevated over the background mutant frequency measured upon transfection with the unfilled gapped DNA substrate (0.00093). Moreover sequencing of the lacZ gene from 29 independent mutants resulting from gap-filling synthesis by wild type Pol ⑀ (Table I) shows that 19 of the mutants result from C 3 T, G 3 T, and G 3 C base substitutions. This specificity is characteristic of the background lacZ mutation spectrum, which is believed to reflect replication of DNA damage (e.g. deaminated cytosine, modified guanine) that accumulates in the DNA template during preparation of gapped DNA (Refs. 53, 54, and references therein). Thus, we consider the error rates that can be calculated for individual mispairs (Table I) to be minimum estimates of wild type Pol ⑀ fidelity with actual error rates being lower than the background noise in the forward mutation assay. This puts yeast Pol ⑀ holoenzyme into the group of highly accurate replicative DNA polymerases that includes eukaryotic mitochon-drial DNA polymerase ␥, E. coli DNA polymerase III, and DNA polymerases of bacteriophages T4 and RB69 (Table II).
Fidelity of Exonuclease-deficient Pol ⑀ -Next we measured the fidelity of exonuclease-deficient Pol ⑀ in the presence of either 25 or 250 M dNTPs. Analysis of reaction products (not shown) again demonstrated complete gap filling. The lacZ mutant frequencies obtained upon transfection of E. coli with these products were 6-and 13-fold higher than the frequency of mutants obtained with wild type Pol ⑀ (bottom of Table I), consistent with the important role for the 3Ј 3 5Ј exonuclease activity of Pol ⑀ in correcting DNA synthesis errors. The error specificity of proofreading-deficient Pol ⑀ was determined by sequencing DNA from 95 independent lacZ mutants obtained at a dNTP concentration of 25 M and 191 mutants obtained at a dNTP concentration of 250 M (Table III and Fig. 3). The majority of mutations were single-base substitutions (89% at 25 M dNTP and 76% at 250 M dNTPs), whereas single-base deletions constituted 9.3 and 18% of the mutations in the two spectra, respectively. Other types of mutations were also observed but at lower frequencies (Table III), including three deletions of long stretches of DNA flanked by direct repeats of 5-6 nucleotides. Two of these deletions occurred between CCCGC repeats at positions Ϫ152 to Ϫ148 and ϩ166 to ϩ170. These deletions likely occur by a slippage mechanism previously proposed to explain this same error when frequently generated by Pol ␤ (see Fig. 3B in Ref. 55). Interestingly the first repeat encountered during gap-filling synthesis is at a location (nucleotides ϩ166 and ϩ167) where Pol ⑀ frequently dissociates from the template (Fig. 2). This correlation further supports the general concept (for a review, see Ref. 56; also see Ref. 57) that replication slippage errors are more likely to occur during the dissociation-reassociation phase of a polymerization reaction.
Overall average (Table II) and individual (Table I) singlebase substitution, addition, and deletion error rates per detectable nucleotide polymerized were calculated from the mutant frequency and DNA sequencing data using the known number of phenotypically detectable template sites for scoring each class of error (34). The average single-base substitution and deletion error rates are Ն12and Ն112-fold higher, respectively, for exonuclease-deficient Pol ⑀ as compared with wild type Pol ⑀. This suggests that at least 92% of base-base mismatches and at least 99% of deletion mismatches made by yeast Pol ⑀ holoenzyme are corrected by its intrinsic proofreading activity. These may be minimum estimates of the contribution of proofreading because the error rates for the wild type enzyme could be much lower than can be measured with the forward mutation assay. Similarly the estimated contribution of proofreading of specific mismatches ranges from Ն2to Ն56fold (last column of Table I).
Interestingly the overall average base substitution fidelity of exonuclease-deficient Pol ⑀ is consistently lower than for exonuclease-deficient derivatives of other eukaryotic and prokaryotic replicative DNA polymerases (lower part of Table II) by factors of 4 -14-fold depending on the comparison made. The most frequently observed base substitutions result from misinsertion of dATP opposite template C, A, and G and from three pyrimidine-pyrimidine mismatches, T⅐dCTP, T⅐dTTP, and C⅐dTTP (Table I). The three pyrimidine⅐pyrimidine mispairs as well as G⅐dATP and C⅐dATP mispairs are generated at substantially higher rates than for other B family polymerases (Fig. 4) or for A family and C family replicative polymerases (see references in Table II). This substitution error specificity is therefore a unique feature of the exonuclease-deficient Pol ⑀ spectrum. The rate of single-base deletions is also somewhat higher than those of some other replicative DNA polymerases (lower part of Table II). Interestingly the error rate for deleting noniterated template nucleotides is similar to deletion rates in runs of 2 or 3 identical nucleotides (Fig. 5A) and only increases when homopolymeric runs length exceeds 3 nucleotides. This relationship between single-base deletion error rate and run length is distinct from the patterns observed with most of the other DNA polymerases when copying this same template sequence (e.g. Fig. 5, B and C; also note differences in scale on y axis). However, the pattern of single-base deletions observed with Pol ⑀ is similar to those previously seen with human and yeast DNA polymerase ␣ (58, 59) and with the RB69 replicative complex (60). This suggests that interactions of B family polymerases with the primer-template that modulate frameshift fidelity may be different from polymerases in the other families.
Mutational Specificity of Yeast Strains Defective in 3Ј 3 5Ј Exonuclease Activity of Pol ⑀ -The yeast pol2-4 mutation results in a mutator phenotype in vivo that is thought to reflect loss of proofreading of replication errors by Pol ⑀ (10, 11). Were Pol ⑀ to correct its own errors during chromosomal DNA replication then the spectrum of spontaneous substitution mutations in a pol2-4 strain should reflect replication errors made by Pol ⑀ in vivo. A previous study described pol2-4 mutational specificity in the URA3 gene inserted into chromosome III in two orientations near a defined replication origin, ARS306 (10). The hallmark of that spectrum was a high proportion of AT 3 TA transversions, a specificity that correlates with the high error rates for A⅐dATP and T⅐dTTP mispairs by exonucleasedeficient Pol ⑀ in vitro (Table I). AT 3 TA transversions were also abundant in the plasmid reporter gene SUP4 o in the pol2-4 strain (61).
In the two studies just mentioned, base substitutions were scored in yeast strains that were proficient in DNA mismatch repair where most DNA replication errors generated will be corrected. We analyzed the spectra of spontaneous mutations in strains that were isogenic to the CG379 strain used by Morrison et al. (10) and carried both the pol2-4 mutation and an msh6 mutation that inactivates repair of single-base mismatches. These strains were also isogenic to the strains we used to overproduce and purify Pol ⑀. The URA3 reporter gene for scoring mutations was at the same position in chromosome III as in the earlier study (10) and was again present in each of  a Other mutations include substitution of a GT dinucleotide for nucleotides Ϫ38 to ϩ20 of the lacZ gene, two deletions between CCCGC repeats at positions Ϫ152 to Ϫ148 and ϩ166 to ϩ170, a deletion between CTGGCG repeats at positions Ϫ180 to Ϫ175 and ϩ146 to ϩ151, a substitution of T for the CCC sequence at position Ϫ44 to Ϫ42, and a deletion of nucleotides Ϫ10 to Ϫ6. the two possible orientations (designated LR and RL) relative to ARS306. Two orientations of the URA3 gene were used for the purpose of comparison with the previous study of the URA3 mutational spectrum in pol2-4 strains (10). The spectrum of ura3 mutations in the double pol2-4 msh6 mutants was also compared with the error specificity of exonuclease-deficient Pol ⑀ in vitro.
The ura3 mutation rates in the pol2-4 msh6 strains were 1.0 ϫ 10 Ϫ5 and 0.52 ϫ 10 Ϫ5 for orientations LR and RL, respectively. These rates are 45-and 37-fold higher than in the pol2-4 single mutant strains (10), consistent with loss of DNA mismatch repair of replication errors and with the earlier data on synergistic interaction of pol2-4 and msh6 mutations (62). We then isolated 93 and 88 independent ura3 mutants for the LR and RL orientations, respectively, and sequenced the URA3 gene to identify the types of mutations. The vast majority of mutations were single-base substitutions (Table IV). The spectra for the two orientations of the reporter gene were clearly different (p Ͻ 0.001), which is consistent with the earlier data suggesting that Pol ⑀ corrects replication errors during synthesis of only one of the two DNA strands (18,61). Comparison of the mutational specificity of the pol2-4 msh6 strains to the specificity of Pol ⑀ errors in vitro revealed both similarities and differences (Fig. 6). GC 3 AT transitions and GC 3 TA transversions predominate in the in vivo spectra, consistent with the high Pol ⑀ error rate in vitro for G⅐dATP, C⅐dATP, and C⅐dTTP mispairs. A hot spot for GC 3 TA transversions (34 of the 93 sequenced mutants, 37%) was seen at position 679 in the LR orientation of URA3 in vivo (Table IV). In the in vitro spectrum, the template site exhibiting the highest error rate was position 146 where 11 of 191 lacZ mutants (5.8%) resulted from incor-poration of dTTP opposite C to yield a GC 3 TA transversion (Fig. 3). The CTG sequence at this site in the lacZ gene matches the CTG sequence for the hot spot at position 679 in the URA3 gene (Fig. 7). The probability that this correlation is due to stochastic reasons is low (p ϭ 0.03). A noticeable difference between the in vitro and in vivo spectra is the lower proportion of AT 3 TA transversions in the pol2-4 msh6 spectrum in comparison with the proportion of A⅐dATP and T⅐dTTP mismatches made by Pol ⑀ in vitro that would yield these mutations. Another large difference is the rates at which base substitutions are generated in vivo and in vitro. The lacZ mutation frequency, 2.59 ϫ 10 Ϫ2 (Table I), multiplied by the proportion of base substitutions in the spectrum (69%, Table I) and divided by 0.6, the probability of expressing a mutant lacZ allele in E. coli, gives an estimate of the base substitution mutation rate in vitro of 3.0 ϫ 10 Ϫ2 . The ura3 mutation rate, 7.6 ϫ 10 Ϫ6 (average for the two orientations), multiplied by the proportion of base substitutions (97% , Table IV) gives an estimate of the base substitution rate in the pol2-4 msh6 strain of 7.4 ϫ 10 Ϫ6 . This dramatic difference is discussed further below.

DISCUSSION
The fidelity and error specificity of Pol ⑀ described here reveals several important properties of this replicative enzyme. First, wild type yeast Pol ⑀ holoenzyme is highly accurate. The low single-base substitution, addition, and deletion error rates (Tables I and II) are in agreement with the proposed substantial role of Pol ⑀ in chromosomal DNA replication where accurate DNA synthesis contributes to genome stability. The low base substitution error rates are consistent with a recent kinetic analysis reporting that yeast Pol ⑀ inserts incorrect nucleotides at relative rates of only 4.6 ϫ 10 Ϫ6 , 1.5 ϫ 10 Ϫ6 , and 0.1 ϫ 10 Ϫ6 for three mismatches in one sequence context (27). These values are slightly below the detection limit of M13mp2 forward mutation assay used here. High fidelity DNA synthesis by yeast Pol ⑀ holoenzyme was anticipated based on a study performed 12 years ago (26) showing high fidelity synthesis by the catalytic subunit of native Pol ⑀ purified from calf thymus tissue. However, at that time, it was not yet possible to inactivate the proofreading exonuclease of Pol ⑀ by mutation. Here we did so in the recombinant enzyme to measure yeast Pol ⑀ selectivity in the absence of proofreading. The comparisons this allows (Table I) suggest that yeast Pol ⑀ holoenzyme efficiently proofreads the vast majority of its own single-base substitution, addition, and deletion errors.
This study also reveals unanticipated properties of exonuclease-deficient Pol ⑀ that make it a clear outlier among replicative polymerases whose fidelity has been characterized in vitro (listed in Table II). These properties include a slightly higher processivity than seen with certain other replicative polymerases (Fig. 2), a somewhat lower fidelity in comparison with proofreading-deficient versions of other replicative DNA polymerases (Table II), and an unusually high rate of forming three pyrimidine⅐pyrimidine mispairs (Fig. 4). To consider possible explanations for these properties, we performed amino acid sequence alignments of yeast Pol ⑀ with other B family polymerases (Fig. 8A) and then considered these alignments in relation to structural information on RB69 DNA polymerase (also family B). The crystal structure of RB69 polymerase reveals that the binding pocket for the nascent base pair snugly accommodates correct Watson-Crick base pairs (63) to permit efficient and accurate DNA synthesis. When the polymerase binds a correct dNTP, this pocket assembles using amino acid residues in the fingers and palm domains (Fig. 8B), including residues in conserved polymerase motif II. Alignment of the motif II coding region (Fig. 8A and depicted in green in Fig. 8B) reveals that Pol ⑀ from different organisms differs in two ways   Fig. 8B) that are otherwise conserved among B family members. In yeast Pol ⑀, these include serine rather than glutamate at 560, leucine rather than proline at 561, asparagine rather than serine at position 647, and arginine rather than asparagine at 653. Given their inferred locations near the polymerase active site (Fig. 8B), these amino acid differences might in part be responsible for the somewhat higher error rates of Pol ⑀ and/or the unusual error specificity observed here. For example, it is already known that human Pol ␣ with a serine to alanine mutation at one of these positions (Ser 867 in Pol ␣ and Asn 647 in Pol ⑀) has increased ability to extend mispaired primer termini (64). It is also tempting to speculate that the active site of Pol ⑀ may be more hydrophobic than that of other DNA polymerases. This would rationalize the high error rates for three mismatches involving misinsertion of dATP, the most hydrophobic base. It might even explain the unusually high error rates for pyrimidine-pyrimidine mismatches, which have been suggested to be more readily accommodated if both pyrimidines can be desolvated, thus allowing these smallest of the 12 possible base-base mismatches to more easily escape geometric selection in the nascent base pair binding pocket (65). Region II of Pol ⑀ also differs from that in other family B members in containing a 66-amino acid insertion that is predicted to encode two ␣-helices (Fig. 8A). These ␣-helices could lie near or contribute to the active site and thus influence fidelity. Given the location of this insertion (Fig. 8B), these residues could perhaps even contribute to template-primer binding and the high processivity of Pol ⑀. This hypothesis can be tested in future studies. The protein-substrate interactions at and within a few base pairs of the active site control binding and correct alignment of the two DNA strands and/or the incoming dNTP. These interactions are clearly different for Pol ⑀ as compared with polymerases from other families studied to date as suggested by the frameshift error specificity (Fig. 5). One theme of our continuing studies of DNA synthesis fidelity is to use polymerase error-specificity information to understand possible functions of these enzymes in cells. Thus, the somewhat unusual error specificity of Pol ⑀ may eventually be helpful for distinguishing among its proposed roles in chromosomal DNA replication, nucleotide excision repair, base excision repair, double strand break repair, DNA mismatch repair, and S phase checkpoint control (for a review, see Ref. 2). Toward this long term goal, the present study compares the spectrum of mutations generated by Pol ⑀ in vitro with the spectrum of mutations that result from spontaneous DNA replication errors in vivo in a region ϳ4,400 base pairs from the ARS306 replication origin that fires early in S phase (66). The data in Table IV reveal that the spectra of spontaneous mutations in the URA3 reporter gene in a pol2-4 msh6 genetic background are significantly different for the two orientations of the reporter gene. This confirms our previous finding that the 3Ј 3 5Ј exonuclease of Pol ⑀ proofreads replication errors on one of the two DNA strands in this region (18). Whether this proofreading occurs during leading or lagging strand replication is currently unknown but can be investigated in the future using a recently described approach to study leading and lagging strand replication in vivo (67). Although it is tempting to assume that Pol ⑀ corrects its own errors and to conclude that Pol ⑀ also performs DNA synthesis on one of the two DNA strands, this has not yet been demonstrated.
To the extent that Pol ⑀ contributes to chain elongation during replication, the spectrum of mutations observed in vivo in the absence of Pol ⑀ proofreading and mismatch repair should reflect errors made by Pol ⑀. Support for a substantial role for Pol ⑀ in chain elongation during replication comes from the observed similarities when the pol2-4 msh6 mutational spectra were compared with exonuclease-deficient Pol ⑀ error specificity in vitro. These similarities include a predominance of GC 3 AT transitions and GC 3 TA transversions (Fig. 6) and common mutable motifs (Fig. 7), suggesting that Pol ⑀ contributes to strand elongation during synthesis of the DNA strand on which it also proofreads replication errors. However, there are striking differences in Pol ⑀ error rates in vitro and the rates at which substitutions are generated in the pol2-4 msh6 strain. These differences suggest that Pol ⑀ participation in DNA synthesis during replication could be small at least during early S phase when the URA3 reporter gene is replicated. Alternatively errors during DNA synthesis by proofreading-deficient Pol ⑀ in vivo may be prevented by additional accessory proteins or efficiently corrected by another exonuclease, for example, the proofreading exonuclease of Pol ␦. It is known that the exonucleases of Pol ⑀ and Pol ␦ can substitute for each other when one of them is inactivated (10). Another striking difference between the in vitro and in vivo spectra is in the proportion of AT 3 TA transversions that are abundant in the in vitro spectrum but rare in the pol2-4 msh6 spectrum. We offer three possible explanations for this difference. 1) Pol ⑀ error specificity in vivo is modulated by interaction with other replication proteins. In addition, the in vitro assay conditions may differ from the environment at the replication fork, which might affect error specificity. 2) The number of detectable sites for AT 3 TA transversions in the URA3 gene is currently unknown but may be smaller than in the lacZ gene such that other types of mutations predominate in vivo.
3) The contribution of Pol ⑀ to replication is limited in this region of chromosome, which is known to be replicated in early S phase. In the future, it would be interesting to determine whether a better match to the Pol ⑀ in vitro error signature can be found in a reporter gene placed at different locations on chromosomes, for example, near late replication origins, centromeres, or telomeres. For example, it has been suggested that Pol ⑀ participates in replication during late but not early S phase in human cells (25).