Interacting fidelity defects in the replicative DNA polymerase of bacteriophage RB69.

The DNA polymerases (gp43s) of the related bacteriophages T4 and RB69 are B family (polymerase alpha class) enzymes that determine the fidelity of phage DNA replication. A T4 whose gene 43 has been mutationally inactivated can be replicated by a cognate RB69 gp43 encoded by a recombinant plasmid in T4-infected Escherichia coli. We used this phage-plasmid complementation assay to obtain rapid and sensitive measurements of the mutational specificities of mutator derivatives of the RB69 enzyme. RB69 gp43s lacking proofreading function (Exo(-) enzymes) and/or substituted with alanine, serine, or threonine at the conserved polymerase function residue Tyr(567) (Pol(Y567(A/S/T)) enzymes) were examined for their effects on the reversion of specific mutations in the T4 rII gene and on forward mutation in the T4 rI gene. The results reveal that Tyr(567) is a key determinant of the fidelity of base selection and that the Pol and Exo functions are strongly coupled in this B family enzyme. In vitro assays show that the Pol(Y567A) Exo(-) enzyme generates mispairs more frequently but extends them less efficiently than does a Pol(+) Exo(-) enzyme. Other replicative DNA polymerases may control fidelity by strategies similar to those used by RB69 gp43.

Bacteriophage RB69 is a relative of phage T4, with which it shares many similarities in genetic organization (1,2) and structures and functions of the phage-encoded DNA replication proteins (3,4). Replication fidelity in T4 and presumably also in RB69 is determined almost exclusively by the fidelities of the phage-encoded DNA polymerase and its associated proofreading 3Ј-5Ј exonuclease (5). This useful simplicity reflects the fact that T4 DNA replication appears to be devoid of DNA mismatch repair; phage T4 is not subject to the action of the several Escherichia coli mismatch repair systems (6) and seems unable to repair mutational heteroduplexes on its own. Screens for T4 mutator mutations have failed to uncover evidence for the involvement of mismatch repair in mutagenesis, and the mutational dose response to base analogues does not display the mismatch repair-dependent lag seen in E. coli (5).
The DNA polymerases of phages T4 and RB69 (gp43, product of phage gene 43) are members of the polymerase ␣ class or B family of DNA polymerases, which includes the replicative polymerases ␣, ␦, and ⑀ of eukaryotic cells and the polymerases of several of their DNA viruses (7). Some archaeons also encode gp43-like B family enzymes (8 -10). As such, T4 gp43 and RB69 gp43 are attractive subjects for studies of mechanisms of replication by this class of enzymes, particularly because of the amenability of the phage system to combined genetic and biochemical analyses (11)(12)(13)(14). A recently determined crystal structure of RB69 gp43 reveals five discrete domains termed N, Exo, Palm, Fingers, and Thumb (15). This structure is in the "open" configuration and provides a preliminary framework for understanding the dynamics of DNA polymerase interactions with the DNA primer template, with incoming dNTPs, and with other proteins of the DNA replicase complex (16,17). These multiple interactions are critical for regulating the processivity and accuracy (fidelity) of replicative DNA synthesis.
RB69 gp43 (903 residues) and T4 gp43 (898 residues) differ at ϳ40% of their amino acids but are probably quite similar in tertiary structure and function because they can substitute for each other to conduct phage DNA replication in vivo (3). In particular, RB69 gp43 has almost the same proficiency and fidelity as T4 gp43 when replicating the T4 genome in vivo (3,18). Thus, one can readily use clones of specifically altered alleles of RB69 gene 43 to study the roles of its five structural domains (and amino acids therein) in supporting T4 DNA replication. Although numerous T4 gp43 alterations can alter the fidelity of DNA replication (12), the ϳ40% divergence between the T4 and RB69 gp43s mandates a focus on the RB69 enzyme when pursuing structure-function relationships because a crystallographic structure is available for RB69 gp43 but not for T4 gp43. Such structural information is apt to be particularly important when probing interactions between gp43 domains, notably of the type to be described in this report, because conserved residues exist in environments that are likely to be influenced by nonconserved residues. Another advantage of the RB69-T4 complementation system is that although recombination occurs vigorously between a phage and a plasmid carrying the T4 gene 43, such recombination is rare (e.g. Ͻ 10 Ϫ7 am ϩ progeny) when the plasmid carries the diverged RB69 gene 43 (3,18). In addition, future structural information is more likely to be derived from the RB69 enzyme than from the T4 enzyme (16,17).
Gp43 and many other DNA polymerases control fidelity through two catalytic functions, a template-dependent 5Ј-3Ј nucleotidyl transferase activity (the Pol function) and a singlestranded DNA-dependent 3Ј-5Ј proofreading exonuclease activ-ity (the Exo function). Although some replicative DNA polymerases have separate Pol and Exo subunits, the catalytic centers for the two gp43 activities reside in separate structural modules of the same polypeptide (15). It is clear that the biological role of the Exo function is to erase errors committed by the Pol function, but it is much less clear how the Pol function makes (or avoids) errors in the first place and how the enzyme recruits the Exo function to reverse such errors. Some insights have been obtained from biochemical studies of T4 gp43 and of the polymerase I family (A family) T7 DNA polymerase, which, like gp43, bears separate Pol and Exo modules in the same polypeptide (19 -21). Kinetic assays with these enzymes indicate that the fidelity of the Pol function is achieved through two transactions that precede proofreading and that occur at or near the Pol catalytic center (20). The first step is accurate selection of an incoming dNTP at the single dNTP-binding site, and the second step is slowed primer extension from a mispaired base at the primer terminus. The base selection step provides a large contribution to fidelity and appears to depend both on base pair geometry and on the hydrogen bonding potential of the incoming dNTP (22)(23)(24)(25). The primer extension step depends particularly strongly on the hydrogen bonding potential of the 3Ј-terminal nucleotide (26) and may provide the signal for transferring the primer terminus to the Exo catalytic site for proofreading (20). Both structural and biochemical evidence suggest that the switch from primer extension to proofreading involves a conformational transition in the enzyme from a "closed" (or Pol) mode to an open (or Exo) mode. This transition includes fraying the primer end to allow its appropriate positioning relative to the Exo catalytic center (17,27).
Recently, a cluster of amino acid residues at the juncture of the Palm and Fingers domains of RB69 gp43 was implicated in dNTP binding (28). One of these residues, Tyr 567 , was proposed to play a role in interactions with the base component of the incoming dNTP during the alignment of the nucleotide for nucleotidyl transfer. We show here that substitutions at this residue can dramatically increase replication errors while exhibiting only small effects on total DNA synthesis and viable phage production. An RB69 gp43 with the Y567A substitution is highly mutagenic in vivo while exhibiting normal 3Ј-exonuclease activity in vitro. Thus, the mutator activity of Y567A-gp43 is not caused by a proofreading-exonuclease defect. An RB69 Y567A-gp43 mutant that is also defective in the Exo function (through the introduction of a D222A/D327A double substitution) does not support viable phage production, although it does support 50 -70% of the normal amount of DNA synthesis. Combining Y567A with the proofreading defect increases the mutation rate only modestly over the increase caused by either component alone. However, the combined increase appears to be sufficient to cause error catastrophe. Based on an analysis of the in vivo mutational spectra produced by Exo Ϫ gp43 and from three different substitutions at the gp43 Tyr 567 site, together with some in vitro kinetic properties of the double-mutator enzyme, we conclude that Tyr 567 is an important determinant of base selection by RB69 gp43. The structural similarities between gp43 and other DNA polymerases of the B family (8 -10) may reveal the existence of similarly positioned tyrosine residues at the dNTP binding sites of these enzymes.

Materials
Strains-Wild type T4 and the T4 gene 43 double-amber mutant amE4332 amE4322 (hereafter referred to as 43am) with UAG codons at positions 202 and 386 were described previously (18). Using two amber mutations minimizes translational read-through in nonpermissive (sup o ) bacterial hosts (29).
Three T4 rII mutants (30) were used for reversion tests. rIIUV131 carries a Ϫ1 frameshift mutation that reverts primarily by the addition of an A⅐T base pair within a run of five A⅐T base pairs, although other frame-restoring alterations are possible. rIIUV356 carries a base pair substitution (BPS) 1 that can revert by transitions (and perhaps also transversions) at a G⅐C site. rIIUV375 carries an ochre (UAA) mutation that can revert to sense codons by both transitions and transversions. For the reversion tests, each rII mutation was combined with 43am by recombination. Most E. coli strains have been described (18). E. coli strain B40 su ϩ II (supE) efficiently suppresses T4 amber mutations. E. coli strain BB lacks amber-suppressing activity; it displays the characteristic r plaque phenotype with T4 rI mutants and also with a few other kinds of T4 r mutants but not with T4 rII mutants. E. coli K12 strain QA1(h) su ϩ I (supD) (hereafter referred to as QA1) restricts the growth of nonamber T4 rII mutants. E. coli strain B E NapIV (31), used for some DNA labeling experiments, requires vitamin B 1 supplementation.
pRB.43 denotes a plasmid encoding RB69 gene 43 (or a mutant version of this gene) in a T7 promoter vector in which gene 43 expression is repressed. Although leak-through expression is weak, it produces sufficient RB69 gp43 to replicate T4 43am somewhat better than is achieved using the amber-suppressor B40 su ϩ II. Good replication is achieved in T4 with only a small amount of gp43 (29,32). Recombination between the phage and plasmid-borne gene 43 homologues is negligible (3,18).
Other-T4 polynucleotide kinase was from New England Biolabs. [␥-32 P]ATP was from Amersham Pharmacia Biotech. dNTPs were from Pharmacia/LKB. Oligonucleotides for kinetics experiments were synthesized by the W. K. Keck Foundation Biotechnology Resource Laboratory (Yale University). All other chemicals were analytical grade. The Pol Y567A Exo Ϫ derivative of RB69 gp43 was constructed and purified as described previously (28).

Methods
Growth, Screening, and Assay Conditions-Cells for genetic experiments were grown in LB broth in a rotary shaker water bath at 37°C. Plates were incubated overnight at 37°C. T4 am43 stocks (with or without rII mutations) were grown on BB cells carrying the desired version of pRB.43. QA1 cells were used to assay am43 rII ϩ revertants. B40 su ϩ II cells were used to assay am43 stocks with or without rII mutations. BB cells carrying pRB.43 (wild type) were used to screen for mutant plaques displaying the rI phenotype.
To measure burst sizes, E. coli BB was grown to 3 ϫ 10 8 cells/ml at 30°C in M9SB medium. At 0 min, 1-ml samples were infected with 10 7 plaque-forming units of T4 43am or T4 wild type added in 10 l of M9SB. At 12 min the cultures were diluted 10 3 -fold in M9SB and aerated at 30°C, and infective centers were assayed. Under these conditions, cell lysis begins at about 42 min. Cell lysis was completed by the addition of chloroform to the diluted infected cultures at 60 min.
To measure T4-induced DNA synthesis in E. coli NapIV cells carrying pRB.43 (shown later in Fig. 1B), the cells were grown at 30°C with vigorous aeration to 3 ϫ 10 8 cells/ml in M9SB medium containing ampicillin at 20 g/ml. 2 ml of each culture were added to 1 ml of fresh medium containing 6 ϫ 10 9 plaque-forming units of T4 43am phage (multiplicity of infection ϭ 10), and aeration was continued at 30°C. After 5 min, [ 3 H]thymidine was added (20 Ci/ml at a specific activity of 20 Ci/g dT). Samples (0.1 ml) were withdrawn at various times thereafter to determine trichloroacetic acid-precipitable counts.
For alkaline gel analysis of newly synthesized T4 DNA (Fig. 1C), E. coli NapIV cells carrying the desired pRB.43plasmid were grown and infected with T4 43am phage as described above. [ 3 H]thymidine (32 Ci/ml, 10 Ci/g) was added to 8 ml of each infected cell at 15 min postinfection. Radiolabeling was stopped at 30 min postinfection by chilling the cultures in an ethanol/wet ice bath. The cells were harvested by centrifuging (5000 ϫ g, 5 min) in the cold and resuspending in 250 l of TE buffer (10 mM Tris-HCl; 1mM EDTA pH 8.0), and their DNA was extracted by the Phase Lock Gel method (catalogue number p1-678901, 5 Prime 3 3 Prime, Inc., Boulder, CO). Samples of purified DNA (3 l containing 70,000 -110,000 cpm) were incubated for 5 min at 65°C with 12 l of a gel loading buffer containing 20 mM NaOH, 1 mM EDTA (pH 8.0), 10% Ficoll, 0.0215% bromcresol green. The samples were then subjected to electrophoresis for 24 h at 25 V in a 0.6% agarose gel (Seakem GTG) in a continuously recycling buffer containing 30 mM NaOH, 1 mM EDTA (pH 8.0). The gel was stained with ethidium bromide and destained, and the DNA lanes were visualized with a UV transilluminator. Subsequently, each ϳ14-cm lane was sliced into 25 pieces of equal size (ϳ5 mm) that were transferred to scintillation vials and counted for 3 H in Ultima Gold scintillation fluid (Packard Instruments); this scintillator decreases quenching from agarose in the samples.
Reversion Tests-To measure rII ϩ revertant frequencies, stocks of T4 am43 rII mutants were grown in E. coli BB cells carrying the desired pRB.43 allele. Revertants were scored by plating on QA1 cells, and the total phage were scored by plating on B40 su ϩ II cells. Revertant frequencies are the median values for 21 stocks; for as few as 5 stocks, 2-fold reproducibility is observed about 95% of the time (6).
Forward Mutation Tests-Mutations in several T4 r (rapid lysis) genes result in large plaques with sharp edges. rI mutants predominate among mutants detected by plating on BB cells. The rI gene is a good candidate for a mutation reporter gene (35,36). It seems not to be involved in DNA metabolism, it is of appropriate size (294 base pairs including the termination codon) for repetitive sequencing, and it displays a mutant phenotype for many missense mutations (as confirmed in this study). Phage stocks for measuring rI mutant frequencies consisted of individual T4 43am plaques recovered in their entirety from lawns of BB cells carrying the desired pRB.43 allele. The plaques were resuspended in 1 ml of LB broth plus a drop of chloroform and were plated on BB cells bearing pRB.43 Pol ϩ Exo ϩ (pCW19R) to yield about 600 plaques/plate, and the plates were then screened for r mutants. rI frequencies (median values for 7-21 stocks) consist of the total r mutant frequency multiplied by the fraction of rI mutants among all r mutants (as determined by subsequent sequencing). The correction factor (0.64) was the ratio of 287 rI mutants among 449 sequenced r mutants distributed among the four gene 43 genotypes.
Measuring Mutations beyond the Error-Catastrophe Barrier-T4 43am phage infecting BB cells carrying pRB.43 Pol Y567A Exo Ϫ produce no viable progeny, probably because the double mutator has a mutation rate so high that all progeny are mutationally inactivated. Therefore, we designed growth conditions in which both the wild type (Pol ϩ Exo ϩ ) and double-mutator (Pol Y567A Exo Ϫ ) polymerases were provided, hoping to recover some viable progeny in which the rI region had been replicated by the mutator gp43. The gp43 mixture was achieved by infecting cells bearing a plasmid expressing double-mutator RB69 gp43 (or, as a control, wild type RB69 gp43) with T4 particles expressing wild type T4 gp43 from the cognate phage gene. BB cells carrying a plasmid expressing either pRB.43 Pol ϩ Exo ϩ or pRB.43 Pol Y567A Exo Ϫ were grown to about 10 8 /ml and were concentrated by centrifugation to 10 9 /ml. T4 43 ϩ was adjusted to 10 10 /ml. At t ϭ 0, equal volumes of prewarmed phage and cells were mixed at 37°C on a rotary shaker water bath. At t ϭ 10 min the mixture was diluted 2-fold in warm LB broth. At t ϭ 15 min, before the appearance of significant numbers of intracellular phage under these conditions, a sample was taken into LB broth with chloroform and assayed to estimate the fraction of unadsorbed parental phages. At t ϭ 20 min the mixture was diluted an additional 50-fold in warm broth. At t ϭ 40, infection was terminated with chloroform, and the lysate was assayed for progeny phages and further screened for r mutants on BB cells bearing pRB. 43 Pol ϩ Exo ϩ . The contribution of unadsorbed parental phages to viable progeny phages was negligible.
Calculating Mutation Rates-In the experiments reported here, sufficiently large numbers of mutational events occurred per T4-infected culture to justify using the expression rI ϭ f/ln(N/N 0 ), where rI is the mutation rate of the rI gene per replication, f is the observed mutation frequency, N is the final population size, and N 0 is the initial population size, or, for N 0 Ͻ Ͻ 1/ rI, the expression rI ϭ f/ln(N rI ) (37,38). The genomic mutation rate g ϭ rI C(168,897 base pairs/genome)/(294 rI base pairs), where C is the ratio of all mutations to detected mutations (the reciprocal of the efficiency of mutation detection). We assume that all non-BPS mutations are detected but that only chain-terminating mutations are efficiently detected among BPSs, synonymous mutations and many missense mutations remaining undetected (37). If we let B ϭ the fraction of mutations that are BPSs and D ϭ the correction factor for the fraction of all BPSs detected, then C ϭ (1 Ϫ B) ϩ DB. The Pol ϩ Exo ϩ rI spectrum contained 47 BPSs of which 9 produced chain-terminating codons. Because the T4 genome is about two-thirds A⅐T, about 0.073 of random BPSs will produce a chain-terminating mutation (37). Therefore, D ϭ 9/(47 ϫ 0.073) ϭ 2.6. Although only approximate, this value is lower than those encountered in most of a variety of other systems (37), demonstrating that the rI system detects many missense mutations.
Sequencing rI Mutants-Mutant plaques were resuspended in 40 l of water, and the rI region was directly amplified by PCR and sequenced. The upstream primer was 5Ј-GTTAAGGCCCTGCATCG-3Ј and the downstream primer was 5Ј-CCTAAGTATTCATCTGCCTTTG-3Ј for both PCR and sequencing. The PCR consisted of 30 cycles of 1 min at 94°C, 1 min at 55°C, 1 min at 72°C, with a final extension time of 10 min at 72°C using Taq large fragment polymerase (Display System Biotech TAQFL from PGC Scientifics, Gaithersburg, MD). PCR products were purified with the PCR purification kit (Qiagen). Sequencing was performed with an ABI Prizm 377 automatic sequencer using the dRhodamine terminator cycle sequencing kit (PE Applied Biosystem). Each mutation was identified by sequencing in both directions.

RESULTS
Effects of Substitutions at RB69 gp43 Tyr 567 -The tyrosine at position 567 of RB69 gp43 (Tyr 564 in T4 gp43) is highly conserved in Region III of B family DNA polymerases (7,11). Nevertheless, we found that replacing this residue with alanine or several other amino acids affects replication functions only weakly. T4 am43 was used to infect E. coli BB cells carrying a plasmid expressing one or another allele of RB69 gene 43, and both DNA synthesis and phage growth were monitored (Fig. 1, A and B). The Exo Ϫ defect had little effect on DNA synthesis and phage growth. The Y567A substitution slightly reduced the rate of DNA synthesis in vivo (by ϳ20%), whereas phage growth was hardly affected. The Y567S, Y567T, and Y567V substitutions reduced DNA synthesis moderately (by 30 -40%) and phage yield somewhat less (Fig. 1A).
In contrast to alanine, serine, threonine, and valine substitutions at Tyr 567 , alanine substitutions at conserved residues located close to Tyr 567 in the crystal structure sharply reduce DNA synthesis and/or phage production, e.g. N564A (Fig. 1, A and C) and others (28). Unexpectedly, the conservative substi-tution Y567F failed to support DNA replication. In addition, gp43 Pol Y567F Exo Ϫ exhibits very weak activity in an M13 gap filling assay (results not shown). The poor catalytic efficiency of Y567F-gp43 was confirmed both by repeating its construction and by reverting the mutated gene 43 at the 567 site (results not shown). Tyr 567 is unusual among highly conserved RB69 gp43 residues tested to date (results not shown) in tolerating nonconservative amino acid substitutions while not tolerating a conservative substitution.
The Pol Y567A Exo Ϫ construct reduced DNA synthesis by a moderate 40% but reduced phage growth sharply. Unlike the other gp43 constructs listed in Fig. 1, RB69 Pol Y567A Exo Ϫ also inhibited the growth of wild type T4. This dominant lethality has two possible explanations, both probably operating here. First, dominant lethality can be exhibited by RB69 gp43 mutants that are deficient in polymerase activity but retain the capacity to repress the translation of the gene 43 mRNA transcribed by the infecting wild type T4 particles (3,11). This Pol Y567A Exo Ϫ enzyme can indeed repress heterologous translation (results not shown). Second, as we show later, the Pol Y567A Exo Ϫ enzyme has such low fidelity that most of the genomes it synthesizes carry numerous mutations.
When we used denaturing gel electrophoresis to examine the size distributions of 3 H-labeled DNA synthesized in vivo, no notable differences were observed among the four genotypes used in this study (Pol ϩ , Pol Y567A , Exo ϩ , or Exo Ϫ ) (Fig. 1C). In particular, the Pol Y567A Exo Ϫ enzyme appears to accumulate no more single-stranded DNA of reduced size than does the Pol ϩ Exo ϩ enzyme. We therefore presume that DNA is packaged in phage progeny with similar efficiency in all four infections.
Reversion Tests-Measuring reversion of specific T4 rII mutations can quickly reveal changes in mutation rates along particular mutational pathways. To this end, we used rII mutants that revert by ϩ1 frameshifts in a run of five A⅐T base pairs or by base pair substitutions at either a G⅐C base pair or an ochre codon (TAA/ATT). The results are summarized in Table I. When Tyr 567 was replaced by serine, threonine, or alanine, the result was weak mutator activity for frameshift mutations in an A⅐T run and strong mutator activity for BPSs at both a G⅐C and an A⅐T site. The polymerases with serine or threonine substitutions appear to be slightly more prone to frameshift mutator activity than the polymerase with an alanine substitution. In contrast to these Pol effects, a defect in exonuclease function strongly promotes all three mutational pathways. Note that the values in Table I are frequencies and not rates; under these experimental conditions, values for rates are roughly an order of magnitude smaller than for frequencies, but relative rates are similar to relative frequencies.
Forward Mutation Tests-We used forward mutation in the T4 rI gene to determine the mutational spectrum generated by the RB69 gp43 mutator mutants. In contrast to reversion tests, which tend to display high sensitivity, forward mutation tests provide generality and, when augmented by sequencing, provide detailed information about mutability at specific sites. Forward mutation tests can also reveal classes of mutations that are not detected in reversion tests.
For the polymerases that supported high levels of T4 DNA synthesis and phage production, it was straightforward to measure r mutant frequencies (discussed later) and to collect mutants of independent origin for sequencing. For gp43 Pol Y567A Exo Ϫ , which failed to support the production of viable phage, we designed a procedure in which T4 infection was supported competitively by T4 gp43 Pol ϩ Exo ϩ and RB69 gp43 Pol Y567A Exo Ϫ (see "Experimental Procedures"). The results of an infection supported by this mixture of gp43s appear in Table  II. The average number of viable progeny per infected cell fell about 90-fold, whereas the frequency of r mutants among the progeny rose by 60-fold. Thus, although the ratio of DNA synthesis conducted by the two competing gp43s is unknown, the DNA in the large majority of the mutated rI regions must have been synthesized by the double-mutator gp43. We believe that these rI regions are embedded in genomes that were mostly synthesized by Pol ϩ Exo ϩ polymerase, the mutated rI regions then finding their way into otherwise little-mutated genomes by recombination; T4 has a high frequency of recombination, about 1% per 150 base pairs. An alternative but less attractive possibility is that the mutated rI regions were introduced by brief intervals of synthesis by the double-mutator enzyme. Because these r mutants arose during a single round of infection, they are all presumed to be of independent origin.
Mutational Classes- Table III lists the numbers of mutations of different kinds arising in Pol ϩ Exo ϩ , Pol ϩ Exo Ϫ , Pol Y567A Exo ϩ , and Pol Y567A Exo Ϫ backgrounds. Complex mutations (closely spaced multiple BPSs and/or frameshift mutations) (5) were excluded from further analysis here because they are rarely produced by our mutator polymerases.
The Pol ϩ Exo ϩ mutational distribution contains a majority of BPSs, a characteristic of most collections of spontaneous mutations in diverse wild type organisms studied in vivo; the exceptions involve distributions in genes that harbor extraordinarily strong frameshift mutation hot spots or organisms experiencing outbursts of transposon mobility. The Pol ϩ Exo ϩ distribution contains roughly twice as many transversions as transitions, whereas the Pol ϩ Exo Ϫ spectrum displays the reverse ratio. Small additions and small deletions (frameshift mutations) are equally frequent in the Pol ϩ Exo ϩ spectrum, whereas small additions predominate in the Pol ϩ Exo Ϫ spectrum. Therefore, gp43 proofreading appears to operate roughly four times more efficiently to repair transition mispairs than transversion mispairs and three or four times more efficiently to repair ϩ1 than Ϫ1 frameshift mutations. (Remember that T4 mutations are not subject to mismatch repair.) These results are in good agreement with the reversion tests ( Table I).
The Pol Y567A Exo ϩ distribution consists almost exclusively of BPSs with a substantial bias in favor of transitions. This distribution is consistent with the reversion tests (Table I).
The Pol Y567A Exo Ϫ distribution appears to be quantitatively intermediate in character between the Pol ϩ Exo Ϫ and Pol Y567A Exo ϩ distributions when broad categories such as transitions, transversions, or frameshift mutations are examined (Table  III). If the Pol Y567A and Exo Ϫ mutator activities operated independently, then the high frequencies of transitions produced by Pol Y567A should have continued to predominate in the Exo Ϫ background, and few or no frameshift mutations would have been seen. It therefore appears that these two mutator activities do not act independently of each other.
When three different substitutions at Tyr 567 in the Exo ϩ background were compared in the forward mutation test, their mutational patterns (Table IV) and their spectra (not shown) were qualitatively similar, although quantitative differences were discernable. In reversion tests, we observed more frameshift mutations with Pol Y567S and Pol Y567T than with Pol Y567A (Table I). This tendency is repeated in the forward mutation tests ( 4 ⁄73 for Pol Y567S ϩ Pol Y567T combined versus 1 ⁄79 for Pol Y567A ). Pol Y567S and Pol Y567T also seemed to produce a higher ratio of transitions to transversions than did Pol Y567A tests (66:3 for Pol Y567S ϩ Pol Y567T combined versus 68:10 for Pol Y567A ). Codon usage patterns and critical amino acids can bias the recovery of missense mutations but probably not sufficiently to determine the observed ratios of various kinds of BPSs.
In all of the distributions, G⅐C 3 A⅐T transitions outnumbered A⅐T 3 G⅐C transitions. G⅐C 3 T⅐A mutations predominated among the transversions, whereas G⅐C 3 C⅐G transversions were completely absent. The predominance of G⅐C 3 A⅐T transitions and G⅐C 3 T⅐A transversions is consistent with the A⅐T-rich nature of the T4 genome. The rarity of G⅐C 3 C⅐G transversions suggests that these polymerases form C⅐C and G⅐G mispairs much less readily or extend such mispairs less efficiently than they do G⅐A, C⅐T, and other transversion-generating mispairs. Almost all frameshift mutations in these spectra arise within short repeats of single base pairs.
Mutational Spectra-Mutational spectra reveal widely different intrinsic mutabilities at different sites. Highly mutable sites are often called hot spots, but this designation is arbitrary because sites typically display a smooth gradient of mutabilities rather than discrete steps, at least within the resolving power of almost all spectra. Hot spots are of considerable interest because they identify genetically unstable sequences. They may also interfere with the analysis of error proclivities   (41); fortunately, the rI reporter gene lacks repeats of more than five base pairs. Spontaneous base substitution hot spots, on the other hand, remain largely unexplained. The many contacts between a DNA polymerase and both the primer template and the incoming dNTP (25) probably modulate error frequencies in still uncharacterized ways that can vary with local DNA sequence up to a dozen bases away (42,43). Fig. 2 shows the mutational spectra obtained with Pol ϩ Exo ϩ , Pol ϩ Exo Ϫ , Pol Y567A Exo ϩ , and Pol Y567A Exo Ϫ gp43s. As expected, frameshift mutations cluster within AAAAA, TTTTT, AAAA, and TTTT, and most are simple additions or deletions of single bases.
The Pol ϩ Exo Ϫ spectrum exhibits no pronounced hot spots but does display numerous sites of intermediate mutability.
The Pol Y567A Exo ϩ spectrum displays two hot spots, each imbedded within a small region of generally increased mutability. The hot spot at position 247 is specific for G⅐C 3 A⅐T transitions. It contains 23% of all the mutations in the spectrum, whereas positions 248 and 250 each contain two G⅐C 3 A⅐T transitions. The hot spot at position 203 produces C⅐G 3 T⅐A transitions and C⅐G 3 A⅐T transversions about equally often, and together they account for about 19% of all the mutations in this spectrum.
The Pol Y567S Exo ϩ and Pol Y567T Exo ϩ spectra are very similar to the Pol Y567A Exo ϩ spectrum and are therefore not shown here. Their ratios of transitions to transversions, which are somewhat higher than those of the Pol Y567A Exo ϩ spectrum (Table IV) At sites of multiple occurrences, the Pol Y567A Exo Ϫ spectrum much more often resembled the Pol Y567A Exo ϩ than the Pol ϩ Exo Ϫ spectrum. However, the Pol Y567A Exo Ϫ mutations at positions 2, 3, 77, 109, 110, 131-135, and 154 are predicted by neither the Pol ϩ Exo Ϫ nor the Pol Y567A Exo ϩ spectrum. If the Pol Y567A and Exo Ϫ mutators acted sequentially and independently, then the Pol Y567A Exo Ϫ spectrum should resemble the predominant Pol Y567A mutational input rather than the Pol ϩ mutational input seen in the Pol ϩ Exo Ϫ spectrum. Thus, the Pol Y567A Exo Ϫ spectrum tends to reinforce the conclusion drawn from the mutation distribution (Table III) that the Pol Y567A and the Exo Ϫ mutator activities do not interact multiplicatively. The exceptions to a simple spectral mixture noted above highlight the complexity of this interaction.
Sequence Determinants of Genetic Instability-Mutationally warm and hot sites and regions constitute DNA sequences that constrain polymerase fidelity and that therefore may provide insights into fidelity mechanisms. The hot spot at rI position 247 is imbedded in a generally hypermutable sequence, 5Ј-TGGCAC-3Ј, that is similar to another hypermutable sequence, 5Ј-TGGCAA-3Ј, previously described in the T4 rIIB gene (44). In both cases the central G is a transition hot spot, whereas the adjacent C mutates moderately often and the first G only slightly more than average. (The complement of this sequence, TTGCCA, is located at position 26 -31 but does not contribute mutations to any of our spectra; however, a transition at the first C of this sequence would produce a Ser 3 Val replacement, which might well go undetected.) The hot spot at rI position 203 is imbedded in yet another generally hypermutable sequence, 5Ј-CCCGTG-3Ј, where the third C is the most mutable, sometimes producing both transitions and transversions, and the T is a little less mutable and produces transitions. The hypermutable region around position 203 begins with three G⅐C base pairs, the only run of three G⅐C base pairs in rI. From the rI sequence, the sum of CCC and GGG expected from a random distribution of bases is 3.5, so perhaps the frequency of these runs has been reduced by mutation pressure. The entire T4 genome has A⅐T ϭ 109278 and G⅐C ϭ 59619 and is unequivocally depleted of such runs (expected ϭ 1857, observed ϭ 1239), and this deficit also appears in the observed infrequent use in T4 of the codons CCX, XCC, GGX, and XGG.
Further inspection of all six rI spectra reveals a strong association between hypermutability and GG (or CC) dinucleotides. We then examined four other spontaneous spectra available for T4 (all produced by T4 gp43) (36,45) and found an identical association. Both the central GG/CC motif and sites to its left and right frequently display increased mutability, and this motif accounts for almost all sites of hypermutability observed in T4 in vivo. The increased mutability of G⅐C-rich regions may in part reflect a previously suggested role for the increased stability of G⅐C base pairs compared with A⅐T base pairs. Such differential stability might modulate the melting of an adjacent mispair prior to partitioning from the Pol site to the Exo site (21,46). Indeed, the contribution of GG and CC regions to mutability is higher in the two Exo ϩ spectra than in the two Exo Ϫ spectra at positions 202-203 and 247, although not at positions 3 and 109 -110. However, our search for associations between the flanking and nearby sequences and the degree of hypermutability failed to reveal a mutability thermostat. Thus, understanding variable GG/CC hypermutability in T4 is likely to require systematic studies of the kinetics of misincorporation in vitro as a function of nearby bases.
Mutation Rates-An experimentally determined mutation frequency f (such as total mutants per total phages) can be   In Vivo Veritas converted into a mutation rate (such as mutations per chromosome replication) provided that the topology of replication is known (such as semiconservative DNA replication) and the growth parameters of the population have been measured (most often the final population size N). The conditions required to apply the expression ϭ f/ln(N) are well met for the gp43 constructs studied here except for gp43 Pol Y567A Exo Ϫ . However, the growth conditions described above, which sufficed to provide a mutational spectrum, also sufficed to estimate the Pol Y567A Exo Ϫ mutation rate. Because of the high rates characteristic of these mutators, mutants sometimes contain multiple mutations. If the mutations are randomly distributed among rI genes, then most multiple mutations should lie many base pairs away from each other in any particular mutant, and the distributions of these distances should appear to be random. Such is the case (Fig. 3). For randomly distributed mutations, the Poisson distribution predicts that M Ն2 /M ϭ 1 Ϫ f/(e f Ϫ 1), where M Ն2 is the number of mutants with Ն2 rI mutations and M is the total number of rI mutants. Thus, f can be estimated solely from the numbers of multiple and single mutations among sequenced mutants and can then be combined with an estimate of N to obtain the mutation rates; this is the method of multiples. The numbers of single and multiple mutations for the relevant RB69 gp43 variants are shown in Table V. Several rI mutation rates were then calculated in two ways. The method of the median was simply the median rI mutation rate calculated using ϭ f/ln (N) and was used to obtain the Pol ϩ Exo Ϫ and Pol Y567A Exo ϩ rates. The method of multiples was used to obtain values of f for all three gp43s.
These f values were then converted to rates using ϭ f/ln(N).
(In the case of Pol Y567A Exo Ϫ , we assumed that the average number of progeny phage was the same, 182, for both infections described in Table II but that most of the phages in the mixed polymerase infection carried lethal mutations. This assumption is justified by the previously described vigor of the polymerase in vivo where nearly normal amount of DNAs of normal sizes are synthesized by Pol Y567A Exo Ϫ . Here, N ϭ 182 instead of the total number of particles in the stock. However, because the condition N 0 Ͻ Ͻ 1/ is not met here, we must calculate the rate using ϭ f/ln (N/N 0 ). The multiplicity of infection was about 10, but not all particles can participate under our experimental conditions. We therefore used the full estimated range N 0 ϭ 1-8.) Note, however, that these multiples values are overestimates for at least two reasons. One is that all of the synonymous mutations and some of the missense mutations among the multiples have an r ϩ phenotype as singles and therefore engender false multiples in the sense of our argument. Another is that some of the multiples may have arisen by recombination between singles. These confounding issues can be roughly factored out by determining the ratio of the multiples rate to the median rate for the Pol ϩ Exo Ϫ and Pol Y567A Exo ϩ polymerases, averaging them (to obtain 5.32), and then dividing the multiples rate for Pol Y567A Exo Ϫ (0.0391-0.652) by this average value to obtain the corrected median value The sequence is that of the wild type rI strand complementary to the coding strand. P ϩ , Pol ϩ ; E ϩ , Exo ϩ ; P M , Pol Y567A ; E Ϫ , Exo Ϫ . Transitions are entered above the wild type bases, and transversions are entered below. Additions (؉) appear above the underlined short repeats and are almost all of single bases. Deletions (Ϫ) appear below the underlined short repeats and are almost all of single bases. Complex mutations and large additions and deletions are omitted because they appeared almost exclusively in the Pol ϩ Exo ϩ spectrum and only once among the 268 mutations in the other spectra; they will be described in a subsequent report. The four top displays (above the horizontal line) are the first halves of the four different spectra, and each second half appears below the dividing line and position numbers.
(0.00740 -0.0122). Although this value may have an experimental uncertainty of a few-fold even beyond its stated range, it still turns out to be very useful.
The forward mutation rates appear in Table VI. As in the reversion tests, all of the mutant polymerases display strong mutator activity. The top three values are expected to be reproducible to within 1.5-fold or less (and the next two to within 2-fold or less) about 95% of the time (6), so that all four singlemutator values are indistinguishable. Because the forward mutation test averages over many base pairs, at some of which BPSs will not produce a mutant phenotype, the relative increases for the Pol site mutators are lower than in the reversion tests. (Relative g values, not shown here, are not identical to relative rI values because the correction factor C varies according to the fraction of BPSs in each spectrum.) The mutator activity of the Pol ϩ Exo Ϫ polymerase (510-fold) is the same as observed previously (490-fold) using an assay that screens all T4 r mutants (18) and is similar to values obtained with various T4 Pol ϩ Exo Ϫ mutants: 650-fold (47) and 760-fold (48) when screening for acridine resistance and 310-fold when screening for all r mutants (18).
Kinetic Parameters in Vitro-Using highly purified, nuclease-free Exo Ϫ gp43, we measured steady-state kinetic constants for Pol ϩ and Pol Y567A for the incorporation and extension of one correct base pair and various mispairs. Using measured values of k cat and K m(app) , we calculated the steadystate catalytic efficiency k cat /K m , a discrimination factor against mispair formation by a particular gp43, and a mutator factor or antimutator factor for gp43 Pol Y567A versus gp43 Pol ϩ .
The kinetic parameters for mispair formation are presented in Table VII. Although these results are preliminary, the trends are unequivocal. The catalytic efficiency with Pol Y567A Exo Ϫ for the correct G⅐C base pair is about 0.2 of the efficiency with Pol ϩ Exo Ϫ . This difference is somewhat smaller than the relative rate of total DNA synthesis in vivo for these two genotypes (about 0.7; Fig. 1). However, the in vitro value of 0.2 varies depending on the sequence context of the measured site. In vitro, gp43 Pol ϩ Exo Ϫ discriminates strongly against all tested mispairs but more strongly against transversion mispairs than against transition mispairs (a result concordant with the higher in vivo frequency of transitions than transversions seen in Table III). Conversely, the Pol Y567A Exo Ϫ mutator factor is stronger for a transition mispair than for any of three transversion mispairs. Discrimination against C⅐C and G⅐G are particularly strong in gp43 Pol ϩ Exo Ϫ and remain strong in gp43 Pol Y567A Exo Ϫ (with no change at all in the case of C⅐C), a result concordant with the absence of G⅐C 3 C⅐G transversions   3. Distances between double mutations. Whereas complex mutations are always close together, most pairs of randomly arising single mutations will be far apart. No multiple mutations were observed in the Pol ϩ Exo ϩ spectra, as expected from the low mutation frequency.

TABLE V Distributions of multiple rI mutations produced by mutant RB69 gp43s
Multiple mutations consisted mostly of doubles (D) plus one triple (T). The top two median rI values were calculated directly from mutant frequencies. The multiples rI values were calculated from the MՆ 2 /M ratio as described in the text. The top two multiples/median ratios were averaged to obtain the third value, which was then applied to the bottom multiples value to obtain the bottom (underlined) median value. in all these spectra (Table III). For all four mispairs, the mutator factor is influenced more strongly by a change in k cat than in K m . Thus, studies both in vivo (Table III) and in vitro (Table  VIII) demonstrate that Pol Y567A is a strong BPS mutator with a preference for generating transition mutations. The kinetic parameters for mispair extension are presented in Table VIII. The catalytic efficiencies for extending a normal base pair are similar (2.2 versus 3.1) for these two gp43s. Both a transition mispair and a transversion mispair were extended very inefficiently, in some cases beyond the limits of measurement. Perhaps surprisingly, the Pol Y567A enzyme extended both the transition and the transversion mispairs less efficiently than did the Pol ϩ enzyme, with the K m contribution outweighing the k cat contribution. A somewhat similar result was reported for the Y766S variant of the Klenow fragment of E. coli DNA polymerase I (49), where catalytic efficiencies rather than discrimination factors revealed a misinsertion mutator factor of about 130, although in this case most of the difference was contributed by the k cat term. We note in closing that a mispair might extend poorly either because of its inappropriate geometry within the Pol site (25) or if it resided for a long time in the inactivated Exo site. DISCUSSION This report describes the first detailed analysis in vivo of determinants of the fidelity of DNA synthesis in a B family DNA polymerase. One determinant is the proofreading exonuclease (Exo) function, which was studied using a bimutational knockout of Exo activity. Another determinant is the Pol function, in which more subtle modifications define a residue that turns out to be critical for accurate base selection. The interpretation of the results is simpler than in many other systems because phage T4 is not subject to the action of the several E. coli mismatch repair systems (6) and seems to be unable to repair mutational heteroduplexes on its own (5). Therefore, comparing mutations arising in Pol ϩ Exo ϩ and Pol ϩ Exo Ϫ backgrounds reveals both the kinds of mutations introduced by the Pol function in the first place and the efficiencies with which each kind of mutation is proofread.
The Proofreading Contribution to Fidelity-Although T4 Exo Ϫ mutations often impair polymerase activity (47), we were fortunate to observe only a small effect of the D222A/D327A combination on polymerase activity either in vivo (Fig. 1) or in vitro (28). The Exo Ϫ form produces a strong mutator effect, increasing both the rI mutant frequency (Table III) and rate (Table VI) and the total r frequency (18) about 500-fold. In reversion tests, the Exo Ϫ state increased mutation frequencies by 500-to 2600-fold (Table I) and rates by similar factors. The standard DNA microbial genomic mutation rate is 0.0034/replication (50). For phage RB69, whose genome size is very close to that of T4 (34), this genomic rate corresponds to an average rate per base pair of 2.0 ϫ 10 Ϫ8 . If proofreading contributes a fidelity factor of about 1/500 ϭ 2 ϫ 10 Ϫ3 to this rate, then the average fidelity of DNA synthesis itself must be 10 Ϫ5 /base pair. Applying the standard genomic rate to E. coli and dividing by genome size gives an average rate/base pair of 7.3 ϫ 10 Ϫ10 . Extending fidelity factors for base substitution (51) to all kinds of mutations gives a synthesis fidelity of 0.9 ϫ 10 Ϫ5 , a proofreading factor of 1.7 ϫ 10 Ϫ2 , and a mismatch repair factor of 5 ϫ 10 Ϫ3 /base pair. Thus, RB69 achieves its spontaneous mutation rate starting with almost exactly the same accuracy of DNA synthesis as achieved by E. coli but attains the remaining balance in a proofreading step that is about 8-fold stronger than in E. coli. E. coli uses additional powerful DNA mismatch repair systems to achieve the standard rate. However, we should point out that these computations do not take into account any coupling that may occur between the several determinants of fidelity.
In E. coli, proofreading discriminates about 4-fold more strongly against transversions than against transitions, whereas mismatch repair discriminates about 14-fold more against transitions than against transversions (51). Transition mismatches are more easily extended and are less efficiently proofread than transversion mismatches by most polymerases (21,25). Because RB69, like T4, almost certainly lacks mismatch repair, we might expect to find a reversed proofreading balance, and we do. The transition rate increases about 1370fold (from 0.43 ϫ 10 Ϫ5 ϫ 15/79 in Pol ϩ Exo ϩ to 220 ϫ 10 Ϫ5 ϫ 39/77 in Pol ϩ Exo Ϫ ), whereas the transversion rate increases only about 310-fold (calculated similarly). The corresponding factors for frameshifts arising predominantly in runs are roughly 1800-fold for ϩ1 mutations and 530-fold for Ϫ1 mutations.
The absence of complex mutations and large additions and deletions from the Pol ϩ Exo Ϫ distribution may mean either that these arise at intrinsically low frequencies and are not proofread efficiently or that they are generated by an aberration of proofreading in the first place. The latter, for instance, might occur by the removal of a correct base followed by misaligned reannealing of the primer terminus to a distant complementary template sequence.
A Polymerase Contribution to Fidelity-We chose to examine the role in accuracy of RB69 gp43 residue Tyr 567 because this residue is unequivocally close to the Pol active site (10,15) and because kinetic parameters for the incorporation of correct bases were almost unaffected in a Pol Y567A Exo Ϫ mutant (28). It therefore seemed likely that if Tyr 567 interacts with the incoming dNTP, it does so with the base rather than with the phosphate or deoxyribose. As it turns out, modifications at this residue produce either a robust polymerase with sharply reduced fidelity (Y567(A/S/T)) or a moribund polymerase (Y567F) whose fidelity could not be measured.
RB69 gp43 Tyr 567 may be related functionally (even if not strictly structurally) to either or both of two A family E. coli Klenow Pol site residues, Tyr 766 (15) and Phe 762 (10), that are crucial for accuracy. Because substitutions at gp43 Tyr 567 increase BPS mutagenesis far more than frameshift mutagenesis, Tyr 567 is involved more deeply in the fidelity of base selection than in preventing slippage errors. This result is consistent with the properties of Klenow fragment Y766S and Y766A, which are strong BPS mutators promoting especially T⅐G and G⅐T mispairs in vitro but also generating deletions of two or more bases (49,52). The mutator activities of RB69 gp43 Y567A, Y567S, and Y567T are similar, although Y567S and Y567T are slightly more prone to transition and frameshift mutagenesis than is Y567A (Tables IV and VI). Understanding the similar effects of these three substitutions and the unanticipated near lethality of the Y567F substitution must await further structural information.
Genomic Mutation Rates-The genomic mutation rates of the mutators are instructive. The rate for each of the single mutators is about 3-4. T4 can survive with this high rate at least long enough to grow into a population of roughly 10 9 phages because of several relieving factors: more than half of the genome is comprised of genes whose function is not required for survival under laboratory conditions, the fraction of BPSs is at least 75% among the mutators (and many of these are relatively innocuous), and selection against deleterious mutations occurs during the growth of a stock. However, the genomic rate for the double mutator is roughly 15, rendering it unable to propagate. Despite the uncertainty in this last value, it is clearly greater than 3, a value that does permit propagation. On the other hand, a rate of 15 is far lower than expected if the component mutator activities in Pol Y567A Exo Ϫ were multiplicative, in which case the genomic mutation rate would be roughly 1800.
Pol-Exo Coupling-Depictions of the accuracy of replicative DNA synthesis usually attribute multiplicative effects to fidelity factors for insertion, proofreading, and mismatch repair. In E. coli, mutationally dissecting the latter two revealed them to be coupled; specifically, a double knockout displayed mutator strength not much greater than did a knockout of proofreading alone, because the large number of input mutations that were not proofread quickly saturated mismatch repair (53). In phage T4, both the deleterious effects of Exo defects upon polymerase activity (47) and the properties of numerous mutants that seem to affect partitioning between the Pol and Exo site (13) imply coupling between the Exo and Pol sites. In no polymerase, however, had the interaction between Pol and Exo mutators been examined previously. After overcoming the complications of error catastrophe, we observed that Pol Y567A and Exo Ϫ interact only a little more strongly than additively. First, the double-mutator spectrum is an approximate mixture of the component spectra, although with important exceptions (Fig. 2). Second, the double-mutator mutation rate is only a little greater than the sum of the component rates (220 ϩ 210 ϭ 430 versus ϳ970), where the probable accuracy of the values does not preclude simple additivity (Table VI). We have observed a quantitatively similar interaction between these two mutator activities when the purified polymerases are assayed in vitro. 2 In a structural sense, coupling means that the partitioning of mispairs between the Pol and Exo sites is determined by more than simple melting and diffusion, and in particular it means that local changes in molecular architecture can affect partitioning strongly. Our results might be explained in several ways: 1) Most DNA polymerases extend mispaired primer termini far more slowly than correctly paired primer termini (25), favoring partitioning to the Exo site. If gp43 Tyr 567 replacements strongly promoted mismatch extension, residence time in the Exo site could be diminished sufficiently to render the Pol Y567A Exo ϩ enzyme functionally Exo Ϫ . However, the Pol Y567A Exo ϩ and Pol Y567A Exo Ϫ mutational distributions and spectra would then be very similar, which they are not, and direct estimates of mispair extension would reveal a large increase, whereas they reveal a substantial decrease (Table  VIII). 2) Because Tyr 567 replacements reduce mismatch extension, the DNA might simply dissociate from the enzyme, and the defective primer terminus might then be unable to reasso-ciate with a gp43 molecule in a productive manner. Thus, most mutations would be lost. 3) If the amino acid replacements in the Exo domain lower the rate of return of the primer to the Pol site, the outcome could be the same, the mutations being lost. 4) The Exo defect might unexpectedly improve fidelity at the modified Pol site. However, there is neither precedence nor structural hint of such a possibility, and it seems unlikely.
Future Directions-It will be important to extend our analyses to these same gp43 variants in vitro to gain insights into how well in vitro analyses of polymerase fidelity accurately reflect the situation in vivo. Such analyses will benefit from reconstituting the replication complex as fully as possible, in addition to studying unassisted gp43s. This will include extending our kinetic analyses to the pre-steady state. It is also crucial to gain structural information on gp43 in the closed configuration complexed with a primer template and an incoming dNTP. The logical later extension is a structural comparison of correct versus incorrect dNTPs, extending eventually to a comparative structural analysis of mutationally cold and mutationally hot sequences.