Base Substitution Specificity of DNA Polymerase β Depends on Interactions in the DNA Minor Groove*

To examine the hypothesis that interactions between a DNA polymerase and the DNA minor groove are critical for accurate DNA synthesis, we studied the fidelity of DNA polymerase β mutants at residue Arg283, where arginine, which interacts with the minor groove at the active site, is replaced by alanine or lysine. Alanine substitution, removing minor groove interactions, strongly reduces polymerase selectivity for all single-base mispairs examined. In contrast, the lysine substitution, which retains significant interactions with the minor groove, has wild-type-like selectivity for T·dGMP and A·dGMP mispairs but reduced selectivity for T·dCMP and A·dCMP mispairs. Examination of DNA crystal structures of these four mispairs indicates that the two mispairs excluded by the lysine mutant have an atom (N2) in an unfavorable position in the minor groove, while the two mispairs permitted by the lysine mutant do not. These results suggest that unfavorable interactions between an active site amino acid side chain and mispair-specific atoms in the minor groove contribute to DNA polymerase specificity.

To examine the hypothesis that interactions between a DNA polymerase and the DNA minor groove are critical for accurate DNA synthesis, we studied the fidelity of DNA polymerase ␤ mutants at residue Arg 283 , where arginine, which interacts with the minor groove at the active site, is replaced by alanine or lysine. Alanine substitution, removing minor groove interactions, strongly reduces polymerase selectivity for all single-base mispairs examined. In contrast, the lysine substitution, which retains significant interactions with the minor groove, has wild-type-like selectivity for T⅐dGMP and A⅐dGMP mispairs but reduced selectivity for T⅐dCMP and A⅐dCMP mispairs. Examination of DNA crystal structures of these four mispairs indicates that the two mispairs excluded by the lysine mutant have an atom (N2) in an unfavorable position in the minor groove, while the two mispairs permitted by the lysine mutant do not. These results suggest that unfavorable interactions between an active site amino acid side chain and mispair-specific atoms in the minor groove contribute to DNA polymerase specificity.
Accurate DNA synthesis catalyzed by DNA polymerases during repair and replication is essential for genome stability. Thus, it is important to understand how DNA polymerases select correct nucleotides for incorporation. That polymerases may distinguish correct from incorrect base pairs by examining the positions of hydrogen bonding atoms in the minor groove of the template⅐primer duplex was suggested more than 20 years ago (1,2). The positions of the two minor groove hydrogen bond acceptors (N3 of purines and O2 of pyrimidines) are similar in Watson-Crick base pairs, but different in mismatched base pairs (3,4). The structure of a DNA polymerase ␤⅐DNA⅐ddNTP complex reveals that the active site binding pocket for the nascent base pair is partly formed by interactions of side chains with the DNA minor groove (5,6). Structural studies of Pol I family polymerases from Thermus aquaticus (7,8), Escherichia coli (9), bacteriophage T7 (10), and Bacillus stearothermophilus (11), and a Pol ␣ family polymerase from bacteriophage RB69 (12) reveal numerous interactions (hydrogen bonds and van der Waals contacts) in the DNA minor groove, both at and upstream of the active site.
Here we probe the importance of polymerase interactions with the DNA minor groove using DNA pol 1 ␤, the smallest and most extensively studied mammalian DNA polymerase. Pol ␤'s primary function is in base excision repair (BER) of DNA damage resulting from exposure to endogenous metabolites and reactive genotoxicants (13)(14)(15). In this capacity, pol ␤ fills in gaps of a single nucleotide (single-nucleotide BER) or gaps of two to six nucleotides (alternate or long patch BER). Pol ␤ lacks an intrinsic proofreading exonuclease and generates singlebase substitution errors at rates of 0.5-13 ϫ 10 Ϫ4 when filling short gaps (16,17). This fidelity is higher than predicted by free energy differences between correct and incorrect base pairing in solution (18), suggesting that like other DNA polymerases, pol ␤ contributes to the selectivity of nucleotide incorporation.
Structural and biochemical evidence suggests that nucleotide selection by pol ␤ occurs at a step following initial, nonspecific dNTP binding (5,6,19,20). When DNA with a singlenucleotide gap is bound by pol ␤, a 90°kink is observed at the 5Ј-phosphodiester linkage of the template residue that base pairs with the incoming dNTP (5,6). The N-terminal 8-kDa domain of the enzyme binds to the 5Ј-phosphate in the gap, causing pol ␤ to have a more compact structure compared with the unliganded enzyme (5). Upon binding of a correct dNTP, the 8-kDa domain moves slightly toward the active site and a portion of the C-terminal domain rotates a large distance closing around the incoming nucleotide and stabilizing its complement in the template strand (5) (Fig. 1, B and C). The binding pocket for the nascent base pair (Fig. 1A) is thus composed of the terminal base pair in the duplex template⅐primer stem on one side and residues in the C-terminal domain of pol ␤ on the other side (5,19). If there is complementarity between the enzyme and a correct base pair, numerous, small conformational changes take place triggering catalysis (5).
A key side chain involved in forming the active site pocket is Arg 283 . As highlighted in red in Fig. 1A, this side chain makes van der Waals contacts with the minor groove edge of the templating nucleotide. This side chain is hydrogen-bonded to the sugar of the preceding template nucleotide in the template⅐primer duplex. Previous studies have shown that the R283A mutant of pol ␤ has greatly reduced fidelity consistent with the importance of these interactions for polymerase selectivity (16,21,22). This phenotype includes an error rate of 25% for T⅐dGMP (16), suggesting complete loss of discrimination.
Here we compare the extreme loss of selectivity resulting from this near complete removal of side chain interactions to the effects of arginine substitution with lysine, retaining some interactions with the minor groove and therefore the possibility of some selectivity.

EXPERIMENTAL PROCEDURES
Short Gap Fidelity Assays-The mutagenesis of the pol ␤ gene and wild-type and mutant enzyme purification have been described previously (16). Gapped DNAs were constructed in which the five-or onenucleotide single-stranded gap contains a portion of the lacZ ␣-complementation sequence, which has been modified by the introduction of an in frame opal codon (17). Fidelity was determined in reaction mixtures (20 l) containing 20 mM Tris-HCl, pH 8.0, 2 mM dithiothreitol, 25 mM * The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
¶ To whom correspondence should be addressed. Tel.: 919-541-2644; Fax: 919-541-7613; E-mail: kunkel@niehs.nih.gov. NaCl, 10 mM MgCl 2 , 5% glycerol, 1 mM ATP (Amersham Pharmacia Biotech), 150 ng (32 fmol) of gapped DNA, 500 M each dATP, dCTP, dGTP, and dTTP (all dNTPs were from Amersham Pharmacia Biotech), 400 units of T4 DNA ligase (New England Biolabs), and pol ␤. Wild-type pol ␤ was used at a 25:1 molar ratio of enzyme to DNA, the lysine mutant was used at a 200:1 molar ratio, and the alanine mutant was used at a 250:1 molar ratio. Following incubation at 37°C for 60 min, reactions were stopped by adding EDTA to 15 mM, and the products were separated on an agarose gel. Gel slices containing covalently closed circular DNA products were isolated, and DNA products were electroeluted and concentrated. DNA samples were introduced into E. coli MC1061 by electroporation, and cells were plated as described (23). After scoring revertant and total plaques, revertants were replated to confirm the phenotype and then reversion frequencies were calculated. Sequence analysis of revertants was performed to define the sequence responsible for the blue plaque phenotype (Sequenase Version 2.0 and sequencing reagents from U. S. Biochemical Corp.).
Structural Superpositions-The superpositions of the structures of Watson-Crick base pairs and base mismatches were accomplished using the Insight® II Version 97.0 molecular modeling system. The atoms in the ribose ring of the nucleotides in the Watson-Crick base pair, and the corresponding atoms in the mispair were selected and used to perform a minimum root mean square alignment of the two base pairs. The structure of the T⅐A and A⅐T base pairs are from the x-ray diffraction studies of a synthetic B-DNA dodecamer (24)  The Insight® II system was also used to create space-filling representations of the active site of pol ␤. Drawings are based on the crystal structure of DNA polymerase ␤ complexed with gapped DNA and ddCTP (5) (Protein Data Bank number 1BPY). The arginine side chain at position 283 of pol ␤ was replaced with the lysine or alanine side chain using the Insight® II Biopolymer module. The correct or mismatched base pairs were superimposed on the nascent base pair at the pol ␤ active site. The atoms in the sugar rings of the template G and incoming ddCTP in the pol ␤ complex and the corresponding atoms in the correct base pair or mispair were selected and used to perform the superposition.

RESULTS AND DISCUSSION
As models for DNA synthesis in both the single-nucleotide and long patch BER pathways, we used M13mp2 DNA substrates containing a single-nucleotide gap or a five-nucleotide gap (17). These gaps contain an in frame opal codon in the lacZ ␣-complementation sequence that renders M13 plaques colorless, and polymerase errors that restore ␣-complementation yield a blue plaque phenotype. When the fidelity of the R283A and R283K mutant polymerases was compared with the wildtype enzyme, both had significantly lower fidelity, as indicated by the increases in overall reversion frequencies (mutant fraction (MF)) in both gap-filling assays (Table I). Consistent with retention of some selectivity by the R283K mutant, the increase relative to wild-type was approximately 10-fold. In contrast, a greater than 100-fold increase was observed for the R283A mutant. DNA sequence analysis of lacZ revertants indicated that the R283A mutant was much less accurate than wild-type pol ␤ for a variety of mispairs (Table I, Fig. 2). In contrast, the R283K mutant was selectively error-prone for some mispairs but not for other mispairs. For example, both mutant enzymes were error-prone for misincorporation of dCMP opposite template T or A, but the R283K mutant was as accurate as wildtype pol ␤ for misincorporation of dGMP opposite template T or A (Table I, Fig. 2).
To explain the distinct base substitution specificities of the R283K and R283A enzymes, we compared the crystal and NMR structures of DNAs containing the four mispairs characterized in Fig. 2 with those of DNAs with correct base pairs. When compared with a correct base pair, the thymine O4-keto group of a T⅐G wobble base pair (25) is displaced in the major groove and the guanine N2-amino group protrudes farther into the minor groove (Fig. 3A). In the active site of both the wild-type and R283K enzymes, closure of the C-terminal domain around the T⅐G mispair to allow catalysis would require the close approach of the hydrogen bond donors of the Arg 283 side chain and N2-atom of the incoming dGTP (Fig. 4, A and B; Arg, Lys). This unfavorable interaction or interference in the minor groove (Fig. 4A, white arrow; B, black arrows) may explain how the wild-type and R283K enzymes discriminate against the T⅐dGTP mispair (Fig. 2). The R283A mutant is unable to discriminate against this mispair due to the absence of unfavorable minor groove interactions (Fig. 4B, Ala). In contrast to the T⅐G mispair, the structure of a T⅐C mispair (26) (Fig. 3B) does not suggest a difference in the minor groove that would lead to steric interference with lysine but not alanine at position 283 (Fig. 4C). This analysis is consistent with our observation of the similar infidelity for the R283A and R283K mutants with the T⅐dCMP mispair (Fig. 2).
In a similar manner, a comparison of the crystal structures of A⅐G and A⅐C mispairs to those of correct base pairs can be used to understand the distinct error specificities of the R283A and R283K mutant polymerases. In both the five-and one-nucleotide gap-filling reactions, R283K discriminates against A⅐dGTP mispairs to the same extent as wild-type pol ␤ (Fig. 2). This may be due to unfavorable interactions (Fig. 4D, black arrows) between the arginine or lysine side chain and the N2 of the incoming guanine in the minor groove in the A(syn)⅐G mispair (27) (Fig. 3C). Again, the smaller alanine side chain does not provide this discriminating interaction (Fig. 4D). The structure of the A⅐C pair (28) superimposes well with the adenine⅐thymine pair (24) (Fig. 3D). This minor groove similarity may provide little or no information for steric hindrance discrimination by the lysine or alanine side chains (Fig. 4E).
The unique error specificities observed with these mutant pol ␤ enzymes thus suggest that interactions between amino acid side chains and the DNA minor groove contribute to the high polymerase selectivity needed to faithfully repair genomes. This interpretation is consistent with the altered fidelity of other DNA polymerases when side chains that interact in the minor groove are altered (29 -31).   (2) 300 (4) a Error rates per detectable nucleotide polymerized were calculated by dividing the total number of revertants obtained for each substitution error by the total number of mutants sequenced and then multiplying by the total mutant fraction.
b Data for wild-type pol ␤ are from Osheroff et al. (17). c Number of pol ␤ revertants sequenced (n). Analysis is shown for the revertants resulting from misinsertions opposite the template adenine. For the wild-type enzyme, an additional 19 revertants resulted from complex substitution-addition mutations, and eight revertants had a base substitution mutation at one of the other two positions in the TGA codon (17). For the R283K mutant, 6 of 48 errors were due to substitutionadditions, and one revertant resulted from a base substitution outside of the TGA target. d MF, mutant fraction. e Numbers in parentheses indicate number of base substitution errors observed. f Data for the wild-type and R283A pols ␤ are from Beard et al. (16). g DNA from 47 wild-type, 50 R283K, and 92 R283A pol ␤ revertants was sequenced yielding the total number of base substitution mutants shown (n) in each column. Multiple base substitution mutations in a revertant were treated as independent events in the calculation.