Basis of Miscoding of the DNA Adduct N2,3-Ethenoguanine by Human Y-family DNA Polymerases*

Background: The miscoding of N2,3-etheno(ϵ)guanine(G) is of interest regarding cancer. Results: N2,3-ϵG:T mispairing was found with Y-family human DNA polymerases, and crystal structures of polymerase ι revealed Hoogsteen base pairing. Conclusion: Structural similarity for N2,3-ϵG:C and N2,3-ϵG:T underlies similar catalytic efficiencies for polymerase ι. Significance: The structural basis of N2,3-ϵG miscoding is revealed. N2,3-Ethenoguanine (N2,3-ϵG) is one of the exocyclic DNA adducts produced by endogenous processes (e.g. lipid peroxidation) and exposure to bioactivated vinyl monomers such as vinyl chloride, which is a known human carcinogen. Existing studies exploring the miscoding potential of this lesion are quite indirect because of the lability of the glycosidic bond. We utilized a 2′-fluoro isostere approach to stabilize this lesion and synthesized oligonucleotides containing 2′-fluoro-N2,3-ϵ-2′-deoxyarabinoguanosine to investigate the miscoding potential of N2,3-ϵG by Y-family human DNA polymerases (pols). In primer extension assays, pol η and pol κ replicated through N2,3-ϵG, whereas pol ι and REV1 yielded only 1-base incorporation. Steady-state kinetics revealed that dCTP incorporation is preferred opposite N2,3-ϵG with relative efficiencies in the order of pol κ > REV1 > pol η ≈ pol ι, and dTTP misincorporation is the major miscoding event by all four Y-family human DNA pols. Pol ι had the highest dTTP misincorporation frequency (0.71) followed by pol η (0.63). REV1 misincorporated dTTP and dGTP with much lower frequencies. Crystal structures of pol ι with N2,3-ϵG paired to dCTP and dTTP revealed Hoogsteen-like base pairing mechanisms. Two hydrogen bonds were observed in the N2,3-ϵG:dCTP base pair, whereas only one appears to be present in the case of the N2,3-ϵG:dTTP pair. Base pairing mechanisms derived from the crystal structures explain the slightly favored dCTP insertion for pol ι in steady-state kinetic analysis. Taken together, these results provide a basis for the mutagenic potential of N2,3-ϵG.

The integrity of DNA is continually challenged by environmental factors (e.g. UV irradiation and radiation), exogenous and endogenous chemicals, and suboptimal repair processes (1). DNA damage produces modified DNA bases (i.e. DNA lesions or DNA adducts), abasic sites, DNA inter-and intrastrand cross-links, and DNA-protein cross-links that, if not properly repaired, can lead to genomic instability and ultimately disease (e.g. cancer).
DNA polymerases (pols) 2 are crucial in maintaining genome integrity. Fifteen human DNA pols, varying in their functions in replication, repair, and tolerance of DNA damage, are known (2). The Y-family DNA polymerases (pol , pol , pol , and REV1) are specialized in translesion synthesis (3,4). For example, pol is known for its unique role in correctly bypassing UV irradiation-induced cyclobutane pyrimidine dimer (5,6). Pol , on the other hand, is unable to copy past cyclobutane pyrimidine dimer but can proficiently insert T or C opposite adducted purines that are impaired in their capability of forming Watson-Crick base pairs (7)(8)(9). Pol has a specialized role in bypassing bulky N 2 -G adducts (10) and interstrand cross-links (11) and is distinct in its moderate processivity, extending beyond the lesion, possibly due to the use of its N-clasp domain. REV1 is highly selective for inserting C opposite normal (12) and adducted template G (10,13). Crystal structures of Y-family pols provide insight into their diverse functions in bypassing normal and adducted templates (14). Pol adopts an induced fit mechanism by flipping template purines into the syn conforma-tion, forming Hoogsteen base pairs (7,8,15,16). REV1 features pairing between dCTP and template G but uses its G-loop to hydrogen bond with the template G and an Arg in another segment (N-digit) to ensure the incorporation of dCTP (12). A high degree of functional and structural differences underlies the diverse but specialized roles in lesion bypass by Y-family human DNA polymerases (17).
The mutagenic potentials of etheno adducts have been established in in vitro bypass assays (26 -30) and site-specific mutagenesis in bacteria (31,32), Chinese hamster ovary cells (33), and simian kidney cells (34). N 2 ,3-⑀G has been less well studied in terms of its replication and repair mechanisms because of the lability of its glycosidic bond. In a polyribo(G/ N 2 ,3-⑀G) template, both C and T were incorporated opposite N 2 ,3-⑀G by avian myeloblastosis virus reverse transcriptase (35). N 2 ,3-⑀-Deoxyguanosine triphosphate was reported to be inserted opposite T by Escherichia coli DNA polymerase I (Klenow fragment), Drosophila melanogaster polymerase ␣-primase complex, and human immunodeficiency virus-I reverse transcriptase (30). An indirect assay in E. coli showed an estimated mutation frequency of 13% for N 2 ,3-⑀G, resulting in G to A transitions (32). The long half-life of N 2 ,3-⑀G in rat liver and lung (150 days) and in rat kidney (75 days) in vinyl chlorideexposed rats suggests inefficient repair of this lesion. In human glycosylase assays (in vitro), N 2 ,3-⑀G was released at a much slower rate compared with 1,N 6 -⑀A and 3,N 4 -⑀C (36). The mutagenicity and persistence of N 2 ,3-⑀G suggest a high miscoding potential in vivo. N 2 ,3-⑀G is generally considered to contribute to the carcinogenesis of vinyl chloride and inflammation-driven malignancies (37). The dominance of GC to AT transitions in five of six K-ras (oncogene) tumors found in vinyl chloride workers (25) suggests the importance of G adducts, and the miscoding pattern of 1,N 2 -⑀G is not consistent with this transition (26 -35).
To further understand the miscoding potential of N 2 ,3-⑀G by Y-family human DNA polymerases, which are highly relevant to translesion synthesis, we carried out a series of primer extension and steady state-kinetic analyses using human pol , human pol , and human REV1 with a template containing 2Ј-F-N 2 ,3-⑀dG. The extension products formed by pol were identified using LC-MS/MS. A consistent mispairing pattern was observed (2Ј-F-N 2 ,3-⑀dG:T), and base pairing mechanisms were revealed in two pol crystal structures with either dCTP or dTTP paired with 2Ј-F-N 2 ,3-⑀dG but with individual differences.
Preparation of Recombinant Catalytic Core of Human REV1-The gene fragment covering the catalytic core (residues 330 -833) (12) of wild-type human REV1 was obtained by PCR amplification from the vector pET-22b(ϩ)/hREV1 (13) as template using Pfu DNA polymerase (Stratagene, La Jolla, CA) with a pair of primers (5Ј-GGATCCATGTCTACGTTTAGCAAG-GCAG-3Ј and 5Ј-GCGGCCGCTTATGTGGAAGGGTTCA-GATTAG-3Ј). The resulting PCR product of the 1.5-kb hREV1 fragment was cloned into the vector pSC-B-Amp/Kan (Stratagene). Following sequence confirmation, the hREV1 gene fragment was cloned into the BamHI and NotI sites of the vector pBG101 (obtained from the Center for Structural Biology, Vanderbilt University) to generate the cleavable glutathione S-transferase (GST)-tagged protein. The GST-tagged hREV1(330 -833) was expressed in E. coli BL21 (DE3) cells, which were grown at 37°C and 220 rpm to an OD 600 of 0.6 and then induced with isopropyl ␤-D-1-thiogalactopyranoside (0.2 mM) for 12 h at 16°C. The harvested pellets were resuspended in lysis buffer containing 50 mM Tris-HCl (pH 7.4), 500 mM NaCl, 10% (v/v) glycerol, 5 mM ␤-mercaptoethanol, 1 mg ml Ϫ1 lysozyme, and protease inhibitor mixture (Roche Applied Science). Suspensions were sonicated, and the cell lysate was clarified by centrifugation at 4 ϫ 10 4 ϫ g for 60 min at 4°C. The resulting supernatants were loaded onto a 1-ml GSTrap 4B column (GE Healthcare), and the column was washed with 20 ml of Buffer A (50 mM Tris-HCl (pH 7.4) containing 150 mM NaCl, 10% (v/v) glycerol, and 5 mM ␤-mercaptoethanol). The GSTtagged REV1(330 -833) bound on the column was cleaved with Prescission protease (GE Healthcare) for 14 h at 4°C. Cleaved REV1(330 -833) was eluted with Buffer A, and the purity was analyzed by SDS-polyacrylamide gel electrophoresis with Coomassie Brilliant Blue R-250 staining. A typical yield was ϳ760 g from 1 liter of culture.

LC-MS Analysis of Full-length Extension
Products-An 18-mer primer (5Ј-GGGGGCTCGTAAGGAT(dU)C-3Ј) was annealed to the same 23-mer oligomer as described above at a 1:1 molar ratio. Reaction conditions were similar to those used in steady-state kinetic assays except that the final concentrations were as follows: 10 M pol , 12.5 M primer⅐template complex, and 2% (v/v) glycerol in a total volume of 80 l. Reactions were carried out in the presence of four dNTPs (10 mM each) for 3.5 h at 37°C. The reactions were terminated by spin column separations to extract dNTPs and Mg 2ϩ , and the resulting product was treated with 50 units of uracil-DNA glycosylase and then 0.25 M hot piperidine (40). LC-MS/MS analyses were performed using an ACQUITY ultraperformance liquid chromatography (UPLC) system (Waters Corp.) connected to a Finnigan LTQ mass spectrometer (Thermo Scientific Corp., San Jose, CA) operating in the electrospray ionization negative ion mode and using an ACQUITY UPLC system BEH octadecylsilane (C 18 ) column (1.7 m; 1.0 ϫ 100 mm). UPLC conditions were as described (38).
Structure Determination and Refinement-X-ray diffraction data were collected at the Advanced Photon Source (Argonne National Laboratory, Argonne, IL) on the 21-ID-F and 21-ID-G (Life Sciences Collaborative Access Team) beam lines. All data sets were recorded from cryoprotected crystals using a single wavelength at 100 K. Data were indexed and scaled with the program HKL2000 (41). Both crystal types belonged to space group P6 5 22. X-ray diffraction data collection and processing statistics are listed in Table 3 (see below). Phases were calculated using MOLREP as a part of the CCP4 program suite (42,43) based on a previously refined model (Protein Data Bank code 3OSN) (16). Refinements were performed using Refmac 6.0 with restrained and rigid body refinement (44,45). Repeated cycles of manual rebuilding were performed in Coot (46). Structural images were generated in PyMOL (47).

RESULTS
Primer Extensions by Human Y-family Pols-Primer extension experiments were performed using a template containing stabilized 2Ј-F-N 2 ,3-⑀dG and compared with those using templates containing 2Ј-F-dG and unmodified dG (Fig. 2). Among the four pols, pol appeared to be most efficient in terms of producing full-length products (5 bases extended). Although similar to pol in efficiency, pol only produced products with 4 or fewer bases extended in a 20-min reaction. REV1 and pol extended the primer by 1 base within the same time frame. Quantitation of the extension products showed that the pol -extended products constituted 93, 96, and 83% of the substrate for dG, 2Ј-F-dG, and 2Ј-F-N 2 ,3-⑀dG templates, respectively, upon 20-min incubation. The corresponding respective values were 73, 76, and 50% for REV1; 19, 10, and 10% for pol ; and 94, 94, and 70% for pol . The resistance to extension observed here is consistent with the known low processivities of pol and REV1 (9,13).
Steady-state Kinetics of Nucleotide Incorporation-To determine the catalytic efficiency of incorporation of each dNTP catalyzed by different pols, steady-state kinetic analysis was performed. The catalytic efficiency (i.e. k cat /K m ) and misinsertion frequency (i.e. f ϭ (k cat /K m(dNTP) ) incorrect /(k cat /K m(dCTP) )) are two useful parameters for comparing different polymerase reactions (Table 1). For the template containing 2Ј-F-N 2 ,3-⑀dG, all three pols preferred to insert dCTP. Pol and pol had relatively high misincorporation frequencies for T because of their marginally lower catalytic efficiencies compared with those for C insertion. For REV1, the order of preference for dNTP misinsertion opposite the lesion was G Ͼ T Ͼ A (based on k cat /K m ) but with much lower misincorporation frequencies compared with the three other pols (Table 1). When comparing the k cat /K m values for dCTP insertion opposite the lesion with those obtained for the dG template, pol and pol showed ϳ40-fold attenuation followed by about 10-fold attenuation for REV1, suggesting that the presence of the lesion did not significantly affect the deoxycytidyltransferase activity of REV1. The catalytic efficiencies for different dNTPs were similar in magnitude for the 2Ј-F-dG and dG templates, indicating that 2Ј-F modification did not dramatically perturb pol recognition. Among all four pols examined (including pol (38)), pol exhibited the highest relative efficiency (0.24) of nucleotide (dCTP) incorporation opposite 2Ј-F-N 2 ,3-⑀dG (compared with dCTP incorporation opposite unmodified dG) followed by REV1 (0.11), pol (0.027), and pol (0.026) ( Table 1). Pol showed the highest absolute value of catalytic efficiencies for all three types of template, consistent with results seen in product extension experiments (see above).
Analysis of Pol Primer Extension Products by LC-MS/MS-Because the highest misinsertion frequency was observed for pol , we examined the extension products of pol reactions using LC-MS/MS. Previous procedures were followed using a uracil-containing primer (Fig. 3A), and the product was cleaved using uracil-DNA glycosylase to simplify the sequencing results obtained with collision-induced dissociation fragmentation. The most abundant species (Ϫ2 or Ϫ3 charge) were chosen for collision-induced dissociation analysis, and the identity of the product was established by matching the fragmentation pattern to the theoretical pattern obtained from a program linked to the mass spectrometry group at the University of Utah (48). By using a longer incubation time (3.5 h) and higher enzyme and substrate concentrations compared with those in the primer extension gel analyses, a greater portion of primer was extended by pol , and the products could be identified. As shown in Fig.  3, B-D, three products were identified with C, T, and A incorporated opposite 2Ј-F-N 2 ,3-⑀dG. Two additional products containing C and T opposite the lesion (Fig. 3, E and F, respectively) were identified as having an extra A at the end. The confirmed fragment ions are illustrated in the spectra with fragmentation patterns in the insets.
The relative yields of various products were calculated based on the peak areas of extracted LC-MS chromatograms (data not shown). The sum of the peak areas was used for the product, which existed in more than one charge state. Consistent with the low efficiency of pol seen in the primer extension gel analysis, the amount total of extended products (based on the total peak areas; data not shown) only accounted for 6% of the total products by Dpo4 formed under the same conditions (38). The major extension products were those containing T and C opposite 2Ј-F-N 2 ,3-⑀dG, and the other three are minor products ( Table 2). Pol produced similar yields of extension products with C (41%) and T (52%) incorporated opposite the lesion. In addition, pol readily extended the 2Ј-F-dG and dG templates  OCTOBER 12, 2012 • VOLUME 287 • NUMBER 42

Miscoding of N 2 ,3-Ethenoguanine
in an error-free manner. The base insertion pattern obtained from LC-MS/MS analysis agrees with the steady-state kinetic analysis (Table 1) with T insertions being the major misincorporation events for pol -catalyzed bypass.
Crystal Structures of Pol with Oligonucleotides Containing 2Ј-F-N 2 ,3-⑀dG and dCTP or dTTP-To understand the base pairing mechanisms for dCTP and dTTP observed above, we conducted co-crystallization experiments with pol , a template containing 2Ј-F-N 2 ,3-⑀dG, and dCTP or dTTP. Two types of crystals were obtained, i.e. pol -1 (pol ⅐2Ј-F-N 2 ,3-⑀dG:dCTP; Protein Data Bank code 4FS2) and pol -2 (pol ⅐2Ј-F-N 2 ,3-⑀dG: dTTP; Protein Data Bank code 4FS1). Structures of these two ternary complexes were determined by molecular replacement using a previously refined model (Protein Data Bank code 3OSN) (16) without the lesion and the incoming nucleotide (Table 3). Clear electron densities around the 2Ј-F-N 2 ,3-⑀dG and incoming nucleotide facilitated the unbiased determina-tion of the base pairing conformations at the active site. Similar to several structures seen previously for template native purines (7,15) and adducted purines (8,9,16), the electron density around 2Ј-F-N 2 ,3-⑀dG indicated that the lesion was rotated from the anti to the syn conformation (Fig. 4, A and C). This substrate-induced conformational change of template purine is thought to be dictated by the rigid active site of pol (7).
In both the pol -1 (Fig. 4A) and pol -2 ( Fig. 4C) structures, the incoming nucleotide served as a donor in hydrogen bonds with the Hoogsteen edge of 2Ј-F-N 2 ,3-⑀dG (i.e. the O6 and N7 atoms). When 2Ј-F-N 2 ,3-⑀dG was paired with dCTP ( Fig. 4B), one hydrogen bond was observed between the N4 atom of dCTP and O6 atom of 2Ј-F-N 2 ,3-⑀dG as indicated by a 2.5-Å distance. The possibility of a second hydrogen bond cannot be ruled out based on the distance (3.0 Å) between the N3 atom of dCTP and the N7 atom of 2Ј-F-N 2 ,3-⑀dG provided that the N3 atom of dCTP is protonated. This mechanism was proposed in

Steady-state kinetic analysis of polymerase-catalyzed single base insertion
The insertion was opposite X in the template sequence of 3Ј-CCCCCGAGCATTCCTAAGXTACT-5Ј where X is 2Ј-F-N 2 ,3-⑀dG, 2Ј-F-dG, or dG. previous studies for dCTP paired with G (15) or adducted G (16).
In the case of 2Ј-F-N 2 ,3-⑀dG-paired dTTP (Fig. 4D), it is likely that only one hydrogen bond exists between the O6 atom of 2Ј-F-N 2 ,3-⑀G and the N3 atom of dTTP with a distance of 2.8 Å. Although a distance of 3.2 Å between N7(2Ј-F-N 2 ,3-⑀dG) and N3(dTTP) can also be interpreted as a potential hydrogen bonding distance, the asymmetry between the two distances (2.8 versus 3.2 Å) makes it unlikely that a bifurcated hydrogen bonding structure is present. The slightly longer distance (2.8 Å) in the case of 2Ј-F-N 2 ,3-⑀dG:T hydrogen bond may be an indication of its weaker strength compared with the 2.5-Å distance seen in the 2Ј-F-N 2 ,3-⑀dG:C pair, although the resolution limit of 2.5 Å does not permit a firm conclusion in this respect. Together with the possibility of two hydrogen bonds in the 2Ј-F-N 2 ,3-⑀dG:C pair, the base pair modes observed here are consistent with the slightly favorable insertion of C observed in the steady-state kinetic analysis.
Irrespective of the incoming nucleotide, the pol -1 and pol -2 structures are quite similar with a root mean square devia-tion value of 0.27 Å for all atom pairs upon superimposition. The superimposition of the incoming nucleotides suggested a movement of 2Ј-F-N 2 ,3-⑀dG toward the minor groove for the 2Ј-F-N 2 ,3-⑀dG:T base pair (Fig. 5A). The pol -1 structure superimposes with the native G:C complex (Protein Data Bank code 2ALZ; Ref. 15) with a root mean square deviation value of 0.27 Å (Fig. 5B), indicating that the presence of the lesion (2Ј-F-N 2 ,3-⑀dG) did not significantly affect the conformations of the protein and the nucleic acid. The conformation of the 2Ј-F-N 2 ,3-⑀dG:C pair also resembles that of an N 2 -ethylguanine:C pair crystallized with pol (9) (Fig. 5C). When the conformation of the 2Ј-F-N 2 ,3-⑀dG:T mispair is compared with an O 6 -methylguanine:T base pair (16) (because of the lack of a structure with the pol G:T pair in the Protein Data Bank), the conformations of both base pairs are similar except that the lesion is slightly moved toward the minor groove in the case of the 2Ј-F-N 2 ,3-⑀dG:T pair (Fig. 5D, green). This shift could be due to the bulkier size of 2Ј-F-N 2 ,3-⑀dG compared with O 6 -methylguanine. Overall, pol appears to be able to accommodate the 2Ј-F-N 2 ,3-⑀dG pair rather well at the active site without significant protein and nucleic acid conformational changes. The structures showed that both C and T pair with the lesion in a similar fashion in line with the observation that pol promoted both error-free and error-prone bypass in steady-state kinetic and LC-MS/MS analyses.

DISCUSSION
The DNA adduct N 2 ,3-⑀G is a ubiquitous modification produced from endogenous processes (e.g. lipid peroxidation) or exposure to environmental pollutants (e.g. vinyl chloride or urethane). We recently developed an isostere approach to incorporate the stabilized analog (2Ј-F-N 2 ,3-⑀dG) into oligonucleotides and investigated the miscoding potential of N 2 ,3-⑀G using several prokaryotic and eukaryotic DNA pols (38). In the present work, we extended our previous investigation into the

TABLE 2 Summary of pol extension products from LC-MS/MS analysis
The results were obtained using template ⅐ primer complexes containing 2Ј-F-N 2 ,3-⑀dG, 2Ј-F-dG, or dG. Underlined nucleotides indicate the base incorporated opposite the lesion.
other three human Y-family DNA pols and provided the structural basis of the most error-prone bypass enzyme, pol .
Primer extension gel analysis generated a qualitative comparison of the capability of bypassing 2Ј-F-N 2 ,3-⑀dG by Y-family pols (Fig. 1). The order of bypassing efficiency (from the percentage of total product extended) is pol Ͼ pol Ͼ REV1 Ϸ pol . Compared with pol and pol , the higher activity of pol copying past N 2 ,3-⑀G observed here is similar to that seen previously for other etheno adducts, i.e. 1,N 2 -⑀G (29), 1,N 6 -⑀A (26,49), and 3,N 4 -⑀C (50). With regard to DNA polymerases, the extension pattern is particularly similar to that of bypass of 1,N 2 -⑀G; i.e. pol readily extended the primer into full-length products, whereas pol and pol showed some single base incorporation (29).
Steady-state kinetic analysis established the preferred base incorporated opposite the lesion and provided a kinetic rationale for primer extension experiments (Table 1). For all four human Y-family DNA pols, the correct base C is marginally preferred opposite 2Ј-F-N 2 ,3-⑀dG with similar relative efficiencies in comparison with the insertion of C opposite a regular G ( Table 1). The misinsertion of T is consistent for all four human Y-family DNA pols as well as for several other prokaryotic and eukaryotic DNA polymerases (38). The highest absolute value of catalytic efficiency (k cat /K m ) seen (for pol ) is in line with primer extension results, which may be partly explained by the more open active site of pol compared with other poly-merases (51). The pattern of fidelity for pol bypassing different etheno lesions is similar: both error-free and error-prone syntheses have been observed. Pol inserted a C opposite N 2 ,3-⑀G in a marginally error-free manner with a misinsertion frequency of 0.63 for T (Table 1). Similarly, pol copied past 1,N 6 -⑀A in the order of preference T Ͼ C Ͼ A Ͼ G (49). The order was G Ͼ A Ͼ C for 1,N 2 -⑀G (29) and A Ϸ G Ͼ C Ϸ T for 3,N 4 -⑀C (50). Pol has the highest misincorporation frequency (although C is preferred 1.5-fold compared with T), which is consistent with the view that pol generally catalyzes errorprone bypass (3). The incorporation patterns seen for pol bypassing other etheno DNA adducts are as follows: pol somewhat prefers to incorporate C opposite 1,N 6 -⑀A (8) and inserts both C and T opposite 1,N 2 -⑀G with almost the same catalytic efficiencies (29). The fact that REV1 prefers to catalyze dCTP insertion is not surprising in that REV1 utilizes its G-loop to hydrogen bond with template G and an Arg in another segment (N-digit) to ensure the incorporation of dCTP (12). When comparisons are made with the catalytic efficiency of dCTP insertions opposite native G in the template, the order of relative efficiency is pol (0.24) Ͼ REV1 (0.11) Ͼ pol (0.027) Ϸ pol (0.026) ( Table 1), suggesting that 2Ј-F-N 2 ,3-⑀dG affects the DNA syntheses of the four Y-family pols to a similar extent.
LC-MS/MS analysis of the primer extension products by pol provided further insight into the nature of the bases inserted beyond the lesion in these error-prone reactions. With pol , TABLE 3 Crystal data collection and refinement statistics for the ternary complexes pol -1 (pol ⅐2-F-N 2 ,3-⑀dG:dCTP, Protein Data Bank code 4FS2) and pol -2 (pol ⅐2-F-N 2 ,3-⑀dG:dTTP, Protein Data Bank code 4FS1)

Data collection
Beamline 21-ID-F 21-ID-G Space group P6 5  approximately half of the products contained T with a high fidelity extension beyond the lesion (Fig. 3 and Table 2). The observation of almost equal amounts of products containing C and T opposite 2Ј-F-N 2 ,3-⑀dG (with LC-MS/MS analysis) is in line with results from kinetic analysis ( Table 1). The much lower amount of total extended products (6%) compared with Dpo4 (38) agrees with the low bypass efficiency of pol seen in the primer extension gel analysis (Fig. 2). These extension products are similar to the products generated by Dpo4 (38); however, the pattern of miscoding is considerably different from that generated by 1,N 2 -⑀G, which yields mainly products with G inserted by human pol (29) and Ϫ1 deletion products by Dpo4 (28).
The hydrogen bonding patterns of 2Ј-F-N 2 ,3-⑀dG:C and 2Ј-F-N 2 ,3-⑀dG:T base pairs seen in the crystal structures provided molecular explanations for the error-free and errorprone bypass of pol . The distance of 2.5 Å is a clear indication that a hydrogen bond is established between the O6 atom of 2Ј-F-N 2 ,3-⑀dG and the N4 atom of dCTP. The possibility of a second hydrogen bond also exists, i.e. between the N7 atom of 2Ј-F-N 2 ,3-⑀dG and the N3 atom of dCTP, provided that the N3 atom of dCTP is protonated. The tendency for protonation of the N3 atom of dCTP has been discussed in several other pol ⅐DNA structures with both native and adducted purines in the templates (8,15,16). Although the N3 atom of free cytosine has a pK a ϳ4.5, the local molecular environment could elevate the pK a to 6.2-7.2 at a terminal position or Ͼ8.5 at an internal position in DNA triple helices (52,53). Nair et al. (8) suggested that an elevation of the pK a of dCTP could be due to the basestacking and long range electrostatic interactions of the active site residues Asp-126 and Glu-127. The one hydrogen bond observed in the 2Ј-F-N 2 ,3-⑀dG:T pair is an indication that the 2Ј-F-N 2 ,3-⑀dG:T pair might be less stable compared with the 2Ј-F-N 2 ,3-⑀dG:C pair. Our crystallization attempts are consistent with this view in that pol -1 type crystals (with dCTP) grew more easily and diffracted to higher resolution than the pol -2 crystal (with dTTP). Collectively, the difference in hydrogen bonding may explain a slightly higher catalytic efficiency for dCTP by pol .
The typical strategy that pol uses a Hoogsteen base pairing mechanism to accommodate native and adducted purines was once again demonstrated in both the pol -1 and pol -2 structures albeit with different hydrogen bonding schemes. The similarity of the two structures consists of their use of the Hoogsteen edge of 2Ј-F-N 2 ,3-⑀dG to hydrogen bond with the incoming nucleotide. The conformation of 2Ј-F-N 2 ,3-⑀dG:C also resembles G:dCTP (15) and N 2 -ethylG:dCTP (9) at the pol active site. However, the base pair conformations seen here are quite different from what has been observed at the active site of Dpo4 (38). Specifically, the 2Ј-F-N 2 ,3-⑀dG:C pair adopts a Watson-  A, shearing of 2Ј-F-N 2 ,3-⑀dG toward the minor groove for the 2Ј-F-N 2 ,3-⑀dG:dTTP pair (green) compared with the 2Ј-F-N 2 ,3-⑀dG:dCTP pair (red). B, similarity between the 2Ј-F-N 2 ,3-⑀dG:dCTP pair (red) and the native G:dCTP pair (blue; Protein Data Bank code 2ALZ). C, structural similarity between the 2Ј-F-N 2 ,3-⑀dG:dCTP pair (red) and the N 2 -ethylguanine:dCTP pair (gray; Protein Data Bank code 3EPG) crystallized with pol ; D, shearing of 2Ј-F-N 2 ,3-⑀dG towards the minor groove for the 2Ј-F-N 2 ,3-⑀dG:dTTP pair (green) compared to O 6 -methylguanine:dTTP pair (16)  Crick-like conformation, and the 2Ј-F-N 2 ,3-⑀dG:T structure contains a sheared base pair at the Dpo4 active site (Fig. 6, C and D, and Ref. 38). That 2Ј-F-N 2 ,3-⑀dG was observed to be positioned in the anti conformation by Dpo4 is likely due to the relatively open active site compared with pol (17) (surface view shown in Fig. 6, A and B). Particularly, the residues adjacent to the template base are bulkier (Leu-62, Val-64, and Gln-59) in pol compared with Dpo4 (Ala-42, Ala-44, and Val-32) (54). In the Dpo4 structure, hydrophobic interactions are likely to exist between Val-32 and the imidazole ring of 2Ј-F-N 2 ,3-⑀dG (anti). Conversely, residues (Leu-62, Val-64, and Gln-59) may force 2Ј-F-N 2 ,3-⑀dG to rotate into the syn conformation, which would otherwise clash with these residues if the lesion were positioned in the anti conformation. Despite these conformational differences, similar extents of T misinsertion are observed in both cases.
As mentioned in the Introduction, Singer and co-workers (30,32,35) reported three studies on the miscoding of N 2 ,3-⑀G more than 20 years ago. These studies were limited by the general methods available for studying miscoding at the time as well as the inherent lability of the glycosidic bond of N 2 ,3-⑀dG. The uncorrected mutation frequency for N 2 ,3-⑀dG inserted into an M13 phage system was only 0.5%, but in that study (32), an in vitro study with a polyribo(G/N 2 ,3-⑀G) template and a reverse transcriptase (35), and a study on "reverse" incorporation of N 2 ,3-⑀dG triphosphate (30), the general pattern was pairing of N 2 ,3-⑀G with T and C. This pattern, despite any deficiencies in the earlier work, is similar to those seen in our own studies (Ref. 38 and the present work). The N 2 ,3-⑀G:T wobble pairing proposed in that early work (35) had no experimental basis and has not been observed in our crystal structures with Dpo4 (38) or human pol (Fig. 4).
More recently, theoretical studies (55) have predicted that G should be the base most likely to pair with N 2 ,3-⑀G followed by T Ͼ A Ͼ C, a prediction that is clearly inconsistent with the results obtained with all DNA polymerases thus far (Tables 1  and 2) (38). The pairing patterns predicted in the theoretical study (55) are also inconsistent with our N 2 ,3-⑀G:C and N 2 ,3-⑀G:T structures observed in the Dpo4 (38) and human pol (Fig.  4) crystals.
As mentioned in the Introduction, the goal of the 2Ј-fluoro substitution was to stabilize the glycosidic bond by destabilizing the transition state leading to an oxocarbenium-like intermediate in hydrolysis. The substitution was clearly successful in stabilizing the residue in oligonucleotides (38). Although miscoding by N 2 ,3-⑀G (specifically, 2Ј-F-N 2 ,3-⑀dG) was clearly demonstrated relative to both dG and 2Ј-F-dG (Table 1), it should be noted that the substitution of fluorine for hydrogen at the C2Ј sugar position is not without effect; i.e. the substitution caused up to a 12-fold change in k cat /K m (primarily in the K m parameter) among four Y-family DNA polymerases: an ϳ8-fold decrease of k cat /K m with pol and a ϳ12-fold increase of k cat /K m with pol but no changes with pol or REV1. Therefore, 2Ј-fluoro substitution seems to slightly interfere with pol activity but to facilitate pol activity, which might be related to a possible stabilizing effect of 2Ј-fluorine to exert a (intra-and/or inter-residual) pseudo-hydrogen bonding interaction with purine H8 as shown previously with 2Ј-fluoroarabinonucleic acid (56 -58). Such a conformational effect (preferentially to an anti conformation) by 2Ј-F at dG might affect catalysis differently with the various polymerases by interfering with the (syn-anti) Hoogsteen base pairing adopted by pol but facilitating the (anti-anti) Watson-Crick base pairing utilized by pol (albeit not with pol ). Nevertheless, these points regarding the influence of the fluorine do not affect our conclusions about the miscoding properties of N 2 ,3-⑀G reported here.
In conclusion, we have utilized a recently developed stabilized analog, 2Ј-F-N 2 ,3-⑀dG, to discern the mutation potential of a ubiquitous but unstable DNA lesion, N 2 ,3-⑀dG. Kinetic and extension analyses allow qualitative and quantitative assessments of the miscoding pattern of this lesion for Y-family DNA polymerases, which are particularly relevant to translesion synthesis. Structural insights provided the molecular bases of error-free and error-prone synthesis by pol . The consistency of T misinsertion with all polymerases studied thus far underscores the miscoding potential of N 2 ,3-⑀G. The miscoding for T suggests the relevance of N 2 ,3-⑀G to vinyl chloride-induced angiosarcomas in which prevailing GC to AT transition mutations were found in the second base of codon 13 of the K-ras gene (59). Our study supports the hypothesis that N 2 ,3-⑀G may contribute to the carcinogenesis of vinyl chloride and inflammation-driven malignancies (25, 37).