DNA Adduct Bypass Polymerization by Sulfolobus solfataricus DNA Polymerase Dpo4

1,N2-Etheno(ϵ)guanine is a mutagenic DNA lesion derived from lipid oxidation products and also from some chemical carcinogens. Gel electrophoretic analysis of the products of primer extension by Sulfolobus solfataricus P2 DNA polymerase IV (Dpo4) indicated preferential incorporation of A opposite 3′-(1,N2-ϵ-G)TACT-5′, among the four dNTPs tested individually. With the template 3′-(1,N2-ϵ-G)CACT-5′, both G and A were incorporated. When primer extension was done in the presence of a mixture of all four dNTPs, high pressure liquid chromatography-mass spectrometry analysis of the products indicated that (opposite 3′-(1,N2-ϵ-G)CACT-5′) the major product was 5′-GTGA-3′ and the minor product was 5′-AGTGA-3′. With the template 3′-(1,N2-ϵ-G)TACT-5′, the following four products were identified by high pressure liquid chromatography-mass spectrometry: 5′-AATGA-3′, 5′-ATTGA-3′, 5′-ATGA-3′, and 5′-TGA-3′. An x-ray crystal structure of Dpo4 was solved (2.1 Å) with a primer-template and A placed in the primer to be opposite the 1,N2-ϵ-G in the template 3′-(1,N2-ϵ-G)TACT 5′. The added A in the primer was paired across the template T with classic Watson-Crick geometry. Similar structures were observed in a ternary Dpo4-DNA-dATP complex and a ternary Dpo4-DNA-ddATP complex, with d(d)ATP opposite the template T. A similar structure was observed with a ddGTP adjacent to the primer and opposite the C next to 1,N2-ϵ-G in 3′-(1,N2-ϵ-G)CACT-5′. We concluded that Dpo4 uses several mechanisms, including A incorporation opposite 1,N2-ϵ-G and also a variation of dNTP-stabilized misalignment, to generate both base pair and frameshift mutations.

The accurate replication of DNA is important in the maintenance of the genetic integrity of an organism, and DNA polymerases have biochemical mechanisms to copy with very high fidelity (1,2). DNA is constantly subject to damage by physical and chemical agents, including compounds produced endogenously and those derived from the environment (3). More than 130 human proteins are involved in the repair of DNA damage (4). However, some damage escapes these surveillance systems, and DNA polymerases are forced to encounter multiple types of damaged DNA. The polymerases can be blocked, leading to cell cycle arrest or death, or can proceed to copy past the modified DNA. Copying past a DNA adduct raises the prospect of mutation. Mutation is a significant factor in the carcinogenicity of chemicals, and mutagens are generally regarded as cancer suspects (5).
1,N 2 -⑀-G, 1 originally reported by Leonard and co-workers (6), is a prototype for the so-called exocyclic DNA adducts, a series of lesions formed by various bis-electrophiles (7). Included in the list of compounds that generate etheno adducts are electrophilic products of industrial vinyl monomers (e.g. epoxides formed by oxidation of vinyl chloride, urethane, etc. (8,9)), some nitrosamines (10), mucochloric acid (11), and also products of the peroxidation of endogenous lipids (12)(13)(14). 1,N 2 -⑀-G produces misincorporation and mutations in systems involving model DNA polymerases and bacterial and mammalian cells (15)(16)(17). 1,N 2 -⑀-G is an interesting lesion in that the two added exocyclic carbons block the normal coding "face" involved in Watson-Crick pairing and preclude the usual hydrogen bonding mode (Scheme 1). Most interestingly, the frequency of mutations is low in bacterial and mammalian cells in which the 1,N 2 -⑀-G lesion is placed (16,17). This low frequency of errors may be attributed to the presence of bacterial and mammalian glycosylases that act on this lesion (18).
Until recently, biochemical studies on the mechanisms of misincorporation have been largely focused on replicative and other available DNA polymerases (2). However, many mutagens were known to be most effective in bacteria when the SOS response system was activated (19). The characterization of DNA damage-inducible polymerases in bacteria led to a better understanding of the mutagenic mechanism (20), and now these so-called translesion DNA polymerases are considered to have major roles in the replication past DNA damage, both "error-free" and "error-prone," in eubacteria, eukaryotes, and archebacteria (21)(22)(23)(24)(25). Some of these translesion polymerases have been studied in considerable detail in recent years, in regard to function, structure, and mechanism (26 -29).
One archebacter translesional DNA polymerase that has been used as a model of eukaryotic pol and as a prototype is Sulfolobus solfataricus P2 DNA polymerase IV (Dpo4) (24,30). Several crystal structures have been reported, including some with damaged DNA (31)(32)(33)(34), and pre-steady-state kinetic analysis has helped define a general catalytic mechanism (for undamaged DNA) (35,36). A number of questions still exist about the interactions of 1,N 2 -⑀-G with DNA polymerases, in that different misincorporations and frameshifts have been observed with individual polymerases, and in some cases undefined mixtures of products are obtained with a single polymerase (15,16). In cells, complex biological effects are also observed (16,17). The availability of Dpo4 provided opportunities to study the incorporation of dNTPs in a simple system and to obtain structures of crystals of ternary complexes (of the polymerase, primer-template complex, and dNTP).
We report a series of steady-state kinetic and other primertemplate studies with Dpo4 and 1,N 2 -⑀-G in two template sequences. LC-MS/MS methods were used to define the products and provide insight into the multiple bypass mechanisms Dpo4 can use. Four crystal structures of Dpo4:1,N 2 -⑀-G oligonucleotide complexes provide direct visualization of some of the intermediates, particularly the Ϫ1 frameshift mechanism. This latter intermediate has normal hydrogen bonding to the base 5Ј of the 1,N 2 -⑀-G but no distortion of the adducted oligonucleotide.
Isolation of S. solfataricus P2 DNA Polymerase IV (Dpo4) Gene and Construction of Dpo4 Expression Plasmid-The dpo4 gene was obtained by PCR amplification from the genomic DNA of S. solfataricus P2 (American Tissue Culture Collection, Manassas, VA; catalog number ATCC 35092) as template using Pfu ultra DNA polymerase (Stratagene, La Jolla, CA) with two primers, 5Ј-CATATGCATCACCATCAC-CATCACATGATTGTTCTTTTCGTTG-3Ј and 5Ј-GGATCCTTAAGTA-TCGAAGAACTTG-3Ј. The resulting 1.1-kb Dpo4 gene PCR product was cloned into the vector pPCR-Script Amp (Stratagene, La Jolla, CA), and the nucleotide sequence was determined (Vanderbilt facility) to confirm the coding region. The 1.1-kb Dpo4 gene fragment was then cloned into the NdeI and XhoI sites of the vector pET-22b(ϩ) (Novagen, Madison, WI), generating the vector pET22b(ϩ)/DPO4-NHis.
Expression and Purification of Dpo4--Dpo4 fused to an N-terminal His 6 tag was expressed in E. coli strain BL21(DE3). E. coli BL21(DE3) harboring the pET22b(ϩ)/DPO4-NHis vector (6 liters) was grown in Luria-Bertani broth supplemented with ampicillin (100 g ml Ϫ1 ) at 37°C, with aeration, to an A 600 of 0.6. Isopropyl ␤-D-thiogalactopyranoside was added to 1 mM, and the incubation was continued for 3 h. The cells were harvested by centrifugation and resuspended in 60 ml of lysis buffer (50 mM Tris-HCl, pH 7.4, containing 300 mM NaCl, 10% glycerol (v/v), 5 mM ␤-mercaptoethanol, 1 mg of lysozyme ml Ϫ1 , and protease inhibitor mixture) (Roche Applied Science), cooled on ice for 30 min, and then lysed by sonication (12 ϫ 10 s duration with a Branson digital sonifier (VWR, West Chester, PA), microtip, 45% amplitude, with intervening cooling time). The cell lysate was clarified by centrifugation at 4 ϫ 10 4 ϫ g for 45 min at 4°C. The resulting supernatant was heated at 80°C for 10 min and centrifuged to remove the denatured proteins; the soluble and heat-stable proteins were collected (including Dpo4).
The soluble supernatant was applied to a 5-ml HisTrap column (Amersham Biosciences) and washed sequentially with 50 ml of buffer A (50 mM Tris-HCl, pH 7.4, containing 300 mM NaCl, 10% glycerol (v/v), 5 mM ␤-mercaptoethanol) containing 20 mM imidazole, 50 ml of buffer A containing 40 mM imidazole, and 50 ml of buffer A containing 50 mM imidazole. Bound proteins were eluted with a linear gradient of 50 -400 mM imidazole. Eluted proteins (detected at 280 nm) were dialyzed against buffer B (50 mM Tris-HCl, pH 7.5, containing 0.5 mM EDTA, 10% glycerol (v/v), and 5 mM ␤-mercaptoethanol) and loaded onto a 1-ml MonoS column (Amersham Biosciences) that had been equilibrated with buffer B. The column was washed with 20 ml of buffer B containing 100 mM NaCl to remove the contaminating proteins, and the protein was eluted with a 30-ml linear gradient of 100 -500 mM NaCl in buffer C. The eluted fractions (280 nm absorbance) were analyzed by SDS-PAGE and silver staining (40). Dpo4 was eluted at 200 mM NaCl. By using this protocol, Dpo4 was purified to electrophoretic homogeneity, based on staining (see Supplemental Material). Protein concentration was determined using a calculated ⑀ 280 value of 22 mM Ϫ1 cm Ϫ1 for Dpo4 (41). The final yield of purified Dpo4 was ϳ10 mg.
Polymerization Assays and Gel Electrophoresis-A 32 P-labeled primer, annealed to either an unmodified or adducted template, was extended in the presence of single dNTPs (Scheme 1). Each reaction was initiated by adding 2 l of dNTP-Mg 2ϩ solution (final concentrations of 250 M of each dNTP and 5 mM MgCl 2 ) to a preincubated E⅐DNA complex (final concentrations of 50 mM Tris-HCl, pH 7.8, 100 nM DNA duplex, 100 nM Dpo4 or pol T7 Ϫ , 1 mM DTT, 50 g of BSA ml Ϫ1 , 50 mM NaCl, and 10% glycerol (v/v)) 2 at 37°C, yielding a total reaction volume of 8 l. When pol T7 Ϫ was used, E. coli thioredoxin was added in a 20-fold molar excess of pol T7 Ϫ . After 30 min, reactions were quenched with 50 l of 20 mM EDTA (pH 9.0) in 95% formamide (v/v). Aliquots (3 l) were separated by electrophoresis on a denaturing gel containing 8.0 M urea and 16% acrylamide (w/v) (from a 19:1 acrylamide:bisacrylamide solution, AccuGel, National Diagnostics, Atlanta, GA) with 80 mM Tris borate buffer, pH 7.8, containing 1 mM EDTA. The gel was exposed to a PhosphorImager screen (Imaging Screen K, Bio-Rad) overnight. The bands (representing extension of the primer) were visualized with a PhosphorImaging system (Bio-Rad, Molecular Imager® FX) using the manufacturer's Quantity One software, version 4.3.0.
Steady-state Kinetics-Unless indicated otherwise, all Dpo4 reactions were performed at 37°C in 50 mM Tris-HCl buffer, pH 7.8, containing 10% glycerol (v/v), 5 mM DTT, 50 mM NaCl, and 50 g of BSA ml Ϫ1 . 2 For unmodified templates, the molar ratio of primer-template to enzyme was 40:1, and the reactions were done at eight dNTP concentrations (reaction time of 3 min). For modified templates, the molar ratio of primer-template to enzyme was usually 10:1, and reactions were done at eight dNTP concentrations (reaction time of 5 min).
LC-MS Analysis of Oligonucleotide Products from Dpo4 Reactions-Dpo4 reactions were performed at 37°C for 6 h in 50 mM Tris-HCl buffer, pH 7.8, containing 5% glycerol (v/v), 5 mM DTT, 50 mM NaCl, 5 mM MgCl 2 , and 100 g of BSA ml Ϫ1 . The reactions were done with 10 M oligonucleotide substrate, 5 M Dpo4, and four dNTPs at 1 mM each, SCHEME 1. Oligonucleotides used for steady-state kinetic assays (Table II, Z ‫؍‬ T; Table III, Z ‫؍‬ C). A, insertion experiments. B, next-base extension experiment. C, second-base extension experiment.
in a final reaction volume of 100 l. The reaction was terminated by extraction of excess dNTPs using a spin column (Bio-Spin 6 chromatography column, Bio-Rad). To the above filtrate (120 l), concentrated Tris-HCl, DTT, and EDTA were added to restore the initial concentrations, and UDG solution was added (20 units). The reaction was incubated at 37°C for 6 h to hydrolyze the uracil residues on the primer. The final reaction mixture was heated at 95°C for 1 h in the presence of 0.25 M piperidine, followed by lyophilization. The dried residues were dissolved in 100 l of H 2 O for the following MS analysis.
MS was performed on a DecaXP ion trap instrument (ThermoFinnigan, San Jose, CA). Separation of oligonucleotide was carried out with a Jupiter microbore column (1.0 mm ϫ 150 mm, 5 m, Phenomenex, Torrance, CA). Buffer A contained 10 mM NH 4 CH 3 CO 2 , pH 6.8, and 2% CH 3 CN (v/v); buffer B contained 10 mM NH 4 CH 3 CO 2 , pH 6.8, and 95% CH 3 CN (v/v). The following gradient program was used with a flow rate of 1.0 ml min Ϫ1 : 0 -2 min, hold at 100% A; 2-20 min, linear program to 100% B; 20 -30 min, hold at 100% B; 30 -31 min, linear program to 100% A; 31-40 min, hold at 100% A (for next injection). The desired oligonucleotide products were eluted at ϳ 7 min. A pre-column "Tee" set-up was applied, with only 10% of the total flow infused to the ion source. The fast flow rate helped reduce the equilibration time of the column, and thus retention time of the oligonucleotides was consistent between runs. Samples were infused using an autosampler, with 8 l withdrawn from a 50-l reaction. ES conditions were as follows: source voltage 3.4 kV, source current 8.5 A, sheath gas flow rate setting 28.2, auxiliary sweep gas flow rate setting 4.3, capillary voltage 49 V, capillary temperature 230°C, tube lens voltage 67 V. MS/MS conditions were as follows: normalized collision energy 35%, activation Q 0.250, time 30 min, 1 scan. Product ion spectra were acquired over the range m/z 250 -2000. The abundant ions from LC-MS spectra were selected for CID analysis, and the cut-off was set above ϳ15% of the most abundant ion. When more than one ion came from a single species, the peak responding to the doubly charged parent ion was chosen for fragmentation. The calculations of the CID fragmentations of a certain oligonucleotide sequence were done using a program linked to the Mass Spectrometry Group of Medicinal Chemistry at the University of Utah (medlib.med.utah.edu/masspec/).
Crystallization of Dpo4-DNA Complexes-Primer and template sequences that were used in the crystallizations were similar to those utilized by Yang and co-workers (32, 34) as follows: 5Ј-GGG GGA AGG ATT CA-3Ј (14-mer primer; binary Dpo4-1 complex; see Table IV), 5Ј-GGG GGA AGG ATT C-3Ј (13-mer primer; ternary Dpo4-2, -3, and -4 where ͉F o,hkl ͉ and ͉F c,hkl ͉ are the observed and calculated structure factor amplitudes, respectively. d R free idem, for the set of reflections (5% of the total) omitted from the refinement process.
Structure Determination and Refinement-The Dpo4-1 structure was solved by molecular replacement with the program AMoRe (45), using the 2.8 Å resolution structure of a Dpo4-DNA complex (Protein Data Bank accession code 1s0n (34)), devoid of all side chains and solvent molecules, as the search model. The positioned model gave a correlation coefficient between observed and calculated structure factor amplitudes of 0.469 at 3.5 Å resolution. The location of the Dpo4-1 model with side chains added was optimized by several rounds of rigid body refinement, including data of increasing resolution until the diffraction limit was reached. Due to the similar values of the unit cell parameters of the four crystal structures reported here (Table I) were annealed with a template containing either Z ϭ G or 1,N 2 -⑀-G (Scheme 1) and either (A) Z ϭ T or (B) Z ϭ C at the position on the template 5Ј to X. Reactions in each set were done for the same time period with the indicated concentration (in nM) of Dpo4 or pol T7 Ϫ . The concentration of primer-template complex was 100 nM, and the concentration of each dNTP was 250 M. The reaction products were analyzed by denaturing gel electrophoresis with PhosphorImaging analysis. A, the percentage of total extended products was 76, 79, and 82% with X ϭ G using 25, 50, and 100 nM Dpo4, respectively. The corresponding values with X ϭ 1,N 2 -⑀-G and Dpo4 were 8, 16, and 17%, respectively. With X ϭ 1,N 2 -⑀-G and 1, 5, 10, and 20 nM pol T7 Ϫ , the values were 2, 4, 32, and 33%, respectively. B, the percentage of total extended products was 65, 79, and 80% with X ϭ G using 25, 50, and 100 nM Dpo4, respectively. The corresponding values with X ϭ 1,N 2 -⑀-G and Dpo4 were 13, 20, and 23%, respectively. With X ϭ 1,N 2 -⑀-G and 1, 5, 10, and 20 nM pol T7 Ϫ , the values were 13, 15, 53, and 62%, respectively.
Dpo4-1 protein-DNA complex without solvent molecules was used as a starting model for the Dpo4-2, -3, and -4 structures. The positions of the models were then optimized by several rounds of rigid body refinement by using data of increasing resolution. Manual model rebuilding was carried out with the TURBO-FRODO program (57) into A maps computed using modified A coefficients (46). The initial difference Fourier maps showed clear negative density (higher than 5.0 r.m.s. deviation) for the Ca 2ϩ ions as well as for the nucleotides used for crystallization (i.e. dATP for Dpo4-2, ddATP for Dpo4-3, and ddGTP for Dpo4-4). Unambiguous density was also observed for the DNA duplex except for the first three template nucleotides (5Ј-TCA) that were disordered in all four structures. Therefore, in the absence of density, the 5Ј-terminal T of the template nucleotide was completely omitted from the models. The adjacent C2 has lower occupancy in all four structures, but A3 has lower occupancy only in Dpo4-3 and Dpo4-4.
Refinements were performed using the program CNS (47). To calculate the cross-validation residual R free , 5% of amplitudes were excluded from refinement. During the model rebuilding steps, water oxygen atoms were added into positive regions of (F o Ϫ F c ) Fourier difference electron density (higher than 3.0 standard deviation). They were kept in the model after the final refinement round only if they explained the A density maps and formed at least one hydrogen bond with any other atom in the model. Current statistics of the refined models for all four structures are presented in Table I.
The quality and stereochemistry of the four models were examined with standard procedures in CNS (47) and PROCHECK (48). All the crystallographic figures were prepared using CHIMERA (49).

Extension of Oligonucleotide Primers by Dpo4 in the Presence of All Four dNTPs-
The template sequences selected for use here were based on crystallography work done by Ling et al. (31)(32)(33)(34). Separate experiments were done with the pyrimidines T and C placed 5Ј to the 1,N 2 -⑀-G (Scheme 1A).
As expected from previous work with 1,N 2 -⑀-G (15), the model replicative DNA polymerase pol T7 Ϫ was blocked considerably in its ability to extend past this DNA adduct (Fig. 1). The major product involved a 1-base extension, i.e. insertion of one dNTP opposite 1,N 2 -⑀-G. Dpo4 was not particularly efficient but did extend the primer to the apparent end of the template (Fig. 1). In the extension, several bands were seen, as a result of the distributive character of this translesion polymerase or due to frameshifts.
Insertion of Individual dNTPs by Dpo4--With the template containing a C residue 5Ј to 1,N 2 -⑀-G (Scheme 1A), pol T7 Ϫ showed a tendency to insert a single A, presumably opposite the 1,N 2 -⑀-G adduct ( Fig. 2A). Some similarity was observed in the pattern with Dpo4 (Fig. 2B), although (i) more incorporation of dGTP was detected (possibly opposite the C residue 5Ј of the 1,N 2 -⑀-G and suggesting a deletion in the product), and (ii) with both dATP and dGTP a tendency was seen for multiple incorporations.
By using the template with a T residue 5Ј of the 1,N 2 -⑀-G, pol T7 Ϫ incorporated a single base in the apparent order of preference A Ͼ Ͼ T Ͼ C Ͼ G (Fig. 3A). With Dpo4 some incorporation of a single G, C, or T was observed, but the preferred reaction was the incorporation of three As (Fig. 3B). This latter result was unexpected and would appear to correspond to the insertion of an A opposite the 1,N 2 -⑀-G, the T, and the A (Scheme 1A). Some of this tendency was also seen with pol T7 Ϫ under these conditions (Fig. 3A) but was less pronounced.
Steady-state Kinetic Analysis of dNTP Incorporation-The incorporation of individual dNTPs was analyzed in more detail using Dpo4 and the template with a T residue 5Ј of the 1,N 2 -⑀-G adduct (Scheme 1A). The individual experiments are outlined in Scheme 1 and Table II, with the results presented in Table II.
"Correct" incorporation of dCTP opposite 1,N 2 -⑀-G was very unfavorable, with an apparent catalytic efficiency 20,000-fold less than the incorporation of dCTP opposite G (Table II, 1st and 2nd rows). As indicated in Fig. 3B, A was the base preferentially incorporated into the primer. Three As were incorporated sequentially, even under limiting conditions, and we were unable to cleanly stop the reaction at the individual intermediates, even using rapid mixing approaches. The results for the incorporation of A are presented in Table II, 3rd and 4th rows, with the results being calculated for the 3-A addition product. The values shown for the insertion of dATP opposite G are based on incorporation of a single residue, and thus a direct comparison of the efficiency of incorporation of A opposite G and 1,N 2 -⑀-G is somewhat misleading. However, even with these caveats the apparent misinsertion frequency (f) is 14, indicating that a 1,N 2 -⑀-G adduct has considerable proclivity for directing the insertion of A Ͼ C.
The tendency for Dpo4 to add additional As to the primer beyond the 1,N 2 -⑀-G:A pair was also examined (Table II, 6th and 7th rows). The apparent catalytic efficiencies for these steps are low (0.0009 and 0.0033 min Ϫ1 M Ϫ1 ) and may limit the value for the reaction shown in row 4. The efficiency for adding the third A (row 7) is ϳ4-fold higher than for the second A (Table II, 6th row) and helps to explain the pattern of incorporation of three As in the gels shown in Fig. 3B.
When the addition of C or A to the primer was examined using rapid quench methods, no burst phase was observed (results not presented), indicating that the addition is an inherently slow process, and the k off rate (for the oligonucleotide) is not rate-limiting, in contrast to results reported for normal base incorporation (35) and confirmed here.
The 1-base incorporation studies were also done with the template sequence containing C instead of T as the pyrimidine SCHEME 2. LC-MS/MS method for sequencing primer extension products (Figs. 4 -11).
MS Analysis of Dpo4 Reaction Products-ES-MS/MS has been used in the past to sequence short oligonucleotides (50,51). LC-MS/MS has advantages over traditional gel sequencing methods (15,52), including the capability of handling an oligonucleotide mixture and better sensitivity, to facilitate the analysis of the oligonucleotide products of the Dpo4 reaction. A uracil-containing primer was designed (Scheme 2). After the polymerase extension was done (see Supplemental Material for an electrophoretogram of an incubation done in parallel), the extended primers were digested by UDG to generate short oligonucleotides, i.e. 6-and 7-mers in length. The short oligonucleotides can be separated easily from the complementary strand by HPLC, due to the reduced T m . Furthermore, short oligonucleotides generate less complex CID spectra for sequence analysis.
After UDG digestion and piperidine treatment, a gap was generated in the DNA strand obtained from primer extension with the template containing a C residue 5Ј of 1,N 2 -⑀-G (Scheme 2). This mixture was separated by HPLC, and the oligonucleotide-containing peak was eluted at 6.8 min. Preliminary analysis of the ES mass spectrum (Fig. 4B) Fig. 5A. The apparent composition of the m/z 942.4 (M Ϫ 2H) oligonucleotide is 1 phosphate, 2 Ts, 1 C, 2 Gs, and 1 A (because the unknown oligonucleotide was a hydrolysis product obtained using UDG, it should contain a phosphate residue at the 5Ј-end) (see web page for the Oligo Composition Calculator at medlib.med.utah.edu/masspec/compo.htm). Given the sequence of the template, the unknown sequence was proposed to be 5Ј-pTCGTGA-3Ј. CID fragments calculated for this sequence are listed in Table IV, and all of the major a-B and w ions appeared in the CID spectra of the unknown oligonucleotide (Fig. 5B). Thus, 5Ј-pTCGTGA-3Ј was identified as one of the major products from the Dpo4 reaction. The same analysis was carried out for the 1098.9 (M Ϫ 2H) ion, and this sequence was identified as 5Ј-pTCAGTGA-3Ј, with the calculated CID fragments shown in Table V. From comparisons of the TIC intensities, 5Ј-pTCGTGA-3Ј corresponds to 84% of the total product, and 5Ј-pTCAGTGA-3Ј accounts for 16%. Because the 1255.3 (M Ϫ 2H) ion could not be fragmented under the same CID conditions and was present throughout the HPLC gradient, this species was not considered for further analysis.
When the C on the 5Ј-side of 1,N 2 -⑀-G was replaced by a T (Z ϭ T in Scheme 1), four major oligonucleotide products were generated and eluted at t R 7.4 min as a mixture: m/z 1090.8 (M 1 Ϫ 2H), 1086.3 (M 2 Ϫ 2H), 934.4 (M 3 Ϫ 2H), and 777.8 (M 4 Ϫ 2H) (Fig. 7). The ions m/z 727.1 (M 1 Ϫ 3H) and 622.9 (M 2 Ϫ 3H) belong to the parent ions M 1 and M 2 , and other peaks were regarded as minor products. The same sequence analysis procedure was used. For example, the m/z 1090.8 ion was assumed to be a M Ϫ 2H ion, and the M r of the parent ion should be 2183.6. Thus, the unknown oligonucleotide contained 1 phosphate, 2 Ts, 1 C, 1 G, and 3 As, based on the calculations. Given the sequence of the template, a candidate sequence of the unknown is 5Ј-pTCAATGA-3Ј. The CID spectra (Fig. 8B) matched well with that calculated based on this candidate sequence (see Supplemental Data for tables of calculated CID fragments). Thus, the four products were identified as 5Ј-pT-CAATGA-3Ј (M 1 ), 5Ј-pTCATTGA-3Ј (M 2 ), 5Ј-pTCATGA-3Ј(M 3 ), and 5Ј-pTCTGA-3Ј (M 4 ) (Figs. 7-11), and their yields were 31, 27, 24, and 18% respectively (Scheme 3). 3 X-ray Crystallography-We determined the crystal structures of a binary Dpo4⅐DNA and three ternary Dpo4⅐DNA⅐ d(d)NTP complexes at resolutions of between 2.0 and 2.5 Å ( Table I). The lengths of the primer (14-mer for Dpo4-1 and 13-mers for the Dpo4-2, -3, and -4 complexes) and template strands (18-mers in all cases) are similar to those used previously for crystallizing complexes between DNA and Dpo4 (31)(32)(33)(34). Alterations in the sequences of the primer and template strands used here compared with published work concern the 3Ј terminus of the former and the 5Ј-half of the latter, including the adducted 1,N 2 -⑀-G residue (see Ref. 31 and "Experimental Procedures"). The structure of the complex between Dpo4 and a DNA primer-template duplex featuring an abasic site (at position 14 of the 18-mer template (34)) served as the model in the molecular replacement searches to phase the Dpo4-1 complex. Consistent with the excellent correlation factors and initial R factors resulting from the search, the two structures (and indeed all four structures described here) are very similar.
The active site of the polymerase in the Dpo4-1, -2, -3, and -4 complexes closely resembles that in the "type II" crystals of the complex between Dpo4 and native DNA (31). Thus, the 1,N 2 -⑀-G residue of the template as well as the 5Ј-adjacent T (Dpo4-1 to -3) and C (Dpo4-4) are both accommodated at the active site 3 One interesting observation was that two major species resulted from the m/z 1086.3 ion (Fig. 9). Based on the CID analysis, the one eluted at 7.5 min was identified as 5Ј-pTCATTGA-3Ј (M 2 ). Another species eluted at t R 3 min and contained fragments from 5Ј-GGGG-GAAGGA-3Ј, which was the other half of the primer after the piperidine hydrolysis. In addition, 1086.3 matched the M Ϫ 3H ion of the species, 5Ј-GGGGGAAGGA-(2Ј,3Ј-dideoxy-2Ј,3Ј-dehydroribose)-H 2 O. It can be assumed that this oligonucleotide may form a cyclic configuration by losing one H 2 O molecule. Thus, this product can be eluted even faster than a 5-mer oligonucleotide such as 5Ј-pTCTGA-3Ј (M 4 ) (Fig. 5).   a Boldface numbers refer to the peaks found in the CID spectrum in Fig. 6B.  (Fig. 12). In the complex with native DNA, the next residue of the template 5Ј-TTCG-(T, underlined) is also directed toward the active site (see Fig. 6B in Ref. 31), whereas in our com-plexes with template sequences 5Ј-TCAT(C)-, the corresponding residue (A, underlined) is looped out and directed away from the active site (Figs. 12 and 13). This may indicate that a purine base is too bulky to be stacked above the second of the two bases lodging in the active site. However, the situation encountered in the Dpo4-4 complex serves to demonstrate that the template strand is able to adopt a range of conformations at this site, as the adenine in that complex assumes a different orientation than in the other three complexes (Fig. 13D).
A shared feature of all four complexes is the orientation of the 1,N 2 -⑀-G adduct. The residue is stacked into the duplex but is not opposed by a base from the primer strand (Fig. 13). Instead, the cytosine and adenine bases of the extended primer (ϩ1 and Ϫ1 positions relative to the position of the 1,N 2 -⑀-G in the template strand) in the Dpo4-1 complex are wrapped around the modified residue. To bridge the larger distance that results from omitting the template 1,N 2 -⑀-G, both primer bases are tilted in the product complex as well as the ternary Dpo4-2, -3, and -4 complexes. The positions of nucleobases at the active site vary only slightly as can be seen in the superpositions of the complexes with adducted DNA and the "type I" and type II complexes with native DNA (Fig. 14, A and B, respectively). The superposition of the DNAs in the Dpo4-1, -2, and -4 complexes with a portion of the native DNA primer-template duplex at and adjacent to the active site in the type I crystal shows that the 1,N 2 -⑀-G residue assumes a position that was previously occupied by a Watson-Crick T:pA pair (Fig. 14B). The superposition reveals practically no changes in the conformation of the sugar-phosphate backbone of the template. Deviations relative to the native DNA duplex are restricted to the primer residues that wrap around 1,N 2 -⑀-G. In addition to the aforementioned tilting of bases, there are shifts and altered torsion angles in the sugar-phosphate backbone of the primer to allow for the bridging of the gap resulting between T and A (G in the case of Dpo4-4; Fig. 14A).
The superposition of the binary Dpo4-1 complex, the ternary complexes Dpo4-2 and Dpo4-4, as well as the type II ternary complex containing native DNA also manifests only subtle variations between the complex with the product and those with dNTPs or ddNTPs occupying the active site (Fig. 14B). The 3Ј-OH groups of the last primer nucleotide are poised for attack at the ␣-phosphates of dATP (Dpo4-2; Fig. 14B) and ddATP (Dpo4-3; not shown). Moreover, the metal ions at the active site occupy virtually identical sites in all four complexes with 1,N 2 -⑀-G-modified templates. However, there is an interesting rearrangement of the triphosphate moiety of ddGTP in the Dpo4-4 complex (Fig. 13D) compared with the conformations of dATP and ddATP in the Dpo4-2 and Dpo4-3 complexes, respectively (Fig. 13, B and C). In the Dpo4-4 complex, the ␥-phosphate is closest to the 3Ј-terminal nucleotide of the primer, whereas the ␣-phosphate now occupies a position that is close to those of the ␥-phosphates in the Dpo4-2 and -3 complexes. The resolutions of the structures allow unambiguous assignments of the conformations of nucleotides, and the altered orientation of the triphosphate in Dpo4-4 is not a consequence of poor electron density or local disorder. When the positions of the d(d)NTPs in the three structures are compared (results not shown), it is also evident that the dATP (Dpo4-2) occupies a position that is distinct from those of ddATP (Dpo4-3) and ddGTP (Dpo4-4). This is likely the result of a hydrogen bond formed between the 3Ј-OH of dATP and OD2 of Asp-105 that is absent in the two other complexes with either ddATP or ddGTP. Detailed analysis reveals that the slightly altered orientations of the nucleotide triphosphates go along with different sets of hydrogen bonding interactions with Dpo4 side chains in the respective complexes.
The observation that the presence of 1,N 2 -⑀-G in the template strand leads to virtually no structural changes relative to the type II Dpo4 complex with native DNA is remarkable. In particular, there is no evidence from the structural data for an extrahelical orientation of the 1,N 2 -⑀-G residue. Such a rearrangement into the major groove (toward the viewer in Figs. 12 and 13) would be necessary in order to allow a Hoogsteen-type interaction with adenine via the N-2 and N-3 positions of 1,N 2 -⑀-G from the minor groove side. However, the observed intrahelical orientation of 1,N 2 -⑀-G in all four complexes presumably offers energetic advantages in terms of stacking. As is evident from Figs. 14 and 15, 1,N 2 -⑀-G provides an excellent platform for cross-strand stacking with an incoming dATP or dGTP. Independent of the sequence of the template 5Ј to 1,N 2 -⑀-G, it may therefore be easier to incorporate A or G, even if the template strand codes for a pyrimidine. This phenomenon would be consistent with the facile incorporation of multiple As catalyzed by Dpo4 with templates containing 1,N 2 -⑀-G (Figs. 2  and 3).

DISCUSSION
The interactions of the archebacter DNA polymerase Dpo4 and oligonucleotides containing 1,N 2 -⑀-G were studied as a model of a polymerase-DNA-carcinogen system, and 1,N 2 -⑀-G is an adduct of potential mutagenic concern itself. Typical gel electrophoresis assays and steady-state kinetic analysis indicated that a mixture of products was formed. The products formed by Dpo4 polymerization past the template 1,N 2 -⑀-G were analyzed by LC-ES-MS/MS. The six identified products, obtained with the two templates, correspond to substitution at the adduct site, Ϫ1 and Ϫ2 frameshifts, and a more complex mechanism involving a variation of a "dNTP-stabilized" slipped intermediate (53) (Schemes 4 and 5). The proposed mechanisms are supported by the results of analysis of x-ray crystal structures of three of the intermediates, including one binary complex and three different ternary complexes (Fig. 13). . The substitution of a uracil for T in the primer, 5Ј of the start of polymerization, permitted better recovery, separation, and analysis of the products (Scheme 2). The MS methodology was optimized using the product obtained with the 3Ј-1,N 2 -⑀-G-5Ј template. In ES-MS, the signal intensity is a reasonable measure of the amounts of each of similar materials, and we estimate that 84% of the product corresponded to an incorporation of G opposite the C residue 5Ј of the 1,N 2 -⑀-G, thus skipping the adduct and resulting in a Ϫ1 deletion product (Scheme 4A). The minor product (16%) was due to the insertion of A opposite 1,N 2 -⑀-G, followed by accurate polymerization beyond this match and yielding a base pair substitution (Scheme 4B). ES-MS/MS was used instead of matrix-assisted laser desorption ionization methods, for which the signal response can be quite variable for similar compounds. The ES-MS/MS method was also applied to the analysis of products obtained with the 3Ј-1,N 2 -⑀-G-T-5Ј template. The four products were determined (Scheme 3), and reasonable mechanisms are proposed in Scheme 5. The crystal structures of two of the intermediates were obtained (Fig. 13, A-C).
The MS methods for analysis of the products are far superior to Maxam-Gilbert analysis of oligonucleotide products, which was done previously in work with 1,N 2 -⑀-G (15, 16). The individual two purines and two pyrimidines are only distinguished by "subtractive analysis" in the reading of the gels (52). The electrophoretic assays are inappropriate for mixtures (e.g. Scheme 3) and particularly so for oligonucleotides of varying lengths with partial identical sequence. Furthermore, if longer products were formed and ligated into vectors for modern automated sequencing methods, the results obtained with a mixture (e.g. Scheme 3) would be uninterpretable.
The analysis of the products by MS/MS provides a more accurate analysis of the course of incorporation, in the presence of a mixture of all four dNTPs, than is obtained with the more traditional gel assays using a single dNTP in each setting (Figs.  2 and 3 and Tables II and III). With the 3Ј-1,N 2 -⑀-G-C-5Ј template (Fig. 2), Dpo4 inserts multiple As and Gs (or 1 T). A similar pattern was seen with Dpo4 and the 3Ј-1,N 2 -⑀-G-T-5Ј template, although apparently 3 As, 1 G, or 1 T was inserted (Fig. 3). The steady-state kinetic analyses are consonant with these results and provide some quantitative information, although the subsequent MS (and x-ray) analyses provide a superior view of the polymerase mechanism. In principle, the course of the nucleotide incorporation process opposite and beyond 1,N 2 -⑀-G might be discerned by analysis of sets of quantitative steady-state analyses (e.g . Tables II and III), although complex multiple experiments would be in order. The difficulty of doing this is seen in Scheme 5 and the myriad of possible events. Each 1-base incorporation event would have to be followed by analysis of the next base insertion, and identifying the formation of the product in Scheme 5B would be quite difficult. Thus the MS analysis has the advantage of providing a semi- quantitative description of the overall process in a single experiment.
Some comparisons of the Dpo4 results can be made with other DNA polymerases in regard to 1,N 2 -⑀-G results, although (i) some of the earlier work with individual polymerases was done in different sequence contexts, and (ii) all of the earlier work with individual polymerases was done with simple gel analysis and has limitations. Previous work with E. coli pol I Ϫ (Klenow fragment) and pol II Ϫ , pol T7 Ϫ , human immunodeficiency virus, type 1, reverse transcriptase, and rat pol ␤ have yielded incorporation of all four dNTPs opposite 1,N 2 -⑀-G, as well as Ϫ1 and Ϫ2 deletions (15,16). Both G to A transitions and G to T transversions were found in E. coli cells, corresponding to insertion of A and T opposite 1,N 2 -⑀-G (16). In Chinese hamster ovary cells, the most common base pair substitution at the 1,N 2 -⑀-G site corresponded to insertion of T (G to A transition) (17). One of the base pair mutants corresponded to insertion of dATP, and a number of complex mutants resulted from rearrangements and deletions. A direct comparison of Dpo4 with another polymerase, pol T7 Ϫ , is made here in Figs. 1-3, with the same sequences used. pol T7 Ϫ showed somewhat similar tendencies, as judged by these results, but there were some differences. For instance, Dpo4 was more likely to insert G than A (Fig. 2). With the other template sequence used (Fig.  3), both polymerases preferred to incorporate A, but Dpo4 catalyzed multiple incorporations. Thus, we conclude that the course of polymerization events is a function of the individual DNA adduct, the DNA sequence (Scheme 4), and the polymerase (Figs. 2 and 3).
Multiple mechanisms are required to explain the mixtures of products Dpo4 generated (Scheme 3). We first consider the mechanisms involved in extension of the primer opposite the template with a C residue 5Ј of 1,N 2 -⑀-G (Scheme 4). The major product, the Ϫ1 deletion, is explained by the mechanism shown in Scheme 4A, with a direct insertion of the dGTP opposite the C. This intermediate is confirmed by its crystal structure (Fig.  13D). This behavior of Dpo4 has been suggested previously in kinetic (35) and structural (31) studies with mispaired bases, without DNA adducts. The minor product has the A inserted opposite 1,N 2 -⑀-G, followed by normal incorporation. The latter result is of significance in explaining the results obtained with the other template (Schemes 3 and 5) because it shows an inherent tendency to insert an A opposite 1,N 2 -⑀-G.
The mechanisms for replication of the template with T in the position 5Ј of 1,N 2 -⑀-G are more complex, with four products recovered in similar yield (Scheme 5). The product shown in Scheme 5A can be derived via the intermediates for which crystal structures were obtained (in boxes) or from an initial binding of dATP opposite 1,N 2 -⑀-G. The former path is supported by the crystal structures, but the result with the other template (depicted in Scheme 4B) is possible, with evidence from the Scheme 4B mechanism for the other sequence. The product produced in Scheme 5B is presumed to be bulged. The proposed mechanism begins with the intermediates identified in the crystal and proceeds with a slippage. This mechanism has some semblance to the "dNTP-stabilized misalignment" and "misinsertion misalignment" mechanisms discussed by Tippin et al. (53), although it is not exactly the same as either. The mechanism proposed in Scheme 5C is identical to that of Scheme 4A and is supported by the crystal structures. Finally, the product generated in Scheme 5D (Ϫ2 deletion) clearly must come from dTTP insertion opposite the A, although we have not searched for crystal structures of this type. Finally, it is of interest to note that the results of the gel assays with single dNTPs, by themselves (Figs. 2 and 3), are of limited usefulness in predicting the results obtained using MS and x-ray crystallography, as discussed above.
The first crystal structure of Dpo4, published by Ling et al. (31), showed a much more open cleft than known for the processive DNA polymerases, as visualized in Fig. 12 of the present work. Our crystal structures have the incoming d(d)ATP skipping the 1,N 2 -⑀-G and pairing in a classic hydrogen-bonding mode with the T base 5Ј to the 1,N 2 -⑀-G (Fig. 13, B and C). Similarly, in the template sequence with the C, the incoming d(d)GTP skips the 1,N 2 -⑀-G and pairs in a classic hydrogenbonding mode with the C that is 5Ј to the 1,N 2 -⑀-G. In the binary complex with 1-base-extended primer plus template, the same Watson-Crick bonding of the A with the T base 5Ј to the 1,N 2 -⑀-G is observed (Figs. 13 and 14). A somewhat unexpected feature is that the 1,N 2 -⑀-G base is not bulged or flipped, and the chains extend past the lesion for phosphodiester bond formation. This perturbed duplex structure might be expected to be associated with an altered Dpo4 structure, although we have not seen any major changes relative to the complex with un- Several other Dpo4 structures are now available with adducted DNA or mismatched residues and may be compared. Two structures were observed with a cis-syn-cyclobutane (T-T) dimer (32). In this system, Dpo4 incorporates two As and thus preserves the coding information. An N 6 -adenyl benzo-[a]pyrene diol epoxide complex yielded two crystal forms (BP-1 and BP-2), which correspond to stacked and extruded orientations (33). The considerable bulk of this lesion makes a comparison with our work difficult. Recently, Wang et al. (54) published a molecular dynamics simulation of the interaction of Dpo4 with an N 2 -G adduct derived from benzo[a]pyrene diol epoxide and compared the predicted stabilities. A set of five structures related to the copying of Dpo4 past an abasic site, published by Ling et al. (34), provides very useful information on the course of the mechanism of several phenomena. Several of these structures may have relevance for the structures we solved with the 1,N 2 -⑀-G adduct (Fig. 13) and the products obtained with the two different templates (Schemes 4 and 5). In the Ab-2B complex that displays the type II structure, the primer terminus is opposite the template abasic site, and the next template base 5Ј-adjacent to the lesion has no partner, and the incoming dTTP then base-pairs with the template A two bases from the lesion. The ensuing Ϫ1 frameshift is similar to those seen with 1,N 2 -⑀-G-modified templates (Schemes 4, B and C, and 5A; Fig. 13). If one were to consider the abasic site in that complex as a model for a sugar-phosphate backbone conformation that could potentially be adopted by 1,N 2 -⑀-G, one would arrive at a scenario that is consistent with the Ϫ2 frameshift product depicted in Scheme 5D. In the structures of the Ab-1 and Ab-2A complexes that are of type I, the abasic lesion in the template is looped out, indicating that a local distortion of the DNA backbone is acceptable at the active site of the Dpo4 polymerase. Incorporation of an adenine opposite   (Fig. 13D). A, major product. B, minor product. the 1,N 2 -⑀-G adduct (Scheme 4B and Scheme 5A) requires the latter to swing out of the helix, thus making way for the pairing interaction with A (presumably via the minor groove edge of 1,N 2 -⑀-G). Finally, the structure of the Ab-3 complex revealed the 3Ј-A of the primer opposite the intrahelical abasic site and the incoming dCTP opposite the template-G 5Ј to the lesion. Therefore, both the template and the primer were in the correct base pair register. However, the terminal primer base has slid toward the template strand to increase stacking with the preceding bases from both strands and the 5Ј-adjacent guanine from the template. This latter interaction is reminiscent of the cross-strand stacking observed in the four structures of binary and ternary complexes of Dpo4 with 1,N 2 -⑀-G-modified DNAs. This observation serves to reiterate the importance of stacking for generating a stable intermediate of the polymerase-DNA complex and may provide a possible explanation for why we were unable thus far to isolate structures of complexes with Dpo4 that display partly extrahelical or fully extruded orientations of the 1,N 2 -e-G residue.
Finally, Trincao et al. (55) analyzed the structure of a G:T mismatch. In this structure a wobble configuration (56) was observed for the G:T pair and is postulated to prevent extension of the primer. This latter structure does not seem relevant to those determined here with the 1,N 2 -⑀-G adduct.
In summary, analysis of the nucleotide sequences of the Dpo4 reaction products provides new insight into the complexity of the action of a polymerase on carcinogen-modified DNA. X-ray crystal structures of several of the proposed intermediates support their proposed roles in the process. Mechanisms include direct coding for A incorporation opposite the adduct (1,N 2 -⑀-G), what appears to be a dNTP-stabilized misalignment mechanism (35, 53), a variant of the latter that produces a 2-base deletion, and a more complex mode of the dNTP-stabilized misalignment that involves both the primer and template (Scheme 5). Two striking features of the Ϫ1 deletion crystal structures, both with the dNTP and the 1-base elongated product (Fig. 13), are (i) the lack of perturbation of the template 1,N 2 -⑀-G and (ii) the classic Watson-Crick hydrogen bonding of the "misaligned" bases. The application of these approaches to other types of DNA adducts, coupled with kinetic and other biochemical approaches, may provide more general insight into the nature of interactions of DNA-carcinogen adducts with DNA polymerases.