The Spacious Active Site of a Y-Family DNA Polymerase Facilitates Promiscuous Nucleotide Incorporation Opposite a Bulky Carcinogen-DNA Adduct ELUCIDATING THE STRUCTURE-FUNCTION RELATIONSHIP THROUGH EXPERIMENTAL AND COMPUTATIONAL APPROACHES*

Y-family DNA polymerases lack some of the mechanisms that replicative DNA polymerases employ to ensure fidelity, resulting in higher error rates during replication of undamaged DNA templates and the ability to bypass certain aberrant bases, such as those produced by exposure to carcinogens, including benzo[ a ]pyrene (BP). A tumorigenic metabolite of BP, ( (cid:1) )- anti -benzo-[ a ]pyrene diol epoxide, attacks DNA to form the major 10 S ( (cid:1) )- trans - anti -[BP]- N 2 -dG adduct, which has been shown to be mutagenic in a number of prokaryotic and eukaryotic systems. The 10 S ( (cid:1) )- trans - anti -[BP]- N 2 -dG adduct can cause all three base substitution mutations, and the SOS response in Escherichia coli increases bypass of bulky adducts, suggesting that Y-family DNA polymerases are involved in the bypass of such lesions. Dpo4 belongs to the DinB branch of the Y-family, which also includes E. coli pol IV and eukaryotic pol (cid:2) . We carried out primer extension assays in conjunction with molecular modeling and molecular dynamics studies in order to elucidate the structure-function relationship involved in nucleotide incorporation

tional interplay between replicative and Y-family DNA polymerases, as well as the mechanism of damage bypass by both types of enzymes, is of considerable interest in understanding the origin of in vivo mutagenesis. While replicative DNA polymerases employ a number of mechanisms to ensure fidelity, including induced-fit geometric selection of the nascent base pair and minor groove scanning of the primer-template duplex region (reviewed in Refs. 10 -16), Y-family DNA polymerases appear to lack such mechanisms, resulting in higher error rates during replication of undamaged DNA templates and the ability to bypass certain aberrant bases (1,2,(5)(6)(7)(8)(17)(18)(19)(20)(21)(22)(23)(24)(25). It is thought that the more permissive nature of Y-family polymerases has evolved, at least in part, to rescue replication machinery that is blocked by altered bases, such as those produced by exposure to bulky mutagens and carcinogens (26 -29). Erroneous bypass of carcinogen-damaged bases can contribute to the carcinogenic process if a mutation occurs in a gene critical to cell cycle control (30,31).
Chemical carcinogens are encountered throughout the environment, and the manner by which their tumorigenic effects are exerted is of much interest in understanding the origins of environmental mutagenesis and carcinogenesis. Polycyclic aromatic hydrocarbons (PAHs) are an omnipresent class of environmental carcinogens that are byproducts of the incomplete combustion of organic matter, and thus, are produced during volcanic eruptions, and are present in automobile, jet, and factory emissions, barbequed meats and fish, and cigarette smoke (32)(33)(34)(35)(36). Benzo[a]pyrene (BP) 1 is a commonly studied PAH that has been implicated as a causative agent in lung cancer (37,38); it is metabolically activated to a number of reactive intermediates in mammals (39), including the tumorigenic (ϩ)-anti-benzo[a]pyrene diol epoxide (BPDE) (40). The electrophilic (ϩ)-anti-BPDE attacks DNA predominantly at the exocyclic amino group of guanine to form the 10S (ϩ)-transanti-[BP]-N 2 -dG ((ϩ)-ta[BP]G) adduct (41,42)(shown in Fig.  1a). High resolution NMR solution studies have shown that (ϩ)-ta[BP]G adopts a normal anti conformation in duplex DNA opposite a dC partner, and the carcinogen moiety resides in the minor groove with its pyrenyl rings pointing toward the 5Ј-side of the damaged guanine (43,44). A similar conformation, with the damaged guanine anti and hydrogen-bonded to its partner, is adopted when (ϩ)-ta[BP]G is placed opposite dC at a singlestrand/double junction (45). In contrast, with no partner oppo-* This work was supported by National Institutes of Health Grants CA28038 (to S. B.), ES10581 (to D. A. S.), and CA76660 (to N. E. G.). The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
site the lesion at a single-strand/double-strand junction, the (ϩ)-ta[BP]G adduct adopts a syn conformation with the carcinogen moiety stacked on top of the duplex region (46). The (ϩ)-ta[BP]G adduct is mutagenic in a number of prokaryotic (7,(47)(48)(49)(50)(51)(52)(53)(54)(55)(56)(57)(58)(59)(60)(61) and eukaryotic (4,59,(62)(63)(64) systems. Specifically, this adduct can cause all three base substitution mutations, with their relative frequencies dependent on sequence context, as well as the host cell replication system by which the adduct is bypassed (47-53, 55, 56, 59 -62, 65). In Escherichia coli, the replicative DNA polymerase III has been shown to participate in the predominant point mutation pathway, namely G 3 T transversions, at (ϩ)-ta[BP]G adducts in a 5Ј-GG*A-3Ј sequence context, where G* represents the adduct (60). However, induction of the SOS response in E. coli increases bypass of bulky adducts, including (ϩ)-ta[BP]G, with a significant increase in both error-free and error-prone events (49,51,60,66); this suggests that Y-family DNA polymerases, including pol IV and pol V, evolved to overcome replication hurdles in order to increase cellular survival (6,60). In agreement, a recent study suggests two mutagenic pathways for G 3 T mutations in E. coli dictated by sequence context, one of which is dependent on DNA polymerase V, a Y-family DNA polymerase (67). Indeed, bulky adducts, such as (ϩ)-ta[BP]G, predominantly block replicative DNA polymerases (47, 56 -58, 68 -70), but members of the Y-family, including pol (71), pol (64), and pol (63), bypass these adducts more readily than their replicative counterparts, albeit with reduced efficiency when compared with that in undamaged DNA (4,7,20,63,71). In addition, pol has been implicated in facilitating survival of mammalian cells exposed to metabolically activated benzo- Previous studies have suggested that replicative DNA polymerases are predominantly blocked by (ϩ)-ta[BP]G because the damaged base must adopt the less common syn conformation in order to be accommodated within the sterically constrained active site (72)(73)(74). However, little is known about the mechanism by which the low fidelity Y-family DNA polymerases accommodate bulky lesions, such as (ϩ)-ta[BP]G, compared with high fidelity DNA polymerases. Dpo4 is a Y-family DNA polymerase from the crenarchaeon Sulfolobus solfataricus P2 that offers a unique opportunity to investigate the structure-function relationship involved in adduct bypass. Not only is the enzyme readily available for functional studies (1,22), but high resolution crystal structures have also been solved of the ternary complex of this enzyme, including the primer-template and incoming nucleotide (2,(75)(76)(77).
The 40-kDa Dpo4 enzyme belongs to the DinB branch of the Y-family, which also includes E. coli pol IV and eukaryotic pol (78), and was isolated from S. solfataricus P2, a thermostable Archaea (1). Despite little sequence similarity, Dpo4 and other Y-family DNA polymerases display structural similarities to their replicative cousins, adopting shapes reminiscent of a right hand with palm, finger, and thumb domains (2, 22, 79 -83). However, unlike the ␣-helical nature of replicative polymerase finger domains, a ␤-strand and adjacent loop in the finger domain of Dpo4 contact the face of the nascent base pair (2). In addition, Dpo4 possesses an extra C-terminal domain, dubbed the little finger, that is thought to be involved in increasing processivity (1,2,22). Dpo4 makes limited contacts with the primer-template, and notably, the minor groove side of the primer-template is solvent-exposed (2) and lacks the minor groove-scanning tract present in replicative DNA polymerases (14,15,79,80,(82)(83)(84)(85)(86)(87)(88). Limited protein-DNA interactions and relaxed fidelity mechanisms may help explain why Dpo4 is both distributive, incorporating 1-2 nucleotides per primertemplate binding event, and relatively error-prone, with a mutational frequency of 8 ϫ 10 Ϫ3 to 4 ϫ 10 Ϫ4 when replicating undamaged DNA (1).
In the present work, we have carried out primer extension assays in conjunction with molecular modeling and molecular dynamics studies in order to elucidate the structure-function relationship involved in the incorporation of different nucleotides opposite the bulky (ϩ)-ta[BP]G adduct catalyzed by Dpo4. Our experimental results reveal that this enzyme is able to incorporate all four bases opposite the damaged guanine at similar rates, and the simulations provide molecular explanations for the accommodation of each incoming nucleotide opposite this bulky carcinogen-DNA adduct.

Damaged Oligodeoxynucleotides and Primer Extension Template
Construction-The oligodeoxynucleotides were synthesized using automated methods based on phosphoramidite chemistry on a Biosearch Cyclone automated DNA synthesizer (Milligen-Biosearch Corp., San Rafael, CA) and were purified using high performance liquid chromatography (HPLC). The 11-mer oligonucleotide, 5Ј-TTA TAG 6 * CAC AC-3Ј with the single (ϩ)-ta[BP]G adduct at G 6 *, was generated by a direct synthetic method and purified and characterized as described in detail elsewhere (71, 89 -91). This oligodeoxynucleotide sequence was designed to prevent slippage-mediated bypass of the adduct (45). The 42-mers containing a single (ϩ)-ta[BP]G were constructed by ligation methods (45,65,68); the modified 11-mers and flanking 12-and 19mers were annealed with a complementary 38-mer template strand, and then the three oligonucleotides were ligated to one another using T4 DNA ligase (New England Biolabs, Inc., Beverly, MA) as described previously (47,68,71). The intact 42-mers were purified using 20% PAGE with 8 M urea, visualized using ethidium bromide and then excised from the gel. The sequence of the 42-mer template thus generated was: 5Ј-CAC GTA ATG ATG TTA TAG 6 * CAC ACG CTA TCT GGC CAG ATC GCG-3Ј, where G 6 * represents (ϩ)-ta[BP]G or unmodified guanine. The oligonucleotides were eluted from the gel slice using the PAGE purification kit (Bio-Rad), the ethidium bromide was removed by butanol extraction, and the samples were desalted using BioRad Spin6 columns. The unmodified control 42-mer template strand was constructed in a similar manner. Each template strand was end-labeled using T4 phosphonucleotide kinase (PNK) (New England Biolabs, Inc., Beverly, MA), and their purity was checked by running on a 20% polyacrylamide gel containing 8 M urea. Adduct presence was evident because the damaged strand migrated more slowly than the undamaged strand, and the presence of the (ϩ)-ta[BP]G was verified by checking the fluorescence and UV absorption of the sample. Complementary primers of 18 and 21 bases long were purchased from Sigma Genosys (The Woodlands, TX), and were PAGE-purified as described above.
Primer Extension Assays-The 21-mer primers, which extend up to 1 base before the adduct, were 5Ј-end labeled using 1 unit of T4PNK (New England BioLabs, Beverly, MA) and 0.01 mCi of [␥-32 P]ATP (3,000 Ci/mmol) (PerkinElmer Life Sciences, Wellesley, MA) in 70 mM Tris-HCl (pH 7.6), 10 mM MgCl 2 , and 5 mM DTT. The 10-l reactions were incubated for 1 h at 37°C, the volume was brought up to 20 l with water, and the solutions were passed through Micro Bio-Spin P6 Chromatography Columns to remove unincorporated radioactivity. The labeled primer and unlabeled 42-mer DNA templates were annealed in a 1:1.25 ratio by heating to 80°C and cooling slowly to 4°C. In the standing-start assays, the V max for insertion of each nucleotide by Dpo4 was determined with only one dNTP at a time. The 40-l reaction contained 40 mM Tris-HCl (pH 8.0), 5 mM MgCl 2 , 10 mM DTT, 250 g/ml bovine serum albumin, 2.5% glycerol, 2.5 nM 32 P-primer-template and 10 nM Dpo4. The concentration of dNTP and reaction times were dependent on the saturation conditions for each dNTP. The dNTP concentrations and reaction times utilized are given in Table S1 under Supplementary Materials. Reactions were incubated at 37°C, and 10-l aliquots were removed and stopped with 10 l of stop solution (80% formamide, 5 mM Tris-boric acid, 1 mM EDTA, 0.1% xylene cyanol, 0.1% bromphenol blue). The reaction-stop mixtures were heated to 90°C for 10 min prior to analysis on a 20% polyacrylamide gel containing 8 M urea. Gels were dried and exposed to a phosphorimager plate prior to data analysis with ImageQuant software (Amersham Biosciences). All statistics reported for reactions ( Fig. 2c and Table S1) are the averages and standard deviations of at least three independent trials.
Running-start primer extension assays were also carried out (results not shown) in a similar fashion, but utilizing an 18-mer primer that extended up to four bases before the adduct. The labeled primer and unlabeled 42-mer DNA templates were annealed in a 1:1.25 ratio by heating to 80°C and cooling slowly to 4°C. Running-start primer extension assays with Dpo4 were carried out in a 40-l final volume, containing 40 mM Tris-HCl (pH 8.0), 5 mM MgCl 2 , 10 mM DTT, 250 g/ml bovine serum albumin, 2.5% glycerol, 2.5 nM 32 P-primer-template, 2 mM dNTPs, and 10 nM Dpo4. Running-start primer extension assays using T7 DNA polymerase were also carried out in 40 l volume, containing 40 mM Tris-HCl (pH 7.5), 20 mM MgCl 2 , 50 mM NaCl, 2000 M dNTPs (Promega Corp., Inc., Madison, WI), 2.5 nM 32 P-primertemplate and 10 nM T7 DNA polymerase. Reactions were incubated at 37°C, and 10-l aliquots were removed and stopped with 10 l of stop solution (80% formamide, 5 mM Tris-boric acid, 1 mM EDTA, 0.1% xylene cyanol, 0.1% bromphenol blue) after 15, 30, and 60 min. Control reactions were carried out in an analogous fashion for each polymerase with undamaged DNA template for 60 min, except that 125 M dNTPs was used. The reaction-stop mixtures were heated to 90°C for 10 min prior to analysis on a 20% polyacrylamide gel containing 8 M urea. Gels were dried and exposed to a phosphorimager plate prior to data analysis with ImageQuant software (Amersham Biosciences).

Molecular Modeling and Molecular Dynamics Simulations of Dpo4 Ternary Complexes Containing the (ϩ)-trans-anti-[BP]-N 2 -dG Adduct-
Parameters for the (ϩ)-ta[BP]G, dNTP, and Mg 2ϩ residues developed previously were employed in the present work (72)(73)(74). The 1.7-Å resolution type I crystal structure of the Dpo4 ternary complex (2) was used as the starting structure for molecular models (PDB ID: 1JX4) and the coordinates were obtained from the Protein Data Bank (92). The Builder module of InsightII (Accelrys, Inc., San Diego, CA) was used for molecular modeling, including remodeling the sequence in the crystal structure to correspond with that used in the primer extension assays. The catalytic Mg 2ϩ ion in the active site was unresolved in the crystal structure, likely because of the lack of hydroxyl group on the 3Ј-terminal base of the primer, and it was modeled in using its predicted coordination site (2). The (ϩ)-ta[BP]G residue was modeled in both anti and syn conformations and each anti dNTP was modeled opposite the adduct; an unmodified control simulation, with an undamaged anti guanine template and dCTP partner was also created in parallel. In addition, systems were constructed where the anti (ϩ)-ta[BP]G was modeled opposite syn dATP and syn dGTP, since a Dpo4 structure with a damaged template opposite a syn incoming purine has been previously observed in a crystal structure (75). Therefore, a total of eleven simulations were carried out. In order to create the damaged starting structures in which (ϩ)-ta[BP]G was anti, the BP moiety was docked to the templating guanine and modeled into a position with the least steric crowding between the damaged base and the protein and DNA backbone, as evaluated in InsightII with the bump check utility. The starting torsion angles for the anti (ϩ)-ta[BP]G residue (see Fig. 1a) were: , 236°; ␣Ј, 109°; ␤Ј, 248°. The syn (ϩ)-ta[BP]G systems were created in a similar fashion, and the starting torsion angles were: , 71°; ␣Ј, 321°; ␤Ј, 230°. The modeled combinations of , ␣Ј and ␤Ј were within the sterically favored regions for the (ϩ)-ta[BP]G adduct (93). The anti dNTPs had a starting torsion of 222°, while the syn dATP and dGTP had a starting torsion of 42°. Bases that were dideoxynucleotides in the crystal structure were remodeled as deoxynucleotides by adding hydroxyl groups to the 3Ј positions, and hydrogens were added to all residues using the LEaP module of AMBER 6.0 (94).
All added hydrogen atoms were minimized for 250 steps of steepest descent (SD) using the Hingerty distance-dependent dielectric function (95), as implemented in our version of AMBER 5.0. The entire structure was then minimized for 100 steps of SD using the Hingerty distancedependent dielectric function (95) in order to alleviate steric clashes, and the protein was then minimized for 1000 steps of SD while holding the DNA, Mg 2ϩ , and incoming dNTP fixed with 10 kcal/mol⅐Å harmonic restraints. The entire system was minimized for 250 steps of SD and 250 steps of conjugate gradient (CG) minimization, again using the Hingerty distance-dependent dielectric function (95). Each system was then oriented optimally for solvation in a rectangular periodic box using the Simulaid program (96) and 17 Na ϩ ions were added using LEaP at the positions of minimum electrostatic potential. LEaP was then used to solvate the neutralized systems in TIP3P (97) waters with a buffer of 10 Å, which maintains all atoms of the solute at least 10 Å away from the edge of the periodic box. The counterions and solvating waters were minimized for 5000 steps of SD while holding the solute fixed with 10 kcal/mol⅐Å restraints using the SANDER module of AMBER 6.0. A total of 200 ps of equilibration were carried out for each system in order to relax the waters and Na ϩ , as well as achieve a realistic density of 1 g/cm 3 . All simulations were carried out at a constant pressure of 1 atm and constant temperature, unless otherwise indicated. Simulations were carried out using AMBER 6.0 (94), the Cornell et al. (98) force field and the parm99.dat parameter set (99). The particle mesh Ewald method (100) was used to approximate the important Coulombic interactions, and the SHAKE option was used to restrain bond lengths that involved hydrogens, with a tolerance of 0.0005 Å. A 2-fs time-step was utilized, non-bonded interactions were updated every time-step, and a 9 Å cut-off was applied to Lennard-Jones interactions. Periodic boundary conditions were applied throughout the simulation, and the rotational motion was removed every 200 -300 ps. Translational motion of the center of mass in each simulation was removed approximately every 500 ps, but removal of global rotation was not carried out since it is technically uncertain in a periodic system. Trajectory visualization revealed that there was no excessive or abnormal rotational or translational motion of the solutes, indicating that energy leakage from internal motion to the global rotation was not occurring. This shows that the "flying ice cube effect" was not present in our simulations (101).
To begin equilibration, 30 ps of molecular dynamics (MD) was carried out at 10 K with 25 kcal/mol⅐Å restraints on the solute. The systems were then heated from 10 to 310 K over 25 ps, again holding the solute fixed with 25 kcal/mol⅐Å restraints, and the system was held at 310 K for 60 ps to mix the waters and ions. The remainder of the simulations was carried out at 310 K. The restraints on the solute were slowly released over 75 ps: 10 ps with 10 kcal/mol⅐Å restraints; 30 ps with 1.0 kcal/mol⅐Å restraints; and 45 ps with 0.1 kcal/mol⅐Å restraints. Following equilibration, unrestrained MD was carried out for 2.5 ns.
The coordinates of the simulated systems were saved every 1 ps for detailed analysis of the time-dependent structural features, and the CARNAL and PTRAJ modules of AMBER 6.0 were used for trajectory processing and analyses. Structural characteristics analyzed include: root mean square deviations (r.m.s.d.) of the system, the active site region of the system and individual residues, interatomic distances, hydrogen bonding occupancies, hydrogen bonding distances, and angles and torsion angles. The cut-off for hydrogen bonding heavy-atom to heavy-atom distance was 3.4 Å and that for the hydrogen bonding angle was 140°. Plots of r.m.s.d. demonstrating the stability of the simulations are shown in Supplementary Materials Fig. S1. Solvent-accessible surface area was calculated using the Connolly algorithm (102) with a probe diameter of 1.4 Å in InsightII (Accelrys, Inc.).

Primer Extension Assays Reveal Similar V max Values for each Nucleotide Opposite (ϩ)-trans-anti-[BP]-N 2 -dG by Dpo4 -
Running-start primer extension experiments (data not shown) indicate that primer extension catalyzed by Dpo4 is slowed one base prior to (ϩ)-ta[BP]G on the template strand, but a nucleotide can be incorporated opposite the adduct, and the damage can be bypassed to a significant extent. In contrast, a replicative DNA polymerase, such as T7, produces little full-length extension products under the same experimental conditions.
Standing-start primer extension assays were carried out using a primer that extended up to the modified guanine, with the adduct (or unmodified guanine in the control reactions) as the next-to-be-replicated base, in order to determine the V max for incorporation of each nucleotide opposite (ϩ)-ta[BP]G by Dpo4. The control standing-start primer extension experiments, as well as previously published work (1), showed that dC is overwhelmingly inserted most often opposite unmodified guanine. Examples of standing-start primer extension experiments at two temperatures (37 and 55°C) are shown in Fig. 2a. In these examples the dNTP concentration was 2 mM in each case. All four 2Ј-deoxynucleotide triphosphates are effectively incorporated opposite the adduct, especially at 55°C. However, the scenario modeled in our structural studies involves full occupancy of the enzyme active site by the incoming dNTP, which represents conditions of substrate saturation. This is relevant to V max , the rate of incorporation in a ternary complex that has a dNTP at its active site. Accordingly, corresponding V max values for the insertion of each nucleotide opposite the damaged guanine were obtained experimentally in order to elucidate the biochemical behavior of the enzyme under saturating conditions. We determined V max by evaluating the rates of incorporation of the different dNTPs at sufficiently high nucleotide concentrations such that further increases caused no further changes in the rates of incorporation, thus identifying V max in each case. Interestingly, Dpo4 is highly promiscuous in standing-start primer extension assays opposite (ϩ)-ta[BP]G. Fig. 2b shows the V max values and standard deviations for the incorporation of each nucleotide opposite the damaged base. All four nucleotides are incorporated opposite (ϩ)-ta[BP]G with comparable V max values. This indicates that, once the active site of Dpo4 is occupied by one of the incoming nucleotides opposite the damaged guanine, they are all incorporated to a similar extent. The V max values determined in this study for the damaged DNA were approximately one order of magnitude lower than those for unmodified DNA (1). 2 The

Pyrenyl Moiety of (ϩ)-trans-anti [BP]-N 2 -dG Can Be Accommodated on the Minor Groove Side of the Nascent Base Pair, with the Adduct in the anti Conformation within the Dpo4
Active Site-Unlike high fidelity DNA polymerases (15,79,80,87,88,104), Dpo4 does not forge critical interactions with the minor groove side of the primer-template DNA (2). In fact, the crystal structure of the Dpo4 ternary complex reveals that the protein has a cleft on the minor groove side of the primertemplate and nascent base pair that exposes the DNA to solvent on this edge. Because of the presence of this open cleft on the minor groove side, we modeled the (ϩ)-ta[BP]G adduct into the active site of Dpo4 in the anti conformation, with the bulky BP moiety on the minor groove side of the nascent base pair. We also included each of the four nucleotides to create the anti (ϩ)-ta[BP]G:dCTP, anti (ϩ)-ta[BP]G:dATP, anti (ϩ)-ta[BP]G: dGTP and anti (ϩ)-ta[BP]G:dTTP structures. In addition, two simulations were carried out with the damaged base in the anti conformation and a purine incoming nucleotide in the syn conformation: anti (ϩ)-ta[BP]G:syn dATP and anti (ϩ)-ta[BP]G: syn dGTP. These structures were created because an anti damaged base was observed opposite a syn incoming purine nucleotide in a previously solved crystal structure (75). The simulated structures, after potential energy minimization, equilibration, and 2.5 ns of unrestrained molecular dynamics, are shown in Fig. 3, a and b. Figs. S1 and S2 in the Supplementary Material show the time-dependent analyses of the r.m.s.d. and torsion angles governing the orientation of the damaged guanine and carcinogen moiety, respectively, in the simulations during the 2.5 ns of unrestrained molecular dynamics.
The anti (ϩ)-ta[BP]G forms stable hydrogen bonds with anti dCTP, dATP, and dTTP during their respective simulations, as demonstrated in Fig. S3b. In addition, the nascent base pairs in these simulations remain coplanar during the simulation, as exemplified in Fig. 3a. In contrast, hydrogen bonds between anti (ϩ)-ta[BP]G and partner anti dGTP do not form because the identical hydrogen bonding edges of each base repel each other; this repulsion forces the dGTP base moiety out of the plane of the damaged guanine template, as shown in Fig. 3a. Table I lists the solvent-exposed surface area (SSA) of the template and incoming bases in each simulation, and the (ϩ)ta[BP]G residue in the anti (ϩ)-ta[BP]G:dGTP simulation has the highest SSA of the anti (ϩ)-ta[BP]G systems. This indicates that the anti (ϩ)-ta[BP]G:dGTP pair is the most solvent-exposed nascent base pair. This probably results from the unfavorable interactions between the hydrogen bonding faces of the two guanine residues in the anti conformation. In lieu of interaction with the anti (ϩ)-ta[BP]G template base, the incoming dGTP participates in hydrogen bonding interactions with the dC 5 residue on the 3Ј side of the modified guanine (see Fig. 1c), forming a pseudo-slipped structure, as shown in Fig. S3b. The dC 5 base interacts both with the skewed dGTP base and its normal partner G 30 . Misaligned bases within the active site of Dpo4 have been observed previously in x-ray crystal structures (2,77), and this enzyme is capable of misincorporating bases via a mechanism that is mediated by misaligned incoming nucleotides (18,103). Therefore, this type of slipped structure within Dpo4 may be adequate for successful nucleotide incorporation. Such a slipped structure could persist through elongation, resulting in a frameshift mutation. Alternatively, this slipped structure could realign and be extended by the polymerase, resulting in a point mutation.
Although the (ϩ)-ta[BP]G adduct can be accommodated by Dpo4 in the anti conformation, placing the bulky BP moiety in the minor groove results in crowding between the pyrenyl rings and the protein and primer-template DNA. This wedging of the BP moiety into the minor groove cleft causes the little finger domain (2) to move away from the fingers and palm domains, as shown by the distance between these domains (Fig. S4 in Supplementary Materials). For example, the Lys-78 C␣ (palm) to Lys-275 C␣ (little finger) distance was 12.3 Ϯ 0.7 Å and 16.5 Ϯ 0.8 Å in the unmodified and anti (ϩ)-ta[BP]G:dCTP simulations, respectively. The BP moiety in the minor groove is positioned between protein residues and the phosphodiester backbone of the template DNA strand. This causes the template phosphodiester backbone to shift toward the major groove compared with the unmodified system when the two systems are aligned via the palm, fingers and thumb domains, as shown in Fig. 4a. If the BP moiety were present in conjunction with the DNA in the conformation adopted by the unmodified system, it would clash with the ϩ1 template base, as well as Lys-78 and Met-76. The shift in position of the template backbone caused by the BP moiety is likely the cause of the opening of the little finger, which is not observed in the simulation with the unmodified DNA.
In addition to widening the minor groove cleft in the Dpo4 protein, the presence of the bulky BP moiety on the minor groove side of the nascent base pair causes rearrangement of several residues near the active site to allow for the accommodation of the carcinogen moiety, as shown in Fig. 4b. Most notably, Lys-78, which contacts the active site region DNA on its minor groove side in the unmodified system, is rotated ϳ180°in the anti (ϩ)-ta[BP]G:dCTP simulation to allow room for the BP moiety. Due to the rotation and motion of Lys-78, Glu-79 is also significantly moved in the anti (ϩ)-ta[BP]G: dCTP system compared with its position in the unmodified structure. In addition, Met-76 contacts the nascent base pair on its minor groove side in the unmodified system and moves toward the 5Ј-side of the modified base in order to allow the BP moiety to reside in the minor groove. It is likely that the reorganization caused by the presence of the BP moiety on the minor groove side of the nascent base pair, including the opening of the little finger and rearrangement of active site region residues, results in the poor bypass of (ϩ)-ta[BP] by Dpo4 in vitro. However, the ability of Dpo4 to accommodate the bulky BP moiety in the minor groove without a major distortion of the protein likely plays an important role in the increased ability of this enzyme to bypass (ϩ)-ta[BP]G in contrast to the minimal bypass by replicative T7 DNA polymerase. It also helps explain the ability of Dpo4 to incorporate cytosine opposite the bulky lesion, albeit to a reduced extent as compared with that opposite unmodified guanine. Ling et al. (75) has shown that Dpo4 can bypass a cyclobutane pyrimidine dimer (CPD) and can stably accommodate a syn ddATP opposite the 5Ј-thymine base moiety of the lesion. Similarly, our simulations reveal that a syn purine nucleotide can be accommodated opposite the (ϩ)-ta[BP]G adduct in the anti conforma- tion, albeit with a slight increase in the distance between the palm/fingers domains and little finger, as also observed for the anti incoming dNTPs. In the anti (ϩ)-ta[BP]G:syn dATP simulation, a stable hydrogen bond forms between the nascent base pair during the simulation (see Fig. S3c) and the nascent base pair remains coplanar. While dGTP in the syn conformation does not participate in significant hydrogen bonds with anti (ϩ)-ta[BP]G during the simulation, the nascent base pair remains coplanar. Instead, the syn dGTP residue hydrogen bonds with the dC 5 residue on the 3Ј-side of the damaged guanine; the N7 and O 6 atoms of the incoming dGTP are hydrogen bonded to the N4 atom of the dC residue for 53 and 31% of the simulation, respectively. This hydrogen bonding leaves the dG 3Ј-terminal base of the primer without a partner, but this base remains stably stacked within the duplex region, as shown in Fig. 3b. This structure is similar in nature to that seen in the anti (ϩ)-ta[BP]G:dGTP structure, and could be viable for nucleotide incorporation due to the ability of Dpo4 to accommodate slipped structures (2,18,77,103). The ability of Dpo4 to accommodate incoming dATP and dGTP nucleotides in the syn conformation opposite anti (ϩ)-ta[BP]G within its active site suggests another option for nucleotide incorporation opposite this bulky lesion.

Incoming Purine Nucleotide Triphosphates in the syn Conformation Can Be Accommodated Opposite anti (ϩ)-trans-anti-[BP]-N 2 -dG Adduct by Dpo4 -Recent work by
During replication, it is likely that the (ϩ)-ta[BP]G adduct is most often encountered by the polymerase in the normal anti conformation; this may explain the proclivity of this adduct to impede replicative DNA polymerases (47, 56 -58, 73), as well as constituting a partial block to Dpo4. Blocking of replicative DNA polymerases by the anti conformation of (ϩ)-ta[BP]G has been discussed previously (72)(73)(74), and the present work shows that the presence of a BP moiety on the minor groove side of the nascent base pair within Dpo4 causes the little finger domain to open up and to move further from the palm and fingers domains. While this structural disruption by the BP moiety in the minor groove may explain the inability of Dpo4 to easily bypass (ϩ)-ta[BP]G, the ability of the enzyme to accommodate the adduct at all in the anti conformation may explain its ability to incorporate dC opposite the adduct. In contrast, replicative DNA polymerases can only accommodate (ϩ)-ta[BP]G in the syn conformation without disrupting the enzyme (72)(73)(74). Since dCTP is accommodated most poorly opposite the syn damaged guanine within the more sterically-restricted active sites of these high fidelity enzymes, it is most rarely incorporated opposite (ϩ)-ta[BP]G.
The has a large open pocket on the major groove side of the primertemplate and nascent base pair, reminiscent of that seen in high fidelity DNA polymerases (72,79,80,83,86), and the bulky BP moiety is easily accommodated within this region when the template (ϩ)-ta[BP]G adopts the syn conformation. Therefore, the (ϩ)-ta[BP]G was also modeled in the syn conformation, with the BP moiety in the major groove, opposite each of the four nucleotides, to create the syn (ϩ)-ta[BP]G:dCTP, syn (ϩ)-ta[BP]G:dATP, syn (ϩ)-ta[BP]G:dGTP and syn (ϩ)ta[BP]G:dTTP structures. In contrast to the systems in which the adduct adopted the anti conformation, placing the BP moiety in the major groove, as in the syn (ϩ)-ta[BP]G:dNTP systems, does not cause the little finger domain to open up, as seen in Fig. 4c and Fig. S4d. Also, the residues located on the minor groove side of the primer-template of the syn (ϩ)-ta[BP]G structures are in positions very similar to those in the unmodified system, since this region is not disrupted by the presence of the bulky carcinogen moiety. The major groove side of the nascent base pair provides a more spacious cavity in which the BP moiety can reside without nearly as significant a disruption of the Dpo4 structure. Fig. 4c shows a comparison of the cavity size on the minor and major groove sides of the primer-template/nascent base pair in the Dpo4 enzyme.
Although the BP moiety has more room in the major groove, certain residues in the active site region still undergo modest rearrangement as compared with their positions in the unmodified DNA-enzyme complex in order to accommodate the pyrenyl moiety. Primarily, Leu-293 is shifted toward the 3Ј side of the template strand in order to avoid steric crowding with the saturated ring of the BP moiety, and Arg-332 is reoriented in the syn (ϩ)-ta[BP]G simulations to hydrogen bond with the phosphodiester backbone of the modified guanine residue.
When (ϩ)-ta[BP]G adopts the syn conformation, it hydrogen bonds to both dCTP and dATP within the active site of Dpo4 during their respective simulations, as shown in Fig. S3d. One stable hydrogen bond forms between the nascent base pair in the syn (ϩ)-ta[BP]G:dATP simulation; the dATP incoming base is hydrogen bonded to N7 of the template through its N6 atom for 81% of the simulation. The dCTP residue is hydrogen bonded to the syn (ϩ)-ta[BP]G for 64% of the simulation; during the last ϳ250 ps, this interaction is disrupted through a motion of the dCTP base moiety toward the major groove side of the nascent base pair. Prior to this shift, the dCTP base is also involved in a hydrogen bond with Lys-78 for 51% of the simulation.
The (ϩ)-ta[BP]G:dGTP nascent base pair fares better when the modified guanine adopts the syn conformation as compared with its anti counterpart. The oppositely oriented hydrogen bonding faces juxtapose complementary substituents, facilitating 2-3 hydrogen bonds between the syn (ϩ)-ta[BP]G and the anti dGTP during the simulation. In contrast, dTTP does not hydrogen bond with the template base during the simulation, but instead interacts with protein residues. The dTTP residue shifts toward the minor groove side of the active site pocket during the simulation and hydrogen bonds with Lys-78 N and Tyr-10 O for 73 and 97% of the simulation, respectively. The position of the dTTP in the syn (ϩ)-ta[BP]G:dTTP simulation is stabilized through these hydrogen bonds on the minor groove side of the nascent base pair, as well as hydrophobic interactions of its methyl group with Ala-44 and Gly-58 on its 3Ј-side, as shown in Fig. 5. Despite the failure of dTTP to hydrogen bond with the modified base, the nascent base pair remains coplanar during the simulation and the SSA of the nascent base pair does not increase as compared with other syn (ϩ)-  Table I. Regardless of the specific partner dNTP, availability of the syn domain of the modified guanine within the active site of Dpo4 provides another avenue through which nucleotide incorporation can occur without major disruption of the protein structure.
Summary and Conclusions-The Dpo4 enzyme is a good model system for elucidating structure-function relationships because of the availability of both high resolution crystal structures of its ternary complex with DNA for structural studies (2) and purified enzyme for in vitro functional assays (1). While running-start primer extension assays showed that Dpo4 exhibits a slowed incorporation of dNTPs opposite (ϩ)-ta[BP]G, full bypass of the damaged base is observed as well. Standingstart primer extension assays showed that the enzyme is promiscuous opposite the adduct, inserting all four bases with comparable V max values (Fig. 2).
Molecular modeling and molecular dynamics simulations of (ϩ)-ta[BP]G opposite each of the four dNTPs reveal that the adduct can adopt either the anti or syn conformation within the Dpo4 active site. This finding highlights structural differences between Dpo4 and the more stringent replicative DNA polymerases, specifically T7 DNA polymerase, where (ϩ)-ta[BP]G must adopt the syn conformation to be accommodated within the active site of the ternary complex without a major perturbation of the protein structure (72)(73)(74). In the anti conformation, (ϩ)-ta[BP]G can be accommodated opposite each of the four nucleotides within the active site of Dpo4, including all four dNTPs in the anti conformation and the purine dNTPs in the syn conformation. This could explain the promiscuity of this enzyme opposite the damaged base. However, dGTP opposite anti (ϩ)-ta[BP]G becomes distorted because of the abutting identical hydrogen bonding edges of the incoming dNTP and template base. In fact, both anti and syn dGTP opposite the anti adduct forgo hydrogen bonding with the template base in favor of that with dC 5 , on the 3Ј side of the adduct. Despite the lack of hydrogen bonds with the templating base, dG is incorporated opposite the adduct with a V max comparable to that of the other three bases. It is likely that slipped structures are acceptable substrates for Dpo4 (2,18,77,103).
In addition, (ϩ)-ta[BP]G can be accommodated in the syn conformation opposite all four anti dNTPs. Dpo4 can exploit the solvent-exposed regions on both the minor and major groove sides of the nascent base pair to accommodate the bulky carcinogen moiety of (ϩ)-ta[BP]G, effectively giving the enzyme at least two options for accommodating the damaged base, in either the anti or syn conformation. It is possible that the syn conformation is less accessible, and predominance of the adduct anti conformation within the active site could result in the significant blockage of Dpo4 by (ϩ)-ta[BP]G. When opposite dC at a primer-template junction, (ϩ)-ta[BP]G adopts the anti conformation (45), suggesting that this conformation may be favored when the adduct has a partner base, or at least the non-mutagenic partner.
It is plausible that the more open and permissive nature of the Dpo4 active site facilitates accommodation of aberrant and/or damaged base pairs within the active site region, possibly with less-constrained alignments than in high fidelity replicative DNA polymerases. This effect may work in conjunction with the anti/syn structural opportunities within the active site to produce the nucleotide incorporation promiscuity observed in standing-start primer extension experiments with the Dpo4 enzyme. In fact, (ϩ)-ta[BP]G has been observed in both anti and syn conformations at a primer-template junction in solution (45,46). Thus, a combination of structural factors in the Dpo4 enzyme, involving both the spacious active site and the open minor groove together, may provide the flexibility to admit the different incoming nucleotides with similar rates. Our current understanding of the structure-function relationship in these low fidelity Y-family DNA polymerases is just emerging, and studies such as ours are starting to provide the outline of structural features acceptable for nucleotide incorporation. Modeling studies have the advantage of permitting investigation of systems that are inaccessible experimentally because of their rarity, despite biological significance, and/or failure to crystallize.
As pointed out by an insightful reviewer, a number of hypotheses can be considered to account for the comparable incorporation rates of all four nucleotides opposite (ϩ)-ta[BP]G. These include (1) a single orientation of (ϩ)-ta[BP]G, for which all dNTPs have equal affinity; (2) different orientations of (ϩ)ta[BP]G, each with its preference for a certain dNTP or dNTPs; (3) a "non-instructional" conformation of (ϩ)-ta[BP]G, allowing equivalent accommodation of each dNTP. Our results suggest multiple conformations of (ϩ)-ta[BP]G can facilitate nucleotide incorporation within the spacious active site of Dpo4, and all four dNTPs can be accommodated opposite one or more of these conformations. In addition, we find that purine-incoming nucleotides can also be accommodated in more than one conformation (anti and syn) opposite the damaged guanine. Thus, through a number of combinations of template and incomer conformations, together with the inherent flexibility and roominess of the active site, our results suggest how Dpo4 can incorporate each of the four dNTPs opposite (ϩ)-ta[BP]G at comparable rates in an instructional, template-directed manner.
Y-family DNA polymerases are structurally distinct from their replicative cousins and it has been proposed that they may be involved in bypassing DNA damage in vivo (27,78,103,105). Because of their possible involvement in damage processing, it is of great interest to understand how these polymerases handle DNA adducts, including bulky lesions such as those derived from BPDE. A recent crystal structure has revealed the accommodation of a BPDE-adenine adduct, linked to N 6 of dA, within Dpo4. Two structures were resolved, including a proposed blocking structure with the BP moiety intercalated between adjacent base pairs, similar to that seen by high resolution NMR in duplex DNA (106), and a proposed bypass structure, with the BP moiety solvent-exposed in the major groove (76). The major groove structure showed that the enzyme can coerce the adduct into adopting a conformation that is disfavored in duplex DNA (106, 107) and on the nucleoside level (108).
It appears that the increased bypass ability and promiscuous nature of nucleotide incorporation opposite (ϩ)-ta[BP]G by Dpo4 can be attributed to the open nature of the active site and the ability to accommodate bulky moieties in the minor groove, albeit to a lesser degree than in the major groove. It is possible that this structural theme is also echoed in other Y-family DNA polymerases, including a Dpo4 DinB family relative, pol , which has been shown to primarily incorporate dC opposite (ϩ)-ta[BP]G (4,71). Perhaps pol possesses a larger pocket on the minor groove side of the nascent base pair that allows for a less disruptive accommodation of the bulky adduct in the anti conformation thus facilitating a greater extent of translesion bypass, tailored to insert dC opposite the lesion. Elucidation of structure-function relationships involved in DNA adduct bypass by Y-family DNA polymerases may help lend insight into the mechanisms by which these adducts are processed in vivo, including mutagenic routes that may lead to the initiation and progression of carcinogenesis.