Substrate discrimination by formamidopyrimidine-DNA glycosylase: a mutational analysis.

Formamidopyrimidine-DNA glycosylase (Fpg) is a primary participant in the repair of 8-oxoguanine, an abundant oxidative DNA lesion. Although the structure of Fpg has been established, amino acid residues that define damage recognition have not been identified. We have combined molecular dynamics and bioinformatics approaches to address this issue. Site-specific mutagenesis coupled with enzyme kinetics was used to test our predictions. On the basis of molecular dynamics simulations, Lys-217 was predicted to interact with the O8 of extrahelical 8-oxoguanine accommodated in the binding pocket. Consistent with our computational studies, mutation of Lys-217 selectively reduced the ability of Fpg to excise 8-oxoguanine from DNA. Dihydrouracil, also a substrate for Fpg, served as a nonspecific control. Other residues involved in damage recognition (His-89, Arg-108, and Arg-109) were identified by combined conservation/structure analysis. Arg-108, which forms two hydrogen bonds with cytosine in Fpg-DNA, is a major determinant of opposite-base specificity. Mutation of this residue reduced excision of 8-oxoguanine from thermally unstable mispairs with guanine or thymine, while excision from the stable cytosine and adenine base pairs was less affected. Mutation of His-89 selectively diminished the rate of excision of 8-oxoguanine, whereas mutation of Arg-109 nearly abolished binding of Fpg to damaged DNA. Taken together, these results suggest that His-89 and Arg-109 form part of a reading head, a structural feature used by the enzyme to scan DNA for damage. His-89 and Lys-217 help determine the specificity of Fpg in recognizing the oxidatively damaged base, while Arg-108 provides specificity for bases positioned opposite the lesion.

Recently, we established the three-dimensional structures of E. coli Fpg (Eco-Fpg) and Nei cross-linked to DNA by sodium borohydride (32,33). The structures of uncomplexed Fpg from Thermus thermophilus (Tth-Fpg), as well as Fpg from Geobacillus stearothermophilus (Bst-Fpg) and Lactococcus lactis (Lla-Fpg) complexed with various DNA ligands, have been reported by others (34 -36). The three-dimensional structures of Fpg and Nei are very similar but differ in several significant details, some of which may underlie the different substrate specificities of these enzymes. Although structural data provides a wealth of information on the interactions of Fpg and Nei with damaged DNA, a disordered loop present in both structures is generally positioned so that it might be involved in recognition of the damaged base. In this study, we explore the role of various amino acid residues, including ones from the disordered loop, in determining the substrate specificity of Fpg. To generate testable hypotheses regarding these interactions, we performed molecular dynamics simulations of an Fpg/DNA complex containing a damaged base. Availability of three-dimensional structures for similar enzymes with different substrate specificities enables us to employ also a combined conservation/ structure analysis (37) to predict amino acid residues responsible for substrate specificity. We subsequently used sitedirected mutagenesis to test the predictions obtained by these in silico approaches, which allowed us to identify several residues responsible for damage recognition by Fpg.

MATERIALS AND METHODS
Oligonucleotides and Enzymes-Oligodeoxynucleotides for site-directed mutagenesis were purchased from Invitrogen. 8-Oxo-2Ј-deoxyguanosine phosphoramidite was prepared as described previously (38) and other phosphoramidites, including DHU phosphoramidite, were purchased from Glen Research. Modified oligodeoxynucleotides for kinetic studies were synthesized by standard solid-state phosphoramidite chemistry and purified by reverse-phase high-pressure liquid chromatography. The sequence used was 5Ј-CTC TCC CTT CXC TCC TTT CCT CT-3Ј, where X was 8-oxoG or DHU; the opposite strand was complementary except for containing A, C, G, or T opposite the modified base. The modified strand was labeled at the 5Ј-end using ␥-[ 32 P]ATP (Amersham Biosciences) and bacteriophage T4 polynucleotide kinase (New England Biolabs) following the manufacturer's protocol, and annealed to the complementary strand in the 1:1.3 ratio. Fpg was purified as described earlier (33). Fpg mutants were produced by site-directed mutagenesis of the expression plasmid pET-24b (Novagen) carrying the wild-type fpg coding sequence inserted at NdeI-HindIII restriction sites. Recombinant plasmids were maintained in XL1-Blue E. coli. To obtain mutated Fpg proteins, plasmids were transfected into BL21(DE3) E. coli and the protocol for purification of the wild-type enzyme was followed (33) No significant difference in chromatographic behavior was observed between wild-type Fpg and the mutants.
Determination of Kinetic and Dissociation Constants-Kinetic and dissociation constants for wild-type and mutant Fpg proteins were determined essentially as described (15). Concentrations of substrates and enzymes and the time of reaction were adjusted to ensure steadystate rate conditions for the experiment. Kinetic constants were determined by nonlinear regression fit to the hyperbolic equation using Hyper v1.1 software.
Molecular Dynamics Simulation-Simulations were carried out using the SANDER module of the AMBER 5.0 molecular modeling and molecular dynamics (MD) software package (39), the Cornell et al. force field (40) and the parm99.dat parameter set (41). Non-bonded interactions were updated at each time step, a 9 Å cut-off was applied to Lennard-Jones interactions, periodic boundary conditions were applied, and MD simulations were carried out under constant pressure and temperature unless indicated otherwise. Coulombic interactions were approximated using the particle mesh Ewald method (42), bonds involving hydrogen atoms were constrained using the SHAKE option with a tolerance of 0.0005 Å, and a 2-fs time step was employed throughout the simulation. Coordinates of DNA covalently linked to Fpg protein, including a modeled loop (residues 217-224) that was not experimentally located in the crystal structure, were obtained as described (33). The N-terminal proline residue, which acts as the nucleophile in Fpg catalysis (28), was modeled as neutral to mimic the situation directly preceding the reaction. Residues not included in the AMBER force field were parameterized. Partial charges for the neutral N-terminal proline and 8-oxodG residues were obtained in the conformations used for the starting models. The conformation of Pro-1 in the crystal structure was used in the parameterization of this residue. For parameterization of 8-oxodG, three conformations that differed slightly in their angle were used to obtain partial charges. Hartree-Fock calculations, with the basis set 6 -31G* in GAUSSIAN 98 (43), were used to calculate the partial charge of each atom in the parameterized residues. The RESP module of AMBER was used to fit each charge to its respective atomic center in the molecules (44) and charges were normalized to avoid charge imbalance in each residue (45). The final partial charges and atom type assignments for the Pro-1 and 8-oxodG residues are given under Supplementary Materials. Bond and angle parameters not included in the original force field were assigned by analogy to chemically similar types in the AMBER data base parm99.dat parameter set and are listed in Supplementary Materials.
Starting structures were prepared using molecular modeling and potential energy minimization. The covalent bond linking Pro-1 and the ring-opened deoxyribose that comprises the reduced Schiff base (33), and C1Ј was broken and the deoxyribose moiety reconstituted. The modeled DNA sequence corresponded to that in the crystal structure, 5Ј-AGGAXGAAG-3Ј where X denotes the lesion (Fig. 1B). An unmodified G or 8-oxoG was attached to the reconstituted sugar moiety with an initial glycosidic torsion angle of 294°(anti structure) or 110°(syn structure). Two additional simulations of 8-oxodG with initial glycosidic torsion angles of 317°and 255°were also carried out. These starting models were chosen for their apparent optimal steric and electrostatic/ hydrogen bonding interactions between the damaged base and the substrate binding pocket. Three anti structures were analyzed because the O 8 group of 8-oxoG was likely to interact with the modeled loop (residues 217-224), whose conformation was somewhat uncertain; visual inspection suggested three conformations of 8-oxodG with similar stabilizing interactions. However, in the syn conformation, damagespecific groups of 8-oxoG were positioned to optimally interact with protein residues well-defined in the crystal structure of E. coli Fpg; only one such region of the syn domain was located. Molecular modeling was carried out using the Builder module of InsightII (Accelrys).
Once the DNA was remodeled, 750 steps of steepest descent (SD) and 250 steps of conjugate gradient potential energy minimization were carried out using the SANDER module of AMBER 5.0. Following minimization, 425 crystallographic water molecules were added to each system, and the systems were minimized for 600 steps of SD using the Hingerty distance-dependent dielectric function (46), while holding the solute (protein and DNA) fixed using harmonic restraints of 0.1 kcal/ mol. A total of 20 neutralizing Na ϩ counterions were then added to each system at positions of lowest electrostatic potential using the LEaP module of AMBER 5.0. The neutralized systems were minimized for 1000 steps of SD using the Hingerty distance dependent dielectric function while holding the solute fixed with harmonic restraints of 10 kcal/mol. The systems were reoriented using the Simulaid program (47) and then solvated in a box of TIP3P waters using a buffer of 10 Å. The solvating waters, crystallographic waters and counterions were mini-FIG. 1. Structure of damaged DNA used in this work. Panel A, chemical structures of 8-oxoguanine and dihydrouracil. Arrow denotes rotation between anti and syn conformations; , glycosidic torsion angle. Panel B, DNA sequence utilized in MD simulations. The guanine highlighted in red was modeled as 8-oxoG in modified simulations and as G in the unmodified control. mized for 5000 steps of SD while keeping the solute fixed with 10 kcal/mol restraints. MD was carried out to equilibrate the system: 30 ps at 10 K holding the solute fixed with 10 kcal/mol restraints, 25 ps heating the system from 10 to 300 K holding the solute fixed with 10 kcal/mol restraints, 45 ps at 300 K holding the solute fixed with 10 kcal/mol restraints, 20 ps at 300 K holding the solute fixed with 1 kcal/mol, and 30 ps at 300 K holding the solute fixed with 0.1 kcal/mol restraints. Following 150 ps of equilibration, unrestrained production MD was carried out at 300 K for 1 ns. The cysteine residues involved in chelating the Zn 2ϩ retained their relative orientations (data not shown), although the Zn 2ϩ diffused out of the protein, due to the absence of other Zn 2ϩ ions in the simulation. In Figs. 2 and S1, Zn 2ϩ was modeled into the chelating position using the Builder module of Insight II (Accelrys).
Coordinates, velocities, and energies for each simulated system were collected every 500 fs. The CARNAL and PTRAJ modules of AMBER were used for trajectory processing and analyses of the atomic fluctuations, critical torsion angles and distances, interactions between the solute and solvent and the occupancy and quality involving of the hydrogen bond between Fpg and its substrates. The hydrogen-bonding cutoff for the distance between heavy atoms was 3.4 Å and that for the hydrogen-bonding angle was 140°.
Conservation Analysis-The protein sequence data base of the National Center for Biotechnology Information, composed of nonredundant sequences deposited in GenBank TM , RefSeq, PDB, Swiss-Prot, PIR and PRF databases, was searched by BLASTP (48). The sequences of E. coli Fpg and Nei were used as initial queries with the search restricted to prokaryotes (Bacteria and Archaea). Sequences possessing neither the N-terminal PE(L/G) Fpg/Nei signature motif nor the Cterminal zinc finger motif were discarded. All remaining sequences were aligned using ClustalW (49). A tree was constructed by the neighbor-joining method (50), and the complete sequences most distant from the query sequences were used as queries for the second round of the search. Sequences identified were classified as the Fpg subgroup, based on the presence of the N-terminal motif PE(L/I/M)PE (124 sequences); the remaining 23 sequences, which contained the N-terminal signature motif PEG, were considered outside the Fpg subgroup and in the Fpg/ Nei family. To analyze the conservation of physicochemical properties of the Fpg residues, subsets of the sequences were randomly selected (except that E. coli K12 sequences were always present) to represent no more than one per family, order, class, or phylum, then re-aligned. Conservation in the aligned sequences were analyzed by the AMAS algorithm (51) with the threshold C n ϭ 9, 10% atypical residues allowed, no gaps ignored and cysteines considered reduced (37). The structure of E. coli Fpg covalently complexed with DNA (1K82 in the Protein Data Bank) (33) was used for mapping similar and dissimilar residues, as described (37).

MD Simulation of Substrate Recognition by
Fpg-Four simulations of Fpg complexed with an oligodeoxynucleotide containing 8-oxoG and two simulations of the protein with unmodified DNA were carried out for 1 ns following 150-ps equilibration. Stereoviews of the entire protein-DNA complex for the anti and syn 8-oxodG simulations are shown in Fig. 2 (A and B); views for 8-oxodG simulations 2 and 3 together with anti and syn unmodified controls are provided as Supplementary Material (Fig. S1). The 8-oxodG residue was modeled starting from four different torsion angles, three in the anti conformation range and one in the syn conformation (see "Materials and Methods" for details regarding the choice of the starting conformations). Following potential energy minimization and molecular dynamics simulations, 8-oxodG was observed to fit into the active site pocket very well in either the anti or the syn conformation, assisted by a number of stabilizing van der Waals and electrostatic interactions. The unmodified dG residue also was modeled and simulated in both conformations.
Each system had a relatively low RMSD from the starting structure, demonstrating the stability of each simulation. Fig.  S2 shows the two-dimensional RMSD analysis of all residues in each system from the starting structure over the 1-ns simulations. The average RMSD for protein and DNA residues, as well as the RMSD of the final structure from the starting structure, are also shown in Fig. S2. The stability of the active site region, including 28 residues within 5Å of the target base and the base itself, also was monitored; two-dimensional plots are shown in Fig. S2. This analysis reveals that the modified base is very stable in the 8-oxodG simulations. The third metastable state in anti 8-oxodG simulation 3 coincides very closely with that of the simulation 1 stable state, exhibiting very similar conformations for the 8-oxodG residue and similar protein-DNA interactions; therefore, only simulation 1 will be discussed further for anti 8-oxodG unless otherwise specified. The time-dependent values of the substrate base glycosidic torsion are shown in Supplementary Material (Fig. S3); throughout the simulations, the anti and syn structures remained in their respective starting domains.
Modeling of 8-oxoG within the active site and subsequent MD simulations allows specific interactions between Fpg and DNA containing the 8-oxoG base to be observed. Fig. 2 (C and D) shows the active site of the Fpg/anti 8-oxodG (simulation 1) and Fpg/syn 8-oxodG complexes following 1 ns of unrestrained MD. In both 8-oxodG simulations, O 8 of the damaged base appears to be recognized through a stable hydrogen bond to the Lys-217 N, with lengths of 2.8 Ϯ 0.1 Å and 3.1 Ϯ 0.5 Å, respectively, in the anti and syn simulations. These interactions are illustrated by the time-dependent distance measurements and the angle between these groups (Fig. 3). Although this bond is absent from anti 8-oxodG simulation 2, a hydrogen bond was present in this case between the Met-73 main-chain carbonyl and N7 of 8-oxoG. The time-dependent length and angle of this hydrogen bond also are shown in Fig. 3. In the case of syn 8-oxodG, the hydrogen bond has a smaller AA-A-H angle than in the other simulations (AA is the atom attached to the acceptor of a hydrogen bond, A is the acceptor and H is hydrogen), but this orientation is within range of a normal hydrogenbonding interaction (52, 53) throughout the simulation. Alternatively, it may belong to the class of NH-facial bonds, often observed in protein structures (54) or even represent a cationinteraction (53).
N7 is a hydrogen bond acceptor in unmodified guanine as is O 8 in 8-oxoG, prompting us to check whether this group could interact with Lys-217 when undamaged guanine in either syn or anti conformation resides in the active site. Monitoring of the interaction between N7 and Lys-217 in the unmodified simulations revealed that the hydrogen bond formed between these groups does not persist. In the anti conformation, a hydrogen bond apparently was present but, for the final 300 ps of the simulation, the distance between Lys-217 N and N7 of guanine was 5.6 Ϯ 0.7 Å (Fig. 3), indicating that this interaction was broken and remained so for the rest of the simulation.
In addition to hydrogen bonding with O 8 of 8-oxodG in both the anti and syn conformations, Lys-217 N also participates in a hydrogen bond with O4Ј of the damaged base in the syn conformation, as illustrated by the distance and angle analysis shown in Fig. 3. In the anti conformation, Lys-217 partakes in a water-mediated hydrogen bond with O4Ј for ϳ25% of the simulation. These interactions are not present in either the anti or syn simulation with the unmodified nucleoside.
Our simulations suggest that Lys-217 plays an important role in hydrogen bonding to O 8 and O4Ј of the 8-oxodG residue, therefore, we chose this amino acid residue as the most promising candidate for site-directed mutagenesis to explore the substrate specificity of Fpg by enzyme kinetics.
Substrate specificity studies (15) have shown that the carbonyl group at C6 or its associated pyrrolic N1 is required for efficient excision of 8-oxoG by Fpg. Although both 8-oxoG and G possess a pyrrolic N1, the hydrogen bonding interactions of this group were found to differ in our simulations of 8-oxodG and dG in the anti conformation. In the anti dG simulation, N1 of guanine participates in a hydrogen bond with the Thr-214 main-chain carbonyl. This hydrogen bond was disrupted in anti 8-oxodG simulation 2 and persisted for only parts of anti 8-ox-odG simulations 1 and 3, where it was later replaced by interactions with Glu-5. Fig. S4 in Supplementary Materials shows the time-dependent distance and angle of these interactions. Apparently, Glu-5 participates in electrostatic or van der Waals interactions, involving O⑀1 and O⑀2, with N1 of 8-oxoG during anti simulation 1. Also, a hydrogen bond developed between O⑀2 of Glu-5 and later its O⑀1 and N1 of 8-oxoG during anti simulation 3. The interaction between N1 of anti 8-oxodG and Glu-5 possibly contributes to the catalytic efficiency of Fpg; in fact, the E5Q mutation of this enzyme decreases glycosylase activity (55). The simulations of 8-oxodG and dG in the syn conformation also showed stable interactions with O 6 of the base. In both simulations, Gln-221 N⑀2 participated in a hydrogen bond with O 6 for Ͼ85% of the simulation; this interaction did not differ significantly between the modified and unmodified simulations.
Structural (33,35,36) and biochemical (see below) data pinpoint Arg-108 as a critical residue in the discrimination of the base positioned opposite 8-oxoG. Our current modeling studies explored Fpg binding to the 8-oxoG:C base pair. In our simulations, stable hydrogen bonds between Arg-108 and the orphaned C were always present throughout the 1-ns unrestrained production run. The distance and angle of these interactions over the simulations are shown in Fig. S6. In addition, the site-directed mutagenesis studies (see below) showed that His-89 contributes to substrate specificity. Our simulations also reveal that His-89 participates in hydrogen bonding interactions on the undamaged strand with T15 and C16, with these interactions being ϳ15 Å from the substrate base. Because these interactions occur at a remote site and are the same for 8-oxoG and G, it seems likely that the discriminatory effect of His-89 is exerted at an earlier stage of damage recognition, not modeled in our complex. The hydrogen bonding distance and angle between His-89 and T15 and C16 is shown in Fig. S6.
Conservation of Amino Acid Residues in the Fpg Family-Analysis of conservation of amino acid residues in protein fam- ilies, combined with information regarding the three-dimensional structure of representative members of this family, provides a powerful tool for generating hypotheses regarding functions of various protein moieties (37). An especially useful situation occurs when a protein family may be divided into non-overlapping subfamilies with defined functional differ- ences. In this case, residues conserved in one subfamily but not in others (i.e. dissimilar residues) are likely to be responsible for subgroup-specific function(s). In the Fpg family, which includes homologs of both Fpg and Nei, the underlying assumption was that dissimilar residues could be involved in determining substrate specificity in the Fpg and Nei subfamilies.
We improved upon our earlier analysis of the Fpg family (37) by using a formal search protocol instead of a predefined data base cluster (see "Materials and Methods" and compare with Ref. 37), and by using the structure of Eco-Fpg complexed with DNA, instead of free T. thermophilus Fpg, for this analysis. Our search uncovered 147 homologs of Fpg and Nei derived from 43 taxa of the family rank (31 orders, 16 classes, 7 phyla, 1 kingdom), showing a quite even distribution of Fpg in the bacterial kingdom. Interestingly, no Fpg homologs were found in Archaea.
The structure of E. coli Fpg covalently complexed with DNA, representing an intermediate reaction complex, reveals three amino acid residues (Met-73, Arg-108, and Phe-110) inserted in the double helix, thereby filling the void formed by eversion of the damaged base from the DNA helix (33). Of these residues, only Arg-108 forms two direct hydrogen bonds with the base opposite the lesion, where it is positioned to contribute to discrimination of C versus A by the enzyme. Arg-108 is highly conserved among known bacterial Fpg homologs but not in members of the Nei subfamily (Fig. 4A). The structures of Lla-Fpg and Bst-Fpg in a non-covalent complex with DNA reveal preservation of the void-filling function and hydrogen bonding with the opposite base by Arg-108 (35,36), suggesting that these contacts persist along the reaction coordinate and may be crucial elements for opposite base discrimination by Fpg. E. coli Nei employs different parts of the protein for void-filling and opposite-base recognition (32). In fact, Gln-69, which makes a contact with the orphaned base in E. coli Nei, is not conserved in the Nei subfamily (not shown).
Among the several other residues conserved in the Fpg subfamily, His-89 ( Fig. 4A) is of special interest for this analysis as it (i) is highly conserved within the Fpg subfamily but not in the Nei subfamily and (ii) forms two hydrogen bonds to DNA. In the intermediate complex, His-89 contacts phosphates of the strand opposite the lesion, possibly contributing to early stages of lesion recognition as opposed to providing lesion-specific interactions in the Michaelis complex. We also analyzed the conservation of Lys-217, which was predicted by our molecular modeling analysis to be a major 8-oxoG recognition element in the active site pocket. Surprisingly, Lys-217 was not absolutely conserved; in addition to a conservative substitution with arginine, this position contained serine, glycine, histidine, alanine, aspartate, and glutamate. Nevertheless, in an alignment of 30 randomly chosen Fpg sequences belonging to species from 30 different orders, Lys or Arg occurred in 19 of these sequences at position 217. We therefore searched for residues co-conserved with Lys/Arg-217. We found that a basic residue at position 217 always occurred together with a basic residue (Lys or Arg) at position 109 (Arg-109 in E. coli), whereas only 5 of 11 nonbasic residues (including gaps) at position 217 occurred together with a basic residue at position 109.
Strikingly, in the three-dimensional structure of E. coli Fpg, Arg-109 is located near His-89. The two residues contact the undamaged DNA strand as part of a saddle-like structure positioned to impose a kinking force on DNA (Fig. 4, B and C). As the actual substrate specificity is unknown for most of the reported Fpg sequences, we reasoned that Arg-109 and Lys-217, despite being spatially separated (Fig. 4B), might form a single functional unit that also involves His-89. For these reasons, we mutated Arg-108 to investigate its role in opposite base recognition, and His-89 and Arg-109 to determine their possible roles in lesion recognition.
Kinetics of Fpg on Oxidized Purines and Pyrimidines-As a reference point for comparison of Fpg mutants, we determined the Michaelis, catalytic and dissociation constants for this enzyme acting on its preferred lesion, 8-oxoG, with each of the four canonical bases positioned opposite the modified base in the complementary strand. In another series of experiments, the lesion was represented by DHU. In contrast to 8-oxoG, the efficiency of DHU cleavage was similar for each of the four opposite bases (data not shown); thus, a full kinetic investigation was performed only for DHU:C. The data are summarized in Table I, Fig. 5 and Fig. 6. For the case of Fpg acting on 8-oxoG paired with different bases, our data confirm those reported by Tchou et al. (15). Although oxidation of guanine bases in native DNA results in formation of 8-oxoG:C pairs, which are presumed to be the physiological substrate for the enzyme, Fpg is actually more active on 8-oxoG:G and 8-oxoG:T pairs that it would encounter only rarely in a cell. Fpg was highly active against the DHU:C substrate, exceeding its activity on 8-oxoG:C. (Note: hereafter, activity comparisons refer to the specificity constant (k sp ϭ k cat /K m ), unless otherwise stated.) To our knowledge, this is the first kinetic study of Fpg activity on DHU, although excision of a related pyrimidine, dihydrothymine, from oligonucleotides by Fpg has been demonstrated (56).
His-89, Arg-109, and Lys-217 as Determinants of Damaged Base Specificity-To experimentally test predictions based on MD simulation and conservation analysis, we used site-directed mutagenesis to change His-89 to alanine (H89A), Arg-109 to alanine (R109A), and Lys-217 to threonine (K217T). The threonine mutant was chosen as it was easier to express in a soluble form and purify to homogeneity. We then tested the activities of these mutant proteins, and determined their kinetic parameters. We were interested particularly in whether the targeted mutations would differentially affect excision of 8-oxoG and DHU, the former representing the specific, and the latter, a nonspecific substrate for this enzyme (see "Discussion"). These data are summarized in Table I and Figs. 5 and 6.
The K217T mutation had a profound effect on the recognition of 8-oxoG:C by Fpg, decreasing the overall catalytic efficiency ϳ5-fold, primarily through a decrease in K m . Interestingly, the K D for this mutant was essentially unchanged. In contrast, little or no change in activity was observed when the K217T mutant was compared with wild type Fpg for their ability to cleave a DHU-containing substrate. The H89A mutation decreased the activity of Fpg on 8-oxoG:C ϳ3-fold, with major contributions by both K m and K D toward this more modest, but still significant effect. Introduction of another mutation, R109A, in the DNA-bending head had dramatic consequences. This mutant displayed very low enzymatic activity against 8-oxoG:C, comparable with the activity of the wild-type enzyme on 8-oxoG:A. In Fpg R109A, k sp decreased 200-fold compared with the wild-type enzyme and 70-fold compared with the H89A mutant. There was a less-pronounced effect of the R109A mutation on the excision of DHU; k sp decreased 15-fold and 20-fold, compared with wild-type Fpg and H89A, respectively. For 8-oxoG, this effect was almost entirely due to increases in K m and was accompanied by an increase in K D , whereas in the case of DHU (Table I) K m (and K D ), and to a lesser extent k cat were affected. An even more dramatic effect was seen with the H89A/R109A double mutation (Table I). This combination had an additive rather than synergistic effect, suggesting that

His-89 and Arg-109 may have different functions despite being spatially close.
Complementary Base Recognition by Arg-108 -To test the hypothesis that Arg-108 is directly involved in recognizing the base opposite the lesion, we replaced this residue with alanine (R108A) and investigated the activity of this mutant on substrates containing 8-oxoG positioned opposite each of the four natural bases. The results are shown in Table I and in Figs. 5 and 6. The mutation led to a 17-fold decrease in Fpg activity on 8-oxoG:C, a 15-fold decrease for 8-oxoG:A, and drastic decreases of 360-and 410-fold for 8-oxoG:T and 8-oxoG:G, respectively. The decreases in activity against 8-oxoG:C and 8-oxoG:A were due almost exclusively to an effect on k cat , whereas, for the other two substrates, K m was mostly affected with a less pronounced influence on k cat . The R108A mutation also influenced the activity of Fpg on DHU:C, although to a lesser degree than when 8-oxoG was positioned opposite various bases (2-fold decrease in k sp ). In the case of DHU:C, k cat was primarily affected by the R108A mutation. The magnitude of the effect observed for DHU is consistent with the interpretation that excision of DHU by Fpg is independent of the base opposite the lesion.   dues in the active site pocket of Fpg contribute significantly to enzyme specificity. Indeed, when the transition from undamaged to damaged DNA is characterized by the change in free energy of Fpg binding ⌬⌬G°ϭ Ϫ2.8 kcal/mole, of which only Ϫ0.9 kcal/mole is contributed by 8-oxoG (57), the 3-8 orders of magnitude increase in the reaction rate appears to be provided by the k cat term through precise alignment of the specific substrate in the active site.
Based on our MD simulations, a stable interaction between Lys-217 N and O 8 of 8-oxoG might be required to facilitate base excision (Fig. 2, C and D), independently of the conformation adopted by 8-oxodG in the pocket. The O 8 group is a distinct hydrogen bond acceptor that distinguishes 8-oxoG from guanine. In addition, the pyrrolic nitrogen at position 7 in 8-oxoG differs from the pyridinic nitrogen at this position in guanine. Thus, the hydrogen-bonding role of N7 is reversed upon the conversion of G to 8-oxoG, while a hydrogen bond acceptor is created at C8. We suggest that Fpg recognizes this new hydrogen bonding profile of 8-oxoG through Lys-217, as illustrated in our simulations of anti and syn 8-oxodG in the active site of Fpg. In addition, the main chain oxygen of Met-73 was hydrogen-bonded with the N7 of the damaged base in the anti 8-oxodG simulation 2. It is possible that both the Met-73 carbonyl and Lys-217 N are involved in the recognition of 8-oxodG in the anti conformation. While we cannot use sitedirected mutagenesis to test this hypothesis with respect to the main-chain atom of Met-73, data obtained with the K217T mutant are consistent with the involvement of Lys-217 in the substrate specificity of Fpg, as predicted by the MD simulations.
As revealed by earlier mutagenesis experiments (55), Glu-5 contributes to the glycosylase activity of Fpg. Our MD simulations reveal that while Glu-5 interacts with N1 in anti 8-oxodG (Figs. 2C and 7), the interaction is abolished when anti dG is placed in the active site of the protein. In this case, N1 of G donates a hydrogen bond to the Thr-214 main chain carbonyl. In the simulation of syn 8-oxodG, Glu-5 does not interact with the damaged base, but forms hydrogen bonds with amino acids in the active site region, including Glu-2, Ser-9, Tyr-170, and Thr-214. The stabilization of 8-oxoG and confirmation of the presence of a lesion is likely carried out by several amino acid residues within the Fpg active site. It appears that Lys-217 and possibly the main chain oxygen of Met-73 is responsible for recognition of O 8 and N7, respectively, of 8-oxoG, while Glu-5 increases the rate at which this base is excised by confirming the presence of pyrrolic N1 in the purine system when the modified nucleotide adopts the anti conformation.
Recognition of 8-oxoG by Enzymes of Different Structural Families-DNA containing 8-oxoG is a primary substrate both for Fpg and for a eukaryotic DNA glycosylase, Ogg1 (58,59). Surprisingly, there is no similarity between the Fpg and Ogg1 sequences (13,58,59) or three-dimensional structures (33)(34)(35)(36)60). Nevertheless, the possibility remained that recognition and stabilization of the damaged base by these enzymes might be similar and require the same or similar functional groups. Such a correlation is not borne out by our data.
In the active site pocket of human Ogg1 (60), the lesion is sandwiched between the aromatic ring of Phe-319 and the thiol group of Cys-253; Gln-315 accepts two direct hydrogen bonds through its O⑀1 from N1 and N 2 of 8-oxoG and is positioned to donate a hydrogen bond through a water bridge to O 6 of the lesion. The only bond to the lesion-specific functional groups of 8-oxoG, O 8 and pyrrolic N7, is made between N7 and the main chain carbonyl of Gly-42 (Fig. 7). However, the patterns of possible contacts in the active site of E. coli Fpg are quite different. The opposite -faces of 8-oxoG touch the walls of a hydrophobic pocket in both the anti or syn conformation. In the anti conformation, one wall of the pocket is formed by Ile-169 and Leu-216, and the other, by aliphatic atoms of Pro-1, Glu-2, and Glu-5. When the damaged nucleotide adopts the syn conformation, one wall of the pocket is formed by Phe-219, Met-73, Ser-74, and Lys-217, while the other wall is formed by Pro-1, Glu-2, and Leu-71. Lys-217 N donates a hydrogen bond to O 8 in a lesion-specific but conformation-independent manner. Thr-214 side chain hydroxyl accepts a hydrogen bond from N 2 of anti 8-oxodG, whereas the carboxyl of Glu-5 accepts a hydrogen bond from N1 of the damaged base (Figs. 2C and 7). When 8-oxodG adopts the syn conformation, the Watson-Crick edge is recognized by Gln-221, Arg-258, and Gly-75. Importantly, when unmodified G is present in the active site in either conformation, Lys-217 N does not participate in hydrogen bonding, even with the acceptor N7, and Glu-5 O⑀1 is replaced by the Thr-214 main-chain carbonyl as a group interacting with N1 of 8-oxoG. This less advantageous fit of undamaged G in the active site pocket may lead to a subtle rearrangement of the active center, throwing the interacting orbitals out of their precise alignment and thus preventing excision of the undamaged guanine.
Our results may explain why MeFaPy-G is an excellent Oxygen atoms are colored red; nitrogens, blue; carbons, gray; sulfur, yellow. 8-oxoG is represented as a space-filling model, the hydrophobic residues lining the pockets are shown as wireframe models with dotted van der Waals radii. The residues shown as stick models are the only ones forming direct hydrogen bonds (dotted lines) with the lesionspecific N7-O 8 region of 8-oxoG. substrate for Fpg (14,61) while being excised rather inefficiently by Ogg1 (58,59,(62)(63)(64). Recognition of 8-oxoG by Ogg1 involves hydrogen bonding to N7 of the purine ring, which is methylated in MeFaPy-G. Fpg recognizes the modified imidazole ring of 8-oxoG (and presumably the methylformamide moiety of MeFaPy-G) through another structural feature, O 8 , which is unchanged by methylation at N7. Specific and Nonspecific Substrates-As discussed above, recognition of the same lesion might proceed by different mechanisms. A related question concerns excision by the same enzyme of lesions with apparently different structures. In the case of Fpg, which excises FaPy-G, FaPy-A and DHU, as well as 8-oxoG, with high specificity, our experiments provide insight into recognition of 8-oxoG and FaPy-G, a less obvious one for FaPy-A (in which N1 is a hydrogen bond acceptor rather than a donor), and are not applicable to the excision of DHU. Intuitively, DHU and 8-oxoG would not be expected to interact with the same residues in the active site pocket of Fpg. Our mutagenesis experiments are consistent with this prediction, as we identified two mutants (K217T and H89A) that impede recognition of 8-oxoG but not of DHU.
What constitutes a specific substrate for a given enzyme is open to discussion beyond simple enzyme kinetics. For DNA glycosylases, a specific (or physiological) substrate would be any lesion present in a cell that is excised with high probability when encountered. Most kinetic studies on DNA glycosylases have been performed with oligonucleotide substrates under conditions where the effective concentration of the lesion is extremely high (in the experiments described here, 1 lesion in 23 base pairs). A dramatic effect is observed upon switching to the more realistic situation of macromolecular DNA damaged by ␥-irradiation. In such a system, Fpg excises only 8-oxoG, FaPy-G and FaPy-A (65), and was specifically shown not to excise DHU, dihydrothymine, 5-hydroxycytidine, 5-hydroxyuracil, thymine glycol, or 5-hydroxy-5-methylhydantoin (65,66), although all of these lesions are reported to be excised by Fpg from duplex oligonucleotides (Refs. 56, 66 -68 and present study). Similarly, excision of 8-oxoG by Nei, easily observed with oligonucleotide substrates (69), does not occur with irradiated DNA (70).
Interestingly, direct comparison of kinetic constants of Fpg obtained using oligonucleotides substrates or irradiated DNA shows nearly the same k cat , but the K m for lesions embedded in high-molecular-weight DNA increases by almost two orders of magnitude (15,65). This phenomenon is consistent with the one-dimensional diffusion model developed by von Hippel and co-workers (71-73) for processes involving proteins searching for a specific element in DNA. This model implies that k 1 included in the denominator of the K m measured in biochemical experiments is a function of both a bimolecular encounter constant (representing enzyme binding to unspecific DNA) and a characteristic translocation time (reflecting the search for the lesion without dissociation of the enzyme-DNA complex). Increasing the size of a DNA molecule and therefore the average distance between the lesions will increase the translocation time and thus increase K m . It is reasonable to suggest that lesions in high molecular weight DNA are mostly encountered in a run-on mode i.e. when the enzyme approaches the lesion while diffusing along DNA (Fig. 8A). With oligonucleotide substrates, a significant fraction of bimolecular encounters will result immediately in the association of the lesion with the active site of the enzyme (jump-on mode) and lead to apparently efficient excision of substrates not recognized in long DNA (Fig. 8B). Structural and mechanistic aspects of lesion binding may differ between these two modes (see discussion below of the reading head model). We suggest that Fpg recog-nizes 8-oxoG, FaPy-A, and FaPy-G in both run-on and jump-on modes (Fig. 8A), whereas only the latter type of encounter results in a productive Fpg complex with DHU or other damaged pyrimidines (Fig. 8B). The low level of excision of oxidized pyrimidines from DNA in the cell is confirmed by the observation that spontaneous reversions to rifampicin resistance in triple nth nei fpg mutants of E. coli are only slightly increased over those in double nth nei mutants, and the resultant mutation spectra do not correspond to those expected from oxidized pyrimidines (74,75). Thus, we contend that DHU and certain other damaged pyrimidines are nonspecific substrates for Fpg, whereas 8-oxoG and formamidopyrimidines represent specific, physiological substrates.
Comparison with Structural Data-After this work was complete, the crystal structure was established for a catalytically inactive Bst-Fpg E2Q mutant complexed with an 8-oxoG-containing oligonucleotide duplex (76). In this complex, the lesion was accommodated in the syn conformation in the active site pocket. N7 of 8-oxoG was hydrogen-bonded to the main chain carbonyl of Ser-222 (Thr-214 in Eco-Fpg). The side chain of Arg-225 (Lys-217 in Eco-Fpg) was not in the vicinity of the damaged base in Bst-Fpg. Another complex, in which DHU was the lesion, presented a similar pattern of hydrogen-bond donors and acceptors through its Watson-Crick edge to the protein moieties that recognize the lesion-specific elements.
The most likely explanation of this apparent discrepancy between the crystal structure and our simulated model lies in the nature of the mutation used to inactivate the protein. The E2Q mutation replaces a charged carboxylate with a neutral amide, which was hydrogen-bonded to O4Ј of the extruded damaged residue. This mutation significantly changes the electrostatic environment in the immediate vicinity of the lesion, which could alter interactions between the mutant protein and the damaged base; this, in fact, could account for the inactivity of this mutant. Our starting conformation for the syn simulation reported here displayed Eco-Fpg/8-oxoG interactions very similar to those reported for the mutant Bst-Fpg, but in our case the wild-type enzyme contained a Glu-2 residue. Early in the simulation (ϳ100 ps), a concerted series of motions within the active site region resulted in lesion-specific moieties of the damaged base forming a stable hydrogen bond with Lys-217 and moving far outside of hydrogen bonding distance with Thr-214. The O4Ј and O8 moieties are repelled by the nega -FIG. 8. Run-on and jump-on modes of lesion finding. In the run-on mode, the enzyme binds DNA nonspecifically and diffuses along the major or minor groove. In a model suggested for Fpg acting on 8-oxoG (G o in the scheme), encountering this lesion during this facilitated diffusion leads to formation of productive Michaelis complex (left), as does during direct binding to the lesion (productive encounters are indicated by bold arrows). In macromolecular DNA the frequency of run-on encounters significantly exceeds those from the jump-on mechanism, while both encounters may contribute significantly to cleavage of oligonucleotide substrates. In a jump-on mode, hypothesized to occur when Fpg excises DHU (right), diffusion may occur but does not lead to productive encounters, which are generated only by primary binding of the enzyme in the immediate vicinity of the lesion. Thus, DHU is efficiently excised from oligonucleotides but not from long DNA targets. tively charged carboxylate in the wild-type protein, whereas these are attracted to the amino group in E2Q Bst-Fpg.
Our biochemical data clearly indicate that Lys-217 mutants are deficient in recognition of 8-oxoG but not DHU. In addition, it is difficult to explain why the elements of the Watson-Crick edge of DHU, reportedly recognized by Bst-Fpg as being isosteric to lesion-specific moieties of 8-oxoG (76), would not be recognized when thymine or uracil are in the active site pocket; Eco-Fpg excises neither T nor U. Thus, the lesion-specific effects of Lys-217 mutation in Eco-Fpg are better explained in terms of our computational model for Eco-Fpg than by the crystal structure of E2Q Bst-Fpg. Perhaps, in solution 8-oxodG could bind in the active site pocket of Fpg in both syn and anti conformations, of which only the former is trapped in the Bst-Fpg/DNA crystal. The possibility of a conformational transition of the damaged nucleotide within the active site pocket seems unlikely but cannot be ruled out yet.
General Features of Damage Recognition: a Reading Head?-Different effects of H89A and R109A mutations on the excision of 8-oxoG and DHU suggest that the DNA-bending saddle containing these residues may participate less intimately in the efficient recognition of DHU. The saddle in Eco-Fpg also contains the non-conserved residues Lys-88 and Pro-107, whereas its lower part is formed by two residues of the voidfilling triad, Arg-108 and Phe-110 (Fig. 4C). The entire saddle structure lies some distance from Lys-217, ruling out direct interactions between these two structural elements. Nevertheless, conservation analysis and site-directed mutagenesis results suggest a functional interplay between the saddle motif and Lys-217. Mutations in either residue specifically affect recognition of 8-oxoG. An analogous saddle structure consisting of different residues (Gln-69, Tyr-71, Val-88, Arg-90, Ser-104, Ser-106) exists in Eco-Nei. Here, Arg-90 is absolutely conserved in the Nei subfamily but not in the Fpg subfamily and, similar to Arg-109 in Eco-Fpg, contacts the p (Ϫ1) phosphate (32). However, the orientations and conformations of the two arginine residues are quite different, as probably are the forces they exert locally on DNA.
As discussed earlier, a plausible model of recognition suggests that 8-oxoG may be recognized in both run-on and jump-on modes, whereas only jump-on mode applies for DHU. We suggest that the saddle motif acts as a DNA reading head. The reading head model for DNA repair enzymes was introduced by Tainer and co-workers (77) based on their studies of uracil-DNA glycosylase. They proposed that two residues in this enzyme, tyrosine and arginine, invaded and widened the minor groove of DNA and scanned it for N3 purine atoms. The reading head concept, in general, may be described as a motif in a glycosylase that senses a lesion-specific pattern or introduces distortion into DNA that permits indirect probing for instability created by a given lesion. We suggest that two void-filling moieties (Arg-108 and Phe-110) and three DNA-bending moieties (Lys-88, His-89, and Arg-109) of the saddle motif function together to distort DNA during one-dimensional diffusion in such a way that facilitates eversion of 8-oxoG or FaPy nucleotides-but not other lesions-into the enzyme active site pocket. When undamaged DNA is bound, residues of the saddle motif, such as His-89, Arg-108, or Arg-109, interact with bases in the minor groove of DNA instead of (or in addition to) the phosphate backbone. Once the damaged nucleotide is everted, it encounters Lys-217 (or possibly Met-73). In this model, His-89/Arg-109 and Lys-217 act at consecutive steps of damage recognition, providing an explanation for their co-conservation.
The structural machinery capable of this push-flip action most likely is assembled when the enzyme closes on DNA after initial binding. In Tth-Fpg (34), residues of the saddle motif (His-81 and Arg-100, corresponding to His-89 and Arg-109 of E. coli Fpg, together with Arg-32 and Arg-83) form a very interesting structure. The guanidine groups of the three arginines essentially stack against each other, and His-81 stacks edge to face with Arg-32. Together, they form a bulky positively charged ridge with several potential sites for hydrogen bonding. In Eco-Fpg, these residues are conserved (except for Arg-83, which is replaced by His-91), but in the cross-linked form, stacking is disrupted. We envision two possibilities: either the ridge exists in the free Eco-Fpg and changes its conformation upon binding, or the ridge is unique for Tth-Fpg. In the latter case, the reason for conservation of these residues is unclear. It is more likely, therefore, that the structure of the reading head in Fpg undergoes a conformational change from that assumed by the free form and may adopt yet another conformation when undamaged DNA is bound. In the free form, Fpg probably exists in a relatively open conformation, as suggested by NMR (78) and fluorescence measurements (79); direct structural data for Nei shows that this enzyme closes when binding DNA (32). 2 The active site of the enzyme is more accessible in the open form and a lesion, such as DHU, may be accommodated directly when the enzyme closes, as described for the jump-on mode, without having to be pushed into the active site by the saddle motif. However, gross alteration of the saddle motif, as in the H89A/R109A double mutant, would preclude efficient DNA kinking and subsequent formation of the catalytically competent active site, influencing excision of DHU as well.
Opposite Base Recognition-Structural and biochemical data reveal Arg-108 as the only residue contacting the orphaned base in the Fpg-DNA complex. As shown earlier (15) and confirmed in the present study, Fpg excises 8-oxoG more efficiently from the presumably unnatural substrates 8-oxoG:T and 8-oxoG:G than from the physiological substrate, 8-oxoG:C. Structures of Bst-Fpg-DNA complexes with C, G, or T positioned opposite the lesion (35) provide clues as to how Fpg can stabilize the Michaelis complexes with these bases, but does not explain the order of substrate preference. The lower thermodynamic stability of 8-oxoG:G and 8-oxoG:T has been suggested as the major reason for this preference (15,80).
In light of this hypothesis, the detailed kinetics of Fpg R108A on 8-oxoG positioned opposite different bases is most intriguing. The mutation affected k cat for all opposite bases to about the same extent, decreasing each 10 -15-fold (Table I). This obviously reflects the fact that disruption of the void-filling structure disorients the catalytic residues in the Michaelis complex for all mispairs involving 8-oxoG. However, the effect on K m clearly depends on the opposite base. K m values for stable mispairs (8-oxoG:C, 8-oxoG:A) are hardly affected by the R108A mutation, whereas K m values for the less thermodynamically stable mispairs (8-oxoG:T, 8-oxoG:G) are increased 20 -50 fold in the mutant. Even more intriguingly, dissociation constants for all four mispairs, as measured by the gel mobility shift assay, do not change significantly as compared with the changes in K m (Table I). These data indicate that the Michaelis complex with 8-oxoG:T and 8-oxoG:G is destabilized without dissociation of Fpg from DNA, possibly due to a shift in the equilibrium of nucleotide eversion toward an intrahelical 8-oxoG. On the other hand, structural data (35) suggest that contacts of Arg-108 with C are probably more energetically favored than are its contacts with T or G. In this case, the removal of Arg-108 would affect the 8-oxoG:C mispair to a higher extent than 8-oxoG:T or 8-oxoG:G, contrary to our observations. We therefore suggest that destabilization of the Michaelis complex in the R108A mutant may be caused by a decrease in the rate of 8-oxoG eversion from 8-oxoG:T or 8-oxoG:G. Arg-108 would then play a dominant role in initiating 8-oxoG eversion from energetically destabilized mispairs, where void filling may be the rate-limiting step. Additionally, Arg-108 could play an accessory role in excision of stable mispairs, such as 8-oxoG:C and 8-oxoG:A, where the rate of reaction may be limited by destabilization involving the reading head or the flanking phosphates-pinching triad of Lys-56, Asn-168, and Arg-258 (33).
Design of Site-directed Mutants-Atomic-resolution structures provide a solid basis for interpreting the roles of individual amino acid residues in enzyme functions and for generating hypotheses to be tested by site-directed mutagenesis. However, in most cases, crystallographic structures reflect only a single snapshot along the reaction coordinate. Even if the reaction coordinate is sampled for representative structures, as has been done for Ung (77,81,82) and Fpg (35), dynamic features of the reaction are difficult to study in this way. This is especially true for enzymes acting on polymeric substrates like DNA, where both the catalyst and its substrate may undergo multiple and extensive conformational changes during the reaction. Furthermore, as illustrated by Fpg, functionally important residues may be missing from the crystallographically determined structure of the enzyme-DNA complex. Thus, the experimentally determined structure can be fruitfully supplemented with studies designed to generate hypotheses regarding the function(s) of individual residues that are not readily deducible from a static model. In this study, we have explored two approaches to in silico analysis of damage recognition: molecular evolution and molecular modeling with MD simulations. Each approach allowed us to make meaningful predictions regarding amino acids crucial for substrate recognition and to design appropriate mutants to test these predictions. MD simulations of various steps along the reaction coordinate have been performed for only a limited number of DNA repair enzymes (34,(83)(84)(85)(86)(87)(88)(89). Experimental tests of the specific predictions of the simulations are even more rare (Refs. 87 and 90) 3 and do not always support their predictions (90). We have used MD simulations here to address a specific question; namely, how do the loop and the base missing from the structure of Fpg interact with the remainder of the protein-DNA complex? The results of site-directed mutagenesis studies with the K217 mutant support predictions based on the simulation.
Another theoretical approach, conservation analysis coupled with structural information, also has rarely been applied to DNA repair enzymes (37). Site-directed mutants have been produced previously for several DNA glycosylases based on high conservation value of certain residues in a limited number of sequences and without consideration of subgroups (55,91,92). To the best of our knowledge, addressing the role of His-89, Arg-108, and Arg-109 in this study is the first example of an experimental test of a prediction made from a combined structural-conservation analysis. Thus, the joint use of structural information, computer simulation and bioinformatics methods provides a powerful approach to generating biochemical hypotheses; one that can be applied to other protein families as well.