Structure of formamidopyrimidine-DNA glycosylase covalently complexed to DNA.

Formamidopyrimidine-DNA glycosylase (Fpg) is a DNA repair enzyme that excises oxidized purines from damaged DNA. The Schiff base intermediate formed during this reaction between Escherichia coli Fpg and DNA was trapped by reduction with sodium borohydride, and the structure of the resulting covalently cross-linked complex was determined at a 2.1-A resolution. Fpg is a bilobal protein with a wide, positively charged DNA-binding groove. It possesses a conserved zinc finger and a helix-two turn-helix motif that participate in DNA binding. The absolutely conserved residues Lys-56, His-70, Asn-168, and Arg-258 form hydrogen bonds to the phosphodiester backbone of DNA, which is sharply kinked at the lesion site. Residues Met-73, Arg-109, and Phe-110 are inserted into the DNA helix, filling the void created by nucleotide eversion. A deep hydrophobic pocket in the active site is positioned to accommodate an everted base. Structural analysis of the Fpg-DNA complex reveals essential features of damage recognition and the catalytic mechanism of Fpg.

Oxidative DNA damage is generated by a variety of environmental and endogenous agents, including ionizing radiation, certain chemicals, and products of aerobic metabolism (1). 8-oxoG 1 is one of the most abundant forms of oxidative DNA damage (2). Due to its ability to form a Hoogstein-type base pair with adenine (3), 8-oxoG is miscoding (4) and mutagenic, resulting in G3 T transversions in bacterial and eukaryotic cells (5,6). The potential harmful effects of this lesion are avoided by base excision repair. In Escherichia coli, formamidopyrimidine-DNA glycosylase (Fpg, EC 3.2.2.23) removes 8-oxoG, Me-FaPy, and several structurally related lesions from damaged DNA (7,8). Fpg is a component of the "GO system" that includes MutY, a mismatch adenine-DNA glycosylase, and MutT, an 8-oxodGTPase (9,10); E. coli strains deficient in any of these genes are strong mutators (11).
Fpg shares significant sequence homology with endonuclease VIII (Nei) of E. coli (12). Both proteins belong to a family unrelated by sequence or tertiary structure to a larger family of DNA glycosylases, for which the prototype is endonuclease III (Nth) (13,14). The substrate specificity of Fpg differs significantly from Nei (7,8,15) but closely resembles that of the eukaryotic 8-oxoguanine-DNA glycosylase, Ogg1, a member of the Nth family (14,16,17). Fpg also possesses AP lyase activity, nicking the phosphodiester backbone of DNA at the site of the lesion. Base excision by Fpg is followed immediately by two ␤-elimination steps, resulting in a single nucleotide gap flanked by phosphate termini (7). A Schiff base intermediate, involving Pro-1 of the enzyme and C1Ј of the damaged nucleotide, forms early in the reaction sequence and can be reductively trapped by treatment with NaBH 4 forming a stable covalent complex (18,19). The mechanism of cleavage is similar to that of Nei (15,20), but not to that of Ogg1 where only one ␤-elimination occurs, and the efficiency of the elimination step is very low compared with base excision (16,17).
Comparing the structures of Fpg, Nei, and Ogg1 provides a unique opportunity to analyze features of damage recognition and catalysis common to DNA glycosylases/AP lyases. The presence of DNA enhances the analytic power of the model by revealing the precise nature of enzyme-DNA interactions. The structure of the human Ogg1 catalytic domain complexed to DNA has been solved (21,22), as has the structure of E. coli Nei covalently cross-linked to DNA by NaBH 4 (23). The structure of Fpg from Thermus thermophilus HB8 (Tth-Fpg) has recently been solved in the absence of DNA (24). Although mechanisms for lesion recognition and catalysis by Fpg have been suggested on the basis of this structure and on earlier biochemical studies of E. coli Fpg (8,18,24,25), many questions remain unanswered regarding the mode of Fpg-DNA interactions and the catalytic reaction mechanism of this important DNA repair protein.
To investigate the mechanisms of Fpg-DNA interactions, we have utilized NaBH 4 reduction of the Schiff base intermediate to produce a stable covalent cross-link between Fpg and duplex DNA. We crystallized and determined the three-dimensional structure of this complex at 2.1 Å, revealing for the first time the three dimensional precise mode of DNA binding by Fpg as well as several important features of the catalytic mechanism of this enzyme. We used this model to rationalize numerous biochemical observations and site-directed mutagenesis studies of Fpg. The present study provides insight into the structural basis of Fpg-DNA binding, furnishing information essential for a mechanistic understanding of DNA glycosylases.

MATERIALS AND METHODS
Oligonucleotides and Enzymes-The 13-mers CCAGGA(8-oxoG)-GAAGCC and GGCTTCATCCTGG were synthesized by established phosphoramidite chemistry (26) and annealed in a 1:1 ratio. The E. coli fpg gene was amplified from E. coli genomic DNA by PCR using Pfu DNA polymerase (Stratagene); NdeI and BamHI restriction sites were present on the primers. The amplified fragment was inserted into the NdeI-BamHI site of the pET13a plasmid (27) and used to transform B834(DE3) E. coli cells. To purify recombinant Fpg, cells from 1 liter of culture, induced with 50 M isopropyl-1-thio-␤-D-galactopyranoside for 6 h at 37°C, were lysed by treatment with 50 g/ml lysozyme in Tris/EDTA buffer, pH 8.0, containing 1 mM phenylmethylsulfonyl fluoride. DNA was precipitated with 0.01% polyethyleneimine, 1 M NaCl, and the supernatant was treated with 45% saturated (NH 4 ) 2 SO 4 . The pellet was dissolved in Buffer A (20 mM HEPES-NaOH, pH 7.5, 1 mM EDTA, 1 mM dithiothreitol, 200 mM NaCl), loaded on a Fractogel EMD SO 3 Ϫ 650 (Merck) column equilibrated in the same buffer and eluted by a 200 -600 mM NaCl gradient in Buffer A. Fractions containing Fpg were diluted, applied to a heparin-Sepharose CL-6B (Amersham Biosciences) column, and eluted with a 200 -800% mM NaCl gradient in Buffer A. Fractions of at least 90% purity were used directly for cross-linking.
Crystallization and Data Collection-Crystals were obtained by mixing 2-l volumes of Fpg-DNA complex and reservoir solution (30% (w/v) polyethylene glycol 8000, 0.2 M (NH 4 ) 2 SO 4 , 0.1 M sodium cacodylate pH 6.5), and equilibrating the drop with 1 ml of reservoir solution at 15°C for several days. X-ray diffraction data (Table I) were collected on crystals soaked for ϳ3 min in a solution composed of 80% reservoir solution, 20% glycerol and flash-cooled to 100 K under a cold nitrogen gas stream. Data were obtained with a Quantum-4 CCD area detector (ADSC) at the National Synchroton Light Source X26C beamline, Brookhaven National Laboratory, and processed with DENZO and SCALEPACK (28).
Structure Determination-Matthews coefficient calculations suggest four Fpg-DNA monomers in an asymmetric unit. The Fpg-DNA structure was determined by molecular replacement (MR), using REPLACE (29,30) and CNS (31). Polar angle self-rotation function calculations resulted in two major peaks, indicating an asymmetric unit of 222 non-crystallographic symmetry. Rotation and translation functions calculations employed all data in the 10.0 -4.0-Å resolution range. Structures of Tth-Fpg (24) (1EE8) and a recently determined Nei-DNA complex (Ref. 23, 1K3W) were combined to form a search model for MR. Calculations imposing non-crystallographic symmetry, implemented in the "locked" cross rotation function in REPLACE, resulted in clear separation between the top and the second peaks, with no overlap between monomers in the unit cell. A translation function search was performed with CNS, using the constructed Fpg-DNA tetramer as a search model and including Patterson correlation refinement (32) and rigid body refinement for each of the monomers. This search gave a clear MR solution that was used for later stages of refinement. Solvent flipping (33) and 4-fold averaging, calculated to 2.1 Å resolution, increased the overall figure of merit from 0.42 to 0.89.
Electron density maps were calculated from the refined MR phases. An initial model was constructed using the program "O" (34), allowing clear determination of most of the structure, including components (e.g. zinc ion and most of the DNA bases) missing from the search model. The model was subjected to simulated annealing and iterative cycles of positional and temperature factor refinement, followed by manual fitting and rebuilding. An overall anisotropic temperature factor and bulk solvent correction factor were applied throughout the calculation. Progress of the refinement was monitored via R free (35). Strict non-crystallographic symmetry constraints were applied in the initial rounds of refinement. Subsequently, tight non-crystallographic symmetry restraints (force constant of 300 kcal/mol) were applied to allow proper refinement of regions significantly different in structure from that of the search model. All of the DNA and most of the protein residues were identified, except for the loop including residues 217-224 and the side chains of several amino acids remote from the protein-DNA interface. Water molecules were assigned to peaks in the [F o Ϫ F c ] electron density maps larger than 3 and within potential hydrogen-bonding distance. The figures were prepared using MidasPlus (36), MOL-SCRIPT (37), BOBSCRIPT (38), Raster3D (39), and GRASP (40). 3DNA (41) was used to calculate various DNA structural parameters.
Molecular Modeling and Conservation Analysis-A model of Fpg complexed with the DNA duplex containing an everted 8-oxodG residue was built based on the current structure and the structure of 8-oxoGcontaining DNA from the Ogg1-DNA complex (1EBM, Ref. 21). The missing residues 217-224 were modeled based on the corresponding loop of Tth-Fpg (1EE8, Ref. 24). The model was subjected to a series of energy minimization steps using the Discover module of Insight II (Accelrys) until the root mean-squared gradient was smaller than 0.001 kcal/(mol⅐Å). All energy optimizations were performed using the AM-BER force field (42) with a distance-dependent dielectric constant of 4r, using the steepest descent and conjugate gradient methods. The phosphorus atoms of the DNA and the C␣ atoms of the protein, except for the newly built loop, were restrained with harmonic forces. To find a conformation with lower energy for the missing loop and the everted 8-oxodG, we also performed simulated annealing molecular dynamic runs from 1000 to 298 K. The conservation of residues within the Nei and Fpg subfamilies of the Fpg family was analyzed using the AMAS algorithm (43) and the cluster of orthologous groups data base (44), as described elsewhere (23).

RESULTS AND DISCUSSION
Overall Structure-The overall structure of Fpg comprises two domains connected by a hinge polypeptide (Fig. 1, a and b) and resembles that of Tth-Fpg (24) and E. coli Nei (23). The N-terminal domain contains eight ␤-strands, forming a ␤-sandwich with two ␣-helices parallel to its edges. The C-terminal domain includes four ␣-helices, two of which, ␣D and ␣E, form the helix-two turn-helix (H2TH) motif, and two ␤-strands that form a ␤-hairpin zinc finger.
The excised base, 8-oxoG, is not retained in the crystal. The ring-opened deoxyribitol moiety (dRbl), formed after base exci- Structure of an Fpg-DNA Complex sion and NaBH 4 reduction, is everted from the helix with C1Ј bound covalently to N␣ of Pro-1 (Fig. 2), as suggested by biochemical experiments identifying Pro-1 as the residue involved in Schiff base formation (18,19). DNA is severely kinked at the lesion point (roll angle, 66°), and the minor groove is significantly widened. Except at the lesion site, the DNA is essentially B-form. As commonly observed in DNA-binding proteins, the DNA-binding groove is positively charged (Fig. 3).
Fpg-DNA Interactions-Binding to DNA involves extensive interactions between Fpg and DNA (Fig. 4). A hydrogen bond network involving all loops that face DNA comprises 2512 Å 2 of contact surface area. This relatively large buried surface area is consistent with the heat capacity change reported for Fpg binding to a lesion-containing duplex. 2 A similar value for binding-induced burial of previously exposed solvent accessible surface area (2268 Å 2 ) has been reported for Ogg1 (21); significantly higher than for human alkylpurine-DNA glycosylase (1034 Å 2 ) (45) and uracil-DNA glycosylase (700 Å 2 ) (46).
Fpg mainly contacts the damaged strand 3Ј to dRbl (P 0 , P Ϫ1 , P Ϫ2 ) via side chains and backbone hydrogen atoms of the highly conserved Lys-56, His-70, Asn-168, Tyr-236, and Arg-258 residues. Fpg binds DNA in the minor groove; the damaged base is extruded from the helix through the major groove (Fig. 3). The complementary strand is held in position largely through Watson-Crick bonds; interactions with the enzyme are few, and except for His-89, the amino acids involved are not conserved.
Asn-168, Arg-258, and Lys-56 contact P 0 , P Ϫ1 , and P Ϫ2 , stabilizing the complex (Fig. 5a). Asn-168 is part of the H2TH motif and forms bonds through backbone and side chain amides to P Ϫ1 and P 0 , respectively. The zinc finger forms four hydrogen bonds with the phosphodiester backbone, three of which involve Arg-258 (two to P 0 , one to P Ϫ1 ) and one via Gln-257, to P (3) . Lys-56, located on the ␤2-␤3 loop, forms hydrogen bonds with P Ϫ1 and P Ϫ2 .
Interactions at the Lesion Site-Extrusion of the damaged base, which facilitates its binding in the enzyme active site, is a common structural feature of DNA glycosylases (21,45,47,48). Eversion of deoxyribose is achieved by rotation around the P-O5Ј and O3Ј-P bonds. The geometry of Fpg-induced DNA kinking differs from that observed for uracil-DNA glycosylase (46,47) and Ogg1 (21), which "pinch" the DNA backbone at the extrusion site, decreasing the normal distance (ϳ12 Å in B-DNA) between P Ϫ1 and P 1 by 30 -50%. Similar to E. coli and human alkylpurine-DNA glycosylases complexed with DNA (45,48), Fpg does not compress the phosphodiester backbone (P Ϫ1 -P 1 is 11.4 Å). Thus, base eversion is not absolutely dependent on strain induced by massive backbone compression (46,47). Nucleotide eversion effected by Fpg may be achieved by "holding" the flanking phosphates with Lys-56, Asn-168, and Arg-258, coupled with intercalation of hydrophobic residues into duplex DNA (49) (Fig. 5, a and b). The energy required for extruding a single base from kinked double helical DNA has been estimated at ϳ3 kcal/mol (50), a barrier readily attained through noncovalent enzyme-DNA interactions.
Eversion of dRb1, coupled with loss of a base, creates a substantial gap between opposite strands, reflected in the 14.7 Å distance between C1Ј of dRb1 and C (0) , as compared with 10.5 Å in B-DNA. The gap is filled by the hydrophobic residues Met-73, Arg-108, and Phe-110 (Fig. 5b). Met-73, part of the ␤4-␤5 loop, enters the helix through the minor groove, occupying the position vacated by the extruded base. Arg-108 and Phe-110 are located on the ␤7-␤8 loop. Phe-110, wedged between C (1) and C (0) , engages in face-to-face interactions (51) with the C (1) pyrimidine ring. Unstacking of these bases may contribute significantly to DNA kinking. C (0) remains intrahelical, stabilized by hydrogen bonds from O2 and N3 to Arg-108, contributing to Fpg specificity through opposite base recognition (Fig. 5b).
The position of dRb1 is fixed by the N␣-C1Ј covalent bond and the absolutely conserved Glu-2, which forms a hydrogen bond with O4Ј. Glu-2 is stabilized by hydrogen bonds to backbone amides of Ile-169 and Gly-167 located in the H2TH ␣E helix. Because the E2Q mutation inactivates the glycosylase activity of Fpg, but does not affect its AP lyase activity (52), Glu-2 probably is involved in protonation of O4Ј during base excision (53) (Fig. 6a).
In addition, we analyzed conservation of amino acid residues in the Fpg family, which can be divided into Fpg and Nei subfamilies (54). At some positions, residues are highly conserved within the Fpg subfamily but not within the Nei subfamily. Alternatively, they may be conserved within the Nei subfamily but differ significantly in their physicochemical properties from residues at the respective positions of Fpg subfamily members. Such "dissimilar" residues may be important for functions specific to the Fpg subfamily, the most obvious of which is substrate specificity. When dissimilar residues are mapped on the surface of Fpg, several of them (Glu-5, Ile-169, Tyr-170) fall into the pocket identified by modeling, suggesting the importance of this region for 8-oxoG-specific recognition.
Mechanistic Implications-The present structure may be used to rationalize the wealth of biochemical data accumulated for Fpg (53,55). Pro-1 and Glu-2 are critical for base release in the Fpg-catalyzed reaction. During the consecutive ␤-elimination steps that follow (56), Fpg must protonate the leaving phosphates. Lys-56 and Arg-258 are well positioned to protonate P Ϫ1 and P 0 , respectively (Fig. 6, b and c). Following elimination of P Ϫ1 , its contact with Arg-258 may be relaxed, allowing this residue to move closer to P 0 for more efficient protonation. Surprisingly, the conformation of dRbl in the catalytic intermediate is inconsistent with the syn stereochemistry of pro-S-2Ј hydrogen abstraction proposed for other AP lyases (57,58), suggesting an anti stereochemical course.
Peptide mapping, site-directed mutagenesis, and mass spectrometric experiments established Pro-1 as the site of Schiff base formation (18,19). The Fpg-DNA structure confirms that Pro-1 N␣ is covalently bound to C1Ј. Fluorescence experiments suggest that the structure of P1E Fpg is more rigid than that of wild type Fpg and P1G and K56G mutants (59). Pro-1 lies at the head of a long helix ␣A, which in addition to a standard dipole moment has acidic N-terminal and basic C-terminal halves. P1E mutation further increases dipole moment, strengthening electrostatic interactions with Lys-56 and Arg-258 and tightening the protein globule.
Mutations of conserved glutamates Glu-2, Glu-5, Glu-131, and Glu-173 to glutamines decrease the glycosylase but not the AP lyase activity of Fpg (52). Glu-2 stabilizes and protonates O4Ј of dRbl (Fig. 6a) and may activate Pro-1 for nucleophilic attack ((Glu-2)O⑀2-N␣(Pro-1), 5.71 Å). Glu-5, which moderates Fpg activity, may help orient Glu-2 for catalysis via interaction with its backbone amide. In addition, Glu-2 and Glu-5 participate in forming the 8-oxoG-binding pocket. Glu-131 forms hydrogen bonds to Arg-53, Arg-54, and Ala-55, stabilizing the ␤2-␤3 loop and bringing together the N-and C-terminal domains; loss of these interactions would affect positioning of Lys-56. Glu-173 forms hydrogen bonds with the backbone amides of Gln-234, Val-235, and Tyr-236, stabilizing the turn formed by these residues. This turn, together with the edge-toface interaction (51) with Phe-261, helps orient Tyr-236, which donates a hydrogen bond to P 0 . Hydrogen bonding by these Glu residues is modified upon mutation to Gln. In contrast, conversion of the conserved Asp-106 and Asp-159 residues to Asn retains hydrogen bonding capability; these mutations do not affect Fpg activity (52).
Lys-56 and Lys-154 have been implicated in substrate specificity. Mutations K56G, K56R, and K154A reduce the efficiency of Fpg acting on substrates containing 8-oxoG but not on DNA containing Me-FaPy or AP sites (25,60). These conserved residues participate in a hydrogen-bonding network near dRbl, with Lys-56 contacting P Ϫ1 and P Ϫ2 and Lys-154 forming a strong hydrogen bond to O␦1 of Asn-168. Nucleotide eversion moves the ␤ position of C1Ј away from these residues, making it unlikely that they will participate in recognition of the damaged base. However, Me-FaPy is less stable than 8-oxoG to FIG. 6. Scheme of principal steps in the sequence of reactions catalyzed by Fpg, showing catalytically important amino acid residues. Nucleophilic attack at C1Ј and protonation at O4Ј (a) lead to base displacement and deoxyribose ring opening. A Schiff base involving Pro-1 is formed with O4Ј stabilized by hydrogen bonding to Glu-2. b, following abstraction of the 2Ј proton of deoxyribose by a general base, Lys-56 protonates the 3Ј-phosphate leading to ␤-elimination (c). Deprotonation of C4Ј, now vinilogous to C1Ј, is followed by protonation of the 5Ј-phosphate by Arg-258 and the second ␤-elimination event (c). acid-catalyzed depurination and protonation of the base; alternatively, protonation of O4Ј may be sufficient to break the glycosidic bond of this lesion. The resulting AP site would be less sensitive to the loss of Lys-56 or Lys-154 due to spontaneous opening of the deoxyribose ring, which relaxes the requirement for strict positioning of C1Ј for nucleophilic attack (23).
DNA binding by Fpg is abolished by mutation of any of the four cysteines in the zinc finger (61,62). This motif exists as a separate subdomain, and its disruption would not affect overall folding, consistent with circular dichroism data (61), but would create conformational changes that affect positioning of the conserved Arg-258. In the zinc finger mutants, AP lyase activity is affected less than base excision activity (62), consistent with our observation that mutation of the homologous Arg-252 in Nei inactivates glycosylase, but not AP lyase activity (23). Cysteines 146 and 194 are remote from the active site and when mutated have little effect on enzymatic function (62).
In conclusion, the crystal structure of Fpg cross-linked to duplex DNA reveals the mode of substrate binding for this important family of DNA glycosylases and identifies highly conserved active site residues involved in the catalytic action of this enzyme. The structural data presented here clarify many of the biochemical properties reported for Fpg. They provide insights into the nature of protein-nucleic acid interactions and facilitate our understanding of mechanisms of base excision repair.