Entrapment and Structure of an Extrahelical Guanine Attempting to Enter the Active Site of a Bacterial DNA Glycosylase, MutM*

MutM, a bacterial DNA glycosylase, protects genome integrity by catalyzing glycosidic bond cleavage of 8-oxoguanine (oxoG) lesions, thereby initiating base excision DNA repair. The process of searching for and locating oxoG lesions is especially challenging, because of the close structural resemblance of oxoG to its million-fold more abundant progenitor, G. Extrusion of the target nucleobase from the DNA double helix to an extrahelical position is an essential step in lesion recognition and catalysis by MutM. Although the interactions between the extruded oxoG and the active site of MutM have been well characterized, little is known in structural detail regarding the interrogation of extruded normal DNA bases by MutM. Here we report the capture and structural elucidation of a complex in which MutM is attempting to present an undamaged G to its active site. The structure of this MutM-extrahelical G complex provides insights into the mechanism MutM employs to discriminate against extrahelical normal DNA bases and into the base extrusion process in general.

Cells are constantly challenged by both endogenous and exogenous sources of DNA damage, such as reactive oxygen species generated during metabolism and chemicals from the environment (1). Failure to repair the resulting damaged nucleobases has adverse consequences ranging from acquisition of mutations to induction of cell cycle arrest and apoptosis (2). Cells have evolved mechanisms that protect genome integrity by specifically identifying and repairing lesions within their DNA. The base excision DNA repair pathway, for example, is responsible for the recognition and repair of single-nucleobase lesions. Base excision DNA repair is initiated by DNA glycosylases, which target aberrant nucleobases and catalyze cleavage of their glycosidic linkage to the DNA backbone (3). Despite the structural divergence of the target lesions for these enzymes, all of the known DNA glycosylases gain access to their target nucleobases by extruding the entire target nucleoside from the DNA helix and inserting it into an extrahelical active site pocket (4).
MutM (also known as Fpg) is a well characterized bacterial DNA glycosylase that specifically recognizes and cleaves oxidatively damaged nucleobases (5). The major substrate, oxoG, 5 is the predominant form of oxidative damage in DNA, because G has the highest redox potential among all four normal nucleobases (6). During DNA replication, adenine (A) is frequently incorporated opposite oxoG through Hoogsteen base pairing; replication of this intermediate results in a G:C to T:A transversion mutation (7). Deficiency in the human functional counterpart of MutM, hOGG1, has been implicated in a variety of cancers (8).
MutM faces the formidable challenge of locating rare oxoG sites amid a vast excess of undamaged DNA, and doing so before replication unleashes the mutagenic potential of the lesion. In mammalian cells, oxoG occurs at a steady state frequency of 10 Ϫ7 -10 Ϫ8 /nucleotide, more than a million-fold less frequent than G (9). Unlike many other types of DNA damage (thymine dimer, ring-opened bases, and alkylation) that impose characteristic perturbations to the local DNA conformation, oxoG has minimal impact on the structure and energetics of B-form DNA (10 -12). The absence of structural alterations in isolated oxoG-containing DNA duplex is not surprising because the Watson-Crick face is unchanged, and oxoG differs from G by only two atoms: an additional oxygen present on the C-8 position of oxoG and a lone pair of electrons on N-7 of G in place of an N-7-hydrogen in oxoG (Fig. 1A). Further complicating the search process is the fact that a necessary step in repair by MutM is complete extrusion of the target nucleoside from the DNA helical stack and insertion into the extrahelical active site pocket on the enzyme. Finally, and in contrast to replication-coupled and mismatch repair processes (13), MutM and other DNA glycosylases receive no input of biochemical energy (e.g. from ATP) to fuel their needle-in-a-haystack damage search process.
How does MutM locate oxoG with such high precision and efficiency? Recent structural studies of MutM-DNA complexes at various stages along the base extrusion pathway have provided valuable insights. As shown by the structural studies and computations (14), the extrusion pathway consists of (i) initial disengagement of the target nucleobase from its complementary pairing partner and transition to an unstacked and exposed conformation, followed by (ii) rotation of the extrahelical base about its glycosidic linkage to a syn configuration, and then (iii) swiveling of the DNA backbone to enable displacement of the target nucleoside away from the DNA surface and insertion of the target base in the enzyme active site, accompanied by structural adaptations in the enzyme. Detailed views of the end state of the base extrusion pathway (Fig. 1B) have been gained via structures of lesion recognition complexes (LRCs) comprising a catalytically inactive but recognition-competent mutant of MutM bound to DNA with an extrahelical oxoG in the active site. These structures revealed extensive contacts to the Hoogsteen face of the extruded oxoG by a loop on the enzyme, the oxoG-capping loop (OCL), that becomes ordered only upon full insertion of oxoG into the enzyme active site (15). Rational design of a disulfide cross-linking (DXL) system based on the LRC structures enabled entrapment and structural elucidation of MutM interrogating intrahelical nucleobases, either normal or damaged, at the earliest stages of the base extrusion pathway. Specifically, the structures of "interrogation complexes" (ICs) having MutM interrogating a fully base-paired undamaged target nucleobase suggested that MutM actively interrogates the intact DNA duplex while searching for lesions (16). By destabilzing the interaction between oxoG and the OCL, we have been able to trap "encounter complexes" (ECs), in which MutM is examining a fully intrahelical target oxoG (14). Comparison of several pairs of sequence-matched EC and IC structures, the protein-DNA interfaces of which differ only by the two atoms that distinguish oxoG from G, illuminated the structural basis of the ability of MutM to discriminate an intrahelical lesion from its normal progenitor. This structural difference underlies the kinetic preference of MutM to extrude oxoG from DNA more rapidly than G. Notwithstanding these advances in elucidation of the MutM base extrusion pathway, to this date, no complex of MutM interrogating an extrahelical G has been structurally characterized, because this state of the system is ordinarily too fleeting to be observed crystallographically.
Here, we report the entrapment and structural elucidation of a MutM-DNA complex trapped at the stage of attempting to present an extrahelical, undamaged G to the active site. As described in detail below, the entrapment of this intermediate was enabled by the use of DXL technology to tug on the DNA backbone at the site of the target nucleobase. The structure of this MutM-extrahelical G complex (XGC) reveals that the MutM active site can detect the subtle structural differences between extrahelical G and oxoG, so that an extruded normal G fails to attain the final precatalytic state of the base extrusion pathway.

EXPERIMENTAL PROCEDURES
Cloning, Overexpression, and Purification of MutM-The N174C point mutation was introduced into WT or the enzymatically inactive E3Q mutant of Geobacillus stearothermophilus MutM (15) in a pET24 (Novagen) expression vector using a QuikChange II site-directed mutagenesis kit (Stratagene). N174C MutM and N174C E3Q MutM were overexpressed and purified essentially as described before (17).
DNA Synthesis and Purification-DNA oligomers containing the backbone DXL N-ethylthio tethers were synthesized on an ABI 392 DNA synthesizer (Applied Biosystems) using published methods (16). The complementary strands of normal DNA oligomers were synthesized on a Mermade-12 Oligonucleotide Synthesizer (BioAutomation) using standard phosphoramidite chemistry. DNA oligomers were purified by urea-PAGE, dissolved in 10 mM Tris, pH 8.0, and annealed at 200 M concentration. Sequence and disulfide functionalization of the oligomers were confirmed by matrix-assisted laser desorption ionization. The following DNA oligomer duplexes were prepared for the structures discussed in this paper, with o G denoting oxoG and x representing the location of the phosphate moiety bearing the nonbridging N-ethylthio-tether modified phosphoramidate used in cross-linking to MutM: lesion recognition complex (LRC N174C ), 5Ј-TGCGTCC x o GAGTCTAC-C-3Ј and 3Ј-CGCAGGCTCAGATGGA-5Ј; XGC, 5Ј-TGCGT-CCA x GGTCTACC-3Ј and 3Ј-CGCAGGTCCAGATGGA-5Ј.
DXL Reactions and Crystallization-LRC N174C and XGC were made by cross-linking the corresponding DNA duplexes to the N174C E3Q MutM protein. Preparative cross-linking reactions were performed under nondenaturing conditions as previously described (16). The DXLed complexes were purified on a MonoQ column using a linear gradient of 100 -600 mM NaCl in 20 mM Tris, pH 7.4, buffer exchanged into 20 mM Tris, pH 7.4, and 50 mM NaCl, concentrated to 250 M, and subjected to crystallization under essentially the same conditions as before (16). Crystals readily appeared within a week at 4°C using the hanging drop vapor diffusion method, with a reservoir solution composed of 12-18% polyethylene glycol 8000, 100 mM sodium cacodylate, pH 7.0, and 5% glycerol. After the crystals grew over 3 weeks, they were removed by a loop and soaked in a cryoprotectant solution (18% polyethylene glycol 8000, 100 mM sodium cacodylate, pH 7.0, and 25% glycerol), then mounted in a loop, and flash-frozen in liquid nitrogen for data collection.
Data Collection and Structure Determination-X-ray diffraction data were collected at 100 K at the X29 Beamline of the National Synchrotron Light Source and the 19ID-C Beamline of the Argonne Photon Source. HKL2000 program suites (18) were used to process all of the diffraction data. The phases of the structures were solved by rigid body fitting of the coordinates of the protein portion from the published un-DXLed MutM lesion recognition complex (15) in CNS (19). Iterative manual adjustments of the initial models in COOT (20) were followed by simulated annealing, energy minimization, and grouped B factor refinements. The DNA molecules were built into the readily visible density in the 2F o Ϫ F c maps generated with the improved initial models. The clearly visible density for the cross-link served to confirm the register of the DNA. After R free (21) dropped to a steady level below 28% through rounds of model adjustments and refinements, water molecules were added to the models using both automated methods in CNS and manual inspection of the difference maps. Simulated annealing omit maps were constantly used to reduce model bias. The DNA backbone cross-links were modeled with restraint parameters generated using PRODRG (22). The complete LRC N174C and XGC models were subjected to a final round of solvent flattening, energy minimization, ADP (B factor), and translation/liberation/screw refinements in Phenix (23). Protein and DNA were defined as separate translation/ liberation/screw groups. Electron density for regions of the OCL in the XGC was not evident, and these residues were thus omitted from the model. Amino acid residues with incomplete electron density around their side chains were truncated at the corresponding positions. Atomic coordinates were deposited to the Protein Data Bank under accession numbers 3JR4 and 3JR5. Images of the models were rendered using PyMol (DeLano Scientific).
oxoG and Abasic Site Cleavage Assays-The DNA duplex used for the LRC N174C and the corresponding duplex lacking the N-ethylthio tether were used in the oxoG cleavage assays. The strands containing oxoG were 5Ј-end-labeled using T4 polynucleotide kinase (New England Biolabs) and [␥-32 P]ATP (PerkinElmer Life Sciences) and annealed with 1.1-fold excess of the unlabeled complementary strand. Single turnover cleavage reactions were carried out with 100 nM duplex DNA and 500 nM WT or N174C MutM in a standard reaction buffer of 40 mM NaCl, 50 mM Tris, pH 7.4, at room temperature. Abasic site cleavage reaction substrates were prepared by treating a 5Ј-32 Plabeled uracil-containing 16-mer duplex DNA (5Ј-AGCGTC-CAUGTCTACC-3Ј) with uracil DNA glycosylase (New England Biolabs) at 37°C for 1 h. 100 nM pretreated duplex DNA was then incubated with 100 nM WT or N174C MutM in a reaction buffer of 200 mM NaCl, 50 mM Tris, pH 7.4, at room temperature. Aliquots of the reactions were removed periodically, quenched with an equal amount of 100 mM dithiothreitol in 95% formamide and 1ϫ Tris/borate/EDTA buffer, and subjected to denaturing urea-PAGE electrophoresis and visualization on a phosphorimaging plate.
Fluorescence Polarization Experiments-A 3Ј-fluoroscein-labeled DNA oligomer (5Ј-TGGTAGACCTGGACGC-fluoroscein-3Ј, purchased from Operon and gel-purified) was annealed 1:1 with either the tether-containing strand used in the XGC or the corresponding normal DNA without the N-ethylthio tether. The resulting normal DNA duplexes with or without tether were used in the WT and N174C MutM-DNA binding measurements in a buffer of 50 mM NaCl, 20 mM Tris, pH 7.4, and 0.1% ␤-mercaptoethanol. MutM at different concentrations (5 nM to 20 M) was mixed with 10 nM DNA duplex in 96-well plates and equilibrated at room temperature. A Spec-traMax M5 microplate reader (Molecular Devices) was used to monitor the changes in fluorescence polarization of the samples using excitation at 485 nm and emission at 530 nm. Dissociation constants were extracted by curve fitting of the changes in fluorescence polarization versus log of protein concentrations using Kaleidagraph 3.6. The experimental data points were measured in triplicate.
Molecular Dynamics Simulations of the DXL Effects-Systems with extrahelical oxoG or G bound to MutM were set up based on the x-ray crystal structure of LRC N174C or XGC, respectively. The positions of the hydrogen atoms were determined using the HBUILD facility in the CHARMM program version c33a2 (24,25). For the DNA duplexes, we kept only the central 14 base pairs and discarded the flanking segments. The total charges of the resulting protein and DNA duplex systems were neutralized by placing Na ϩ ions (18 for un-cross-linked and 17 for cross-linked simulations) 4.5 Å away from the phosphorus atom along the line passing through the phosphorus atom and the midpoint of the two nonbridging oxygens. All of the ordered waters observed in the x-ray structures were included, and the resulting systems were further solvated with a box of water molecules measuring 80.0 ϫ 65.0 ϫ 65.0 Å 3 . Each system was first energy-minimized to alleviate high energy contacts with a series of constraints and harmonic restraints and then equilibrated with molecular dynamics (MD) stimulations at constant pressure (26) for 0.7 ns. The dimensions of each system measured 78.8 ϫ 63.8 ϫ 63.8 Å 3 after the equilibration. Production MD simulations were performed for 5 ns for each system in which the last 1-ns simulations were used to determine the average structures. In the simulations, the leapfrog Verlet algorithm (27) was used with a 2-fs integration time step, and SHAKE (28) was applied to bonds involving hydrogens. The temperature was maintained at 298 K by coupling the system to the external thermal bath (29), and the volume was held constant. The all-atom CHARMM 27 force fields (30,31) were used to represent the protein, ions, and nucleic acids, and the TIP3P model (32) was employed to represent water molecules. For the protein backbone dihedral angles, the CMAP correction (33) was applied. Periodic boundary conditions were used for the simulations. For the electrostatics, the particle mesh Ewald summation method (34) was used with a cut-off distance of 9 Å for the real space summation. The van der Waals interactions were evaluated with the same cut-off distance, and a shift function was used to smoothly turn off the interaction energy at the cut-off distance.

DXL Strategy for Stabilizing the Target Nucleoside in an
Extruded Conformation-The presentation of a normal nucleobase to the MutM active site is likely to be a rare event having a fleeting lifetime. We sought to devise a strategy with which to capture this state and elucidate its structure. Previously, in the case of hOGG1, which is structurally unrelated to MutM, capturing an XGC was accomplished by a DXL strategy that interfered directly with base pairing of the target G to its complementary C (35). In the case of MutM, we have employed an alternative strategy involving conformational manipulation of the DNA backbone.
Computer simulations of the base extrusion processes indicated that there is a significant energetic barrier (ϳ11 kcal/mol) for an intrahelical G to be extruded by MutM into a fully extrahelical and rotated conformation followed by insertion into the enzyme active site (14). Furthermore, the free energy of the fully extruded state for G bound to MutM is ϳ7 kcal/mol higher than for the initial intrahelical state. This, plus the fact that single-molecule sliding experiments estimated that the onedimensional diffusion rate of MutM approaches that of the theoretical limit (36), essentially precludes the possibility that MutM extrudes and examines every normal nucleobase in its active site during the search for oxoG lesions. Additionally, MutM has no preference to bind any particular DNA sequence. These considerations suggested it would be necessary to provide an energetic driving force to stabilize G in the extrahelical conformation and to restrict the roaming range of the enzyme to the immediate vicinity of the target G. We reasoned that both of these objectives might be accomplished via DXL technology, which has previously been employed successfully to trap various other fleeting species along the MutM lesion search and extrusion pathway (14,16).
Comparison of EC and IC structures (14) with those of the LRC (15) revealed, not surprisingly, that the most drastic structural differences brought about by extrusion of the target nucle-oside are localized to the site of the target nucleoside, including the flanking 5Ј-and 3Ј-phosphates. We reasoned that we might be able to attach a phosphoramidate tether to one of these phosphates, thereby providing a point from which to tug on the DNA backbone at the site of the target nucleoside, thereby forcing it to adopt a conformation approximating that in the extrahelical state. Inspection of the previously reported structure of a MutM LRC (LRC 1R2Y , where 1R2Y is the Protein Data Bank code) (15), in which oxoG is extrahelical and bound specifically in the enzyme active site having an ordered OCL, led to the identification of a key residue, Asn 174 , that employs its side chain amide to hydrogen bond with the 5Ј phosphate of the target nucleoside (supplemental Fig. S1A). The distances separating C␥ of Asn 174 from the pro-S and pro-R nonbridging 5Ј-phosphate oxygens of oxoG in the LRC 1R2Y are 4.3 and 3.9 Å, respectively. In an IC sequence-matched to LRC 1R2Y , (16), the corresponding distances are 7.1 and 5.3 Å for the respective pro-S and pro-R oxygens. From previous structures having an N-ethylthio (C 2 ) phosphoramidate DXL located at position 166 in MutM, we measured an average distance of 4.4 Ϯ 0.3 Å between the cross-linker N and the Cys 166 S␥ (13 structures total) (14,16). That length closely matches the distance between the target nucleobase and Asn 174 in the extruded state but is much shorter than the distance in the corresponding intrahelical state.
It is not possible to test the effect of the DXL on catalysis by MutM, because the rate of cross-linking is not considerably faster than the rate of DNA cleavage. We therefore decided to test the effects of the component parts of the cross-link, namely the N174C mutation in MutM and the N-ethylthio phosphoramidate in DNA, on the rate of the enzymatic repair reaction and on the strength of protein/DNA binding. The rate of enzyme-catalyzed oxoG and abasic site cleavage is barely affected by the presence of the N174C mutation and only modestly diminished by the presence of the N-ethylthio-bearing tether on the phosphate 5Ј-to oxoG (Figs. 2A and supplemental Fig. S2). Neither modification has a substantial effect (Ͻ2-fold) on the affinity of the MutM/DNA interaction (Fig. 2B). These results indicate that the component parts of the DXL, namely the N174C mutation in MutM and the N-ethylthio tether in DNA, do not significantly compromise the catalytic function of MutM.
Structure of a Control LRC N174C Having the DXL Introduced-To validate the N174C cross-linking strategy and test whether the DXL perturbs the MutM-DNA structure in unintended ways, we determined the 1.7 Å structure of an LRC having the N174C/5Ј-oxoG cross-link introduced into it (LRC N174C ; Figs. 3A, Table 1, and supplemental Figs. S1B and S3); this complex also bore the catalysis-inactivating E3Q mutation. In the structure of LRC N174C , electron density is plainly visible and interpretable for the atoms of the disulfide cross-link (Fig. 3C); although the DNA used in the DXL reaction consisted of a mixture of stereoisomers at the tethered 5Ј-phosphoramidate, the cross-linked product appears to bear only the S-configured phosphoramidate, thus indicating strong stereochemical selectivity in the cross-linking reaction. In other systems, including the case of Q166C MutM, we have invariably observed stereochemical mixtures in the crystallized cross-linked products.
The structure of the control DXLed LRC N174C is nearly identical in structure to that of the un-DXLed LRC 1R2Y (root mean square deviation ϭ 0.263 Å for all protein heavy atoms) and to other LRCs obtained using the Q166C DXL system (14,16). Of particular importance, in the LRC N174C structure, the target oxoG is fully extrahelical and correctly inserted into the MutM active site, with the OCL being well ordered (Fig. 3A). Despite the presence of a covalent cross-link to the 5Ј-phosphate, the DNA backbone conformation is nearly identical in LRC N174C and LRC 1R2Y (Fig. 4A).
The most significant difference in the protein-DNA interface of the two LRC structures is the swiveling of the Arg 264 side chain, which results in disengagement from its contacts to the oxoG 5Ј-and 3Ј-phosphates in LRC 1R2Y and establishment of new contacts to nearby DNA bases in LRC N174C (supplemental Fig. S4). This change in the Arg 264 contact repertoire is accompanied by a slight (ϳ1 Å) movement of the loop containing Arg 264 (␤10-␤11 loop; supplemental Fig. S5). The reorganization of Arg 264 appears to be driven by the presence of the DXL, specifically the need to avoid a steric clash between the sulfur atoms of the disulfide cross-link and the guanidinium head group of the amino acid.
Notwithstanding the minor adjustments at Arg 264 and the ␤10-␤11 loop, the overwhelming structural similarity between an un-DXLed LRC and one having a DXL introduced at Asn 174 and the oxoG 5Ј-phosphate provides validation that this DXL system faithfully recapitulates physiologically relevant MutM/DNA interactions and therefore is suitable for the study of G presentation to the enzyme active site.
Structure of an Extrahelical G Complex, with a Normal G Being Presented to the MutM Active Site-Having validated the N174C DXL strategy by determining and analyzing the structure of LRC N174C , we  moved on to using the same system to study G presentation to the MutM active site. We simply replaced the oxoG nucleobase in LRC N174C with G, a difference of two atoms in the ϳ40-kDa complex. We retained the catalysis-inactivating E3Q mutation, so as to avoid the possibility of MutM-catalyzed cleavage of the undamaged nucleobase over the extended time periods elapsed during crystallization and data collection.
The structure of the XGC, solved to 2.6 Å, revealed the target G to be unambiguously extrahelical (Fig. 3B, Table 1, and supplemental Fig. S3). As with the LRC N174C control complex, clear electron density attributable to the disulfide cross-link in the XGC is observed at the 1 level in the 2F o Ϫ F c composite omit map (Fig. 3E). Again, there was clear evidence of strong stereochemical selectivity in the DXL reaction, with the crosslinked phosphoramidate having predominantly or exclusively the S-configuration. The distance between the cross-linker N and the S␥ of the partner cysteine is 4.8 Å in the XGC (4.3 Å for the LRC N174C ), as compared with 4.3 Å for the corresponding atoms in LRC 1R2Y and 7.1 Å in an intrahelical IC. The strategy of promoting target nucleoside extrusion by tugging on its 5Ј-phosphate thus succeeded in furnishing a complex in which an undamaged G is presented to the extrahelical active site of MutM.
Alignment of the XGC, LRC N174C , and LRC 1R2Y using only the protein C␣ coordinates (Fig. 4) shows that despite differences in DNA sequence, the DNA backbone of the strands containing the target base follows the same track; the complementary strands only differ on the 5Ј flank, which tends to vary similarly among various MutM/DNA complexes, because of a lack of significant protein and crystal contacts. Not unexpect-edly, the ␤10-␤11 loop is slightly shifted in the XGC, as it was in LRC N174C .
The XGC structure (Fig. 3B) shares many characteristic structural elements with the LRC N174C structure (Fig. 3A). These features serve to stabilize the significantly deformed DNA bearing an extrahelical target base present in both structures. Three MutM residues, Met 77 , Arg 112 , and Phe 114 , penetrate the DNA helix through the minor groove and stabilize the sharp bend in the DNA duplex (supplemental Fig. S6). Specifically, Phe 114 inserts itself between the "estranged" C (opposite the extruded target base) and the 5Ј-neighboring nucleoside, causing the target base pair to buckle and de-stack from the neighboring base pair. Arg 112 hydrogen bonds with the estranged C and, along with Met 77 , fills in the void left behind by the extruded nucleobase. In addition, MutM contacts the backbone phosphates of both DNA strands in the region surrounding the target base to further enforce the bent DNA conformation (supplemental Fig. S4). The direct hydrogen bonds involved in the LRC N174C protein-DNA interface are preserved in the XGC; the side chains of Lys 258 , Tyr 242 , Gln 3 , Lys 60 , His 74 , and Arg 76 contact the DNA strand containing the target base, and the side chains of His 93 , Lys 113 , Asn 32 , and Trp 30 hydrogen bond to the complementary strand. A rigorous comparison of the water-mediated hydrogen bonds in the two structures is not feasible because the XGC and LRC N174C structures are not of the same resolution. However, in both cases, there seem to be relatively few ordered waters at the protein-DNA interface mediating indirect contacts, as compared with intrahelical  DNA-MutM structures (ICs and ECs) (14). The similarity between the hydrogen-bonding networks at the MutM-DNA interface in the LRC N174C and the XGC is expected, considering the similar protein and DNA backbone conformations in the two structures, most likely imposed by the engineered DXL (discussed above).
The main distinctions between the XGC and LRC N174C structures lie in the conformations of the target nucleobases and the positions of the MutM residues involved in stabilizing these conformations. The oxoG in the LRC N174C completely enters the active site and is stabilized by the OCL, forming multiple hydrogen bonds to the O-6 of the oxoG (amino Ns of Val 222 , Arg 223 , Thr 224 , and Tyr 225 ; Fig. 5B). The oxoG is specifically contacted by the side chain of Gln 3 at the O-8 position and the carbonyl oxygen of Ser 220 at the protonated N-7. Although G bears the same O-6 functionality as oxoG, G differs from oxoG at C-8 and N-7, and this difference is sufficient to prevent the target G in the XGC from fully entering the active site and engaging the OCL; instead, the target G lies parallel to the minor groove floor of the DNA (Fig. 3B). The orientation of the extruded G is almost perpendicular to that of the extrahelical oxoG, and the OCL in the XGC is disordered throughout most of its length ( Fig. 3B and supplemental Fig. S3). Interestingly, in the complexes of MutM interacting with an intrahelical target oxoG or normal base (ECs or ICs), the entire OCL (residues 216 -237) is disordered, but in the XGC, 12 more residues of the OCL are ordered than in the intrahelical complexes (Fig. 3D). The disordered region of the OCL (residues 222-231) comprises mainly the tip of the loop, which contains residues that interact directly with extrahelical oxoG. In contrast to the extensive contacts between the target oxoG and the MutM active site in the LRC N174C , only a few hydrogen bonds appear to be established between MutM and the extrahelical G in the XGC: the side chain of Glu 78 and the carbonyl oxygen of Met 77 with N-2 and Tyr 242 with N-7, none of which is specific for G versus oxoG (Fig. 5A).
Overlaying the 5Ј-phosphate groups from the extrahelical G and the oxoG in the active site (supplemental Fig. S7) reveals that the center of mass for the deoxyribose relative to the backbone phosphate is quite similar for the two target bases, but the glycosidic bond angle between the base and the sugar (the O-4Ј-C-1Ј-N-9 -N-3 dihedral angle) differs significantly (162 o for the G and 96 o for the oxoG). Although the active site oxoG is in the syn conformation (Fig. 6B), the extrahelical G remains close to the anti conformation (Fig. 6A). Conversion of the extrahelical G conformation to the active site oxoG conformation requires (i) pseudo-rotation of the deoxyribose (i.e. downward movement of the C-3Ј with respect to the center of the sugar) to relax the sugar to the C-2Ј-endo conformation, and (ii) rotation of the glycosidic bond to assume the syn conformation, which orients O-6 of the base in the proper direction for interactions with the OCL (supplemental Fig. S7). In summary, the XGC structure represents MutM examining a fully extrahelical but not fully rotated G in an "exo-site" outside a partially assembled active site.
Simulating the Structure of an un-DXLed G Complex-Although the foregoing analysis established that the presence of the N174C DXL in the LRC N174C complex does not perturb the DNA structure at the interface with oxoG, it does have a slight effect on the protein conformation, specifically of Arg 264 and the ␤10-␤11 loop. An un-DXLed XGC is obviously inaccessible experimentally, but we reasoned that because the effects of DXL are modest and localized, we could be confident in the use  of equilibrium MD simulations to remove the effect of DXL and provide a close semblance of a physiologic intermediate with G being presented to the MutM active site. A series of MD simulations was performed starting with the crystal structure of the DXLed XGC, and the results were compared with a second series of MD simulations carried out in the absence of the DXL. Corresponding MD simulations were performed for the LRC N174C structure in the presence and absence of the DXL, again as a control. An averaged structure for each system was determined from each run of MD simulations. As expected, the MD-averaged structures for the DXLed XGC and LRC systems remained close to those of the corresponding crystal structures (heavy atom root mean square deviation ϭ 1.292 Å for the XGC and 0.830 Å for the LRC). The global structures generated via the un-DXLed MD simulations also remained close to those of the relevant crystal structures (heavy atom root mean square deviation ϭ 0.800 Å for the XGC and 0.648 Å for the LRC), consistent with the notion that the effect of cross-linking to N174C is restricted to minor structural adjustments.
Close inspection of the MD-averaged XGC structures, in comparison with the crystal structure, reveals that the extruded G of the un-DXLed system (Fig. 7D) is oriented slightly differently from that of the simulated DXLed system (Fig. 7C) and the crystal XGC structure (Fig. 7A). One notable difference lies in the glycosidic bond angle of the extruded G, which is 162°in the crystal XGC structure, 169°for the DXLed MD-averaged structure, and 244°for the un-DXLed MD-averaged structure. As a reference, for the LRC systems with almost identical oxoG con-formations for DXLed and non-DXLed crystal structures, the glycosidic bond angle of the oxoG in the active site is 72°for the DXLed MD simulations and 62°for the un-DXLed simulations, respectively, whereas that for the crystal structure is 99°(supplemental Fig. S8). It is particularly noteworthy that previous potential of mean force simulations (14), carried out by us before the present XGC was available, have suggested the existence of an extrahelical intermediate state in G extrusion by MutM (Fig. 7B); the orientation of the extruded G in the potential of mean force simulations is almost identical to that of the G from the present un-DXLed MD simulations, i.e. the glycosidic bond angle for the potential of mean force extrahelical intermediate structure is ϳ240°, essentially the same as the angle from the present un-DXLed simulations.
A second point of difference between the DXLed and un-DXLed structures lies in the conformation of Arg 264 and the ␤10-␤11 loop. As mentioned above, the positions of these elements shift slightly upon DXL introduction, as evidenced by detailed comparison of the crystal structures of (un-DXLed) LRC 1R2Y and (DXLed) LRC N174C . Computational removal of the DXL in LRC N174C restores Arg 264 and the ␤10-␤11 loop to nearly identical conformations as seen in LRC 1R2Y (supplemental Fig. S8). This restoration can be understood as resulting from removal of the unfavorable steric interaction between the DXL moiety and the side chain of Arg 264 , a residue that is part of the ␤10-␤11 loop. The forces that cause the glycosidic bond adjustment are less obvious from the available data but may also reflect the influence of Arg 264 , specifically the presence or absence of its interactions with the 5Ј and 3Ј phosphates of the target G.
Taken together, the MD simulation results support the notion that the presence of the DXL in the XGC does not significantly alter the protein/DNA interface in the XGC, and what minor differences do exist are readily removed through the MD simulations so as to provide an otherwise inaccessible physiologic extrusion intermediate.

DISCUSSION
Trapping Transient Intermediates in Protein/DNA Interaction Systems Using DXL-Intermolecular disulfide cross-linking of protein-DNA complexes, first employed to enable structure determination of the HIV reverse transcriptase-primer-template complex (37), has since been used in a variety of systems to trap fleeting or rarely formed intermediates and to restrict the roaming range of proteins on DNA, thereby increas-  (14). C, the average structure from MD simulations based on the XGC crystal structure with the cross-link. D, the average structure from MD simulations based on the XGC crystal structure but with the cross-link removed. The coloring scheme is the same as in Fig. 3. Hydrogen bonds are shown as dashed lines.
ing the homogeneity of otherwise inhomogeneous complexes (14, 16, 35, 38 -41). The process of forming an intermolecular disulfide bond between DNA and protein is typically carried out under conditions in which thiol-disulfide interchange is kinetically reversible and can thereby attain an equilibrium position (supplemental Fig. S1B). Although some thermodynamic driving force in the cross-linking reaction is provided by the high local concentration of the engineered Cys on the protein and thiol tether on the DNA with respect to each other, this effect is modest and therefore insufficient under equilibrating conditions to drive the formation of highly strained species. In this way, DXL differs from kinetically irreversible covalent conjugation methods such as photocross-linking, which involve highly reactive intermediates that can drive the formation of strained species. By virtue of formative equilibration, DXL is predisposed toward trapping structures that are readily accessible energetically and are therefore likely to be substantially represented under physiologic conditions.
In the present case, a pair of interacting thiols was engineered into the interface between MutM and an undamaged duplex DNA, with a sufficiently large gap between the thiol partners that either the protein, DNA, or both would have to undergo structural remodeling to form a disulfide linkage. We found that indeed the protein and DNA readily reacted to form a DXL. Inspection of the structure of this complex revealed that the structural remodeling entailed extrusion of the normal G nucleoside 3Ј-to the N-ethylthio-tethered phosphoramidate (i.e. the G having the tether on its 5Ј-phosphate); very little structural remodeling appears to have taken place on the part of the protein.
Validation of the structure of this MutM XGC was provided by generating a control lesion recognition complex (LRC N174C ) that differed from the XGC only in having a target oxoG in place of G, a difference of two atoms. The DNA conformation at the site of the extruded target base in the DXLed LRC N174C control structure proved to be nearly identical to that of un-DXLed LRC structures, and only minor perturbations attributable to the cross-link were evident. These perturbations in LRC N174C were found to be readily rectified via MD simulations. Having thus also validated the MD simulations using the LRC structures, we applied the computational rectification procedure to the XGC structure, thereby producing a hybrid crystallographic/computational structure that provides a view of MutM extruding an undamaged G nucleobase from DNA and presenting it for insertion into the extrahelical enzyme active site.
Discrimination against Normal Extrahelical DNA Bases by MutM-Because the structures of LRC N174C and XGC and of the MD-rectified versions thereof differ only in whether they possess oxoG or G, respectively, at the target nucleobase and because the differences are not ascribable to crystal packing factors, we are confident in the conclusion that the observed structural differences result from the two-atom structural divergence in the target nucleobase. Whereas oxoG undergoes full insertion into the MutM active site, causing the OCL to become ordered and engage the Hoosteen face of the nucleobase, G undergoes only partial insertion, and the OCL fails to become fully ordered.
What are the structural features of G relative to oxoG that cause the undamaged nucleobase to be rejected by the MutM active site? To gain insight into this question, we modeled the structure of a MutM-DNA complex having an extruded G fully inserted into the MutM active site (Fig. 5C); this was done in straightforward manner by changing oxoG to G in LRC N174C. As expected, the multiple contacts between the OCL and the O-6 of oxoG would appear to be preserved with O-6 of G (Fig. 5, compare C and D), and thus O-6 contacts are unlikely to play a role in discriminating G from oxoG. On the other hand, oxoG and G do differ in their protonation state at N-7, and this is likely to affect the interaction with the main chain carbonyl of Ser 220 . Specifically, the attractive hydrogen bonding interaction between the Ser 220 amide carbonyl and the oxoG N-7-hydrogen would be replaced by a repulsive interaction when presented with the lone pair of electrons on N-7 of G (Fig. 5, compare B with C). Indeed, studies with analogs of oxoG have shown that a protonated N-7 position is essential for lesion recognition by MutM (and hOGG1) (42). Although the OCL could in principle adjust to avoid the repulsive interaction with N-7 of G, such an adjustment would possibly dislodge the contacts to O-6.
In addition, in the unpaired state, oxoG has a strong preference for the syn glycosidic bond conformation, whereas an unpaired G does not prefer syn over anti (43). Full engagement of oxoG in the MutM active site requires the syn conformation, and this factor therefore provides an additional thermodynamic bias toward oxoG and against G recognition by the MutM active site. In summary, a combination of differences in N-7 protonation state and glycosidic torsion angle preferences appear to provide the structural cues that the MutM active site uses to distinguish an oxoG lesion from its closely related normal counterpart G.
In work published elsewhere (14), we have found that mutations in the OCL that locally alter its conformation near the residues involved in oxoG recognition completely abrogate base excision repair of oxoG while leaving intact the ability of the enzyme to catalyze ␤-lyase degradation of abasic sites. Thus, proper ordering and positioning of the OCL over the substrate appears to be essential for catalysis of base excision by MutM. Presentation of a G residue to the active site also prevents the OCL from attaining its proper position; this, combined with the higher intrinsic glycosidic bond stability of G versus oxoG (42), can be safely assumed to make even a forcibly extrahelical G a poor substrate for MutM.
This work implies that MutM, like hOGG1, 6 has a late checkpoint in its nucleobase extrusion pathway that consists of acceptance or rejection of an extruded substrate nucleoside on the basis of active site complementarity. The existence of this checkpoint prevents MutM from accidentally cleaving G, which is present in vast excess over oxoG. How frequently is this catalytic checkpoint called upon to function? One potential source of extrahelical G residues for MutM to bind is those that have been produced by random thermal fluctuations in the DNA. Normal base pairs in DNA have a lifetime of 5-100 ms as 6 C. M. Crenshaw and G. L. Verdine, unpublished results. measured by NMR imino proton exchange experiments (44), but this figure reflects the rate of breathing and not that of extrusion. Breathing involves minimal loss of base stacking interactions and little increase in solvent exposure of hydrophobic surface area, whereas spontaneous extrusion entails both of these energetically costly occurrences; therefore, the rate of spontaneous nucleobase extrusion from DNA is orders of magnitude slower than the breathing rate (45). The near barrier-free one-dimensional diffusion of MutM along DNA (3.5 Ϯ 0.6 ϫ 10 5 base pair 2 /s) implies that chance encounters of MutM with spontaneously extrahelical nucleosides are extremely rare (36), to the point of being negligible. Another source is G residues whose extrusion from DNA has been accelerated through interaction with MutM. Studies in our laboratories have demonstrated that when MutM binds even undamaged DNA, it induces significant helical bending and de-stacking at, and buckling of, target base pairs (16). This is associated with a lowering of the energy barrier to extrusion. The barrier for oxoG extrusion is lowered by 14 kcal/mol more than that for G, but even for G, MutM-assisted extrusion is much faster than in naked DNA (14). A calculated energy barrier to MutM-assisted G extrusion of 11 kcal/mol suggests that the enzyme extrudes G at a rate of ϳ1 ϫ 10 Ϫ8 ps Ϫ1 . This is frequent enough to be deleterious to cells; hence we conclude that the catalytic checkpoint serves a vital biological function.
Finally, structures of DNA glycosylases having partially extruded undamaged bases have also been observed previously with hOGG1 (35,46) and uracil DNA glycosylase (47). As observed for the case of MutM XGC, a common feature of these structures is the few contacts between the target nucleobase and the enzyme exo-site. Therefore, stalling partially extruded normal nucleobases in exo-sites might be a general mechanism of lesion recognition at the late stage of base extrusion for DNA glycosylases.
Implications for the MutM Base Extrusion Pathway-The present structure of MutM attempting to present an undamaged nucleobase to the active site provides a representation of a formerly cryptic state along the base extrusion pathway of MutM; as such, it complements the previously reported structures of MutM-DNA complexes at other stages of lesion recognition and base extrusion (IC, EC, and LRC; Fig. 1B) (14 -16). As shown in the previous computational studies, the base extrusion pathway of MutM involves three steps: (i) extrusion of the nucleobase from the intrahelical base pair to a fully extrahelical position through the minor groove, (ii) rotation of the nucleobase to a syn conformation, and (iii) further extrusion of the base to settle into the enzyme active site. Compared with the fully extruded oxoG in the LRC, the G in the XGC has only completed the first step of the extrusion process but has not completed the second or third step. Using MD simulations, the XGC can be readily relaxed into a structure very close to that of a previously calculated local free energy minimum state (Z ϭ 1.0) (14) on the two-dimensional free energy surface for the extrusion of G assisted by MutM. Therefore, consistent with the computational model of base extrusion by MutM, the XGC structure suggests that complete rotation of the nucleobase is a late event following the extrusion of the base to a fully extrahelical position. Furthermore, ordering of the OCL accompanies the extrusion of the base, and the OCL does not assume a single conformation until full extrusion of the target base occurs.
Model for Overall Damage Recognition by MutM-The complete set of MutM-DNA structures along with the currently available biochemical and biophysical data suggest the following model for lesion recognition and discrimination against normal nucleobases by MutM: MutM scans along the DNA rapidly in a highly redundant mode by one-dimensional Brownian motion. During this process, the protein interrogates the duplex, bending it and buckling target base pairs more or less indiscriminately. Upon encountering an intrahelical oxoG, the free energy barrier to extrusion of the target lesion is selectively lowered as the result of steric interactions between oxoG and the bent DNA backbone induced by interactions with MutM, and the lesion therefore is extruded in preference to undamaged bases. Once extruded, oxoG rotates into the syn conformation and is captured and stabilized by the OCL in the active site. Whenever MutM accidentally promotes the extrusion of G, the undamaged nucleoside reaches the exo-site but fails to enter the active site fully and cause the OCL to lock down over it. Absent proper assembly of a functional catalytic apparatus, the partially extruded G fails to undergo base excision and quickly collapses back into the intrahelical state. Because the other three undamaged nucleobases have very different chemical compositions from oxoG, discrimination against these nucleobases by MutM is not expected to be as challenging as with G. In conclusion, through a combination of intrahelical and extrahelical lesion recognition mechanisms, MutM efficiently locates the rare oxoG sites and avoids being overwhelmed by the more abundant normal nucleobases in the genome.