NH2-terminal proline acts as a nucleophile in the glycosylase/AP-lyase reaction catalyzed by Escherichia coli formamidopyrimidine-DNA glycosylase (Fpg) protein.

Formamidopyrimidine-DNA glycosylase (Fpg) protein plays a prominent role in the repair of oxidatively damaged DNA in Escherichia coli. The protein possesses three enzymatic activities, hydrolysis of the N-glycosidic bond (DNA glycosylase), β-elimination (AP lyase), and δ-elimination; these functions act in a concerted manner to excise oxidized deoxynucleosides from duplex DNA. Schiff base formation between the enzyme and substrate has been demonstrated (Tchou, J., and Grollman, A. P. (1995) J. Biol. Chem. 270, 11671-11677); this protein-DNA complex can be trapped by reduction with sodium borohydride. By digesting the stable, covalently linked intermediate with proteases and determining the accurate mass of the products by negative electrospray ionization-mass spectrometry, we show that the N-terminal proline of Fpg protein is linked to DNA and, therefore, is identified as the nucleophile that initiates the catalytic excision of oxidized bases from DNA. This experimental approach may be applicable to the analysis of other protein-DNA complexes.

Fpg 1 protein, a DNA base excision repair enzyme with Nglycosylase and AP lyase activities (1-3), efficiently removes 8-oxoguanine (8-oxoGua) and formamidopyrimidines from oxidatively damaged DNA (4 -7). We have shown previously (8) that this reaction involves an imino-enzyme-substrate (Schiff base) intermediate and that the amino group involved is located within a 72-amino acid fragment of Fpg protein containing the N terminus. The four-cysteine zinc finger motif located near the C terminus (9,10) is utilized in binding oxidatively damaged DNA (10,11). The 8-oxo moiety is a critical structural determinant by which Fpg protein recognizes duplex DNA substrates containing 8-oxoguanine (5).
It has been proposed that Schiff bases participate in the catalytic action of AP endonucleases (12) and DNA glycosylases that possess AP lyase activity (13)(14)(15)(16). Sodium borohydride and cyanoborohydride have been used to trap Schiff base intermediates as covalently linked DNA-protein complexes in reactions catalyzed by Fpg protein (8,16) and by bacteriophage T4 en-donuclease V, Micrococcus luteus UV endonuclease and Escherichia coli endonuclease III (15,16). A mechanism involving nucleophilic attack on C1Ј of the modified deoxynucleoside targeted for excision (8,(15)(16)(17) has been proposed to explain the catalytic action of these enzymes. Nucleophilic attack, facilitated by protonation of the base, effects cleavage of the N-glycosidic bond. The AP lyase activity of Fpg protein is a concerted reaction involving ␤and ␦-elimination reactions (2,18,19), producing a gap in one strand of the duplex demarcated by 3Ј-and 5Ј-phosphate termini (2,18).
Several nucleophiles capable of forming Schiff base intermediates (20) are located within the 72-residue N-terminal fragment of Fpg protein. Basic nitrogen functional groups are found in the N-terminal proline, the free amino group of Lys-56, and eight arginine residues. Following enzymatic digestion of the covalently linked complex, we used electrospray ionizationmass spectrometry (ESI-MS) to identify unambiguously the residual peptide-DNA fragment, leading to the conclusion that the N-terminal proline residue links the peptide to DNA. To our knowledge, Schiff base formation involving nucleophilic attack by secondary amines has not been reported for biological systems. The experimental approach used in this study is applicable to the structural analysis of protein-DNA complexes.

EXPERIMENTAL PROCEDURES
Oligodeoxynucleotides-Oligodeoxynucleotides containing 8-oxo-7,8dihydro-2Ј-deoxyguanosine (8-oxo-dG) (21) and a carba analogue of 8-oxo-dG 2 ( Fig. 1) were synthesized on an Applied Biosystems model 394 DNA Synthesizer using standard phosphoramidite chemistry. 5Ј-Dimethoxytrityl-protected phosphoramidite derivatives were used to prepare the 23-mer, 5Ј-CTCTCCCTTCXCTCCTTTCCTCT-3Ј (X ϭ modified nucleotide). 2Ј-Deoxycytidine was positioned opposite the modified base in the complementary strand. The normal oligomer was deprotected and removed from the support by treatment with 28% ammonia at 55°C for 16 h. For oligomers containing 8-oxodG, ␤-mercaptoethanol (final concentration, 0.1 M) was added prior to the deprotection reaction. The resin was removed by filtration, and the solution was evaporated to dryness at room temperature in a SpeedVac (Savant). 5Ј-DMT-protected oligomers were purified by reverse phase HPLC (Waters) on a semipreparative PRP-1 column (Hamilton) and eluted over 20 min with 16 -33% acetonitrile, 0.1 M triethylammonium acetate, pH 7.0, at a flow rate of 2 ml/min. Fractions containing the 5Ј-DMT-protected oligomer were combined and lyophilized; the 5Ј-DMT moiety was removed by treatment for 30 min with 80% acetic acid. Oligomers were further purified on a PRP-1 column, this time eluting over 15 min with 0 -12% acetonitrile, 0.1 M triethylammonium acetate, pH 7.0, at a flow rate of 2 ml/min. The fraction containing the desired product was evaporated to dryness, dissolved in 1 ml of water, and stored frozen at Ϫ20°C.
Enzymes-The general procedure for the purification of E. coli Fpg protein has been described elsewhere (10) and was used with minor modifications. Briefly, E. coli JM109 harboring the pKK233-3 overexpression plasmid with the fpg insert were grown at 37°C to an A 600 of 0.7. Cells were induced by adding isopropylthio-␤-galactoside (1 mM), incubated at 37°C for 3 additional h, and harvested by centrifugation. * This work was supported by Grants CA17395 and ES04068 from the National Institutes of Health. The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
‡ To whom correspondence should be addressed. Trapping the Schiff Base Intermediate-An oligonucleotide duplex containing a single 8-oxo-dG residue positioned opposite dC was mixed with Fpg protein in the presence of freshly dissolved NaBH 4 . The reaction mixture included 1 M oligodeoxynucleotide, 25 mM potassium phosphate, pH 6.8, 100 mM NaCl, 100 mM NaBH 4 , and varying amounts of Fpg protein, added last to minimize substrate cleavage. For the Edman degradation experiments, a 1:2 ratio of Fpg:duplex was used in a final volume of 10 -100 l. For preparative scale synthesis, the ratio of Fpg:duplex was 5:1 in a reaction volume of 10 ml. To reduce foaming and protein denaturation, the preparative reaction mixture was allowed to equilibrate for 30 min before oligodeoxynucleotide and Fpg protein were added. All reactions were incubated for 1 h at 37°C.
Tryptic Digestion of the Covalent Complex-For analytical purposes, an aliquot (10 l) of the reaction mixture was removed, dried in a SpeedVac, dissolved in 10 l of 8 M urea, and heated for 10 min at 95°C. Tris-HCl, pH 7.5, urea, and trypsin were added, bringing the solution to a final concentration of 2 M urea, 20 mM Tris-HCl, and 4% w/w trypsin/ Fpg. The reaction mixture was incubated for 1 h at 37°C and then heated for 10 min at 95°C to terminate the reaction. In preparative scale reactions, the NaBH 4 reaction mixture was loaded on Centricon-10 concentration devices (Amicon); the volume was reduced to 200 -300 l, and the sample was washed three times with 2 ml of water. After the final wash, the sample was dried, dissolved in 50 l of 8 M urea, and heated for 10 min at 95°C. Tris-HCl, pH 7.5, and urea were added to the solution; the final concentration was 2 M urea and 20 mM Tris-HCl. Trypsin was added (4% w/w trypsin/Fpg, assuming all of the Fpg protein had been recovered) following the Centricon concentration step. After 1 h incubation at 37°C, the sample was desalted by washing three times with 2 ml of water in a Centricon-3 device and dried in a SpeedVac.
Pronase Digestion of the Covalent Intermediate-After NaBH 4 reduction of the protein-DNA complex, glucose was added (final concentration, 400 mM), and the solution was incubated at 37°C for 3 h. Fpg (20 ng for the analytical reaction and 20 g for the preparative reaction) was added after 2 h to cleave unreacted substrate. In the analytical reaction, Pronase E (final concentration, 2 mg/ml) was added directly to the reaction mixture. In the preparative reaction, the mixture was loaded onto a Q Sepharose Fast Flow column (10-ml bed volume) and eluted in a 100-ml gradient of 200 -1000 mM NaCl with 20 mM Tris-HCl, pH 7.5. Protein samples were analyzed by SDS-PAGE. Peaks containing cross-linked material, which eluted at ϳ650 mM NaCl, were pooled, concentrated by evaporation to ϳ500 l, and dialyzed twice against 1 liter of 10 mM Tris-HCl, pH 7.5, 10 mM Na-EDTA. Pronase E was added to a final concentration of 1 mg/ml. The reaction mixture was incubated at 37°C for 24 h, desalted using a SepPak reverse phase cartridge (Waters), and dried in a SpeedVac.
PAGE and Reverse Phase HPLC-Following tryptic or Pronase digestion, the samples were dried, dissolved in 10 l of 8 M urea, heated for 5 min at 95°C, and loaded onto a 0.4-mm 20% polyacrylamide gel containing 8 M urea. Electrophoresis was conducted at 50 V/cm, maintaining constant voltage, with xylene cyanol loaded in a separate lane as a mobility marker. After the dye migrated approximately 65 cm, oligonucleotide-containing material was localized by UV shadowing; bands were cut from the gel, crushed with a spatula, and extracted twice with 1 ml of 1 mM Tris-HCl, pH 8.0, 0.1 mM EDTA for 2 h and the third time overnight. Extracts were concentrated to 1 ml, loaded on a Novapak C 8 reverse phase column (Waters), and eluted with 0 -15% acetonitrile, 0.1 M triethylammonium acetate buffer, pH 7.0, at 1 ml/ min. Fractions were collected in Eppendorf tubes, dried in vacuo, and stored frozen.
Electrospray Mass Spectrometry-Samples were diluted to an approximate concentration of 20 -50 M using a solution of 60% acetonitrile/water containing 1% triethylamine and then injected into the electrospray source on a TRIO-2000 mass spectrometer (Micromass). The source was supplied with a flow of solvent, 60% acetonitrile/water, at a rate of 12 l/min by means of a syringe pump (Harvard Apparatus). The mass spectrometer was operated in the negative ion spray mode with the probe voltage set to Ϫ3.3 kV. Nitrogen was used as the drying gas (250 liters/h) and also as the nebulization gas (15 liters/h). Approximately eight scans were averaged; the base line was adjusted and filtered digitally to produce each mass spectrum. Protein Analysis Worksheet (PAWS, v6.1.1) 3 was used to identify amino acid sequences in the Fpg protein with masses equal to the observed mass of the peptide portion of the complex.
Edman Degradation of the Covalent Intermediate-The NaBH 4 reaction mixture was added to an equal amount of SDS reducing buffer (62.5 mM Tris-HCl, pH 6.8, 2% w/v SDS, 5% v/v ␤-mercaptoethanol, 10% v/v glycerol, and 0.05% w/v bromphenol blue), heated for 5 min at 95°C, and then subjected to 12% discontinuous SDS-PAGE (22) on a Mini-PROTEAN II Dual Slab Cell (Bio-Rad). Electrophoresis was conducted in a cold room at 300 V. Bands were transferred onto an Immobilon P membrane (Waters) using the Mini Trans-Blot electrophoretic transfer cell (Bio-Rad). The membrane was stained briefly with Coomassie Blue and destained with methanol:acetic acid:water 4:1:5 (v/v/v). Bands were cut, washed 5 times with 10% methanol and 10 times with deionized water, and then subjected to Edman degradation (23). Sequence analysis was conducted on an Applied Biosystems 475A Protein Sequencer.

Formation of the Oligodeoxynucleotide-Peptide Complex-
When a 5-fold excess of Fpg protein was added to the reaction mixture, 90 -95% of the 8-oxoguanine-modified oligonucleotide was covalently linked to Fpg in 1 h (Fig. 2A, lane 2). When the enzyme:substrate ratio was 1:2 (used in preparing a covalent complex for Edman degradation), approximately 80% of the protein was cross-linked in 1 h (Fig. 2B, lanes 1 and 2). No cleavage products were observed.
To exclude the possibility that a covalent bond was formed during the reduction step at a position other than C1Ј of 8-oxo-dG, a duplex oligodeoxynucleotide containing carba-8-oxo-dG ( Fig. 1) was tested as a substrate for the cross-linking reaction. In this analogue, a methylene group replaces the ring oxygen in the deoxyribose moiety of 8-oxo-dG; consequently, the glycosidic bond is stable to nucleophilic attack at C1Ј. Duplexes containing the carbacyclic analogue bind Fpg protein tightly (K d ϭ 22.2 nM for carba-8-oxo-dG:dC and 8.9 nM for 8-oxo-dG: dC), but the duplex is not cleaved after 24 h incubation with a 20-fold excess of enzyme. As shown in Fig. 2A (lane 4), this modified oligomer does not generate a cross-linked product under conditions where more than 90% of the 8-oxo-dG-containing oligomer is covalently bound to Fpg.
The proteolytic digestion of the cross-linked complex is shown in Fig. 3. Following trypsin digestion for 1 h, the reduced enzyme-substrate complex was converted to lower molecular weight products (Fig. 3A, lane 3). Twenty four-hour incubation was required to completely digest this complex with Pronase E (Fig. 3B, lane 3).
Isolation of the Cross-linked Complex-Following tryptic digestion, products of a preparative scale reaction were subjected to denaturing gel electrophoresis. Two clearly separated bands (data not shown) were excised and purified by reverse phase HPLC. The elution pattern of the lower mobility product (A) contained a single large peak with a retention time of 42 min (Fig. 4). The UV absorption spectrum of peak A (Fig. 4, inset) resembles that of a standard oligomer containing 8-oxo-dG (B); however, the shoulder absorbing at 300 -320 nm, reflecting the presence of 8-oxo-dG, is not present in the spectrum of the reaction product. The higher mobility band co-migrated with the unmodified complementary strand during PAGE and HPLC.
Electrospray Mass Spectrometer Analysis-The negative ion ESI mass spectrum of product A (Fig. 5) shows an excellent signal:noise ratio and reveals the presence of two components. The major product is distributed into nine ion peaks (A series) with charged states ranging from Ϫ7 to Ϫ15; the mass of this product is 7786.4 Ϯ 3.7 Da. A series of peaks representing a minor product (B series) also is evident in the mass spectrum. The measured molecular mass of this substance is 7944.1 Ϯ 4.0 Da.
Mass data were analyzed by PAWS 3 to determine the mass of those amino acid sequences within Fpg protein that correspond to the measured mass of the peptide portion of the complex. Since the 8-oxoGua moiety is cleaved by Fpg protein, the mass of the depurinated oligomer (6631.3 Da) must be used to cal-culate the mass of residual Fpg protein sequences covalently bound to the oligomer. Thus, the mass difference between the apurinic oligomer and products A and B, corresponding to the mass of the residual peptide fragment, is 1155.1 and 1312.8 Da, respectively. Mass calculations from the PAWS program returned a total of 11 sequences in the Fpg protein (268 amino acids) for the major product with a calculated mass of 1156.1 Da (Table I) than the measured mass, since the program calculates the mass of the neutral peptide molecule rather than the mass of the peptide cleavage product obtained from the complex. Only one of these sequences is a tryptic fragment. It is located at the N terminus of Fpg and contains the first 10 amino acids, PELPEVETSR. The 10 remaining sequences are not expected to be products of trypsin cleavage (Table II) (Table I); however, only two of those represent trypsin-generated fragments (Table II). One sequence corresponds to the first 11 amino acids of the Fpg sequence, PELPEVETSRR, with a calculated mass of 1312.4 Da (measured mass 1312.8 Da). The second fragment (Thr-151 through Lys-161) is located within the internal sequence of the protein and has a mass of 1315.6 Da.
A similar experiment was conducted using Pronase E. The ESI mass spectrum of the product eluted from the HPLC column showed a good signal:noise ratio, yielding a molecular mass of 7213.2 Ϯ 1.4 Da (Fig. 6). Signals from other components present in the spectrum are insufficient to provide accurate mass assignments. Analysis of the major product by PAWS returned a total of 15 sequences in Fpg protein with a calculated mass of 582.9 Ϯ 1.4 Da for the peptide component ( Table  I). One of these sequences contains the first five amino acids, PELPE, in the N terminus of the enzyme (calculated molecular mass 582.6 Da; measured mass 581.9 Da). One of the other sequences reported (Leu-59 through Pro-63) is found within the first 72 amino acids of the N terminus; however, none of the amino acids in this sequence, LLELP, contains a functional group capable of nucleophilic attack. The remaining peptide fragments are located internally. There is no overlap (full or  partial) with the internal trypsin-generated fragment, Thr-151 to Lys-161, described above.
Edman Degradation of the Covalent Complex-The reduced covalent complex was fully resistant to 10 cycles of Edman degradation (Table III), indicating that its N terminus was modified. A control reaction mixture containing NaBH 4 but lacking the oligonucleotide substrate was subjected to similar treatment. This reaction did not lead to blockage of the N terminus, and the eight amino acids in the N-terminal sequence of Fpg protein were identified. DISCUSSION In this study, negative electrospray ionization-mass spectrometry was used to determine the mass of a peptide-oligodeoxynucleotide complex created by reductive cross-linking of Fpg protein to a duplex oligodeoxynucleotide containing a single 8-oxo-dG residue. The complex was partially digested by treatment with trypsin or Pronase E and denatured, leaving a residual peptide bound to a single strand oligomer. Accurate masses for the peptide component, determined from the ESI-MS data, were used to identify amino acid sequences within Fpg. A single peptide was identified as the major product of the tryptic digest; its calculated mass was consistent with the measured mass of a tryptic fragment forming the peptide portion of the complex. This peptide, PELPEVETSR, representing the N terminus, contains 10 amino acids. The mass of a minor product also was determined from the same digest. Fpg contained two sequences consistent with the data. The first, PELPEVETSRR, includes the sequence of the major product with an additional Arg present at the C terminus. This peptide may be the result of incomplete tryptic digestion. The second sequence, Thr-151 to Lys-161, is located internally and is not part of the 72-amino acid N-terminal fragment of Fpg protein to which the active site has been mapped (8). The central role of the N-terminal sequence was confirmed by analysis of a peptide-DNA complex formed by digestion with Pronase E. Several sequences corresponding to the measured mass were identified (Table I); however, only two of these were located within 72 amino acids of the N terminus. One sequence, PELPE, represents the first five amino acids of the N terminus and completely overlaps both trypsin-generated fragments. The second sequence (LLELP, Leu-59 -Pro-63) does not contain an amino group capable of Schiff base formation or share common elements with any of the trypsin-generated fragments. Other sequences compatible with the mass data are found in the internal sequence of Fpg protein and have no overlap with the trypsin-generated fragments. Most of these cannot form Schiff bases due to lack of a suitable functional group.
The three peptide-DNA complexes identified by ESI-MS have common sequence elements in the peptide portion of the molecule; in each case, N-terminal residues of Fpg protein, beginning with Pro-1, are covalently bound to the oligonucleotide. This N-terminal sequence is conserved in all known Fpg proteins (24). The ESI-MS technique does not identify the precise amino acid involved in the formation of the complex; however, the nature of the peptide sequences identified by this method indicates that only the N-terminal proline is capable of reacting with 8-oxo-dG in the substrate. Strong support for participation of the N terminus as an active site residue comes from experiments in which the covalently trapped intermediate was subjected to Edman degradation. Formation of the complex completely blocked this reaction, indicating that the N terminus was modified. It seems unlikely that steric inhibition by the oligonucleotide prevented access of low molecular weight reagents used in the Edman procedure to the N terminus. Combined with the ESI-MS data, resistance to Edman degradation permits us to identify unambiguously the N-terminal  proline as the nucleophile involved in base elimination reaction by Fpg protein.
A mechanistic scheme illustrating the involvement of Pro-1 in the glycosylase and AP lyase activities of Fpg-catalyzed excision is shown in Fig. 7. The nitrogen atom of proline attacks at C1Ј and, assisted by protonation at O 6 of 8-oxoGua (5), cleaves the glycosidic bond leading to expulsion of the damaged base. Removal of a proton from the positively charged proline nitrogen, mediated by solvent or the enzyme, is expected to follow. Protonation of the sugar oxygen leads subsequently to opening of the deoxyribose ring and Schiff base formation. The positively charged nitrogen can facilitate hydrogen abstraction at C2Ј, and ␤-elimination (syn) of the C3Ј phosphate leads to DNA strand breakage. At this point the complex may undergo elimination of the C5Ј-phosphate, mechanistically similar and vinylogous to the one already described. In the final step, the enzyme is regenerated by attack of hydroxyl ion on C1Ј with release of 4-oxo-2-pentenal.
The N-terminal residue of another DNA glycosylase/AP lyase, bacteriophage T4 endonuclease V, has been shown to be involved in Schiff base formation (14,(25)(26). The reaction mechanism proposed above for Fpg is similar to that suggested for endonuclease V (14) and other DNA glycosylase/AP lyases (16) with respect to Schiff base formation. In endonuclease V, the nucleophile is the ␣-NH 2 group of Thr-1; when this residue is replaced by proline, the activity of the enzyme is reduced to negligible values (25). The facts that Pro-1 is conserved in all Fpg proteins (24) and replacement with glycine abolishes cleavage but not binding (8) indicate that proline must play an important role in Fpg catalysis.
The finding that the N-terminal proline of Fpg protein acts as a nucleophile is unique. As a strong base, proline can act as a general acid/base catalyst and has been shown to abstract a proton from a substrate molecule bound to 4-oxalocrotonate tautomerase (27). Our study presents evidence for the involvement of an N-terminal proline in nucleophilic catalysis with formation of a Schiff base. Formation of an imine with a positive charge on the nitrogen atom may be rendered more favorable by coupling to a charge-delocalizing system, as in pyridoxal-containing enzymes (20) or stabilized by some other means. In this case, the positive charge on nitrogen undoubtedly facilitates AP lyase activity (␤and ␦-elimination) through abstraction of the C2Ј-and C4Ј-hydrogens. In addition, introduction of a positive charge between the highly negatively charged phosphodiester backbone and the negatively charged N-terminal region, which contains three glutamic acid residues in the first seven amino acids, may play a role in spatial organization of the active center or in stabilizing the intermediate complex.
The ESI-MS approach used in this study to identify amino acid residues involved in nucleophilic attack on a DNA substrate may be used to analyze other DNA glycosylases and DNA-protein complexes and may, in some cases, replace more time-consuming methods of identifying the peptide portions of these complexes (28). A stable covalent bond, linking protein to DNA, is required for ESI-MS analysis; two new ionization techniques, electrospray and matrix-assisted laser desorption, provide accurate mass assignments for large complexes of this type. Negative ion ESI is especially useful for molecules containing multiple negative charges and was used in this case because it was perceived that the charged state of the covalent complex would be controlled by the phosphates of the oligodeoxynucleotide. With pure samples, ESI-MS data yield mass accuracies of 0.01%; our measurements of more complex samples in low concentrations yielded results in the range 0.02- 0.05%. Enzymatic degradation of protein and/or DNA was not required for this analysis; however, partial degradation by proteases was helpful in defining the position of the covalent bond and in restricting the number of possible amino acid sequences.