The domains of mammalian base excision repair enzyme N-methylpurine-DNA glycosylase. Interaction, conformational change, and role in DNA binding and damage recognition.

Repair of a variety of alkylated base adducts in DNA is initiated by their removal by N-methylpurine-DNA glycosylase. The 31-kDa mouse N-methylpurine-DNA glycosylase, derived by deletion of 48 amino acid residues from the 333-residue wild type protein without loss of activity, was analyzed for the presence of protease-resistant domains with specific roles in substrate binding and catalysis. Increasing proteolysis with trypsin generated first a 29-kDa polypeptide by removal of 42 amino-terminal residues, followed by production of 8-, 6-, and 13-kDa fragments with defined, nonoverlapping boundaries. The 8- and 13-kDa domains include the amino and carboxyl termini, respectively. Based on DNA-affinity chromatography and the protease protection assay, it appears that the 6- and 13-kDa domains are necessary for nontarget DNA binding and that the 8-kDa domain, in cooperation with the other two domains, participates in recognition of damaged bases. Furthermore, chemical cross-linking studies indicated that, in the presence of substrate DNA, the 8- and 6-kDa domains undergo conformational changes reflected by both protection from proteolysis and reduced availability of cysteine residues for the thiol-exchange reaction.

These diverse DNA adducts are repaired by the base excision repair pathway, whose first step, release of the free modified base, is catalyzed by N-methylpurine-DNA glycosylase (MPG 1 ; Refs. 18 -23). The resulting abasic (AP) sites are then repaired in a series of steps starting with endonucleolytic cleavage at AP sites by AP-endonuclease (24). Only one MPG homologous in the humans and rodents has been extensively characterized as a result of the cloning of its cDNA by several laboratories (18,28,29). This protein behaves like the inducible AlkA protein of Escherichia coli in regard to their broad substrate range. E. coli has a second MPG, the constitutive Tag protein that is specific only for 3-alkyladenine (reviewed in Refs. 25 and 26).
The mammalian MPGs, like many other DNA glycosylases, are monomeric in active form, do not have an absolute requirement for a cofactor for activity, and are active when expressed in E. coli (27). These features provided the strategy for cloning the cDNAs of human and rodent MPGs by phenotypic rescue in E. coli (18,28,29). Human and mouse MPGs share about 83% identity in their amino acid sequences. These MPGs have been purified to homogeneity and characterized biochemically (27,30). Moreover, the mouse MPG (and particularly the aminoterminal half of the mouse MPG in a mouse/human hybrid protein) showed repair of 7-methylguanine and 3-methylguanine at a higher rate than did human MPG or the aminoterminal half of human MPG in human/mouse hybrid protein (when the levels of both proteins were adjusted to release 3-methyladenine at comparable rate). This result indicates that the amino-terminal half of MPG is more critical for the recognition of 3-methylguanine and 7-methylguanine (31). We also observed that deletion of the amino-terminal residues led to greater solubility of overexpressed protein, and deletion of 44 or 48 residues apparently did not affect enzyme activity for 3-methyladenine release. 2 Results from the chimeric mouse/human MPG suggest that the protein is organized in discrete substructures or domains. If so, how do these domains interact during substrate binding and catalysis? In this study controlled proteolysis was employed to identify the domains of mouse MPG in solution and to characterize their possible functions.

EXPERIMENTAL PROCEDURES
Purification of Mouse MPG-N⌬48 mMPG, overexpressed in bacteria, was purified essentially to homogeneity as described previously (27) in four steps by ammonium sulfate precipitation (30 -55%), anion-exchange chromatography on Q-Sepharose (Pharmacia Biotech Inc.), cation exchange chromatography on phosphocellulose P11 (Whatman) and finally FPLC on MonoS (Pharmacia). The yield of mouse MPG was about 15 mg of protein/20 g of cells (wet weight), and the purity was Ͼ95%. After dialysis against buffer A (20 mM Tris, pH 8.0, 1.0 mM EDTA, 1 mM dithiothreitol) containing 0.1 M NaCl and 50% glycerol, purified mouse MPG was stored at Ϫ20°C.
Chromatography on Single-stranded DNA-Cellulose-MPG (100 g) was digested with endoprotease Arg-C at E/S of 1:800 for 5 min and then applied to a 2.0-ml single-stranded DNA-cellulose column equilibrated with buffer A containing 0.1 M NaCl. The column was washed with 100 mM NaCl in buffer A and was sequentially eluted at room temperature with 1.0 ml each of 150 mM, 200 mM, 250 mM, and 300 mM of NaCl in buffer A. The fractions were collected on ice and mixed with trichloroacetic acid, and the protein precipitates centrifuged and suspended in 15 l of H 2 O and 5 l of 4 ϫ sample buffer (200 mM Tris-HCl, pH 6.8, 400 mM dithiothreitol, 8% SDS, 0.4% bromphenol blue, and 40% glycerol) prior to loading for SDS-PAGE.
N-Methylpurine-DNA Glycosylase Assay-MPG was assayed by incubating purified MPG or proteolytic fragments with an excess substrate, [ 3 H]methyl-labeled calf thymus DNA (370 cpm/g DNA; 109 fmol of m 7 G and 20 fmol of m 3 A in a total of 960 pmol of adenine and 640 pmol of guanine/g of DNA), as described previously (27).
Protease Protection Assay-MPG (2.3 nmol) was incubated with either substrate ⑀A-oligonucleotide or control A-oligonucleotide (singlestranded or duplex, 25-mer) at a 1:1 molar ratio for 10 min at 25°C and the reaction mixture was digested with trypsin at (E/S ϭ 1:100) at 25°C for 5, 25, and 60 min. The digests were then electrophoresed on SDS-PAGE, stained with Coomassie Brilliant Blue, and photographed.
Chemical Cross-linking Assay-DNA-bound MPG (with binding performed as in the protease protection assay) or MPG (2.3 nmol) alone was incubated with 3 mM DTSSP for 2 and 4 h. The reaction was then stopped by addition of the quencher 100 mM ethanolamine, pH 8.0. The reaction mixture was adjusted to 0.1% SDS and digested with endoprotease Glu-C (E/S ϭ 1:10) for 60 min at 25°C. The digests were then electrophoresed on 20.1% SDS-PAGE Ϯ 100 mM ␤-mercaptoethanol.
Kinetics of Thiol Exchange-MPG (2.6 M, dithiothreitol-free) was mixed with ⑀A-containing duplex oligonucleotide (6 M, 25-mer) or A-containing control oligonucleotide (7.5 M, 25-mer), and incubated at room temperature for 10 min. The MPG⅐DNA mixtures and MPG alone were then mixed separately with 68-fold excess of Ellman's reagent (DTNB, Pierce), in the presence of 0.1 M sodium phosphate, pH 8.0, and the reaction was monitored by the change in A 412 in a Beckman DU640 spectrophotometer at room temperature.
Other Methods-Circular dichroism spectra were measured with an Aviv 62 DS spectropolarimeter. To acquire a full range of near-and far-ultraviolet (UV) circular dichroism (CD) spectra, fused quartz cuvettes with path lengths of 0.01 and 0.1 cm were used for protein and DNA solutions with concentrations around 10 Ϫ5 M. Since the K d for DNA-protein complex formation is 2 ϫ 10 Ϫ8 M (27), the spectra for the 1:1 DNA-protein complex must be that of the complex with negligible amount of free protein or DNA. Each spectrum was recorded with a 0.5-nm increment and 1-s interval. For each sample multiple repetitive scans were obtained and averaged.
The amino-terminal sequences of MPG or its proteolytic fragments, attached to a Sequelon-AA membrane, were determined in an Applied Biosystems 475/477A protein sequencer. Protein concentrations were measured by the bicinchoninic acid procedure (33) with bovine serum albumin as standard using a BCA protein assay kit (Pierce). Radioactivity in liquid samples was quantitated with Scintiverse II (Fisher Biotech) in a Beckman LS6000SC liquid scintillation counter. Oligonucleotides were synthesized in an Applied Biosystems 394 DNA/RNA synthesizer using phosphoramidite chemistry. Adenine, thymine, guanine, and cytosine CE phosphoramidites were obtained from Applied Biosystems and special ethenoadenine CE phosphoramidite from Glen Research.

Mouse MPG Is Organized in Three Domains and Contains an Unstructured Region at the Amino Terminus-Domains are independently folded units of protein structure (reviewed in
Refs. 34 and 35). Fifty to 150 amino acids typically form a domain; a protein composed of more than 150 amino acids may contain multiple domains connected by flexible loops (36). In general, controlled proteolysis provides a classical approach to define domain organization (36 -46). Proteases preferentially cleave the solvent-exposed loops and domain-linking regions to yield domain fragments without affecting their tertiary structures. The domains often become susceptible to further proteolysis only under denaturing conditions. The individual domains sometimes retain some of the functions assigned to the intact protein (40,44,(47)(48)(49)(50).
We used controlled proteolysis to cleave the mouse MPG into discrete fragments. Because the protein contains multiple cleavage sites for several proteases such as trypsin (Trp), endoprotease Lys-C (endoLys-C), endoprotease Arg-C (endoArg-C), endoprotease Glu-C (V8) and endoproteinase Asp-N (en-doAsp-N), different fragments were generated by these proteases. Digestion with trypsin for different times under nondenaturing condition cleaved MPG into three stable fragments (p13, p8, and p6), which were resistant to further proteolysis, although each of the fragments contained a number of trypsin cleavage sites; p13 had 6 Arg, 5 Lys, p8 had 9 Arg, and p6 had 5 Arg and 2 Lys residues. Additionally, fragments p29 and p20 were transiently observed at earlier time points and yielded the same stable fragments, p8, p6, and p13, after further digestion (Fig. 1A). The identity of these fragments was confirmed by sequencing their amino-terminal residues (Table  I). Proteolysis with endoLys-C generated three discrete fragments: p18, p16, and p13 (Fig. 1B). However, amino-terminal sequence analysis shows that p13 (endoLys-C) was equivalent to p13 (Trp), and that p16 is generated from p18, which is equivalent to the combination of p8 (Trp) and p6 (Trp) fragments (Table I). Digestion with endoArg-C generated two transient fragments, p29 and p20, but longer digestion produced only one stable fragment, p13 (Fig. 1C). Amino-terminal sequencing of p13 revealed two bands, one equivalent to the carboxyl-terminal p13 (Trp), and the other to the combination of p8 (Trp) and p6 (Trp) fragments (Table I). Table I also shows that the first 30 -40 amino acid residues at the amino terminus are not folded into a protease-resistant structure, because the region was hypersensitive to all the tested proteases, including trypsin, endoLys-C, and endoArg-C. Although MPG had cleavage sites for V8 and endoAsp-N, only ϳ25 residues were released from the amino terminus when these enzymes were used under nondenaturing conditions, at a level 10-fold higher than that of the other proteases (Fig. 2). This observation suggests that these protease sites are located within the tightly folded domains. Taken together, these results lead to the model that the MPG polypeptide is organized in three tightly folded, protease-resistant domains linked by protease-sensitive hinge regions, and has a protease-sensitive amino-terminal region.
The unstructured region at the amino terminus of the truncated MPG consists of nearly 40 amino acid residues. The p8 domain starts at residue 93, the center domain (p6) at residue 166, and the carboxyl-terminal domain (p13) includes the region beyond residue 223 (Fig. 3).
Role of Domains in MPG-mediated Base Excision DNA Repair-As a DNA glycosylase specific for certain modified bases, MPG should have at least three broad functions: binding to nontarget DNA, recognizing a damaged base (substrate), and finally, catalyzing cleavage of the DNA glycosyl bond.
In order to determine whether the ability to bind nontarget DNA is localized in a specific domain, we performed affinity chromatography on single-stranded DNA-cellulose of the domain fragments generated by endoArg-C digestion (Fig. 4). The p5 did not bind to DNA, while p13 fragments were co-eluted at 150 mM NaCl. The largest fragment, p29, was eluted at 300 mM NaCl. The transient p20 fragment, (consisting of domains p5, p8 and p6) was eluted at 200 mM NaCl. Although not shown here, the individual domains produced in a trypsin digest (p8, p6, and p13) did not bind to DNA at 100 mM NaCl. However, a transient fragment, p20 (consisting of domains p6 and p13), was retained on the column and eluted at 200 -250 mM NaCl. The NaCl concentration required for elution reflects the strength of nonspecific DNA binding by the protein fragments. Fig. 5 summarizes the results from DNA-cellulose affinity chromatography, indicating that p6, derived from both the p20 fragments, appears to be the common denominator in binding to nontarget DNA; however, another domain, from either the amino or carboxyl terminus, appears to be needed to anchor nontarget DNA.
In order to elucidate the involvement of individual domains in damage recognition, we determined protease sensitivity of the enzyme bound to the substrate. Fig. 6 shows the results of protease protection experiments with MPG⅐DNA complexes. The transiently formed fragment p20 (lanes 1-3), became stable to trypsin treatment in the presence of duplex control DNA (lanes 4 -6). More interestingly, in the presence of substrate DNA, a larger transient fragment, p29, was also stable for up to 60 min of digestion (lanes 7-9). This result indicates that the binding of nonspecific DNA protects a cleavage site in p20, which consists of domains p6 and p13. The substrate DNA induces protection of an additional domain (p8) in the p29 fragment. This difference implies that specific and nonspecific DNA binding do not induce identical structural perturbation, as judged by changes in patterns of proteolytic digestion. Our results, including those obtained from the DNA-cellulose column binding assay, also indicate that p20 (consisting of p6 and p13 domains) is necessary for nontarget DNA binding, while the p8 domain in the p29 fragment is necessary for recognition of the substrate. Single-stranded substrate DNA was also able to protect MPG from protease digestion, but to a lesser extent than the duplex DNA (Fig. 6, last panel); the difference may be due to reduced affinity of the enzyme for single-stranded DNA. We may point out in this context that MPG removes N-methylpurines from single-stranded DNA at a rate 3-4-fold lower than from duplex DNA. 3 DNA Damage Recognition Induces Conformational Change within the Domains of MPG-The difference in the protease resistance patterns of protein complexed with nonspecific DNA versus substrate DNA (Fig. 6) may indicate 1) a rigid body movement of domains in response to DNA binding, 2) DNAinduced changes in the secondary or tertiary structure of indi-vidual domains, or 3) a combination of both possibilities. Three types of experiments: chemical cross-linking, UV CD spectroscopy, and kinetics of thiol exchange reaction with DTNB, were performed to asses any conformational change. We reasoned that any conformational change within the protein during recognition of substrate DNA, may alter the interdomain distances. In order to test this hypothesis, we used a thiol-cleavable homobifunctional protein cross-linker, DTSSP, which can cross-link reactive amino groups up to 12 Å apart. Thus domains with primary amino groups oriented such that they are closer than 12 Å will be cross-linked, and the bond can be cleaved by a reducing agent such as ␤-mercaptoethanol. Incubating MPG with DTSSP for up to 2 h did not yield any higher molecular weight products, and the protein remained mostly in the monomeric form (Fig. 7A, lanes 1 and 5). Because the DNA-protein complex was resistant to trypsin, we digested cross-linked MPG or (cross-linked) DNA-bound MPG with endoproteinase Glu-C (V8) under mild denaturing conditions (0.1% SDS) for 1 h. After digestion of MPG alone with V8, four fragments (designated as B, B*, C, and D) were resolved by SDS-PAGE (Fig. 7A, lane 2). The amino-terminal sequences showed that the V8 protease cleaves MPG at the carboxyl end of the glutamic acids at positions 145 and 265 (generating fragment B), at 153 and 265 (forming B*), at 52 and 145 (forming C), and at 265, forming fragment D (residues 265-333; Fig. 7C). The presence of cross-linker generated a new set of bands (A1-A3) after V8 treatment in the absence of DNA (Fig.  7A, lanes 3 and 4), their amino-terminal sequences showed that A1, A2 and A3 started from Glu 145 , Glu 153 , and Glu 52 , respectively (Fig. 7C). Thus, "A" consists of fragments C and B (Fig.  7D). The fragment sizes were different due to the availability of V8 cleavage sites. However, fragments "C," "B," and "D" corresponded to domains p8, p6, and p13, respectively. After treatment with ␤-mercaptoethanol the bands in group "A" disappeared, generating B, B*, C, D, and a new band C* (a carboxylterminal truncation product of C) (Fig. 7A, lanes 7 and 8), indicating that the bands in "A" group were specifically crosslinked products, that p8 and p6 domains involved in crosslinking had to be closer than 12 Å. Furthermore, p13 domain was far enough away to prevent its being cross-linked with p8⅐p6 complex.
The V8 digestion of MPG bound to control DNA-generated peptide fragments (Fig. 7B, lane 2) identical to those formed after digestion of MPG alone with V8 (Fig. 7A, lane 2). The p8 and p6 were again cross-linked with DTSSP in presence of 3 R. Roy and S. Mitra, unpublished observation.  control DNA (Fig. 7B, lane 1), but those domains were not cross-linked in the presence of substrate DNA (Fig. 7B, lane 3). These results suggest that the p8 and p6 domains move away from each other as a result of substrate recognition and binding.
CD spectra of MPG and MPG⅐DNA complexes were measured. Difference CD spectra generated by subtracting the spectra of uncomplexed DNA from that of the complexes showed a small change in molar ellipticity in the region of 230 -280 nm (data not shown). Regardless of the nature of DNA, nonspecific or substrate, the spectra of complexes were not additives of uncomplexed protein and DNA. These results imply that the DNA and/or protein structures were perturbed as a consequence of complex formation; because both DNA and protein could contribute to the spectral changes in the region of 230 -280 nm, no attempts were made to dissect the respective contributions to the observed spectral changes. Nevertheless, the changes around 240 -260 nm could be consequence of perturbation in cysteine or cysteine environments. Thus, we directly tested the effect of DNA binding on the accessibility of cysteines in MPG by measuring the rate of the thiol-exchange reaction with DTNB (Ellman's reagent).
The 31-kDa protein has 5 cysteine residues, 2 in the p6 and 3 in the p13 domain. These residues were much less reactive when the protein was bound to substrate DNA than they were in the free protein or in the presence of control DNA. Fig. 8 shows that the pseudo-first order rate of the reaction was about 5-fold lower in the presence of substrate DNA (t 1 ⁄2 ϭ 24.8 min) than that of the protein in the presence of control DNA (t 1 ⁄2 ϭ 5.0 min). The rate of reaction of the protein (t 1 ⁄2 ϭ 2.0 min) in the absence of any DNA was about 2.5 times faster (t 1 ⁄2 ϭ 5.0 min) than that in the presence of control DNA.
To identify the domain containing the catalytic site, we separately digested MPG with endoprotease Arg-C for 5 min and trypsin for 15 min and attempted to separate the resulting fragments on different chromatographic and affinity columns. Endoprotease Arg-C-digested MPG was resolved on a singlestranded DNA-cellulose affinity column into p5 (the aminoterminal unstructured region), a p14/p13 mixture, p20 (a combination of p5, p8, and p6), and p29 (the intact protein without p5). Analysis of the enzymatic activity of these fragments showed that p5 was inactive. However, p29, generated after loss of p5, retained activity comparable to that of intact MPG. None of the individual domains or the transient fragments had any activity. However, a purified mixture of p14 (a combination of p8 and p6) and p13 retained about 8% of the original activity (Table II). It is not clear whether this residual activity was due to the presence of a small amount of undegraded enzyme in the column eluate, or to an intrinsic ability of noncovalently linked fragments to carry out catalysis. That fragments p14 and p13 could not be separated from each other by either ion exchange chromatography (Mono Q and Mono S) or chromatofocusing (Mono P) suggested strong association under nondenaturing conditions. We hope to resolve this issue by assaying enzyme activity of a mixture of p13 and p14 recombinant polypeptides individually overexpressed in E. coli. DISCUSSION DNA glycosylases act in the first step of the base excision repair pathway and constitute an important class of enzymes because they are responsible for removal of modified bases from DNA. Among these enzymes, uracil-DNA glycosylases have been extensively studied, and the structures of the human and herpesvirus-coded uracil-DNA glycosylases enzymes have recently been elucidated by x-ray crystallographic analysis (51,52). The MPGs are also critical DNA glycosylases because they are involved in repair of many spontaneous and induced base lesions in DNA. Mammalian MPGs do not share significant sequence homology with the E. coli MPGs, yet both recognize many common substrates that differ widely in their structural features. These unusual enzymatic properties set the MPGs apart from uracil-DNA glycosylases and other DNA glycosylases that are specific for a single substrate, and raise the possibility that MPGs possess unusual structural features.
The three-dimensional structures of MPGs have yet to be elucidated, and the amino acid residues involved in substrate binding and catalysis have not been identified. As a first approach to these issues, we asked whether the MPG polypeptide is organized in substructures or domains, and if so, how these interact in substrate recognition. Our results indicate that the mouse MPG has three well defined, non-overlapping globular regions arranged in a linear array of p8-p6-p13 domains, in addition to a protease-sensitive region at the amino terminus.
"Exon shuffling" has been proposed as being responsible for diversification of protein sequences during evolution (53). Because domains are analogous to modules with discrete structures of their own, we examined whether the MPG domains are coded by distinct exons in its gene. However, comparison of the exon boundaries with the coding sequence defining the domain boundaries does not show equivalence (Fig. 9). This result suggests that the domains of MPG did not evolve as independent modules.
The contributions of the distinct domains to nonspecific DNA binding, substrate lesion recognition, and glycosyl bond cleavage have been analyzed by deletion mutagenesis and by examining the role of specific domains in these activities. We have already shown that up to 100 and 18 amino acid residues can be removed from the amino and carboxyl terminus of the fulllength MPG, respectively, without loss of activity. 2 Examination of the domain boundaries in this protein indicates that not   only is the loosely structured amino terminus dispensable for enzyme activity, but also that 8 residues of the p8 domain and up to 18 residues of the p13 domain are not required for glycosylase activity. However, further deletion of additional residues from both termini resulted in a loss of activity, but did not reduce the binding of substrate DNA. The substrate DNA binding activity was retained in truncated proteins with deletion of 65 residues from the carboxyl terminus of the p13 domain. At the same time, this activity was lost when 23 or more residues were deleted from the amino terminus of the p8 domain.
Our results suggest that the domains' structural integrity, which should be essential for catalysis, is preserved despite partial truncation. The inhibition of the cleavage of interdomain hinge regions in the presence of DNA suggests that p6, in combination with p13, binds DNA nonspecifically as a prerequisite to specific DNA binding. Furthermore, specific binding to the damaged base requires interaction of all three domains. Studies of protein cross-linking suggest movement of the p8 domain away from or toward the p13⅐p6 complex in the presence or absence of substrate DNA, respectively. Finally, sequestration of cysteine residues (present in the p13⅐p6 complex in the protein) from the thiol exchange reaction supports the possibility of induced change in the domains' conformations as a result of substrate binding.
Taken together, these results are consistent with a model proposed for conformational change of MPG during its binding to control and substrate DNA (Fig. 10). In the absence of DNA, the p8 and p6 domains are in close proximity to each other, with the p13 domain somewhat removed in space. In the presence of nonspecific DNA, p6 binds to DNA, with additional interaction by the p13 domain. When DNA containing a substrate lesion is available, p8 moves away from p6, with a significant change in overall conformation of the protein. However, all three domains appear to be involved in binding substrate DNA. Although our experiments have not directly addressed the question of how the same enzyme recognizes distinct substrate lesions in DNA with widely different structures, it is reasonable to expect that the three domains can interact among themselves in subtly different ways in order to allow "induced fit" of the substrates. This intrinsic structural flexibility may contrast MPGs from uracil-DNA glycosylases, which has a rigid recognition pocket for uracil, its only substrate (51).