DNA polymerases η and κ bypass N2-guanine-O6-alkylguanine DNA alkyltransferase cross-linked DNA-peptides

DNA-protein cross-links are formed when proteins become covalently trapped with DNA in the presence of exogenous or endogenous alkylating agents. If left unrepaired, they inhibit transcription as well as DNA unwinding during replication and may result in genome instability or even cell death. The DNA repair protein O6-alkylguanine DNA-alkyltransferase (AGT) is known to form DNA cross-links in the presence of the carcinogen 1,2-dibromoethane, resulting in G:C to T:A transversions and other mutations in both bacterial and mammalian cells. We hypothesized that AGT-DNA cross-links would be processed by nuclear proteases to yield peptides small enough to be bypassed by translesion (TLS) polymerases. Here, a 15-mer and a 36-mer peptide from the active site of AGT were cross-linked to the N2 position of guanine via conjugate addition of a thiol containing a peptide dehydroalanine moiety. Bypass studies with DNA polymerases (pols) η and κ indicated that both can accurately bypass the cross-linked DNA peptides. The specificity constant (kcat/Km) for steady-state incorporation of the correct nucleotide dCTP increased by 6-fold with human (h) pol κ and 3-fold with hpol η, with hpol η preferentially inserting nucleotides in the order dC > dG > dA > dT. LC-MS/MS analysis of the extension product also revealed error-free bypass of the cross-linked 15-mer peptide by hpol η. We conclude that a bulky 15-mer AGT peptide cross-linked to the N2 position of guanine can retard polymerization, but that overall fidelity is not compromised because only correct bases are inserted and extended.

DNA-protein cross-links are formed when proteins become covalently trapped with DNA in the presence of exogenous or endogenous alkylating agents. If left unrepaired, they inhibit transcription as well as DNA unwinding during replication and may result in genome instability or even cell death. The DNA repair protein O 6 -alkylguanine DNA-alkyltransferase (AGT) is known to form DNA cross-links in the presence of the carcinogen 1,2-dibromoethane, resulting in G:C to T:A transversions and other mutations in both bacterial and mammalian cells. We hypothesized that AGT-DNA cross-links would be processed by nuclear proteases to yield peptides small enough to be bypassed by translesion (TLS) polymerases. Here, a 15mer and a 36-mer peptide from the active site of AGT were cross-linked to the N2 position of guanine via conjugate addition of a thiol containing a peptide dehydroalanine moiety. Bypass studies with DNA polymerases (pols) η and κ indicated that both can accurately bypass the cross-linked DNA peptides. The specificity constant (k cat /K m ) for steady-state incorporation of the correct nucleotide dCTP increased by 6-fold with human (h) pol κ and 3-fold with hpol η, with hpol η preferentially inserting nucleotides in the order dC > dG > dA > dT. LC-MS/ MS analysis of the extension product also revealed error-free bypass of the cross-linked 15-mer peptide by hpol η. We conclude that a bulky 15-mer AGT peptide cross-linked to the N2 position of guanine can retard polymerization, but that overall fidelity is not compromised because only correct bases are inserted and extended.
The repair of DNA-protein cross-links is more challenging due to its diverse nature, and more remains to be discovered. The repair process involves all three moieties, i.e., protein, DNA, or the attachment between DNA and protein (21,22). Nuclear proteases and proteasome-based repair target the protein moieties in DNA-protein cross-links and degrade proteins into peptides to reduce their bulkiness and prevent the toxicity generated by these cross-links (23). Proteases involved in repair processes include Wss1, SPRTN, GCNA, and FAM111A and B (24), although only Wss1 and SPRTN have been shown to act directly. Posttranslational modification of proteases via binding to ubiquitin or SUMO can play an important role in proteolysis (25). Nuclease-based repair targets the DNA moiety of DNA-protein cross-links (e.g., MRN (Mre11-Rad50-Xrs2) complex in the processing of topoisomerase II cross-links at the ends of double-strand breaks) (26). The third target is the attachment between DNA and protein that can be hydrolyzed to release protein from the DNA (e.g., tyrosyl-DNA phosphodiesterase (TDP), an enzyme capable of breaking down the covalent attachment between topoisomerases and DNA) (27). In addition, DNA-protein cross-links can also be repaired by homologous recombination and nucleotide excision repair pathways (28), although the latter have some limitations (29).
If bulky DNA-protein cross-links are left unrepaired, they can be very detrimental to cells as major obstacles for cellular processes such as replication, DNA repair, recombination, chromatin remodeling, and transcription (3). DNA-protein crosslinks are thought to contribute to various mutagenic events, genomic instability, and even cell death, in that defects in protease-mediated DNA-protein cross-link repair are associated with aging and cancer, e.g., Ruijs-Aalfs syndrome linked to mutations in SPRTN (30), pediatric germ cell tumors (31), Kenny-Caffey syndrome type 2 (32), and gracile bone dysplasia (33).
The DNA repair protein O 6 -alkylguanine DNAalkyltransferase (AGT or MGMT) is known to form crosslinks with DNA in the presence of bifunctional electrophiles (i.e., 1,2-dibromoethane) resulting in G:C to T:A transversions and other mutations in both Escherichia coli and CHO cells (34)(35)(36). The mechanism of formation of AGT-DNA cross-links involves the nucleophilic attack of the active site residue Cys-145 on 1,2-dibromoethane and leads to the formation of a halfmustard intermediate, which further cyclizes into an unstable episulfonium ion (37). Nucleophilic sites on DNA react with the unstable episulfonium ion and form AGT-DNA cross-links (34,35), including the N6 position of dA, N7 position of dG, N2 position of dG, N1 position of dG, and O6 position of dG (37). This process is related to the induction of mutations in both E. coli and mammalian cells (34)(35)(36).
The processing (and possible repair) of AGT-DNA crosslinks are not fully understood. Our current hypothesis is that such AGT-DNA cross-links are processed by proteases to yield peptides that are small enough to be bypassed by human translesion (TLS) DNA polymerases, with the introduction of mutations (Fig. 1). The focus of our work was the synthesis and bypass studies with DNA-peptide cross-links from the active site of AGT at the N2 position of dG, known to occur in DNA (37) (Fig. 1). N 2 -dG-DNA-peptide cross-links with varying bulk (15-and 36-amino acids) were synthesized using the strategy involving the conjugate addition of a thiol-containing oligonucleotide with a dehydroalanine moiety formed from AGT in the Cys-145 peptides, thus yielding the same basic AGT-DNA cross-link as formed with 1,2-dibromoethane. Our results indicate that both hpol η and hpol κ bypass these long peptides and insert bases correctly across from the N 2 -dG-DNA-peptide cross-links.
A postoligomerization approach was developed for the synthesis of the oligonucleotide-peptide cross-links. The 2-F-dIcontaining oligonucleotide was converted into an N 2 -cystamine-dG-modified oligonucleotide (Fig. 2). The synthesis was carried out on a CPG (controlled pore glass) level, which involves nucleophilic displacement using cystamine followed by O 6 -(pnitrophenylethyl) (NPE) deprotection (41). The final oligonucleotide deprotection was carried out to cleave CPG beads and to remove base-protecting groups to obtain N 2 -cystamine-dGmodified oligonucleotide (Fig. 2). The cystamine-modified oligonucleotide was purified by HPLC ( Fig. S4) and characterized by MALDI MS (Fig. S5).
The cystamine-containing oligonucleotide was reduced with DTT to obtain an N 2 -(2-thioethyl)-dG oligonucleotide (Fig. 2), which was coupled with each respective dha-containing peptide (Fig. 2). The DNA-peptide cross-links were purified by gel electrophoresis (Fig. S6), and the N 2 -dG-15-mer peptide crosslink was characterized by MALDI MS (Fig. 3). To simplify the LC-MS/MS characterization, the nucleic acid moiety of the oligonucleotide-peptide cross-links was hydrolyzed with HF to generate a peptide adducted with only the base guanine (  cross-link and their fragmentation patterns (Figs. S8B and S9B; Tables S2 and S3) indicated an oxidized sulfur atom of the cysteine (sulfone). These oxidized forms of oligonucleotidepeptide cross-links, which had also been observed in our work with N 6 -adenyl cross-links to the AGT peptides (39), were used for further studies.
Human TLS pol ηand pol κ-mediated bypass of N 2 -dGpeptide cross-links Full-length extension and single-nucleotide incorporation assays were carried out with two major human (h) TLS pols, η and κ. Preliminary studies with shorter peptides (3-, 5-, 7-, and 11-mers) bound at the N6 atom of dA, prepared using the  same dehydroalanine strategy (39,40), were bypassed readily by these two polymerases, and we focused our work on the AGT active site 15-mer peptide bound at the N2 atom of dG, in that it was more informative of longer peptides that might be present in cells. In the course of our work, we extended the studies to a 36-mer AGT active site peptide cross-link at the N2 atom of dG, although we did not repeat all of the miscoding studies. The full-length extension reactions were performed using a 14-mer primer and respective unmodified and cross-linked templates in the presence of mixture of dNTPs (Fig. 4A). With the N 2 -dG-15-mer peptide cross-link, hpol η fully extended the primer with similar efficiency as the unmodified template (Fig. 4B, lanes 1-6 and 7-12). In the case of the N 2 -dG-36-mer peptide cross-link, the extension efficiency was reduced as compared with the unmodified as well as the N 2 -dG-15-mer peptide cross-link template (Fig. 4B, lanes [13][14][15][16][17][18].
With the N 2 -dG-15-mer peptide cross-link, hpol κ also fully extended the primer with similar efficiency as the unmodified template (Fig. 4C, lanes 1-6 and 7-12). In the case of the N 2 -dG-36-mer peptide cross-link, the extension efficiency was reduced as compared with the unmodified template as well as the N 2 -dG-15-mer peptide crosslink template (Fig. 4C, lanes [13][14][15][16][17][18]. Overall, both of the TLS polymerases were able to tolerate the 36-amino acid bulk. Single-nucleotide insertion assays were carried out using a 14-mer primer and respective unmodified and cross-linked templates in the presence of all four of the individual dNTPs (Fig. 5A). As observed earlier in our laboratory and by others (e.g., Ghodke et al. (39)), hpol η is prone to misincorporation at high dNTP concentrations and to insert multiple copies of a single dNTP. hpol η preferentially added dCTP across from (unmodified) dG, with the incorporation preference dCTP > dATP > dTTP > dGTP (Fig. 5B, lanes 1-12). For both the N 2 -dG peptide cross-links, the incorporation preference was also dCTP > dATP > dTTP > dGTP (Fig. 5B, lanes 1-12, X is N 2 -dG-15-mer or 36-mer peptide cross-link).
With hpol κ, only dCTP incorporation was observed for all the templates (Fig. 5C). hpol κ had a slower rate of incorporation of dCTP across from N 2 -dG-36-mer peptide cross-link as compared with N 2 -dG-15-mer peptide cross-link (Fig. 5C, lanes 1-12, X is N 2 -dG-36-mer peptide cross-link). Overall, hpol κ showed faithful bypass of N 2 -dG-peptide cross-links irrespective of the bulk of the adduct.

Miscoding potential of N 2 -dG oligonucleotide peptide crosslinks
To determine the frequency of misincorporation across from N 2 -dG-peptide cross-links, steady-state kinetic analysis was done using individual dNTPs at varying concentrations with both hpol η and hpol κ (Figs. 6, 7, and S10-S12 and Table 1). The steady-state kinetics of insertion of the correct base (i.e., dCTP) by hpol η and κ are shown in Figures 6 and 7. The steady-state kinetics of misincorporations (i.e., dATP, dGTP, and dTTP) are shown in Figures S10-S12.
Comparison of the specificity constants indicated that the 15-mer peptide cross-link did not increase the misinsertion frequency compared with dG.
LC-ESI-MS/MS analysis of hpol η primer extension past the N 2 -dG-15-mer peptide cross-link As discussed earlier, hpol η inserted all four dNTPs opposite the N 2 -dG-15-mer peptide cross-link (Table 1). MS analysis was performed to define other miscoding events occurring with hpol η. Full-length extension reactions were carried out using a 2 0 -deoxyuridine (dU)-containing 14-mer primer and the unmodified and N 2 -dG-15-mer peptide-cross-linked templates. After full-length extension by hpol η, the reaction mixtures were treated with uracil-DNA glycosylase (UDG) and piperidine (43) to cleave the oligonucleotide to a length that could be analyzed by LC-MS/MS (Fig. S13). The fully extended products and relative yields are summarized in Table 2.
Bypass of N 2 -dG DNA-peptide cross-links complex, m/z 939.36 (−3) is indicative of insertion of the correct base C (Fig. 8, Table S7). Ions at m/z 1043.82 (−3) and 1148.36 (−3) were also observed, indicating insertion of the correct base C with blunt end addition of A or two As (Figs. S17 and S18; Tables S8 and S9). Overall, the MS analysis revealed only an error-free bypass of the N 2 -dG-15-mer peptide cross-link.

Discussion
Our current working hypothesis is that the AGT-DNA cross-links induced by 1,2-dibromoethane and other biselectrophiles (35,44) undergo proteolysis and generate DNA-peptide cross-links that can be bypassed by TLS polymerases (Fig. 1). Accordingly, we considered DNA-peptide cross-links formed at the N 2 -position of dG, generated from the active site of AGT (Cys-145) (35,43). We focused on a 15-mer long AGT peptide cross-linked at the N2-position of dG in DNA and the major TLS polymerases hpol η and κ, which were considered to be the ones most likely involved in the bypass of DNA-peptide cross-links, based on our previous experience with bulky N 2 -dG adducts (45,46) and with another cross-link (N 6 -dA) (39). Following initial results with the 15-mer, we extended some of the work to a 36-mer derived from the AGT active site, known to be cross-linked in biological settings (35). The chemical synthesis of the N 2 -dG DNA-peptide cross-links was achieved by coupling a thiolcontaining oligonucleotide and dehydroalanine-modified 15-mer or 36-mer peptide (Fig. 2). As a result, oxidized N 2 -dG DNA-peptide cross-links were obtained (Fig. 2).
The enzymatic and MS studies were performed with this (oxidized form of the) N 2 -dG-oligonucleotide-peptide crosslink (Figs. 3, S8, and S9), which we encountered with both the 15-and 36-mer peptides. We have been unable to avoid what appears to be the artifactual oxidation of the thioether of Cys to the sulfone in this or work with the N 6 -dA cross-link (39) although we had not observed this species in earlier work on the isolation of the cross-linked peptide from DNA (34,35,37,45). The only other differences we introduced, compared with the natural cross-link, was in the AGT peptide,  Table 1 for k cat and K m values (fit to a hyperbolic equation in Prism) and Table S1 for the oligonucleotide sequences used. P, FAM-labeled 14-mer DNA primer.
is that we converted Cys-150 to prevent competition with Cys-145 and changed Lys-125 and Met-134 to leucine because of nucleophilic interference in the synthetic procedure we used.
Early studies from the Lloyd laboratory showed that bacterial DNA polymerases could efficiently bypass 4-mer-and 12-mer peptides linked to DNA by a ring-opened γ-hydroxypropano dG adduct (47,48). Since then, several studies have been done with human TLS polymerases that can bypass DNA-peptide cross-links (19,49,50), although the number of reports is relatively small. However, these have involved  Table 1 for k cat and K m values (fit to a hyperbolic equation in Prism) and Table S1 for the oligonucleotide sequences used. P, FAM-labeled 14-mer DNA primer.

Table 1
Steady-state kinetic analysis of insertion opposite N 2 -dG-15-mer and 36-mer peptide cross-links Bypass of N 2 -dG DNA-peptide cross-links shorter peptides (10-mers), and in one study a 23-mer posed a strong block to replication by hpol η, ι, and κ (50). Almost all of these studies used artificial linkages or peptides from model proteins, with no clear connection to actual DNA-protein cross-link structures or biological phenomena. Among the various human TLS polymerases, hpol η and κ have been studied with various N 2 -dG DNA adducts (45,46,51,52). Our full-length primer extension assays showed that hpol η and κ tolerated both the N 2 -dG-15-mer and 36-mer peptide cross-links (Fig. 4). These results indicate that both polymerases can tolerate the 36-amino acid residue bulk found in the minor groove of DNA. Single-nucleotide incorporation assays indicated that only hpol η misincorporated individual dNTPs across from unmodified template, the N 2 -dG-15-mer, and the 36-mer peptide cross-links (Fig. 5). hpol κ showed the only incorporation of dCTP across from unmodified template and the N 2 -dG-15-mer and 36-mer peptide cross-links, indicating high-fidelity bypass of the oxidized oligonucleotidepeptide cross-links (Fig. 5).
We analyzed hpol η-mediated misincorporation levels by comparing specificity constants for incorporation of individual dNTP across from unmodified, N 2 -dG-15-mer, and 36-mer peptide cross-link templates. With the N 2 -dG-15-mer peptide cross-link, hpol η inserted dATP 33-fold, dGTP 228fold, and dTTP 316-fold less efficiently than dCTP (Figs. 6 and S10-12 and Table 1) with an overall misincorporation frequency less than that observed for dG. The steady-state kinetic results indicate a very low level of misincorporation at the site of the N 2 -dG-15-mer peptide cross-link by hpol η.
We also compared hpol η-mediated correct base incorporation opposite N 2 -dG-15-mer as well as 36-mer peptide cross-link templates. The steady-state kinetic results suggest that the specificity constant for incorporation of dCTP across from N 2 -dG-15-mer peptide cross-link template was increased by 3.5-fold as compared with dG (Table 1). Surprisingly, the N 2 -dG-36-mer peptide cross-link template showed even faster insertion. As a result, the 36-amino acid bulk had only a minor effect on extension of the primer (Fig. 4) but did not attenuate single dCTP incorporation rates.
In the case of hpol κ (Fig. 7), the steady-state kinetic specificity constant for incorporation of dCTP across from N 2 -dG-15-mer peptide cross-link template was increased 6-fold compared with dG, and the specificity constant with the N 2 -dG-36-mer peptide cross-link template was nearly identical to the rate measured with dG (Table 1). These findings are consistent with previous studies with hpol κ and bulky N 2 -dG   Table S7 for fragment assignments.
Bypass of N 2 -dG DNA-peptide cross-links adducts (46,51), in which alkyl and polycyclic aromatic hydrocarbon adducts had only small effects on the efficiency of polymerization or on misincorporation (46). MS analysis of bypass of the N 2 -dG-15-mer peptide crosslink by hpol η showed products with the only insertion of the correct base C across from the adducted site (with or without blunt end addition of A, Table 2, Figs. 8, S17, and S18). Previous studies of our own and others have also shown the efficient bypass of some smaller N 2 -dG DNA adducts from the minor groove of DNA (46,52). As discussed earlier, in steadystate kinetics the extent of misincorporation observed opposite the N 2 -dG-15-mer peptide cross-link was less than for dG ( Table 1). In contrast, our own studies with N 6 -dA-oligonucleotide-peptide cross-links showed 37% of overall misincorporation by hpol η (39).
On the basis of both steady-state kinetics and mass spectral analysis, we conclude that other bases (i.e., A, G, and T) can be inserted across from the N 2 -dG-15-mer peptide cross-link (Table 1) but that the misincorporation frequency for hpol η is not higher than opposite unmodified dG. Further, if a misincorporation occurs, the enzyme is not able to extend the primer further, so mutations should not result.
In conclusion, the steady-state kinetics and mass spectral analysis of extended primers revealed only error-free products while bypassing the N 2 -dG-15-mer AGT peptide cross-link. In contrast, our own studies with N 6 -dA-oligonucleotide-peptide cross-links showed 37% of overall misincorporation by hpol η (39). The lack of miscoding with such bulky N 2 -dG peptides was somewhat surprising, but the ability of these enzymes to bypass such large peptides was also very unexpected. The formula weight of the 36-mer is >3500, and an object of future investigation is to understand how the structures of these polymerases can accommodate the bulk so well from the minor groove of DNA.

Experimental procedures
Reagents 2-F-dI phosphoramidite was purchased from Glen Research. Unmodified oligonucleotides and 6-carboxyfluorescein (FAM)-labeled oligonucleotide primers were purchased from Integrated DNA Technologies (IDT). The 15-and 36-mer peptides were purchased from New England Peptides. C 18 Sep-Pak columns were purchased from Waters. Other chemical reagents were from Sigma-Aldrich. hpol η (catalytic core of 1-432 amino acids) and hpol κ (19-526 amino acids) were expressed in E. coli and purified as previously reported (53,54). Unlabeled dNTPs and UDG were purchased from New England Biolabs. Micro Biospin-6 columns were purchased from Bio-Rad.

Oligonucleotide synthesis
Solid-phase synthesis of the 2-F-dI containing oligonucleotide was done on a PerSeptive Biosystems Model 8909 DNA synthesizer. The oligonucleotides were synthesized on a 1-μmol scale using the appropriate CPG as solid support.

Synthesis of N 2 -cystamine-dG oligonucleotide
The synthesis of the N 2 -cystamine-dG modified oligonucleotide was achieved on CPG (Fig. 2). The CPG-bound 2-F-dI containing oligonucleotide (5 0 -TCTCXGTTTATGGACC ACC-3 0 , where X is 2-F-dI) was treated with cystamine-2 HCl (10 μl of 500 mM stock in DMSO) in a mixture of DMSO (86 μl) and diisopropylethylamine (4.4 μl), with incubation at 55 C for 22 h to obtain the O 6 -(p-nitrophenylethyl) (NPE)protected N 2 -cystamine-dG-modified oligonucleotide. Next, the supernatant was discarded carefully, and the CPG was washed with DMSO (2 ml), followed by CH 3 CN (3 ml), and then air-dried. NPE deprotection of N 2 -cystamine-dG modified oligonucleotide was carried out using 1 M solution of DBU in CH 3 CN (1 ml) for 2 h at room temperature with continuous shaking. After DBU treatment, the DBU solution was carefully removed and the CPG was washed with anhydrous CH 3 OH (2 ml) and with anhydrous CH 3 CN (3 ml) and then air-dried. Finally, the NPE-deprotected N 2 -cystamine-dG-modified oligonucleotide CPG was treated with 1 ml of 0.4 M NaOH (in CH 3 OH) at room temperature overnight with continuous stirring. After completion of the final deprotection step, the CPG was sonicated for 5 min. Further, the supernatant was collected in an Eppendorf tube and the CPG was washed with H 2 O (400 μl); the supernatant was collected in the same tube. The collected supernatant was neutralized to pH 7.0 with 10% glacial CH 3 CO 2 H (v/v) and concentrated in Bypass of N 2 -dG DNA-peptide cross-links vacuo using a centrifugal evaporator. Next, a dried DNA pellet was resuspended in 300 μl of nuclease-free water.
HPLC purification of N 2 -cystamine-dG modified oligonucleotide was carried out using mobile phase A (0.1 M triethylammonium acetate (TEAA), pH 7.0) and mobile phase B (0.1 M triethylammonium acetate (TEAA), and CH 3 CN, 1:1, v/v) using a Phenomenex Clarity Oligo-RP (C 18 ) column (150 mm × 10 mm, 5 μm) at room temperature. UV detection was at 260 nm. The following gradient program (v/v) was used with a flow rate of 3 ml min −1 : started at 17% B, continued for 5 min, then increased to 40% B over 20 min, then increased to 100% B at 21 min, held at 100% B for 5 min, and reequilibrated for 4 min at 17% B (all v/v). The oligonucleotide eluted at 16.5 min (Fig. S4). The desired oligonucleotide fractions were collected and concentrated with a centrifugal evaporator. Finally, the purified oligonucleotide pellet was resuspended in 20 ml of 10 mM Tris-HCl buffer (pH 8.0) containing 1 mM EDTA and 300 mM NaCl (TEN buffer) and desalted using a C 18 Sep-Pak column. The integrity of the N 2cystamine-dG modified oligonucleotide was confirmed by MALDI-TOF MS (Fig. S5).

Synthesis of N 2 -dG-15-mer and 36-mer peptide cross-links
Reduction of the disulfide bond of N 2 -cystamine-dG oligonucleotide yielded the N 2 -(2-thioethyl)-dG-modified oligonucleotide (Fig. 2). DTT (2.5 μl of a 500 mM solution), HEPES buffer (1 M, pH 7.3, 10 μl), and water (14 μl) were added to the solution of the oligonucleotide (24 μl, 6 nmol) (total 50 μl). The reaction mixture was incubated at 37 C for 2 h. The reaction mixture was further precipitated: 3 M NaCl (30 μl) and H 2 O (220 μl) were added and the mixture was mixed with a vortex device. Further, C 2 H 5 OH (900 μl) was added to the same tube, vortex mixing was done, and the sample was stored at -80 C for overnight. Next, the mixture was centrifuged (21,000g, 30 min at 4 C), and the supernatant was carefully removed. The pellet was washed with absolute C 2 H 5 OH (100 μl). After centrifugation at 21,000g at 4 C for 20 min, the supernatant was removed. Finally, the oligonucleotide pellet was air-dried.
The dehydroalanine peptide (7 mg of the 15-mer or 5 mg of the 36-mer) was added to 40 μl of 100 mM Na 2 CO 3 (pH 9.2), mixed with a vortex device, and centrifuged. This peptide solution was added to the dry oligonucleotide pellet. An additional 10 μl of 100 mM Na 2 CO 3 (pH 9.2) and 10 μl of 10% CH 3 CN (v/v) were used to dissolve the rest of the peptide. This solution was added to the oligonucleotide solution again and mixed with a vortex device. The reaction mixture (total 60 μl) was incubated at 37 C for overnight. After 12 h, reaction mixtures were loaded directly for further purification by gel electrophoresis. Products were separated using a 20% polyacrylamide gel electrophoresis (PAGE) (7 M urea), at 55 W for 3 h (Fig. S6).
The desired oligonucleotide-peptide cross-link bands were located by brief exposure to a UV lamp (260 nm). The desired oligonucleotide-peptide cross-links (which migrated more slowly than the control oligonucleotide) were extracted from gel pieces by soaking in TEN buffer at 4 C overnight. The vial containing the gel fragment was shaken for 1 h to release the oligonucleotide-peptide cross-link into the buffer. The buffer solution from the vial was collected in a clean Falcon tube. This extraction step was repeated, and the buffer fractions were combined and desalted using a C 18 Sep-Pak column.
The integrity of the oligonucleotide-peptide cross-links was confirmed by MALDI MS (Fig. 3), as well as positive ion ESI nano-LC-MS/MS (Figs. S8 and S9). An additional mass of 32 a. m. u. was found, indicative of apparent oxidation to the sulfone.
For nano-LC-MS/MS, the desired DNA-peptide cross-links were hydrolyzed using HF to obtain a peptide adducted with only the base guanine (Fig. S7). The desired oligonucleotidepeptide cross-links (125-150 pmol each) were dried and suspended in HF (48%, 50 μl) and incubated at 4 C for 16 h. Next, the hydrolyzed samples were dried under a stream of nitrogen and resuspended in anhydrous CH 3 OH (50 μl). These samples were dried again under a stream of nitrogen and the same step repeated once. The dried pellet of hydrolyzed oligonucleotide-peptide cross-links was dissolved in 0.1% HCO 2 H (20 μl, v/v), shaken for 10 min, and centrifuged for 5 min at 23 C (21,000g). Finally, nucleobase-adducted peptides (without any purification) were analyzed using a nanoLC Ultra system (Eksigent Technologies) interfaced with an LTQ Orbitrap XL mass spectrometer (Thermo Scientific) in the positive ion mode as described previously (41), except for the separation conditions: a linear gradient increased from 2% to 45% solvent B over a period of 0-45 min, increased from 45% to 95% solvent B over a period of 45-50 min, held at 95% solvent B over a period of 50-60 min, decrease from 95% to 2% solvent B over a period of 60-62 min, and column was equilibrated at 2% solvent B over a period of 62-72 min (all v/v).
MS analysis showed that the fragmentation patterns of oligonucleotide-peptide cross-links are consistent with the additional mass of 32 a. m. u. indicating the oxidized form of sulfur (sulfone) (Figs. S8 and S9; Tables S2 and S3).

Single-nucleotide insertion assays
A primer-template oligonucleotide complex containing a FAM-labeled oligonucleotide primer (14-mer) and an unmodified or N 2 -dG-15-mer as well as 36-mer peptide cross-linked templates were annealed (1:1 M ratio) at 95 C for 5 min, followed by slow cooling overnight (Fig. 5). All single-nucleotide insertion reactions were performed using 50 mM Tris-HCl buffer (pH 7.5) containing 50 mM NaCl, 5 mM MgCl 2 , 5% glycerol (v/v), 5 mM DTT, and 50 μg ml -1 BSA at 37 C. The final concentration of primer-template complex was 120 nM. Concentrations of 5 nM hpol η and hpol κ were used to obtain extended primers. The reactions were initiated by addition of 1 μl of each individual dNTP (final concentration 100 μM) to a total volume of 25 μl. Aliquots (3.5 μl) of reaction mixtures were taken at each time point (0, 5, 10, and 30 min) and quenched with 6.5 μl of 10 mM EDTA (pH 8.0) in 95% deionized formamide (v/ v). Products were separated (5 μl) using 20% PAGE (7 M urea), and results were visualized using a Typhoon scanner (GE Healthcare) and analyzed by ImageJ software.

Steady-state kinetics
A primer-template complex containing a FAM-labeled oligonucleotide primer (14-mer) and an unmodified or N 2 -dG-15-mer or 36-mer peptide cross-linked template were annealed (1:1 M ratio) at 95 C for 5 min, followed by slow cooling overnight (Table 1, Figs. 6, 7 and S10-S12). All steadystate kinetic reactions were performed using 50 mM Tris-HCl buffer (pH 7.5) containing 50 mM NaCl, 5 mM MgCl 2 , 5% glycerol (v/v), 5 mM DTT, and 50 μg ml -1 BSA at 37 C. The final concentration of primer-template complex was 120 nM. Desired concentrations of hpol η (ranging from 0.75 to 3 nM) and hpol κ (ranging from 2 to 2.5 nM) were used to maintain steady-state kinetics of each dNTP insertion (<30% product). Reactions were started by adding 1 μl of an individual dNTP stock solution, at each of ten different concentrations (in some cases only eight concentrations were used in plotting because product formation was >30%), to a total volume of 5 μl. Aliquots (3.5 μl) of reaction mixtures were taken at respective time points and quenched with 6.5 μl of 10 mM EDTA (pH 8.0) in 95% deionized formamide (v/v). Products were separated using 20% PAGE (7 M urea), and results were visualized using a Typhoon scanner (GE Healthcare) and analyzed by ImageJ software. Data points are shown as means ± SD (range) from two independent experiments; see Table 1 for k cat and K m values and estimated using fit to a hyperbolic equation in Prism software (GraphPad, San Diego, CA). Hyperbolic fitting was done to obtain k cat / K m (k sp ) and k cat directly and then dividing to obtain K m (42).
LC-ESI-MS/MS: full-length extension assays followed by UDG and piperidine treatment A 2 0 -dU-containing FAM-labeled 14-mer primer (5 0 -GGTGGTCCAUAAAC-3 0 ) and an unmodified or N 2 -dG-15-mer cross-linked templates were annealed (1:1 M ratio) at 95 C for 5 min followed by slow cooling overnight (Fig. S13) (43,55). The full-length extension reactions were carried out under same reaction conditions as described in full-length extension reactions, except that the final concentrations were 2.5 μM primer-template complex and 0.75 μM hpol η, in a total reaction volume of 85 μl. Reactions were carried out in the presence of a mixture of dNTPs (final concentration 1 mM; i.e., 250 μM of each dNTP) at 37 C for 4 h. Reactions were terminated using Micro Biospin-6 column separations to extract Mg 2+ and dNTPs. The resulting products were treated with 25 units of UDG at 37 C for 4 h, followed by 0.25 M piperidine at 95 C for 1 h. The reaction mixture was dried by lyophilization. The dried pellet was resuspended in 500 μl of nuclease-free H 2 O and taken to dryness by lyophilization. The dried pellet was resuspended in 25 μl of nuclease-free water for LC-ESI-MS/MS analysis.
The fully extended products were identified ( Table 2, Figs. 8 and S14-S18; Tables S4-S9) by comparing the observed CID fragments with the theoretical values using a Mongo Oligo Mass Calculator v2.06 (http://rna.rega.kuleuven.be/masspec/ mongo.htm). The relative yields of extended products were calculated based on the peak areas of extracted ion chromatograms.

Data availability
All data are contained within the article and the supporting information.
Supporting information-This article contains supporting information (39).