Ada protein– and sequence context–dependent mutagenesis of alkyl phosphotriester lesions in Escherichia coli cells

Alkyl phosphotriester (alkyl-PTE) lesions are frequently induced in DNA and are resistant to repair. Here, we synthesized and characterized methyl (Me)- and n-butyl (nBu)-PTEs in two diastereomeric configurations (Sp and Rp) at six different flanking dinucleotide sites, i.e. XT and TX (X = A, C, or G), and assessed how these lesions impact DNA replication in Escherichia coli cells. When single-stranded vectors contained an Sp-Me-PTE in the sequence contexts of 5′-AT-3′, 5′-CT-3′, or 5′-GT-3′, DNA replication was highly efficient and the replication products for all three sequence contexts contained 85–90% AT and 5–10% TG. Thus, the replication outcome was largely independent of the identity of the 5′ nucleotide adjacent to an Sp-Me-PTE. Furthermore, replication across these lesions was not dependent on the activities of DNA polymerases II, IV, or V; Ada, a protein involved in adaptive response and repair of Sp-Me-PTE in E. coli, however, was essential for the generation of the mutagenic products. Additionally, the Rp diastereomer of Me-PTEs at XT sites and both diastereomers of Me-PTEs at TX sites exhibited error-free replication bypass. Moreover, Sp-nBu-PTEs at XT sites did not strongly impede DNA replication, and other nBu-PTEs displayed moderate blockage effects, with none of them being mutagenic. Taken together, these findings provide in-depth understanding of how alkyl-PTE lesions are recognized by the DNA replication machinery in prokaryotic cells and reveal that Ada contributes to mutagenesis of Sp-Me-PTEs in E. coli.

The specific sequence of DNA within an organism imparts the genetic code for all domains of life; however, this code is susceptible to alterations because of limited chemical stability of DNA (1). As a result, the genetic integrity of DNA can be compromised by endogenous metabolites and exogenous chemicals, resulting in different types of damage (1,2).
Alkylation is a major type of DNA damage (3), and the cytotoxic effects of DNA alkylation adducts are manifested by the fact that DNA alkylation also constitutes the central mechanism of action for many widely prescribed chemotherapeutic agents (2). Nucleobase modifications have been the major focus of DNA alkylation studies (3), although alkyl phosphotriesters (alkyl-PTEs) can also be efficiently induced (4). The latter lesions are formed when alkylating agents con-jugate with one of the two non-carbon-bonded oxygen atoms and, based on which of these oxygen atoms is attacked, alkyl-PTEs can form in S p or R p configuration ( Fig. 1) (4). It was also shown that chronic exposure of rats to 4-(methylnitrosamino)-1-(3-pyridyl)-1-butanone, a major tobacco-specific Nnitrosamine, through drinking water induces the formation of Me-PTE lesions in lung tissues (5).
There have been some studies about the repair and biological consequences of alkyl-PTEs (4). Because the addition of an alkyl group to the backbone phosphate neutralizes its negative charge, the presence of alkyl-PTE lesions in DNA may perturb its interactions with proteins. For instance, a mixture of S p -and R p -ethyl-PTE impedes in vitro primer extension catalyzed by T4 DNA polymerase and Escherichia coli DNA polymerase I (6). Isopropyl-PTE was also found to inhibit the unwinding of duplex DNA mediated by superfamily 2 DNA helicases (6). Our previous study showed that the two diastereomers of alkyl-PTEs exhibited different replication bypass efficiencies in E. coli cells, where replication across S p -Me-PTE at TT dinucleotide site is mutagenic, and intriguingly the mutagenicity of the lesion requires the presence of Ada protein (7).
Previous studies showed that the frequencies for the formation of alkyl-PTEs are influenced by flanking sequences. For instance, Guichard et al. (8) employed N-nitrosodiethylamine to treat three strains of mice, detected the levels of Et-PTE products by 32 P-postlabeling, and observed higher frequencies of Et-PTE lesions with the 59-flanking nucleoside being a thymidine or 29-deoxyguanosine compared with a 29-deoxyadenosine or 29-deoxycytidine. Likewise, the relative frequencies of 59-nucleobases at PTE sites exhibited nonrandom distribution in calf thymus DNA and liver DNA of BALB/c mice treated with diethyl sulfate (9). LC-MS/MS results also revealed the effects of flanking sequences on the levels of Me-PTE lesions induced in lung tissues of rats treated with 4-(methylnitrosamino)-1-(3-pyridyl)-1-butanone and its metabolite, 4-(methylnitrosamino)-1-(3-pyridyl)-1-butanol (5). Several models were proposed to rationalize the sequence-dependent accumulation of alkyl-PTE lesions: 1) Other than the non-carbon-bound oxygen atoms on backbone phosphate groups, each nucleobase possesses unique nucleophiles with different reactivities toward alkylating agents; 2) the alkylating agents' electrophilicity may also influence the nonrandom distribution of alkyl-PTE lesions, where highly reactive alkylating agents may confer a more random distribution (4); or 3) the repair efficiencies for This article contains supporting information. * For correspondence: Yinsheng Wang, Yinsheng.Wang@ucr.edu.
alkyl-PTEs may vary with flanking base sequences. However, not much is known about how the flanking base sequence contexts modulate the biological consequences of alkyl-PTEs.
Here, we employed a shuttle vector-based method, in conjunction with LC-MS (10,11), to analyze how the flanking nucleobases of Me-and nBu-PTE lesions influence the fidelity and efficiency of DNA replication in E. coli cells, and how replication past alkyl-PTEs is modulated by Ada protein and translesion synthesis DNA polymerases. Considering that we have previously investigated the TT sequence (7), here we examined the Me-and nBu-PTE lesions in both S p and R p configurations and in six different combinations of flanking sequences (TX and XT, with X being A, C, or G) (Fig. 1). The Me-and nBu-PTE lesions were selected to examine the influence that the size of the alkyl group has on the replicative bypass of the alkyl-PTE lesions.

Results
The aim of the present study was to gain a comprehensive understanding about the impact flanking base sequences have on DNA replication past alkyl-PTE lesions in E. coli, and to examine the roles of Ada protein and translesion synthesis DNA polymerases in modulating the replicative bypass of these lesions.
We synthesized 12-mer oligodeoxyribonucleotides (ODNs) containing a site-specifically inserted alkyl-PTE lesion in different sequence contexts following our recently published procedures (7), except that the exocyclic amino groups of adenine, cytosine, and guanine bases in the phosphoramidite building blocks were protected (see "Materials and Methods" in supporting information, Fig. 2, Figs. S1-S4). The synthesized ODNs were purified by HPLC (Fig. S5). Because the S p and R p diastereomers of the T(Me)C-and T(nBu)A-containing ODNs cannot be resolved from each other by HPLC, a mixture of the two diastereomers was utilized for these two ODNs in the subsequent experiments. We characterized the ODNs using electrospray ionization-MS (ESI-MS) and tandem MS (MS/MS), and the results confirmed the expected site of alkyl-PTE incorporation and the sequences of the modified ODNs (Figs. S6-S17).
We employed a previously reported shuttle vector method to assess the bypass efficiencies and mutation frequencies of the alkyl-PTE lesions (12). The lesion-containing ODNs were ligated into single-stranded M13 phage (Fig. S18). After replication, progeny recovery, PCR amplification, and restriction enzyme digestion, the liberated ODNs were analyzed by native PAGE and LC-MS/MS to identify the replication products ( Fig.   3 and Figs. S19-S30). As shown in Fig. 3, we employed two restriction enzymes, BbsI and MluCI, to digest the PCR products of the progeny genome, resulting in the release of the initial damage-containing region as 10-mer ODNs for the lesioncontaining or control genome, or a 13-mer ODN for the corresponding region in the competitor genome. By switching the order of digestion of the two restriction enzymes, we selectively radiolabeled the 59-terminus of either the original lesion-situated strand (p*GGCMNGCTAT) or the opposite strand (p*AATTATAGCY), where p* designates the 32 P-labeled phosphate group.
Effects of alkyl-PTE lesions on the fidelity of DNA replication in E. coli cells Replication across the alkyl-PTEs may yield up to 16 potential products (i.e. with the four natural nucleotides being incorporated at the two nucleosides flanking the PTE site), some of which could not be resolved from one another by PAGE analysis. Thus, we employed a restriction endonuclease and MS assay to identify the mutagenic products and to quantify the mutation frequencies (10). In this vein, the aforementioned digestion products were subjected to LC-MS and MS/MS analyses, where we monitored the fragmentation of the [M-3H] 32 ions. The mutation frequencies were quantified by the calibrated ratios of peak areas found in the selected-ion chromatograms for the mutagenic and nonmutagenic products ( Fig. 4b and Figs. S21 and S22).
We found that intriguingly the replication products of the S p -Me-PTEs at the three XT sites (X = A, C, or G) are largely independent of the neighboring 59 nucleosides, with ;85-90% and ;5-10% of products carrying AT and TG at bases flanking the initial damage site (XT), respectively, although replication across C(Me)T also yields 7% of nonmutagenic replication product ( Fig. 4b and Figs. S19 and S20). Additionally, none of the R p diastereomers of the Me-PTEs in the XT sequences, none of Me-PTEs in the TX sequences, and none of the nBu-PTEs in the TX or XT sequence are mutagenic (Figs. S19, S23, S29, and S30).

Impacts of alkyl-PTE lesions on the efficiency of DNA replication in E. coli cells
The bypass efficiencies of the Me-PTE lesions were quantified by comparing the relative intensity of signal for the 10-mer products from the lesion-containing or control genome to that of the 13-mer replication product from the competitor genome; this value was adjusted based on the molar ratios of lesion/competitor and control/competitor genomes used for the transfection (Fig.  4a). For Me-PTEs, none of the S p -PTEs at the three XT sites were strong impediments to DNA replication, although the corresponding R p diastereomers significantly blocked DNA replication. Additionally, the blockage effects were more pronounced when the flanking 59-nucleobase was a purine (adenine and guanine). Meanwhile, all Me-PTEs at TX sites elicited moderate blockage effects on DNA replication.
We also examined whether SOS-induced DNA polymerases (Pol II, Pol IV, and Pol V) promote the replicative bypass of different Me-PTEs. Our results showed that, similar to the results obtained for the alkyl-PTE lesions at TT site (7), simultaneous depletion of all three SOS-induced DNA polymerases did not exert any apparent effects on the replication bypass efficiencies for the Me-PTE lesions (Fig. 4a). For nBu-PTEs, we found that none of the S p -PTEs at XT sites significantly impede DNA replication in E. coli cells, whereas the R p diastereomer in the XT sequence and both diastereomers of nBu-PTEs in the TX sequence elicit moderate blockage effects (Fig. 4c).
Impact of Ada protein on the efficiency and fidelity of replication across alkyl-PTE lesions in E. coli cells Considering our previous observation that Ada protein can influence the mutagenicity and cytotoxicity of S p -Me-PTE at TT dinucleotide site (7), we next asked how Ada protein affects the replication bypass efficiencies and mutation patterns of S p -Me-PTEs in different flanking base sequences ( Fig. 5 and Figs. S25-S27). We found that genetic ablation of the ada gene resulted in a moderate decline in bypass efficiencies for the S p -Me-PTEs in all three XT sequences, and a complete abrogation of mutations. However, depletion of ada gene did not elicit significant changes in efficiency or accuracy in replication across the S p -Me-PTE in any of the three TX sequences.

Discussion
We previously investigated how size and stereochemistry of the alkyl-PTEs at TT site influence DNA replication in E. coli (7). Here, we systematically investigated how compositions of the flanking nucleobases of the alkyl-PTE lesions affect DNA replication. We found that replication across Me-PTEs in XT sequences (X = A, C, or G) shares common features as what we observed for the replicative bypass of T(Me)T (7). First, we demonstrated that neither the replication bypass efficiency nor the mutation frequency of Me-PTEs was impacted by concurrent ablation of the three SOS-induced DNA polymerases. Second, we revealed that none of S p -Me-PTEs in the three XT sequences suppress DNA replication in E. coli cells, whereas the R p -Me-PTEs in these sequence contexts exhibit significant replication blockage effects. Third, we show that S p -Me-PTEs at XT sites are mutagenic and Ada protein is indispensable for the mutagenic bypass. In this vein, removal of Ada protein also resulted in decreases in bypass efficiencies for the S p -Me-PTEs in the three XT sequences.
We also uncovered some unique features regarding replicating past alkyl-PTEs with different flanking base sequences. Strikingly, we found that the distribution of replication products for S p -Me-PTEs in the three XT sequences (X = A, C, or G) was largely independent of the 59-neighboring nucleobase being an A, C, or G, where ;85-90% and 5-10% of the replication products were with AT and TG being inserted at the initial XT site, respectively (Fig. 4b). In this context, it is worth noting that S p -Me-PTE at TT site induced 50% TT!GT and 15% TT!GC mutations, and the induction of these mutations also entails Ada protein (7).
Ada was reported to remove the methyl group from S p -Me-PTEs, where Cys-38 on N-terminal domain of the protein interacts with the methyl group on the DNA backbone (13,14). A previous structural study showed that the N-terminal domain of Ada protein preferentially recognizes A/T in the Ada box (13,15). In this recognition, Arg45 participates in hydrogen bonding interactions with thymine residues in both strands, and the binding with Arg-45 is sterically incompatible with a G positioned in the last two bases of the Ada box (13,15), which is consistent with the sequence specificity observed for replication past alkyl-PTEs.
In contrast to what we found for Me-PTEs at XT sites (X = A, C, G, or T), replication across the Me-PTEs at TX sites (X = A, C, or G) was accurate. In addition, neither the efficiency nor fidelity of replication across any of the Me-PTEs at TX site was altered upon genetic ablation of Ada. We also examined the replication past nBu-PTEs and found that the trends in bypass efficiency were similar to what we found for Me-PTEs (Fig. 4c); however, none of nBu-PTEs were mutagenic.
The results from our replication studies prompted us to propose a tentative model where Ada protein binds to S p -Me-PTE lesions formed at XT dinucleotide sites, and this binding is maintained during the replicative bypass of these lesions in ssDNA. In this vein, the very high levels of AT (85-90%) and GT (5-10%) replication products induced at the S p -Me-PTEcontaining XT (X = A, C, or G) dinucleotide sites, together with the dependence of these product distributions on Ada, suggest that Ada protein is bound with the lesion during replicative bypass of the S p -Me-PTE at these sites (Fig. 4b). Moreover, this interaction with Ada protein also assists the replicative bypass of these lesions in E. coli cells. The lack of apparent dependence of the distributions of replication products for the S p -Me-PTE lesion in the three XT sequences (X = A, C, or G) on the identities of the 59-flanking nucleobases strongly suggests their lack of recognition by DNA polymerases during nucleotide incorporation at the site. These results, together with the previous observation that an arginine residue in REV1 could direct the incorporation of dCMP through direct hydrogen bonding interaction with the nucleobase in the incoming nucleotide (16), indicate that some amino acid residue(s) in the Ada protein may direct nucleotide incorporation when E. coli replication machinery incorporates a nucleotide opposite the 59 flanking base of the Me-PTE lesions.
Harper and Lee (17) analyzed the mutations induced by Nmethyl-N9-nitro-N-nitrosoguanidine in 16 different strains of E. coli, and they found that 96.6% of the 4099 detected mutations were G!A transition mutation, which were attributed to O 6 -MedG (18,19). Our results, however, suggest that the G!A mutation may also arise, in part, from S p -Me-PTE formed at GT site.
In summary, the results from our shuttle vector-based replication study showed that flanking base sequences play important roles in DNA replication across Me-and nBu-PTE lesions. S p -Me-PTEs and S p -nBu-PTEs at XT sites (X = A, C, or G) were not strong impediments to DNA replication, whereas their R p counterparts exhibited blockage effects. Meanwhile, Me-and nBu-PTEs at TX sites moderately block DNA replication in E. coli. Furthermore, replication across S p -Me-PTEs at XT sites is mutagenic, which requires the presence of Ada protein. However, this phenomenon was not observed for R p -Me-PTEs at XT sites, which is consistent with the notion that Ada protein does not recognize R p -Me-PTEs. There are two established functions for Ada protein, i.e. removal of the methyl group from S p -Me-PTE or O 6 -MedG, and transcriptional activation of ada, alkA, alkB, and aidB genes (3,20,21). The N-terminal domain of Ada is required for its functions in the repair of S p -Me-PTE and transcriptional regulation (13,22), whereas the Cterminal domain of Ada is necessary for the repair of O 6 -MedG (23). Our work suggests that, aside from these two well- characterized functions, Ada may assume other important functions in cells, i.e. by binding and modulating the replicative bypass of S p -Me-PTE lesions in some sequence contexts. In the future, it will be important to investigate whether a mutant form of Ada protein that is competent in binding with S p -Me-PTE, but deficient in removal of the methyl group from the lesion (e.g. a Cys-38 mutant), can still support the replicative bypass of the lesion, and if so, whether this role of Ada is modulated by the flanking nucleobases of S p -Me-PTE. Additionally, future studies about how alkyl-PTE lesions influence DNA replication in mammalian cells will also provide additional insights into the biological impacts of this unique class of DNA damage.

Chemical syntheses
The detailed materials, synthetic procedures, reaction yields, and spectroscopic characterizations of compounds are provided in the supporting information, and the NMR spectra for these compounds are shown in Figs. S1-S4.

ODN synthesis
A Beckman Oligo 1000M DNA synthesizer (Fullerton, CA, USA) was used to synthesize the 12-mer lesion-containing ODNs, 59-ATGGCX(Y)TGCTAT-39 and 59-ATGGCT(Y)X-GCTAT-39 (X represents A, C, or G; Y represents an Me or nBu group) at 1 mmol scale. The synthesized phosphoramidite building block was dissolved in anhydrous acetonitrile at a concentration of 67 mM. Incorporation of unmodified nucleotides was conducted by using commercially available ultramild phos-phoramidite building blocks (Glen Research Inc., Sterling, VA, USA) following standard protocols. Synthesized ODNs were cleaved and deprotected from controlled pore glass with concentrated ammonium hydroxide at room temperature for 55 min. After solvent removal using a SpeedVac, the solid residues were dissolved in water and HPLC purified.

HPLC
HPLC separation was conducted on an Agilent 1100 system with a Synergi Fusion-RP column (10 3 150 mm, 4 mM in particle size and 80 Å in pore size; Phenomenex Inc., Torrance, CA, USA). Triethylammonium acetate (TEAA) solution (50 mM, pH 6.8), and a mixture of 50 mM TEAA and acetonitrile (70:30, v/v) were employed as mobile phases A and B, respectively. The gradient profile was 5-30% B in 5 min and 30-60% B in 70 min, and the flow rate was 0.8 ml/min. The HPLC traces for the purification of the 12-mer lesion-containing ODNs are shown in Fig. S5 and their ESI-MS and MS/MS are provided in Figs. S6-S17.
The single-stranded lesion-containing and lesion-free competitor M13 genomes were prepared following published procedures (Fig. S18) (12). Briefly, 20 pmol of M13mp7 (L2) plasmid was digested with 40 units EcoRI-HF at 25°C for 8 h to linearize the vector. Two scaffolds, 59-CTTCCACTCACT-GAATCATGGTCATAGCTTTC-39 and 59-AAAACGACG-GCCAGTGAATTATAGC-39 (25 pmol each), were subsequently annealed with the linearized vector. A phosphorylated 22-mer lesion-containing or lesion-free control ODN, or a 25mer competitor ODN, was then added to the mixture and incubated with scaffold ODNs and T4 DNA ligase at 16°C for 8 h. Unligated linear vector and ODNs were removed by the exonuclease activity of T4 DNA polymerase (22.5 units, 16°C for 2 h). The resulting plasmids were purified using Cycle Pure Kit (Omega), and the purified lesion-containing and lesion-free control plasmids were subsequently normalized against the competitor plasmid (12).
Preparation of Ada-deficient AB1157 E. coli strain P1 transduction was employed to obtain the Ada-deficient E. coli strain (Dada::kan) in AB1157 background from JW2201-1 strain (26). Genotype of the ensuing deficient strain was confirmed by antibiotic resistance and PCR followed by sequencing.

Transfection of M13 genomes into E. coli cells
The lesion-free control or lesion-containing plasmids were mixed with the competitor plasmid at a 1:1 molar ratio. The mixtures were transfected into electrocompetent WT AB1157 E. coli strains as well as the isogenic cells deficient in Ada, or all three SOS-induced DNA polymerases (i.e. Pol II, Pol IV and Pol V, TKO) (11). The transfected E. coli cells were cultured at 37°C for 5.5 h and the M13 phage was isolated from the supernatant by centrifugation at 13,200 rpm for 5 min. The purified M13 phage was transfected into SCS110 E. coli cells for amplification, followed by extraction with QIAprep Spin M13 kit (Qiagen) to obtain M13 ssDNA template for PCR amplification.

Quantification of bypass efficiency by the competitive replication and adduct bypass assay
We utilized a modified version of the competitive replication and adduct bypass assay to assess the bypass efficiency of alkyl-PTE lesions in E. coli cells (10)(11)(12)19). The regions of interest in the progeny M13 genomes were amplified by PCR with the use of Phusion high-fidelity DNA polymerase. The primers were 59-YCAGCTATGACCATGATTCAGTGAGTGGA-39 and 59-YTCGGTGCGGGCCTCTTCGCTATTAC-39, where Y represents a 59-amino modifier conjugated to the 59-phosphate group of ODNs, i.e. H 2 N(CH 2 ) 6 -. The PCR amplification started from 98°C for 30 s, followed by 35 cycles of amplification, with each cycle consisting of 98°C for 10 s, 65°C for 30 s, and 72°C for 15 s, and then with a final extension at 72°C for 5 min, ending at 4°C. The PCR products were purified by Cycle Pure kit (Omega).
The PCR products (100 ng) were digested with BbsI-HF restriction endonuclease (10 units) and recombinant shrimp alkaline phosphatase (rSAP, 10 units) in 10 ml 1 3 CutSmart buffer (New England Biolabs) at 37°C for 25 min, followed by deactivation of rSAP at 80°C for 10 min. To the above mixture were added 5 mM DTT, 1.66 pmol [g-32 P]ATP, 10 units T4 polynucleotide kinase (T4 PNK), CutSmart buffer, and water to give a total volume of 15 ml. The mixture was incubated at 37°C for 30 min, and T4 PNK was then deactivated by heating the solution at 70°C for 10 min. The resulting mixture was further digested with 10 units MluCI at 37°C for 25 min and subsequently quenched by adding 15 ml formamide gel-loading buffer containing xylene cyanol FF and bromphenol blue dyes. The radiolabeled digestion mixtures were resolved using a 30% native polyacrylamide gel (19:1 acrylamide:bis) and the intensities for the radiolabeled gel bands were measured by using a Typhoon 9410 imager.
The aforementioned digestion procedures yield a 10-mer duplex: 59-p*GGCMNGCTAT-39/59-AATTATAGCY-39 for full-length replication products of the lesion-containing plasmid, with M and N being the nucleobases at the dinucleotides initially flanking the alkyl-PTE lesions, Y being the complementary base of N in the opposite strand, and p* being the radiolabeled phosphate. The corresponding digestion of the PCR product for the competition genome yielded a 13-mer duplex: 59-p*GGCGATAAGCTAT-39/59-AATTATAGCT-TAT-39. The bypass efficiency was calculated by: Bypass effi-ciency (%) = (lesion signal/competitor signal)/(control signal/ competitor signal) 3 100%.

Identification and quantification of mutation frequencies by MS
Approximately 3 mg PCR products in 250 ml of 1 3 CutSmart buffer was mixed with 50 units BbsI-HF, 20 units rSAP, and incubated at 37°C for 2 h, followed by deactivation of rSAP at 80°C for 20 min MluCI (20 units) was subsequently added to the mixture and the digestion was continued at 37°C for 1 h. The resulting mixture was extracted once with phenol/chloroform/isoamyl alcohol (25:24:1, v/v) and the aqueous phase was evaporated, desalted by Waters Oasis HLB extraction cartridges (Milford, MA, USA), and redissolved in 10 ml water. A 5-ml aliquot was analyzed by LC-MS/MS on an LTQ linear ion trap mass spectrometer (Thermo Electron, San Jose, CA, USA) with an Agilent Zorbax SB-C18 column (0.5 3 150 mm, 5 mM in particle size). The gradient was 5 min of 5-20% methanol followed by 35 min of 20-50% methanol in 400 mM HFIP (pH was adjusted to 7.0 with triethylamine). The temperature for the ion transfer tube was 300°C, and the mass spectrometer was set up for acquiring the higher-resolution ultra-zoom scan MS and full-scan MS/MS for the [M-3H] 32 ions of 10-mer ODNs, d (GGCMNGCTAT), with M and N being A, T, C, and G.

Data availability
All data are contained within the manuscript.