Thymine DNA Glycosylase Can Rapidly Excise 5-Formylcytosine and 5-Carboxylcytosine

Background: Thymine DNA glycosylase is essential for active DNA demethylation and embryonic development. Results: TDG rapidly excises 5-formylcytosine (fC) and 5-carboxylcytosine (caC), products of Tet-mediated oxidation of 5-methylcytosine. Conclusion: Excision of fC and caC is consistent with TDG specificity for removing modified C or mC from CpG sites. Significance: The results suggest that active DNA demethylation could involve TDG excision of Tet-produced fC (or caC) and subsequent BER. Thymine DNA glycosylase (TDG) excises T from G·T mispairs and is thought to initiate base excision repair (BER) of deaminated 5-methylcytosine (mC). Recent studies show that TDG, including its glycosylase activity, is essential for active DNA demethylation and embryonic development. These and other findings suggest that active demethylation could involve mC deamination by a deaminase, giving a G·T mispair followed by TDG-initiated BER. An alternative proposal is that demethylation could involve iterative oxidation of mC to 5-hydroxymethylcytosine (hmC) and then to 5-formylcytosine (fC) and 5-carboxylcytosine (caC), mediated by a Tet (ten eleven translocation) enzyme, with conversion of caC to C by a putative decarboxylase. Our previous studies suggest that TDG could excise fC and caC from DNA, which could provide another potential demethylation mechanism. We show here that TDG rapidly removes fC, with higher activity than for G·T mispairs, and has substantial caC excision activity, yet it cannot remove hmC. TDG excision of fC and caC, oxidation products of mC, is consistent with its strong specificity for excising bases from a CpG context. Our findings reveal a remarkable new aspect of specificity for TDG, inform its catalytic mechanism, and suggest that TDG could protect against fC-induced mutagenesis. The results also suggest a new potential mechanism for active DNA demethylation, involving TDG excision of Tet-produced fC (or caC) and subsequent BER. Such a mechanism obviates the need for a decarboxylase and is consistent with findings that TDG glycosylase activity is essential for active demethylation and embryonic development, as are mechanisms involving TDG excision of deaminated mC or hmC.

DNA methylation at CpG dinucleotides in DNA is an important epigenetic modification that is implicated in biological processes including regulation of gene expression and silencing of transposons (1,2). Maintaining proper CpG methylation is essential for embryonic development, and aberrant CpG methylation is implicated in cancer. Although it is well known that cytosine 5-methyltransferases convert C to 5-methylcytosine (mC), 2 the mechanism for demethylation remains to be established (3). Several proposed mechanisms for active mC demethylation are shown in Fig. 1, and most involve a DNA glycosylase and subsequent base excision repair (BER) (3). Although plants have DNA glycosylases that excise mC (1), animals do not; thus, glycosylase-mediated demethylation likely begins with deamination or oxidation of mC.
One potential mechanism involves active deamination of mC to T by an AID/APOBEC enzyme, giving a G⅐T mispair that is converted to G⅐C by thymine DNA glycosylase (TDG) and subsequent BER (4 -6). Consistent with such a mechanism, recent studies show that TDG, including its glycosylase activity, is essential for active demethylation and for embryonic development (7,8). Other studies suggest a role in active demethylation for another G⅐T glycosylase (9,10), methyl binding domain IV (MBD4) (11), which is not essential for development (12).
Other potential pathways for demethylation were revealed by the discovery that Tet proteins (Tet1-3) convert mC to 5-hydroxymethylcytosine (hmC) (13,14) and that hmC is found in mammalian DNA (13,15,16). Thus, demethylation could involve deamination of hmC to hmU by an AID/APOBEC enzyme followed by BER-mediated conversion of hmU to C (3,17). Previous studies show that hmU can be efficiently excised by the DNA glycosylase SMUG1 (18) or by TDG (19), which is highly specific for excising bases from a CpG context (20,21).
It was also proposed that demethylation could involve iterative oxidation, where a Tet enzyme would oxidize mC to hmC and further to 5-formylcytosine (fC) and 5-carboxylcytosine (caC), with conversion of caC to C by a putative decarboxylase (3,22). Precedent for such a mechanism includes a pathway for iterative oxidation of T to 5-carboxyluracil (caU) and decarboxylation of caU to U (23,24). Recent studies confirm that Tet enzymes can oxidize hmC to fC and to caC and show that fC and caC are found in mammalian DNA (25,26). However, a decarboxylase that can convert caC to C has not been identified.
Our previous studies suggest that TDG can excise fC and perhaps caC (19), which could potentially obviate the need for a decarboxylase and provide a novel mechanism for active DNA demethylation. We showed that TDG activity is greater for C (and U) analogs with an electron-withdrawing C5-substituent ( m Ͼ 0) that can stabilize negative charge developing on the excised base in the chemical transition state (19) ( m is the electronic substituent constant) (27). In particular, we found that TDG can excise 5-hydroxycytosine (hoC, m ϭ 0.12) and 5-fluorocytosine (5FC, m ϭ 0.34) but not C ( m ϭ 0) or mC ( m ϭ Ϫ0.07) (19). The formyl group (CHO) is strongly electron-withdrawing ( m ϭ 0.35) (27), suggesting that TDG could remove fC. The C5-carboxyl of caC has modest electron-withdrawing effects (28), suggesting potential TDG excision activity for caC. The negligible electronic effect for CH 2 OH ( m ϭ 0) (27) suggests that hmC may be a poor TDG substrate.
We examine these possibilities here. As discussed below, our findings provide important insight into the specificity and catalytic mechanism of TDG and suggest that TDG might protect against fC-induced mutagenesis. In addition, our results suggest a new potential mechanism for active demethylation of CpG sites.

EXPERIMENTAL PROCEDURES
Materials-Human TDG was expressed and purified as described previously, quantified by absorbance (⑀ 280 ϭ 31.5 mM Ϫ1 cm Ϫ1 ) (21,29), flash-frozen, and stored at Ϫ80°C. The duplex DNA substrates consisted of a target strand, 5Ј-GGG AGA AGA GGA GGA AXG AAG AGA GCTC-3Ј where X ϭ T, hmC, fC, or caC, and a complement that places G opposite the target base (X). This DNA construct also places the target base in a CpG context. Oligodeoxynucleotides (ODNs) were synthesized (trityl-on) at the Keck Foundation Biotechnology Resource Laboratory, Yale University, using phosphoramidites for hmC, fC, and caC from Glen Research. The ODNs contain-ing these bases were synthesized and deprotected following the manufacturer's instructions. ODNs were purified using Glen-Pak cartridges (Glen Research) and quantified by absorbance (260 nm) (19). Following deprotection and purification, the 5-CH(OH)CH 2 OH-dC substituent was converted to 5-formyl-dC by reacting the ODN with NaIO 4 (0.1 M) at 4°C for 30 min (30) and was desalted with a Glen-Pak cartridge. The purity of the final ODNs was verified by analytical anion-exchange HPLC under denaturing (pH 12) conditions (19).
Determination of Glycosylase Activity-The glycosylase reactions were performed at 22 or 37°C, with 0.5 M DNA substrate and 5 M TDG, in HEMN.1 buffer (0.02 M HEPES, pH 7.5, 0.1 M NaCl, 0.2 mM EDTA, 2.5 mM MgCl 2 ). Reactions were initiated by mixing concentrated TDG with buffered substrate, quenched with 50% (v/v) 0.3 M NaOH, 0.03 M EDTA, and incubated at 85°C for 15 min to cleave the DNA at TDG-induced abasic sites. Cleavage was not observed for any substrate subject to quenching and heating without TDG.
For the electrophoretic assay of glycosylase activity, a sample from a quenched reaction was diluted 10-fold, mixed with sample buffer, and subjected to denaturing PAGE with a 15% polyacrylamide gel (Invitrogen). The gel was imaged using a Typhoon 9400 imager (GE Healthcare). For the reactions analyzed by denaturing PAGE, the target strand of DNA substrates contained 3Ј-fluorescein-dT (31).
For the single turnover kinetics experiments, reaction progress was analyzed by anion-exchange HPLC under denaturing (pH 12) conditions (19). The data were fitted by non-linear regression to Equation 1 using Grafit 5 (32) Fraction product ϭ A͑1 Ϫ exp͑Ϫk obs t͒͒ (Eq. 1) where A is the amplitude, k obs is the rate constant, and t is the reaction time (in min). The saturating enzyme conditions used impacted by enzyme-substrate association or by product release or inhibition, such that k obs reflects the maximal base excision rate (k obs Ϸ k max ) (19,21).

RESULTS
TDG Rapidly Excises fC and caC but Not hmC-We examined the ability of human TDG to excise hmC, fC, and caC, using DNA substrates in which these bases are paired with G and located in a CpG site; consistent with the specificity of TDG and the context in which oxidized mC bases will likely arise in mammalian DNA. As shown in Fig. 2, TDG can rapidly excise fC and caC from DNA, converting a substantial fraction of fC and caC substrate to a basic DNA product in 30 s at 37°C. Moreover, its activity for fC is similar to that for excision of T from a G⅐T mispair (Fig. 2).
In sharp contrast, TDG exhibits no significant activity for hmC in the 30-s time period during which substantial fC and caC excision is observed (Fig. 2). Moreover, no significant hmC excision is evident for much longer reaction times of 1 or 2 h at 37°C or even for 18 h at 22°C (lower temperature used to maintain enzyme stability). Thus, TDG cannot remove hmC from DNA, similar to previous findings that it cannot excise C or mC (19,20).
Quantifying TDG Activity for fC, caC, and hmC-To quantitatively compare the activity of TDG for excising fC and caC relative to that for a G⅐T mispair, we performed single turnover kinetics experiments, which provide the maximal rate of base excision (k max ) for a given substrate (19,21). The kinetics data are shown in Fig. 3, and the rate constants are reported in Table  1. As shown in Fig. 3A, TDG rapidly excises fC from a G⅐fC substrate, with k max ϭ 2.64 Ϯ 0.09 min Ϫ1 . The reaction occurs with a half-life of ϳ15 s and is essentially complete within 2 min. Fig. 3B shows that TDG also possesses substantial activity for excising caC from a G⅐caC pair, with k max ϭ 0.47 Ϯ 0.01 min Ϫ1 , and a corresponding reaction half-life of 1.3 min. TDG activity for a G⅐T mispair, k max ϭ 1.83 Ϯ 0.04 min Ϫ1 , falls between that for G⅐fC and G⅐caC. Thus, relative to a G⅐T substrate, TDG activity is 40% faster for fC and 4-fold slower for caC (at 37°C).
To quantify the degree of specificity of TDG for fC and caC relative to hmC, we sought to determine whether some level of hmC activity could be observed for an extended reaction time of 48 h (at 22°C) using an HPLC assay, as we have done for other slow TDG substrates (19,21,33). Although we see some evidence of product formation, the level is below that which we could reliably quantify. Nevertheless, we can place an upper limit on TDG activity for excising hmC; k max Ͻ 1.4 ϫ 10 Ϫ5 min Ϫ1 (Ͻ 4% product in 48 h). For comparison, rate constants determined here by single turnover kinetics (22°C) for G⅐T,    (19) and were determined using a DNA construct that gives 2-fold lower activity (G⅐T) than the construct used for these studies. e No significant activity detected in previous studies (19,20). G⅐fC, and G⅐caC substrates are provided in Table 1, as are previously determined rate constants for G⅐hoC, G⅐5FC, and G⅐5bromocytosine. The results indicate that TDG activity is at least 44,000-fold higher for fC and 10,000-fold higher for caC, relative to hmC. Thus, TDG exhibits a striking degree of specificity for excising fC and caC, yet it completely avoids excision of hmC, mC, and C.

DISCUSSION
Our studies reveal that TDG can rapidly excise fC and caC from DNA but has essentially no activity for removing hmC. In addition to the mechanistic and biological implications discussed in detail below, our findings suggest that TDG may protect against a number of mutations that can be induced by fC (34). In addition to fC that may be produced by Tet enzymes (Fig. 1A), fC can be generated upon exposure of mC to UV light, ionizing radiation, and oxidizing agents (35,36). Further studies are needed to explore the potential role of TDG in countering fC-induced mutagenesis.
Specificity and Mechanism of TDG-Previous studies show that regardless of the target base, TDG has strong specificity for excising bases that are paired with G and are followed by G (20,21,37,38), consistent with interactions that TDG forms with both guanines in a CpG site (39). These findings suggest that TDG is specific for modified forms of mC because mC is generated selectively at CpG sites. Our discovery that TDG can rapidly excise fC and caC, oxidation products of mC, is entirely consistent with its specificity.
The observation that TDG readily excises fC and caC, but not hmC, is consistent with our previous findings that TDG activity is greater for C (and U) analogs harboring an electron-withdrawing C5-substituent ( m Ͼ 0) that can stabilize the negative charge that develops on the excised base in the chemical transition state (19) ( m reflects the electronic effect of the C5-substituent, Table 1). Thus, we showed that TDG can remove 5FC ( m ϭ 0.34) and hoC ( m ϭ 0.12) but not C ( m ϭ 0) or mC ( m ϭ Ϫ0.07). Of course, other substituent properties (steric, etc.) can potentially impact active-site interactions, and thus TDG activity (19,21), and additional studies are needed to more fully explore these potential effects. Nevertheless, the substituent electronic effect, and perhaps active-site interactions that could stabilize negative charge on the C5-substituent, seem to be key factors, as discussed below. Notably, our previous finding that TDG excises BrC (Table 1) indicates substantial tolerance for bulky C5-substituents (19).
Our finding that TDG cannot remove hmC is consistent with m ϭ 0 for CH 2 OH and with the absence of activity for C and mC (Table 1). It seems reasonable to speculate that rapid excision of hmC by TDG could potentially be deleterious, if hmC functions in epigenetic regulation rather than simply existing as an intermediate active demethylation.
The rapid excision of fC by TDG is consistent with the electron-withdrawing properties of the CHO group ( m ϭ 0.35). We suggest that the greater activity for fC relative to 5FC, which have similar m values for CHO and fluoro substituents (Table  1), could reflect favorable electrostatic interactions with the carbonyl oxygen of CHO. Similarly, electrostatic interactions with the 5-OH group of hoC may account for previous findings that hoC is excised faster than might be predicted by m ϭ 0.12 (Table 1) (19). Additional studies are needed to test these ideas.
Regarding the caC activity of TDG, we first note that previous studies show that m poorly predicts the electronic effect for the carboxyl of caC (28). The carboxyl group is deprotonated at neutral pH, and the m of Ϫ0.10 for COO Ϫ would indicate an electron-donating effect. However, the COO Ϫ substituent lowers the N3 pK a of cytosine (⌬pK a ϭ Ϫ0.3) (28,40), due likely to stabilization of COO Ϫ by NH 2 (Fig. 1B), indicating an electron-withdrawing effect. Note that m ϭ 0.37 for a COOH substituent. The robust activity of TDG for caC and the absence of activity for C and mC are consistent with a strong electron-withdrawing effect for the COO Ϫ group of caC in the TDG active site, suggesting substantial stabilization of COO Ϫ by the NH 2 of caC and perhaps active-site interactions.
The robust activity for fC and caC is somewhat remarkable given that G⅐fC and G⅐caC base pairs are more stable than G⅐C pairs (28,30), which are in turn much more stable than G⅐T mispairs (41). Previous studies of TDG and other glycosylases show that their activity is weaker for a target base that forms more stable base pair interactions (38,42,43). To overcome this energetic handicap, TDG may provide additional interactions to stabilize base flipping for fC and caC relative to the interactions provided for T.
Implications for Active DNA Demethylation-Our results suggest a new potential mechanism for active demethylation of CpG sites. A recently proposed mechanism involves Tet-mediated iterative oxidation of mC to hmC and then to fC and caC, with conversion of caC to C by a putative decarboxylase (Fig.  1A) (25,26). However, Tet-catalyzed conversion of fC to caC appears to be fairly inefficient (25), and the required decarboxylase has not been identified.
Our finding that TDG rapidly excises fC suggests that active demethylation could involve Tet-mediated production of fC and subsequent conversion of fC to C by TDG-initiated BER (Fig. 1A). In addition, the substantial caC activity reported here suggests that TDG could also initiate BER-mediated conversion of caC to C. Both options obviate the need for a decarboxylase, which would be required for an iterative oxidation mechanism (25,26). We note that robust TDG excision of fC could account in part for findings that fC and caC are present at very low levels in DNA (25,26,44). Importantly, this new potential demethylation pathway is consistent with findings that TDG glycosylase activity is essential for active demethylation and embryonic development (7,8), as are the proposed demethylation mechanisms involving TDG excision of deaminated mC or hmC (Fig.  1A).
We note that a study published online while our manuscript was under review shows that TDG can excise caC, but did not examine its ability to remove fC (45). It was also shown that depletion of TDG in mouse embryonic stem cells leads to accumulation of caC in DNA and that caC could not be detected in TDG-proficient cells (45). This result could be explained by TDG excision of caC, as proposed (45). Alternatively, our finding that TDG excises fC much faster than caC indicates that their results could also be explained by TDG excision of fC in TDG-proficient cells (and lack of fC excision in TDG-deficient cells). In addition to more efficient TDG excision of fC (relative to caC), such a mechanism would bypass the need for Tetmediated conversion of fC to caC (Fig. 1A). Additional studies are needed to determine whether a potential Tet-TDG-BER pathway for demethylation involves TDG excision of fC, caC, or perhaps both, and whether such a pathway is rapid enough to account for rates of active demethylation observed in vivo. Our findings here and previous studies of Tet-mediated conversion of hmC to fC (25) suggest that the Tet and TDG steps could be quite efficient (minutes timescale).
Regarding a potential role for other glycosylases in the excision of fC and caC, it has been shown that SMUG1 cannot remove fC from DNA (46). We find that MBD4 (catalytic domain) has ϳ120-fold lower activity for G⅐fC relative to a G⅐T substrate and that G⅐fC activity is ϳ450-fold lower for MBD4 as compared with TDG. 3 (Note that the catalytic domain of MBD4 retains full activity for a G⅐T substrate (47).) Recent studies find that SMUG1 and MBD4 have no significant activity for excision of caC from DNA (45). Thus, TDG is the only glycosylase shown to have substantial activity for excision of fC or caC from DNA, consistent with findings that depletion of TDG leads to accumulation of caC in ES cells (45).