Structural insights into methylated DNA recognition by the C-terminal zinc fingers of the DNA reader protein ZBTB38

Methyl-CpG–binding proteins (MBPs) are selective readers of DNA methylation that play an essential role in mediating cellular transcription processes in both normal and diseased cells. This physiological function of MBPs has generated significant interest in understanding the mechanisms by which these proteins read and interpret DNA methylation signals. Zinc finger and BTB domain–containing 38 (ZBTB38) represents one member of the zinc finger (ZF) family of MBPs. We recently demonstrated that the C-terminal ZFs of ZBTB38 exhibit methyl-selective DNA binding within the ((A/G)TmCG(G/A)(mC/T)(G/A)) context both in vitro and within cells. Here we report the crystal structure of the first four C-terminal ZBTB38 ZFs (ZFs 6–9) in complex with the previously identified methylated consensus sequence at 1.75 Å resolution. From the structure, methyl-selective binding is preferentially localized at the 5′ mCpG site of the bound DNA, which is facilitated through a series of base-specific interactions from residues within the α-helices of ZF7 and ZF8. ZF6 and ZF9 primarily stabilize ZF7 and ZF8 to facilitate the core base–specific interactions. Further structural and biochemical analyses, including solution NMR spectroscopy and electrophoretic mobility gel shift assays, revealed that the C-terminal ZFs of ZBTB38 utilize an alternative mode of mCpG recognition from the ZF MBPs structurally evaluated to date. Combined, these findings provide insight into the mechanism by which this ZF domain of ZBTB38 selectively recognizes methylated CpG sites and expands our understanding of how ZF-containing proteins can interpret this essential epigenetic mark.

Normal cellular function relies on exquisite control of transcriptional processes, which is in part mediated by the methylation status at CpG sites. In a number of disease conditions, patterns of DNA methylation become inappropriately distributed, leading to aberrant transcriptional outcomes that pro-mote and maintain the disease state (1)(2)(3)(4)(5)(6)(7). Transcriptional regulation at many methylated gene loci is facilitated by methyl-CpG-binding proteins (MBPs), 2 transcription factors that have evolved to selectively recognize methylated CpG (mCpG) sites (8 -11). Upon binding, these epigenetic readers translate the mCpG signal into recruitment of protein assemblies that subsequently alter local chromatin architecture (12)(13)(14)(15)(16)(17)(18). Therefore, MBPs play a critical intermediary role in regulating gene activity in both normal and diseased cells, prompting significant interest in mechanistically understanding how these MBPs select and interpret DNA methylation signals.
In recent years, high-resolution structural investigations for several Cys 2 His 2 zinc finger (ZF)-containing proteins in complex with their cognate methylated DNA sequences have been reported (19 -26). Comparisons between these structural models have afforded insight into the similarities and differences in modes of mCpG recognition for this important class of MBPs (27)(28)(29). More recently, several high-throughput screening strategies have identified a number of additional ZF-containing proteins that appear to have selective recognition of methylated DNA (30 -33) but remain to be biochemically and structurally characterized. It is therefore possible that additional modes of mCpG recognition by ZF MBPs exist.
Zinc finger and BTB domain-containing protein 38 (ZBTB38) belongs to the large broad complex, tramtrack, bric-á-brac, pox zinc finger (BTB/POZ) family of transcription factors and represents one of only three members of this extended family that has demonstrated selectivity for mCpG recognition (34). Methyl-specific DNA binding of ZBTB38 was originally characterized for the N-terminal ZFs 3-5 (Fig. 1A), which have high sequence commonality with the three ZFs of ZBTB33 (Kaiso), the founding member of the ZBTB MBPs (34 -36). Recently, we identified a sequence-specific DNA binding site ((A/G)Tm-CG(G/A)(mC/T)(G/A)) for the C-terminal set of ZBTB38 ZFs and determined that motif recognition is also methyl-selective both in vitro and within cells (37). It was further demonstrated that, depending on the gene context, the C-terminal ZBTB38 ZFs participate in modulating transcriptional outcomes (37). These findings have significant implications for delineating the overall biological function of ZBTB38, given that its transcriptional activities have been linked to a number of core cellular processes such as cellular proliferation (38), apoptosis (35,39), and replication/genomic stability (40). Using a hybrid biophysical methodology and collective structural insight from available ZF-methylated DNA structures, we previously presented an atomistic model in which it was proposed that a key arginine common to ZF MBPs was replaced with a lysine (Lys-1055) in mCpG recognition (37). Here, using a combination of high-resolution structural and biochemical analyses, we characterize the interactions of the C-terminal ZBTB38 ZFs in complex with its methylated consensus motif, termed the methylated C-terminal ZBTB38 binding sequence (mCZ38BS). From this investigation, it was determined that Lys-1055 does function as an arginine surrogate in mCpG binding. This finding provides the first evidence that alternative modes of mCpG recognition by ZF MBPs may be utilized and expands our understanding of the mechanisms by which this essential epigenetic mark is read.

Optimization of the C-terminal ZBTB ZF domain for structural investigations with methylated DNA
We determined previously that the C-terminal ZBTB38 ZFs 7-9 were necessary and sufficient for methyl-selective DNA recognition (37). However, it was also demonstrated that the highest-affinity binding was achieved in the presence of all five C-terminal ZFs (ZFs 6 -10), suggesting that either or both ZF6 and ZF10 may contribute to the overall stability of the protein:DNA interaction. Thus, initial attempts to crystallize the C-terminal ZFs of ZBTB38 with the mCZ38BS focused on utilizing residues 1006 -1153, encompassing ZFs 6 -10 ( Fig. 1A). Preliminary screening of crystallization conditions yielded several low-resolution diffracting crystals. Further optimization attempts, including varying the DNA length and introducing overhangs, failed to produce crystals with better than 7 Å diffraction despite obtaining a number of crystals with multiple morphologies.

Structure of ZBTB38 C-terminal ZFs with methylated DNA
mCZ38BS, both ZFs 6 -10 and 6 -9 produced well-resolved spectra indicative of high-affinity binding interactions (Fig. 1C). Moreover, the majority of resonances outside of the random coil region observed for the ZF 6 -10:mCZ38BS complex spectrum were largely recapitulated in the ZF 6 -9:mCZ38BS complex spectrum, indicating similar structures for the protein: DNA complexes. In contrast, ZFs 7-10 in complex with the mCZ38BS resulted in significant signal loss for many resonances, suggestive of a much weaker binding interaction (Fig. 1C).
Consistent with the solution NMR observations, ZFs 6 -10 and 6 -9 had nearly identical low nanomolar binding affinities (K d values of ϳ5 nM) for mCZ38BS recognition, whereas ZFs 7-10 exhibited ϳ8-fold weaker binding ( Fig. 1D and Fig. S1A). This finding was also in agreement with our previously proposed atomistic model, where the ␣-helices of ZFs 6 -9 could be readily positioned along the major groove of the DNA, whereas the exact positioning of ZF10 remained indeterminant (37). Although these observations do not fully preclude ZF10 from interacting nonspecifically with DNA, it is possible that another primary physiological role for this ZF domain exists. Indeed, it was shown that the region of ZBTB38 encompassing the C-terminal ZFs can also participate in protein:protein interactions (40). Combined, these findings suggest that ZBTB38 ZFs 6 -9 constitute the minimal subset of ZFs necessary for high-affinity DNA recognition.
Overall structure of the C-terminal ZBTB38 ZF 6 -9:mCZ38BS complex ZBTB38 ZFs 6 -9 ( Fig. 2A) were co-crystallized with an 18-bp oligonucleotide sequence containing the previously determined high-affinity mCZ38BS motif (ATmCGGmCG) (37) (Fig. 2, B and C). Crystallization of the complexes succeeded with a shorter oligonucleotide duplex than what was used for the EMSA and initial solution NMR experiments (Table S1). No discernable differences in protein resonance chemical shifts were observed by solution NMR in complexation with either the 21-mer or 18-mer version of the mCZ38BS. The protein:DNA complex crystallized into space group C222(1), with one complex per asymmetric unit, and the structure was determined at a resolution of 1.75 Å ( Table  1). Atomic positioning was afforded for all base atoms within the mCZ38BS and ZBTB38 residues 1009 -1120, which excludes three and four flexible residues at the N and C termini, respectively.
All four of the Cys 2 His 2 ZFs adopt the canonical ␤␤␣ zinc finger fold and wrap around the DNA with their ␣-helices oriented toward the major groove (Fig. 2, A and B). However, only ZF7 and ZF8 participate in base-specific DNA interactions, which are concentrated at the three core T8:A29, mC9:G28, and G10:mC27 base steps ( Fig. 2C and Table S2). The ␣-helix of ZF6 spans the major groove and is primarily anchored through a series of electrostatic and direct hydrogen-bonding interactions with the phosphate backbone as well as by van der Waals Figure 2. High-resolution structure of C-terminal ZBTB38 ZFs 6 -9 in complex with its methylated DNA consensus motif. A, polypeptide sequence for ZBTB38 ZFs 6 -9 (residues 1006 -1124) utilized for structural analysis. Canonical amino acid positions along the ZF ␣-helices typically involved in DNA recognition are denoted above. B, X-ray crystal structure of ZBTB38 ZFs 6 -9 in complex with the mCZ38BS_18-mer. The core ATmCGGmCG sequence is highlighted in blue, and zinc atoms for each ZF are depicted as yellow spheres. This figure was prepared using PyMOL. C, contact map summarizing the ZBTB38 ZF 6 -9:mCZ38BS interactions. The dashed box denotes the location of the core mCZ38BS consensus motif. Residues are colored red, green, blue, and purple, corresponding to ZFs 6 -9, respectively. Numbers in parentheses designate the residue position along the ␣-helix. Black arrows define van der Waals interactions, pink arrows define classical hydrogen bonds, and blue arrows define water-mediated hydrogen bonds. A complete list of contacts is available in Table S2.

Structure of ZBTB38 C-terminal ZFs with methylated DNA
contacts with the sugar rings on both DNA strands (Fig. 2C, Fig.  S2A, and Table S2). These contacts from ZF6 are positioned toward the 5Ј and 3Ј ends of the coding and noncoding strands, respectively. ZF9 has an extended ␣-helix relative to ZFs 6 -8, the C-terminal end of which orients away from the major groove of the DNA. This affords accumulation of a hydration layer within the corresponding major groove region of the DNA (Fig. S2B). Similar to ZF6, the N-terminal end of the ZF9 ␣-helix contributes analogous backbone interactions along both DNA strands toward the 3Ј and 5Ј ends of the coding and noncoding strands, respectively (Fig. 2C, Fig. S2C, and Table S2).
Both ZF6 and ZF9 also contribute interactions between residues along their respective ␣-helices (Met-1027 and Ser-1107) and the C5-methyl groups of thymine bases found along the noncoding strand (T30 and T22) (Fig. 2C, Fig. S2A, and Fig. S2C and Table S2). The side-chain interactions observed between the C ␥ -methyl of Met-1027 (Fig. S1A) and the C5-methyl of T30 in conjunction with a contact between the C ␥2 -methyl of Val-1049 (ZF7) and the C5-methyl of T30 may explain the slight observed preference for having an A:T step in the first position of the mCZ38BS (37) (corresponding to A7:T30 in Fig. 1C). Overall, it appears that ZFs 6 and 9 primarily function to stabilize ZF7 and ZF8 for making core base-specific interactions.

Molecular basis for recognition of the core TmCG consensus
Of the high-resolution structures currently available for ZF MBPs in complex with methylated DNA, the ZBTB38 ZF 6 -9: mCZ38BS complex structure most resembles the ZBTB33: mDNA structures (19,25) in that two ZF helices provide residues for mCpG recognition. Further, although the ZBTB33 methylated consensus motif contains two consecutive mCpG (mCpGmCpG) sites, structural and biochemical evidence has established that methyl sensitivity is primarily conferred by basespecific contacts at the 5Ј mCpG, whereas methylation at the 3Ј mCpG is less essential (19,25). Similarly, the C-terminal ZBTB38 ZFs preferentially recognize the 5Ј mCpG site within the mCZ38BS ( 7 ATmCGGmCG 13 ) (37). Although optimal binding between the ZBTB38 C-terminal ZFs and the mCZ38BS was observed when both CpG sites were methylated, biochemical analysis demonstrated that the methylation status of the 3Ј mCpG contributed less to the overall binding interaction (37). Finally, the adjacent base preceding the 5Ј mCpG in the mCZ38BS was identified from in vitro selection methodologies to be exclusively a thymine (37), which is consistent with the base-specific interactions from residues in ZF7 and ZF8 being centralized at the 5Ј 8 TmCG 10 region.
Direct binding interactions for the T8:A29 base pair are mediated by the side chains of Asn-1052 and Lys-1055. Specifically, the N ␦2 and O ␦1 of Asn-1052 each form hydrogen bonds with the N7 and N6 positions of A29, respectively (Fig. 3A). The N of Lys-1055 makes a subsequent hydrogen bond with O4 of the pairing T8. Combined, these interactions afford selectivity in distinguishing a T:A from a G:C step. Indeed, mutation of T8:A29 to C8:G29 resulted in a significant ϳ54-fold decrease in binding ( Fig. 3B and Fig. S1B).
A common mode of mCpG recognition among ZF MBPs involves hydrogen bond interactions between an arginine and the 3Ј G and the formation of weaker van der Waals and/or CH⅐⅐⅐O-type hydrogen bonds between a glutamate and the mC methyl groups (19 -21, 23-26). Although still considered a relatively weak nonbonding interaction, the importance of CH⅐⅐⅐O-type hydrogen bonds in stabilizing protein structure (41,42) and facilitating protein binding at consensus DNA sites (43) is becoming increasingly recognized. Further structural insight from several ZF MBPs suggests that CH⅐⅐⅐O type hydrogen bonds are essential for conferring methyl-selective DNA recognition (19 -21, 23-26). For ZBTB33 and Zfp57, which have much higher selectivity for mCpG over CpG relative to the other structurally characterized ZF MBPs, the core glutamate side-chain atoms can also participate in direct hydrogen-bonding interactions with the N4 amino group of one (Zfp57) or both (ZBTB33) mCs within the mCpG palindrome (19,24,25).
Similar to ZBTB33 and Zfp57, the C-terminal ZFs of ZBTB38 also exhibit high selectivity for mCpG over CpG sites. Consistent with this observation, Glu-1079 recognizes the 5Ј palindromic mCpG site by forming cross-strand hydrogen bonds between its O ⑀2 atom and the N4 amino groups of mC9 and mC27. The positioning of Glu-1079 for making these DNA interactions is maintained through a hydrogen bond with Lys-1055 (Fig. 3A). In this orientation, the O ⑀1 and O ⑀2 atoms of Glu-1079 reside within the 3.7 Å van der Waals distance cutoff (41,44) and adopt the H⅐⅐⅐O-C angle (110°Ϯ 29°(43)), consistent with formation of CH⅐⅐⅐O-type hydrogen bonds with the C5-methyls of both T8 and mC9, respectively. Overall, given the number of core base contacts facilitated by Glu-1079, loss of this residue would be predicted to have a significant impact on the ability of the C-terminal ZFs of ZBTB38 to recognize DNA. Indeed, removal of Glu-1079 was assessed previously and found to result in an ϳ10-fold decrease in DNA binding (37).

Structure of ZBTB38 C-terminal ZFs with methylated DNA
An additional contribution to mC binding specificity comes from the C ␦1 -methyl of Leu-1077, which packs against the C5-methyl group of mC27 (Fig. 3C). An analogous Leu within ZBTB33 (Leu-533) was observed to participate in a similar interaction with its methylated DNA target (19). Substitution of Leu-1077 with an alanine did not affect the structural integrity of the protein (Fig. S3), as evidenced by the expected observation of chemical shift perturbations for only a few resonances relative to the WT but did result in an ϳ4-fold decrease in DNA binding ( Fig. 3D and Fig. S1C).
Notably, unlike the other ZF MBPs, ZF7 and ZF8 do not have an analogous arginine residue capable of hydrogen bonding with the mCpG guanines. However, Lys-1055 occupies a spatially comparable position in which its terminal N group forms a water-mediated hydrogen bond with O6 of G28 (Fig. 3A). Overall, it is clear from the structure that Lys-1055 plays a key role in directly and indirectly mediating core recognition of the TmCG motif (Fig. 3A). As further evidence, removal of Lys-1055 was found previously to result in an ϳ14-fold decrease in DNA binding (37). Nevertheless, to determine whether Lys-1055 functions as an arginine surrogate in mCpG recognition, a K1055R point variant was introduced into ZFs 6 -9 and evaluated for mCZ38BS binding capability after confirming that any observed chemical shift perturbations from the WT were consistent with what would be expected for a conservative point mutation, indicating that protein structural integrity was maintained (Fig. S3).
Comparative 1 H-15 N HSQC analysis of WT ZFs 6 -9 and ZFs 6 -9_K1055R in complex with the mCZ38BS demonstrated that the K1055R variant also produced a well-resolved spectrum indicative of a high-affinity binding interaction, although there were chemical shift perturbation differences between the two complexes (Fig. 4A). Nevertheless, both proteins exhibit nearly identical low nanomolar binding affinities for recognition of the mCZ38BS (Fig. 4B and Fig. S1C). The ZBTB38 ZFs 6 -9_K1055R construct was also capable of cocrystallizing with the mCZ38BS in the same space group as the WT complex, affording structural determination to a resolution of 1.60 Å ( Table 1). The WT:mCZ38BS and K1055R: mCZ38BS structures are highly superimposable (protein backbone root mean square deviation between the crystal structures of 0.18 Å) and definitively demonstrate that Arg-1055 is able to accommodate the same direct hydrogen-bonding interactions at the 5Ј-8 TmCG 10 -3Ј region (Fig. 4, C and D, Fig. S2D, and Table S2). In addition, the ordered water molecule present in the WT:mCZ38BS complex (W54) is displaced by the Arg-1055 N 2 atom, which can make a direct hydrogen bond with O6 of G28 ( Fig. 4D and Table S2). This substitution of direct base interactions by Arg-1055 relative to the indirect water-mediated base contacts by Lys-1055 coincides with the observed chemical shift variances between these two complexes by solution NMR (Fig. 4A). Combined, the presented results indicate, for the first time, how a lysine residue, through the use of an ordered water molecule, can function as a surrogate for arginine in selective mCpG recognition. These findings expand the mechanisms by which ZF scaffolds can read and interpret this essential epigenetic mark.

Structure of ZBTB38 C-terminal ZFs with methylated DNA
Interactions for the 3Ј mCpG site are limited to side chain contacts with the C5-methyl of mC24 as well as the presence of a water layer surrounding the C5-methyls of both mC24 and mC12 (Fig. S2B). Recognition of the C5-methyl of mC24 is mediated through the C ␥2 -methyl of Ile-1083 and the ring edge of Tyr-1105 (Fig. S2C). The C5-methyl group of mC12 is devoid of direct or indirect interactions with the protein but has a couple of proximally localized ordered water molecules (Fig.  S2B). Intriguingly, the majority of ZF MBP-methylated DNA complex structures solved to date have one mC within the mCpG palindrome that has a hydration shell around its methyl group (20,23,24,26). This hydration of the mC methyl has been determined to improve protein binding by a factor of 2 (24). Although these side chain interactions and the methyl hydration observed in the ZF 6 -9:mCZ38BS complex structure are individually relatively weak, they may collectively contribute to the moderate methyl selectivity observed at the 3Ј mCpG site.

Discussion
The presented structure of ZBTB38 ZFs 6 -9 in complex with the mCZ38BS illuminates how this ZF-containing domain elicits mCpG recognition. In particular, the structure affirms the prediction (37) that methyl-selective binding is preferentially localized at the 5Ј mCpG site. In a manner comparable with ZBTB33 (19, 25), recognition of mC27pG28 is accommodated through a series of base-specific and van der Waals interactions contributed from the side chains of Lys-1055, Leu-1077, and Glu-1079 ( Fig. 3 and Fig. S4). Additional canonical and CH⅐⅐⅐Otype hydrogen bonds from Glu-1079 facilitate recognition of the corresponding mC9 base within the 5Ј mCpG palindrome.
In contrast, the 3Ј mCpG is recognized through a series of side chain contacts between Ile-1083 and Tyr-1105 with the C5-methyl of mC24, which is consistent with this site contributing less to the overall DNA binding. Combined, it is evident that the majority of protein-mediated contacts for mCZ38BS recognition are preferentially localized to the noncoding strand ( Fig. 2C and Table S2). This asymmetric DNA strand recognition is emerging as a common feature in ZF MBP binding (19 -21, 23-26), although the physiological consequence of this capability is not yet fully understood.
Although it is evident that the C-terminal ZBTB38 ZFs display selectivity for recognition of mC over C at the 5Ј site, contributions from interactions at the T8:A29 step in overall recognition of the mCZ38BS cannot be discounted. Indeed, loss of the combined hydrogen-bonding contributions from Asn-1052 and Lys-1055 as well as the CH⅐⅐⅐O-type hydrogen bond between Glu-1079 and the C5-methyl of T8 have a significant impact on DNA binding (Fig. 3B).
Notably, the presented structures definitively demonstrate that Lys-1055 is capable of functioning analogously to the  15 N HSQC spectral overlay for WT ZBTB38 ZF 6 -9 (black/ green) and ZBTB38 ZF 6 -9_K1055R (red/blue) in complex with mCZ38BS_18-mer. Green and blue cross-peaks indicate aliased arginine guanidinium side chain resonances that only appear upon DNA binding at neutral pH. B, binding isotherms comparing the binding affinity for WT ZBTB38 ZFs 6 -9 (replotted from Fig.  1D) and ZBTB38 ZFs 6 -9_K1055R with mCZ38BS_27-mer. Each data point represents the average of triplicate data, with error bars depicting Ϯ S.D. C, overlay of the ZBTB38 ZF 6 -9:mCZ38BS (dark colors, beige DNA, orange zinc atoms) and ZBTB38 ZF 6 -9_K1055R:mCZ38BS (light colors, blue DNA, yellow zinc atoms) structures superimposed on their protein backbones with a root mean square deviation of 0.18 Å. The K1055R site is highlighted in magenta. D, comparison of interactions between Asn-1052, Lys-1055/Arg-1055, and Glu-1079 with the core T8:A29, mC9:G28, and G10:mC27 base pairs. The WT ZBTB38 ZF 6 -9:mCZ38BS complex is depicted in dark colors and beige DNA, whereas the ZBTB38 ZF 6 -9_K1055R:mCZ38BS complex is depicted in light colors and blue DNA. Black dotted lines represent classical hydrogen bond interactions, and blue dotted lines represent water-mediated hydrogen bonds.

Structure of ZBTB38 C-terminal ZFs with methylated DNA
spatially conserved arginine residue required for mCpG recognition in all other ZF MBPs investigated to date (19 -26). Nevertheless, the functional equivalence of lysine and arginine residues for mCpG recognition by the C-terminal ZFs of ZBTB38 is intriguing for two main reasons. First, substitution of the corresponding mCpG recognizing Arg-178 with a lysine in Zfp57 severally impacted DNA binding and resulted in loss of methyl sensitivity (24), suggesting that not all ZF protein scaffolds can accommodate this interconversion. Secondly, structures from both the ZF and methyl-CpG-binding domain (MBD) families of MBPs in complex with methylated DNA demonstrate that an mC-Arg-G triad is utilized (28,29) where, in addition to making a direct hydrogen bond with the mCpG guanine base, van der Waals interactions between the arginine side chain and mC 5C-methyl group are important for stabilizing the interaction and conferring some of the methyl sensitivity (28,45). In the ZBTB38 ZF 6 -9:mCZ38BS complex structure, these van der Waals contacts are missing. This is seemingly due to the requirement of Lys-1055 to hydrogen-bond with the O6 of T8, which pulls the lysine side chain away from the C5-methyl of mC27 so that the distance is too great to accommodate these van der Waals interactions ( Fig. 4D and Table S2). However, substitution of Lys-1055 with arginine results in a slight shift of the Arg-1055 side chain to a position more proximal to the C5-methyl of mC27 that re-establishes the van der Waals contacts (Fig. 4D, Fig. S2D, and Table S2). Given that the binding affinities for mCZ38BS are similar between the WT and K1055R variant proteins, it would seem that, in the context of the C-terminal ZFs of ZBTB38, loss of these van der Waal interactions can be compensated. This may be in part through interactions between the ordered water molecule (W59) and the C5-methyl of mC27 (Fig. 4D).
Regardless of the knowledge gained from the reported structures, the physiological rationale for why a lysine substitution was selected by the C-terminal ZFs of ZBTB38 for mCpG recognition remains unclear. Given that ZBTB38 appears to have two independent ZF domains that can each selectively recognize mCpG loci, perhaps the usage of Lys-1055 in the C-terminal ZFs provides an element for diversifying DNA target selection between the two ZF domains. Indeed, initial investigations within cells suggested that, depending on the gene context, the N-and C-terminal ZFs may work synergistically as well as mutually exclusively to modulate transcription (37). Nevertheless, the presented structural observations suggest that additional ZF-containing proteins that utilize this alternative mode of mCpG recognition may remain to be found. This is particularly relevant given that a number of uncharacterized ZFs have recently been identified as potential selective readers for methylated DNA (30 -33).
All oligonucleotides utilized for NMR, X-ray crystallography, and EMSA were prepared synthetically (DNA/Peptide Facility, University of Utah), further desalted on Nap-5 or Nap-10 Sephadex G25 columns (GE Healthcare Life Sciences), and lyophilized. Duplex DNA formation was completed as described previously (46). DNA sequences utilized for solution NMR spectroscopy, X-ray crystallography, and EMSA are presented in Table S1.

X-ray crystallography and structure refinement
All protein:DNA complexes were formed and validated by solution NMR spectroscopy as described above prior to crystallization. Robotic crystallization trials were performed for the various C-terminal ZBTB38 ZF:DNA complexes using an automated Crystal Gryphon robot (Art Robbins Instruments) within the PACT (pH, anion, cation crystallization trial) Premier HT-96 (Molecular Dimensions) and Natrix HT (Hampton Research) screens. Promising crystal conditions were further optimized using the sitting drop vapor diffusion method. Final crystals for the ZF 6 -9:mCZ38BS complex were formed by mixing 1 l of 8 mg/ml protein:DNA complex with 1 l of crystallization buffer (0.25 M ammonium chloride, 0.1 M MES (pH 5.8), and 20% PEG 6000) and sealed in a well containing 1 ml of crystallization buffer. Final crystals for the ZF 6 -9_K1055R:mCZ38BS complex were similarly formed by mixing 1 l of 8 mg/ml protein:DNA complex with 1 l of crystallization buffer (0.25 M ammonium chloride, 0.1 M MES (pH 6.0), and 19% PEG 6000) and sealed in a well containing 1 ml of crystallization buffer. Crystals for both complexes appeared in ϳ6 days at 4°C. Crystals were cryoprotected in crystallization buffer containing 30% (v/v) ethylene glycol. Diffraction data for both complexes were collected under cryogenic conditions from single crystals at the Stanford Synchrotron Radiation Lightsource, beamline 12-2 at a wavelength of 0.98 Å using a Dectris Pilatus3 S 6 M detector.
Data were processed using either HKL2000 (49) (ZF 6 -9: mCZ38BS complex) or XDS (50) (ZF 6 -9_K1055R:mCZ38BS complex). The ZF 6 -9:mCZ38BS structure was determined by molecular replacement (MR) within PHENIX (51) using a subset of the Aart(residues 75-184):DNA(base steps 1-18) complex (PDB:2I13) (52), and refined to 1.75 Å. The ZF 6 -9_K1055R:mCZ38BS structure was determined by MR within PHENIX using the ZF 6 -9:mCZ38BS complex as a search model and refined to 1.60 Å. To avoid biasing during MR, DNA and ZF segments were systematically removed and then built back into the electron density using Coot (53,54) Structure of ZBTB38 C-terminal ZFs with methylated DNA prior to further refinement in PHENIX to validate each crystal structure. Crystallographic and refinement statistics for each complex are summarized in Table 1. Interactions at distances of Ͻ4.2 Å were identified using ENTANGLE (55) and are summarized in Fig. 2C and listed in Table S2.

EMSA
For K d determination, duplex DNA (2 nM) was incubated with various concentrations of ZBTB38(1006 -1153), ZBTB38(1006 -1124) or mutant variants, or ZBTB38(1034 -1153) in 20 l of binding buffer (10 mM Tris (pH 7.0), 150 mM NaCl, 1 mM tris(2carboxyethyl)phosphine, 0.005% NaN 3 , 100 g/ml BSA and 10% (w/v) sucrose). Samples were incubated at room temperature for 30 min prior to electrophoretic separation in 10% 1ϫ Tris borate (pH 8.3) nondenaturing polyacrylamide gels at 100 V for 1 h. Gels were soaked in 1ϫ Tris borate supplemented with 1ϫ SYBR Gold Nucleic Acid Gel Stain (Thermo Fisher Scientific) and imaged on an Amersham Biosciences Typhoon Biomolecular Imager (GE Healthcare Life Sciences) at 495 nm. Gel band intensities were quantified by ImageJ (56,57). K d values were obtained by first averaging the data from triplicate experiments and fitting the averaged fraction-bound DNA as a function of protein concentration to a Langmuir isotherm (SigmaPlot version 10.0 from Systat Software, Inc.), where the maximal fraction bound value was set to 1.