Advertisement

Structural basis for the ability of MBD domains to bind methyl-CG and TG sites in DNA

Open AccessPublished:March 22, 2018DOI:https://doi.org/10.1074/jbc.RA118.001785
      Cytosine methylation is a well-characterized epigenetic mark and occurs at both CG and non-CG sites in DNA. Both methylated CG (mCG)- and mCH (H = A, C, or T)-containing DNAs, especially mCAC-containing DNAs, are recognized by methyl-CpG–binding protein 2 (MeCP2) to regulate gene expression in neuron development. However, the molecular mechanism involved in the binding of methyl-CpG–binding domain (MBD) of MeCP2 to these different DNA motifs is unclear. Here, we systematically characterized the DNA-binding selectivities of the MBD domains in MeCP2 and MBD1–4 with isothermal titration calorimetry–based binding assays, mutagenesis studies, and X-ray crystallography. We found that the MBD domains of MeCP2 and MBD1–4 bind mCG-containing DNAs independently of the sequence identity outside the mCG dinucleotide. Moreover, some MBD domains bound to both methylated and unmethylated CA dinucleotide–containing DNAs, with a preference for the CAC sequence motif. We also found that the MBD domains bind to mCA or nonmethylated CA DNA by recognizing the complementary TG dinucleotide, which is consistent with an overlooked ligand of MeCP2, i.e. the matrix/scaffold attachment regions (MARs/SARs) with a consensus sequence of 5′-GGTGT-3′ that was identified in early 1990s. Our results also explain why MeCP2 exhibits similar binding affinity to both mCA- and hmCA-containing dsDNAs. In summary, our results suggest that in addition to mCG sites, unmethylated CA or TG sites also serve as DNA-binding sites for MeCP2 and other MBD-containing proteins. This discovery expands the genome-wide activity of MBD-containing proteins in gene regulation.

      Introduction

      Cytosine methylation occurs prevalently at CG dinucleotide sites, with about 70% of CG sites being subject to methylation (m)
      The abbreviations used are: m
      methylation
      hm
      5-hydroxymethyl
      MBD
      methyl-CpG-binding domain
      ITC
      isothermal titration calorimetry
      aa
      amino acid
      5-mC
      5-methylcytosine
      hmC
      5-hydroxymethylcytosine
      c
      chicken
      ARBP
      attachment region-binding protein
      PDB
      Protein Data Bank
      GST
      glutathione S-transferase
      EMSA
      electrophoretic mobility shift assay
      SPR
      surface plasmon resonance.
      in the human genome (
      • Ehrlich M.
      • Gama-Sosa M.A.
      • Huang L.H.
      • Midgett R.M.
      • Kuo K.C.
      • McCune R.A.
      • Gehrke C.
      Amount and distribution of 5-methylcytosine in human DNA from different types of tissues or cells.
      ). Nevertheless, cytosine methylation is also present at CH (H = A, T, or C) sites (
      • Ramsahoye B.H.
      • Biniszkiewicz D.
      • Lyko F.
      • Clark V.
      • Bird A.P.
      • Jaenisch R.
      Non-CpG methylation is prevalent in embryonic stem cells and may be mediated by DNA methyltransferase 3a.
      ,
      • Woodcock D.M.
      • Crowther P.J.
      • Diver W.P.
      The majority of methylated deoxycytidines in human DNA are not in the CpG dinucleotide.
      ), and non-CG methylation (mCH) accounts for about 25% of the total cytosine methylation in both embryonic stem cells and neurons, contributing to transcriptional repression and imprinting, similar to CG methylation (
      • Guo J.U.
      • Su Y.
      • Shin J.H.
      • Shin J.
      • Li H.
      • Xie B.
      • Zhong C.
      • Hu S.
      • Le T.
      • Fan G.
      • Zhu H.
      • Chang Q.
      • Gao Y.
      • Ming G.L.
      • Song H.
      Distribution, recognition and regulation of non-CpG methylation in the adult mammalian brain.
      ,
      • Lister R.
      • Pelizzola M.
      • Dowen R.H.
      • Hawkins R.D.
      • Hon G.
      • Tonti-Filippini J.
      • Nery J.R.
      • Lee L.
      • Ye Z.
      • Ngo Q.M.
      • Edsall L.
      • Antosiewicz-Bourget J.
      • Stewart R.
      • Ruotti V.
      • Millar A.H.
      • et al.
      Human DNA methylomes at base resolution show widespread epigenomic differences.
      • Kribelbauer J.F.
      • Laptenko O.
      • Chen S.
      • Martini G.D.
      • Freed-Pastor W.A.
      • Prives C.
      • Mann R.S.
      • Bussemaker H.J.
      Quantitative analysis of the DNA methylation sensitivity of transcription factor complexes.
      ). Non-CG methylation occurs in virtually all human tissues and is associated with repression of development-related genes during differentiation of adult stem cells (
      • Schultz M.D.
      • He Y.
      • Whitaker J.W.
      • Hariharan M.
      • Mukamel E.A.
      • Leung D.
      • Rajagopal N.
      • Nery J.R.
      • Urich M.A.
      • Chen H.
      • Lin S.
      • Lin Y.
      • Jung I.
      • Schmitt A.D.
      • Selvaraj S.
      • et al.
      Human body epigenome maps reveal noncanonical DNA methylation variation.
      ).
      mCG-mediated transcriptional repression is through its binding to a family of proteins containing the MBD domain, a specific methyl-CpG–binding domain of about 70 residues. 11 MBD domains have been identified in mammals, including MeCP2, MBD1–6, SETDB1/2, and BAZ2A/B.
      In both mouse and human neurons, mCH is mainly located in chromatin regions of low CG density, which is established and maintained by DNMT3A (
      • Ramsahoye B.H.
      • Biniszkiewicz D.
      • Lyko F.
      • Clark V.
      • Bird A.P.
      • Jaenisch R.
      Non-CpG methylation is prevalent in embryonic stem cells and may be mediated by DNA methyltransferase 3a.
      ,
      • Guo J.U.
      • Su Y.
      • Shin J.H.
      • Shin J.
      • Li H.
      • Xie B.
      • Zhong C.
      • Hu S.
      • Le T.
      • Fan G.
      • Zhu H.
      • Chang Q.
      • Gao Y.
      • Ming G.L.
      • Song H.
      Distribution, recognition and regulation of non-CpG methylation in the adult mammalian brain.
      ). Among the three CH dinucleotides, CA is the major target for cytosine methylation (
      • Ramsahoye B.H.
      • Biniszkiewicz D.
      • Lyko F.
      • Clark V.
      • Bird A.P.
      • Jaenisch R.
      Non-CpG methylation is prevalent in embryonic stem cells and may be mediated by DNA methyltransferase 3a.
      ,
      • Guo J.U.
      • Su Y.
      • Shin J.H.
      • Shin J.
      • Li H.
      • Xie B.
      • Zhong C.
      • Hu S.
      • Le T.
      • Fan G.
      • Zhu H.
      • Chang Q.
      • Gao Y.
      • Ming G.L.
      • Song H.
      Distribution, recognition and regulation of non-CpG methylation in the adult mammalian brain.
      ,
      • Ichiyanagi T.
      • Ichiyanagi K.
      • Miyake M.
      • Sasaki H.
      Accumulation and loss of asymmetric non-CpG methylation during male germ-cell development.
      ,
      • Ziller M.J.
      • Müller F.
      • Liao J.
      • Zhang Y.
      • Gu H.
      • Bock C.
      • Boyle P.
      • Epstein C.B.
      • Bernstein B.E.
      • Lengauer T.
      • Gnirke A.
      • Meissner A.
      Genomic distribution and inter-sample variation of non-CpG methylation across human cell types.
      ). A flurry of recent studies demonstrate that MeCP2, a protein involved in neuron development whose mutations are linked to Rett syndrome and other neurological diseases (
      • Du Q.
      • Luu P.L.
      • Stirzaker C.
      • Clark S.J.
      Methyl-CpG-binding domain proteins: readers of the epigenome.
      ,
      • Amir R.E.
      • Van den Veyver I.B.
      • Wan M.
      • Tran C.Q.
      • Francke U.
      • Zoghbi H.Y.
      Rett syndrome is caused by mutations in X-linked MECP2, encoding methyl-CpG-binding protein 2.
      ), interacts with mCH sites, particularly the mCA sites, in neurons, implying that the MeCP2–mCA interaction plays a key role in regulation of gene expression in normal neuron development (
      • Guo J.U.
      • Su Y.
      • Shin J.H.
      • Shin J.
      • Li H.
      • Xie B.
      • Zhong C.
      • Hu S.
      • Le T.
      • Fan G.
      • Zhu H.
      • Chang Q.
      • Gao Y.
      • Ming G.L.
      • Song H.
      Distribution, recognition and regulation of non-CpG methylation in the adult mammalian brain.
      ,
      • Chen L.
      • Chen K.
      • Lavery L.A.
      • Baker S.A.
      • Shaw C.A.
      • Li W.
      • Zoghbi H.Y.
      MeCP2 binds to non-CG methylated DNA as neurons mature, influencing transcription and the timing of onset for Rett syndrome.
      ,
      • Gabel H.W.
      • Kinde B.
      • Stroud H.
      • Gilbert C.S.
      • Harmin D.A.
      • Kastan N.R.
      • Hemberg M.
      • Ebert D.H.
      • Greenberg M.E.
      Disruption of DNA-methylation-dependent long gene repression in Rett syndrome.
      • Kinde B.
      • Gabel H.W.
      • Gilbert C.S.
      • Griffith E.C.
      • Greenberg M.E.
      Reading the unique DNA methylation landscape of the brain: non-CpG methylation, hydroxymethylation, and MeCP2.
      ). MeCP2 mainly represses long genes (>100 kb) with high mCA density that are primarily expressed in brain (
      • Gabel H.W.
      • Kinde B.
      • Stroud H.
      • Gilbert C.S.
      • Harmin D.A.
      • Kastan N.R.
      • Hemberg M.
      • Ebert D.H.
      • Greenberg M.E.
      Disruption of DNA-methylation-dependent long gene repression in Rett syndrome.
      ). EMSA analysis indicates that MeCP2 binds to mCA as tightly as to mCG DNA and that MeCP2 prefers mCA over mCT and mCC (
      • Gabel H.W.
      • Kinde B.
      • Stroud H.
      • Gilbert C.S.
      • Harmin D.A.
      • Kastan N.R.
      • Hemberg M.
      • Ebert D.H.
      • Greenberg M.E.
      Disruption of DNA-methylation-dependent long gene repression in Rett syndrome.
      ,
      • Kinde B.
      • Gabel H.W.
      • Gilbert C.S.
      • Griffith E.C.
      • Greenberg M.E.
      Reading the unique DNA methylation landscape of the brain: non-CpG methylation, hydroxymethylation, and MeCP2.
      ). Hydroxylation of mCG into hmCG (hmC is 5-hydroxymethylcytosine) significantly reduces its binding affinity to MeCP2, whereas hydroxylation of mCA into hmCA does not affect its binding to MeCP2 (
      • Kinde B.
      • Gabel H.W.
      • Gilbert C.S.
      • Griffith E.C.
      • Greenberg M.E.
      Reading the unique DNA methylation landscape of the brain: non-CpG methylation, hydroxymethylation, and MeCP2.
      ).
      Recent progress in understanding the physiological role of mCA recognition by MeCP2 motivated us to carry out systematic analysis of mCG and mCH binding to the MBD domains of human MeCP2 and MBD1–4 by using ITC and crystallography. We found that the MBD domains of MeCP2 and MBD1–4 bound to mCG DNAs independent of the sequence identity outside the mCG dinucleotide, and the MBD domains of both MeCP2 and MBD1/2/4 could bind to mCA DNAs with a preference for the mCAC sequence motif. We next determined the crystal structures of the MBD2 MBD domain in complex with several different DNA ligands, including mCG, mCAT, mCAC, and unmodified CAC dsDNAs. We found that the MBD domain of MBD2 recognizes the mCA or CA via binding to their complementary TG dinucleotide and explained why the MBD domains favor the mCAC motif. Taken together, our results presented here imply that the unmethylated CA (or TG) DNAs also serve as the binding sites for MeCP2 and other MBD proteins, and also provide a foundation to study how the TG dinucleotide–binding ability of some MBD proteins, including that of MeCP2, impacts their genome-wide distributions and associated gene expression regulation.

      Results and discussion

      MBD domains of MeCP2 and MBD1–4 bind to mCG DNA independent of the sequence outside the mCG dinucleotide

      The methyl-CG binding ability and sequence selectivity of MBD domains have been studied extensively. For instance, MeCP2 has been reported to prefer some A/T nucleotides surrounding the fully methylated CG dinucleostide (
      • Klose R.J.
      • Sarraf S.A.
      • Schmiedeberg L.
      • McDermott S.M.
      • Stancheva I.
      • Bird A.P.
      DNA binding selectivity of MeCP2 due to a requirement for A/T sequences adjacent to methyl-CpG.
      ). On the basis of the SELEX selection assay, the MBD domain of MBD1 has been shown to preferentially recognize mCG within the TCGCA and TGCGCA sequence contexts (
      • Clouaire T.
      • de Las Heras J.I.
      • Merusi C.
      • Stancheva I.
      Recruitment of MBD1 to target genes requires sequence-specific interaction of the MBD domain with methylated DNA.
      ). By surface plasmon resonance (SPR) and structural analysis, the MBD domain of cMBD2 (chicken MBD2) was reported to preferentially recognize the mCGG sequence (
      • Scarsdale J.N.
      • Webb H.D.
      • Ginder G.D.
      • Williams Jr., D.C.
      Solution structure and dynamic analysis of chicken MBD2 methyl binding domain bound to a target-methylated DNA sequence.
      ). The MBD domain of MBD3, which was initially found to lack mCpG-binding ability (
      • Hendrich B.
      • Bird A.
      Identification and characterization of a family of mammalian methyl-CpG binding proteins.
      ,
      • Zhang Y.
      • Ng H.H.
      • Erdjument-Bromage H.
      • Tempst P.
      • Bird A.
      • Reinberg D.
      Analysis of the NuRD subunits reveals a histone deacetylase core complex and a connection with DNA methylation.
      • Saito M.
      • Ishikawa F.
      The mCpG-binding domain of human MBD3 does not bind to mCpG but interacts with NuRD/Mi2 components HDAC1 and MTA2.
      ), has been reported to display preferential binding to 5hmC by EMSAs (
      • Yildirim O.
      • Li R.
      • Hung J.H.
      • Chen P.B.
      • Dong X.
      • Ee L.S.
      • Weng Z.
      • Rando O.J.
      • Fazzio T.G.
      Mbd3/NURD complex regulates expression of 5-hydroxymethylcytosine marked genes in embryonic stem cells.
      ) or preferential binding to mCG by residual dipolar coupling analysis (
      • Cramer J.M.
      • Scarsdale J.N.
      • Walavalkar N.M.
      • Buchwald W.A.
      • Ginder G.D.
      • Williams Jr., D.C.
      Probing the dynamic distribution of bound states for methylcytosine-binding domains on DNA.
      ). The MBD4 protein contains a T/G or U/G mismatch-specific glycosylase domain in addition to the MBD domain, and its MBD domain was found to recognize the mCG/TG mismatch DNA, a product from the deamination of the methylated CG DNA, as well as to the mCG DNA (
      • Hendrich B.
      • Hardeland U.
      • Ng H.H.
      • Jiricny J.
      • Bird A.
      The thymine glycosylase MBD4 can bind to the product of deamination at methylated CpG sites.
      ,
      • Otani J.
      • Arita K.
      • Kato T.
      • Kinoshita M.
      • Kimura H.
      • Suetake I.
      • Tajima S.
      • Ariyoshi M.
      • Shirakawa M.
      Structural basis of the versatile DNA recognition ability of the methyl-CpG binding domain of methyl-CpG binding domain protein 4.
      ). However, some reports indicate that the sequence flanking the mCG dinucleotide does not affect their MBD binding ability (
      • Otani J.
      • Arita K.
      • Kato T.
      • Kinoshita M.
      • Kimura H.
      • Suetake I.
      • Tajima S.
      • Ariyoshi M.
      • Shirakawa M.
      Structural basis of the versatile DNA recognition ability of the methyl-CpG binding domain of methyl-CpG binding domain protein 4.
      ,
      • Baubec T.
      • Ivánek R.
      • Lienert F.
      • Schübeler D.
      Methylation-dependent and -independent genomic targeting principles of the MBD protein family.
      ). In this study, we have systematically measured the binding affinities between the recombinant MBD domains of human MeCP2 as well as MBD1–4 and mCG-containing DNA with different lengths and sequence contexts by ITC (Table 1, Table 2 and Fig. S1). However, we failed to observe significant sequence selectivity other than the mCG dinucleotide. To elucidate the structural determinants of our observations, we determined crystal structures of the MBD domain of MBD2 in complex with two different mCG-containing dsDNAs (mCGG and mCGT), respectively (Fig. 1 and Table S1).
      Table 1Binding affinities (Kd values) of the MBD domains of MeCP2 and MBD2 with different DNA (μm)
      Table 2Binding affinities (Kd values) of the MBD domains of MBD1/3/4 with different DNA (μm)
      Figure thumbnail gr1
      Figure 1Structural basis for the recognition of mCG DNA by the MBD domain of MBD2. A, overall structure of the MBD domain of MBD2 in complex with a mCG DNA in a schematic cartoon representation. The protein is shown in blue; the DNA ligand is shown in green except for the mC6–G6′ and G7–mC7′ bp, which are shown as yellow and red sticks, respectively. The mCG dinucleotide-interacting residues in MBD2 are shown as stick models, and water molecules are shown as red spheres. B, detailed interactions of the mCG dinucleotide-specific recognition by the MBD2 MBD domain. The interacting residues and DNA bases are shown in the same mode as in A. A and B, hydrogen bonds formed between protein residues and bases are marked as black dashed lines, and gray dashed lines represent hydrogen bonds between bp. C, schematic diagram of the detailed interactions between MBD2 and mCG DNA. Direct and water-mediated hydrogen bonds are indicated by solid and dashed red arrows, respectively. The stacking interactions between Arg-166 and mC6, Arg-188, and mC7′ are indicated by gray arrows. D, superposition of the complex structures of the MBD2 MBD domain, respectively, with AmCGT (blue) and CmCGG (green) DNA. E–G, structural comparison of the mCG DNA recognition by the MBD domains of human and chicken MBD2. The protein is shown in blue, and the DNA ligand is shown in green. The mCG-interacting residues in both human MBD2 and chicken MBD2 are shown as stick models, and hydrogen bonds formed between protein residues and DNA are shown as dashed lines.
      In both crystal structures the base-specific protein–DNA interactions are largely confined to the mCG dinucleotide motif (Fig. 1, A–C). We did not observe any base-specific interaction between protein and methylated DNA outside the mCG dinucleotide. The two MBD2–mCG complex structures could be well superimposed with a root mean square deviation of 0.66 Å over aligned backbone Cα atoms (Fig. 1D). Different from the published cMBD2–mCG structure (
      • Scarsdale J.N.
      • Webb H.D.
      • Ginder G.D.
      • Williams Jr., D.C.
      Solution structure and dynamic analysis of chicken MBD2 methyl binding domain bound to a target-methylated DNA sequence.
      ), we found that Lys-174 of human MBD2 did not interact with the guanine following the mCG dinucleotide, explaining why MBD2 does not display sequence selectivity other than the mCG dinucleotide (Fig. 1, E–G). Although the human MBD2 MBD domain is 95% identical to that of cMBD2, our affinities were slight stronger than those of cMBD2 (
      • Scarsdale J.N.
      • Webb H.D.
      • Ginder G.D.
      • Williams Jr., D.C.
      Solution structure and dynamic analysis of chicken MBD2 methyl binding domain bound to a target-methylated DNA sequence.
      ). Based on the complex structures, we could not establish a causal link between the few differing sequence positions and the observed difference in affinity because these different amino acids do not play an obvious role in binding. Thus, we propose that the binding discrepancies for our human MBD2 and reported cMBD2 may result from the different experimental techniques, i.e. ITC versus SPR (
      • Scarsdale J.N.
      • Webb H.D.
      • Ginder G.D.
      • Williams Jr., D.C.
      Solution structure and dynamic analysis of chicken MBD2 methyl binding domain bound to a target-methylated DNA sequence.
      ). The sequence-independent binding of these MBD domains is not only consistent with our binding results, but is also in line with crystal structures of MeCP2 and MBD4 in complex with the mCG DNA solved by others, which reveal that the MBD domains of MeCP2 and MBD4 barely make contact with any bases other than the CG dinucleotide (Fig. S2) (
      • Otani J.
      • Arita K.
      • Kato T.
      • Kinoshita M.
      • Kimura H.
      • Suetake I.
      • Tajima S.
      • Ariyoshi M.
      • Shirakawa M.
      Structural basis of the versatile DNA recognition ability of the methyl-CpG binding domain of methyl-CpG binding domain protein 4.
      ,
      • Ho K.L.
      • McNae I.W.
      • Schmiedeberg L.
      • Klose R.J.
      • Bird A.P.
      • Walkinshaw M.D.
      MeCP2 binding to DNA depends upon hydration at methyl-CpG.
      ,
      • Ohki I.
      • Shimotake N.
      • Fujita N.
      • Jee J.
      • Ikegami T.
      • Nakao M.
      • Shirakawa M.
      Solution structure of the methyl-CpG binding domain of human MBD1 in complex with methylated DNA.
      ). Taken together, the MBD domains of MeCP2 and MBD1–4 display no sequence selectivity outside the mCG dinucleotide.
      It has been reported that the MBD domains recognize the duplex mCG dinucleotide through two highly conserved arginine “fingers” (Fig. 1B, Fig. S2, B and D) (
      • Ho K.L.
      • McNae I.W.
      • Schmiedeberg L.
      • Klose R.J.
      • Bird A.P.
      • Walkinshaw M.D.
      MeCP2 binding to DNA depends upon hydration at methyl-CpG.
      ,
      • Ohki I.
      • Shimotake N.
      • Fujita N.
      • Jee J.
      • Ikegami T.
      • Nakao M.
      • Shirakawa M.
      Solution structure of the methyl-CpG binding domain of human MBD1 in complex with methylated DNA.
      ). Each of the two arginine fingers recognizes one mCG dinucleotide from the duplex mCG DNA and forms a stair motif (
      • Zou X.
      • Ma W.
      • Solov'yov I.A.
      • Chipot C.
      • Schulten K.
      Recognition of methylated DNA through methyl-CpG binding domain proteins.
      ). This stair-shaped motif is usually bound together by means of three kinds of interactions: bidentate hydrogen bonds between the arginine side chain and the guanine base; cation–π interactions between the guanidinium group of the same arginine side chain and the 5-methylcytosine (5-mC) 5′ to the guanine; and the nucleobase stacking interactions between the two bases in the mCG dinucleotide (Fig. 1, B and C, and Fig. S2). Cytosine methylation enlarges the binding interface and enhances cation–π interactions between 5-methylcytosine and arginine (
      • Zou X.
      • Ma W.
      • Solov'yov I.A.
      • Chipot C.
      • Schulten K.
      Recognition of methylated DNA through methyl-CpG binding domain proteins.
      ). The stair-shaped motif is also found in other protein–DNA complexes and usually consists of an arginine residue interacting with consecutive bases (pyrimidine followed by guanine) (
      • Lamoureux J.S.
      • Glover J.N.
      Principles of protein-DNA recognition revealed in the structural analysis of Ndt80-MSE DNA complexes.
      ,
      • Rooman M.
      • Liévin J.
      • Buisine E.
      • Wintjens R.
      Cation-π/H-bond stair motifs at protein-DNA interfaces.
      ). Therefore, we propose that the two arginine and the two symmetrically related mCG steps would be the structural determinants in the specific interactions between the MBD domains and mCG DNA.

      MBD domains of MeCP2 as well as MBD1/2/4 bind to mCA DNA with a preference for mCAC sequence motif

      As the MBD domain of MeCP2 recognizes mCA DNA in addition to mCG DNA (
      • Guo J.U.
      • Su Y.
      • Shin J.H.
      • Shin J.
      • Li H.
      • Xie B.
      • Zhong C.
      • Hu S.
      • Le T.
      • Fan G.
      • Zhu H.
      • Chang Q.
      • Gao Y.
      • Ming G.L.
      • Song H.
      Distribution, recognition and regulation of non-CpG methylation in the adult mammalian brain.
      ,
      • Chen L.
      • Chen K.
      • Lavery L.A.
      • Baker S.A.
      • Shaw C.A.
      • Li W.
      • Zoghbi H.Y.
      MeCP2 binds to non-CG methylated DNA as neurons mature, influencing transcription and the timing of onset for Rett syndrome.
      ,
      • Gabel H.W.
      • Kinde B.
      • Stroud H.
      • Gilbert C.S.
      • Harmin D.A.
      • Kastan N.R.
      • Hemberg M.
      • Ebert D.H.
      • Greenberg M.E.
      Disruption of DNA-methylation-dependent long gene repression in Rett syndrome.
      • Kinde B.
      • Gabel H.W.
      • Gilbert C.S.
      • Griffith E.C.
      • Greenberg M.E.
      Reading the unique DNA methylation landscape of the brain: non-CpG methylation, hydroxymethylation, and MeCP2.
      ), we also measured the binding affinities of the MBD domains of MBD1–4 in addition to MeCP2 to different non-CG DNAs by ITC (Fig. 2, A and B, Fig. S1, and Table 1, Table 2). We found that the MBD domains of MeCP2 and MBD1/2/4 bound to mCA DNA, albeit weaker than to mCpG DNA in general, and the MBD domain of MBD3 exhibited only weak binding ability to mCA (Table 1, Table 2 and Fig. S1). We found that Tyr-178 of MBD2 formed a water-mediated hydrogen bond with mCG DNA in the MBD2 complex structures (Fig. 1B), and this interaction is also conserved in the MeCP2–mCpG DNA structure (Fig. S2B) (
      • Ho K.L.
      • McNae I.W.
      • Schmiedeberg L.
      • Klose R.J.
      • Bird A.P.
      • Walkinshaw M.D.
      MeCP2 binding to DNA depends upon hydration at methyl-CpG.
      ). This conserved tyrosine residue has been proposed to be critical for mCG binding (
      • Ho K.L.
      • McNae I.W.
      • Schmiedeberg L.
      • Klose R.J.
      • Bird A.P.
      • Walkinshaw M.D.
      MeCP2 binding to DNA depends upon hydration at methyl-CpG.
      ,
      • Ohki I.
      • Shimotake N.
      • Fujita N.
      • Jee J.
      • Ikegami T.
      • Nakao M.
      • Shirakawa M.
      Solution structure of the methyl-CpG binding domain of human MBD1 in complex with methylated DNA.
      ,
      • Fraga M.F.
      • Ballestar E.
      • Montoya G.
      • Taysavang P.
      • Wade P.A.
      • Esteller M.
      The affinity of different MBD proteins for a specific methylated locus depends on their intrinsic binding properties.
      ), but it is substituted with phenylalanine (Phe-34) in MBD3 (Fig. 2C), which cannot form a hydrogen bond as tyrosine does in MBD2 and MeCP2 (
      • Ho K.L.
      • McNae I.W.
      • Schmiedeberg L.
      • Klose R.J.
      • Bird A.P.
      • Walkinshaw M.D.
      MeCP2 binding to DNA depends upon hydration at methyl-CpG.
      ,
      • Ohki I.
      • Shimotake N.
      • Fujita N.
      • Jee J.
      • Ikegami T.
      • Nakao M.
      • Shirakawa M.
      Solution structure of the methyl-CpG binding domain of human MBD1 in complex with methylated DNA.
      ,
      • Fraga M.F.
      • Ballestar E.
      • Montoya G.
      • Taysavang P.
      • Wade P.A.
      • Esteller M.
      The affinity of different MBD proteins for a specific methylated locus depends on their intrinsic binding properties.
      ). As a result, MBD3 is a weaker mCG binder, and an even weaker binder to mCA DNA (Table 2 and Fig. S1). Our ITC binding results also revealed that MBD domains bind to mCT and mCC DNAs only weakly (Table 1, Table 2 and Fig. S1), consistent with the earlier report that the MBD domain of MeCP2 binds to mCT and mCC DNAs as weakly as unmethylated CG DNA (
      • Gabel H.W.
      • Kinde B.
      • Stroud H.
      • Gilbert C.S.
      • Harmin D.A.
      • Kastan N.R.
      • Hemberg M.
      • Ebert D.H.
      • Greenberg M.E.
      Disruption of DNA-methylation-dependent long gene repression in Rett syndrome.
      ).
      Figure thumbnail gr2
      Figure 2MBD domain proteins possess CAC-binding ability. A, ITC-binding curves for the MBD domain of MBD2 and its mutants with different dsDNAs. B, ITC-binding curves for the MBD domain of MeCP2 and its mutants with different dsDNAs. NB, no detectable binding. C, sequence alignment of the MBD domains of human MBD2 (NP_003918.1), MBD1 (NP_001191065.1), MBD3 (NP_001268382.1), MBD4 (NP_001263199.1), and MeCP2 (NG_007107.2). The secondary structures of MBD2 and MeCP2 are indicated at the top and bottom of the sequences, respectively. The mCG dinucleotide-interacting residues of MBD2 and MeCP2 are labeled.
      Motif analysis of the genome-wide CH methylation identifies that CH methylation prominently occurs in the context of trinucleotide mCAC in neuron cells (
      • Guo J.U.
      • Su Y.
      • Shin J.H.
      • Shin J.
      • Li H.
      • Xie B.
      • Zhong C.
      • Hu S.
      • Le T.
      • Fan G.
      • Zhu H.
      • Chang Q.
      • Gao Y.
      • Ming G.L.
      • Song H.
      Distribution, recognition and regulation of non-CpG methylation in the adult mammalian brain.
      ,
      • Ichiyanagi T.
      • Ichiyanagi K.
      • Miyake M.
      • Sasaki H.
      Accumulation and loss of asymmetric non-CpG methylation during male germ-cell development.
      ,
      • Lagger S.
      • Connelly J.C.
      • Schweikert G.
      • Webb S.
      • Selfridge J.
      • Ramsahoye B.H.
      • Yu M.
      • He C.
      • Sanguinetti G.
      • Sowers L.C.
      • Walkinshaw M.D.
      • Bird A.
      MeCP2 recognizes cytosine methylated tri-nucleotide and di-nucleotide sequences to tune transcription in the mammalian brain.
      ,
      • Kozlenkov A.
      • Roussos P.
      • Timashpolsky A.
      • Barbu M.
      • Rudchenko S.
      • Bibikova M.
      • Klotzle B.
      • Byne W.
      • Lyddon R.
      • Di Narzo A.F.
      • Hurd Y.L.
      • Koonin E.V.
      • Dracheva S.
      Differences in DNA methylation between human neuronal and glial cells are concentrated in enhancers and non-CpG sites.
      ). Interestingly, our ITC results also revealed that the MBD domains of MeCP2 and MBD1/2/4 preferred mCAC over other mCAH (H = T, G and A) DNA (Fig. 2, A and B, Fig. S1, and Table 1, Table 2), in line with the observation that the preferential binding of MeCP2 to mCAC is critical for cerebral gene expression in the brain (
      • Lagger S.
      • Connelly J.C.
      • Schweikert G.
      • Webb S.
      • Selfridge J.
      • Ramsahoye B.H.
      • Yu M.
      • He C.
      • Sanguinetti G.
      • Sowers L.C.
      • Walkinshaw M.D.
      • Bird A.
      MeCP2 recognizes cytosine methylated tri-nucleotide and di-nucleotide sequences to tune transcription in the mammalian brain.
      ). Taken together, the MBD domains of MeCP2 and MBD1/2/4 exhibited binding abilities to mCA DNAs with a preference for the mCAC sequence motif.

      Structural basis for the mCA recognition by the MBD domain

      To understand the molecular basis of the mCA recognition by MeCP2, we tried to co-crystallize the MBD domain of MeCP2 with different mCA DNAs, but our attempt of the co-crystallization failed. Because our binding results also revealed that the MBD domains of MBD1/2/4 were able to recognize mCA DNA, we tried their co-crystallization and successfully determined the crystal structure of the MBD domain of MBD2 in complex with an mCAT DNA at a resolution of 2.05 Å (Fig. 3, A–C, and Table S1). In the MBD2–mCAT complex structure, the MBD domain of MBD2 adopted a canonical MBD-fold, with a C-terminal α-helix packed against the three-stranded β-sheet. The β-sheet was inserted into the major groove of mCA DNA and interacted with the mCA dinucleotide extensively (Fig. 3A).
      Figure thumbnail gr3
      Figure 3Structural basis for the selective recognition of mCA over mCT and mCC by the MBD2 MBD domain. A, overall structure of the MBD domain of MBD2 in complex with mCAT DNA in a schematic representation. The protein is shown in blue; the DNA ligand is shown in green, except the A5–T5′, mC6–G6′, and G7–mC7′ bp, which are shown as gray, yellow, and red sticks, respectively. The mCAT-interacting residues in MBD2 are shown as stick models. B, detailed interactions between the MBD2 MBD domain and mCAT DNA. The mCAT-interacting residues and DNA bases are shown in the same mode as in A. A and B, hydrogen bonds formed between protein residues and bases are marked as black dashed lines, and gray dashed lines represent hydrogen bonds between bp. C, schematic diagram of the detailed interactions between MBD2 and mCAT DNA, with the intermolecular interactions indicated in the same way as shown in C. D, superposition of the MBD2–mCAT (blue) and MBD2–mCGG (gray) structures. Hydrogen bonds formed between protein residues and DNA bases are marked as black dashed lines; hydrogen bonds between bp are marked as gray dashed lines. E and F, structural models of the MBD2 MBD domain bound to the mCT and mCC DNA, respectively. The protein and DNA are shown in the same way as in B. Hydrogen bonds formed between the MBD domain and the top or the bottom CG pairs are marked as red and gray dashed lines, respectively.
      In the MBD2-mCAT complex structure, Arg-166 formed two hydrogen bonds with the guanine base and simultaneously formed cation–π interactions with the pyrimidine ring of thymine in the TG dinucleotide that pairs with the mCA dinucleotide, completing an R/TG stair interaction motif (Fig. 3, B and C). Despite the same positively charged binding groove and the similar Arg-166 binding pattern between the MBD2–mCAT and other available MBD–mCG structures (Figs. 1B and 3B and Fig. S3, A and B) (
      • Klose R.J.
      • Sarraf S.A.
      • Schmiedeberg L.
      • McDermott S.M.
      • Stancheva I.
      • Bird A.P.
      DNA binding selectivity of MeCP2 due to a requirement for A/T sequences adjacent to methyl-CpG.
      ,
      • Hendrich B.
      • Bird A.
      Identification and characterization of a family of mammalian methyl-CpG binding proteins.
      ,
      • Zhang Y.
      • Ng H.H.
      • Erdjument-Bromage H.
      • Tempst P.
      • Bird A.
      • Reinberg D.
      Analysis of the NuRD subunits reveals a histone deacetylase core complex and a connection with DNA methylation.
      • Saito M.
      • Ishikawa F.
      The mCpG-binding domain of human MBD3 does not bind to mCpG but interacts with NuRD/Mi2 components HDAC1 and MTA2.
      ,
      • Ho K.L.
      • McNae I.W.
      • Schmiedeberg L.
      • Klose R.J.
      • Bird A.P.
      • Walkinshaw M.D.
      MeCP2 binding to DNA depends upon hydration at methyl-CpG.
      ), there are significant differences between the mCA and mCG recognition. Different from the second mC–G pair recognition by Arg-188 in the MBD2–mCG complex, Arg-188, the other arginine finger, did not interact with the adenine of mCA dinucleotide, because both the side chain of Arg-188 and the 6-NH2 group of adenine function as hydrogen bond donors and could not form a hydrogen bond with each other. Instead, the side chain of Arg-188 was pushed away from the interaction interface, resulting in the loss of the cation–π interactions between Arg-188 and 5-mC (Fig. 3D and Fig. S4, A and B). The 5-mC did form a water-mediated hydrogen bond with Asp-176 and a C–H···O hydrogen bond with the main chain carbonyl oxygen of Arg-188 (Fig. 3, B and C) (
      • Derewenda Z.S.
      • Lee L.
      • Derewenda U.
      The occurrence of C–H···O hydrogen bonds in proteins.
      ).
      The arginine finger Arg-166 forms a salt bridge with the conserved residue Asp-176, as observed in the MBD–mCG complex structures (Fig. 3B). Because Arg-166 was fixed by Asp-176 with two intramolecular hydrogen bonds, and Arg-188 had more flexibility, Arg-166 was used to recognize the TG dinucleotide; otherwise, if the fixed Arg-166 recognized the complementary CA dinucleotide, then the adenine would form close contacts with Arg-166 because both are hydrogen bond donors. Consistently, our mutagenesis binding results revealed that mutating Arg-166 to alanine severely diminished its binding to mCA, whereas mutating Arg-188 to alanine just reduced its binding to mCA by about 4-fold, highlighting that Arg-166 is essential for the binding of MBD2 to mCA DNA (Fig. 2A and Fig. S1).
      Interestingly, in the MeCP2–mCG DNA structures, Arg-133 (corresponding to Arg-188 in MBD2) also formed a hydrogen bond with Glu-137, in addition to the conserved salt bridge interactions between Arg-111 and Asp-121 (corresponding to Arg-166 and Asp-176 in MBD2, respectively) (Fig. 2C and Fig. S2, A and B). To investigate how MeCP2 recognizes mCA DNA, we also mutated Arg-111 and Arg-133 to alanine, and found that R111A disrupted the mCA DNA binding, whereas the R133A still retained modest mCA DNA binding (Fig. 2B and Fig. S1), implying that MeCP2 adopts a binding mode similar to that of MBD2 in binding mCA DNA.
      Our structure also explained why mCC and mCT DNAs displayed significantly reduced binding affinities toward the MBD domains (Table 1, Table 2 and Fig. S1), because Arg-166 could not form cation–π interactions with the purine ring of adenine or guanine as it does with methylcytosine or thymine (Fig. 3, E and F). This binding mode also explained why MeCP2 exhibits similar binding affinities to both mCA and hmCA (
      • Kinde B.
      • Gabel H.W.
      • Gilbert C.S.
      • Griffith E.C.
      • Greenberg M.E.
      Reading the unique DNA methylation landscape of the brain: non-CpG methylation, hydroxymethylation, and MeCP2.
      ), because its MBD domain recognized the mCA mainly through its complementary sequence TG, a mimic of mCG, regardless of the modification status of CA.

      Molecular basis for the preferential mCAC binding by the MBD domain

      To further address why the MBD domains of MeCP2 and MBD1/2/4 prefer mCAC over other mCAH (H = A, T, and G) DNAs, we also determined the structures of the MBD2 MBD domain in complex with two different mCAC DNAs, respectively (Fig. 4, A–C, Fig. S3, C and D, and Table S1). The only difference between these two mCAC DNA sequences is that a thymine nucleotide located at the −2 position to the mCAC motif is replaced with a cytosine. These two structures are highly conserved, further implying that the flanking sequences do not affect the MBD binding. In the MBD2–mCAC complex structures, in addition to the interactions between Arg-166 and T6G7 dinucleotide, Arg-188 formed a hydrogen bond with G5 that pairs with the C5′ following the mC7′A6′ dinucleotide by taking a different conformation from that in the MBD2–mCAT structure (Figs. 4B and 5A and Fig. S3C), and this interaction is not allowed if the nucleotide following the mCA is not cytosine, explaining why the MBD domains of MeCP2 and MBD1/2/4 favor mCAC over other mCAH (H = G, A, and T) motifs (Fig. 2, A and B, Fig. S1, and Table 1, Table 2).
      Figure thumbnail gr4
      Figure 4Structural basis for the recognition of mCAC and CAC DNA by the MBD domain of MBD2. A, overall structure of the MBD domain of MBD2 in complex with an mCAC DNA in a schematic representation. The protein is shown in blue, and the DNA ligand is shown in green except the A4–T4′ (gray), G5–C5′ (gray), T6–A6′ (yellow), and G7–mC7′ (red) bp. The mCAC-interacting residues in MBD2 are shown as stick models. B, specific recognition of the mCAC trinucleotide by MBD2. The interacting residues and DNA bases are shown in the same mode as in A. C, schematic diagram of the detailed interactions between MBD2 and mCAC DNA. The direct hydrogen bonds and water-mediated hydrogen bonds are indicated by solid and dashed red arrows, respectively. The stacking interaction between Arg-166 and T6 is indicated by a gray arrow. D, overall structure of the MBD domain of MBD2 in complex with a CAC DNA in a schematic representation. The protein and DNA are shown the same as observed in A. E, detailed interactions between the MBD2 MBD domain and CAC DNA. The DNA-interacting residues and DNA bases are shown in the same mode as in D. F, schematic diagram of the detailed interactions between MBD2 and CAC DNA, with the intermolecular interactions indicated in the same way as shown in C. A, B, D, and E, hydrogen bonds formed between protein residues and DNA bases are marked as black dashed lines, and gray dashed lines represent hydrogen bonds between bp.
      Figure thumbnail gr5
      Figure 5Structural basis for preferential recognition of mCAC DNA by the MBD domain of MBD2. A, superposition of the MBD2–mCAT (blue) and MBD2–mCAC (orange) structures. B, superposition of the MBD2–mCAC (orange) and MBD2–CAC (green) structures. Hydrogen bonds formed between protein residues and DNA bases are marked as black dashed lines, and gray dashed lines represent hydrogen bonds between bp.

      Cytosine methylation of the CA dinucleotide is not essential for the binding of MBD domains

      The structural revelation that the MBD domain of MBD2 bound to the mCA DNA by specifically recognizing the complementary TG dinucleotide prompted us to investigate whether MBD2 was also able to recognize the unmethylated CA (or TG) DNA. Our binding results indeed revealed that the MBD domains of MBD2, MBD4, and MeCP2 could bind to the unmethylated CA DNA, albeit weaker than to mCA DNA (Fig. 2, A and B, Fig. S1, and Table 1, Table 2), presumably due to the lack of the C–H···O hydrogen bond between the 5-methyl group of the 5-mC and the main chain carbonyl oxygen of Arg-188 in MBD2. To illustrate the structural basis of the recognition of unmethylated CA DNA by the MBD domains, we determined the complex structure of the MBD2 MBD domain bound to a CAC-containing DNA (Fig. 4, D–F, and Table S1). The MBD2–CAC complex structure confirmed our hypothesis that the only difference between the MBD2–mCAC and MBD2–CAC structures is the loss of the C–H···O hydrogen bond between the cytosine of the CA dinucleotide and the main chain carbonyl oxygen of Arg-188 (Figs. 4, C and F, and 5B, and Fig. S4, C and D).
      Although the MBD domain has been long established as a methyl-CG–binding domain (
      • Lewis J.D.
      • Meehan R.R.
      • Henzel W.J.
      • Maurer-Fogy I.
      • Jeppesen P.
      • Klein F.
      • Bird A.
      Purification, sequence, and cellular localization of a novel chromosomal protein that binds to methylated DNA.
      ), surprisingly, back to 1991 it has been reported that the chicken attachment region-binding protein (ARBP) protein, which was later found to be the MeCP2 homolog in chicken (
      • Weitzel J.M.
      • Buhrmester H.
      • Strätling W.H.
      Chicken MAR-binding protein ARBP is homologous to rat methyl-CpG-binding protein MeCP2.
      ), recognizes the matrix/scaffold attachment regions (MARs/SARs) through a consensus sequence of 5′-GGTGT-3′ with flanking AT-rich sequences (
      • Buhrmester H.
      • von Kries J.P.
      • Strätling W.H.
      Nuclear matrix protein ARBP recognizes a novel DNA sequence motif with high affinity.
      ,
      • von Kries J.P.
      • Buhrmester H.
      • Strätling W.H.
      A matrix/scaffold attachment region binding protein: identification, purification, and mode of binding.
      ), and this recognition depends on the MBD domain and a central 5′-GGTGT-3′ sequence (
      • Weitzel J.M.
      • Buhrmester H.
      • Strätling W.H.
      Chicken MAR-binding protein ARBP is homologous to rat methyl-CpG-binding protein MeCP2.
      ,
      • Buhrmester H.
      • von Kries J.P.
      • Strätling W.H.
      Nuclear matrix protein ARBP recognizes a novel DNA sequence motif with high affinity.
      ). Mutation of the central three nucleotides GTG of 5′-GGTGT-3′ motif either abolishes or diminishes its binding to ARBP (or MeCP2) (
      • Buhrmester H.
      • von Kries J.P.
      • Strätling W.H.
      Nuclear matrix protein ARBP recognizes a novel DNA sequence motif with high affinity.
      ). The GTG sequence corresponds to the CAC sequence in the complementary strand of the DNA duplex. Furthermore, by re-assessing the previously published DNA binding database generated from the protein-binding microarray (PBM) assay, a technology developed to characterize DNA-binding sequence specificities of proteins, including transcription factors, in a high-throughput manner, we found that the MBD domain of MeCP2 selectively bound to unmethylated CA/TG sequence (Fig. 6A) (
      • Weirauch M.T.
      • Cote A.
      • Norel R.
      • Annala M.
      • Zhao Y.
      • Riley T.R.
      • Saez-Rodriguez J.
      • Cokelaer T.
      • Vedenko A.
      • Talukder S.
      • DREAM5 Consortium
      • Bussemaker H.J.
      • Morris Q.D.
      • Bulyk M.L.
      • Stolovitzky G.
      • Hughes T.R.
      Evaluation of methods for modeling transcription factor sequence specificity.
      ,
      • Lam K.N.
      • van Bakel H.
      • Cote A.G.
      • van der Ven A.
      • Hughes T.R.
      Sequence specificity is obtained from the majority of modular C2H2 zinc-finger arrays.
      ). Hence, these observations together with our findings presented here demonstrated that the binding of MBD domains, such as those of MeCP2 and MBD2, to mCA DNAs, is through the recognition of the complementary TG dinucleotide, and cytosine methylation of the CA dinucleotide is not essential for the binding of MBD domains.
      Figure thumbnail gr6
      Figure 6Structural basis for TG dinucleotide recognition by Kaiso, KLF4, and MBD2. A, protein-binding microarray of the MeCP2 MBD domain with the binding motifs highlighted. B, structural basis for TG recognition by Kaiso. T9–A9′ and G10–C10′ are shown as yellow and red sticks, respectively. The Kaiso residues involved in TG recognition are shown as blue sticks. C, structural basis for TG recognition by KLF4 with T5–A5′ and G6–mC6′ of dsDNA shown as yellow and red sticks, respectively. The KLF4 residues involved in TG recognition are shown as blue sticks. D, structural basis for TG recognition by MBD2. T6–A6′ and G7–mC7′ are shown as yellow and red sticks, respectively. The MBD2 residues involved in TG recognition are shown as blue sticks. Hydrogen bonds between protein residues and the top or the bottom CG pairs are marked as red and gray dashed lines, respectively.
      The ability of some MBD domains recognizing both mCG and TG DNA is analogous to those of some other transcription factors (
      • Liu Y.
      • Zhang X.
      • Blumenthal R.M.
      • Cheng X.
      A common mode of recognition for methylated CpG.
      ), such as KLF4 (Krüppel-like factor 4) and Kaiso (
      • Liu Y.
      • Olanrewaju Y.O.
      • Zheng Y.
      • Hashimoto H.
      • Blumenthal R.M.
      • Zhang X.
      • Cheng X.
      Structural basis for Klf4 recognition of methylated DNA.
      ,
      • Buck-Koehntop B.A.
      • Stanfield R.L.
      • Ekiert D.C.
      • Martinez-Yamout M.A.
      • Dyson H.J.
      • Wilson I.A.
      • Wright P.E.
      Molecular basis for recognition of methylated and specific DNA sequences by the zinc finger protein Kaiso.
      • Schuetz A.
      • Nana D.
      • Rose C.
      • Zocher G.
      • Milanovic M.
      • Koenigsmann J.
      • Blasig R.
      • Heinemann U.
      • Carstanjen D.
      The structure of the Klf4 DNA-binding domain links to self-renewal and macrophage differentiation.
      ). Nevertheless, unlike KLF4 and Kaiso that bind to both mCG and TG DNA located within some specific sequences (
      • Liu Y.
      • Olanrewaju Y.O.
      • Zheng Y.
      • Hashimoto H.
      • Blumenthal R.M.
      • Zhang X.
      • Cheng X.
      Structural basis for Klf4 recognition of methylated DNA.
      • Buck-Koehntop B.A.
      • Stanfield R.L.
      • Ekiert D.C.
      • Martinez-Yamout M.A.
      • Dyson H.J.
      • Wilson I.A.
      • Wright P.E.
      Molecular basis for recognition of methylated and specific DNA sequences by the zinc finger protein Kaiso.
      ,
      • Schuetz A.
      • Nana D.
      • Rose C.
      • Zocher G.
      • Milanovic M.
      • Koenigsmann J.
      • Blasig R.
      • Heinemann U.
      • Carstanjen D.
      The structure of the Klf4 DNA-binding domain links to self-renewal and macrophage differentiation.
      • Daniel J.M.
      • Spring C.M.
      • Crawford H.C.
      • Reynolds A.B.
      • Baig A.
      The p120(ctn)-binding partner Kaiso is a bi-modal DNA-binding protein that recognizes both a sequence-specific consensus and methylated CpG dinucleotides.
      ), the MBD domains recognize mCG or GTG DNA without additional sequence selectivity. Compared with the KLF4–TG and Kaiso–TG complex structures, we found that, apart from the water-mediated interaction between Lys-178 and DNA, MBD2 utilizes the conserved arginine residue and acidic amino acid to recognize the TG dinucleotide (Fig. 6, B–D). The TG motif binding by MBD domains also reminds us of another DNA sequence motif, i.e. the GT box motif, a GGTGTGGG-like sequence (
      • Suske G.
      • Bruford E.
      • Philipsen S.
      Mammalian SP/KLF transcription factors: bring in the family.
      ). The GT box is predominantly found in the proximal promoter regions or the more distal regulatory regions of mammalian genes with its CG-rich sequence unmethylated (also called GC box) (
      • Suske G.
      • Bruford E.
      • Philipsen S.
      Mammalian SP/KLF transcription factors: bring in the family.
      ). The GT and GC boxes together function as the recruiting elements for the Sp (specificity protein) and KLF families of transcription factors (
      • Suske G.
      • Bruford E.
      • Philipsen S.
      Mammalian SP/KLF transcription factors: bring in the family.
      ). Recent genome-wide MeCP2 distribution analysis reveals that, in addition to binding chromatin regions of high mCG density, MeCP2 also occupies chromatin sites of high mCH density but with lower mCG density. The distinctive MeCP2–mCG and MeCP2–mCA binding events may control different transcriptional programs during brain development (
      • Lagger S.
      • Connelly J.C.
      • Schweikert G.
      • Webb S.
      • Selfridge J.
      • Ramsahoye B.H.
      • Yu M.
      • He C.
      • Sanguinetti G.
      • Sowers L.C.
      • Walkinshaw M.D.
      • Bird A.
      MeCP2 recognizes cytosine methylated tri-nucleotide and di-nucleotide sequences to tune transcription in the mammalian brain.
      ). In this study, we revealed that the unmethylated CA (or TG) DNA might function as a novel biological ligand of MBD proteins. Consistently, the MBD proteins have been found to bind to chromatin in a methylation-independent manner, and more MeCP2 is located in the 5mC-scarce open chromatin regions than in the 5mC-rich heterochromatin regions (
      • Baubec T.
      • Ivánek R.
      • Lienert F.
      • Schübeler D.
      Methylation-dependent and -independent genomic targeting principles of the MBD protein family.
      ,
      • Shin J.
      • Ming G.L.
      • Song H.
      By hook or by crook: multifaceted DNA-binding properties of MeCP2.
      ), further implicating the potential role of unmethylated CA dinucleotide in recruiting MeCP2 and other MBD proteins. Nevertheless, how the CA (or TG) binding ability of MBD proteins, including that of MeCP2, impacts their genome-wide distributions and associated gene expression regulation warrants further studies.

      Experimental procedures

      Cloning, expression, and purification of MBD domains

      MeCP2 (aa 80–164), MBD2 (aa 143–220), and MBD4 (aa 55–152) fragments of human genes were subcloned into pET28-MHL (Structural Genomics Consortium) expression vector to generate N-terminal His-tagged fusion proteins, whereas human MBD1 (aa 1–77) and MBD3 (aa 1–71) domains were subcloned into the pET28-GST-LIC (Structural Genomics Consortium) expression vector to generate N-terminal GST-tagged fusion proteins. The MBD domain mutants of MeCP2 (R111A and R133A) and MBD2 (R166A and R188A) were obtained by QuickChange site-directed mutagenesis (Agilent Technologies) using the MeCP2 (aa 80–164) and MBD2 (aa 143–220) expression constructs as the template, respectively.
      The recombinant proteins were overexpressed in Escherichia coli BL21 (DE3)-V2R-pRARE2 induced with 1 mm isopropyl-d-thiogalactopyranoside at 14 °C overnight. The cell pellet was dissolved and further lysed in a buffer containing 20 mm Tris-HCl, pH 7.5, 500 mm NaCl, 0.5 mm phenylmethylsulfonyl fluoride, and 5% glycerol. Supernatant was collected after centrifugation at 16,000 × g for 1 h and then purified with nickel-nitrilotriacetic acid resin (Qiagen) or GSH-Sepharose 4 beads (GE Healthcare). Purified proteins were then treated by tobacco etch virus (for MeCP2, MBD2, and MBD4 proteins) and thrombin (for MBD1 and MBD3 proteins) proteases to remove the tags. The treated samples were further analyzed by affinity chromatography, anion-exchange column, and gel-filtration column (GE Healthcare). Finally, the pure proteins were concentrated to 10 mg/ml in a buffer containing 20 mm Tris-HCl, pH 7.5, and 150 mm NaCl.

      Isothermal titration calorimetry binding assay

      All the DNA ligands used for ITC and crystallization experiments were synthesized by Integrated DNA Technologies and dissolved in the identical buffer with protein samples containing 20 mm Tris-HCl, pH 7.5, and 150 mm NaCl. Then, the DNA solution was finally adjusted to around pH 7.5 using NaOH. The single strand DNA was annealed into DNA duplex as described before (
      • Xu Y.
      • Xu C.
      • Kato A.
      • Tempel W.
      • Abreu J.G.
      • Bian C.
      • Hu Y.
      • Hu D.
      • Zhao B.
      • Cerovina T.
      • Diao J.
      • Wu F.
      • He H.H.
      • Cui Q.
      • Clark E.
      • et al.
      Tet3 CXXC domain and dioxygenase activity cooperatively regulate key genes for Xenopus eye and neural development.
      ,
      • Xu C.
      • Liu K.
      • Lei M.
      • Yang A.
      • Li Y.
      • Hughes T.R.
      • Min J.
      DNA sequence recognition of human CXXC domains and their structural determinants.
      ). The concentrations of the protein and DNA samples were determined based on UV absorbance using the NanoDrop ND-1000 spectrophotometer (Thermo Fisher Scientific). For each sample, we measured at least three times to get an average concentration. ITC measurements were carried out at the concentrations of MBD domain proteins and DNA ligands ranging from 20 to 60 μm and from 0.5 to 1 mm, respectively. The assays were performed using MicroCal ITC or ITC200 (GE Healthcare) at 25 °C. Regarding the ITC titrations, for most samples, we did just once; for the other samples, we did more than once until we found optimal experimental conditions, mainly protein/DNA concentrations, which gave nice ITC curves with significant heat change so that we could calculate the Kd reliably. We just used the best curves for each and every binding pair to calculate Kd, and the standard errors are the fitting errors from the best ITC titration curves of each binding pair. All the ITC curves with the corresponding thermodynamic parameters are shown in Fig. S1. To determine the Kd values, the data were fitted using the ITC data analysis module of Origin 7.0 (MicroCal Inc.) with the one-site binding model.

      Crystallization

      The purified proteins were mixed at a 1:1 molar ratio with different DNA ligands followed by incubation on ice for 30 min. The protein/DNA reaction mixtures were crystallized using the sitting drop vapor diffusion method at 18 °C by mixing 0.5 μl of the complex samples with 0.5 μl of the reservoir solution. Finally, we successfully obtained the complex crystals for MBD2 (aa 143–220) with the respective DNA ligands. The detailed crystallization conditions for each MBD–DNA complex are summarized in Table S1.

      Data collection and structure determination

      The native crystals were soaked in the crystallization solution plus a final concentration of 15% glycerol and frozen by immersion in liquid nitrogen. Diffraction data were collected at synchrotron or rotating anode X-ray sources under cooling to 100 K, processed with XDS (
      • Kabsch W.
      XDS.
      ), and merged with SCALA or AIMLESS (
      • Evans P.R.
      • Murshudov G.N.
      How good are my data and what is the resolution?.
      ). Structures were solved by molecular replacement with PHASER (
      • McCoy A.J.
      • Grosse-Kunstleve R.W.
      • Adams P.D.
      • Winn M.D.
      • Storoni L.C.
      • Read R.J.
      Phaser crystallographic software.
      ) using coordinates from PDB entries 3QMG and 2KY8 (for MBD2-CmCGG) or unpublished models (for remaining MBD2 structures) as required. The MBD2–AmCAT complex was used as a starting model for the nearly isomorphous triclinic MBD2–AmCAC complex structure, which in turn was used as a starting model for the MBD2–ACAC complex. In these cases, molecular replacement search was not needed, and POINTLESS (
      • Evans P.R.
      • Murshudov G.N.
      How good are my data and what is the resolution?.
      ) analysis and initial refinement were controlled by a DIMPLE (ccp4.github.io/dimple/) script.
      Please note that the JBC is not responsible for the long-term archiving and maintenance of this site or any other third party hosted site.
      ARP/WARP (
      • Perrakis A.
      • Harkiolaki M.
      • Wilson K.S.
      • Lamzin V.S.
      ARP/wARP and molecular replacement.
      ) was used for electron density map improvement and COOT (
      • Emsley P.
      • Lohkamp B.
      • Scott W.G.
      • Cowtan K.
      Features and development of Coot.
      ) for interactive model building. Restrained model refinement was performed with PHENIX.REFINE (
      • Afonine P.V.
      • Grosse-Kunstleve R.W.
      • Echols N.
      • Headd J.J.
      • Moriarty N.W.
      • Mustyakimov M.
      • Terwilliger T.C.
      • Urzhumtsev A.
      • Zwart P.H.
      • Adams P.D.
      Towards automated crystallographic structure refinement with phenix.refine.
      ), REFMAC (
      • Murshudov G.N.
      • Skubák P.
      • Lebedev A.A.
      • Pannu N.S.
      • Steiner R.A.
      • Nicholls R.A.
      • Winn M.D.
      • Long F.
      • Vagin A.A.
      REFMAC5 for the refinement of macromolecular crystal structures.
      ), and AUTOBUSTER (Cambridge, United Kingdom, Global Phasing Ltd.). MOLPROBITY (
      • Chen V.B.
      • Arendall 3rd, W.B.
      • Headd J.J.
      • Keedy D.A.
      • Immormino R.M.
      • Kapral G.J.
      • Murray L.W.
      • Richardson J.S.
      • Richardson D.C.
      MolProbity: all-atom structure validation for macromolecular crystallography.
      ) and PARVATI server (
      • Zucker F.
      • Champ P.C.
      • Merritt E.A.
      Validation of crystallographic models containing TLS or other descriptions of anisotropy.
      ) were used for analysis of model geometry and atomic anisotropic displacement parameters, respectively. PDB_EXTRAC (
      • Yang H.
      • Guranovic V.
      • Dutta S.
      • Feng Z.
      • Berman H.M.
      • Westbrook J.D.
      Automated and accurate deposition of structures solved by X-ray diffraction to the Protein Data Bank.
      ) and IOTBX.CIF (
      • Gildea R.J.
      • Bourhis L.J.
      • Dolomanov O.V.
      • Grosse-Kunstleve R.W.
      • Puschmann H.
      • Adams P.D.
      • Howard J.A.
      iotbx.cif: a comprehensive CIF toolbox.
      ) were used for the compilation of data collection and refinement statistics summarized in Table S1.
      Coordinates and structure factors for the structures of the MBD domains in complex with respective DNA ligands, have been deposited into Protein Data Bank (PDB) under the accession codes 6C1A, 6C1U, 6C1T, 6C1V, 6CNP and 6CNQ.

      Author contributions

      K. L., C. X., M. L., A. Y., P. L., and T. R. H. investigation; K. L., C. X., and J. M. writing-original draft; K. L., C. X., and J. M. writing-review and editing; C. X. software; K. L., C. X. and J. M. validation; C. X. project administration; J. M. conceptualization; J. M. formal analysis; J. M. supervision.

      Acknowledgments

      We thank Wolfram Tempel for the structure determination. We thank Chuanbing Bian for the MBD2–mCG crystal structure, Aiping Dong for a coordinate averaging program, and Amy Wernimont for some diffraction data collection and crystal structure review. Some diffraction data were collected at the Structural Biology Center and the General Medical Sciences and Cancer Institutes Structural Biology Facility at the Advanced Photon Source (GM/CA at APS) is part of the X-ray Science Division at APS, Argonne National Laboratory. GM/CA received support from National Institutes of Health Grants ACB-12002 from NCI and AGM-12006 from NIGMS. Argonne is operated by the United States Department of Energy under Contract DE-AC02-06CH11357. The Structural Genomics Consortium is a registered charity (number 1097737) that receives funds from the following: AbbVie; Bayer Pharma AG; Boehringer Ingelheim; Canada Foundation for Innovation; Eshelman Institute for Innovation; Genome Canada through Ontario Genomics Institute Grant OGI-055; Innovative Medicines Initiative (EU/EFPIA) ULTRA-DD Grant 115766; Janssen, Merck KGaA (Darmstadt, Germany); Merck Sharp and Dohme; Novartis Pharma AG; Ontario Ministry of Research, Innovation and Science (MRIS); Pfizer; São Paulo Research Foundation (FAPESP); Takeda; and Wellcome.

      Supplementary Material

      References

        • Ehrlich M.
        • Gama-Sosa M.A.
        • Huang L.H.
        • Midgett R.M.
        • Kuo K.C.
        • McCune R.A.
        • Gehrke C.
        Amount and distribution of 5-methylcytosine in human DNA from different types of tissues or cells.
        Nucleic Acids Res. 1982; 10 (7079182): 2709-2721
        • Ramsahoye B.H.
        • Biniszkiewicz D.
        • Lyko F.
        • Clark V.
        • Bird A.P.
        • Jaenisch R.
        Non-CpG methylation is prevalent in embryonic stem cells and may be mediated by DNA methyltransferase 3a.
        Proc. Natl. Acad. Sci. U.S.A. 2000; 97 (10805783): 5237-5242
        • Woodcock D.M.
        • Crowther P.J.
        • Diver W.P.
        The majority of methylated deoxycytidines in human DNA are not in the CpG dinucleotide.
        Biochem. Biophys. Res. Commun. 1987; 145 (3593377): 888-894
        • Guo J.U.
        • Su Y.
        • Shin J.H.
        • Shin J.
        • Li H.
        • Xie B.
        • Zhong C.
        • Hu S.
        • Le T.
        • Fan G.
        • Zhu H.
        • Chang Q.
        • Gao Y.
        • Ming G.L.
        • Song H.
        Distribution, recognition and regulation of non-CpG methylation in the adult mammalian brain.
        Nat. Neurosci. 2014; 17 (24362762): 215-222
        • Lister R.
        • Pelizzola M.
        • Dowen R.H.
        • Hawkins R.D.
        • Hon G.
        • Tonti-Filippini J.
        • Nery J.R.
        • Lee L.
        • Ye Z.
        • Ngo Q.M.
        • Edsall L.
        • Antosiewicz-Bourget J.
        • Stewart R.
        • Ruotti V.
        • Millar A.H.
        • et al.
        Human DNA methylomes at base resolution show widespread epigenomic differences.
        Nature. 2009; 462 (19829295): 315-322
        • Kribelbauer J.F.
        • Laptenko O.
        • Chen S.
        • Martini G.D.
        • Freed-Pastor W.A.
        • Prives C.
        • Mann R.S.
        • Bussemaker H.J.
        Quantitative analysis of the DNA methylation sensitivity of transcription factor complexes.
        Cell Rep. 2017; 19 (28614722): 2383-2395
        • Schultz M.D.
        • He Y.
        • Whitaker J.W.
        • Hariharan M.
        • Mukamel E.A.
        • Leung D.
        • Rajagopal N.
        • Nery J.R.
        • Urich M.A.
        • Chen H.
        • Lin S.
        • Lin Y.
        • Jung I.
        • Schmitt A.D.
        • Selvaraj S.
        • et al.
        Human body epigenome maps reveal noncanonical DNA methylation variation.
        Nature. 2015; 523 (26030523): 212-216
        • Ichiyanagi T.
        • Ichiyanagi K.
        • Miyake M.
        • Sasaki H.
        Accumulation and loss of asymmetric non-CpG methylation during male germ-cell development.
        Nucleic Acids Res. 2013; 41 (23180759): 738-745
        • Ziller M.J.
        • Müller F.
        • Liao J.
        • Zhang Y.
        • Gu H.
        • Bock C.
        • Boyle P.
        • Epstein C.B.
        • Bernstein B.E.
        • Lengauer T.
        • Gnirke A.
        • Meissner A.
        Genomic distribution and inter-sample variation of non-CpG methylation across human cell types.
        PLoS Genet. 2011; 7 (22174693): e1002389
        • Du Q.
        • Luu P.L.
        • Stirzaker C.
        • Clark S.J.
        Methyl-CpG-binding domain proteins: readers of the epigenome.
        Epigenomics. 2015; 7 (25927341): 1051-1073
        • Amir R.E.
        • Van den Veyver I.B.
        • Wan M.
        • Tran C.Q.
        • Francke U.
        • Zoghbi H.Y.
        Rett syndrome is caused by mutations in X-linked MECP2, encoding methyl-CpG-binding protein 2.
        Nat. Genet. 1999; 23 (10508514): 185-188
        • Chen L.
        • Chen K.
        • Lavery L.A.
        • Baker S.A.
        • Shaw C.A.
        • Li W.
        • Zoghbi H.Y.
        MeCP2 binds to non-CG methylated DNA as neurons mature, influencing transcription and the timing of onset for Rett syndrome.
        Proc. Natl. Acad. Sci. U.S.A. 2015; 112 (25870282): 5509-5514
        • Gabel H.W.
        • Kinde B.
        • Stroud H.
        • Gilbert C.S.
        • Harmin D.A.
        • Kastan N.R.
        • Hemberg M.
        • Ebert D.H.
        • Greenberg M.E.
        Disruption of DNA-methylation-dependent long gene repression in Rett syndrome.
        Nature. 2015; 522 (25762136): 89-93
        • Kinde B.
        • Gabel H.W.
        • Gilbert C.S.
        • Griffith E.C.
        • Greenberg M.E.
        Reading the unique DNA methylation landscape of the brain: non-CpG methylation, hydroxymethylation, and MeCP2.
        Proc. Natl. Acad. Sci. U.S.A. 2015; 112 (25739960): 6800-6806
        • Klose R.J.
        • Sarraf S.A.
        • Schmiedeberg L.
        • McDermott S.M.
        • Stancheva I.
        • Bird A.P.
        DNA binding selectivity of MeCP2 due to a requirement for A/T sequences adjacent to methyl-CpG.
        Mol. Cell. 2005; 19 (16137622): 667-678
        • Clouaire T.
        • de Las Heras J.I.
        • Merusi C.
        • Stancheva I.
        Recruitment of MBD1 to target genes requires sequence-specific interaction of the MBD domain with methylated DNA.
        Nucleic Acids Res. 2010; 38 (20378711): 4620-4634
        • Scarsdale J.N.
        • Webb H.D.
        • Ginder G.D.
        • Williams Jr., D.C.
        Solution structure and dynamic analysis of chicken MBD2 methyl binding domain bound to a target-methylated DNA sequence.
        Nucleic Acids Res. 2011; 39 (21531701): 6741-6752
        • Hendrich B.
        • Bird A.
        Identification and characterization of a family of mammalian methyl-CpG binding proteins.
        Mol. Cell. Biol. 1998; 18 (9774669): 6538-6547
        • Zhang Y.
        • Ng H.H.
        • Erdjument-Bromage H.
        • Tempst P.
        • Bird A.
        • Reinberg D.
        Analysis of the NuRD subunits reveals a histone deacetylase core complex and a connection with DNA methylation.
        Genes Dev. 1999; 13 (10444591): 1924-1935
        • Saito M.
        • Ishikawa F.
        The mCpG-binding domain of human MBD3 does not bind to mCpG but interacts with NuRD/Mi2 components HDAC1 and MTA2.
        J. Biol. Chem. 2002; 277 (12124384): 35434-35439
        • Yildirim O.
        • Li R.
        • Hung J.H.
        • Chen P.B.
        • Dong X.
        • Ee L.S.
        • Weng Z.
        • Rando O.J.
        • Fazzio T.G.
        Mbd3/NURD complex regulates expression of 5-hydroxymethylcytosine marked genes in embryonic stem cells.
        Cell. 2011; 147 (22196727): 1498-1510
        • Cramer J.M.
        • Scarsdale J.N.
        • Walavalkar N.M.
        • Buchwald W.A.
        • Ginder G.D.
        • Williams Jr., D.C.
        Probing the dynamic distribution of bound states for methylcytosine-binding domains on DNA.
        J. Biol. Chem. 2014; 289 (24307175): 1294-1302
        • Hendrich B.
        • Hardeland U.
        • Ng H.H.
        • Jiricny J.
        • Bird A.
        The thymine glycosylase MBD4 can bind to the product of deamination at methylated CpG sites.
        Nature. 1999; 401 (10499592): 301-304
        • Otani J.
        • Arita K.
        • Kato T.
        • Kinoshita M.
        • Kimura H.
        • Suetake I.
        • Tajima S.
        • Ariyoshi M.
        • Shirakawa M.
        Structural basis of the versatile DNA recognition ability of the methyl-CpG binding domain of methyl-CpG binding domain protein 4.
        J. Biol. Chem. 2013; 288 (23316048): 6351-6362
        • Baubec T.
        • Ivánek R.
        • Lienert F.
        • Schübeler D.
        Methylation-dependent and -independent genomic targeting principles of the MBD protein family.
        Cell. 2013; 153 (23582333): 480-492
        • Ho K.L.
        • McNae I.W.
        • Schmiedeberg L.
        • Klose R.J.
        • Bird A.P.
        • Walkinshaw M.D.
        MeCP2 binding to DNA depends upon hydration at methyl-CpG.
        Mol. Cell. 2008; 29 (18313390): 525-531
        • Ohki I.
        • Shimotake N.
        • Fujita N.
        • Jee J.
        • Ikegami T.
        • Nakao M.
        • Shirakawa M.
        Solution structure of the methyl-CpG binding domain of human MBD1 in complex with methylated DNA.
        Cell. 2001; 105 (11371345): 487-497
        • Zou X.
        • Ma W.
        • Solov'yov I.A.
        • Chipot C.
        • Schulten K.
        Recognition of methylated DNA through methyl-CpG binding domain proteins.
        Nucleic Acids Res. 2012; 40 (22110028): 2747-2758
        • Lamoureux J.S.
        • Glover J.N.
        Principles of protein-DNA recognition revealed in the structural analysis of Ndt80-MSE DNA complexes.
        Structure. 2006; 14 (16531239): 555-565
        • Rooman M.
        • Liévin J.
        • Buisine E.
        • Wintjens R.
        Cation-π/H-bond stair motifs at protein-DNA interfaces.
        J. Mol. Biol. 2002; 319 (12051937): 67-76
        • Fraga M.F.
        • Ballestar E.
        • Montoya G.
        • Taysavang P.
        • Wade P.A.
        • Esteller M.
        The affinity of different MBD proteins for a specific methylated locus depends on their intrinsic binding properties.
        Nucleic Acids Res. 2003; 31 (12626718): 1765-1774
        • Lagger S.
        • Connelly J.C.
        • Schweikert G.
        • Webb S.
        • Selfridge J.
        • Ramsahoye B.H.
        • Yu M.
        • He C.
        • Sanguinetti G.
        • Sowers L.C.
        • Walkinshaw M.D.
        • Bird A.
        MeCP2 recognizes cytosine methylated tri-nucleotide and di-nucleotide sequences to tune transcription in the mammalian brain.
        PLoS Genet. 2017; 13 (28498846): e1006793
        • Kozlenkov A.
        • Roussos P.
        • Timashpolsky A.
        • Barbu M.
        • Rudchenko S.
        • Bibikova M.
        • Klotzle B.
        • Byne W.
        • Lyddon R.
        • Di Narzo A.F.
        • Hurd Y.L.
        • Koonin E.V.
        • Dracheva S.
        Differences in DNA methylation between human neuronal and glial cells are concentrated in enhancers and non-CpG sites.
        Nucleic Acids Res. 2014; 42 (24057217): 109-127
        • Derewenda Z.S.
        • Lee L.
        • Derewenda U.
        The occurrence of C–H···O hydrogen bonds in proteins.
        J. Mol. Biol. 1995; 252 (7674305): 248-262
        • Lewis J.D.
        • Meehan R.R.
        • Henzel W.J.
        • Maurer-Fogy I.
        • Jeppesen P.
        • Klein F.
        • Bird A.
        Purification, sequence, and cellular localization of a novel chromosomal protein that binds to methylated DNA.
        Cell. 1992; 69 (1606614): 905-914
        • Weitzel J.M.
        • Buhrmester H.
        • Strätling W.H.
        Chicken MAR-binding protein ARBP is homologous to rat methyl-CpG-binding protein MeCP2.
        Mol. Cell. Biol. 1997; 17 (9271441): 5656-5666
        • Buhrmester H.
        • von Kries J.P.
        • Strätling W.H.
        Nuclear matrix protein ARBP recognizes a novel DNA sequence motif with high affinity.
        Biochemistry. 1995; 34 (7696275): 4108-4117
        • von Kries J.P.
        • Buhrmester H.
        • Strätling W.H.
        A matrix/scaffold attachment region binding protein: identification, purification, and mode of binding.
        Cell. 1991; 64 (1846084): 123-135
        • Weirauch M.T.
        • Cote A.
        • Norel R.
        • Annala M.
        • Zhao Y.
        • Riley T.R.
        • Saez-Rodriguez J.
        • Cokelaer T.
        • Vedenko A.
        • Talukder S.
        • DREAM5 Consortium
        • Bussemaker H.J.
        • Morris Q.D.
        • Bulyk M.L.
        • Stolovitzky G.
        • Hughes T.R.
        Evaluation of methods for modeling transcription factor sequence specificity.
        Nat. Biotechnol. 2013; 31 (23354101): 126-134
        • Lam K.N.
        • van Bakel H.
        • Cote A.G.
        • van der Ven A.
        • Hughes T.R.
        Sequence specificity is obtained from the majority of modular C2H2 zinc-finger arrays.
        Nucleic Acids Res. 2011; 39 (21321018): 4680-4690
        • Liu Y.
        • Zhang X.
        • Blumenthal R.M.
        • Cheng X.
        A common mode of recognition for methylated CpG.
        Trends Biochem. Sci. 2013; 38 (23352388): 177-183
        • Liu Y.
        • Olanrewaju Y.O.
        • Zheng Y.
        • Hashimoto H.
        • Blumenthal R.M.
        • Zhang X.
        • Cheng X.
        Structural basis for Klf4 recognition of methylated DNA.
        Nucleic Acids Res. 2014; 42 (24520114): 4859-4867
        • Buck-Koehntop B.A.
        • Stanfield R.L.
        • Ekiert D.C.
        • Martinez-Yamout M.A.
        • Dyson H.J.
        • Wilson I.A.
        • Wright P.E.
        Molecular basis for recognition of methylated and specific DNA sequences by the zinc finger protein Kaiso.
        Proc. Natl. Acad. Sci. U.S.A. 2012; 109 (22949637): 15229-15234
        • Schuetz A.
        • Nana D.
        • Rose C.
        • Zocher G.
        • Milanovic M.
        • Koenigsmann J.
        • Blasig R.
        • Heinemann U.
        • Carstanjen D.
        The structure of the Klf4 DNA-binding domain links to self-renewal and macrophage differentiation.
        Cell. Mol. Life Sci. 2011; 68 (21290164): 3121-3131
        • Daniel J.M.
        • Spring C.M.
        • Crawford H.C.
        • Reynolds A.B.
        • Baig A.
        The p120(ctn)-binding partner Kaiso is a bi-modal DNA-binding protein that recognizes both a sequence-specific consensus and methylated CpG dinucleotides.
        Nucleic Acids Res. 2002; 30 (12087177): 2911-2919
        • Suske G.
        • Bruford E.
        • Philipsen S.
        Mammalian SP/KLF transcription factors: bring in the family.
        Genomics. 2005; 85 (15820306): 551-556
        • Shin J.
        • Ming G.L.
        • Song H.
        By hook or by crook: multifaceted DNA-binding properties of MeCP2.
        Cell. 2013; 152 (23452844): 940-942
        • Xu Y.
        • Xu C.
        • Kato A.
        • Tempel W.
        • Abreu J.G.
        • Bian C.
        • Hu Y.
        • Hu D.
        • Zhao B.
        • Cerovina T.
        • Diao J.
        • Wu F.
        • He H.H.
        • Cui Q.
        • Clark E.
        • et al.
        Tet3 CXXC domain and dioxygenase activity cooperatively regulate key genes for Xenopus eye and neural development.
        Cell. 2012; 151 (23217707): 1200-1213
        • Xu C.
        • Liu K.
        • Lei M.
        • Yang A.
        • Li Y.
        • Hughes T.R.
        • Min J.
        DNA sequence recognition of human CXXC domains and their structural determinants.
        Structure. 2018; 26 (29276034): 85-95.e3
        • Kabsch W.
        XDS.
        Acta Crystallogr. D Biol. Crystallogr. 2010; 66 (20124692): 125-132
        • Evans P.R.
        • Murshudov G.N.
        How good are my data and what is the resolution?.
        Acta Crystallogr. D Biol. Crystallogr. 2013; 69 (23793146): 1204-1214
        • McCoy A.J.
        • Grosse-Kunstleve R.W.
        • Adams P.D.
        • Winn M.D.
        • Storoni L.C.
        • Read R.J.
        Phaser crystallographic software.
        J. Appl. Crystallogr. 2007; 40 (19461840): 658-674
        • Perrakis A.
        • Harkiolaki M.
        • Wilson K.S.
        • Lamzin V.S.
        ARP/wARP and molecular replacement.
        Acta Crystallogr. D Biol. Crystallogr. 2001; 57 (11567158): 1445-1450
        • Emsley P.
        • Lohkamp B.
        • Scott W.G.
        • Cowtan K.
        Features and development of Coot.
        Acta Crystallogr. D Biol. Crystallogr. 2010; 66 (20383002): 486-501
        • Afonine P.V.
        • Grosse-Kunstleve R.W.
        • Echols N.
        • Headd J.J.
        • Moriarty N.W.
        • Mustyakimov M.
        • Terwilliger T.C.
        • Urzhumtsev A.
        • Zwart P.H.
        • Adams P.D.
        Towards automated crystallographic structure refinement with phenix.refine.
        Acta Crystallogr. D Biol. Crystallogr. 2012; 68 (22505256): 352-367
        • Murshudov G.N.
        • Skubák P.
        • Lebedev A.A.
        • Pannu N.S.
        • Steiner R.A.
        • Nicholls R.A.
        • Winn M.D.
        • Long F.
        • Vagin A.A.
        REFMAC5 for the refinement of macromolecular crystal structures.
        Acta Crystallogr. D Biol. Crystallogr. 2011; 67 (21460454): 355-367
        • Chen V.B.
        • Arendall 3rd, W.B.
        • Headd J.J.
        • Keedy D.A.
        • Immormino R.M.
        • Kapral G.J.
        • Murray L.W.
        • Richardson J.S.
        • Richardson D.C.
        MolProbity: all-atom structure validation for macromolecular crystallography.
        Acta Crystallogr. D Biol. Crystallogr. 2010; 66 (20057044): 12-21
        • Zucker F.
        • Champ P.C.
        • Merritt E.A.
        Validation of crystallographic models containing TLS or other descriptions of anisotropy.
        Acta Crystallogr. D Biol. Crystallogr. 2010; 66 (20693688): 889-900
        • Yang H.
        • Guranovic V.
        • Dutta S.
        • Feng Z.
        • Berman H.M.
        • Westbrook J.D.
        Automated and accurate deposition of structures solved by X-ray diffraction to the Protein Data Bank.
        Acta Crystallogr. D Biol. Crystallogr. 2004; 60 (15388930): 1833-1839
        • Gildea R.J.
        • Bourhis L.J.
        • Dolomanov O.V.
        • Grosse-Kunstleve R.W.
        • Puschmann H.
        • Adams P.D.
        • Howard J.A.
        iotbx.cif: a comprehensive CIF toolbox.
        J. Appl. Crystallogr. 2011; 44 (22199401): 1259-1263