Uncovering the mechanistic basis for specific recognition of monomethylated H3K4 by the CW domain of Arabidopsis histone methyltransferase SDG8

Chromatin consists of DNA and histones, and specific histone modifications that determine chromatin structure and activity are regulated by three types of proteins, called writer, reader, and eraser. Histone reader proteins from vertebrates, vertebrate-infecting parasites, and higher plants possess a CW domain, which has been reported to read histone H3 lysine 4 (H3K4). The CW domain of Arabidopsis SDG8 (also called ASHH2), a histone H3 lysine 36 methyltransferase, preferentially binds monomethylated H3K4 (H3K4me1), unlike the mammalian CW domain protein, which binds trimethylated H3K4 (H3K4me3). However, the molecular basis of the selective binding by the CW domain of SDG8 (SDG8-CW) remains unclear. Here, we solved the 1.6-Å-resolution structure of SDG8-CW in complex with H3K4me1, which revealed that residues in the C-terminal α-helix of SDG8-CW determine binding specificity for low methylation levels at H3K4. Moreover, substitutions of key residues, specifically Ile-915 and Asn-916, converted SDG8-CW binding preference from H3K4me1 to H3K4me3. Sequence alignment and mutagenesis studies revealed that the CW domain of SDG725, the homolog of SDG8 in rice, shares the same binding preference with SDG8-CW, indicating that preference for low methylated H3K4 by the CW domain of ASHH2 homologs is conserved among higher-order plants. Our findings provide first structural insights into the molecular basis for specific recognition of monomethylated H3K4 by the H3K4me1 reader protein SDG8 from Arabidopsis.

The basic unit of chromatin is the nucleosome, which contains eight histone proteins and 147 bp of DNA (1). The histone proteins have tails that protrude from the nucleosome, and many residues in these tails can be covalently modified (2). A number of specific modifications of histones have been identi-fied, including methylation, acetylation, ubiquitination, phosphorylation, SUMOylation, 2 deamination, and ADP-ribosylation (2)(3)(4). Combinations of these modifications play important roles in many biological processes, such as regulation of gene activity and cell fate determination (5)(6)(7)(8). Most of these modifications have been found to be dynamic (9,10). Therefore, establishment, recognition, and removal of histone modifications are carefully regulated by three types of proteins, called "writer," "reader," and "eraser," respectively (11,12). Investigations of these proteins show that more than one functional domain may occur in one protein, and they may play multiple roles in chromatin-associated processes (13). The N termini of histones are rich in lysine residues. Methylation markers can be deposited on particular lysine residues with different degrees of methylation (mono-, di-, and trimethylation) and can be recognized by various functional domains of histone readers (14). The "royal family" of proteins is well known for its function in the recognition of methylated histones, including chromodomain, Tudor domain, malignant brain tumor domain, and PWWP domain (15,16). Plant homeodomain and WD40 domain are also capable of binding methylated or unmethylated histones (17)(18)(19). Recently, a domain family, called CW domain, was found to function as a H3K4 reader (20 -23).
The CW domain is a zinc-binding domain with conserved cysteines and tryptophans (hence the name CW) and has been found in vertebrates, vertebrate-infecting parasites, and higherorder plants (24 -27). The CW domains are usually found in chromatin-related proteins associated with other domains, such as PWWP domain, SET domain, and amine oxidase domain, suggesting a gene regulation role for this domain. There are seven CW domain-containing proteins in humans and 11 in Arabidopsis. Previous studies indicate that the CW domains in various proteins show different preference for the degree of methylation of H3K4 (20,21,23,28). The CW domains in mammalian MORC1, MORC2, and LSD2 have been reported to have no ability to bind any histone H3K4 peptides, whereas the CW domains of mammalian ZCWPW1, ZCWPW2, MORC3, and MORC4 bind to H3K4me3 (20,23,29). Interest- cro ARTICLE ingly, the CW domain of Arabidopsis SDG8 (also called ASHH2/ CCR1/EFS) was reported to preferentially bind H3K4me1 (21). Moreover, SDG8 bears sequence homology to SET2 (the sole yeast H3K36 methyltransferase), catalyzing the di-and trimethylation of H3K36 from the monomethylated state (30,31). The sdg8 mutant plants exhibit early flowering with a global reduction of H3K36me2/me3 level and an increase of H3K36me1 level. SDG8 is not the sole H3K36 methyltransferase in Arabidopsis. SDG26, which lacks the N-terminal CW domain, is also homologous to SET2. In addition, SDG8 is involved in many biological processes, including shoot branching, ovule and anther development, carotenoid biosynthesis, defense response, seed development, brassinosteroid-regulated gene expression, and light-and/or carbonresponsive gene expression, indicating the nonredundant role of SDG8 in chromatin modification and gene regulation (32)(33)(34)(35)(36)(37)(38)(39). Therefore, SDG8 may serve as a platform for downstream H3K36 methylation via the recognition of H3K4me1 by the CW domain.
So far, several structures of mammalian CW domains complexed with histone H3K4 peptides have been reported (20,21,23,40). However, mammalian CW domains prefer to bind unmethylated or trimethylated H3K4 (20,23,40). The molecular mechanism by which the Arabidopsis SDG8 CW domain specifically recognizes low-level methylation of H3K4 remains unclear. Here, we determined the crystal structure of SDG8-CW in complex with H3K4me1 peptide at 1.6-Å resolution. The structural and biochemical data provide the molecular basis for the selective recognition of H3K4me1/2. Key residues that determine the specificity were identified. Furthermore, we also tested the binding specificity of SDG725, the homolog of SDG8 in rice, for various histone peptides. Sequence alignment and biochemical data indicate that the preference for low-level methylation of H3K4 by the CW domain of SDG8 is conserved in green plants. Our findings may provide new insights into the molecular mechanism of the recruitment of SDG8 to its target genes and shed light on the conserved role played by an incomplete aromatic cage in plants in recognizing low-level methylation of H3K4.

The CW domain of SDG8 preferentially binds monomethylated H3K4
SDG8 harbors a CW domain in the middle of its amino acid sequence and a SET domain that is C-terminal to the CW domain (Fig. 1A). To systematically explore the binding affinity of the SDG8 CW domain to different histone markers, we performed isothermal titration calorimetry (ITC) using label-free histone peptides. SDG8-CW (residues 862-921) was purified with its N-terminal His-SUMO fusion protein removed to rule out any impact introduced by this fusion protein (Fig. 1B). As shown in Fig. 1C, SDG8-CW showed an ability to bind to H3K4me peptides but not H3K9me3, H3K27me3, and H3K36me3, consistent with a previous report (21). The equilibrium disassociation constants (K D ) determined for SDG8 to various H3K4me peptides are 1.3 Ϯ 0.2 M for H3K4me1, 3.3 Ϯ 0.3 M for H3K4me2, 18.9 Ϯ 1.5 M for H3K4me3, and 65.8 Ϯ 11.5 M for H3K4me0 ( Fig. 1B and TableS1).ComparedwithmonomethylatedH3K4,di-andtrimeth-ylation decreased the binding affinity 2.5-and 14.5-fold, respectively, indicating the SDG8 CW domain's strong preference for low levels of methylation of H3K4, especially H3K4me1.

Overall structure of SDG8-CW in complex with H3K4me1
To elucidate the molecular mechanism for the specific recognition of H3K4me1, we tried to cocrystallize SDG8-CW and an H3K4me1 peptide (residues 1-9). However, our initial attempts at crystallizing the complex failed. By analyzing the sequence of SDG8-CW using the SERp server (41), we mutated Glu-917 to alanine to reduce the potential surface entropy for crystallization. The binding affinity of the E917A mutant to H3K4me1 was also measured (Fig. 1C). The K D is 2.79 Ϯ 0.36 M, indicating that the alanine mutation at Glu-917 has little impact on the binding affinity. E917A was successfully cocrystallized with H3K4me1. Thus, for convenience, we treated the E917A mutant as SDG8-CW.

Structural basis for the interaction of SDG-CW with H3K4me1
Residues 1-7 of the histone H3K4me1 peptide show a defined electron density and can be successfully modeled ( Fig.  2A). Multiple interactions were observed between SDG8-CW and H3K4me1 (Fig. 2, B and C). The methyl group of Ala-1 inserts into a hydrophobic pocket formed by Val-866, Ile-885, and Trp-891 (Fig. 2D). The free amine group of Ala-1 interacts with the main chain carbonyl groups of Asp-886 and Ser-889 via hydrogen bonds. The carboxyl group of Asp-869 further stabilizes A1 via water-mediated hydrogen bonds. Moreover, Arg-2 and K4me1 form hydrogen bonds with the main chains of Arg-867 and Trp-865 on ␤1, respectively (Fig. 2B). Arg-2 stretched toward the solvent with its guanidinium moiety sandwiched by Arg-867 and Glu-887 (Fig. 2E). The methyl group of Thr-3 is anchored in a shallow hydrophobic pocket formed by Val-866, Ile-877, Val-882, and Ile-885 (Fig. 2F). In addition, the

Crystal structure of SDG8-CW in complex with H3K4me1
hydroxyl group of Thr-3 is involved in water-mediated hydrogen bond interactions with the main chains of Val-882 and Ile-885. Intriguingly, SDG8-CW adopts a unique cage to accommodate the monomethylated Lys-4 ( Fig. 2G). Two conserved tryptophans, Trp-865 and Trp-874, occupy the back and left walls of the cage, respectively. Ile-915 and Leu-919 on ␣1 form the right wall of the cage, and Asn-916 constitutes the floor of the cage, leaving the front side of the cage unshielded. In addition, the amino group of Asn-916 contacts the guanidino group of K4me1 via a hydrogen bond. The K4me1 binds in a straight concave surface of SDG-CW, which is different from the canonical trimethyllysinebinding pocket formed by three or more aromatic residues to

Crystal structure of SDG8-CW in complex with H3K4me1
increase the hydrophobicity and space to facilitate the accommodation of a bulky trimethyllysine.

Validation of the key residues that determine the specific recognition of sequence and lysine methylation level
To validate the intermolecular interactions between SDG8-CW and H3K4me1, we performed ITC assays to detect changes in binding affinity that were introduced by site-specific mutagenesis. Structural analysis showed that the first three residues, ART, shared most of the interactions between SDG8-CW and the H3K4me1, indicating that residues ART may be important for the sequence-specific binding. We designed and synthesized four H3K4me1 mutant peptides, including H3K4me1⌬〈1, which lacks the first Ala-1 residue; AH3K4me1, which adds an additional alanine to the N terminus of the peptide; and two alanine substitution mutants, namely H3K4me1R2A and H3K4me1T3A, which substitute Arg-2 and Thr-3 with alanine, respectively (Fig. 3A). H3K4me1⌬A1 peptide completely lost its ability to bind SDG8-CW, whereas AH3K4me1 exhibited severely diminished binding, underscoring the importance of residue Ala-1 in the recognition of H3K4me1 ( Fig. 3B and Table S1). Mutation of Arg-2 to alanine only mildly reduced the binding about 5-fold, whereas mutation of Thr-3 abolished the binding, consistent with the structural analysis result. Thus, Ala-1 and Thr-3 are two key residues that determine the sequence-specific recognition of H3K4me1 by SDG8-CW.
Unlike other CW domains, such as MORC3-CW, which has additional negatively charged residues, or ZCWPW2-CW, which has a third aromatic residue, the monomethyllysine-binding cage of SDG8-CW comprises some hydrophobic residues (see Fig. 7, A and B). To further investigate how SDG8-CW prefers to bind monomethylated lysine, we used the same strategy as above to monitor the impact of point mutations on binding affinity (Table S1). Alanine substitution of Trp-865 and Trp-874 significantly affected binding ability. W874A mutant exhibited no ability to bind any of the four H3K4me peptides (Fig. 3C). W865A mutant lost its ability to bind H3K4me0 and H3K4me3 and showed dramatically decreased binding to H3K4me1 and H3K4me2, about 60-and 23-fold lower compared with WT, respectively (Fig. 3D). This is consistent with previous studies, which indicated that the corresponding residue of Trp-865 is substituted by isoleucine and threonine in ZCWPW2, resulting in the absence of binding to any histone H3K4me peptide (23). In the crystal structure, the monomethyl group of K4me1 faces toward Ile-915, and Asn-916 contacts the methylammonium ion via a hydrogen bond (Fig. 2G). Using this structural information as a guide, we generated two single mutants, I915A and N916A, and the double mutant I915A/ N916A. Both I915A and N916A mutants reduced the ability to bind to H3K4me peptides (Fig. 3, E and F). However, for various degrees of methylation, I915A and N916A mutants both exhibit drastic differences in the reduction of ability to bind to histone H3K4me peptides. Compared with WT, I915A reduced the binding about 116-fold for H3K4me1, 16-fold for H3K4me2, and 1.5-fold for H3K4me3 (Fig. 3E). N916A and the double mutant I915A/N916A showed selectively diminished binding to H3K4me peptides in a manner similar to I915A (Fig. 3, F and  G). Therefore, mutation of Ile-915 and Asn-916 leads to a conversion of ligand binding preference from K4me1 to K4me3, indicating that Ile-915 and Asn-916 are determinants of the selectivity for the recognition of monomethyllysine (Fig. 3H).

SDG8-CW undergoes conformational change upon binding to H3K4me1
The solution structure of the apo form of the SDG8 CW domain was reported previously (21). However, superimposition of the apo form (Protein Data Bank code 2L7P) and the complex form reveals a significant conformational change at the 1 turn (Fig. 4A). The r.m.s.d. between the apo form and the complex is 1.18 Å. In the complex structure, the loop connecting ␤2 and ␣1, containing the 1 in between, is closer to the ␤-hairpin core, thus facilitating the binding to H3K4me1 peptide. In the apo form, residues Val-882, Ile-885, and Trp-891 that are involved in the recognition of Ala-1 and Thr-3 were far away from Val-866 on ␤1 and Ile-877 on ␤2, failing to form the hydrophobic pockets for the accommodation of the methyl groups of Ala-1 and Thr-3 (Fig. 4B). In addition, conformational changes were also observed in the methyllysine-binding cage. The cage is more open in the complex structure compared with that in the apo form due to the deviation of the indole ring of Trp-874 (Fig. 4, C and D). Moreover, the orientation of the side chain of Asn-916, which forms a hydrogen bond with the methylammonium of K4me1, turns away from the peptide (Fig.  4E). Together, we suggest that the methyllysine-binding pocket is closed in the absence of histone ligand. Upon binding, SDG-CW undergoes conformational changes, including forming the open state of the methyllysine-binding cage, the flip of

Crystal structure of SDG8-CW in complex with H3K4me1
Asn-916, and the closer distance of 3 10 -turn 1, to facilitate the binding of histone H3K4me1 peptide.

The C-terminal ␣-helix of SDG-CW domain that is important for the K4me1 recognition is conserved in green plants
Ile-915, Asn-916, and Leu-919 are the three key residues that form the monomethyllysine-binding pocket. The mutagenesis study and ITC assays also indicated the critical role of Ile-915 and Asn-916 in the selection of the level of lysine methylation. These three residues are located on the C-terminal ␣-helix of SDG8-CW. We searched the structural homology of SDG8 CW domain on the Dali server (44), which showed that the most homologous structure is ZCWPW2 with an r.m.s.d. of 1.6 Å. However, we found that in all previous reported CW structures no C-terminal ␣-helix is observed, including the CW domains of ZCWPW1-3, MORC1-3, and LSD2 (20,23,43), indicating that the presence of the C-terminal ␣-helix is unique in SDG8-CW (Fig. 1G). To investigate whether the C-terminal ␣-helix may exist in a wider range of species, we performed sequence alignments (Fig. 5A). The alignment results indicated that the three key residues (Ile-915, Asn-916, and Leu-919) involved in the monomethyllysine-binding pocket were not conserved in human CW domains or in other CW-containing proteins in Arabidopsis. Therefore, we speculated that the specific recognition of monomethylated lysine is a unique feature to SDG8 protein in green plants.

The CW domain of OsSDG725 in rice shows a similar binding preference as SDG8-CW
However, we found that the C-terminal ␣1 of SDG8-CW is highly conserved in ASHH2 proteins in most green plants, including dicots and monocots (Fig. 5B). To investigate whether the CW domain of SDG8 protein in other plant species has the same binding preference as Arabidopsis SDG8, we chose OsSDG725, the rice homolog of SDG8. OsSDG725 contains 637 amino acids with a CW domain and a SET domain (Fig. 6A). OsSDG725-CW domain (residues 41-101) exhibits high binding affinity for histone H3K4me peptides like SDG8-CW but no binding to H3K9me3, H3K27me3, and H3K36me3 ( Fig. 6B and Table S2). The K D values for H3K4me  . Therefore, the order of preference of OsSDG725 for H3K4me peptides is H3K4me1 Ն H3K4me2 Ͼ H3K4me3 Ͼ H3K4me0. However, the binding affinity difference caused by the degree of methylation is narrower. Next, according to the mutagenesis studies in SDS8-CW, we generated five mutants in OsSDG725, that is W53A (corresponding to W874A in SDG8-CW), W44A (corresponding to W865A), I95A (corresponding to I915A), N96A (corresponding to N916A), and I95A/N96A (corresponding to the double mutant I915A/N916A). Similar to SDG8-CW mutants, the W53A mutant completely lost the ability to bind any of the four H3K4me peptides (Fig. 6C and Table S2). W44A reduced the binding ability about 26-fold for mono-, 24-fold for di-, 31-fold for trimethylated H3K4, and 28-fold for unmethylated peptide (Fig. 6D). Ile-95 and Asn-96 in OsSDG725 also play critical roles in the recognition of lowlevel methylation (Fig. 6, E, F, and G). Mutation at the C-terminal ␣-helix decreases the binding to H3K4me peptides at different levels. The binding affinities to H3K4me1 are most affected compared with H3K4me2 and H3K4me3. The -fold change of binding affinity caused by I95A is 74-fold for mono-, 38-fold for di-, and 6-fold for trimethylated H3K4 (Fig. 6, E and I). N96A reduced the binding about 21-, 6-, and 1.4-fold, respectively (Fig. 6, F and I), as did the double mutant I95A/N96A (Fig. 6, G and I). Mutation at the histone peptides was also performed (Fig. 6H). The results showed that ART (residues 1-3) of histone H3 is important for the recognition of H3K4me1 by OsSDG725-CW, the same as SDG8. Together, these results demonstrate that the sequence-specific recognition of H3K4me1 by the CW domain of ASHH2 is conserved in rice. According to the sequence alignment results, residues that specifically recognize Ala-1 and Thr-3 are highly conserved among green plants.

Discussion
In this work, we investigated the binding ability and preference of the CW domains of Arabidopsis SDG8 and rice SDG725 to the monomethylated histone H3K4 peptide and determined the crystal structure of SDG8-CW in complex with H3K4me1 peptide. We found that the N terminus of H3 is critical for the sequence-specific binding, and residues on the C-terminal ␣-helix ␣1 are the determinants of monomethyllysine recognition. Mutation on ␣1 leads to a conversion of the binding preference from monomethylated lysine to trimethylated lysine. By analyzing the sequences and structures of other CW domains, we found that the existence of ␣1 is a unique phenomenon in

Crystal structure of SDG8-CW in complex with H3K4me1
the CW domains of ASHH2 homologs among green plants. Our findings may provide structural insights into the molecular mechanism of the specific recognition of a histone H3K4me1 marker by the SDG8 CW domain.
So far, many crystal structures of histone reader and histone H3K4 peptides have been reported (45,46). However, most of these structures are in complex with H3K4me3 or unmethylated H3K4. Two structures of histone readers in complex with H3K4me1 have been published previously in the Protein Data Bank code, including MORC3-CW-H3K4me1 (Protein Data Bank code 5SVY) (40) and WDR5-H3K4me1 (Protein Data Bank code 2H9N) (47). However, these two proteins are not H3K4me1 readers. They preferentially bind trimethylated H3K4 (MORC3-CW) and unmethylated H3K4 (WDR5). Therefore, our structure is the first H3K4me1-specific reader protein in complex with H3K4me1.
We compared our crystal structure with other reported human CW domain structures, such as MORC3 and ZCWPW2, revealing some differences between H3K4me1-reader CW domain and H3K4me3-reader CW domain (23). In our structure, Ala-1 and Thr-3 are key residues in the sequence-specific recognition. Arg-2 plays a less important role than Ala-1 and Thr-3. Arg-2 is sandwiched by Arg-867 and Glu-887 in SDG8 (Fig. 2, B and D). However, the guanidinium moiety is closer to Arg-867, which causes electrostatic repulsion. Mutation of Arg-2 to alanine only decreased the binding about 5-fold. In MORC3-CW-H3K4me3 complex and ZCWPW2-CW-H3K4me3 complex structures, the corresponding residues of Arg-867 are Gln-412 and Gln-32, respectively (23). Thus, Arg-2 forms hydrogen bonds with Gln-412 or Gln-32 (Fig. 7, A and B). Mutation of Arg-2 to alanine nearly abolishes the binding by ZCWPW2-CW (23). No binding could be detected in an NMR experiment of MORC3-CW titrated with the histone H3K4me3 peptides (residues 3-10) in the absence of Ala-1 and Arg-2 (40). We noticed that the orientation of histone H3K4me peptides starting from residue Gln-5 are quite different in the SDG8-CW-H3K4me1 complex structure compared with similar structures (Fig. 7, C and D). In the MORC3-CW or ZCWPW2-CW complex structure, the C-terminal histone peptides form antiparallel ␤-strands with ␤1. However, in our structure, the C-terminal ␣1 helix blocks the position so that the direction of the peptide flips away and cannot form antiparallel ␤-strands. Moreover, the existence of ␣1 results in narrower and tighter binding to the methylated lysine. By contrast, in the MORC3-CW-H3K4me3 and ZCWPW2-CW-H3K4me3 complex structures, the trimethylated lysine is surrounded by three aromatic residues in a much more open and large pocket (Fig. 7, E  and F). If we model the trimethylated lysine in the structure of SDG8-CW, a steric hindrance between the trimethyl group and the binding pocket is observed (Fig. 7G). Similarly, unmethylated lysine does not fit well into the pocket, resulting in the formation of a cavity in the pocket (Fig. 7H). Our crystal structure reveals that the hydrophobic, narrow pocket of SDG8-CW excludes binding of

Crystal structure of SDG8-CW in complex with H3K4me1
a more highly methylated state of lysine due to steric hindrance (Fig. 7I).
In Arabidopsis, SDG8 is an H3K36 methyltransferase, which catalyzes the dimethylation and trimethylation of H3K36 and functions in nutrient and energy metabolism, cell differentiation, timing of flowering, and other processes (30,48,49). The function of the CW domain in SDG8 remains unclear. In animals, the H3K4me1 marker is associated with enhancers (50). However, the H3K4me1 in Arabidopsis is predominantly located on gene bodies, especially the transcribed regions correlated with CG DNA methylation (51)(52)(53). How the combination of a histone H3K4me1 reader and an H3K36me2/3 writer interprets the effect on gene regulation, plant development, and stress response remains to be elucidated. Based on our study, we propose that SDG8 is recruited via the CW domain to the H3K4me1-labeled transcribed region to deposit the H3K36me2/3 mark and is subject to gene regulation.

Protein expression and purification
The cDNAs encoding the CW domain of Arabidopsis thaliana SDG8 (residues 862-921) and Oryza sativa SDG725 (residues 41-101) were amplified by PCR and cloned into the modified pET28-SMT3 vector with an N-terminal His-SUMO tag. Site-specific mutants were generated using a site-directed mutagenesis kit (New England Biolabs) according to the manufacturer's instructions. The plasmid was transformed into Escherichia coli strain BL21 (DE3). The cells were cultured in LB medium with 50 g/ml kanamycin and 0.1 mM ZnSO 4 . The cells were induced by isopropyl ␤-D-thiogalactopyranoside Figure 5. Sequence alignment of CW proteins in species. Sequence alignment was calculated using Jalview (53), and amino acids were shaded according to the ESPript server (54). Secondary structural elements of SDG8-CW are displayed above the sequence alignment. The conserved zinc-binding mode is shown by lines at the bottom of the alignment. The residues that form the conserved aromatic cage are marked by red circles, and the various residues of the aromatic cage are marked by green frames. The key residues involved in histone tail sequence-specific recognition are marked by green triangles. A, sequence alignment of CW domains in human (Hs) and plant (Os and At

Crystal structure of SDG8-CW in complex with H3K4me1
(IPTG) at a final concentration of 0.2 mM and continued growing at 18°C for 18 h. Cells were harvested by centrifugation and resuspended in 20 mM Tris, pH 8.0, 500 mM NaCl, 25 mM imidazole. Cells were lysed by French press (JNBIO). WT and mutant proteins were purified by a HisTrap column (GE Healthcare) followed by removal of the His-SUMO tag by Ulp1 digestion. The target protein was further purified by ion exchange chromatography using a HiTrap Q column (GE Healthcare) and size exclusive chromatography using a Superdex G75 HiLoad 16/60 column (GE Healthcare). Fractions with target proteins were pooled and concentrated to 50 mg/ml in buffer containing 10 mM Tris, pH 8.0, 100 mM NaCl, 1 mM DTT for structural and biochemical studies. Sequence alignment was calculated using Jalview (54), and amino acid residues were shaded according to ESPript server (55).

Crystallization, data collection, and structure determination
To crystallize the SDG8-CW E917A and H3K4me1 complex, purified E917A protein (40 mg/ml) and H3K4me1 (res-idues 1-9) were mixed at a 1:2 ratio and incubated on ice for 1 h. The SDG8-CW E917A-H3K4me1 complex was crystallized by the hanging drop vapor diffusion method. The well buffer contained 0.1 M Tris, pH 8.5, 30% PEG 3350, 30% isopropanol. X-ray diffraction data were collected at BL19U1 of the Shanghai Synchrotron Radiation Facility. Crystals were flash frozen under a cold nitrogen stream (100 K) during data collection. The data were processed using the HKL3000 program suite (56). Initial phases were determined by the single-wavelength anomalous dispersion method using zinc anomalous scattering. The PHENIX program suite was used for location of zinc positions, phasing, and density modification (57). The graphics program Coot was used for model building (58), and refinement was performed using PHENIX. The structure was analyzed using the MolProbity server (59). Phasing and refinement statistics are listed in Table 1. Buried surface area was calculated using PISA (42). Figures were generated using PyMOL (The PyMOL Molec- Crystal structure of SDG8-CW in complex with H3K4me1 ular Graphics System, Version 1.8, Schrödinger, LLC). Electrostatic surface potential was calculated by the PDB2PQR server (60).

Isothermal titration calorimetry assays
ITC experiments were performed at 20°C on a MicroCal iTC200 (Malven). Proteins and peptides were kept in an identical buffer of 20 mM Tris, pH 8.0, 100 mM NaCl. The sample cell was filled with a 0.05 mM solution of protein, and the injection syringe was filled with 1 mM titrating ligand. Each titration consisted of 20 2-l injections with 2-min intervals. Binding isotherms were analyzed by fitting data into the one-site model using the ITC data analysis module in Origin 7.0 software.