Structure and Hemimethylated CpG Binding of the SRA Domain from Human UHRF1*

Human UHRF1 (ubiquitin-like PHD and RING finger 1) functions to maintain CpG DNA methylation patterns through DNA replication by co-localizing with the DNA methyltransferase DNMT1 at chromatin in mammals. Recent studies show that UHRF1 binds selectively to hemimethylated CpG via its conserved SRA (SET- and RING finger-associated) domain. However, the underlying molecular mechanism is not known. Here, we report a 1.95Å resolution crystal structure of the SRA domain of human UHRF1. Using NMR structure-guided mutagenesis, electrophoretic mobility shift assay, and fluorescence anisotropy analysis, we determined key amino acid residues for methyl-DNA binding that are conserved in the SRA domain.

Human UHRF1 (ubiquitin-like PHD and RING finger 1) functions to maintain CpG DNA methylation patterns through DNA replication by co-localizing with the DNA methyltransferase DNMT1 at chromatin in mammals. Recent studies show that UHRF1 binds selectively to hemimethylated CpG via its conserved SRA (SET-and RING finger-associated) domain. However, the underlying molecular mechanism is not known. Here, we report a 1.95 Å resolution crystal structure of the SRA domain of human UHRF1. Using NMR structure-guided mutagenesis, electrophoretic mobility shift assay, and fluorescence anisotropy analysis, we determined key amino acid residues for methyl-DNA binding that are conserved in the SRA domain.
Almost all DNA methylation in the mammalian genome occurs in the CpG dinucleotide motif carried out by the DNA methyltransferase DNMT1 and is maintained through DNA replication (1). As a heritable epigenetic mark, CpG methylation works in concert with histone modifications to control gene transcriptional silencing and heterochromatin formation (2,3). Studies show that the multidomain human protein UHRF1 (also known as ICBP90 in human and Np95 in mouse) functions to maintain CpG methylation patterns by co-localizing with DNMT1 (4,5) and the histone-lysine deacetylase HDAC1 (6). More recent studies show that UHRF1 has E3 ubiquitin ligase activity (7) and can also selectively recognize methylated histone H3 at lysine 9 by its PHD finger (4,8) and hemimethylated CpG by its SRA domain (6,9,10), thus bridging two important epigenetic marks in DNA and histones for functional regulation of gene silencing and pericentromeric heterochromatin formation (11)(12)(13). Given its multiple functionality in epigenetic gene regulation, it is not surprising that human UHRF1 is found to be overexpressed in many different forms of human cancers, including breast (6, 14 -16), cervical (17), and prostate (16) cancers; pancreatic adenocarcinomas (18); rhabdomyosarcomas (19); and gliomas (20). However, molecular mechanistic understanding of UHR1 is limited. In an effort to determine its structural basis of hemimethylated DNA recognition, we solved a high resolution crystal structure of the SRA domain from human UHRF1. By using NMR structure-guided analysis and mutagenesis, we further identified the amino acid residues that are important for SRA domain recognition of hemimethylated DNA.

EXPERIMENTAL PROCEDURES
Protein Preparation-The SRA domain (residues 414 -617) of human UHRF1 was cloned into the pET28a plasmid as an N-terminal His 6 -tagged fusion protein. The fusion protein was expressed in Escherichia coli (strain BL21(DE3)). Uniformly 15 N-and 15 N/ 13 C-labeled proteins were prepared by growing bacteria in minimal medium with 15 NH 4 Cl and/or [ 13 C 6 ]glucose as the sole nitrogen and carbon sources. Deuterated protein was generated by cell growth in 90% 2 H 2 O. The SRA domain proteins were purified by nickel-nitrilotriacetic acid affinity and size exclusion columns followed by Mono S ion exchange chromatography after thrombin cleavage of the His 6 tag. Protein NMR samples (ϳ0.5 mM) were prepared in 50 mM sodium phosphate buffer (pH 6.5) containing 150 mM NaCl and 2 mM dithiothreitol-d 10  DNA Preparation-High pressure liquid chromatographypurified 5Ј-fluorescein-labeled, 5Ј-biotinylated, unmodified or hemimethylated CpG oligonucleotides were purchased from Fisher (supplemental Table 1). The lyophilized oligonucleotides were resuspended to 2.0 mM in 10 mM Tris-HCl (pH 8.0) containing 100 mM NaCl. Complementary strands were mixed in equal molar amounts and annealed by heating to 368 K and then cooled to room temperature.
Site-directed Mutagenesis-Human UHRF1 SRA domain mutants were generated using the QuikChange kit (Stratagene) and cloned into the pGEX6p1 vector as an N-terminal glutathione S-transferase fusion protein. The presence of appropriate mutations was confirmed by DNA sequencing.
NMR Spectroscopy-All NMR spectra were acquired at 293 K on an 800, 600, or 500 MHz NMR spectrometer. 1 H, 13 C, and 15 N backbone resonances of the protein were assigned with three-dimensional deuterium-decoupled triple-resonance HNCA, HN(CA)CB, HN(COCA)CB, and HN(CO)-CACB spectra recorded on a uniformly 15 N/ 13 C-labeled and fractionally deuterated protein (21). DNA titration was performed by recording a series of two-dimensional 15 N heteronuclear single quantum coherence spectra on uniformly 15   ferent amounts of DNA oligonucleotides ranging from 0 to 0.05 mM.
Crystallization-Native crystals of the UHRF1 SRA domain were obtained at 293 K with the vapor diffusion hanging drop method by mixing 1 l of the protein solution with 1 l of crystallization solution (100 mM Tris-HCl (pH 8.0) containing 200 mM NaCl, 1 M ammonium sulfate, and 15% glycerol). For initial phasing, sodium selenate (Na 2 SeO 4 ) heavy atom derivative was prepared by soaking native SRA domain protein crystals in mother liquor solution in which the ammonium sulfate was substituted with sodium selenate while maintaining all remaining components constant. The crystal soaking was carried out for at least 1 h at 293 K to ensure SO 4 /SeO 4 substitution. All diffraction data were measured at 100 K at Brookhaven National Laboratory beamlines X4C and X6A. The heavy atom soaked crystal and the native crystal diffracted to 3.0 and 1.95 Å, respectively. All data were processed with HKL-2000 (22). The crystals belong to space group P3 1 2 1 with unit cell dimensions of a ϭ b ϭ 65.6 Å, c ϭ 95.4 Å, and ␣ ϭ ␤ ϭ ␥ ϭ 120°. During the phasing process, one family protein crystal structure (Protein Data Bank code 3BI7) was released, which was used as a molecular replacement template. The molecular replacement solution was used subsequently for the structure refinement to 1.95 Å using REFMAC5 (23) and COOT (24). X-ray data collection and refinement statistics are listed in Table 1.
Fluorescence Anisotropy-The SRA domain binding affinity for 5Ј-fluorescein-labeled oligonucleotides was determined at 293 K using a TECAN Safire fluorescence reader with excitation at 470 nm and emission at 525 nm. The 5Ј-fluoresceinlabeled DNA (5 nM) was added to a series of the wild-type or mutant SRA domain proteins with varying concentrations of 5 nM to 0.5 mM. Anisotropy values were referenced against a blank sample of buffer (20 mM Tris-HCl (pH 7.5) containing 50 mM NaCl and 1 mM dithiothreitol) at the beginning of each experiment to account for background correction. The anisotropy data were fitted by nonlinear least-squares regression to the following equation: where A is the anisotropy, E is the total protein concentration, D is the total DNA concentration, A min is the anisotropy of free DNA, A max is the anisotropy of the DNA-protein complex, and K d is the dissociation constant. A 1:1 stoichiometry for the DNA-protein complex was assumed.

RESULTS AND DISCUSSION
The SRA domain from the human UHFR1 protein was purified to homogeneity and crystallized under the conditions described in detail under "Experimental Procedures." The three-dimensional crystal structure of the SRA domain from human UHFR1 was determined to 1.95 Å resolution ( Table 1). As shown in Fig. 1a, its compact architecture is built on an eight-stranded ␤-barrel core consisting of amino acid sequences conserved in the SRA domain family (supplemental Fig. 1). One side of the ␤-barrel is curved around helix ␣1 through interactions with the conserved residues Trp 430 and Arg 433 , whereas the amphipathic helix ␣2 closes off one end of the barrel by Asn 505 , Leu 508 , and Asn 511 , interacting with the residues in the core of the barrel. Moreover, the C-terminal helix ␣5 ties together the short N-terminal ␤1, emphasizing the modular nature of the structure. Note that the loop connecting ␤4 and ␣2 is flexible, as residues 484 -495 were invisible in the electron density map.
Based on SCOP and DALI searches, the structural fold of the SRA domain is distinct topologically from any available protein structures, including the ␤-barrel OB fold known to interact with nucleic acids (25) and the methylated or non-methylated CpG-binding domains (26, 27) (supplemental Fig. 2).
The structure exhibits a large, distinct, positively charged patch on the SRA domain surface (Fig. 1b) composed of Arg 431 , Arg 433 , Arg 443 , Lys 540 , and Arg 452 . At this positive patch, the electron density map reveals a well ordered sulfate anion (from the crystallization buffer) and a glycerol molecule (from the cryoprotectant solution) that form salt bridges with Arg 431 and Arg 433 , respectively, and a sulfate anion that forms hydrogen bonds with the side chain of Arg 443 (Fig. 1a, right panel).
We characterized SRA domain/methyl-DNA binding with NMR spectroscopy. As shown in two-dimensional 1 H-15 N heteronuclear single quantum coherence spectra (Fig. 1c), upon addition of a 13-nucleotide hemimethylated DNA duplex, a large number of protein amide resonances under- went major chemical shift perturbations and conformation exchange-induced line broadening. To identify SRA residues that are important for methyl-DNA recognition, we obtained 90% backbone 1 H, 15 N, and 13 C resonance assignment of the protein with triple-resonance NMR spectra, including residues 484 -495 in the ␤4/␣2 loop (supplemental Fig. 3), which are structurally flexible and missing in the electron density map of the crystal structure. Interestingly, the residues that show the most profound chemical shift perturbations or line broadening upon binding to hemimethylated DNA are those located in the ␤4/␣2 loop: Arg 431 , Phe 432 , and Arg 433 in the ␤1/␣1 loop; Arg 443 , His 445 , and Val 446 in the ␣1/␤2 loop; and Gly 465 , Tyr 466 , Glu 467 , and Asp 469 in the ␤3/␤4 loop (Fig. 1a, right panel). These four loops are spatially clustered at the positive patch of the pro-tein (28), supporting the notion that these residues likely constitute the DNA-binding site. It is worth noting that many residues in the ␤1/␣1, ␣1/␤2, and ␤3/␤4 loops are highly conservative in the SRA domain family (supplemental Fig. 1). Using a DNA electrophoretic mobility shift assay (EMSA), 2 we further demonstrated that the SRA domain recognizes a 13-nucleotide singly or doubly hemimethylated DNA duplex and forms a stable complex, but only very weakly, if at all, with a 13-nucleotide non-methylated DNA (Fig. 2a, lanes 2 and 3 versus lane 1). To investigate the molecular determinants of methyl-DNA binding by the SRA 2 The abbreviation used is: EMSA, electrophoretic mobility shift assay. . The latter depicts key protein residues that interact with bound sulfate anions and glycerol and those that constitute a positively charged patch on the protein surface. b, surface electrostatic potential representation of the protein, shown in the front view in the right panel in a. c, two-dimensional 1 H-15 N heteronuclear single quantum coherence spectra showing chemical shift changes of the protein amide resonances between its free form (black) and in the presence of a 13-nucleotide hemimethylated DNA duplex (red) (supplemental Table 1). d, surface of the SRA domain highlighting residues color-coded according to chemical shift perturbations or line broadening effects upon binding to a 13-nucleotide hemimethylated DNA duplex. The residues are indicated in a gray-to-red colored gradient according to weighted 1 H and 15 N chemical shift changes of the protein induced by methylated DNA binding. The residues showing line broadening effects were treated the same as the residues that had most profound chemical shifts perturbations.
domain, we performed site-directed mutagenesis of Arg 433 , Arg 443 , Tyr 466 , Asp 469 , and Arg 491 , which are located at the putative DNA-binding site as mapped out by NMR titration. Individual Ala mutation of Arg 433 and Arg 491 did not cause a significant reduction in DNA binding, whereas mutants R443A and Y466G resulted in markedly decreased DNA binding compared with the wild type (Fig. 2b). Moreover, mutant D469G almost totally lost the ability to form a complex with hemimethylated DNA in the gel. Note that the D469G mutation did not cause any significant perturbation of the protein structure as judged by its similar one-dimensional 1 H NMR spectrum to that of the wild-type protein (supplemental Fig. 4a). Given that Arg 443 , Tyr 466 , and Asp 469 are highly conserved in the SRA domain family (supplemental Fig. 1), we argue that these residues are likely important for the SRA domain interactions with DNA.
To verify quantitatively the mutation effects, we determined the dissociation constants for the binding of the wildtype SRA domain and its mutants to a 5Ј-fluorescein-labeled 13-nucleotide hemimethylated DNA duplex in a fluorescent anisotropy binding study. The wild type bound the hemimethylated DNA with a K d of ϳ0.2 M, whereas the R433A, R443A, Y466G, and R491A mutants showed a major reduc-tion in binding affinity by 6-, 19-, 16-, and 7-fold, respectively (Fig.  2c). The D469G mutant showed much more reduced binding to the hemimethylated DNA, with a K d of ϳ50 M, as determined by fluorescent anisotropy binding, consistent with its nearly complete loss of DNA binding in EMSA. Interesting, the D469G mutation resulted in a much more dramatic reduction in SRA domain binding to hemimethylated DNA than to non-methylated DNA of the same sequence, i.e. 250-fold versus 50-fold (supplemental Fig. 4b).
Moreover, D469 mutant binding to non-methylated DNA was ϳ2-fold tighter than to hemimethylated DNA (K d ϭ 25 M versus 50 M). Collectively, these results suggest a critical role of Asp 469 in SRA domain recognition of hemimethylated DNA.
In conclusion, our new crystal structure of the human UHRF1 SRA domain reported here enabled us to identify key amino acid residues, including Arg 443 , Tyr 466 , and Asp 469 , at the opening of the ␤-barrel that contribute to hemimethylated DNA binding. As these residues are highly conserved within the SRA domain family, we expect that our findings reported in this study will yield a better understanding of the structure-function relationship of the SRA domain family in epigenetic gene regulation.