Structural and Functional Insights into the Human Börjeson-Forssman-Lehmann Syndrome-associated Protein PHF6*

Background: PHF6 gene is mutated in BFLS and adult acute myeloid and T-cell acute lymphoblastic leukemias. Results: Crystal structure of the second extended PHD domain of PHF6 was solved. Conclusion: PHF6-ePHD2 is a novel structural module and binds dsDNA. Significance: PHF6 may function as a transcriptional repressor using its ePHD domains binding to DNA and recruiting NuRD complex through its NoLS region to regulate gene transcription. The plant homeodomain finger 6 (PHF6) was originally identified as the gene mutated in the X-linked mental retardation disorder Börjeson-Forssman-Lehmann syndrome. Mutations in the PHF6 gene have also been associated with T-cell acute lymphoblastic leukemia and acute myeloid leukemia. Approximately half of the disease-associated mutations are distributed in the second conserved extended plant homeodomain (ePHD2) of PHF6, indicating the functional importance of the ePHD2 domain. Here, we report the high resolution crystal structure of the ePHD2 domain of PHF6, which contains an N-terminal pre-PHD (C2HC zinc finger), a long linker, and an atypical PHD finger. PHF6-ePHD2 appears to fold as a novel integrated structural module. Structural analysis of PHF6-ePHD2 reveals pathological implication of PHF6 gene mutations in Börjeson-Forssman-Lehmann syndrome, T-cell acute lymphoblastic leukemia, and acute myeloid leukemia. The binding experiments show that PHF6-ePHD2 can bind dsDNA but not histones. We also demonstrate PHF6 protein directly interacts with the nucleosome remodeling and deacetylation complex component RBBP4. Via this interaction, PHF6 exerts its transcriptional repression activity. Taken together, these data support the hypothesis that PHF6 may function as a transcriptional repressor using its ePHD domains binding to the promoter region of its repressed gene, and this process was regulated by the nucleosome remodeling and deacetylation complex that was recruited to the genomic target site by NoLS region of PHF6.

The plant homeodomain finger 6 (PHF6) was originally identified as the gene mutated in the X-linked mental retardation disorder Börjeson-Forssman-Lehmann syndrome. Mutations in the PHF6 gene have also been associated with T-cell acute lymphoblastic leukemia and acute myeloid leukemia. Approximately half of the disease-associated mutations are distributed in the second conserved extended plant homeodomain (ePHD2) of PHF6, indicating the functional importance of the ePHD2 domain. Here, we report the high resolution crystal structure of the ePHD2 domain of PHF6, which contains an N-terminal pre-PHD (C2HC zinc finger), a long linker, and an atypical PHD finger. PHF6-ePHD2 appears to fold as a novel integrated structural module. Structural analysis of PHF6-ePHD2 reveals pathological implication of PHF6 gene mutations in Börjeson-Forssman-Lehmann syndrome, T-cell acute lymphoblastic leukemia, and acute myeloid leukemia. The binding experiments show that PHF6-ePHD2 can bind dsDNA but not histones. We also demonstrate PHF6 protein directly interacts with the nucleosome remodeling and deacetylation complex component RBBP4. Via this interaction, PHF6 exerts its transcriptional repression activity. Taken together, these data support the hypothesis that PHF6 may function as a transcriptional repressor using its ePHD domains binding to the promoter region of its repressed gene, and this process was regulated by the nucleosome remodeling and deacetylation complex that was recruited to the genomic target site by NoLS region of PHF6.
The PHF6 gene encodes a 365-amino acid protein, which contains four nuclear localization signals and two imperfect PHD zinc finger domains with a proposed role in gene transcription regulation (1,14). PHF6 is highly conserved among vertebrate species, including Mus musculus and Xenopus laevis, but it has no obvious orthologs in nonvertebrates, suggesting the conservative functions of PHF6 in the cell (1). Northern blot and in situ hybridization analyses have revealed that PHF6 is highly expressed during embryonic, fetal, and postnatal stages of brain development and predominantly present in the developing embryonic central nervous system (1,15). Therefore, it appears that PHF6 plays an important role in human brain development. In addition, RT-PCR analysis demonstrates that PHF6 gene is ubiquitously expressed in human tissues and most highly expressed in thymus, ovary, and thyroid (16). ChIPon-chip analysis identifies PHF6 as a direct target gene of NOTCH1 that is frequently mutated in T-cell acute lymphoblastic leukemia (T-ALL) (17). Indeed, a reported BFLS patient has developed T-ALL disease (18). Meanwhile, somatic mutations and deletions of the PHF6 gene have been found in patients with T-ALL and acute myeloid leukemia (16, 19 -22). Moreover, a small set of microRNAs has been found to produce cooperative suppression effects on several tumor suppressor genes implicated in T-ALL, including the PHF6 gene (23). It has been suggested that PHF6 may function as an X-linked tumor suppressor gene and play a role in the pathogenesis of hematologic tumors.
Concerning the molecular mechanism, recently PHF6 has been identified as a novel protein partner interacting with the nucleosome remodeling and deacetylation (NuRD) complex in nucleus (24). PHF6 regulates cell cycle progression by suppressing ribosomal RNA synthesis in the nucleolus (25). PHF6 physically associates with the PAF1 transcription elongation complex and plays an essential role in neuronal migration in the brain (26). Nevertheless, despite that the PHF6 is implicated in chromatin regulation, neural development, and tumor suppression, very little is known regarding how PHF6 plays its functions and how PHF6 mutations contribute to disease pathogenesis. To gain more insights into the structure-function relationship of the PHF6 and its role in the pathogenesis of BFLS and leukemia diseases, we determined the high resolution crystal structure of the second extended PHD domain (C2HC zinc finger-PHD, refer to ePHD2) from the human PHF6 protein and found that C2HC zinc finger-PHD folds as a novel integrated structural module. PHF6-ePHD2 domain can bind to dsDNA. Structural analysis of PHF6-ePHD2 reveals pathological implication of disease-associated mutations of PHF6. In addition, we demonstrated that PHF6 can directly interact with the RBBP4, a component of the NuRD complex, through its NoLS region that is important for the transcriptional suppression activity of PHF6 protein.

EXPERIMENTAL PROCEDURES
Protein Cloning, Expression, and Purification-DNA fragments encoding the full-length (365 amino acids) and the second extended PHD domain (ePHD2, residues 208 -333) of human PHF6 were amplified by PCR from the human brain cDNA library and cloned into the pGEX-4T-1 (GE Healthcare) expression vector between the NdeI and XhoI sites. The resulting plasmid was transformed into Escherichia coli Rosetta (DE3) cells for protein expression and induced with 0.25 mM isopropyl 1-thio-␤-D-galactopyranoside at 16°C overnight supplemented with 0.1 mM ZnSO4 in the LB medium. 15 N-or 15 N/ 13 C-labeled proteins were produced in SV40 medium supplemented with 15 NH 4 Cl or 15 NH 4 Cl and [ 13 C 6 ]glucose. The fusion protein was purified on glutathione-Sepharose beads (GE Healthcare). The proteins were eluted with buffer containing 20 mM reduced L-glutathione. The eluted fraction was concentrated to 5 ml using Amicon Ultra-15 centrifugal filter units (10-kDa cutoff, Millipore) and further purified by gel filtration chromatography using a 16/26 Superdex 200 column (GE Healthcare) pre-equilibrated in 25 mM phosphate sodium buffer, pH 6.5, containing 400 mM NaCl. After cleavage of the GST tag by thrombin (GE Healthcare) at 16°C, an extra cationexchange purification step was used.
Full-length RBBP4 (residues 1-425; UniProt accession number Q09028) was cloned into a pFastBac vector encoding an N-terminal His 6 tag and a tobacco etch virus protease cleavage site. Recombinant baculovirus was generated in Tn5 cells using the Bac-to-Bac (Invitrogen) expression methodology according to the manufacturer's instructions.
Crystallization Conditions, Data Collection, and Structure Determination-Crystals were grown at 16°C by vapor diffusion in sitting drops by mixing equal volumes of the well solution containing 200 mM ammonium acetate, 100 mM tri-sodium citrate dehydrate, 30% (w/v) polyethylene glycol 4000, pH 5.6, and protein solution at 10 mg/ml.
The diffraction data were collected on the beam line 17U at Shanghai Synchrotron Radiation Facility. The initial diffraction data were indexed and integrated with iMosflm (27) and scaled with SCALA (28) from CCP4 program suite (29). All of the three zinc atom positions were determined by SHELX C/D (30), and the initial phases of calculations and density modification were performed with SHELX E (30). An initial model was automatically built by the program ARP/wARP (31). The model was further built and refined using Coot (32) and Refmac (33), respectively. NMR Spectroscopy, 15 N Relaxation Experiments, and Residual Dipolar Couplings (RDCs)-All NMR experiments were collected at 298 K on a Bruker DMX 600-MHz spectrometer equipped with a cryoprobe. Sequence-specific backbone assignments were carried out by analysis of three-dimensional HNCO, HN(CA)CO, CBCANH, and CBCA(CO)NH experiments. 1 H and 13 C side chain assignments were obtained from three-dimensional HBHA(CBCACO)NH, C(CO)NH-TOCSY, H(CCO)NH-TOCSY, HCCH-COSY, and HCCH-TOCSY. NMR distance information was obtained form 15 N-and 13 Cedited NOESY-HSQC spectra. One-dimensional spectra of PHF6-ePHD2 domain mutants are collected at 298 K on a Varian 500-MHz spectrometer. All NMR data were processed using the NMRPipe (34) and analyzed by Sparky (T. D. Goddard and D. G. Kneller, University of California at San Francisco). 15 N relaxation measurements were performed on a Bruker DMX-600 equipped with a cryoprobe using a 15 N-labeled PHF6-ePHD2 sample with a concentration of 0.6 mM at 298 K. The 1 H-15 N heteronuclear NOE experiments were recorded in an interleaved fashion, alternately with and without proton presaturation in the recovery delay. A 2-s relaxation delay following a 3-s 1 H presaturation delay were used in the NOE experiment, and a 5-s relaxation delay was used in the reference experiment. 15 N T1 and T2 relaxation times were derived from nine spectra with different values for the relaxation delay (11,61,142, and 242 (run twice) and 362, 523, 753, and 1147 ms) and seven relaxation delays (17.6, 35.2, and 52.8 (run twice) and 70.4, 105.6, and 140.8 ms), respectively. T1 and T2 values were extracted and fitted using a curve-fitting subroutine included in the program Sparky.
Polyacrylamide gels were prepared and squeezed into an open-ended NMR tube (35). Apparent J NH constants of 15 Nlabeled PHF6(208-333) were measured at Agilent 700-MHz spectrometer using the in-phase and anti-phase scheme (36). RDC values were obtained by subtracting the reference value of the protein in isotropic solution. Two duplicate datasets were collected at a temperature of 25°C, and PALES was used for data analysis (37). Dipolar couplings from residues having lower 1 H-15 N NOEs (Ͻ0.65) were excluded.
GST Pulldown Experiments-GST fusion PHF6 ePHD2 was bound to glutathione-Sepharose beads (GE Healthcare), and then the beads were incubated with calf thymus histones (Worthington) in buffer containing 25 mM BisTris, pH 6.5, 1 M NaCl, 1% Nonidet P-40 for 1 h at 4°C. After washing six times with the buffer described above, results were analyzed on Coomassiestained SDS-polyacrylamide gels. GST-tagged PHF6(145-207) and PHF6(152-171) were bound to beads, and the beads were subsequently mixed with purified RBBP4 in 1ϫ PBS buffer for 1 h at 4°C. The glutathione-Sepharose was sedimented by centrifugation and washed five times with 1ϫ PBS buffer. The bound materials were analyzed on SDS-PAGE and Coomassie Blue staining.

PHF6 Contains Two Highly Conserved Extended PHD Domains-PHD
finger is a C4HC3 zinc finger-like motif, first noted in two closely related plant homeodomain proteins HAT3.1 and HOX1A (39). This evolutionarily conserved cysteine-rich motif was also observed for various proteins involved in epigenetics and chromatin-mediated transcriptional regulation in human, Drosophila, Caenorhabditis elegans, and yeast, etc. The PHD finger binds two zinc ions using the so-called "cross-brace" motif to form a stable unit (40). It can function as a protein-protein interaction domain and in many cases as a reader of histone marks to recognize methylated or unmethylated lysines (41,42). It is reported that the PHF6 protein contains two imperfect PHD zinc finger domains and may play a role in chromatin transcription regulation.
Unexpectedly, in our experiments we found that the second PHD finger region of PHF6 alone was unstable in solution. So we extended the N terminus of the second PHD finger and got a stable construct (residues 208 -333). It is 70 amino acids longer than the predicted PHD finger region (residues 277-329) in protein knowledge database (UniProtKB) and we refer to is as the extended PHD domain (ePHD2) (Figs. 1 and 2A).
To further characterize structural domains in PHF6, we performed sequence alignment of many PHF6 proteins from different vertebrate species, which shows conserved extended PHD domains at the N and C termini, respectively, that contain a C2HC zinc finger and the previously reported imperfect PHD finger (Fig. 1). That means the ePHD2 entity may have important function.
Overall Structure of the Second Extended PHD Domain of PHF6-To understand the molecular structure of the ePHD domain, we solved the high resolution crystal structure of the ePHD2 domain of PHF6 (Table 1). In the final model, PHF6-ePHD2 domain includes two molecules in an asymmetric unit with 124 amino acid residues for molecule A (residues 208 -331) and 118 amino acid residues for molecule B (residues 209 -255 and 260 -330). The two molecules display a root-mean-square deviation of 0.296 Å, reflecting the slight differences in the flexible loops of the protein.
The tertiary structure of PHF6-ePHD2 domain displays a globular conformation (Fig. 2, C and D). It contains two differ-ent motifs. The N-terminal part of the PHF6-ePHD2 domain (residues 208 -247) includes a Cys-2-His-Cys motif constituting a putative zinc finger (designated pre-PHD), consisting of two ␣-helices (␣1, residues 220 -225; ␣2, residues 241-244) separated by an anti-parallel ␤-sheet (␤1, residues 228 -230; ␤2, residues 237-239) (Figs. 2, B and C, and 3A). Three cysteine residues and one histidine residue from the N-terminal loop, ␤2-strand, and ␣2-helix coordinate one zinc ion (designated Zn1) to stabilize the pre-PHD (Fig. 2, B and C). The C-terminal part of the ePHD2 domain (residues 279-330) as expected is a PHD finger, consisting of one short anti-parallel ␤-sheet (␤3, residues 290-292; ␤4, residues 300 -301) and one long antiparallel ␤-sheet (␤5, residues 312-316; ␤6, residues 321-325) that are linked by one ␣-helix (␣4, residues 303-308) (Figs. 2, B and C, and 3A). Like other PHD fingers, the PHD finger of PHF6-ePHD2 domain consists of two interleaved zinc fingers. A pair of bound zinc ions (designated Zn2 and Zn3) specifically stabilizes the characteristic cross-braced folding topology of the PHD finger (Fig. 2, B-D). Each zinc ion is coordinated by a combination of four cysteine and histidine residues in which the Zn3 ion is coordinated by a C3H motif instead of a Cys-4 motif in PHF6-ePHD2 (Fig. 2, B and D), which is similar with BRPF2-PHD2 (38). For three zinc-chelating histidines, the zinc coordination occurs all through the N ␦1 of imidazole rings (data not shown). Moreover, in the ePHD2 structure, one long ␣-helix (␣3, residues 265-275) and one long reversed flexural loop (L1, residues 248 -264) connect the pre-PHD motif and the imperfect PHD finger to form an integrated fold unit (Fig. 2, D and E).  ePHD2 Domain of PHF6 Folds as an Integrated Structural Module-In ePHD2 domain, the pre-PHD and PHD finger associated with each other via extensive hydrophobic interactions and numerous hydrogen bonding interactions to form a compact globular protein module. Moreover, the long loop L1 and the helix ␣3 have widespread interactions with the pre-PHD and PHD finger, especially with the former (Figs. 2E and 3A). In particular, residues Leu-244 and Phe-245 of helix ␣2 in pre-PHD and Ile-290, Tyr-303, Tyr-313, and Tyr-322 from the ␤3 strand, helix ␣4, ␤5, and ␤6 strands in PHD finger, respectively, form a hydrophobic intermolecular interface (Fig. 3B). In addition, residues Tyr-240 and Met-243 of helix ␣2 and Phe-214, Phe-231, and Ala-236 from loops in pre-PHD hydrophobically interact with residues Phe-260 and Phe-263 of the long loop L1 and Ile-265, Val-268, Leu-269, and Ile-272 of the helix ␣3, respectively (Fig. 3C). Besides hydrophobic contacts, a network of hydrogen bonds further stabilizes the pre-PHD-PHD finger fold unit. The hydroxyl oxygen of residue Ser-246 and the main chain amino group and hydroxyl oxygen of residue Ser-247 in pre-PHD form hydrogen bonds with the hydroxyl oxygen of residue Thr-289, and the main chain amino group of residue Gly-291 in ␤3 strand and the main chain carbonyl group of residue Thr-289 of PHD finger, respectively (Fig. 3D). Furthermore, the side chain of residue Glu-271 in helix ␣3 forms two hydrogen bonds with the hydroxyl oxygen of residue Ser-246 and the main chain amino group of residue Gly-248 (Fig. 3D). A nitrogen atom on the side chain of residue Arg-274 in helix ␣3 forms a hydrogen bond with the main chain carbonyl group of residue Thr-289 and another nitrogen atom from Arg-274 side chain via a water molecule bridges the polar contact with the hydroxyl oxygen of residue Thr-300 in ␤4 strand of PHD finger (Fig. 3D). Additionally, the main chain carbonyl group of residue Phe-214 and the N ⑀2 of imidazole ring of residue His-216 in pre-PHD form hydrogen bonds with the N ⑀2 of imidazole ring of residue His-302 and the main chain carbonyl group of residue Pro-286 separately (Fig. 3E). The main chain carbonyl group of residue Gly-275 in helix ␣3 forms two hydrogen bonds with the main chain amino groups of residue Gly-287 and Met-278 of PHD finger (Fig. 3E). The main chain carbonyl group of residue Gly-213 via a water molecule bridges with the main chain carbonyl of residue Ile-272. These numerous interactions fix the relative orientation between the pre-PHD and PHD finger and allow PHF6-ePHD2 domain to adopt a compact structure resembling a single folded unit. The importance of the network of hydrogen bonding and hydrophobic interactions for stabilizing the structure of PHF6-ePHD2 domain is confirmed by our efforts to prepare the PHD finger region alone, and several other construct samples yielded unstable products in solution, suggesting its integrate structure is critical for the biological function of the PHF6 protein.
Structure of PHF6-ePHD2 Represents a Novel Structural Module-To identify structural similarity, we performed a structural homology search using the DALI server (43). Unexpectedly, no reported protein structures have been found to be similar with the whole PHF6-ePHD2 structure. So the structure of PHF6-ePHD2 represents a novel structural module.
The DALI results show that only the PHD finger of PHF6-ePHD2 has reasonable structural identity with the other PHD domain, such as BRPF2-PHD2 (Z-score 5.4), Pygo1-PHD (Z-score 3.8), NSD3-PHD5-C5HCH (Z-score 3.7), and  shown in gray). G, structure-based sequence alignment of the pre-PHD zinc finger of PHF6-ePHD2 with AIRE-PHD and BRPF2-PHD1 fingers. The conserved zinc-chelating residues are highlighted on an orange background, the absent zinc-chelating residues in a pink and the absent acidic residues in a gray background. Secondary structural elements of pre-PHD zinc finger of PHF6-ePHD2 and BRPF2-PHD1 finger are shown above and below the sequences, respectively.
Trim33-PHD (Z-score 3.5), etc. The N-terminal pre-PHD exhibits no significant structural similarity to other known structures. Interestingly, the pre-PHD motif in PHF6-ePHD2 can superimpose well with the part of BRPF2-PHD1 (Protein Data Bank code 2KU3) with a root mean square deviation value of 2.78 Å for C␣ atoms over 28-amino acid residues (Fig. 3F). However, sequence alignment shows the pre-PHD motif of PHF6 lacks half of the conserved C4HC3 part that chelate two zinc ions in BRPF2-PHD1 finger (Fig. 3G). Therefore, the pre-PHD motif in PHF6-ePHD2 binds to only one zinc ion through the first half of the structure.

PHF6-ePHD2 Does Not Bind Histones in Vitro-Because
PHD fingers often recognize histone marks, we investigated whether the PHF6-ePHD2 domain binds to histones. No interaction was detected between calf thymus histones and the PHF6-ePHD2 domain in high salt (1 M NaCl) buffer using an in vitro GST pulldown assay (data not shown). NMR titration experiments were also performed to explore the interaction of PHF6-ePHD2 with histone H3 and H4 N-terminal peptides; however, no obvious chemical shift perturbation was observed in 15 N-HSQC spectra (data not shown), implying that PHF6-ePHD2 does not bind histones. Based on the structural information of PHF6-ePHD2, we could give the explanation for why it does not bind to histones. Despite the overall structural similarity, the PHD finger of PHF6-ePHD2 exhibits several different structural features compared with other histone-binding PHD fingers. First, all aromatic residues that are required to form the methyl-lysine recognition cage are absent in the PHD finger of PHF6-ePHD2, or acidic residues used to recognize unmodified lysine are replaced by the oppositely charged residue in the PHF6-ePHD2 PHD finger (Fig. 4A). Second, PHF6-ePHD2 contains one long helix ␣3 that sterically hindered any histone tail peptide binding to the histone-binding site at PHD (Fig. 4, C and D). These analyses suggest the PHD finger of PHF6 is a noncanonical PHD finger and lacks the structural features for binding to histone tails. These results indicate that although the PHD domain of PHF6-ePHD2 shows overall structural similarities with other PHD domains, the two domains of the protein are fused to produce a unique structure.
Dynamic Properties of the PHF6-ePHD2 Module-The 1 H, 13 C, and 15 N chemical shifts of backbone resonances of PHF6-ePHD2 domain were assigned by multidimensional heteronuclear NMR experiments for the 15 N, 13 C-labeled sample (data not shown). Using uniformly 15 N-labeled PHF6-ePHD2, we measured the 15 N spin relaxation data R1 and R2 and 1 H-15 N heteronuclear NOE values to investigate the internal mobility of the secondary structural elements and loops in PHF6-ePHD2 domain by NMR. Analysis of backbone relaxation data R1, R2, and the 1 H-15 N NOEs show a remarkably uniform distribution over most of the amino acid sequence (Fig. 5A). The overall average values of R1, R2, and NOE are 1.28 Ϯ 0.02, 13.74 Ϯ 0.17, and 0.76 Ϯ 0.02 s Ϫ1 , respectively, implying that overall molecular architecture of the PHF6-ePHD2 domain is rather rigid and might fold as an integrated structural unit. Except for the segment from 253 to 260 amino acid residues of loop L1, the increased R1 value and decreased R2 and 1 H- 15 N NOEs values that reveal this is a flexible region compared with the core region (Fig. 5A). The dynamic property of this segment is consistent with its bad electron density map in one molecule in the asymmetric unit.
To assess the orientation of the pre-PHD and PHD2 zinc fingers in PHF6-ePHD2 in solution further, we measured their 1 H-15 N RDCs in 5% stretched polyacrylamide gel medium. Alignment tensors were then determined for the pre-PHD, PHD2, and ePHD2 domain using the crystal structure, respectively. The magnitude, rhombicity, and Euler angles of the alignment tensors from the individual fits are indistinguishable (Fig. 5B), and the experimental RDC values agree well with the back-calculated values (Fig. 5C), indicating the tandem domains can be fit simultaneously to a single tensor just as well as fits to each individual domain. These results demonstrate the rigidity of the two PHD zinc fingers.
PHF6-ePHD2 Domain Basic Surface Nonspecifically Binds to DNA-We noticed that the positively charged region of the PHF6-ePHD2 domain that covers one side of the structure consisted of highly conserved arginine and lysine residues. To investigate the possibility that the PHF6-ePHD2 domain function may involve a DNA binding activity, we conducted an FPA and an electrophoretic mobility shift assay (EMSA) to test whether PHF6-ePHD2 domain directly binds to dsDNA. It has been shown that the C-terminal tail (from 334 or 343 to 365 amino acids) of PHF6 is missing in several BFLS patients, indicating an important role of the tail for the function of PHF6. We therefore performed the dsDNA binding assays using purified PHF6(208-333) and PHF6(208-365) proteins.
As shown in Fig. 6, A and B, the value of fluorescence polarization anisotropy (in millipascal units) was concurrently increased with the increasing ratio of protein/DNA, and the binding affinities of PHF6(208-333) and PHF6(208-365) to dsDNA oligonucleotides were similar. The K d values for PHF6(208-333) binding to 14 -20-bp dsDNA were about 12-17 M, but for PHF6(208-365), the K d values were decreased to 7-8 M, indicating a stronger binding affinity of PHF6(208-365) to dsDNA ( Table 2).
We next examined whether PHF6(208-333) and PHF6(208-365) exhibit DNA-binding sequence preferences, but no obvious preference for an AT-or GC-rich sequence was observed (Fig. 6, C and D). The K d values for PHF6(208-333) binding to AT-and GC-rich dsDNA were about 20.5 and 30.5 M, respectively ( Table 2). The K d values for PHF6(208-365) binding to AT-or GC-rich dsDNA were about 4.2 and 5.2 M, respectively ( Table 2), suggesting the C-terminal tail of PHF6 may contribute to a high binding affinity of PHF6(208-365) to dsDNA. We also performed EMSA)to confirm the interaction of dsDNA with either PHF6(208-333) or PHF6(208-365). As shown in Fig.  6, E and F, with the increasing concentration of GST-fused PHF6(208-333) or PHF6(208-365), the amounts of shifted dsDNA were increased correspondingly.
To determine the PHF6(208-333) or PHF6(208-365) interfaces that are responsible for DNA binding, we monitored chemical shift perturbation changes upon titrating DNA into the 15 N-labeled PHF6(208-333) and PHF6(208-365) protein samples. The NMR perturbation experiments showed lots of the residues with obvious chemical shift changes (Fig. 7, A-D (Fig. 7, E and F). The residues with the most perturbed amides were distributed across the basic face of the pre-PHD finger and helix ␣3 in PHF6(208-333) (Fig. 7G). The perturbed residues due to DNA binding are same for PHF6(208-333) and PHF6(208-365) (Fig. 7, E and F).
To investigate whether PHF6-ePHD2 interacts with dsDNA via the positively charged region (Fig. 7G), we examined the dsDNA binding activities of four mutants of PHF6-ePHD2, R209E/K211E, K234E/K235E, K273E/R274E/K276E/R277E, and K276E/R277E/M278A/K279E. Despite that these mutations do not appear to affect overall protein structure of PHF6-ePHD2 (data not shown), the K d value of R209E/K211E mutant was increased to 62.0 Ϯ 16.8 M, and the dsDNA-binding abilities of other three mutants were almost completely abolished (Fig. 7, H-J; Table 2). These results indicate that the positively residues located in pre-PHD and helix ␣3 are crucial for the ability of PHF6-ePHD2 to bind dsDNA.
Structural Analysis of PHF6-ePHD2 Reveals Pathological Implication of Amino Acid Point Mutations-To relate the structural information to function, we mapped the previously reported PHF6 point mutations present in BFLS, T-ALL, and acute myeloid leukemia to the three-dimensional structure of the ePHD2 domain (Fig. 8, A-D). These mutations are dispersedly located in pre-PHD finger, helix ␣3, loop L1, and PHD finger. Several point mutations have been found in BFLS patients. H229R, K234E, R257G, and I314V were identified within the PHF6-ePHD2 domain (1,8,10). Residue His-229 is located on the ␤1 strand; residue Lys-234 is located on the looplinked ␤1 and ␤2 strands; residue Arg-257 is located on the loop L1, and residue Ile-314 is located on the ␤6 strand. To obtain the structural impact of these substitutions, we expressed, purified, and analyzed the corresponding PHF6-ePHD2 mutants by NMR spectroscopy. All H229R, K234E, R257G, and I314V mutants do not appear to have obvious impact on the structure based on the one-dimensional NMR spectra analysis (data not shown). These results indicate that residues His-229, Lys-234,   30.8, 61.7, 123.4, 185, 247, 14.2, 28.3, 56.7, 113.4, 226.8, and 283.5 M.
Arg-257, and Ile-314 do not affect structural integrity of the PHF6-ePHD2 domain, and they may interfere with the interaction of PHF6 with its binding partners.
Multiple point mutations in PHF6-ePHD2 domain found in T-ALL and acute myeloid leukemia patients affect nine zinccoordinating amino acids residues (C212R, C215F/C215Y/ C215R, C242Y, C280R/C280N, C283Y, C297R, H302Y, C326W, and H329R/H329L) (1, 16, 19 -22), and position Cys-215 within the pre-PHD finger appears to be a mutational hot spot. PHF6-ePHD2 mutants harboring the affecting zinc-coordinating cysteines and histidines mutations are expressed only in trace amounts, mostly in the inclusion body, indicating these mutants are misfolded. Additional point mutations change deeply buried amino acid residues (G275R, A288T/A288V, and I290N) (20,22) in the hydrophobic core of the structure and partially solvent-exposed amino acid residues (I294R, T300A,  A311P, and Y325H) (16,20,22). Buried amino acid mutations destabilize the structural core resulting in low soluble protein, and most are unfolded in the inclusion body. The partially solvent-exposed Y325H and I294R mutations adjacent to the third zinc-coordinating position are also mostly expressed as inclusion bodies. The T300A and A311P mutations have slight effect on the structure of the PHF6-ePHD2 domain (data not shown).
The dsDNA binding abilities of these point mutants were different, but only the K234E mutant was reduced significantly Thus, these mutations that affect the structure core or zinc coordination of the PHF6-ePHD2 domain destabilize the correct protein fold, which are likely to affect its normal biological function. Another class of mutations that locate on the surface of the ePHD2 domain also affects the protein's structural and electrical characters or stability to some extent, and they are most likely to affect interactions with specific partners directly.
PHF6 Directly Interacts with NuRD Complex Component RBBP4-Recently, it has been identified that PHF6 interacts with the NuRD (also known as Mi-2) complex. PHF6 can be co-purified with CHD4, RBBP4, and HDAC1 (24). However, it remains unknown the component of the NuRD complex to  which PHF6 directly binds. We therefore determined whether RBBP4 could directly interact with PHF6. We expressed GSTfused full-length PHF6 and several PHF6 fragments, and we carried out GST pulldown experiments with recombinant RBBP4 protein. Several conclusions can be drawn from the pulldown results (Fig. 9). First, RBBP4 interacts with full-length PHF6 directly but not with GST alone (Fig. 9A). Second, the extended PHD2 domain of PHF6 is not involved in the interaction with RBBP4 (Fig. 9B). Third, RBBP4 directly binds to two PHF6 fragments, PHF6(145-207) and PHF6(152-171) (Fig.  9C), both of which contain a putative NoLS composed of nuclear localization sequences 3 and 4 (NLS3 and NLS4). This region is rich in positively charged amino acid residues (Lys and Arg), which is similar to the N-terminal region of several co-repressors and transcription factors, including SALL1, BCL11A/B, and FOG1/2 (48). A sequence alignment of PHF6 with SALL1, BCL11A/B, and FOG1/2 is shown in Fig. 9D. The high sequence similarity indicates that PHF6 likely recruit the NuRD complex in a similar manner as do SALL1, BCL11A/B, and FOG1/2.
PHF6-mediated Transcriptional Repression Is Dependent on the RBBP4 Interaction-To investigate the function of the PHF6 protein, we used a reporter system in which PHF6 was expressed as a fusion with the GAL4 DNA binding domain (GAL4-DBD) and co-expressed with reporters controlled by GAL4-response elements (GAL-TK-luciferase). The results showed that full-length PHF6 strongly repressed the transcription of the reporter gene by more than 5-fold (Fig. 9E). However, we found the transcriptional repression function of PHF6 is not associated with the ePHD2 domain and the C-terminal region of PHF6 (Fig. 9E). When the putative NoLS region was deleted from full-length PHF6, the transcriptional repression by PHF6 was recovered (Fig. 9E), suggesting that PHF6-medi-ated transcriptional repression may be dependent on this RBBP4-interacting sequence.

DISCUSSION
Mutations in the PHF6 gene was first discovered in the BFLSassociated patients (1). BFLS is a recessive X-linked disorder characterized by mental retardation, hypogonadism, hypometabolism, and obesity (1,15). Somatic mutations in the PHF6 gene were also found associated with T-ALL and acute myeloid leukemia patients, implicating PHF6 as a tumor suppressor (16, 19 -22). However, the pathogenesis implication for these mutations is poorly understood. It has been reported that PHF6 can interact with both the NuRD complex and the PAF1 transcription elongation complex (24,26). Knockdown of PHF6 profoundly impairs neuronal migration in the mouse cerebral cortex in vivo, leading to the formation of white matter heterotopias displaying neuronal hyperexcitability (26). PHF6 is implicated in chromatin regulation and neural development (1,24), but the molecular mechanisms in these processes are still unclear. Here, we report the high resolution crystal structure of the ePHD2 domain of PHF6. Structural analysis of PHF6-ePHD2 reveals pathological implication of PHF6 gene mutations in BFLS, T-ALL, and acute myeloid leukemia. We showed that PHF6-ePHD2 could bind dsDNA. We also demonstrated that the PHF6 protein directly interacted with RBBP4 through its NoLS region that is important for the transcriptional suppression activity of PHF6 protein.
ePHD Domain of PHF6 Represents a Novel Structural Module-To better understand the function of PHF6, we have determined the structure of the ePHD2 domain from human PHF6 protein. The pre-PHD and PHD fingers in ePHD2 are associated with each other via extensive hydrophobic interac- FIGURE 10. Sequence alignment of homologically extended PHD domains of proteins from a Psi-Blast search in NCBI. Zinc-chelating cysteines, histidines, and identical amino acid residues are highlighted on a red background. Similar residues that may associate with hydrophobic interactions between pre-PHD and linker region, pre-PHD and PHD, are highlighted on a yellow and cyan background, respectively. The secondary elements of the PHF6-ePHD2 domain are shown at top. tions and numerous hydrogen bonding interactions and folded as an intact structural module.
Given the sequence similarity, it is likely that the essential features of this structure are also shared by other extended PHD domain-containing proteins. We performed a Psi-blast search in NCBI, and we found several human chromatin-associated proteins indeed contain similar variational PHD domains, including MLL1/2/3/4, JMJD2A/B/C, and several PHD finger proteins (Fig. 10). Intriguingly, unlike other PHD domain-containing proteins, PHF6-ePHD2 is not able to interact with histones, suggesting that the extended PHD domains of PHF6 may have unique functions that different from other typical PHD fingers. We showed here that PHF6-ePHD2 could bind dsDNA; nevertheless, it is possible that the ePHD2 domain could interact with RNA, given the role of PHF6 in regulating rRNA synthesis (25).
Previous studies have reported that the extended LAP/PHD finger mediates homo-oligomerization of recombinant AF10, and this structural module is conserved in several proteins, including MLL (49). However, the ePHD2 domain of PHF6 behaves as a monomer in our experiments, so despite similarity within the extended PHD domains, different proteins have their unique functional and structural features.

Implication of Pathological Mutants Relevant to BFLS, T-ALL, or Acute Myeloid Leukemia Revealed by Structure-
The important role of the ePHDs of PHF6 is highlighted by its mutations found in BFLS, T-ALL, and acute myeloid leukemia. The ePHD2 domain of PHF6 harbors 50% of missense mutations found in the PHF6 gene in BFLS and more than 50% mutations in the PHF6 gene found in T-ALL. However, previously very little is known about the function of PHF6 or the pathophysiological mechanism underlying PHF6 mutations. The structural analysis presented here shows that the point mutations associated with BFLS mainly affect the electrostatic characters of the ePHD2 domain, not affecting the structural integrity but probably disrupt its ability to interact with specific binding partners. We demonstrate that the point mutations in the crucial zinc-coordinating residues or buried residues will affect the folding of the ePHD2 domain and therefore will certainly affect the function of this protein.
PHF6 through Its NoLS Region Directly Interacts with RBBP4, a Component of NuRD Complex, and Plays the Function of Transcriptional Suppression-Somatic inactivating mutations in PHF6 have been linked to T-ALL. Loss of PHF6 is also associated with leukemias driven by aberrant expression of the homeobox transcription factor oncogenes TLX1 and TLX3 (16). These findings suggest that PHF6 is an X-linked tumor suppressor (16). However, the underlying molecular mechanisms remain obscure.
Recently, PHF6 has been identified to interact with the NuRD complex and PHF6 as a chromatin-associated protein was also confirmed (24). We showed that PHF6 is able to directly interact with RBBP4, a component of the NuRD complex, through the NoLS region. We also demonstrated this NoLS region is associated with the transcriptional suppression activity of PHF6 protein. Lack of this region obviously reduces the suppression effect of PHF6 on the reporter gene activity.
Taken together, our data support the hypothesis that PHF6 may function as a transcription repressor using its ePHD domains binding to the promoter region of its repressed gene, and the gene transcription was regulated by the NuRD complex that was recruited to the target site in the genome by the NoLS region of PHF6. Mutations in PHF6 relevant to BFLS, T-ALL, or acute myeloid leukemia may disturb the expression of normal genes during brain development or T-cell development and cause disease. Further biochemical and structural studies will be aimed to identify the target genes and interacting partners of the PHF6 protein and help to further elucidate and understand the molecular functions of the PHF6 protein in human BFLS and leukemia diseases.