Crystal Structure of the Human Centromere Protein B (CENP-B) Dimerization Domain at 1.65-Å Resolution*

The human centromere protein B (CENP-B), a centromeric heterochromatin component, forms a homodimer that specifically binds to a distinct DNA sequence (the CENP-B box), which appears within every other α-satellite repeat. Previously, we determined the structure of the human CENP-B DNA-binding domain, CENP-B-(1-129), complexed with the CENP-B box DNA. In the present study, we determined the crystal structure of its dimerization domain (CENP-B-(540-599)), another functional domain of CENP-B, at 1.65-Å resolution. CENP-B-(540-599) contains two α-helices, which are folded into an antiparallel configuration. The CENP-B-(540-599) dimer formed a symmetrical, antiparallel, four-helix bundle structure with a large hydrophobic patch in which 23 residues of one monomer form van der Waals contacts with the other monomer. In the CENP-B-(540-599) dimer, the N-terminal ends of CENP-B-(540-599) are oriented on opposite sides of the dimer. This CENP-B dimer configuration may be suitable for capturing two distant CENP-B boxes during centromeric heterochromatin formation.

The centromere, an essential chromosomal region required for the proper segregation of chromosomes during cell division, is embedded within heterochromatin in most eukaryotes (1)(2)(3). This centromere-specific heterochromatin contains specific DNA-binding proteins that target the centromeric region of chromosomes. In humans, the centromere-specific DNA-binding proteins CENP-A, 1 CENP-B, and CENP-C were identified as antigens for autoimmune sera from scleroderma patients (4,5).
CENP-A is the centromere-specific histone H3 variant (6), and CENP-C is a fundamental component of the inner kinetochore plate (7)(8)(9)(10)(11). CENP-A and CENP-C do not seem to have any obvious sequence specificity in DNA binding. In contrast, CENP-B specifically binds a 17-base pair sequence (the CENP-B box), which appears in every other 171-base pair ␣-satellite repeat in human centromeres (12), and induces nucleosome positioning in the vicinity of the CENP-B box (13). These results suggest that CENP-B may function as a transacting factor, which induces the formation of the centromerespecific heterochromatin. Analyses with cultured human cells revealed that the existence of the CENP-B box within the ␣-satellite sequence is required for the formation of a functional centromere containing CENP-B in minichromosomes in vivo (14,15). Biochemical and genetic studies also showed that CENP-A, CENP-B, and CENP-C cooperatively constitute functional centromeres in mammalian cells (16,17).
However, CENP-B null mice appeared to be normal (18 -20). This discrepancy about the CENP-B requirement for centromere formation may be explained by presuming the existence of functional homologue(s) of CENP-B. In fact, CENP-B-like proteins of unknown function have been identified in humans, such as the jerky-like protein (21,22) and the transposases encoded by the human Tigger1 and Tigger2 transposable elements (23,24). The functional redundancy of CENP-B homologues has also been found in the fission yeast Schizosaccharomyces pombe. Three fission yeast proteins, Abp1, Cbh1, and Cbh2, are homologues of CENP-B (25)(26)(27) and associate with centromeric heterochromatin (28,29). None of these proteins is essential for S. pombe viability like human CENP-B (30). However, the double disruptants of Abp1 and Cbh1 showed a synergistic reduction of centromere function in fission yeast (29,30).
Human CENP-B is a dimeric protein composed of 80-kDa subunits (31). It contains DNA-binding domains and a dimerization domain at the N terminus and the C terminus, respectively (32). Electron microscopic observations showed that CENP-B bundles the two distant CENP-B boxes by its dimer formation and DNA binding activities (13). The bundling activity suggests that, in the centromere, CENP-B may function in the higher ordered chromatin formation, in addition to the centromere-specific nucleosome assembly. To reveal the molecular mechanism of the CENP-B functions in centromeric chromatin formation, we determined previously the crystal structure of the CENP-B N-terminal DNA-binding domain complexed with the CENP-B box DNA (33). In the present study, we determined the crystal structure of another CENP-B functional domain, the dimerization domain, at 1.65-Å resolution. The structure provides new insights into CENP-B dimerization that may play an essential role in the centromerespecific heterochromatin formation.

EXPERIMENTAL PROCEDURES
Purification of the Dimerization Domain of CENP-B-A DNA fragment containing the CENP-B-(540 -599) portion of the human CENP-B gene was amplified by polymerase chain reaction and cloned into the NdeI site of the pET15b vector. Selenomethionine-labeled CENP-B-(540 -599) was overexpressed in Escherichia coli BL21 CodonPlus (DE3)-RP-X cells under the control of the T7 promoter as a His 6 -tagged protein. The E. coli strains carrying the CENP-B-(540 -599) expression vector were grown at 30°C, and isopropyl-␤-D-thiogalactopyranoside (IPTG; 80 M) was added at an OD 600 of 0.4 to induce protein expression. After overnight cultivation at 18°C, the cells were harvested and disrupted by sonication in 50 mM Tris-HCl buffer (pH 8.0) containing 500 mM NaCl. The cell debris was removed by centrifugation for 20 min at 30,000 ϫ g, and the lysate was mixed gently by the batch method with nickel-nitrilotriacetic acid (Ni-NTA)-agarose beads (Qiagen) at 4°C for 1 h. The CENP-B-(540 -599)-bound Ni-NTA-agarose beads were then packed into an Econo-Column (Bio-Rad) and washed with 30-column volumes of 50 mM Tris-HCl buffer (pH 8) containing 500 mM NaCl, 5% glycerol, 1 mM phenylmethylsulfonyl fluoride, and 5 mM imidazole, at a flow rate of about 0.3 ml/min. The His 6 -tagged CENP-B-(540 -599) was eluted in a 20-column volume linear gradient from 5 to 300 mM imidazole in 50 mM Tris-HCl buffer (pH 8) containing 500 mM NaCl, 5% glycerol, and 1 mM phenylmethylsulfonyl fluoride. The protein was dialyzed against 50 mM Tris-HCl buffer (pH 8) containing 50 mM NaCl, 5% glycerol, and 1 mM phenylmethylsulfonyl fluoride, and the His 6 tag was uncoupled from CENP-B-(540 -599) by a digestion with 10 units of thrombin protease (Amersham Biosciences) per milligram of CENP-B-(540 -599). Then, the fractions containing CENP-B-(540 -599) were mixed with 5 ml of heparin-Sepharose (Amersham Biosciences) at 4°C for 2 h and then packed into an Econo column (Bio-Rad). The heparin-Sepharose beads with CENP-B-(540 -599) were washed with 6-column volumes of 50 mM Tris-HCl buffer (pH 8) containing 50 mM NaCl, and CENP-B-(540 -599) was eluted in a 12-column volume linear gradient from 50 to 800 mM NaCl in this buffer. The CENP-B-(540 -599) protein was concentrated and further purified by HiLoad 26/60 Superdex 75 gel filtration chromatography (Amersham Biosciences).
Crystallization and Structure Determination-The purified CENP-B-(540 -599) was concentrated up to 12 mg of protein per milliliter, and crystals were obtained by the hanging drop method after mixing an equal volume of 12 mg/ml CENP-B-(540 -599) with a reservoir solution of 50 mM CHES buffer (pH 9.5) containing 0.5 M sodium citrate. The CENP-B-(540 -599) crystals were soaked in a cryo-protectant solution of 50 mM CHES buffer (pH 9.5) containing 5% glycerol for a few minutes. Then, the crystals were transferred to 50 mM CHES buffer (pH 9.5) containing 10% glycerol for a few minutes and transferred again to 50 mM CHES buffer (pH 9.5) containing 20% glycerol. The crystals were flash-frozen in a stream of N 2 gas (100 K). The crystals belong to the orthorhombic space group P2 1 2 1 2 1 , with unit cell constants of a ϭ 43.7 Å, b ϭ 49.0 Å, and c ϭ 100.7 Å, and contain two dimers per asymmetric unit. High resolution diffraction data were obtained using the synchrotron radiation source at the RIKEN beam line BL44B2 station (RIKEN Structural Biology Beamline II; Ref. 34) of SPring-8, Harima, Japan. Intensity data were collected with a MAR CCD detector. The structure of CENP-B-(540 -599) was initially solved to 1.8-Å resolution by the multiple wavelength anomalous dispersion (MAD) method. Diffraction data were integrated and scaled with HKL2000 and SCALEPACK (35). General handling of the scaled data was carried out with programs from the CCP4 suite (36). The positions of the selenium atoms were determined using the program SOLVE (37). The magnitudes of the anomalous differences at the peak wavelength were normalized, and the density modification was performed with RESOLVE (37). The resulting electron density map was sufficiently clear to build an initial model of the structure. The model was built with the program TURBO-FRODO (38). Structural refinement was performed using ARPwARP (39), X-PLOR version 3.851 (40), and REFMAC. Solvent molecules were placed at the positions where spherical electron density peaks were found above 1.3 in the 2F o Ϫ F c map and above 3 in the F o Ϫ F c map and where the stereochemically reasonable hydrogen bonds were allowed. Structural evaluations of the final models using PROCHECK (41) indicated that 97% of the residues are in the most favorable regions of the Ramachandran plot, with no residues in the "disallowed" regions. A summary of the data collection and refinement statistics is given in Table I. The atomic coordinates of CENP-B-(540 -599) have been deposited (Research Collaboratory for Structural Bioinformatics identification code, rcsb005769; Protein Data Bank identification code, 1UFI).
Gel filtration analysis revealed that the molecular mass of the purified CENP-B-(540 -599) is ϳ16 kDa (Fig. 1B), which corresponds nearly to that of two molecules of CENP-B-(540 -599) (14.5 kDa for two CENP-B-(540 -599) molecules). Therefore, we concluded that the purified CENP-B-(540 -599) formed homodimers in solution. Then, the CENP-B-(540 -599) dimer was concentrated up to 12 mg of protein per milliliter, and the crystals were obtained by the hanging drop method. The crystals belong to the orthorhombic space group P2 1 2 1 2 1 , with unit cell constants of a ϭ 43.71 Å, b ϭ 48.96 Å, c ϭ 100.70 Å, and ␣ ϭ ␤ ϭ ␥ ϭ 90°, and contain two dimers per asymmetric unit. The crystal structure was solved by the multiple wavelength anomalous dispersion (MAD) method using crystals of the selenomethionine-substituted sample (Table I). A diffraction data set was collected at 1.65-Å resolution at beamline BL44B2 in SPring-8, Harima, Japan.
Protein Structure-In the crystal structure, CENP-B-(540 -599) is composed of two ␣-helices (␣1 and ␣2) ( Fig. 2A), which are folded into an antiparallel configuration (Fig. 2B). The human CENP-B-(540 -599) region is perfectly conserved among the mammalian CENP-Bs, except for the hamster CENP-B Trp-595 residue, indicating that the dimerization domain structures of mammalian CENP-Bs are identical ( Fig.  2A). The ␣1 and ␣2 helices are amphipathic, and the two CENP-B-(540 -599) monomers, which interact via their hydrophobic surfaces, form a dimer with a symmetrical, antiparallel, four-helix bundle structure (Fig. 2, B-D). The last 16 residues of each CENP-B-(540 -599) C terminus, which is the natural C terminus of the full-length CENP-B, are disordered. Therefore, they are not required for dimer formation. This may reflect the flexible nature of the CENP-B C-terminal region, which is rich in glycine residues. Interestingly, the amino acid residues in the CENP-B region containing the amphipathic helices are significantly conserved in the human Tigger1 transposase ( Fig.  2A), which also shares sequence similarity with CENP-B in the N-terminal DNA-binding and central transposase-like domains. Therefore, the Tigger1 transposase may be a CENP-B functional homologue, which can complement the CENP-B function in CENP-B-deficient mice. Furthermore, the amino acid residues in the structured region of CENP-B-(540 -599) are also conserved in the S. pombe CENP-B homologues Abp1, Cbh1, and Cbh2 ( Fig. 2A), suggesting that the dimerization mechanisms of the human and fission yeast CENP-Bs are similar.
The crystals used in the present study contain two dimers per asymmetric unit in which the CENP-B-(540 -599) dimers directly interact with each other (Fig. 2E). The root mean square deviation value between these CENP-B-(540 -599) dimers was 0.7 Å for the C␣ atoms, indicating that these CENP-B-(540 -599) dimers share almost identical structures.
The Dimer Interface-As shown in Fig. 3A, the two CENP-B-(540 -599) monomers dimerize to form an antiparallel, four-helix bundle. This type of dimerization is also present in the bacterial ROP protein (root mean square deviation value of 2.6 Å for the C␣ atoms) (Fig. 3B). The two ␣-helices of CENP-B-   2. Crystal structure of CENP-B-(540 -599). A, the secondary structure of CENP-B-(540 -599). The secondary structures of the CENP-B-(540 -599) monomers in the dimer are presented in the top and second rows. The N-terminal and C-terminal ␣-helices are denoted as ␣1 and ␣2, respectively, and the N-terminal loop and the loop between ␣1 and ␣2 are denoted as L1 and L2, respectively. The two molecules are colored blue and pink, respectively, and dashed lines indicate the C-terminal disordered regions. The dimerization domain sequence of human CENP-B is aligned to those of the African green monkey (AGM) CENP-B, hamster CENP-B, mouse CENP-B, S. pombe (S. p) Abp1, S. pombe Cbh1, and S. pombe Cbh2 and is independently aligned to the corresponding region of the human Tigger1 transposase (bottom row). These amino acid sequences were aligned by the ClustalW program (43). Fully conserved residues, moderately conserved residues, and weakly conserved residues are colored red, orange, and yellow, respectively. Numbers indicate the amino acid position from the N terminus. B-D, three views of the CENP-B-(540 -599) structure. Colors correspond to those described above for panel A. E, Overall structures of the two CENP-B-(540 -599) dimers in the asymmetric unit.
(540 -599) are arranged in parallel. Similarly, the two ␣-helices of the ROP monomer are also parallel in the dimeric form. The topologies in the dimeric forms of CENP-B-(540 -599) and ROP are the same; the N-terminal helices interact with those of the other monomer. Similar four-helix bundle structures were also found in the human telomere-binding proteins TRF1 and TRF2 (42) (Fig. 3, C and D). However, these four helices are only parts of the large dimerization domains of TRF1 and TRF2 (Fig. 3E). Therefore, the dimerization domains of these telomere-binding proteins differ from that of CENP-B. The structural similarities between the CENP-B-(540 -599) dimer and the dimerization interfaces of TRF1 (65-111 residues) and TRF2 (43-89 residues) were not significant (root mean square deviation values of 3.5 and 3.3 Å, respectively, for the C␣ atoms), and the two helices of the TRF1 or TRF2 monomer within the interface are non-parallel (the angles between these helices are ϳ15°) (Fig. 3, C and D). Thus, these results indicate that the CENP-B-(540 -599) dimer structure is unique.
The dimerization interface of CENP-B-(540 -599) is a large, buried surface area of 1,386 Å 2 (ϳ34% of the monomer surface) (Fig. 4A), which is large enough to confirm the natural dimerization of CENP-B. On the other hand, the area buried in the dimer-dimer surface in the asymmetric unit (Fig. 2D) is as small as 711 Å 2 , indicating that this interaction is non-physiological. In the functional dimerization interface, 23 residues of one monomer form van der Waals contacts with the other monomer (Fig.  4B). Specific hydrogen bonds were also found in the large hydrophobic interface between the monomers, indicating that the CENP-B-(540 -599) dimer formation is structure-specific. The side chain OH group of Tyr-557 forms a hydrogen bond with the Thr-582 O␥ atom of the other monomer (Fig. 4C, top). The ␦-NH group of His-570 forms a hydrogen bond with the Asp-577 O␦2 atom of the other monomer (Fig. 4C, bottom). These interactions are nearly symmetrical between the monomers because of their non-crystallographic symmetry.
Centromeric Chromatin-CENP-B reportedly bundles two distant CENP-B boxes through its dimer formation and DNAbinding abilities (13). This DNA-bundling activity should be important for the centromeric chromatin formation. To reveal the functional significance of the CENP-B DNA-bundling activity, structural analyses of the DNA-binding and dimerization domains of CENP-B would be useful. We determined previously the crystal structure of the CENP-B N-terminal DNAbinding domain (CENP-B-(1-129)) in a complex form with the CENP-B box DNA (33). In the present study, we determined the crystal structure of another CENP-B functional domain, the dimerization domain, which revealed that the CENP-B dimerization domains form a homodimer with a symmetrical, antiparallel, four-helix bundle structure. Many hydrophobic residues face the monomer-monomer interface and form van der Waals contacts between the monomers. In fact, no CENP-B-(540 -599) monomer was detected in solution by the gel filtration analysis. Therefore, the monomeric form of the CENP-B dimerization domain cannot exist in solution, because the hydrophobic surfaces, which correspond to about 34% of the monomer surface, are exposed to the solvent.
In the crystal structure, the N-terminal loops of the CENP-B-(540 -599) dimer were located on opposite sides of the dimer. These N-terminal positions of the CENP-B-(540 -599) dimer may be suitable for capturing two distant CENP-B boxes with its N-terminal DNA-binding domains, as shown in the model presented in Fig. 5. Because the CENP-B box sequence exists in every other ␣-satellite repeat (171 base pairs), CENP-B may accommodate a pair of centromeric nucleosomes that contain histones H2A, H2B, H4, and a centromere-specific histone H3 variant, CENP-A, between two CENP-B boxes tethered by the CENP-B dimer (Fig. 5).