Solution Structure and DNA-binding Mode of the Matrix Attachment Region-binding Domain of the Transcription Factor SATB1 That Regulates the T-cell Maturation*

SATB1 is a transcriptional regulator controlling the gene expression that is essential in the maturation of the immune T-cell. SATB1 binds to the nuclear matrix attachment regions of DNA, where it recruits histone deacetylase and represses transcription through a local chromatin remodeling. Here we determined the solution structure of the matrix attachment region-binding domain, possessing similarity to the CUT DNA-binding domain, of human SATB1 by NMR spectroscopy. The structure consists of five α-helices, in which the N-terminal four are arranged similarly to the four-helix structure of the CUT domain of hepatocyte nuclear factor 6α. By an NMR chemical shift perturbation analysis and by surface plasmon resonance analyses of SATB1 mutant proteins, an interface for DNA binding was revealed to be located at the third helix and the surrounding regions. Surface plasmon resonance experiments using groove-specific binding drugs and methylated DNAs indicated that the domain recognizes DNA from the major groove side. These observations suggested that SATB1 possesses a DNA-binding mode similar to that of the POU-specific DNA-binding domain, which is known to share structural similarity to the four-helix CUT domain.

The nuclear matrix is a structural component inside the nucleus, to which chromatin binds via matrix (or scaffold) attachment regions (MARs) 2 of the DNA, forming looped chromatin structures (1). MARs are frequently located at the boundaries of transcription units where they are likely to delimit the ends of the active chromatin domains in terms of transcription as well as replication (1). The MAR sequences commonly contain regions where base pairs tend to break under an unwinding stress (base-unpairing region (BUR)), which is important in binding to the nuclear matrix (2). SATB1 (from special AT-rich sequence-binding protein 1) was initially identified as a factor that specifically binds to the BUR sequence, which is predominantly expressed in thymus (3). In thymocytes, SATB1 recruits histone deacetylase complex to the MAR site inside the transcription of the interleukin-2 receptor ␣ gene, in order to repress their expression in the premature T-cells (4). Indeed, SATB1-null mice exhibit irregular expression of interleukin-2 receptor ␣ gene and related genes in the premature CD4 ϩ CD8 ϩ T-cells, which caused small thymi and spleens and death at the age of 3 weeks (5). SATB1 also regulates the expression of fetal globin genes in the erythroid progenitor cells by directly binding to MARs in the locus control region and the ⑀-globin promoter region in the ␤-globin cluster (6). Inside the cells, SATB1 is localized at nuclei and surrounds heterochromatin, forming a cage-like network structure (7). SATB1 is ϳ800 amino acids in length, in which a region of ϳ150 amino acids located nearly in the middle (Tyr 346 -Asn 495 ) is originally reported to be relevant to binding to the MAR DNA (MAR-binding domain (MBD)) (8). This protein also contains a homeodomain located at a more C-terminal region, which alone does not show significant DNA binding activity but enhances the activity of MBD (9). The MBD sequence possesses a similarity to the CUT DNA-binding motif that is found in a group of DNA-binding proteins also containing a homeodomain at a more C-terminal region (10). Recently, a four-helix structure of the CUT domain of hepatocyte nuclear factor 6␣ has been determined (11). 3 Although this structure showed similarity to those of the POU-specific domains of POU-homologous proteins (12,13), the DNA-binding mechanism of the CUT domain is yet unknown.
In this study, a three-dimensional structure of SATB1-MBD was determined by NMR spectroscopy. The structure consists of five ␣-helices packing together into a globular shape, which possesses an additional helix as compared with the HNF6␣ CUT domain structure. In addition, the DNA-binding mode was elucidated by NMR titration, surface plasmon resonance (SPR), and point mutation experiments and was proposed to be similar to that by POU-specific domains.

MATERIALS AND METHODS
Sample Preparation-The cDNA that codes for the SATB1 MARbinding region (Tyr 346 -Asn 495 ) was amplified from Quick Clone TM human thymus cDNA mixture (Clontech), by PCR using Pyrobest DNA polymerase (Takara), and was subcloned into the NdeI-BamHI cloning site of pET15b vector (Novagen). Shorter fragments, Val 353 -Asn 490 , Asn 368 -Lys 475 , and Asn 368 -Ala 455 , were also produced by PCR from the above vector. The isotope-labeled and unlabeled proteins were expressed in Escherichia coli cells, as described previously (14). The proteins were purified by nickel-nitrilotriacetic acid Superflow (Qiagen) and Sephadex G-75 (Amersham Biosciences) column chromatography. The buffers used were 50 mM sodium phosphate (pH 8.0), 500 mM NaCl, 10 -250 mM imidazole for nickel-nitrilotriacetic acid Superflow chromatography and 50 mM sodium phosphate (pH 5.5) for Sephadex G-75 chromatography. Protein concentration was determined by A 280 values, and molar absorption coefficients were calculated from the amino acid sequences. For NMR measurements, ϳ1.3-1.7 mM proteins were dissolved in 50 mM sodium phosphate buffer (pH 5.5) containing 0.5 mM sodium 2,2-dimethyl-2-silapentane-5-sulfonate, and 5% D 2 O, unless otherwise stated.
NMR Measurements, Resonance Assignments, and Structural Calculation-Typical homonuclear and heteronuclear NMR spectra (15,16) 13 C, and 50.68 MHz for 15 N) spectrometers at 308 K, essentially as described previously (17). The recorded spectra were analyzed for the backbone and side chain resonances assignment, also as described previously (17). HSQC spectra of samples containing 100% D 2 O were recorded at 293 K, by which 43 hydrogen bond donors were identified. By using the HMQC-J experiment (18), 80 3 J HNH␣ coupling values in the calculated region (see below) were obtained. By analyzing the nuclear Overhauser effect spectroscopy, total correlation spectroscopy, and double quantum-filtered correlation spectroscopy spectra, 25 and 4 pairs of H-␤ and valine H-␥ resonances, respectively, in the calculated region were assigned stereospecifically.
Structure was calculated for the Asn 368 -Ala 455 region, because more N-terminal and C-terminal regions appeared to be unstructured in the NMR spectra, and resonance assignments were not completed mainly because of the cross-peak overlapping. Indeed, NMR spectra of the Asn 368 -Ala 455 fragment were very similar to those of the fragment Val 353 -Asn 490 , as described under "Results." Random simulated annealing (20) was carried out by using the program CNS (21), essentially as described (19).
SPR-Experiments were carried out at 293 or 288 K using a BiaCORE X apparatus (BIAcore). The running buffer was 50 mM sodium phosphate buffer (pH 5.5) containing 0.005% Tween 20. A total of 338 and 522 resonance units of two double-stranded DNAs (5Ј-bio-CGTT-TCTAATATATGC-3Ј/5Ј-GCATATATTAGAAACG-3Ј, the BUR nucleation sequence (22) is underlined, and 5Ј-bio-GGGCATGTA-TCTTGAATC-3Ј/5Ј-GATTCAAGATACATGCCC-3Ј, respectively, where "bio" indicates biotinylation at the 5Ј end) were immobilized on the surfaces of Sensor Chip SAs (BIAcore) in one (flow cell 2) of the two flow cells, so that the other (flow cell 1) is treated as the control. Solutions containing the protein at concentrations of 5 nM to 5 M were injected into the flow cells at 20 l min Ϫ1 for 5 min. The equilibrium binding constant was estimated by fitting the equilibrium response values at different protein concentrations to the simple 1:1 binding model using the BIAevaluation 3.0 software (BIAcore). For concentration-response profiles that do not appear to be saturated within the experimental concentration range, the saturation levels were constrained to the maximal response value expected when the protein molecules bind to all the immobilized DNA molecules at 1:1 stoichiometry. The standard deviations in four repeated experiments were used as the error levels.
Essentially the same experiments were performed with DNAs containing methylated bases: 5Ј-bio-CGTTTCTAATATATGC-3Ј/5Ј-GC-ATATATTm 6 AGAAACG-3Ј and 5Ј-bio-CGTTTCTAATATATGC-3Ј/5Ј-GCATATm 6 ATTAGAAACG-3Ј or in the presence of distamycin (Sigma), methyl green (Funakoshi, Japan), or spermine (Wako, Japan). Because methylation at the N 6 H 2 group of adenine base destabilizes the DNA double strands (23), experiments using methylated DNAs were performed at 288 K. Experiments were also performed for the Val 353 -Asn 490 fragment in 50 mM sodium acetate buffer (pH 5.5), which confirmed that the phosphate ions in the phosphate buffer do not interfere with the DNA binding (data not shown).
NMR Titration Analysis-1 H-15 N HSQC spectra of the protein at the initial concentration of 0.31 mM were recorded at 293 K by adding increasing amounts of 7.8 mM of the double-stranded 16-mer DNA (5Ј-CGTTTCTAATATATGC-3Ј/5Ј-GCATATATTAGAAACG-3Ј) (the BUR nucleation sequence (22) is underlined). The concentration of the double-stranded DNA was determined by using an extinction coefficient calculated after digestion of the strands with phosphodiesterase I (Worthington). Heteronuclear three-dimensional spectra of SATB1-MBD in the complex with DNA were recorded in order to confirm assignments of backbone 1 H and 15 N resonances.
Site-directed Mutagenesis-A series of point mutations ( Table 2) were introduced on the SATB1-MBD (Val 353 -Asn 490 ) expression plasmid, using QuickChange TM XL site-directed mutagenesis kit (Stratagene). After annealing of primers of 30 bases containing mutated triplet codons, DNA strands were elongated by KOD Plus DNA polymerase (Toyobo). The mutant proteins were prepared essentially as described above, and SPR experiments were performed also as described above. The error levels in the ratios of the binding constants of the wild-type and mutant proteins (K AWT /K Amut ) were evaluated as ( 2 WT K 2 Amut ϩ 2 mut K 2 AWT ) 1/2 /K 2 Amut , where K AWT and K Amut represent the binding constants of wild-type and mutant proteins, respectively, and WT and mut represent the error levels in K AWT and K Amut , respectively.
Modeling of Complex of SATB1-MBD and DNA-Starting from an initial model in which the side of the protein containing the residues with largely affected chemical shifts in the NMR titration experiment was oriented to the major groove side of a standard B-DNA, the most energetically favored structure was selected by a careful and systematic search, essentially as described previously (17).

MAR-binding Domain and CUT Domain-MAR-binding domain
region was originally assigned to be Tyr 346 -Asn 495 for murine SATB1 protein, where the human counterpart possesses a nearly identical sequence (8) (Fig. 1). We have initially expressed and purified a fragment corresponding to this region, and we found that the fragment contained significantly unfolded region(s) by NMR analyses (data not shown). Therefore, we prepared shorter fragments Val 353 -Asn 490 , Asn 368 -Lys 475 , and Asn 368 -Ala 455 . All of them were expressed well and successfully purified. Comparison of the NMR spectra suggested that the shortest fragment contains most of the region with folded property and the longer ones possess additional unstructured region(s) (data not shown).
The abilities of the above fragments to bind to a DNA containing a MAR sequence (the BUR nucleation sequence; see Ref. 22) were evaluated by SPR experiments (Fig. 2, a and b). The original MBD region fragment (Tyr 346 -Asn 495 ) and a slightly shorter fragment (Val 353 -Asn 490 ) were demonstrated to bind to the DNA with similar binding constants (3.7 (Ϯ0.4) ϫ 10 6 M Ϫ1 and 4.8 (Ϯ0.6) ϫ 10 6 M Ϫ1 , respectively) (Fig. 2b). The response ratio of the maximal binding of the protein to the DNA immobilized on the sensor chip suggests that SATB1-MBD binds to this DNA at a stoichiometry of 1:1. Nonspecific binding to a DNA with an unrelated sequence was shown to be very weak, which indicates the recognition specificity to the MAR sequence.
In contrast to above, fragment Asn 368 -Lys 475 showed significantly weaker binding, and the shortest fragment Asn 368 -Ala 455 , most close to the CUT domain region (Fig. 1), showed even lower activity (Fig. 2b). This is probably because the deletion reduced the net charges of the fragments at neutral pH (difference between the numbers of basic and acidic residues), which need to be positive enough to stably bind to the negatively charged DNA, i.e. the net charges are ϩ2, ϩ2, ϩ1, and Ϫ1 for fragments Tyr 346 -Asn 495 , Val 353 -Asn 490 , Asn 368 -Lys 475 , and Asn 368 -Ala 455 , respectively. Indeed, decreasing the pH value to 4.0 partially recovered the binding activity of fragment Asn 368 -Ala 455 (data not shown). Also, another fragment containing Asn 368 -Leu 452 and additional four basic residues at the C terminus showed a strong sequencespecific binding (data not shown). Therefore, it is likely that the CUT domain region is mainly responsible for the MAR-DNA binding, although the other unstructured regions contribute to the strong binding at least by increasing the net charges. Considering the above, we used the fragment Val 353 -Asn 490 as SATB1-MBD for all the experiment hereafter unless otherwise stated.
Structural Description-Although the NMR analyses were performed on the Val 353 -Asn 490 fragment, the structure was calculated for region Asn 368 -Ala 455 , because the N-terminal and C-terminal regions appear to be largely unfolded as described above. The experimental constraints and stereochemical properties of the NMR solution structure of SATB1-MBD are shown in Table 1. The secondary structure elements are 5 ␣-helices (␣1, Glu 374 -Ala 386 ; ␣2, Gln 390 -Ala 397 ; ␣3, Leu 404 -Lys 411 ; ␣4, Gln 420 -Gln 434 ; and ␣5, Glu 437 -Gln 445 ), as identified by the program Procheck-NMR (24) (Figs. 1 and 3). These helices pack against one another, forming a globular shape as a whole. The packing of the helices was achieved by hydrophobic interactions between side chains, many of which are those of aliphatic or aromatic residues, i.e.   (35). Sequences of eight SATB proteins (hsSATB1 and hsSATB2 from Homo sapiens, mmSATB1 and mmSATB2 from Mus musculus, ggSATB1 and ggSATB2 from Gallus gallus, and tnSATB1 and tnSATB2 from Tetraodon nigroviridis), four ONECUT group proteins (human hepatocyte nuclear factor 6␣ (hsHNF6a), human ONECUT2 (hsONECUT2), Drosophila melanogaster ONECUT (dmONECUT), and Danio rerio ONECUT (drONECUT)), and two CDP/Cux proteins (D. melanogaster CUT protein (dmCUT) and human CCAAT displacement protein (hsCDP)), each containing three repeated CUT domains, were obtained from the NCBI data base (www.ncbi.nlm.nih.gov). Entry codes are AAH01744 for hsSATB1, AAA17372 for mmSATB1, XP_418746 for ggSATB1, CAG02763 for tnSATB1, AAI03501 for hsSATB2, AAL37172 for mmSATB2, XP_421919 for ggSATB2, CAF93666 for tnSATB2, AAD00826 for hsHNF6a, CAB38253 for hsONECUT2, AAL13705 for dmONECUT, AAH66466 for drONECUT, CAA30794 for dmCUT, and AAB26579 for hsCDP. Numbers above the sequences are for hsSATB1, although those of the first residues of the aligned sequences of the individual proteins are indicated beside the sequences. Basic (Arg and Lys) residues conserved in eight or more of the proteins presented here are colored cyan, although aliphatic (Ile, Leu, Met, and Val) and aromatic (His, Phe, Trp, and Tyr) residues conserved in 17 or more are colored red. Colored boxes above the sequence alignment indicate regions of the helices of SATB1-MBD. Below the sequence, identical and similar residues are marked as produced by the ClustalX program (35). CUT domain region is indicated by a horizontal bar. The arrows indicate the introduced mutational sites (see Table 2). the loop regions contribute to formation of the structural core, through hydrophobic interactions. Also, an electrostatic interaction between Glu 382 and Arg 440 side chains, with an average O Ϫ -H ϩ distance of 3.2 Å, is likely to contribute to the packing of ␣-helices 1 and 5.
It should be noted that many of the above residues, except for those in ␣-helix 5, are conserved among the CUT domains (Fig. 1, top), indicating that the domains of these proteins share a common structural architecture except for ␣5. Indeed, structure of the CUT domain of hepatocyte nuclear factor 6␣ has been reported to possess four helices (11), which corresponds to ␣1-4 of SATB1-MBD. More will be described on the structural comparisons below under "Discussion." Binding to the Major Groove of DNA-To test to which groove of DNA SATB1-MBD binds, SPR experiments employing DNAs methylated on the major groove side and groove-specific binding drugs, such as methyl green (major groove binding; Ref. 25), spermine (major groove binding; Ref. 26), and distamycin (minor groove binding; Ref. 27), were carried out. When a DNA containing an m 6 A base outside the BUR nucleation sequence (5Ј-CGTTTCTAATATATGC-3Ј/5Ј-GCATA-TATTm 6 AGAAACG-3Ј; the BUR nucleation sequence (22) is underlined), the binding ability of SATB1-MBD is only slightly affected (Fig.  2c). In contrast, binding to another DNA containing an m 6 A base in the BUR nucleation sequence (5Ј-CGTTTCTAATATATGC-3Ј/5Ј-GCATATm 6 ATTAGAAACG-3Ј) is weaker by ϳ40-fold than that of the DNA without modification (Fig. 2c). Because the N 6 H 2 group of adenine base is exposed to the major groove side of DNA, the above results indicate that SATB1-MBD binds to the BUR nucleation sequence from the major groove side. Consistently, experiments using the groove-specific DNA-binding drugs showed the major groove binding of SATB1-MBD, namely the presence of 50 M distamycin did not affect the DNA binding of SATB1-MBD, although presence of 50 M methyl green or spermine significantly reduced the apparent affinity of the protein (Fig. 2c). Therefore, it was concluded that SATB1-MBD binds to the DNA from the major groove side.
Interface for DNA Binding-The protein surface involved in the DNA binding tends to be positively charged to bind to negatively charged DNA. On the surface of the SATB1-MBD structure, a characteristic array of positively charged amino acid side chains, i.e. Lys 411 -Arg 410 -Arg 380 -Lys 384 -Arg 385 , is observed (Fig. 3c). These residues are conserved in the SATB proteins and either of CUT domains of ONECUT group or Cux/CDP proteins and are considered to be candidate residues that form the interface for DNA binding (Fig. 1). As described later, the first four residues of this array are suggested to be involved in the DNA binding.
To elucidate the protein-DNA interface, an NMR titration experiment was carried out (Fig. 4a). By adding increasing amounts of the DNA, chemical shift perturbations were observed in the HSQC spectra as follows. The positions of some cross-peaks did not change at all, or they shift only slightly so that the chemical shift changes were easy to follow (e.g. Ser 372 , Ser 373 , Phe 398 , Glu 437 , and Gln 430 H-⑀ in Fig. 4a). For some cross-peaks, even though the differences were significant, the change could be followed as the fast-exchange manner and/or resonances were assigned by analyses of three-dimensional spectra (e.g. Asp 381 , Asp 414 , Leu 409 , and Arg 427 in Fig. 4a). For others, however, the changes could not be followed, for which distances to the nearest unassigned cross-peaks of the complex were temporarily considered as the chemical shift difference or classification into the highest category was applied if there were no such candidate peaks within the classification limit ((⌬␦ H 2 ϩ (⌬␦ N /5.0) 2 ) 1/2 Ն0.2 ppm) (e.g. Gln 390 , Thr 401 , Gln 402 , Gln 420 , and Gln 390 (H-⑀) in Fig. 4a). This treatment may cause underestimation of the difference but causes no overestimation except for the case where a strong exchange broadening occurs without large change in the chemical shift. The chemical shift perturbations were completed when the concentration ratio of DNA to protein reached ϳ1.0, which suggested a 1:1 binding stoichiometry of the protein-DNA interaction.
After classifying the residues according to their chemical shift differences, it became apparent that a relatively limited region of the structure was largely affected by the binding of DNA (Fig. 4b). These regions are loop between ␣1 and ␣2 (loop 12), ␣3, loop between ␣3 and ␣4 (loop 34), and ␣4. In contrast, most residues of ␣1 and ␣5 are not significantly affected. Considering the above, a model of the complex of SATB1-MBD and DNA was built using a computational approach (Fig. 4b). In this model, ␣-helix 3 deeply enters the major groove of the DNA in such a way that the axis of the ␣-helix is nearly perpendicular to that of the DNA double helix. Accordingly, the side chains of Ser 406 , Glu 407 , Arg 410 , and Lys 411 in ␣3 contact the DNA bases of the ϳ4-bp region, probably forming hydrogen bonds. The side chains of Arg 380 , Lys 384 , Ser 389 , Gln 390 , Arg 400 , Arg 410 , Ser 419 , and Ser 421 , and the backbone amide of Gln 390 form intermolecular hydrogen bonds to the DNA phosphate groups. The NMR titration experiment is highly consistent with these interactions, e.g. cross-peaks of both the side chain and the backbone signals of Gln 390 are significantly affected upon adding DNA, as shown in Fig. 4a. It should be noted that Ala 428 and neighboring residues in the middle of ␣4 are also mainly affected in the NMR titration experiment, although the residues are distant from the interface to DNA. This suggests a possibility that ␣4 slightly kinks upon binding to DNA, in order to achieve better intermolecular fitting.
To evaluate the reliability of the proposed model, point mutations at 8 Arg/Lys residues and 3 Ser residues were introduced, and their effects on the DNA binding ability were evaluated by SPR (Fig. 2d and Table 2). It should be noted that all the Arg/Lys to Asn mutations decreased the affinity constants at least by ϳ3-fold. This is likely to be at least partially because of the effect of reducing net charge (from ϩ2 to ϩ1) by eliminating a basic residue. However, three mutations at Lys 384 , Arg 410 , and

Solution Structure of SATB1 MAR-binding Domain
Lys 416 decreased affinity constants more significantly (Ͼ10-fold) ( Fig.  2d and Table 2). The first two residues contact the base and/or phosphate of DNA in the proposed model, as described above, although Lys 416 is significantly distant from the interface to the DNA. In the structure, the side chain of this residue is capable of forming a salt bridge with those of Glu 370 and Asp 414 . It is likely that the mutation of Lys 416 affected structure at least locally, which is not suitable for DNA binding.
It should be noted that Glu 370 , Asp 414 , and Lys 416 are conserved only in the SATB proteins (SATB1 and related SATB2 proteins; Fig. 1), indicating that the above possible salt bridge network is specific for the SATB proteins. Thus, although a possibility cannot be excluded that Lys 416 is directly involved in DNA binding through relatively large conformational changes of loop 34, this is not the case with the CUT domains of the other families. Four of the other Arg/Lys residues with relatively small mutational effects on the DNA binding (Arg 380 , Arg 395 , Arg 427 , and Arg 442 ; Table 2) are distant from the protein-DNA interface in the model, except for Arg 380 in ␣1. Arg 380 forms a hydrogen bond with a DNA backbone phosphate, as described above. Although the reason why the mutation at this site does not have a large effect is unknown, it is possible that an Asn residue at this site also forms a hydrogen bond to the DNA phosphate. Indeed, Thr or Ser residues, which are also capable of hydrogen bonding, are conserved at this position in the ONECUT group of proteins (Fig. 1). Lys 475 in the C-terminal unstructured loop also showed a relatively small mutational effect. Although the basic residues in the unstructured regions were considered to be important in increasing the net charge, as described above, a possibility that their side chains directly associate with the DNA phosphate cannot be excluded.
Among the three mutants of Ser residues, only that of Ser 406 in ␣3 induced a large decrease in the affinity to DNA (Fig. 2d and Table 2). This residue contacts the DNA bases possibly forming hydrogen bonds, as described above. The other two Ser residues in the N-or C-terminal loop are likely be distant from the interface. Therefore, the model proposed in the present study is largely consistent with the mutational effects on the DNA binding activity and is likely to be reliable at least to define the framework of the protein-DNA binding. To reveal the   FEBRUARY 24, 2006 • VOLUME 281 • NUMBER 8 sequence-specific DNA-recognition mechanism, however, future experiments on the structure determination of the protein-DNA complex are necessary.

DISCUSSION
Major Groove Binding-In contrast to the present results, a previous report suggested the minor groove binding of the full-length SATB1 protein (3). This is mainly based on the result that a minor groovespecific binding drug, distamycin, at high concentrations (Ͼ25 M) competes out the DNA binding of the SATB1 protein in a gel retardation experiment, which requires more stable binding than the SPR experiment. In the present SPR experiment, distamycin at a higher concentration (50 M) did not interfere with DNA binding by MBD (Fig.  2c). It should be noted that the full-length protein includes the homeodomain that significantly enhances the affinity of the protein and DNA (9), which is likely to ensure that the binding is stable enough to be detected in the gel retardation experiments. It was reported that distamycin at even lower concentrations (Ͼ5 M) inhibits the DNA binding of the Antennapedia homeodomain (28) that is known to bind specifically to the major groove. Therefore, the previous observation of chemical competition is possibly because of its competition to homeodomain but not to MBD.
Comparison with the Structures of CUT, POU-specific, and Related Helix-Turn-Helix DNA-binding Domains-The CUT domain of HNF6␣ was reported to possess a four-helix structure (11), as shown in Fig. 5a, although the SATB1-MBD region similar to the CUT domain possesses an additional C-terminal helix, ␣5. The HNF6␣ CUT domain structure was shown to be similar to the POU-specific DNA-binding domains of the POU-homologous homeodomain proteins (12, 13, 29 -31). The common structure of the POU-specific domain contains four helices arranged similarly to the HNF6␣ CUT domain and the N-terminal four helices of SATB1-MBD, although an atypical variation possessing an additional N-terminal helix was reported for the HNF1␣ POU-specific domain (31) (Fig. 5, b and c). Furthermore, the POU-specific domain is known to show similarity to the helix-turn-helix DNA-binding domains of phage and 434 repressors and 434 Cro proteins (13). Although these prokaryotic DNAbinding domains possess five helices (32), the relative position of the fifth helix, which is known to be required for the dimer formation, and the length of the fourth helix are substantially different from SATB1-MBD (Figs. 3b  and 5d).
It is also clear that the DNA-binding mode of the POU-specific domain is very similar to the proposed model of the complex of SATB1-MBD and DNA (Fig. 4b, and Fig. 5, b and c), in which the third helix (of the common four-helix structure) deeply enters the major groove of DNA in order to contact bases. Also for /434 repressors and 434 Cro DNA-binding domains, the third helix acts as the DNA-recognition helix. Therefore, these groups of the DNA-binding domains are likely to possess a common DNA-binding framework. It is especially noted that at the N terminus of the second ␣-helix, a Gln residue is strictly conserved for all these proteins (residue 390 of SATB1 in Fig. 1) (13,31). This residue forms a hydrogen bond to a DNA phosphate group and is likely to contribute to defining the DNA-binding framework. Therefore, it is concluded that SATB1-MBD is likely to be classified into the same helix-turn-helix group as the POU-specific domain, the CUT domain, and the /434 repressors/434Cro DNA-binding domains, commonly containing four helices, although it is atypical with regard to the C-terminal helix. The role of this additional helix of SATB1-MBD is highly likely to be stabilizing the structural core, mainly by hydrophobic interactions between Val 396 (␣2)-Ile 443 (␣5), Phe 432 (␣4)-Arg 440 (␣5), and Phe 432 (␣4)-Tyr 444 (␣5).
Despite the similarity in the DNA-binding modes of the above domains, the prokaryotic and eukaryotic proteins containing the above domains are different in the DNA-binding modes of the whole molecules, i.e. the prokaryotic proteins form homodimers of the above domains when bound with DNA, although the eukaryotic proteins possess homeodomains and bind to DNA without forming homodimeric protein-protein contacts. Furthermore, among the eukaryotic members, the POU-homologous proteins and CUT-like proteins, including the SATB proteins, are different in that the homeodomains in the former proteins are involved in the specific recognition of DNA (29 -31), although the homeodomains of CUT-like proteins are not required in the specific recognition or only assist the CUT-like domains (9,10,33).
It should be noted that the residues in the recognition helices of the helix-turn-helix DNA-binding domains described here are not conserved very much, which should cause differences in the DNA recognition  sequence. Even among the three subgroups of CUT domain proteins, including the SATB family, the residues of the ␣3 region are similar but slightly different, i.e. LLSEILRK for the SATB proteins, TLSDLLRN for the ONECUT group proteins, and (S/T)VS(D/E)(L/I/M)L (A/S)(K/R) for the CDP/Cux proteins ( Fig. 1) (commonly conserved or similar residues are underlined). This is very likely to cause similarity and differences in the DNA recognition sequence. Indeed, the recognition sequences of SATB1 (BUR nucleation sequence), HNF6␣, and CDP R2/R3 are ATATAT, ATCAAT, and ATCGAT, respectively (10,22,33). Biological Implications of the SATB1/MAR-DNA Interaction-The BUR sequence is known to be essential to the binding of MAR-DNAs and the nuclear matrix, in which the tendency of the base unpairing appears to be important (2). Although the SATB1 protein is found as a factor that specifically binds to the BUR sequence, the present experimental results are consistent with a model in which SATB1-MBD binds to a double-stranded DNA in the standard B-form from the major groove side. This complex should stabilize the double-stranded MAR-DNA in the B-form and protect it from the base-unpairing force. Therefore, it is likely that the binding of SATB1 to MAR competes with the binding of MAR and the nuclear matrix, if the base-unpaired DNA is assumed to be involved in the binding to the nuclear matrix. By considering that SATB1 recruits the histone deacetylase complex, in order to inactivate the gene expression through the chromatin remodeling (4), and that SATB1 itself possesses a nuclear matrix targeting sequence that is important in the transcriptional repression (34), the multistep gene-silencing reactions driven by SATB1 would be as follows: (i) translocation of SATB1 to the nuclear matrix, (ii) competing out the binding of the nuclear matrix and MAR-DNA, and (iii) chromatin remodeling by histone deacetylation.