A novel matrix attachment region DNA binding motif identified using a random phage peptide library.

SATB1 is a nuclear matrix attachment DNA (MAR)-binding protein which is predominantly expressed in thymocytes. This protein binds to the minor groove specifically recognizing an unusual DNA context exhibited by a specific MAR region with strong base-unpairing propensity. A phage library displaying nonamer random peptides without any built-in structure was used to identify a MAR binding motif of SATB1. One predominant cyclic peptide C1 of CRQNWGLEGC selected by a MAR-affinity column showed 50% identity with a segment in SATB1 (amino acids 355-363). Replacement of the C1 similarity segment in SATB1 by a random amino acid sequence or its truncation resulted in more than 80% reduction in MAR binding. In contrast, replacement of the same SATB1 segment with the C1 peptide restored full MAR binding activity and specificity as the wild-type protein. Single amino acid mutation of the conserved Arg or Glu residue to Ala greatly reduced MAR binding. Taken together our data show that a nine amino acid sequence in SATB1 represents a key MAR binding motif. Phage display may provide a general tool for rapid identification of DNA binding peptide motifs.

SATB1 is a nuclear matrix attachment DNA (MAR)binding protein which is predominantly expressed in thymocytes. This protein binds to the minor groove specifically recognizing an unusual DNA context exhibited by a specific MAR region with strong base-unpairing propensity. A phage library displaying nonamer random peptides without any built-in structure was used to identify a MAR binding motif of SATB1. One predominant cyclic peptide C1 of CRQNWGLEGC selected by a MAR-affinity column showed 50% identity with a segment in SATB1 (amino acids 355-363). Replacement of the C1 similarity segment in SATB1 by a random amino acid sequence or its truncation resulted in more than 80% reduction in MAR binding. In contrast, replacement of the same SATB1 segment with the C1 peptide restored full MAR binding activity and specificity as the wildtype protein. Single amino acid mutation of the conserved Arg or Glu residue to Ala greatly reduced MAR binding. Taken together our data show that a nine amino acid sequence in SATB1 represents a key MAR binding motif. Phage display may provide a general tool for rapid identification of DNA binding peptide motifs.
Eukaryotic chromatin is organized into loop domains, which may have both structural and functional roles such as differential gene expression and replication (1)(2)(3). It is believed that the chromatin loops are anchored to the nuclear matrix at specific DNA regions that exhibit high affinity to the nuclear matrix. These matrix-associated regions or MARs 1 (2,3) are common at the boundaries of transcription units (4 -8) and often found near enhancer-like regulatory sequences (9 -12). Recent evidence shows that MARs play a role in tissue-specific gene expression. MARs associated with the immunoglobulin heavy chain locus have been shown to be essential for transcription of a rearranged gene in transgenic B lymphocytes (13). Also, MARs are binding targets for a cell type-specific MAR-binding protein, SATB1. SATB1 is primarily expressed in thymus (14), and because it is expressed during specific stages of T-cell development, 2 it is believed to be involved in thymocyte maturation and differentiation.
SATB1 is a novel type of DNA-binding protein that recognizes a specific sequence context in which one strand exclusively consists of mixed A, T, and C nucleotides (ATC sequences). Clustered ATC sequences commonly found in MARs have a strong tendency to unwind by extensive base unpairing (15). The unwinding property of MARs confers high affinity binding to the nuclear matrix and is essential for transcription enhancing activity of such sequences (16). SATB1 does not bind to sequences that are similarly AT-rich but lack the unwinding capability. SATB1 binding is highly specific, but exhibits an unusual mode of DNA recognition; it interacts with the minor grooves of its target sequences while making little contact with the bases (14). A 150-amino acid segment of SATB1 was recently identified as the MAR binding domain (17). The concomitant presence of both NH 2 -and COOH-terminal arms of this segment is necessary for full binding activity, but the DNA contact sites remain to be determined.
We have taken an unconventional approach in delineating the DNA binding sites within SATB1 using a random peptide bacteriophage display library. Bacteriophage display is a powerful tool to study protein-target molecule interactions (18,19). In this system, hundreds of millions of random peptides are expressed on the surface of bacteriophage as fusion protein libraries, and ligands for various purposes can be selected from them. Peptide ligands for proteins (20 -24), antibodies (25), and enzymes (26) have been identified in this manner. We speculated that phage display libraries could also represent a vast source of DNA binding motifs. We report here that a random peptide bacteriophage library without any built-in structures can be used to affinity-select specific MAR binding peptides. Based on the sequence similarities between the selected peptide and a native MAR-binding protein SATB1, we were able to define a nine-amino acid segment in SATB1 as a key MAR binding motif for this protein. Deletion of this segment markedly reduced the MAR binding activity of SATB1. In the context of SATB1, this motif together with other components in SATB1 confers unique binding specificity to the AT-rich sequences with strong unpairing potential.

EXPERIMENTAL PROCEDURES
MAR Probes and Affinity Column-Multimeric wild-type MAR DNA (25) 5 (top strand of the monomer sequence: 5Ј-TCTTTAATTTCTA-ATATATTTAGAAttc-3Ј) and mutant MAR DNA (24) 8 (top strand of monomer sequence: 5Ј-TCTTTAATTTCTACTGCTTTAGAAttc-3Ј) were generated as described (14). The differences between the wild-type and mutated MARs are underlined. To prepare affinity column, concatemers of the wild-type MAR were coupled to CNBr-activated Sepharose 6MB (Pharmacia Biotech Inc.) at 200 g/ml bed volume as described (27). A control column was made by coupling sonicated herring sperm DNA (average molecular weight about 200 base pairs) to Sepharose 6MB at the same concentrations.
Construction of CX 9 Library-The details for constructing CX 9 library (C, cysteine; X, any amino acids) have been described elsewhere (28). Briefly, an oligonucleotide corresponding to the sense strand of CX 9 insert was synthesized and amplified with primers complementary to the surrounding constant regions. The purified polymerase chain reaction product was cloned into the amino terminus coding region of the pIII gene in the fUSE5 vector (18,19) and transformed into the MC1061 strain of Escherichia coli. About 10 9 independent clones were * This work was supported by NIH Grants CA 28896 (to E. R.) and CA 39681 (to T. K.-S.), Tobacco-related Disease Grant 4IT0147 (to E. R.), and Cancer Center Support Grant CA 30199. The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
Phage Selection on a MAR Affinity Column-Our selection strategy combined procedures for biopanning (18,19) and methods for purifying DNA-binding proteins (29,30). In the first round of panning, about 10 11 transducing units (TU) of the CX 9 library were incubated with 0.2 ml of the concatemerized MAR Sepharose 6MB (16) for 3 h at room temperature in DNA binding buffer (DBB: 25 mM HEPES-KOH, pH 7.6, 12.5 mM MgCl 2 , 0.1 M KCl, 20% glycerol, and 0.1% Nonidet P-40) plus 0.1% bovine serum albumin and 10 g/ml poly(dI-dC) in a final volume of 0.5 ml. The unbound phage were removed, and the beads were washed three times with 1 ml of DBB in a microcentrifuge tube. After transfer to a 1.5-ml microcolumn (Bio-Rad), the beads were subjected to three 1-ml washes. The bound phage were then eluted successively with 0.2 ml of DBB containing increasing concentrations of KCl: once each with buffers containing 0.1 M, 0.4 M, and 0.7 M KCl and three times each with buffers containing 1.0 M and 2.0 M KCl. The eluted phage fractions were titrated, pooled, and amplified. The second to the fifth rounds of selection were carried out under the same conditions except that the input phage and incubation time were gradually reduced to 10 9 TU and 20 min, respectively. Only phage amplified from the Z1.0 and Z2.0 eluates were used in the sixth panning where the KCl concentration in the binding buffer was raised to 0.4 M. In the last selection, the KCl concentration was further raised to 0.7 M. DNA from individual phage clones was extracted and sequenced as described (18,19).
Filter Binding Assays-A concatemerized MAR probe (25) 5 (27) was labeled by filling in the BamHI and HindIII ends with Klenow enzyme to a specific activity of 5-10 ϫ 10 9 cpm/g with [␣-32 P]dCTP. About 10 9 TU of phage were incubated with 0.1 ng of the probe in a final volume of 40 l of DBB containing 0.1% bovine serum albumin and 10 g/ml poly(dI-dC) for 1 h at room temperature. The reaction mixture was spotted onto nitrocellulose filter in a dot blot apparatus. The filters were washed twice with 200 l of DBB and exposed to x-ray films (Fig. 2). Alternatively, the level of binding was quantitated by counting each spot in a liquid scintillation counter (Table I). In competition assays, specific or nonspecific competitor DNA was added as indicated at the same time with the probe. For reduction of disulfide bonds, phage were preincubated with dithiothreitol (5 mM final) for 15 min before adding the probe.
Chimeric SATB1 and Mutagenesis-The SATB1 fragment coding for the MAR binding domain (amino acids 346 -495) (MD-wt) as well as the fragment containing the NH 2 -terminal deletion (amino acids 367-495) (MD-⌬367) were cloned into pGEX-2T as described previously (17). Two complementary 63-nucleotide oligonucleotides spanning the coding region of the amino-terminal end of the MAR binding domain (amino acids 346 -367) were synthesized in which the wild-type sequence RS-MNKPLEQ (amino acids 355-364) was replaced with the C1 peptide sequence RQNWGLEG. The annealed oligonucleotides were ligated to MD-⌬367 (amino acids 367-495) to yield the plasmid encoding the chimeric protein. Mutations of the conserved residues in the chimeric proteins were obtained by changing the codons for the underlined amino acids in C1 simultaneously (mut RNE-AAA) or individually (mut R-A, mut N-A, and mut E-A). The glutathione S-transferase-fusion proteins were purified as described (17). RESULTS A library containing about 10 9 independent clones was constructed by inserting CX 9 peptides (C: cysteine; X: any amino acid) into the pIII protein of the fUSE5 vector (18,19). The inclusion of the cysteine was intended to facilitate the selection of peptides containing cysteine pairs, should the binding activity studied require a cyclic structure. Such peptides can potentially cyclize and improve the affinities for target binding (28,31). Phage particles were incubated with MAR covalently coupled to Sepharose (27) under conditions developed for purifying DNA-binding proteins (29,30). The DNA consisted of concatemerized repeats of a 25-base pair SATB1 recognition sequence (5Ј-TCTTTAATTTCTAATATATTTAGAA-3Ј), which is derived from the core unwinding element of the MAR downstream of the mouse immunoglobulin heavy chain enhancer (14). After washing, the bound phage were eluted with a KCl gradient in DNA binding buffer, amplified, and reapplied to the column for further selection. Following five rounds of such selection, the retention efficiencies on the MAR column, measured as transducing units (TU) of eluted phage, were increased over 300 times compared with plain Sepharose or Sepharose coupled with nonspecific herring sperm DNA. Concomitantly, with each successive round of selection the elution profile shifted toward higher salt concentrations (Fig. 1), reflecting the enrichment of phage with high affinity. In the sixth and seventh rounds of panning, the KCl concentration in the incubation buffer was raised from 0.1 to 0.4 M and 0.7 M respectively; this led to a further shift in the elution profile with most of the phage eluting at 2 M KCl (Fig. 1). Table I lists amino acid sequences displayed on these phage and their MAR binding activity determined by a filter binding assay. After seven rounds of selection, a single peptide CRQN-WGLEGC (C1) predominated by over 60% (26/42) of all the phage clones sequenced and exhibited strong MAR binding activity. Sequence alignment revealed a similarity of C1 with SATB1 at the NH 2 -terminal (four identities and one similarity among 8 residues) and COOH-terminal arms (two identities and one similarity among 8 residues) of the MAR binding domain ( Fig. 2 and Ref. 17). Two other clones (C2 and C3) were similar to C1 and SATB1 in that they both contained charged residues at the conserved positions. Clone 4 (C4) was identified four times and showed some similarities to the NH 2 -terminal arm of SATB1 adjacent to C1 similarity region, but not to the COOH-terminal arm (not shown). The remaining clones, including clone 5 (C5) which was repeated three times, did not show discernible similarities to SATB1. Except for C1 and C4, all other clones showed low levels of binding but still above those of control phage.
All the clones had a second cysteine, suggesting a preference for cyclized peptides. The conformational constraint provided by the disulfide bond appeared to be important for the binding, as reduction of the bond greatly decreased MAR binding by C1 (Fig. 3A) and by other clones (not shown) in the filter binding assay. To test the specificity of MAR binding by the predominant C1 clone, an unrelated DNA sequence was used as a probe, and no binding was detected (Fig. 3B). A 100-fold molar excess of unlabeled wild-type MAR completely inhibited C1 binding, whereas nonspecific herring sperm DNA had no effect even at much higher concentrations (Fig. 3C). However, C1 binding was also inhibited by unlabeled mutated MAR (5Ј-TCTTTAATTTCTACTGCTTTAGAA-3Ј) which has lost the base unpairing capability and is not bound by SATB1, even though it is still AT-rich (14). As described below, it appears that the context of the whole DNA binding domain is required for the C1 similarity region in SATB1 to distinguish between the wild-type and mutated MARs.
The importance of the native SATB1 sequence mimicked by the peptides for MAR binding was examined by deletion mutagenesis. Partial truncation of the C1 region in SATB1 DNA binding domain greatly reduced MAR binding activity (Fig. 4A, MD-⌬361), but additional truncations, MD-⌬367 (17) and MD-⌬369, which deleted most, or all, of the C4 similarity region caused no further decrease in MAR binding activity. It may be that the C4 similarity region is a minor contributor to MAR binding in SATB1 and that the C4 peptide represents a more potent version of that minor binding site. Similar enhancement of weak binding sites by peptide mimics selected from libraries has been seen with an integrin (28). Because of the apparent minor role of the C4 region for SATB1 binding activity, this region was not studied further.
To test the relevance of the similarity between the C1 peptide and the two arms of the SATB1 DNA binding domain, we replaced the native sequence (amino acids 355-362) with the C1 peptide and its variations (Fig. 4B). The resultant chimeric SATB1 MAR binding domains were expressed as glutathione S-transferase-fusion proteins. The C1 peptide chimera (MD-C1) yielded a gel shift pattern virtually identical to that of the wild-type DNA binding domain (MD-wt) (Fig. 5) and exhibited similar affinity to the wild-type MAR probe (Fig. 4B). In contrast, a deletion mutant lacking amino-terminal 21 residues (MD-⌬367) and a protein produced by inserting a random peptide (MD-XX) had greatly reduced MAR binding activities (Figs. 4 and 5). Furthermore, like MD-wt, MD-C1 chimera discriminated between the wild-type and mutated probes, since a 200-fold molar excess of unlabeled wild-type MAR completely inhibited binding, whereas the mutated MAR had virtually no effect (Fig. 5). Thus, MD-C1 confers specific binding to the AT-rich sequence with high unwinding capability.

FIG. 2. Alignment of the C1 phage-derived peptide to the NH 2and COOH-terminal arms of the MAR binding domain of SATB1.
The amino acid positions of the native SATB1 are shown. The MAR binding domain of mouse SATB1 was defined as a 150-amino acid region (amino acids 346 -495) within the 764-amino acid coding sequence (17). Sequence identities and similarities are indicated by solid and broken rectangles, respectively. Gaps are indicated by bars. Alignment between the phage-derived peptides and SATB1 is shown with thick lines. Alignment restricted between the NH 2 -and COOH-terminal arms of the native SATB1 protein is shown with thin lines. Since the two cysteines in peptide displayed on the phage are likely to play a structural role (see text), they are not included in the alignment.

FIG. 4.
A, truncations of the C1 similarity region greatly reduces the MAR binding. A series of deletion mutants of the SATB1 MAR-binding domain were generated by polymerase chain reaction and expressed in bacteria as glutathione S-transferase fusion proteins. The C1 and C4 phage peptide similarity regions are indicated by a thick solid or a thin broken bar, respectively, above the diagram of the wild-type construct. B, the residues conserved between C1 peptide and SATB1 are required for MAR binding. Depicted is the schematic representation of the native MAR binding domain of SATB1 (MD-wt) and the various constructs derived from it (see text for details of construction). Purified proteins were subjected to quantitative gel mobility shift assays as described previously (14). Dissociation constants (K D ) were estimated from the amount of protein required for a 50% shift of the probe under conditions of protein excess and normalized to that of the wild-type construct. The relative K D values are shown on the right.

TABLE I MAR binding peptide sequences
Phage from a nonamer random peptide display library were subjected to seven rounds of selection on MAR-Sepharose. The frequency of occurrence for each sequence and DNA binding activities as measured in the filter binding assay are shown. The underlined C corresponds to the constant C in the CX 9 library. Boldface indicates conserved residues. Two phage clones from two unrelated selection experiments were used as negative controls (Ctl 1 and Ctl 2).

Clone
Sequence The amino acids in the C1 peptide critical for MAR binding were studied by point mutations. Simultaneous mutations of the three conserved residues Arg, Asn, and Glu to Ala in MD-C1 (mut RNE-AAA) reduced the affinity of the resultant chimera for MAR to the level of the deletion mutant MD-⌬367 (Fig. 4, A and B). Mutation of either Arg (mut R-A) or Glu (mut E-A) alone was sufficient to reduce binding to the same level, whereas a mutation in Asn (mut N-A) had an intermediate effect. These results show that specific amino acids, Arg and Glu, that are conserved among C1, the NH 2 -terminal arm and COOH-terminal arm of the SATB1 MAR binding domain are critical for MAR recognition. Thus, consistent with the predominance of C1 phage after stringent selection, results from peptide swapping, point mutations, and deletion mutations all suggest that the C1 similarity region in SATB1 is a key DNA binding site. DISCUSSION Using a random peptide phage library without any built-in structure, we have isolated a predominant peptide (C1) of eight amino acids which is similar to the sequences found in SATB1 located within the NH 2 -and COOH-terminal regions of the 150-amino acid MAR binding domain. Several lines of evidence suggest that this peptide represents a MAR binding motif. First, phage displaying this peptide can directly bind both to the MAR immobilized on an affinity column and to the radiolabeled MAR in solution. More importantly, it could functionally replace the similar native sequence in a chimeric SATB1 construct without loss of binding affinity or specificity. In contrast, either deletion of this SATB1 sequence or its replacement with a random sequence markedly reduced the MAR binding activity of the mutated SATB1. Finally, when amino acids conserved between phage-derived peptide and SATB1 were singly mutated to Ala, the MAR binding was greatly reduced. Because both the NH 2 -and COOH-terminal arms of the MAR binding domain are required for MAR binding, one of the models proposed earlier is that the two arms may converge at the surface of SATB1 to form a symmetric binding structure capable of grasping DNA (17). Consistent with this model, we have demonstrated here that a subsequence of no more than 9 amino acids within these arms can directly interact with DNA, as the predominant peptide selected for MAR binding is similar to both arms of the SATB1 DNA binding domain and can functionally replace the similarity region. Thus, SATB1 may be similar to TATA-binding protein in that the critical DNA binding activities reside in two similar portions of the same molecule (32)(33)(34).
SATB1 is the first non-ubiquitous MAR-binding protein cloned to date which is predominantly expressed in a specific tissue, thymus. The observation that SATB1 is expressed in thymocytes during specific stages of T-cell development suggests that this protein may play a role in T-cell maturation. 2 In collaboration with M. Sikorska, we have found that SATB1 may be involved in thymocyte apoptosis, for it is proteolytically cleaved very ealy after the apoptotic induction of thymocytes. 3 The nuclear matrix protein composition varies depending on cell types (35), suggesting the existence of other cell typespecific MAR-binding proteins like SATB1. The DNA binding sequence of SATB1 may facilitate the identification of MAR binding potential in other proteins and may define a new class of DNA binding motifs.
Our results demonstrate that small peptides displayed on the surface of filamentous phage can have high affinity for specific DNA sequences and can allow identification of a specific DNA binding motif within such proteins. Recently, three zinc fingers of the Zif268 protein were expressed on the phage surface, and randomization of the DNA contacting residues was used to select for zinc fingers with new DNA binding specificities (36)(37)(38). Rather than using long peptides with built-in structures, we have demonstrated that random short peptide phage library can be effective in searching for potential DNA binding sites in DNA-binding proteins for which the DNA binding domains are unknown or have not been completely characterized. Structure/function studies of DNA binding peptides identified from peptide libraries may present new opportunities for drug discovery.  Fig. 4. Gel mobility shift assays were carried out by incubating identical concentrations (16 nM) of purified bacterial SATB1 fusion proteins (Fig. 4) with 32 P-labeled MAR (25) 5 . For competition assays, unlabeled wild-type MAR (25) 5 (wt) or mutant MAR (24) 8 (mut) were used at 200-fold molar excess. Gel mobility shift assays was carried out according to the legend to Fig. 4.