Protein Footprinting Reveals Specific Binding Modes of a High Mobility Group Protein I to DNAs of Different Conformation*

The high mobility group proteins I and Y (HMGI/Y) are abundant components of chromatin. They are thought to derepress chromatin, affect the assembly and activity of the transcriptional machinery, and associate with constitutive heterochromatin during mitosis. HMGI/Y protein molecules contain three potential DNA-binding motifs (AT-hooks), but the extent of contacts between DNA and the entire protein has not been determined. We have used a protein-footprinting procedure to map regions of theChironomus HMGI protein molecule that are involved in contacts with DNA. We find that in the presence of double-stranded DNA all AT-hook motifs are protected against hydroxyl radical proteolysis. In contrast, only two motifs were protected in the presence of four-way junction DNA. Large regions that flank the AT-hook motifs were found to be strongly protected against proteolysis in complexes with interferon-β promoter DNA, suggesting amino acid residues outside the AT-hooks considerably contribute to DNA binding.

The high mobility group proteins I and Y (HMGI/Y) are abundant components of chromatin. They are thought to derepress chromatin, affect the assembly and activity of the transcriptional machinery, and associate with constitutive heterochromatin during mitosis. HMGI/Y protein molecules contain three potential DNA-binding motifs (AT-hooks), but the extent of contacts between DNA and the entire protein has not been determined. We have used a protein-footprinting procedure to map regions of the Chironomus HMGI protein molecule that are involved in contacts with DNA. We find that in the presence of double-stranded DNA all AT-hook motifs are protected against hydroxyl radical proteolysis. In contrast, only two motifs were protected in the presence of four-way junction DNA. Large regions that flank the AT-hook motifs were found to be strongly protected against proteolysis in complexes with interferon-␤ promoter DNA, suggesting amino acid residues outside the AT-hooks considerably contribute to DNA binding.
The properties of the insect cHMGI protein resemble the major structural features of the mammalian HMGI and Y proteins. It contains three DNA-binding motifs (AT-hooks) (7) and a negatively charged COOH-terminal domain (3) and have similar charge distribution in the regions flanking the AThooks (8). Both human and Chironomus HMGI/Y proteins bind preferentially to four-way junction DNA (3) and to AT-tracts of double-stranded DNA (3,9). Their binding alters the DNA conformation (10 -12) and unbends intrinsically bent DNA (13). Moreover, these proteins from evolutionarily distant organisms are substrates of Cdc-2 kinase (8,14), mitogen-activated protein kinase (8), and protein kinase C (8,15). Phosphorylation reduces their DNA-binding affinity (14 -16) and alters the mode of binding to DNA (8).
Diverse biological functions for the mammalian proteins have been suggested. Initially the HMGI/Y proteins were considered as specific components of constitutive chromatin (17,18). Further studies revealed that they are involved in the modulation of transcription of specific genes (for review see Ref. 19). Studies on the human interferon-␤ (IFN-␤) gene and the gene encoding the ␣-subunit of the interleukin 2 receptor showed transcriptional activation upon binding of HMGI(Y) proteins to positive regulatory domains (PRD), which facilitate binding of transcription factors (20 -22). More recently, HMGI/Y proteins were identified as components of a repressor complex that inactivates the promoter of the T cell receptor ␣-chain gene (23) and as crucial host proteins in the HIV-1 preintegration complex (24). Cytological studies of insect polytene chromosomes have demonstrated that the cHMGI protein is present in many transcriptionally active loci and in nucleoli, suggesting that HMGI/Y proteins are involved in polymerase I and II transcription (25). They are highly abundant in undifferentiated and rapidly dividing cells (25,26). Elevated levels of HMGI/Y in differentiated tissues have been found to be correlated with progressive and neoplastic transformations (27,28). Disruption or rearrangements of their genes lead to tumorigenesis (29,30).
Whereas different biological effects of HMGI/Y proteins have been described, the molecular and biochemical mechanism of affecting DNA and chromatin structure is not well defined. The spatial organization of HMGI/Y⅐DNA complex remains only partially understood. NMR data of a complex of a truncated form of the protein and a DNA dodecamer show that the central part of the AT-hook domain, the Arg-Gly-Arg motif, interacts with the bases and the sugar within the minor groove of the DNA double helix (31). In addition, several residues flanking this motif interact with the sugar-phosphate backbone and are responsible for the strength of the overall protein DNA binding (31). Moreover, the central AT-hook motif mediates specific DNA binding and cooperates the other two AT-hooks (32). Here we report the mapping of the regions of the cHMGI protein involved in contacts with various types of DNA, including linear synthetic poly(dA-dT)⅐poly(dA-dT), four-way junction DNA, and a region of the promoter of the IFN-␤ gene. The mapping was performed by means of protein hydroxyl radical-footprinting technique (33). The data presented show that the interaction of cHMGI protein with DNA involves residues of two or three AT-hook motifs dependent on the DNA type and/or the protein to DNA ratio. Large regions flanking these motifs also contribute to the binding of cHMGI to DNA. protein were phosphorylated at Ser 3 (8) at 30°C with 8 units of recombinant human Cdc2-kinase (New England Biolabs Inc.) for 4 h in the presence of 3.5 mM ATP and 100 -150 Ci [␥ 32 P]ATP in 8 l Cdc-2 kinase buffer containing 50 mM Tris/HCl, 10 mM MgCl 2 , 1 mM dithiothreitol, 1 mM EGTA, pH 7.5. The reaction was stopped by precipitation of the proteins with 30% (w/v) CCl 3 COOH for 30 min at 0°C. The pellet was washed with 30% CCl 3 COOH, 0.2% HCl in acetone, twice with pure acetone and dried.
DNA and Oligonucleotides-The synthetic linear poly(dA-dT)⅐poly(dA-dT) DNA was from Amersham Pharmacia Biotech. The approximate average length of this DNA was 5000 bp. The 34-bp fragment of the promoter of the IFN-␤ gene containing the PRDII/NRDI sites was prepared from synthetic oligonucleotides. The sequence of the top strand was 5Ј-GAAGTGAAAGTGGGAAATTCCTCTGAATAGAGA-G-3Ј (PRDII site is underlined). Four-way junction c was prepared according to Bianchi (36).
Hydroxyl Radical Footprinting-100 pmol of the radioactively endlabeled protein (15,000 -30,000 cpm) were incubated in the presence or absence of DNA in 257 mM NaCl and 14.3 mM MOPS, pH 7.2, buffer at room temperature for 15 min. The chemical digestions were started by sequential addition of 1 l each of the following freshly prepared solutions: (i) 20 mM EDTA and 10 mM (NH 2 ) 2 Fe(II)(SO 4 ) 2 ; (ii) 0.2 M sodium ascorbate; and (iii) 0.375% H 2 O 2 . If not specified otherwise, the reactions were stopped after 40 min by addition of 3.3 l of 4-fold SDS sample buffer (4% SDS, 16% glycerol, 25 mM Tris/HCl, pH 6.8, 6% ␤-mercaptoethanol, and 0.01% bromphenol blue). The reaction products were separated on 16.5% polyacrylamide gels using the Tricine-glycine-SDS buffer system (37). The gels were dried and scanned by a Phos-phorImager (Molecular Dynamics).
Size Markers and Assignment of the Hydroxyl Radical Cleavage Sites-Size markers were obtained by limited digestions of the endlabeled cHMGI protein with trypsin or proteinase Glu-C (V8). 100 pmol of end-labeled cHMGI were digested with 17 ng of trypsin in 10 l of 180 mM NaCl, 20 mM Tris/HCl, pH 7.5, at 0°C for 5 min. Reactions were stopped by addition of 1 l of 0.14 mM N␣-p-tosyl-L-lysine chloromethyl ketone (TLCK). The cleavage with proteinase Glu-C (V8) was carried out in the presence of 50 ng of enzyme in 25 mM sodium phosphate, pH 7.8, and 180 mM NaCl at 0°C for 2 min. Reactions were stopped by addition of SDS sample buffer and immediate boiling of the probe. The end-labeled peptides 1-48 and 1-6 were obtained by cleavage of the cHMGI protein with hydroxylamine (38) and trypsin (8), respectively. The peptide 1-32 was synthesized as described previously (8). The assignment of the hydroxyl radical fragments was accomplished using a standard curve.
Data Analysis-The phosphorimages were essentially analyzed according to Heyduk et al. (39) and Baichoo and Heyduk (40). Briefly, phosphorimages of the full lanes width were scanned and the intensities were plotted versus mobility (ImageQuant Software, Molecular Dynamics). The intensity plots were aligned to correct distortions between different lanes using ALIGN software (gift from Dr. T. Heyduk, St. Louis, MO). The aligned intensity plots were imported into EXCEL (Microsoft), gel-loading efficiencies, and the extent of cleavages were normalized. The electrophoretic mobilities were transformed into amino acid residue positions and mean values for each position was calculated. Finally each amino acid residue position was compared in a difference plot: ⌬ norm ϭ (I without DNA Ϫ I with DNA )/I without DNA , where ⌬ norm is the normalized difference, I without DNA is the mean value of the corrected phosphorimager intensity of single residue position measured in the absence of DNA, and I with DNA is the mean value of corrected phosphorimager intensity of the same position measured in the presence of DNA. Cutting frequencies within tryptic protein footprints were calculated by integrating intensities of the bands subtracted from background (ImageQuant). During electrophoresis, digestion products shorter than 7 or longer than 90 residues were not resolved. Therefore, the difference plots were calculated excluding these regions (39).

RESULTS
Experimental Strategy-Electrophoretic analysis of the products of a limited digestion of protein labeled at its NH 2 terminus yields a characteristic pattern. With DNA-binding proteins, changes in the electrophoretic pattern can be observed. In the presence of DNA, disappearing or fading of bands in defined regions (footprints) can be related to a protection of the protein at sites contacting the DNA. To map on the Chironomus HMGI protein regions of DNA binding it was (i) radioactively phosphorylated at Ser 3 , (ii) partially digested with se-quence specific proteinases or hydroxyl radicals in the presence or absence of DNA, and (iii) the digestion products were separated on polyacrylamide gels. (iv) Finally the gels were subjected to quantitative scanning and objective data analysis, i.e. corrections for gel loading, cleavage efficiency, and the transformation of the electrophoretic mobility of the bands into residue numbers (39,40).
Cleavage Conditions and Assignment of the Bands-Phosphorylation of the cHMGI protein at Ser 3 could be used as an end-labeling procedure because this is a unique target of Cdc-2 kinase, and because its modification does not change the DNA binding properties of the protein (8). Limited digestion of the labeled cHMGI by proteinase Glu-C or trypsin followed by electrophoresis yielded patterns in which the individual bands could be assigned to peptides of defined lengths (Fig. 1, A,  Glu-C and trypsin, and B). Application of end-labeled peptides 1-48, 1-32, and 1-6 facilitated this assignment. Nonlinear regression enabled transformation of the relative mobilities of the cleavage products into residue sites within the protein (Fig.  1B).
In the absence of DNA, trypsin cut the cHMGI protein preferentially at five positions, amino acid residues 27/28, 31/32, 53/54, 76, and 84/85 (two adjacent lysine residues are present in four of the sites, and therefore the precise position of cleavage is uncertain). In the presence of poly(dA-dT)⅐poly(dA-dT) DNA, some bands nearly disappeared, whereas the intensity of others was only slightly reduced. (Fig. 1A, lanes marked trypsin). In particular, the cutting frequencies at the tryptic cleavage sites 53/54 and 76 were strongly reduced, whereas the cutting frequencies at position 27/28 and 85/85 were increased (Fig. 1C). The percentage of cleavage at sites 31/32 remained almost unchanged. These results show that regions corresponding to the second and the third AT-hook motif in the protein were protected after DNA binding, whereas other regions were more exposed. Since trypsin (like other proteinases), because of relatively large size, does not gain access to all cleavage sites of a protein and is able to cleave the peptide backbone at certain positions, these data provide only preliminary information on the structure of the cHMGI⅐DNA complex.
To obtain more precise information on regions of the protein that contact the DNA, hydroxyl radicals were used as a chemical proteinase (33,39,41).
The hydroxyl radical footprinting patterns exhibited relatively small changes after binding to DNA and were therefore difficult to analyze by visual comparison (Fig. 1A, hydroxyl  radical). A quantitative analysis of the data showed that defined regions were protected from chemical proteolysis, whereas others were not. A comparison of the mean intensities (radioactivity) measured in the lanes (from Fig. 2A) is shown in Fig. 2C. The plot of normalized differences measured between corresponding lanes with and without poly(dA-dT)⅐poly(dA-dT) DNA shows the protected and exposed regions of the DNAbound protein (Fig. 2D). The protected and the exposed regions found by this procedure matched those found in the enzymatic approach (compare Fig. 1C and Fig. 2D). The sizes and positions of the DNA-protected areas were constant over a wide range of digestion time (Fig. 2, C and D). For further experiments, 40 min of digestion were selected since under these conditions the ratio of small to long fragments is well balanced (Fig. 2C, 40 min), and 55-70% of the protein remained uncleaved (Fig. 2B), suggesting conditions of single cleavage (42). Furthermore, simultaneous DNA digestion by hydroxyl radicals was low under these conditions (not shown).
In the Presence of poly(dA-dT)⅐poly(dA-dT) DNA, Three AThook Domains Are Protected from Digestion-Earlier work has revealed that mammalian HMGI protein binds strongly to poly(dA-dT)⅐poly(dA-dT) (9). Because this synthetic DNA is also a good ligand of cHMGI, we chose it for our experiments. cHMGI was footprinted in the presence of concentrations of poly(dA-dT)⅐poly(dA-dT) ranging from 16 to 120 bp per protein molecule (Fig. 3, A and B). At lower ratios (16 bp:1 and 32 bp:1) protection was observed at amino acid residues 10 -22 and 54 -60. In addition, protection at amino acid residues 30 -40 was detected. The region between residues 10 and 20 corresponds to part of the first AT-hook sequence motif and adjacent NH 2 -terminal stretch. The region 53-60 constitutes the second AT-hook motif. At protein to DNA ratios of 60 bp:1 (Fig. 3B) and 120 bp:1 (not shown) an additional region, amino acid residues 75-81, which comprises the third AT-hook, was protected. Under these conditions, protection of region 30 -40 was no longer observed, and in contrast to lower DNA to protein ratios, a large portion of the protein (residues 29 to 50) was found to be highly susceptible to digestion and should therefore be exposed to the solvent. At all concentrations of DNA used an enhanced FIG. 1. Enzymatic protein footprint and assignment of the bands. A, 100 pmol of 32 P-labeled cHMGI on Ser 3 were digested with proteinase Glu-C, trypsin, or hydroxyl radicals in the absence or presence of 6 nmol/bp of poly(dA-dT)⅐poly(dA-dT) (i.e. 60 bp per molecule cHMGI). The digestion products were separated electrophoretically, and the dried gels were scanned. The numbers indicate the amino acid residue positions of the cleavage sites of Glu-C and trypsin digestion. B, plot of size of the peptide markers versus relative mobility. Relative mobility of the uncleaved cHMGI proteins was defined as 0 and the most rapidly migrating band of hydroxyl radical cleavage as 1. Endlabeled peptides were generated by digestion of the protein with trypsin (closed circles), proteinase Glu-C (open circles), and with hydroxylamine (triangle). The presented data are mean values from four independent experiments. The curve is a nonlinear regression fit. C, tryptic cutting frequencies observed within the cHMGI protein in the presence (gray bars) or absence (filled bars) of poly(dA-dT)⅐poly(dA-dT). The DNA and protein concentrations were the same as used in A. susceptibility to digestion was observed for the negatively charged COOH terminus of the protein.
Binding of cHMGI to Oligonucleotide with Binding Site of Human HMGI Involves the Three AT-hooks and Regions Flanking AT-hooks Two and Three-In vivo binding sites of the insect HMGI on the DNA are not known. Because most properties of this protein are identical or similar to those of mammalian HMGI, we analyzed the binding determinants of cHMGI in complexes with a DNA sequence comprising a mammalian ligand. A 34-bp DNA carrying PRDII/NRDI elements of the promoter region of the human IFN-␤ gene was selected for the experiments. The stoichiometry of cHMGI binding to this DNA is 1:1. 2 Footprinting revealed protection of residues 46 -82 and 8 -10 (Fig. 3, C and D). The NH 2 -terminal located region corresponds to part of the first AT-hook motif, whereas the large protected region comprises the other two AT-hooks and areas flanking these motifs. Regions of cHMGI rich in negatively charged residues (20 -44 and 85-90), as in the com-plex with poly(dA-dT)⅐poly(dA-dT) were found to be cleaved more frequently than without DNA and therefore probably exposed to solvent in complex with the DNA.
cHMGI Binding to Four-way Junction DNA Does Not Involve the Third AT-hook Motif-Chironomus and human HMGI proteins specifically recognize cruciform DNA (3). cHMGI was footprinted at different concentration ratios of DNA to protein (1:4, 1:2, 1:1, and 2:1) (Fig. 3E). The results were almost identical at all four conditions and showed protection of residues 9 -15 and 51-74. The first region corresponds to part of the first AT-hook motif, whereas the second one comprises the second AT-hook motif and residues between the second and third AT-hooks (Fig. 3F). Interestingly, the third AT-hook was not protected in the presence of the cruciform DNA. The regions of amino acid residues 26 -34, 38 -48, and Ͼ82 were found to be more susceptible in the presence of four-way junction DNA and are thus probably exposed within the complex.  -dT) (A and B), PRDII/ NRDI (C and D), and four-way junction DNA (E and F). A and B, 100 pmol of 32 P-labeled cHMGI on Ser 3 were digested with hydroxyl radicals in the absence or presence of 1.6, 3.2, 6, and 12 nmol/bp of poly(dA-dT)⅐poly(dA-dT), respectively. C-F, the same amount of protein was footprinted in the presence of 50, 100, and 200 pmol of PRDII/NRDI or fourway junction DNA, respectively. Panels A, C, and E, representative phosphorimages of the gels from single experiments. The schematic primary structure of cHMGI with AT-hooks (gray boxes) is shown at the left of each image. Panels B, D, and F, in each, case difference plots show averaged data from comparisons of 12 lanes. Bold lines above the plots indicate regions where the observed protection/exposition was statistically significant according to a Student's t-test (confidence level of 0.95). Only the statistical analysis for the highest DNA concentration used is shown. Schematic primary structure of cHMGI with AT-hooks (gray boxes) is shown in the lower part of panels B, D, and F. the structure of the histones and HMG1/2 proteins, well characterized folded domains are found in central positions. Another characteristic structural feature shared by these proteins is the presence of large structurally undefined regions, termini, bristles, or tails. Two groups of the HMG proteins, the families HMGI/Y and HMG14/17, are thought to be composed mainly of flexible regions of undefined structure. Binding to DNA induces a spatial ordering of regions containing residues involved in contacting DNA. Binding of the HMGI protein to DNA simultaneously leads to specific ordering of the protein structure (31) and induces changes in the conformation of the DNA (10 -13). The contacts of the entire HMGI molecule to DNA in protein-DNA complexes have been unknown. Here we have mapped for the first time the regions of an HMGI protein that are involved in the binding to DNAs of various types.
Our protein footprinting experiments revealed some general and some DNA structure-specific features of the cHMGI⅐DNA complex (Fig. 4). Binding of cHMGI to poly(dA-dT)⅐poly(dA-dT), to cruciform DNA and to HMGI binding site in interferon-␤ promoter DNA (PRII/NRDI) involved contacts by the first and second AT-hook motifs and led to exposition to the solvent of large parts of the region joining these two motifs and the COOH-terminal acidic tail of the protein. At ratios higher than 30 bp of the poly(dA-dT)⅐poly(dA-dT) per protein molecule, the third AT-hook motif was also protected. This suggests a lower DNA-binding affinity of the third motif as compared with the other two motifs and also that the entire protein occupies more than 30 bp on this DNA. This is reminiscent of the situation observed in human HMGI, where the third AT-hook exhibited a several orders of magnitude lower DNA-binding affinity in comparison to the second motif (31). Also, an 18-bp-long DNA molecule was found not to accommodate the entire HMGI protein (31). Furthermore, a 27-bp PRDII/NRDI-DNA fragment and a 40-bp PRDI/PRDII/NRDI-DNA fragment were able to bind only one molecule of an HMGI-C (45) or HMGI (32), respectively.
Binding of cHMGI protein to the 34-bp PRDII/NRDI-DNA appears to involve all three AT-hook motifs and also residues in front of the second motif and between the second and the third motif. These data suggest that the protein contacts with this DNA are more extensive in comparison to those found in poly(dA-dT)⅐poly(dA-dT) and probably reflect a capability of the cHMGI to recognize specific sequence contexts and/or a specific conformation of this DNA, which are absent in the poly(dA-dT)⅐poly(dA-dT). This DNA with alternating A/T residues has a widened minor groove and is conformationally flexible in solution (46), resulting in a weaker binding of the HMGI protein as compared with binding to rigid or intrinsically curved AT-rich DNA (47). The PRDII/NRDI-DNA is intrinsically pre-bent by about 20°, and binding of human HMGI to this DNA induces a partial reversal bending (13). A similar unbending of the PR-DII/NRD-DNA was also exhibited by the insect HMGI protein. 3 Our finding that the insect HMGI protein makes extensive contacts with this specific DNA is in accordance with data from recent NMR studies that suggest extensive contacts between HMGI and PRDII-DNA and have revealed that 19 of the 42residues-long HMGI(2/3) peptide are directly involved in the interaction with a PRDII-dodecamer (31).
In contrast, in complexes with cruciform DNA the third motif of the cHMGI protein remained unprotected, suggesting that this DNA cannot accommodate the entire protein or that the protein has a specific conformation in a complex with the cruciform DNA. Because this type of DNA is able to accommodate up to two HMGI molecules (48), the second possibility appears to be more probable. Interestingly, in this complex the region joining the second and the third motif was protected. It contains another sequence motif, PKRP (Fig. 4), which occurs not only in the HMGI/Y proteins but is also characteristic for proteins of the families HMG14/17 and HMG1/2. Peptides carrying this motif interact specifically with the minor groove of the DNA (35). Because all of these proteins exhibit preferential binding to cruciform DNA, it is possible that this motif plays a role in the recognition of this type of DNA.
Four-way junction DNA has been suggested as a model for DNA in chromatin at the site where it enters and exits the nucleosome (49,50). The binding of HMGI to this DNA may reflect a constitutive function of this protein in the organization of chromatin, in contrast to specific functions as organization of promoter complexes of particular genes. We have recently shown that phosphorylation at various residues may be involved in the adaptation of members of the HMGI/Y family to fulfill different cellular functions (8). In further studies, this possibility could be checked by investigating the binding of various phosphoforms of HMGI/Y proteins to DNA as well as to nucleosomes. By means of the protein-footprinting technique applied in this study it would also be possible to map contacts of the cHMGI protein within reconstituted chromatin or its selected components.