HMGB1 interacts with many apparently unrelated proteins by recognizing short amino acid sequences.

The chromatin high mobility group protein 1 (HMGB1) is a very abundant and conserved protein that is structured into two HMG box domains plus a highly acidic C-terminal domain. From the ability to bind DNA nonspecifically and to interact with various proteins, several functions in DNA-related processes have been assigned to HMGB1. Nevertheless, its functional role remains the subject of controversy. Using a phage display approach we have shown that HMGB1 can recognize several peptide motifs. A computer search of the protein data bases found peptide homologies with proteins already known to interact with HMGB1, like p53, and have allowed us to identify new potential candidates. Among them, transcriptional activators like the heterogeneous nuclear ribonucleoprotein K (hnRNP K), repressors like methyl-CpG binding protein 2 (MeCP2), and co-repressors like the retinoblastoma susceptibility protein (pRb) and Groucho-related gene proteins 1 (Grg1) and 5 (Grg5) can be found. A detailed analysis of the interaction of Grg1 with HMGB1 confirmed that the binding region contained the sequence homologous to one of the peptides identified. Our results have led us to propose that HMGB1 may play a central role in the stabilization and/or assembly of several multifunctional complexes through protein-protein interactions.

In the eukaryotic cell nucleus of all vertebrate cell types, HMGB1 1 (formerly named HMG1, see Ref. 1 for a revised nomenclature) is one of the most abundant non-histone proteins. HMGB1 has been shown to be essential because knockout mice die 24 h after birth (2). HMGB1 is highly conserved, particularly in mammals and to a lesser extent throughout the animal kingdom. HMGB1 is structured into three domains, two basic HMG boxes (HMG domains A and B) and a highly acidic C-terminal domain, which confer an overall dipolar appearance to this protein (see Refs. 3-5 for reviews). Each of the HMG boxes is formed by two short and one long ␣-helix that upon folding produce an L-or V-shaped three-dimensional domain structure (6 -8). Whereas the acidic C-terminal domain is presumably involved in the modulation of HMGB1 activity, the HMG box domains allow the protein to bind to linear DNA with moderate affinity and to highly structured (3-and 4-way junction DNA, cruciform DNA) or distorted DNA (bent or kink DNA, bulged DNA, cisplatin-modified DNA) with higher affinity, but always without sequence specificity. The concave surface of the L-or V-shaped HMG box domain contacts the DNA in the minor groove in two slightly different ways introducing important modifications in the structure of DNA, in particular a strong bend (reviewed in Ref. 5). Presumably, these features will be of relevance for the biological functions in which HMGB1 has been involved (DNA repair, recombination, replication, and transcription).
The activity of HMGB1 is not solely mediated by its ability to bind to DNA. Indeed, HMGB1 and the related HMGB2 protein can interact through their HMG box domains with a broad range of proteins ranging from nuclear cell proteins to viral proteins. Interactions of HMGB1 have been described with the recombination activation gene protein RAG1 (9), several transcription factors including the cellular tumor suppressor p53 (10), the octamer transcription factors Oct1, Oct2, Oct4, and Oct6 (11,12), some homeotic HOX proteins (13), the steroid receptors (progesterone (PR), glucocorticoid (GR), estrogen (ER), and androgen (AR)) (14,15), the general initiation factor human TATA-binding protein (hTBP) (16 -18), and the viral replication proteins Rep78 and Rep68 (19). The consequences of these interactions are multiple. HMGB1 in general increases the DNA binding affinity of those factors and depending on the context and the assay conditions HMGB1 has been shown to have a positive or a negative effect on transcription (14 -17, 20). In the case of RAG1 and Rep68/78 HMGB1 enhances the rate of the sequence-specific DNA cleavage reaction (19,21). Interestingly, HMGB1 can also stimulate the ATPase activity of Rep78 (19).
The two HMG box domains of HMGB1 appear to have a similar but not identical behavior with respect to their proteininteracting features. Thus, HMG box A is important for binding to hTBP and p53, whereas the binding to Oct factors, HOX factors, and hormone receptors can take place through boxes A or B (11,13,17,22). However, the interaction with RAG requires both HMG box domains (9).
To date, neither the HMG box surface that is involved in the interaction with other proteins nor the required amino acids of HMGB1 are known. On the other hand, sequence analysis of the factors interacting with HMGB1 does not suggest any apparent homology or similarity. For instance, the interaction with RAG1, Oct, and HOX factors occurs at the homeodomain. In the case of hTBP, it is the H2Ј ␣-helix of the core and with Rep78, two different regions are recognized. From these data, no consensus can be defined.
The fast progress of genomics and proteomics has made it obvious that an important focus in understanding biological processes is to characterize how proteins interact in macromolecular complexes. Attempts to define general rules for predicting specific recognition between proteins have been unsuccessful because each protein-protein interaction has its own properties. Nevertheless, some indications are emerging. The development of powerful tools has led to the discovery that one type of recognition involves asymmetric interactions that occur between a particular domain and a short region, often less than 10 amino acids in length, within the other protein. A recent review (23) recapitulated several examples of protein domains involved in these kinds of interactions like SH3 (Src homology 3), phosphotyrosine-binding WW, EH (Eps15 homology), PDZ modules (PSD-95/dlg/ZO1), as well as pRb and the ER. Some of them, like the ER, could interact in several modes with different peptides, in a manner that depends on the bound ligand.
In the present study we have explored the molecular recognition properties of HMGB1 by ligand selection from a large library of heptapeptides displayed on phages. Our results do not give support to one unique strong consensus sequence but rather to a few different kinds of peptide sequences. A BLAST search enabled us to predict new proteins that may interact with HMGB1. We have tested and confirmed the interactions with a few of these and have shown that HMGB1 can interact not only with transcriptional activators but also with repressors and co-repressors. Taken together the data suggest a complex network of protein-protein interactions that will clarify the biological function(s) of HMGB1.

EXPERIMENTAL PROCEDURES
Expression and Purification of Recombinant HMGB1 Box A, Box B, and Full-length HMGB1-The plasmids pT7-HMGB1bA, pT7-HMGB1bB, and pET14b-HMGB1 used for the expression of rat HMGB1 box A, box B, and full-length HMGB1, respectively, have already been described. The procedure for the expression and purification of the three recombinant proteins was as described previously (17). Native calf thymus HMGB1 was purified as described previously (24).
Peptide Phage Display Analysis with HMGB1 Boxes A and B-A phage display heptapeptide library kit (New England Biolabs, Beverly, MA) was used to screen for peptides binding to HMGB1 box domains A and B. The kit contained a random combinatorial collection of heptapeptides fused via a flexible linker sequence to the N terminus of protein pIII of bacteriophage M13. Each phage expressed at the tip of the cover 3-5 copies of the unique peptide it encoded. The library complexity contained all the possible combinations of the 20 natural amino acids taken as 7-mer sequences. For the phage biopanning process, we followed the kit instructions as indicated by the manufacturer. Four independent experiments were run in which 30 -40 g of HMGB1 box A or B were immobilized overnight at 4°C on 96-well microtiter plates (Costar 3690). Wells were then blocked for 1 h at 4°C with Tris-buffered saline, 0.1% Tween 20. 2 ϫ 10 11 plaque-forming units of the phage library were added per well and incubated for an additional hour at room temperature. Wells were washed ten times with Trisbuffered saline, 0.1% Tween 20 for the first round. For subsequent rounds of washing, Tris-buffered saline, 0.5% Tween 20 was used to increase stringency. Finally, phages were either eluted at room temperature by incubation with 0.2 M glycine-HCl, pH 2.2 for 10 min or affinity eluted by incubation for 1 h with 30 -40 g of the respective HMGB1 box A or B. Phage amplification, titration, and purification were carried out according to the manufacturer's protocol. Automatic phage DNA sequencing used the Ϫ96gIII primer (5Ј-CCCCTCATAGT-TAGCGTAACG-3Ј) and was performed by the Serveis Científico-Tècnics of the Universitat de Barcelona.
The data base search for homology with the sequences selected in the phage display experiments was performed using the BLAST (Basic Local Alignment Search Tool) program (NCBI, National Center for Biotechnology Information) (25). The alignments were obtained using the MultAlin program (Multiple Sequence Alignment, INRA, Institut National de Recherche Agronomique, Toulouse, France) (26).
GST Fusion Constructions and Protein Purification-All GST constructs were prepared using the pGEX-4T3 plasmid (Amersham Biosciences, Inc.). The twelve peptides selected to be further analyzed as GST fusions were amplified from the phages with PCR using the following primers 5Ј-TGGTACCTTTGAATTCTCACTC-3Ј and 5Ј-TCA-ACAGTGTCGACCGAACC-3Ј, which introduced EcoRI and SalI sites, respectively (underlined). The PCR products were inserted between these sites in the pGEX-4T3 plasmid.
pGEX-Grg1 and pGEX-Grg5 were constructed by inserting the NaeI/ XhoI fragments obtained from the pBS-Grg1 and pBS-Grg5 (kindly provided by Dr. C. Lobe) into pGEX-4T3 digested with SmaI and XhoI. pGEX-Grg Q-GP-CcN was produced by inserting a NaeI/SmaI fragment from pBS-Grg1 into a SmaI site of the pGEX vector. pGEX-Grg SP-WD was generated by inserting a SmaI/XhoI fragment of pBS-Grg1 into the same sites of the pGEX vector. pGEX-Grg Q was obtained by digesting pGEX-Grg1 with Bpu1102I and XhoI. The vector was blunt-ended and then self-ligated. pGEX-Grg GP-CcN was prepared by digestion of pGEX-Grg1 Q-GP-CcN with BamHI and Bpu1102I. The vector was blunt-ended and then self-ligated.
pGEX-Grg ⌬GP-CcN was obtained by inserting a BamHI/XhoI fragment obtained by PCR from the pGEX-Grg1 construct by using the primers 5Ј-TTCAGCCTCCTGGATCCCCG-3Ј and 5Ј-GTTTTCTCGAG-GTGAGTGTG-3Ј (restriction sites underlined) into pGEX 4T3 digested with the same enzymes. pGEX-Grg SP was generated by inserting a SmaI/XhoI fragment obtained as above by PCR with primers 5Ј-ACT-CACCCCGGGAAAACG-3Ј and 5Ј-GTTGATCTCTCGAGCATGTCG-3Ј into pGEX 4T3 digested with the same enzymes. All constructions were verified by manual or automated DNA sequencing.
Far-Western Analysis and Western Blotting-GST-peptide fusions were separated by SDS-PAGE and electroblotted in transfer buffer (25 mM Tris, 40 mM glycine, 0.05% SDS, 20% methanol) to nitrocellulose membranes (Optitran BA-85, Schleicher & Schuell). After blocking in phosphate-buffered saline, 0.1% Tween 20) containing 5% nonfat dry milk for 1-2 h at room temperature, membranes were incubated overnight in 10 ml of D buffer without glycerol (20 mM HEPES, pH 7.9, 0.2 mM EDTA, 100 mM NaCl, 0.5 mM dithiothreitol, 0.1 mM phenylmethylsulfonyl fluoride) containing 10 -20 g of either HMGB1 or its derivatives, HMGB1 box A or B. After several washes in phosphate-buffered saline/Tween, membranes were incubated with a primary chicken anti-HMGB1 antibody raised in our laboratory against recombinant HMGB1 deleted of the C-terminal domain and subsequently with a secondary anti-chicken IgY-HRP antibody (Jackson Laboratories) and detected using ECL reagents.
Pull-down Assays-Glutathione-Sepharose beads (Amersham Biosciences, Inc.) were loaded with the different GST fusion proteins as suggested by the manufacturer and washed five times with 450 l of D buffer containing 20% glycerol. Then, they were incubated with HMGB1 box A or B for 1 h at 4°C in the same buffer and washed six times more with the same buffer. Beads were finally boiled in proteinloading buffer, and proteins were separated by SDS-PAGE and detected by Western blotting with specific antibodies.

Preferential Interaction of HMGB1 Box Domains A and B
with Several Peptides-In an attempt to identify targets that could be potentially recognized by HMGB1, and because other general approaches like yeast two-hybrid analysis were not possible (likely due to toxic effects of the expression of HMGB1 boxes in yeast, results not shown) a peptide library screening approach was carried out. A highly complex library containing the whole collection of natural heptapeptide sequences displayed on phage M13 was used. Four rounds of selection were carried out for each of the four independent experiments that were performed using highly purified recombinant HMGB1 boxes A or B as bait proteins. During the biopanning process stringency was increased by using higher detergent concentrations in the washing buffer. Bound phages were recovered by either a nonspecific acid elution (experiment 1) or by affinity competition using HMGB1 boxes A or B free in solution (experiments 2-4). Several peptides were selected as potentially interacting with HMGB1 with some specificity (Table I). From these results, it became clear from their sequences that they were not related to a single strong consensus sequence, suggesting that either the interactions between HMGB1 and the peptides were weak or that HMGB1 boxes could interact with several unrelated motifs. We noted that the frequency of appearance of some amino acids in the selected peptides clearly deviated from the random theoretical level indicating that the selection process was successful. That is, if interactions of HMGB1 were specific they should be independent of the particular growth features of the phages and their statistics should clearly differ from those of the unselected phages in the biopanning assays and be due to their ligand-binding requirements. For example, positively and negatively charged amino acids usually presented frequencies lower than expected in the selected peptides, suggesting that the interactions did not mainly rely on electrostatic forces despite the highly basic character of the HMGB1 boxes. Also, a high rate of aromatic residues was observed, in particular a high level of tryptophan in experiment number 4, which contrasts with the observation that this amino acid tends to decrease naturally without selection. A remarkable level of proline was also obtained in experiment number 3. Hydrophilic amino acids, which tend to be involved in hydrogen-bond recognition, showed some decrease as well. These data indicate that selection had in fact occurred in the presence of the HMGB1 boxes although no clear-cut consensus could be easily drawn.
Despite the fact that the sequences selected were very variable (as shown in Table I), the appearance of several copies of the same peptide in each experiment indicated a high enrichment and specificity in the screening. Note also that some peptides (e.g. HWGMWSY, HAIYPRH) were selected in independent experiments with both HMGB1 domains. Despite this variability in the peptide sequences it was also clear that amino acid distribution in the peptides was not random (Table I). Thus, peptides could be grouped into at least two classes: proline-rich and tryptophan-rich. Two peptides enriched in tryptophan, HWGMWSY and HSWLWWP, accounted for 50% of the clones in experiment 4, and HWGMWSY appeared also in experiments 1 and 2. A minimal consensus WXXW motif could be a potential site for interaction. In the case of proline-enriched peptides it was more difficult to define a consensus given the diversity of these sequences.
Because of the high complexity in the peptide sequences retrieved in the phage display experiments we were concerned about the potential existence of false positives in the selected set of peptides. As a second approach to confirm the bona fide association of HMGB1 with those peptides, we performed in vitro assays in which peptide sequences were fused to GST. The twelve peptides were used among those selected with HMGB1 boxes A and B and were representative of the different kinds of peptides obtained. They were fused to the C terminus of GST to facilitate their expression and purification in E. coli. As a control a GST with an extended, unrelated, and never selected peptide was included (GST-KG). Because direct GST pull-down assays did not work, likely because of steric hindrance of the GST moiety to the very small peptide (not shown), Far-Western experiments were performed (Fig. 1). In contrast to the negative control, all the other peptides showed interaction with HMGB1 box B (Fig. 1B). The relative intensity of each band varied from experiment to experiment indicating that the results were not quantitative. Nevertheless, these interactions were always detectable, whereas interaction with the control was never detected (Fig. 1B, lane GST-KG). Moreover, the same assays were also done using either HMGB1 box A or HMGB1 full-length and the results were the same (not shown).
We noted that none of the peptides selected was represented in the sequence of HMGB1. Because HMGB1 can only very inefficiently interact with itself forming a homodimer (27) and HMG box domains do not interact with each other inside the HMGB1 molecule (28), this suggested that the peptides selected were representative of reasonable rather than very weak interactions.
Discovery of New Partners for HMGB1-Once our results were confirmed it was of interest to look for proteins containing regions of homology to the peptides selected in the phage display assay in order to perform a survey. This point was addressed by searching the available data bases with the BLAST  program for nuclear proteins. HWGMWSY was a peptide that generated interesting candidates and among them appeared ER, a factor previously identified to interact with HMGB1 (15). We noted a potential WXXW consensus motif (Table IIB) for the factors belonging to this group. These factors were not studied in detail because in comparison another peptide, LPLT-PLP, generated the most interesting homologies and allowed the identification of a new set of nuclear proteins that could potentially interact with HMGB1. These putative factors are summarized in Table IIA. Once again, two other factors, p53 and PR, already described to interact with HMGB1 appeared (10,14) and interestingly, the region homologous to our peptide in p53 was highly conserved in mammals. Remarkably, all the proteins listed presented a potential PXXPXP consensus motif, and among them components for many transcription factor complexes can be found. In this list we had only included the proteins containing the sequence motifs homologous to the peptide that were conserved among mammalian species. Note that in some cases two conserved motifs could be found (e.g. Grg1, p53). Also, the two orientations of the motif were considered because similar proline-rich motifs were reported to be recognized by SH3 domains independently of their orientation (Ref. 29 and references therein).
To confirm that HMGB1 can physically interact with the surveyed proteins, we used a pull-down approach employing GST-p53 as a positive control and proteins fused to GST, which were selected as representative of the several classes described above: the RNA-binding protein and positive regulator hnRNP K, the negative regulator MeCP2, and co-repressors pRb, Grg1, and Grg5. Because HMGB1 is a protein that presents many post-translational modifications, including at least acetylation, ADP-ribosylation, and methylation (reviewed in Ref. 30), we have taken this fact into account and have analyzed in parallel the interaction with recombinant and native calf thymus purified HMGB1 (i.e. presenting many post-translational modifications, not shown). Fig. 2 shows that HMGB1 can effectively interact with all the proteins tested with some notable differences between the recombinant and the purified forms. Whereas either recombinant or purified HMGB1 interacted with Grg1 to a similar extent, recombinant HMGB1 showed some preference for Grg5, p53, hnRNP K, and pRb. Purified HMGB1, in contrast, interacted with MeCP2 more efficiently than recombinant HMGB1. Neither recombinant nor purified HMGB1 can interact with GST. We wanted to emphasize, however, that because those results were semiquantitative, only large differences should be taken into consideration. Thus, it was tempting to suggest that whereas post-translational modification did not seem to have a major effect on HMGB1 interaction with some (mainly Grg1, but also Grg5 to some extent), it appeared to negatively affect interaction with others (especially pRb and p53) and in one case was slightly preferred (MeCP2). In this assay the efficiency of the interaction was  2. HMGB1 interacts with several transcriptional regulators. Pull-down assay of recombinant (r) and purified (p, highly modified) HMGB1 with GST fusions of p53, pRb, hnRNP K, MeCP2, Grg1, and Grg5 (see "Experimental Procedures" for details). GST alone was used as negative control. Lanes containing 10% of the input material and corresponding to recombinant (Input r) and purified (Input p) HMGB1 are shown on the left and the right, respectively. In order to analyze the effect of post-translational modifications on the interactions of HMGB1, equal amounts of GST fusions were used with the two HMGB1 preparations. Recombinant HMGB1 migrates slightly above purified HMGB1 because of the addition of a ϳ1-kDa histidine tag.

dependent on the fusion protein. Preliminary data obtained using recombinant HMG boxes A and B showed the same results (not shown).
Mapping of the Interacting Regions on Grg1-We further studied the potential of the PXXPXP motif by using Grg1 as a model target. Grg1 is structured into five domains, each having a particular function (Fig. 3A). A highly conserved N-terminal glutamine-rich domain (Q) is involved in protein oligomerization, a C-terminal WD-repeat domain is used for interaction with other proteins, and a central domain encompassing the CcN motif allows the nuclear localization of Grg1. The two other domains, GP and SP, are poorly conserved and may play a direct role in repression of transcription (31). By using different deletions of Grg1 the involvement of the PXXPXP motif in interaction with HMGB1 was tested (Fig. 3A). As shown in Fig. 3B splitting of the Grg1 molecule into two moieties, N-and C-terminal, showed that HMGB1 was able to interact with both, the one containing domains Q-GP-CcN (lane 4) and the other containing domains SP-WD (lane 8), respectively.
The binding site on the N-terminal moiety was located at the N-terminal region of the GP domain because deletion encompassing residues 131-155 (Fig. 3B, compare lanes 6 and 7, GST-GP-CcN and GST-⌬GP-CcN, respectively) completely abolished the interaction. Additionally, the Q domain clearly showed no interaction with HMGB1 (lane 5). The sequence, which upon deletion abolished binding of HMGB1 on this side of the protein, precisely corresponded to the region of homology to the LPLTPLP peptide (Fig. 3C, shaded). Note that in the closely related Grg5 factor this region is also present and is likely used for interaction with HMGB1 as well (Fig. 2).
The site of interaction in the C-terminal moiety is likely located at the SP domain because deletion of the WD domain did not abolish the interaction. Nevertheless, the existence of still another site on the WD domain cannot be ruled out. A careful examination of the amino acid sequence of the SP domain revealed another region of weaker homology (vPfp-PmP) to the LPLTPLP peptide (Fig. 3D). In this case, this site is absent in the related Grg5 factor.

DISCUSSION
On the HMGB1 Side-HMGB1 was first isolated almost 30 years ago and since then much effort has been put into characterizing this protein and finding its functional role (30). The latter point has been shown to be difficult because of controversial results, and it is still a matter of discussion. Because HMGB1 does not present any enzymatic activity on its own or any sequence-specific DNA or RNA binding, it is clear that the functions in which it could be involved would require the assistance of other factors, likely proteins. Several reports have clearly shown that HMGB1 interacts with a set of different factors that present very little, if any, sequence or structure in common. The question of how HMGB1 can recognize its partners is one we have addressed here.
As yet, the rules that govern the interactions between two given proteins are unknown, and in any case interactions are impossible to predict. A hypothesis that will reconcile all the data on HMGB1 interactions could be that the interaction surface recognized by HMGB1 on these proteins might be very small, so that only a few amino acids would be important. Eventually, not only one of such motifs could be recognized by HMGB1. This idea is based on recent findings showing that many different proteins can recognize motifs as short as 4 -5 residues long such as for example, LXCXE for pRb (32) or WRPY/W for Groucho in Drosophila (33).
As an approach that occasionally helped in determining the residues that might be required for interactions to take place, the screening of peptide repertoires has been successfully used by several groups to shed light on the mechanisms of ligand recognition of macromolecular complexes. The use of a peptide phage display approach allowed us to begin understanding some previously reported interactions, to predict others, and to begin testing them. In our hands, HMGB1 can recognize sev- eral short motifs through the HMG boxes. However, the lack of a unique sequence motif or a single consensus sequence derived from them complicated the interpretation of the results and clearly suggested that HMGB1 is a protein that does not have a highly preferred interaction sequence but can establish interactions with many peptides having rather different sequences. Therefore, no strong enrichment is observed, and this heterogeneity makes the interpretation of the results difficult. As far as we know, this seems to be a particular behavior of HMGB1. Some of the peptides were isolated with both HMG boxes A and B (HWGMWSY and HAIYPRH) suggesting that either recognition with these peptides is due to the threedimensional structure of the HMG box or it uses some of the most highly conserved regions between the two domains. The other peptides seem to be different for each HMG box. However, the complexity of the patterns obtained suggests that the relative affinities for the peptides are similar and weak. This may in turn enrich a particular experiment in a certain set of peptides because of slight changes (in temperature for instance). But it could also be attributable to the differences in sequence of the two HMG boxes because their three-dimensional structures are rather similar (6,8). On the other hand, taking into account sequence identity in HMG boxes A and B is very high when compared with the corresponding domains in HMGB2, it is also very likely that many if not all of the peptide motifs will also be recognized by HMGB2, as well. In fact, the apparent redundancy of HMGB1 and HMGB2 is a general observation for all of the interactions described so far for these proteins, including Oct factors (11,12), steroid receptors (15), hTBP, 2 and RAG1 (9) among others.
Our results do not discount the highly acidic C-terminal domain as potentially interacting with other factors. In fact, the C-terminal domain is the one involved in the interaction with histone H1 (34) and with the histone dimer H2A⅐H2B as well (24), and recently it has been claimed to interact with the glutamine-rich N-terminal domain of human and Drosophila TBP (35). However, the highly acidic nature of this domain makes its interactions highly electrostatic and in general weakly specific.
New HMGB1-interacting Proteins Come to Light-The use of peptides in our assay is likely limiting the set of potential partners for interaction with HMGB1 to those that are recognized only by sequence, because little structure can be expected from heptapeptides. Therefore, it is reasonable to expect some of the interactions previously described not to appear in this assay. This may be the case, for instance, of the interaction to the H2Ј ␣-helix of hTBP previously described (17). Moreover, the fact that a particular protein contains one of the motifs in its sequence does not necessarily imply that this will be the site used for interaction or that another motif can be used. In addition, the contribution of peptidic sequences around the motifs identified here can be determinant, because in some cases they have been shown to play a major role either in modulating or even in affecting the specificity of the interactions. Additionally, the accessibility of the motifs embedded in some protein contexts might be rather limited. Finally, there is always a potential for HMGB1 to be actively recognized by the other partner (see below), and then motifs can be completely different.
When searching protein data bases, a point that must be taken into account is that the relative representation of the different peptides and motifs is not homogeneous. Thus, some peptide motifs produce long lists of proteins, like for PXXPXP, whereas other motifs do not. Nevertheless, some motifs not studied here in detail also generated very interesting candidates. For instance, a sequence closely related to the potential WXXW motif, and in particular to peptide HWGMWS, can be found in conserved sequences of the ␣and ␤-ER of mouse, rat, and human, of ETS-1, and also in a conserved sequence of the mouse and human RNA-binding protein nucleolysin TIAR among others (Table IIB). Remarkably, the interaction with ␣-ER was previously described. Moreover, the proposed HMGB1-binding sequence lies very close to the zinc finger of the ER that interacts with DNA and is clearly accessible in the co-crystal structure (15,36). These features might help to explain how HMGB1 can stabilize the interaction of ER with DNA.
The PXXPXP consensus motif has predicted the most interesting partners upon BLAST search and for that reason has been studied in detail. These include pRb, MeCP2, hnRNP K, Grg1, Grg5, and p53. Remarkably, p53, which was previously shown to interact with HMGB1, appeared, giving credit to this linear sequence as a potential HMGB1 recognition motif. In fact, all the PXXPXP-containing proteins tested here interacted with HMGB1. The reason for this clear result is likely due to the high proline content of the motif, which makes the adoption of a defined secondary structure difficult even if embedded in a protein sequence. Thus, it is very likely that other proteins containing this motif and listed in Table IIA will interact with HMGB1 as well.
HMGB1 has already been shown to interact with p53 and be a unique activator of this factor (10). The PXXPXP motif is always found at the N terminus of p53 around position 80 -90 in the different species and forming part of a proline-rich domain that negatively affects p53 interaction with DNA (37). The interaction of HMGB1 at this domain could explain the stimulatory effects observed on DNA binding and transactivation (10). Because the proline-rich domain is dispensable for transactivation (38) the role of HMGB1 might be restricted to stabilize the p53-DNA complex. We must mention, however, that residues 363-376 in the basic domain of p53 have been recently reported to recognize HMG box A on HMGB1 and that p53 conversely enhanced HMGB1 interaction with cisplatinmodified DNA (22). In this region of p53 no homology to any of the peptides identified in this work was found. This might be in apparent contradiction to the data discussed above but could be reconciled if there is mutual recognition between p53 and HMGB1. If so, HMG box A would be recognized by the Cterminal region of p53, and the proline-rich domain of p53 would be recognized by the HMG boxes of HMGB1. Depending on the experimental conditions one or the other interaction would prevail and be selected.
This may also be the case with pRb because in addition to having a sequence showing homology to the PXXPXP motif, the pocket region of pRb can recognize the LXCXE motif in other proteins (32). This motif (as the sequence LFCSE) is present in the HMG box B of both HMGB1 and HMGB2 and is absolutely conserved in vertebrates. We have not analyzed here in detail whether recognition is via the LXCXE motif (i.e. pRB recognizes HMGB1) or the PXXPXP motif in pRb (i.e. HMGB1 recognizes pRb) and simply described this interaction. Although it is possible that the interaction may work in both directions we reasoned that it may be difficult since the LFCSE sequence is structured into an ␣-helix facing the concave side of the Lshaped domain and is partially buried in HMG box B (6).
The interaction between HMGB1 and Grg1 has been studied in more detail because of the high homology to the peptide and also because this is the first interaction of HMGB1 with a co-repressor. In Grg1 two regions of interaction with HMGB1 have been uncovered. The one corresponding to the GP domain fits nicely with high homology to the LPLTPLP peptide isolated in the phage display analysis and very likely accounts for that interaction as predicted. In the second domain interacting with HMGB1, the SP domain, also a region with the PXXPXP motif is present albeit homology to the original peptide sequence is weaker. These results support the PXXPXP motif as a good recognition sequence for HMGB1. Brantjes et al. (51) have recently shown that all Tcf HMG box transcription factors interact with Groucho-related co-repressors and even more, all the long members containing five domains (here represented by Grg1) mediated repression of the Tcfs, whereas the short member (Grg5) mediated de-repression. All Tcfs are HMG box transcription factors but so far the region that interacts with the Grgs has never included the HMG box domain. In contrast HMGB1 uses the HMG box domains to interact with Grg proteins, suggesting that recognition of the PXXPXP motif is not a feature common to all HMG box domains. On the other hand, transcription mediated by AR is inhibited by Grg5 (39). Because the PXXPXP motif is also present in the AR sequence (Table IIA), and the interaction of HMGB1 with AR stabilizes the AR-DNA complex as shown by others, the interaction of HMGB1 with Grg5 could cooperate by displacing it from AR and helping to relieve inhibition (15).
MeCP2 is a protein involved in the recognition of methylated DNA at CpG islands and is closely related to gene silencing (see Ref. 40 for a review). The HMGB1 binding motif is located at the C terminus of this protein (residues 380 -386) in a region where no other factor has been shown to interact. Therefore, it is possible that HMGB1 interaction does not interfere with the binding of the many factors known to interact with MeCP2, (among others: mSin3A, NcoR, and c-Ski), and it remains to be analyzed whether in this case interaction could also stabilize MeCP2-DNA complexes similar to HMGB1 in other complexes.
hnRNP K presents a remarkable variety of protein interactions with some factors involved in signal transduction and others involved in several aspects of gene expression (reviewed in Refs. 41 and 42). This diversity suggests that the hnRNP K protein may act as a docking platform or as a scaffold protein within multiple functional modules. Then, HMGB1-hnRNP K interaction may connect two systems, which have a large protein-protein interaction potential, in order to extend their respective domains of action even further. For example, hnRNP K was shown to be a transcription-activating factor for the c-myc promoter (43,44). Although we have not been able to establish a direct connection, it is remarkable that c-myc mRNA levels in HMGB1Ϫ/Ϫ mouse cells are about 5-10-fold lower than those in wild type cells. 3 We have focused so far on the nuclear environment where HMGB1 is expected to play a role in gene expression. However, HMGB1 is an extraordinary protein and besides being a nuclear protein in most cases it is also true that for more than 10 years HMGB1, under the name of amphoterin, was found on the outer cell membrane of some cell types (45). It is now clear that HMGB1 can be actively released outside the cell in response to tumor necrosis factor and interleukin 1 (46) and passively by necrosis or cell damage in a variety of cell types, mainly immature and transformed cells (47). Release of HMGB1 induces some pathological processes as a potent late mediator of endotoxin lethality and inflammation. Many of these phenomena require the interaction of HMGB1 with the receptor for advanced glycation products (RAGE) (reviewed in Ref. 48). Examination of the mouse RAGE amino acid sequence shows a region highly homologous to the LPLTPLP motif in the second Ig-like C2-type domain that is highly conserved among several species. Similarly, HMGB1 was also shown to interact with Syndecan-1, a cell surface heparan sulfate-rich proteoglycan (49) in which sequence a PXXPXP motif can also be found. These and other data suggest that HMGB1 might use the same motifs to interact with cell surface proteins and with nuclear factors.
Final Considerations-Prior to this work, a long list of proteins were reported to interact with HMGB1. Along with the new candidates reported here, this may give the impression that HMGB1 interacts with almost every protein in the cell. Despite the appearance that HMGB1 is in fact a "sticky" protein (48), it is by no means true. For example, we have been trying to extend the initial interaction of HMGB1 with human TBP to other factors of the general transcription machinery, and we failed to observe any interaction of HMGB1 with TFIIA, TFIIB, TFIIF, the CTD of RNA polymerase II, TFIIE (despite a clear interaction that can be observed with the p56 subunit, no interaction can be observed with the native tetramer), and a TBP-related factor among others, 4 suggesting that the list of factors interacting with HMGB1 may be long but by no means indiscriminate. This particular behavior makes it difficult to attribute a defined role for HMGB1 in the organism but may explain the general weakness observed in the HMGB1 knockout mice that lead them to death 24 h after birth (2). HMGB1 seems to have developed a high potential for protein-protein interaction with multiple partners, always taking part in a macromolecular complex and either assisting in the assembly or stabilizing the assembled factors. This feature along with the remarkable abundance of HMGB1 in the nucleus and the many post-translational modifications that it undergoes could explain its general involvement in nuclear processes and its modulation.
On the one hand, HMGB1 is a protein that can interact with angled DNA on its own. On the other hand, a general observation is that upon interaction HMGB1 can help stabilize many proteins previously bound to DNA. These functions may be compatible; however, although the concave region of the Lshaped HMG box domain clearly is the DNA interaction site, there is no data about the region of the HMG box domain involved in the interaction with other protein factors. Some evidence at the enhanceosome of BHLF-1 suggests that HMGB1 can interact with both the narrow groove of DNA and with protein factors (50). However, more detailed work is required in order to discover whether HMGB1 simultaneously interacts with proteins and DNA, the stoichiometry of the interactions, and how its binding can stabilize protein-DNA complexes. A potential role as a co-factor is emerging for HMGB1 in many different processes.