Role of Conserved Salt Bridges in Homeodomain Stability and DNA Binding*

The sequence information available for homeodomains reveals that salt bridges connecting pairs 19/30, 31/42, and 17/52 are frequent, whereas aliphatic residues at these sites are rare and mainly restricted to proteins from homeotherms. We have analyzed the influence of salt and hydrophobic bridges at these sites on the stability and DNA binding properties of human Hesx-1 homeodomain. Regarding the protein stability, our analysis shows that hydrophobic side chains are clearly preferred at positions 19/30 and 31/42. This stabilizing influence results from the more favorable packing of the aliphatic side chains with the protein core, as illustrated by the three-dimensional solution structure of a thermostable variant, herein reported. In contrast only polar side chains seem to be tolerated at positions 17/52. Interestingly, despite the significant influence of pairs 19/30 and 31/42 on the stability of the homeodomain, their effect on DNA binding ranges from modest to negligible. The observed lack of correlation between binding strength and conformational stability in the analyzed variants suggests that salt/hydrophobic bridges at these specific positions might have been employed by evolution to independently modulate both properties.

Presently, the largest searchable collection of information for the homeodomain protein family is the Homeodomain Resource Data Base (5)(6)(7). It contains around 1056 full-length protein sequences isolated from 112 different species. Inspection of these data shows that helices I/II, I/III, and II/III can be connected by three salt bridges involving pairs 19/30, 31/42, and 17/52, respectively, which in addition can exhibit different polarities. In contrast, hydrophobic bridges at these sites are present only in a minor fraction of sequences (and mainly restricted to homeodomains from warm-blooded animals). This predominance of salt versus hydrophobic bridges suggests a role for the former in homeodomain function and/or stability.
Here we analyze the relative effect of salt and hydrophobic bridges on the homeodomain stability and DNA binding properties, employing the human Hesx-1 DNA binding domain as model system. Our approach involves the extensive use of sitedirected mutagenesis experiments together with CD, NMR, and isothermal titration microcalorimetry (ITC) 4 measurements. Thus, as a first step, the hydrophobic pair Val 19 /Ile 30 , present in the wild-type polypeptide, was replaced by salt bridges of different polarities, either isolated or networked. A previous statistical analysis of the information stored in the homeodomain resource data base (8,9) revealed an intriguing correlation between the nature of pair 19/30 and the loop residue 26 (proline or a branched aliphatic amino acid in all cases). Considering this, both salt bridges and aliphatic residues connecting 19/30 were also considered in the context of either proline or leucine in 26. Finally, conserved salt bridges 31/42 and 17/52 were substituted by hydrophobic pairs observed in other natural sequences.
The work reported here implied the production of 33 homeodomain variants, including single, double, triple, and quadruple mutations. All of them were subjected to careful thermal and chemical denaturation experiments, employing circular dichroism. This analysis allowed the identification of a thermophilic variant of the Hesx-1 homeodomain. Its three-dimensional solution structure was determined by NMR methods to provide a detailed structural framework for elucidating the origin of the enhanced thermal stability. In addition, the pK a values for carboxyl groups in some selected variants were meas-initial and final denaturant concentrations were confirmed by refractometry. ⌬G (the free energy of folding) and m values (the dependence of the free energy of folding on the denaturant concentration) were determined from the chemical denaturation data assuming a two-state transition and using the linear extrapolation method (10). Nonlinear regression calculations were performed employing Origin 5.0. The errors in the ⌬G values derived from the fitting procedure were in all cases smaller than 0.1 kcal/mol. Thermal unfolding transitions were measured employing 1-cm path length quartz cells from samples containing 3 M protein in 10 mM phosphate, 0.5 mM DTT, pH 6, and five different NaCl concentrations (0, 0.1, 0.4, 1.0, and 2.0 M). The transitions were monitored by the decrease of the CD signal at 222 nm using a 2-nm bandwidth. Heating rates were 20°C/h. Transitions were evaluated using a nonlinear least squares fit assuming a two-state model with sloping pre-and post-transitional base lines. ⌬G values at a common temperature (55 or 77°C depending on the particular set of protein variants, see Table 1) were derived from the thermal denaturation data. In this procedure, a constant value of Ϫ0.7 kcal/(mol⅐K) was used for the change in heat capacity upon folding (⌬Cp). It represents the average of all the values found in the analyses of the individual thermal transitions. In addition, it is consistent with the denaturant m values measured for the different protein variants and also with the decrease in the accessible surface area estimated for the homeodomain upon folding (11). The fact that m values exhibit very low variations for the different mutants (in all cases it is in the 0.8 -1 kcal/(mol⅐M) range for denaturation experiments with urea and in the 1.4 -1.7 kcal/ (mol⅐M) range for those with guanidinium chloride) supports the employment of a common ⌬Cp for all of them (11). Comparable ⌬Cp values have been described for the Engrailed (Ϫ0.7 kcal/(mol⅐K)) (9) and VndK2 (Ϫ0.52 kcal/(mol⅐K)) (12) homeodomains. In addition, it has to be considered that most variants showed denaturation midpoints relatively close to 55 or 77°C (depending on the set of mutants), and therefore large extrapolations were not necessary for determining ⌬G at these temperatures.
The standard errors for T m and ⌬H determined from the analysis of the individual melting profiles were typically not higher than 0.1°C and 1 kcal/mol, respectively. The accuracy in the ⌬G values at 55 or 77°C is obviously correlated with the T m . To estimate the errors in the folding free energies, their values were also derived considering a variation of 2 kcal/mol in ⌬H (⌬H Ϯ 2 kcal/mol), 0.2°C in T m (T m Ϯ 0.2°C), and 0.2 kcal/ (mol K) in ⌬Cp (⌬Cp Ϯ 0.2 kcal/(mol K)). In every case, the maximum difference between the calculated free energies was taken as the error (see supplemental Fig. S1).
Finally, the contributions to stability still present in 2 M NaCl were assumed to reflect primarily packing and hydrophobic interactions (herein referred as ⌬G 2 M NaCl ), whereas those that could be screened by adding 2 M NaCl were attributed to coulombic interactions (⌬G coulomb ). They were calculated from the difference ⌬G coulomb ϭ ⌬G 0 M NaCl Ϫ ⌬G 2 M NaCl and the corresponding errors estimated from E ⌬Gcoulomb ϭ and Varian Unity 500-MHz spectrometers. NMR samples included 400 -800 M protein concentration in 100 mM NaCl, 10 mM sodium phosphate buffer, and 2 mM DTT, pH 6.0. Protein assignments were obtained using a set of two-dimensional NOESY (13), TOCSY (14) and DQF-COSY (15) experiments. The TOCSY, NOESY, and DQF-COSY experiments were carried out in the phase-sensitive mode using the TPPI method (15) for quadrature detection in the indirect dimension. Typically, a data matrix of 512*2048 points was used to digitize a spectral width of 8000 -6000 Hz. 80 scans were used per increment with a relaxation delay of 1 s. Prior to Fourier transformation, zero filling was used in the indirect dimension to expand the data to 1K*2K. Base-line correction was applied in both dimensions. The TOCSY spectra were acquired using 60 ms of isotropic mixing period. The NOESY experiments were performed with mixing times from 50 to 200 ms.
Upper limits for proton-proton distances were obtained from NOESY cross-peak intensities at three mixing times, 50, 150, and 200 ms. Cross-peaks were classified as strong, medium, and weak corresponding to upper limits of 2.5, 3.5, and 5.0 Å. The lower limit for proton/proton distances was set as the sum of the van der Waals radii of the protons. Distance geometry calculations were performed on a Silicon Graphics O2 computer using the program DYANA (16). A set of 703 constraints was used in the final round of calculations.
The 30 best DYANA structures in terms of target function were submitted to a simulated annealing protocol (17) with the AMBER 5.0 package and the parameters described by Kollman and co-workers (18). To prevent nonrealistic interactions between disordered regions of the protein and the structured helical core, explicit TIP3P (18) water molecules and periodic boundary conditions were employed in these calculations (19). pK a Measurements-NMR experiments were performed at 5°C on a Bruker Avance 800 spectrometer in 85:15 1 H 2 O: 2 H 2 O. NMR samples were prepared in 100 mM NaCl, 10 mM sodium phosphate, and 2 mM DTT at 400 -600 M protein concentrations. Resonances for all the glutamate and aspartate side chains were assigned employing a set of two-dimensional NOESY (13), TOCSY (14), and DQF-COSY (15) experiments. Changes in chemical shift for the side-chain H␥/H␤ protons in glutamates and H␤ protons in aspartates were monitored as a function of pH. The pK values were obtained from nonlinear least squares fitting employing the Henderson-Hasselbach equation with Hill coefficients set to 1.
DNA Binding Experiments-Binding studies were performed at 15, 25, and 35°C, in 150 mM NaCl, 20 mM phosphate, 5 mM MgCl 2 , pH 6.0, using a VP-ITC titration calorimeter (MicroCal, LLC) with a reaction cell volume of 1.467 ml. Both the protein and duplex DNA (5Ј-GTCTAATTGACGCG-3Ј and its complementary 5Ј-CGCGTCAATTAGAC-3Ј) solutions were dialyzed against the same buffer prior to ITC experiments to ensure chemical equilibration. Typically, 4.0 -14.0 M duplex DNA in the reaction cell was titrated with a 100 -200 M solution of the different Hesx-1 variants contained in a 300-l syringe. At least 30 consecutive injections of 5-10 l were applied at 5-min intervals, whereas the DNA solution was stirred at a constant speed of 300 rpm. Dilution heats of protein into DNA solutions (which agreed with those obtained by injections of proteins into the same volume of buffer) were subtracted from measured heats of binding. Titration curves were analyzed with Origin, provided with the instrument by Micro-Cal LLC, using a one-site binding model to fit the curves. For every single protein variant, thermodynamic parameters were derived from two independent experiments and averaged.

RESULTS
Occurrence of Salt Bridges within the Homeodomain Family-As a first step, the relative occurrence of both salt bridges and hydrophobic pairs at positions 19/30, 31/42, and 17/52 was determined from the information stored in the Homeodomain Resource Data Base (1056 nonredundant full-length and domain sequences). For our analysis, a data set of 754 proteins, including the canonical 60-residue homeodomain sequence with no insertions or deletions, was considered. The results obtained are represented schematically in Fig. 1, a-c. It can be observed that positions 19/30 are connected by salt bridge interactions of both polarities in 52.5% of the cases (396 sequences) with a clear preference for those combinations with a positively charged residue in 30 (Fig. 1a). In contrast, aliphatic residues at these sites are relatively rare (2.5%).
In some cases, residues 19/30 can establish a cooperative network with positions 15/37, as that observed in the x-ray structure of the Engrailed homeodomain (9,20). An equivalent pattern of interactions involving pairs 19/30 and 15/37 would be feasible in only 2.7% of the sequences stored in the Homeodomain Resource Data Bank (see Fig. 1b). In fact, an alternative network involving the triad 19/30/33 (typically Glu 19 /Arg 30 / Glu 33 ) seems to be much more frequent within this family of proteins.
From a statistical analysis of homeodomain sequences, Clarke (8) identified a dominating pattern of pairwise co-variation centered on residue 26. Using the co-varying network, homeodomains were divided into two classes. One class has branched aliphatic residues at position 26, whereas the second contains proline. Previous analysis of a reduced data set of representative human homeodomain sequences (9) found that the branched aliphatic subgroup usually has a salt bridge connecting residues 19 and 30 (92% of cases). Strikingly, none of the Pro 26 subgroup, with more than 60 members, had this potential interaction. Based on this observation, it was suggested that salt bridge 19/30 in presence of Leu 26 could contribute to the conformational variability of the loop between helices I and II. This could affect the ability of the homeodomain for induced fit on binding to DNA.
Our analysis of an extended data set with 754 sequences reveals a slightly more complex situation (Fig. 1c). For homeodomains with a salt bridge connecting 19/30, 26 can either be a branched aliphatic residue (usually leucine) or proline, depending on the salt bridge polarity. If the positive charge is at 30, 26 is a branched aliphatic amino acid in Ͼ96% of the cases. On the other hand, if position 30 is negatively charged, then 26 can be either leucine or proline, although leucine is clearly pre- AUGUST Fig. 1c).

Salt Versus Hydrophobic Bridges in Stability and DNA Binding
Regarding positions 31/42, they form a salt bridge in the majority of homeodomains (61.7%). For a minor fraction of sequences, an aliphatic residue can be found either in 31 (14.7%) or 42 (3.8%). Finally, both are aliphatic in just five sequences (0.7%) of our data set (Fig. 1a).
The most conserved salt bridge interaction within the homeodomain family is that connecting positions 17/52, present in 75.1% of the sequences. Hydrophobic residues at either position, 17 or 52, can be found in 0.8 and 0.1% of the cases, respectively. However, a hydrophobic pair at this site is not present in any single protein within our data set (Fig. 1a). A selection of representative homeodomain structures with salt bridges 19/30, 31/42, and 17/52 is shown in supplemental Fig. S2.
Salt Versus Hydrophobic Bridges as Determinants of Homeodomain Stability, Methodological Aspects-To analyze the relative efficacy of salt versus hydrophobic bridges at positions 19/30, 31/42, and 17/52 in the stabilization of the homeodomain fold, several single, double, triple, and quadruple mutants of Hesx-1 homeodomain were produced. The obtained variants (1-33 in Table 1) were subjected to both heat and chemically induced denaturation experiments.
To gain further insights into the different contributions to the observed changes in stability, thermal denaturation profiles were collected at five different NaCl concentrations (0, 0.1, 0.4, 1.0, and 2.0 M), and the dependence of ⌬G with the ionic strength at a common temperature (55 or 77°C, see Table 1) was analyzed.
Typical thermal and chemical denaturation profiles collected for Hesx-1 mutants are shown in Fig. 2a and supplemental Figs. S3 and S4. It can be observed that the obtained variants are strongly stabilized by increasing NaCl concentrations. Moreover, in all cases, ⌬G exhibits a linear dependence with the square root of the ionic strength suggesting that the electrostatic screening of unfavorable interactions is the dominant mechanism for the observed salt stabilization (21).
Free energies derived at 0 and 2 M NaCl are referred as ⌬G 0 M NaCl and ⌬G 2 M NaCl , respectively. Following the methodology described by Schmid and co-workers (22)(23)(24)(25), the contributions to stability still present in 2 M NaCl were assumed to reflect primarily packing and hydrophobic interactions. Other effects, such as desolvation penalties, the intrinsic helix propensities of the different residues, or their conformational entropies, might also contribute to this term. On the other hand, those contributions that could be screened by adding 2 M NaCl were attributed to coulombic interactions (⌬G coulomb ). Throughout this paper, residues relevant for the discussion in the different protein variants are indicated in parentheses.

TABLE 1 Stability data measured for wild-type Hesx-1 and variants 1-33 from chemical (left) and thermal (right) denaturation experiments (employing urea or guanidinium chloride as indicated)
Mutants are numbered according to their thermal stability at 0 M NaCl. For all variants a proline residue is present at the loop position 26 unless explicitly stated (in parentheses, variants 1 and 17). a m (kcal/(mol M)) slope of ⌬G versus denaturant concentration plots. b T m is the midpoint of the thermal unfolding transition. Values measured at 0 and at 2 M NaCl concentration are shown. c ⌬G is the change in Gibbs free energy of folding at 55 or 77°C (as indicated); ⌬G at 0 M NaCl is the total change, ⌬G at 2 M NaCl represents the nonpolar contribution, and the difference between 0 and 2 M NaCl represents the Coulombic contribution to ⌬G. d m1 (kcal/(mol M 1/2 )) slope of ⌬G versus [NaCl] 1/2 determined as described by Rios and Plaxco (21). This parameter is proportional to the ⌬G coulomb term.

Salt Bridges Versus Hydrophobic Pairs at Positions 19/30-
First, the influence of pair 19/30 on the stability of Hesx-1 was analyzed. It is well established that formation of extensive cooperative networks can greatly enhance the stabilizing effect of salt bridges in proteins (26). Pair 19/30 has been shown to be involved in a polar network with residues 15/37 in the Engrailed homeodomain ( Fig. 1b). Taking this into account, variants, including different combinations of charged/aliphatic side chains at these four positions, were considered. The obtained results are represented in Table 1 and summarized in Fig. 2b and supplemental Figs. S4 -S7.
It can be observed that the single or double replacement of residues Val 19 /Ile 30 (present in the wild-type polypeptide) by charged side chains to build salt bridges of different polarities results, in all cases, in a clear destabilization of the homeodomain fold. Interestingly, this effect seems relatively independent on the context provided by pair 15/37. The salt dependence exhibited by ⌬G at 55°C provides a possible explanation for this behavior. Thus, although the unfavorable influences of the mutations are, in most cases, slightly attenuated at 2 M NaCl (indicating a less optimized electrostatic balance in the mutants), to a large extent they are maintained at such ionic strength (see ⌬G 0 M NaCl , ⌬G 2 M NaCl , and ⌬G coulomb terms in Table 1 and Fig. 2b), suggesting that Val 19 and Ile 30 establish more optimized packing interactions with the protein hydro-phobic core than the charged side chains. Even variants 11 ( Table 1). It is likely that this particular distribution of amino acids allows a better packing of the sidechain methylene groups with the protein hydrophobic core, and/ or a more optimized interaction between position 30 and the loop connecting helices I and II. Interestingly, the observed preference is reflected in the relative abundance of both types of contacts in the homeodomain resource data base (salt bridges of equivalent polarity to those present in mutants 19 and 16 are present in 33.8 and 18.7% of the sequences, respectively, see Fig. 1a). In conclusion, simple hydrophobic interactions established by aliphatic residues at positions 19 and 30 provide a more effective stabilization of the Hesx-1 fold than salt bridges of any polarity either isolated or networked.
Interaction between Pair 19/30 and the Loop Residue 26, Influence on Homeodomain Stability-As indicated under "Occurrence of Salt Bridges within the Homeodomain Family," the homeodomain resource data base reveals an intriguing correlation between pair 19/30 and residue 26, located on the loop between helices I and II.
Wild-type Hesx-1 includes a proline at position 26, and residues 19/30 are both aliphatic (Val 19 /Ile 30 , see Fig. 3). This pair was replaced by a salt bridge Glu 19 /Arg 30 in the previously analyzed mutants 19 and 15, resulting in a rather unusual combination (Glu 19 /Pro 26 /Arg 30 ). To assess the relative effect of aliphatic versus salt bridges connecting sites 19/30 on the protein stability, in the context of both proline or leucine at position 26, two additional protein variants were produced and analyzed (Fig. 3a). First, the triple mutant 17 (Glu 19 /Leu 26 /Arg 30 ), which contains the triad more frequently found in homeodomains, was obtained. Second, Pro 26 was replaced by a leucine in the wild-type protein to produce mutant 1 characterized by the unnatural triad Val 19 /Leu 26 /Ile 30 .
As a preliminary analysis, high resolution one-and two-dimensional NMR spectra were collected for mutants 17 (Glu 19 / Leu 26 /Arg 30 ) and 19 (Glu 19 /Pro 26 /Arg 30 ) and compared with that corresponding to the wild-type polypeptide (Fig. 3b). All spectra exhibit sharp signals with a similar dispersion. Moreover, distinct chemical shifts and key long range NOEs, which are hallmarkers of this family of proteins, could also be identified for the two mutants, which rule out the existence of major structural differences between the three homeodomains. Next, both 17 (Glu 19 /Leu 26 /Arg 30 ) and 19 (Glu 19 /Pro 26 /Arg 30 ) were subjected to denaturation studies. The results obtained are shown schematically in Fig. 3a. It can be observed that the P26L substitution in 19 (Glu 19 /Pro 26 /Arg 30 ), to give 17 (Glu 19 /Leu 26 / Arg 30 ), has a minor effect on the homeodomain stability. In fact, 17 (Glu 19 /Leu 26 /Arg 30 ) is slightly less stable, according to both chemical and thermal denaturation experiments, mainly reflecting a less optimized electrostatic balance (see ⌬G coulomb terms in Table 1). A speculative but plausible explanation for this result is that the P26L mutation induces minor conformational differences in the loop between helices I and II (a flexible region in homeodomains), which in turn might lead to a poorer electrostatic interaction between charged residues Glu 19 and Arg 30 .
A different behavior was observed for variant 1 (Val 19 /Leu 26 / Ile 30 ). Strikingly, the P26L substitution in the wild-type protein strongly promoted the formation of inclusion bodies under all the expression conditions tested. In addition, the minor fraction of protein expressed, which was soluble, showed a strong tendency to aggregate and precipitate and was lost during the purification in all attempts. Therefore, this mutant could not be subjected to a detailed thermodynamic analysis. However, tak-ing into account that residue 26 is not solvent-exposed in the native protein (and therefore it is not likely to significantly affect the protein solubility), this behavior suggests that the P26L mutation strongly destabilizes the protein. This conclusion might seem surprising considering that it appears conservative at first glance. However, it would be in agreement with previous studies on protein design (27)(28)(29)(30). More specifically, it has been shown that although some variations in the buried positions of a protein are allowed, there are limits on the sequences that result in stable native-like folds (27)(28)(29)(30). Taking this into account, specific packing between residues 19/26/30 seems critical for the stability of the homeodomain fold.
Salt Bridges Versus Hydrophobic Pairs at Positions 31/42, Three-dimensional Solution Structure of a Thermostable Homeodomain Variant-To determine the influence of pair 31/42 on the homeodomain structure and stability, additional single and double mutants were prepared and analyzed. The results obtained are schematically shown in Fig. 4a. It can be observed that the single replacement E42L (variant 28 (Arg 31 / Leu 42 )) slightly increases the protein stability (see Table 1 and Fig. 4a). This result was unexpected given the exposed location of residue 42 and considering that the E42L substitution disrupts the salt bridge present in the wild-type polypeptide (and observed in most homeodomain structures). A much larger stabilization (up to 2.32 kcal/mol at 77°C) is achieved by the single mutation R31L (variant 29 (Leu 31 /Glu 42 )). Finally, the double substitution R31L/E42L leads to variant 31 (Leu 31 /Leu 42 ), a thermophilic version of Hesx-1 homeodomain, stabilized more than 4 kcal/mol with respect to the wild-type polypeptide (see Table 1 and Fig. 4a) and characterized by melting temperatures in the 89 -95°C range. The origin of this effect can be understood by examining the thermodynamic data in detail. Thus, despite the disruption of the Arg 31 /Glu 42 interaction, replace- ment of both side chains by aliphatic residues results in an improvement of the homeodomain electrostatic balance (as deduced from the ⌬G coulomb term). In addition, these substitutions lead to a very favorable ⌬G 2 M NaCl term, suggesting a more optimized packing or hydrophobic contacts in the mutated proteins. Indeed, energy differences at 0 M NaCl are to a large extent maintained in the presence of very high ionic strength (2 M NaCl, see Fig. 4a). In principle, ⌬G 2 M NaCl should be rather sensitive to the packing of the aliphatic side chain at position 31. To test this point, the leucine residue present in variant 31 (Leu 31 /Leu 42 ) was replaced by an isoleucine to give mutant 30 (Ile 31 /Leu 42 ). As expected, 30 (Ile 31 /Leu 42 ) and 31 (Leu 31 /Leu 42 ) present identical coulombic contributions to ⌬G at 77°C. In contrast, the salt-independent contribution is slightly reduced in variant 30 (Ile 31 /Leu 42 ). The observed destabilization might reflect partially the lower intrinsic helix propensity of the isoleucine with respect to that of leucine. However, it also suggests that a leucine side chain establishes more optimized packing interactions within the protein hydrophobic core than an isoleucine.
As a next step, one-dimensional NMR spectra were collected for the wild-type protein and variant 31 (Leu 31 /Leu 42 ) at different temperatures (Fig. 4b). It can be observed that the thermophilic polypeptide exhibits good dispersion in the proton onedimensional spectra, a characteristic of well folded proteins. Moreover, the lack of significant broadening of the signals rules out the existence of aggregation under the experimental conditions employed. Finally, the NMR data are in agreement with the CD denaturation experiments confirming the enhanced stability of 31 (as revealed by the persistence of its proton NH signals at high temperatures). To determine the influence of pair Leu 31 /Leu 42 on the homeodomain conformation, its solution structure was solved by employing NMR methods. An ensemble of 30 structures was calculated on the basis of 703 unambiguous nuclear Overhauser effects. The structures have very small deviations from ideal geometry and reasonable nonbonded contacts (supplemental Tables S1 and S2). A schematic illustration of the 25 simulated annealed structures is depicted in Fig. 5 and supplemental Fig. S8.
It can be observed that, despite the mutations, the global structure of 31(Leu 31 /Leu 42 ) is very similar to that reported for wild-type Hesx-1 (backbone pairwise root mean square deviation of 0.51 Å between residues 8 and 55). According to the NMR data, Leu 31 is buried, presenting clear contacts with side chains of Leu 42 , Pro 26 , and Phe 49 (Fig. 5, a and b). Minor structural differences between the two polypeptides are located on the loop between helices I and II (Fig. 5c). For example, in the wild-type protein, Pro 26 stacks on the aromatic ring of Phe 49 . As a result of this interaction Pro 26 H␥ protons present unusual chemical shifts (1.04/0.12 ppm). For variant 31 (Leu 31 /Leu 42 ), the Leu 31 /Phe 49 contact induces a slight displacement in the loop residue Pro 26 , which is clearly reflected in its side-chain resonances (i.e. the proton H␥2 is shifted downfield 0.85 ppm with respect to wild-type Hesx-1). The refined structure of the thermophilic polypeptide conclusively shows that the interaction between Pro 26 and Phe 49 (present in the wild-type homeodomain) is partially disrupted in 31 by the burial of Leu 31 side chain in the hydrophobic core. In contrast to residue 31, the leucine at position 42 is exposed to the solvent. However, this residue contributes significantly to protein stabilization through its contact with Leu 31 . The close proximity between both side chains is evidenced by several nuclear Overhauser effect cross-peaks. In fact, the three-dimensional structure obtained shows that Leu 42 aliphatic side chain significantly reduces the solvent-accessible surface of Leu 31 , which is almost totally buried in mutant 31.
In conclusion, aliphatic side chains at positions 31 and 42 provide a more efficient stabilization of the homeodomain fold than the highly conserved Arg 31 /Glu 42 contact. This is achieved without promoting significant aggregation of the polypeptide even at high concentrations. The NMR data are consistent with a local repacking of the hydrophobic core in the thermophilic protein, which is in agreement with the stabilizing ⌬G 2 M NaCl terms measured for variants 30 (Ile 31 /Leu 42 ) and 31 (Leu 31 /Leu 42 ).
Salt Bridges Versus Hydrophobic Pairs at Positions 17/52-The salt bridge 17/52 has been subjected to several studies in the past. For example, a stabilization of 0.5 kcal/mol has been reported for the K52E variant of the Engrailed homeodomain (in which a bridge Lys 17 /Glu 52 can be formed) with respect to the single mutant K52A (9). Moreover, it has been shown that H52R replacement in vnd/NK-2 (allowing the electrostatic interaction between side chains of Glu 17 and Arg 52 ) increases the thermal stability of the polypeptide by 7.7°C at 0 M NaCl (12). Overall, these studies indicate that a salt bridge connecting 17 and 52 is moderately stabilizing. Interestingly, close inspection of the sequence information available for homeodomains reveals a very low occurrence of hydrophobic residues at sites 17 or 52. Aliphatic (usually Val or Leu) or aromatic (Tyr in all cases) side chains can be found at position 17 in less than 2% of the sequences included in our data set. Although a minor fraction of polypeptides exhibits aromatic residues (Phe, Trp, or Tyr) at position 52 (Ͻ0.5%) aliphatic amino acids (Val, Leu, or Ile) are totally absent from this site. Finally, hydrophobic bridges (involving either aliphatic or aromatic residues) connecting 17/52 were not found in any sequence. This observation argues against the viability of such interaction in homeodomains. To test this prediction additional mutants were designed. Thus, variants 32 and 33, with pairs Leu 17 /Tyr 52 and Tyr 17 /Tyr 52 , respectively, were produced and analyzed. Inspection of molecular models suggests that both pairs should be sterically tolerated and could provide stabilizing interactions with the hydrophobic core of the protein. The obtained results are schematically represented in Fig. 6a and Table 1. A minor amount of mutant 33 (Tyr 17 /Tyr 52 ), insufficient for a detailed thermodynamic analysis was purified after several attempts. However, the measured thermal denaturation profiles are consistent with a significant decrease in stability (Ϫ15°C at 0 M NaCl). Regarding variant 32 (Leu 17 /Tyr 52 ), this protein was not expressed soluble under any of the conditions tested. It seems that the Leu 17 /Tyr 52 combination strongly promotes protein aggregation. Altogether, these observations suggest that the nonpolar side chains connecting 17/52 reduce the protein stability and might also affect its solubility. In conclusion, only polar residues seem to be tolerated at positions 17/52.

Stabilizing Influence of Side Chain Packing in Conserved Salt
Bridges-In addition to their electrostatic contribution to the energy of folding, charged side chains could also contribute to protein stability by means of packing interactions involving methylene groups. To analyze this effect, two additional variants were prepared. In both cases, a glutamic acid involved in a salt bridge interaction was replaced by the shorter aspartic acid (see Fig. 6b). Thus, the Glu 19 /Arg 30 bridge present in variant 19 was replaced by pair Asp 19 /Arg 30 (variant 2) . In a similar way, residue Glu 42 , which is involved in a salt bridge with Arg 31 in the wild-type protein, was substituted by Asp (to give 27, see Table 1 and Fig. 6b).
According to our analysis, the salt bridge Asp 19 /Arg 30 , present in 2 , is strongly destabilizing, with respect to pair Glu 19 / Arg 30 present in variant 19 (see Table 1). This remarkable destabilization is maintained in the presence of 2 M NaCl (⌬⌬G 2 M NaCl ϭ 1.69 kcal/mol). A fraction of the observed difference in ⌬G 2 M NaCl , estimated in about 0.2 kcal/mol (32), could be attributed to the more unfavorable contribution of the aspartic acid side chain to the intrinsic helix propensity of the polypeptide. The remaining 1.49 kcal/mol difference is suggestive of less optimized packing or hydrophobic interactions in 2 (Asp 19 / Arg 30 ). In contrast, ⌬G coulomb terms are similar in 2 (Asp 19 /Arg 30 ) and 19 (Glu 19 /Arg 30 ), which indicates that the electrostatic interaction between 19 and 30 is not altered significantly by the shortening of the acidic side chain.
In a similar way, the single substitution E42D in the wild-type polypeptide (to give variant 27) leads to a moderate decrease in stability (around 0.7 kcal/mol at both 20 and 55°C; see Table 1). Again, the observed difference in free energy between the wild-type protein (with pair Arg 31 / Glu 42 ) and variant 27 (Arg 31 /Asp 42 ) is rather insensitive to the ionic strength, suggesting that the reduced number of hydrophobic and van der Walls contacts established by the shorter Asp 42 side chain is responsible for this effect. In conclusion, our results indicate that the packing of methylene groups from saltbridging side chains provide significant favorable contributions to the global ⌬G.
NMR Analysis of pK a Values for Selected Multiple Mutants-To further dissect the electrostatic contributions to homeodomain stability, we measured the pK a values of acid residues (Glu and Asp) in the folded state for selected protein variants, employing NMR spectroscopy. By monitoring the chemical shift dependence of resonances adjacent to the titrating groups, the degree of stabilization of the charged state may be estimated (33)(34)(35)(36). Our analysis was focused on quadruple mutants 12 and 15, given that they include all the potential salt bridge interactions conserved within this family of proteins (involving pairs Arg 19 Table 2 and supplemental Fig. S9. The measured pK a values for Glu 17 in both 12 and 15 are 3.75 and 3.95, respectively, which is consistent with a moderate stabilizing effect of the charged residue (in the 0.6 -0.8 kcal/mol range assuming a pK a ϭ 4.4 in the unfolded state). In contrast, the side chain of Glu 42 shows a pK a close to those typical for random coils (4.39 and 4.40 for 12 and 15, respectively) suggesting that, overall, a negative charge at this position has a minor contribution to protein stability.
Finally, the negative charge of residue 37 in variant 15 seems to have a minor effect on the homeodomain stability as judged by the measured pK a value (pK a ϭ 4.52). A plausible explanation is that its favorable contact with Arg 15 is counterbalanced by a repulsive interaction with Asp 33 .
It should be pointed out that these conclusions are in qualitative agreement with those derived from the thermal denaturation studies previously described (a comparison is shown in supplemental Fig. S10). Thus, the electrostatic influence of a negatively charged side chain on the protein stability could be estimated from the effect of neutralizing mutations on the ⌬G coulomb term. For example, the E42L substitution in the wildtype protein, to give variant 28, entails the removal of a negatively charged residue. As expected from the measured pK a value (4.4), this mutation does not alter significantly the electrostatic balance of the homeodomain (⌬⌬G coulomb ϭ 0.04 kcal/mol, see supplemental Fig. S10a). In another example, the neutralization of Glu 15 in 12 by the E15V mutation, to give variant 13, produces significant changes in the salt-dependent ⌬G coulomb contribution (0.43 kcal/mol), which also is in agreement with the pK a measured for this residue (pK a ϭ 3.8, see supplemental Fig. S10c). Similar conclusions can be obtained for the other acidic residues analyzed (see supplemental Fig.  S10). Unfortunately, a more quantitative comparison between the results obtained employing both approaches is not possible given that the pK a values in the Hesx-1 denatured state are not known at the moment.
Finally, the analysis of the pK a values suggests the presence of additional salt bridges in Hesx-1 not conserved within the homeodomain family. Thus, Glu 41 (pK a ϭ 3.52 and 3.47 for 12 and 15, respectively) seems to participate in a polar intra-helical contact with Arg 44 . More interestingly, Asp 43 (pK a ϭ 3.77/3.67) might interact electrostatically with Arg 3 , located at the disordered N terminus of the homeodomain. Indeed, both residues are in close proximity according to the NMR structure of wildtype Hesx-1. This dynamic protein region is known to participate in DNA recognition. One could speculate that the Asp 43 / Arg 3 interaction might contribute to DNA binding by orienting the N-terminal region and also by reducing its conformational entropy.
Influence of Conserved Salt Bridges on the Homeodomain DNA Binding Properties-As a final step, the specific binding of selected Hesx-1 variants to a 14-bp DNA was analyzed, employing microcalorimetry. Selected experimental conditions (pH 6.0 in 150 mM NaCl, 20 mM phosphate, and 5 mM MgCl 2 ) were proximate to those employed in the stability studies previously described, to allow a direct correlation between binding and stability.
First, the influence of pair 31/42 on the thermodynamic parameters of the association was determined. The structural data available indicate that residue Arg 31 usually participates in DNA recognition by forming a salt bridge with a DNA phosphate. Interestingly, in most cases, such interaction is estab-lished without disrupting the previously existing contact Arg 31 / Glu 42 (as illustrated by the x-ray structures of the MSX-1 and Engrailed homeodomains bound to a DNA duplex; see Fig. 7a). The thermodynamic influence on complex formation of such pattern of interactions is unclear. On the one hand, it could be hypothesized that Arg 31 /Glu 42 salt bridge fixes the Arg 31 side chain in the right orientation for DNA binding (pre-organizing the protein receptor and reducing the entropic cost of the recognition process). On the other hand, secondary repulsive interactions between Glu 42 and the DNA backbone might destabilize the complex. However, this side chain is in most cases relatively far from the phosphates (6.5-9.0 Å), and therefore, under physiological conditions, its influence would presumably be small.
ITC titrations performed with the wild-type protein and variants 28 (Arg 31 /Leu 42 ), 29 (Leu 31 /Glu 42 ), and 31 (Leu 31 / Leu 42 ) at 25°C are represented in Fig. 7b, and the derived thermodynamic parameters are shown in Table 3. It can be observed that, in all cases, complex formation is enthalpically driven, whereas entropy opposes binding. This is in contrast with the results described for the Nkx-2.5 (37) and vnd-NK-2 (12) homeodomains where entropy was found to also be favorable. According to our data, the mutation E42L (to give variant 28) has an almost negligible influence on the homeodomain binding properties, indicating that the electrostatic interaction between Glu 42 side chain and the DNA receptor, in the wildtype polypeptide, is small. In contrast, the single substitution of Arg 31 by leucine (to give 29) has a moderate effect on the complex stability. Thus, the affinity constant (K b ) is reduced by a factor of 4.4, corresponding to a free energy increment (⌬⌬G) of 0.9 kcal/mol. This destabilization probably reflects the absence of the Arg 31 /phosphate interaction in the complex formed by the mutated protein. However, it should be pointed out that the observed effect is modest in comparison with that reported for comparable contacts in other homeodomains. Thus, previous studies have shown that, in Hesx-1, the single replacement of Arg 53 (a highly conserved arginine, also involved in a salt bridge with the DNA backbone) by a cysteine decreases binding affinity by more than 2 orders of magnitude (19). This observation highlights the context-dependent character of such interactions and their role on DNA binding.
Finally, it should be noticed that the influence of both mutations on the protein-DNA complex stability is almost additive, i.e. the effect of R31L substitution in the presence and absence of Glu 42 is almost identical (see Fig. 7b). It can be concluded from these data that the salt bridge Arg 31 /Glu 42 exerts a minor influence on the homeodomain DNA binding properties.
As a next step, the role of salt bridges 19/30 and 15/37 on DNA binding was analyzed. In contrast with pair Arg 31 / Glu 42 , none of these sites are involved in direct contacts with the DNA duplex. In fact, they are all located far from the protein/DNA interface, as deduced from the inspection of the homeodomain-DNA complex structures available (see Fig. 8). However, they might still have an indirect influence on DNA binding, modulating the dynamic properties of the polypeptide and its adaptability. The thermodynamic parameters collected for selected mutants, with different combinations of charged or hydrophobic residues at these sites, shown in Table 3, do not support this hypothesis. It can be observed that differences in binding strength (⌬⌬Gb), with respect to the wild-type polypeptide, are in the 0 -0.2 kcal/mol range. Although somewhat larger differences in enthalpy were measured, they are partially cancelled by the entropic contribution (enthalpy/entropy compensation). Under the employed experimental conditions, neither the conformational stability of the polypeptide (maximum differences in the free energy of folding for the selected mutants amounts to 2.6 kcal/mol at 20°C; see Table 1), nor the protein total charge (that varies from Ϫ1 to ϩ1 with respect to the wild-type homeodomain) seems to have a significant effect on complex formation.
To further analyze the DNA binding properties of the mutated homeodomains, the temperature dependence of the thermodynamic parameters was measured, by performing ITC titrations at three different temperatures (15, 25, and 35°C) with a reduced set of variants (see Table 4 and supplemental Figs. S11 and S12). It has been shown that the specific interaction of homeodomains with DNA is accompanied by folding of their intrinsically disordered N-and C-terminal regions. In fact, coupling of local folding and binding is common between DNA-binding proteins (38). Considering that all protein variants included in our analysis present mutations far from these unstructured regions, it might be anticipated that they would have a minor influence on the induced fit of the protein upon binding to DNA. However, it has been shown that the homeodomain core is dynamic leading to the hypothesis that this character is essential for the adaptability of the polypeptide. In addition, it has been suggested that salt bridge 19/30 is part of a co-varying network of interacting residues that plays a role in the induced fit of the protein on complex formation (9). The estimated changes in heat capacity that accompany the specific binding of the selected variants to DNA (⌬Cp, one of the thermodynamic signatures of the induced fit) do not provide support for this hypothesis. Thus, for the wild-type protein it can be observed (Table 4 and supplemental Figs. S11 and S12) that the binding process becomes more exothermic with temperature. The measured variation in ⌬H is consistent with a negative ⌬Cp value of Ϫ0.38 kcal/(mol K). A similar behavior was deduced for variants 16, 11, and 7 (including salt bridges 19/30 of opposite polarities). Our experimental data suggest a minor increase in ⌬Cp for the mutated homeodomains that might correlate with their thermal stability. This would be consistent with a slightly more pronounced adjustment of the polypeptides in the more unstable variants. However, the observed differences are, in all cases, Յ0.1 kcal/(mol K). Overall, the obtained data rule out a major effect of pairs 19/30 and 15/37 on DNA recognition and suggest that the induced fit of the protein is mainly restricted to residues in or close to the protein/DNA interface.

DISCUSSION
According to our analysis, of the three salt bridges conserved in homeodomains, two (19/30 and 31/42) are clearly destabilizing with respect to aliphatic pairs found in a minor fraction of sequences. Thus, we have shown that replacement of the highly conserved salt bridge 31/42 by an aliphatic pair (in variants 30 and 31) leads to a thermophilic version of the Hesx-1 homeodomain. Moreover, for sites 19/30 the stabilizing influence of the aliphatic side chains is maintained even in those cases in which a polar network involving tetrad 15/19/30/37 could be established. Our analysis shows that stabilization by the aliphatic residues reflects, in all cases, the improvement in packing interactions with the protein hydrophobic core (as judged from the ⌬G 2 M NaCl terms). The   Despite the observed influence of salt versus hydrophobic bridges, charged side chains at positions 19, 30, 31, and 42 contribute to the stability of the homeodomain fold, through packing interactions between the side-chain methylene groups and the protein hydrophobic core. The destabilization induced by single Glu/Asp substitutions in protein variants 2 and 27 is illustrative of this effect.
In addition, our work provides clues about the correlation between triad 19/26/30, deduced from the statistical analysis of homeodomain sequences. Thus, a proline at position 26 is compatible with both aliphatic and polar (neutral or charged) residues at 19/30. In contrast, the sterically larger and conformationally more mobile leucine is incompatible with aliphatic side chains at positions 19/30. Violations of this rule lead to a severe destabilization and/or aggregation of the protein, which is in agreement with the absence of this particular combination from any natural sequence.
It should be emphasized that, apart from position 26, the hydrophobic environment of residues 19/30 is highly conserved and shows no relevant differences in homeodomains with salt or hydrophobic bridges connecting these sites (supplemental Fig. S13). Taking this into account, the observed preference for aliphatic residues at positions 19/30 is likely to be general for those polypeptides with a proline in 26.
Overall, the results obtained are in agreement with those reported by other groups for different systems (39) and correspond well with theoretical predictions by Hendsch and Tidor (31).
In contrast with sites 19/30 and 31/42, only polar residues seem to be tolerated at positions 17 and 52. This particular behavior might have its origin in the dynamic properties of the polypeptide. Whereas sites 19, 30, 31, and 42 are located in well structured regions, residue 52 is located close to the end of helix III. This segment of the protein (especially from residue 55), although not totally disordered in solution, is characterized by a low helical content that might increase solvent exposure of pair 17/52. Thus, a hydrophobic pair at these sites is likely to affect the protein stability and promote aggregation.
Finally, the influence of positions 15, 19, 30, 31, 37, and 42 on the homeodomain DNA binding properties has been tested employing microcalorimetry. First, our analyses confirm the participation of Arg 31 in the recognition of DNA. However, its effect on the binding strength (only a factor of 4.4 in K b ) is relatively small in comparison with that reported for other comparable contacts in homeodomains. In contrast, the influence of positions 15,19,30,37, and 42 on the DNA binding properties of the polypeptide is almost negligible. Interestingly, the thermodynamic parameters derived seem almost independent on the protein conformational stability. Indeed, folding free energies of the analyzed homeodomains range from Ϫ2.01 to Ϫ4.60 kcal/mol (at 20°C, see Table 1). Despite this fact, all of them exhibit similar affinities for DNA (⌬⌬Gb Յ 0.3 kcal/mol). Moreover, binding enthalpies (⌬H) and the changes in heat capacity (⌬Cp) are also comparable. It has been hypothesized that the dynamic character of the homeodomain hydrophobic core plays a role in the induced fit of the protein upon   kcal/mol complex formation. Considering that protein stability and dynamics are usually strongly correlated, the observed lack of influence of the former on DNA binding argues against this proposal. Instead, it strongly suggests that the homeodomain core does not play a significant role in the induced fit mechanism, which is probably restricted to the protein/DNA interface. This conclusion is probably meaningful in the context of evolution. Thus, if the dynamics of the homeodomain hydrophobic core was important for complex formation, many of the mutations affecting the conformational stability of the polypeptide would also have an effect on DNA binding. In contrast, the observed lack of correlation between binding and stability allows an independent modulation of both properties by evolution. In this context, it should be considered that homeodomains are present in both cold-blooded and warm-blooded organisms, under different environments, and therefore should be able to satisfy different stability requirements. Rapid protein turnover ensures that some regulatory proteins are degraded so that the cell can respond to constantly changing conditions. Such proteins usually have evolved to be relatively unstable under physiological conditions. We propose that salt/hydrophobic bridges at positions 19/30, 15/37, and 31/42 probably contribute to adjust the thermal stability of the polypeptide for a particular environment while maintaining binding affinity and specificity. In agreement with this hypothesis, an analysis of the sequence information available reveals that stabilizing hydrophobic pairs at positions 31/42 in homeodomains can be found exclusively in warm-blooded animals. Moreover, these sequences exhibit also a significantly greater abundance of hydrophobic bridges connecting sites 19 and 30 (3.5% versus 1% in sequences from cold-blooded animals).
In summary, the studies presented herein provide a detailed description of the role of conserved salt bridge interactions on the homeodomain stability and DNA binding properties together with new insights into the relationship between sequence, structure, stability, and function within this important family of proteins.
Accession Number-The coordinates corresponding to the NMR structure of the thermostable Hesx-1 homeodomain R31L/E42L double mutant have been deposited in the Protein Data Bank (PDB) under code 2k40.