A Backbone-reversed Form of an All- (cid:1) (cid:2) -Crystallin Domain from a Small Heat-shock Protein (Retro-HSP12.6) Folds and Assembles into Structured Multimers*

The structural consequences of polypeptide backbone reversal (“retro” modification) remain largely unex-plored, in particular, for the retro forms of globular all- (cid:1) -sheet proteins. To examine whether the backbone-reversed form of a model all- (cid:1) -sheet protein can fold and adopt secondary and tertiary structure, we created and examined the recombinant retro form of a 110-residue-long polypeptide, an (cid:2) -crystallin-like small heat-shock protein, HSP12.6, from C. elegans . Following intracellu- lar overexpression in fusion with a histidine affinity tag in Escherichia coli , purification under denaturing con- ditions, and removal of denaturant through dialysis, retro-HSP12.6 was found to fold to a soluble state. The folded protein was examined using fluorescence and CD spectroscopy, gel filtration chromatography, non-dena- turing electrophoresis, differential scanning calorimetry, and electron microscopy and confirmed to have adopted secondary structure and assembled into a multimer. Interestingly, like its parent polypeptide, retro- HSP12.6 did not aggregate upon heating; rather, heating led to a dramatic increase in structural content and the adoption of what would appear to be a very well folded state at high temperatures. However, this was essen- tially reversed upon cooling with some hysteresis being observed resulting in greater structural content in the heated-cooled protein than in the unheated protein. The heated-cooled samples displayed CD spectra indicative of structural content

The structure of a naturally occurring globular protein is determined by its amino acid sequence. The amino acid sequence has a definite polarity with the CϭO group of every residue forming a peptide bond with the N-H group of the next residue. Here, we explore the structural-biochemical consequences of reversing the polarity of the polypeptide backbone through the creation of a novel protein with an amino acid sequence that is the exact reverse of the sequence of a naturally occurring protein, the ␣-crystallin-like small heat-shock protein HSP12.6 from Caenorhabditis elegans (1).
The consequences of effecting such a transformation have previously been explored both in theory and in experiment.
Among works dealing with polypeptides that are large enough to be called proteins, the following discussions are worthy of note. (a) Guptasarma (2,3) hypothesized that the retro form of an all-␤-sheet protein would fold into a topological mirror image of the structure adopted by the parent sequence through mirror imaging of the entire scheme of side chain-side chain interactions facilitating folding. The dihedral angles characterizing each residue in the parent structure would thus change both sign as well as definition in the mirror-imaged structure because of the replacement of CϭO by N-H and vice versa. As a consequence, every would become Ϫ and every would become Ϫ. Notably, with ␤-sheets, such a transformation could conceivably allow each residue to remain in a ␤-sheet configuration (3). However, with ␣-helices, such a transformation would be expected to effect a change in the handedness of the helix and so mirror imaging would not ordinarily occur for single helices and all-helix protein structures, but rather, it would occur only with isolated helices in predominantly ␤-sheet structural contexts. In other words, only helices that could pay the energy penalty for switching handedness through stabilization by packing contacts with other mirror-imaged substructures would undergo the transformation. (b) Skolnick and colleagues (4) performed folding simulations with the retro form of an all-␣ protein, the B domain of Staphylococcal protein A, and showed that 3 of 12 folding simulations led to mirror-imaged structures, whereas the remaining nine simulations folded into ␣-helical structures. (c) Another result with ␣-helical proteins was obtained experimentally by Grutter and colleagues (5) who showed that the retro form of a GCN4 leucine zipper folds into a structure (determined crystallographically) that is almost identical to the parent structure and not a mirror image of the parent structure. The similarity of the parent and retro structures was explained on the basis of the fact that there was a 2-fold palindrome in the hydrophobicity profile of the protein, intersecting the central cavity in the structure of the protein.
(d) Importantly, whereas the retro forms of ␣-helical proteins did display a tendency to fold, an experimental attempt to reverse the backbone of a protein containing ␤-sheets produced a polypeptide that displayed no tendency to fold. Serrano and colleagues (6) demonstrated that the retro form of an Src homology 3 domain does not fold. However, structural modeling carried out by the same group established that a mirror image topology was feasible, especially in combination with folding to a moltenglobule state rather than a rigid unique structure (6). Since only ϳ40% of an Src homology 3 domain constitutes ␤-structure with the remaining polypeptide being folded into rigid loop-like structures that do not qualify to be called secondary structure (the 60-residue-long domain has only ϳ25 residues of 60 forming ␤-strands (7)), the question of what would happen upon reversal of a larger protein with a much greater propensity to form ␤-sheets has thus far remained open to question. * The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
We decided to examine the consequences of reversing the sequence of a larger all-␤-sheet protein. We chose as a parent sequence for backbone reversal the 110-residue-long HSP12.6 from C. elegans. This is the smallest known homolog of all of the proteins belonging to the family of ␣-crystallin-like heatshock domains (1). Based on actual determination of the structures of several members of this family (8,9), ␣-crystallin-like domains are currently accepted to be proteins of a defined all-␤-fold. Our investigations reveal that the reversed sequence too folds and assembles into a multimer like the parent.

Molecular Genetic Manipulations and Design of Constructs-From
the sequence of the gene encoding the parent protein, Hsp12.6 (1), the sequence of a novel gene encoding retro-HSP12.6 (RETHSP) 1 was created by reversing the sequence of codons used by the parent gene. The DNA encoding the reversed sequence (retro-HSP12.6) was then synthesized through a combination of contract synthesis of double-stranded oligonucleotides and our own molecular genetic manipulations to derive constructs encoding retro-HSP12.6 with a choice of restriction sites flanking the sequence for insertion into the vector pQE30 (Qiagen) to facilitate expression in fusion with a His 6 affinity tag. Finally, two forms of retro-HSP12.6 were created. (i) The first form, RETHSP-1, was a backbone-reversed form of the 110-residue-long parent sequence flanked by N-and C-terminal extensions. The N-terminal extension consisted of 12 residues incorporating a 10-residue affinity tag (MRG-SHHHHHH) and an additional two residues contributed by the cloning site (GS). The C-terminal extension of nine residues (VDLQPSLIS) was entirely due to the choice of restriction sites at the multiple cloning site of pQE-30, as the vector's own stop codon was used. (ii) To make the second form, RETHSP-2, the C-terminal extension was removed through inclusion of a stop codon immediately after the backbonereversed HSP12.6 sequence. Expression of proteins from both constructs was first checked in XL1Blue, which was the cloning host. The sequence of the insert in the positive clone was confirmed through automated DNA sequencing on an ABI 310 Prism sequencer, and the plasmid was transformed into the expression host M15pREP4. The sequences of HSP12.6, RETHSP-1, and RETHSP-2 are shown in Table  I for reference.
Expression, Purification, and Folding-The expression of both proteins in the expression host Escherichia coli M15pREP4 was low but this was compensated for by setting up larger culture volumes. For expression, the cells were grown overnight and a 1% secondary innoculum was added to an appropriate volume of LB. Cells were induced with 1 mM isopropyl-1-thio-␤-D-galactopyranoside at an optical density of 0.6 and harvested 4 h after induction. Harvested cells were suspended in lysis buffer containing denaturant (8 M urea, 0.1 M NaH 2 PO 4 , 0.01 M Tris-Cl, pH 8.0) and lysed by sonication. The lysate was centrifuged at 18,000 ϫ g for 1 h, and the supernatant thus obtained was loaded onto a nickel-nitrilotriacetic acid column in the presence of the denaturant, urea. Washing (with 8 M urea, 0.1 M NaH 2 PO 4 , 0.01 M Tris-Cl, pH 6.3) and elution (with 8 M urea, 0.1 M NaH 2 PO 4 , 0.01 M Tris-Cl, pH 5.9 and pH 4.5) were also done under denaturing conditions. Dialysis of the eluted protein to remove denaturant was done against 20 mM Tris. After dialysis, concentration was effected through centrifugation under vacuum to the point at which the protein started to precipitate after which the sample was centrifuged and supernatant was taken for an estimation of concentration of soluble protein. This was taken as the maximally concentrated solution of the protein since upon further concentration the protein precipitated. The change in molarity of buffer following such concentration was also estimated by reckoning for the change in volume effected by centrifugal vacuum concentration.
Spectroscopy and Microcalorimetry-Protein concentrations were estimated through UV absorption measurements at 280 nm using a predicted molar extinction coefficient of 12,780 for proteins encoded by both RETHSP-1 and RETHSP-2. Fluorescence spectra were collected on a PerkinElmer LS-50B spectrofluorimeter with variable excitation and emission bandpasses as appropriate using excitation with light of 280 nm and scanning protein emission between 300 and 400 nm. CD spectra were collected at intervals of 1 nm on a Jasco J-810 spectropolarimeter through scanning of wavelengths from 250 to 200 nm using a protein concentration of 0.4 mg/ml and a cuvette path length of 0.2 cm. CD signals below 200 nm could not be collected because the spectra were noisy. Consequently, no attempt was made to estimate secondary structural contents from this data. Calorimetry was carried out using a Microcal MC-2-ultrasensitive microcalorimeter using a protein concentration of 80 M and a scan rate of 60 K/h. Fourier Transform Infrared spectra for the solid precipitate of RETHSP-1 obtained through concentration beyond the solubility limit were collected on a PerkinElmer Spectrum BX instrument with the protein sample placed between two calcium fluoride windows at a resolution of 1 cm Ϫ1 , taking an average of 32 scans.
Chromatography and Non-denaturing Gel Electrophoresis-Gel filtration chromatography was performed on a Pharmacia SMART system using an analytical Superdex-200 column (bed volume Ϸ2.4 ml, void volume 0.8 ml) and a flow rate of 0.1 ml/min through use of 0.05-ml protein samples of concentration at 0.4 mg/ml. The fractionation range of the column was 6,00,000 Da, and the exclusion limit was 16,00,000 Da. Non-denaturing gel electrophoresis for the determination of the native molecular weight of RETHSP-2 was carried out by the standard procedures involving: (i) determination, plotting, and (linear) fitting of changes in the relative mobilities of protein samples as a function of gel acrylamide percentage by a least squares fitting method, followed by (ii) by plotting of the negative value of the slope thus obtained for each standard protein against its known native molecular weight (Ferguson plot), least squares fitting of this data to a straight line, and interpolation of the value of the slope obtained for RETHSP-2 into the plot.
Electron Microscopy-Transmission electron microscopy studies were carried out through use of routine negative staining procedures employing phosphotungstic acid and uranyl acetate on a JEOL 1200 EX-2 microscope.

RESULTS AND DISCUSSION
Solubility of RETHSP-1 and RETHSP-2-No precipitation was observed during dialysis-based removal of denaturant from solutions of either RETHSP-1 or RETHSP-2 following His 6 tag-based affinity purification on nickel-nitrilotriacetic acid-agarose columns in the presence of urea. The protein samples obtained were found to be soluble up to a concentration of 1.2-1.5 mg/ml for RETHSP-1 and ϳ0.6 mg/ml for RETHSP-2. Concentration beyond these values led to protein precipitation.
Spectrofluorimetric Characterization-The two forms, RE-THSP-1 and RETHSP-2, displayed wavelengths of maximal fluorescence emission ( max ) of 348 and 351, respectively, indicating only very nominal burial of their aromatic residues (two tryptophans and one tyrosine). The emission spectrum of RE-THSP-2 is shown in Fig. 1A.
RETHSP-1 and RETHSP-2 Appear to Be Trimers/Tetramers at Low Concentrations-Because both proteins were soluble, we examined their quaternary structural status using gel filtration chromatography. On an analytical SMART Superdex-200 column, RETHSP-2 eluted at 1.47 ml (Fig. 1B), correspond- 1 The abbreviations used is: RETHSP, retro-heat-shock protein. ing to a molecular mass of ϳ52-53 kDa. In some preparations of the protein, a minor population was also found to elute close to the void volume of the column (0.8 -0.9 ml), which had a bed volume of 2.4 ml and a fractionation range of 10,000 -600,000 Da with an exclusion limit of 1.3 ϫ 10 6 Da. RETHSP-2 has a polypeptide molecular mass of 13,970 Da, indicating that the majority population eluting at ϳ 1.47 ml falling within the optimal fractionation range for the column is predominantly tetrameric. In comparison with the value of 52-53 kDa that was obtained through gel filtration, determination of the native molecular mass at low protein concentration by non-denaturing gel electrophoresis (see Fig. 6, C and D) yielded an estimated molecular mass of 45-46 kDa. Both of these estimates turn out to lie between the values expected for trimeric (ϳ42 kDa) and tetrameric (ϳ56 kDa) states of a ϳ14-kDa polypeptide like RETHSP-2. Thus, three possibilities apply. (i) The molecule is assembled into a tetramer that behaves as a smaller species on account of compactness. (ii) The molecule is assembled into a trimer that behaves as a larger species on account of being swollen, or (iii) the molecule forms a mixed population of trimers and tetramers existing in equilibrium.
Because techniques used to determine multimeric molecular masses including non-denaturing gel electrophoresis, gel filtration chromatography, dynamic light scattering, and analytical ultracentrifugation are all influenced to varying extents by molecular shape and effective hydrodynamic volume, which need not correlate perfectly with size for non-spherical species, estimates of native molecular mass do not always correspond to expected multiples of subunit molecular weight. We emphasize that dynamic light scattering or analytical ultracentrifugation data could provide more accurate information concerning whether the population is mostly trimeric or tetrameric, and we are organizing to perform these experiments. Meanwhile, we have obtained preliminary plate-like crystals of the protein and attempts are being made to refine crystallization conditions toward eventual structure determination, which should resolve the issue. Notably, the occasional observation of a soluble higher order multimer at the void volume of the Superdex-200 column indicates that this assembly may also be capable of associating further into larger multimers approaching sizes of 600,000 or more, especially at high protein concentrations.

Evidence of Formation of a Higher Order Multimer upon
Concentration-As already mentioned, RETHSP-1 has a solubility limit of ϳ1.2-1.5 mg/ml. Beyond this protein concentration, protein precipitates are obtained. Whereas the gel filtration studies reported above used the sample remaining in the supernatant after concentration, examination of the precipitated protein using transmission electron microscopy and negative staining showed the presence of a globular, bead-like form with a diameter of roughly 18 -20 nm (Fig. 2, panels A and B).
We are in the process of analyzing multiple images of these beads to attempt partial three-dimensional reconstruction. Fourier Transform Infrared spectra of these beaded samples clearly show the presence of secondary structure (Fig. 2C) with CϭO stretch band maxima at 1629 and 1651 cm Ϫ1 , indicative of predominantly ␤-sheet with some helical content.
Evidence of Secondary Structure Formation-The CD spectrum of any protein is a linear combination of the contributions of peptide bonds in various secondary structures. The CD spectrum of RETHSP-1 (Fig. 3A) indicates ␤/␣-secondary structural content together with a substantial component of randomly coiled structure. It may be noted that for completely randomly coiled polypeptides, no negative ellipticity signal is observed between 250 and 210 nm. Only at wavelengths below 210 nm are spectra observed to show a negative band that dips to display a minimum at ϳ198 nm. Since out of a total of 131 residues RETHSP-1 contains 21 residues that comprise N-and C-terminal extensions that are in any case not expected to participate in structure formation (16% of the chain) and, furthermore, because any negative CD signal resulting from random coil is nearly 4 -5 times stronger than that due to a ␤-sheet configuration for any peptide bond (i.e. in a plot of meanresidue ellipticity), the CD spectrum of RETHSP-1 might arguably be expected to be dominated by a negative band at ϳ198 nm, however, with significant negative mean residue ellipticity being visible even at wavelengths above 210 nm. Gratifyingly, this is what is seen.
When the C-terminal extension is removed as in RETHSP-2, the length of the chain reduces to 122 residues, bringing down the number of extra residues to 12 (only 9% of the chain) from 21. This would be expected to reduce the contribution of residues in a randomly coiled configuration to the CD spectrum and thus lead to a shift of the band minimum to a longer wavelength. As shown by the CD spectrum of RETHSP-2, such a shift is exactly what is seen. RETHSP-2 shows a band minimum at 210 nm together with another band minimum at ϳ230 nm (Fig. 3B). Notably, the mean residue ellipticity of the entire spectral range is enhanced by a factor of almost 2.5 over that seen with RE-THSP-1, indicating that this polypeptide is significantly more structured even though there is still some random coil component.
Heating Causes Structural Consolidation Rather than Un-folding-To examine whether heating of the protein is characterized by an endothermic reaction resulting in a change in enthalpy associated with unfolding, differential scanning calorimetry was carried out with RETHSP-1. The scan showed a small, almost indiscernible endothermic reaction followed by a dramatic and unexpected exothermic reaction at high temperatures (Fig. 4). Such an exothermic reaction could only be due to the formation rather than destruction of non-covalent contacts. Such contacts could be because of either aggregation or further structura1 consolidation within the polypeptide. To examine whether any aggregation occurs upon heating, we heated the proteins at 90°C for 15 min and found no visible sign of aggregation. Gel filtration of cooled samples also showed no signs of high molecular weight species (see Fig. 6A). One further test excluding any possibility of aggregation was a monitoring of the HT voltage associated with the transmission of light through the sample in a CD spectrometer during heating of the sample. The HT voltage in a spectropolarimeter rises to compensate for reductions in light intensity not associated with differential absorption of left and right circularly polarized light when the detector is starved for light through absorption or scattering. The HT voltage, therefore, is extremely sensitive to changes in the level of scattering of transmitted light. We found no significant changes in HT voltage associated with heating, signaling a lack of aggregation during heating. Because there was no evidence of aggregation and suspecting that structural consolidation could indeed have occurred in the sample, we examined the CD spectrum of RETHSP-2 at high temperature (Fig. 5A). We also examined the nature of changes in ellipticity associated with heating and cooling of RETHSP-2 ( Fig. 5B) to further confirm the changes in CD signal associated with heating and investigate the extent to which such changes are reversible for this protein. As is evident from Fig. 5A, the protein is significantly more structured at high temperatures than at room temperature. Fig. 5B reinforces this conclusion, demonstrating that there is a gradual increase in the negative ellipticity at 218 nm during heating (effected at a rate of 5°C/ min) as well as a clear reduction in negative ellipticity associ-  3. Far-UV CD spectra of retro-HSP12.6 constructs. Panel A, the retroprotein, RETHSP-1, at room temperature. Panel B, the retroprotein, RETHSP-2, at room temperature. Panel C, the retroprotein, RETHSP-2, following heating to 90°C and cooling to room temperature. ated with cooling (effected at the same rate). This finding shows that the structural consolidation effected through heating is largely reversed upon cooling. However, some hysteresis is also clearly seen to be associated with the process (Fig. 5B) with the signal at 218 nm not returning to its original value upon cooling. Thus, the structural gains effected through heating are not entirely lost upon cooling, and as a result, the heated and cooled protein (Fig. 3C) shows a negative band maximum at 215-216 nm with a second band at ϳ230 nm, indicating a clear predominance of ␤-sheet configuration and a well folded state of the polypeptide even upon return to room temperature. It may be emphasized once again that perhaps because of the fact that the heat-induced changes do not reverse completely, the spectrum of the heated-cooled protein (Fig. 3C) is different from that of the unheated protein (Fig.  3B), which displays a band minimum at ϳ210 nm as well as from spectrum of the protein at the high temperature of 92°C (Fig. 5A), which most resembles the CD spectrum of a well folded naturally occurring protein. The differences in signal intensities among the various spectra, especially at lower wavelengths, conform to what is expected for enhancement of ␤-sheet content at the expense of unstructured content. As is well known, the negative ellipticity associated with a peptide bond in a ␤-sheet configuration is much lower than that associated with a peptide bond in a random coil.
The intriguing semi-reversible thermally induced consolidation of structure described above caused us to carry out a further comparative examination of RETHSP-2 in the unheated, heated state, and cooled states. The unheated sample and the heated-cooled sample were chromatographed through gel filtration (Fig. 6A) and seen to contain the same dominant trimeric/tetrameric population eluting at 1.47 ml on a Superdex-200 column. As can be seen, there was no evidence of any additional species eluting at the column's void volume (0.8 -0.9 ml), indicating that there was no generation of high molecular weight aggregates through the process of heating and cooling. However, a slight difference can be seen in the width-at-halfheight of the elution as well as in the volume at which the elution begins for the heated-cooled sample. Thus, the gross quaternary structural status of RETHSP-2 is not changed through heating and cooling, although as already pointed out clearly some of the structural gains effected through heating are retained by the molecule after cooling, evident from the hysteresis seen in the ellipticity signal as a function of temperature (Fig. 5B) as well as from differences in the CD spectral shapes of unheated, heated, and heated-cooled samples (Figs. 3B, 5A, and 3C, respectively).
Fluorescence emission spectroscopy at different temperatures did not shed light on structural transitions. The 351-nm emission ( max ) of the protein alluded to earlier (Fig. 1A) remained at 351 nm, even at 90 -92°C as well as upon cooling to room temperature, displaying only a reversible reduction in intensity with increased temperature but no change in other spectral characteristics (Supplemental Fig. 1). Therefore, it would appear that the exposed tryptophan residues of the protein remains largely exposed to the solvent, even in course of the heat-induced structural consolidation and resettlement into a more structured state upon cooling. Fluorescence quenching carried out for unheated and heated-cooled samples reinforce this conclusion. Stern-Volmer plots (Fig. 6B) show that the accessibility of the fluorescing aromatic residues is virtually unaltered between the unheated and the heatedcooled samples. To further investigate the conclusion from gel filtration data, which indicated that the unheated and heatedcooled samples have similar sizes despite their structural differences, non-denaturing gel electrophoresis was carried out. The gels of four different percentages of acrylamide were run. Variations in the relative mobilities of five different protein standards as a function of varying gel density were analyzed and used to construct a Ferguson plot. A representative gel  (10% acrylamide) is shown (Fig. 6C) with unheated and heatedcooled samples run in the last two lanes, establishing that the hydrodynamic volumes of these two forms are entirely similar. The Ferguson plot (Fig. 6D) shows that both forms correspond to a molecular mass of 45-46 kDa as already mentioned earlier.
CONCLUSIONS Retro-HSP12.6 appears to fold and assemble into multimeric states that further associate to form large globular structures. At low protein concentrations, the polypeptide displays secondary structural content and no tendency to aggregate, despite possessing solvent-exposed aromatic residues. Secondary structural content is enhanced through heating and lost through cooling as evidenced by CD spectroscopy with calorimetry showing an exothermic reaction to be occurring upon raising of temperature without attendant molecular aggregation. Thus, heating of this backbone-reversed all-␤ heat-shock protein results in enhancement of structural content, perhaps because of improved hydrophobic interactions among residues at high temperatures facilitating further hydrogen-bonding interactions and greater structure and stability. At the high temperature of 92°C, the protein shows a CD spectrum not unlike that of any folded naturally occurring protein. Upon cooling, there is loss of most of the structural content gained through heating, but nevertheless some hysteresis is seen and the process is not fully reversible, such that the heated-cooled protein shows greater structural content than the unheated protein and shows a CD spectrum indicative of a considerably better folded state. Most intriguingly, heating is not associated with any aggregation. Independently, concentration of the protein was observed, to lead to assembly of the molecule into larger bead-like structures, which are precipitation-prone and show high secondary structural content. Preliminary plate-like crystals of the protein have been obtained and attempts are being made to refine crystallization conditions to carry out further structural analysis.
FIG. 6. Characteristics of RETHSP-2 prior to heating and following cooling to room temperature after heating at 90°C. Panel A, elution from a Superdex-200 column of unheated and heated-cooled samples as indicated. Panel B, Stern-Volmer plots of acrylamide quenching of protein fluorescence for unheated (open triangles) and heated-cooled (open circles) samples. Panel C, representative non-denaturing PAGE (10% acrylamide, stacking/resolving gels of pH 8.8). The first five lanes on the left show the markers, carbonic anhydrase, chicken egg albumin, bovine serum albumin, urease, and ␣-lactalbumin, respectively, with isoforms visible where present. The last two lanes, respectively, correspond to unheated and heated-cooled samples of RETHSP-2. Panel D, Ferguson plot (both axes in log 10 scale) showing five standard protein molecular masses in kDa plotted against the negative values of slopes of individual linear (least square) fits obtained for each protein (K r ) through initial plotting of relative mobility versus gel acrylamide percentage. The interpolation of the value of the slope obtained for RETHSP-2 is shown by horizontal and vertical lines.