Tryptophanyl substitutions in apomyoglobin determine protein aggregation and amyloid-like fibril formation at physiological pH.

Myoglobin is an alpha-helical globular protein that contains two highly conserved tryptophan residues located at positions 7 and 14 in the N-terminal region of the protein. Replacement of both indole residues with phenylalanine residues, i.e. W7F/W14F, results in the expression of an unstable, not correctly folded protein that does not bind the prosthetic group. Here we report data (Congo red and thioflavine T binding assay, birefringence, and electron microscopy) showing that the double Trp/Phe replacements render apomyoglobin molecules highly susceptible to aggregation and amyloid-like fibril formation under physiological conditions in which most of the wild-type protein is in the native state. In refolding experiments, like the wild-type protein, the W7F/W14F apomyoglobin mutant formed a soluble, partially folded helical state between pH 2.0 and pH 4.0. A pH increase from 4.0 to 7.0 restored the native structure only in the case of the wild-type protein and determined aggregation of W7F/W14F. The circular dichroism spectrum recorded immediately after neutralization showed that the polypeptide consists mainly of beta-structures. In conclusion, under physiological pH conditions, some mutations that affect folding may cause protein aggregation and the formation of amyloid-like fibrils.

Such chronic disorders as Alzheimer's disease, senile systemic amyloidosis, transmissible spongiform encephalopathies, and dialysis-related amyloidosis are characterized by the extracellular deposition of insoluble protein aggregates known as amyloid fibrils (1)(2)(3)(4)(5)(6). About 20 proteins are now known to be involved in the generation of amyloid in vivo. Fibril formation is initiated in vitro under conditions that stabilize partially unfolded soluble intermediates of the native proteins either after the partial destabilization of physiologically folded proteins in the case of globular proteins (7) or after the partial stabilization (i.e. folding) of random coil polypeptide chains in the case of natively unfolded proteins (8). Despite substantial differences in both sequence and length (from 40 to 250 residues), all the proteins responsible for amyloid deposition form fibrils composed of ␤-strands oriented perpendicularly to the long axis of the fibril (9). Electron microscopy shows that the fibrils are straight and unbranched and are 40 -120 Å in diameter (9,10). Also proteins not known to be associated with amyloid disease may form amyloid fibrils under in vitro conditions that favor partially folded states (11)(12)(13)(14)(15). These states are more prone to aggregation than the native state because hydrophobic residues, which are largely buried within the core of the native protein, become more exposed upon partial unfolding. The way in which proteins aggregate in the test tube is remarkably similar to how proteins form the so-called "amyloid" deposits. Even myoglobin, an ordinary all-␣ globular protein, can form fibrils containing ␤-strands under experimental conditions that favor the formation of partially folded states (15). Thus, amyloid formation does not appear to be restricted to a small number of protein sequences but is rather a property common to many, if not all, polypeptide chains. Recent experimental data show that some human genetic amyloid diseases are due to point mutations that cause incomplete protein folding, thereby increasing the propensity of the polypeptide chain to aggregate (16,17).
Sperm whale apomyoglobin lends itself to structural studies of partially folded states. The folding of this protein proceeds through compact intermediates that have been detected in both equilibrium and kinetic experiments. At least three partially folded intermediates, which differ in the amount of their secondary structure, have been detected at low pH in the presence of different anions (18 -21). Similar results were obtained in studies of methanol-induced conformational transitions (22). In these intermediates, A, G, and H helices are folded as in the native state, whereas the remainder of the molecule seems to be unordered. We previously showed that the simultaneous substitution of the two tryptophan residues at positions 7 and 14 along the A helix with phenylalanine residues causes incorrect folding and a subsequent loss of heme binding. This suggests that the tryptophan residues play an important role in driving the folding process and in determining the structure of apomyoglobin (23). Molecular dynamics simulation indicated that this is probably due to distortions introduced into the compact AGH subdomain, which influence the subsequent steps of the folding pathway (23).
The aim of this study was to further investigate the conformational properties of the W7F/W14F apomyoglobin mutant. The main finding is that the amino acid replacements render the protein highly susceptible to aggregation and formation of amyloid-like fibrils under experimental conditions not drastically different from the natural setting. Thus, W7F/W14F is a model system with which to investigate the relationship between folding and misfolding at molecular level and to identify the specific interactions that are involved in amyloid formation. * This work was supported by Ministero dell'Università e della Ricerca Scientifica e Tecnologica Grant PRIN 2000 and by grants from Consiglio Nazionale delle Ricerche (Target Project Biotechnology). The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.

EXPERIMENTAL PROCEDURES
Materials-Oligonucleotides were obtained from Genset Oligos, Taq polymerase and ligase from Promega, and restriction endonuclease from New England Biolabs. Ni 2ϩ -nitrilotriacetic acid agarose was obtained from Qiagen. Congo red and thioflavin T were obtained from Sigma. All other chemicals were of analytical grade.
Plasmid Construction-The His-tagged version of the myoglobin W7F/W14F mutant was obtained in two steps. First, plasmid pMb413 (24) was mutated in both tryptophan residues (23); then, the mutated gene was amplified by polymerase chain reaction to create an in-frame fusion with the vector pQE-30 (Qiagen) sequence that encodes a His 6 tag at the N terminus. The upstream primer was designed to delete the stop codon region and the PstI restriction site of the parental pMb413 plasmid and to introduce the Bam H1 restriction site. The primers used were 5Ј-AGG AGA ACA ACA GGA TCC GTT CTG TCT GAA-3Ј and 5Ј-GAG CTC GGT ACC TCA TTA ACC CTG GTA ACC-3Ј for forward and reverse sequences, respectively (regions complementary to the synthetic myoglobin gene are underlined; the BamH1 site introduced is indicated in bold letters). The PCR was conducted for 30 cycles of heat denaturation at 94°C for 30 s; the primers were annealed to the DNA at 60°C for 1 min, and the DNA chain was extended with Taq polymerase at 72°C for 1 min, with a final extension at 72°C for 10 min. The PCR was performed also with the wild-type myoglobin gene. The PCR products were purified by phenol-chloroform extraction and alcohol precipitation. The DNAs were digested with BamH1 and KpnI, separated on 1.5% agarose gel, and extracted by means of the QIAquick gel extraction kit (Qiagen). The DNA fragments were sub-cloned into the pQE30 vector digested with BamH1 and KpnI and then transformed into Escherichia coli strain M15[pREP4]. Recombinant plasmids were identified by restriction endonuclease analysis, and clones containing the correct-size insert were sequenced by dideoxynucleotide chain termination method.
Protein Expression and Purification-The E. coli M15[pREP4] cells containing plasmid pQE-30/Mb were grown by shaking at 37°C in Luria-Bertani broth containing 0.125 mg/ml ampicillin until A 600 reached 0.5-0.7. Protein expression was induced by the addition of 1 mM isopropyl-1-thio-␤-D-galactopyranoside, and incubation was continued for 4 h. The culture was then centrifuged at 9000 ϫ g for 20 min at 4°C. The wild-type His-tagged myoglobin was produced in soluble form and then purified under native conditions, whereas the mutant His-tag myoglobin was sequestered into insoluble inclusion bodies and purified under denaturing conditions. In particular, the bacterial pellet of the mutant myoglobin was re-suspended in lysis buffer (100 mM NaH 2 PO 4 , 10 mM Tris⅐HCl, 8.0 M urea, 10 mM imidazole, pH 8.0) and stirred at room temperature for 1 h. After centrifugation at 15,000 ϫ g for 30 min, the supernatant was mixed to a slurry of nickel-nitrilotriacetic acid resin (Qiagen) and incubated on a rotary shaker for 1 h at room temperature. The lysate-resin mixture was packed into an empty column and washed twice with 100 mM NaH 2 PO 4 , 10 mM Tris⅐HCl, 8.0 M urea, 10 mM imidazole, pH 6.3. The protein was eluted 4 times with 100 mM NaH 2 PO 4 , 10 mM Tris⅐HCl, 8.0 M urea, pH 4.5. The fractions were collected and analyzed by 15% SDS-PAGE and Western blotting. The protein was concentrated using an Amicon concentrator and folded by dialysis against 10 mM NaH 2 PO 4 , pH 7.0, containing decreasing concentrations of urea to 0 M. Wild-type apomyoglobin was prepared from myoglobin by the 2-butanone extraction procedure described by Teale (25). Contamination of the apoprotein by myoglobin was assessed spectrophotometrically. Apomyoglobin concentration was determined by the Bradford assay (26) using bovine serum albumin as a standard and/or measuring absorption at 280 and 275 nm for the wild-type and mutant protein, respectively. The molar extinction coefficients ⑀ 280 ϭ 13,500 M Ϫ1 cm Ϫ1 and ⑀ 275 ϭ 3,750 M Ϫ1 cm Ϫ1 , calculated from tryptophan and tyrosine content (27), were used.
Congo Red Binding Assay-The binding of Congo red was monitored using absorption spectroscopy. A 5.0 mM stock solution of Congo red was prepared in 5.0 mM KH 2 PO 4 , 150 mM NaCl, pH 7.4. Congo red solution (5.0 ϫ 10 Ϫ6 M) was mixed with protein suspension (8.0 ϫ 10 Ϫ6 M) and incubated at room temperature for 30 min before absorption measurements. The difference spectra (Jasco V-550 spectrophotometer) were obtained by subtracting the Congo red spectra obtained in the absence of protein from the Congo red spectra measured in the presence of protein. Poly-L-lysine solution was used as positive control (28).
Congo Red Birefringence-Drops of protein suspension were airdried on gelatin-coated slides. The resulting films were stained with a saturated solution of Congo red and sodium chloride and corrected to pH 10.0 with NaOH (11). The stained slides were washed and air-dried before being examined with a Zeiss Axioskope 2 light microscope be-tween cross-polarizers and imaged by means of a high resolution digital camera (C4742-95, Hamamatsu Photonics, Italy).
Thioflavin T Binding Assay-Samples containing 5 ϫ 10 Ϫ6 M of protein were mixed with 25 ϫ 10 Ϫ6 M thioflavin T in 10 mM phosphate buffer. Thioflavin T fluorescence was recorded from 460 to 500 nm with excitation at 450 nm and slits of 3.5 nm for both excitation and emission. For each sample, the fluorescence intensity at 482 nm was corrected by subtracting the emission intensity recorded before the addition of protein to thioflavin T solutions.
Circular Dichroism Measurements-Far-UV circular dichroism (CD) measurements were performed on homogeneous wild-type and W7F/ W14F apomyoglobin samples at a concentration of 10 ϫ 10 Ϫ6 M. Spectra were recorded at 25°C on a Jasco J-710 spectropolarimeter equipped with a temperature-controlled liquid system Neslab RTE-110. Cuvettes of 1-mm path length were used over the wavelength range from 250 to 190 nm. Spectra were acquired at 0.5-nm intervals with a 4-s integration time and a bandwidth of 1.0 nm. An average of three scans was obtained for all spectra. Photomultiplier absorbance did not exceed 600 V in the spectral region analyzed. Data were corrected for buffer contributions and smoothed using the software provided by the manufacturer (System Software version 1.00). All measurements were performed under nitrogen flow. The results are expressed as mean residue ellipticity in units of degree cm 2 dmol Ϫ1 . A mean residue mass of 115 was used for the peptide chromophore.
Electron Microscopy-Drops (5 l) of mutated myoglobin and wildtype apomyoglobin were placed on collodion-amyl-acetate-coated grids and air-dried under germ-free conditions. Samples were stained with 1% phosphotungstic acid for 30 s, then grids were blotted and again air-dried. Images were acquired using a Zeiss TM-109 electron microscope operating at 80-kV excitation voltage.

Mammalian myoglobins contain two tryptophan residues at the invariant positions 7 (A-5) and 14 (A-12) in the N-terminal region (A helix) of the protein.
We recently showed that the simultaneous substitution of both tryptophan residues with phenylalanine residues results in low expression of an unstable protein, which is unable to bind heme and that cannot be purified by standard procedures (23). Homogeneous W7F/ W14F recombinant myoglobin was obtained after extraction of the histidine-tagged protein from inclusion bodies and purified under denaturing conditions, i.e. 8.0 M urea, as reported under "Experimental Procedures." Removal of denaturant by extensive dialysis against 10 mM phosphate buffer, pH 7.0, determined the formation of protein aggregates. By contrast, the histidine-tagged wild-type myoglobin was obtained as a soluble, native-like folded polypeptide chain as documented by far UV-CD spectrum and Soret absorption.
Congo red is a diazo dye that is widely used to identify amyloid deposits because of its ability to bind preferentially to aggregated amyloid peptides/proteins but not to the native, unassembled forms (28 -30). We performed Congo red binding experiments to determine whether the W7F/W14F aggregates formed at neutral pH and at 25°C are amyloid structures. Congo red binding is detected as a red shift in its absorbance spectrum and as birefringence under polarized light due to the strong intercalation of the dye into the ordered, intermolecular ␤-pleated sheet structure of amyloid fibrils (28,31). Fig. 1 shows the absorption spectrum of Congo red in the presence of W7F/W14F at neutral pH compared with that recorded without protein and the difference spectrum. A significant red shift from 494 to 550 nm occurred immediately upon binding of the dye to the W7F/W14F apomyoglobin mutant. The wild-type protein did not affect the Congo red spectrum.
The development of birefringence under cross-polarized light is highly diagnostic for amyloid (31). Fig. 2 shows the optical microscope image obtained under cross-polarized light of W7F/ W14F aggregates stained with Congo red. The figure shows the characteristic birefringence in correspondence to regions rich in amyloid fibrils. We obtained negative controls by adding Congo red to wild-type protein.
Thioflavin T is a fluorescent dye that detects amyloid fibrils (32,33). The emission intensity of the dye increases significantly upon binding to the linear array of ␤-strands in amyloid fibrils (34,35). We examined changes in the emission intensity of thioflavin T in the presence of wild-type apomyoglobin and of the W7F/W14F mutant at neutral pH and at 25°C. The emission intensity of the dye at 485 nm increased 6-fold upon the addition of the fluorophore to W7F/W14F. Thirty days later the emission intensity was 20-fold higher versus the free fluorophore. There was no increase for the wild-type apomyoglobin.
Electron microscopy confirmed the presence of amyloid-like fibrils in W7F/W14F samples at pH 7.0 (Fig. 3, C and D); these fibrils had a mean diameter of 5-10 nm and a mean length between 500 and 800 nm. No fibrils were detected in the wildtype apomyoglobin samples.
To further probe the aggregation process of apomyoglobin, we examined the relationship between pH and far-UV CD activity. It is well known that the conformation of apomyoglobin depends on both pH and salt concentration (18, 36 -38). Specifically, it is native at neutral pH and fully unfolded at pH 2.0 in the presence of very low amounts of salt. The addition of 0.3 M NaCl to fully unfolded apomyoglobin or a pH increase from 2.0 to 4.0 at low ionic strength results in the formation of a partially folded form that is similar to the kinetic intermediate identified on the folding pathway (19,20). At pH 2.0 and at a low salt concentration, both the wild type and W7F/W14F apomyoglobin were essentially unfolded as documented by a negative peak at 200 -202 nm (Fig. 4). A pH increase from 2.0 to 4.0 caused the appearance of a significant amount of ␣-helical structure in both proteins. In fact, their CD spectra showed the characteristic minima at 222 and 208 nm (Fig. 4). The analysis of CD data by Yang's algorithm (39) produced the following secondary structure percentages for the wild-type and W7F/W14F apomyoglobin, respectively: helix 44.1, ␤ 0, turn 22.8, and random 33.1, and helix 20.2, ␤ 0, turn 34.5, and random 45.2. These data demonstrate that at pH 4.0 part of the polypeptide chain is organized in ␣-helical segments. The lower helical percentage observed for W7F/W14F could be related to a higher concentration of unfolded protein in equilibrium with the partially folded helical structures formed at pH 4.0. The CD spectra recorded upon the addition of 0.3 M NaCl to unfolded apoprotein at pH 2.0 were similar to those observed at pH 4.0 in the presence of low salt concentrations. A pH increase from 4.0 to 7.0 restored the far-UV CD typical of the native state only in the case of the wild-type apomyoglobin. The secondary structure percentages of refolded protein were: helix 63.9, ␤ 0, turn 10.5, and random 25.5. By contrast, the CD spectrum of the W7F/W14F apomyoglobin mutant at pH 7.0 shows a negative peak at 216 nm, which is typical of a ␤-like polypeptide. The estimated secondary structure percentages were: helix 22.1, ␤ 38.6, turn 17.1, and random 22.2.
We examined the dependence of apomyoglobin aggregation on pH by monitoring thioflavin T fluorescence. There was no increase of thioflavin T fluorescence for either wild-type or W7F/W14F apomyoglobin at pH 2.0 and 4.0. At pH 7.0, thioflavin T fluorescence rapidly and significantly increased only in the case of W7F/W14F. DISCUSSION Myoglobin is a highly soluble globular protein whose native state properties do not suggest a predisposition to form amyloid fibrils. In fact, most of its amino acid sequence is organized in well defined ␣ helices with few or no ␤-sheet elements (40). Moreover, all compact intermediates detected so far in equilibrium and kinetic experiments are predominantly helical (19, 20, 36 -38, 41). Apomyoglobin incubated for a long time under experimental conditions that destabilize the native fold, i.e. pH 9.0 and 65°C, form fibrillar structures that are indistinguishable from disease-related amyloid fibrils (15). In this respect, myoglobin resembles other non-disease-related proteins that form amyloid aggregates under conditions favoring molten or partially folded states. It is generally agreed that non-covalent interactions involving the polypeptide chain can occur under mild denaturing conditions, and this may contribute to stabilize ␤-sheet formation.
Here we demonstrate that substitution of both highly conserved tryptophan residues endows the apomyoglobin with a high propensity to rapidly form fibrils under physiological conditions in which most of the wild-type protein is in the native state. In other words, apomyoglobin fibrils of the same type as those associated with amyloid-related diseases are formed in the absence of any physical or chemical agent able to perturb and, hence, destabilize the native state.
Two mechanisms have been proposed to explain the increased propensity to aggregate and form fibrils caused by amino acid mutations. One attributes amyloid formation to a shift in the equilibrium between the native state and the ensemble of partially denatured or molten states toward the latter, which is more prone to aggregation (3,7,8,11). In this case, the enhanced propensity to form fibrils is due to the destabilizing effect caused by the amino acid replacement rather than to the loss or the formation of specific interactions. The second mechanism concerns changes in the aggregation rate of partially denatured or molten proteins caused by the amino acid mutation and is generally applicable to proteins that are essentially unstructured even under physiological conditions (42,43). In this respect, it is noteworthy that some mutations in prion proteins do not destabilize the protein but do increase the aggregation rate (44,45). In the case of myoglobin, fully unfolded W7F/W14F becomes partially folded when the pH is increased from 2.0 to 4.0 or when 0.3 M sodium chloride is added to the samples, but fibril formation does not occur, at least within the time of our observation. A further pH increase from 4.0 to 7.0 determined aggregation and the formation of highly organized amyloid fibrils. We cannot say which mechanism operates in the case of myoglobin until we identify the main kinetic component underlying amyloid formation. It could be unfolded or partially folded populations or both. The overall picture is that the two Trp/Phe substitutions make apomyoglobin more susceptible to aggregation; however, this is prevented at low pH because of the electrostatic repulsion. On raising the pH from 4.0 to 7.0, the net charge of the polypeptide chain decreases, and this favors the aggregation process. Thus, two factors appear crucial for fibril formation in the case of W7F/W14F; they are non-covalent interactions and charge effects.
Experimental and theoretical evidence indicates that apomyoglobin folding occurs through a nucleation site formed by the confluence of the A, G, and H helices (46 -51). Therefore, incorrect GH and AH pairings influence the subsequent folding pattern. Molecular dynamics simulations suggest that the double tryptophan mutation in helix A promotes a substantial distortion and disruption of helix G that results in protein destabilization (23). The data reported herein corroborate the concept that formation of the AGH subdomain is a key step in apomyoglobin folding. In fact, distortions among the helices in this subdomain prevent the polypeptide from proceeding through the subsequent steps of the folding pathway.