Dissociation from the Oligomeric State Is the Rate-limiting Step in Fibril Formation by κ-Casein*

Amyloid fibrils are aggregated and precipitated forms of protein in which the protein exists in highly ordered, long, unbranching threadlike formations that are stable and resistant to degradation by proteases. Fibril formation is an ordered process that typically involves the unfolding of a protein to partially folded states that subsequently interact and aggregate through a nucleation-dependent mechanism. Here we report on studies investigating the molecular basis of the inherent propensity of the milk protein, κ-casein, to form amyloid fibrils. Using reduced and carboxymethylated κ-casein (RCMκ-CN), we show that fibril formation is accompanied by a characteristic increase in thioflavin T fluorescence intensity, solution turbidity, and β-sheet content of the protein. However, the lag phase of RCMκ-CN fibril formation is independent of protein concentration, and the rate of fibril formation does not increase upon the addition of seeds (preformed fibrils). Therefore, its mechanism of fibril formation differs from the archetypal nucleation-dependent aggregation mechanism. By digestion with trypsin or proteinase K and identification by mass spectrometry, we have determined that the region from Tyr25 to Lys86 is incorporated into the core of the fibrils. We suggest that this region, which is predicted to be aggregation-prone, accounts for the amyloidogenic nature of κ-casein. Based on these data, we propose that fibril formation by RCMκ-CN occurs through a novel mechanism whereby the rate-limiting step is the dissociation of an amyloidogenic precursor from an oligomeric state rather than the formation of stable nuclei, as has been described for most other fibril-forming systems.

amyloid-like deposits have been found in bovine mammary glands within calcified stones known as corpora amylacea (14 -16), and bundles of fibrils have also been reported in the cytoplasm of cells that surround these calcified deposits (17). These fibrils may result from the incapacity of other casein proteins to prevent amyloid formation by -casein.
Casein is the principal component of milk micelles and is a heterogeneous mix of proteins that include ␣ S1 -, ␣ S2 -, ␤-, and -casein. These proteins have been classified as intrinsically disordered (18,19), since they are extremely flexible, essentially unfolded, and have relatively little secondary or tertiary structure under physiological conditions. Although the caseins themselves are relatively small, ranging in mass between 19 and 25 kDa, they are usually associated, with calcium and other ions, in milk micelles that have molecular masses from 1 ϫ 10 3 to 3 ϫ 10 6 kDa (20). When purified from the other casein proteins, -casein in aqueous solution forms large spherical polymers with an average molecular mass of 1.18 MDa (21). These polymers are formed from smaller multimeric subunits (ranging from monomers to decamers) that result from intramolecular and intermolecular disulfide bonding (22). When completely reduced, -casein forms a self-associating polymeric species (comprising 31 monomers) in equilibrium with the monomeric form (23). It has been proposed that the monomeric form of the protein is a precursor for amyloid fibril formation (13).
The inherent propensity for -casein to form amyloid fibrils may be reflective of the relatively unstructured nature of its monomeric form in solution. A theoretical energy-minimized three-dimensional model of -casein proposed by Kumosinski et al. (24) suggests that the monomeric form of the protein can adopt a "horse and rider" configuration in which the legs of the horse are two sets of antiparallel ␤-sheets (from Lys 21 to Phe 55 ), which are rich in hydrophobic side chains. It has been speculated that the putative antiparallel ␤-sheet legs, due to their high degree of hydrophobicity and exposure to solution, may facilitate amyloid fibril formation by -casein (12).
The capacity of RCM-CN to form fibrils in a short time frame under conditions of physiological temperature and pH and without the need for denaturants make it an ideal system in which to study amyloid fibril formation. Moreover, the link between casein amyloid fibril formation, corpora amylacea, and the intracellular fibril-like bundles identified in cells surrounding these deposits indicates that a more detailed understanding of the mechanism by which -casein forms fibrils is required. Here, we show that fibril formation by RCM-CN occurs through a unique mechanism whereby the rate-limiting step is the dissociation of an amyloidogenic precursor from an oligomeric state. Furthermore, we have identified the region of the protein incorporated into the core of the amyloid fibrils formed by RCM-CN and show that this region encompasses that predicted to adopt an antiparallel ␤-sheet structure in the native form of the protein (24), thereby providing the basis of its unique fibril-forming mechanism. These results help to explain how fibril formation by -casein is largely prevented in vivo. Similar mechanisms involving dissociation of an amyloidogenic precursor from its natural binding partner/ oligomeric state are also likely to be important in other fibrilforming systems in vivo.

EXPERIMENTAL PROCEDURES
Materials-Bovine milk -casein (Swiss-Prot accession number P02668), without the N-terminal 21-amino acid leader peptide, was purchased from Sigma. The homogeneity of this protein was typically above 90%, as assessed by gel electrophoresis and ion exchange chromatography. Further purification of the protein by ion exchange chromatography showed that there was no change in the rate or degree of fibril formation in this sample compared with the protein supplied directly from Sigma (data not shown), indicating that the other minor components present did not influence the ability of reduced -casein to form fibrils. As such, we used the -casein without further purification in these studies. For the experiments described here, the protein was reduced and carboxymethylated as described previously (12). The vector pRSETB (Invitrogen), containing the human ␣-synuclein A53T gene, was a kind gift from Dr. Roberto Cappai (University of Melbourne, Australia), and the ␣-synuclein A53T protein was expressed and purified as described previously (25). Thioflavin T (ThT), 1,4dithiothreitol, proteinase K (molecular biology grade), and ␤-mercaptoethanol were obtained from Sigma. Trypsin (proteomics grade) was also purchased from Sigma and prepared at 1 mg/ml in HCl. Uranyl acetate was purchased from Agar Scientific (Stansted, UK). The concentration of -casein was determined by spectrophotometric methods using a Cary 5000 UV-visible NIR spectrophotometer (Varian, Melbourne, Australia) and an extinction coefficient of 0.95 ml⅐mg Ϫ1 cm Ϫ1 (26).
Thioflavin T and Light Scattering Assays-The formation of amyloid fibrils by RCM-CN was monitored using an in situ ThT binding assay (27). The protein was incubated at 37°C in 50 mM phosphate buffer, pH 7.2, unless stated otherwise. Samples were incubated with 10 M ThT in black Clear 96-microwell plates (Greiner Bio-One, Stonehouse, UK). The plates were sealed to prevent evaporation, and the fluorescence levels were measured with a Fluostar Optima plate reader (BMG Labtechnologies, Melbourne, Australia) with a 440/490-nm excitation/emission filter set. Fibril formation was also monitored by light scattering at 340 nm using a Fluostar Optima plate reader. For each assay, samples were prepared in duplicate, and the change in ThT fluorescence at 490 nm or light scattering at 340 nm is presented in the figures.
In the seeding experiments, RCM-CN (0.2-2.5 mg/ml) was incubated under the same conditions as described above; however, prior to the start of the incubation, an aliquot of freshly prepared fibrils was added to the reaction such that both the amount and the concentration of RCM-CN in each sample was maintained. In some experiments, the fibrils were sonicated for 10 min prior to use. As a control for these experiments, ␣-synuclein A53T was incubated at 37°C in 50 mM phosphate buffer containing 100 mM NaCl, pH 7.4, and 10 M ThT in a microtiter plate. Single time point ThT readings were taken for ␣-synuclein A53T during incubation, and the microtiter plate was subjected to constant shaking (700 rpm) using a Thermostar microplate shaker (BMG Labtechnologies) between readings.
Transmission Electron Microscopy (TEM)-Formvar and carbon-coated nickel electron microscopy grids (SPI Supplies, West Chester, PA) were prepared by the addition of 2 l of protein sample, washed with 3 ϫ 10 l of milli-Q, and negatively stained with 10 l of uranyl acetate (2% w/v). Samples were viewed using a Philips CM100 transmission electron microscope (Philips, Eindhoven, The Netherlands) at a magnification range of 10,500 -96,000 using an 80-kV excitation voltage.
Circular Dichroism-CD spectra over a wavelength range of 190 -250 nm were acquired for samples of RCM-CN before and after fibril formation using a Jasco J-180 spectropolarimeter (Jasco, Victoria, Canada) connected to a Jasco water bath using a cell with a 0.1-mm optical path length. Samples were prepared in 20 mM phosphate buffer, pH 7.2, and diluted with the same buffer to a final concentration of 0.2 mg/ml before obtaining the spectra. The spectra were obtained at 37°C. Solvent spectra were subtracted from the measured protein spectra, and the results shown are the average of five scans.
FTIR Spectra and Deconvolution Procedures-Samples of RCM-CN before and after fibril formation were analyzed in a Bruker BioATRCell II using a Bruker Equinox 55 Fourier transform infrared spectrometer equipped with a liquid nitrogencooled mercury cadmium telluride detector and a silicon internal reflection element. For each spectrum, 256 interferograms were co-added at 2 cm Ϫ1 resolution, and the water background was independently measured and subtracted from each protein spectrum. Spectral contributions from water vapor were subtracted using the atmospheric compensation feature of the Bruker Opus software package. The amide I region of the IR spectrum (1600 -1700 cm Ϫ1 ) consists of a series of overlapped component bands that arise from absorption by different secondary structural elements in proteins and peptides. In order to deconvolve these spectra, second derivatives of the amide I region were taken to determine the peak positions of the component bands. Subsequently, these component bands within the original spectra were fitted to a fixed number of Gaussian functions (28) using the OriginPro 7.5 software package by OriginLab, where the heights, widths, and positions of each band were optimized iteratively.
Proteolytic Cleavage of Amyloid Fibrils Formed by RCM-CN-Prior to proteolysis, RCM-CN fibrils were formed by incubating in 25 mM ammonium bicarbonate buffer, pH 7.4, overnight at 37°C. The presence of fibrils was confirmed by TEM. Limited proteolysis experiments were carried out by incubating native and fibrillar forms of RCM-CN with trypsin or proteinase K. Enzymatic digestions were performed at 37°C using enzyme-to-substrate ratios of 1:100 (w/w) for trypsin and 1:1000 (w/w) for proteinase K. The presence of amyloid fibrils following proteolysis was confirmed by TEM. At the end of the assay, an aliquot of each sample was taken for SDS-PAGE analysis, and the remainder was used for MALDI-TOF MS analysis. For SDS-PAGE, aliquots were removed at various time intervals, and the reaction was stopped by mixing with an equal volume of reducing sample buffer (50 mM Tris-HCl, pH 6.8, 4% (w/v) SDS, 2% (v/v) 2-mercaptoethanol, 12% glycerol, 0.01% bromphenol blue). These samples were then heated to 95°C for 5 min before being loaded onto gels. Proteolysis products were analyzed using 15 or 20% acrylamide gels (v/v) and standard techniques (29). Analysis of the samples by TEM showed that the sample preparation for SDS-PAGE analysis was sufficient to disaggregate the fibrils.
Mass Spectrometry-An Ultraflex III MALDI-TOF/TOF mass spectrometer (Bruker Daltonics, Bremen, Germany) was used for the analysis of intact protein and proteolytic cleaved peptides. The instrument was used in linear mode for peptides Ͼ6000 Da and in reflectron mode for smaller peptides. Linear mode was calibrated using protein calibration standard I (Bruker Daltonics). The protein mixture enables calibrations and testing in a mass range of ϳ3000 to 25,000 Da with an error of ϳ100 ppm. Reflector mode was calibrated using peptide calibration standard II (Bruker Daltonics), which enables calibrations and testing in a mass range of 500 -6000 Da to a mass accuracy greater than 50 ppm.
Aliquots taken for analysis by mass spectrometry were first mixed with urea to a final concentration of 6 M and left overnight at room temperature in order to disaggregate the fibrils. The sample was then mixed with trifluoroacetic acid to a final concentration of 0.1% (v/v), acidified with HCl (final pH Ͻ 4), before being taken up in a reverse phase column ZipTip C-18 (Millipore, Billerica, MA). After washing with acetonitrile (5%, v/v), containing 0.1% (v/v) trifluoroacetic acid multiple times to remove any salt and urea present, the sample was then stepwise eluted with increasing concentrations of acetonitrile (20,30,40,50, and 60% (v/v) in 0.1% (v/v) trifluoroacetic acid). The proteinase-resistant peptide of interest was eluted with 50 and 60% (v/v) acetonitrile. Samples were analyzed by MALDI-TOF/ TOF MS in linear mode using 2,5-dihydroxybenzoic acid as the matrix (5 g/liter) in 50% (v/v) acetonitrile with 0.1% (v/v) trifluoroacetic acid.
To confirm that this peptide matched the protease-resistant band identified by SDS-PAGE, a tryptic in-gel digest was performed according to the method of Shevchenko et al. (30) with the following modifications. The Coomassie Blue-stained bands from the gel were cut with a sterile scalpel blade into small blocks and destained using 50 mM NH 4 HCO 3 in 30% (v/v) acetonitrile. The blocks were rinsed and incubated overnight at 37°C with 10 ng/l trypsin in 5 mM NH 4 HCO 3 . The tryptic fragments were extracted from the gel pieces by sonicating the samples for 15 min with 50% (v/v) acetonitrile with 0.1% (v/v) trifluoroacetic acid, and then the extracted peptide solution was dried down to a final volume of ϳ5 l using a vacuum centrifuge. The tryptic peptides were analyzed directly by MALDI-TOF MS in reflector mode using ␣-cyano-4-hydroxycinnamic acid as the matrix. The identity of these peptides was confirmed by matching their observed masses to the expected masses of the in silico tryptic digest of the known -casein sequence using Biotools 3.0 software (Bruker Daltonics).

Amyloid Fibril Formation by RCM-CN Does Not Proceed by a Standard
Nucleation-dependent Mechanism-First, RCM-CN was incubated at 37°C at various concentrations (0 -10 mg/ml) in 50 mM phosphate buffer, pH 7.2, and the increase in ThT binding (which is associated with fibril formation) was examined by an in situ assay (Fig. 1, A and B). We have previously observed that the presence of the ThT dye in this assay does not affect the rate or extent of fibril formation by RCM-CN or the morphology of the fibrils formed. For all of the concentrations tested, we found that during the early phase of incubation (i.e. the first 30 min), there was an initial concentration-dependent decrease in the ThT fluorescence of the samples. We and others have shown that -casein is able to bind ThT in its native state (12,13), and we observed that this initial decrease in fluorescence intensity at the start of this assay was attributable to the temperature dependence of the ThT fluorescence (an increase in temperature leading to a decrease in fluorescence intensity). Thus, as the samples were warmed from room temperature to 37°C, the fluorescence intensity from ThT bound to the native state of the protein decreased. After this period, there was a concentration-dependent increase in ThT fluorescence, which reached a maximum at 600 -800 min. The presence of fibrils in these samples was confirmed by TEM. Since the rate of increase in ThT fluorescence was observed to be dependent on the protein concentration (Fig. 1, A and B), we sought to determine the reaction rate for the formation of fibrils by RCM-CN. The apparent reaction order can be determined according to the equation lnV ϭ A ϩ nlnC, where V represents the initial reaction rate, C is the protein concentration, and n is the apparent reaction order. The plot of the log of the protein concentration against the log of the initial reaction rate (Fig. 1C) gave a slope (n) of 1.13 Ϯ 0.15, indicating that the rate of increase in ThT FIGURE 1. Fibril formation of RCM-CN (0 -10 mg/ml) as monitored by ThT fluorescence (A and B) and light scattering (D) at 340 nm. Samples were incubated in 50 mM phosphate buffer, pH 7.2, at 37°C and monitored over time. C, a plot of the ln(RCM-CN) against ln(reaction rate). The reaction rate was determined from the initial rate of increase in ThT fluorescence from data, such as those shown in A and B. In D, the initial rate of the increase in light scattering has been extrapolated to determine the lag phase of the reaction (dotted lines). E, the effect on ThT fluorescence of decreasing the temperature to 32°C on fibril formation of RCM-CN (0 -10 mg/ml). F, the effect of seeding on the rate of fibril formation by RCM-CN. A sample of RCM-CN (2.5 mg/ml) was incubated in 50 mM phosphate buffer, pH 7.2, at 37°C with or without the addition of preformed fibrils (seeds) at a 0.5 or 5.0% (w/w) ratio (note that the seeded curves overlay each other in the graph). Inset, the addition of seeds to a sample of ␣-synuclein A53T incubated at 2 mg/ml in 50 mM phosphate buffer, 100 mM NaCl, pH 7.4, at 37°C with shaking (F, no seed added; E, 0.5% (w/w) seed added; , 2.0% (w/w) seed added). A.U., absorbance units. fluorescence for RCM-CN follows an apparent first-order kinetic mechanism in relation to the protein concentration.
We also monitored the incubation of RCM-CN by a light scattering assay, which did not involve ThT (Fig. 1D). When RCM-CN (0 -10 mg/ml) was incubated under the same conditions as for the ThT assay, there was a concentrationdependent increase in light scattering that was associated with fibril formation. Interestingly, both the ThT in situ (Fig.  1, A and B) and light scattering (Fig. 1D) data showed no evidence of a concentration-dependent variation in the lag phase associated with fibril formation, an attribute that is a component of a nucleation-dependent mechanism of aggregation (such as has been described for most amyloid fibrilforming systems) (3). Even when the protein was incubated at lower temperatures (e.g. 32°C (Fig. 1E) and 27°C (data not shown)), the lag phase of the increase in ThT fluorescence was very short (i.e. Ͻ 30 min) and concentration-independent. According to a nucleation-dependent aggregation mechanism, the rate-limiting step of fibril formation is the generation of stable nuclei, the rate of which is limited by the protein concentration. As such, the length of the lag phase of conversion of the protein to amyloid fibrils depends dramatically on the concentration of the protein (3). The clear absence of a concentration-dependent lag phase associated with fibril formation by RCM-CN suggested that the mechanism by which it forms fibrils may differ from that typically reported for other systems.
We also tested whether adding preformed fibrils (seeds) to RCM-CN (2.5 mg/ml) incubated at 37°C increased the rate of fibril formation, since in a nucleation-dependent mechanism, the addition of exogenous seeds bypasses the need for nucleus formation, which, as discussed above, is typically considered to be the rate-limiting step in the aggregation process. However, the addition of seeds (0.5-10.0%, w/w) had no effect on the length of the lag phase or the rate of increase in ThT fluorescence, either at 32°C (data not shown) or 37°C (Fig. 1F). Preformed fibrils that were sonicated prior to the addition also had no effect on RCM-CN fibril formation (data not shown). Similar results were found when the protein was incubated at 37°C at 0.2 mg/ml and the reaction was seeded (10.0%, w/w) with preformed fibrils (data not shown). In contrast, seeding (0.5-2.0%, w/w) a sample of ␣-synuclein A53T (2 mg/ml), the Parkinson disease-related fibril-forming protein that has been shown to form fibrils through a classic nucleation-dependent mechanism, dramatically increased the rate of fibril formation and decreased the lag phase of the reaction (Fig. 1F, inset). Together, these data suggest that the rate of fibril formation by RCM-CN is not limited by the formation of stable nuclei, as is found in most nucleation-dependent mechanisms of aggregation but instead is dependent on steps that precede the formation of nuclei (e.g. the dissociation of an amyloidogenic precursor from its oligomeric state).
Monitoring the Changes in RCM-CN before and after Fibril Formation by Spectroscopic Methods-Since RCM-CN has an inherent propensity to form fibrils when incubated under conditions of physiological pH and temperature, we investigated structural changes in RCM-CN that occur during the process of amyloid fibril formation to see if these would help to explain its amyloidogenic nature. RCM-CN (1 mg/ml) was incubated under conditions whereby it readily forms fibrils (i.e. 50 mM phosphate buffer, pH 7.2, at 37°C for 1000 min), and the presence of fibrils was confirmed by ThT fluorescence and TEM. Far-UV CD spectroscopy ( Fig. 2A), used to probe the secondary structural changes that accompany RCM-CN fibril formation, showed that freshly prepared RCM-CN gives a spectrum with a maximum negative molar ellipticity at 200 nm and shoulder at 215 nm, indicative of a predominantly random coil protein and consistent with CD spectra of the protein reported previously (12). Following incubation under fibril-forming conditions, the CD spectrum of RCM-CN was red shifted such that the maximum negative molar ellipticity was at 203 nm and was decreased in intensity, and the shoulder at 215 nm was increased, indicating that upon fibril formation there is a loss in random coil and an increase in ␤-sheet structure compared with the native protein. Similarly, FTIR analysis of the protein before and after fibril formation showed an increased ␤-sheet content in the fibrillar form of the protein (Fig. 2C). The shift in the amide I maximum to lower wavenumbers (from 1645 to 1635 cm Ϫ1 ) suggests that there is significant structural reorganization during fibril formation involving an increase in ␤-sheet structure (31). From the deconvolved data (Fig. 2D), showing the individual component bands of the spectra (32), an increase in the band at ϳ1625 cm Ϫ1 and the development of bands at 1639 and 1677 cm Ϫ1 following fibril formation is attributable to the increase in ␤-sheet content in the sample. There was a corresponding decrease in bands at approximately 1669 and 1685 cm Ϫ1 attributed to turns and bends, whereas the magnitude of the band corresponding to disordered structure (1648 -1653 cm Ϫ1 ) was largely unchanged. Overall, the deconvolved FTIR data indicated a ϳ2.4-fold increase in the amount of ␤-sheet in RCM-CN following fibril assembly (from 17 to 41%), which is primarily derived from the loss of turns and bends present in the nonfibrillar form of the protein ( Table 1).
Identification of the Protease-resistant Core Region of Fibrils Formed from RCM-CN-Our data suggest that the critical, rate-limiting step in amyloid fibril formation by RCM-CN is the dissociation of a precursor that is highly amyloidogenic and readily incorporated into prefibrillar nuclei. Thus, we sought to identify the region of the protein incorporated into the core of the amyloid fibrils formed by RCM-CN in order to investigate whether this region showed characteristics of being highly amyloidogenic. In order to identify the core region of these fibrils, we performed proteolysis experiments on native and fibrillar forms of the protein. In the absence of trypsin, RCM-CN derived from fibrils was observed as a full-length protein (19 kDa) by SDS-PAGE. The addition of trypsin (1:100 (w/w) ratio of trypsin/substrate) to native RCM-CN resulted in its rapid proteolysis in less than 30 min, generating small peptide fragments of less than 1 kDa (Fig. 3A). In contrast, fibrillar RCM-CN was much more resistant to degradation, such that, even following 4 h of incubation with trypsin, a protease-resistant peptide of ϳ7 kDa was still evident by SDS-PAGE. Following proteolysis with trypsin, fibrils with a similar morphology to those present before digestion were observed by TEM (Fig. 3, B and C).
Similar results were obtained when proteinase K was used in place of trypsin (Fig. 4A) (i.e. native RCM-CN was degraded by proteinase K into small peptide fragments, whereas the fibrillar form was resistant to proteolytic degradation). After 4 h of incubation of fibrillar RCM-CN with proteinase K, four peptide bands were observed by SDS-PAGE at 18, 15, 10, and 8 kDa. Following digestion with proteinase K, amyloid fibrils were still observed by TEM (Fig. 4, B and C).
We identified the protease-resistant core region of these amyloid fibrils by mass spectrometry. Since digestion with trypsin produced a smaller protease-resistant peptide fragment compared with proteinase K, we used this treatment in order to identify the protease-resistant core. Following disaggregation of the trypsin-treated fibrils, we purified the resultant peptide and subjected it to MALDI-TOF MS. The peptide was seen as a [M ϩ H] ϩ ion at 7207.66 Da in the spectrum (Fig. 5A); thus, its mass matched that of the peptide identified by SDS-PAGE (see Fig. 3A). A number of smaller tryptic fragments were also identified that matched fragments generated when the native form of the protein was treated with trypsin (data not shown). Based on its mass, the identity of this tryptic fragment could not be    Table 2) and thus correspond to fragments of the -casein-(25-86) peptide. The sequence identity of these in-gel tryptic fragments was confirmed by tandem mass spectrometry. The peaks with lower m/z ratios in the spectrum arose from further fragmentation of these peptide fragments. We also confirmed that the same region of the protein was resistant to proteolytic degradation by proteinase K by performing an in-gel digest of the bands identified by SDS-PAGE (see Fig. 4A) and analyzing the fragments by MALDI-MS/MS (data not shown). Fragments corresponding to -casein-(17-111) and -casein-(17-86) were identified from the 10 and 8 kDa bands, respectively ( Table 2). Both of these fragments encompass the region identified by trypsin proteolysis of the fibrils. Similar tryptic fragments were identified from the 18 and 15 kDa bands; therefore, these higher molecular mass bands incorporate the same protease-resistant region of -casein and represent larger peptides that were not fully digested by the proteinase K treatment.

Correlation of the Protease-resistant Core Region of Fibrils Formed by RCM-CN with Theoretical Predictions on the Structure and Fibril-forming Propensity of -Casein-Al-
though -casein is generally classified as an intrinsically disordered protein (18,19), previous three-dimensional modeling (24) indicates that it can sample defined conformations, a finding that is supported by our CD and FTIR data (see Fig. 2). We have employed two independent disorder-predicting algorithms to assess the regions of the -casein that are likely to fold into a well defined conformation and therefore facilitate fibril formation (Fig. 6). Analysis with the VL-XT algorithm using PONDR (33,34) indicated that, although almost half of the residues in the protein are predicted to be disordered (44%), the region Lys 24 -Pro 64 is the most likely to adopt an ordered conformation. A similar region of the protein (Tyr 30 -Gln 89 ) was predicted to fold into a defined conformation using the FoldIndex algorithm (35,36). Both of these regions of predicted order therefore correlate very well with the region identified herein as being incorporated into the fibrillar core of fibrils formed by RCM-CN (Tyr 25 -Lys 86 ), suggesting that this region is likely to adopt the ␤-sheet conformation predicted by the previous modeling work (24).
An examination of so-called "amyloidogenic domains" within a number of amyloid-forming proteins has led to the development of algorithms to predict aggregation-prone regions. We have used two such algorithms to examine the regions of -casein predicted to be the most aggregation prone; 1) the algorithm TANGO (37) indicates these to be residues Ile 28 -Leu 32 and Ile 73 -Leu 79 , and 2) a modified ver-  sion of the Zyggregator algorithm (38), 4 also predicts regions of significant aggregation propensity between Tyr 25 and Gln 29 and between Leu 74 and Ser 80 (Fig. 6). -Casein also contains two regions (Tyr 38 -Gln 44 and Thr 121 -Ala 126 ) that match the six-amino acid sequence pattern proposed by Lopez de la Paz and Serrano (39) to identify amyloidogenic stretches within proteins; however, the latter is preceded by a proline residue at position 120, which was found to be an "amyloid breaker." Hence, these predictive methods correlate well with the protease-resistant fragment (Tyr 25 -Lys 86 ) identified in this study, which includes at least two regions of high aggregation propensity, encompassing residues Tyr 25 -Gln 44 and Ile 73 -Ser 80 .

Fibril Formation by RCM-CN Does Not Follow a Simple Nucleationdependent Mechanism-
We have previously shown that in its native state, -casein has an inherent propensity to form amyloid fibrils when incubated under conditions of physiological pH and temperature in vitro (13). The formation of fibrils is enhanced by reduction of its disulfide bonds; however, this is not attributable to significant secondary or tertiary structural changes in the protein upon reduction (12). As a result, we have used RCM-CN in these studies, since it forms fibrils much more readily than the nonreduced "native" form (13), and we attribute this primarily to the absence of disulfide bonds, thereby increasing the pool of dissociated species in solution that is available to form fibrils.
The key to understanding the rate of amyloid formation involves the identification of the rate-determining step in the overall process. Amyloid fibril formation is generally considered to proceed via a nucleation-dependent mechanism in which the slow step is the formation of an ordered, stable nucleus (Fig. 7) (3). The time required for the formation of this nucleus corresponds to a lag phase of polymerization, which is strongly protein concentration-dependent. In this scheme, the lag phase can be shortened by the addition of preformed nuclei (seeds) to the reaction mix, which overcomes the time required for the formation of these seeds. Since fibril elongation also occurs from these added nuclei, the rate of fibril formation is FIGURE 6. A summary of the key regions of -casein with regard to its amyloid fibril-forming ability. The diagram indicates the regions of the protein predicted to form ␤-sheet by three-dimensional modeling (24), the protease-resistant fibrillar core fragments generated by digestion with proteinase K and trypsin and identified by mass spectrometry, the region of the protein predicted to be ordered by the algorithms PONDR (33,34) and FoldIndex (35,36), the most amyloidogenic regions of the protein as predicted by TANGO (37) and a modified version of Zyggregator (38), 4 and the region of the protein in which amino acid sequences match those considered amyloidogenic as proposed by Lopez de la Paz and Serrano (39). In the last, the asterisk indicates that this amino acid sequence is followed by a proline residue, which is considered an amyloid fibril breaker. also increased by seeding in a simple nucleation-dependent model (with the increase in rate being proportional to the amount of seeds added). In this study, we have shown that RCM-CN fibril formation appears to be regulated by a novel mechanism (i.e. not the formation of a stable nucleus as is the case in a simple nucleation-dependent model). We have three distinct lines of evidence that have led us to this conclusion. First, the lag phase of fibril formation is independent of the protein concentration (Fig. 1, A, B, and D), suggesting that steps preceding the formation of a nucleus are significant in the reaction mechanism. Second, this preceding step is rate-limiting and follows an approximate first-order reaction mechanism (Fig. 1C) with respect to the protein concentration. Third, the addition of exogenous seeds to the reaction did not affect either the rate or lag phase of fibril formation (Fig. 1F).
Together, these data are consistent with a model in which the dissociated form of the protein is the precursor to fibril formation. Moreover, the dissociation of this amyloidogenic species from an oligomeric state is the rate-limiting step to fibril formation by RCM-CN (Fig. 7). This dissociation step therefore limits the supply of species to form stable nuclei and the supply of precursor to the growing fibril during its elongation phase. That the formation of a stable nucleus is not rate-limiting accounts for the lack of protein concentration dependence on the lag phase of polymerization, since the dissociation rate would be independent of protein concentration. Such a model also explains the protein concentration dependence on the rate of fibril formation but the lack of effect on the rate and lag phase of fibril formation of adding preformed seeds to the reaction mix.
Thus, the precursor to fibril formation is the dissociated species rather than the oligomeric form of the protein. The reasoning for this is 2-fold. First, seeding of the reaction with up to 20% (w/w) preformed fibrils did not have an effect on the rate of fibril formation. If the oligomer were the precursor to fibril formation, then it is foreseeable that seeding at low concentrations (i.e. 0.5-2%, w/w) would not have a significant effect on the lag phase or rate of fibril formation, since the number of seeds would be relatively small compared with the number of pre-existing nuclei (in the form of oligomers). However, at higher concentrations, the number of seeds would be relatively high, so one would expect to see an observable change in the lag phase and rate of fibril elongation, since the number of amyloidogenic oligomers would not be limiting. Second, it is more likely that the dissociated species would enable the ordered aggregation required for fibril formation to proceed, since it would expose the hydrophobic "legs" that we have identified make up the protease-resistant core of the fibrils (see below). In contrast, in the oligomeric form of the protein, the hydrophobic regions are thought to be internally buried (23,40), which would limit their ability to associate to form fibrils.
In the majority of studies describing fibril formation by a specific precursor protein, aggregation has been shown to involve solution conditions that disrupt the native structure FIGURE 7. A proposed model to describe differences in the process of fibril formation by proteins that follow a simple nucleation-dependent mechanism compared with fibril formation by -casein. In the nucleation-dependent mechanism, fibril formation commences with the unfolding of a native protein, forming a pool of partially folded intermediates, a process that is reversible. The partially folded intermediates are able to reversibly associate with each other until they reach a critical size/mass at which a stable nucleus is formed. The formation of this nucleus from the partially folded intermediates is slow and rate-limiting in the overall process of fibril formation. The time required for the formation of this nucleus is referred to as the lag phase, which is protein concentration-dependent, since increasing the concentration increases the number of intermediates that can associate to form a nucleus. Fibril elongation then proceeds via the addition of intermediates to the growing nucleus. The mechanism also explains how seeding the reaction increases the reaction rate and decreases the lag phase, since the addition of preformed fibrils overcomes the time required to form nuclei (orange arrow), and the concentration of partially folded intermediates is not limiting. In contrast, fibril formation by -casein depends on the dissociation of a highly amyloidogenic species from its oligomeric state, and this is the rate-limiting step in the fibril-forming pathway. Once released, these amyloidogenic species rapidly associate to form stable nuclei, and, since the dissociation constant is independent of protein concentration, the lag phase of fibril formation by -casein is independent of the protein concentration. Fibril formation proceeds through a similar mechanism of ␤-sheet stacking as with the nucleation-dependent model; however, the addition of preformed seeds to the reaction does not increase the reaction rate, since the supply of the amyloidogenic species for fibril elongation is also limited by the dissociation constant (orange arrow). and therefore facilitate unfolding. Thus, a critical component of the general model for amyloid fibril formation is the requirement for the protein to unfold, at least partially, as a precursor to amyloid nucleation (1) (see Fig. 7). However, for many systems involved with disease (e.g. Alzheimer and Parkinson disease), the peptides and proteins that have been found to aggregate adopt essentially unstructured conformations in their normal biological state (i.e. they are "natively unfolded") (10). Thus, aggregation does not need to be preceded by the unfolding of a globular protein but rather may proceed by direct selfassembly of the particular protein or peptide from an ensemble of unstructured conformations (10). Moreover, although dissociation of a protein from an oligomeric form can generate an amyloid precursor due to the change in solution conditions (e.g. for acid denaturation of the transthyretin tetramer to form monomers (41)), the possibility that amyloid fibril formation primarily occurs due to a "native" amyloidogenic species dissociating from its binding partner or parent oligomer, under physiological conditions, has not been fully explored.
In the case of -casein, it is reasonable to speculate that the generation of the amyloidogenic intermediate may simply involve dissociation from its oligomeric state (i.e. quaternary structure destabilization). In the absence of the other casein proteins, monomeric forms of -casein are in dynamic equilibrium with a large oligomeric species (22). Treatments known to increase the rate of dissociation, such as heat and reduction, increase the rate of amyloid fibril assembly (13). It remains to be determined the extent of unfolding undergone by these dissociated forms in the generation of an amyloidogenic precursor. However, it is likely to be limited, since modeling (24) suggests that the dissociated -casein monomer samples a structural conformation that is an ideal ␤-sheet-containing precursor for fibril formation, negating the need for further unfolding.
The Rate-limiting Species Formed during Amyloid Fibril Formation by RCM-CN Is Highly Amyloidogenic-As part of our proposed model for the mechanism by which RCM-CN forms fibrils, the dissociated species is highly amyloidogenic and therefore is readily incorporated into nuclei or growing fibrils when it becomes available. Thus, there is no free pool of this amyloidogenic species available to be incorporated into exogenous seeds as occurs in the simple nucleation-dependent model. We therefore sought to identify the regions of the protein incorporated into the protease-resistant core of fibrils formed by RCM-CN and investigated whether these help to explain its amyloidogenic nature. Our findings indicate that the region of RCM-CN from Tyr 25 to Lys 86 represents the protease-resistant ␤-sheet core of amyloid fibrils formed by -casein. This region correlates well with that predicted to fold into a well defined conformation by protein disorder algorithms and is aggregation-prone (see Fig. 6). Although -casein is regarded as a natively unstructured protein, since its monomer adopts native molten globule-like states (i.e. it has ill defined tertiary structure while retaining some secondary structural elements (19)), theoretical three-dimensional modeling of the protein has suggested that it may sample a "horse and rider" conformation (24). In this model, the region from Lys 21 to Phe 55 is proposed to adopt an antiparallel ␤-sheet formation in which the ␤-sheets are separated by proline turns (24). These "legs," due to their high degree of hydrophobicity and exposure, would make an ideal site for sheet-sheet interactions. Under physiological conditions, such interactions would presumably occur with hydrophobic domains of other caseins, facilitating its incorporation into oligomeric species, such as milk micelles. However, this region of the protein would also favor ␤-sheet stacking with other -casein subunits and therefore facilitate amyloid fibril assembly.
The Assembly of Fibrils Formed from RCM-CN-Our studies show that a significant portion of the full-length protein is protected from proteolytic cleavage once it has adopted the fibrillar conformation, even when a relatively nonspecific protease, such as proteinase K, is used for the digestion. This is similar to what has been reported for other amyloid forming systems (e.g. up to 89 residues (residues 10 -99) of ␤2-microglobulin were found to be resistant to degradation by pepsin) (42). It is highly likely that a single -casein monomer may contribute more than one ␤-strand to the fibril structure, in a similar manner to that shown for the amyloid ␤ peptide (A␤-(1-40) (43) and the C-terminal fragment of the HET-s protein (44). Moreover, since the protease-resistant region of fibrillar -casein is continuous, the loops joining these strands are unlikely to be solvent-exposed. Further structural studies of fibrils formed by -casein are required to confirm these proposals. It was unexpected that treatment with trypsin (a relatively specific protease) was able to cleave closer to the core of the amyloid fibrils than proteinase K (a relatively nonspecific protease). This may reflect differences in the residue accessibility of the two proteases (due to packing of the fibrils) due to variations in their size and mode of action.
Upon fibril assembly, -casein undergoes structural rearrangement within the protein, particularly to the region predicted to have a high propensity to form ␤-sheet (24). Indeed, our spectroscopic data indicate that, although some regions of the protein are not involved in fibril formation (as assessed by NMR spectroscopy), 5 others undergo major rearrangements, leading to a significant increase in the ␤-sheet content of the protein (as shown by FTIR spectroscopy; see Fig. 2 and Table 1). Moreover, although we suggest that the inherent propensity for -casein to form amyloid fibrils originates from regions of the monomeric state that facilitate ␤-sheet stacking, the differences in the component bands of the amide I region of the FTIR spectra attributable to ␤-sheet suggest that the ␤-sheet domains in the native and fibrillar forms are structurally dissimilar. Thus, amyloid fibril growth does not simply involve the direct stacking of native ␤-sheets from individual precursors but instead is most likely preceded by a reorganization of these regions, as has been suggested for other ␤-sheet-containing proteins (45).
Conclusions-Here we show that fibril formation by RCM-CN is not governed by the simple nucleation-dependent model that has been proposed for most other amyloid fibrilforming systems studied to date (Fig. 7). Instead, we propose that the rate of fibril formation by RCM-CN is limited by the dissociation of an amyloidogenic precursor from an oligomeric state (Fig. 7). In milk, the amount of this amyloidogenic species would be very low because of the interaction of -casein with itself (e.g. via intermolecular disulfide bonding) and with the other caseins in milk (e.g. in milk micelles). Nonetheless, the proteinaceous deposits found within corpora amylacea of mammary tissue have been shown to be amyloid in nature (14 -16), and peptides corresponding to casein fragments have been isolated from within these amyloid-like deposits (46,47). Moreover, intracellular fibrillar bundles have also been identified in epithelial cells that surround the deposits of corpora amylacea (17,48). These are sites at which casein synthesis and secretion is high and the reducing environment of the cell would favor reduced forms of these proteins. Thus, it is possible that the dissociation of amyloidogenic precursors of -casein from milk micelles does occur under some circumstances, resulting in fibrillar deposits within these tissues. That amyloidoses associated with -casein fibril formation are not more prevalent in vivo is probably a testament to the interaction between the casein proteins, which prevent the release of this -casein amyloidogenic precursor and thereby inhibit large scale -casein fibril formation through their chaperone-like activity (13,49).