N-terminal Prion Protein Peptides (PrP(120–144)) Form Parallel In-register β-Sheets via Multiple Nucleation-dependent Pathways*

The prion diseases are a family of fatal neurodegenerative diseases associated with the misfolding and accumulation of normal prion protein (PrPC) into its pathogenic scrapie form (PrPSc). Understanding the fundamentals of prion protein aggregation and the molecular architecture of PrPSc is key to unraveling the pathology of prion diseases. Our work investigates the early-stage aggregation of three prion protein peptides, corresponding to residues 120–144 of human (Hu), bank vole (BV), and Syrian hamster (SHa) prion protein, from disordered monomers to β-sheet-rich fibrillar structures. Using 12 μs discontinuous molecular dynamics simulations combined with the PRIME20 force field, we find that the Hu-, BV-, and SHaPrP(120–144) aggregate via multiple nucleation-dependent pathways to form U-shaped, S-shaped, and Ω-shaped protofilaments. The S-shaped HuPrP(120–144) protofilament is similar to the amyloid core structure of HuPrP(112–141) predicted by Zweckstetter. HuPrP(120–144) has a shorter aggregation lag phase than BVPrP(120–144) followed by SHaPrP(120–144), consistent with experimental findings. Two amino acid substitutions I138M and I139M retard the formation of parallel in-register β-sheet dimers during the nucleation stage by increasing side chain-side chain association and reducing side chain interaction specificity. On average, HuPrP(120–144) aggregates contain more parallel β-sheet content than those formed by BV- and SHaPrP(120–144). Deletion of the C-terminal residues 138–144 prevents formation of fibrillar structures in agreement with the experiment. This work sheds light on the amyloid core structures underlying prion strains and how I138M, I139M, and S143N affect prion protein aggregation kinetics.

The prion diseases are a family of fatal neurodegenerative diseases associated with the misfolding and accumulation of normal prion protein (PrP C ) into its pathogenic scrapie form (PrP Sc ). Understanding the fundamentals of prion protein aggregation and the molecular architecture of PrP Sc is key to unraveling the pathology of prion diseases. Our work investigates the early-stage aggregation of three prion protein peptides, corresponding to residues 120 -144 of human (Hu), bank vole (BV), and Syrian hamster (SHa) prion protein, from disordered monomers to ␤-sheet-rich fibrillar structures. Using 12 s discontinuous molecular dynamics simulations combined with the PRIME20 force field, we find that the Hu-, BV-, and SHaPrP(120 -144) aggregate via multiple nucleation-dependent pathways to form U-shaped, S-shaped, and ⍀-shaped protofilaments. The S-shaped HuPrP(120 -144) protofilament is similar to the amyloid core structure of HuPrP(112-141) predicted by Zweckstetter. HuPrP(120 -144) has a shorter aggregation lag phase than BVPrP(120 -144) followed by SHaPrP(120 -144), consistent with experimental findings. Two amino acid substitutions I138M and I139M retard the formation of parallel in-register ␤-sheet dimers during the nucleation stage by increasing side chain-side chain association and reducing side chain interaction specificity. On average, HuPrP(120 -144) aggregates contain more parallel ␤-sheet content than those formed by BV-and SHaPrP(120 -144). Deletion of the C-terminal residues 138 -144 prevents formation of fibrillar structures in agreement with the experiment. This work sheds light on the amyloid core structures underlying prion strains and how I138M, I139M, and S143N affect prion protein aggregation kinetics.
Prion diseases are a family of infectious amyloid diseases that affect both humans and animals. They include Creutzfeldt-Jakob disease, Gerstmann-Sträussler-Scheinker syndrome, kuru, and fatal familial insomnia in humans, and scrapie, bovine spongiform encephalopathy, and chronic wasting disease in other mammals (1). The infectious agent is the scrapie form of the prion protein (PrP Sc ), 2 which is a ␤-sheet-rich aggregated form of normal prion protein (PrP C ) whose full structure is unknown. In contrast to viruses or other pathogens, which propagate by replicating themselves inside a host cell based on their nucleic acid genome, PrP Sc propagates by templating the misfolding and accumulation of PrP C into misfolded PrP Sc (2). In addition, within a single population, e.g. human, various strains of Creutzfeldt-Jakob disease are believed to be caused by the same prion protein but with diverse PrP Sc conformations (3). The molecular mechanism underlying PrP Sc -assisted propagation still remains unclear (4).
Various models for the full structure of PrP Sc have been postulated based on experiment and simulation studies. PrP Sc structures have been found to contain a 27-30-kDa proteaseresistant core consisting of residues 90 -231 (5). Govaerts et al. (6) postulated a trimeric model for PrP Sc in which the N-terminal residues 89 -175 adopt a left-handed ␤-helix, and residues 176 -227 in the C terminus maintain an ␣-helical conformation. Using molecular dynamics simulation, DeMarco and Daggett (7) observed the conversion of native two-stranded ␤-sheets (residues 129 -131 and 161-163) in the Syrian hamster prion protein N terminus into three-stranded ␤-sheets (residues 116 -119, 129 -132, 160 -164) and isolated ␤-strands (residues 135-140). Based on this, they proposed a spiral ␤-helix filament model containing PrP Sc -like trimers tightly packed along the fibril axis. Smirnovas et al. (8) applied hydrogen-deuterium exchange coupled with mass spectrometry to brain-derived PrP Sc and found that the proteinase K-resistant domain (from residues 90 to 231) of PrP Sc consists mainly of ␤-strands with relatively short connecting turns and loops and no ␣-helices, contradicting the notion that PrP Sc retains native ␣-helices. Based on solid-state NMR data, Groveman et al. (9) proposed a parallel in-register architecture for the mouse PrP(90 -231) fibril. Their MD simulation results show that residues 124 -227 adopt a more packed structure than residues 90 -123.
The N-terminal prion protein fragments have been the subject of experiments aimed at learning the fundamental mecha-* This work was supported by National Institutes of Health Grants GM056766 and EB006006 and in part by National Science Foundation Research Triangle Materials Research Science and Engineering Centers (MRSEC) Grant DMR-1121107. The authors declare that they have no conflicts of interest with the contents of this article. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health. □ S This article contains supplemental Figs. S1-S6 and Table S1. 1 To whom correspondence should be addressed. nisms underlying prion protein propagation. Examples of prion fragments that have been studied include the prion Y145Stop (PrP(23-144)) and Q160Stop mutants (PrP(23-159)), which are associated with prion cerebral amyloid angiopathy, a Gerstmann-Sträussler-Scheinker syndrome-like prion disease characterized by the presence of amyloid deposits in the cerebral nervous system (10). Vanik et al. (11) found that recombinant PrP(23-144) from human, mouse, and hamster form fibrils with distinct lag phases. They concluded that speciesspecific mutations (I138M and I139M) are critical for determining the aggregation kinetics as well as the cross-seeding specificity of PrP(23-144). Jones et al. (12) found that residues 130 -140 are an essential region for amyloid formation because deletion of the conserved palindrome sequence 113 AGAAAAGA 120 results in an altered amyloid ␤-core but does not affect amyloidogenicity or seeding specificity. Later on, Chatterjee et al. (13) carried out seed-induced fibrillation experiments on recombinant full-length mouse PrP  and found that residues 127-143 participate in the formation of the amyloid core. They also found that synthetic MoPrP(107-143) and MoPrP(127-143) peptides can seed the fibrillation of recombinant full-length mouse prion protein, whereas MoPrP(107-126) cannot. The fibrillation of short N-terminal prion peptides from the amyloid core region of N-terminal prion protein fragment has also been investigated in experiments. Tagliavini et al. (14) performed the fibrillation experiments of synthetic peptides homologous to prion protein residues 106 -147 and found that PrP(106 -126) formed straight fibrils, whereas PrP(127-147) formed twisted fibrils. Skora et al. (15) applied H/D exchange coupled with NMR spectroscopy to fibrils formed by recombinant human PrP(108 -143) and found that residue 129 is deeply buried in the amyloid core. They also found that mutations at position 129 greatly impact aggregation kinetics. Li et al. (16) used vibrational spectroscopy to show that PrP(118 -135) starts to adopt a parallel ␤-sheet structure in a lipid bilayer at high peptide concentrations but prefers ␣-helical structure at low peptide concentrations. Apostol et al. (17) crystallized prion fragments corresponding to human (Hu)PrP(138 -143), MoPrP(137-142), and hamster PrP(138 -143) and found that HuPrP(138 -143) and MoPrP(137-142) form parallel in-register steric zippers, whereas hamster PrP(138 -143) forms antiparallel out-of-register steric zippers. Chuang et al. (18) investigated how various amino acid substitutions in synthetic bovine PrP(108 -144) affect peptide amyloidogenesis and seeding efficiency. They found that the mutation L138M promotes amyloid formation and increases seeding efficiency, whereas the mutations I139M and N143S retard amyloid formation and decrease PrP(108 -144) seeding efficiency.
There are many structural models proposed for short N-terminal prion peptides. Kuwata et al. (19) conducted MD simulations and confirmed the structural stability of tightly packed parallel ␤-sheets in the MoPrP(106 -126) octameric fibril. Helmus et al. (20) conducted solid state NMR on the human PrP(23-144) amyloid structure and found that the fibril contains a parallel in-register ␤-sheet core comprising residues 112-141 near the C terminus of the peptide. The remaining residues in human PrP(23-144) amyloid fibrils were found to be flexible and largely disordered. Later on, based on solid-state NMR data, Lin et al. (21) presented a parallel in-register ␤-sheet model for the human PrP(127-147) fibril where each peptide has two straight ␤-strands connected by a kink at Pro-137. Skora and Zweckstetter (22) applied the chemical shift ROSETTA method and predicted the conformation of human PrP(112-141) within the amyloid core formed by the Y145Stop prion mutant. Later on, Zweckstetter (23) used a combination of NMR spectroscopy, hydroxyl radical probing combined with mass spectrometry, electron microscopy, and computational methods to propose dimeric and trimeric left-handed ␤-helix models for the PrP(112-141) amyloid core.
In this work, we apply a combination of discontinuous molecular dynamics and the PRIME20 force field to investigate the spontaneous aggregation of prion protein peptides PrP(120 -144), an essential region for PrP(23-144) amyloid formation. Although the fibrillation of prion protein and its N-terminal fragments has been widely studied in experiments, this is, to our knowledge, the first simulation work to investigate the aggregation mechanism for this particular peptide. We investigate three different PrP(120 -144) peptides, which originate from human, bank vole, and Syrian hamster. The spontaneous aggregation of a system of eight PrP(120 -144) peptides is monitored, starting from a random configuration and eventually after 12 s forming a mixture of oligomers and protofilaments. The structures of oligomers and protofilaments formed by Hu-, BV-, and SHaPrP(120 -144) at the end of 37 independent simulations are analyzed. The aggregation kinetics of the three PrP(120 -144) peptides are compared. We also investigate how the amino acid substitutions I138M, I139M, and S143N affect PrP(120 -144) aggregation. This work provides molecular level insight into the amyloid core structure and the aggregation mechanism for PrP(120 -144) peptides.
Highlights of our simulation results are the following. PrP(120 -144) forms S-shaped, U-shaped, and ⍀-shaped protofilament structures via multiple nucleation-dependent pathways. The PrP(120 -144) protofilaments mainly contain parallel in-register ␤-sheets, which are consistent with previous experimental work (20,23). The key event during the aggregation is the formation of a nucleus consisting of a parallel in-register ␤-sheet dimer. In most cases, the nucleus first grows into a metastable oligomer, then undergoes backbone rearrangement, and eventually grows into a protofilament. In rare cases, the nucleus grows directly into a protofilament without intermediate steps. Hu-, BV-, and SHaPrP(120 -144) all aggregate in a nucleation-dependent manner. In terms of aggregation kinetics, HuPrP(120 -144) has a shorter lag phase than BVPrP(120 -144), followed by SHaPrP(120 -144), consistent with experimental findings (11). The HuPrP(120 -144) aggregates also have more parallel in-register ␤-sheet content than BV-and SHaPrP(120 -144). This is because the two species-specific amino acid substitutions I138M and I139M enhance side chain-side chain association and reduce side chain-side chain interaction specificity, thus decreasing the probability of forming parallel in-register ␤-sheets. S143N does not have any effect on PrP(120 -144) aggregation kinetics because it does not engender significant change in the side chain-side chain interactions.

Results
Hu-, BV-, and SHaPrP(120 -144) Aggregate into Protofilament Structures-We investigate the spontaneous fibrillation of prion protein fragments (PrP(120 -144)) by performing DMD/PRIME20 simulations on a system containing eight peptides. Eleven to 15 independent simulations were conducted on each of the Hu-, BV-, and SHaPrP(120 -144) peptides for a total of 37 simulations at T* ϭ 0.195 and peptide concentration of 20 mM. Fig. 1 shows the total interaction energy versus reduced time from 15 runs of HuPrP(120 -144). Starting from a high energetic state, the system goes through various pathways and reaches its relatively low energy state by the end of each simulation. The protofilament structures formed for HuPrP(120 -144) in the 3rd, 5th, 7th, 11th, and 12th runs have the lowest interaction energies and hence the highest thermodynamic stabilities among all the final structures. The system total energy profiles for BV-and SHaPrP(120 -144) are shown in supplemental Figs. S1 and S2. Fig. 2 shows the final structures for three different protofilaments formed by HuPrP(120 -144) in our simulations. Fig. 2A shows an S-shaped parallel in-register protofilament formed by HuPrP(120 -144) from the 5th run; Fig. 2B shows a schematic side view and the side chain placement for this structure. Such S-shaped protofilaments are also formed in the 2nd, 3rd, and 7th runs. From Fig. 2B, we see that there are four turning points located at Gly-124, Gly-126, Pro-137, and Gly-142 along the backbone chain of each peptide in the protofilament. Two ␤-strands (residues 127-136 and 138 -141) flank residues Arg-136 and Pro-137, which is similar to the amyloid core structure suggested by Skora (22) and Zweckstetter (23) for HuPrP(112-141). Having a proline at position 137 may explain why HuPrP(120 -144) adopts a "boomerang" structure, because proline facilitates turns and breaks in ␣-helices and ␤-sheets (24). In contrast, in the Zweckstetter (23) PrP(112-141) model, the two ␤-strands lie on a nearly straight line. Jones et al. (12) probed the fibril structure of both wild type PrP(23-144) and PrP(23-144) without the nonessential palindromic sequence (residues 113-120) by solid-state NMR spectroscopy and found that deletion of residues 113-120 changes the secondary structural distribution of the fibril core region. Because our PrP(120 -144) simulation does not include residues 113-120, the structural differences between our PrP(120 -144) protofilament and the Zweckstetter (23) PrP(112-141) amyloid core structure are not surprising. Fig. 2C shows the ⍀-shaped protofilament formed in the 12th run, and Fig. 2D shows a side view. It is apparent from Fig.  2D that the ⍀-shaped protofilament has five turning points at Gly-124, Gly-126, Gly-131, Pro-137, and Gly-142. Note that the C terminus of the yellow chain is not part of the protofilament. The ⍀-shaped protofilament has two triangularly shaped hydrophobic cores formed by the association of the hydrophobic side chain beads in white. The first hydrophobic core contains Val-121, Leu-125, and Leu-130. The second hydrophobic core contains Met-129, Met-134, and Ile-138, which is similar to the triangular hydrophobic core of Het-s (218 -289) fibrils (25). Fig. 2E shows the U-shaped protofilament formed by HuPrP(120 -144) in the 11th run, and Fig. 2F shows its side view. Compared with the ⍀-shaped protofilament, the N-terminal residues of the U-shaped protofilament curve inward to form a large hydrophobic side chain core that is essentially a combination of the two separate hydrophobic cores formed in the S-shaped protofilament. Cheon et al. (26) found that A␤(17-42) forms a similar U-shaped protofilament. BVPrP(120 -144) can form both S-shaped and ⍀-shaped protofilaments, as shown in Fig. 3. Regardless of the amino acid difference between BV-and HuPrP(120 -144), the peptides in the BVPrP(120 -144) S-shaped protofilament in Fig. 3A adopt nearly the same conformation as those in HuPrP(120 -144) S-shaped protofilament in Fig. 2A. From Fig. 3C, the ⍀-shaped protofilament formed by BVPrP(120 -144) has Val-121, Leu-125, and Leu-130 in the first hydrophobic core and Met-129, Ser-132, Met-134, and Met-138 in the second hydrophobic core.
Our simulations reveal that the Hu-, BV-, and SHaPrP(120 -144) peptides aggregate to form relatively heterogeneous aggregate structures. We believe that the main reason that PrP(120 -144) forms the S, ⍀, and U-shaped protofilaments is its conformational flexibility. Each of the three different PrP(120 -144) peptides contain six glycines and one proline, which facilitate turn structures in a protein. For HuPrP(120 -144), the S-, ⍀-, and U-shaped protofilaments are formed in 4, 1, and 1 out of 15 runs. For BVPrP(120 -144), the S-and ⍀-shaped protofilaments are formed in 2 and 1 out of 11 runs. For SHaPrP(120 -144), the S-and ⍀-shaped protofilaments are formed in 2 and 1 out of 11 runs. These S-shaped, ⍀-shaped, and U-shaped protofilaments are stable enough so that they maintain their conformation throughout the simulation. Aside from forming parallel ␤-sheets in 9 of the 15 runs, 8 out of the 11 runs, and 8 out of the 11 runs, Hu-, BV-, and SHaPrP(120 -144) form metastable oligomers that are characterized by a mixture of parallel and anti-parallel ␤-strands, as shown in supplemental Figs. S3-5, respectively.
Notably, all of the peptides considered, regardless of their amino acid differences, can form both S-shaped and ⍀-shaped protofilaments. Comparing the structure of the three S-shaped protofilaments formed by three PrP(120 -144)s in Figs. 2B, 3B, and 4B, we find that the peptides all have the same backbone alignment and the same side chain placement. However, the peptides in the ⍀-shaped protofilament adopt distinct side chain placements as shown in Figs. 2D, 3D, and 4D. The major structural difference between the three ⍀-shaped protofilaments is that the two pairs of adjacent residues, Met-129 and Leu-130 and Met-134 and Ser-135 switch their position between the two hydrophobic pockets. Hydrophobic residues (white), positively charged residues (red), negatively charged residues (blue), and polar residues (green) are shown. Structural Analysis of Hu-, BV-, SHaPrP(120 -144) Protofilament-The distribution of secondary structure in the HuPrP(120 -144) S-shaped protofilament was calculated using the STRIDE program (27) in visual molecular dynamics and is shown in Fig. 5 (28). The hydrophobic and polar residues, including Val-121 to Ser-135 and Ile-138 to Phe-141, have at least 70% ␤-sheet structure except for residues near the two termini. The glycine and proline are most likely to be in random coil or turn structures. The secondary structure distributions for the non-S-shaped protofilaments formed by Hu-, BV-, and SHaPrP(120 -144) are similar to those shown in Fig. 5 and thus are not displayed here. Table 1 shows the equilibrium values for the energies of the different-shaped Hu-, BV-, and SHaPrP(120 -144) protofilaments in Figs. 2-4. The backbone hydrogen bonding energy, E HB , side chain-side chain interaction energy, E side chain , and their sum total interaction energy, E total , are listed. All of the energies are calculated by averaging over the last 100 billion collisions.
A good way to understand the conformational difference between S-shaped, ⍀-shaped, and U-shaped protofilaments is to compare the equilibrium values for E side chain within each species. From Table 1, we find that for the protofilaments formed by HuPrP(120 -144), E side chain (U-shaped) Յ E side chain (⍀-shaped) Ͻ E side chain (S-shaped), indicating that the U-shaped protofilament has a stronger side chain-side chain association than the ⍀-shaped protofilament, followed by the S-shaped protofilament. The U-shaped protofilament contains a large hydrophobic core having seven side chain spheres per peptide contacting each other (Fig. 2F), whereas the ⍀-shaped protofilament contains two separate small hydrophobic cores (Fig. 2D) and the S-shaped protofilament contains only one small hydrophobic core near the N terminus and an open side chain pocket near the C terminus (Fig. 2B). In addition, the three different protofilaments formed by HuPrP(120 -144) have similar E HB because they all consist of parallel in-register ␤-sheets. Similar correlations between the side chain energies and protofilament structure can be applied to those formed by BV-and SHaPrP(120 -144).
Nucleation-dependent Aggregation of Hu-, BV-, and SHaPrP (120 -144)-Next, we investigate the protein aggregation pathway and hence the aggregation mechanism for the three prion protein fragments. Because of the large number of degrees of freedom that a system of aggregating proteins can have, the free   (120 -144). The secondary structure content is calculated and averaged over the last 100 billion collisions using the STRIDE program (27). ␤-Sheet, random coil, and turn structure are three major types. Other types of secondary structures are not shown here.  ), the side chain-side chain interaction energy (E side chain ), and the end-to-end distance (E end-to-end ). The clustering analysis is based on the Ward method (29), which proceeds in the following way. Each data point along the trajectory (which consists of three CVs) starts out as a single cluster. There are initially N clusters. The next step is to combine two of the clusters into one so that the total number of clusters is reduced to N Ϫ 1. The two nearest clusters are selected from N(N Ϫ 1)/2 possible pairs of clusters. The distance of two clusters is shown in Equation 1, where n A , n B are the total number of elements in clusters A and B, and n k ϭ 3 is the total number of CVs in this case, and x k A ,x k B is the average value for the kth CV over all elements in clusters A and B. By iteratively merging two nearest clusters, the N clusters are eventually grouped into 3-4 major clusters that are termed the lag phase, nucleus growth, oligomer rearrangement, and fibril formation. If the trajectory is grouped into three major clusters, we regard it as "one-step" fibrillation. If the trajectory is grouped into four major clusters, we regard it as "two-step" fibrillation. The results of the analysis for the 3rd (one-step fibrillation) and 5th (two-step fibrillation) runs on HuPrP(120 -144) are discussed below.
We find from our simulation that the rate-limiting step during the aggregation of Hu-, BV-, and SHaPrP(120 -144) peptides is nucleus formation. We define nucleus formation to be the first time that a small oligomer containing a parallel in-register ␤-sheet dimer forms; this is the initial step in the develop-ment of a new aggregated phase. Fig. 6, A-E, shows simulation snapshots of HuPrP(120 -144) aggregation from the 3rd run. Fig. 6F shows the total interaction energy versus simulation time for this run. Peptides are initially placed in a random configuration. After the temperature is cooled to T* ϭ 0.195, the peptides start to contact each other and aggregate. The aggregation process for the 3rd run can be divided into three major stages as follows: the nucleation stage, the nucleus growth stage, and the equilibration stage as shown in Fig. 6F. During the nucleation stage, peptides keep attaching and detaching from each other through side chain contacts or hydrogen bonding interaction. In Fig. 6B, peptides form a transient oligomer with peptides partially associated with each other to form hydrogen bonds. In Fig. 6C, peptides dissociate from and reversibly return to the disordered state due to the formation of transient antiparallel steric zippers. The key event during HuPrP(120 -144) aggregation is the formation of a nucleus that contains a parallel in-register ␤-sheet dimer, as shown in Fig.  6D. We identify the time window where the first dimer is formed by keeping track of the number of parallel in-register ␤-sheet segments (defined in Fig. 10) between any two peptides in the system to see when the "18 parallel ␤-sheet segments" criteria are satisfied. The nucleus rapidly grows into an S-shaped protofilament (Fig. 6E) by recruiting other peptides to its two ends. Fig. 6F shows that the system total interaction energy decreases sharply after the nucleus forms; the system here is stabilized by a large number of hydrogen bonds and maximal side chain contacts. Baftizadeh et al. (30) had a similar observation in their biased-exchange meta-dynamics simulation where the rate-limiting step during the aggregation of 18 A␤ (35)(36)(37)(38)(39)(40) peptides was associated with the formation of a specific type of steric zipper.
We find in our simulations that PrP(120 -144) goes through multiple aggregation pathways. Fig. 7, A-E, shows simulation snapshots of the 5th run for HuPrP(120 -144), and Fig. 7F shows the total interaction energy as a function of time. In this run, the eight HuPrP(120 -144) peptides aggregate to form an S-shaped protofilament via a two-step fibrillation pathway (Fig.  7F). Compared with the one-step fibrillation pathway described previously (Fig. 6F), the unstable nucleus on the two-step pathway requires a long time to overcome energy barriers, rearrange into a stable conformation, and finally grow into a protofilament. This is reminiscent of the Lumry-Eyring nucleated polymerization (LENP) model for protein aggregation proposed by Andrews et al. (31). In the LENP model, peptides first reversibly associate to form a pre-nucleus, which eventually goes through a conformational rearrangement step to form an irreversible nucleus. In Fig. 7A, peptides are initially disordered. In Fig. 7B, a nucleus containing a ␤-sheet dimer (colored cyan and green) is formed. It quickly grows into a metastable oligomer (Fig. 7C) consisting of five peptides, the N terminus of which has a tilted ␤-roll conformation. Disordered peptides repeatedly attach and detach from the transient oligomer. The peptides within the oligomer rearrange into a stable parallel in-register ␤-sheet conformation (Fig. 7D), which eventually elongates and grows into an S-shaped protofilament (Fig. 7E) by recruiting and templating the remaining disordered peptides at its two ends.
Although the two-step fibrillation pathway of HuPrP(120 -144) is similar to the LENP model, there are some differences. In the LENP model, the protein reversibly switches its conformation between its native state, an intermediate state, and an unfolded state prior to nucleus formation. However, in our simulation, there are no native, intermediate, or unfolded states because PrP(120 -144) is too short to form any native domain, much less a domain that is characteristic of the whole prion protein. Also, residues 120 -144 within the normal prion protein are intrinsically disordered, except that residues 128 -131 form a short antiparallel ␤-sheet with residues 161-164, which is not included in our studies.
The PrP(120 -144) peptides in all 37 independent simulations aggregate in a nucleation-dependent manner, as evidenced by the fact that they all go through a lag phase before aggregating into ␤-sheet-rich structures. The one-step and two-step fibrillation pathways are applicable only for those simulation runs that result in the formation of parallel in-register ␤-sheet protofilaments as shown in Figs. 2-4. For HuPrP(120 -144), aside from the 3rd run, the 2nd and 7th runs also follow the one step fibrillation pathway to form an S-shaped protofilament because the nucleus grows directly into a final fibrillar state without passing through an obvious metastable state. Therefore, there is no intermediate plateau between the lag phase and fibrillar phase along the total energy profile, as shown in Fig. 1. The formation of the U-shaped and ⍀-shaped protofilaments in the 11th and 12th runs also follows a one-step fibrillation pathway for the same reason. For BVPrP(120 -144), the 4th run follows a one-step and the 3rd and 5th runs follow a two-step fibrillation pathway. For SHaPrP(120 -144), the 10th run follows a one-step and the 1st and 7th follows a two-step fibrillation pathway. However, neither the one-nor two-step fibrillation pathways are applicable for the trajectories where a metastable oligomer, e.g. ␤-barrel oligomer, is formed such as those in supplemental Figs. S3-5. In those cases we cannot predict whether the oligomers are off-pathway products or onpathway toward forming protofilaments without using many more computing resources as well as applying biased-sampling techniques.
I138M and I139M Retard Nucleus Formation-We then investigate how amino acid substitutions (I138M, I139M, and S143N) between Hu-, BV-, and SHaPrP(120 -144) affect their aggregation kinetics with particular focus on the lag phase time. The lag phase time in our simulations is defined to be the time from the start of the simulation to the first time that two peptides form a stable parallel in-register ␤-sheet dimer. Our criterion for stability here is that the dimer should have at least 18 parallel ␤-sheet segments, as defined in Fig. 10. The rationale for using this definition is that the formation of a stable parallel ␤-sheet dimer leads to an obvious increase in ␤-sheet content of the system and so is easy to detect. If two peptides share less than 18 parallel ␤-sheet segments, they are metastable and may dissociate into the monomer state. The lag phase time for each species is averaged over 11 and 15 independent runs (15 runs for HuPrP(120 -144) and 11 runs for both BV-and SHaPrP(120 -144)). Table 2 shows a comparison of the aggre-

N-terminal Prion Peptides Form Parallel ␤-Sheets
OCTOBER 14, 2016 • VOLUME 291 • NUMBER 42 gation lag phase time for the three PrP(120 -144) peptides. HuPrP(120 -144) has a shorter average lag phase time than BVPrP(120 -144), followed by SHaPrP (120 -144). The relatively large error in the lag phase time is a consequence of having a random initial configuration that may go down numerous nucleation pathways. From a kinetics point of view, the relative values of the lag time for the three peptides can be interpreted as evidence that HuPrP(120 -144) has a smoother free energy landscape for fibrillation than BVPrP(120 -144), followed by SHaPrP(120 -144).
We find that the two amino acid substitutions I138M and I139M are responsible for retarding BV-and SHaPrP(120 -144) dimer formation compared with HuPrP(120 -144) and thus retard the aggregation kinetics. Fig. 8 describes the average side chain interaction energy and average number of side chain contacts experienced by residues 138, 139, and 143 in the Hu-, BV-, and SHaPrP(120 -144) peptides during the nucleation stage. From Fig. 8, A and B, we see that the Met-138 on BV-and SHaPrP(120 -144) have the same number of side chain contacts as the Ile-138 on HuPrP(120 -144), and the Met-139 on SHaPrP(120 -144) has the same number of side chain contacts as the Ile-139 on HuPrP(120 -144), but the Met-138 on BV-and SHaPrP(120 -144) have stronger side chain interaction energies than the Ile-138 on HuPrP(120 -144), and also the Met-139 on SHaPrP(120 -144) has a stronger side chain energy than the Ile-139 on HuPrP(120 -144). Thus, the Mets on BV and SHaPrP(120 -144) have a stronger side chain association with the surrounding residues than the Iles on HuPrP(120 -144) during the simulation. The stronger the attractive interaction that the C-terminal residues (138 and 139) experience with the surrounding residues, the more difficult it is for two peptides to form a perfect in-register ␤-sheet dimer. As shown in supplemental Table S1, which lists the energy parameter matrix between 19 amino acid side chains in the PRIME20 force field, Ile has a square well attraction only with hydrophobic residues, e.g. Ala, Leu, Ile, and Phe, whereas Met has a strong square well attraction not only with those hydrophobic residues but also with polar and charged residues, e.g. His, Arg, Ser, and Asn. In other words, supplemental Table S1 indicates that Met has a broader set of attractive interaction partners than Ile. One reason is that the divalent sulfur atom on the Met side chain can form hydrogen bonds with residues that have hydrogen donors in their side chains (32). A calculation by Berka et al. (33) using the RI-DFT-D method with the COSMO model also showed that Met has a stronger side chain interaction energy with Tyr, Arg, Ser, and Asn than Ile does in a water-like environment. Our finding that I138M and I139M retard PrP(120 -144) aggregation is also consistent with predictions of the amyloid aggregation propensity of individual amino acids by Pawar et al. (34), showing that Met has a lower ␤-sheet-forming propensity than Ile.
In contrast to the I138M or I139M, the amino acid substitution S143N does not retard the aggregation kinetics. From Fig.  8A, Ser-143 (for HuPrP(120 -144)) and Asn-143 (for BV-and SHaPrP(120 -144)) have nearly the same average side chain interaction energy, which means that they have similar degrees of side chain association. In fact, supplemental Table S1 shows that Asn and Ser have the same side chain energetic parameters in the PRIME20 force field. They have also been shown to have similar ␤-sheet-forming propensity (34,35). From the above analysis of simulation data, we conclude that S143N does not affect the aggregation kinetics of PrP(120 -144). Chuang et al. (18), however, had a different finding on how Asn-Ser mutation affects PrP(108 -144) aggregation kinetics; they found that the single amino acid substitution N143S can retard bovine PrP(108 -144) aggregation.
I138M and I139M Retard Parallel ␤-Sheet Formation-During the aggregation process, not only do I138M and I139M retard nucleus formation but they also retard the attachment of disordered peptides to the seed, thus decreasing the parallel ␤-sheet content of the final aggregation configuration. Table 3 shows the average number of parallel ␤-sheet and anti-parallel ␤-sheet content in the oligomers and protofilaments obtained at the end of all simulations for Hu-, BV-, and SHaPrP(120 -144). From Table 3, HuPrP(120 -144) on average has more parallel ␤-sheet content than BV-and SHaPrP(120 -144), indicating HuPrP(120 -144) has stronger propensity to form parallel ␤-sheets than BV-and SHaPrP(120 -144). In addition, experiments by Apostol et al. (17) show that HuPrP(138 -143) and MoPrP(138 -143) (MoPrP has the same sequence as BVPrP for residues 138 -143) both form parallel steric zippers, whereas SHaPrP(138 -143) preferentially forms anti-parallel out-ofregister steric zippers. This effect is not obvious from our simulation.
Residues 138 -144 Are Critical for PrP(120 -144) Fibril Formation-The C-terminal residues 138 -144 are found to be necessary for PrP(120 -144) to form fibrils in our simulations. Deleting the C-terminal residues 138 IIHFGSD 144 results in a non-amyloidogenic peptide PrP(120 -137). The data are shown in Fig. 9, which describes the number of hydrogen bonds formed during HuPrP(120 -144) (the 3rd run) and PrP(120 -137) aggregation simulations as a function of time. From Fig. 9, PrP(120 -144) forms ␤-sheet-rich aggregates, whereas PrP(120 -137) fails to form a nucleus and thus maintains the disordered state throughout the simulations, indicating PrP(120 -144) is much more amyloidogenic than PrP(120 -137). One possible reason is that without the C-terminal residues 138 -144, the ␤-sheet dimer of PrP(120 -137) becomes thermodynamically unstable, thus making it unable to form ordered aggregates. Our results are consistent with the Kundu

Discussion
Using DMD simulation combined with the PRIME20 force field, we investigated the early stage aggregation mechanism of amyloidogenic prion protein fragments PrP(120 -144). In this work, by simulating the aggregation of prion protein fragments PrP(120 -144) from three species (human, bank vole, and Syrian hamster), we provide molecular level insight into how the side chain substitutions, I138M and I139M, affect the aggregation kinetics of PrP(120 -144) peptide. The parallel in-register protofilament formed by stacking of PrP(120 -144) peptides shed light on the structure of the amyloid core of the PrP(23-144) fibril that is associated with familial PrP-cerebral amyloid angiopathy (20,37).
Our simulation results on PrP(120 -144) are consistent with a number of experimental studies on PrP(23-144) and it core region (residues 112-144). (No experiments have been conducted on PrP(120 -144), so we cannot compare our results to experiments directly.) First, we find that two species-specific amino acid substitutions (I138M and I139M) retard nucleus formation and the aggregation kinetics of N-terminal prion protein fragments. This is consistent with the Vanik et al. (11) experimental findings that I138M and I139M greatly affect the aggregation kinetics of PrP(23-144) and lead to cross-seeding barriers between PrP(23-144) from different species. In fact, the steric zipper structures formed by short PrP(138 -143) fragments have been shown to correlate with the cross-species transmission barrier of prion diseases by Eisenberg and co-workers (17,38). Second, we show that PrP(120 -144) is unable to form fibrillar structures without the C-terminal residues 138 -144. This is also consistent with the Kundu et al. (36) experiments that HuPrP(23-144) forms fibrils but HuPrP23-137 cannot. Our finding that residues 137-144 play a crucial role in the fibrillation of prion peptides is also consistent with the fact that the prion fragment PrP(136 -140) in experiment binds to PrP Sc and may be responsible for conversion from PrP C to PrP Sc (39). Third, we show that HuPrP(120 -144) aggregates to form ␤-sheet-rich protofilaments with a parallel in-register peptide alignment. This is also consistent with both Helmus et al. (20) and Zweckstetter (23) structural studies on the prion amyloid core PrP(112-141) region based on NMR and other experimental data.
One of the main focuses of this work is learning the aggregation mechanisms that drive disordered peptides PrP(120 -144) to form fibrillar structures. We found that PrP(120 -144) aggregates in a nucleation-dependent manner but along multiple pathways. Several studies have found multiple aggregation pathways for amyloidogenic proteins. Natalello et al. (40) used a combination of Fourier transform infrared spectroscopy and atomic force microscopy to find that human Pr(82-146) undergoes multiple aggregation pathways to form fibrils with various types of secondary structure. Also, Cheon et al. (26) found in DMD/PRIME20 simulations that A␤  assembles from an oligomer state to a U-shaped protofilament via two distinct pathways. Unlike the nucleated polymerization that short amy-   loidogenic peptides like A␤ (16 -22) undergo, PrP(120 -144) seems to go through heterogeneous oligomeric states before forming the nucleus because of numerous local energy minima resulting from the folding effects of peptide backbones. The eight-peptide system that we have simulated is large enough to allow investigation of the nucleation stage of prion protein fragments because generally only a few monomers are involved in the early stage of protein aggregation (31). For example, Cho et al. (41) found in experiment that the first step along the aggregation of recombinant full-length prion protein is the coming together of 3-4 monomers to form a seed. Note that our definition of nucleation is different from the description in classical nucleation theory (42), where nucleus formation is characterized by overcoming a free energy barrier to form a new thermodynamic phase. Because our system size is much smaller than real systems, we are unable to quantitatively compare the aggregation kinetics in our simulation with those in experiments because the macroscopic fibrillation rate is an ensemble average value over all possible kinetic pathways. Instead, the kinetics-related quantities that we can calculate, e.g. the average lag phase time, are simple measurements that only allow for qualitative comparison with experiments.
The conditions chosen for our simulation, e.g. temperature and peptide concentration, have a significant effect on the protein aggregation kinetics. Our choice of high simulation temperature and high peptide concentration allows us to explore the aggregation mechanism of PrP(120 -144) within a reasonable time scale. The major advantage of simulating at a high temperature, just below the critical fibrillation temperature, is that the high entropic fluctuations in this state smooth the peptide aggregation free energy landscape and decrease the probability that the system will be trapped in metastable states. In addition, we use a relatively high peptide concentration (20 mM) compared with experimental conditions because the barrier to aggregation kinetics is greatly lowered at a supersaturated peptide concentration such as this, and it is believed to be the driving force for protein aggregation (43).
The 25-residue-long peptide simulated in this work is the longest prion protein fragment that we have simulated so far due to limitations in computational resources. The longer the peptide sequence is, the more rugged the aggregation free energy landscape is, thus requiring more computation resources and advanced sampling techniques to sample the whole ensemble of system. In future work, we hope to simulate the aggregation of a longer prion peptide, PrP(112-144), because it is thought to include the whole prion amyloid core (44).

DMD-
The main simulation method employed in this work is discontinuous molecular dynamics, a fast alternative to traditional molecular dynamics (45). In DMD simulation, the potential energy between two particles is modeled as a discontinuous potential, such as the hard sphere, square well, and square well shoulder potentials. In DMD, the velocities and positions of two beads (particles or sites on molecules) at each time step are recalculated only when a discontinuity in their interaction potential is encountered. The types of events con-sidered include core collision, bond extension, well capture, dissociation, and bounce. This is different from traditional molecular dynamics where the position and velocity of every particle in the system need to be recalculated at very small regularly spaced time steps by solving Newton's equation of motion.
PRIME20 Model-PRIME20 is a coarse-grained model developed by Hall and co-workers that was specifically designed for use in DMD simulations of protein aggregation. Each amino acid is represented by three backbone spheres (N-H, C␣-H, and CϭO) and one side chain sphere (R). PRIME, the predecessor of PRIME20, was used to simulate the spontaneous formation of fibrils containing polyalanine and polyglutamine (46,47,48). PRIME was extended in 2010 to become PRIME20, a model that describes 20 different amino acids uniquely and considers side chain-side chain polar, charge-charge, and hydrophobic interactions (49). The geometric and energetic parameters for the 20 different side chain beads were obtained by applying a perceptron learning algorithm and a modified stochastic learning algorithm to optimize the energy discrepancy between 711 native state proteins and 2 million decoy conformations produced by gapless threading techniques. In PRIME20, the known backboneangles, C␣-C␣ distances and L-isomerization constraints on each amino acid are modeled by imposing pseudo-bonds between the relevant beads. A non-biased distancebased criterion is used to judge when directional hydrogen bonds are formed and broken (50). Glycine does not have a side chain bead, and proline cannot form backbone hydrogen bonds with other residues. The total interaction energy consists of the side chain-side chain interaction energy, E side chain , and the backbone-backbone hydrogen bonding energy, E HB . All the other non-bonded interactions (including backbone-backbone and backbone-side chain interactions) are modeled as hard sphere interactions that represent excluded volume effects but do not contribute to the system energy. The interactions between two bonded (adjacent) beads include hard sphere repulsion when the distance between the two beads reaches the minimum bond length and an infinite square well attraction when the distance reaches the maximum bond length. As a result, the bond length fluctuates within 2% of its assigned value.
PRIME20 has been used in the Hall group to simulate the spontaneous formation of twisted A␤ (16 -22) fibrils (51), fragment fibrils (50), and U-shaped A␤(17-42) protofilaments (26). PRIME20's ability to discriminate the fibrillation propensity of short peptides was tested on the STVIIE-designed sequence by Wagoner et al. (52) and López de la Paz and Serrano (53). It has also been used to study the effect of macromolecular crowding on A␤ (16 -22) aggregation (54,55). In this work, we use the same PRIME20 force field parameters as were used by Cheon et al. (51) in their simulations of the aggregation of A␤ (16 -22) peptides.
Definition of Parallel and Anti-parallel ␤-Sheet Segments-Two three-residue peptide strands (labeled i, iϩ1, iϩ2, and j, jϩ1, jϩ2) form a parallel ␤-strand segment if both NH i ϩ 1 -CO j and NH j ϩ 2 -CO i ϩ 1 form hydrogen bonds, as shown in Fig. 10A. They form an anti-parallel ␤-sheet segment if 1) both NH i ϩ 1 -CO j ϩ 1 and NH j ϩ 1 -CO i ϩ 1 form hydrogen bonds as shown in Fig. 10B; and 2) NH j -CO i ϩ 2 , NH i ϩ 2 -CO j , NH i -CO j ϩ 2 , and NH j ϩ 2 -CO i all form hydrogen bonds, as shown in Fig. 10C. As an example, two four-residue peptide strands (labeled A and B) could form two consecutive parallel ␤-sheet segments, with residues 1-3 of peptide A forming the first seg-ment with the same residues on peptide B and residues 2-4 of peptide A forming the second segment with the same residues on peptide B. In our simulation, two 25-residue PrP(120 -144) peptides form a parallel ␤-sheet dimer if they have at least 18 parallel ␤-sheet segments.
Relating the PRIME 20 Reduced Temperature and Time to Real Temperature and Time-We relate our simulation-reduced temperature T* to real temperature T by matching the DMD/PRIME20 simulation predictions of the ␣-helical content versus temperature profiles of alanine-rich polypeptides to those measured in experiments by Munoz and Serrano (56). DMD/PRIME20 simulations of the folding of the polypeptides, Ac-AAQAAAAQAAAAQAAY-NH 2 , Ac-AAAAKAAAAKA-AAAKA-NH 2 , Ac-AEAAAKEAAAKEAAAKA-NH 2 , Ac-WDA-AAKDAAAKDAAAKA-NH 2 , and Ac-EAEKAAKEAEKAA-KEAEK-NH 2 at reduced temperatures ranging from 0.12 to 0.25, show that peptides adopt a mostly ␣-helical structure at low temperature T* ϭ 0.12 undergo a folding transition at about T* ϭ 0.175 and become random coils at high temperatures T* ϭ 0.25 as shown in supplemental Fig. S6A. At each temperature point, we simulate a single peptide in an 87.25 ϫ 87.25 ϫ 87.25 Å 3 box for at least 20 billion collisions. By comparing the simulation data for the percentage of ␣-helical hydrogen bonds with the experimental data (supplemental Fig.  S6 B) from Munoz and Serrano (56), as shown in supplemental Fig. S6 D, we obtain a linear correlation between T and T* for each of five peptides. Then, after averaging over the slope (A) and interception (B) value of five T versus T* linear correlations functions, we obtain T/K ϭ ͗A͘T* ϩ ͗B͘ ϭ 2288.46T*Ϫ115.79. Thus, the simulations presented in this paper are conducted at a real temperature of 330 K. Justification for conducting our simulations at a temperature that is higher than body temperature are given under "Discussion." In our DMD/PRIME20 simulation, the system energy is scaled based on the hydrogen bonding energy unit ⑀ HB . To relate system energy values with realistic units, we set ⑀ HB equal to12.47 kJ/mol as used by Ricchiuto et al. (57) in their DMD simulation of peptide aggregation, which is also close to the optimal backbone hydrogen bonding energy 11.72 kJ/mol used in the STRIDE algorithm (27). The ⑀ HB value is a simple estimation for the PRIME20 model because in nature the hydrogen bonding strength is considered to have temperature dependence (58). Note that the reduced temperature in simulation is defined to be T* ϭ k B TЈ/⑀ HB , and this would suggest that the real temperature from our simulations that corresponds to T* ϭ 0.195 is 273 K, which is of course unrealistically low. It is for this reason that we chose to use the method described above to relate the reduced temperature in our simulations to realistic temperature.
To convert reduced time to real time, we compare the selfdiffusion coefficients (D) of the A␤ (16 -22) peptide from DMD/PRIME20 simulations and atomistic simulation. Recall that the diffusion coefficient can be obtained in terms of the mean squared displacement using the Einstein diffusion equation (59), 6⌬tD ϭ ͉͗r(t ϩ⌬t) Ϫ r(t)͉ 2 ͘, where the angle brackets indicate that the center of mass mean squared displacement of the peptide is averaged over all peptides in the system during the simulation time period ⌬t. In DMD, we simulate six A␤ (16 -22) peptides in a 100 ϫ 100 ϫ 100 nm box for 1 billion collisions at the reduced temperature T* ϭ 0.19 and find that D DMD ϭ 0.072 nm 2 /⌬t reduced , where ⌬t reduced is the reduced time unit. We then perform a 1-ns NPT followed by a 50-ns NVT explicit solvent atomistic simulation of a single A␤ (16 -22) peptide in a 6 ϫ 6 ϫ 6-nm 3 water box at T ϭ 298 K using GROMACS 5.0.4 package (60) and Amber ffSB99 force field (61). The N terminus of the peptide is capped with an acetyl group, and the C terminus is capped with an N-methyl group. We find that D MD ϭ 1.0 nm 2 /ns. By equating D DMD to D MD (1 nm 2 /1 ns ϭ 0.072 nm 2 /⌬t reduced ), we find that the reduced time unit, ⌬t reduced , is equivalent to 0.072 ns.
Description of DMD Simulation Techniques-We simulate three 25-residue-long prion peptides that correspond to residues 120 -144 from three different species as follows: human, bank vole, and Syrian hamster. Note that bank vole PrP has the same sequence as mouse PrP between residues 120 and 144. The amino acid sequences of these three prion protein fragments are shown in Table 4. The primary sequence differences between three peptides are at positions 138, 139, and 143. HuPrP(120 -144) has Ile at positions 138 and 139 and Ser at position 143. Compared with HuPrP(120 -144), BVPrP(120 -144) has two mutations, I138M and S143N, and SHaPrP(120 -144) has three mutations, I138M, I139M, and S143N.
The simulations are performed in the canonical ensemble (fixed number of particles, constant volume, and temperature). The Andersen thermostat is implemented to maintain the system at a constant temperature (62). The initial velocity for every bead in the system is generated based on a Maxwell-Boltzmann distribution centered at the desired temperature. The system is initialized by randomly placing eight peptides in a cubic box with box length equal to 87.25 Å corresponding to a relatively high peptide concentration of 20 mM. The peptide concentration is set to a high value to reduce the time required for monomer peptides separated far apart to move close together. The reduced temperature is set to be T* ϭ 0.195, which is right below the critical fibrillation temperature T C of the PrP(120 -144) peptide (no fibril forms when the temperature is above T C ). Each independent simulation starts with a random configuration and a high reduced temperature (T* ϭ 0.50) to relax each peptide chain into a random coil conformation. The system is then gradually cooled to T* ϭ 0.195 and then proceeds for at least 1 trillion collisions (12 s), which takes 1000 CPU hours (40 days) on a fast workstation.
Author Contributions-Y. W. and C. K. H. designed the research and wrote the manuscript. Y.M. and Q.S. conducted the research and analyzed the data.