Uncovering the Early Assembly Mechanism for Amyloidogenic β2-Microglobulin Using Cross-linking and Native Mass Spectrometry*

β2-Microglobulin (β2m), a key component of the major histocompatibility class I complex, can aggregate into fibrils with severe clinical consequences. As such, investigating the structural aspects of the formation of oligomeric intermediates of β2m and their subsequent progression toward fibrillar aggregates is of great importance. However, β2m aggregates are challenging targets in structural biology, primarily due to their inherent transient and heterogeneous nature. Here we study the oligomeric distributions and structures of the early intermediates of amyloidogenic β2m and its truncated variant ΔN6-β2m. We established compact oligomers for both variants by integrating advanced mass spectrometric techniques with available electron microscopy maps and atomic level structures from NMR spectroscopy and x-ray crystallography. Our results revealed a stepwise assembly mechanism by monomer addition and domain swapping for the oligomeric species of ΔN6-β2m. The observed structural similarity and common oligomerization pathway between the two variants is likely to enable ΔN6-β2m to cross-seed β2m fibrillation and allow the formation of mixed fibrils. We further determined the key subunit interactions in ΔN6-β2m tetramer, revealing the importance of a domain-swapped hinge region for formation of higher order oligomers. Overall, we deliver new mechanistic insights into β2m aggregation, paving the way for future studies on the mechanisms and cause of amyloid fibrillation.

␤ 2 -Microglobulin (␤ 2 m), a key component of the major histocompatibility class I complex, can aggregate into fibrils with severe clinical consequences. As such, investigating the structural aspects of the formation of oligomeric intermediates of ␤ 2 m and their subsequent progression toward fibrillar aggregates is of great importance. However, ␤ 2 m aggregates are challenging targets in structural biology, primarily due to their inherent transient and heterogeneous nature. Here we study the oligomeric distributions and structures of the early intermediates of amyloidogenic ␤ 2 m and its truncated variant ⌬N6-␤ 2 m. We established compact oligomers for both variants by integrating advanced mass spectrometric techniques with available electron microscopy maps and atomic level structures from NMR spectroscopy and x-ray crystallography. Our results revealed a stepwise assembly mechanism by monomer addition and domain swapping for the oligomeric species of ⌬N6-␤ 2 m. The observed structural similarity and common oligomerization pathway between the two variants is likely to enable ⌬N6-␤ 2 m to cross-seed ␤ 2 m fibrillation and allow the formation of mixed fibrils. We further determined the key subunit interactions in ⌬N6-␤ 2 m tetramer, revealing the importance of a domain-swapped hinge region for formation of higher order oligomers. Overall, we deliver new mechanistic insights into ␤ 2 m aggregation, paving the way for future studies on the mechanisms and cause of amyloid fibrillation.
The major histocompatibility class I complex (MHC I) is found on the cell surface of all nucleated cells and is responsible for antigen presentation (1,2). ␤ 2 -Microglobulin (␤ 2 m) 2 is a key component of this complex. After its dissociation from MHC I, serum ␤ 2 m is broken down in the kidney. The buildup of circulating ␤ 2 m can result as a consequence of renal dysfunction and long term hemodialysis. This leads to amyloid fibril deposition in osteoarticular tissues and joint destruction in a condition known as dialysis-related amyloidosis.
Amyloidogenic proteins such as ␤ 2 m tend to self-assemble into higher order oligomeric species through a complex aggregation process, which can ultimately lead to the formation of fibrils (3)(4)(5)(6). Despite the clinical significance of amyloid formation, the principles governing the mechanisms for the aggregation process of ␤ 2 m and related proteins remains largely unknown, primarily due to the transient nature of intermediates on-pathway to fibril formation (7). Structural elucidation of ␤ 2 m oligomeric intermediates is, therefore, a challenging task that is further complicated by the uncertainty in differentiating between specific and nonspecific protein aggregates.
Despite these obstacles progress has been made most prominently from techniques such as nuclear magnetic resonance spectroscopy (NMR) (8), x-ray crystallography (9), atomic force microscopy (10), cryo-electron microscopy (EM) (11), and hydrogen/deuterium exchange (12) and by combining NMR with mass spectrometry (MS) (13). MS in particular is well suited for studying heterogeneous assembly intermediates, including proteins populating multiple oligomeric states (14). When coupled with ion mobility (IM), IM-MS allows the separation of different conformational states of co-populated oligomers (5,15). IM-MS has been successfully employed to investigate the structure and dynamics of amyloid assembly intermediates revealing information on the aggregation process of A␤40 and A␤42 complexes (15,16) and more recently human amylin (17). Structural insights have also been gained for ␤ 2 m (18,19) where IM-MS experiments suggested either an elongated or more compact assembly mechanism for fulllength ␤ 2 m under different solution conditions (20).
The integration of different experimental techniques with modeling can provide powerful means to interrogate candidate models of protein assemblies (21)(22)(23)(24)(25). In particular so-called hybrid approaches, which combine information from complementary experiments, have shed light on complexes intractable by single techniques, exemplified by the structural elucidation of the nuclear pore complex (26), the 26S proteasome (27,28), and the eukaryotic translation initiation factor 3 (29,30).
Here we used an integrative MS-based strategy for generating three-dimensional structural models of the oligomeric assembly intermediates of ⌬N6-␤ 2 m, a truncated ␤ 2 m isoform (11.1-kDa monomer). ⌬N6-␤ 2 m makes up to 30% of amyloid deposits extracted from dialysis-related amyloidosis patients and can act as a seed for full-length ␤ 2 m fibrillogenesis in vitro (1). Contrary to full-length ␤ 2 m, ⌬N6-␤ 2 m is highly amyloido-genic at neutral pH, making it a convenient model for studying ␤ 2 m aggregation under laboratory conditions (2). Furthermore, the x-ray crystal structure of the dimeric intermediate built by the self-association of two ⌬N6-␤ 2 m monomers was recently solved, and proposed as a building block for growing oligomers on-pathway to fibril formation (1). In this structure domainswapping occurs through the so-called hinge region, which corresponds to two NHVTLSQ heptapeptides interacting in an antiparallel fashion (1).
Using a combination of experimental and computational techniques, we predict the structures and an early assembly mechanism for ⌬N6-␤ 2 m oligomers. We further compare oligomers of the truncated variant with those of the full-length protein, highlighting similar oligomeric distributions and compact topologies as well as inter-and intraprotein distances. This points to a common assembly mechanism in the early stages of their aggregation and may facilitate the ability of the truncated variant to cross-seed and form mixed fibrils (31) with fulllength ␤ 2 m in vivo. The data and the structural models generated from the integrative strategy further suggest an elongation mechanism of monomer addition consistent with domain swapping and self-templated growth. Furthermore, our model for ⌬N6-␤ 2 m tetramer shows that the domain-swapped hinge region found in ⌬N6-␤ 2 m dimer is key to both intra-and interdimer interactions.

Experimental Procedures
Protein Preparation-⌬N6-␤ 2 m and ␤ 2 m were expressed in Escherichia coli and purified using ion exchange chromatography and size exclusion chromatography as previously described (1). Lyophilized protein was dissolved in 100 mM ammonium acetate, pH 5, before MS analysis.
Ion Mobility-Mass Spectrometry-IM-MS experiments were performed on a quadrupole ion mobility time-of-flight mass spectrometer (Synapt HDMS, Waters Corp., Manchester, UK) modified such that the traveling-wave IM cell is replaced with an 18-cm drift cell with radial RF confinement (RF amplitude 200 V) and a linear voltage gradient along the axis of ion transmission, as described in detail previously (32). The following parameters were used: source pressure 4 -6 mbar, capillary voltage 1.0 -1.5 kV, sample cone voltage 20 V, bias voltage 20 V, IM entrance DC 5 V, trap gas 6 ml min Ϫ1 , trap collision energy 5 V. Helium (2 torr) was used as the buffer gas, and the drift voltage varied from 50 to 200 V. All spectra were mass-calibrated using cesium iodide (100 mg ml Ϫ1 ).
Chemical Cross-linking MS-50 l of ⌬N6-␤ 2 m or ␤ 2 m were cross-linked with 10 l of a 25 mM 1:1 mixture of deuterated (d4) and non-deuterated (d0) bis[sulfosuccinimidyl] suberate (BS3). The reaction mixture was incubated at 25°C and 400 rpm for 1 h. 10 l of the cross-linked proteins were analyzed by gel electrophoresis (NuPAGE system, Invitrogen) according to the manufacturer's protocol. The proteins were digested in-gel as described elsewhere (33).
The mixture of cross-linked and non-cross-linked peptides was analyzed by liquid chromatography-coupled tandem-mass spectrometry (LC-MS/MS) employing an LTQ-Orbitrap XL hybrid mass spectrometer (Thermo Scientific) coupled with a Dionex UltiMate 3000 RSLC nano System (Thermo Scientific).
The peptides were directly eluted into the mass spectrometer. Mass spectrometric conditions were: spray voltage of 1.8 kV, capillary temperature 180°C, normalized collision energy 35% at an activation of q ϭ 0.25, and an activation time of 30 ms. The LTQ-Orbitrap XL was operated in data-dependent mode. Survey full scan MS spectra were acquired in the orbitrap (m/z 300 -2,000) with a resolution of 30,000 at m/z 400 and an automatic gain control target at 10 6 . The five most intense ions were selected for collision-induced dissociation in the linear ion trap at an automatic gain control target of 30,000. Selection of previously selected precursor ions was dynamically excluded for 30 s. Singly charged ions as well as ions with unrecognized charge state were also excluded. Internal calibration of the Orbitrap was performed using the lock mass option (lock mass: m/z 445.120025) (34). mzXML files were generated from raw data using the Mass-Matrix file conversion tool. Potential cross-links were identified by searching against a reduced database containing ⌬N6-␤ 2 m and ␤ 2 m protein sequences using the MassMatrix Database Search Engine (35). Search parameters were: enzyme, trypsin; missed cleavage sites, two; variable modifications, carbamidomethylation of cysteines and oxidation of methionine; mass accuracy filter, 10 ppm for precursor ions and 0.8 Da for fragment ions; minimum pp and pp2 values, 5.0; minimum pptag, 1.3; maximum number of cross-links per peptide, 1. All searches were performed twice, including the deuterated and the non-deuterated BS3 cross-linker, respectively.
Modeling Restraints from Cross-linking-To test if the identified cross-links in the ⌬N6-␤ 2 m tetramer (or trimer) were arising from interdimer or intradimer interactions, we projected the cross-links confirmed by MS/MS quality onto the dimer x-ray crystal structure (PDB ID 2X89). We measured the physical C␣-C␣ distances (36) to check if these were within the upper-bound interresidue distance threshold (35 Å) (37). The cross-links, which do not satisfy our distance threshold, are more likely to generate from interdimer interactions in the tetramer (or trimer) and were, therefore, assigned as such in our modeling analysis. Those that were within the distance threshold were most likely to arise from intradimer interactions.
MS-based Hybrid Approach-We employed a hybrid approach for structural determination of oligomeric intermediates of ⌬N6-␤ 2 m, primarily based on native MS, IM-MS and cross-linking MS (CX-MS) (27,29) (Fig. 1). From native MS, we established the oligomeric state of the identified complexes (38). By combining MS with IM, topological information in the form of an orientation-averaged collision cross-section (CCS) was derived (39). The measured CCS from IM-MS was used as shape restraint for interrogating candidate structural models (22). CX-MS identified lysines in close proximity and was used as a distance restraint (27). In addition to MS data, we made use of available structures from x-ray crystallography and NMR as well as EM density maps. The atomic level structures were used as starting points in our modeling strategy, whereas a segment of the EM map of ␤ 2 m assembled into fibrils (EM Database ID 1613; type A) was used as a volume restraint, as there is no available EM density map for ⌬N6-␤ 2 m fibrils (11). Structural information obtained from these methods was encoded into restraints and exploited by a scoring function for subsequent modeling analysis.
Integrative Modeling and Scoring Function-We generated structural models of the assemblies by employing a Monte Carlo search algorithm. The building process was guided by a scoring function that encoded the experimental data as a sum of individual restraints. This scoring function (S) evaluates the ensemble of candidate models generated against their qualityof-fit with the input data.
where S IM-MS and S CX-MS refer to IM-MS and CX-MS restraints, respectively. S IM-MS was implemented as a harmonic potential function (22,23), whereas S CX-MS was applied as a distance restraint between two interacting residues (27,29). The EM (S EM ) restraint assessed the quality-of-fit between the model and the corresponding molecular volume of an appropriate section of the EM map, as defined by the cross-correlation coefficient. IM-MS, CX-MS, and EM were given 1:2:2 weightings (W), respectively, consistent with previous benchmark studies (27).
To assess the uniqueness of the ensemble of generated mod-els we performed ensemble analysis (e.g. clustering of top-scoring solutions), and the final solution was selected from the major cluster. The visual molecular dynamics (VMD) and the UCSF Chimera packages were used for visualization of the structures (40).
Collision Cross-section Calculations-To interpret the experimentally obtained CCS values, we compared them to theoretically calculated CCSs (41). Theoretical CCSs were obtained with the open source MOBCAL code using the projection approximation (PA) algorithm (42,43). The PA method is known to underestimate the experimental CCS of proteins by neglecting multiple collisions between ions and buffer gas (7,43). However, it has been shown that it is correlated with the experimental CCS for protein complexes (R 2 Ͼ 0.99) (44). We use the scaled PA CCS as previously described, where the experimental CCS can be typically predicted (Ϯ3%) by multiplying the PA CCS by a factor of 1.14 (44). All CCS calculations include hydrogen atoms.
Molecular Dynamics Simulations-All simulations were performed in single (solution phase) or double floating-point precision (gas phase) with GROMACS 4.5.3 using the OPLS-AA/L forcefield (45).
Gas-phase simulations (10 ns) were performed at 300 K as described previously (46), The MS observed charge state was distributed evenly over solvent-accessible basic residues (Ͻ5 Å from the surface) (47,48). As such, we assigned the charged residues for the 6ϩ monomer, 9ϩ dimer, 11ϩ trimer, and 13ϩ tetramer charge states of ⌬N6-␤ 2 m. CCSs were calculated for structures every 25 ps using the scaled PA method implemented in MOBCAL. To predict the structures of higher oligomeric states of proteins, we implement an integrative modeling strategy. Here we use information derived from native MS, IM-MS, and CX-MS. The acquired data and other available information are converted into spatial restraints, which are subsequently utilized by a scoring function to guide the search for candidate model structures. Finally, an analysis step (e.g. clustering) of the top-scoring models determined the most likely structures of the oligomeric assembly pathways.
Solution phase simulations (10 ns) were carried out similarly, except periodicity and a cutoff of 0.9 and 1.4 nm were used for electrostatic and van der Waals forces respectively; an integration step of 2 fs was used. Acidic and basic residues were charged as appropriate for solution. Total charge of the system was neutralized by the addition of an appropriate number of sodium ions.
Software and Scripts-Our integrative protocol was implemented within the open source Integrative Modeling Platform (IMP) software package.

Results
⌬N6-␤ 2 m Assembles into Compact Oligomers-We carried out IM-MS on ⌬N6-␤ 2 m at various monomer concentrations (10 -30 M, pH 5) revealing multiple oligomeric species in equilibrium ( Fig. 2A). Our experiments demonstrated that the formation of specific oligomers in solution is highly concentration-dependent (data not shown). At 10 M, the predominant species observed were monomers and dimers, with a low amount of trimers formed. At 15 M, we could clearly observe four charge state distributions corresponding to monomers, dimers, trimers, and tetramers of ⌬N6-␤ 2 m ( Fig. 2A). Higher concentrations (30 M) revealed higher order oligomers (Ͼpentamer). As a control we carried out similar experiments using cytochrome c, a monomeric protein of similar molecular mass (12 kDa). At 15 M and below, only monomeric species were detected, whereas at higher concentrations (e.g. 30 M), we could observe low intensity peaks for dimeric and trimeric species (data not shown). Therefore, we carried out experiments at a protein concentration of 15 M to minimize any contribution from nonspecific aggregation (49). To correctly assign the oligomeric species of ⌬N6-␤ 2 m, separation in mobility space was critical, as peaks at certain m/z values were co-populated with multiple species. For instance, m/z 2800 is composed of both dimer (8ϩ charge state) and trimer (12ϩ charge state) ( Fig. 2A). Separation of different species with the same m/z but different CCS is a major strength of IM-MS.
We applied a similar approach to interrogate full-length ␤ 2 m, which is expected to oligomerize to a lesser extent than its truncated variant at pH 5 (19). We observed monomer and dimer at a protein concentration of 15 M (Fig. 2B), with trimers and tetramers only observed at higher concentrations (30 M, data not shown). We measured CCSs for the oligomers of ␤ 2 m at 15 M (monomer and dimer) and at 30 M (trimer and tetramer) ( Table 1) revealing values Ͻ5% greater than those measured for ⌬N6-␤ 2 m, consistent with the higher molecular weight. This suggests that the oligomers of full-length ␤ 2 m and its truncated variant adopt similar conformations for their early oligomeric intermediates.
To establish the overall topology (i.e. compact or elongated) of ⌬N6-␤ 2 m oligomers, we used the IM-MS data to inform a coarse-grained modeling strategy reported previously (22). Monomers were represented as spheres, with radii defined by the monomer CCS. Models were then built for the dimer and trimer by varying the intersubunit distances and angles. These models were then scored, and the one with the lowest total score at each stage was taken to the next step, to form the (nϩ1) oligomer (Fig. 2D). To sample conformational space for the tetramer, a Monte Carlo approach was used keeping the relative position of the three other subunits fixed. The 1% lowest scoring models were clustered, revealing three distinct clusters. The largest cluster (84.4%) is represented by a compact topological arrangement of subunits. This low resolution model of tetrameric ⌬N6-␤ 2 m from IM-MS is consistent with a recently published EM map of ␤ 2 m fibrils (Fig. 2E).
To assess the structural differences between full-length ␤ 2 m and its truncated form, we performed cross-linking experi-ments on ␤ 2 m following the same strategy as above. Bands corresponding to monomer, dimer, trimer, and tetramer of ␤ 2 m were cut, proteins were digested, and peptides were analyzed by LC-MS/MS. We obtained 529 potential hits after database searching and validated 322 of these manually (false discovery rate 39.13%). We obtained two unique cross-links from the monomer band and up to nine unique cross-links for the dimer, trimer, or tetramer bands ( Table 2). These cross-links were in good agreement with those obtained for ⌬N6-␤ 2 m (Table 2). Due to the longer amino acid sequence in ␤ 2 m we identified additional interactions in the N-terminal regions of the protein.
The high similarity between the observed cross-links in the oligomers of the two variants suggests highly conserved solution structures and initial assembly mechanism.
Building Atomic Models of ⌬N6-␤ 2 m Oligomers-Having established ⌬N6-␤ 2 m interresidue proximities from CX-MS and the overall compact assembly using IM-MS, we turned our attention to identify suitable atomic resolution structures from the protein data bank (PDB) for building further ⌬N6-␤ 2 m olig- . Workflow for structural characterization of ⌬N6-␤ 2 m tetramer. A, structural data from IM-MS and CX-MS were used to select a suitable starting structure for building the tetramer. The cross-linking data (orange) was consistent with ⌬N6-␤ 2 m monomer NMR (PDB ID 2XKU) and dimer x-ray (PDB ID 2X89) crystal structures. Calculated CCS for the energy-minimized monomer was in good agreement with experimental CCS from IM-MS; however, calculated dimer CCS was larger than observed experimentally. B, gas-phase MD simulations were performed on the dimer, and the subsequently calculated CCS was in good agreement with experimental CCS, suggesting subtle compaction in the gas phase. Information on the trimer from IM-MS and cross-linking MS were combined using a scoring function in an integrative approach to suggest model structures for trimeric ⌬N6-␤ 2 m, starting from the validated monomer and dimer structures from NMR and x-ray crystallography, respectively. Gas-phase MD simulations were performed on the best-scoring trimeric model structure and CCS calculated, showing good agreement with experimental CCS. C, similarly, using restraints from IM-MS, cross-linking MS and EM model structures were suggested for tetrameric ⌬N6-␤ 2 m, starting from the validated dimeric structure (1). Docking of the best-scoring tetramer model into the ␤ 2 m fibril EM density map (EM Database ID 1613) showed excellent agreement, with a cross-correlation coefficient of 0.77.
omers. NMR structure (PDB ID 2XKU) for ⌬N6-␤ 2 m monomer (50) was used to compare the calculated monomer CCS with that from IM-MS. These were found to differ by 6.3% (CCS exp 1200 Ϯ 36 Å 2 ; CCS calc 1276 Å 2 ). Rearrangements in the gas phase are thought to be responsible for overall compaction of the structure in the absence of solution leading to lower experimental CCS values than anticipated from the crystal structure (51). To account for this possibility, gas-phase molecular dynamics (MD) simulations were performed before CCS calculations (46). The calculated CCS of the simulated structure was in good agreement with the measured CCS (CCS exp 1200 Ϯ 36 Å 2 ; CCS calc 1144 Å 2 ; 4.5% deviation). Next, we projected the experimentally identified cross-links (Table 2) onto the atomic structure and measured the C␣-C␣ distances (Fig.  4A). The calculated distances revealed that the experimentally determined cross-links were in good agreement with the NMR structure using an upper bound interresidue distance threshold of 35 Å (27).
X-ray crystal structure (PDB ID 2X89) is composed of two ⌬N6-␤ 2 m monomers associated through domain-swapping. We compared the calculated CCS from this structure with the measured dimer CCS from IM-MS and found them to differ by 8% (CCS exp 1900 Ϯ 57 Å 2 ; CCS calc 2064 Å 2 ). Similar to the monomer, we carried out gas-phase MD simulations and found that the calculated CCS of the simulated dimer was in good agreement with the measured CCS from IM-MS (CCS exp 1900 Ϯ 57 Å 2 ; CCS calc 1873 Å 2 ; 1.4% deviation) (Fig. 4B). Structural rearrangement in the gas phase primarily occurred through the compaction of the gross structure as measured by center of mass distances between the two interacting monomers, which decreased from 2.93 to 1.97 nm during the MD simulation. Structural agreement with the experimentally identified cross-links (Table 2) was confirmed by projecting them onto the x-ray crystal structure before and after the MD simulations.
Overall we conclude that both the NMR structure for the ⌬N6-␤ 2 m monomer and the x-ray structure for the dimer are consistent with our experimental data and are themselves structurally similar (Fig. 5, A and B). We, therefore, used these structures as a starting point to build higher order oligomers. Of particular interest is the dimeric structure, as domain-swapping has been proposed as a plausible assembly mechanism for amyloidogenesis (1,9).
Integrative Modeling Predicts ⌬N6-␤ 2 m Oligomers at Atomic Resolution-Having validated starting structures for building higher oligomers, we used an integrative approach to model trimeric and tetrameric ⌬N6-␤ 2 m. To achieve this we computationally integrated our MS data (trimer and tetramer) with available information from EM (tetramer only) using a suitable scoring function ("Experimental Procedures").
We began by building a model for trimeric ⌬N6-␤ 2 m at atomic resolution. We used as starting structures the domain- swapped dimer and the monomer from NMR. We generated 10,000 atomic models using a Monte Carlo sampling of conformational space, and the models generated were evaluated using the scoring function described above. In particular, we scored all models using the experimentally measured CCS (CCS exp 2530 Å 2 ) and four interdimer cross-links, K110:K113 (19.5 Å) K110:K110 (13.4 Å), K110:K67 (29.3 Å), and K110:K27 (18.0 Å), identified in the trimer band (Table 2). Clustering analysis of the top-scoring 1% models revealed two main clusters (threshold 5 Å). We chose a representative structure of the largest cluster (60%). We finally performed gas phase MD simulations (46) showing a compaction (10%; CCS calc 2640 Å 2 ) of the trimeric structure in vacuum (Fig. 4B), in good agreement with the measured CCS (4% deviation).
Next, we employed a similar strategy to predict the most likely architecture of the tetrameric ⌬N6-␤ 2 m, starting with two dimers (Fig. 4C). We scored all models using the experimentally measured CCS (CCS exp 3057 Å 2 ), the three identified interdimer cross-links, K110:K113 (22.3 Å), K110:K110 (20.9 Å) and K110:K67 (32.4 Å), for the tetramer (Table 2), and the section of density map corresponding to the globular tetramer of ␤ 2 m (EM Database ID 1613) (11). To reflect the molecular envelope of tetrameric ⌬N6-␤ 2 m, we chose a globular rather than elongated section from the EM map (Fig. 4C), consistent with our IM-MS experiments and coarse-grained modeling (Fig. 2E). As performed on the trimer, the top 1% scoring models for the tetramer were clustered, and the top scoring model in the largest cluster was chosen as the representative model structure (Fig. 4C). Interestingly, in this model the hinge region consisting of heptapeptides NHVTLSQ, which readily form amyloids in isolation, are stacked together (Fig. 5C). These interact through dimer-dimer interfaces by hydrogen bonding between Ser-88 (D chain), Thr-86 and Ser-88 (C chain) from one dimer with Thr-86, Ser-88, and Gln-89 from a second dimer, respectively (Fig. 5C). Of further interest are the D strands, dynamic regions which are thought to play a role in amyloidogenicity (2,52). Here, these form the dimer interface on the opposite side to the domain-swapped hinge region (Fig.  5C).
Finally, to assess the stability of the predicted tetramer, we performed solution and gas phase MD simulations (46). These simulations revealed that the tetramer underwent only subtle changes over the simulation timeframe, suggesting a stable conformation (Fig. 6, A and B).

Discussion
⌬N6-␤ 2 m oligomers have been proposed as intermediate assemblies leading to fibrillogenesis either through nucleation and elongation of their own fibrils or through cross-seeding with full-length ␤ 2 m (2). ␤ 2 m fibrillogenesis may be seeded by preformed ⌬N6-␤ 2 m filaments or fibrils (53). Alternatively, ⌬N6-␤ 2 m monomers may interact with full-length ␤ 2 m, enabling a transition into an aggregation-prone conformation (50). Evidence in support of this, from NMR (50) and IM-MS (18), shows that ␤ 2 m monomers undergo a conformational change before oligomerization. This intermediate state is thought to have enhanced amyloidogenic potential (2). A similar intermediate state may be more readily accessible for ⌬N6-␤ 2 m, giving rise to its increased amyloidogenicity compared with fulllength ␤ 2 m (2, 54, 55). Another possibility is that a proteolytic step may precede aggregation, in which the N-terminal hexapeptide is removed from ␤ 2 m and that ⌬N6-␤ 2 m is, therefore, itself an on-pathway intermediate of ␤ 2 m fibrillation (54). Although the specific interaction between ␤ 2 m and ⌬N6-␤ 2 m remains a topic of intense debate, it is becoming increasingly clear that studying the early oligomeric pathway of ⌬N6-␤ 2 m at atomic level may be essential to understand the assembly mechanism for full-length ␤ 2 m.
Here we predict structural models and an early assembly mechanism for oligomers of ⌬N6-␤ 2 m up to tetramer, highlighting their compact topologies and key intersubunit interactions and comparing oligomers of the full-length protein.
Because fibrillogenesis has been observed for ⌬N6-␤ 2 m under similar solution conditions to those used in our MS experiments (1), we believe that the observed oligomers serve as preamyloid intermediates en route to higher order oligomeric species and fibril formation.
From our comparative IM-MS and CX-MS results, we established that ⌬N6-␤ 2 m and full-length ␤ 2 m had similar oligomer profiles, CCS, and identified cross-links. This suggests a high structural similarity for the lower order oligomers of these two variants and points to a conserved initial assembly mechanism before fibrillogenesis. The specific experimental conditions, however, are important for determining the ex vivo aggregation pathway followed and resulting fibril morphology (18,19,56). Previous studies of full-length ␤ 2 m have shown that at low pH and low ionic strength, long straight fibrils are formed via a nucleated mechanism. On the other hand, at higher pH and ionic strength, worm-like fibrils are formed, with elongation proceeding via a non-nucleated, monomer addition pathway, and a conformational change in monomeric ␤ 2 m thought to be responsible for initiating oligomerization. Interestingly, assembly of ␤ 2 m in the presence of Cu(II) ions proceeds through dimer addition (52), with domain-swapping also implicated (57).
Although the precise assembly mechanism(s) followed in vivo are not fully understood, the evidence is growing for a domain-swapped oligomeric intermediate preceding fibril formation (1,9). By exploiting the power of MS, we put forward a model at the atomic level of resolution, describing the early formation of ⌬N6-␤ 2 m oligomers, indicating a stepwise assembly mechanism through the addition of monomeric subunits (Fig. 7A). We further mapped the oligomeric growth into the fibrillar EM map to show the conversion of stacked domain-to runaway-swapped oligomers in which the ends of the growing oligomers are capable of binding open monomers by self-templated growth (Fig. 7B). Overall, this study provides novel insights into the early mechanism of pre-amyloid assemblies and further makes inroads toward an atomic level description of ␤ 2 m that entails detailed structural models of intermediate states and their associated transient interactions.