The Structure of Human Apolipoprotein A-IV as Revealed by Stable Isotope-assisted Cross-linking, Molecular Dynamics, and Small Angle X-ray Scattering*

Background: Apolipoprotein (apo)A-IV is involved in lipid and glucose metabolism, but its full-length structure is not known. Results: Stable isotope-assisted cross-linking combined with molecular modeling produced new models of the full-length protein. Conclusion: At least three hydrophobic residues participate in a unique clasp mechanism that regulates apoA-IV function. Significance: We report the most detailed models of lipid-free apoA-IV to date and demonstrate their utility in terms of functional predictions. Apolipoprotein (apo)A-IV plays important roles in dietary lipid and glucose metabolism, and knowledge of its structure is required to fully understand the molecular basis of these functions. However, typical of the entire class of exchangeable apolipoproteins, its dynamic nature and affinity for lipid has posed challenges to traditional high resolution structural approaches. We previously reported an x-ray crystal structure of a dimeric truncation mutant of apoA-IV, which showed a unique helix-swapping molecular interface. Unfortunately, the structures of the N and C termini that are important for lipid binding were not visualized. To build a more complete model, we used chemical cross-linking to derive distance constraints across the full-length protein. The approach was enhanced with stable isotope labeling to overcome ambiguities in determining molecular span of the cross-links given the remarkable similarities in the monomeric and dimeric apoA-IV structures. Using 51 distance constraints, we created a starting model for full-length monomeric apoA-IV and then subjected it to two modeling approaches: (i) molecular dynamics simulations and (ii) fitting to small angle x-ray scattering data. This resulted in the most detailed models yet for lipid-free monomeric or dimeric apoA-IV. Importantly, these models were of sufficient detail to direct the experimental identification of new functional residues that participate in a “clasp” mechanism to modulate apoA-IV lipid affinity. The isotope-assisted cross-linking approach should prove useful for further study of this family of apolipoproteins in both the lipid-free and -bound states.

Human apolipoprotein (apo)A-IV 4 is a lipid-binding protein synthesized by the small intestine and secreted into the gastric lymphatics primarily associated with chylomicrons. Upon entering the plasma compartment, it rapidly dissociates from chylomicrons and distributes between high density lipoprotein particles (roughly 30 -50%) and a lipid-free state (1). Unlike the other exchangeable apolipoproteins, apoA-IV message and protein increase markedly during lipid absorption in the gut (2,3). Although its specific function has not been fully settled, it has been suggested that apoA-IV co-evolved with apoB mRNA editing in mammals to play a role in chylomicron assembly (4), perhaps acting as a "barostat" to regulate interfacial pressures on nascent chylomicrons (5). Other evidence suggests that apoA-IV modulates chylomicron size by affecting its rate of passage through the cellular secretory pathway (6) or by facilitating triglyceride packing into the nascent particle (7). Additional proposed functions include cholesterol transport in high density lipoprotein (8 -11), an inhibitor of lipoprotein oxidation (12), a regulator of food intake (13), and a mediator of gastric emptying (14,15). More recently, apoA-IV has been implicated in the control of glucose homeostasis (16). Its presence enhanced glucose-stimulated release of insulin from pancreatic islet cells, implicating it as a target for the treatment of diabetes.
To better define the role of apoA-IV in these varied processes, there has been significant interest in determining its structure. Furthermore, given its evolutionary relationship to other disease-relevant players like apoA-I and apoE (17), structural understanding of apoA-IV should translate to the entire class of exchangeable apolipoproteins. All three proteins, apoA-IV being the largest at 376 amino acids (46 kDa), are dominated by 22-amino acid repeats, which are predicted to form amphipathic ␣ helices (18) and likely impart the ability to emulsify lipids into stable lipoproteins. ApoA-IV contains 12 such repeats, most of which are punctuated by proline residues (18), tightly clustered between residues 40 and 332. Like apoA-I, the N terminus of apoA-IV (residues 1-39) is encoded by a different exon than the remainder of the protein (19). Unlike apoA-I, the C terminus of apoA-IV contains a unique glutamine-rich sequence (residues 354 -367) with unknown function (20). In the absence of lipid, human apoA-IV exists as a mixture of monomers (25%) and homodimers (75%) (21).
We have reported an x-ray crystal structure of the central domain of human apoA-IV (apoA-IV 64 -335 ) in the dimeric form that shows a unique helix swapping interface (22). This interface was remarkably similar to that found in the crystal structure of a dimeric C-terminal truncated form of apoA-I (23). The nature of the interaction allowed us to propose that apoA-IV can monomerize when this shared helical domain doubles back onto the originating molecule, forming a fourhelix bundle. Unfortunately, our structure did not shed light on the N-and C-terminal residues. This is important because we have previously identified residues in these domains that can strongly affect the ability of the protein to interact with lipid. We proposed that these sites interact to form a "clasp" that regulates the unfolding of the protein in response to lipid (24,25). More recent work using small angle x-ray scattering (SAXS) indicated that these terminal domains cluster at one end of the helical bundle, but their detailed three-dimensional arrangement remains unresolved (26).
Chemical cross-linking represents a powerful method for determining three-dimensional distance constraints for protein molecules as they exist in solution. We have used this technique extensively to study the structure of apoA-I (27,28), apoA-II (29), and even apoA-IV (25). However, the technique is limited when studying the structure of homodimeric proteins because it cannot distinguish between intramolecular and intermolecular cross-links. This is particularly important because the apoA-IV 64 -335 structure predicts that the same cross-links can be generated both intra-and intermolecularly. In this work, we have overcome this technical challenge by combining isotopic labeling with chemical cross-linking to extend our model of monomeric apoA-IV. Using the crosslinking distance constraints, we created a starting model and then subjected it to two modeling approaches: (i) molecular dynamics simulations and (ii) fitting to SAXS data. These datadriven models are the most detailed yet achieved for lipid-free human apoA-IV.

ApoA-IV Expression and Endogenous Stable Isotope Labeling-
Recombinant human apoA-IV was expressed and purified as described previously in BL21 Escherichia coli (30). The construct contained the mature sequence with an additional glycine at the N terminus after the histidine tag was removed by the tobacco etch virus protease. Purity was routinely Ͼ95% as determined by SDS-PAGE and mass spectrometry. For the 15 Nlabeled version, a modified expression system was used. Briefly, 1 ml of bacterial cells containing the expression vector was pelleted and resuspended in rich minimal medium containing vitamins, salts, glucose, and 15 N-lableled ammonium chloride with no other nitrogen source. This was seeded into a 1-liter culture of the same media and grown overnight. From here, the expression and purification of the labeled protein was similar to that of the unlabeled proteins.
Separation of Stable Monomeric and Dimeric Forms of ApoA-IV-Recombinant 14 N and 15 N apoA-IV in PBS at pH 7.8 were individually brought to 3 M guandine HCl with dry reagent to assure complete dissociation of any dimeric species. The two solutions were then mixed at a 1:1 molar ratio to achieve a final concentration of Ͼ1 mg/ml at 37°C for 1 h to assure complete mixing of the isotopically labeled forms. After removal of the guanidine HCl by dialysis, the proteins were applied to a Superdex 200 16/60 (GE Healthcare) gel filtration column equilibrated in PBS and run at 4°C. Fractions corresponding to monomeric and dimeric apoA-IV were pooled, concentrated by ultrafiltration, and stored at 4°C until used. We previously showed that isolated monomeric and dimeric species were stable for longer than a week at 4°C. Protein concentrations were determined by the Markwell modified Lowry assay (31).
Chemical Cross-linking-The monomeric and dimeric samples were cross-linked with bis(sulfosuccinimidyl) suberate as previously reported (25). Freshly solubilized bis(sulfosuccinimidyl) suberate (spacer arm of 11.4 Å) was added to the protein solutions at molar ratios of cross-linker to protein that varied from 10:1 to 50:1. The samples were incubated at 4°C for periods of 2 or 12 h, with no differences in total cross-linking noted between the two incubation times. After quenching with an excess of Tris-HCl, the cross-linked proteins were exhaustively digested with trypsin at a 1:20 mass ratio of trypsin to apoA-IV for 2 h at 37°C followed by a second spike of trypsin and another 2-h incubation. The resulting peptides were then lyophilized to dryness and stored at Ϫ80°C until analyzed by MS.
Mass Spectrometry-Nano-LC-MS/MS analyses were performed on a TripleTOF 5600ϩ (AB Sciex, Toronto, Canada) coupled to an Eksigent (Dublin, CA) NanoLC-Ultra nanoflow system. Dried samples were reconstituted in formic acid/H 2 O 0.1/99.9 (v/v), and 5 l (containing 1-3 g of digest) was loaded onto C18 IntegraFrit TM trap column (outer diameter of 360 m, inner diameter of 100, and packed bed of 25 m) from New Objective, Inc. (Woburn, MA) at 2 l/min in FA/H 2 O 0.1/99.9 (v/v) for 15 min to desalt and concentrate the samples. For the chromatographic separation, the trap column was switched to align with the analytical column, Acclaim PepMap100 (inner diameter of 75 m, length of 15 cm, C18 particle size of 3 m, and pore size of 100 Å) from Dionex-Thermo Fisher Scientific (Sunnyvale, CA). The protein was eluted at 300 nl/min using a varying mobile phase gradient from 95% phase A (FA/H 2 O 0.1/ 99.9, v/v) to 40% phase B (FA/acetonitrile 0.1/99.9, v/v) for 35 min (1% per min) and then from 40% B to 85% B in 5 min with re-equilibration. The effluent from the nanoLC was introduced in to the mass spectrometer using a NANOSpray III source (AB Sciex, Toronto, Canada). The instrument was operated in positive ion mode for 65 min, where each cycle consisted of one TOF-MS scan (0.25-s accumulation time, in a 350 -1500 m/z window) followed by 30 information-dependent acquisition mode MS/MS scans on the most intense candidate ions selected from initially performed TOF-MS scan during each cycle. Each product ion scan was operated under vender-specified high sensitivity mode with an accumulation time of 0.075 s and CE of 43 with an 8-unit scan range. The .wiff files were converted to Mascot generic files using a file converting algorithm embedded in PeakView v1.2.0.3 software (AB Sciex).
Mass Spectral Data Analysis-The MS data, in the form of a Mascot generic file, was uploaded into Cross-ID. Given the sequence of apoA-IV, the cross-linker used, the isotopic possibilities, and the enzyme used for peptide generation, the program first identifies all possible cross-link interactions based on a user defined window of error for each mass detected in the experiment (20 ppm for this instrument). Once candidate identifications are determined, the program then compares theoretical MS/MS fragmentation patterns to the actual data, generating a probability score that reflects the confidence of the identification. These identifications are then visually verified by experienced personnel before the cross-link is declared valid. Valid identifications required the presence of co-eluting 14 N and 15 N versions of the same species and appeared in two independent analyses of purified monomers and dimers.
Generation of Initial Model for Molecular Dynamics Studies-An initial full-length model of apoA-IV was generated using the Phyre2 server (32). Phyre2 server searches for sequence homology within the Protein Data Bank database and generates models based on previously solved structures. The initial structure generated by Phyre2 was based on the apoA-IV 64 -335 structure, and the secondary structure of this model consisted primarily of helices. This was then further refined manually using PyMOL to manipulate residues to conform to the distance constraints from cross-linking experiments. In conforming to these constraints, proline, glycine, or serine residues were used to initiate helical breaks or bends, similar to what is observed in the apoA-IV 64 -335 structure. Additionally, hydrophobic residues were buried into the core of the helical bundle, again similar to what is observed in the apoA-IV 64 -335 structure.
All Atom Molecular Dynamics Simulations-All atom MD simulations were performed using NAMD 2.9 (33) as described by Jones et al. (34). Using Visual Molecular Dynamics (35), each system was solvated with the solvate plug-in and ionized and charge-neutralized with NaCl to 0.15 M with the ionize plug-in. The TIP3P model was used for water (36), and the CHARMM 22 (37,38) force fields were used for protein.
The all atom simulations began with the starting structure of full-length monomeric apoA-IV described above. After heating to 310 K, the system was simulated for 15 ns. In another experiment, the system was subjected to a 15-ns MD-simulated annealing (MDSA) protocol as described by Jones et al. (34), involving a quick heating to 500 K that was maintained for 5 ns, followed by a cool down from 500 to 310 K over 5 ns, followed by 5 ns at 310 K. As a measure of the convergence of our simulations, a root mean square deviation was performed over the course of each simulation.
To obtain contact maps over the trajectories of the simulations, the number of times that ␣-carbon atoms were within 25 Å of each other was counted over the course of that simulation, combined, and then shown as a contour plot of different colors indicating percentages of counts (except for a count of 0, which has no color) relative to the maximum count over all pairs of residues, respectively, using the graphics software GNUPLOT.
Model Comparisons to SAXS Data-To generate multiple models for SAXS comparison, the starting model or the MD generated model described above was processed with the Allos-Mod-FoXS web server (39,40) using MODELLER (41) and constrained with the identified cross-links that link the N/C termini (20.0 Ϯ 8.0Å from C-␣ to C-␣). The simulation was allowed to sample all intermediate conformations consistent with the input structure using a temperature scan. A total of 30 runs were conducted generating 101 possible structures for each run, yielding both the single best fit model and the multiple ensemble search results (42).
Liposome Clearance Assay-ApoA-IV or its mutants were added to dimyristoyl phosphatidylcholine (DMPC) liposomes at a 2:1 specified mole lipid: mole protein ratio, and the absorbance of the solution at 325 nm was recorded every 30 s for 20 min. Samples were measured in triplicate, and the values were averaged. The data were expressed as a normalized optical density calculated by dividing the sample optical density by the initial optical density. All measurements were made on an Amersham Biosciences Ultraspec 4000 UV-visible spectrophotometer within a temperature-controlled cuvette at 24.5°C.

RESULTS
Overall Strategy-We previously reported an x-ray crystal structure of the central region of human apoA-IV in the dimeric state (22). However, despite continuing work aimed at crystallizing the entire protein, the N-and C-terminal regions have not yet been visualized. In the meantime, we reasoned that chemical cross-linking could provide distance constraints for building a new model for full-length apoA-IV in both its monomeric and dimeric forms. However, when applied to homodimeric proteins, a drawback of the technique is uncertainty with regard to cross-link molecular span. The structures of truncated forms of apoA-IV and apoA-I clearly show that the monomeric and dimeric forms of both proteins are highly similar with respect to the protein contacts made. Thus, one can observe cross-linking of the same two peptides in both monomer and dimer forms, but these linkages occur intramolecularly in the former and intermolecularly in the latter.
To unambiguously distinguish between intra-and intermolecular cross-links, we produced versions of apoA-IV containing either naturally occurring nitrogen ( 14 N) or its heavier isotope ( 15 N). Thus, for every nitrogen atom in a peptide (one for each amino acid in the peptide backbone and more in some side chains), the mass increases by 1 Da in the labeled protein. These mass differences are easily detected by modern mass spectrometers and can be exploited to distinguish between intramolecular and intermolecular cross-linked peptides. Differentially labeled species were mixed 1:1 under denaturing conditions and then allowed to refold/reassociate, creating a mixture of isotopically labeled proteins that are either purely 14 N or 15 N, or a mixed 14 N/ 15 N dimer. Fig. 1 shows the example of the crystal structure of dimeric apoA-IV 64 -335 with 14 N-apoA-IV (green) and 15 N-apoA-IV (red). For an intermolecular cross-link between two peptides (A and B), there are four possible mass combinations: 14 A to 14  Cross-linking Monomeric and Dimeric ApoA-IV-The 14 N and 15 N versions of the full-length apoA-IV were generated using our bacterial expression system (43). Fig. 2a shows that in solution at 1 mg/ml, both forms distributed into monomers (lower bands) and dimers (upper bands) at a ϳ1:2 ratio, respectively. Fig. 2b shows mass spectra for a typical peptide from each preparation. The expected 9-Da mass increase was apparent in the 15 N-labeled peptide. The lack of significant peaks lower than 413.221 Da shows nearly complete 15  These were separated by gel filtration chromatography to yield isolated monomer and dimer species, with minimal cross-con-    FEBRUARY 28, 2014 • VOLUME 289 • NUMBER 9

Full-length apoA-IV Structure
tamination. We have previously demonstrated that the isolated species can be maintained for weeks at 4°C without redistribution (22).
The separated species were then individually cross-linked, trypsinized, and analyzed by MS as described under "Experimental Procedures." We identified 51 peptide pairs that contained an intact cross-link in both the monomer and dimer samples. The MS signatures of these species fell into two categories exemplified in Fig. 4. The majority contained 14 N and 15 N masses only (Fig. 4a), which are indicative of intramolecular linkages (Fig. 1). Eleven cross-linked pairs, always appearing in dimeric samples, exhibited the additional mixed forms, all with similar intensities (Fig. 4b, green spectrum), clearly indicat-ing intermolecular span. Table 1 lists the peptides containing an intra-peptide cross-link, i.e., cross-links that occurred between two lysine residues located on the same tryptic peptide. By definition, these links are intramolecular. Table 2 lists the crosslinks that were interpeptide. These can be intramolecular, occurring between distant peptides within the same protein molecule, or intermolecular. Inspection of the tables reveals several important observations. First, almost every cross-link that appeared in the monomeric samples also appeared in the dimeric samples. There were only two that were unique to the dimer and one unique to the monomer. Second, there were several instances where the same two peptides were cross-  linked intramolecularly in the monomer but were intermolecular in the dimer (for example, Lys 59 -Lys 330 in Table 2, with spectra shown in Fig. 4b).
Cross-link Consistency with the ApoA-IV 64 -335 Crystal Structure-The cross-linking technique analyzes proteins in solution, whereas x-ray crystallography reports on proteins in a crystalline state. We first compared the cross-linked Lys residues listed in Tables 1 and 2 with predictions from the apoA-IV 64 -335 crystal structure. Fig. 5 is a contact plot rendered from the crystal structure which shows the proximity of each residue with respect to all other residues in a two-dimensional format.
The colored areas show where residues indicated on the axes come within 20 Å of each other (i.e., close enough to be crosslinked). Plotting the experimentally derived cross-links showed that all landed within the colored regions and were thus consistent with the crystal structure, even when far apart in the sequence. The observation of these cross-links on a previously validated homo-dimeric crystal structure shows that we can clearly differentiate intra-and intermolecular contacts. Based on the dimeric apoA-IV 64 -335 crystal structure and later supported by SAXS (26), we proposed that apoA-IV could monomerize by keeping the shared helix to itself rather than swapping it with another molecule (22). The helix, bent at its midpoint, was posited to fold back onto the molecule making intramolecular interactions that are identical to the intermolecular interactions made in the dimer. One of the cross-links in Table 2 allowed direct evaluation of this prediction. Fig. 6 shows the cross-link between Lys 103 and Lys 191 . Table 2 and Fig. 4b indicate that this cross-link is intramolecular in the monomer but intermolecular in the dimer. The insets in Fig. 6 show that this linkage can easily occur (ϳ12 Å) in both models. In fact, this was the only model that we could envision that satisfies the data. Furthermore, this concept easily explains the remarkable similarity in the overall cross-linking patterns between the dimeric and monomeric samples. We interpreted this data to be strongly supportive of the proposed monomer model and elected to pursue further modeling of full-length apoA-IV in the context of the monomer. After creating a starting model based on available data, we took two separate modeling approaches: (i) molecular dynamics simulation and (ii) constrained simulation using our previously generated SAX data for the full-length protein (26).
Generation of a Starting Model for the Full-length ApoA-IV Monomer-Once we obtained the cross-linking distance constraints within regions that were not visible in the crystal structure of apoA-IV 64 -335 , we set out to generate a model of fulllength apoA-IV that would be suitable as a starting point for in silico modeling. Using homology modeling techniques, coupled with secondary structure analysis, we created a preliminary model of full-length monomeric apoA-IV in which unknown regions of the protein were placed as directed by the cross-linking data (see "Experimental Procedures"). Subsequently, we energy-minimized the structure to generate the "starting" model.
Modeling Approach 1: Molecular Dynamics Simulation-We subjected the starting model to two 15-ns MD simulations: one at 310 K and a second MDSA simulation (5 ns at 500 K, 5 ns of cool down, and 5 ns at 310 K). These two methods are useful for comparing the stability of the initial model at physiological temperature with more aggressive simulation using the temperature jump in the MDSA approach. Root mean square deviation plots for both simulations showed good convergence after 10 ns (not shown).  The long, "swapped" helices that reciprocate across the dimer interface can be seen on the right side of the dimer. In our proposed model of the monomeric structure (right), the long helix in molecule A has a hinge introduced at its midpoint and folds back onto the molecule, making exactly the same interactions as the swapped helix from molecule B makes in the dimer. The insets show the cross-link between Lys 191 and Lys 103 (pink line), which was determined to be intermolecular in the dimer but intramolecular in the monomer (the mass spectra for a representative ion for this cross-link is shown in Fig. 4b).
The structure resulting from the 15-ns simulation is shown in Fig. 7 with its contact plot (with cross-linking data overlaid) and sequences predicted to be helical. The model is dominated by a four-helix bundle composed of the same residues that form the reciprocating four-helix bundle in the apoA-IV 64 -335 crystal structure. The N and C termini are clustered at one end of the bundle in a significantly less helical, globular conformation. The calculated ␣-helicity of the model for the last 5 ns is 67%. The experimentally determined ␣-helical content of lipidfree apoA-IV varies considerably in the literature: 35% (44), 40% (30), 43% (45), 54% (46), 60% (47), and 68% (48). Thus, this model is on the high end of the range. The contact plot in Fig. 7a differs from Fig. 5 in that the distances plotted are over the entire 15 ns of the simulation rather than distances determined from a static structure. When superimposed on the plot, it was clear that all cross-links were plausible in the model. Because the cross-links were not set as a constraint during the simulations, this analysis indicates that the general organization of our starting model held up during the simulation. However, we caution that MD simulations may reflect localized energy minima that may not fully reflect the native solution structure of the protein.
The results of the simulated annealing (MDSA) simulation are not shown. The model was overall similar with a four-helical bundle and globular domain, although the bundle was less helical and slightly more curved. The N terminus folded back across the helical bundle and was in closer proximity to the C terminus versus the non-MDSA model. The helicity was ϳ66%. The cross-links were generally compatible with the model with the exception of Lys 189 -Lys 325 , which was not plausible in this model.
Modeling Approach 2: Comparing the Monomer Models to SAXS Data-We have previously used SAXS to determine the molecular shape of apoA-IV 64 -335 (26). The molecular envelope of the dimer exhibited an excellent match with that pre-dicted from the crystal structure. In that work, we also derived SAXS envelopes for isolated full-length monomer and dimer. We first compared the starting model to the full-length SAXS profile. Fig. 8a shows that it was in good agreement with the SAXS profile ( ϭ 1.51) and significantly better than a model that only includes the helical bundle ( ϭ 5.80). We then used the AllosMod-FoXS web server (39,40) to generate alternate conformations of the starting model to determine whether we could better represent the SAXS profile. The starting model was constrained with the experimentally observed cross-links as described under "Experimental Procedures." Fig. 8 (a and b) shows the single best fit from the analysis, which is in excellent agreement with the experimental SAXS profile ( ϭ 0.93). A minimal ensemble search (42) of the AllosMod-FoXS generated models revealed that a combination of two models resulted in a slight improvement to the fit of the SAXS profile ( ϭ 0.57). However, the models were divergent in their N/C termini, suggesting that these segments can sample multiple conformations (Fig. 8c).
The single best fit molecular model is shown in Fig. 8b and shows that the general overall shape is maintained, whereas parts of the N/C termini are less compact and extended away from the bundle. Overall, the orientation satisfies the experimentally determined cross-linking profiles and SAXS scattering profile. Furthermore, we generated a GASBOR ab initio molecular envelope of the apoA-IV monomer, which contained extra density adjacent to the four-helical bundle (26). Our newly derived single best fit model strongly indicates that the extra density derives from the N/C termini (Fig. 8d). Nonetheless, alternate/multiple conformations may exist that could also satisfy these parameters.
Although the single best fit is consistent with the experimental data, we took a closer look at the experimental SAXS profile. Further analysis confirmed that monomeric apoA-IV exhibits some flexibility. This is demonstrated by the lack of a distinct Full-length apoA-IV Structure FEBRUARY 28, 2014 • VOLUME 289 • NUMBER 9 plateau in the Porod-Debye plot (q 4 versus q 4 ⅐I(q)), whereas there is a clear plateau in the q 3 versus q 3 ⅐I(q) plot (Fig. 8e). A plateau in this plot is suggestive of intrinsic flexibility in the protein (49,50). This should not be mistaken for an unstructured protein because the q 2 versus q 2 ⅐I(q) plot displays a distinct decrease and not a plateau (Fig. 8e) (49,50).

Full-length apoA-IV Structure
For completeness, we also analyzed the MD model (non-MDSA) with respect to the SAXS data (not shown). The simulated model was in good agreement with the SAXS envelope ( ϭ 2.38) though not as strong as the starting model. The AllosMod-FoXS analysis yielded an improved single best fit model ( ϭ 0.97). Again, the models were divergent in the N and C termini (Fig. 8c), consistent with significant conformational flexibility in these regions.
Extrapolation of the Simulated Monomer Model to the Dimeric Form-Based on the cross-linking data in Tables 1 and 2 and our previous work, we reasoned that the structure of the globular domain was similar in the monomer and dimer. In the monomer, this domain consists of the N and C termini from the same molecule. In the dimer, there are two of these domains, each composed of termini from different molecules. Therefore, we generated a symmetry related full-length dimer with swapped chains by superimposing the monomer MD results onto the apoA-IV 64 -335 crystal structure dimer. Unfortunately, MD simulations of this model were not successful because of significant instability at the crossover in the middle of the model. Thus, we were unable to develop a consensus model and corresponding contact plot. However, by simple extrapolation, it is clear that the cross-links occurring between the N-and C-regions in the monomer can also occur intermolecularly in the dimer. An example is shown in Fig. 9. This conclusion also applies to the single best fit models derived from the SAXS analysis (not shown).
Functional Predictions from the Models-From the modeling approaches described above, it is evident that the N/C-terminal domains interact and likely fold back to interact with the helical bundle. Further inspection of the bundle revealed that a phenylalanine (Phe 223 ) was positioned near hydrophobic residues known to be involved in triggering the lipid affinity of apoA-IV (24) (see "Discussion" for further details). Speculating that this residue may comprise part of the clasp mechanism, we mutated it to an alanine and then evaluated the effect on the ability of apoA-IV to reorganize DMPC liposomes into lipoproteins. Fig.  10 shows that apoA-I cleared the lipid avidly as demonstrated previously (51), whereas WT apoA-IV did so at a more moderate rate (47). On the other hand, a mutant of apoA-IV (apoA-IV F334A) in which the clasp interaction has been disrupted (24) cleared the lipid nearly as efficiently as apoA-I. Interestingly, apoA-IV(F223A) was significantly more effective than WT apoA-IV and nearly as effective as apoA-IV(F334A), indicating that this mutation may indeed affect the lipid binding clasp interaction. We observed the same effect when Phe 223 was mutated to His (not shown). By contrast, other point mutations in the region, such as Phe 294 , failed to exert significant effects on lipid solubilization (Fig. 10).

DISCUSSION
The remarkable structural plasticity of apolipoproteins is evident in the different forms they adopt. They can exist lipidfree (or poor) in solution where they undergo a concentrationdependent self-association that bears a remarkable similarity to chemical detergents. Upon interaction with lipid, significant structural rearrangements are induced to generate and stabilize lipoprotein particles. These properties have made them notoriously difficult to study by traditional high resolution techniques such as nuclear magnetic resonance and x-ray crystallography. Although moderate success has been achieved with truncated mutants, our structural understanding of these proteins as they exist in lipoproteins and how they transition to this state remains limited. This has required the clever integration of a wide range of complementary structural techniques to understand the structure of the full-length proteins.   Table 2. b, the non-MDSA model was used to generate a preliminary model of the dimeric form of full-length apoA-IV as shown with molecule A in green and molecule B in teal. The same Lys residues are shown participating in an intermolecular cross-link. Power of Isotope-assisted Cross-linking for Determining Spatial Constraints in Oligomeric Proteins-The recent crystal structures of dimeric apoA-I (23) and apoA-IV (22) fragments have highlighted the remarkable similarities in molecular contacts among the various oligomeric states of apolipoproteins. This may hint that their lipid-bound forms also exhibit similar reciprocating contacts. Therefore, it is critical to develop structural techniques that can distinguish between intra-and intermolecular contacts, even if they occur between exactly the same protein regions.
In earlier work, we proposed a crude way to estimate the molecular span of cross-links called the "oligomer isolation strategy." This involved setting the cross-linking conditions so that linked monomers and dimers could be separated by gel filtration chromatography and independently analyzed by MS (52). In theory, the dimeric species should contain all intra-and intermolecular cross-links, whereas the monomeric species have only intramolecular links. Those unique to the dimer were interpreted to be intermolecular. This approach was adopted by other laboratories (53,54) as well. However, this concept suffers from three limitations. First, it is not always possible to completely separate monomers and dimers in the analysis, resulting in uncertainty assigning molecular span (52). Second, it is not possible to distinguish cross-link span in trimers or higher, even if they can be individually isolated. Third, and most importantly, the crystal structures of dimeric apoA-I (23) and apoA-IV (22) show that the same cross-link can appear in both the dimers and monomers but have different molecular spans in each.
The dual isotope labeling approach reported here addresses all these issues. Although we separated monomers and dimers in this study, this is not a requirement. Contributions of mixed systems should be interpretable by comparing the relative intensities of the mixed isotope cross-links compared with the pure 14 N and 15 N versions. A mixed system of equal monomer and dimer should produce a pattern in which the 14 N and 15 N peptide ion intensities are X and the mixed peptide ion intensities are 0.5X. A similar argument can be made for evaluating oligomers of higher order than dimer. For example, if all four ion intensities are equal, then all three molecules in a trimer can be interpreted as making intermolecular contacts. However, if the mixed species are some fraction of the 14 N and 15 N species, one of the molecules may be making different contacts than the other two. Therefore, we anticipate that this technique will be highly useful for studying the structure of oligomeric species of other proteins such as apoA-I as well as the lipid-bound forms of all apolipoproteins. However, because this technique requires purified protein from two different sources, it is most applicable to systems that can (re)assemble postproduction.
Structure of ApoA-IV-We made four major conclusions from the current study. First, the cross-linking pattern from soluble protein in buffer closely matched the predictions of the apoA-IV 65-335 crystal structure. This, combined with our previous SAXS work (26), is strong evidence that the crystal structure reflects soluble apoA-IV and the crystallization conditions did not significantly perturb the structure. Second, because the cross-linking was performed on full-length apoA-IV and the crystal structure on a truncation mutant, the addition of the N and C termini does not dramatically affect the structure of the central helical bundle of the protein, again arguing for the relevance of the crystal structure to full-length soluble apoA-IV. Third, the molecular span information from the cross-links strongly support our proposed structure for monomeric apoA-IV. Finally, using the cross-linking/SAXS data and molecular dynamics, we created two new models for monomeric apoA-IV. In both, the hydrophobic components of each termini interact to form a globular domain that appears to sit atop the fourhelix bundle.
The best test of a protein structural model is whether it can be used as a basis for making functional predictions. Our previous work has provided significant evidence that sequences near the N terminus (particularly residue 12) and C terminus (residue 334 and possibly residue 335) interact to form a clasp that prevents apoA-IV from unfolding in response to lipid (24,25,47). Unfortunately, these residues were not visible in the apoA-IV 65-335 crystal structure. Upon inspection of the new full-length models in Figs. 7 and 8, we found that both residues are in proximity in both models (␣-carbons within 12.3 Å). Furthermore, a phenylalanine (Phe 223 ) was also clustered near these hydrophobic clasp residues. This residue had previously caught our attention because of its unusual exposure on the surface of the four-helix bundle in the crystal structure. The models suggested that Phe 223 may comprise part of the clasp mechanism, possibly acting to keep the globular domain associated with the folded helical bundle. Consistent with this, mutating Phe 223 had similar stimulatory effects on apoA-IV lipid binding as Phe 334 . However, manipulation of another phenylalanine in the same general area failed to alter lipid binding. This may indicate a three-way interaction between the N terminus (residue 12), the C-terminal region (Phe 334 ), and the helical bundle (Phe 223 ) that plays a major role in regulating apoA-IV lipid binding (Fig. 11). When the loosely folded globular domain interacts with lipid, there may be a conformational cascade that results in its dissociation from the helical bundle, allowing it to open and solubilize lipid bilayers more avidly. Based on the similarity of the structures, this process should occur similarly in the monomer or the dimer.
Another interesting issue relates to the impact of the globular domain on the transition between monomer and dimer. The degree to which the two termini are intertwined may be envisioned to impact the kinetics of the interconversion. Our new models indicate that the N and C termini are more or less independently folded, although each may depend on the other for stabilizing interactions. Thus, the dimer to monomer transition may not require a complex conformational change in this domain. This is supported by our previous studies showing that both the WT apoA-IV and apoA-IV 65-335 (lacking the N and C termini) have similar dimer to monomer interconversion rates (22). If the N and C termini interacted in a more integrated or tight fashion, the transition should be slower when the termini are present. However, we caution that the structural resolution of these domains needs to be improved before more definitive conclusions can be made.
As additional data become available, the quality of full-length human apolipoprotein models is consistently improving. The currently reported apoA-IV models are consistent with (i) our partial crystal structure, (ii) SAXS molecular shape data, and (iii) upwards of 50 chemical cross-links. In addition, they are consistent with secondary structure estimates made by circular dichroism. Nevertheless, there remains some uncertainty in the structure of the globular domain. This area is likely dynamic enough that it may not exist in any one structure at a given time. Indeed, this is consistent with our data showing that these domains are the first to be digested under limited proteolysis (30). Although crystallography and nuclear magnetic resonance methods will continue to be pursued, the dynamic nature of these apolipoproteins may preclude complete structural visualization by these methods. Thus, it is important to continue to develop alternative methodologies, such as the isotope assisted cross-linking technique, to meet this goal.